This patch adds a barrier() in futex unqueue_me to avoid aliasing of two
pointers.
On my s390x system I saw the following oops:
Unable to handle kernel pointer dereference at virtual kernel address
0000000000000000
Oops: 0004 [#1]
CPU: 0 Not tainted
Process mytool (pid: 13613, task: 000000003ecb6ac0, ksp: 00000000366bdbd8)
Krnl PSW : 0704d00180000000 00000000003c9ac2 (_spin_lock+0xe/0x30)
Krnl GPRS: 00000000ffffffff 000000003ecb6ac0 0000000000000000 0700000000000000
0000000000000000 0000000000000000 000001fe00002028 00000000000c091f
000001fe00002054 000001fe00002054 0000000000000000 00000000366bddc0
00000000005ef8c0 00000000003d00e8 0000000000144f91 00000000366bdcb8
Krnl Code: ba 4e 20 00 12 44 b9 16 00 3e a7 84 00 08 e3 e0 f0 88 00 04
Call Trace:
([<0000000000144f90>] unqueue_me+0x40/0xe4)
[<0000000000145a0c>] do_futex+0x33c/0xc40
[<000000000014643e>] sys_futex+0x12e/0x144
[<000000000010bb00>] sysc_noemu+0x10/0x16
[<000002000003741c>] 0x2000003741c
The code in question is:
static int unqueue_me(struct futex_q *q)
{
int ret = 0;
spinlock_t *lock_ptr;
/* In the common case we don't take the spinlock, which is nice. */
retry:
lock_ptr = q->lock_ptr;
if (lock_ptr != 0) {
spin_lock(lock_ptr);
/*
* q->lock_ptr can change between reading it and
* spin_lock(), causing us to take the wrong lock. This
* corrects the race condition.
[...]
and my compiler (gcc 4.1.0) makes the following out of it:
00000000000003c8 <unqueue_me>:
3c8: eb bf f0 70 00 24 stmg %r11,%r15,112(%r15)
3ce: c0 d0 00 00 00 00 larl %r13,3ce <unqueue_me+0x6>
3d0: R_390_PC32DBL .rodata+0x2a
3d4: a7 f1 1e 00 tml %r15,7680
3d8: a7 84 00 01 je 3da <unqueue_me+0x12>
3dc: b9 04 00 ef lgr %r14,%r15
3e0: a7 fb ff d0 aghi %r15,-48
3e4: b9 04 00 b2 lgr %r11,%r2
3e8: e3 e0 f0 98 00 24 stg %r14,152(%r15)
3ee: e3 c0 b0 28 00 04 lg %r12,40(%r11)
/* write q->lock_ptr in r12 */
3f4: b9 02 00 cc ltgr %r12,%r12
3f8: a7 84 00 4b je 48e <unqueue_me+0xc6>
/* if r12 is zero then jump over the code.... */
3fc: e3 20 b0 28 00 04 lg %r2,40(%r11)
/* write q->lock_ptr in r2 */
402: c0 e5 00 00 00 00 brasl %r14,402 <unqueue_me+0x3a>
404: R_390_PC32DBL _spin_lock+0x2
/* use r2 as parameter for spin_lock */
So the code becomes more or less:
if (q->lock_ptr != 0) spin_lock(q->lock_ptr)
instead of
if (lock_ptr != 0) spin_lock(lock_ptr)
Which caused the oops from above.
After adding a barrier gcc creates code without this problem:
[...] (the same)
3ee: e3 c0 b0 28 00 04 lg %r12,40(%r11)
3f4: b9 02 00 cc ltgr %r12,%r12
3f8: b9 04 00 2c lgr %r2,%r12
3fc: a7 84 00 48 je 48c <unqueue_me+0xc4>
400: c0 e5 00 00 00 00 brasl %r14,400 <unqueue_me+0x38>
402: R_390_PC32DBL _spin_lock+0x2
As a general note, this code of unqueue_me seems a bit fishy. The retry logic
of unqueue_me only works if we can guarantee, that the original value of
q->lock_ptr is always a spinlock (Otherwise we overwrite kernel memory). We
know that q->lock_ptr can change. I dont know what happens with the original
spinlock, as I am not an expert with the futex code.
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@timesys.com>
Signed-off-by: Christian Borntraeger <borntrae@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
We've confirmed that the debug version of write_lock() can get stuck for long
enough to cause NMI watchdog timeouts and hence a crash.
We don't know why, yet. Disable it for now.
Also disable the similar read_lock() code. Just in case.
Thanks to Dave Olson <olson@unixfolk.com> for reporting and testing.
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
With CONFIG_PCI=n:
CC drivers/edac/edac_mc.o
drivers/edac/edac_mc.c: In function ‘add_mc_to_global_list’:
drivers/edac/edac_mc.c:1362: error: implicit declaration of function ‘to_platform_device’
drivers/edac/edac_mc.c:1362: error: invalid type argument of ‘->’
drivers/edac/edac_mc.c: In function ‘edac_mc_add_mc’:
drivers/edac/edac_mc.c:1467: error: invalid type argument of ‘->’
drivers/edac/edac_mc.c: In function ‘edac_mc_del_mc’:
drivers/edac/edac_mc.c:1504: error: invalid type argument of ‘->’
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
It should be possible to suspend, either to RAM or to disk, if there's a
traced process that has just reached a breakpoint. However, this is a
special case, because its parent process might have been frozen already and
then we are unable to deliver the "freeze" signal to the traced process.
If this happens, it's better to cancel the freezing of the traced process.
Ref. http://bugzilla.kernel.org/show_bug.cgi?id=6787
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Patch from Martin Michlmayr
Fix the following compilation error in arm/mach-ixp4xx/gtwx5715-setup:
CC arch/arm/mach-ixp4xx/gtwx5715-setup.o
arch/arm/mach-ixp4xx/gtwx5715-setup.c:110: error: expected identifier or '(' before ‘=’ token
arch/arm/mach-ixp4xx/gtwx5715-setup.c:121: error: 'gtwx5715_flash_resource' undeclared here (not in a function)
arch/arm/mach-ixp4xx/gtwx5715-setup.c: In function 'gtwx5715_init':
arch/arm/mach-ixp4xx/gtwx5715-setup.c:133: error: 'flash_resource' undeclared (first use in this function)
arch/arm/mach-ixp4xx/gtwx5715-setup.c:133: error: (Each undeclared identifier is reported only once
arch/arm/mach-ixp4xx/gtwx5715-setup.c:133: error: for each function it appears in.)
make[1]: *** [arch/arm/mach-ixp4xx/gtwx5715-setup.o] Error 1
Signed-off-by: Martin Michlmayr <tbm@cyrius.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from George G. Davis
Fix "WARNING: "rtc_next_alarm_time" [drivers/rtc/rtc-sa1100.ko]
undefined!"
Signed-off-by: George G. Davis <gdavis@mvista.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Juha [êölä
When the block queue is plugged, mq->req must be set to NULL.
Otherwise mq->req might be left non-NULL, even though mmcqd is
not processing a request, thus preventing the MMC queue thread from
being woken up when new requests do arrive.
Signed-off-by: Juha Yrjola <juha.yrjola@solidboot.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
In bug #6954, Norbert Reinartz reported the following issue:
"Function lapb_setparms() in file net/lapb/lapb_iface.c checks if the given
parameters are valid. If the given window size is in the range of 8 .. 127,
lapb_setparms() fails and returns an error value of LAPB_INVALUE, even if bit
LAPB_EXTENDED in parms->mode is set.
If bit LAPB_EXTENDED in parms->mode is set and the window size is in the range
of 8 .. 127, the first check "(parms->mode & LAPB_EXTENDED)" results true and
the second check "(parms->window < 1 || parms->window > 127)" results false.
Both checks in conjunction result to false, thus the third check "(parms->window
< 1 || parms->window > 7)" is done by fault.
This third check results true, so that we leave lapb_setparms() by 'goto out_put'.
Seems that this bug doesn't cause any problems, because lapb_setparms() isn't
used to change the default values of LAPB. We are using kernel lapb in our
software project and also change the default parameters of lapb, so we found
this bug"
He also pasted a fix, that I've transformated into a patch:
Signed-off-by: Diego Calleja <diegocg@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Whenever a transfer is application limited, we are allowed at least
initial window worth of data per window unless cwnd is previously
less than that.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Overflow can occur very easily with 32 bits, e.g., with 1 second
us_idle is approx. 2^20, which leaves only 11-Wlog bits for queue
length. Since the EWMA exponent is typically around 9, queue
lengths larger than 2^2 cause overflow. Whether the affected
branch is taken when us_idle is as high as 1 second, depends on
Scell_log, but with rather reasonable configuration Scell_log is
large enough to cause p->Stab to have zero index, which always
results zero shift (typically also few other small indices result
in zero shift).
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
The datagram interface of LLC is broken in a couple of ways.
These were discovered when trying to use it to build an out-of-kernel
version of STP.
First it didn't pass the source address of the received packet
in recvfrom(). It needs to copy the source address of received LLC packets
into the socket control block. At the same time fix a security issue
because there was uninitialized data leakage. Every recvfrom call
was just copying out old data.
Second, LLC should not merge multiple packets in one receive call
on datagram sockets. LLC should preserve packet boundaries on
SOCK_DGRAM.
This fix goes against the old historical comments about UNIX98 semantics
but without this fix SOCK_DGRAM is broken and useless. So either ANK's
interpretation was incorect or UNIX98 standard was wrong.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Acked-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix code that passes back netlink status messages about
bridge changes. Submitted by Aji_Srinivas@emc.com
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
By using milliseconds instead of jiffies to calculate acceleration
factor we make the code immune to changes in HZ.
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
When emulating button toggle drivers need to send input_sync()
between 'down' and 'up' events, otherwise some users might miss
keypress because device's state is only considered finalized
after EV_SYN/SYN_REPORT is received.
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Trackpoint driver was not sending the magic knock sequence upon resume
causing incorrect device behavior after resuming from disk.
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
MX300 does not have an EXTRA_BTN - it is a simple wheel mouse with
an additional task-switcher button, which is reported as side button
(and not task button).
Signed-off-by: Daniel Drake <dsd@gentoo.org>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
In the error path, ata_device_add()
* dereferences null host_set->ports[] element.
* calls scsi_remove_host() on not-yet-added shost.
This patch fixes both bugs. The first problem was spotted and initial
patch submitted by Dave Jones <davej@redhat.com>. The second problem
was mentioned and fixed by Jeff Garzik <jgarzik@pobox.com> in a larger
cleanup patch.
Cc: Dave Jones <davej@redhat.com>
Cc: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Tejun Heo <htejun@gmail.com>
sata_sil24 doesn't make use of probe_ent->mmio_base and setting this
field causes the area to be released twice on detach. Don't set
probe_ent->mmio_base.
Signed-off-by: Tejun Heo <htejun@gmail.com>
To get host_set->private_data initialized reliably, all pinfos need to
point to the same hpriv. Restore pinfo->private_data after pata pinfo
assignment.
Signed-off-by: Tejun Heo <htejun@gmail.com>
ata_prot_detach() did nothing for old EH ports and thus SCSI hosts
associated with those ports are left dangling after they are detached
leaving stale devices and causing oops eventually. Make
ata_port_detach() remove SCSI hosts for old EH ports.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Drivers expect to be able to call wireless_send_event in arbitrary
contexts. On the other hand, netlink really doesn't like being
invoked in an IRQ context. So we need to postpone the sending of
netlink skb's to a tasklet.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The uncached allocator has a function, uncached_get_new_chunk(), that needs
to be serialized on a per node basis. It also has a global variable,
allocated_granules, which should be defined on a per node basis and protected
by that serialization. Additionally, all error returns from functions called
(like ia64_pal_mc_drain()) should be handled appropriately.
Signed-off-by: Dean Nelson <dcn@sgi.com>
Acked-by: Jes Sorenson <jes@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
* 'fixes' of git://git.linux-nfs.org/pub/linux/nfs-2.6:
SUNRPC: Fix obvious refcounting bugs in rpc_pipefs.
RPC: Ensure that we disconnect TCP socket when client requests error out
NLM/lockd: remove b_done
NFS: make 2 functions static
NFS: Release dcache_lock in an error path of nfs_path
* master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreq:
[CPUFREQ] Propagate acpi_processor_preregister_performance return value.
[CPUFREQ] [2/2] demand load governor modules.
[CPUFREQ] [1/2] add __find_governor helper and clean up some error handling.
[CPUFREQ] Longhaul - Rename & fix multipliers table
[CPUFREQ] Longhaul - Fix power state test to do something more useful
[CPUFREQ] Longhaul - Readd accidentally dropped line
[CPUFREQ] Make longhaul_walk_callback() static
[CPUFREQ] X86_GX_SUSPMOD must depend on PCI
[CPUFREQ] Longhaul - Initialise later.
[CPUFREQ] Longhaul - Workaround issues with APIC.
[CPUFREQ] Longhaul - Hook into ACPI C states.
[CPUFREQ] return error when failing to set minfreq
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev:
[PATCH] ahci: skip protocol test altogether in spurious interrupt code
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/pci-2.6:
PNP: Add missing casts in printk() arguments
PCI: docking station: remove dock uevents
PCI: Unhide the SMBus on Asus PU-DLS
PCI Hotplug: add acpiphp to MAINTAINERS
PCI: pci/search: EXPORTs cannot be __devinit
PCIE: cleanup on probe error
pcie: fix warnings when CONFIG_PM=n
Skip protocol test altogether in spurious interrupt code. If PIOS is received
when it shouldn't, ahci will raise protocol violation.
Signed-off-by: Unicorn Chang <uchang@tw.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
When writing the firmware to the NIC, the FIFO is 256-bytes long,
so we use 256-bytes chunks and a read to wait until the previous
write is done.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Prevent phylib from freeing PHY IRQ twice on closing an eth device:
phy_disconnect() first calls phy_stop_interrupts(), then it calls
phy_stop_machine() which in turn calls phy_stop_interrupts() making the
kernel complain on each bootup...
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
This patch contains some of the bug fixes and enhancements done to
s2io driver. Following are the brief description of changes
1. code cleanup to handle gso modification better
2. Move repeated code in rx path, to a common function
s2io_chk_rx_buffers()
3. Bug fix in MSI interrupt
4. clear statistics when card is down
5. Avoid linked list traversing in lro aggregation.
6. Use pci_dma_sync_single_for_cpu for buffer0 in case of 2/3
buffer mode.
7. ethtool tso get/set functions to set clear NETIF_F_TSO6
8. Stop LRO aggregation when we receive ECN notification
Signed-off-by: Ananda Raju <ananda.raju@neterion.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
This patch contains some of the bug fixes and enhancements done to
s2io driver. Following are the brief description of changes
1. Introduced macro "S2IO_PARM_INT" for declaring integer load parameter
2. UDP_RR test failure, memset txdl after Tx completion
3. PXE boot may leave adapter in unknown state so do reset in probe.
4. Add Tx completion code in netpoll
5. In s2io_vpd_read() move array vpd_data[] to pointer, saves stack memory
6. Fix bug in ethtool online test
Signed-off-by: Ananda Raju <ananda.raju@neterion.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
If we're part way through transmitting a TCP request, and the client
errors, then we need to disconnect and reconnect the TCP socket in order to
avoid confusing the server.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
(cherry picked from 031a50c8b9ea82616abd4a4e18021a25848941ce commit)