Commit graph

191589 commits

Author SHA1 Message Date
Paul Mackerras
0fe1ac48be powerpc/perf_event: Fix oops due to perf_event_do_pending call
Anton Blanchard found that large POWER systems would occasionally
crash in the exception exit path when profiling with perf_events.
The symptom was that an interrupt would occur late in the exit path
when the MSR[RI] (recoverable interrupt) bit was clear.  Interrupts
should be hard-disabled at this point but they were enabled.  Because
the interrupt was not recoverable the system panicked.

The reason is that the exception exit path was calling
perf_event_do_pending after hard-disabling interrupts, and
perf_event_do_pending will re-enable interrupts.

The simplest and cleanest fix for this is to use the same mechanism
that 32-bit powerpc does, namely to cause a self-IPI by setting the
decrementer to 1.  This means we can remove the tests in the exception
exit path and raw_local_irq_restore.

This also makes sure that the call to perf_event_do_pending from
timer_interrupt() happens within irq_enter/irq_exit.  (Note that
calling perf_event_do_pending from timer_interrupt does not mean that
there is a possible 1/HZ latency; setting the decrementer to 1 ensures
that the timer interrupt will happen immediately, i.e. within one
timebase tick, which is a few nanoseconds or 10s of nanoseconds.)

Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: stable@kernel.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-05-12 14:34:00 +10:00
Sage Weil
e84346b726 ceph: preserve seq # on requeued messages after transient transport errors
If the tcp connection drops and we reconnect to reestablish a stateful
session (with the mds), we need to resend previously sent (and possibly
received) messages with the _same_ seq # so that they can be dropped on
the other end if needed.  Only assign a new seq once after the message is
queued.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-11 21:20:38 -07:00
Sage Weil
f818a73674 ceph: fix cap removal races
The iterate_session_caps helper traverses the session caps list and tries
to grab an inode reference.  However, the __ceph_remove_cap was clearing
the inode backpointer _before_ removing itself from the session list,
causing a null pointer dereference.

Clear cap->ci under protection of s_cap_lock to avoid the race, and to
tightly couple the list and backpointer state.  Use a local flag to
indicate whether we are releasing the cap, as cap->session may be modified
by a racing thread in iterate_session_caps.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-11 20:56:31 -07:00
Steve French
aa3e5572c5 [CIFS] drop quota operation stubs
CIFS has stubs for XFS-style quotas without an actual implementation backing
them, hidden behind a config option not visible in Kconfig.  Remove these
stubs for now as the quota operations will see some major changes and this
code simply gets in the way.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jeff Layton <jlayton@samba.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2010-05-12 02:24:12 +00:00
Arnaldo Carvalho de Melo
ef7b93a119 perf report: Librarize the annotation code and use it in the newt browser
Now we don't anymore use popen to run 'perf annotate' for the selected
symbol, instead we collect per address samplings when processing samples
in 'perf report' if we're using the newt browser, then we use this data
directly to do annotation.

Done this way we can actually traverse the objdump_line objects
directly, matching the addresses to the collected samples and colouring
them appropriately using lower level slang routines.

The new ui_browser class will be reused for the main, callchain aware,
histogram browser, when it will be made generic and don't assume that
the objects are always instances of the objdump_line class maintained
using list_heads.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-11 23:23:20 -03:00
H. Peter Anvin
c9775b4cc5 x86, fpu: Use static_cpu_has() to implement use_xsave()
use_xsave() is now just a special case of static_cpu_has(), so use
static_cpu_has().

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <1273135546-29690-2-git-send-email-avi@redhat.com>
2010-05-11 17:49:54 -07:00
H. Peter Anvin
a3c8acd043 x86: Add new static_cpu_has() function using alternatives
For CPU-feature-specific code that touches performance-critical paths,
introduce a static patching version of [boot_]cpu_has().  This is run
at alternatives time and is therefore not appropriate for most
initialization code, but on the other hand initialization code is
generally not performance critical.

On gcc 4.5+ this uses the new "asm goto" feature.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <1273135546-29690-2-git-send-email-avi@redhat.com>
2010-05-11 17:47:07 -07:00
Linus Torvalds
cea0d767c2 Merge branch 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging
* 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
  hwmon: (applesmc) Correct sysfs fan error handling
  hwmon: (asc7621) Bug fixes
2010-05-11 17:38:04 -07:00
Linus Torvalds
b2464ab202 Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  kprobes/x86: Fix removed int3 checking order
  perf: Fix static strings treated like dynamic ones
2010-05-11 17:37:24 -07:00
Andrew Morton
788885ae7a drivers/gpu/drm/i915/i915_irq.c:i915_error_object_create(): use correct kmap-atomic slot
i915_error_object_create() is called from the timer interrupt and hence
can corrupt the KM_USER0 slot.  Use KM_IRQ0 instead.

Reported-by: Jaswinder Singh Rajput <jaswinderlinux@gmail.com>
Tested-by: Jaswinder Singh Rajput <jaswinderlinux@gmail.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dave Airlie <airlied@linux.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-11 17:33:42 -07:00
Oliver Neukum
06efbeb4a4 hp_accel: fix race in device removal
The work queue has to be flushed after the device has been made
inaccessible.  The patch closes a window during which a work queue might
remain active after the device is removed and would then lead to ACPI
calls with undefined behavior.

Signed-off-by: Oliver Neukum <oneukum@suse.de>
Acked-by: Eric Piel <eric.piel@tremplin-utc.net>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: Pavel Herrmann <morpheus.ibis@gmail.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-11 17:33:42 -07:00
André Goddard Rosa
a3ed2a1571 mqueue: fix kernel BUG caused by double free() on mq_open()
In case of aborting because we reach the maximum amount of memory which
can be allocated to message queues per user (RLIMIT_MSGQUEUE), we would
try to free the message area twice when bailing out: first by the error
handling code itself, and then later when cleaning up the inode through
delete_inode().

Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-11 17:33:42 -07:00
Michael Hennerich
de145b44b9 fbdev: bfin-t350mcqb-fb: fix fbmem allocation with blanking lines
The current allocation does not include the memory required for blanking
lines.  So avoid memory corruption when multiple devices are using the DMA
memory near each other.

Signed-off-by: Michael Hennerich <michael.hennerich@analog.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-11 17:33:42 -07:00
KAMEZAWA Hiroyuki
747388d78a memcg: fix css_is_ancestor() RCU locking
Some callers (in memcontrol.c) calls css_is_ancestor() without
rcu_read_lock.  Because css_is_ancestor() has to access RCU protected
data, it should be under rcu_read_lock().

This makes css_is_ancestor() itself does safe access to RCU protected
area.  (At least, "root" can have refcnt==0 if it's not an ancestor of
"child".  So, we need rcu_read_lock().)

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-11 17:33:42 -07:00
KAMEZAWA Hiroyuki
7f0f154641 memcg: fix css_id() RCU locking for real
Commit ad4ba37537 ("memcg: css_id() must be
called under rcu_read_lock()") modifies memcontol.c for fixing RCU check
message.  But Andrew Morton pointed out that the fix doesn't seems sane
and it was just for hidining lockdep messages.

This is a patch for do proper things.  Checking again, all places,
accessing without rcu_read_lock, that commit fixies was intentional....
all callers of css_id() has reference count on it.  So, it's not necessary
to be under rcu_read_lock().

Considering again, we can use rcu_dereference_check for css_id().  We know
css->id is valid if css->refcnt > 0.  (css->id never changes and freed
after css->refcnt going to be 0.)

This patch makes use of rcu_dereference_check() in css_id/depth and remove
unnecessary rcu-read-lock added by the commit.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-11 17:33:42 -07:00
Vitaliy Gusev
11cad320a4 bsdacct: use del_timer_sync() in acct_exit_ns()
acct_exit_ns --> acct_file_reopen deletes timer without check timer
execution on other CPUs.  So acct_timeout() can change an unmapped memory.

Signed-off-by: Vitaliy Gusev <vgusev@openvz.org>
Cc: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-11 17:33:42 -07:00
Naoya Horiguchi
ab941e0fff rmap: remove anon_vma check in page_address_in_vma()
Currently page_address_in_vma() compares vma->anon_vma and
page_anon_vma(page) for parameter check, but in 2.6.34 a vma can have
multiple anon_vmas with anon_vma_chain, so current check does not work.
(For anonymous page shared by multiple processes, some verified (page,vma)
pairs return -EFAULT wrongly.)

We can go to checking all anon_vmas in the "same_vma" chain, but it needs
to meet lock requirement.  Instead, we can remove anon_vma check safely
because page_address_in_vma() assumes that page and vma are already
checked to belong to the identical process.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-11 17:33:42 -07:00
Mel Gorman
4a6018f7f4 hugetlbfs: kill applications that use MAP_NORESERVE with SIGBUS instead of OOM-killer
Ordinarily, application using hugetlbfs will create mappings with
reserves.  For shared mappings, these pages are reserved before mmap()
returns success and for private mappings, the caller process is guaranteed
and a child process that cannot get the pages gets killed with sigbus.

An application that uses MAP_NORESERVE gets no reservations and mmap()
will always succeed at the risk the page will not be available at fault
time.  This might be used for example on very large sparse mappings where
the developer is confident the necessary huge pages exist to satisfy all
faults even though the whole mapping cannot be backed by huge pages.
Unfortunately, if an allocation does fail, VM_FAULT_OOM is returned to the
fault handler which proceeds to trigger the OOM-killer.  This is
unhelpful.

Even without hugetlbfs mounted, a user using mmap() can trivially trigger
the OOM-killer because VM_FAULT_OOM is returned (will provide example
program if desired - it's a whopping 24 lines long).  It could be
considered a DOS available to an unprivileged user.

This patch alters hugetlbfs to kill a process that uses MAP_NORESERVE
where huge pages were not available with SIGBUS instead of triggering the
OOM killer.

This change affects hugetlb_cow() as well.  I feel there is a failure case
in there, but I didn't create one.  It would need a fairly specific target
in terms of the faulting application and the hugepage pool size.  The
hugetlb_no_page() path is much easier to hit but both might as well be
closed.

Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-11 17:33:42 -07:00
Vitaly Mayatskikh
475f9aa6aa kexec: fix OOPS in crash_kernel_shrink
Two "echo 0 > /sys/kernel/kexec_crash_size" OOPSes kernel.  Also content
of this file is invalid after first shrink to zero: it shows 1 instead of
0.

This scenario is unlikely to happen often (root privs, valid crashkernel=
in cmdline, dump-capture kernel not loaded), I hit it only by chance.

This patch fixes it.

Signed-off-by: Vitaly Mayatskikh <v.mayatskih@gmail.com>
Cc: Cong Wang <amwang@redhat.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-11 17:33:42 -07:00
Nicolas Ferre
d586ebbb88 mmc: atmel-mci: fix in debugfs: response value printing
In debugfs, printing of command response reports resp[2] twice: fix it to
resp[3].

Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Haavard Skinnemoen <hskinnemoen@atmel.com>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-11 17:33:41 -07:00
Nicolas Ferre
abc2c9fdf6 mmc: atmel-mci: remove data error interrupt after xfer
Disable data error interrupts while we are actually recording that there
is not such errors.  This will prevent, in some cases, the warning message
printed at new request queuing (in atmci_start_request()).

Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
Cc: <linux-mmc@vger.kernel.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-11 17:33:41 -07:00
Nicolas Ferre
009a891b22 mmc: atmel-mci: prevent kernel oops while removing card
The removing of an SD card in certain circumstances can lead to a kernel
oops if we do not make sure that the "data" field of the host structure is
valid.  This patch adds a test in atmci_dma_cleanup() function and also
calls atmci_stop_dma() before throwing away the reference to data.

Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
Cc: <linux-mmc@vger.kernel.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-11 17:33:41 -07:00
Nicolas Ferre
ebb1fea9b3 mmc: atmel-mci: fix two parameters swapped
Two parameters were swapped in the calls to atmci_init_slot().

Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Reported-by: Anders Grahn <anders.grahn@hd-wireless.se>
Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
Cc: <linux-mmc@vger.kernel.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-11 17:33:41 -07:00
Robin Holt
34441427aa revert "procfs: provide stack information for threads" and its fixup commits
Originally, commit d899bf7b ("procfs: provide stack information for
threads") attempted to introduce a new feature for showing where the
threadstack was located and how many pages are being utilized by the
stack.

Commit c44972f1 ("procfs: disable per-task stack usage on NOMMU") was
applied to fix the NO_MMU case.

Commit 89240ba0 ("x86, fs: Fix x86 procfs stack information for threads on
64-bit") was applied to fix a bug in ia32 executables being loaded.

Commit 9ebd4eba7 ("procfs: fix /proc/<pid>/stat stack pointer for kernel
threads") was applied to fix a bug which had kernel threads printing a
userland stack address.

Commit 1306d603f ('proc: partially revert "procfs: provide stack
information for threads"') was then applied to revert the stack pages
being used to solve a significant performance regression.

This patch nearly undoes the effect of all these patches.

The reason for reverting these is it provides an unusable value in
field 28.  For x86_64, a fork will result in the task->stack_start
value being updated to the current user top of stack and not the stack
start address.  This unpredictability of the stack_start value makes
it worthless.  That includes the intended use of showing how much stack
space a thread has.

Other architectures will get different values.  As an example, ia64
gets 0.  The do_fork() and copy_process() functions appear to treat the
stack_start and stack_size parameters as architecture specific.

I only partially reverted c44972f1 ("procfs: disable per-task stack usage
on NOMMU") .  If I had completely reverted it, I would have had to change
mm/Makefile only build pagewalk.o when CONFIG_PROC_PAGE_MONITOR is
configured.  Since I could not test the builds without significant effort,
I decided to not change mm/Makefile.

I only partially reverted 89240ba0 ("x86, fs: Fix x86 procfs stack
information for threads on 64-bit") .  I left the KSTK_ESP() change in
place as that seemed worthwhile.

Signed-off-by: Robin Holt <holt@sgi.com>
Cc: Stefani Seibold <stefani@seibold.net>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-11 17:33:41 -07:00
Denis Turischev
3c904afd73 it8761e_gpio: fix bug in gpio numbering
The SIO chip contains 16 possible gpio lines, not 14.  The schematic was
not read carefully.

Signed-off-by: Denis Turischev <denis@compulab.co.il>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-11 17:33:41 -07:00
FUJITA Tomonori
f33d7e2d2d dma-mapping: fix dma_sync_single_range_*
dma_sync_single_range_for_cpu() and dma_sync_single_range_for_device() use
a wrong address with a partial synchronization.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-11 17:33:41 -07:00
Paul E. McKenney
72d5a9f7a9 rcu: remove all rcu head initializations, except on_stack initializations
Remove all rcu head inits. We don't care about the RCU head state before passing
it to call_rcu() anyway. Only leave the "on_stack" variants so debugobjects can
keep track of objects on stack.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2010-05-11 16:10:47 -07:00
Sage Weil
45c6ceb547 ceph: zero unused message header, footer fields
We shouldn't leak any prior memory contents to other parties.  And random
data, particularly in the 'version' field, can cause problems down the
line.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-11 15:17:40 -07:00
Arnaldo Carvalho de Melo
3798ed7bc7 perf ui: Add ui_helpline methods
Initially this was just to be able to have a printf like method to
prepare the formatted string and then pass to newtPushHelpLine, but as
we already have for ui_progress, etc, its a step in identifying a
restricted, highlevel set of widgets we can then have implementations
for multiple widget sets (GTK, etc).

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-11 18:01:23 -03:00
Jeff Layton
3d69438031 cifs: guard against hardlinking directories
When we made serverino the default, we trusted that the field sent by the
server in the "uniqueid" field was actually unique. It turns out that it
isn't reliably so.

Samba, in particular, will just put the st_ino in the uniqueid field when
unix extensions are enabled. When a share spans multiple filesystems, it's
quite possible that there will be collisions. This is a server bug, but
when the inodes in question are a directory (as is often the case) and
there is a collision with the root inode of the mount, the result is a
kernel panic on umount.

Fix this by checking explicitly for directory inodes with the same
uniqueid. If that is the case, then we can assume that using server inode
numbers will be a problem and that they should be disabled.

Fixes Samba bugzilla 7407

Signed-off-by: Jeff Layton <jlayton@redhat.com>
CC: Stable <stable@kernel.org>
Reviewed-and-Tested-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2010-05-11 20:57:50 +00:00
Linus Torvalds
fc2a093e7a Merge branch 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6
* 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
  drm/radeon: Fix 3 regressions - since buffer rework
2010-05-11 10:12:18 -07:00
Linus Torvalds
9fc282baa8 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  net: Fix FDDI and TR config checks in ipv4 arp and LLC.
  IPv4: unresolved multicast route cleanup
  mac80211: remove association work when processing deauth request
  ar9170: wait for asynchronous firmware loading
  ipv4: udp: fix short packet and bad checksum logging
  phy: Fix initialization in micrel driver.
  sctp: Fix a race between ICMP protocol unreachable and connect()
  veth: Dont kfree_skb() after dev_forward_skb()
  IPv6: fix IPV6_RECVERR handling of locally-generated errors
  net/gianfar: drop recycled skbs on MTU change
  iwlwifi: work around passive scan issue
2010-05-11 10:11:40 -07:00
David Howells
c61ea31dac CacheFiles: Fix occasional EIO on call to vfs_unlink()
Fix an occasional EIO returned by a call to vfs_unlink():

	[ 4868.465413] CacheFiles: I/O Error: Unlink failed
	[ 4868.465444] FS-Cache: Cache cachefiles stopped due to I/O error
	[ 4947.320011] CacheFiles: File cache on md3 unregistering
	[ 4947.320041] FS-Cache: Withdrawing cache "mycache"
	[ 5127.348683] FS-Cache: Cache "mycache" added (type cachefiles)
	[ 5127.348716] CacheFiles: File cache on md3 registered
	[ 7076.871081] CacheFiles: I/O Error: Unlink failed
	[ 7076.871130] FS-Cache: Cache cachefiles stopped due to I/O error
	[ 7116.780891] CacheFiles: File cache on md3 unregistering
	[ 7116.780937] FS-Cache: Withdrawing cache "mycache"
	[ 7296.813394] FS-Cache: Cache "mycache" added (type cachefiles)
	[ 7296.813432] CacheFiles: File cache on md3 registered

What happens is this:

 (1) A cached NFS file is seen to have become out of date, so NFS retires the
     object and immediately acquires a new object with the same key.

 (2) Retirement of the old object is done asynchronously - so the lookup/create
     to generate the new object may be done first.

     This can be a problem as the old object and the new object must exist at
     the same point in the backing filesystem (i.e. they must have the same
     pathname).

 (3) The lookup for the new object sees that a backing file already exists,
     checks to see whether it is valid and sees that it isn't.  It then deletes
     that file and creates a new one on disk.

 (4) The retirement phase for the old file is then performed.  It tries to
     delete the dentry it has, but ext4_unlink() returns -EIO because the inode
     attached to that dentry no longer matches the inode number associated with
     the filename in the parent directory.

The trace below shows this quite well.

	[md5sum] ==> __fscache_relinquish_cookie(ffff88002d12fb58{NFS.fh,ffff88002ce62100},1)
	[md5sum] ==> __fscache_acquire_cookie({NFS.server},{NFS.fh},ffff88002ce62100)

NFS has retired the old cookie and asked for a new one.

	[kslowd] ==> fscache_object_state_machine({OBJ52,OBJECT_ACTIVE,24})
	[kslowd] <== fscache_object_state_machine() [->OBJECT_DYING]
	[kslowd] ==> fscache_object_state_machine({OBJ53,OBJECT_INIT,0})
	[kslowd] <== fscache_object_state_machine() [->OBJECT_LOOKING_UP]
	[kslowd] ==> fscache_object_state_machine({OBJ52,OBJECT_DYING,24})
	[kslowd] <== fscache_object_state_machine() [->OBJECT_RECYCLING]

The old object (OBJ52) is going through the terminal states to get rid of it,
whilst the new object - (OBJ53) - is coming into being.

	[kslowd] ==> fscache_object_state_machine({OBJ53,OBJECT_LOOKING_UP,0})
	[kslowd] ==> cachefiles_walk_to_object({ffff88003029d8b8},OBJ53,@68,)
	[kslowd] lookup '@68'
	[kslowd] next -> ffff88002ce41bd0 positive
	[kslowd] advance
	[kslowd] lookup 'Es0g00og0_Nd_XCYe3BOzvXrsBLMlN6aw16M1htaA'
	[kslowd] next -> ffff8800369faac8 positive

The new object has looked up the subdir in which the file would be in (getting
dentry ffff88002ce41bd0) and then looked up the file itself (getting dentry
ffff8800369faac8).

	[kslowd] validate 'Es0g00og0_Nd_XCYe3BOzvXrsBLMlN6aw16M1htaA'
	[kslowd] ==> cachefiles_bury_object(,'@68','Es0g00og0_Nd_XCYe3BOzvXrsBLMlN6aw16M1htaA')
	[kslowd] remove ffff8800369faac8 from ffff88002ce41bd0
	[kslowd] unlink stale object
	[kslowd] <== cachefiles_bury_object() = 0

It then checks the file's xattrs to see if it's valid.  NFS says that the
auxiliary data indicate the file is out of date (obvious to us - that's why NFS
ditched the old version and got a new one).  CacheFiles then deletes the old
file (dentry ffff8800369faac8).

	[kslowd] redo lookup
	[kslowd] lookup 'Es0g00og0_Nd_XCYe3BOzvXrsBLMlN6aw16M1htaA'
	[kslowd] next -> ffff88002cd94288 negative
	[kslowd] create -> ffff88002cd94288{ffff88002cdaf238{ino=148247}}

CacheFiles then redoes the lookup and gets a negative result in a new dentry
(ffff88002cd94288) which it then creates a file for.

	[kslowd] ==> cachefiles_mark_object_active(,OBJ53)
	[kslowd] <== cachefiles_mark_object_active() = 0
	[kslowd] === OBTAINED_OBJECT ===
	[kslowd] <== cachefiles_walk_to_object() = 0 [148247]
	[kslowd] <== fscache_object_state_machine() [->OBJECT_AVAILABLE]

The new object is then marked active and the state machine moves to the
available state - at which point NFS can start filling the object.

	[kslowd] ==> fscache_object_state_machine({OBJ52,OBJECT_RECYCLING,20})
	[kslowd] ==> fscache_release_object()
	[kslowd] ==> cachefiles_drop_object({OBJ52,2})
	[kslowd] ==> cachefiles_delete_object(,OBJ52{ffff8800369faac8})

The old object, meanwhile, goes on with being retired.  If allocation occurs
first, cachefiles_delete_object() has to wait for dir->d_inode->i_mutex to
become available before it can continue.

	[kslowd] ==> cachefiles_bury_object(,'@68','Es0g00og0_Nd_XCYe3BOzvXrsBLMlN6aw16M1htaA')
	[kslowd] remove ffff8800369faac8 from ffff88002ce41bd0
	[kslowd] unlink stale object
	EXT4-fs warning (device sda6): ext4_unlink: Inode number mismatch in unlink (148247!=148193)
	CacheFiles: I/O Error: Unlink failed
	FS-Cache: Cache cachefiles stopped due to I/O error

CacheFiles then tries to delete the file for the old object, but the dentry it
has (ffff8800369faac8) no longer points to a valid inode for that directory
entry, and so ext4_unlink() returns -EIO when de->inode does not match i_ino.

	[kslowd] <== cachefiles_bury_object() = -5
	[kslowd] <== cachefiles_delete_object() = -5
	[kslowd] <== fscache_object_state_machine() [->OBJECT_DEAD]
	[kslowd] ==> fscache_object_state_machine({OBJ53,OBJECT_AVAILABLE,0})
	[kslowd] <== fscache_object_state_machine() [->OBJECT_ACTIVE]

(Note that the above trace includes extra information beyond that produced by
the upstream code).

The fix is to note when an object that is being retired has had its object
deleted preemptively by a replacement object that is being created, and to
skip the second removal attempt in such a case.

Reported-by: Greg M <gregm@servu.net.au>
Reported-by: Mark Moseley <moseleymark@gmail.com>
Reported-by: Romain DEGEZ <romain.degez@smartjog.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-11 10:07:53 -07:00
Alex Chiang
7d6fb7bd19 ACPI: sleep: eliminate duplicate entries in acpisleep_dmi_table[]
Duplicate entries ended up acpisleep_dmi_table[] by accident.
They don't hurt functionality, but they are ugly, so let's get
rid of them.

Cc: stable@kernel.org
Signed-off-by: Alex Chiang <achiang@canonical.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-11 10:07:53 -07:00
Sage Weil
9abf82b8bc ceph: fix locking for waking session requests after reconnect
The session->s_waiting list is protected by mdsc->mutex, not s_mutex.  This
was causing (rare) s_waiting list corruption.

Fix errors paths too, while we're here.  A more thorough cleanup of this
function is coming soon.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-11 09:53:57 -07:00
Sage Weil
d85b705663 ceph: resubmit requests on pg mapping change (not just primary change)
OSD requests need to be resubmitted on any pg mapping change, not just when
the pg primary changes.  Resending only when the primary changes results in
occasional 'hung' requests during osd cluster recovery or rebalancing.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-11 09:53:56 -07:00
Sage Weil
04d000eb35 ceph: fix open file counting on snapped inodes when mds returns no caps
It's possible the MDS will not issue caps on a snapped inode, in which case
an open request may not __ceph_get_fmode(), botching the open file
counting.  (This is actually a server bug, but the client shouldn't BUG out
in this case.)

Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-11 09:53:55 -07:00
Sage Weil
0ceed5db32 ceph: unregister osd request on failure
The osd request wasn't being unregistered when the osd returned a failure
code, even though the result was returned to the caller.  This would cause
it to eventually time out, and then crash the kernel when it tried to
resend the request using a stale page vector.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-11 09:53:18 -07:00
Changli Gao
a93d2f1744 sched, wait: Use wrapper functions
epoll should not touch flags in wait_queue_t. This patch introduces a new
function __add_wait_queue_exclusive(), for the users, who use wait queue as a
LIFO queue.

__add_wait_queue_tail_exclusive() is introduced too instead of
add_wait_queue_exclusive_locked(). remove_wait_queue_locked() is removed, as
it is a duplicate of __remove_wait_queue(), disliked by users, and with less
users.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Paul Menage <menage@google.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Davide Libenzi <davidel@xmailserver.org>
Cc: <containers@lists.linux-foundation.org>
LKML-Reference: <1273214006-2979-1-git-send-email-xiaosuo@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-11 17:43:58 +02:00
Kyle McMartin
d11c7addfe perf symbols: allow forcing use of cplus_demangle
For Fedora, I want to force perf to link against libiberty.a for
cplus_demangle, rather than libbfd.a for bfd_demangle due to licensing insanity
on binutils. (libiberty is LGPL2, libbfd is GPL3.)

If we just rely on autodetection, we'll end up with libbfd linked against us,
since they're both in binutils-static in the buildroot.

Cc: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <20100510204335.GA7565@bombadil.infradead.org>
Signed-off-by: Kyle McMartin <kyle@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-11 12:43:11 -03:00
Masami Hiramatsu
6b3c4ef504 perf probe: Check older elfutils and set NO_DWARF
Check whether elfutils is older than 0.138 (from which version checking
routine has been introduced). And if so, set NO_DWARF because it is hard
to check the API dependency without version checking.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Reported-by: Robert Richter <robert.richter@amd.com>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <20100511045953.9913.19485.stgit@localhost6.localdomain6>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-11 12:43:11 -03:00
Arnaldo Carvalho de Melo
b09e0190ac perf hist: Adopt filter by dso and by thread methods from the newt browser
Those are really not specific to the newt code, can be used by other UI
frontends.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-11 12:43:10 -03:00
Suresh Jayaraman
fdb3603800 cifs: propagate cifs_new_fileinfo() error back to the caller
..otherwise memory allocation errors go undetected.

Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2010-05-11 15:42:21 +00:00
Joerg Roedel
795e74f7a6 Merge branch 'iommu/largepages' into amd-iommu/2.6.35
Conflicts:
	arch/x86/kernel/amd_iommu.c
2010-05-11 17:40:57 +02:00
Joerg Roedel
a523572596 x86/amd-iommu: Add amd_iommu=off command line option
This patch adds a command line option to tell the AMD IOMMU
driver to not initialize any IOMMU it finds.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2010-05-11 17:12:33 +02:00
Lin Ming
8e6d5573af perf, powerpc: Implement group scheduling transactional APIs
[paulus@samba.org: Set cpuhw->event[i]->hw.config in
power_pmu_commit_txn.]

Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20100508102841.GA10650@brick.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-11 17:08:24 +02:00
Peter Zijlstra
96c21a460a perf: Fix exit() vs event-groups
Corey reported that the value scale times of group siblings are not
updated when the monitored task dies.

The problem appears to be that we only update the group leader's
time values, fix it by updating the whole group.

Reported-by: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: <stable@kernel.org> # .34.x
LKML-Reference: <1273588935.1810.6.camel@laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-11 17:08:24 +02:00
Peter Zijlstra
050735b08c perf: Fix exit() vs PERF_FORMAT_GROUP
Both Stephane and Corey reported that PERF_FORMAT_GROUP didn't
work as expected if the task the counters were attached to quit
before the read() call.

The cause is that we unconditionally destroy the grouping when
we remove counters from their context. Fix this by splitting off
the group destroy from the list removal such that
perf_event_remove_from_context() does not do this and change
perf_event_release() to do so.

Reported-by: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Reported-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: <stable@kernel.org> # .34.x
LKML-Reference: <1273571513.5605.3527.camel@twins>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-11 15:46:43 +02:00
Henrik Rydberg
0559a53889 hwmon: (applesmc) Correct sysfs fan error handling
The current code will not remove the sysfs files for fan numbers three
and up. Also, upon exit, fans one and two are removed regardless of
their existence.  This patch cleans up the sysfs error handling for
the fans.

Signed-off-by: Henrik Rydberg <rydberg@euromail.se>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
2010-05-11 09:17:47 +02:00
Ken Milmore
d1bf8cf6b9 hwmon: (asc7621) Bug fixes
* Allow fan minimum RPM to be set to zero without triggering alarms.
* Fix voltage scaling arithmetic and correct scale factors.
* Correct fan1-fan4 alarm bit shifts.
* Correct register address for temp3_smoothing_enable.
* Read the alarm registers with high priority.

Signed-off-by: Ken Milmore <ken.milmore@googlemail.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
2010-05-11 09:17:46 +02:00