linux/arch/x86_64/kernel
Tim Hockin e02e68d31e x86_64: support poll() on /dev/mcelog
Background:
 /dev/mcelog is typically polled manually.  This is less than optimal for
 situations where accurate accounting of MCEs is important.  Calling
 poll() on /dev/mcelog does not work.

Description:
 This patch adds support for poll() to /dev/mcelog.  This results in
 immediate wakeup of user apps whenever the poller finds MCEs.  Because
 the exception handler can not take any locks, it can not call the wakeup
 itself.  Instead, it uses a thread_info flag (TIF_MCE_NOTIFY) which is
 caught at the next return from interrupt or exit from idle, calling the
 mce_user_notify() routine.  This patch also disables the "fake panic"
 path of the mce_panic(), because it results in printk()s in the exception
 handler and crashy systems.

 This patch also does some small cleanup for essentially unused variables,
 and moves the user notification into the body of the poller, so it is
 only called once per poll, rather than once per CPU.

Result:
 Applications can now poll() on /dev/mcelog.  When an error is logged
 (whether through the poller or through an exception) the applications are
 woken up promptly.  This should not affect any previous behaviors.  If no
 MCEs are being logged, there is no overhead.

Alternatives:
 I considered simply supporting poll() through the poller and not using
 TIF_MCE_NOTIFY at all.  However, the time between an uncorrectable error
 happening and the user application being notified is *the*most* critical
 window for us.  Many uncorrectable errors can be logged to the network if
 given a chance.

 I also considered doing the MCE poll directly from the idle notifier, but
 decided that was overkill.

Testing:
 I used an error-injecting DIMM to create lots of correctable DRAM errors
 and verified that my user app is woken up in sync with the polling interval.
 I also used the northbridge to inject uncorrectable ECC errors, and
 verified (printk() to the rescue) that the notify routine is called and the
 user app does wake up.  I built with PREEMPT on and off, and verified
 that my machine survives MCEs.

[wli@holomorphy.com: build fix]
Signed-off-by: Tim Hockin <thockin@google.com>
Signed-off-by: William Irwin <bill.irwin@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-21 18:37:10 -07:00
..
acpi PM: Integrate beeping flag with existing acpi_sleep flags 2007-07-19 10:04:43 -07:00
cpufreq [CPUFREQ] the overdue removal of X86_SPEEDSTEP_CENTRINO_ACPI 2007-07-13 01:29:51 -04:00
aperture.c x86_64: off-by-two error in aperture.c 2007-05-11 12:53:00 -07:00
apic.c x86_64: apic.c coding style janitor work 2007-07-21 18:37:09 -07:00
asm-offsets.c [PATCH] x86-64: Auto compute __NR_syscall_max at compile time 2007-05-02 19:27:18 +02:00
audit.c [PATCH] audit signal recipients 2007-05-11 05:38:25 -04:00
bugs.c x86_64: Add asm/mtrr.h include for some builds 2007-05-12 09:47:15 -07:00
crash.c move die notifier handling to common code 2007-05-08 11:15:04 -07:00
crash_dump.c [PATCH] kdump: read previous kernel's memory 2006-01-10 08:01:28 -08:00
e820.c x86_64: extract helper function from e820_register_active_regions 2007-07-21 18:37:10 -07:00
early-quirks.c [PATCH] x86: revert x86_64-mm-fix-the-irqbalance-quirk-for-e7320-e7520-e7525 2007-05-02 19:27:04 +02:00
early_printk.c xen: use the hvc console infrastructure for Xen console 2007-07-18 08:47:44 -07:00
entry.S x86_64: support poll() on /dev/mcelog 2007-07-21 18:37:10 -07:00
genapic.c [PATCH] x86: adjust inclusion of asm/fixmap.h 2007-05-02 19:27:04 +02:00
genapic_flat.c [PATCH] x86-64: Fix allnoconfig error in genapic_flat.c 2007-05-02 19:27:21 +02:00
head.S x86: initial fixmap support 2007-07-16 09:05:35 -07:00
head64.c x86_64: display more intuitive error message if kernel is not 2MB aligned 2007-05-11 08:29:32 -07:00
hpet.c x86_64: fiuxp pt_reqs leftovers 2007-07-21 18:37:09 -07:00
i387.c [PATCH] x86-64: use BUILD_BUG_ON in FPU code 2006-12-07 02:14:01 +01:00
i8259.c header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
init_task.c use the new percpu interface for shared data 2007-07-19 10:04:45 -07:00
io_apic.c x86_64: set the irq_chip name for lapic 2007-06-26 16:54:29 -07:00
ioport.c header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
irq.c x86_64 irq: use mask/unmask and proper locking in fixup_irqs() 2007-06-26 16:54:29 -07:00
k8.c Avoid zero size allocation in cache_k8_northbridges() 2007-05-23 20:14:12 -07:00
kprobes.c Kprobes: The ON/OFF knob thru debugfs 2007-05-08 11:15:19 -07:00
ldt.c header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
machine_kexec.c Revert "[PATCH] x86: __pa and __pa_symbol address space separation" 2007-05-07 08:44:24 -07:00
Makefile Use a new CPU feature word to cover features that are spread around 2007-07-12 10:55:54 -07:00
mce.c x86_64: support poll() on /dev/mcelog 2007-07-21 18:37:10 -07:00
mce_amd.c x86_64: Fix APIC typo 2007-07-21 18:37:09 -07:00
mce_intel.c [PATCH] x86: Add a cumulative thermal throttle event counter. 2006-09-26 10:52:42 +02:00
module.c [PATCH] Generic BUG for x86-64 2006-12-08 08:28:39 -08:00
mpparse.c x86_64: remove unused variable maxcpus 2007-07-21 18:37:09 -07:00
nmi.c x86_64: speedup touch_nmi_watchdog 2007-07-17 10:23:04 -07:00
pci-calgary.c [PATCH] x86-64: dma_ops as const 2007-05-02 19:27:06 +02:00
pci-dma.c PCI: remove pci_dac_dma_... APIs 2007-07-11 16:02:11 -07:00
pci-gart.c x86_64: off-by-two error in aperture.c 2007-05-11 12:53:00 -07:00
pci-nommu.c [PATCH] x86-64: dma_ops as const 2007-05-02 19:27:06 +02:00
pci-swiotlb.c [PATCH] x86-64: dma_ops as const 2007-05-02 19:27:06 +02:00
pmtimer.c [PATCH] time: x86_64: convert x86_64 to use GENERIC_TIME 2007-02-16 08:14:00 -08:00
process.c x86_64: Quicklist support for x86_64 2007-07-21 18:37:09 -07:00
ptrace.c Handle bogus %cs selector in single-step instruction decoding 2007-07-18 12:09:01 -07:00
reboot.c Detach sched.h from mm.h 2007-05-21 09:18:19 -07:00
relocate_kernel.S [PATCH] Avoid overwriting the current pgd (V4, x86_64) 2006-09-26 10:52:38 +02:00
setup.c i386: Add L3 cache support to AMD CPUID4 emulation 2007-07-21 18:37:08 -07:00
setup64.c x86_64: Ignore compat mode SYSCALL when IA32_EMULATION is not defined 2007-06-22 18:41:19 -07:00
signal.c x86_64: support poll() on /dev/mcelog 2007-07-21 18:37:10 -07:00
smp.c x86_64: Quicklist support for x86_64 2007-07-21 18:37:09 -07:00
smpboot.c header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
stacktrace.c simplify the stacktrace code 2007-05-08 11:14:58 -07:00
suspend.c [PATCH] x86: Save and restore the fixed-range MTRRs of the BSP when suspending 2007-05-02 19:27:17 +02:00
suspend_asm.S [PATCH] x86-64: Relocatable Kernel Support 2007-05-02 19:27:07 +02:00
sys_x86_64.c header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
syscall.c [PATCH] x86-64: Auto compute __NR_syscall_max at compile time 2007-05-02 19:27:18 +02:00
tce.c Remove all inclusions of <linux/config.h> 2006-10-04 03:38:54 -04:00
time.c x86_64: time.c white space wreckage cleanup 2007-07-21 18:37:09 -07:00
trampoline.S [PATCH] x86-64: Move cpu verification code to common file 2007-05-02 19:27:08 +02:00
traps.c drivers/edac: add new nmi rescan 2007-07-19 10:04:53 -07:00
tsc.c x86_64: Remove dead code and other janitor work in tsc.c 2007-07-21 18:37:08 -07:00
tsc_sync.c [PATCH] x86: Log reason why TSC was marked unstable 2007-05-02 19:27:08 +02:00
verify_cpu.S Unify the CPU features vectors between i386 and x86-64 2007-07-12 10:55:54 -07:00
vmlinux.lds.S x86_64: Add vDSO for x86-64 with gettimeofday/clock_gettime/getcpu 2007-07-21 18:37:08 -07:00
vsmp.c [PATCH] Fix build breakage with CONFIG_X86_VSMP 2006-10-12 12:25:27 -07:00
vsyscall.c x86_64: Add vDSO for x86-64 with gettimeofday/clock_gettime/getcpu 2007-07-21 18:37:08 -07:00
x8664_ksyms.c [PATCH] x86: Export _proxy_pda for gcc 4.2 2007-03-16 21:07:36 +01:00