linux/Documentation
Andi Kleen 3c0797925f x86, mce: switch x86 machine check handler to Monarch election.
On Intel platforms machine check exceptions are always broadcast to
all CPUs.  This patch makes the machine check handler synchronize all
these machine checks, elect a Monarch to handle the event and collect
the worst event from all CPUs and then process it first.

This has some advantages:

- When there is a truly data corrupting error the system panics as
  quickly as possible. This improves containment of corrupted
  data and makes sure the corrupted data never hits stable storage.

- The panics are synchronized and do not reenter the panic code
  on multiple CPUs (which currently does not handle this well).

- All the errors are reported. Currently it often happens that
  another CPU happens to do the panic first, but reports useless
  information (empty machine check) because the real error
  happened on another CPU which came in later.
  This is a big advantage on Nehalem where the 8 threads per CPU
  lead to often the wrong CPU winning the race and dumping
  useless information on a machine check.  The problem also occurs
  in a less severe form on older CPUs.

- The system can detect when no CPUs detected a machine check
  and shut down the system.  This can happen when one CPU is so
  badly hung that that it cannot process a machine check anymore
  or when some external agent wants to stop the system by
  asserting the machine check pin.  This follows Intel hardware
  recommendations.

- This matches the recommended error model by the CPU designers.

- The events can be output in true severity order

- When a panic happens on another CPU it makes sure to be actually
  be able to process the stop IPI by enabling interrupts.

The code is extremly careful to handle timeouts while waiting
for other CPUs. It can't rely on the normal timing mechanisms
(jiffies, ktime_get) because of its asynchronous/lockless nature,
so it uses own timeouts using ndelay() and a "SPINUNIT"

The timeout is configurable. By default it waits for upto one
second for the other CPUs.  This can be also disabled.

From some informal testing AMD systems do not see to broadcast
machine checks, so right now it's always disabled by default on
non Intel CPUs or also on very old Intel systems.

Includes fixes from Ying Huang
Fixed a "ecception" in a comment (H.Seto)
Moved global_nwo reset later based on suggestion from H.Seto
v2: Avoid duplicate messages

[ Impact: feature, fixes long standing problems. ]

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-06-03 14:45:12 -07:00
..
ABI slub: add Documentation/ABI/testing/sysfs-kernel-slab 2009-04-28 14:30:35 +03:00
accounting Documentation/accounting/getdelays.c: fix endless loop 2009-01-15 16:39:37 -08:00
acpi ACPI: update debug parameter documentation 2008-11-07 21:45:29 -05:00
aoe
arm Merge branch 'next-s3c-pm' of git://aeryn.fluff.org.uk/bjdooks/linux into devel 2009-03-26 22:44:43 +00:00
auxdisplay .gitignore updates 2008-10-30 11:38:45 -07:00
blackfin Blackfin arch: Add document about bfin-gpio 2009-01-07 23:14:38 +08:00
block block: update biodoc.txt on plugging 2009-04-15 08:28:11 +02:00
blockdev mflash: initial support 2009-04-07 08:12:38 +02:00
cdrom doc/cdrom: Trvial documentation error, file not present 2008-10-10 08:22:44 +02:00
cgroups memcg: fix documentation 2009-04-13 15:04:33 -07:00
connector Documentation/connector/cn_test.c: don't use gfp_any() 2009-02-12 16:47:01 -08:00
console
cpu-freq [CPUFREQ] ondemand/conservative: sanitize sampling_rate restrictions 2009-02-24 22:47:31 -05:00
cpuidle
cris fix random typos 2008-10-16 11:21:30 -07:00
crypto async_tx, dmaengine: document channel allocation and api rework 2009-01-05 18:10:19 -07:00
development-process Fix a typo in the development process document. 2009-01-08 16:32:13 -07:00
device-mapper dm crypt: add documentation 2008-04-25 13:27:03 +01:00
DocBook kgdb: gdb documentation fix 2009-05-15 07:56:25 -05:00
driver-model Driver Core: early platform driver 2009-04-16 16:17:10 -07:00
dvb V4L/DVB (11138): get_dvb_firmware: add support for downloading the cx2584x firmware for pvrusb2 2009-03-30 12:43:31 -03:00
early-userspace
fault-injection
fb uvesafb: documentation update 2009-04-07 08:31:09 -07:00
filesystems hugh: update email address 2009-05-21 13:14:32 -07:00
firmware_class
frv
hwmon hwmon: Update documentation on fan_max 2009-06-01 13:46:50 +02:00
i2c Move the pcf8591 driver to hwmon 2009-03-30 21:46:43 +02:00
i2o
ia64 trivial: Fix misspelling of firmware 2009-03-30 15:21:59 +02:00
ide ide: update warm-plug HOWTO 2009-01-06 17:21:00 +01:00
infiniband IPoIB: Document newish features 2009-04-08 13:52:01 -07:00
input Input: multitouch - augment event semantics documentation 2009-05-23 09:53:26 -07:00
ioctl V4L/DVB (10870a): remove all references for video_decoder.h 2009-03-30 12:43:15 -03:00
isdn Add reference to CAPI 2.0 standard 2009-04-27 05:37:39 -07:00
ja_JP Sync patch for jp_JP/stable_kernel_rules.txt 2009-01-28 15:55:48 -08:00
kbuild kbuild: introduce subdir-ccflags-y 2009-04-19 11:12:12 +02:00
kdump powerpc: Support for relocatable kdump kernel 2008-10-22 15:01:22 +11:00
ko_KR
laptops thinkpad-acpi: bump up version to 0.23 2009-04-18 01:19:54 -04:00
lguest lguest: document 32-bit and PAE requirements 2009-04-19 23:14:02 +09:30
m68k
make
mips ide: remove unused CONFIG_BLK_DEV_IDE_AU1XXX_SEQTS_PER_RQ 2009-01-14 19:19:03 +01:00
misc-devices drivers/misc/isl29003.c: driver for the ISL29003 ambient light sensor 2009-04-01 08:59:18 -07:00
mn10300
mtd [MTD] [NAND] nand_ecc.c: rewrite for improved performance 2008-08-16 10:55:33 +01:00
namespaces
netlabel
networking Doc: fixed descriptions on /proc/sys/net/core/* and /proc/sys/net/unix/* 2009-05-17 21:19:31 -07:00
parisc
PCI PCI MSI: Add example request loop to MSI-HOWTO.txt 2009-03-20 11:35:04 -07:00
pcmcia .gitignore updates 2008-10-30 11:38:45 -07:00
power pm: document use of RTC in pm_trace 2008-10-16 11:21:29 -07:00
powerpc Merge branch 'merge' of git://git.secretlab.ca/git/linux-2.6 into merge 2009-04-22 13:02:09 +10:00
prctl
RCU Doc: Fix spelling in RCU/rculist_nulls.txt. 2009-04-02 01:33:51 -07:00
s390 documentation: update s390 header file paths 2009-01-06 15:59:28 -08:00
scheduler trivial: fix where cgroup documentation is not correctly referred to 2009-03-30 15:22:02 +02:00
scsi [SCSI] aacraid driver update 2009-04-03 09:23:11 -05:00
serial Create/use more directory structure in the Documentation/ tree. 2008-11-14 17:28:53 +00:00
sh sh: Kill off remaining CONFIG_SH_KGDB bits. 2008-12-22 18:44:05 +09:00
sound Merge branch 'fix/pcm-jiffies-check' into for-linus 2009-05-27 16:51:27 +02:00
sparc sparc: Remove Documentation/sparc/sbus_drivers.txt 2008-08-29 02:15:25 -07:00
spi spi: documentation: emphasise spi_master.setup() semantics 2009-04-21 13:41:50 -07:00
sysctl Revert "mm: add /proc controls for pdflush threads" 2009-05-15 11:32:24 +02:00
telephony remove mention of CONFIG_KMOD from documentation 2008-07-22 19:24:29 +10:00
thermal thermal: update the documentation 2008-04-29 02:49:47 -04:00
timers hpet: /dev/hpet - fixes and cleanup 2008-07-31 18:45:41 +02:00
trace tracing: consolidate documents 2009-04-09 07:28:10 +02:00
uml
usb USB: usbmon: Add binary API v1 2009-03-24 16:20:36 -07:00
video4linux V4L/DVB (11373): v4l2-common: add explicit v4l2_device pointer as first arg to new_(probed)_subdev 2009-04-06 21:44:24 -03:00
vm mm: add documentation describing what tsk->active_mm means vs tsk->mm 2009-04-13 15:04:32 -07:00
w1 w1: send status messages after command processing 2009-01-08 08:31:14 -08:00
watchdog .gitignore updates 2008-10-30 11:38:45 -07:00
wimax i2400m: documentation and instructions for usage 2009-01-07 10:00:18 -08:00
x86 x86, mce: switch x86 machine check handler to Monarch election. 2009-06-03 14:45:12 -07:00
zh_CN
00-INDEX trivial: fix where cgroup documentation is not correctly referred to 2009-03-30 15:22:02 +02:00
applying-patches.txt
atomic_ops.txt
bad_memory.txt Document handling of bad memory 2008-12-03 16:09:53 -07:00
basic_profiling.txt
binfmt_misc.txt
braille-console.txt Basic braille screen reader support 2008-04-30 08:29:52 -07:00
bt8xxgpio.txt gpio: add bt8xxgpio driver 2008-07-25 10:53:30 -07:00
BUG-HUNTING
c2port.txt Add c2 port support 2008-11-12 17:17:18 -08:00
cachetlb.txt
Changes x86, mce: document new 32bit mcelog requirement in Documentation/Changes 2009-05-28 09:24:13 -07:00
CodingStyle fix emacs indenting howto filename expansion 2009-01-29 18:19:29 -08:00
cpu-hotplug.txt x86: use possible_cpus=NUM to extend the possible cpus allowed 2008-12-18 12:08:05 +01:00
cpu-load.txt
cputopology.txt cpumask: Use topology_core_cpumask()/topology_thread_cpumask() 2009-01-11 19:12:49 +01:00
credentials.txt CRED: Documentation 2008-11-14 10:39:26 +11:00
dcdbas.txt
debugging-modules.txt
debugging-via-ohci1394.txt
dell_rbu.txt trivial: fix an -> a typos in documentation and comments 2009-01-06 11:28:07 +01:00
devices.txt lanana: assign a device name and numbering for MAX3100 2009-04-07 08:44:05 -07:00
DMA-API.txt dma-debug: Documentation update 2009-03-17 12:56:47 +01:00
DMA-attributes.txt powerpc/cell: Add DMA_ATTR_WEAK_ORDERING dma attribute and use in Cell IOMMU code 2008-07-22 10:39:36 +10:00
DMA-ISA-LPC.txt
DMA-mapping.txt dma-mapping: update the old macro DMA_nBIT_MASK related documentations 2009-04-07 08:31:12 -07:00
dmaengine.txt async_tx, dmaengine: document channel allocation and api rework 2009-01-05 18:10:19 -07:00
dontdiff dontdiff: Fix asm exclude 2009-03-26 15:45:43 -07:00
dynamic-debug-howto.txt Dynamic debug: allow simple quoting of words 2009-03-24 16:38:27 -07:00
edac.txt Documentation cleanup: trivial misspelling, punctuation, and grammar corrections. 2008-07-26 12:00:06 -07:00
eisa.txt
email-clients.txt Documentation/email-clients.txt: add some info about gmail 2008-11-06 15:41:19 -08:00
exception.txt
feature-removal-schedule.txt x86, mce: deprecate old 32bit machine check code 2009-05-28 09:24:13 -07:00
gpio.txt gpio: gpio_{request,free}() now required (feature removal) 2009-04-02 19:04:51 -07:00
highuid.txt
HOWTO Remove Andrew Morton's http://www.zip.com.au/~akpm/ 2008-10-16 11:21:32 -07:00
hw_random.txt
ics932s401 ics932s401: new clock generator chip driver 2008-11-12 17:17:18 -08:00
initrd.txt
Intel-IOMMU.txt Documentation cleanup: trivial misspelling, punctuation, and grammar corrections. 2008-07-26 12:00:06 -07:00
IO-mapping.txt Documentation: move DMA-mapping.txt to Doc/PCI/ 2009-01-29 18:19:29 -08:00
io-mapping.txt io mapping: improve documentation 2008-11-03 18:21:44 +01:00
io_ordering.txt
iostats.txt Documentation cleanup: trivial misspelling, punctuation, and grammar corrections. 2008-07-26 12:00:06 -07:00
IPMI.txt
IRQ-affinity.txt genirq: Expose default irq affinity mask (take 3) 2008-06-05 15:18:30 +02:00
IRQ.txt
irqflags-tracing.txt
isapnp.txt
java.txt
kernel-doc-nano-HOWTO.txt kernel-doc: restrict syntax for private: and public: 2009-05-02 15:36:10 -07:00
kernel-docs.txt doc: update to URL and status of kernel-docs.txt entry 2008-06-06 11:29:10 -07:00
kernel-parameters.txt Merge branch 'linus' into irq/numa 2009-06-01 21:06:21 +02:00
keys-request-key.txt keys: allow the callout data to be passed as a blob rather than a string 2008-04-29 08:06:16 -07:00
keys.txt Documentation cleanup: trivial misspelling, punctuation, and grammar corrections. 2008-07-26 12:00:06 -07:00
kobject.txt kobject: Make Documentation/kobject.txt a little more coherent. 2009-01-06 10:44:32 -08:00
kprobes.txt kprobes: support kretprobe and jprobe per-probe disabling 2009-04-07 08:31:08 -07:00
kref.txt
ldm.txt
leds-class.txt Documentation cleanup: trivial misspelling, punctuation, and grammar corrections. 2008-07-26 12:00:06 -07:00
local_ops.txt documentation: local_ops fix on_each_cpu 2008-12-01 13:51:26 +01:00
lockdep-design.txt locking: Documentation: lockdep-design.txt, fix note of state bits 2009-04-26 18:21:24 +02:00
lockstat.txt lockstat: contend with points 2008-10-20 15:43:10 +02:00
logo.gif Revert "linux.conf.au 2009: Tuz" 2009-04-27 12:00:27 -07:00
logo.txt Revert "linux.conf.au 2009: Tuz" 2009-04-27 12:00:27 -07:00
magic-number.txt documentation: update header file paths 2009-01-06 15:59:28 -08:00
Makefile docsrc: build Documentation/ sources 2008-08-12 16:07:30 -07:00
ManagementStyle docs: fix ManagementStyle book name 2008-10-30 11:38:46 -07:00
markers.txt markers: comment marker_synchronize_unregister() on data dependency 2008-11-28 16:47:41 +01:00
mca.txt
md.txt Documentation/md.txt update 2009-03-31 15:18:37 +11:00
memory-barriers.txt read_barrier_depends arch fixlets 2008-05-14 10:05:18 -07:00
memory-hotplug.txt mm: show node to memory section relationship with symlinks in sysfs 2009-01-06 15:59:00 -08:00
memory.txt
mono.txt
mutex-design.txt
nmi_watchdog.txt x86, nmi-watchdog: update procfs nmi_watchdog file documentation v2 2008-10-30 19:07:04 +01:00
nommu-mmap.txt NOMMU: Make mmap allocation page trimming behaviour configurable. 2009-01-08 12:04:47 +00:00
numastat.txt
oops-tracing.txt Taint kernel after WARN_ON(condition) 2008-04-29 08:05:59 -07:00
parport-lowlevel.txt
parport.txt
pi-futex.txt
pnp.txt
preempt-locking.txt
printk-formats.txt DOC: add printk-formats.txt 2008-11-12 17:17:17 -08:00
prio_tree.txt
rbtree.txt
rfkill.txt rfkill: add master_switch_mode and EPO lock to rfkill and rfkill-input 2008-10-31 19:00:09 -04:00
robust-futex-ABI.txt
robust-futexes.txt
rt-mutex-design.txt
rt-mutex.txt
rtc.txt
SAK.txt Remove Andrew Morton's old email accounts 2008-10-16 11:21:32 -07:00
SecurityBugs
SELinux.txt selinux: add support for installing a dummy policy (v2) 2008-08-27 08:54:08 +10:00
serial-console.txt
sgi-ioc4.txt
sgi-visws.txt
slow-work.txt Document the slow work thread pool 2009-04-03 16:42:35 +01:00
SM501.txt
Smack.txt smack: Add a new '-CIPSO' option to the network address label configuration 2009-03-28 15:01:37 +11:00
sparse.txt Documentation: explain the difference between __bitwise and __bitwise__ 2009-04-11 08:18:11 +02:00
spinlocks.txt
stable_api_nonsense.txt
stable_kernel_rules.txt Update stable tree documentation 2008-10-29 15:03:49 -07:00
SubmitChecklist documentation: explain memory barriers 2008-10-16 11:21:32 -07:00
SubmittingDrivers Remove Andrew Morton's old email accounts 2008-10-16 11:21:32 -07:00
SubmittingPatches Merge branch 'docs' of git://git.lwn.net/linux-2.6 2008-10-16 12:18:16 -07:00
svga.txt
sysfs-rules.txt Doc/sysfs-rules: Swap the order of the words so the sentence makes more sense 2009-05-08 19:22:20 -07:00
sysrq.txt Merge branch 'tracing/core-v2' into tracing-for-linus 2009-04-02 00:49:02 +02:00
tomoyo.txt tomoyo: add Documentation/tomoyo.txt 2009-04-14 09:14:58 +10:00
unaligned-memory-access.txt introduce HAVE_EFFICIENT_UNALIGNED_ACCESS Kconfig symbol 2008-07-25 10:53:27 -07:00
unicode.txt
unshare.txt
VGA-softcursor.txt
video-output.txt
volatile-considered-harmful.txt Documentation cleanup: trivial misspelling, punctuation, and grammar corrections. 2008-07-26 12:00:06 -07:00
voyager.txt
zorro.txt