linux/drivers/pci
Naga Chumbalkar bbfa306a1e PCI: Changing ASPM policy, via /sys, to POWERSAVE could cause NMIs
v3 -> v2: Modified the text that describes the problem
v2 -> v1: Returned -EPERM
v1      : http://marc.info/?l=linux-pci&m=130013194803727&w=2

For servers whose hardware cannot handle ASPM the BIOS ought to set the
FADT bit shown below:
In Sec 5.2.9.3 (IA-PC Boot Arch. Flags) of ACPI4.0a Specification, please
see Table 5-11:
PCIe ASPM Controls: If set, indicates to OSPM that it must not enable
OPSM ASPM control on this platform.

However there are shipping servers whose BIOS did not set this bit. (An
example is the HP ProLiant DL385 G6. A Maintenance BIOS will fix that).
For such servers even if a call is made via pci_no_aspm(), based on _OSC
support in the BIOS, it may be too late because the ASPM code may have
already allocated and filled its "link_list".

So if a user sets the ASPM "policy" to "powersave" via /sys then
pcie_aspm_set_policy() will run through the "link_list" and re-configure
ASPM policy on devices that advertise ASPM L0s/L1 capability:
# echo powersave > /sys/module/pcie_aspm/parameters/policy
# cat /sys/module/pcie_aspm/parameters/policy
default performance [powersave]

That can cause NMIs since the hardware doesn't play well with ASPM:
[ 1651.906015] NMI: PCI system error (SERR) for reason b1 on CPU 0.
[ 1651.906015] Dazed and confused, but trying to continue

Ideally, the BIOS should have set that FADT bit in the first place but we
could be more robust - especially given the fact that Windows doesn't
cause NMIs in the above scenario.

There should be a sanity check to not allow a user to modify ASPM policy
when aspm_disabled is set.

Signed-off-by: Naga Chumbalkar <nagananda.chumbalkar@hp.com>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-03-21 09:40:57 -07:00
..
hotplug PCI hotplug: acpiphp: set current_state to D0 in register_slot 2011-03-04 10:42:22 -08:00
pcie PCI: Changing ASPM policy, via /sys, to POWERSAVE could cause NMIs 2011-03-21 09:40:57 -07:00
.gitignore
Kconfig PCI: Export ACPI _DSM provided firmware instance number and string name to sysfs 2011-03-04 10:41:56 -08:00
Makefile Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 2011-03-18 10:56:44 -07:00
access.c PCI: output FW warning in pci_read/write_vpd 2010-05-18 15:00:25 -07:00
bus.c Revert "PCI: allocate bus resources from the top down" 2010-12-17 10:00:54 -08:00
dmar.c x86, vt-d: Handle previous faults after enabling fault handling 2010-12-13 16:53:57 -08:00
hotplug-pci.c
hotplug.c
htirq.c ht: Convert to new irq_chip functions 2010-10-12 16:53:37 +02:00
intel-iommu.c Merge git://git.infradead.org/iommu-2.6 2010-09-27 12:25:10 -07:00
intr_remapping.c x86: Speed up the irq_remapped check in hot pathes 2010-10-12 16:53:42 +02:00
intr_remapping.h intr-remap: generic support for remapping HPET MSIs 2009-08-27 23:33:20 +02:00
ioapic.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
iov.c PCI: fix pci_resource_alignment prototype 2010-09-09 13:41:25 -07:00
iova.c intel-iommu: Remove superfluous iova_alloc_lock from IOVA code 2009-07-15 08:17:02 +01:00
irq.c
msi.c PCI: Add mask bit definition for MSI-X table 2010-12-23 12:53:08 -08:00
msi.h PCI: MSI: Move MSI-X entry definition to pci_regs.h 2010-12-23 12:53:07 -08:00
pci-acpi.c PCI/PM: Report wakeup events before resuming devices 2011-01-14 08:55:43 -08:00
pci-driver.c PM: Remove CONFIG_PM_OPS 2011-03-15 00:43:15 +01:00
pci-label.c PCI: label: remove #include of ACPI header to avoid warnings 2011-03-16 10:24:34 -07:00
pci-stub.c PCI: pci-stub: ignore zero-length id parameters 2010-12-23 12:53:52 -08:00
pci-sysfs.c Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 2011-03-18 10:56:44 -07:00
pci.c PCI: PCIe links may not get configured for ASPM under POWERSAVE mode 2011-03-21 09:40:43 -07:00
pci.h PCI: Export ACPI _DSM provided firmware instance number and string name to sysfs 2011-03-04 10:41:56 -08:00
probe.c PCI: Avoid potential NULL pointer dereference in pci_scan_bridge 2011-02-08 13:08:05 -08:00
proc.c BKL: remove extraneous #include <smp_lock.h> 2010-11-17 08:59:32 -08:00
quirks.c PCI: do not create quirk I/O regions below PCIBIOS_MIN_IO for ICH 2011-03-04 10:42:32 -08:00
remove.c PCI: eliminate redundant pci_stop_dev() call from pci_destroy_dev() 2009-06-11 12:04:19 -07:00
rom.c
search.c PCI: use for_each_pci_dev() 2010-07-30 09:47:22 -07:00
setup-bus.c PCI: pre-allocate additional resources to devices only after successful allocation of essential resources. 2011-03-04 10:46:47 -08:00
setup-irq.c PCI: use for_each_pci_dev() 2010-07-30 09:47:22 -07:00
setup-res.c PCI: fix message typo 2010-10-17 20:03:05 -07:00
slot.c PCI: bus speed strings should be const 2010-08-31 15:28:00 -07:00
syscall.c headers: smp_lock.h redux 2009-07-12 12:22:34 -07:00
vpd.c pci: Add helper to search for VPD keywords 2010-02-28 00:43:33 -08:00
xen-pcifront.c pci/xen: Cleanup: convert int** to int[] 2011-02-18 12:41:49 -05:00