One regression fix for SR-IOV on PPC and a couple of misc fixes from
Yinghai.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci:
PCI: Fix pci cardbus removal
PCI: set pci sriov page size before reading SRIOV BAR
PCI: workaround hard-wired bus number V2
by the xen-pci[front|back] to conform to the one used in majority of
PCI drivers; Two fixes to make the code more resilient to invalid
configurations.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQEcBAABAgAGBQJPOeReAAoJEFjIrFwIi8fJn9QIANP48kzrGg0uO4bjSf2h/z7G
pp3ISdtVLk7pwMov2POBqskoXSq8E0yQAfNN8se183wqNXo3Dm4rU1DIG7HQFBk9
sdcyfHI8x7pat9JClRhGxpQ23Ig9f1iWkShweCcZCO782vfxZyNd65i6t87X7uLq
7SPtG1XH2RixTX7tHtKKBqdzZ0OMXOEkJ33dgCmyrn+wzohbKrFj5mg+NdOgmzEo
VgsHPVtuq7orDROe+F9d91eAg0TILQ13th8xfWZ59lQATXu/zAlaueYt87tpy1pb
oVQvumsn8Xev+7hct9My9Tw45D4m8YOSFLG2HcekkW2WtNmGhTTbIyMh9PsLugk=
=NDYK
-----END PGP SIGNATURE-----
Merge tag 'stable/for-linus-fixes-3.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
Two fixes for VCPU offlining; One to fix the string format exposed
by the xen-pci[front|back] to conform to the one used in majority of
PCI drivers; Two fixes to make the code more resilient to invalid
configurations.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* tag 'stable/for-linus-fixes-3.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xenbus_dev: add missing error check to watch handling
xen/pci[front|back]: Use %d instead of %1x for displaying PCI devfn.
xen pvhvm: do not remap pirqs onto evtchns if !xen_have_vector_callback
xen/smp: Fix CPU online/offline bug triggering a BUG: scheduling while atomic.
xen/bootup: During bootup suppress XENBUS: Unable to read cpu state
During test busn_res allocation with cardbus, found pci card removal is not
working anymore, and it turns out it is broken by:
|commit 79cc9601c3
|Date: Tue Nov 22 21:06:53 2011 -0800
|
| PCI: Only call pci_stop_bus_device() one time for child devices at remove
The above changed the behavior of pci_remove_behind_bridge that
yenta_cardbus depended on. So restore the old behavoir of
pci_remove_behind_bridge (which requires stopping and removing of all
devices) by:
1. rename pci_remove_behind_bridge to __pci_remove_behind_bridge, and let
__pci_remove_bus_device() call it instead.
2. add pci_stop_behind_bridge that will stop devices behind a bridge
3. add back pci_remove_behind_bridge that will stop and remove devices
under bridge.
-v2: update commit description a little bit.
Tested-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
For an SRIOV device, PCI_SRIOV_SYS_PGSIZE should be set before
the PCI_SRIOV_BAR are queried. The sys pagesize defaults to 4k,
so this change is required on powerpc box with 64k base page size.
This is a regression caused due to moving SRIOV init to sriov_enable().
| commit afd24ece5c
| Author: Ram Pai <linuxram@us.ibm.com>
| PCI: delay configuration of SRIOV capability
| The SRIOV capability, namely page size and total_vfs of a device are
| configured during enumeration phase of the device. This can potentially
| interfere with the PCI operations of the platform, if the IOV capability
| of the device is not enabled.
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Acked-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Fixes PCI device detection on IBM xSeries IBM 3850 M2 / x3950 M2
when using ACPI resources (_CRS).
This is default, a manual workaround (without this patch)
would be pci=nocrs boot param.
V2: Add dev_warn if the workaround is hit. This should reveal
how common such setups are (via google) and point to possible
problems if things are still not working as expected.
-> Suggested by Jan Beulich.
Cc: stable@vger.kernel.org
Tested-by: garyhade@us.ibm.com
Signed-off-by: Yinghai Lu <yinghai.lu@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
.. as the rest of the kernel is using that format.
Suggested-by: Марк Коренберг <socketpair@gmail.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Fix new kernel-doc warnings:
Warning(drivers/pci/pci.c:2811): No description found for parameter 'dev'
Warning(drivers/pci/pci.c:2811): Excess function parameter 'pdev' description in 'pci_intx_mask_supported'
Warning(drivers/pci/pci.c:2894): No description found for parameter 'dev'
Warning(drivers/pci/pci.c:2894): Excess function parameter 'pdev' description in 'pci_check_and_mask_intx'
Warning(drivers/pci/pci.c:2908): No description found for parameter 'dev'
Warning(drivers/pci/pci.c:2908): Excess function parameter 'pdev' description in 'pci_check_and_unmask_intx'
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus' of git://selinuxproject.org/~jmorris/linux-security:
capabilities: remove __cap_full_set definition
security: remove the security_netlink_recv hook as it is equivalent to capable()
ptrace: do not audit capability check when outputing /proc/pid/stat
capabilities: remove task_ns_* functions
capabitlies: ns_capable can use the cap helpers rather than lsm call
capabilities: style only - move capable below ns_capable
capabilites: introduce new has_ns_capabilities_noaudit
capabilities: call has_ns_capability from has_capability
capabilities: remove all _real_ interfaces
capabilities: introduce security_capable_noaudit
capabilities: reverse arguments to security_capable
capabilities: remove the task from capable LSM hook entirely
selinux: sparse fix: fix several warnings in the security server cod
selinux: sparse fix: fix warnings in netlink code
selinux: sparse fix: eliminate warnings for selinuxfs
selinux: sparse fix: declare selinux_disable() in security.h
selinux: sparse fix: move selinux_complete_init
selinux: sparse fix: make selinux_secmark_refcount static
SELinux: Fix RCU deref check warning in sel_netport_insert()
Manually fix up a semantic mis-merge wrt security_netlink_recv():
- the interface was removed in commit fd77846152 ("security: remove
the security_netlink_recv hook as it is equivalent to capable()")
- a new user of it appeared in commit a38f7907b9 ("crypto: Add
userspace configuration API")
causing no automatic merge conflict, but Eric Paris pointed out the
issue.
module_param(bool) used to counter-intuitively take an int. In
fddd5201 (mid-2009) we allowed bool or int/unsigned int using a messy
trick.
It's time to remove the int/unsigned int option. For this version
it'll simply give a warning, but it'll break next kernel version.
Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci: (80 commits)
x86/PCI: Expand the x86_msi_ops to have a restore MSIs.
PCI: Increase resource array mask bit size in pcim_iomap_regions()
PCI: DEVICE_COUNT_RESOURCE should be equal to PCI_NUM_RESOURCES
PCI: pci_ids: add device ids for STA2X11 device (aka ConneXT)
PNP: work around Dell 1536/1546 BIOS MMCONFIG bug that breaks USB
x86/PCI: amd: factor out MMCONFIG discovery
PCI: Enable ATS at the device state restore
PCI: msi: fix imbalanced refcount of msi irq sysfs objects
PCI: kconfig: English typo in pci/pcie/Kconfig
PCI/PM/Runtime: make PCI traces quieter
PCI: remove pci_create_bus()
xtensa/PCI: convert to pci_scan_root_bus() for correct root bus resources
x86/PCI: convert to pci_create_root_bus() and pci_scan_root_bus()
x86/PCI: use pci_scan_bus() instead of pci_scan_bus_parented()
x86/PCI: read Broadcom CNB20LE host bridge info before PCI scan
sparc32, leon/PCI: convert to pci_scan_root_bus() for correct root bus resources
sparc/PCI: convert to pci_create_root_bus()
sh/PCI: convert to pci_scan_root_bus() for correct root bus resources
powerpc/PCI: convert to pci_create_root_bus()
powerpc/PCI: split PHB part out of pcibios_map_io_space()
...
Fix up conflicts in drivers/pci/msi.c and include/linux/pci_regs.h due
to the same patches being applied in other branches.
* 'stable/for-linus-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: (37 commits)
xen/pciback: Expand the warning message to include domain id.
xen/pciback: Fix "device has been assigned to X domain!" warning
xen/pciback: Move the PCI_DEV_FLAGS_ASSIGNED ops to the "[un|]bind"
xen/xenbus: don't reimplement kvasprintf via a fixed size buffer
xenbus: maximum buffer size is XENSTORE_PAYLOAD_MAX
xen/xenbus: Reject replies with payload > XENSTORE_PAYLOAD_MAX.
Xen: consolidate and simplify struct xenbus_driver instantiation
xen-gntalloc: introduce missing kfree
xen/xenbus: Fix compile error - missing header for xen_initial_domain()
xen/netback: Enable netback on HVM guests
xen/grant-table: Support mappings required by blkback
xenbus: Use grant-table wrapper functions
xenbus: Support HVM backends
xen/xenbus-frontend: Fix compile error with randconfig
xen/xenbus-frontend: Make error message more clear
xen/privcmd: Remove unused support for arch specific privcmp mmap
xen: Add xenbus_backend device
xen: Add xenbus device driver
xen: Add privcmd device driver
xen/gntalloc: fix reference counts on multi-page mappings
...
* 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (165 commits)
reiserfs: Properly display mount options in /proc/mounts
vfs: prevent remount read-only if pending removes
vfs: count unlinked inodes
vfs: protect remounting superblock read-only
vfs: keep list of mounts for each superblock
vfs: switch ->show_options() to struct dentry *
vfs: switch ->show_path() to struct dentry *
vfs: switch ->show_devname() to struct dentry *
vfs: switch ->show_stats to struct dentry *
switch security_path_chmod() to struct path *
vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb
vfs: trim includes a bit
switch mnt_namespace ->root to struct mount
vfs: take /proc/*/mounts and friends to fs/proc_namespace.c
vfs: opencode mntget() mnt_set_mountpoint()
vfs: spread struct mount - remaining argument of next_mnt()
vfs: move fsnotify junk to struct mount
vfs: move mnt_devname
vfs: move mnt_list to struct mount
vfs: switch pnode.h macros to struct mount *
...
The MSI restore function will become a function pointer in an
x86_msi_ops struct. It defaults to the implementation in the
io_apic.c and msi.c. We piggyback on the indirection mechanism
introduced by "x86: Introduce x86_msi_ops".
Cc: x86@kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: linux-pci@vger.kernel.org
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86: Skip cpus with apic-ids >= 255 in !x2apic_mode
x86, x2apic: Allow "nox2apic" to disable x2apic mode setup by BIOS
x86, x2apic: Fallback to xapic when BIOS doesn't setup interrupt-remapping
x86, acpi: Skip acpi x2apic entries if the x2apic feature is not present
x86, apic: Add probe() for apic_flat
x86: Simplify code by removing a !SMP #ifdefs from 'struct cpuinfo_x86'
x86: Convert per-cpu counter icr_read_retry_count into a member of irq_stat
x86: Add per-cpu stat counter for APIC ICR read tries
pci, x86/io-apic: Allow PCI_IOAPIC to be user configurable on x86
x86: Fix the !CONFIG_NUMA build of the new CPU ID fixup code support
x86: Add NumaChip support
x86: Add x86_init platform override to fix up NUMA core numbering
x86: Make flat_init_apic_ldr() available
During S3 or S4 resume or PCI reset, ATS regs aren't restored correctly.
This patch enables ATS at the device state restore if PCI device has ATS
capability.
Signed-off-by: Xudong Hao <xudong.hao@intel.com>
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This warning was recently reported to me:
------------[ cut here ]------------
WARNING: at lib/kobject.c:595 kobject_put+0x50/0x60()
Hardware name: VMware Virtual Platform
kobject: '(null)' (ffff880027b0df40): is not initialized, yet kobject_put() is
being called.
Modules linked in: vmxnet3(+) vmw_balloon i2c_piix4 i2c_core shpchp raid10
vmw_pvscsi
Pid: 630, comm: modprobe Tainted: G W 3.1.6-1.fc16.x86_64 #1
Call Trace:
[<ffffffff8106b73f>] warn_slowpath_common+0x7f/0xc0
[<ffffffff8106b836>] warn_slowpath_fmt+0x46/0x50
[<ffffffff810da293>] ? free_desc+0x63/0x70
[<ffffffff812a9aa0>] kobject_put+0x50/0x60
[<ffffffff812e4c25>] free_msi_irqs+0xd5/0x120
[<ffffffff812e524c>] pci_enable_msi_block+0x24c/0x2c0
[<ffffffffa017c273>] vmxnet3_alloc_intr_resources+0x173/0x240 [vmxnet3]
[<ffffffffa0182e94>] vmxnet3_probe_device+0x615/0x834 [vmxnet3]
[<ffffffff812d141c>] local_pci_probe+0x5c/0xd0
[<ffffffff812d2cb9>] pci_device_probe+0x109/0x130
[<ffffffff8138ba2c>] driver_probe_device+0x9c/0x2b0
[<ffffffff8138bceb>] __driver_attach+0xab/0xb0
[<ffffffff8138bc40>] ? driver_probe_device+0x2b0/0x2b0
[<ffffffff8138bc40>] ? driver_probe_device+0x2b0/0x2b0
[<ffffffff8138a8ac>] bus_for_each_dev+0x5c/0x90
[<ffffffff8138b63e>] driver_attach+0x1e/0x20
[<ffffffff8138b240>] bus_add_driver+0x1b0/0x2a0
[<ffffffffa0188000>] ? 0xffffffffa0187fff
[<ffffffff8138c246>] driver_register+0x76/0x140
[<ffffffff815ca414>] ? printk+0x51/0x53
[<ffffffffa0188000>] ? 0xffffffffa0187fff
[<ffffffff812d2996>] __pci_register_driver+0x56/0xd0
[<ffffffffa018803a>] vmxnet3_init_module+0x3a/0x3c [vmxnet3]
[<ffffffff81002042>] do_one_initcall+0x42/0x180
[<ffffffff810aad71>] sys_init_module+0x91/0x200
[<ffffffff815dccc2>] system_call_fastpath+0x16/0x1b
---[ end trace 44593438a59a9558 ]---
Using INTx interrupt, #Rx queues: 1.
It occurs when populate_msi_sysfs fails, which in turn causes free_msi_irqs to
be called. Because populate_msi_sysfs fails, we never registered any of the
msi irq sysfs objects, but free_msi_irqs still calls kobject_del and kobject_put
on each of them, which gets flagged in the above stack trace.
The fix is pretty straightforward. We can key of the parent pointer in the
kobject. It is only set if the kobject_init_and_add succededs in
populate_msi_sysfs. If anything fails there, each kobject has its parent reset
to NULL
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Bjorn Helgaas <bhelgaas@google.com>
CC: Greg Kroah-Hartman <gregkh@suse.de>
CC: linux-pci@vger.kernel.org
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
When the runtime PM is activated on PCI, if a device switches state
frequently (e.g. an EHCI controller with autosuspending USB devices
connected) the PCI configuration traces might be very verbose in the
kernel log. Let's guard those traces with DEBUG condition.
Acked-by: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Vincent Palatin <vpalatin@chromium.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
All users of pci_create_bus() have been converted to pci_create_root_bus(),
so remove it.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Users of pci_scan_bus_parented() should be converted to use either
pci_scan_root_bus() (preferred, but also calls pci_bus_add_devices)
or
pci_create_root_bus()
pci_scan_child_bus()
Since pci_scan_bus_parented(), I'm marking it deprecated now and will
actually remove it later.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This converts pci_scan_bus_parented() to use pci_create_root_bus()
instead of pci_create_bus(). The new bus still has the default (incorrect)
resources, so this patch doesn't help fix that problem, but it does remove
one more use of pci_create_bus().
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
I plan to deprecate pci_scan_bus_parented(), so use pci_create_root_bus()
directly instead. pci_scan_bus() itself will be removed as soon as all
callers are gone, so this is just an interim step.
v2: export pci_scan_bus
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
"Early" and "header" quirks often use incorrect bus resources because they
see the default resources assigned by pci_create_bus(), before the
architecture fixes them up (typically in pcibios_fixup_bus()). Regions
reserved by these quirks end up with the wrong parents.
Here's the standard path for scanning a PCI root bus:
pci_scan_bus or pci_scan_bus_parented
pci_create_bus <-- A create with default resources
pci_scan_child_bus
pci_scan_slot
pci_scan_single_device
pci_scan_device
pci_setup_device
pci_fixup_device(early) <-- B
pci_device_add
pci_fixup_device(header) <-- C
pcibios_fixup_bus <-- D fill in correct resources
Early and header quirks at B and C use the default (incorrect) root bus
resources rather than those filled in at D.
This patch adds a new pci_scan_root_bus() function that sets the bus
resources correctly from a supplied list of resources.
I intend to remove pci_scan_bus() and pci_scan_bus_parented() after
fixing all callers.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
pci_create_bus() assigns ioport_resource and iomem_resource as the default
bus resources, i.e., the entire address space. Architectures fix these
later, typically in pcibios_fixup_bus() or after pci_scan_bus_parented()
returns, but code that runs in the interim sees incorrect resource
information.
This patch adds a new pci_create_root_bus() that sets the bus resources
correctly from a supplied list of resources.
I intend to remove pci_create_bus() after changing all callers.
Based on original patch by Deng-Cheng Zhu.
Reference: http://www.spinics.net/lists/mips/msg41654.html
Reference: https://lkml.org/lkml/2011/8/26/88
Signed-off-by: Deng-Cheng Zhu <dczhu@mips.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Show the bus number and resources for every root bus we create. This
will become more interesting when we supply the correct resources
instead of using the defaults (ioport_resource and iomem_resource).
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
We'd like to supply a list of resources when we create a new PCI bus,
e.g., the root bus under a PCI host bridge. These are helpers for
constructing that list.
These are exported because the plan is to replace this exported interface:
pci_scan_bus_parented()
with this one:
pci_add_resource(resources, ...)
pci_scan_root_bus(..., resources)
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The SRIOV capability, namely page size and total_vfs of a device are
configured during enumeration phase of the device. This can potentially
interfere with the PCI operations of the platform, if the IOV capability
of the device is not enabled.
The following patch postpones the configuration of the IOV capability of
the device to a later point, when the IOV capability is explicitly
enabled by the device driver.
The patch is tested on x86 and power platform.
Tested-by: Donald Dutile <ddutile@redhat.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
During debugging pcie hotplug with SRIOV with pcie switch, I found
pci_stop_bus_device() is called several times for some child devices.
So change original pci_remove_bus_device() to __pci_remove_bus_device(),
and make it only do remove work, and add a new pci_remove_bus_device
that calls pci_stop_bus_device() one time, and then call
__pci_remove_bus_device().
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The latency timer is read-only and hardwired to zero for all PCIe
devices, both Type 0 and Type 1, so don't bother trying to update it
and cluttering the dmesg log with meaningless "setting latency timer
to 64" messages.
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The 'latency timer' of PCI devices, both Type 0 and Type 1,
is setup in architecture-specific code [see: 'pcibios_set_master()'].
There are two approaches being taken by all the architectures - check
if the 'latency timer' is currently set between 16 and 255 and if not
bring it within bounds, or, do nothing (and then there is the
gratuitously different PA-RISC implementation).
There is nothing architecture-specific about PCI's 'latency timer' so
this patch pulls its setup functionality up into the PCI core by
creating a generic 'pcibios_set_master()' function using the '__weak'
attribute which can be used by all architectures as a default which,
if necessary, can then be over-ridden by architecture-specific code.
No functional change.
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
These new PCI services allow to probe for 2.3-compliant INTx masking
support and then use the feature from PCI interrupt handlers. The
services are properly synchronized with concurrent config space access
via sysfs or on device reset.
This enables generic PCI device drivers like uio_pci_generic or KVM's
device assignment to implement the necessary kernel-side IRQ handling
without any knowledge about device-specific interrupt status and control
registers.
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
pci_block_user_cfg_access was designed for the use case that a single
context, the IPR driver, temporarily delays user space accesses to the
config space via sysfs. This assumption became invalid by the time
pci_dev_reset was added as locking instance. Today, if you run two loops
in parallel that reset the same device via sysfs, you end up with a
kernel BUG as pci_block_user_cfg_access detect the broken assumption.
This reworks the pci_block_user_cfg_access to a sleeping service
pci_cfg_access_lock and an atomic-compatible variant called
pci_cfg_access_trylock. The former not only blocks user space access as
before but also waits if access was already locked. The latter service
just returns false in this case, allowing the caller to resolve the
conflict instead of raising a BUG.
Adaptions of the ipr driver were originally written by Brian King.
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Include the driver name and device in warning when a pci driver
supports both legacy pm and new framework as just the stack trace
gives no way to identify the driver.
Signed-off-by: David Fries <David@Fries.net>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
I traced a nasty kexec on panic boot failure to the fact that we had
screaming msi interrupts and we were not disabling the msi messages at
kernel startup. The booting kernel had not enabled those interupts so
was not prepared to handle them.
I can see no reason why we would ever want to leave the msi interrupts
enabled at boot if something else has enabled those interrupts. The pci
spec specifies that msi interrupts should be off by default. Drivers
are expected to enable the msi interrupts if they want to use them. Our
interrupt handling code reprograms the interrupt handlers at boot and
will not be be able to do anything useful with an unexpected interrupt.
This patch applies cleanly all of the way back to 2.6.32 where I noticed
the problem.
Cc: stable@kernel.org
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Modify pci_acpi_wake_dev() to avoid resuming PME-capable devices
whose PME Status bits are not set, which may happen currently if
several devices are associated with the same wakeup GPE and all
of them are notified whenever at least one of them signals PME.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Use non-ordered workqueue for attention button events.
Attention button events on each slot can be handled asynchronously. So
we should use non-ordered workqueue. This patch also removes ordered
workqueue in pciehp as a result.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Fix improper workqueue cleanup.
In the current pciehp, pcied_cleanup() calls destroy_workqueue()
before calling pcie_port_service_unregister(). This causes kernel oops
because flush_workqueue() is called in the pcie_port_service_unregister()
code path after the workqueue was destroyed. So pcied_cleanup() must call
pcie_port_service_unregister() first before calling destroy_workqueue().
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Right now we forcibly clear ASPM state on all devices if the BIOS indicates
that the feature isn't supported. Based on the Microsoft presentation
"PCI Express In Depth for Windows Vista and Beyond", I'm starting to think
that this may be an error. The implication is that unless the platform
grants full control via _OSC, Windows will not touch any PCIe features -
including ASPM. In that case clearing ASPM state would be an error unless
the platform has granted us that control.
This patch reworks the ASPM disabling code such that the actual clearing
of state is triggered by a successful handoff of PCIe control to the OS.
The general ASPM code undergoes some changes in order to ensure that the
ability to clear the bits isn't overridden by ASPM having already been
disabled. Further, this theoretically now allows for situations where
only a subset of PCIe roots hand over control, leaving the others in the
BIOS state.
It's difficult to know for sure that this is the right thing to do -
there's zero public documentation on the interaction between all of these
components. But enough vendors enable ASPM on platforms and then set this
bit that it seems likely that they're expecting the OS to leave them alone.
Measured to save around 5W on an idle Thinkpad X220.
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
These are extended capabilities, rename and move to proper
group for consistency.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This patch adds a per-pci-device subdirectory in sysfs called:
/sys/bus/pci/devices/<device>/msi_irqs
This sub-directory exports the set of msi vectors allocated by a given
pci device, by creating a numbered sub-directory for each vector beneath
msi_irqs. For each vector various attributes can be exported.
Currently the only attribute is called mode, which tracks the
operational mode of that vector (msi vs. msix)
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
security_capable takes ns, cred, cap. But the LSM capable() hook takes
cred, ns, cap. The capability helper functions also take cred, ns, cap.
Rather than flip argument order just to flip it back, leave them alone.
Heck, this should be a little faster since argument will be in the right
place!
Signed-off-by: Eric Paris <eparis@redhat.com>
The 'name', 'owner', and 'mod_name' members are redundant with the
identically named fields in the 'driver' sub-structure. Rather than
switching each instance to specify these fields explicitly, introduce
a macro to simplify this.
Eliminate further redundancy by allowing the drvname argument to
DEFINE_XENBUS_DRIVER() to be blank (in which case the first entry from
the ID table will be used for .driver.name).
Also eliminate the questionable xenbus_register_{back,front}end()
wrappers - their sole remaining purpose was the checking of the
'owner' field, proper setting of which shouldn't be an issue anymore
when the macro gets used.
v2: Restore DRV_NAME for the driver name in xen-pciback.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
I noticed that hotplug of one setup does not work with recent change in
pci tree.
After checking the bridge conf setup, I noticed that the bridges get
assigned but do not get enabled.
The reason is the following commit, while simply ignores bridge
resources when enabling a pci device:
| commit bbef98ab0f
| Author: Ram Pai <linuxram@us.ibm.com>
| Date: Sun Nov 6 10:33:10 2011 +0800
|
| PCI: defer enablement of SRIOV BARS
|...
| NOTE: Note, there is subtle change in the pci_enable_device() API. Any
| driver that depends on SRIOV BARS to be enabled in pci_enable_device()
| can fail.
Put back bridge resource and ROM resource checking to fix the problem.
That should fix regression like BIOS does not assign correct resource to
bridge.
Discussion can be found at:
http://www.spinics.net/lists/linux-pci/msg12874.html
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
During test of one IB card with guest VM, found that, msi is not
initialized properly.
It turns out __write_msi_msg will do nothing if device current_state is
not PCI_D0. And, that pci device does not have pm_cap in guest VM.
There is an error in setting of power state to PCI_D0 in
pci_enable_device(), but error is not returned for this. Following is
code flow:
pci_enable_device() --> __pci_enable_device_flags() -->
do_pci_enable_device() --> pci_set_power_state() -->
__pci_start_power_transition()
We have following condition inside __pci_start_power_transition():
if (platform_pci_power_manageable(dev)) {
error = platform_pci_set_power_state(dev, state);
if (!error)
pci_update_current_state(dev, state);
} else {
error = -ENODEV;
/* Fall back to PCI_D0 if native PM is not supported */
if (!dev->pm_cap)
dev->current_state = PCI_D0;
}
Here, from platform_pci_set_power_state(), acpi_pci_set_power_state() is
getting called and that is failing with ENODEV because of following
condition:
if (!handle || ACPI_SUCCESS(acpi_get_handle(handle, "_EJ0",&tmp)))
return -ENODEV;
Because of that, pci_update_current_state() is not getting called.
With this patch, if device power state can not be set via
platform_pci_set_power_state and that device does not have native pm
support, then PCI device power state will be set to PCI_D0.
-v2: This also reverts 47e9037ac1, as it's
not needed after this change.
Acked-by: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Ajaykumar Hotchandani<ajaykumar.hotchandani@oracle.com>
Signed-off-by: Yinghai Lu<yinghai.lu@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Commit 0d52f54e2e (PCI / ACPI: Make
acpiphp ignore root bridges using PCIe native hotplug) added code
that made the acpiphp driver completely ignore PCIe root complexes
for which the kernel had been granted control of the native PCIe
hotplug feature by the BIOS through _OSC. Unfortunately, however,
this was a mistake, because on some systems there were PCI bridges
supporting PCI (non-PCIe) hotplug under such root complexes and
those bridges should have been handled by acpiphp.
For this reason, revert the changes made by the commit mentioned
above and make register_slot() in drivers/pci/hotplug/acpiphp_glue.c
avoid registering hotplug slots for PCIe ports that belong to
root complexes with native PCIe hotplug enabled (which means that
the BIOS has granted the kernel control of this feature for the
given root complex). This is reported to address the original
issue fixed by commit 0d52f54e2e and
to work on the system where that commit broke things.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This adjusts PCI_IOAPIC to be user configurable (possibly as a
module) on x86, since the base architecture code for adding
IO-APICs dynamically isn't there yet (and hence having the code
present everywhere is pretty pointless).
To make this consistent, a MODULE_DEVICE_TABLE() declaration
gets added, the class specifications get corrected (by properly
using PCI_DEVICE_CLASS() intended for purposes like this), and
the probe and remove functions get their sections adjusted.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Link: http://lkml.kernel.org/r/4EDDD71A02000078000659F1@nat28.tlf.novell.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>