Commit Graph

25 Commits (3954096d4b1adab6fbb3c26de37ddd4f781e072f)

Author SHA1 Message Date
Alexander Gordeev a5a391561b x86/apic: Eliminate cpu_mask_to_apicid() operation
Since there are only two locations where cpu_mask_to_apicid() is
called from, remove the operation and use only
cpu_mask_to_apicid_and() instead.

Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Suggested-and-acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/20120614074935.GE3383@dhcp-26-207.brq.redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-14 12:53:13 +02:00
Alexander Gordeev 9d8e106676 x86/apic: Factor out default vector_allocation_domain() operation
Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/20120607131449.GC4759@dhcp-26-207.brq.redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-08 11:44:27 +02:00
Alexander Gordeev 6398268d2b x86/apic: Factor out default cpu_mask_to_apicid() operations
Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/20120605112340.GA11454@dhcp-26-207.brq.redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 10:22:18 +02:00
Alexander Gordeev bf721d3a3b x86/apic: Factor out default target_cpus() operation
Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/20120605112324.GA11449@dhcp-26-207.brq.redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 10:22:17 +02:00
Suresh Siddha 0b8255e660 x86/x2apic/cluster: Use all the members of one cluster specified in the smp_affinity mask for the interrupt destination
If the HW implements round-robin interrupt delivery, this
enables multiple cpu's (which are part of the user specified
interrupt smp_affinity mask and belong to the same x2apic
cluster) to service the interrupt.

Also if the platform supports Power Aware Interrupt Routing,
then this enables the interrupt to be routed to an idle cpu or a
busy cpu depending on the perf/power bias tunable.

We are now grouping all the cpu's in a cluster to one vector
domain. So that will limit the total number of interrupt sources
handled by Linux. Previously we support "cpu-count *
available-vectors-per-cpu" interrupt sources but this will now
reduce to "cpu-count/16 * available-vectors-per-cpu".

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: yinghai@kernel.org
Cc: gorcunov@openvz.org
Cc: agordeev@redhat.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1337644682-19854-2-git-send-email-suresh.b.siddha@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 09:51:22 +02:00
Michael S. Tsirkin 0ab711ae6a x86/apic: Implement EIO micro-optimization
We know both register and value for eoi beforehand,
so there's no need to check it and no need to do math
to calculate the msr. Saves instructions/branches
on each EOI when using x2apic.

I looked at the objdump output to verify that the
generated code looks right and actually is shorter.

The real improvemements will be on the KVM guest side
though, those come in a later patch.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: gleb@redhat.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/e019d1a125316f10d3e3a4b2f6bda41473f4fb72.1337184153.git.mst@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-18 09:46:09 +02:00
Michael S. Tsirkin 2a43195d83 x86/apic: Add apic->eoi_write() callback
Add eoi_write callback so that kvm can override
eoi accesses without touching the rest of the apic.
As a side-effect, this will enable a micro-optimization
for apics using msr.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: gleb@redhat.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/0df425d746c49ac2ecc405174df87752869629d2.1337184153.git.mst@redhat.com
[ tidied it up a bit ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-18 09:46:08 +02:00
Greg Pearson ea0dcf903e x86/apic: Use x2apic physical mode based on FADT setting
Provide systems that do not support x2apic cluster mode
a mechanism to select x2apic physical mode using the
FADT FORCE_APIC_PHYSICAL_DESTINATION_MODE bit.

Changes from v1: (based on Suresh's comments)
 - removed #ifdef CONFIG_ACPI
 - removed #include <linux/acpi.h>

Signed-off-by: Greg Pearson <greg.pearson@hp.com>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Link: http://lkml.kernel.org/r/1335313436-32020-1-git-send-email-greg.pearson@hp.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-04-25 12:47:08 +02:00
Steffen Persvold b7157acf42 x86/apic: Add separate apic_id_valid() functions for selected apic drivers
As suggested by Suresh Siddha and Yinghai Lu:

For x2apic pre-enabled systems, apic driver is set already early
through early_acpi_boot_init()/early_acpi_process_madt()/
acpi_parse_madt()/default_acpi_madt_oem_check() path so that
apic_id_valid() checking will be sufficient during MADT and SRAT
parsing.

For non-x2apic pre-enabled systems, all apic ids should be less
than 255.

This allows us to substitute the checks in
arch/x86/kernel/acpi/boot.c::acpi_parse_x2apic() and
arch/x86/mm/srat.c::acpi_numa_x2apic_affinity_init() with
apic->apic_id_valid().

In addition we can avoid feigning the x2apic cpu feature in the
NumaChip apic code.

The following apic drivers have separate apic_id_valid()
functions which will accept x2apic type IDs :

 x2apic_phys
 x2apic_cluster
 x2apic_uv_x
 apic_numachip

Signed-off-by: Steffen Persvold <sp@numascale.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Daniel J Blueman <daniel@numascale-asia.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Jack Steiner <steiner@sgi.com>
Link: http://lkml.kernel.org/r/1331925935-13372-1-git-send-email-sp@numascale.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-03-23 13:28:43 +01:00
Daniel J Blueman fa63030e9c x86/platform: Move APIC ID validity check into platform APIC code
Move APIC ID validity check into platform APIC code, so it can
be overridden when needed. For NumaChip systems, always trust
MADT, as it's constructed with high APIC IDs.

Behaviour verifies on standard x86 systems and on NumaChip
systems with this, and compile-tested with allyesconfig.

Signed-off-by: Daniel J Blueman <daniel@numascale-asia.com>
Reviewed-by: Steffen Persvold <sp@numascale.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: H. Peter Anvin <hpa@linux.intel.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Link: http://lkml.kernel.org/r/1331709454-27966-1-git-send-email-daniel@numascale-asia.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-03-14 09:49:48 +01:00
Suresh Siddha 1a8880a142 x86, apic: Make apic drivers static
Apic probe now looks at the apic drivers listed in the
.apicdrivers section. Remove apic_probe[] and make each apic
driver static.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Tested-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: steiner@sgi.com
Cc: gorcunov@openvz.org
Cc: yinghai@kernel.org
Link: http://lkml.kernel.org/r/20110521005526.341718626@sbsiddha-MOBL3.sc.intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-22 11:48:04 +02:00
Suresh Siddha 107e0e0cd8 x86, apic: Introduce .apicdrivers section to find the list of apic drivers
This will pave the way for each apic driver to be self-contained
and eliminate the need for apic_probe[].

Order in which apic drivers are listed in the .apicdrivers
section is important, as this determines the apic probe order.
And this is enforced by the ordering of apic driver files in the
Makefile and the macros apic_driver()/apic_drivers().

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Tested-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: steiner@sgi.com
Cc: gorcunov@openvz.org
Cc: yinghai@kernel.org
Link: http://lkml.kernel.org/r/20110521005526.068775085@sbsiddha-MOBL3.sc.intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-22 11:48:03 +02:00
Cyrill Gorcunov 79deb8e511 x86, x2apic: Move the common bits to x2apic.h
To eliminate code duplication.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: steiner@sgi.com
Cc: yinghai@kernel.org
Link: http://lkml.kernel.org/r/20110519234637.591426753@sbsiddha-MOBL3.sc.intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-20 13:41:11 +02:00
Suresh Siddha a27d0b5e7d x86, x2apic: Remove duplicate code for IPI mask routines
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: steiner@sgi.com
Cc: yinghai@kernel.org
Link: http://lkml.kernel.org/r/20110519234637.337024125@sbsiddha-MOBL3.sc.intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-20 13:41:06 +02:00
Suresh Siddha 9ebd680bd0 x86, apic: Use probe routines to simplify apic selection
Use the unused probe routine in the apic driver to finalize the
apic model selection. This cleans up the
default_setup_apic_routing() and this probe routine in future
can also be used for doing any apic model specific
initialisation.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: steiner@sgi.com
Cc: yinghai@kernel.org
Link: http://lkml.kernel.org/r/20110519234637.247458931@sbsiddha-MOBL3.sc.intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-20 13:41:04 +02:00
Tejun Heo 89e5dc218e x86: Replace apic->apicid_to_node() with ->x86_32_numa_cpu_node()
apic->apicid_to_node() is 32bit specific apic operation which
determines NUMA node for a CPU.  Depending on the APIC
implementation, it can be easier to determine NUMA node from
either physical or logical apicid.  Currently,
->apicid_to_node() takes @logical_apicid and calls
hard_smp_processor_id() if the physical apicid is needed.

This prevents NUMA mapping from being queried from a different
CPU, which in turn makes it impossible to initialize NUMA
mapping before SMP bringup.

This patch replaces apic->apicid_to_node() with
->x86_32_numa_cpu_node() which takes @cpu, from which both
logical and physical apicids can easily be determined.  While at
it, drop duplicate implementations from bigsmp_32 and summit_32,
and use the default one.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Pekka Enberg <penberg@kernel.org>
Cc: eric.dumazet@gmail.com
Cc: yinghai@kernel.org
Cc: brgerst@gmail.com
Cc: gorcunov@gmail.com
Cc: shaohui.zheng@intel.com
Cc: rientjes@google.com
LKML-Reference: <1295789862-25482-13-git-send-email-tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-28 14:54:08 +01:00
Tejun Heo 7632611f53 x86: Kill apic->cpu_to_logical_apicid()
After the previous patch, apic->cpu_to_logical_apicid() is no
longer used.  Kill it.

For apic types with custom cpu_to_logical_apicid() which is also
used for other purposes, remove the function and modify its
users to do the mapping directly.

#ifdef's on CONFIG_SMP in es7000_32 and summit_32 are ignored
during conversion as they are not used for UP kernels.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: eric.dumazet@gmail.com
Cc: yinghai@kernel.org
Cc: brgerst@gmail.com
Cc: gorcunov@gmail.com
Cc: penberg@kernel.org
Cc: shaohui.zheng@intel.com
Cc: rientjes@google.com
LKML-Reference: <1295789862-25482-7-git-send-email-tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-28 14:54:06 +01:00
Suresh Siddha 18374d89e5 x86, irq: Allow 0xff for /proc/irq/[n]/smp_affinity on an 8-cpu system
John Blackwood reported:
> on an older Dell PowerEdge 6650 system with 8 cpus (4 are hyper-threaded),
> and  32 bit (x86) kernel, once you change the irq smp_affinity of an irq
> to be less than all cpus in the system, you can never change really the
> irq smp_affinity back to be all cpus in the system (0xff) again,
> even though no error status is returned on the "/bin/echo ff >
> /proc/irq/[n]/smp_affinity" operation.
>
> This is due to that fact that BAD_APICID has the same value as
> all cpus (0xff) on 32bit kernels, and thus the value returned from
> set_desc_affinity() via the cpu_mask_to_apicid_and() function is treated
> as a failure in set_ioapic_affinity_irq_desc(), and no affinity changes
> are made.

set_desc_affinity() is already checking if the incoming cpu mask
intersects with the cpu online mask or not. So there is no need
for the apic op cpu_mask_to_apicid_and() to check again
and return BAD_APICID.

Remove the BAD_APICID return value from cpu_mask_to_apicid_and()
and also fix set_desc_affinity() to return -1 instead of using BAD_APICID
to represent error conditions (as cpu_mask_to_apicid_and() can return
logical or physical apicid values and BAD_APICID is really to represent
bad physical apic id).

Reported-by: John Blackwood <john.blackwood@ccur.com>
Root-caused-by: John Blackwood <john.blackwood@ccur.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <1261103386.2535.409.camel@sbs-t61>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-12-17 22:03:06 -08:00
Yinghai Lu 087d7e56de x86: Fix MSI-X initialization by using online_mask for x2apic target_cpus
found a system where x2apic reports an MSI-X irq initialization
failure:

[  302.859446] igbvf 0000:81:10.4: enabling device (0000 -> 0002)
[  302.874369] igbvf 0000:81:10.4: using 64bit DMA mask
[  302.879023] igbvf 0000:81:10.4: using 64bit consistent DMA mask
[  302.894386] igbvf 0000:81:10.4: enabling bus mastering
[  302.898171] igbvf 0000:81:10.4: setting latency timer to 64
[  302.914050] reserve_memtype added 0xefb08000-0xefb0c000, track uncached-minus, req uncached-minus, ret uncached-minus
[  302.933839] reserve_memtype added 0xefb28000-0xefb29000, track uncached-minus, req uncached-minus, ret uncached-minus
[  302.940367]   alloc irq_desc for 265 on node 4
[  302.956874]   alloc kstat_irqs on node 4
[  302.959452] alloc irq_2_iommu on node 0
[  302.974328] igbvf 0000:81:10.4: irq 265 for MSI/MSI-X
[  302.977778]   alloc irq_desc for 266 on node 4
[  302.980347]   alloc kstat_irqs on node 4
[  302.995312] free_memtype request 0xefb28000-0xefb29000
[  302.998816] igbvf 0000:81:10.4: Failed to initialize MSI-X interrupts.

... it turns out that when trying to enable MSI-X,
__assign_irq_vector(new, cfg_new, apic->target_cpus()) can not
get vector because for x2apic target-cpus returns cpumask_of(0)

Update that to online_mask like xapic.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <4A785AFF.3050902@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-08 17:04:58 +02:00
Yinghai Lu d8c7eb34c2 x86: Don't use current_cpu_data in x2apic phys_pkg_id
One system has socket 1 come up as BSP.

kexeced kernel reports BSP as:

[    1.524550] Initializing cgroup subsys cpuacct
[    1.536064] initial_apicid:20
[    1.537135] ht_mask_width:1
[    1.538128] core_select_mask:f
[    1.539126] core_plus_mask_width:5
[    1.558479] CPU: Physical Processor ID: 0
[    1.559501] CPU: Processor Core ID: 0
[    1.560539] CPU: L1 I cache: 32K, L1 D cache: 32K
[    1.579098] CPU: L2 cache: 256K
[    1.580085] CPU: L3 cache: 24576K
[    1.581108] CPU 0/0x20 -> Node 0
[    1.596193] CPU 0 microcode level: 0xffff0008

It doesn't have correct physical processor id and will get an
error:

[   38.840859] CPU0 attaching sched-domain:
[   38.848287]  domain 0: span 0,8,72 level SIBLING
[   38.851151]   groups: 0 8 72
[   38.858137]   domain 1: span 0,8-15,72-79 level MC
[   38.868944]    groups: 0,8,72 9,73 10,74 11,75 12,76 13,77 14,78 15,79
[   38.881383] ERROR: parent span is not a superset of domain->span
[   38.890724]    domain 2: span 0-7,64-71 level CPU
[   38.899237] ERROR: domain->groups does not contain CPU0
[   38.909229]     groups: 8-15,72-79
[   38.912547] ERROR: groups don't span domain->span
[   38.919665]     domain 3: span 0-127 level NODE
[   38.930739]      groups: 0-7,64-71 8-15,72-79 16-23,80-87 24-31,88-95 32-39,96-103 40-47,104-111 48-55,112-119 56-63,120-127

it turns out: we can not use current_cpu_data in phys_pgd_id
for x2apic.

identify_boot_cpu() is called by check_bugs() before
smp_prepare_cpus() and till smp_prepare_cpus() current_cpu_data
for bsp is assigned with boot_cpu_data.

Just make phys_pkg_id for x2apic is aligned to xapic.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <4A6ADD0D.10002@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-04 16:22:44 +02:00
Suresh Siddha ce4e240c27 x86: add x2apic_wrmsr_fence() to x2apic flush tlb paths
Impact: optimize APIC IPI related barriers

Uncached MMIO accesses for xapic are inherently serializing and hence
we don't need explicit barriers for xapic IPI paths.

x2apic MSR writes/reads don't have serializing semantics and hence need
a serializing instruction or mfence, to make all the previous memory
stores globally visisble before the x2apic msr write for IPI.

Add x2apic_wrmsr_fence() in flush tlb path to x2apic specific paths.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: "steiner@sgi.com" <steiner@sgi.com>
Cc: Nick Piggin <npiggin@suse.de>
LKML-Reference: <1237313814.27006.203.camel@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-18 09:36:14 +01:00
Ingo Molnar 1f5bcabf1b x86: apic: simplify secondary CPU wakeup methods
Impact: cleanup

- rename apic->wakeup_cpu  to apic->wakeup_secondary_cpu, to
  make it apparent that this is an SMP-only method

- handle NULL ->wakeup_secondary_cpus to mean the default INIT
  wakeup sequence - this allows simplification of the APIC
  driver templates.

Cc: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-26 13:58:26 +01:00
Yinghai Lu 2b6163bf57 x86: remove update_apic from x86_quirks
Impact: cleanup

x86_quirks->update_apic() calling looks crazy. so try to remove it:

 1. every apic take wakeup_cpu member directly
 2. separate es7000_apic to es7000_apic_cluster
 3. use uv_wakeup_cpu directly

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-26 06:32:25 +01:00
Suresh Siddha ef1f87aa7b x86: select x2apic ops in early apic probe only if x2apic mode is enabled
If BIOS hands over the control to OS in legacy xapic mode, select
legacy xapic related ops in the early apic probe and shift to x2apic
ops later in the boot sequence, only after enabling x2apic mode.

If BIOS hands over the control in x2apic mode, select x2apic related
ops in the early apic probe.

This fixes the early boot panic, where we were selecting x2apic ops,
while the cpu is still in legacy xapic mode.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-22 18:20:50 +01:00
Ingo Molnar f62bae5009 x86, apic: move APIC drivers to arch/x86/kernel/apic/*
arch/x86/kernel/ is getting a bit crowded, and the APIC
drivers are scattered into various different files.

Move them to arch/x86/kernel/apic/*, and also remove
the 'gen' prefix from those which had it.

Also move APIC related functionality: the IO-APIC driver,
the NMI and the IPI code.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 18:17:36 +01:00