linux

q3k/linux

Author	SHA1	Message	Date
Ingo Molnar	36d93d88a5	Revert "x86/early_printk: Replace obsolete simple_strtoul() usage with kstrtoint()" This reverts commit `fbd24153c4`. This commit is subtly buggy: kstrto*int() can return an error but it's not checked in every path. simple_strtoul() on the other hand could not fail, so this patch subtly intruduces new failure modes. Signed-off-by: Shuah Khan <shuahkhan@gmail.com> Link: http://lkml.kernel.org/r/1338424803.3569.5.camel@lorien2 Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-22 15:47:52 +02:00
Geert Uytterhoeven	2e76c2838a	module.c: spelling s/postition/position/g Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2012-07-20 10:38:35 +02:00
Liu, Jinsong	a8fccdb061	x86, MCE, AMD: Adjust initcall sequence for xen there are 3 funcs which need to be _initcalled in a logic sequence: 1. xen_late_init_mcelog 2. mcheck_init_device 3. threshold_init_device xen_late_init_mcelog must register xen_mce_chrdev_device before native mce_chrdev_device registration if running under xen platform; mcheck_init_device should be inited before threshold_init_device to initialize mce_device, otherwise a a NULL ptr dereference will cause panic. so we use following _initcalls 1. device_initcall(xen_late_init_mcelog); 2. device_initcall_sync(mcheck_init_device); 3. late_initcall(threshold_init_device); when running under xen, the initcall order is 1,2,3; on baremetal, we skip 1 and we do only 2 and 3. Acked-and-tested-by: Borislav Petkov <bp@amd64.org> Suggested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2012-07-19 15:51:37 -04:00
Liu, Jinsong	cef12ee52b	xen/mce: Add mcelog support for Xen platform When MCA error occurs, it would be handled by Xen hypervisor first, and then the error information would be sent to initial domain for logging. This patch gets error information from Xen hypervisor and convert Xen format error into Linux format mcelog. This logic is basically self-contained, not touching other kernel components. By using tools like mcelog tool users could read specific error information, like what they did under native Linux. To test follow directions outlined in Documentation/acpi/apei/einj.txt Acked-and-tested-by: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: Ke, Liping <liping.ke@intel.com> Signed-off-by: Jiang, Yunhong <yunhong.jiang@intel.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2012-07-19 15:51:36 -04:00
Avi Kivity	d63d3e6217	x86, hyper: fix build with !CONFIG_KVM_GUEST Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2012-07-18 17:01:48 -03:00
Ingo Molnar	a2fe194723	Merge branch 'linus' into perf/core Pick up the latest ring-buffer fixes, before applying a new fix. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-18 11:17:17 +02:00
Michael S. Tsirkin	9053666406	KVM guest: switch to apic_set_eoi_write, apic_write Use apic_set_eoi_write, apic_write to avoid meedling in core apic driver data structures directly. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-07-16 12:51:44 +03:00
Michael S. Tsirkin	1551df646d	apic: add apic_set_eoi_write for PV use KVM PV EOI optimization overrides eoi_write apic op with its own version. Add an API for this to avoid meddling with core x86 apic driver data structures directly. For KVM use, we don't need any guarantees about when the switch to the new op will take place, so it could in theory use this API after SMP init, but it currently doesn't, and restricting callers to early init makes it clear that it's safe as it won't race with actual APIC driver use. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-07-16 12:51:23 +03:00
Will Drewry	09d314425f	vsyscall_64: add missing ifdef CONFIG_SECCOMP vsyscall_seccomp introduced a dependency on __secure_computing. On configurations with CONFIG_SECCOMP disabled, compilation will fail. Reported-by: feng xiangjun <fengxj325@gmail.com> Signed-off-by: Will Drewry <wad@chromium.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-07-14 12:01:36 -07:00
Will Drewry	5651721ede	x86/vsyscall: allow seccomp filter in vsyscall=emulate If a seccomp filter program is installed, older static binaries and distributions with older libc implementations (glibc 2.13 and earlier) that rely on vsyscall use will be terminated regardless of the filter program policy when executing time, gettimeofday, or getcpu. This is only the case when vsyscall emulation is in use (vsyscall=emulate is the default). This patch emulates system call entry inside a vsyscall=emulate by populating regs->ax and regs->orig_ax with the system call number prior to calling into seccomp such that all seccomp-dependencies function normally. Additionally, system call return behavior is emulated in line with other vsyscall entrypoints for the trace/trap cases. [ v2: fixed ip and sp on SECCOMP_RET_TRAP/TRACE (thanks to luto@mit.edu) ] Reported-and-tested-by: Owen Kibel <qmewlo@gmail.com> Signed-off-by: Will Drewry <wad@chromium.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-07-13 14:25:55 -07:00
Ingo Molnar	bb65a764de	Merge branch 'mce-ripvfix' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras into x86/mce Merge memory fault handling fix from Tony Luck. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-11 22:37:48 +02:00
Tony Luck	6751ed65dc	x86/mce: Fix siginfo_t->si_addr value for non-recoverable memory faults In commit `dad1743e59` ("x86/mce: Only restart instruction after machine check recovery if it is safe") we fixed mce_notify_process() to force a signal to the current process if it was not restartable (RIPV bit not set in MCG_STATUS). But doing it here means that the process doesn't get told the virtual address of the fault via siginfo_t->si_addr. This would prevent application level recovery from the fault. Make a new MF_MUST_KILL flag bit for memory_failure() et al. to use so that we will provide the right information with the signal. Signed-off-by: Tony Luck <tony.luck@intel.com> Acked-by: Borislav Petkov <borislav.petkov@amd.com> Cc: stable@kernel.org # 3.4+	2012-07-11 10:20:47 -07:00
Prarit Bhargava	fc73373b33	KVM: Add x86_hyper_kvm to complete detect_hypervisor_platform check While debugging I noticed that unlike all the other hypervisor code in the kernel, kvm does not have an entry for x86_hyper which is used in detect_hypervisor_platform() which results in a nice printk in the syslog. This is only really a stub function but it does make kvm more consistent with the other hypervisors. Signed-off-by: Prarit Bhargava <prarit@redhat.com> Cc: Avi Kivity <avi@redhat.com> Cc: Gleb Natapov <gleb@redhat.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Marcelo Tostatti <mtosatti@redhat.com> Cc: kvm@vger.kernel.org Signed-off-by: Avi Kivity <avi@redhat.com>	2012-07-11 19:33:32 +03:00
Ingo Molnar	92254d3144	Linux 3.5-rc6 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) iQEcBAABAgAGBQJP+NMmAAoJEHm+PkMAQRiGPxEH/18YQN8FAzEIjcC10ytA3RC3 KzPv31jXgJGZDy1UqmpKtJ7GDwb92AhqZxVnJimMa+6d1uA8NsZQq5EMOPPiX8Qi 8P4AEaw5kSMmR/6zxsxguCGdbDLU3xZ1nJZkHyMgjo2UJbMU0jBPneb/79heWPhe 0HOkLzN5VA6Yx3Nt70sWQ1zsuj0Ji5jCGO0iNTCBmTiv4J9ZlOx3xJQn4aK6JscO /3QRTM43GG0j6zToEOCTHrn8ajOq6rHQQkG0bPVR723nFrSGLoaCT6QVBXYug+AZ 9Xay7zVNvrq2oH5x5jADG2t2vyaG+nEJpSrVjXznzxgDnK7tWjYqiuG5zqKhAq8= =IMfr -----END PGP SIGNATURE----- Merge tag 'v3.5-rc6' into x86/mce Merge Linux 3.5-rc6 before merging more code. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-11 09:41:37 +02:00
Jan Beulich	a7101d1526	x86/mm/mtrr: Slightly simplify print_mtrr_state() high_width can be easily calculated in a single expression when making use of __ffs64(). Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/4FF71053020000780008E1B5@nat28.tlf.novell.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-10 10:38:15 +02:00
Jan Beulich	1ba9a29414	x86/mm/mtrr: Fix alignment determination in range_to_mtrr() With the variable operated on being of "unsigned long" type, neither ffs() nor fls() are suitable to use on them, as those truncate their arguments to 32 bits. Using __ffs() and __fls() respectively at once eliminates the need to subtract 1 from their results. Additionally, with the alignment value subsequently used as a shift count, it must be enforced to be less than BITS_PER_LONG (and on 64-bit there's no need for it to be any smaller). Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/4FF70D54020000780008E179@nat28.tlf.novell.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-10 10:38:14 +02:00
Pekka Enberg	c3b7cdf180	perf/x86: Fix intel_perfmon_event_mapformatting Use tabs for "intel_perfmon_event_map" formatting in perf_event_intel.c. Signed-off-by: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Link: http://lkml.kernel.org/r/1341568786-7045-1-git-send-email-penberg@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-06 13:16:15 +02:00
Suresh Siddha	d872818dbb	x86/apic/x2apic: Use multiple cluster members for the irq destination only with the explicit affinity During boot or driver load etc, interrupt destination is setup using default target cpu's. Later the user (irqbalance etc) or the driver (irq_set_affinity/ irq_set_affinity_hint) can request the interrupt to be migrated to some specific set of cpu's. In the x2apic cluster routing, for the default scenario use single cpu as the interrupt destination and when there is an explicit interrupt affinity request, route the interrupt to multiple members of a x2apic cluster specified in the cpumask of the migration request. This will minmize the vector pressure when there are lot of interrupt sources and relatively few x2apic clusters (for example a single socket server). This will allow the performance critical interrupts to be routed to multiple cpu's in the x2apic cluster (irqbalance for example uses the cache siblings etc while specifying the interrupt destination) and allow non-critical interrupts to be serviced by a single logical cpu. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: Yinghai Lu <yinghai@kernel.org> Acked-by: Alexander Gordeev <agordeev@redhat.com> Acked-by: Cyrill Gorcunov <gorcunov@openvz.org> Link: http://lkml.kernel.org/r/1340656709-11423-4-git-send-email-suresh.b.siddha@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-06 11:00:23 +02:00
Suresh Siddha	1ac322d0b1	x86/apic/x2apic: Limit the vector reservation to the user specified mask For the x2apic cluster mode, vector for an interrupt is currently reserved on all the cpu's that are part of the x2apic cluster. But the interrupts will be routed only to the cluster (derived from the first cpu in the mask) members specified in the mask. So there is no need to reserve the vector in the unused cluster members. Modify __assign_irq_vector() to reserve the vectors based on the user specified irq destination mask. If the new mask is a proper subset of the currently used mask, cleanup the vector allocation on the unused cpu members. Also, allow the apic driver to tune the vector domain based on the affinity mask (which in most cases is the user-specified mask). Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: Yinghai Lu <yinghai@kernel.org> Acked-by: Alexander Gordeev <agordeev@redhat.com> Acked-by: Cyrill Gorcunov <gorcunov@openvz.org> Link: http://lkml.kernel.org/r/1340656709-11423-3-git-send-email-suresh.b.siddha@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-06 11:00:22 +02:00
Suresh Siddha	b39f25a849	x86/apic: Optimize cpu traversal in __assign_irq_vector() using domain membership Currently __assign_irq_vector() goes through each cpu in the specified mask until it finds a free vector in all the cpu's that are part of the same interrupt domain. We visit all the interrupt domain sibling cpus to reserve the free vector. So, when we fail to find a free vector in an interrupt domain, it is safe to continue our search with a cpu belonging to a new interrupt domain. No need to go through each cpu, if the domain containing that cpu is already visited. Use the irq_cfg's old_domain to track the visited domains and optimize the cpu traversal while finding a free vector in the given cpumask. NOTE: We can also optimize the search by using for_each_cpu() and skip the current cpu, if it is not the first cpu in the mask returned by the vector_allocation_domain(). But re-using the cfg->old_domain to track the visited domains will be slightly faster. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: Yinghai Lu <yinghai@kernel.org> Acked-by: Alexander Gordeev <agordeev@redhat.com> Acked-by: Cyrill Gorcunov <gorcunov@openvz.org> Link: http://lkml.kernel.org/r/1340656709-11423-2-git-send-email-suresh.b.siddha@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-06 11:00:21 +02:00
Yan, Zheng	6a67943a18	perf/x86: Uncore filter support for SandyBridge-EP This patch adds C-Box and PCU filter support for SandyBridge-EP uncore. We can filter C-Box events by thread/core ID and filter PCU events by frequency/voltage. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1341381616-12229-5-git-send-email-zheng.z.yan@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-05 21:56:01 +02:00
Yan, Zheng	4208969724	perf/x86: Detect number of instances of uncore CBox The CBox manages the interface between the core and the LLC, so the instances of uncore CBox is equal to number of cores. Reported-by: Andrew Cooks <acooks@gmail.com> Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1341381616-12229-4-git-send-email-zheng.z.yan@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-05 21:56:00 +02:00
Yan, Zheng	3b19e4c98c	perf/x86: Fix event constraint for SandyBridge-EP C-Box The constraint for C-Box event 0x1f should have overlap flag set. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1340866596-22502-2-git-send-email-zheng.z.yan@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-05 21:55:59 +02:00
Yan, Zheng	eca26c9950	perf/x86: Use 0xff as pseudo code for fixed uncore event Stephane Eranian suggestted using 0xff as pseudo code for fixed uncore event and using the umask value to determine which of the fixed events we want to map to. So far there is at most one fixed counter in a uncore PMU. So just change the definition of UNCORE_FIXED_EVENT to 0xff. Suggested-by: Stephane Eranian <eranian@google.com> Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1340780953-21130-1-git-send-email-zheng.z.yan@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-05 21:55:58 +02:00
Peter Zijlstra	3e0091e2b6	perf/x86: Save a few bytes in 'struct x86_pmu' All these are basically boolean flags, use a bitfield to save a few bytes. Suggested-by: Borislav Petkov <bp@amd64.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-vsevd5g8lhcn129n3s7trl7r@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-05 21:55:58 +02:00
Peter Zijlstra	c93dc84cbe	perf/x86: Add a microcode revision check for SNB-PEBS Recent Intel microcode resolved the SNB-PEBS issues, so conditionally enable PEBS on SNB hardware depending on the microcode revision. Thanks to Stephane for figuring out the various microcode revisions. Suggested-by: Stephane Eranian <eranian@google.com> Acked-by: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-v3672ziwh9damwqwh1uz3krm@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-05 21:55:57 +02:00
Robert Richter	f285f92f7e	perf/x86: Improve debug output in check_hw_exists() It might be of interest which perfctr msr failed. Signed-off-by: Robert Richter <robert.richter@amd.com> [ added hunk to avoid GCC warn ] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1340217996-2254-5-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-05 21:19:42 +02:00
Robert Richter	b1dc3c4820	perf/x86/amd: Unify AMD's generic and family 15h pmus There is no need for keeping separate pmu structs. We can enable amd_{get,put}_event_constraints() functions also for family 15h event. The advantage is that there is only a single pmu struct for all AMD cpus. This patch introduces functions to setup the pmu to enabe core performance counters or counter constraints. Also, cpuid checks are used instead of family checks where possible. Thus, it enables the code independently of cpu families if the feature flag is set. Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1340217996-2254-4-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-05 21:19:41 +02:00
Robert Richter	a1eac7ac90	perf/x86: Move Intel specific code to intel_pmu_init() There is some Intel specific code in the generic x86 path. Move it to intel_pmu_init(). Since p4 and p6 pmus don't have fixed counters we may skip the check in case such a pmu is detected. Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1340217996-2254-3-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-05 21:19:40 +02:00
Robert Richter	15c7ad51ad	perf/x86: Rename Intel specific macros There are macros that are Intel specific and not x86 generic. Rename them into INTEL_*. This patch removes X86_PMC_IDX_GENERIC and does: $ sed -i -e 's/X86_PMC_MAX_/INTEL_PMC_MAX_/g' \ arch/x86/include/asm/kvm_host.h \ arch/x86/include/asm/perf_event.h \ arch/x86/kernel/cpu/perf_event.c \ arch/x86/kernel/cpu/perf_event_p4.c \ arch/x86/kvm/pmu.c $ sed -i -e 's/X86_PMC_IDX_FIXED/INTEL_PMC_IDX_FIXED/g' \ arch/x86/include/asm/perf_event.h \ arch/x86/kernel/cpu/perf_event.c \ arch/x86/kernel/cpu/perf_event_intel.c \ arch/x86/kernel/cpu/perf_event_intel_ds.c \ arch/x86/kvm/pmu.c $ sed -i -e 's/X86_PMC_MSK_/INTEL_PMC_MSK_/g' \ arch/x86/include/asm/perf_event.h \ arch/x86/kernel/cpu/perf_event.c Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1340217996-2254-2-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-05 21:19:39 +02:00
Ingo Molnar	1070505d18	Merge branch 'x86/microcode' into perf/core Merge this branch because we want to rely on the newer (and saner) microcode loading and checking facilities. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-05 21:13:57 +02:00
Ingo Molnar	b0338e99b2	Merge branch 'x86/cpu' into perf/core Merge this branch because we changed the wrmsr*_safe() API and there's a conflict. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-05 21:12:11 +02:00
Ingo Molnar	90574ebb7e	Merge branch 'perf/urgent' into perf/core Merge this branch to pick up a fixlet and to update to a more recent base. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-05 21:10:23 +02:00
Peter Zijlstra	ce5c1fe9a9	perf/x86: Fix USER/KERNEL tagging of samples Several perf interrupt handlers (PEBS,IBS,BTS) re-write regs->ip but do not update the segment registers. So use an regs->ip based test instead of an regs->cs/regs->flags based test. Reported-and-tested-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-xxrt0a1zronm1sm36obwc2vy@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-05 20:59:07 +02:00
Borislav Petkov	3d8986bc7f	x86, microcode: Make reload interface per system The reload interface should be per-system so that a full system ucode reload happens (on each core) when doing echo 1 > /sys/devices/system/cpu/microcode/reload Move it to the cpu subsys directory instead of it being per-cpu. Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Link: http://lkml.kernel.org/r/1340280437-7718-3-git-send-email-bp@amd64.org Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2012-07-01 10:24:09 -07:00
Borislav Petkov	c9fc3f778a	x86, microcode: Sanitize per-cpu microcode reloading interface Microcode reloading in a per-core manner is a very bad idea for both major x86 vendors. And the thing is, we have such interface with which we can end up with different microcode versions applied on different cores of an otherwise homogeneous wrt (family,model,stepping) system. So turn off the possibility of doing that per core and allow it only system-wide. This is a minimal fix which we'd like to see in stable too thus the more-or-less arbitrary decision to allow system-wide reloading only on the BSP: $ echo 1 > /sys/devices/system/cpu/cpu0/microcode/reload ... and disable the interface on the other cores: $ echo 1 > /sys/devices/system/cpu/cpu23/microcode/reload -bash: echo: write error: Invalid argument Also, allowing the reload only from one CPU (the BSP in that case) doesn't allow the reload procedure to degenerate into an O(n^2) deal when triggering reloads from all /sys/devices/system/cpu/cpuX/microcode/reload sysfs nodes simultaneously. A more generic fix will follow. Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Link: http://lkml.kernel.org/r/1340280437-7718-2-git-send-email-bp@amd64.org Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: <stable@vger.kernel.org>	2012-07-01 10:24:05 -07:00
Linus Torvalds	c76760926a	Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux Pull ACPI & Power Management patches from Len Brown. * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: acpi_pad: fix power_saving thread deadlock ACPI video: Still use ACPI backlight control if _DOS doesn't exist ACPI, APEI, Avoid too much error reporting in runtime ACPI: Add a quirk for "AMILO PRO V2030" to ignore the timer overriding ACPI: Remove one board specific WARN when ignoring timer overriding ACPI: Make acpi_skip_timer_override cover all source_irq==0 cases ACPI, x86: fix Dell M6600 ACPI reboot regression via DMI ACPI sysfs.c strlen fix	2012-06-30 11:11:58 -07:00
Len Brown	6eca954e25	Merge branches 'acpi_pad-bugzilla-42981', 'apei-bugzilla-43282', 'video-bugzilla-43168', 'bugzilla-40002' and 'bugfix-misc' into release bug fixes	2012-06-30 00:53:50 -04:00
Fenghua Yu	954e482bde	x86/copy_user_generic: Optimize copy_user_generic with CPU erms feature According to Intel 64 and IA-32 SDM and Optimization Reference Manual, beginning with Ivybridge, REG string operation using MOVSB and STOSB can provide both flexible and high-performance REG string operations in cases like memory copy. Enhancement availability is indicated by CPUID.7.0.EBX[9] (Enhanced REP MOVSB/ STOSB). If CPU erms feature is detected, patch copy_user_generic with enhanced fast string version of copy_user_generic. A few new macros are defined to reduce duplicate code in ALTERNATIVE and ALTERNATIVE_2. Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: http://lkml.kernel.org/r/1337908785-14015-1-git-send-email-fenghua.yu@intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2012-06-29 15:33:34 -07:00
Linus Torvalds	15b77435ed	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar. * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, cpufeature: Remove stray %s, add -w to mkcapflags.pl x86, cpufeature: Catch duplicate CPU feature strings x86, cpufeature: Rename X86_FEATURE_DTS to X86_FEATURE_DTHERM x86: Fix kernel-doc warnings x86, compat: Use test_thread_flag(TIF_IA32) in compat signal delivery	2012-06-29 10:29:54 -07:00
Alex Shi	52aec3308d	x86/tlb: replace INVALIDATE_TLB_VECTOR by CALL_FUNCTION_VECTOR There are 32 INVALIDATE_TLB_VECTOR now in kernel. That is quite big amount of vector in IDT. But it is still not enough, since modern x86 sever has more cpu number. That still causes heavy lock contention in TLB flushing. The patch using generic smp call function to replace it. That saved 32 vector number in IDT, and resolved the lock contention in TLB flushing on large system. In the NHM EX machine 4P * 8cores * HT = 64 CPUs, hackbench pthread has 3% performance increase. Signed-off-by: Alex Shi <alex.shi@intel.com> Link: http://lkml.kernel.org/r/1340845344-27557-9-git-send-email-alex.shi@intel.com Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2012-06-27 19:29:13 -07:00
Alex Shi	c4211f42d3	x86/tlb: add tlb_flushall_shift for specific CPU Testing show different CPU type(micro architectures and NUMA mode) has different balance points between the TLB flush all and multiple invlpg. And there also has cases the tlb flush change has no any help. This patch give a interface to let x86 vendor developers have a chance to set different shift for different CPU type. like some machine in my hands, balance points is 16 entries on Romely-EP; while it is at 8 entries on Bloomfield NHM-EP; and is 256 on IVB mobile CPU. but on model 15 core2 Xeon using invlpg has nothing help. For untested machine, do a conservative optimization, same as NHM CPU. Signed-off-by: Alex Shi <alex.shi@intel.com> Link: http://lkml.kernel.org/r/1340845344-27557-5-git-send-email-alex.shi@intel.com Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2012-06-27 19:29:10 -07:00
Alex Shi	e0ba94f14f	x86/tlb_info: get last level TLB entry number of CPU For 4KB pages, x86 CPU has 2 or 1 level TLB, first level is data TLB and instruction TLB, second level is shared TLB for both data and instructions. For hupe page TLB, usually there is just one level and seperated by 2MB/4MB and 1GB. Although each levels TLB size is important for performance tuning, but for genernal and rude optimizing, last level TLB entry number is suitable. And in fact, last level TLB always has the biggest entry number. This patch will get the biggest TLB entry number and use it in furture TLB optimizing. Accroding Borislav's suggestion, except tlb_ll[i/d]_* array, other function and data will be released after system boot up. For all kinds of x86 vendor friendly, vendor specific code was moved to its specific files. Signed-off-by: Alex Shi <alex.shi@intel.com> Link: http://lkml.kernel.org/r/1340845344-27557-2-git-send-email-alex.shi@intel.com Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2012-06-27 19:28:24 -07:00
H. Peter Anvin	1b6b7c9ff3	x86, cpufeature: Remove stray %s, add -w to mkcapflags.pl There was a stray %s left from testing, remove it. Add -w to the #! line (which is parsed by Perl even if the Perl interpreter is invoked explicitly on the command line) to catch these kinds of errors in the future. Reported-by: Jean Delvare <khali@linux-fr.org> Link: http://lkml.kernel.org/r/20120626143246.0c9bf301@endymion.delvare Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2012-06-26 08:02:48 -07:00
H. Peter Anvin	55f6cb9d0b	x86, cpufeature: Catch duplicate CPU feature strings We had a case of duplicate CPU feature strings, a user space ABI violation, for almost two years. Make it a build error so that doesn't happen again. Link: http://lkml.kernel.org/r/4FE34BCB.5050305@linux.intel.com Cc: Jan Beulich <JBeulich@suse.com> Cc: Jean Delvare <khali@linux-fr.org>	2012-06-25 09:02:13 -07:00
H. Peter Anvin	4ad3341130	x86, cpufeature: Rename X86_FEATURE_DTS to X86_FEATURE_DTHERM It makes sense to label "Digital Thermal Sensor" as "DTS", but unfortunately the string "dts" was already used for "Debug Store", and /proc/cpuinfo is a user space ABI. Therefore, rename this to "dtherm". This conflict went into mainline via the hwmon tree without any x86 maintainer ack, and without any kind of hint in the subject. `a4659053` x86/hwmon: fix initialization of coretemp Reported-by: Jean Delvare <khali@linux-fr.org> Link: http://lkml.kernel.org/r/4FE34BCB.5050305@linux.intel.com Cc: Jan Beulich <JBeulich@suse.com> Cc: <stable@vger.kernel.org> v2.6.36..v3.4 Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2012-06-25 09:01:15 -07:00
Alex Williamson	7d43c2e42c	iommu: Remove group_mf The iommu=group_mf is really no longer needed with the addition of ACS support in IOMMU drivers creating groups. Most multifunction devices will now be grouped already. If a device has gone to the trouble of exposing ACS, trust that it works. We can use the device specific ACS function for fixing devices we trust individually. This largely reverts `bcb71abe`. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2012-06-25 13:48:30 +02:00
Michael S. Tsirkin	ab9cf4996b	KVM guest: guest side for eoi avoidance The idea is simple: there's a bit, per APIC, in guest memory, that tells the guest that it does not need EOI. Guest tests it using a single est and clear operation - this is necessary so that host can detect interrupt nesting - and if set, it can skip the EOI MSR. I run a simple microbenchmark to show exit reduction (note: for testing, need to apply follow-up patch 'kvm: host side for eoi optimization' + a qemu patch I posted separately, on host): Before: Performance counter stats for 'sleep 1s': 47,357 kvm:kvm_entry [99.98%] 0 kvm:kvm_hypercall [99.98%] 0 kvm:kvm_hv_hypercall [99.98%] 5,001 kvm:kvm_pio [99.98%] 0 kvm:kvm_cpuid [99.98%] 22,124 kvm:kvm_apic [99.98%] 49,849 kvm:kvm_exit [99.98%] 21,115 kvm:kvm_inj_virq [99.98%] 0 kvm:kvm_inj_exception [99.98%] 0 kvm:kvm_page_fault [99.98%] 22,937 kvm:kvm_msr [99.98%] 0 kvm:kvm_cr [99.98%] 0 kvm:kvm_pic_set_irq [99.98%] 0 kvm:kvm_apic_ipi [99.98%] 22,207 kvm:kvm_apic_accept_irq [99.98%] 22,421 kvm:kvm_eoi [99.98%] 0 kvm:kvm_pv_eoi [99.99%] 0 kvm:kvm_nested_vmrun [99.99%] 0 kvm:kvm_nested_intercepts [99.99%] 0 kvm:kvm_nested_vmexit [99.99%] 0 kvm:kvm_nested_vmexit_inject [99.99%] 0 kvm:kvm_nested_intr_vmexit [99.99%] 0 kvm:kvm_invlpga [99.99%] 0 kvm:kvm_skinit [99.99%] 57 kvm:kvm_emulate_insn [99.99%] 0 kvm:vcpu_match_mmio [99.99%] 0 kvm:kvm_userspace_exit [99.99%] 2 kvm:kvm_set_irq [99.99%] 2 kvm:kvm_ioapic_set_irq [99.99%] 23,609 kvm:kvm_msi_set_irq [99.99%] 1 kvm:kvm_ack_irq [99.99%] 131 kvm:kvm_mmio [99.99%] 226 kvm:kvm_fpu [100.00%] 0 kvm:kvm_age_page [100.00%] 0 kvm:kvm_try_async_get_page [100.00%] 0 kvm:kvm_async_pf_doublefault [100.00%] 0 kvm:kvm_async_pf_not_present [100.00%] 0 kvm:kvm_async_pf_ready [100.00%] 0 kvm:kvm_async_pf_completed 1.002100578 seconds time elapsed After: Performance counter stats for 'sleep 1s': 28,354 kvm:kvm_entry [99.98%] 0 kvm:kvm_hypercall [99.98%] 0 kvm:kvm_hv_hypercall [99.98%] 1,347 kvm:kvm_pio [99.98%] 0 kvm:kvm_cpuid [99.98%] 1,931 kvm:kvm_apic [99.98%] 29,595 kvm:kvm_exit [99.98%] 24,884 kvm:kvm_inj_virq [99.98%] 0 kvm:kvm_inj_exception [99.98%] 0 kvm:kvm_page_fault [99.98%] 1,986 kvm:kvm_msr [99.98%] 0 kvm:kvm_cr [99.98%] 0 kvm:kvm_pic_set_irq [99.98%] 0 kvm:kvm_apic_ipi [99.99%] 25,953 kvm:kvm_apic_accept_irq [99.99%] 26,132 kvm:kvm_eoi [99.99%] 26,593 kvm:kvm_pv_eoi [99.99%] 0 kvm:kvm_nested_vmrun [99.99%] 0 kvm:kvm_nested_intercepts [99.99%] 0 kvm:kvm_nested_vmexit [99.99%] 0 kvm:kvm_nested_vmexit_inject [99.99%] 0 kvm:kvm_nested_intr_vmexit [99.99%] 0 kvm:kvm_invlpga [99.99%] 0 kvm:kvm_skinit [99.99%] 284 kvm:kvm_emulate_insn [99.99%] 68 kvm:vcpu_match_mmio [99.99%] 68 kvm:kvm_userspace_exit [99.99%] 2 kvm:kvm_set_irq [99.99%] 2 kvm:kvm_ioapic_set_irq [99.99%] 28,288 kvm:kvm_msi_set_irq [99.99%] 1 kvm:kvm_ack_irq [99.99%] 131 kvm:kvm_mmio [100.00%] 588 kvm:kvm_fpu [100.00%] 0 kvm:kvm_age_page [100.00%] 0 kvm:kvm_try_async_get_page [100.00%] 0 kvm:kvm_async_pf_doublefault [100.00%] 0 kvm:kvm_async_pf_not_present [100.00%] 0 kvm:kvm_async_pf_ready [100.00%] 0 kvm:kvm_async_pf_completed 1.002039622 seconds time elapsed We see that # of exits is almost halved. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-06-25 12:38:06 +03:00
Robert Richter	357398e96d	perf/x86: Fix section mismatch in uncore_pci_init() Fix section mismatch in uncore_pci_init(): WARNING: vmlinux.o(.init.text+0x9246): Section mismatch in reference from the function uncore_pci_init() to the function .devexit.text:uncore_pci_remove() The function __init uncore_pci_init() references a function __devexit uncore_pci_remove(). [...] Signed-off-by: Robert Richter <robert.richter@amd.com> Cc: <a.p.zijlstra@chello.nl> Cc: <zheng.z.yan@intel.com> Link: http://lkml.kernel.org/r/20120620163927.GI5046@erda.amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-25 10:32:21 +02:00
H. Peter Anvin	2b1b712f05	x86, reboot: Drop redundant write of reboot_mode We write reboot_mode to BIOS location 0x472 in native_machine_emergency_restart() (reboot.c:542) already, there is no need to then write it again in machine_real_restart(). This means nothing gets written there for MRR_APM, but the APM call is a poweroff call and doesn't use this memory location. Link: http://lkml.kernel.org/n/tip-3i0pfh44c1e3jv5lab0cf7sc@git.kernel.org Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2012-06-20 21:18:14 -07:00
Jan Beulich	0fa0e2f02e	x86: Move call to print_modules() out of show_regs() Printing the list of loaded modules is really unrelated to what this function is about, and is particularly unnecessary in the context of the SysRQ key handling (gets printed so far over and over). It should really be the caller of the function to decide whether this piece of information is useful (and to avoid redundantly printing it). Signed-off-by: Jan Beulich <jbeulich@suse.com> Link: http://lkml.kernel.org/r/4FDF21A4020000780008A67F@nat28.tlf.novell.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-20 14:33:48 +02:00
Jan Beulich	e1b6fc55da	x86/microcode: Mark microcode_id[] as __initconst It's not being used for other than creating module aliases (i.e. no loadable section has any reference to it). Signed-off-by: Jan Beulich <jbeulich@suse.com> Link: http://lkml.kernel.org/r/4FDF1EFD020000780008A65D@nat28.tlf.novell.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-20 14:33:47 +02:00
Li Zhong	0718467c85	x86/nmi: Clean up register_nmi_handler() usage Implement a cleaner and easier to maintain version for the section warning fixes implemented in commit `eeaaa96a3a` ("x86/nmi: Fix section mismatch warnings on 32-bit"). Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Signed-off-by: Don Zickus <dzickus@redhat.com> Cc: Jan Beulich <JBeulich@suse.com> Link: http://lkml.kernel.org/r/1340049393-17771-1-git-send-email-dzickus@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-20 14:23:17 +02:00
Ingo Molnar	6a991accee	Merge commit 'v3.5-rc3' into x86/debug Merge it in to pick up a fix that we are going to clean up in this branch. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-20 14:22:34 +02:00
Peter Zijlstra	2992c542fc	perf/x86: Lowercase uncore PMU event names Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-ucnds8gkve4x3s4biuukyph3@git.kernel.org [ Trivial build fix ] Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-18 15:55:52 +02:00
Yan, Zheng	7c94ee2e09	perf/x86: Add Intel Nehalem and Sandy Bridge-EP uncore support The uncore subsystem in Sandy Bridge-EP consists of 8 components: Ubox, Cacheing Agent, Home Agent, Memory controller, Power Control, QPI Link Layer, R2PCIe, R3QPI. Signed-off-by: Zheng Yan <zheng.z.yan@intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1339741902-8449-9-git-send-email-zheng.z.yan@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-18 12:13:23 +02:00
Yan, Zheng	14371cce03	perf: Add generic PCI uncore PMU device support This patch adds generic support for uncore PMUs presented as PCI devices. (These come in addition to the CPU/MSR based uncores.) Signed-off-by: Zheng Yan <zheng.z.yan@intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1339741902-8449-8-git-send-email-zheng.z.yan@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-18 12:13:23 +02:00
Yan, Zheng	fcde10e916	perf/x86: Add Intel Nehalem and Sandy Bridge uncore PMU support Signed-off-by: Zheng Yan <zheng.z.yan@intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1339741902-8449-7-git-send-email-zheng.z.yan@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-18 12:13:22 +02:00
Yan, Zheng	087bfbb032	perf/x86: Add generic Intel uncore PMU support This patch adds the generic Intel uncore PMU support, including helper functions that add/delete uncore events, a hrtimer that periodically polls the counters to avoid overflow and code that places all events for a particular socket onto a single cpu. The code design is based on the structure of Sandy Bridge-EP's uncore subsystem, which consists of a variety of components, each component contains one or more "boxes". (Tooling support follows in the next patches.) Signed-off-by: Zheng Yan <zheng.z.yan@intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1339741902-8449-6-git-send-email-zheng.z.yan@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-18 12:13:22 +02:00
Yan, Zheng	4b4969b144	perf: Export perf_assign_events() Export perf_assign_events() so the uncore code can use it to schedule events. Signed-off-by: Zheng Yan <zheng.z.yan@intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1339741902-8449-2-git-send-email-zheng.z.yan@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-18 12:13:20 +02:00
Robert Richter	76958a61e4	perf/x86/amd: Fix RDPMC index calculation for AMD family 15h The RDPMC index calculation is wrong for AMD family 15h (X86_FEATURE_ PERFCTR_CORE set). This leads to a #GP when accessing the counter: Pid: 2237, comm: syslog-ng Not tainted 3.5.0-rc1-perf-x86_64-standard-g130ff90 #135 AMD Pike/Pike RIP: 0010:[<ffffffff8100dc33>] [<ffffffff8100dc33>] x86_perf_event_update+0x27/0x66 While the msr address offset is (index << 1) we must use index to select the correct rdpmc. Signed-off-by: Robert Richter <robert.richter@amd.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Vince Weaver <vweaver1@eecs.utk.edu> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-18 11:14:35 +02:00
Ido Yariv	abf71f3066	x86/vsmp: Fix vector_allocation_domain's return value Commit `8637e38a` ("x86/apic: Avoid useless scanning thru a cpumask in assign_irq_vector()") modified vector_allocation_domain() to return a boolean indicating if cpumask is dynamic or static. Adjust vSMP's callback implementation accordingly. Signed-off-by: Ido Yariv <ido@wizery.com> Acked-by: Shai Fultheim <shai@scalemp.com> Cc: Alexander Gordeev <agordeev@redhat.com> Link: http://lkml.kernel.org/r/1339773055-27397-1-git-send-email-ido@wizery.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-18 11:10:23 +02:00
Ingo Molnar	8461689c67	Merge branch 'x86/apic' into x86/platform Merge in x86/apic to solve a vector_allocation_domain() API change semantic merge conflict. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-18 11:09:49 +02:00
Wanpeng Li	c15acff337	x86: Fix kernel-doc warnings Signed-off-by: Wanpeng Li <liwp@linux.vnet.ibm.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Jason Wessel <jason.wessel@windriver.com> Cc: Jan Kiszka <jan.kiszka@siemens.com> Cc: Gavin Shan <shangw@linux.vnet.ibm.com> Cc: Wanpeng Li <liwp.linux@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-18 10:53:18 +02:00
H. Peter Anvin	650513979a	x86-64, reboot: Allow reboot=bios and reboot-cpu override on x86-64 With the revamped realmode trampoline code, it is trivial to extend support for reboot=bios to x86-64. Furthermore, while we are at it, remove the restriction that only we can only override the reboot CPU on 32 bits. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Link: http://lkml.kernel.org/n/tip-jopx7y6g6dbcx4tpal8q0jlr@git.kernel.org	2012-06-17 10:51:01 -07:00
Linus Torvalds	56b880e2e3	Merge branch 'fixes-for-linus' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping Pull DMA-mapping fixes from Marek Szyprowski: "A set of minor fixes for dma-mapping code (ARM and x86) required for Contiguous Memory Allocator (CMA) patches merged in v3.5-rc1." * 'fixes-for-linus' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping: x86: dma-mapping: fix broken allocation when dma_mask has been provided ARM: dma-mapping: fix debug messages in dmabounce code ARM: mm: fix type of the arm_dma_limit global variable ARM: dma-mapping: Add missing static storage class specifier	2012-06-15 17:35:01 -07:00
Linus Torvalds	c83119a980	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar. * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/smp: Fix topology checks on AMD MCM CPUs x86/mm: Fix some kernel-doc warnings x86, um: Correct syscall table type attributes breaking gcc 4.8	2012-06-15 16:59:19 -07:00
Suresh Siddha	7eb9ae0799	irq/apic: Use config_enabled(CONFIG_SMP) checks to clean up irq_set_affinity() for UP Move the ->irq_set_affinity() routines out of the #ifdef CONFIG_SMP sections and use config_enabled(CONFIG_SMP) checks inside those routines. Thus making those routines simple null stubs for !CONFIG_SMP and retaining those routines with no additional runtime overhead for CONFIG_SMP kernels. Cleans up the ifdef CONFIG_SMP in and around routines related to irq_set_affinity in io_apic and irq_remapping subsystems. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: torvalds@linux-foundation.org Cc: joerg.roedel@amd.com Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Link: http://lkml.kernel.org/r/1339723729.3475.63.camel@sbsiddha-desk.sc.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-15 14:17:29 +02:00
Ingo Molnar	879060d574	Merge branch 'x86/cleanups' into x86/apic Merge in the cleanups because a followup x86/apic change relies on them. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-15 14:17:01 +02:00
Ido Yariv	d48daf37a3	x86/vsmp: Fix linker error when CONFIG_PROC_FS is not set set_vsmp_pv_ops() references no_irq_affinity which is undeclared if CONFIG_PROC_FS isn't set. Fix this by adding an #ifdef around this variable's access. Reported-by: Fengguang Wu <wfg@linux.intel.com> Signed-off-by: Ido Yariv <ido@wizery.com> Acked-by: Shai Fultheim <shai@scalemp.com> Link: http://lkml.kernel.org/r/1339688588-12674-1-git-send-email-ido@wizery.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-15 13:54:11 +02:00
Marek Szyprowski	c080e26edc	x86: dma-mapping: fix broken allocation when dma_mask has been provided Commit `0a2b9a6ea9` ("X86: integrate CMA with DMA-mapping subsystem") broke memory allocation with dma_mask. This patch fixes possible kernel ops caused by lack of resetting page variable when jumping to 'again' label. Reported-by: Konrad Rzeszutek Wilk <konrad@darnok.org> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Acked-by: Michal Nazarewicz <mina86@mina86.com>	2012-06-14 14:01:30 +02:00
Alexander Gordeev	5a0a2a3081	x86/apic/es7000: Make apicid of a cluster (not CPU) from a cpumask cpu_mask_to_apicid_and() always returns apicid of a single CPU, even in case multiple CPUs were requested. This update fixes a typo and forces apicid of a cluster to be returned. Signed-off-by: Alexander Gordeev <agordeev@redhat.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/20120614075043.GI3383@dhcp-26-207.brq.redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-14 12:53:16 +02:00
Alexander Gordeev	214e270b5f	x86/apic/es7000+summit: Always make valid apicid from a cpumask In case of invalid parameters cpu_mask_to_apicid_and() might return apicid value of 0 (on Summit) or a uninitialized value (on ES7000), although it is supposed to return apicid of cpu-0 at least. Fix the operation to always return a valid apicid. Signed-off-by: Alexander Gordeev <agordeev@redhat.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/20120614075026.GH3383@dhcp-26-207.brq.redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-14 12:53:15 +02:00
Alexander Gordeev	49ad3fd483	x86/apic/es7000+summit: Fix compile warning in cpu_mask_to_apicid() Signed-off-by: Alexander Gordeev <agordeev@redhat.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/20120614075010.GG3383@dhcp-26-207.brq.redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-14 12:53:15 +02:00
Alexander Gordeev	ea3807ea52	x86/apic: Fix ugly casting and branching in cpu_mask_to_apicid_and() Signed-off-by: Alexander Gordeev <agordeev@redhat.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/20120614074954.GF3383@dhcp-26-207.brq.redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-14 12:53:14 +02:00
Alexander Gordeev	a5a391561b	x86/apic: Eliminate cpu_mask_to_apicid() operation Since there are only two locations where cpu_mask_to_apicid() is called from, remove the operation and use only cpu_mask_to_apicid_and() instead. Signed-off-by: Alexander Gordeev <agordeev@redhat.com> Suggested-and-acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/20120614074935.GE3383@dhcp-26-207.brq.redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-14 12:53:13 +02:00
Alexander Gordeev	cac4afbc3d	x86/x2apic/cluster: Vector_allocation_domain() should return a value Since commit `8637e38` ("x86/apic: Avoid useless scanning thru a cpumask in assign_irq_vector()") vector_allocation_domain() operation indicates if a cpumask is dynamic or static. This update fixes the oversight and makes the operation to return a value. Signed-off-by: Alexander Gordeev <agordeev@redhat.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/20120614103933.GJ3383@dhcp-26-207.brq.redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-14 12:53:12 +02:00
Vlad Zolotarov	0816b0f036	x86: Add read_mostly declaration/definition to variables from smp.h Add "read-mostly" qualifier to the following variables in smp.h: - cpu_sibling_map - cpu_core_map - cpu_llc_shared_map - cpu_llc_id - cpu_number - x86_cpu_to_apicid - x86_bios_cpu_apicid - x86_cpu_to_logical_apicid As long as all the variables above are only written during the initialization, this change is meant to prevent the false sharing. More specifically, on vSMP Foundation platform x86_cpu_to_apicid shared the same internode_cache_line with frequently written lapic_events. From the analysis of the first 33 per_cpu variables out of 219 (memories they describe, to be more specific) the 8 have read_mostly nature (tlb_vector_offset, cpu_loops_per_jiffy, xen_debug_irq, etc.) and 25 are frequently written (irq_stack_union, gdt_page, exception_stacks, idt_desc, etc.). Assuming that the spread of the rest of the per_cpu variables is similar, identifying the read mostly memories will make more sense in terms of long-term code maintenance comparing to identifying frequently written memories. Signed-off-by: Vlad Zolotarov <vlad@scalemp.com> Acked-by: Shai Fultheim <shai@scalemp.com> Cc: Shai Fultheim (Shai@ScaleMP.com) <Shai@scalemp.com> Cc: ido@wizery.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1719258.EYKzE4Zbq5@vlad Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-14 12:42:11 +02:00
OGAWA Hirofumi	2f74759056	x86/alternatives: Use atomic_xchg() instead atomic_dec_and_test() for stop_machine_text_poke() stop_machine_text_poke() uses atomic_dec_and_test() to select one of the CPUs executing that function to actually modify the code. Since the variable is initialized to 1, subsequent CPUs will make the variable go negative. Since going negative is uncommon/unexpected in typical dec_and_test usage change this user to atomic_xchg(). This was found using a patch that warns on dec_and_test going negative. Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Acked-by: Steven Rostedt <rostedt@goodmis.org> [ Rewrote changelog ] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/87zk8fgsx9.fsf@devron.myhome.or.jp Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-13 15:08:37 +02:00
Borislav Petkov	161270fc1f	x86/smp: Fix topology checks on AMD MCM CPUs The warning below triggers on AMD MCM packages because physical package IDs on the cores of a _physical_ socket are the same. I.e., this field says which CPUs belong to the same physical package. However, the same two CPUs belong to two different internal, i.e. "logical" nodes in the same physical socket which is reflected in the CPU-to-node map on x86 with NUMA. Which makes this check wrong on the above topologies so circumvent it. [ 0.444413] Booting Node 0, Processors #1 #2 #3 #4 #5 Ok. [ 0.461388] ------------[ cut here ]------------ [ 0.465997] WARNING: at arch/x86/kernel/smpboot.c:310 topology_sane.clone.1+0x6e/0x81() [ 0.473960] Hardware name: Dinar [ 0.477170] sched: CPU #6's mc-sibling CPU #0 is not on the same node! [node: 1 != 0]. Ignoring dependency. [ 0.486860] Booting Node 1, Processors #6 [ 0.491104] Modules linked in: [ 0.494141] Pid: 0, comm: swapper/6 Not tainted 3.4.0+ #1 [ 0.499510] Call Trace: [ 0.501946] [<ffffffff8144bf92>] ? topology_sane.clone.1+0x6e/0x81 [ 0.508185] [<ffffffff8102f1fc>] warn_slowpath_common+0x85/0x9d [ 0.514163] [<ffffffff8102f2b7>] warn_slowpath_fmt+0x46/0x48 [ 0.519881] [<ffffffff8144bf92>] topology_sane.clone.1+0x6e/0x81 [ 0.525943] [<ffffffff8144c234>] set_cpu_sibling_map+0x251/0x371 [ 0.532004] [<ffffffff8144c4ee>] start_secondary+0x19a/0x218 [ 0.537729] ---[ end trace 4eaa2a86a8e2da22 ]--- [ 0.628197] #7 #8 #9 #10 #11 Ok. [ 0.807108] Booting Node 3, Processors #12 #13 #14 #15 #16 #17 Ok. [ 0.897587] Booting Node 2, Processors #18 #19 #20 #21 #22 #23 Ok. [ 0.917443] Brought up 24 CPUs We ran a topology sanity check test we have here on it and it all looks ok... hopefully :). Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20120529135442.GE29157@aftab.osrc.amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-13 14:56:12 +02:00
Sebastian Andrzej Siewior	83452c6a43	x86/PCI: move fixup hooks from __init to __devinit The fixups are executed once the pci-device is found which is during boot process so __init seems fine as long as the platform does not support hotplug. However it is possible to remove the PCI bus at run time and have it rediscovered again via "echo 1 > /sys/bus/pci/rescan" and this will call the fixups again. Cc: x86@kernel.org Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2012-06-12 09:10:54 -06:00
Marcelo Tosatti	e32025a564	x86: kvmclock: remove check_and_clear_guest_paused warning CPU offline path calls the hrtimer interrupt handler with interrupts disabled, without touching preempt_count, triggering this warning. Remove the warning since it is supposed to be used from hrtimer interrupt context only. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2012-06-11 23:18:33 -03:00
Feng Tang	f6b54f083c	ACPI: Add a quirk for "AMILO PRO V2030" to ignore the timer overriding This is the 2nd part of fix for kernel bugzilla 40002: "IRQ 0 assigned to VGA" https://bugzilla.kernel.org/show_bug.cgi?id=40002 The root cause is the buggy FW, whose ACPI tables assign the GSI 16 to 2 irqs 0 and 16(VGA), and the VGA is the right owner of GSI 16. So add a quirk to ignore the irq0 overriding GSI 16 for the FUJITSU SIEMENS AMILO PRO V2030 platform will solve this issue. Reported-and-tested-by: Szymon Kowalczyk <fazerxlo@o2.pl> Signed-off-by: Feng Tang <feng.tang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>	2012-06-11 17:29:44 -04:00
Feng Tang	7f68b4c2e1	ACPI: Remove one board specific WARN when ignoring timer overriding Current WARN msg is only for the ati_ixp4x0 board, while this function is used by mulitple platforms. So this one board specific warning is not appropriate any more. Signed-off-by: Feng Tang <feng.tang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>	2012-06-11 17:29:38 -04:00
Feng Tang	ae10ccdc30	ACPI: Make acpi_skip_timer_override cover all source_irq==0 cases Currently when acpi_skip_timer_override is set, it only cover the (source_irq == 0 && global_irq == 2) cases. While there is also platform which need use this option and its global_irq is not 2. This patch will extend acpi_skip_timer_override to cover all timer overriding cases as long as the source irq is 0. This is the first part of a fix to kernel bug bugzilla 40002: "IRQ 0 assigned to VGA" https://bugzilla.kernel.org/show_bug.cgi?id=40002 Reported-and-tested-by: Szymon Kowalczyk <fazerxlo@o2.pl> Signed-off-by: Feng Tang <feng.tang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>	2012-06-11 17:29:30 -04:00
Ravikiran Thirumalai	110c1e1f1b	x86/vsmp: Ignore IOAPIC IRQ affinity if possible vSMP can route interrupts more optimally based on internal knowledge the OS does not have. In order to support this optimization, all CPUs must be able to handle all possible IOAPIC interrupts. Fix this by setting the vector allocation domain for all CPUs and by enabling this feature in vSMP. Signed-off-by: Ravikiran Thirumalai <kiran.thirumalai@gmail.com> Signed-off-by: Shai Fultheim <shai@scalemp.com> [ Rebased, simplified, and reworded the commit message. ] Signed-off-by: Ido Yariv <ido@wizery.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-11 10:59:13 +02:00
Shuah Khan	e2b297fcf1	perf/x86: Convert obsolete simple_strtoul() usage to kstrtoul() Signed-off-by: Shuah Khan <shuahkhan@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/1339384421.3025.8.camel@lorien2 Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-11 10:52:12 +02:00
Ingo Molnar	c3e228d59b	Linux 3.5-rc2 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQEcBAABAgAGBQJP0qm4AAoJEHm+PkMAQRiG62QIAJRNJFyVB0ZrsMPgdwLnlX4O 5I86H7GaYXoOK/KMb2s5h4KiFggIODnyEkZi+/39tJOgGo0KrMcDlsh0owB1Iggw LE6iyze9I1z9wQze0+SXe7VAcvUYvsx2vgpOKvoNi97Qgn3B6onL+SAi5U+NAqJl 0NdKmveEd42UIm7JfChHlxl8bm8YB+WcU38OkMGpRpJ/Moz9EbSjYVQg3oHrzJjy duiX6SD/OV4m5yCcXXmu+f41pN+SG7xENJ5r4enyi2ZF8mAyVz2goIyL2bA0AJX2 +GbpD1sxUHkZ6yPg4tf2bmJOj0PkfZNAi8YpFxZDlP4y1pKuCTEDTBp8O2id43w= =Jyn8 -----END PGP SIGNATURE----- Merge tag 'v3.5-rc2' into perf/core Merge in Linux 3.5-rc2 - to pick up fixes. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-11 10:51:35 +02:00
Steven Rostedt	70fb74a542	x86: Save cr2 in NMI in case NMIs take a page fault (for i386) Avi Kivity reported that page faults in NMIs could cause havic if the NMI preempted another page fault handler: The recent changes to NMI allow exceptions to take place in NMI handlers, but I think that a #PF (say, due to access to vmalloc space) is still problematic. Consider the sequence #PF (cr2 set by processor) NMI ... #PF (cr2 clobbered) do_page_fault() IRET ... IRET do_page_fault() address = read_cr2() The last line reads the overwritten cr2 value. This is the i386 version, which has the luxury of doing the work in C code. Link: http://lkml.kernel.org/r/4FBB8C40.6080304@redhat.com Reported-by: Avi Kivity <avi@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-06-08 18:51:12 -04:00
Steven Rostedt	c7d65a78fc	x86: Remove cmpxchg from i386 NMI nesting code I've been informed by someone on LWN called 'slashdot' that some i386 machines do not support a true cmpxchg. The cmpxchg used by the i386 NMI nesting code must be a true cmpxchg as disabling interrupts will not work for NMIs (which is the work around for i386s that do not have a true cmpxchg). This 'slashdot' character also suggested a fix to the issue. As the state of the nesting NMIs goes as follows: NOT_RUNNING -> EXECUTING EXECUTING -> NOT_RUNNING EXECUTING -> LATCHED LATCHED -> EXECUTING Having these states as enum values of: NOT_RUNNING = 0 EXECUTING = 1 LATCHED = 2 Instead of a cmpxchg to make EXECUTING -> NOT_RUNNING a dec_and_test() would work as well. If the dec_and_test brings the state to NOT_RUNNING, that is the same as a cmpxchg succeeding to change EXECUTING to NOT_RUNNING. If a nested NMI were to come in and change it to LATCHED, the dec_and_test() would convert the state to EXECUTING (what we want it to be in such a case anyway). I asked 'slashdot' to post this as a patch, but it never came to be. I decided to do the work instead. Thanks to H. Peter Anvin for suggesting to use this_cpu_dec_and_return() instead of local_dec_and_test(&__get_cpu_var()). Link: http://lwn.net/Articles/484932/ Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-06-08 18:48:05 -04:00
Linus Torvalds	7249450449	Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar. * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched: Fix the relax_domain_level boot parameter sched: Validate assumptions in sched_init_numa() sched: Always initialize cpu-power sched: Fix domain iteration sched/rt: Fix lockdep annotation within find_lock_lowest_rq() sched/numa: Load balance between remote nodes sched/x86: Calculate booted cores after construction of sibling_mask	2012-06-08 14:59:29 -07:00
Linus Torvalds	0b35d326f8	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar. * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/nmi: Fix section mismatch warnings on 32-bit x86/uv: Fix UV2 BAU legacy mode x86/mm: Only add extra pages count for the first memory range during pre-allocation early page table space x86, efi stub: Add .reloc section back into image x86/ioapic: Fix NULL pointer dereference on CPU hotplug after disabling irqs x86/reboot: Fix a warning message triggered by stop_other_cpus() x86/intel/moorestown: Change intel_scu_devices_create() to __devinit x86/numa: Set numa_nodes_parsed at acpi_numa_memory_affinity_init() x86/gart: Fix kmemleak warning x86: mce: Add the dropped timer interval init back x86/mce: Fix the MCE poll timer logic	2012-06-08 09:26:55 -07:00
Linus Torvalds	106544d81d	Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "A bit larger than what I'd wish for - half of it is due to hw driver updates to Intel Ivy-Bridge which info got recently released, cycles:pp should work there now too, amongst other things. (but we are generally making exceptions for hardware enablement of this type.) There are also callchain fixes in it - responding to mostly theoretical (but valid) concerns. The tooling side sports perf.data endianness/portability fixes which did not make it for the merge window - and various other fixes as well." * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (26 commits) perf/x86: Check user address explicitly in copy_from_user_nmi() perf/x86: Check if user fp is valid perf: Limit callchains to 127 perf/x86: Allow multiple stacks perf/x86: Update SNB PEBS constraints perf/x86: Enable/Add IvyBridge hardware support perf/x86: Implement cycles:p for SNB/IVB perf/x86: Fix Intel shared extra MSR allocation x86/decoder: Fix bsr/bsf/jmpe decoding with operand-size prefix perf: Remove duplicate invocation on perf_event_for_each perf uprobes: Remove unnecessary check before strlist__delete perf symbols: Check for valid dso before creating map perf evsel: Fix 32 bit values endianity swap for sample_id_all header perf session: Handle endianity swap on sample_id_all header data perf symbols: Handle different endians properly during symbol load perf evlist: Pass third argument to ioctl explicitly perf tools: Update ioctl documentation for PERF_IOC_FLAG_GROUP perf tools: Make --version show kernel version instead of pull req tag perf tools: Check if callchain is corrupted perf callchain: Make callchain cursors TLS ...	2012-06-08 09:14:46 -07:00
Ingo Molnar	707ecec1dc	AMD thresholding fixes for 3.6 Those are a bunch of patches which give the MCE thresholding code a hard look and a scrubbing to remove a couple of annoyances like sysfs warnings when running CPU off-/online tests and the threshold_bank4 node under /sys/devices/system/machinecheck/ is a symlink. It also gives proper names to the thresholding banks instead of simply enumerating them, like this: /sys/devices/system/machinecheck/machinecheck0/ \|-- bank0 \|-- bank1 \|-- bank2 \|-- bank3 \|-- bank4 \|-- bank5 \|-- bank6 \|-- check_interval \|-- cmci_disabled \|-- combined_unit \| \|-- combined_unit \| \|-- error_count \| \|-- threshold_limit \|-- dont_log_ce \|-- execution_unit \| \|-- execution_unit \| \|-- error_count \| \|-- threshold_limit \|-- ignore_ce \|-- insn_fetch \| \|-- insn_fetch \| \|-- error_count \| \|-- threshold_limit \|-- load_store \| \|-- load_store \| \|-- error_count \| \|-- threshold_limit \|-- monarch_timeout \|-- northbridge \| \|-- dram \| \| \|-- error_count \| \| \|-- interrupt_enable \| \| \|-- threshold_limit \| \|-- ht_links \| \| \|-- error_count \| \| \|-- interrupt_enable \| \| \|-- threshold_limit \| \|-- l3_cache \| \|-- error_count \| \|-- interrupt_enable \| \|-- threshold_limit ... It is tested on all our families >= K8. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIcBAABAgAGBQJP0Jw9AAoJEBLB8Bhh3lVKMa8P/1ZPWkFZVFIdilyViQdSR/1/ 6MPy6BcZAACBl4rgrvjtFhmNv8C2dCGoPYRksHiO9sjgsilhQe/L92rmORifrNB4 kvqR1QfKH2Hw2X1B/0fWXthh7UV37h1TdrVNJNlzhmi3wO+MHlX54iZcwpsaceFx QdzSqdHbaKfkfttojxIdgSfl7M2aCRnkmMOUG4X9HCsIK0C3ChdHLhJDnLT0xYb8 fdA8dkXMktli0GC+KfevOXILZGLhUQuigu4iYKRm689N98N1Ejfa7BvMCVqLr0kF 4fNmC+BtZmdw8MYd7EiuYXhA0Unu+CAg23ADQpn0AEyGQcM5h7/9/4GKvgjjsV1h /2r1WU+UVGZSUQ3FRDbzD37QVAa9FoOv967Gks6Fa31K7kEPC8yIRhWl72wXQXpa hFk+Hf3RlKtaO06iH/2RD2JA+W6xntiFo8CZ+AUMoLWfIQaYSAFP039lpjJp/Hzd CDdNWKCchAaMYI1MBmbRZ65mSgsVLLioNrf55+kdWT/CbuXJua95YxRRmllNFv5k MHjPoTajL0WKZhYxUSjCH87rqZHyNBH5s8iZlIt7wR//kqBGYfmRvGSDe31yMrL8 PH/MgEIBVmrLQSWcojF+pU6ep+sQELVNsbu1+doZd/ruD/hzsZeu+MANWtJgrrVs +rsPRDWTcC3ca/V5Y1UO =XN3W -----END PGP SIGNATURE----- Merge tag 'amd-thresholding-fixes-for-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras into x86/mce Pull in AMD MCE thresholding fixes for v3.6, from Borislav Petkov: " Those are a bunch of patches which give the MCE thresholding code a hard look and a scrubbing to remove a couple of annoyances like sysfs warnings when running CPU off-/online tests and the threshold_bank4 node under /sys/devices/system/machinecheck/ is a symlink. It also gives proper names to the thresholding banks instead of simply enumerating them, like this: /sys/devices/system/machinecheck/machinecheck0/ \|-- bank0 \|-- bank1 \|-- bank2 \|-- bank3 \|-- bank4 \|-- bank5 \|-- bank6 \|-- check_interval \|-- cmci_disabled \|-- combined_unit \| \|-- combined_unit \| \|-- error_count \| \|-- threshold_limit \|-- dont_log_ce \|-- execution_unit \| \|-- execution_unit \| \|-- error_count \| \|-- threshold_limit \|-- ignore_ce \|-- insn_fetch \| \|-- insn_fetch \| \|-- error_count \| \|-- threshold_limit \|-- load_store \| \|-- load_store \| \|-- error_count \| \|-- threshold_limit \|-- monarch_timeout \|-- northbridge \| \|-- dram \| \| \|-- error_count \| \| \|-- interrupt_enable \| \| \|-- threshold_limit \| \|-- ht_links \| \| \|-- error_count \| \| \|-- interrupt_enable \| \| \|-- threshold_limit \| \|-- l3_cache \| \|-- error_count \| \|-- interrupt_enable \| \|-- threshold_limit ... It is tested on all our families >= K8." Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-08 12:29:47 +02:00
Ananth N Mavinakayanahalli	7eb9ba5ed3	uprobes: Pass probed vaddr to arch_uprobe_analyze_insn() On RISC architectures like powerpc, instructions are fixed size. Instruction analysis on such platforms is just a matter of (insn % 4). Pass the vaddr at which the uprobe is to be inserted so that arch_uprobe_analyze_insn() can flag misaligned registration requests. Signed-off-by: Ananth N Mavinakaynahalli <ananth@in.ibm.com> Cc: michael@ellerman.id.au Cc: antonb@thinktux.localdomain Cc: Paul Mackerras <paulus@samba.org> Cc: benh@kernel.crashing.org Cc: peterz@infradead.org Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Jim Keniston <jkenisto@us.ibm.com> Cc: oleg@redhat.com Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/20120608093257.GG13409@in.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-08 12:22:27 +02:00
Don Zickus	eeaaa96a3a	x86/nmi: Fix section mismatch warnings on 32-bit It was reported that compiling for 32-bit caused a bunch of section mismatch warnings: VDSOSYM arch/x86/vdso/vdso32-syms.lds LD arch/x86/vdso/built-in.o LD arch/x86/built-in.o WARNING: arch/x86/built-in.o(.data+0x5af0): Section mismatch in reference from the variable test_nmi_ipi_callback_na.10451 to the function .init.text:test_nmi_ipi_callback() [...] WARNING: arch/x86/built-in.o(.data+0x5b04): Section mismatch in reference from the variable nmi_unk_cb_na.10399 to the function .init.text:nmi_unk_cb() The variable nmi_unk_cb_na.10399 references the function __init nmi_unk_cb() [...] Both of these are attributed to the internal representation of the nmiaction struct created during register_nmi_handler. The reason for this is that those structs are not defined in the init section whereas the rest of the code in nmi_selftest.c is. To resolve this, I created a new #define, register_nmi_handler_initonly, that tags the struct as __initdata to resolve the mismatch. This #define should only be used in rare situations where the register/unregister is called during init of the kernel. Big thanks to Jan Beulich for decoding this for me as I didn't have a clue what was going on. Reported-by: Witold Baryluk <baryluk@smp.if.uj.edu.pl> Tested-by: Witold Baryluk <baryluk@smp.if.uj.edu.pl> Cc: Jan Beulich <JBeulich@suse.com> Signed-off-by: Don Zickus <dzickus@redhat.com> Link: http://lkml.kernel.org/r/1338991542-23000-1-git-send-email-dzickus@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-08 12:19:27 +02:00
Alexander Gordeev	4988a40c39	x86/apic: Make cpu_mask_to_apicid() operations check cpu_online_mask Currently cpu_mask_to_apicid() should not get a offline CPU with the cpumask. Otherwise some apic drivers might try to access non-existent per-cpu variables (i.e. x2apic). In that regard cpu_mask_to_apicid() and cpu_mask_to_apicid_and() operations are inconsistent. This fix makes the two operations do not rely on calling functions and always return the apicid for only online CPUs. As result, the meaning and implementations of cpu_mask_to_apicid() and cpu_mask_to_apicid_and() operations become straight. Signed-off-by: Alexander Gordeev <agordeev@redhat.com> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/20120607131624.GG4759@dhcp-26-207.brq.redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-08 11:44:30 +02:00
Alexander Gordeev	ff16432412	x86/apic: Make cpu_mask_to_apicid() operations return error code Current cpu_mask_to_apicid() and cpu_mask_to_apicid_and() implementations have few shortcomings: 1. A value returned by cpu_mask_to_apicid() is written to hardware registers unconditionally. Should BAD_APICID get ever returned it will be written to a hardware too. But the value of BAD_APICID is not universal across all hardware in all modes and might cause unexpected results, i.e. interrupts might get routed to CPUs that are not configured to receive it. 2. Because the value of BAD_APICID is not universal it is counter- intuitive to return it for a hardware where it does not make sense (i.e. x2apic). 3. cpu_mask_to_apicid_and() operation is thought as an complement to cpu_mask_to_apicid() that only applies a AND mask on top of a cpumask being passed. Yet, as consequence of `18374d8` commit the two operations are inconsistent in that of: cpu_mask_to_apicid() should not get a offline CPU with the cpumask cpu_mask_to_apicid_and() should not fail and return BAD_APICID These limitations are impossible to realize just from looking at the operations prototypes. Most of these shortcomings are resolved by returning a error code instead of BAD_APICID. As the result, faults are reported back early rather than possibilities to cause a unexpected behaviour exist (in case of [1]). The only exception is setup_timer_IRQ0_pin() routine. Although obviously controversial to this fix, its existing behaviour is preserved to not break the fragile check_timer() and would better addressed in a separate fix. Signed-off-by: Alexander Gordeev <agordeev@redhat.com> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/20120607131559.GF4759@dhcp-26-207.brq.redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-08 11:44:29 +02:00
Alexander Gordeev	8637e38aff	x86/apic: Avoid useless scanning thru a cpumask in assign_irq_vector() In case of static vector allocation domains (i.e. flat) if all vector numbers are exhausted, an attempt to assign a new vector will lead to useless scans through all CPUs in the cpumask, even though it is known that each new pass would fail. Make this corner case less painful by letting report whether the vector allocation domain depends on passed arguments or not and stop scanning early. The same could have been achived by introducing a static flag to the apic operations. But let's allow vector_allocation_domain() have more intelligence here and decide dynamically, in case we would need it in the future. Signed-off-by: Alexander Gordeev <agordeev@redhat.com> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/20120607131542.GE4759@dhcp-26-207.brq.redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-08 11:44:29 +02:00
Alexander Gordeev	1bccd58bff	x86/apic: Try to spread IRQ vectors to different priority levels When assigning a new vector it is primarially done by adding 8 to the previously given out vector number. Hence, two consequently allocated vector numbers would likely fall into the same priority level. Try to spread vector numbers to different priority levels better by changing the step from 8 to 16. Signed-off-by: Alexander Gordeev <agordeev@redhat.com> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/20120607131514.GD4759@dhcp-26-207.brq.redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-08 11:44:28 +02:00

1 2 3 4 5 ...

8573 commits