Commit graph

9874 commits

Author SHA1 Message Date
Ingo Molnar
e0b5f80dd4 Merge branch 'iommu/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/linux-2.6-iommu into x86/urgent 2010-01-27 11:04:40 +01:00
Ingo Molnar
b7a0afb0b4 Merge branch 'urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/rric/oprofile into x86/urgent 2010-01-27 10:52:36 +01:00
Russ Anderson
da482474b8 x86, msr/cpuid: Pass the number of minors when unregistering MSR and CPUID drivers.
Pass the number of minors when unregistering MSR and CPUID drivers.

Reported-by: Dean Nelson <dnelson@redhat.com>
Signed-off-by: Dean Nelson <dnelson@redhat.com>
LKML-Reference: <20100127023722.GA22305@sgi.com>
Signed-off-by: Russ Anderson <rja@sgi.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-01-26 23:52:38 -08:00
Andi Kleen
e83e452b06 oprofile/x86: add Xeon 7500 series support
Add Xeon 7500 series support to oprofile.

Straight forward: it's the same as Core i7, so just detect
the model number. No user space changes needed.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
2010-01-25 15:34:53 +01:00
Suravee Suthikulpanit
d8cc108f4f oprofile/x86: fix crash when profiling more than 28 events
With multiplexing enabled oprofile crashs when profiling more than 28
events. This patch fixes this.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
2010-01-25 15:34:53 +01:00
Wei Yongjun
443c39bc9e KVM: x86: Fix leak of free lapic date in kvm_arch_vcpu_init()
In function kvm_arch_vcpu_init(), if the memory malloc for
vcpu->arch.mce_banks is fail, it does not free the memory
of lapic date. This patch fixed it.

Cc: stable@kernel.org
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-01-25 12:26:40 -02:00
Wei Yongjun
36cb93fd6b KVM: x86: Fix probable memory leak of vcpu->arch.mce_banks
vcpu->arch.mce_banks is malloc in kvm_arch_vcpu_init(), but
never free in any place, this may cause memory leak. So this
patch fixed to free it in kvm_arch_vcpu_uninit().

Cc: stable@kernel.org
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-01-25 12:26:40 -02:00
Marcelo Tosatti
a6085fbaf6 KVM: MMU: bail out pagewalk on kvm_read_guest error
Exit the guest pagetable walk loop if reading gpte failed. Otherwise its
possible to enter an endless loop processing the previous present pte.

Cc: stable@kernel.org
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-01-25 12:26:38 -02:00
Sheng Yang
82b7005f0e KVM: x86: Fix host_mapping_level()
When found a error hva, should not return PAGE_SIZE but the level...

Also clean up the coding style of the following loop.

Cc: stable@kernel.org
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-01-25 12:26:37 -02:00
Avi Kivity
a5d36f82c4 KVM: Fix race between APIC TMR and IRR
When we queue an interrupt to the local apic, we set the IRR before the TMR.
The vcpu can pick up the IRR and inject the interrupt before setting the TMR,
and perhaps even EOI it, causing incorrect behaviour.

The race is really insignificant since it can only occur on the first
interrupt (usually following interrupts will not change TMR), but it's better
closed than open.

Fixed by reordering setting the TMR vs IRR.

Cc: stable@kernel.org
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-01-25 12:26:36 -02:00
H. Peter Anvin
b160091802 x86: Remove "x86 CPU features in debugfs" (CONFIG_X86_CPU_DEBUG)
CONFIG_X86_CPU_DEBUG, which provides some parsed versions of the x86
CPU configuration via debugfs, has caused boot failures on real
hardware.  The value of this feature has been marginal at best, as all
this information is already available to userspace via generic
interfaces.

Causes crashes that have not been fixed + minimal utility -> remove.

See the referenced LKML thread for more information.

Reported-by: Ozan Çağlayan <ozan@pardus.org.tr>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
LKML-Reference: <alpine.LFD.2.00.1001221755320.13231@localhost.localdomain>
Cc: Jaswinder Singh Rajput <jaswinder@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: <stable@kernel.org>
2010-01-23 18:27:47 -08:00
David S. Miller
51c24aaaca Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2010-01-23 00:31:06 -08:00
Andreas Herrmann
3b2e3d85ae Revert "x86: ucode-amd: Load ucode-patches once ..."
Commit d1c84f79a6
leads to a regression when microcode_amd.c is compiled into the kernel.
It causes a big boot delay because the firmware is not available.
See http://marc.info/?l=linux-kernel&m=126267290920060

It also renders the reload sysfs attribute useless.
Fixing this is too intrusive for an -rc5 kernel.

Thus I'd like to restore the microcode loading behaviour of kernel
2.6.32.

CC: Gene Heskett <gene.heskett@verizon.net>
Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
LKML-Reference: <20100122203456.GB13792@alberich.amd.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-01-23 06:21:59 +01:00
Pallipadi, Venkatesh
73472a46b5 x86: Disable HPET MSI on ATI SB700/SB800
HPET MSI on platforms with ATI SB700/SB800 as they seem to have some
side-effects on floppy DMA. Do not use HPET MSI on such platforms.

Original problem report from Mark Hounschell
http://lkml.indiana.edu/hypermail/linux/kernel/0912.2/01118.html

[ This patch needs to go to stable as well. But, there are some
  conflicts that prevents the patch from going as is. I can
  rebase/resubmit to stable once the patch goes upstream.
  hpa: still Cc:'ing stable@ as an FYI. ]

Tested-by: Mark Hounschell <markh@compro.net>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: <stable@kernel.org>
LKML-Reference: <20100121190952.GA32523@linux-os.sc.intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-01-23 06:21:58 +01:00
David Rientjes
3a5fc0e40c x86: Set hotpluggable nodes in nodes_possible_map
nodes_possible_map does not currently include nodes that have SRAT
entries that are all ACPI_SRAT_MEM_HOT_PLUGGABLE since the bit is
cleared in nodes_parsed if it does not have an online address range.

Unequivocally setting the bit in nodes_parsed is insufficient since
existing code, such as acpi_get_nodes(), assumes all nodes in the map
have online address ranges.  In fact, all code using nodes_parsed
assumes such nodes represent an address range of online memory.

nodes_possible_map is created by unioning nodes_parsed and
cpu_nodes_parsed; the former represents nodes with online memory and
the latter represents memoryless nodes.  We now set the bit for
hotpluggable nodes in cpu_nodes_parsed so that it also gets set in
nodes_possible_map.

[ hpa: Haicheng Li points out that this makes the naming of the
  variable cpu_nodes_parsed somewhat counterintuitive.  However, leave
  it as is in the interest of keeping the pure bug fix patch small. ]

Signed-off-by: David Rientjes <rientjes@google.com>
Tested-by: Haicheng Li <haicheng.li@linux.intel.com>
LKML-Reference: <alpine.DEB.2.00.1001201152040.30528@chino.kir.corp.google.com>
Cc: <stable@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-01-23 06:21:57 +01:00
Borislav Petkov
048a8774ca x86, cacheinfo: Calculate L3 indices
We need to know the valid L3 indices interval when disabling them over
/sysfs. Do that when the core is brought online and add boundary checks
to the sysfs .store attribute.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
LKML-Reference: <1264172467-25155-6-git-send-email-bp@amd64.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-01-22 16:06:31 -08:00
Borislav Petkov
897de50e08 x86, cacheinfo: Add cache index disable sysfs attrs only to L3 caches
The cache_disable_[01] attribute in

/sys/devices/system/cpu/cpu?/cache/index[0-3]/

is enabled on all cache levels although only L3 supports it. Add it only
to the cache level that actually supports it.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
LKML-Reference: <1264172467-25155-5-git-send-email-bp@amd64.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-01-22 16:06:31 -08:00
Borislav Petkov
dcf39daf3d x86, cacheinfo: Fix disabling of L3 cache indices
* Correct the masks used for writing the cache index disable indices.
* Do not turn off L3 scrubber - it is not necessary.
* Make sure wbinvd is executed on the same node where the L3 is.
* Check for out-of-bounds values written to the registers.
* Make show_cache_disable hex values unambiguous
* Check for Erratum #388

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
LKML-Reference: <1264172467-25155-4-git-send-email-bp@amd64.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-01-22 16:06:31 -08:00
Borislav Petkov
a7b480e7f3 x86, lib: Add wbinvd smp helpers
Add wbinvd_on_cpu and wbinvd_on_all_cpus stubs for executing wbinvd on a
particular CPU.

[ hpa: renamed lib/smp.c to lib/cache-smp.c ]
[ hpa: wbinvd_on_all_cpus() returns int, but wbinvd() returns
  void.  Thus, the former cannot be a macro for the latter,
  replace with an inline function. ]

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
LKML-Reference: <1264172467-25155-2-git-send-email-bp@amd64.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-01-22 16:05:42 -08:00
Joerg Roedel
d3ad9373b7 x86/amd-iommu: Fix deassignment of a device from the pt_domain
Deassigning a device from the passthrough domain does not
work and breaks device assignment to kvm guests. This patch
fixes the issue.

Cc: stable@kernel.org
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2010-01-22 17:56:49 +01:00
Joerg Roedel
f532509437 x86/amd-iommu: Fix IOMMU-API initialization for iommu=pt
This patch moves the initialization of the iommu-api out of
the dma-ops initialization code. This ensures that the
iommu-api is initialized even with iommu=pt.

Cc: stable@kernel.org
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2010-01-22 17:44:35 +01:00
Joerg Roedel
2ca762790c x86/amd-iommu: Fix NULL pointer dereference in __detach_device()
In the __detach_device function the reference count for a
device-domain binding may become zero. This results in the
device being removed from the domain and dev_data->domain
will be NULL. This is bad because this pointer is
dereferenced when trying to unlock the domain->lock. This
patch fixes the issue by keeping the domain in a seperate
variable.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2010-01-22 17:32:31 +01:00
Joerg Roedel
d91afd15b0 x86/amd-iommu: Fix possible integer overflow
The variable i in this function could be increased to over
2**32 which would result in an integer overflow when using
int. Fix it by changing i to unsigned long.

Cc: stable@kernel.org
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2010-01-22 16:48:57 +01:00
Linus Torvalds
e80b135985 Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  perf: x86: Add support for the ANY bit
  perf: Change the is_software_event() definition
  perf: Honour event state for aux stream data
  perf: Fix perf_event_do_pending() fallback callsite
  perf kmem: Print usage help for unknown commands
  perf kmem: Increase "Hit" column length
  hw-breakpoints, perf: Fix broken mmiotrace due to dr6 by reference change
  perf timechart: Use tid not pid for COMM change
2010-01-21 08:50:04 -08:00
Stephane Eranian
b27d515a49 perf: x86: Add support for the ANY bit
Propagate the ANY bit into the fixed counter config for v3 and higher.

Signed-off-by: Stephane Eranian <eranian@google.com>
[a.p.zijlstra@chello.nl: split from larger patch]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <4b5430c6.0f975e0a.1bf9.ffff85fe@mx.google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-01-21 13:40:41 +01:00
Len Brown
be6066f34c Merge branch 'misc' into release 2010-01-20 01:23:27 -05:00
Suresh Siddha
bb668da6d6 x86, apic: use logical flat for systems with <= 8 logical cpus
We can use logical flat mode if there are <= 8 logical cpu's
(irrespective of physical apic id values).  This will enable simplified
and efficient IPI and device interrupt routing on such platforms.

This has been tested to work on both Intel and AMD platforms.
Exceptions like IBM summit platform which can't use logical flat mode
are addressed by using OEM platform checks.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Chris McDermott <lcm@linux.vnet.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-01-18 14:15:28 -08:00
Suresh Siddha
dfea91d5a7 x86, apic: use physical mode for IBM summit platforms
Chris McDermott from IBM confirmed that hurricane chipset in IBM summit
platforms doesn't support logical flat mode.  Irrespective of the other
things like apic_id's, total number of logical cpu's, Linux kernel
should default to physical mode for this system.

The 32-bit kernel does so using the OEM checks for the IBM summit
platform.  Add a similar OEM platform check for the 64bit kernel too.

Otherwise the linux kernel boot can hang on this platform under certain
bios/platform settings.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Tested-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Chris McDermott <lcm@linux.vnet.ibm.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-01-18 14:15:27 -08:00
H. Peter Anvin
1838ef1d78 x86-64, rwsem: 64-bit xadd rwsem implementation
For x86-64, 32767 threads really is not enough.  Change rwsem_count_t
to a signed long, so that it is 64 bits on x86-64.

This required the following changes to the assembly code:

a) %z0 doesn't work on all versions of gcc!  At least gcc 4.4.2 as
   shipped with Fedora 12 emits "ll" not "q" for 64 bits, even for
   integer operands.  Newer gccs apparently do this correctly, but
   avoid this problem by using the _ASM_ macros instead of %z.
b) 64 bits immediates are only allowed in "movq $imm,%reg"
   constructs... no others.  Change some of the constraints to "e",
   and fix the one case where we would have had to use an invalid
   immediate -- in that case, we only care about the upper half
   anyway, so just access the upper half.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <tip-bafaecd11df15ad5b1e598adc7736afcd38ee13d@git.kernel.org>
2010-01-18 14:00:34 -08:00
Thadeu Lima de Souza Cascardo
00097c4fdf x86, trivial: Fix grammo in tsc comment about Geode TSC reliability
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@holoscopio.com>
Cc: marcelo@kvack.org
Cc: dilinger@collabora.co.uk
Cc: trivial@kernel.org
LKML-Reference: <1263764685-9871-1-git-send-email-cascardo@holoscopio.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-01-18 09:01:39 +01:00
Luca Barbieri
0bb7a95f54 hw-breakpoints, perf: Fix broken mmiotrace due to dr6 by reference change
Commit 62edab9056 (from June 2009
but merged in 2.6.33) changes notify_die to pass dr6 by
reference.

However, it forgets to fix the check for DR_STEP in kmmio.c,
breaking mmiotrace. It also passes a wrong value to the post
handler.

This simple fix makes mmiotrace work again.

Signed-off-by: Luca Barbieri <luca@luca-barbieri.com>
Acked-by: K.Prasad <prasad@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <1263634770-14578-1-git-send-email-luca@luca-barbieri.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-01-17 08:01:44 +01:00
Linus Torvalds
330a518a1a Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, uv: Ensure hub revision set for all ACPI modes.
  x86, uv: Add function retrieving node controller revision number
  x86: xen: 64-bit kernel RPL should be 0
  x86: kernel_thread() -- initialize SS to a known state
  x86/agp: Fix agp_amd64_init and agp_amd64_cleanup
  x86: SGI UV: Fix mapping of MMIO registers
  x86: mce.h: Fix warning in header checks
2010-01-16 12:31:42 -08:00
Russ Anderson
1d2c867c94 x86, uv: Ensure hub revision set for all ACPI modes.
Ensure that UV hub revision is set for all ACPI modes.

Signed-off-by: Russ Anderson <rja@sgi.com>
LKML-Reference: <20100115180908.GB7757@sgi.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-01-15 11:09:04 -08:00
Jack Steiner
7a1110e861 x86, uv: Add function retrieving node controller revision number
Add function for determining the revision id of the SGI UV
node controller chip (HUB). This function is needed in a
subsequent patch.

Signed-off-by: Jack Steiner <steiner@sgi.com>
LKML-Reference: <20100112210904.GA24546@sgi.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-01-15 11:08:55 -08:00
Michael S. Tsirkin
3a4d5c94e9 vhost_net: a kernel-level virtio server
What it is: vhost net is a character device that can be used to reduce
the number of system calls involved in virtio networking.
Existing virtio net code is used in the guest without modification.

There's similarity with vringfd, with some differences and reduced scope
- uses eventfd for signalling
- structures can be moved around in memory at any time (good for
  migration, bug work-arounds in userspace)
- write logging is supported (good for migration)
- support memory table and not just an offset (needed for kvm)

common virtio related code has been put in a separate file vhost.c and
can be made into a separate module if/when more backends appear.  I used
Rusty's lguest.c as the source for developing this part : this supplied
me with witty comments I wouldn't be able to write myself.

What it is not: vhost net is not a bus, and not a generic new system
call. No assumptions are made on how guest performs hypercalls.
Userspace hypervisors are supported as well as kvm.

How it works: Basically, we connect virtio frontend (configured by
userspace) to a backend. The backend could be a network device, or a tap
device.  Backend is also configured by userspace, including vlan/mac
etc.

Status: This works for me, and I haven't see any crashes.
Compared to userspace, people reported improved latency (as I save up to
4 system calls per packet), as well as better bandwidth and CPU
utilization.

Features that I plan to look at in the future:
- mergeable buffers
- zero copy
- scalability tuning: figure out the best threading model to use

Note on RCU usage (this is also documented in vhost.h, near
private_pointer which is the value protected by this variant of RCU):
what is happening is that the rcu_dereference() is being used in a
workqueue item.  The role of rcu_read_lock() is taken on by the start of
execution of the workqueue item, of rcu_read_unlock() by the end of
execution of the workqueue item, and of synchronize_rcu() by
flush_workqueue()/flush_work(). In the future we might need to apply
some gcc attribute or sparse annotation to the function passed to
INIT_WORK(). Paul's ack below is for this RCU usage.

(Includes fixes by Alan Cox <alan@linux.intel.com>,
David L Stevens <dlstevens@us.ibm.com>,
Chris Wright <chrisw@redhat.com>)

Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-01-15 01:43:29 -08:00
Linus Torvalds
bafaecd11d x86-64: support native xadd rwsem implementation
This one is much faster than the spinlock based fallback rwsem code,
with certain artifical benchmarks having shown 300%+ improvement on
threaded page faults etc.

Again, note the 32767-thread limit here. So this really does need that
whole "make rwsem_count_t be 64-bit and fix the BIAS values to match"
extension on top of it, but that is conceptually a totally independent
issue.

NOT TESTED! The original patch that this all was based on were tested by
KAMEZAWA Hiroyuki, but maybe I screwed up something when I created the
cleaned-up series, so caveat emptor..

Also note that it _may_ be a good idea to mark some more registers
clobbered on x86-64 in the inline asms instead of saving/restoring them.
They are inline functions, but they are only used in places where there
are not a lot of live registers _anyway_, so doing for example the
clobbers of %r8-%r11 in the asm wouldn't make the fast-path code any
worse, and would make the slow-path code smaller.

(Not that the slow-path really matters to that degree. Saving a few
unnecessary registers is the _least_ of our problems when we hit the slow
path. The instruction/cycle counting really only matters in the fast
path).

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <alpine.LFD.2.00.1001121810410.17145@localhost.localdomain>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-01-13 22:39:50 -08:00
Linus Torvalds
5d0b7235d8 x86: clean up rwsem type system
The fast version of the rwsems (the code that uses xadd) has
traditionally only worked on x86-32, and as a result it mixes different
kinds of types wildly - they just all happen to be 32-bit.  We have
"long", we have "__s32", and we have "int".

To make it work on x86-64, the types suddenly matter a lot more.  It can
be either a 32-bit or 64-bit signed type, and both work (with the caveat
that a 32-bit counter will only have 15 bits of effective write
counters, so it's limited to 32767 users).  But whatever type you
choose, it needs to be used consistently.

This makes a new 'rwsem_counter_t', that is a 32-bit signed type.  For a
64-bit type, you'd need to also update the BIAS values.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <alpine.LFD.2.00.1001121755220.17145@localhost.localdomain>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-01-13 22:38:51 -08:00
Brian Gerst
3bef444797 x86: Merge show_regs()
Using kernel_stack_pointer() allows 32-bit and 64-bit versions to
be merged.  This is more correct for 64-bit, since the old %rsp is
always saved on the stack.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
LKML-Reference: <1263397555-27695-1-git-send-email-brgerst@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-01-13 09:23:15 -08:00
Thomas Renninger
557a701c16 [CPUFREQ] Fix use after free of struct powernow_k8_data
Easy fix for a regression introduced in 2.6.31.

On managed CPUs the cpufreq.c core will call driver->exit(cpu) on the
managed cpus and powernow_k8 will free the core's data.

Later driver->get(cpu) function might get called trying to read out the
current freq of a managed cpu and the NULL pointer check does not work on
the freed object -> better set it to NULL.

->get() is unsigned and must return 0 as invalid frequency.

Reference:
http://bugzilla.kernel.org/show_bug.cgi?id=14391

Signed-off-by: Thomas Renninger <trenn@suse.de>
Tested-by: Michal Schmidt <mschmidt@redhat.com>
CC: stable@kernel.org
Signed-off-by: Dave Jones <davej@redhat.com>
2010-01-13 10:55:15 -05:00
Ian Campbell
e68266b700 x86: xen: 64-bit kernel RPL should be 0
Under Xen 64 bit guests actually run their kernel in ring 3,
however the hypervisor takes care of squashing descriptor the
RPLs transparently (in order to allow them to continue to
differentiate between user and kernel space CS using the RPL).
Therefore the Xen paravirt backend should use RPL==0 instead of
1 (or 3). Using RPL==1 causes generic arch code to take
incorrect code paths because it uses "testl $3, <CS>, je foo"
type tests for a userspace CS and this considers 1==userspace.

This issue was previously masked because get_kernel_rpl() was
omitted when setting CS in kernel_thread(). This was fixed when
kernel_thread() was unified with 32 bit in
f443ff4201.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Jeremy Fitzhardinge <Jeremy.Fitzhardinge@citrix.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Brian Gerst <brgerst@gmail.com>
LKML-Reference: <1263377768-19600-2-git-send-email-ian.campbell@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-01-13 11:23:54 +01:00
Cyrill Gorcunov
864a0922dd x86: kernel_thread() -- initialize SS to a known state
Before the kernel_thread was converted into "C" we had
pt_regs::ss set to __KERNEL_DS (by SAVE_ALL asm macro).

Though I must admit I didn't find any *explicit* load of
%ss from this structure the better to be on a safe side
and set it to a known value.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Jeremy Fitzhardinge <Jeremy.Fitzhardinge@citrix.com>
Cc: Brian Gerst <brgerst@gmail.com>
LKML-Reference: <1263377768-19600-1-git-send-email-ian.campbell@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-01-13 11:23:45 +01:00
FUJITA Tomonori
42590a7501 x86/agp: Fix agp_amd64_init and agp_amd64_cleanup
This fixes the regression introduced by the commit
f405d2c023.

The above commit fixes the following issue:

  http://marc.info/?l=linux-kernel&m=126192729110083&w=2

However, it doesn't work properly when you remove and insert the
agp_amd64 module again.

agp_amd64_init() and agp_amd64_cleanup should be called only
when gart_iommu is not called earlier (that is, the GART IOMMU
is not enabled). We need to use 'gart_iommu_aperture' to see if
GART IOMMU is enabled or not.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: mitov@issp.bas.bg
Cc: davej@redhat.com
LKML-Reference: <20100104161603L.fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-01-13 11:15:37 +01:00
Mike Travis
fcfbb2b5fa x86: SGI UV: Fix mapping of MMIO registers
This fixes the problem of the initialization code not correctly
mapping the entire MMIO space on a UV system.  A side effect is
the map_high() interface needed to be changed to accommodate
different address and size shifts.

Signed-off-by: Mike Travis <travis@sgi.com>
Reviewed-by: Mike Habeck <habeck@sgi.com>
Cc: <stable@kernel.org>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <4B479202.7080705@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-01-13 10:56:27 +01:00
Alan Cox
df39a2e48f x86: mce.h: Fix warning in header checks
Someone isn't reading their build output: Move the definition
out of the exported header.

Signed-off-by: Alan Cox <alan@linux.intel.com>
Cc: linux-kernel@vger.kernelorg
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-01-13 10:41:22 +01:00
Masami Hiramatsu
aa5add93e9 x86/ptrace: Remove unused regs_get_argument_nth API
Because of dropping function argument syntax from kprobe-tracer,
we don't need this API anymore.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: systemtap <systemtap@sources.redhat.com>
Cc: DLE <dle-develop@lists.sourceforge.net>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: linuxppc-dev@ozlabs.org
LKML-Reference: <20100105224656.19431.92588.stgit@dhcp-100-2-132.bos.redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-01-13 10:09:12 +01:00
Frederic Weisbecker
0fb8ee48d9 perf: Drop useless check for ignored frame
The check that ignores the debug and nmi stack frames is useless
now that we have a frame pointer that makes us start at the
right place. We don't anymore have to deal with these.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1262235183-5320-2-git-send-regression-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-01-13 10:09:08 +01:00
Ingo Molnar
61405fea92 Merge branch 'perf/urgent' into perf/core
Merge reason: queue up dependent patch, update to -rc4

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-01-13 10:08:50 +01:00
Frederic Weisbecker
c2c5d45d46 perf: Stop stack frame walking off kernel addresses boundaries
While processing kernel perf callchains, an bad entry can be
considered as a valid stack pointer but not as a kernel address.

In this case, we hang in an endless loop. This can happen in an
x86-32 kernel after processing the last entry in a kernel
stacktrace.

Just stop the stack frame walking after we encounter an invalid
kernel address.

This fixes a hard lockup in x86-32.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1262227945-27014-1-git-send-regression-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-01-13 09:32:54 +01:00
Dave Jones
2ca49b2fcf x86: Macroise x86 cache descriptors
Use a macro to define the cache sizes when cachesize > 1 MB.

This is less typing, and less prone to introducing bugs like we
saw in e02e0e1a13, and means we
don't have to do maths when adding new non-power-of-2 updates
like those seen recently.

Signed-off-by: Dave Jones <davej@redhat.com>
LKML-Reference: <20100104144735.GA18390@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-01-13 08:57:38 +01:00
Linus Torvalds
59c33fa779 x86-32: clean up rwsem inline asm statements
This makes gcc use the right register names and instruction operand sizes
automatically for the rwsem inline asm statements.

So instead of using "(%%eax)" to specify the memory address that is the
semaphore, we use "(%1)" or similar. And instead of forcing the operation
to always be 32-bit, we use "%z0", taking the size from the actual
semaphore data structure itself.

This doesn't actually matter on x86-32, but if we want to use the same
inline asm for x86-64, we'll need to have the compiler generate the proper
64-bit names for the registers (%rax instead of %eax), and if we want to
use a 64-bit counter too (in order to avoid the 15-bit limit on the
write counter that limits concurrent users to 32767 threads), we'll need
to be able to generate instructions with "q" accesses rather than "l".

Since this header currently isn't enabled on x86-64, none of that matters,
but we do want to use the xadd version of the semaphores rather than have
to take spinlocks to do a rwsem. The mm->mmap_sem can be heavily contended
when you have lots of threads all taking page faults, and the fallback
rwsem code that uses a spinlock performs abysmally badly in that case.

[ hpa: modified the patch to skip size suffixes entirely when they are
  redundant due to register operands. ]

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <alpine.LFD.2.00.1001121613560.17145@localhost.localdomain>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-01-12 20:43:04 -08:00
Ananth N Mavinakayanahalli
066000dd85 Revert "x86, apic: Use logical flat on intel with <= 8 logical cpus"
Revert commit 2fbd07a5f5, as this commit
breaks an IBM platform with quad-core Xeon cpu's.

According to Suresh, this might be an IBM platform issue, as on other
Intel platforms with <= 8 logical cpu's, logical flat mode works fine
irespective of physical apic id values (inline with the xapic
architecture).

Revert this for now because of the IBM platform breakage.

Another version will be re-submitted after the complete analysis.

Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-01-11 16:47:57 -08:00
Avi Kivity
a29815a333 core, x86: make LIST_POISON less deadly
The list macros use LIST_POISON1 and LIST_POISON2 as undereferencable
pointers in order to trap erronous use of freed list_heads.  Unfortunately
userspace can arrange for those pointers to actually be dereferencable,
potentially turning an oops to an expolit.

To avoid this allow architectures (currently x86_64 only) to override
the default values for these pointers with truly-undereferencable values.
This is easy on x86_64 as the virtual address space is large and contains
areas that cannot be mapped.

Other 64-bit architectures will likely find similar unmapped ranges.

[ingo: switch to 0xdead000000000000 as the unmapped area]
[ingo: add comments, cleanup]
[jaswinder: eliminate sparse warnings]

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-01-11 09:45:37 -08:00
Albin Tonnerre
13510997d6 x86: add support for LZO-compressed kernels
The necessary changes to the x86 Kconfig and boot/compressed to allow the
use of this new compression method

Signed-off-by: Albin Tonnerre <albin.tonnerre@free-electrons.com>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Tested-by: Wu Zhangjin <wuzhangjin@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Russell King <rmk@arm.linux.org.uk>
Acked-by: Russell King <rmk@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-01-11 09:34:05 -08:00
Andreas Fenkart
4b529401c5 mm: make totalhigh_pages unsigned long
Makes it consistent with the extern declaration, used when CONFIG_HIGHMEM
is set Removes redundant casts in printout messages

Signed-off-by: Andreas Fenkart <andreas.fenkart@streamunlimited.com>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Chen Liqin <liqin.chen@sunplusct.com>
Cc: Lennox Wu <lennox.wu@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-01-11 09:34:03 -08:00
Linus Torvalds
80e23b7cea Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-tip:
  x86, irq: Check move_in_progress before freeing the vector mapping
  x86: copy_from_user() should not return -EFAULT
  Revert "x86: Side-step lguest problem by only building cmpxchg8b_emu for pre-Pentium"
  x86/pci: Intel ioh bus num reg accessing fix
  x86: Fix size for ex trampoline with 32bit
2010-01-08 13:55:52 -08:00
Roel Kluin
99659a929d x86, uv: Remove recursion in uv_heartbeat_enable()
The recursion is not needed and does not improve readability.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
LKML-Reference: <4B45F13E.3040202@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-01-07 11:50:06 -08:00
Jack Steiner
e1e0138d7d x86, uv: uv_global_gru_mmr_address() macro fix
Fix bug in uv_global_gru_mmr_address macro.  Macro failed
to cast an int value to a long prior to a left shift > 32.

Signed-off-by: Jack Steiner <steiner@sgi.com>
LKML-Reference: <20100107161240.GA2610@sgi.com>
Cc: <stable@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-01-07 11:49:57 -08:00
Brian Gerst
5abbbbf0b0 x86: Merge asm/atomic_{32,64}.h
Merge the now identical code from asm/atomic_32.h and asm/atomic_64.h
into asm/atomic.h.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
LKML-Reference: <1262883215-4034-4-git-send-email-brgerst@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-01-07 11:48:38 -08:00
Brian Gerst
3ce59bb835 x86: Sync asm/atomic_32.h and asm/atomic_64.h
Prepare for merging into asm/atomic.h.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
LKML-Reference: <1262883215-4034-3-git-send-email-brgerst@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-01-07 11:47:55 -08:00
Brian Gerst
1a3b1d89ed x86: Split atomic64_t functions into seperate headers
Split atomic64_t functions out into separate headers, since they will
not be practical to merge between 32 and 64 bits.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
LKML-Reference: <1262883215-4034-2-git-send-email-brgerst@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-01-07 11:47:31 -08:00
Len Brown
8558e3943d x86, ACPI: delete acpi_boot_table_init() return value
cleanup only.

setup_arch(), doesn't care care if ACPI initialization succeeded
or failed, so delete acpi_boot_table_init()'s return value.

Signed-off-by: Len Brown <len.brown@intel.com>
2010-01-06 16:18:15 -05:00
Suresh Siddha
7f41c2e152 x86, irq: Check move_in_progress before freeing the vector mapping
With the recent irq migration fixes (post 2.6.32), Gary Hade has noticed
"No IRQ handler for vector" messages during the 2.6.33-rc1 kernel boot on IBM
AMD platforms and root caused the issue to this commit:

> commit 23359a88e7
> Author: Suresh Siddha <suresh.b.siddha@intel.com>
> Date:   Mon Oct 26 14:24:33 2009 -0800
>
>    x86: Remove move_cleanup_count from irq_cfg

As part of this patch, we have removed the move_cleanup_count check
in smp_irq_move_cleanup_interrupt(). With this change, we can run into a
situation where an irq cleanup interrupt on a cpu can cleanup the vector
mappings associated with multiple irqs, of which one of the irq's migration
might be still in progress. As such when that irq hits the old cpu, we get
the "No IRQ handler" messages.

Fix this by checking for the irq_cfg's move_in_progress and if the move
is still in progress delay the vector cleanup to another irq cleanup
interrupt request (which will happen when the irq starts arriving at the
new cpu destination).

Reported-and-tested-by: Gary Hade <garyhade@us.ibm.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <1262804191.2732.7.camel@sbs-t61.sc.intel.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-01-06 12:08:43 -08:00
Rusty Russell
db677ffa5f Revert "x86: Side-step lguest problem by only building cmpxchg8b_emu for pre-Pentium"
This reverts commit ae1b22f6e4.

As Linus said in 982d007a6e: "There was something really messy about
cmpxchg8b and clone CPU's, so if you enable it on other CPUs later, do it
carefully."

This breaks lguest for those configs, but we can fix that by emulating
if we have to.

Fixes: http://bugzilla.kernel.org/show_bug.cgi?id=14884
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-01-05 16:01:35 -08:00
Heiko Carstens
409d02ef6d x86: copy_from_user() should not return -EFAULT
Callers of copy_from_user() expect it to return the number of bytes
it could not copy. In no case it is supposed to return -EFAULT.

In case of a detected buffer overflow just return the requested
length. In addition one could think of a memset that would clear
the size of the target object.

[ hpa: code is not in .32 so not needed for -stable ]

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
LKML-Reference: <20100105131911.GC5480@osiris.boeblingen.de.ibm.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-01-05 13:45:06 -08:00
Rusty Russell
f4b825bde9 Revert "x86: Side-step lguest problem by only building cmpxchg8b_emu for pre-Pentium"
This reverts commit ae1b22f6e4.

As Linus said in 982d007a6e: "There was something really messy about
cmpxchg8b and clone CPU's, so if you enable it on other CPUs later, do it
carefully."

This breaks lguest for those configs, but we can fix that by emulating
if we have to.

Fixes: http://bugzilla.kernel.org/show_bug.cgi?id=14884
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
LKML-Reference: <201001051248.49700.rusty@rustcorp.com.au>
Cc: stable@kernel.org
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-01-04 19:35:49 -08:00
Yinghai Lu
a557aae29c x86/pci: Intel ioh bus num reg accessing fix
It is above 0x100 (PCI-Express extended register space), so if mmconf
is not enable, we can't access it.

[ hpa: changed the bound from 0x200 to 0x120, which is the tight
  bound. ]

Reported-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <1261525263-13763-3-git-send-email-yinghai@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-01-04 13:21:51 -08:00
Yinghai Lu
9dad0fd5a7 x86: Fix size for ex trampoline with 32bit
fix for error that is introduced by
| x86: Use find_e820() instead of hard coded trampoline address

it should end with PAGE_SIZE + PAGE_SIZE

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <1261525263-13763-2-git-send-email-yinghai@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-01-04 13:20:11 -08:00
Linus Torvalds
952363c90c Merge branch 'perf-fixes-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  perf: Fix NULL deref in inheritance code
  perf: Pass appropriate frame pointer to dump_trace()
2009-12-31 11:56:24 -08:00
Linus Torvalds
2d959e9565 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86/agp: Fix agp_amd64_init() initialization with CONFIG_GART_IOMMU enabled
  x86: SGI UV: Fix writes to led registers on remote uv hubs
  x86, kmemcheck: Use KERN_WARNING for error reporting
  x86: Use KERN_DEFAULT log-level in __show_regs()
  x86, compress: Force i386 instructions for the decompressor
  x86/amd-iommu: Fix initialization failure panic
  dma-debug: Do not add notifier when dma debugging is disabled.
  x86: Fix objdump version check in chkobjdump.awk for different formats.

Trivial conflicts in arch/x86/include/asm/uv/uv_hub.h due to me having
applied an earlier version of an SGI UV fix.
2009-12-31 11:54:13 -08:00
Frederic Weisbecker
48b5ba9cc9 perf: Pass appropriate frame pointer to dump_trace()
Pass the frame pointer from the regs of the interrupted path
to dump_trace() while processing the stack trace.

Currently, dump_trace() takes the current bp and starts the
callchain from dump_trace() itself. This is wasteful because
we need to walk through the entire NMI/DEBUG stack before
retrieving the interrupted point.

We can fix that by just using the frame pointer from the
captured regs. It points exactly where we want to start.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1262235183-5320-1-git-send-regression-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
2009-12-31 13:11:31 +01:00
Linus Torvalds
08d869aa86 Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6:
  ACPI: introduce kernel parameter acpi_sleep=sci_force_enable
  ACPI: WMI: Survive BIOS with duplicate GUIDs
  dell-wmi - fix condition to abort driver loading
  wmi: check find_guid() return value to prevent oops
  dell-wmi, hp-wmi, msi-wmi: check wmi_get_event_data() return value
  ACPI: hp-wmi, msi-wmi: clarify that wmi_install_notify_handler() returns an acpi_status
  dell-wmi: sys_init_module: 'dell_wmi'->init suspiciously returned 21, it should
  ACPI video: correct error-handling code
  ACPI video: no warning message if "acpi_backlight=vendor" is used
  ACPI: fix ACPI=n allmodconfig build
  thinkpad-acpi: improve Kconfig help text
  thinkpad-acpi: update volume subdriver documentation
  thinkpad-acpi: make volume subdriver optional
  thinkpad-acpi: don't fail to load the entire module due to ALSA problems
  thinkpad-acpi: don't take the first ALSA slot by default
2009-12-30 16:00:24 -08:00
Zhang Rui
d7f0eea9e4 ACPI: introduce kernel parameter acpi_sleep=sci_force_enable
Introduce kernel parameter acpi_sleep=sci_force_enable

some laptop requires SCI_EN being set directly on resume,
or else they hung somewhere in the resume code path.

We already have a blacklist for these laptops but we still need
this option, especially when debugging some suspend/resume problems,
in case there are systems that need this workaround and are not yet
in the blacklist.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-12-30 18:32:01 -05:00
Linus Torvalds
d661d76b02 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
  PCI/cardbus: Add a fixup hook and fix powerpc
  PCI: change PCI nomenclature in drivers/pci/ (non-comment changes)
  PCI: change PCI nomenclature in drivers/pci/ (comment changes)
  PCI: fix section mismatch on update_res()
  PCI: add Intel 82599 Virtual Function specific reset method
  PCI: add Intel USB specific reset method
  PCI: support device-specific reset methods
  PCI: Handle case when no pci device can provide cache line size hint
  PCI/PM: Propagate wake-up enable for PCIe devices too
  vgaarbiter: fix a typo in the vgaarbiter Documentation
2009-12-30 13:13:24 -08:00
Linus Torvalds
b07d41b77e Merge branch 'kvm-updates/2.6.33' of git://git.kernel.org/pub/scm/virt/kvm/kvm
* 'kvm-updates/2.6.33' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: get rid of kvm_create_vm() unused label warning on s390
  KVM: powerpc: Fix mtsrin in book3s_64 mmu
  KVM: ia64: fix build breakage due to host spinlock change
  KVM: x86: Extend KVM_SET_VCPU_EVENTS with selective updates
  KVM: LAPIC: make sure IRR bitmap is scanned after vm load
  KVM: Fix possible circular locking in kvm_vm_ioctl_assign_device()
  KVM: MMU: remove prefault from invlpg handler
2009-12-30 12:56:17 -08:00
Mike Travis
9a7262a056 x86_64 SGI UV: Fix writes to led registers on remote uv hubs.
The wrong address was being used to write the SCIR led regs on remote
hubs.  Also, there was an inconsistency between how BIOS and the kernel
indexed these regs.  Standardize on using the lower 6 bits of the APIC
ID as the index.

This patch fixes the problem of writing to an errant address to a
cpu # >= 64.

Signed-off-by: Mike Travis <travis@sgi.com>
Reviewed-by: Jack Steiner <steiner@sgi.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-30 12:25:26 -08:00
Jan Beulich
7269e8812a x86-64: Modify memcpy()/memset() alternatives mechanism
In order to avoid unnecessary chains of branches, rather than
implementing memcpy()/memset()'s access to their alternative
implementations via a jump, patch the (larger) original function
directly.

The memcpy() part of this is slightly subtle: while alternative
instruction patching does itself use memcpy(), with the
replacement block being less than 64-bytes in size the main loop
of the original function doesn't get used for copying memcpy_c()
over memcpy(), and hence we can safely write over its beginning.

Also note that the CFI annotations are fine for both variants of
each of the functions.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <4B2BB8D30200007800026AF2@vpn.id2.novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-30 11:57:32 +01:00
Jan Beulich
1b1d925818 x86-64: Modify copy_user_generic() alternatives mechanism
In order to avoid unnecessary chains of branches, rather than
implementing copy_user_generic() as a function consisting of
just a single (possibly patched) branch, instead properly deal
with patching call instructions in the alternative instructions
framework, and move the patching into the callers.

As a follow-on, one could also introduce something like
__EXPORT_SYMBOL_ALT() to avoid patching call sites in modules.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <4B2BB8180200007800026AE7@vpn.id2.novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-30 11:57:31 +01:00
Jan Beulich
499a5f1efa x86: Lift restriction on the location of FIX_BTMAP_*
The early ioremap fixmap entries cover half (or for 32-bit
non-PAE, a quarter) of a page table, yet they got
uncondtitionally aligned so far to a 256-entry boundary. This is
not necessary if the range of page table entries anyway falls
into a single page table.

This buys back, for (theoretically) 50% of all configurations
(25% of all non-PAE ones), at least some of the lowmem
necessarily lost with commit e621bd1895.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <4B2BB66F0200007800026AD6@vpn.id2.novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-30 11:57:30 +01:00
Mike Travis
39d3077099 x86: SGI UV: Fix writes to led registers on remote uv hubs
The wrong address was being used to write the SCIR led regs on
remote hubs.  Also, there was an inconsistency between how BIOS
and the kernel indexed these regs.  Standardize on using the
lower 6 bits of the APIC ID as the index.

This patch fixes the problem of writing to an errant address to
a cpu # >= 64.

Signed-off-by: Mike Travis <travis@sgi.com>
Reviewed-by: Jack Steiner <steiner@sgi.com>
Cc: Robin Holt <holt@sgi.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: stable@kernel.org
LKML-Reference: <4B3922F9.3060905@sgi.com>
[ v2: fix a number of annoying checkpatch artifacts and whitespace noise ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-29 06:47:39 +01:00
Pekka Enberg
c0ca9da442 x86, kmemcheck: Use KERN_WARNING for error reporting
As suggested by Vegard Nossum, use KERN_WARNING for error
reporting to make sure kmemcheck reports end up in syslog.

Suggested-by: Vegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <1261990935.4641.7.camel@penberg-laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-28 10:28:35 +01:00
Pekka Enberg
d015a09298 x86: Use KERN_DEFAULT log-level in __show_regs()
Andrew Morton reported a strange looking kmemcheck warning:

  WARNING: kmemcheck: Caught 32-bit read from uninitialized memory (ffff88004fba6c20)
  0000000000000000310000000000000000000000000000002413000000c9ffff
   u u u u u u u u u u u u u u u u i i i i i i i i u u u u u u u u

   [<ffffffff810af3aa>] kmemleak_scan+0x25a/0x540
   [<ffffffff810afbcb>] kmemleak_scan_thread+0x5b/0xe0
   [<ffffffff8104d0fe>] kthread+0x9e/0xb0
   [<ffffffff81003074>] kernel_thread_helper+0x4/0x10
   [<ffffffffffffffff>] 0xffffffffffffffff

The above printout is missing register dump completely. The
problem here is that the output comes from syslog which doesn't
show KERN_INFO log-level messages. We didn't see this before
because both of us were testing on 32-bit kernels which use the
_default_ log-level.

Fix that up by explicitly using KERN_DEFAULT log-level for
__show_regs() printks.

Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <1261988819.4641.2.camel@penberg-laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-28 09:40:21 +01:00
Naga Chumbalkar
fd2a50a024 x86, perfctr: Remove unused func avail_to_resrv_perfctr_nmi()
avail_to_resrv_perfctr_nmi() is neither EXPORT'd, nor used in
the file. So remove it.

Signed-off-by: Naga Chumbalkar <nagananda.chumbalkar@hp.com>
Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: oprofile-list@lists.sf.net
LKML-Reference: <20091224015441.6005.4408.sendpatchset@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-28 09:36:46 +01:00
Ingo Molnar
605c1a187f Merge branch 'iommu/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/linux-2.6-iommu into x86/urgent 2009-12-28 09:23:13 +01:00
Jan Kiszka
dab4b911a5 KVM: x86: Extend KVM_SET_VCPU_EVENTS with selective updates
User space may not want to overwrite asynchronously changing VCPU event
states on write-back. So allow to skip nmi.pending and sipi_vector by
setting corresponding bits in the flags field of kvm_vcpu_events.

[avi: advertise the bits in KVM_GET_VCPU_EVENTS]

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-27 13:36:33 -02:00
Marcelo Tosatti
6e24a6eff4 KVM: LAPIC: make sure IRR bitmap is scanned after vm load
The vcpus are initialized with irr_pending set to false, but
loading the LAPIC registers with pending IRR fails to reset
the irr_pending variable.

Cc: stable@kernel.org
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-27 13:36:31 -02:00
Marcelo Tosatti
fb341f572d KVM: MMU: remove prefault from invlpg handler
The invlpg prefault optimization breaks Windows 2008 R2 occasionally.

The visible effect is that the invlpg handler instantiates a pte which
is, microseconds later, written with a different gfn by another vcpu.

The OS could have other mechanisms to prevent a present translation from
being used, which the hypervisor is unaware of.

While the documentation states that the cpu is at liberty to prefetch tlb
entries, it looks like this is not heeded, so remove tlb prefetch from
invlpg.

Cc: stable@kernel.org
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-27 13:36:30 -02:00
H. Peter Anvin
17a2a9b57a x86, compress: Force i386 instructions for the decompressor
Recently, some distros have started shipping versions of gcc which
default to -march=i686.  This breaks building kernels for pre-i686
machines, even if they have been selected in Kconfig, due to the
generation of CMOV instructions.

There isn't enough benefit to try to preserve the generation of these
instructions even when selected, so simply force -march=i386 for the
decompressor when building a 32-bit kernel.

Reported-and-tested-by: Chris Rankin <rankincj@yahoo.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
LKML-Reference: <219280.97558.qm@web52907.mail.re2.yahoo.com>
2009-12-25 15:40:38 -08:00
Len Brown
fcb11235d3 Merge branch 'misc-2.6.33' into release 2009-12-24 01:19:00 -05:00
Len Brown
da3df858c8 Merge branch 'pdc' into release 2009-12-24 01:17:21 -05:00
Linus Torvalds
2f99f5c8f0 Revert "x86, ucode-amd: Ensure ucode update on suspend/resume after CPU off/online cycle"
This reverts commit 9f15226e75.  It's just
wrong, and broke resume for Rafael even on a non-AMD CPU.

As Rafael says:
 "... it causes microcode_init_cpu() to be called during resume even for
  CPUs for which there's no microcode to apply.  That, in turn, results
  in executing request_firmware() (on Intel CPUs at least) which doesn't
  work at this stage of resume (we have device interrupts disabled, I/O
  devices are still suspended and so on).

  If I'm not mistaken, the "if (uci->valid)" logic means "if that CPU is
  known to us" , so before commit 9f15226e75 microcode_resume_cpu() was
  called for all CPUs already in the system during suspend, which was
  the right thing to do.  The commit changed it so that the CPUs without
  microcode to apply are now treated as "unknown", which is not quite
  right.

  The problem this commit attempted to solve has to be handled
  differently."

Bisected-and -requested-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-23 15:04:53 -08:00
Andrew Morton
4a28395d72 arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c: avoid cross-CPU interrupts by using smp_call_function_any()
Presently acpi-cpufreq will perform the MSR read on the first CPU in the
mask.  That's inefficient if that CPU differs from the current CPU.
Because we have to perform a cross-CPU call, but we could have run the
rdmsr on the current CPU.

So switch to using the new smp_call_function_any(), which will perform the
call on the current CPU if that CPU is present in the mask (it is).

Cc: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Cc: Dave Jones <davej@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jaswinder Singh Rajput <jaswinder@kernel.org>
Cc: Len Brown <len.brown@intel.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Zhao Yakui <yakui.zhao@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-12-22 15:03:57 -05:00
Alex Chiang
47817254b8 ACPI: processor: unify arch_acpi_processor_cleanup_pdc
The x86 and ia64 implementations of the function in $subject are
exactly the same.

Also, since the arch-specific implementations of setting _PDC have
been completely hollowed out, remove the empty shells.

Cc: Tony Luck <tony.luck@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-12-22 03:24:14 -05:00
Alex Chiang
6c5807d7bc ACPI: processor: finish unifying arch_acpi_processor_init_pdc()
The only thing arch-specific about calling _PDC is what bits get
set in the input obj_list buffer.

There's no need for several levels of indirection to twiddle those
bits. Additionally, since we're just messing around with a buffer,
we can simplify the interface; no need to pass around the entire
struct acpi_processor * just to get at the buffer.

Cc: Tony Luck <tony.luck@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-12-22 03:24:13 -05:00
Alex Chiang
08ea48a326 ACPI: processor: factor out common _PDC settings
Both x86 and ia64 initialize _PDC with mostly common bit settings.

Factor out the common settings and leave the arch-specific ones alone.

Cc: Tony Luck <tony.luck@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-12-22 03:24:12 -05:00
Alex Chiang
407cd87c54 ACPI: processor: unify arch_acpi_processor_init_pdc
The x86 and ia64 implementations of arch_acpi_processor_init_pdc()
are almost exactly the same. The only difference is in what bits
they set in obj_list buffer.

Combine the boilerplate memory management code, and leave the
arch-specific bit twiddling in separate implementations.

Cc: Tony Luck <tony.luck@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-12-22 03:24:11 -05:00
Alex Chiang
1d9cb470a7 ACPI: processor: introduce arch_has_acpi_pdc
arch dependent helper function that tells us if we should attempt to
evaluate _PDC on this machine or not.

The x86 implementation assumes that the CPUs in the machine must be
homogeneous, and that you cannot mix CPUs of different vendors.

Cc: Tony Luck <tony.luck@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-12-22 03:24:10 -05:00
Joerg Roedel
0f76480643 x86/amd-iommu: Fix initialization failure panic
The assumption that acpi_table_parse passes the return value
of the hanlder function to the caller proved wrong
recently. The return value of the handler function is
totally ignored. This makes the initialization code for AMD
IOMMU buggy in a way that could cause a kernel panic on
initialization. This patch fixes the issue in the AMD IOMMU
driver.

Cc: stable@kernel.org
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-12-21 15:51:23 +01:00
Linus Torvalds
eca9dfcd00 Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  perf session: Make events_stats u64 to avoid overflow on 32-bit arches
  hw-breakpoints: Fix hardware breakpoints -> perf events dependency
  perf events: Dont report side-band events on each cpu for per-task-per-cpu events
  perf events, x86/stacktrace: Fix performance/softlockup by providing a special frame pointer-only stack walker
  perf events, x86/stacktrace: Make stack walking optional
  perf events: Remove unused perf_counter.h header file
  perf probe: Check new event name
  kprobe-tracer: Check new event/group name
  perf probe: Check whether debugfs path is correct
  perf probe: Fix libdwarf include path for Debian
2009-12-19 09:48:42 -08:00
Linus Torvalds
3981e15286 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, irq: Allow 0xff for /proc/irq/[n]/smp_affinity on an 8-cpu system
  Makefile: Unexport LC_ALL instead of clearing it
  x86: Fix objdump version check in arch/x86/tools/chkobjdump.awk
  x86: Reenable TSC sync check at boot, even with NONSTOP_TSC
  x86: Don't use POSIX character classes in gen-insn-attr-x86.awk
  Makefile: set LC_CTYPE, LC_COLLATE, LC_NUMERIC to C
  x86: Increase MAX_EARLY_RES; insufficient on 32-bit NUMA
  x86: Fix checking of SRAT when node 0 ram is not from 0
  x86, cpuid: Add "volatile" to asm in native_cpuid()
  x86, msr: msrs_alloc/free for CONFIG_SMP=n
  x86, amd: Get multi-node CPU info from NodeId MSR instead of PCI config space
  x86: Add IA32_TSC_AUX MSR and use it
  x86, msr/cpuid: Register enough minors for the MSR and CPUID drivers
  initramfs: add missing decompressor error check
  bzip2: Add missing checks for malloc returning NULL
  bzip2/lzma/gzip: pre-boot malloc doesn't return NULL on failure
2009-12-19 09:48:14 -08:00
Masami Hiramatsu
8bee738bb1 x86: Fix objdump version check in chkobjdump.awk for different formats.
Different version of objdump says its version in different way;

GNU objdump 2.16.1

or

GNU objdump version 2.19.51.0.14-1.fc11 20090722

This patch uses the first argument which starts with a number
as version string.

Changes in v2:
 - Remove unneeded increment.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
LKML-Reference: <20091218154012.16960.5113.stgit@dhcp-100-2-132.bos.redhat.com>
Suggested-by: H. Peter Anvin <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-12-18 09:26:56 -08:00
Frederic Weisbecker
99e8c5a3b8 hw-breakpoints: Fix hardware breakpoints -> perf events dependency
The kbuild's select command doesn't propagate through the config
dependencies.

Hence the current rules of hardware breakpoint's config can't
ensure perf can never be disabled under us.

We have:

config X86
	selects HAVE_HW_BREAKPOINTS

config HAVE_HW_BREAKPOINTS
	select PERF_EVENTS

config PERF_EVENTS
	[...]

x86 will select the breakpoints but that won't propagate to perf
events. The user can still disable the latter, but it is
necessary for the breakpoints.

What we need is:

 - x86 selects HAVE_HW_BREAKPOINTS and PERF_EVENTS
 - HAVE_HW_BREAKPOINTS depends on PERF_EVENTS

so that we ensure PERF_EVENTS is enabled and frozen for x86.

This fixes the following kind of build errors:

 In file included from arch/x86/kernel/hw_breakpoint.c:31:
 include/linux/hw_breakpoint.h: In function 'hw_breakpoint_addr':
 include/linux/hw_breakpoint.h:39: error: 'struct perf_event' has no member named 'attr'

v2: Select also ANON_INODES from x86, required for perf

Reported-by: Cyrill Gorcunov <gorcunov@gmail.com>
Reported-by: Michal Marek <mmarek@suse.cz>
Reported-by: Andrew Randrianasulu <randrik_a@yahoo.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: K.Prasad <prasad@linux.vnet.ibm.com>
LKML-Reference: <1261010034-7786-1-git-send-regression-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-18 13:11:51 +01:00
Suresh Siddha
18374d89e5 x86, irq: Allow 0xff for /proc/irq/[n]/smp_affinity on an 8-cpu system
John Blackwood reported:
> on an older Dell PowerEdge 6650 system with 8 cpus (4 are hyper-threaded),
> and  32 bit (x86) kernel, once you change the irq smp_affinity of an irq
> to be less than all cpus in the system, you can never change really the
> irq smp_affinity back to be all cpus in the system (0xff) again,
> even though no error status is returned on the "/bin/echo ff >
> /proc/irq/[n]/smp_affinity" operation.
>
> This is due to that fact that BAD_APICID has the same value as
> all cpus (0xff) on 32bit kernels, and thus the value returned from
> set_desc_affinity() via the cpu_mask_to_apicid_and() function is treated
> as a failure in set_ioapic_affinity_irq_desc(), and no affinity changes
> are made.

set_desc_affinity() is already checking if the incoming cpu mask
intersects with the cpu online mask or not. So there is no need
for the apic op cpu_mask_to_apicid_and() to check again
and return BAD_APICID.

Remove the BAD_APICID return value from cpu_mask_to_apicid_and()
and also fix set_desc_affinity() to return -1 instead of using BAD_APICID
to represent error conditions (as cpu_mask_to_apicid_and() can return
logical or physical apicid values and BAD_APICID is really to represent
bad physical apic id).

Reported-by: John Blackwood <john.blackwood@ccur.com>
Root-caused-by: John Blackwood <john.blackwood@ccur.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <1261103386.2535.409.camel@sbs-t61>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-12-17 22:03:06 -08:00
Linus Torvalds
55db493b65 Merge branch 'cpumask-cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus
* 'cpumask-cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
  cpumask: rename tsk_cpumask to tsk_cpus_allowed
  cpumask: don't recommend set_cpus_allowed hack in Documentation/cpu-hotplug.txt
  cpumask: avoid dereferencing struct cpumask
  cpumask: convert drivers/idle/i7300_idle.c to cpumask_var_t
  cpumask: use modern cpumask style in drivers/scsi/fcoe/fcoe.c
  cpumask: avoid deprecated function in mm/slab.c
  cpumask: use cpu_online in kernel/perf_event.c
2009-12-17 17:00:20 -08:00
akpm@linux-foundation.org
8c63450718 x86: Fix objdump version check in arch/x86/tools/chkobjdump.awk
It says

Warning: objdump version  is older than 2.19
Warning: Skipping posttest.

because it used the wrong field from `objdump -v':

akpm:/usr/src/25> /opt/crosstool/gcc-4.0.2-glibc-2.3.6/x86_64-unknown-linux-gnu/bin/x86_64-unknown-linux-gnu-objdump -v
GNU objdump 2.16.1
Copyright 2005 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License.  This program has absolutely no warranty.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <200912172326.nBHNQaQl024796@imap1.linux-foundation.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
2009-12-17 15:34:29 -08:00
Pallipadi, Venkatesh
6c56ccecf0 x86: Reenable TSC sync check at boot, even with NONSTOP_TSC
Commit 83ce4009 did the following change
If the TSC is constant and non-stop, also set it reliable.

But, there seems to be few systems that will end up with TSC warp across
sockets, depending on how the cpus come out of reset. Skipping TSC sync
test on such systems may result in time inconsistency later.

So, reenable TSC sync test even on constant and non-stop TSC systems.
Set, sched_clock_stable to 1 by default and reset it in
mark_tsc_unstable, if TSC sync fails.

This change still gives perf benefit mentioned in 83ce4009 for systems
where TSC is reliable.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <20091217202702.GA18015@linux-os.sc.intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-12-17 14:44:35 -08:00
Russ Anderson
b76365a18f x86, uv: Add serial number parameter to uv_bios_get_sn_info()
Add system_serial_number to the information returned by
uv_bios_get_sn_info() UV BIOS call.

Signed-off-by: Russ Anderson <rja@sgi.com>
LKML-Reference: <20091217165323.GA30774@sgi.com>
Cc: Jack Steiner <steiner@sgi.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-12-17 11:33:51 -08:00
Linus Torvalds
5a865c0606 Merge branch 'for-33' of git://repo.or.cz/linux-kbuild
* 'for-33' of git://repo.or.cz/linux-kbuild: (29 commits)
  net: fix for utsrelease.h moving to generated
  gen_init_cpio: fixed fwrite warning
  kbuild: fix make clean after mismerge
  kbuild: generate modules.builtin
  genksyms: properly consider  EXPORT_UNUSED_SYMBOL{,_GPL}()
  score: add asm/asm-offsets.h wrapper
  unifdef: update to upstream revision 1.190
  kbuild: specify absolute paths for cscope
  kbuild: create include/generated in silentoldconfig
  scripts/package: deb-pkg: use fakeroot if available
  scripts/package: add KBUILD_PKG_ROOTCMD variable
  scripts/package: tar-pkg: use tar --owner=root
  Kbuild: clean up marker
  net: add net_tstamp.h to headers_install
  kbuild: move utsrelease.h to include/generated
  kbuild: move autoconf.h to include/generated
  drop explicit include of autoconf.h
  kbuild: move compile.h to include/generated
  kbuild: drop include/asm
  kbuild: do not check for include/asm-$ARCH
  ...

Fixed non-conflicting clean merge of modpost.c as per comments from
Stephen Rothwell (modpost.c had grown an include of linux/autoconf.h
that needed to be changed to generated/autoconf.h)
2009-12-17 07:23:42 -08:00
Linus Torvalds
04a1e62c2c x86/ptrace: make genregs[32]_get/set more robust
The loop condition is fragile: we compare an unsigned value to zero, and
then decrement it by something larger than one in the loop.  All the
callers should be passing in appropriately aligned buffer lengths, but
it's better to just not rely on it, and have some appropriate defensive
loop limits.

Acked-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-17 07:04:56 -08:00
Roland Dreier
4beb3d6d14 x86: Don't use POSIX character classes in gen-insn-attr-x86.awk
Not all awk implementations (including the default awk in Ubuntu 9.10)
support POSIX character classes.  Since x86-opcode-map.txt is plain
ASCII, we can just use explicit ranges for lower case, alphabetic, and
alphanumeric characters instead.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
Acked-by: Masami Hiramatsu <mhiramat@redhat.com>
LKML-Reference: <adabphy750b.fsf@roland-alpha.cisco.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-12-17 07:03:21 -08:00
Frederic Weisbecker
06d65bda75 perf events, x86/stacktrace: Fix performance/softlockup by providing a special frame pointer-only stack walker
It's just wasteful for stacktrace users like perf to walk
through every entries on the stack whereas these only accept
reliable ones, ie: that the frame pointer validates.

Since perf requires pure reliable stacktraces, it needs a stack
walker based on frame pointers-only to optimize the stacktrace
processing.

This might solve some near-lockup scenarios that can be triggered
by call-graph tracing timer events.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1261024834-5336-2-git-send-regression-fweisbec@gmail.com>
[ v2: fix for modular builds and small detail tidyup ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-17 10:42:52 +01:00
Frederic Weisbecker
61c1917f47 perf events, x86/stacktrace: Make stack walking optional
The current print_context_stack helper that does the stack
walking job is good for usual stacktraces as it walks through
all the stack and reports even addresses that look unreliable,
which is nice when we don't have frame pointers for example.

But we have users like perf that only require reliable
stacktraces, and those may want a more adapted stack walker, so
lets make this function a callback in stacktrace_ops that users
can tune for their needs.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1261024834-5336-1-git-send-regression-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-17 09:56:19 +01:00
Rusty Russell
a4636818f8 cpumask: rename tsk_cpumask to tsk_cpus_allowed
Noone uses this wrapper yet, and Ingo asked that it be kept consistent
with current task_struct usage.

(One user crept in via linux-next: fixed)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au.
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Tejun Heo <tj@kernel.org>
2009-12-17 11:43:30 +10:30
Yinghai Lu
6a1e008a09 x86: Increase MAX_EARLY_RES; insufficient on 32-bit NUMA
Due to recent changes wakeup and mptable, we run out of early
reservations on 32-bit NUMA.  Thus, adjust the available number.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <4B22D754.2020706@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-12-16 16:46:23 -08:00
Yinghai Lu
3299625036 x86: Fix checking of SRAT when node 0 ram is not from 0
Found one system that boot from socket1 instead of socket0, SRAT get rejected...

[    0.000000] SRAT: Node 1 PXM 0 0-a0000
[    0.000000] SRAT: Node 1 PXM 0 100000-80000000
[    0.000000] SRAT: Node 1 PXM 0 100000000-2080000000
[    0.000000] SRAT: Node 0 PXM 1 2080000000-4080000000
[    0.000000] SRAT: Node 2 PXM 2 4080000000-6080000000
[    0.000000] SRAT: Node 3 PXM 3 6080000000-8080000000
[    0.000000] SRAT: Node 4 PXM 4 8080000000-a080000000
[    0.000000] SRAT: Node 5 PXM 5 a080000000-c080000000
[    0.000000] SRAT: Node 6 PXM 6 c080000000-e080000000
[    0.000000] SRAT: Node 7 PXM 7 e080000000-10080000000
...
[    0.000000] NUMA: Allocated memnodemap from 500000 - 701040
[    0.000000] NUMA: Using 20 for the hash shift.
[    0.000000] Adding active range (0, 0x2080000, 0x4080000) 0 entries of 3200 used
[    0.000000] Adding active range (1, 0x0, 0x96) 1 entries of 3200 used
[    0.000000] Adding active range (1, 0x100, 0x7f750) 2 entries of 3200 used
[    0.000000] Adding active range (1, 0x100000, 0x2080000) 3 entries of 3200 used
[    0.000000] Adding active range (2, 0x4080000, 0x6080000) 4 entries of 3200 used
[    0.000000] Adding active range (3, 0x6080000, 0x8080000) 5 entries of 3200 used
[    0.000000] Adding active range (4, 0x8080000, 0xa080000) 6 entries of 3200 used
[    0.000000] Adding active range (5, 0xa080000, 0xc080000) 7 entries of 3200 used
[    0.000000] Adding active range (6, 0xc080000, 0xe080000) 8 entries of 3200 used
[    0.000000] Adding active range (7, 0xe080000, 0x10080000) 9 entries of 3200 used
[    0.000000] SRAT: PXMs only cover 917504MB of your 1048566MB e820 RAM. Not used.
[    0.000000] SRAT: SRAT not used.

the early_node_map is not sorted because node0 with non zero start come first.

so try to sort it right away after all regions are registered.

also fixs refression by 8716273c (x86: Export srat physical topology)

-v2: make it more solid to handle cross node case like node0 [0,4g), [8,12g) and node1 [4g, 8g), [12g, 16g)
-v3: update comments.

Reported-and-tested-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <4B2579D2.3010201@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-12-16 16:43:37 -08:00
Suresh Siddha
45a94d7cd4 x86, cpuid: Add "volatile" to asm in native_cpuid()
xsave_cntxt_init() does something like:

	cpuid(0xd, ..);	// find out what features FP/SSE/.. etc are supported

	xsetbv();	// enable the features known to OS

	cpuid(0xd, ..);	// find out the size of the context for features enabled

Depending on what features get enabled in xsetbv(), value of the
cpuid.eax=0xd.ecx=0.ebx changes correspondingly (representing the
size of the context that is enabled).

As we don't have volatile keyword for native_cpuid(), gcc 4.1.2
optimizes away the second cpuid and the kernel continues to use
the cpuid information obtained before xsetbv(), ultimately leading to kernel
crash on processors supporting more state than the legacy FP/SSE.

Add "volatile" for native_cpuid().

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <1261009542.2745.55.camel@sbs-t61.sc.intel.com>
Cc: stable@kernel.org
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-12-16 16:30:57 -08:00
Borislav Petkov
6ede31e030 x86, msr: msrs_alloc/free for CONFIG_SMP=n
Randy Dunlap reported the following build error:

"When CONFIG_SMP=n, CONFIG_X86_MSR=m:

ERROR: "msrs_free" [drivers/edac/amd64_edac_mod.ko] undefined!
ERROR: "msrs_alloc" [drivers/edac/amd64_edac_mod.ko] undefined!"

This is due to the fact that <arch/x86/lib/msr.c> is conditioned on
CONFIG_SMP and in the UP case we have only the stubs in the header.
Fork off SMP functionality into a new file (msr-smp.c) and build
msrs_{alloc,free} unconditionally.

Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Borislav Petkov <petkovbb@gmail.com>
LKML-Reference: <20091216231625.GD27228@liondog.tnic>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-12-16 15:36:32 -08:00
Andreas Herrmann
9d260ebc09 x86, amd: Get multi-node CPU info from NodeId MSR instead of PCI config space
Use NodeId MSR to get NodeId and number of nodes per processor.

Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
LKML-Reference: <20091216144355.GB28798@alberich.amd.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-12-16 15:06:23 -08:00
Jiri Slaby
5714868812 PCI: fix section mismatch on update_res()
Remark update_res from __init to __devinit as it is called also
from __devinit functions.

This patch removes the following warning message:

  WARNING: vmlinux.o(.devinit.text+0x774a): Section mismatch
  in reference from the function pci_root_bus_res() to the
  function .init.text:update_res()
  The function __devinit pci_root_bus_res() references
  a function __init update_res().
  If update_res is only used by pci_root_bus_res then
  annotate update_res with a matching annotation.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Aristeu Sergio <arozansk@redhat.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: linux-pci@vger.kernel.org
Cc: x86@kernel.org
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-12-16 13:37:52 -08:00
Linus Torvalds
288f02bbb6 Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (117 commits)
  ACPI processor: Fix section mismatch for processor_add()
  ACPI: Add platform-wide _OSC support.
  ACPI: cleanup pci_root _OSC code.
  ACPI: Add a generic API for _OSC -v2
  msi-wmi: depend on backlight and fix corner-cases problems
  msi-wmi: switch to using input sparse keymap library
  msi-wmi: replace one-condition switch-case with if statement
  msi-wmi: remove unused field 'instance' in key_entry structure
  msi-wmi: remove custom runtime debug implementation
  msi-wmi: rework init
  msi-wmi: remove useless includes
  X86 drivers: Introduce msi-wmi driver
  Toshiba Bluetooth Enabling driver (RFKill handler v3)
  ACPI: fix for lapic_timer_propagate_broadcast()
  acpi_pad: squish warning
  ACPI: dock: minor whitespace and style cleanups
  ACPI: dock: add struct dock_station * directly to platform device data
  ACPI: dock: dock_add - hoist up platform_device_register_simple()
  ACPI: dock: remove global 'dock_device_name'
  ACPI: dock: combine add|alloc_dock_dependent_device (v2)
  ...
2009-12-16 12:33:19 -08:00
Linus Torvalds
bac5e54c29 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (38 commits)
  direct I/O fallback sync simplification
  ocfs: stop using do_sync_mapping_range
  cleanup blockdev_direct_IO locking
  make generic_acl slightly more generic
  sanitize xattr handler prototypes
  libfs: move EXPORT_SYMBOL for d_alloc_name
  vfs: force reval of target when following LAST_BIND symlinks (try #7)
  ima: limit imbalance msg
  Untangling ima mess, part 3: kill dead code in ima
  Untangling ima mess, part 2: deal with counters
  Untangling ima mess, part 1: alloc_file()
  O_TRUNC open shouldn't fail after file truncation
  ima: call ima_inode_free ima_inode_free
  IMA: clean up the IMA counts updating code
  ima: only insert at inode creation time
  ima: valid return code from ima_inode_alloc
  fs: move get_empty_filp() deffinition to internal.h
  Sanitize exec_permission_lite()
  Kill cached_lookup() and real_lookup()
  Kill path_lookup_open()
  ...

Trivial conflicts in fs/direct-io.c
2009-12-16 12:04:02 -08:00
Linus Torvalds
61ecdb84c1 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: Fix kprobes build with non-gawk awk
  x86: Split swiotlb initialization into two stages
  x86: Regex support and known-movable symbols for relocs, fix _end
  x86, msr: Remove incorrect, duplicated code in the MSR driver
  x86: Merge kernel_thread()
  x86: Sync 32/64-bit kernel_thread
  x86, 32-bit: Use same regs as 64-bit for kernel_thread_helper
  x86, 64-bit: Use user_mode() to determine new stack pointer in copy_thread()
  x86, 64-bit: Move kernel_thread to C
  x86-64, paravirt: Call set_iopl_mask() on 64 bits
  x86-32: Avoid pipeline serialization in PTREGSCALL1 and 2
  x86: Merge sys_clone
  x86, 32-bit: Convert sys_vm86 & sys_vm86old
  x86: Merge sys_sigaltstack
  x86: Merge sys_execve
  x86: Merge sys_iopl
  x86-32: Add new pt_regs stubs
  cpumask: Use modern cpumask style in arch/x86/kernel/cpu/mcheck/mce-inject.c
2009-12-16 12:02:37 -08:00
Linus Torvalds
74f3ae7434 Merge branch 'module' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus
* 'module' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
  modpost: fix segfault with short symbol names
  module: handle ppc64 relocating kcrctabs when CONFIG_RELOCATABLE=y
  Kbuild: clear marker out of modpost
  module: make MODULE_SYMBOL_PREFIX into a CONFIG option
  ARM: unexport symbols used to implement floating point emulation
  ARM: use unified discard definition in linker script
  x86: don't export inline function
  sparc64: don't export static inline pci_ functions
2009-12-16 10:47:24 -08:00
Al Viro
853b3da10d sanitize do_pipe_flags() callers in arch
* hpux_pipe() - no need to take BKL
* sys32_pipe() in arch/x86/ia32 and xtensa_pipe() in arch/xtensa -
	no need at all, since both functions are open-coded sys_pipe()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-16 12:16:40 -05:00
Akinobu Mita
a66022c457 iommu-helper: use bitmap library
Use bitmap library and kill some unused iommu helper functions.

1. s/iommu_area_free/bitmap_clear/

2. s/iommu_area_reserve/bitmap_set/

3. Use bitmap_find_next_zero_area instead of find_next_zero_area

  This cannot be simple substitution because find_next_zero_area
  doesn't check the last bit of the limit in bitmap

4. Remove iommu_area_free, iommu_area_reserve, and find_next_zero_area

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-16 07:20:18 -08:00
Jack Steiner
56abcf24ff gru: function to generate chipset IPI values
Create a function to generate the value that is written to the UV hub MMR
to cause an IPI interrupt to be sent.  The function will be used in the
GRU message queue error recovery code that sends IPIs to nodes in remote
partitions.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-16 07:20:17 -08:00
Robin Holt
c2c9f11574 x86: uv: update XPC to handle updated BIOS interface
The UV BIOS has moved the location of some of their pointers to the
"partition reserved page" from memory into a uv hub MMR.  The GRU does not
support bcopy operations from MMR space so we need to special case the MMR
addresses using VLOAD operations.

Additionally, the BIOS call for registering a message queue watchlist has
removed the 'blade' value and eliminated the structure that was being
passed in.  This is also reflected in this patch.

Signed-off-by: Robin Holt <holt@sgi.com>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-16 07:20:14 -08:00
Robin Holt
fae419f2ab x86: uv: introduce uv_gpa_is_mmr
Provide a mechanism for determining if a global physical address is
pointing to a UV hub MMR.

Signed-off-by: Robin Holt <holt@sgi.com>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-16 07:20:13 -08:00
Robin Holt
729d69e699 x86: uv: introduce a means to translate from gpa -> socket_paddr
The UV BIOS has been updated to implement some of our interface
functionality differently than originally expected.  These patches update
the kernel to the bios implementation and include a few minor bug fixes
which prevent us from doing significant testing on real hardware.

This patch:

For SGI UV systems, translate from a global physical address back to a
socket physical address.  This does nothing to ensure the socket physical
address is actually addressable by the kernel.  That is the responsibility
of the user of the function.

Signed-off-by: Robin Holt <holt@sgi.com>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-16 07:20:13 -08:00
Jan Beulich
ac2b3e67dd dma-mapping: fix off-by-one error in dma_capable()
dma_mask is, when interpreted as address, the last valid byte, and hence
comparison msut also be done using the last valid of the buffer in
question.

Also fix the open-coded instances in lib/swiotlb.c.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Becky Bruce <beckyb@kernel.crashing.org>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-16 07:20:12 -08:00
Christoph Hellwig
698ba7b5a3 elf: kill USE_ELF_CORE_DUMP
Currently all architectures but microblaze unconditionally define
USE_ELF_CORE_DUMP.  The microblaze omission seems like an error to me, so
let's kill this ifdef and make sure we are the same everywhere.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: <linux-arch@vger.kernel.org>
Cc: Michal Simek <michal.simek@petalogix.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-16 07:20:12 -08:00
Oleg Nesterov
d519650373 ptrace: x86: change syscall_trace_leave() to rely on tracehook when stepping
Suggested by Roland.

Unlike powepc, x86 always calls tracehook_report_syscall_exit(step) with
step = 0, and sends the trap by hand.

This results in unnecessary SIGTRAP when PTRACE_SINGLESTEP follows the
syscall-exit stop.

Change syscall_trace_leave() to pass the correct "step" argument to
tracehook and remove the send_sigtrap() logic.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Roland McGrath <roland@redhat.com>
Cc: <linux-arch@vger.kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-16 07:20:08 -08:00
Oleg Nesterov
7f38551fc3 ptrace: x86: implement user_single_step_siginfo()
Suggested by Roland.

Implement user_single_step_siginfo() for x86.  Extract this code from
send_sigtrap().

Since x86 calls tracehook_report_syscall_exit(step => 0) the new helper is
not used yet.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Roland McGrath <roland@redhat.com>
Cc: <linux-arch@vger.kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-16 07:20:08 -08:00
Sheng Yang
5df974009f x86: Add IA32_TSC_AUX MSR and use it
Clean up write_tsc() and write_tscp_aux() by replacing
hardcoded values.

No change in functionality.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
LKML-Reference: <1260942485-19156-4-git-send-email-sheng@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-16 09:02:42 +01:00
Len Brown
0ceafc33af Merge branch 'bugzilla-14700' into release 2009-12-15 22:35:21 -05:00
H. Peter Anvin
0b962d473a x86, msr/cpuid: Register enough minors for the MSR and CPUID drivers
register_chrdev() hardcodes registering 256 minors, presumably to
avoid breaking old drivers.  However, we need to register enough
minors so that we have all possible CPUs.

checkpatch warns on this patch, but the patch is correct: NR_CPUS here
is a static *upper bound* on the *maximum CPU index* (not *number of
CPUs!*) and that is what we want.

Reported-and-tested-by: Russ Anderson <rja@sgi.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
LKML-Reference: <tip-*@git.kernel.org>
2009-12-15 15:13:07 -08:00
Jonathan Nieder
23637568ad x86: Fix kprobes build with non-gawk awk
The instruction attribute table generator fails when run by mawk
or original-awk:

 $ mawk -f arch/x86/tools/gen-insn-attr-x86.awk \
	arch/x86/lib/x86-opcode-map.txt > /dev/null
 Semantic error at 240: Second IMM error
 $ echo $?
 1

Line 240 contains "c8: ENTER Iw,Ib", which indicates that this
instruction has two immediate operands, the second of which is
one byte.  The script loops through the immediate operands using
a for loop.

Unfortunately, there is no guarantee in awk that a for (variable
in array) loop will return the indices in increasing order.
Internally, both original-awk and mawk iterate over a hash table
for this purpose, and both implementations happen to produce the
index 2 before 1.  The supposed second immediate operand is more
than one byte wide, producing the error.

So loop over the indices in increasing order instead.  As a
side-effect, with mawk this means the silly two-entry hash table
never has to be built.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Acked-by Masami Hiramatsu <mhiramat@redhat.com>
Cc: Jim Keniston <jkenisto@us.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20091213220437.GA27718@progeny.tock>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-15 20:35:49 +01:00
Ingo Molnar
bf08b3b1a1 Merge branch 'x86/mce' into x86/urgent
Merge reason: Leftover mini-topic from the merge window - merge it.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-15 20:33:53 +01:00
Ingo Molnar
ab1eebe77d Merge branch 'x86/asm' into x86/urgent
Merge reason: it's stable so lets push it upstream.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-15 20:33:28 +01:00
Linus Torvalds
8f0ddf91f2 Merge branch 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (26 commits)
  clockevents: Convert to raw_spinlock
  clockevents: Make tick_device_lock static
  debugobjects: Convert to raw_spinlocks
  perf_event: Convert to raw_spinlock
  hrtimers: Convert to raw_spinlocks
  genirq: Convert irq_desc.lock to raw_spinlock
  smp: Convert smplocks to raw_spinlocks
  rtmutes: Convert rtmutex.lock to raw_spinlock
  sched: Convert pi_lock to raw_spinlock
  sched: Convert cpupri lock to raw_spinlock
  sched: Convert rt_runtime_lock to raw_spinlock
  sched: Convert rq->lock to raw_spinlock
  plist: Make plist debugging raw_spinlock aware
  bkl: Fixup core_lock fallout
  locking: Cleanup the name space completely
  locking: Further name space cleanups
  alpha: Fix fallout from locking changes
  locking: Implement new raw_spinlock
  locking: Convert raw_rwlock functions to arch_rwlock
  locking: Convert raw_rwlock to arch_rwlock
  ...
2009-12-15 09:02:01 -08:00
André Goddard Rosa
e7d2860b69 tree-wide: convert open calls to remove spaces to skip_spaces() lib function
Makes use of skip_spaces() defined in lib/string.c for removing leading
spaces from strings all over the tree.

It decreases lib.a code size by 47 bytes and reuses the function tree-wide:
   text    data     bss     dec     hex filename
  64688     584     592   65864   10148 (TOTALS-BEFORE)
  64641     584     592   65817   10119 (TOTALS-AFTER)

Also, while at it, if we see (*str && isspace(*str)), we can be sure to
remove the first condition (*str) as the second one (isspace(*str)) also
evaluates to 0 whenever *str == 0, making it redundant. In other words,
"a char equals zero is never a space".

Julia Lawall tried the semantic patch (http://coccinelle.lip6.fr) below,
and found occurrences of this pattern on 3 more files:
    drivers/leds/led-class.c
    drivers/leds/ledtrig-timer.c
    drivers/video/output.c

@@
expression str;
@@

( // ignore skip_spaces cases
while (*str &&  isspace(*str)) { \(str++;\|++str;\) }
|
- *str &&
isspace(*str)
)

Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com>
Cc: Julia Lawall <julia@diku.dk>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Neil Brown <neilb@suse.de>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Cc: David Howells <dhowells@redhat.com>
Cc: <linux-ext4@vger.kernel.org>
Cc: Samuel Ortiz <samuel@sortiz.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:32 -08:00
Andres Salomon
c95d1e53ed cs5535: drop the Geode-specific MFGPT/GPIO code
With generic modular drivers handling all of this stuff, the
geode-specific code can go away.  The cs5535-gpio, cs5535-mfgpt, and
cs5535-clockevt drivers now handle this.

Signed-off-by: Andres Salomon <dilinger@collabora.co.uk>
Cc: Jordan Crouse <jordan@cosmicpenguin.net>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Chris Ball <cjb@laptop.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:28 -08:00
Andres Salomon
f3a57a60d3 cs5535: define lxfb/gxfb MSRs in linux/cs5535.h
..and include them in the lxfb/gxfb drivers rather than asm/geode.h (where
possible).

Signed-off-by: Andres Salomon <dilinger@collabora.co.uk>
Cc: Jordan Crouse <jordan@cosmicpenguin.net>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Chris Ball <cjb@laptop.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:28 -08:00
Andres Salomon
f060f27007 cs5535: move VSA2 checks into linux/cs5535.h
Signed-off-by: Andres Salomon <dilinger@collabora.co.uk>
Cc: Jordan Crouse <jordan@cosmicpenguin.net>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Chris Ball <cjb@laptop.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:28 -08:00
Andres Salomon
2e8c12436f cs5535: move the DIVIL MSR definition into linux/cs5535.h
The only thing that uses this is the reboot_fixups code.

Signed-off-by: Andres Salomon <dilinger@collabora.co.uk>
Cc: Jordan Crouse <jordan@cosmicpenguin.net>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Chris Ball <cjb@laptop.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:28 -08:00
Andres Salomon
82dca611bb cs5535: add a generic MFGPT driver
This is based on the old code on arch/x86/kernel/mfgpt_32.c, except it's
not x86 specific, it's modular, and it makes use of a PCI BAR rather than
a random MSR.  Currently module unloading is not supported; it's uncertain
whether or not it can be made work with the hardware.

[akpm@linux-foundation.org: add X86 dependency]
Signed-off-by: Andres Salomon <dilinger@collabora.co.uk>
Cc: Jordan Crouse <jordan@cosmicpenguin.net>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Chris Ball <cjb@laptop.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:28 -08:00
Andres Salomon
3c55494670 ALSA: cs5535audio: free OLPC quirks from reliance on MGEODE_LX cpu optimization
Previously, OLPC support for the mic extensions was only enabled in the
ALSA driver if CONFIG_OLPC and CONFIG_MGEODE_LX were both set.  This was
because the old geode GPIO code was written in a manner that assumed
CONFIG_MGEODE_LX.  With the new cs553x-gpio driver, this is no longer the
case; as such, we can drop the requirement on CONFIG_MGEODE_LX and instead
include a requirement on GPIOLIB.

We use the generic GPIO API rather than the cs553x-specific API.

Signed-off-by: Andres Salomon <dilinger@collabora.co.uk>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Jordan Crouse <jordan@cosmicpenguin.net>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:27 -08:00
Andres Salomon
5f0a96b044 cs5535-gpio: add AMD CS5535/CS5536 GPIO driver support
This creates a CS5535/CS5536 GPIO driver which uses a gpio_chip backend
(allowing GPIO users to use the generic GPIO API if desired) while also
allowing architecture-specific users directly (via the cs5535_gpio_*
functions).

Tested on an OLPC machine.  Some Leemotes also use CS5536 (with a mips
cpu), which is why this is in drivers/gpio rather than arch/x86.
Currently, it conflicts with older geode GPIO support; once MFGPT support
is reworked to also be more generic, the older geode code will be removed.

Signed-off-by: Andres Salomon <dilinger@collabora.co.uk>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Jordan Crouse <jordan@cosmicpenguin.net>
Cc: David Brownell <david-b@pacbell.net>
Reviewed-by: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:27 -08:00
Lee Schermerhorn
4e25b2576e hugetlb: add generic definition of NUMA_NO_NODE
Move definition of NUMA_NO_NODE from ia64 and x86_64 arch specific headers
to generic header 'linux/numa.h' for use in generic code.  NUMA_NO_NODE
replaces bare '-1' where it's used in this series to indicate "no node id
specified".  Ultimately, it can be used to replace the -1 elsewhere where
it is used similarly.

Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Reviewed-by: Andi Kleen <andi@firstfloor.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Nishanth Aravamudan <nacc@us.ibm.com>
Cc: Adam Litke <agl@us.ibm.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Eric Whitney <eric.whitney@hp.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:12 -08:00
FUJITA Tomonori
186a25026c x86: Split swiotlb initialization into two stages
The commit f4780ca005 moves
swiotlb initialization before dma32_free_bootmem(). It's
supposed to fix a bug that the commit
75f1cdf1dd introduced, we
initialize SWIOTLB right after dma32_free_bootmem so we wrongly
steal memory area allocated for GART with broken BIOS earlier.

However, the above commit introduced another problem, which
likely breaks machines with huge amount of memory. Such a box
use the majority of DMA32_ZONE so there is no memory for
swiotlb.

With this patch, the x86 IOMMU initialization sequence are:

1. We set swiotlb to 1 in the case of (max_pfn > MAX_DMA32_PFN
   && !no_iommu). If swiotlb usage is forced by the boot option,
   we go to the step 3 and finish (we don't try to detect IOMMUs).

2. We call the detection functions of all the IOMMUs. The
   detection function sets x86_init.iommu.iommu_init to the IOMMU
   initialization function (so we can avoid calling the
   initialization functions of all the IOMMUs needlessly).

3. We initialize swiotlb (and set dma_ops to swiotlb_dma_ops) if
   swiotlb is set to 1.

4. If the IOMMU initialization function doesn't need swiotlb
   (e.g. the initialization is sucessful) then sets swiotlb to zero.

5. If we find that swiotlb is set to zero, we free swiotlb
   resource.

Reported-by: Yinghai Lu <yinghai@kernel.org>
Reported-by: Roland Dreier <rdreier@cisco.com>
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
LKML-Reference: <20091215204729A.fujita.tomonori@lab.ntt.co.jp>
Tested-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-15 13:01:57 +01:00
Rusty Russell
e642804772 x86: don't export inline function
For CONFIG_PARAVIRT, load_gs_index is an inline function (it's #defined
to native_load_gs_index otherwise).

Exporting an inline function breaks the new assembler-based alphabetical
sorted symbol list:

Today's linux-next build (x86_64 allmodconfig) failed like this:

	.tmp_exports-asm.o: In function `__ksymtab_load_gs_index':
	(__ksymtab_sorted+0x5b40): undefined reference to `load_gs_index'

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
To: x86@kernel.org
Cc: alan-jenkins@tuffmail.co.uk
2009-12-15 16:28:15 +10:30
Zhao Yakui
03a05ed115 ACPI: Use the ARB_DISABLE for the CPU which model id is less than 0x0f.
Currently, ARB_DISABLE is a NOP on all of the recent Intel platforms.
For such platforms, reduce contention on c3_lock by skipping the fake
ARB_DISABLE.

The cpu model id on one laptop is 14. If we disable ARB_DISABLE on this box,
the box can't be booted correctly. But if we still enable ARB_DISABLE on this
box, the box can be booted correctly.

So we still use the ARB_DISABLE for the cpu which mode id is less than 0x0f.

http://bugzilla.kernel.org/show_bug.cgi?id=14700

Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
Acked-by: Pallipadi, Venkatesh <venkatesh.pallipadi@intel.com>
cc: stable@kernel.org
Signed-off-by: Len Brown <len.brown@intel.com>
2009-12-14 21:54:30 -05:00
Thomas Gleixner
239007b844 genirq: Convert irq_desc.lock to raw_spinlock
Convert locks which cannot be sleeping locks in preempt-rt to
raw_spinlocks.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
2009-12-14 23:55:33 +01:00
Thomas Gleixner
e5931943d0 locking: Convert raw_rwlock functions to arch_rwlock
Name space cleanup for rwlock functions. No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: linux-arch@vger.kernel.org
2009-12-14 23:55:32 +01:00
Thomas Gleixner
fb3a6bbc91 locking: Convert raw_rwlock to arch_rwlock
Not strictly necessary for -rt as -rt does not have non sleeping
rwlocks, but it's odd to not have a consistent naming convention.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: linux-arch@vger.kernel.org
2009-12-14 23:55:32 +01:00
Thomas Gleixner
0199c4e68d locking: Convert __raw_spin* functions to arch_spin*
Name space cleanup. No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: linux-arch@vger.kernel.org
2009-12-14 23:55:32 +01:00
Thomas Gleixner
edc35bd72e locking: Rename __RAW_SPIN_LOCK_UNLOCKED to __ARCH_SPIN_LOCK_UNLOCKED
Further name space cleanup. No functional change

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: linux-arch@vger.kernel.org
2009-12-14 23:55:32 +01:00
Thomas Gleixner
445c89514b locking: Convert raw_spinlock to arch_spinlock
The raw_spin* namespace was taken by lockdep for the architecture
specific implementations. raw_spin_* would be the ideal name space for
the spinlocks which are not converted to sleeping locks in preempt-rt.

Linus suggested to convert the raw_ to arch_ locks and cleanup the
name space instead of using an artifical name like core_spin,
atomic_spin or whatever

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: linux-arch@vger.kernel.org
2009-12-14 23:55:32 +01:00
H. Peter Anvin
873b5271f8 x86: Regex support and known-movable symbols for relocs, fix _end
This adds a new category of symbols to the relocs program: symbols
which are known to be relative, even though the linker emits them as
absolute; this is the case for symbols that live in the linker script,
which currently applies to _end.

Unfortunately the previous workaround of putting _end in its own empty
section was defeated by newer binutils, which remove empty sections
completely.

This patch also changes the symbol matching to use regular expressions
instead of hardcoded C for specific patterns.

This is a decidedly non-minimal patch: a modified version of the
relocs program is used as part of the Syslinux build, and this 	is
basically a backport to Linux of some of those changes; they have
thus been well tested.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
LKML-Reference: <4AF86211.3070103@zytor.com>
Acked-by: Michal Marek <mmarek@suse.cz>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
2009-12-14 13:55:20 -08:00
Linus Torvalds
75b08038ce Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, mce: Clean up thermal init by introducing intel_thermal_supported()
  x86, mce: Thermal monitoring depends on APIC being enabled
  x86: Gart: fix breakage due to IOMMU initialization cleanup
  x86: Move swiotlb initialization before dma32_free_bootmem
  x86: Fix build warning in arch/x86/mm/mmio-mod.c
  x86: Remove usedac in feature-removal-schedule.txt
  x86: Fix duplicated UV BAU interrupt vector
  nvram: Fix write beyond end condition; prove to gcc copy is safe
  mm: Adjust do_pages_stat() so gcc can see copy_from_user() is safe
  x86: Limit the number of processor bootup messages
  x86: Remove enabling x2apic message for every CPU
  doc: Add documentation for bootloader_{type,version}
  x86, msr: Add support for non-contiguous cpumasks
  x86: Use find_e820() instead of hard coded trampoline address
  x86, AMD: Fix stale cpuid4_info shared_map data in shared_cpu_map cpumasks

Trivial percpu-naming-introduced conflicts in arch/x86/kernel/cpu/intel_cacheinfo.c
2009-12-14 12:36:46 -08:00
H. Peter Anvin
494c2ebfb2 x86, msr: Remove incorrect, duplicated code in the MSR driver
The MSR driver would compute the values for cpu and c at declaration,
and then again in the body of the function.  This isn't merely
redundant, but unsafe, since cpu might not refer to a valid CPU at
that point.

Remove the unnecessary and dangerous references in the declarations.
This code now matches the equivalent code in the CPUID driver.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-12-14 10:05:05 -08:00
Linus Torvalds
d0316554d3 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (34 commits)
  m68k: rename global variable vmalloc_end to m68k_vmalloc_end
  percpu: add missing per_cpu_ptr_to_phys() definition for UP
  percpu: Fix kdump failure if booted with percpu_alloc=page
  percpu: make misc percpu symbols unique
  percpu: make percpu symbols in ia64 unique
  percpu: make percpu symbols in powerpc unique
  percpu: make percpu symbols in x86 unique
  percpu: make percpu symbols in xen unique
  percpu: make percpu symbols in cpufreq unique
  percpu: make percpu symbols in oprofile unique
  percpu: make percpu symbols in tracer unique
  percpu: make percpu symbols under kernel/ and mm/ unique
  percpu: remove some sparse warnings
  percpu: make alloc_percpu() handle array types
  vmalloc: fix use of non-existent percpu variable in put_cpu_var()
  this_cpu: Use this_cpu_xx in trace_functions_graph.c
  this_cpu: Use this_cpu_xx for ftrace
  this_cpu: Use this_cpu_xx in nmi handling
  this_cpu: Use this_cpu operations in RCU
  this_cpu: Use this_cpu ops for VM statistics
  ...

Fix up trivial (famous last words) global per-cpu naming conflicts in
	arch/x86/kvm/svm.c
	mm/slab.c
2009-12-14 09:58:24 -08:00
Hidetoshi Seto
70fe440718 x86, mce: Clean up thermal init by introducing intel_thermal_supported()
It looks better to have a common function. No change in functionality.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
LKML-Reference: <4B25FDDC.407@jp.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
2009-12-14 10:38:41 +01:00
Cyrill Gorcunov
485a2e1973 x86, mce: Thermal monitoring depends on APIC being enabled
Add check if APIC is not disabled since thermal
monitoring depends on it. As only apic gets disabled
we should not try to install "thermal monitor" vector,
print out that thermal monitoring is enabled and etc...

Note that "Intel Correct Machine Check Interrupts" already
has such a check.

Also I decided to not add cpu_has_apic check into
mcheck_intel_therm_init since even if it'll call apic_read on
disabled apic -- it's safe here and allow us to save a few code
bytes.

Reported-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
LKML-Reference: <4B25FDC2.3020401@jp.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-14 10:38:41 +01:00
Yinghai Lu
f3eee54276 x86: Gart: fix breakage due to IOMMU initialization cleanup
This fixes the following breakage of the commit
75f1cdf1dd:

- GART systems that don't AGP with broken BIOS and more than 4GB
  memory are forced to use swiotlb. They can allocate aperture by
  hand and use GART.

- GART systems without GAP must disable GART on shutdown.

- swiotlb usage is forced by the boot option,
  gart_iommu_hole_init() is not called, so we disable GART
  early_gart_iommu_check().

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
LKML-Reference: <1260759135-6450-3-git-send-email-fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-14 08:57:40 +01:00
FUJITA Tomonori
f4780ca005 x86: Move swiotlb initialization before dma32_free_bootmem
The commit 75f1cdf1dd introduced a
bug that we initialize SWIOTLB right after dma32_free_bootmem so
we wrongly steal memory area allocated for GART with broken BIOS
earlier.

This moves swiotlb initialization before dma32_free_bootmem().

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: yinghai@kernel.org
LKML-Reference: <1260759135-6450-2-git-send-email-fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-14 08:57:40 +01:00
Joe Perches
eba11d6da7 x86: Fix build warning in arch/x86/mm/mmio-mod.c
Stephen Rothwell reported these warnings:

 arch/x86/mm/mmio-mod.c: In function 'print_pte':
 arch/x86/mm/mmio-mod.c💯 warning: too many arguments for format
 arch/x86/mm/mmio-mod.c:106: warning: too many arguments for format

The 'fmt' was left out accidentally.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus <torvalds@linux-foundation.org>
LKML-Reference: <1260775443.18538.16.camel@Joe-Laptop.home>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-14 08:55:43 +01:00
Cliff Wickman
1d865fb728 x86: Fix duplicated UV BAU interrupt vector
Interrupt vector 0xec has been doubly defined in irq_vectors.h

It seems arbitrary whether LOCAL_PENDING_VECTOR or
UV_BAU_MESSAGE is the higher number.  As long as they are
unique. If they are not unique we'll hit a BUG in
alloc_system_vector().

Signed-off-by: Cliff Wickman <cpw@sgi.com>
Cc: <stable@kernel.org>
LKML-Reference: <E1NJ9Pe-0004P7-0Q@eag09.americas.sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-13 08:17:40 +01:00
Sam Ravnborg
273b281fa2 kbuild: move utsrelease.h to include/generated
Fix up all users of utsrelease.h

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Michal Marek <mmarek@suse.cz>
2009-12-12 13:08:15 +01:00
Sam Ravnborg
9204595405 kbuild: move compile.h to include/generated
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Michal Marek <mmarek@suse.cz>
2009-12-12 13:08:14 +01:00
Sam Ravnborg
559df2e021 kbuild: move asm-offsets.h to include/generated
The simplest method was to add an extra asm-offsets.h
file in arch/$ARCH/include/asm that references the generated file.

We can now migrate the architectures one-by-one to reference
the generated file direct - and when done we can delete the
temporary arch/$ARCH/include/asm/asm-offsets.h file.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Michal Marek <mmarek@suse.cz>
2009-12-12 13:08:14 +01:00
Linus Torvalds
756300983f Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86/amd-iommu: Fix PCI hotplug with passthrough mode
  x86/amd-iommu: Fix passthrough mode
  x86: mmio-mod.c: Use pr_fmt
  x86: kmmio.c: Add and use pr_fmt(fmt)
  x86: i8254.c: Add pr_fmt(fmt)
  x86: setup_percpu.c: Use pr_<level> and add pr_fmt(fmt)
  x86: es7000_32.c: Use pr_<level> and add pr_fmt(fmt)
  x86: Print DMI_BOARD_NAME as well as DMI_PRODUCT_NAME from __show_regs()
  x86: Factor duplicated code out of __show_regs() into show_regs_common()
  arch/x86/kernel/microcode*: Use pr_fmt() and remove duplicated KERN_ERR prefix
  x86, mce: fix confusion between bank attributes and mce attributes
  x86/mce: Set up timer unconditionally
  x86: Fix bogus warning in apic_noop.apic_write()
  x86: Fix typo in arch/x86/mm/kmmio.c
  x86: ASUS P4S800 reboot=bios quirk
2009-12-11 20:47:59 -08:00
Linus Torvalds
6f696eb17b Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (57 commits)
  x86, perf events: Check if we have APIC enabled
  perf_event: Fix variable initialization in other codepaths
  perf kmem: Fix unused argument build warning
  perf symbols: perf_header__read_build_ids() offset'n'size should be u64
  perf symbols: dsos__read_build_ids() should read both user and kernel buildids
  perf tools: Align long options which have no short forms
  perf kmem: Show usage if no option is specified
  sched: Mark sched_clock() as notrace
  perf sched: Add max delay time snapshot
  perf tools: Correct size given to memset
  perf_event: Fix perf_swevent_hrtimer() variable initialization
  perf sched: Fix for getting task's execution time
  tracing/kprobes: Fix field creation's bad error handling
  perf_event: Cleanup for cpu_clock_perf_event_update()
  perf_event: Allocate children's perf_event_ctxp at the right time
  perf_event: Clean up __perf_event_init_context()
  hw-breakpoints: Modify breakpoints without unregistering them
  perf probe: Update perf-probe document
  perf probe: Support --del option
  trace-kprobe: Support delete probe syntax
  ...
2009-12-11 20:47:30 -08:00
Linus Torvalds
2fe77b81c7 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq:
  [ACPI/CPUFREQ] Introduce bios_limit per cpu cpufreq sysfs interface
  [CPUFREQ] make internal cpufreq_add_dev_* static
  [CPUFREQ] use an enum for speedstep processor identification
  [CPUFREQ] Document units for transition latency
  [CPUFREQ] Use global sysfs cpufreq structure for conservative governor tunings
  [CPUFREQ] Documentation: ABI: /sys/devices/system/cpu/cpu#/cpufreq/
  [CPUFREQ] powernow-k6: set transition latency value so ondemand governor can be used
  [CPUFREQ] cpumask: don't put a cpumask on the stack in x86...cpufreq/powernow-k8.c
2009-12-11 15:59:23 -08:00
Linus Torvalds
3126c136bc Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6: (21 commits)
  ext3: PTR_ERR return of wrong pointer in setup_new_group_blocks()
  ext3: Fix data / filesystem corruption when write fails to copy data
  ext4: Support for 64-bit quota format
  ext3: Support for vfsv1 quota format
  quota: Implement quota format with 64-bit space and inode limits
  quota: Move definition of QFMT_OCFS2 to linux/quota.h
  ext2: fix comment in ext2_find_entry about return values
  ext3: Unify log messages in ext3
  ext2: clear uptodate flag on super block I/O error
  ext2: Unify log messages in ext2
  ext3: make "norecovery" an alias for "noload"
  ext3: Don't update the superblock in ext3_statfs()
  ext3: journal all modifications in ext3_xattr_set_handle
  ext2: Explicitly assign values to on-disk enum of filetypes
  quota: Fix WARN_ON in lookup_one_len
  const: struct quota_format_ops
  ubifs: remove manual O_SYNC handling
  afs: remove manual O_SYNC handling
  kill wait_on_page_writeback_range
  vfs: Implement proper O_SYNC semantics
  ...
2009-12-11 15:31:13 -08:00
Linus Torvalds
880188b243 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb:
  kgdb: Always process the whole breakpoint list on activate or deactivate
  kgdb: continue and warn on signal passing from gdb
  kgdb,x86: do not set kgdb_single_step on x86
  kgdb: allow for cpu switch when single stepping
  kgdb,i386: Fix corner case access to ss with NMI watch dog exception
  kgdb: Replace strstr() by strchr() for single-character needles
  kgdbts: Read buffer overflow
  kgdb: Read buffer overflow
  kgdb,x86: remove redundant test
2009-12-11 15:19:56 -08:00
Mike Travis
2eaad1fddd x86: Limit the number of processor bootup messages
When there are a large number of processors in a system, there
is an excessive amount of messages sent to the system console.
It's estimated that with 4096 processors in a system, and the
console baudrate set to 56K, the startup messages will take
about 84 minutes to clear the serial port.

This set of patches limits the number of repetitious messages
which contain no additional information.  Much of this information
is obtainable from the /proc and /sysfs.   Some of the messages
are also sent to the kernel log buffer as KERN_DEBUG messages so
dmesg can be used to examine more closely any details specific to
a problem.

The new cpu bootup sequence for system_state == SYSTEM_BOOTING:

Booting Node   0, Processors  #1 #2 #3 #4 #5 #6 #7 Ok.
Booting Node   1, Processors  #8 #9 #10 #11 #12 #13 #14 #15 Ok.
...
Booting Node   3, Processors  #56 #57 #58 #59 #60 #61 #62 #63 Ok.
Brought up 64 CPUs

After the system is running, a single line boot message is displayed
when CPU's are hotplugged on:

    Booting Node %d Processor %d APIC 0x%x

Status of the following lines:

    CPU: Physical Processor ID:		printed once (for boot cpu)
    CPU: Processor Core ID:		printed once (for boot cpu)
    CPU: Hyper-Threading is disabled	printed once (for boot cpu)
    CPU: Thermal monitoring enabled	printed once (for boot cpu)
    CPU %d/0x%x -> Node %d:		removed
    CPU %d is now offline:		only if system_state == RUNNING
    Initializing CPU#%d:		KERN_DEBUG

Signed-off-by: Mike Travis <travis@sgi.com>
LKML-Reference: <4B219E28.8080601@sgi.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-12-11 15:16:00 -08:00
Mike Travis
450b1e8dd1 x86: Remove enabling x2apic message for every CPU
Print only once that the system is supporting x2apic mode.

Signed-off-by: Mike Travis <travis@sgi.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
LKML-Reference: <4B226E92.5080904@sgi.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-12-11 14:32:31 -08:00
Linus Torvalds
aad3bf04dc Merge git://git.kernel.org/pub/scm/linux/kernel/git/viro/mmap
* git://git.kernel.org/pub/scm/linux/kernel/git/viro/mmap:
  Add missing alignment check in arch/score sys_mmap()
  fix broken aliasing checks for MAP_FIXED on sparc32, mips, arm and sh
  Get rid of open-coding in ia64_brk()
  sparc_brk() is not needed anymore
  switch do_brk() to get_unmapped_area()
  Take arch_mmap_check() into get_unmapped_area()
  fix a struct file leak in do_mmap_pgoff()
  Unify sys_mmap*
  Cut hugetlb case early for 32bit on ia64
  arch_mmap_check() on mn10300
  Kill ancient crap in s390 compat mmap
  arm: add arch_mmap_check(), get rid of sys_arm_mremap()
  file ->get_unmapped_area() shouldn't duplicate work of get_unmapped_area()
  kill useless checks in sparc mremap variants
  fix pgoff in "have to relocate" case of mremap()
  fix the arch checks in MREMAP_FIXED case
  fix checks for expand-in-place mremap
  do_mremap() untangling, part 3
  do_mremap() untangling, part 2
  untangling do_mremap(), part 1
2009-12-11 12:23:29 -08:00
Linus Torvalds
11bd04f6f3 Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (109 commits)
  PCI: fix coding style issue in pci_save_state()
  PCI: add pci_request_acs
  PCI: fix BUG_ON triggered by logical PCIe root port removal
  PCI: remove ifdefed pci_cleanup_aer_correct_error_status
  PCI: unconditionally clear AER uncorr status register during cleanup
  x86/PCI: claim SR-IOV BARs in pcibios_allocate_resource
  PCI: portdrv: remove redundant definitions
  PCI: portdrv: remove unnecessary struct pcie_port_data
  PCI: portdrv: minor cleanup for pcie_port_device_register
  PCI: portdrv: add missing irq cleanup
  PCI: portdrv: enable device before irq initialization
  PCI: portdrv: cleanup service irqs initialization
  PCI: portdrv: check capabilities first
  PCI: portdrv: move PME capability check
  PCI: portdrv: remove redundant pcie type calculation
  PCI: portdrv: cleanup pcie_device registration
  PCI: portdrv: remove redundant pcie_port_device_probe
  PCI: Always set prefetchable base/limit upper32 registers
  PCI: read-modify-write the pcie device control register when initiating pcie flr
  PCI: show dma_mask bits in /sys
  ...

Fixed up conflicts in:
	arch/x86/kernel/amd_iommu_init.c
	drivers/pci/dmar.c
	drivers/pci/hotplug/acpiphp_glue.c
2009-12-11 12:18:16 -08:00
Borislav Petkov
505422517d x86, msr: Add support for non-contiguous cpumasks
The current rd/wrmsr_on_cpus helpers assume that the supplied
cpumasks are contiguous. However, there are machines out there
like some K8 multinode Opterons which have a non-contiguous core
enumeration on each node (e.g. cores 0,2 on node 0 instead of 0,1), see
http://www.gossamer-threads.com/lists/linux/kernel/1160268.

This patch fixes out-of-bounds writes (see URL above) by adding per-CPU
msr structs which are used on the respective cores.

Additionally, two helpers, msrs_{alloc,free}, are provided for use by
the callers of the MSR accessors.

Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Mauro Carvalho Chehab <mchehab@redhat.com>
Cc: Aristeu Rozanski <aris@redhat.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
LKML-Reference: <20091211171440.GD31998@aftab>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-12-11 10:59:21 -08:00
H. Peter Anvin
5c6baba84e Merge commit 'linus/master' into x86/urgent 2009-12-11 10:57:42 -08:00
Jason Wessel
8097551d9a kgdb,x86: do not set kgdb_single_step on x86
On an SMP system the kgdb_single_step flag has the possibility to
indefinitely hang the system in the case.  Consider the case where,
CPU 1 has the schedule lock and CPU 0 is set to single step, there is
no way for CPU 0 to run another task.

The easy way to observe the problem is to make 2 cpus busy, and run
the kgdb test suite.  You will see that it hangs the system very
quickly.

while [ 1 ] ; do find /proc > /dev/null 2>&1 ; done &
while [ 1 ] ; do find /proc > /dev/null 2>&1 ; done &
echo V1 > /sys/module/kgdbts/parameters/kgdbts

The side effect of this patch is that there is the possibility
to miss a breakpoint in the case that a single step operation
was executed to step over a breakpoint in common code.

The trade off of the missed breakpoint is preferred to
hanging the kernel.  This can be fixed in the future by
using kprobes or another strategy to step over planted
breakpoints with out of line execution.

CC: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2009-12-11 08:43:18 -06:00
Jason Wessel
cf6f196d11 kgdb,i386: Fix corner case access to ss with NMI watch dog exception
It is possible for the user_mode_vm(regs) check to return true on the
i368 arch for a non master kgdb cpu or when the master kgdb cpu
handles the NMI watch dog exception.

The solution is simply to select the correct gdb_ss location
based on the check to user_mode_vm(regs).

CC: Ingo Molnar <mingo@elte.hu>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2009-12-11 08:43:16 -06:00
Roel Kluin
a5d09d6833 kgdb,x86: remove redundant test
The for loop starts with a breakno of 0, and ends when it's 4. so
this test is always true.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2009-12-11 08:43:12 -06:00
Al Viro
f8b7256096 Unify sys_mmap*
New helper - sys_mmap_pgoff(); switch syscalls to using it.

Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-11 06:44:29 -05:00
Yinghai Lu
893f38d144 x86: Use find_e820() instead of hard coded trampoline address
Jens found the following crash/regression:

[    0.000000] found SMP MP-table at [ffff8800000fdd80] fdd80
[    0.000000] Kernel panic - not syncing: Overlapping early reservations 12-f011 MP-table mpc to 0-fff BIOS data page

and

[    0.000000] Kernel panic - not syncing: Overlapping early reservations 12-f011 MP-table mpc to 6000-7fff TRAMPOLINE

and bisected it to b24c2a9 ("x86: Move find_smp_config()
earlier and avoid bootmem usage").

It turns out the BIOS is using the first 64k for mptable,
without reserving it.

So try to find good range for the real-mode trampoline instead of
hard coding it, in case some bios tries to use that range for sth.

Reported-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Tested-by: Jens Axboe <jens.axboe@oracle.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
LKML-Reference: <4B21630A.6000308@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-11 09:28:22 +01:00
Prarit Bhargava
ebb682f522 x86, AMD: Fix stale cpuid4_info shared_map data in shared_cpu_map cpumasks
The per_cpu cpuid4_info shared_map can contain stale data when CPUs are added
and removed.

The stale data can lead to a NULL pointer derefernce panic on a remove of a
CPU that has had siblings previously removed.

This patch resolves the panic by verifying a cpu is actually online before
adding it to the shared_cpu_map, only examining cpus that are part of
the same lower level cache, and by updating other siblings lowest level cache
maps when a cpu is added.

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
LKML-Reference: <20091209183336.17855.98708.sendpatchset@prarit.bos.redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-12-10 17:19:03 -08:00
Brian Gerst
df59e7bf43 x86: Merge kernel_thread()
Signed-off-by: Brian Gerst <brgerst@gmail.com>
LKML-Reference: <1260380084-3707-6-git-send-email-brgerst@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-12-10 16:41:31 -08:00
Brian Gerst
f443ff4201 x86: Sync 32/64-bit kernel_thread
Signed-off-by: Brian Gerst <brgerst@gmail.com>
LKML-Reference: <1260380084-3707-5-git-send-email-brgerst@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-12-10 15:55:39 -08:00
Brian Gerst
e840227c14 x86, 32-bit: Use same regs as 64-bit for kernel_thread_helper
The arg should be in %eax, but that is clobbered by the return value
of clone.  The function pointer can be in any register.  Also, don't
push args onto the stack, since regparm(3) is the normal calling
convention now.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
LKML-Reference: <1260380084-3707-4-git-send-email-brgerst@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-12-10 15:55:36 -08:00
Brian Gerst
fa4b8f8438 x86, 64-bit: Use user_mode() to determine new stack pointer in copy_thread()
Use user_mode() instead of a magic value for sp to determine when returning
to kernel mode.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
LKML-Reference: <1260380084-3707-3-git-send-email-brgerst@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-12-10 15:55:30 -08:00
Brian Gerst
3bd95dfb18 x86, 64-bit: Move kernel_thread to C
Prepare for merging with 32-bit.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
LKML-Reference: <1260380084-3707-2-git-send-email-brgerst@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-12-10 15:55:26 -08:00
Linus Torvalds
d71cb81af3 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  workqueue: Add debugobjects support
2009-12-10 09:35:44 -08:00
Linus Torvalds
ab1831b0b8 Merge branch 'bugfix' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen
* 'bugfix' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen:
  xen: try harder to balloon up under memory pressure.
  Xen balloon: fix totalram_pages counting.
  xen: explicitly create/destroy stop_machine workqueues outside suspend/resume region.
  xen: improve error handling in do_suspend.
  xen: don't leak IRQs over suspend/resume.
  xen: call clock resume notifier on all CPUs
  xen: use iret for return from 64b kernel to 32b usermode
  xen: don't call dpm_resume_noirq() with interrupts disabled.
  xen: register runstate info for boot CPU early
  xen: register runstate on secondary CPUs
  xen: register timer interrupt with IRQF_TIMER
  xen: correctly restore pfn_to_mfn_list_list after resume
  xen: restore runstate_info even if !have_vcpu_info_placement
  xen: re-register runstate area earlier on resume.
  xen: wait up to 5 minutes for device connetion
  xen: improvement to wait_for_devices()
  xen: fix is_disconnected_device/exists_disconnected_device
  xen/xenbus: make DEVICE_ATTR()s static
2009-12-10 09:35:02 -08:00
Cyrill Gorcunov
125580380f x86, perf events: Check if we have APIC enabled
Ralf Hildebrandt reported this boot warning:

| Running a vanilla 2.6.32 as Xen DomU, I'm getting:
|
| [    0.000999] CPU: Physical Processor ID: 0
| [    0.000999] CPU: Processor Core ID: 1
| [    0.000999] Performance Events: AMD PMU driver.
| [    0.000999] ------------[ cut here ]------------
| [    0.000999] WARNING: at arch/x86/kernel/apic/apic.c:249 native_apic_write_dummy

So we need to check if APIC functionality is available, and
not just in the P6 driver but elsewhere as well.

Reported-by: Ralf Hildebrandt <Ralf.Hildebrandt@charite.de>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20091210165634.GF5086@lenovo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-10 18:00:30 +01:00
Xiao Guangrong
5e855db5d8 perf_event: Fix variable initialization in other codepaths
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <4B20BAA6.7010609@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-10 17:23:02 +01:00
Christoph Hellwig
6b2f3d1f76 vfs: Implement proper O_SYNC semantics
While Linux provided an O_SYNC flag basically since day 1, it took until
Linux 2.4.0-test12pre2 to actually get it implemented for filesystems,
since that day we had generic_osync_around with only minor changes and the
great "For now, when the user asks for O_SYNC, we'll actually give
O_DSYNC" comment.  This patch intends to actually give us real O_SYNC
semantics in addition to the O_DSYNC semantics.  After Jan's O_SYNC
patches which are required before this patch it's actually surprisingly
simple, we just need to figure out when to set the datasync flag to
vfs_fsync_range and when not.

This patch renames the existing O_SYNC flag to O_DSYNC while keeping it's
numerical value to keep binary compatibility, and adds a new real O_SYNC
flag.  To guarantee backwards compatiblity it is defined as expanding to
both the O_DSYNC and the new additional binary flag (__O_SYNC) to make
sure we are backwards-compatible when compiled against the new headers.

This also means that all places that don't care about the differences can
just check O_DSYNC and get the right behaviour for O_SYNC, too - only
places that actuall care need to check __O_SYNC in addition.  Drivers and
network filesystems have been updated in a fail safe way to always do the
full sync magic if O_DSYNC is set.  The few places setting O_SYNC for
lower layers are kept that way for now to stay failsafe.

We enforce that O_DSYNC is set when __O_SYNC is set early in the open path
to make sure we always get these sane options.

Note that parisc really screwed up their headers as they already define a
O_DSYNC that has always been a no-op.  We try to repair it by using it for
the new O_DSYNC and redefinining O_SYNC to send both the traditional
O_SYNC numerical value _and_ the O_DSYNC one.

Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Grant Grundler <grundler@parisc-linux.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andreas Dilger <adilger@sun.com>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Kyle McMartin <kyle@mcmartin.ca>
Acked-by: Ulrich Drepper <drepper@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jan Kara <jack@suse.cz>
2009-12-10 15:02:50 +01:00
Ingo Molnar
9cf7826743 Merge branch 'amd-iommu/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/linux-2.6-iommu into x86/urgent 2009-12-10 14:25:48 +01:00
Joerg Roedel
8638c4914f x86/amd-iommu: Fix PCI hotplug with passthrough mode
The device change notifier is initialized in the dma_ops
initialization path. But this path is never executed for
iommu=pt. Move the notifier initialization to IOMMU hardware
init code to fix this.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-12-10 12:23:47 +01:00
Joerg Roedel
b7cc9554bc x86/amd-iommu: Fix passthrough mode
The data structure changes to use dev->archdata.iommu field
broke the iommu=pt mode because in this case the
dev->archdata.iommu was left uninitialized. This moves the
inititalization of the devices into the main init function
and fixes the problem.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-12-10 12:21:31 +01:00