Currently, the 32-bit and 64-bit atomic operations on ARM do not
include memory constraints in the inline assembly blocks. In the
case of barrier-less operations [for example, atomic_add], this
means that the compiler may constant fold values which have actually
been modified by a call to an atomic operation.
This issue can be observed in the atomic64_test routine in
<kernel root>/lib/atomic64_test.c:
00000000 <test_atomic64>:
0: e1a0c00d mov ip, sp
4: e92dd830 push {r4, r5, fp, ip, lr, pc}
8: e24cb004 sub fp, ip, #4
c: e24dd008 sub sp, sp, #8
10: e24b3014 sub r3, fp, #20
14: e30d000d movw r0, #53261 ; 0xd00d
18: e3011337 movw r1, #4919 ; 0x1337
1c: e34c0001 movt r0, #49153 ; 0xc001
20: e34a1aa3 movt r1, #43683 ; 0xaaa3
24: e16300f8 strd r0, [r3, #-8]!
28: e30c0afe movw r0, #51966 ; 0xcafe
2c: e30b1eef movw r1, #48879 ; 0xbeef
30: e34d0eaf movt r0, #57007 ; 0xdeaf
34: e34d1ead movt r1, #57005 ; 0xdead
38: e1b34f9f ldrexd r4, [r3]
3c: e1a34f90 strexd r4, r0, [r3]
40: e3340000 teq r4, #0
44: 1afffffb bne 38 <test_atomic64+0x38>
48: e59f0004 ldr r0, [pc, #4] ; 54 <test_atomic64+0x54>
4c: e3a0101e mov r1, #30
50: ebfffffe bl 0 <__bug>
54: 00000000 .word 0x00000000
The atomic64_set (0x38-0x44) writes to the atomic64_t, but the
compiler doesn't see this, assumes the test condition is always
false and generates an unconditional branch to __bug. The rest of the
test is optimised away.
This patch adds suitable memory constraints to the atomic operations on ARM
to ensure that the compiler is informed of the correct data hazards. We have
to use the "Qo" constraints to avoid hitting the GCC anomaly described at
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44492 , where the compiler
makes assumptions about the writeback in the addressing mode used by the
inline assembly. These constraints forbid the use of auto{inc,dec} addressing
modes, so it doesn't matter if we don't use the operand exactly once.
Cc: stable@kernel.org
Reviewed-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The atomic64_add_unless function compares an atomic variable with
a given value and, if they are not equal, adds another given value
to the atomic variable. The function returns zero if the addition
did not occur and non-zero otherwise.
On ARM, the return value is initialised to 1 in C code. Inline assembly
code then performs the atomic64_add_unless operation, setting the
return value to 0 iff the addition does not occur. This means that
when the addition *does* occur, the value of ret must be preserved
across the inline assembly and therefore requires a "+r" constraint
rather than the current one of "=&r".
Thanks to Nicolas Pitre for helping to spot this.
Cc: stable@kernel.org
Reviewed-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Linux expects that if a CPU modifies a memory location, then that
modification will eventually become visible to other CPUs in the system.
On an ARM11MPCore processor, loads are prioritised over stores so it is
possible for a store operation to be postponed if a polling loop immediately
follows it. If the variable being polled indirectly depends on the outstanding
store [for example, another CPU may be polling the variable that is pending
modification] then there is the potential for deadlock if interrupts are
disabled. This deadlock occurs in the KGDB testsuire when executing on an
SMP ARM11MPCore configuration.
This patch changes the definition of cpu_relax() to smp_mb() for ARMv6 cores,
forcing a flushing of the write buffer on SMP systems before the next load
takes place. If the Kernel is not compiled for SMP support, this will expand
to a barrier() as before.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
gpio must be int, not u16, otherwise -1 isn't recognised
by gpio_is_valid().
Signed-off-by: Steve Bennett <steveb@workware.net.au>
Signed-off-by: Eric Miao <eric.y.miao@gmail.com>
There are more architectures that don't support ARCH_HAS_SG_CHAIN than
those that support it. This removes removes ARCH_HAS_SG_CHAIN in
asm-generic/scatterlist.h and lets arhictectures to define it.
It's clearer than defining ARCH_HAS_SG_CHAIN asm-generic/scatterlist.h and
undefing it in arhictectures that don't support it.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch contains the hooks and instrumentation into kernel which
live outside the kernel/debug directory, which the kdb core
will call to run commands like lsmod, dmesg, bt etc...
CC: linux-arch@vger.kernel.org
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Martin Hicks <mort@sgi.com>
This patch adds support for 10 hardirq bits to
the ARM architecture. Needed by the SH-Mobile
ARM processor sh7372 that has more than 512 IRQs.
Signed-off-by: Magnus Damm <damm@opensource.se>
Acked-by: Eric Miao <eric.y.miao@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (44 commits)
vlynq: make whole Kconfig-menu dependant on architecture
add descriptive comment for TIF_MEMDIE task flag declaration.
EEPROM: max6875: Header file cleanup
EEPROM: 93cx6: Header file cleanup
EEPROM: Header file cleanup
agp: use NULL instead of 0 when pointer is needed
rtc-v3020: make bitfield unsigned
PCI: make bitfield unsigned
jbd2: use NULL instead of 0 when pointer is needed
cciss: fix shadows sparse warning
doc: inode uses a mutex instead of a semaphore.
uml: i386: Avoid redefinition of NR_syscalls
fix "seperate" typos in comments
cocbalt_lcdfb: correct sections
doc: Change urls for sparse
Powerpc: wii: Fix typo in comment
i2o: cleanup some exit paths
Documentation/: it's -> its where appropriate
UML: Fix compiler warning due to missing task_struct declaration
UML: add kernel.h include to signal.c
...
* 'devel' of master.kernel.org:/home/rmk/linux-2.6-arm: (224 commits)
ARM: remove 'select GENERIC_TIME'
ARM: 6136/1: ARCH_REQUIRE_GPIOLIB selects GENERIC_GPIO
ARM: 6074/1: oprofile: convert from sysdev to platform device
ARM: 6073/1: oprofile: remove old files and update KConfig
ARM: 6072/1: oprofile: use perf-events framework as backend
ARM: 6071/1: perf-events: allow modules to query the number of hardware counters
ARM: 6070/1: perf-events: add support for xscale PMUs
ARM: 6069/1: perf-events: use numeric ID to identify PMU
ARM: 6064/1: pmu: register IRQs at runtime
ARM: Optionally allow ARMv6 to use 'normal, bufferable' memory for DMA
ARM: 6134/1: Handle instruction cache maintenance fault properly
ARM: nwfpe: allow debugging output to be configured at runtime
ARM: rename mach_cpu_disable() to platform_cpu_disable()
ARM: 6132/1: PL330: Add common core driver
ARM: 6094/1: Extend cache-l2x0 to support the 16-way PL310
ARM: Move memory mapping into mmu.c
ARM: Ensure meminfo is sorted prior to sanity_check_meminfo
ARM: Remove useless linux/bootmem.h includes
ARM: convert /proc/cpu/aligment to seq_file
arm: use asm-generic/scatterlist.h
...
In preparation for removing volatile from the atomic_t definition, this
patch adds a volatile cast to all the atomic read functions.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
For OProfile to initialise oprofilefs correctly, it needs to know
the number of counters it can represent.
This patch adds a function to the ARM perf-events backend to return
the number of hardware counters available for the current PMU.
Cc: Jamie Iles <jamie.iles@picochip.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The ARM perf-events framework provides support for a number of different
PMUs using struct arm_pmu. The char *name field of this struct can be
used to identify the PMU, but this is cumbersome if used outside of perf.
This patch replaces the name string for a PMU with an enum, which holds
a unique ID for the PMU being represented. This ID can be used to index
an array of names within perf, so no functionality is lost. The presence
of the ID field, allows other kernel subsystems [currently oprofile] to
use their own mappings for the PMU name.
Cc: Jean Pihet <jpihet@mvista.com>
Acked-by: Jamie Iles <jamie.iles@picochip.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The current PMU infrastructure for ARM requires that the IRQs for the PMU
device are fixed at compile time and are selected based on the ARCH_ or MACH_ flags. This has the disadvantage of tying the Kernel down to a
particular board as far as profiling is concerned.
This patch replaces the compile-time IRQ registration with a runtime mechanism which allows the IRQs to be registered with the framework as
a platform_device.
A further advantage of this change is that there is scope for registering
different types of performance counters in the future by changing the id
of the platform_device and attaching different resources to it.
Acked-by: Jamie Iles <jamie.iles@picochip.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Provide a configuration option to allow the ARMv6 to use normal
bufferable memory for coherent DMA. This option is forced to 'y'
for ARMv7, and offered as a configuration option on ARMv6.
Enabling this option requires drivers to have the necessary barriers
to ensure that data in DMA coherent memory is visible prior to the
DMA operation commencing.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
PL330 is a configurable DMA controller PrimeCell device.
The register map of the device is well defined.
The configuration of a particular implementation can be
read from the six configuration registers CR0-4,Dn.
This patch implements a driver for the specification:-
http://infocenter.arm.com/help/topic/com.arm.doc.ddi0424a/DDI0424A_dmac_pl330_r0p0_trm.pdf
The exported interface should be sufficient to implement
a driver for any DMA API.
Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The L310 cache controller's interface is almost identical
to the L210. One major difference is that the PL310 can
have up to 16 ways.
This change uses the cache's part ID and the Associativity
bits in the AUX_CTRL register to determine the number of ways.
Also, this version prints out the CACHE_ID and AUX_CTRL registers.
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Jason S. McMullan <jason.mcmullan@netronome.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This moves the TWD register set of MPcore to a common
existing file so that watchdog driver can access it
Signed-off-by: srinidhi kasagar <srinidhi.kasagar@stericsson.com>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The standard I-cache Invalidate All (ICIALLU) and Branch Predication
Invalidate All (BPIALL) operations are not automatically broadcast to
the other CPUs in an ARMv7 MP system. The patch adds the Inner Shareable
variants, ICIALLUIS and BPIALLIS, if ARMv7 and SMP.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Final version of the patch that adds support for RS485 communications to the atmel_serial driver.
The patch has been already sent and discussed on both linux-kernel and linux-arm-kernel mailing lists several times.
Many people collaborated to improve and test the code:
Tested-by: Sebastian Heutling <Sebastian.Heutling@who-ing.de>
Tested-by: Bernhard Roth <br@pwrnet.de>
Reviewed-by: Ryan Mallon <ryan@bluewatersys.com>
Signed-off-by: Claudio Scordino <claudio@evidence.eu.com>
Signed-off-by: Michael Trimarchi <michael@evidence.eu.com>
Signed-off-by: Rick Bronson <rick@efn.org>
Signed-off-by: Sebastian Heutling <Sebastian.Heutling@who-ing.de>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The only difference between ICST307 and ICST525 are the two arrays
for calculating the S parameter; the code is now identical. Merge
the two files and kill the duplicated code.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This makes the ICST support fit more nicely with the clk API,
eliminating the need to *1000 and /1000 in places.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
These functions were originally implemented for the CLCD driver before
we had clk API support. Since the CLCD driver does not use these
anymore, we can remove them.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The structures for the ICST307 and ICST525 VCO devices are
identical, so merge them together.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
iop32x_defconfig:
In file included from include/linux/elf.h:7,
from kernel/elfcore.c:1:
arch/arm/include/asm/elf.h:101: warning: "struct task_struct" declared inside parameter list
arch/arm/include/asm/elf.h:101: warning: its scope is only this definition or declaration, which is probably not what you want
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch adds an enum describing the potential PMU device types in
preparation for PMU device registration via platform devices.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch adds support for PCI domains on ARM platforms.
Also, protect asm/mach/pci.h from multiple inclustions, otherwise
build fails because of pci_domain_nr() and pci_proc_domain()
redefinitions.
Signed-off-by: Anton Vorontsov <avorontsov@mvista.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
/tmp/ccJ3ssZW.s: Assembler messages:
/tmp/ccJ3ssZW.s:1952: Error: can't resolve `.text' {.text section} - `.LFB1077'
This is caused because:
.section .data
.section .text
.section .text
.previous
does not return us to the .text section, but the .data section; this
makes use of .previous dangerous if the ordering of previous sections
is not known.
Fix up the other users of .previous; .pushsection and .popsection are
a safer pairing to use than .section and .previous.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
From: Imre Deak <imre.deak@nokia.com>
Signal handlers can use floating point, so prevent them to corrupt
the main thread's VFP context. So far there were two signal stack
frame formats defined based on the VFP implementation, but the user
struct used for ptrace covers all posibilities, so use it for the
signal stack too.
Introduce also a new user struct for VFP exception registers. In
this too fields not relevant to the current VFP architecture are
ignored.
Support to save / restore the exception registers was added by
Will Deacon.
Signed-off-by: Imre Deak <imre.deak@nokia.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The VIVT cache of a highmem page is always flushed before the page
is unmapped. This cache flush is explicit through flush_cache_kmaps()
in flush_all_zero_pkmaps(), or through __cpuc_flush_dcache_area() in
kunmap_atomic(). There is also an implicit flush of those highmem pages
that were part of a process that just terminated making those pages free
as the whole VIVT cache has to be flushed on every task switch. Hence
unmapped highmem pages need no cache maintenance in that case.
However unmapped pages may still be cached with a VIPT cache because the
cache is tagged with physical addresses. There is no need for a whole
cache flush during task switching for that reason, and despite the
explicit cache flushes in flush_all_zero_pkmaps() and kunmap_atomic(),
some highmem pages that were mapped in user space end up still cached
even when they become unmapped.
So, we do have to perform cache maintenance on those unmapped highmem
pages in the context of DMA when using a VIPT cache. Unfortunately,
it is not possible to perform that cache maintenance using physical
addresses as all the L1 cache maintenance coprocessor functions accept
virtual addresses only. Therefore we have no choice but to set up a
temporary virtual mapping for that purpose.
And of course the explicit cache flushing when unmapping a highmem page
on a system with a VIPT cache now can go, which should increase
performance.
While at it, because the code in __flush_dcache_page() has to be modified
anyway, let's also make sure the mapped highmem pages are pinned with
kmap_high_get() for the duration of the cache maintenance operation.
Because kunmap() does unmap highmem pages lazily, it was reported by
Gary King <GKing@nvidia.com> that those pages ended up being unmapped
during cache maintenance on SMP causing segmentation faults.
Signed-off-by: Nicolas Pitre <nico@marvell.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
clkdev.h is using struct device *. Due to this compilation
warning is comming. Removing this warning.
Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
irq.h is using struct pt_regs *. Due to this compilation
warning is comming. Removing this warning by adding declaration
of struct pt_regs.
Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The mandatory barriers (mb, rmb, wmb) are used even on uniprocessor
systems for things like ordering Normal Non-cacheable memory accesses
with DMA transfer (via Device memory writes). The current implementation
uses dmb() for mb() and friends but this is not sufficient. The DMB only
ensures the relative ordering of the observability of accesses by other
processors or devices acting as masters. In case of DMA transfers
started by writes to device memory, the relative ordering is not ensured
because accesses to slave ports of a device are not considered
observable by the DMB definition.
A DSB is required for the data to reach the main memory (even if mapped
as Normal Non-cacheable) before the device receives the notification to
begin the transfer. Furthermore, some L2 cache controllers (like L2x0 or
PL310) buffer stores to Normal Non-cacheable memory and this would need
to be drained with the outer_sync() function call.
The patch also allows platforms to define their own mandatory barriers
implementation by selecting CONFIG_ARCH_HAS_BARRIERS and providing a
mach/barriers.h file.
Note that the SMP barriers are unchanged (being DMBs as before) since
they are only guaranteed to work with Normal Cacheable memory.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch introduces the outer_cache_fns.sync function pointer together
with the OUTER_CACHE_SYNC config option that can be used to drain the
write buffer of the outer cache.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
To avoid #include collisions with subsequent patches in the series, this
patch moves the outer_cache definitions to a separate asm/outercache.h
file.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Convert arm to use GENERIC_TIME via the arch_getoffset() infrastructure,
reducing the amount of arch specific code we need to maintain.
The arm architecture is the last arch that need to be converted.
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Commit 26a26d3296 ("dma-mapping: switch
ARMv7 DMA mappings to retain 'memory' attribute") added a new macro,
pgprot_dmacoherent(), to correctly map DMA memory. The non-mmu pgtable
support code also needs to implement this macro, otherwise when
compiling you get:
CC arch/arm/mm/dma-mapping.o
arch/arm/mm/dma-mapping.c: In function 'dma_alloc_coherent':
arch/arm/mm/dma-mapping.c:320: error: implicit declaration of function 'pgprot_dmacoherent'
arch/arm/mm/dma-mapping.c:320: error: 'pgprot_kernel' undeclared (first use in this function)
arch/arm/mm/dma-mapping.c:320: error: (Each undeclared identifier is reported only once
arch/arm/mm/dma-mapping.c:320: error: for each function it appears in.)
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2.6.34-rc1 added kernel/elfcore.c which includes <asm/elf.h>.
On ARM, this results in:
In file included from include/linux/elf.h:7,
from kernel/elfcore.c:1:
/tmp/linux-2.6.34-rc1/arch/arm/include/asm/elf.h:101: warning: 'struct task_struct' declared inside parameter list
/tmp/linux-2.6.34-rc1/arch/arm/include/asm/elf.h:101: warning: its scope is only this definition or declaration, which is probably not what you want
Including <linux/sched.h> seems a bit heavyweight, so this patch just
adds a tentative declaration of struct task_struct in <asm/elf.h>.
Signed-off-by: Mikael Pettersson <mikpe@it.uu.se>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* 'for-linus' of master.kernel.org:/home/rmk/linux-2.6-arm: (370 commits)
ARM: S3C2443: Add set_rate and round_rate calls for armdiv clock
ARM: S3C2443: Remove #if 0 for clk_mpll
ARM: S3C2443: Update notes on MPLLREF clock
ARM: S3C2443: Further clksrc-clk conversions
ARM: S3C2443: Change to using plat-samsung clksrc-clk implementation
USB: Fix s3c-hsotg build following Samsung platform header moves
ARM: S3C64XX: Reintroduce unconditional build of audio device
ARM: 5961/1: ux500: fix CLKRST addresses
ARM: 5977/1: arm: Enable backtrace printing on oops when PC is corrupted
ASoC: Fix S3C64xx IIS driver for Samsung header reorg
ARM: S3C2440: Fix plat-s3c24xx move of s3c2440/s3c2442 support
[ARM] pxa: fix typo in mxm8x10.h
[ARM] pxa/raumfeld: set GPIO drive bits for LED pins
[ARM] pxa/zeus: Add support for mcp2515 CAN bus
[ARM] pxa/zeus: Add support for onboard max6369 watchdog
[ARM] pxa/zeus: Add Eurotech as the manufacturer
[ARM] pxa/zeus: Correct the USB host initialisation flags
[ARM] pxa/zeus: Allow usage of 8250-compatible UART in uncompress
[ARM] pxa: refactor uncompress.h for non-PXA uarts
[ARM] mmp2: fix incorrect calling of chip->mask_ack() for 2nd level cascaded IRQs
...