Don't alter any txd->srcbus or txd->dstbus values while building the
LLI list. This allows us to see the original dma_addr_t values passed
in via the prep_memcpy() method.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The number of bytes we want to fill into any LLI is the minimum of:
- number of bytes remaining in the transfer
- number of bytes we can transfer in a single LLI
- number of bytes we can transfer without overflowing the source boundary
- number of bytes we can transfer without overflowing the destination boundary
The minimum of the first two is already calculated (target_len). We
limit the boundary calculations to this number of bytes, which will
then give us the number of bytes we can place into this LLI.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
pl08x_pre_boundary() was unsafe with addresses towards the top of
memory space:
boundary = ((addr >> PL08X_BOUNDARY_SHIFT) + 1)
<< PL08X_BOUNDARY_SHIFT;
This can overflow a 32-bit number, producing zero. When it does:
if (boundary < addr + len)
return boundary - addr;
else
return len;
results in (boundary - addr) returning either a large positive value.
Also if addr + len overflows, this calculation also fails.
We can fix this trivially as the only thing we're actually interested
in is the value of the least significant PL08X_BOUNDARY_SHIFT bits:
boundary_len = PL08X_BOUNDARY_SIZE -
(addr & (PL08X_BOUNDARY_SIZE - 1));
gives us the number of bytes before 'addr' becomes a multiple of
PL08X_BOUNDARY_SIZE. We can then just take the min() of the two
calculated lengths.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
We don't need pl08x_fill_lli_for_desc() to return num_llis + 1 as
we know that's what it always does. We can just pass in num_llis
and use post-increment in the caller.
This makes the code slightly easier to read.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Calling the callback handler with spinlocks in the tasklet held leads
to deadlock when dmaengine functions are called:
BUG: spinlock lockup on CPU#0, sh/417, c1870a08
Backtrace:
...
[<c017b408>] (do_raw_spin_lock+0x0/0x154) from [<c02c4b98>] (_raw_spin_lock_irqsave+0x54/0x60)
[<c02c4b44>] (_raw_spin_lock_irqsave+0x0/0x60) from [<c01f5828>] (pl08x_prep_channel_resources+0x718/0x8b4)
[<c01f5110>] (pl08x_prep_channel_resources+0x0/0x8b4) from [<c01f5bb4>] (pl08x_prep_slave_sg+0x120/0x19c)
[<c01f5a94>] (pl08x_prep_slave_sg+0x0/0x19c) from [<c01be7a0>] (pl011_dma_tx_refill+0x164/0x224)
[<c01be63c>] (pl011_dma_tx_refill+0x0/0x224) from [<c01bf1c8>] (pl011_dma_tx_callback+0x7c/0xc4)
[<c01bf14c>] (pl011_dma_tx_callback+0x0/0xc4) from [<c01f4d34>] (pl08x_tasklet+0x60/0x368)
[<c01f4cd4>] (pl08x_tasklet+0x0/0x368) from [<c004d978>] (tasklet_action+0xa0/0x100)
Dan quoted the documentation:
> 2/ Completion callback routines cannot submit new operations. This
> results in recursion in the synchronous case and spin_locks being
> acquired twice in the asynchronous case.
but then followed up to say:
> I should clarify, this is the async_memcpy() api requirement which is
> not used outside of md/raid5. DMA drivers can and do allow new
> submissions from callbacks, and the ones that do so properly move the
> callback outside of the driver lock.
So let's fix it by moving the callback out of the spinlocked region.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Platforms need to be able to control which AHB master interface is used,
as each AHB master interface may be asymetric. Allow the interfaces
used for fetching LLIs, memory, and each peripheral to be configured
individually.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
As we initialize the default cctl value in the prep_* functions along
with the increment settings, we don't need to repeat the selection of
the AHB ports each time we create a LLI entry. Do this in the prep_*
functions once per transfer.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
We don't need to initialize the cctl increment and protection values
in the runtime_config method - we have all the inforamtion to setup
these values in prep_slave_sg(). Move their initialization there.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Rather than modifying platform data while preparing a transfer, copy
the cctl value into the txd structure and modify the value there.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
There is no need to wait until we start processing a tx descriptor
before setting up the DMA request selection in the ccfg register.
We know which channel and request will be used in prep_phy_channel(),
so setup the ccfg request selection at txd creation time in
prep_phy_channel().
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The ccfg register is used to configure the channel parameters - the type
and direction of transfer, the flow control signal and IRQ mask enables.
The type and direction of transfer is known in the relevent prep_*
function where a txd is created. The IRQ mask enables are always set,
and the flow control signals are always set when we start processing a
txd according to phychan->signal.
If we store the ccfg value in the txd structure, we can avoid modifying
platform data - and even having it in platform data at all.
So, remove it from platform data too.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
As we now have all the code accessing the phychan {csrc,cdst,clli,cctl,
ccfg} members in one function, there's no point storing the data into
the struct. Get rid of the struct members. Re-order the register dump
in the dev_dbg() to reflect the order we write the registers to the DMA
device.
The txd {csrc,cdst,clli,cctl} values are duplicates of the lli[0]
values, so there's no point duplicating these either. Program the DMAC
registers directly from the lli[0] values.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
There is no need for pl08x_config_phychan_for_txd(), pl08x_set_cregs()
and pl08x_enable_phy_chan() to be separate - they are always called in
sequence. Combine them into one function.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
As the LLI list is an array, we can use maths to locate which LLI
index we're currently at, and then sum up the remaining LLI entries
until we reach the end of the list.
This makes the code much easier to read, and much less susceptible
to falling off the end of the array.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The LLI pointer in the documentation is placed into the LLI register,
so name it LLI rather than 'next'.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Use 'u32' for the LLI structure members, which are defined by hardware
to be 32-bit. dma_addr_t is much more vague about its actual size.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Use size_t for variables denoting lengths throughout, and use the 'z'
qualifier for printing the value. For safety, add a BUG_ON() in
pl08x_fill_lli_for_desc() to catch the remainder potentially becoming
negative.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
llis_bus is the DMA address of the LLI array. Casting it to be a
pointer just to be able to use pointer arithmetic on it is not nice.
We can trivially deal with the places where we do arithmetic on it,
and it's actually cleaner this way.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
We only want use the address of the LLI pointer when locating the
corresponding structure in memory, so clear the master bus selection
bit.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Tight loops should use cpu_relax() to allow CPUs to reduce power
consumption while waiting for events.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Consolidate duplicated channel release code into release_phy_channel()
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Consolidate code which allocates and initializes txds.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Avoid using 'void *' struct fields when the structs are not defined
in linux/amba/pl08x.h - instead, forward declare the struct names, and
use these instead. This ensures we have proper typechecking.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
We should never modify the vendor data structure so make it const.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The driver already won't initialize a channel with a circular buffer;
the check in pl08x_prep_channel_resources() sees to that. Remove
circular buffer support for the time being.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The tasklet always is initialized with a non-NULL data argument. It
is not possible for it to be called with a NULL data argument (unless
something is very wrong in the tasklet code - in which case lots of
stuff will break). Therefore, as plchan can never be NULL, remove
this unnecessary BUG check.
In pl08x_tasklet(), we've already dereferenced plchan->at, so it can't
be NULL here. Remove this unnecessary BUG check.
pl08x_fill_llis_for_desc() and pl08x_free_txd() are always called with
a non-NULL txd argument - either as a consequence of the code paths or
as a result of other checks already in place. We don't need to repeat
the non-NULL check in these functions.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
We don't need to include linux/pci.h as we aren't a PCI driver. We
aren't doing any processor specific functions, so asm/processor.h is
not required. asm/cacheflush.h shouldn't be used, we have the DMA API
for this. DMA interfaces aren't required as we're only implementing
the dmaengine API and not a platform-private DMA API.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
A driver which emits both decimal and hex numbers in its printk
creates confusion as to what is what. Prefix hex numbers with 0x.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Include the revision number of the PL08x primecell in the boot-time
printk to allow proper identification of the peripheral. Reformat
the announcement printk format reflect what we do for other primecell
drivers - generally "PLXXX revX at 0xNNNNNNNN irq X".
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Explain the two flow control methods which the PL08x implements, along
with the problem which peripheral flow control presents. This helps
people understand why we are unable to use these DMA controllers with
(eg) the MMCI.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
drivers/dma/amba-pl08x.c:1895:40: warning: Unknown escape '%'
drivers/dma/amba-pl08x.c:1903:40: warning: Unknown escape '%'
drivers/dma/amba-pl08x.c:513:6: warning: symbol 'pl08x_choose_master_bus' was not declared. Should it be static?
drivers/dma/amba-pl08x.c:604:5: warning: symbol 'pl08x_fill_llis_for_desc' was not declared. Should it be static?
drivers/dma/amba-pl08x.c:1442:32: warning: symbol 'pl08x_prep_slave_sg' was not declared. Should it be static?
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Trying to disable a tasklet while holding a spinlock which the tasklet
will take is a recipe for deadlock - tasklet_disable() will wait for the
tasklet to finish running, which it will never do. In any case, there
is not a corresponding tasklet_enable(), so once the tasklet is disabled,
it will never run again until reboot.
It's safe to just remove the tasklet_disable() as we remove all current
and pending descriptors before releasing this spinlock. This means that
the tasklet will find no remaining work if it subsequently runs.
The only remaining issue is that the callback for an already submitted
txd may be in progress, or even called after terminate_all() returns.
There's not much that can be done about that as waiting for the callback
to complete before returning will also lead to deadlocks.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
pl08x_issue_pending() returns with the spinlock locked and interrupts
disabled if the channel is waiting for a physical DMA to become free.
This is wrong - especially as pl08x_issue_pending() is an API function
as it leads to deadlocks. Fix it to always return with the spinlock
unlocked.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
If we fail to allocate the LLI, the prep_* function will return NULL.
However, the TXD we allocated will not be placed on any list, nor
will it be freed - we'll just drop all references to it. Make sure
we free it rather than leaking TXDs.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Tasklets are run from an interruptible context. The slave DMA functions
can be called from within IRQ handlers. Taking the spinlock without
disabling interrupts allows an interrupt handler to run, which may try
to take the spinlock again, resulting in deadlock. Fix this by using
the irqsave spinlocks.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The last_issued variable uses an atomic type, which is only
incremented inside a protected region, and then read. Everywhere else
only reads the value, so it isn't using atomic_t correctly, and it
doesn't even need to. Moreover, the DMA engine code provides us with
a variable for this already - chan.cookie. Use chan.cookie instead.
Also, avoid negative dma_cookie_t values - negative returns from
tx_submit() mean failure, yet in reality we always succeed. Restart
from cookie 1, just like other DMA engine drivers do.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
If maxburst was passed in as zero, we would overflow the burst_sizes[]
array. Fix this by checking for this condition, and defaulting to
single transfer 'bursts'.
Improve the readability of the loop using a for() loop rather than
a while() loop with the iterator initialized far from the loop.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Correct mis-spellings in comments and printk strings.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The maximum transfer size of the stedma40 is (64k-1) x data-width.
If the transfer size of one element exceeds this limit
the job is split up and sent as linked transfer.
Signed-off-by: Per Forlin <per.forlin@linaro.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx:
dmaengine: provide dummy functions for DMA_ENGINE=n
mv_xor: fix race in tasklet function
use mv_xor_slot_cleanup() instead of __mv_xor_slot_cleanup() as the former function
aquires the spin lock that needed to protect the drivers data.
Cc: <stable@kernel.org>
Signed-off-by: Saeed Bishara <saeed@marvell.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Currently completed descriptors are processed in the tasklet. This can
lead to dead lock in case of CONFIG_NET_DMA enabled (new requests are
submitted from softirq context and dma_memcpy_to_iovec() busy loops until
the requests is submitted). To prevent this we should process completed
descriptors from the allocation failure path in prepare_memcpy too.
Signed-off-by: Ilya Yanok <yanok@emcraft.com>
Cc: Piotr Ziecik <kosmo@semihalf.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
MPC8308 has pretty much the same DMA controller as MPC5121 and
this patch adds support for MPC8308 to the mpc512x_dma driver.
Signed-off-by: Ilya Yanok <yanok@emcraft.com>
Acked-by: Wolfgang Denk <wd@denx.de>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Current code clears interrupt active status _after_ submitting new
transfers. This leaves a possibility of clearing the interrupt for this
new transfer (if it is triggered fast enough) and thus lose this
interrupt. We want to clear interrupt active status _before_ new
transfers is submitted and for current channel only.
Signed-off-by: Ilya Yanok <yanok@emcraft.com>
Acked-by: Wolfgang Denk <wd@denx.de>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
While testing mpc512x-dma driver with dmatest module I've found that
I can hang the mpc512x-dma issuing request from multiple threads to
the single channel.
insmod dmatest.ko max_channels=1 threads_per_chan=16
After investigating this case I've managed to find that this happens
if and only if we have more than one queued requests.
In this case the driver tries to make use of hardware scatter/gather
functionality. I've found two problems with scatter/gather:
1. When TCD is copied form RAM to the TCD register space with memcpy_io()
e_sg bit eventually gets cleared. This results in only first TCD being
executed. I've added setting of e_sg bit explicitly in the TCD registers.
BTW, what is the correct way to do this? (How can I use setbits with bitfield
structure?) After that hardware loads consecutive TCDs and we hit the
second issue.
2. Existing code clears int_maj bit in the last TCD so we never get
an interrupt on transfer completion.
With these fixes my tests with many threads of single channel succeed but
tests that use many channels simultaneously still don't work reliable.
Signed-off-by: Ilya Yanok <yanok@emcraft.com>
Acked-by: Wolfgang Denk <wd@denx.de>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Conflicts:
MAINTAINERS
arch/arm/mach-omap2/pm24xx.c
drivers/scsi/bfa/bfa_fcpim.c
Needed to update to apply fixes for which the old branch was too
outdated.
Presently DMA transfers are interrupted and aborted by the NMI. This
implements some basic logic for more gracefully handling and clearing
each controller's NMIF flag via the NMI die chain, needed to resume
transfers post-NMI.
Reported-by: Michael Szafranek <Michael.Szafranek@emtrion.de>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Buffer transfer size is the number of transfers to be performed in
relation with the width of the _source_ interface.
So in the DMA_FROM_DEVICE case, it should be the register width that
should be taken into account.
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Fixed fsl dma slow issue by initializing dma mode register with
bandwidth control. It boosts dma performance and should works
with 85xx board.
Signed-off-by: Forrest Shi <b29237@freescale.com>
Signed-off-by: Li Yang <leoli@freescale.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The firmware framework gets initialized during fs_initcall time, so
we are not allowed to call request_firmware earlier.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Currently, in case of using scatter/gather mode, head of data is not sent to
destination. The cause is second descriptor address is set to NEXT.
The NEXT must have head of descriptor address.
This patch sets head of descriptor address to the NEXT.
Acked-by: Yong Wang <youg.y.wang@intel.com>
Signed-off-by: Tomoya MORINAGA <tomoya-linux@dsn.okisemi.com>
[dan.j.williams@intel.com: fixed up usage of virt_to_phys()]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Current driver's device_prep_slave_sg can't be used by DMAC2 even
the sg list contains one item, this patch will enable DMAC2 to
use this API.
Signed-off-by: Feng Tang <feng.tang@intel.com>
Acked-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Rename intel_mid_dma_pci to intel_mid_dma_pci_driver to pick up the
applied annotations of that suffix.
Reported-by: <major_Lee@wistron.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Currently while submitting scatterlists with more than one SG
entry the DMA buffer address from the first SG entry is inserted
into all initialized DMA buffer descriptors. This is due to the
typo in the for_each_sg() loop where the scatterlist pointer is
used for obtaining the DMA buffer address and _not_ the SG list
iterator.
As a result all received data will be written only into the first
DMA buffer while reading. While writing the data from the first
DMA buffer is send to the device multiple times. This caused
the filesystem destruction on the MMC card when using DMA in
mxcmmc driver.
Signed-off-by: Anatolij Gustschin <agust@denx.de>
Acked-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Changed Makefile to use <modules>-y instead of <modules>-objs. Following
(documentation/kbuild/makefiles.txt).
Signed-off-by: Tracey Dent <tdent48227@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Expand the dma_mask of fsldma device to 36-bit, indicating that the
DMA engine can deal with 36-bit physical address and does not need
the SWIOTLB to create bounce buffer for it when doing dma_map_*().
Signed-off-by: Li Yang <leoli@freescale.com>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The SDMA firmware consists of a ROM part and a RAM part.
The ROM part is always present in the SDMA engine and
is sufficient for many cases.
This patch allows to pass in platform data containing
the script addresses in ROM, so loading a firmware is
optional now.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Use the ccflag-y flag instead of EXTRA_CFLAGS because EXTRA_CFLAGS is
deprecated and should now be switched. According to (documentation/kbuild/makefiles.txt).
Signed-off-by: Tracey Dent <tdent48227@gmail.com>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
We should not call kfree(dma) in mid_setup_dma error path because
the memory is allocated in intel_mid_dma_probe and will be freed
in intel_mid_dma_probe error path if mid_setup_dma return error.
Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
otherwise, i will be -1 inside the latest iteration of the while loop.
Signed-off-by: Axel Lin <axel.lin@gmail.com>
Acked-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Smatch complains because we dereference "mid" before checking it. It
turns out that "mid" is always a valid pointer here so we can just
remove the check.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Product codenames are OK, but once an actual product name is available,
it should be referenced as well.
http://ark.intel.com/chipset.aspx?familyID=52499
Signed-off-by: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx: (48 commits)
DMAENGINE: move COH901318 to arch_initcall
dma: imx-dma: fix signedness bug
dma/timberdale: simplify conditional
ste_dma40: remove channel_type
ste_dma40: remove enum for endianess
ste_dma40: remove TIM_FOR_LINK option
ste_dma40: move mode_opt to separate config
ste_dma40: move channel mode to a separate field
ste_dma40: move priority to separate field
ste_dma40: add variable to indicate valid dma_cfg
async_tx: make async_tx channel switching opt-in
move async raid6 test to lib/Kconfig.debug
dmaengine: Add Freescale i.MX1/21/27 DMA driver
intel_mid_dma: change the slave interface
intel_mid_dma: fix the WARN_ONs
intel_mid_dma: Add sg list support to DMA driver
intel_mid_dma: Allow DMAC2 to share interrupt
intel_mid_dma: Allow IRQ sharing
intel_mid_dma: Add runtime PM support
DMAENGINE: define a dummy filter function for ste_dma40
...
NULL-terminating pci_device_id in pch_dma.c and scx200_acb.c
for appying MODULE_DEVICE_TABLE (to publish modalias-es).
Signed-off-by: Dzianis Kahanovich <mahatma@eu.by>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
After moving the PL022 driver to subsys_initcall() due to the need
of having stuff like regulators on the other end of the SPI link,
I noticed that the COH901318 DMA engine will get probed before
the DMA engine, so move it to an arch_initcall().
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
mxdmac->channel was unsigned, so check (imxdmac->channel < 0) for
failed imx_dma_request_by_prio() made no sence. Explicitly check
signed values.
Also, fix uninitialzed use of ret.
Signed-off-by: Vasiliy Kulikov <segooon@gmail.com>
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl:
vfs: make no_llseek the default
vfs: don't use BKL in default_llseek
llseek: automatically add .llseek fop
libfs: use generic_file_llseek for simple_attr
mac80211: disallow seeks in minstrel debug code
lirc: make chardev nonseekable
viotape: use noop_llseek
raw: use explicit llseek file operations
ibmasmfs: use generic_file_llseek
spufs: use llseek in all file operations
arm/omap: use generic_file_llseek in iommu_debug
lkdtm: use generic_file_llseek in debugfs
net/wireless: use generic_file_llseek in debugfs
drm: use noop_llseek
Simplify: ((a && b) || (!a && !b)) => (a == b)
Signed-off-by: Nicolas Kaiser <nikai@nikai.net>
Acked-by: Jack Stone <jwjstone@fastmail.fm>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
A bool will suffice. The default is little endian.
Acked-by: Jonas Aaberg <jonas.aberg@stericsson.com>
Signed-off-by: Rabin Vincent <rabin.vincent@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Defaults are "basic mode" for physical channels, and "logical source
logical destination" for logical channels.
Acked-by: Jonas Aaberg <jonas.aberg@stericsson.com>
Signed-off-by: Rabin Vincent <rabin.vincent@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
And keep it logical by default.
Acked-by: Jonas Aaberg <jonas.aberg@stericsson.com>
Signed-off-by: Rabin Vincent <rabin.vincent@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
And keep it low priority by default.
Acked-by: Jonas Aaberg <jonas.aberg@stericsson.com>
Signed-off-by: Rabin Vincent <rabin.vincent@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Since we want to reduce the amount of required channel
configuration and remove channel_type, don't depend on it
to indicate whether the configuration is valid.
Acked-by: Jonas Aaberg <jonas.aberg@stericsson.com>
Signed-off-by: Rabin Vincent <rabin.vincent@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
All file_operations should get a .llseek operation so we can make
nonseekable_open the default for future file operations without a
.llseek pointer.
The three cases that we can automatically detect are no_llseek, seq_lseek
and default_llseek. For cases where we can we can automatically prove that
the file offset is always ignored, we use noop_llseek, which maintains
the current behavior of not returning an error from a seek.
New drivers should normally not use noop_llseek but instead use no_llseek
and call nonseekable_open at open time. Existing drivers can be converted
to do the same when the maintainer knows for certain that no user code
relies on calling seek on the device file.
The generated code is often incorrectly indented and right now contains
comments that clarify for each added line why a specific variant was
chosen. In the version that gets submitted upstream, the comments will
be gone and I will manually fix the indentation, because there does not
seem to be a way to do that using coccinelle.
Some amount of new code is currently sitting in linux-next that should get
the same modifications, which I will do at the end of the merge window.
Many thanks to Julia Lawall for helping me learn to write a semantic
patch that does all this.
===== begin semantic patch =====
// This adds an llseek= method to all file operations,
// as a preparation for making no_llseek the default.
//
// The rules are
// - use no_llseek explicitly if we do nonseekable_open
// - use seq_lseek for sequential files
// - use default_llseek if we know we access f_pos
// - use noop_llseek if we know we don't access f_pos,
// but we still want to allow users to call lseek
//
@ open1 exists @
identifier nested_open;
@@
nested_open(...)
{
<+...
nonseekable_open(...)
...+>
}
@ open exists@
identifier open_f;
identifier i, f;
identifier open1.nested_open;
@@
int open_f(struct inode *i, struct file *f)
{
<+...
(
nonseekable_open(...)
|
nested_open(...)
)
...+>
}
@ read disable optional_qualifier exists @
identifier read_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
expression E;
identifier func;
@@
ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
{
<+...
(
*off = E
|
*off += E
|
func(..., off, ...)
|
E = *off
)
...+>
}
@ read_no_fpos disable optional_qualifier exists @
identifier read_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
@@
ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
{
... when != off
}
@ write @
identifier write_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
expression E;
identifier func;
@@
ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
{
<+...
(
*off = E
|
*off += E
|
func(..., off, ...)
|
E = *off
)
...+>
}
@ write_no_fpos @
identifier write_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
@@
ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
{
... when != off
}
@ fops0 @
identifier fops;
@@
struct file_operations fops = {
...
};
@ has_llseek depends on fops0 @
identifier fops0.fops;
identifier llseek_f;
@@
struct file_operations fops = {
...
.llseek = llseek_f,
...
};
@ has_read depends on fops0 @
identifier fops0.fops;
identifier read_f;
@@
struct file_operations fops = {
...
.read = read_f,
...
};
@ has_write depends on fops0 @
identifier fops0.fops;
identifier write_f;
@@
struct file_operations fops = {
...
.write = write_f,
...
};
@ has_open depends on fops0 @
identifier fops0.fops;
identifier open_f;
@@
struct file_operations fops = {
...
.open = open_f,
...
};
// use no_llseek if we call nonseekable_open
////////////////////////////////////////////
@ nonseekable1 depends on !has_llseek && has_open @
identifier fops0.fops;
identifier nso ~= "nonseekable_open";
@@
struct file_operations fops = {
... .open = nso, ...
+.llseek = no_llseek, /* nonseekable */
};
@ nonseekable2 depends on !has_llseek @
identifier fops0.fops;
identifier open.open_f;
@@
struct file_operations fops = {
... .open = open_f, ...
+.llseek = no_llseek, /* open uses nonseekable */
};
// use seq_lseek for sequential files
/////////////////////////////////////
@ seq depends on !has_llseek @
identifier fops0.fops;
identifier sr ~= "seq_read";
@@
struct file_operations fops = {
... .read = sr, ...
+.llseek = seq_lseek, /* we have seq_read */
};
// use default_llseek if there is a readdir
///////////////////////////////////////////
@ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier readdir_e;
@@
// any other fop is used that changes pos
struct file_operations fops = {
... .readdir = readdir_e, ...
+.llseek = default_llseek, /* readdir is present */
};
// use default_llseek if at least one of read/write touches f_pos
/////////////////////////////////////////////////////////////////
@ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier read.read_f;
@@
// read fops use offset
struct file_operations fops = {
... .read = read_f, ...
+.llseek = default_llseek, /* read accesses f_pos */
};
@ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier write.write_f;
@@
// write fops use offset
struct file_operations fops = {
... .write = write_f, ...
+ .llseek = default_llseek, /* write accesses f_pos */
};
// Use noop_llseek if neither read nor write accesses f_pos
///////////////////////////////////////////////////////////
@ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier read_no_fpos.read_f;
identifier write_no_fpos.write_f;
@@
// write fops use offset
struct file_operations fops = {
...
.write = write_f,
.read = read_f,
...
+.llseek = noop_llseek, /* read and write both use no f_pos */
};
@ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier write_no_fpos.write_f;
@@
struct file_operations fops = {
... .write = write_f, ...
+.llseek = noop_llseek, /* write uses no f_pos */
};
@ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier read_no_fpos.read_f;
@@
struct file_operations fops = {
... .read = read_f, ...
+.llseek = noop_llseek, /* read uses no f_pos */
};
@ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
@@
struct file_operations fops = {
...
+.llseek = noop_llseek, /* no read or write fn */
};
===== End semantic patch =====
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Julia Lawall <julia@diku.dk>
Cc: Christoph Hellwig <hch@infradead.org>
Commit 0793448 "DMAENGINE: generic channel status v2" changed the interface for
how dma channel progress is retrieved. It inadvertently exported an internal
helper function ioat_tx_status() instead of ioat_dma_tx_status(). The latter
polls the hardware to get the latest completion state, while the helper just
evaluates the current state without touching hardware. The effect is that we
end up waiting for completion timeouts or descriptor allocation errors before
the completion state is updated.
iperf (before fix):
[SUM] 0.0-41.3 sec 364 MBytes 73.9 Mbits/sec
iperf (after fix):
[SUM] 0.0- 4.5 sec 499 MBytes 940 Mbits/sec
This is a regression starting with 2.6.35.
Cc: <stable@kernel.org>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: Linus Walleij <linus.walleij@stericsson.com>
Cc: Maciej Sosnowski <maciej.sosnowski@intel.com>
Reported-by: Richard Scobie <richard@sauce.co.nz>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The majority of drivers in drivers/dma/ will never establish cross
channel operation chains and do not need the extra overhead in struct
dma_async_tx_descriptor. Make channel switching opt-in by default.
Cc: Anatolij Gustschin <agust@denx.de>
Cc: Ira Snyder <iws@ovro.caltech.edu>
Cc: Linus Walleij <linus.walleij@stericsson.com>
Cc: Saeed Bishara <saeed@marvell.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
This driver is currently implemented as a user to the old i.MX
DMA API. This allows us to convert each user of the old API to
the dmaengine API one by one. Once this is done the old DMA
driver can be merged into the i.MX dmaengine driver.
V2: remove some debug leftovers and unused variables
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
In 2.6.36 kernel, dma slave control command was introduced,
this patch changes the intel-mid-dma driver to this
new kernel slave interface
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Moved the WARN_ON to BUG_ON, as WARN_ON if hit,
can cause null pointer derefrences
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
For a very high speed DMA various periphral devices need
scatter-gather list support. The DMA hardware support link list items.
This list can be circular also (adding new flag DMA_PREP_CIRCULAR_LIST)
Right now this flag is in driver header and should be moved to
dmaengine header file eventually
Signed-off-by: Ramesh Babu K V <ramesh.b.k.v@intel.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Allow DMAC2 to share interrupt since exclusive interrupt line
for mrst DMAC2 is not provided on other platforms.
Signed-off-by: Yong Wang <yong.y.wang@intel.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
intel_mid_dma driver allows interrupt sharing. Thus it needs
to check whether IRQ source is the DMA controller and return
the appropriate IRQ return.
Signed-off-by: Yong Wang <yong.y.wang@intel.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
This patch adds runtime PM support in this dma driver
for 4 PCI Controllers
Whenever the driver is idle (no channels grabbed), it
can go to low power state
It also adds the PCI suspend and resume support
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Remove obsolete pre_transfer hook in stedma40_chan_cfg. The
intent of this hook is merely to handle burst size
compensation for ux500 variant MMCI. Remove obsolete stedma40_set_psize
since it is only called from pre_transfer. DMAEngine device_control
replaces the functionality of stedma40_set_psize.
Signed-off-by: Per Forlin <per.forlin@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Set burst for physical or logical channels respectively.
Convert the values in dma_cfg to dma reg bits
for physical or logical channels.
Signed-off-by: Per Forlin <per.forlin@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Fix some leaks of allocated descriptors in error paths.
Acked-by: Jonas Aaberg <jonas.aberg@stericsson.com>
Signed-off-by: Rabin Vincent <rabin.vincent@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Fix desc_get to alloc a descriptor from the cache if the ones in the
list are waiting for the ack. Also, memzero the descriptor when
allocated from the list to ensure all fields are cleared.
Acked-by: Jonas Aaberg <jonas.aberg@stericsson.com>
Signed-off-by: Rabin Vincent <rabin.vincent@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
clk_get returns an ERR_PTR.
Acked-by: Jonas Aaberg <jonas.aberg@stericsson.com>
Signed-off-by: Rabin Vincent <rabin.vincent@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The value in the array, not the index, specifies the channel to be
disabled.
Acked-by: Jonas Aaberg <jonas.aberg@stericsson.com>
Signed-off-by: Rabin Vincent <rabin.vincent@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Now that the DMAEngine API has support for scatterlist to scatterlist
copy, implement support for the STE DMA40 DMA controller.
Cc: Linus Walleij <linus.ml.walleij@gmail.com>
Acked-by: Per Fridén <per.friden@stericsson.com>
Signed-off-by: Ira W. Snyder <iws@ovro.caltech.edu>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Now that the generic DMAEngine API has support for scatterlist to
scatterlist copying, the device_prep_slave_sg() portion of the
DMA_SLAVE API is no longer necessary and has been removed.
However, the device_control() portion of the DMA_SLAVE API is still
useful to control device specific parameters, such as externally
controlled DMA transfers and maximum burst length.
A special dma_ctrl_cmd has been added to enable externally controlled
DMA transfers. This is currently specific to the Freescale DMA
controller, but can easily be made generic when another user is found.
Signed-off-by: Ira W. Snyder <iws@ovro.caltech.edu>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>