Both changes in dc97b3409a cause serious
regressions in the nouveau driver.
move_notify() was originally able to presume that bo->mem is the old node,
and new_mem is the new node. The above commit moves the call to
move_notify() to after move() has been done, which means that now, sometimes,
new_mem isn't the new node at all, bo->mem is, and new_mem points at a
stale, possibly-just-been-killed-by-move node.
This is clearly not a good situation. This patch reverts this change, and
replaces it with a cleanup in the move() failure path instead.
The second issue is that the call to move_notify() from cleanup_memtype_use()
causes the TTM ghost objects to get passed into the driver. This is clearly
bad as the driver knows nothing about these "fake" TTM BOs, and ends up
accessing uninitialised memory.
I worked around this in nouveau's move_notify() hook by ensuring the BO
destructor was nouveau's. I don't particularly like this solution, and
would rather TTM never pass the driver these objects. However, I don't
clearly understand the reason why we're calling move_notify() here anyway
and am happy to work around the problem in nouveau instead of breaking the
behaviour expected by other drivers.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Jerome Glisse <j.glisse@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
ttm tt rework modified the way we allocate and populate the
ttm_tt structure, the AGP side was missing some bit to properly
work. Fix those and fix radeon and nouveau AGP support.
Tested on radeon only so far.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
"drm/ttm: callback move_notify any time bo placement change v4" failed to
avoid a NULL pointer dereference in nouveau caused by move_notify being
expected to handle that case now.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Merge in the upstream tree to bring in the mainline fixes.
Conflicts:
drivers/gpu/drm/exynos/exynos_drm_fbdev.c
drivers/gpu/drm/nouveau/nouveau_sgdma.c
Previously we were calling back move_notify in error path when the
bo is returned to it's original position or when destroy the bo.
When destroying the bo set the new mem placement as NULL when calling
back in the driver.
Updating nouveau to deal with NULL placement properly.
v2: reserve the object before calling move_notify in bo destroy path
at that point ttm should be the only piece of code interacting
with the object so atomic_set is safe here.
v3: callback move notify only once the bo is in its new position
call move notify want swaping out the buffer
v4:- don't call move_notify when swapin out bo, assume driver should
do what is appropriate in swap notify
- move move_notify call back to ttm_bo_cleanup_memtype_use for
destroy path
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Provide helper function to compute the kernel memory size needed
for each buffer object. Move all the accounting inside ttm, simplifying
driver and avoiding code duplication accross them.
v2 fix accounting of ghost object, one would have thought that i
would have run into the issue since a longtime but it seems
ghost object are rare when you have plenty of vram ;)
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Move dma data to a superset ttm_dma_tt structure which herit
from ttm_tt. This allow driver that don't use dma functionalities
to not have to waste memory for it.
V2 Rebase on top of no memory account changes (where/when is my
delorean when i need it ?)
V3 Make sure page list is initialized empty
V4 typo/syntax fixes
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
If the card is capable of more than 32-bit, then use the default
TTM page pool code which allocates from anywhere in the memory.
Note: If the 'ttm.no_dma' parameter is set, the override is ignored
and the default TTM pool is used.
V2 use pci_set_consistent_dma_mask
V3 Rebase on top of no memory account changes (where/when is my
delorean when i need it ?)
CC: Ben Skeggs <bskeggs@redhat.com>
CC: Francisco Jerez <currojerez@riseup.net>
CC: Dave Airlie <airlied@redhat.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Move the page allocation and freeing to driver callback and
provide ttm code helper function for those.
Most intrusive change, is the fact that we now only fully
populate an object this simplify some of code designed around
the page fault design.
V2 Rebase on top of memory accounting overhaul
V3 New rebase on top of more memory accouting changes
V4 Rebase on top of no memory account changes (where/when is my
delorean when i need it ?)
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
ttm_backend will only exist with a ttm_tt, and ttm_tt
will only be of interest when bound to a backend. Merge them
to avoid code and data duplication.
V2 Rebase on top of memory accounting overhaul
V3 Rebase on top of more memory accounting changes
V4 Rebase on top of no memory account changes (where/when is my
delorean when i need it ?)
V5 make sure ttm is unbound before destroying, change commit
message on suggestion from Tormod Volden
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Fixes the framebuffer memory allocation failure seen on some
low-memory cards, followed by X refusing to start.
https://bugs.freedesktop.org/show_bug.cgi?id=42384
Reported-by: Chris Paulson-Ellis <chris@edesix.com>
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
This reverts commit dfadbbdb57.
Further upstream discussion between Marek and Thomas decided this wasn't
fully baked and needed further work, so revert it before it hits mainline.
Signed-off-by: Dave Airlie <airlied@redhat.com>
* 'drm-nouveau-next' of git://git.freedesktop.org/git/nouveau/linux-2.6: (353 commits)
drm/nouveau: remove allocations from gart populate() hook
drm/nvc0/fb: slightly improve PMFB intr handling, move out of nvc0_graph.c
drm/nvc0/fifo: avoid touching missing subfifos
drm/nvd9/disp: bail out of mode_set_base if no fb bound to crtc
drm/nvd9/disp: stub some more api hooks so we don't oops on resume
drm/nouveau: fix printk typo in ioremap failure path
drm/nvc0/pm: minor clock readback fixes
drm/nv40/pm: execute memory reset script from vbios
drm/nv50/gr: refactor initialisation
drm/nouveau: if requested, try harder at disabling sysmem pushbufs
drm/nv50/gr: enable ctxprog xfer only when we need it to save power
drm/nouveau/dp: add support for displayport table 0x30
drm/nouveau/dp: return master dp table pointer too when looking up encoder
drm/nouveau/bios: simplify U/d table hash matching func to just match
drm/nouveau/dp: preserve non-pattern bits in DP_TRAINING_PATTERN_SET
drm/nvc0/gr: remove MODULE_FIRMWARE() lines
drm/nouveau/dp: use alternate lane mask for nvaf
drm/nouveau/dp: link rate scripts are selected with a comparison table
drm/nv40/pm: write nv40-specific reclocking routines
drm/nv40/pm: parse geometric delta clock from vbios
...
Sometimes we want to know whether a buffer is busy and wait for it (bo_wait).
However, sometimes it would be more useful to be able to query whether
a buffer is busy and being either read or written, and wait until it's stopped
being either read or written. The point of this is to be able to avoid
unnecessary waiting, e.g. if a GPU has written something to a buffer and is now
reading that buffer, and a CPU wants to map that buffer for read, it needs to
only wait for the last write. If there were no write, there wouldn't be any
waiting needed.
This, or course, requires user space drivers to send read/write flags
with each relocation (like we have read/write domains in radeon, so we can
actually use those for something useful now).
Now how this patch works:
The read/write flags should passed to ttm_validate_buffer. TTM maintains
separate sync objects of the last read and write for each buffer, in addition
to the sync object of the last use of a buffer. ttm_bo_wait then operates
with one the sync objects.
Signed-off-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Greatly simplifies a number of things, particularly once per-client GPU
address spaces are involved.
May add this back later once I know what things'll look like.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Was previously assuming a page size of 4KiB unless a VMA was present to
override it. Eventually, a buffer won't necessarily have a VMA at all at
some stages of its life, so we need to store this info elsewhere.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
On chipsets using nouveau_vm, the virtual address stays constant, so
the value set at bo creation time is fine.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
* 'intel/drm-intel-next' of ../drm-next: (755 commits)
drm/i915: Only wait on a pending flip if we intend to write to the buffer
drm/i915/dp: Sanity check eDP existence
drm/i915: Rebind the buffer if its alignment constraints changes with tiling
drm/i915: Disable GPU semaphores by default
drm/i915: Do not overflow the MMADDR write FIFO
Revert "drm/i915: fix corruptions on i8xx due to relaxed fencing"
drm/i915: Don't save/restore hardware status page address register
drm/i915: don't store the reg value for HWS_PGA
drm/i915: fix memory corruption with GM965 and >4GB RAM
Linux 2.6.38-rc7
Revert "TPM: Long default timeout fix"
drm/i915: Re-enable GPU semaphores for SandyBridge mobile
drm/i915: Replace vblank PM QoS with "Interrupt-Based AGPBUSY#"
Revert "drm/i915: Use PM QoS to prevent C-State starvation of gen3 GPU"
drm/i915: Allow relocation deltas outside of target bo
drm/i915: Silence an innocuous compiler warning for an unused variable
fs/block_dev.c: fix new kernel-doc warning
ACPI: Fix build for CONFIG_NET unset
mm: <asm-generic/pgtable.h> must include <linux/mm_types.h>
x86: Use u32 instead of long to set reset vector back to 0
...
Conflicts:
drivers/gpu/drm/i915/i915_gem.c
Somehow fixes a misrendering + hang at GDM startup on my NVA8...
My first guess would have been stale TLB entries laying around that a new
bo then accidentally inherits. That doesn't make a great deal of sense
however, as when we mapped the pages for the new bo the TLBs would've
gotten flushed anyway.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
The immediate benefit of doing this is that on NV50 and up, the GPU
virtual address of any buffer is now constant, regardless of what
memtype they're placed in.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Upcoming patches are going to enable full support for buffers that keep
a constant GPU virtual address whenever they're validated for use by
the GPU.
In order for this to work properly while keeping support for large pages,
we need to know if it's ever going to be possible for a buffer to end
up in GART, and if so, disable large pages for the buffer's VMA.
This is a new restriction that's not present in earlier kernel's, but
should not break userspace as the current code never attempts to validate
buffers into a memtype other than it was created with.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
'mappable' isn't really used at all, nor is it necessary anymore as the
bo code is capable of moving buffers to mappable vram as required.
'no_vm' isn't necessary anymore either, any places that don't want to be
mapped into a GPU address space should allocate the VRAM directly instead.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
In preparation for the addition of a new nv40 backend, we'll need to be
able to distinguish between a paged dma object and the on-chip GART.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
We free the temporary binding before leaving this function, so we also have
to wait for the move to actually complete.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reported-by: Alex Buell <alex.buell@munted.org.uk>
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
As of this commit, it's guaranteed that if an object is in VRAM that its
GPU virtual address will be constant.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
This is required on nv50 as we need to be able to have more precise control
over physical VRAM allocations to avoid buffer corruption when using
buffers of mixed memory types.
This removes some nasty overallocation/alignment that we were previously
using to "control" this problem.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
The point is to share more code between the PFB/PGRAPH tile region
hooks, and give the hardware specific functions a chance to allocate
per-region resources.
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>