Commit Graph

499 Commits (3a2fd4a14112452eb5c1a079ac8b3f4842762afe)

Author SHA1 Message Date
Chris Wilson b5ffc9bc38 drm/i915: Introduce i915_gem_object_finish_gtt()
Like its siblings finish_gpu(), this function clears the object from the
GTT domain forcing it to be trigger a domain invalidation should we ever
need to use via the GTT again.

Note that the most important side-effect of finishing the GTT domain
(aside from clearing the tracking read/write domains) is that it imposes
an memory barrier so that all accesses are complete before it returns,
which is important if you intend to be modifying translation tables
shortly afterwards. The second most important side-effect is that it
tears down the GTT mappings forcing a page-fault and invalidation on
next user access to the object.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2011-06-09 21:51:14 -07:00
Chris Wilson a8198eea15 drm/i915: Introduce i915_gem_object_finish_gpu()
... reincarnated from i915_gem_object_flush_gpu(). The semantic
difference is that after calling finish_gpu() the object no longer
resides in any GPU domain, and so will cause the GPU caches to be
invalidated if it is ever used again.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2011-06-09 11:43:47 -07:00
Daniel Vetter c8ebc2b076 drm/915: fix relaxed tiling on gen2: tile height
A tile on gen2 has a size of 2kb, stride of 128 bytes and 16 rows.

Userspace was broken and assumed 8 rows. Chris Wilson noted that the
kernel unfortunately can't reliable check that because libdrm rounds
up the size to the next bucket.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Keith Packard <keithp@keithp.com>
2011-06-04 10:41:12 -07:00
Chris Wilson c8cbbb8ba9 drm/i915: s/addr & ~PAGE_MASK/offset_in_page(addr)/
Convert our open coded offset_in_page() to the common macro.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Keith Packard <keithp@keithp.com>
2011-06-04 10:40:42 -07:00
Ying Han 1495f230fa vmscan: change shrinker API by passing shrink_control struct
Change each shrinker's API by consolidating the existing parameters into
shrink_control struct.  This will simplify any further features added w/o
touching each file of shrinker.

[akpm@linux-foundation.org: fix build]
[akpm@linux-foundation.org: fix warning]
[kosaki.motohiro@jp.fujitsu.com: fix up new shrinker API]
[akpm@linux-foundation.org: fix xfs warning]
[akpm@linux-foundation.org: update gfs2]
Signed-off-by: Ying Han <yinghan@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Acked-by: Pavel Emelyanov <xemul@openvz.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:26 -07:00
Eric Anholt 25aebfc30b drm/i915: Add support for fence registers on Ivybridge.
The registers are the same as on Sandybridge.  Fixes scrambled display
in X when it does software drawing to the GTT, and scans the results
out as tiled.

Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Keith Packard <keithp@keithp.com>
2011-05-13 18:12:51 -07:00
Eric Anholt 10ed13e4a5 drm/i915: Use existing function instead of open-coding fence reg clear.
This is once less place to miss a new INTEL_INFO(dev)->gen update now.

Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Keith Packard <keithp@keithp.com>
2011-05-13 18:12:50 -07:00
Chris Wilson 9c23f7fc4c drm/i915: Do not clflush snooped objects
Rely on the GPU snooping into the CPU cache for appropriately bound
objects on MI_FLUSH. Or perhaps one day we will have a cache-coherent
CPU/GPU package...

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Keith Packard <keithp@keithp.com>
2011-05-10 13:56:44 -07:00
Chris Wilson 93dfb40cd8 drm/i915: Rename agp_type to cache_level
... to clarify just how we use it inside the driver and remove the
confusion of the poorly matching agp_type names. We still need to
translate through agp_type for interface into the fake AGP driver.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Keith Packard <keithp@keithp.com>
2011-05-10 13:56:43 -07:00
Chris Wilson f6e47884e7 drm/i915: Avoid unmapping pages from a NULL address space
Found by gem_stress.

As we perform retirement from a workqueue, it is possible for us to free
and unbind objects after the last close on the device, and so after the
address space has been torn down and reset to NULL:

BUG: unable to handle kernel NULL pointer dereference at 00000054
IP: [<c1295a20>] mutex_lock+0xf/0x27
*pde = 00000000
Oops: 0002 [#1] SMP
last sysfs file: /sys/module/vt/parameters/default_utf8

Pid: 5, comm: kworker/u:0 Not tainted 2.6.38+ #214
EIP: 0060:[<c1295a20>] EFLAGS: 00010206 CPU: 1
EIP is at mutex_lock+0xf/0x27
EAX: 00000054 EBX: 00000054 ECX: 00000000 EDX: 00012fff
ESI: 00000028 EDI: 00000000 EBP: f706fe20 ESP: f706fe18
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
Process kworker/u:0 (pid: 5, ti=f706e000 task=f7060d00 task.ti=f706e000)
Stack:
 f5aa3c60 00000000 f706fe74 c107e7df 00000246 dea55380 00000054 f5aa3c60
 f706fe44 00000061 f70b4000 c13fff84 00000008 f706fe54 00000000 00000000
 00012f00 00012fff 00000028 c109e575 f6b36700 00100000 00000000 f706fe90
Call Trace:
 [<c107e7df>] unmap_mapping_range+0x7d/0x1e6
 [<c109e575>] ? mntput_no_expire+0x52/0xb6
 [<c11c12f6>] i915_gem_release_mmap+0x49/0x58
 [<c11c3449>] i915_gem_object_unbind+0x4c/0x125
 [<c11c353f>] i915_gem_free_object_tail+0x1d/0xdb
 [<c11c55a2>] i915_gem_free_object+0x3d/0x41
 [<c11a6be2>] ? drm_gem_object_free+0x0/0x27
 [<c11a6c07>] drm_gem_object_free+0x25/0x27
 [<c113c3ca>] kref_put+0x39/0x42
 [<c11c0a59>] drm_gem_object_unreference+0x16/0x18
 [<c11c0b15>] i915_gem_object_move_to_inactive+0xba/0xbe
 [<c11c0c87>] i915_gem_retire_requests_ring+0x16e/0x1a5
 [<c11c3645>] i915_gem_retire_requests+0x48/0x63
 [<c11c36ac>] i915_gem_retire_work_handler+0x4c/0x117
 [<c10385d1>] process_one_work+0x140/0x21b
 [<c103734c>] ? __need_more_worker+0x13/0x2a
 [<c10373b1>] ? need_to_create_worker+0x1c/0x35
 [<c11c3660>] ? i915_gem_retire_work_handler+0x0/0x117
 [<c1038faf>] worker_thread+0xd4/0x14b
 [<c1038edb>] ? worker_thread+0x0/0x14b
 [<c103be1b>] kthread+0x68/0x6d
 [<c103bdb3>] ? kthread+0x0/0x6d
 [<c12970f6>] kernel_thread_helper+0x6/0x10
Code: 00 e8 98 fe ff ff 5d c3 55 89 e5 3e 8d 74 26 00 ba 01 00 00 00 e8
84 fe ff ff 5d c3 55 89 e5 53 8d 64 24 fc 3e 8d 74 26 00 89 c3 <f0> ff
08 79 05 e8 ab ff ff ff 89 e0 25 00 e0 ff ff 89 43 10 58
EIP: [<c1295a20>] mutex_lock+0xf/0x27 SS:ESP 0068:f706fe18
CR2: 0000000000000054

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Keith Packard <keithp@keithp.com>
2011-03-23 09:17:03 +00:00
Chris Wilson 26e12f8943 drm/i915: Fix use after free within tracepoint
Detected by scripts/coccinelle/free/kfree.cocci.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Keith Packard <keithp@keithp.com>
2011-03-23 09:17:02 +00:00
Chris Wilson 36d527dead drm/i915: Restore missing command flush before interrupt on BLT ring
We always skipped flushing the BLT ring if the request flush did not
include the RENDER domain. However, this neglects that we try to flush
the COMMAND domain after every batch and before the breadcrumb interrupt
(to make sure the batch is indeed completed prior to the interrupt
firing and so insuring CPU coherency). As a result of the missing flush,
incoherency did indeed creep in, most notable when using lots of command
buffers and so potentially rewritting an active command buffer (i.e.
the GPU was still executing from it even though the following interrupt
had already fired and the request/buffer retired).

As all ring->flush routines now have the same preconditions, de-duplicate
and move those checks up into i915_gem_flush_ring().

Fixes gem_linear_blit.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=35284
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Tested-by: mengmeng.meng@intel.com
2011-03-23 09:17:01 +00:00
Chris Wilson ed0291fd16 drm/i915: Fix computation of pitch for dumb bo creator
Cc: Dave Airlie <airlied@linux.ie>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-03-23 09:17:00 +00:00
Chris Wilson 29c5a58728 drm/i915: Fix tiling corruption from pipelined fencing
... even though it was disabled. A mistake in the handling of fence reuse
caused us to skip the vital delay of waiting for the object to finish
rendering before changing the register. This resulted in us changing the
fence register whilst the bo was active and so causing the blits to
complete using the wrong stride or even the wrong tiling. (Visually the
effect is that small blocks of the screen look like they have been
interlaced). The fix is to wait for the GPU to finish using the memory
region pointed to by the fence before changing it.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34584
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
[Note for 2.6.38-stable, we need to reintroduce the interruptible passing]
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Dave Airlie <airlied@linux.ie>
2011-03-23 09:12:24 +00:00
Herton Ronaldo Krzesinski 09bfa51773 drm/i915: Prevent racy removal of request from client list
When i915_gem_retire_requests_ring calls i915_gem_request_remove_from_client,
the client_list for that request may already be removed in i915_gem_release.
So we may call twice list_del(&request->client_list), resulting in an
oops like this report:

[126167.230394] BUG: unable to handle kernel paging request at 00100104
[126167.230699] IP: [<f8c2ce44>] i915_gem_retire_requests_ring+0xd4/0x240 [i915]
[126167.231042] *pdpt = 00000000314c1001 *pde = 0000000000000000
[126167.231314] Oops: 0002 [#1] SMP
[126167.231471] last sysfs file: /sys/devices/LNXSYSTM:00/device:00/PNP0C0A:00/power_supply/BAT1/current_now
[126167.231901] Modules linked in: snd_seq_dummy nls_utf8 isofs btrfs zlib_deflate libcrc32c ufs qnx4 hfsplus hfs minix ntfs vfat msdos fat jfs xfs exportfs reiserfs cryptd aes_i586 aes_generic binfmt_misc vboxnetadp vboxnetflt vboxdrv parport_pc ppdev snd_hda_codec_hdmi snd_hda_codec_conexant snd_hda_intel snd_hda_codec snd_hwdep arc4 snd_pcm snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq uvcvideo videodev snd_timer snd_seq_device joydev iwlagn iwlcore mac80211 snd cfg80211 soundcore i915 drm_kms_helper snd_page_alloc psmouse drm serio_raw i2c_algo_bit video lp parport usbhid hid sky2 sdhci_pci ahci sdhci libahci
[126167.232018]
[126167.232018] Pid: 1101, comm: Xorg Not tainted 2.6.38-6-generic-pae #34-Ubuntu Gateway                          MC7833U /
[126167.232018] EIP: 0060:[<f8c2ce44>] EFLAGS: 00213246 CPU: 0
[126167.232018] EIP is at i915_gem_retire_requests_ring+0xd4/0x240 [i915]
[126167.232018] EAX: 00200200 EBX: f1ac25b0 ECX: 00000040 EDX: 00100100
[126167.232018] ESI: f1a2801c EDI: e87fc060 EBP: ef4d7dd8 ESP: ef4d7db0
[126167.232018]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[126167.232018] Process Xorg (pid: 1101, ti=ef4d6000 task=f1ba6500 task.ti=ef4d6000)
[126167.232018] Stack:
[126167.232018]  f1a28000 f1a2809c f1a28094 0058bd97 f1aa2400 f1a2801c 0058bd7b 0058bd85
[126167.232018]  f1a2801c f1a28000 ef4d7e38 f8c2e995 ef4d7e30 ef4d7e60 c14d1ebc f6b3a040
[126167.232018]  f1522cc0 000000db 00000000 f1ba6500 ffffffa1 00000000 00000001 f1a29214
[126167.232018] Call Trace:

Unfortunately the call trace reported was cut, but looking at debug
symbols the crash is at __list_del, when probably list_del is called
twice on the same request->client_list, as the dereferenced value is
LIST_POISON1 + 4, and by looking more at the debug symbols before
list_del call it should have being called by
i915_gem_request_remove_from_client

And as I can see in the code, it seems we indeed have the possibility
to remove a request->client_list twice, which would cause the above,
because we do list_del(&request->client_list) on both
i915_gem_request_remove_from_client and i915_gem_release

As Chris Wilson pointed out, it's indeed the case:
"(...) I had thought that the actual insertion/deletion was serialised
under the struct mutex and the intention of the spinlock was to protect
the unlocked list traversal during throttling. However, I missed that
i915_gem_release() is also called without struct mutex and so we do need
the double check for i915_gem_request_remove_from_client()."

This change does the required check to avoid the duplicate remove of
request->client_list.

Bugzilla: http://bugs.launchpad.net/bugs/733780
Cc: stable@kernel.org # 2.6.38
Signed-off-by: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-03-23 06:41:12 +00:00
Dave Airlie 34db18abd3 Merge remote branch 'intel/drm-intel-next' of ../drm-next into drm-core-next
* 'intel/drm-intel-next' of ../drm-next: (755 commits)
  drm/i915: Only wait on a pending flip if we intend to write to the buffer
  drm/i915/dp: Sanity check eDP existence
  drm/i915: Rebind the buffer if its alignment constraints changes with tiling
  drm/i915: Disable GPU semaphores by default
  drm/i915: Do not overflow the MMADDR write FIFO
  Revert "drm/i915: fix corruptions on i8xx due to relaxed fencing"
  drm/i915: Don't save/restore hardware status page address register
  drm/i915: don't store the reg value for HWS_PGA
  drm/i915: fix memory corruption with GM965 and >4GB RAM
  Linux 2.6.38-rc7
  Revert "TPM: Long default timeout fix"
  drm/i915: Re-enable GPU semaphores for SandyBridge mobile
  drm/i915: Replace vblank PM QoS with "Interrupt-Based AGPBUSY#"
  Revert "drm/i915: Use PM QoS to prevent C-State starvation of gen3 GPU"
  drm/i915: Allow relocation deltas outside of target bo
  drm/i915: Silence an innocuous compiler warning for an unused variable
  fs/block_dev.c: fix new kernel-doc warning
  ACPI: Fix build for CONFIG_NET unset
  mm: <asm-generic/pgtable.h> must include <linux/mm_types.h>
  x86: Use u32 instead of long to set reset vector back to 0
  ...

Conflicts:
	drivers/gpu/drm/i915/i915_gem.c
2011-03-14 14:15:13 +10:00
Chris Wilson 47ae63e0c2 Merge branch 'drm-intel-fixes' into drm-intel-next
Apply the trivial conflicting regression fixes, but keep GPU semaphores
enabled.

Conflicts:
	drivers/gpu/drm/i915/i915_drv.h
	drivers/gpu/drm/i915/i915_gem_execbuffer.c
2011-03-07 12:35:15 +00:00
Chris Wilson 467cffba85 drm/i915: Rebind the buffer if its alignment constraints changes with tiling
Early gen3 and gen2 chipset do not have the relaxed per-surface tiling
constraints of the later chipsets, so we need to check that the GTT
alignment is correct for the new tiling. If it is not, we need to
rebind.

Reported-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-03-07 11:02:16 +00:00
Chris Wilson ce453d81cb drm/i915: Use a device flag for non-interruptible phases
The code paths for modesetting are growing in complexity as we may need
to move the buffers around in order to fit the scanout in the aperture.
Therefore we face a choice as to whether to thread the interruptible status
through the entire pinning and unbinding code paths or to add a flag to
the device when we may not be interrupted by a signal. This does the
latter and so fixes a few instances of modesetting failures under stress.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-22 15:56:25 +00:00
Chris Wilson c872522663 drm/i915: Protect against drm_gem_object not being the first member
Dave Airlie spotted that we had a potential bug should we ever rearrange
the drm_i915_gem_object so not the base drm_gem_object was not its first
member. He noticed that we often convert the return of
drm_gem_object_lookup() immediately into drm_i915_gem_object and then
check the result for nullity. This is only valid when the base object is
the first member and so the superobject has the same address. Play safe
instead and use the compiler to convert back to the original return
address for sanity testing.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-22 15:55:57 +00:00
Chris Wilson bed636abea drm/i915: i915_mutex_interruptible() returns -EINTR
... so we handle that for i915_gem_fault() in the same manner as
ERESTARTSYS, or we send a SIGBUS to the faulting application.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-11 20:32:44 +00:00
Chris Wilson 8d7e3de1e0 drm/i915: Skip the no-op domain changes when already in CPU|GTT domains
Removes some superfluous fluff from tracing...

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-07 15:24:03 +00:00
Chris Wilson db53a30261 drm/i915: Refine tracepoints
A lot of minor tweaks to fix the tracepoints, improve the outputting for
ftrace, and to generally make the tracepoints useful again. It is a start
and enough to begin identifying performance issues and gaps in our
coverage.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-07 14:59:18 +00:00
Chris Wilson d9bc7e9f32 drm/i915: Fix infinite loop regression from 21dd3734
By returning EAGAIN upon a wedged GPU before attempting to wait, we
would hit an infinite loop of repeating operation without ever
progressing. Instead this needs to be EIO so that userspace knows that
the GPU is truly wedged and not in the process of error recovery.

Similarly, we need to handle the error recovery during i915_gem_fault.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-07 14:33:55 +00:00
Dave Airlie ff72145bad drm: dumb scanout create/mmap for intel/radeon (v3)
This is just an idea that might or might not be a good idea,
it basically adds two ioctls to create a dumb and map a dumb buffer
suitable for scanout. The handle can be passed to the KMS ioctls to create
a framebuffer.

It looks to me like it would be useful in the following cases:
a) in development drivers - we can always provide a shadowfb fallback.
b) libkms users - we can clean up libkms a lot and avoid linking
to libdrm_*.
c) plymouth via libkms is a lot easier.

Userspace bits would be just calls + mmaps. We could probably
mark these handles somehow as not being suitable for acceleartion
so as top stop people who are dumber than dumb.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-02-07 12:16:14 +10:00
Chris Wilson 21dd373486 drm/i915: Defer reporting EIO until we try to use the GPU
Instead of reporting EIO upfront in the entrance of an ioctl that may or
may not attempt to use the GPU, defer the actual detection of an invalid
ioctl to when we issue a GPU instruction. This allows us to continue to
use bo in video memory (via pread/pwrite and mmap) after the GPU has hung.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-01-27 11:06:07 +00:00
Chris Wilson e110e8d672 drm/i915: Check wedged status before throttling
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-01-27 11:05:51 +00:00
Chris Wilson 29ee399131 drm/i915: Silence a few -Wunused-but-set-variable
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-01-25 10:33:11 +00:00
Chris Wilson bee4a186c1 drm/i915,agp/intel: Do not clear stolen entries
We can only utilize the stolen portion of the GTT if we are in sole
charge of the hardware. This is only true if using GEM and KMS,
otherwise VESA continues to access stolen memory.

Reported-by: Arnd Bergmann <arnd@arndb.de>
Reported-by: Frederic Weisbecker <fweisbec@gmail.com>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-01-24 18:26:25 +00:00
Chris Wilson 076e2c0eb8 drm/i915: Fix use of invalid array size for ring->sync_seqno
There are I915_NUM_RINGS-1 inter-ring synchronisation counters, but we
were clearing I915_NUM_RINGS of them. Oops.

Reported-by: Jiri Slaby <jirislaby@gmail.com>
Tested-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-01-23 12:52:11 +00:00
Chris Wilson 809b63349c drm/i915: If we hit OOM when allocating GTT pages, clear the aperture
Rather than evicting an object at random, which is unlikely to alleviate
the memory pressure sufficient to allow us to continue, zap the entire
aperture. That should give the system long enough to recover and reap
some pages from the evicted objects, forestalling the allocation error
for the new object.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-01-11 22:55:48 +00:00
Chris Wilson 0a58705b2f drm/i915: Periodically flush the active lists and requests
In order to retire active buffers whilst no client is active, we need to
insert our own flush requests onto the ring.

This is useful for servers that queue up some rendering and then go to
sleep as it allows us to the complete processing of those requests,
potentially making that memory available again much earlier.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-01-11 22:15:30 +00:00
Chris Wilson 882417851a drm/i915: Propagate error from flushing the ring
... in order to avoid a BUG() and potential unbounded waits.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-01-11 20:44:50 +00:00
Chris Wilson b72f3acb71 drm/i915: Handle ringbuffer stalls when flushing
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-01-11 20:43:55 +00:00
Chris Wilson 63256ec534 drm/i915: Enforce write ordering through the GTT
We need to ensure that writes through the GTT land before any
modification to the MMIO registers and so must impose a mandatory write
barrier when flushing the GTT domain. This was revealed by relaxing the
write ordering by experimentally mapping the registers and the GATT as
write-combining.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-01-11 20:42:53 +00:00
Chris Wilson 72bfa19c8d drm/i915: Allow the application to choose the constant addressing mode
The relative-to-general state default is useless as it means having to
rewrite the streaming kernels for each batch. Relative-to-surface is
more useful, as that stream usually needs to be rewritten for each
batch. And absolute addressing mode, vital if you start streaming
state, is also only available by adjusting the register...

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-12-20 09:41:36 +00:00
Chris Wilson b5ba177d8d drm/i915: Poll for seqno completion if IRQ is disabled
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=32288
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-12-14 12:19:25 +00:00
Chris Wilson b13c2b96bf drm/i915/ringbuffer: Make IRQ refcnting atomic
In order to enforce the correct memory barriers for irq get/put, we need
to perform the actual counting using atomic operations.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-12-14 11:34:46 +00:00
Chris Wilson 1a1c69762a Merge branch 'drm-intel-fixes' into drm-intel-next
Conflicts:
	drivers/gpu/drm/i915/i915_gem.c
	drivers/gpu/drm/i915/intel_dp.c
2010-12-07 23:02:08 +00:00
Chris Wilson 7a1948768c drm/i915: Emit a request to clear a flushed and idle ring for unbusy bo
In order for bos to retire eventually, a request must be sent down the
ring. This is expected, for example, by occlusion queries for which mesa
will wait upon (whilst running glean) before issuing more batches and so
the normal activity upon the ring is suspended and we need to emit a
request to clear the idle ring.

Reported-by: Jinjin, Wang <jinjin.wang@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=30380
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-12-07 10:59:14 +00:00
Chris Wilson 0be732841f drm/i915: Wait for the bo if a display flip is pipelined on the other ring
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-12-06 14:37:27 +00:00
Chris Wilson 0ac74c6b33 drm/i915: Only emit a flush if there is an outstanding gpu write
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-12-06 14:36:02 +00:00
Chris Wilson 6bda10d152 drm/i915: Completely disable fence pipelining.
I'm still seeing tiling corruption of PutImage and CopyArea (I think)
under mutter on pnv, so obviously the pipelining logic is deeply flawed.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-12-05 23:19:37 +00:00
Chris Wilson 1ec14ad313 drm/i915: Implement GPU semaphores for inter-ring synchronisation on SNB
The bulk of the change is to convert the growing list of rings into an
array so that the relationship between the rings and the semaphore sync
registers can be easily computed.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-12-05 00:37:38 +00:00
Chris Wilson 60de2ba51e drm/i915: Kill the get_fence tracepoint
As the tracepoint is now decoupled from when the actual register is
assigned and was never complemented by detailing when the object lost
its fence, it has outlived its limited usefulness. Profiling the actual
stalls is a far more profitable venture anyway.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-12-02 10:20:47 +00:00
Chris Wilson c6748e09ee drm/i915: Remove inactive LRU tracking from set_domain_ioctl
As the userspace mappings are torn down on every GPU write, we prefer to
track when the buffer is activated (via a fresh i915_gem_fault). This
makes the LRU conceptually simpler. With coherent mappings, the
remaining use-case for set_domain_ioctl is GPU synchronisation.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-12-02 10:16:30 +00:00
Chris Wilson d9e86c0ee6 drm/i915: Pipelined fencing [infrastructure]
With this change, every batchbuffer can use all available fences (save
pinned and scanout, of course) without ever stalling the gpu!

In theory. Currently the actual pipelined update of the register is
disabled due to some stability issues. However, just the deferred update
is a significant win.

Based on a series of patches by Daniel Vetter.

The premise is that before every access to a buffer through the GTT we
have to declare whether we need a register or not. If the access is by
the GPU, a pipelined update to the register is made via the ringbuffer,
and we track the last seqno of the batches that access it. If by the
CPU we wait for the last GPU access and update the register (either
to clear or to set it for the current buffer).

One advantage of being able to pipeline changes is that we can defer the
actual updating of the fence register until we first need to access the
object through the GTT, i.e. we can eliminate the stall on set_tiling.
This is important as the userspace bo cache does not track the tiling
status of active buffers which generate frequent stalls on gen3 when
enabling tiling for an already bound buffer.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2010-12-02 10:07:05 +00:00
Chris Wilson 87ca9c8a7e drm/i915: Prevent stalling for a GTT read back from a read-only GPU target
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-12-02 10:00:15 +00:00
Chris Wilson 7d2cb39c33 drm/i915: Release fenced GTT mapping on suspend
... so that upon first use after resume we will reacquire the fence reg.

Reported-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-11-28 16:12:15 +00:00
Chris Wilson 3619df035e Merge branch 'drm-intel-fixes' into drm-intel-next
Conflicts:
	drivers/gpu/drm/i915/i915_gem.c
2010-11-28 15:37:17 +00:00
Daniel Vetter de18a29e0f drm/i915: fix regression due to ba3d8d749b
We don't track gpu flush request in any special way. So even with
obj->write_domain == 0, a gpu flush might be outstanding but no
yet executed. Even worse, the latest request might use the object
only for reading. So and unconditional call to object_wait_rendering
is needed for !pipelined.

Hence revert that patch fully and untangle the flushing from the
synchronization again.

Reported-by: Keith Packard <keithp@keithp.com>
Tested-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-11-28 09:05:12 +00:00
Chris Wilson 432e58edc9 drm/i915: Avoid allocation for execbuffer object list
Besides the minimal improvement in reducing the execbuffer overhead, the
real benefit is clarifying a few routines.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-11-25 21:19:26 +00:00
Chris Wilson 54cf91dc4e drm/i915: Split i915_gem_execbuffer into its own file.
A number of dragons have been seen lurking within the execbuffer code.
The first step is then to isolate them from the rest and begin to
scrutinise them in depth. Suggested by Daniel Vetter.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-11-25 21:19:25 +00:00
Chris Wilson 6299f992c0 drm/i915: Defer accounting until read from debugfs
Simply remove our accounting of objects inside the aperture, keeping
only track of what is in the aperture and its current usage. This
removes the over-complication of BUGs that were attempting to keep the
accounting correct and also removes the overhead of the accounting on
the hot-paths.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-11-25 15:04:53 +00:00
Chris Wilson 2021746e1d drm/i915: Mark a few functions as __must_check
... to benefit from the compiler checking that we remember to handle
and propagate errors.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-11-25 15:04:04 +00:00
Chris Wilson 312817a39f drm/i915: Only save and restore fences for UMS
With KMS, we can simply relinquish the fence when we idle the GPU and
reassign it upon first use.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-11-25 15:03:22 +00:00
Daniel Vetter c6642782b9 drm/i915: Add a mechanism for pipelining fence register updates
Not employed just yet...

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-11-25 15:01:39 +00:00
Chris Wilson caea7476d4 drm/i915: More accurately track last fence usage by the GPU
Based on a patch by Daniel Vetter.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-11-24 13:30:52 +00:00
Chris Wilson a7a09aebe8 drm/i915: Rework execbuffer pinning
Avoid evicting buffers that will be used later in the batch in order to
make room for the initial buffers by pinning all bound buffers in a
single pass before binding (and evicting for) fresh buffer.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-11-24 13:30:51 +00:00
Chris Wilson 919926aeb3 drm/i915: Thread the pipelining ring through the callers.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-11-23 20:19:16 +00:00
Chris Wilson dddbc0e525 drm/i915: Remove a defunct BUG_ON
This used to check the precondition that all fences were to be located
in a mappable area, redundant now as those two parameters are combined
into one.

After pinning, we assert that the buffer is bound into the desired
region.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-11-23 20:19:15 +00:00
Chris Wilson b6913e4bdb drm/i915: Move the implementation details of PIPE_CONTROL to the ringbuffer
The pipe control object is allocated by the device for the sole use of the
render ringbuffer. Move this detail from the general code to the render
ring buffer initialisation.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-11-23 20:19:14 +00:00
Chris Wilson 92b88aeb1a drm/i915: Not all mappable regions require GTT fence regions
Combining map_and_fenceable revealed a bug in
i915_gem_object_gtt_size() in that it always computed the appropriate
fence size for the object regardless of tiling state which caused us to
over-allocate linear buffers when binding to the GTT.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-11-23 20:19:13 +00:00
Chris Wilson 05394f3975 drm/i915: Use drm_i915_gem_object as the preferred type
A glorified s/obj_priv/obj/ with a net reduction of over a 100 lines and
many characters!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-11-23 20:19:10 +00:00
Daniel Vetter 7c2e6fdf45 drm/i915: move gtt handling to i915_gem_gtt.c
No more drm_*_agp in i915_gem.c!

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-11-23 20:14:47 +00:00
Daniel Vetter 93a37f20ea drm/i915: track objects in the gtt
This is required to restore gtt mappings on resume when agp is gone.

The right way to do this would be to make sturct drm_mm_node embeddable
and use the allocation list maintained by the drm memory manager. But
that's a bigger project. Getting rid of the per bo agp_mem will save
more memory than this wastes, anyway.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-11-23 20:14:45 +00:00
Daniel Vetter 40ce657510 drm/i915/gtt: call chipset flush directly
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-11-23 20:14:44 +00:00
Daniel Vetter 23ed992a5e drm/i915|intel-gtt: consolidate intel-gtt.h headers
... and a few other defines.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-11-23 20:14:43 +00:00
Chris Wilson e384eafc1c Merge branch 'drm-intel-fixes' into drm-intel-next 2010-11-23 20:13:13 +00:00
Chris Wilson bcf50e2775 drm/i915: Handle pagefaults in execbuffer user relocations
Currently if we hit a pagefault when applying a user relocation for the
execbuffer, we bail and return EFAULT to the application. Instead, we
need to unwind, drop the dev->struct_mutex, copy all the relocation
entries to a vmalloc array (to avoid any potential circular deadlocks
when resolving the pagefault), retake the mutex and then apply the
relocations.  Afterwards, we need to again drop the lock and copy the
vmalloc array back to userspace.

v2: Incorporate feedback from Daniel Vetter.

Reported-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2010-11-23 20:11:43 +00:00
Chris Wilson e624ae8e0d Merge branch 'drm-intel-fixes' into drm-intel-next
Conflicts:
	drivers/gpu/drm/i915/i915_gem.c
2010-11-22 08:51:36 +00:00
Chris Wilson d1d788302e drm/i915: Prevent integer overflow when validating the execbuffer
Commit 2549d6c2 removed the vmalloc used for temporary storage of the
relocation lists used during execbuffer. However, our use of vmalloc was
being protected by an integer overflow check which we do want to
preserve!

Reported-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-11-21 09:30:58 +00:00
Chris Wilson 51311d0a5c drm/i915: Do not hold mutex when faulting in user addresses
Linus Torvalds found that it was rather trivial to trigger a system
freeze:

  In fact, with lockdep, I don't even need to do the sysrq-d thing: it
  shows the bug as it happens. It's the X server taking the same lock
  recursively.

  Here's the problem:

    =============================================
    [ INFO: possible recursive locking detected ]
    2.6.37-rc2-00012-gbdbd01a #7
    ---------------------------------------------
    Xorg/2816 is trying to acquire lock:
     (&dev->struct_mutex){+.+.+.}, at: [<ffffffff812c626c>] i915_gem_fault+0x50/0x17e

    but task is already holding lock:
     (&dev->struct_mutex){+.+.+.}, at: [<ffffffff812c403b>] i915_mutex_lock_interruptible+0x28/0x4a

    other info that might help us debug this:
    2 locks held by Xorg/2816:
     #0:  (&dev->struct_mutex){+.+.+.}, at: [<ffffffff812c403b>] i915_mutex_lock_interruptible+0x28/0x4a
     #1:  (&mm->mmap_sem){++++++}, at: [<ffffffff81022d4f>] page_fault+0x156/0x37b

This recursion was introduced by rearranging the locking to avoid the
double locking on the fast path (4f27b5d and fbd5a26d) and the
introduction of the prefault to encourage the fast paths (b5e4f2b). In
order to undo the problem, we rearrange the code to perform the access
validation upfront, attempt to prefault and then fight for control of the
mutex.  the best case scenario where the mutex is uncontended the
prefaulting is not wasted.

Reported-and-tested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-11-19 09:30:15 +00:00
Chris Wilson c94f28c383 Merge branch 'drm-intel-fixes' into drm-intel-next
Conflicts:
	drivers/gpu/drm/i915/i915_gem.c
	drivers/gpu/drm/i915/intel_ringbuffer.c
2010-11-15 06:49:30 +00:00
Chris Wilson 1bb95834bb Merge remote branch 'airlied/drm-fixes' into drm-intel-fixes 2010-11-15 06:33:11 +00:00
Daniel Vetter 5e78330126 drm/i915: fix relaxed tiling for gen <= 3 && !g33
g33/pineview doesn't have any alignment constrains for unfenced tiled
buffers. But older chips have. Fix this.

Problem introduced in a00b10c360.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2010-11-15 05:22:16 +00:00
Chris Wilson 85345517fe drm/i915: Retire any pending operations on the old scanout when switching
An old and oft reported bug, is that of the GPU hanging on a
MI_WAIT_FOR_EVENT following a mode switch. The cause is that the GPU is
waiting on a scanline counter on an inactive pipe, and so waits for a
very long time until eventually the user reboots his machine.

We can prevent this either by moving the WAIT into the kernel and
thereby incurring considerable cost on every swapbuffers, or by waiting
for the GPU to retire the last batch that accesses the framebuffer
before installing a new one. As mode switches are much rarer than swap
buffers, this looks like an easy choice.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=28964
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=29252
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@kernel.org
2010-11-13 09:49:11 +00:00
Chris Wilson 5d97eb69bd drm/i915: Only add the lazy request if we end up waiting for it.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-11-10 20:41:16 +00:00
Joe Perches fce7d61be0 drivers/gpu/drm: Update WARN uses
Coalesce long formats.
Align arguments.
Add missing newlines.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2010-11-09 13:37:15 +10:00
Chris Wilson b47b30ccda drm/i915: Avoid might_fault during pwrite whilst holding our mutex
... and so prevent a potential circular reference:

  [ INFO: possible circular locking dependency detected ]
  2.6.37-rc1-uwe1+ #4
  -------------------------------------------------------
  Xorg/1401 is trying to acquire lock:
   (&mm->mmap_sem){++++++}, at: [<c01e4ddb>] might_fault+0x4b/0xa0

  but task is already holding lock:
   (&dev->struct_mutex){+.+.+.}, at: [<f869c3ac>]
  i915_mutex_lock_interruptible+0x3c/0x60 [i915]

  which lock already depends on the new lock.

When the locking around the pwrite ioctl was simplified, I did not spot
that the phys path never took any locks and so we introduced this
potential circular reference.

Reported-by: Uwe Helm <uwe.helm@googlemail.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-11-08 09:19:11 +00:00
Chris Wilson 045e769ab6 drm/i915: Handle GPU hangs during fault gracefully.
Instead of killing the process, just return no page found and reschedule
the process giving the GPU some time to (hopefully) recover.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-11-07 09:18:22 +00:00
Daniel Vetter 75e9e9158f drm/i915: kill mappable/fenceable disdinction
a00b10c360 "Only enforce fence limits inside the GTT" also
added a fenceable/mappable disdinction when binding/pinning buffers.
This only complicates the code with no pratical gain:

- In execbuffer this matters on for g33/pineview, as this is the only
  chip that needs fences and has an unmappable gtt area. But fences
  are only possible in the mappable part of the gtt, so need_fence
  implies need_mappable. And need_mappable is only set independantly
  with relocations which implies (for sane userspace) that the buffer
  is untiled.

- The overlay code is only really used on i8xx, which doesn't have
  unmappable gtt. And it doesn't support tiled buffers, currently.

- For all other buffers it's a bug to pass in a tiled bo.

In short, this disdinction doesn't have any practical gain.

I've also reverted mapping the overlay and context pages as possibly
unmappable. It's not worth being overtly clever here, all the big
gains from unmappable are for execbuf bos.

Also add a comment for a clever optimization that confused me
while reading the original patch by Chris Wilson.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-11-04 19:02:03 +00:00
Chris Wilson 085ce26437 drm/i915: Ensure that if we ever try to pin+fence it is mappable.
When merging Daniel's full-gtt patches I had a set of tweaks which I
thought I had undone. I was half right...

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=31286
Reported-by: jinjin.wang@intel.com
Reported-by: Alexey Fisher <bug-track@fisher-privat.net>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-11-03 09:31:57 +00:00
Chris Wilson f2a630bfec Merge branch 'drm-intel-fixes' into drm-intel-next
Conflicts:
	drivers/gpu/drm/i915/i915_gem.c
	drivers/gpu/drm/i915/i915_gem_evict.c
2010-11-01 13:44:41 +00:00
Chris Wilson c6afd65807 drm/i915: Apply big hammer to serialise buffer access between rings
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@kernel.org
2010-11-01 13:39:24 +00:00
Chris Wilson 0f8c6d7ca9 drm/i915: Move the invalidate|flush information out of the device struct
... and into a local structure scoped for the single function in which
it is used.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-11-01 12:38:44 +00:00
Chris Wilson 13b2928933 drm/i915: Apply big hammer to serialise buffer access between rings
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-11-01 12:31:19 +00:00
Chris Wilson 5eac3ab459 drm/i915: Evict just the purgeable GTT entries on the first pass
Take two passes to evict everything whilst searching for sufficient free
space to bind the batchbuffer. After searching for sufficient free space
using LRU eviction, evict everything that is purgeable and try again.
Only then if there is insufficient free space (or the GTT is too badly
fragmented) evict everything from the aperture and try one last time.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-10-31 12:31:30 +00:00
Chris Wilson ff75b9bc48 drm/i915: Fix typo from e5281ccd in i915_gem_attach_phys_object()
Accessing the uninitialised obj->pages instead of the local page lead to
an OOPs.

Reported-by: Xavier Chantry <chantry.xavier@gmail.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-10-30 22:52:31 +01:00
Chris Wilson 872d860c85 drm/i915: Remove the duplicate domain-change tracepoint for GPU flush
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-10-29 11:15:54 +01:00
Chris Wilson a00b10c360 drm/i915: Only enforce fence limits inside the GTT.
So long as we adhere to the fence registers rules for alignment and no
overlaps (including with unfenced accesses to linear memory) and account
for the tiled access in our size allocation, we do not have to allocate
the full fenced region for the object. This allows us to fight the bloat
tiling imposed on pre-i965 chipsets and frees up RAM for real use. [Inside
the GTT we still suffer the additional alignment constraints, so it doesn't
magic allow us to render larger scenes without stalls -- we need the
expanded GTT and fence pipelining to overcome those...]

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-10-29 11:15:07 +01:00
Chris Wilson 7465378fd7 drm/i915: Convert BUG_ON(pin_count) from an impossible condition
Also spotted by Dan Carpenter.

obj->pin_count is unsigned so the BUG_ON(obj->pin_count<0) will never
trigger.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-10-29 10:54:29 +01:00
Chris Wilson bbe2e11a4b drm/i915: Do not return -1 from shrinker when nr_to_scan == 0
The error code is only expected during the actual pruning and not during
the first measurement (nr_to_scan == 0) pass.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-10-28 22:35:07 +01:00
Chris Wilson 395b70be54 drm/i915: Flush read-only buffers from the active list upon idle as well
It is possible for the active list to only contain a read-only buffer so
that the ring->gpu_write_list remains entry. This leads to an
inconsistency between i915_gpu_is_active() and i915_gpu_idle() causing
an infinite spin during the shrinker and an assertion failure that
i915_gpu_idle() does indeed flush all buffers from the active lists.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-10-28 21:31:19 +01:00
Chris Wilson 4a684a4117 drm/i915: Kill GTT mappings when moving from GTT domain
In order to force a page-fault on a GTT mapping after we start using it
from the GPU and so enforce correct CPU/GPU synchronisation, we need to
invalidate the mapping.

Pointed out by Owain G. Ainsworth.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-10-28 20:55:03 +01:00
Chris Wilson e5281ccd2e drm/i915: Eliminate nested get/put pages
By using read_cache_page() for individual pages during pwrite/pread we
can eliminate an unnecessary large allocation (and immediate free) of
obj->pages. Also this eliminates any potential nesting of get/put pages,
simplifying the code and preparing the path for greater things.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-10-28 20:55:02 +01:00
Chris Wilson 39a01d1fb6 drm/i915: Remove mmap_offset
Since we rarely use the mmap_offset and it is easily computable from the
obj->map_list.hash, remove it.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-10-28 20:55:02 +01:00
Chris Wilson 17250b7155 drm/i915: Make the inactive object shrinker per-device
Eliminate the racy device unload by embedding a shrinker into each
device. Smaller, simpler code.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-10-28 20:55:01 +01:00
Chris Wilson da761a6edf drm/i915: Bail early if we try to mmap an object too large to be mapped.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-10-27 23:31:08 +01:00
Daniel Vetter fb7d516af1 drm/i915: add accounting for mappable objects in gtt v2
More precisely: For those that _need_ to be mappable. Also add two
BUG_ONs in fault and pin to check the consistency of the mappable
flag.

Changes in v2:
- Add tracking of gtt mappable space (to notice mappable/unmappable
  balancing issues).
- Improve the mappable working set tracking by tracking fault and pin
  separately.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-10-27 23:31:08 +01:00