linux

q3k/linux

Author	SHA1	Message	Date
Zou Nan hai	722aab0c3b	sched: fix minimum granularity tunings increase the default minimum granularity some more - this gives us more performance in aim7 benchmarks. also correct some comments: we scale with ilog(ncpus) + 1. Signed-off-by: Zou Nan hai <nanhai.zou@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-11-26 21:21:49 +01:00
Adrian Bunk	518b22e990	sched: make sched_nr_latency static sched_nr_latency can now become static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-11-15 20:57:40 +01:00
Srivatsa Vaddagiri	3c90e6e99b	sched: fix copy_namespace() <-> sched_fork() dependency in do_fork Sukadev Bhattiprolu reported a kernel crash with control groups. There are couple of problems discovered by Suka's test: - The test requires the cgroup filesystem to be mounted with atleast the cpu and ns options (i.e both namespace and cpu controllers are active in the same hierarchy). # mkdir /dev/cpuctl # mount -t cgroup -ocpu,ns none cpuctl (or simply) # mount -t cgroup none cpuctl -> Will activate all controllers in same hierarchy. - The test invokes clone() with CLONE_NEWNS set. This causes a a new child to be created, also a new group (do_fork->copy_namespaces->ns_cgroup_clone-> cgroup_clone) and the child is attached to the new group (cgroup_clone-> attach_task->sched_move_task). At this point in time, the child's scheduler related fields are uninitialized (including its on_rq field, which it has inherited from parent). As a result sched_move_task thinks its on runqueue, when it isn't. As a solution to this problem, I moved sched_fork() call, which initializes scheduler related fields on a new task, before copy_namespaces(). I am not sure though whether moving up will cause other side-effects. Do you see any issue? - The second problem exposed by this test is that task_new_fair() assumes that parent and child will be part of the same group (which needn't be as this test shows). As a result, cfs_rq->curr can be NULL for the child. The solution is to test for curr pointer being NULL in task_new_fair(). With the patch below, I could run ns_exec() fine w/o a crash. Reported-by: Sukadev Bhattiprolu <sukadev@us.ibm.com> Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-11-09 22:39:39 +01:00
Ingo Molnar	502d26b524	sched: clean up the wakeup preempt check, #2 clean up the preemption check to not use unnecessary 64-bit variables. This improves code size: text data bss dec hex filename 44227 3326 36 47589 b9e5 sched.o.before 44201 3326 36 47563 b9cb sched.o.after Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-11-09 22:39:39 +01:00
Ingo Molnar	77d9cc44b5	sched: clean up the wakeup preempt check clean up the wakeup preemption check. No code changed: text data bss dec hex filename 44227 3326 36 47589 b9e5 sched.o.before 44227 3326 36 47589 b9e5 sched.o.after Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-11-09 22:39:39 +01:00
Ingo Molnar	8bc6767acb	sched: wakeup preemption fix wakeup preemption fix: do not make it dependent on p->prio. Preemption purely depends on ->vruntime. This improves preemption in mixed-nice-level workloads. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-11-09 22:39:39 +01:00
Ingo Molnar	3e3e13f399	sched: remove PREEMPT_RESTRICT remove PREEMPT_RESTRICT. (this is a separate commit so that any regression related to the removal itself is bisectable) Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-11-09 22:39:39 +01:00
Ingo Molnar	19978ca610	sched: reintroduce SMP tunings again Yanmin Zhang reported an aim7 regression and bisected it down to: \| commit `38ad464d41` \| Author: Ingo Molnar <mingo@elte.hu> \| Date: Mon Oct 15 17:00:02 2007 +0200 \| \| sched: uniform tunings \| \| use the same defaults on both UP and SMP. fix this by reintroducing similar SMP tunings again. This resolves the regression. (also update the comments to match the ilog2(nr_cpus) tuning effect) Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-11-09 22:39:38 +01:00
Peter Zijlstra	b2be5e96dc	sched: reintroduce the sched_min_granularity tunable we lost the sched_min_granularity tunable to a clever optimization that uses the sched_latency/min_granularity ratio - but the ratio is quite unintuitive to users and can also crash the kernel if the ratio is set to 0. So reintroduce the min_granularity tunable, while keeping the ratio maintained internally. no functionality changed. [ mingo@elte.hu: some fixlets. ] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-11-09 22:39:37 +01:00
Peter Zijlstra	2cb8600e6b	sched: documentation: place_entity() comments Add a few comments to place_entity(). No code changed. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-11-09 22:39:37 +01:00
Peter Zijlstra	10b777246c	sched: fix vslice vslice was missing a factor NICE_0_LOAD, as weight is in weight*NICE_0_LOAD units. the effect of this bug was larger initial slices and thus latency-noisier forks. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-11-09 22:39:37 +01:00
Ingo Molnar	8eb172d941	sched: fix style of swap() macro in kernel/sched_fair.c fix style of swap() macro in kernel/sched_fair.c. ( this macro should eventually move to a general header, as ext3 uses a similar construct too. ) Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-10-29 21:18:11 +01:00
Peter Williams	681f3e6854	sched: isolate SMP balancing code a bit more At the moment, a lot of load balancing code that is irrelevant to non SMP systems gets included during non SMP builds. This patch addresses this issue and reduces the binary size on non SMP systems: text data bss dec hex filename 10983 28 1192 12203 2fab sched.o.before 10739 28 1192 11959 2eb7 sched.o.after Signed-off-by: Peter Williams <pwil3058@bigpond.net.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-10-24 18:23:51 +02:00
Peter Williams	e1d1484f72	sched: reduce balance-tasks overhead At the moment, balance_tasks() provides low level functionality for both move_tasks() and move_one_task() (indirectly) via the load_balance() function (in the sched_class interface) which also provides dual functionality. This dual functionality complicates the interfaces and internal mechanisms and makes the run time overhead of operations that are called with two run queue locks held. This patch addresses this issue and reduces the overhead of these operations. Signed-off-by: Peter Williams <pwil3058@bigpond.net.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-10-24 18:23:51 +02:00
Srivatsa Vaddagiri	b9dca1e0fc	sched: fix new task startup crash Child task may be added on a different cpu that the one on which parent is running. In which case, task_new_fair() should check whether the new born task's parent entity should be added as well on the cfs_rq. Patch below fixes the problem in task_new_fair. This could fix the put_prev_task_fair() crashes reported. Reported-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Reported-by: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-10-17 16:55:11 +02:00
Ingo Molnar	da84d96176	sched: reintroduce cache-hot affinity reintroduce a simplified version of cache-hot/cold scheduling affinity. This improves performance with certain SMP workloads, such as sysbench. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-10-15 17:00:18 +02:00
Ingo Molnar	e5f32a3856	sched: speed up context-switches a bit speed up context-switches a bit by not clearing p->exec_start. (as a side-effect, this also makes p->exec_start a universal timestamp available to cache-hot estimations.) Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-10-15 17:00:18 +02:00
Ingo Molnar	91c234b4e3	sched: do not wakeup-preempt with SCHED_BATCH tasks do not wakeup-preempt with SCHED_BATCH tasks, their preemption is batched too, driven by the tick. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-10-15 17:00:18 +02:00
Ingo Molnar	d274a4cee1	sched: update comment update comment: clarify time-slices and remove obsolete tuning detail. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-10-15 17:00:14 +02:00
Mike Galbraith	95938a35c5	sched: prevent wakeup over-scheduling Prevent wakeup over-scheduling. Once a task has been preempted by a task of the same or lower priority, it becomes ineligible for repeated preemption by same until it has been ticked, or slept. Instead, the task is marked for preemption at the next tick. Tasks of higher priority still preempt immediately. Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-10-15 17:00:14 +02:00
Peter Zijlstra	ce6c131131	sched: disable forced preemption by default Implement feature bit to disable forced preemption. This way it can be checked whether a workload is overscheduling or not. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-10-15 17:00:14 +02:00
Dmitry Adamushko	e62dd02ed0	sched: fix group scheduling for SCHED_BATCH The following patch (sched: disable sleeper_fairness on SCHED_BATCH) seems to break GROUP_SCHED. Although, it may be 'oops'-less due to the possibility of 'p' being always a valid address. Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-10-15 17:00:14 +02:00
Peter Zijlstra	8ca0e14ffb	sched: disable sleeper_fairness on SCHED_BATCH disable sleeper fairness for batch tasks - they are about batch processing after all. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-10-15 17:00:14 +02:00
Peter Zijlstra	810e95ccd5	sched: another wakeup_granularity fix unit mis-match: wakeup_gran was used against a vruntime Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-10-15 17:00:14 +02:00
Ingo Molnar	00bf7bfc2e	sched: fix: move the CPU check into ->task_new_fair() noticed by Peter Zijlstra: fix: move the CPU check into ->task_new_fair(), this way we can call place_entity() and get child ->vruntime right at initial wakeup time. (without this there can be large latencies) Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>	2007-10-15 17:00:14 +02:00
Ingo Molnar	0702e3ebc1	sched: cleanup: function prototype cleanups noticed by Thomas Gleixner: cleanup: function prototype cleanups - move into single line wherever possible. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-10-15 17:00:14 +02:00
Ingo Molnar	06877c33fe	sched: cleanup: rename SCHED_FEAT_USE_TREE_AVG to SCHED_FEAT_TREE_AVG cleanup: rename SCHED_FEAT_USE_TREE_AVG to SCHED_FEAT_TREE_AVG, to make SCHED_FEAT_ names more consistent. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-10-15 17:00:13 +02:00
Dmitry Adamushko	a2a2d68073	sched: cleanup, make dequeue_entity() and update_stats_wait_end() similar make dequeue_entity() / enqueue_entity() and update_stats_dequeue() / update_stats_enqueue() look similar, structure-wise. zero effect, functionality-wise: text data bss dec hex filename 34550 3026 100 37676 932c sched.o.before 34550 3026 100 37676 932c sched.o.after Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-10-15 17:00:13 +02:00
Dmitry Adamushko	a03c9061d9	sched: cleanup, remove calc_weighted() remove obsolete code -- calc_weighted() Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-10-15 17:00:13 +02:00
Alexey Dobriyan	a9957449b0	sched: uninline scheduler * save ~300 bytes * activate_idle_task() was moved to avoid a warning bloat-o-meter output: add/remove: 6/0 grow/shrink: 0/16 up/down: 438/-733 (-295) <=== function old new delta __enqueue_entity - 165 +165 finish_task_switch - 110 +110 update_curr_rt - 79 +79 __load_balance_iterator - 32 +32 __task_rq_unlock - 28 +28 find_process_by_pid - 24 +24 do_sched_setscheduler 133 123 -10 sys_sched_rr_get_interval 176 165 -11 sys_sched_getparam 156 145 -11 normalize_rt_tasks 482 470 -12 sched_getaffinity 112 99 -13 sys_sched_getscheduler 86 72 -14 sched_setaffinity 226 212 -14 sched_setscheduler 666 642 -24 load_balance_start_fair 33 9 -24 load_balance_next_fair 33 9 -24 dequeue_task_rt 133 67 -66 put_prev_task_rt 97 28 -69 schedule_tail 133 50 -83 schedule 682 594 -88 enqueue_entity 499 366 -133 task_new_fair 317 180 -137 Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-10-15 17:00:13 +02:00
Ingo Molnar	155bb293ae	sched: tweak wakeup granularity tweak wakeup granularity. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-10-15 17:00:13 +02:00
Dmitry Adamushko	08ec3df510	sched: fix __pick_next_entity() The thing is that __pick_next_entity() must never be called when first_fair(cfs_rq) == NULL. It wouldn't be a problem, should 'run_node' be the very first field of 'struct sched_entity' (and it's the second). The 'nr_running != 0' check is _not_ enough, due to the fact that 'current' is not within the tree. Generic paths are ok (e.g. schedule() as put_prev_task() is called previously)... I'm more worried about e.g. migration_call() -> CPU_DEAD_FROZEN -> migrate_dead_tasks()... if 'current' == rq->idle, no problems.. if it's one of the SCHED_NORMAL tasks (or imagine, some other use-cases in the future -- i.e. we should not make outer world dependent on internal details of sched_fair class) -- it may be "Houston, we've got a problem" case. it's +16 bytes to the ".text". Another variant is to make 'run_node' the first data member of 'struct sched_entity' but an additional check (se ! = NULL) is still needed in pick_next_entity(). Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-15 17:00:13 +02:00
Ingo Molnar	647e7cac2d	sched: vslice fixups for non-0 nice levels Make vslice accurate wrt nice levels, and add some comments while we're at it. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-15 17:00:13 +02:00
Ingo Molnar	5522d5d5f7	sched: mark scheduling classes as const mark scheduling classes as const. The speeds up the code a bit and shrinks it: text data bss dec hex filename 40027 4018 292 44337 ad31 sched.o.before 40190 3842 292 44324 ad24 sched.o.after Signed-off-by: Ingo Molnar <mingo@elte.hu> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-15 17:00:12 +02:00
Srivatsa Vaddagiri	b9fa3df33f	sched: group scheduler, fix latency There is a possibility that because of task of a group moving from one cpu to another, it may gain more cpu time that desired. See http://marc.info/?l=linux-kernel&m=119073197730334 for details. This is an attempt to fix that problem. Basically it simulates dequeue of higher level entities as if they are going to sleep. Similarly it simulate wakeup of higher level entities as if they are waking up from sleep. Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-15 17:00:12 +02:00
Srivatsa Vaddagiri	fad095a7b9	sched: group scheduler, fix bloat Recent fix to check_preempt_wakeup() to check for preemption at higher levels caused a size bloat for !CONFIG_FAIR_GROUP_SCHED. Fix the problem. 42277 10598 320 53195 cfcb kernel/sched.o-before_this_patch 42216 10598 320 53134 cf8e kernel/sched.o-after_this_patch Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-15 17:00:12 +02:00
Ingo Molnar	b39c5dd7f9	sched: cleanup, remove stale comment cleanup, remove stale comment. Signed-off-by: Ingo Molnar <mingo@elte.hu> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-15 17:00:12 +02:00
Peter Zijlstra	5f6d858ecc	sched: speed up and simplify vslice calculations speed up and simplify vslice calculations. [ From: Mike Galbraith <efault@gmx.de>: build fix ] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-10-15 17:00:12 +02:00
Peter Zijlstra	b0ffd246ea	sched: clean up min_vruntime use clean up min_vruntime use. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-10-15 17:00:12 +02:00
Dmitry Adamushko	2b1e315dd2	sched: yield fix fix yield bugs due to the current-not-in-rbtree changes: the task is not in the rbtree so rbtree-removal is a no-no. [ From: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>: build fix. ] also, nice code size reduction: kernel/sched.o: text data bss dec hex filename 38323 3506 24 41853 a37d sched.o.before 38236 3506 24 41766 a326 sched.o.after Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-15 17:00:12 +02:00
Srivatsa Vaddagiri	8651a86c34	sched: group scheduler wakeup latency fix group scheduler wakeup latency fix: when checking for preemption we must check cross-group too, not just intra-group. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-10-15 17:00:12 +02:00
Ingo Molnar	57cb499df2	sched: remove set_leftmost() Lee Schermerhorn noticed that set_leftmost() contains dead code, remove this. Reported-by: Lee Schermerhorn <Lee.Schermerhorn@hp.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-15 17:00:11 +02:00
Peter Zijlstra	368059a977	sched: max_vruntime() simplification max_vruntime() simplification. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>	2007-10-15 17:00:11 +02:00
Ingo Molnar	b8487b9241	sched: fix sign check error in place_entity() fix sign check error in place_entity() - we'd get excessive latencies due to negatives being converted to large u64's. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>	2007-10-15 17:00:11 +02:00
Ingo Molnar	94359f05cb	sched: undo some of the recent changes undo some of the recent changes that are not needed after all, such as last_min_vruntime. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>	2007-10-15 17:00:11 +02:00
Ingo Molnar	dc1f31c90c	sched: remove last_min_vruntime effect remove last_min_vruntime use - prepare to remove it. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>	2007-10-15 17:00:11 +02:00
Ingo Molnar	8465e792e8	sched: entity_key() fix entity_key() fix - we'd occasionally end up with a 0 vruntime in the !initial case. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>	2007-10-15 17:00:11 +02:00
Peter Zijlstra	ddc9729750	sched debug: check spread debug feature: check how well we schedule within a reasonable vruntime 'spread' range. (note that CPU overload can increase the spread, so this is not a hard condition, but normal loads should be within the spread.) Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>	2007-10-15 17:00:10 +02:00
Peter Zijlstra	67e9fb2a39	sched: add vslice add vslice: the load-dependent "virtual slice" a task should run ideally, so that the observed latency stays within the sched_latency window. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-15 17:00:10 +02:00
Ingo Molnar	c18b8a7cbc	sched: remove unneeded tunables remove unneeded tunables. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-15 17:00:10 +02:00
Srivatsa Vaddagiri	9b5b77512d	sched: clean up code under CONFIG_FAIR_GROUP_SCHED With the view of supporting user-id based fair scheduling (and not just container-based fair scheduling), this patch renames several functions and makes them independent of whether they are being used for container or user-id based fair scheduling. Also fix a problem reported by KAMEZAWA Hiroyuki (wrt allocating less-sized array for tg->cfs_rq[] and tf->se[]). Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Signed-off-by: Dhaval Giani <dhaval@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-15 17:00:09 +02:00
Srivatsa Vaddagiri	75c28ace9f	sched: print &rq->cfs stats - Print &rq->cfs statistics as well (useful for group scheduling) Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Signed-off-by: Dhaval Giani <dhaval@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-15 17:00:09 +02:00
Srivatsa Vaddagiri	72ea22f8fb	sched: fix minor bug in yield - fix a minor bug in yield (seen for CONFIG_FAIR_GROUP_SCHED), group scheduling would skew when yield was called. Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Signed-off-by: Dhaval Giani <dhaval@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-15 17:00:08 +02:00
Srivatsa Vaddagiri	83b699ed20	sched: revert recent removal of set_curr_task() Revert removal of set_curr_task. Use put_prev_task/set_curr_task when changing groups/policies Signed-off-by: Srivatsa Vaddagiri < vatsa@linux.vnet.ibm.com> Signed-off-by: Dhaval Giani <dhaval@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>	2007-10-15 17:00:08 +02:00
Ingo Molnar	edcb60a309	sched: kernel/sched_fair.c whitespace cleanups some trivial whitespace cleanups. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-15 17:00:08 +02:00
Dmitry Adamushko	f6b53205e1	sched: rework enqueue/dequeue_entity() to get rid of set_curr_task() rework enqueue/dequeue_entity() to get rid of sched_class::set_curr_task(). This simplifies sched_setscheduler(), rt_mutex_setprio() and sched_move_tasks(). text data bss dec hex filename 24330 2734 20 27084 69cc sched.o.before 24233 2730 20 26983 6967 sched.o.after Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com> Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-15 17:00:08 +02:00
Dmitry Adamushko	4530d7ab0f	sched: simplify sched_class::yield_task() the 'p' (task_struct) parameter in the sched_class :: yield_task() is redundant as the caller is always the 'current'. Get rid of it. text data bss dec hex filename 24341 2734 20 27095 69d7 sched.o.before 24330 2734 20 27084 69cc sched.o.after Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-15 17:00:08 +02:00
Dmitry Adamushko	87fefa381e	sched: optimize task_new_fair() due to the fact that we no longer keep the 'current' within the tree, dequeue/enqueue_entity() is useless for the 'current' in task_new_fair(). We are about to reschedule and sched_class->put_prev_task() will put the 'current' back into the tree, based on its new key. text data bss dec hex filename 24388 2734 20 27142 6a06 sched.o.before 24341 2734 20 27095 69d7 sched.o.after Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-15 17:00:08 +02:00
Dmitry Adamushko	30cfdcfc5f	sched: do not keep current in the tree and get rid of sched_entity::fair_key Get rid of 'sched_entity::fair_key'. As a side effect, 'current' is not kept withing the tree for SCHED_NORMAL/BATCH tasks anymore. This simplifies some parts of code (e.g. entity_tick() and yield_task_fair()) and also somewhat optimizes them (e.g. a single update_curr() now vs. dequeue/enqueue() before in entity_tick()). Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-15 17:00:07 +02:00
Dmitry Adamushko	d02e5ed8d5	sched: sched_setscheduler() fix Fix a problem in the 'sched-group' patch for !CONFIG_FAIR_GROUP_SCHED. description: sched_setscheduler() { ... if (task_running()) p->sched_class->put_prev_entity(); [ this one sets up cfs_rq->curr to NULL ] ... if (task_running) p->sched_class->set_curr_task(); [ and this one is a _NOP_ (empty) for !CONFIG_FAIR_GROUP_SCHED ] As a result, the task continues to run with cfs_rq->curr == NULL... no crashes (due to checks for !NULL in place) but e.g. update_curr() effectively becomes a NOP... i.e. runtime statistics for this task is not accounted untill it's rescheduled anew. Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-15 17:00:07 +02:00
Srivatsa Vaddagiri	29f59db3a7	sched: group-scheduler core Add interface to control cpu bandwidth allocation to task-groups. (not yet configurable, due to missing CONFIG_CONTAINERS) Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Signed-off-by: Dhaval Giani <dhaval@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>	2007-10-15 17:00:07 +02:00
Peter Zijlstra	02e0431a3d	sched: better min_vruntime tracking Better min_vruntime tracking: update it every time 'curr' is updated - not just when a task is enqueued into the tree. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Mike Galbraith <efault@gmx.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-15 17:00:07 +02:00
Dmitry Adamushko	db36cc7d6d	sched: clean up schedstat block in dequeue_entity() Better placement of #ifdef CONFIG_SCHEDSTAT block in dequeue_entity(). Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-15 17:00:06 +02:00
Ingo Molnar	bbdba7c0e1	sched: remove wait_runtime fields and features remove wait_runtime based fields and features, now that the CFS math has been changed over to the vruntime metric. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mike Galbraith <efault@gmx.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-15 17:00:06 +02:00
Ingo Molnar	e22f5bbf86	sched: remove wait_runtime limit remove the wait_runtime-limit fields and the code depending on it, now that the math has been changed over to rely on the vruntime metric. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mike Galbraith <efault@gmx.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-15 17:00:06 +02:00
Dmitry Adamushko	495eca494a	sched: clean up struct load_stat 'struct load_stat' is redundant now so let's get rid of it. Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mike Galbraith <efault@gmx.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-15 17:00:06 +02:00
Ingo Molnar	7a62eabc4d	sched: debug: update exec_clock only when SCHED_DEBUG micro-optimization: update cfs_rq->exec_clock only if CONFIG_SCHED_DEBUG=y. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mike Galbraith <efault@gmx.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-15 17:00:06 +02:00
Peter Zijlstra	9014623c0e	sched: handle vruntime 64-bit overflow Handle vruntime overflow by centering the key space around min_vruntime. ( otherwise we could overflow 64-bit vruntime in a few days with SCHED_IDLE tasks - or in a few years with nice +19. ) Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Mike Galbraith <efault@gmx.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-15 17:00:05 +02:00
Peter Zijlstra	94dfb5e75e	sched: add tree based averages add support for tree based vruntime averages. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Mike Galbraith <efault@gmx.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-15 17:00:05 +02:00
Ingo Molnar	28a1f6fa2f	sched: remove SCHED_FEAT_SKIP_INITIAL remove SCHED_FEAT_SKIP_INITIAL - it was off by default and even when enabled it never made any real difference. Signed-off-by: Ingo Molnar <mingo@elte.hu> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-15 17:00:05 +02:00
Peter Zijlstra	aeb73b0403	sched: clean up new task placement clean up new task placement. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Mike Galbraith <efault@gmx.de>	2007-10-15 17:00:05 +02:00
Ingo Molnar	2e09bf556f	sched: wakeup granularity increase increase wakeup granularity - we were overscheduling a bit. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mike Galbraith <efault@gmx.de>	2007-10-15 17:00:05 +02:00
Ingo Molnar	5c6b5964a0	sched: simplify check_preempt() methods simplify the check_preempt() methods. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mike Galbraith <efault@gmx.de>	2007-10-15 17:00:05 +02:00
Peter Zijlstra	6d0f0ebd06	sched: simplify adaptive latency simplify adaptive latency. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Mike Galbraith <efault@gmx.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-15 17:00:05 +02:00
Peter Zijlstra	4d78e7b656	sched: new task placement for vruntime add proper new task placement for the vruntime based math too. ( note: introduces a swap() macro, but the swap token is too widely used in the kernel namespace for a generic version to be added without changing non-scheduler code - so this cleanup will be done separately. ) Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Mike Galbraith <efault@gmx.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-15 17:00:04 +02:00
Ingo Molnar	6cb5819514	sched: optimize vruntime based scheduling optimize vruntime based scheduling. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mike Galbraith <efault@gmx.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-15 17:00:04 +02:00
Ingo Molnar	bf5c91ba8c	sched: move sched_feat() definitions move sched_feat() definitions so that it can be used sooner by generic code too. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mike Galbraith <efault@gmx.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-15 17:00:04 +02:00
Ingo Molnar	e9acbff648	sched: introduce se->vruntime introduce se->vruntime as a sum of weighted delta-exec's, and use that as the key into the tree. the idea to use absolute virtual time as the basic metric of scheduling has been first raised by William Lee Irwin, advanced by Tong Li and first prototyped by Roman Zippel in the "Really Fair Scheduler" (RFS) patchset. also see: http://lkml.org/lkml/2007/9/2/76 for a simpler variant of this patch. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mike Galbraith <efault@gmx.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-15 17:00:04 +02:00
Ingo Molnar	08e2388aa1	sched: clean up calc_weighted() clean up calc_weighted() - we always use the normalized shift so it's not needed to pass that in. Also, push the non-nice0 branch into the function. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mike Galbraith <efault@gmx.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-15 17:00:04 +02:00
Ingo Molnar	19ccd97a03	sched: uninline __enqueue_entity()/__dequeue_entity() suggested by Roman Zippel: uninline __enqueue_entity() and __dequeue_entity(). this reduces code size: text data bss dec hex filename 25385 2386 16 27787 6c8b sched.o.before 25257 2386 16 27659 6c0b sched.o.after Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mike Galbraith <efault@gmx.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-15 17:00:04 +02:00
Peter Zijlstra	e59c80c5bb	sched: simplify SCHED_FEAT_* code Peter Zijlstra suggested to simplify SCHED_FEAT_* checks via the sched_feat(x) macro. No code changed: text data bss dec hex filename 38895 3550 24 42469 a5e5 sched.o.before 38895 3550 24 42469 a5e5 sched.o.after Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Mike Galbraith <efault@gmx.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-15 17:00:03 +02:00
Ingo Molnar	429d43bcc0	sched: cleanup: simplify cfs_rq_curr() methods cleanup: simplify cfs_rq_curr() methods - now that the cfs_rq->curr pointer is unconditionally present, remove the wrappers. kernel/sched.o: text data bss dec hex filename 11784 224 2012 14020 36c4 sched.o.before 11784 224 2012 14020 36c4 sched.o.after Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mike Galbraith <efault@gmx.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-15 17:00:03 +02:00
Ingo Molnar	62160e3f4a	sched: track cfs_rq->curr on !group-scheduling too Noticed by Roman Zippel: use cfs_rq->curr in the !group-scheduling case too. Small micro-optimization and cleanup effect: text data bss dec hex filename 36269 3482 24 39775 9b5f sched.o.before 36177 3486 24 39687 9b07 sched.o.after Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mike Galbraith <efault@gmx.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-15 17:00:03 +02:00
Ingo Molnar	a25707f3ae	sched: remove precise CPU load CPU load calculations are statistical anyway, and there's little benefit from having it calculated on every scheduling event. So remove this code, it gets rid of a divide from the scheduler wakeup and context-switch fastpath. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mike Galbraith <efault@gmx.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-15 17:00:03 +02:00
Ingo Molnar	8ebc91d936	sched: remove stat_gran remove the stat_gran code - it was disabled by default and it causes unnecessary overhead. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mike Galbraith <efault@gmx.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-15 17:00:03 +02:00
Ingo Molnar	2bd8e6d422	sched: use constants if !CONFIG_SCHED_DEBUG use constants if !CONFIG_SCHED_DEBUG. this speeds up the code and reduces code-size: text data bss dec hex filename 27464 3014 16 30494 771e sched.o.before 26929 3010 20 29959 7507 sched.o.after Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mike Galbraith <efault@gmx.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-15 17:00:02 +02:00
Ingo Molnar	eba1ed4b7e	sched: debug: track maximum 'slice' track the maximum amount of time a task has executed while the CPU load was at least 2x. (i.e. at least two nice-0 tasks were runnable) Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mike Galbraith <efault@gmx.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-15 17:00:02 +02:00
Ingo Molnar	bb61c21083	sched: resched task in task_new_fair() to get full child-runs-first semantics make sure the parent is rescheduled. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mike Galbraith <efault@gmx.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-15 17:00:02 +02:00
Ingo Molnar	30084fbd1c	sched: fix profile=sleep fix sleep profiling - we lost this chunk in the CFS merge. Found-by: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-10-02 14:13:08 +02:00
Ingo Molnar	1799e35d5b	sched: add /proc/sys/kernel/sched_compat_yield add /proc/sys/kernel/sched_compat_yield to make sys_sched_yield() more agressive, by moving the yielding task to the last position in the rbtree. with sched_compat_yield=0: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2539 mingo 20 0 1576 252 204 R 50 0.0 0:02.03 loop_yield 2541 mingo 20 0 1576 244 196 R 50 0.0 0:02.05 loop with sched_compat_yield=1: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2584 mingo 20 0 1576 248 196 R 99 0.0 0:52.45 loop 2582 mingo 20 0 1576 256 204 R 0 0.0 0:00.00 loop_yield Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>	2007-09-19 23:34:46 +02:00
Peter Zijlstra	1169783085	sched: fix ideal_runtime calculations for reniced tasks fix ideal_runtime: - do not scale it using niced_granularity() it is against sum_exec_delta, so its wall-time, not fair-time. - move the whole check into __check_preempt_curr_fair() so that wakeup preemption can also benefit from the new logic. this also results in code size reduction: text data bss dec hex filename 13391 228 1204 14823 39e7 sched.o.before 13369 228 1204 14801 39d1 sched.o.after Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-09-05 14:32:49 +02:00
Peter Zijlstra	4a55b45036	sched: improve prev_sum_exec_runtime setting Second preparatory patch for fix-ideal runtime: Mark prev_sum_exec_runtime at the beginning of our run, the same spot that adds our wait period to wait_runtime. This seems a more natural location to do this, and it also reduces the code a bit: text data bss dec hex filename 13397 228 1204 14829 39ed sched.o.before 13391 228 1204 14823 39e7 sched.o.after Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-09-05 14:32:49 +02:00
Peter Zijlstra	7c92e54f6f	sched: simplify __check_preempt_curr_fair() Preparatory patch for fix-ideal-runtime: simplify __check_preempt_curr_fair(): get rid of the integer return. text data bss dec hex filename 13404 228 1204 14836 39f4 sched.o.before 13393 228 1204 14825 39e9 sched.o.after functionality is unchanged. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-09-05 14:32:49 +02:00
Ingo Molnar	a206c07213	sched: debug: fix cfs_rq->wait_runtime accounting the cfs_rq->wait_runtime debug/statistics counter was not maintained properly - fix this. this also removes some code: text data bss dec hex filename 13420 228 1204 14852 3a04 sched.o.before 13404 228 1204 14836 39f4 sched.o.after Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>	2007-09-05 14:32:49 +02:00
Ingo Molnar	a0dc72601d	sched: fix niced_granularity() shift fix niced_granularity(). This resulted in under-scheduling for CPU-bound negative nice level tasks (and this in turn caused higher than necessary latencies in nice-0 tasks). Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-09-05 14:32:49 +02:00
Ingo Molnar	9f508f8258	sched: clean up task_new_fair() cleanup: we have the 'se' and 'curr' entity-pointers already, no need to use p->se and current->se. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mike Galbraith <efault@gmx.de>	2007-08-28 12:53:24 +02:00
Ingo Molnar	213c8af67f	sched: small schedstat fix small schedstat fix: the cfs_rq->wait_runtime 'sum of all runtimes' statistics counters missed newly forked tasks and thus had a constant negative skew. Fix this. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mike Galbraith <efault@gmx.de>	2007-08-28 12:53:24 +02:00
Ingo Molnar	b77d69db9f	sched: fix wait_start_fair condition in update_stats_wait_end() Peter Zijlstra noticed the following bug in SCHED_FEAT_SKIP_INITIAL (which is disabled by default at the moment): it relies on se.wait_start_fair being 0 while update_stats_wait_end() did not recognize a 0 value, so instead of 'skipping' the initial interval we gave the new child a maximum boost of +runtime-limit ... (No impact on the default kernel, but nice to fix for completeness.) Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mike Galbraith <efault@gmx.de>	2007-08-28 12:53:24 +02:00
Ting Yang	7109c4429a	sched: call update_curr() in task_tick_fair() update the fair-clock before using it for the key value. [ mingo@elte.hu: small cleanups. ] Signed-off-by: Ting Yang <tingy@cs.umass.edu> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>	2007-08-28 12:53:24 +02:00
Ingo Molnar	f6cf891c4d	sched: make the scheduler converge to the ideal latency de-HZ-ification of the granularity defaults unearthed a pre-existing property of CFS: while it correctly converges to the granularity goal, it does not prevent run-time fluctuations in the range of [-gran ... 0 ... +gran]. With the increase of the granularity due to the removal of HZ dependencies, this becomes visible in chew-max output (with 5 tasks running): out: 28 . 27. 32 \| flu: 0 . 0 \| ran: 9 . 13 \| per: 37 . 40 out: 27 . 27. 32 \| flu: 0 . 0 \| ran: 17 . 13 \| per: 44 . 40 out: 27 . 27. 32 \| flu: 0 . 0 \| ran: 9 . 13 \| per: 36 . 40 out: 29 . 27. 32 \| flu: 2 . 0 \| ran: 17 . 13 \| per: 46 . 40 out: 28 . 27. 32 \| flu: 0 . 0 \| ran: 9 . 13 \| per: 37 . 40 out: 29 . 27. 32 \| flu: 0 . 0 \| ran: 18 . 13 \| per: 47 . 40 out: 28 . 27. 32 \| flu: 0 . 0 \| ran: 9 . 13 \| per: 37 . 40 average slice is the ideal 13 msecs and the period is picture-perfect 40 msecs. But the 'ran' field fluctuates around 13.33 msecs and there's no mechanism in CFS to keep that from happening: it's a perfectly valid solution that CFS finds. to fix this we add a granularity/preemption rule that knows about the "target latency", which makes tasks that run longer than the ideal latency run a bit less. The simplest approach is to simply decrease the preemption granularity when a task overruns its ideal latency. For this we have to track how much the task executed since its last preemption. ( this adds a new field to task_struct, but we can eliminate that overhead in 2.6.24 by putting all the scheduler timestamps into an anonymous union. ) with this change in place, chew-max output is fluctuation-less all around: out: 28 . 27. 39 \| flu: 0 . 2 \| ran: 13 . 13 \| per: 41 . 40 out: 28 . 27. 39 \| flu: 0 . 2 \| ran: 13 . 13 \| per: 41 . 40 out: 28 . 27. 39 \| flu: 0 . 2 \| ran: 13 . 13 \| per: 41 . 40 out: 28 . 27. 39 \| flu: 0 . 2 \| ran: 13 . 13 \| per: 41 . 40 out: 28 . 27. 39 \| flu: 0 . 1 \| ran: 13 . 13 \| per: 41 . 40 out: 28 . 27. 39 \| flu: 0 . 1 \| ran: 13 . 13 \| per: 41 . 40 this patch has no impact on any fastpath or on any globally observable scheduling property. (unless you have sharp enough eyes to see millisecond-level ruckles in glxgears smoothness :-) Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mike Galbraith <efault@gmx.de>	2007-08-28 12:53:24 +02:00

1 2 3 4

200 commits