linux

Commit Graph

Author	SHA1	Message	Date
Thomas Gleixner	e958b36004	sched: move weighted_cpuload into #ifdef CONFIG_SMP section weighted_cpuload is only used on SMP. move it into the CONFIG_SMP section. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-06-06 15:25:02 +02:00
Max Krasnyansky	68f4f1ec08	sched: Move cpu masks from kernel/sched.c into kernel/cpu.c kernel/cpu.c seems a more logical place for those maps since they do not really have much to do with the scheduler these days. kernel/cpu.c is now built for the UP kernel too, but it does not affect the size the kernel sections. $ size vmlinux before text data bss dec hex filename 3313797 307060 310352 3931209 3bfc49 vmlinux after text data bss dec hex filename 3313797 307060 310352 3931209 3bfc49 vmlinux Signed-off-by: Max Krasnyansky <maxk@qualcomm.com> Cc: pj@sgi.com Cc: menage@google.com Cc: rostedt@goodmis.org Cc: mingo@elte.hu Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-06-06 15:25:01 +02:00
Max Krasnyansky	5c8e1ed1d2	sched: CPU hotplug events must not destroy scheduler domains created by the cpusets First issue is not related to the cpusets. We're simply leaking doms_cur. It's allocated in arch_init_sched_domains() which is called for every hotplug event. So we just keep reallocation doms_cur without freeing it. I introduced free_sched_domains() function that cleans things up. Second issue is that sched domains created by the cpusets are completely destroyed by the CPU hotplug events. For all CPU hotplug events scheduler attaches all CPUs to the NULL domain and then puts them all into the single domain thereby destroying domains created by the cpusets (partition_sched_domains). The solution is simple, when cpusets are enabled scheduler should not create default domain and instead let cpusets do that. Which is exactly what the patch does. Signed-off-by: Max Krasnyansky <maxk@qualcomm.com> Cc: pj@sgi.com Cc: menage@google.com Cc: rostedt@goodmis.org Cc: mingo@elte.hu Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-06-06 15:25:00 +02:00
Ingo Molnar	1100ac91b6	sched: fix cpuprio build bug this patch was not built on !SMP: kernel/sched_rt.c: In function 'inc_rt_tasks': kernel/sched_rt.c:404: error: 'struct rq' has no member named 'online' Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-06 15:19:46 +02:00
Thomas Gleixner	e539d8fcd1	sched: fix the cpuprio count really Peter pointed out that the last version of the "fix" was still one off under certain circumstances. Use BITS_TO_LONG instead to get an accurate result. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-06-06 15:19:44 +02:00
Gregory Haskins	709d4b0c60	sched: fix cpupri priocount A rounding error was pointed out by Peter Zijlstra which would result in the structure holding priorities to be off by one. Signed-off-by: Gregory Haskins <ghaskins@novell.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-06-06 15:19:43 +02:00
Gregory Haskins	1f11eb6a8b	sched: fix cpupri hotplug support The RT folks over at RedHat found an issue w.r.t. hotplug support which was traced to problems with the cpupri infrastructure in the scheduler: https://bugzilla.redhat.com/show_bug.cgi?id=449676 This bug affects 23-rt12+, 24-rtX, 25-rtX, and sched-devel. This patch applies to 25.4-rt4, though it should trivially apply to most cpupri enabled kernels mentioned above. It turned out that the issue was that offline cpus could get inadvertently registered with cpupri so that they were erroneously selected during migration decisions. The end result would be an OOPS as the offline cpu had tasks routed to it. This patch generalizes the old join/leave domain interface into an online/offline interface, and adjusts the root-domain/hotplug code to utilize it. I was able to easily reproduce the issue prior to this patch, and am no longer able to reproduce it after this patch. I can offline cpus indefinately and everything seems to be in working order. Thanks to Arnaldo (acme), Thomas, and Peter for doing the legwork to point me in the right direction. Also thank you to Peter for reviewing the early iterations of this patch. Signed-off-by: Gregory Haskins <ghaskins@novell.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-06-06 15:19:42 +02:00
Gautham R Shenoy	099f98c8a1	sched: print the sd->level in sched_domain_debug code While printing out the visual representation of the sched-domains, print the level (MC, SMT, CPU, NODE, ... ) of each of the sched_domains. Credit: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-06 15:19:41 +02:00
Dhaval Giani	6d6bc0ad86	sched: add comments for ifdefs in sched.c make sched.c easier to read. Signed-off-by: Dhaval Giani <dhaval@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-06-06 15:19:38 +02:00
Arjan van de Ven	e21f5b153b	sched: print module list in the "scheduling while atomic" warning For the normal WARN_ON() etc we added a print-the-modules-list already, which is very useful to figure out candidates for certain types of bugs. This patch adds the same print to the "scheduling while atomic" BUG warning, for the same reason: when we get here it's very useful to see which modules are loaded, to narrow down the candidate code list. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: mingo@elte.hu Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-06-06 15:19:37 +02:00
Rabin Vincent	81d41d7ece	sched: fix defined-but-unused warning Fix this warning, which appears with !CONFIG_SMP: kernel/sched.c:1216: warning: `init_hrtick' defined but not used Signed-off-by: Rabin Vincent <rabin@rab.in> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-06-06 15:19:34 +02:00
Thomas Gleixner	f7dcd80bbc	namespacecheck: fixes in kernel/sched.c Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-06-06 15:19:32 +02:00
Dmitry Adamushko	d07355f5de	sched: check for SD_SERIALIZE atomically in rebalance_domains() Nothing really serious here, mainly just a matter of nit-picking :-/ From: Dmitry Adamushko <dmitry.adamushko@gmail.com> For CONFIG_SCHED_DEBUG && CONFIG_SYSCT configs, sd->flags can be altered while being manipulated in rebalance_domains(). Let's do an atomic check. We rely here on the atomicity of read/write accesses for aligned words. Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-06-06 15:19:30 +02:00
Gregory Haskins	6d299f1b53	sched: fix SCHED_OTHER balance iterator to include all tasks The currently logic inadvertently skips the last task on the run-queue, resulting in missed balance opportunities. Signed-off-by: Gregory Haskins <ghaskins@novell.com> Signed-off-by: David Bahi <dbahi@novell.com> CC: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-06-06 15:19:29 +02:00
Gregory Haskins	6e0534f278	sched: use a 2-d bitmap for searching lowest-pri CPU The current code use a linear algorithm which causes scaling issues on larger SMP machines. This patch replaces that algorithm with a 2-dimensional bitmap to reduce latencies in the wake-up path. Signed-off-by: Gregory Haskins <ghaskins@novell.com> Acked-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-06-06 15:19:28 +02:00
Mike Galbraith	f333fdc909	sched: make !hrtick faster it is safe to ignore timers and flags when the feature is disabled. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-06-06 15:19:27 +02:00
Gregory Haskins	45c01e8249	sched: prioritize non-migratable tasks over migratable ones Dmitry Adamushko pointed out a known flaw in the rt-balancing algorithm that could allow suboptimal balancing if a non-migratable task gets queued behind a running migratable one. It is discussed in this thread: http://lkml.org/lkml/2008/4/22/296 This issue has been further exacerbated by a recent checkin to sched-devel (git-id 5eee63a5ebc19a870ac40055c0be49457f3a89a3). >From a pure priority standpoint, the run-queue is doing the "right" thing. Using Dmitry's nomenclature, if T0 is on cpu1 first, and T1 wakes up at equal or lower priority (affined only to cpu1) later, it should wait for T0 to finish. However, in reality that is likely suboptimal from a system perspective if there are other cores that could allow T0 and T1 to run concurrently. Since T1 can not migrate, the only choice for higher concurrency is to try to move T0. This is not something we addessed in the recent rt-balancing re-work. This patch tries to enhance the balancing algorithm by accomodating this scenario. It accomplishes this by incorporating the migratability of a task into its priority calculation. Within a numerical tsk->prio, a non-migratable task is logically higher than a migratable one. We maintain this by introducing a new per-priority queue (xqueue, or exclusive-queue) for holding non-migratable tasks. The scheduler will draw from the xqueue over the standard shared-queue (squeue) when available. There are several details for utilizing this properly. 1) During task-wake-up, we not only need to check if the priority preempts the current task, but we also need to check for this non-migratable condition. Therefore, if a non-migratable task wakes up and sees an equal priority migratable task already running, it will attempt to preempt it if there is a likelyhood that the current task will find an immediate home. 2) Tasks only get this non-migratable "priority boost" on wake-up. Any requeuing will result in the non-migratable task being queued to the end of the shared queue. This is an attempt to prevent the system from being completely unfair to migratable tasks during things like SCHED_RR timeslicing. I am sure this patch introduces potentially "odd" behavior if you concoct a scenario where a bunch of non-migratable threads could starve migratable ones given the right pattern. I am not yet convinced that this is a problem since we are talking about tasks of equal RT priority anyway, and there never is much in the way of guarantees against starvation under that scenario anyway. (e.g. you could come up with a similar scenario with a specific timing environment verses an affinity environment). I can be convinced otherwise, but for now I think this is "ok". Signed-off-by: Gregory Haskins <ghaskins@novell.com> CC: Dmitry Adamushko <dmitry.adamushko@gmail.com> CC: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-06-06 15:19:25 +02:00
Linus Torvalds	3b5b60b821	Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb: kgdbts: Use HW breakpoints with CONFIG_DEBUG_RODATA kgdb: use common ascii helpers and put_unaligned_be32 helper	2008-06-04 08:08:27 -07:00
Linus Torvalds	a7f75d3bed	Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: re-tune NUMA topologies sched: stop wake_affine from causing serious imbalance sched: fix sched_clock_cpu() revert ("sched: fair-group: SMP-nice for group scheduling") sched: cleanup show_schedstat(): fix memleak sched: unite unlikely pairs in rt_policy() and schedule_debug() revert ("sched: fair: weight calculations")	2008-05-29 09:26:17 -07:00
Ingo Molnar	6715930654	Merge commit 'linus/master' into sched-fixes-for-linus	2008-05-29 16:05:05 +02:00
Mike Galbraith	b3137bc8e7	sched: stop wake_affine from causing serious imbalance Prevent short-running wakers of short-running threads from overloading a single cpu via wakeup affinity, and wire up disconnected debug option. Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-05-29 11:29:20 +02:00
Peter Zijlstra	a381759d6a	sched: fix sched_clock_cpu() Make sched_clock_cpu() return 0 before it has been initialized and avoid corrupting its state due to doing so. This fixes the weird printk timestamp jump reported. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>	2008-05-29 11:29:19 +02:00
Ingo Molnar	6363ca57c7	revert ("sched: fair-group: SMP-nice for group scheduling") Yanmin Zhang reported: Comparing with 2.6.25, volanoMark has big regression with kernel 2.6.26-rc1. It's about 50% on my 8-core stoakley, 16-core tigerton, and Itanium Montecito. With bisect, I located the following patch: \| `18d95a2832` is first bad commit \| commit `18d95a2832` \| Author: Peter Zijlstra <a.p.zijlstra@chello.nl> \| Date: Sat Apr 19 19:45:00 2008 +0200 \| \| sched: fair-group: SMP-nice for group scheduling Revert it so that we get v2.6.25 behavior. Bisected-by: Yanmin Zhang <yanmin_zhang@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-05-29 11:28:57 +02:00
Ingo Molnar	4285f594f8	sched: cleanup Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-05-29 11:25:15 +02:00
Adrian Bunk	c6fba5451a	show_schedstat(): fix memleak The Coverity checker spotted a memleak introduced by commit `39106dcf85` (cpumask: use new cpus_scnprintf function). It seems the kfree() got lost between v2 and v3 of this patch... Signed-off-by: Adrian Bunk <bunk@kernel.org> Cc: Mike Travis <travis@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-05-29 11:25:15 +02:00
Roel Kluin	3f33a7ce95	sched: unite unlikely pairs in rt_policy() and schedule_debug() Removes obfuscation and may improve assembly. Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-05-29 11:25:14 +02:00
Ingo Molnar	f9305d4a09	revert ("sched: fair: weight calculations") Yanmin Zhang reported: Comparing with kernel 2.6.25, sysbench+mysql(oltp, readonly) has many regressions with 2.6.26-rc1: 1) 8-core stoakley: 28%; 2) 16-core tigerton: 20%; 3) Itanium Montvale: 50%. Bisect located this patch: \| `8f1bc385cf` is first bad commit \| commit `8f1bc385cf` \| Author: Peter Zijlstra <a.p.zijlstra@chello.nl> \| Date: Sat Apr 19 19:45:00 2008 +0200 \| \| sched: fair: weight calculations Revert it to the 2.6.25 state. Bisected-by: Yanmin Zhang <yanmin_zhang@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-05-29 11:24:01 +02:00
Harvey Harrison	827e609b45	kgdb: use common ascii helpers and put_unaligned_be32 helper Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jason Wessel <jason.wessel@windriver.com>	2008-05-28 12:49:56 -05:00
Tom Zanussi	a82c53a0e3	splice: fix sendfile() issue with relay Splice isn't always incrementing the ppos correctly, which broke relay splice. Signed-off-by: Tom Zanussi <zanussi@comcast.net> Tested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-05-28 14:49:27 +02:00
Oleg Nesterov	cbaffba12c	posix timers: discard SI_TIMER signals on exec Based on Roland's patch. This approach was suggested by Austin Clements from the very beginning, and then by Linus. As Austin pointed out, the execing task can be killed by SI_TIMER signal because exec flushes the signal handlers, but doesn't discard the pending signals generated by posix timers. Perhaps not a bug, but people find this surprising. See http://bugzilla.kernel.org/show_bug.cgi?id=10460 Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Austin Clements <amdragon+kernelbugzilla@mit.edu> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-26 10:37:07 -07:00
Oleg Nesterov	c8e85b4f4b	posix timers: sigqueue_free: don't free sigqueue if it is queued Currently sigqueue_free() removes sigqueue from list, but doesn't cancel the pending signal. This is not consistent, the task should either receive the "full" signal along with siginfo_t, or it shouldn't receive the signal at all. Change sigqueue_free() to clear SIGQUEUE_PREALLOC but leave sigqueue on list if it is queued. This is a user-visible change. If the signal is blocked, it stays queued after sys_timer_delete() until unblocked with the "stale" si_code/si_value, and of course it is still counted wrt RLIMIT_SIGPENDING which also limits the number of posix timers. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Austin Clements <amdragon+kernelbugzilla@mit.edu> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-26 10:37:06 -07:00
Cedric Le Goater	5c02b57578	cgroups: remove node_ prefix_from ns subsystem This is a slight change in the namespace cgroup subsystem api. The change is that previously when cgroup_clone() was called (currently only from the unshare path in ns_proxy cgroup, you'd get a new group named "node_$pid" whereas now you'll get a group named after just your pid.) The only users who would notice it are those who are using the ns_proxy cgroup subsystem to auto-create cgroups when namespaces are unshared - something of an experimental feature, which I think really needs more complete container/namespace support in order to be useful. I suspect the only users are Cedric and Serge, or maybe a few others on containers@lists.linux-foundation.org. And in fact it would only be noticed by the users who make the assumption about how the name is generated, rather than getting it from the /proc/<pid>/cgroups file for the process in question. Whether the change is actually needed or not I'm fairly agnostic on, but I guess it is more elegant to just use the pid as the new group name rather than adding a fairly arbitrary "node_" prefix on the front. [menage@google.com: provided changelog] Signed-off-by: Cedric Le Goater <clg@fr.ibm.com> Cc: "Paul Menage" <menage@google.com> Cc: "Serge E. Hallyn" <serue@us.ibm.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-24 09:56:14 -07:00
Shi Weihua	7b26655f62	sys_prctl(): fix return of uninitialized value If none of the switch cases match, the PR_SET_PDEATHSIG and PR_SET_DUMPABLE cases of the switch statement will never write to local variable `error'. Signed-off-by: Shi Weihua <shiwh@cn.fujitsu.com> Cc: Andrew G. Morgan <morgan@kernel.org> Acked-by: "Serge E. Hallyn" <serue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-24 09:56:13 -07:00
Oleg Nesterov	da7978b034	signals: fix sigqueue_free() vs __exit_signal() race __exit_signal() does flush_sigqueue(tsk->pending) outside of ->siglock. This can race with another thread doing sigqueue_free(), we can free the same SIGQUEUE_PREALLOC sigqueue twice or corrupt the pending->list. Note that even sys_exit_group() can trigger this race, not only sys_timer_delete(). Move the callsite of flush_sigqueue(tsk->pending) under ->siglock. This patch doesn't touch flush_sigqueue(->shared_pending) below, it is called when there are no other threads which can play with signals, and sigqueue_free() can't be used outside of our thread group. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-24 09:56:10 -07:00
Christian Borntraeger	3401a61e16	stop_machine: make stop_machine_run more virtualization friendly On kvm I have seen some rare hangs in stop_machine when I used more guest cpus than hosts cpus. e.g. 32 guest cpus on 1 host cpu triggered the hang quite often. I could also reproduce the problem on a 4 way z/VM host with a 64 way guest. It turned out that the guest was consuming all available cpus mostly for spinning on scheduler locks like rq->lock. This is expected as the threads are calling yield all the time. The problem is now, that the host scheduling decisings together with the guest scheduling decisions and spinlocks not being fair managed to create an interesting scenario similar to a live lock. (Sometimes the hang resolved itself after some minutes) Changing stop_machine to yield the cpu to the hypervisor when yielding inside the guest fixed the problem for me. While I am not completely happy with this patch, I think it causes no harm and it really improves the situation for me. I used cpu_relax for yielding to the hypervisor, does that work on all architectures? p.s.: If you want to reproduce the problem, cpu hotplug and kprobes use stop_machine_run and both triggered the problem after some retries. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> CC: Ingo Molnar <mingo@elte.hu> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-05-23 13:09:34 +10:00
Denis V. Lunev	34e4e2fef4	modules: proper cleanup of kobject without CONFIG_SYSFS kobject: '<NULL>' (ffffffffa0104050): is not initialized, yet kobject_put() is being called. ------------[ cut here ]------------ WARNING: at /home/den/src/linux-netns26/lib/kobject.c:583 kobject_put+0x53/0x55() Modules linked in: ipv6 nfsd lockd nfs_acl auth_rpcgss sunrpc exportfs ide_cd_mod cdrom button [last unloaded: pktgen] comm: rmmod Tainted: G W 2.6.26-rc3 #585 Call Trace: [<ffffffff802359ab>] warn_on_slowpath+0x58/0x7a [<ffffffff80236aca>] ? printk+0x67/0x69 [<ffffffff80236aca>] ? printk+0x67/0x69 [<ffffffff80324289>] kobject_put+0x53/0x55 [<ffffffff8025e2ee>] free_module+0x87/0xfa [<ffffffff8025fee5>] sys_delete_module+0x178/0x1e1 [<ffffffff804b1e70>] ? lockdep_sys_exit_thunk+0x35/0x67 [<ffffffff804b1dff>] ? trace_hardirqs_on_thunk+0x35/0x3a [<ffffffff8020c0bb>] system_call_after_swapgs+0x7b/0x80 ---[ end trace 8f5aafa7f6406cf8 ]--- mod->mkobj.kobj is not initialized without CONFIG_SYSFS. Do not call kobject_put in this case. Signed-off-by: Denis V. Lunev <den@openvz.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-05-23 13:09:33 +10:00
Cyrill Gorcunov	c4ea6fcf5a	module loading ELF handling: use SELFMAG instead of numeric constant Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-05-23 13:09:32 +10:00
Linus Torvalds	16ae527bfa	Merge branch 'audit.b51' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current * 'audit.b51' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current: [PATCH] list_for_each_rcu must die: audit [patch 1/1] audit_send_reply(): fix error-path memory leak [PATCH] open sessionid permissions	2008-05-19 16:38:10 -07:00
Paul E. McKenney	6793a051fb	[PATCH] list_for_each_rcu must die: audit All uses of list_for_each_rcu() can be profitably replaced by the easier-to-use list_for_each_entry_rcu(). This patch makes this change for the Audit system, in preparation for removing the list_for_each_rcu() API entirely. This time with well-formed SOB. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-05-17 03:30:23 -04:00
Andrew Morton	fcaf1eb868	[patch 1/1] audit_send_reply(): fix error-path memory leak Addresses http://bugzilla.kernel.org/show_bug.cgi?id=10663 Reporter: Daniel Marjamki <danielm77@spray.se> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-05-17 03:30:22 -04:00
Al Viro	eceea0b3df	[PATCH] avoid multiplication overflows and signedness issues for max_fds Limit sysctl_nr_open - we don't want ->max_fds to exceed MAX_INT and we don't want size calculation for ->fd[] to overflow. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-05-16 17:22:52 -04:00
Al Viro	02afc6267f	[PATCH] dup_fd() fixes, part 1 Move the sucker to fs/file.c in preparation to the rest Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-05-16 17:22:26 -04:00
Harvey Harrison	3fc957721d	lib: create common ascii hex array Add a common hex array in hexdump.c so everyone can use it. Add a common hi/lo helper to avoid the shifting masking that is done to get the upper and lower nibbles of a byte value. Pull the pack_hex_byte helper from kgdb as it is opencoded many places in the tree that will be consolidated. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Acked-by: Paul Mundt <lethal@linux-sh.org> Cc: Jason Wessel <jason.wessel@windriver.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-14 19:11:14 -07:00
Mirco Tischler	0c70814c31	cgroups: fix compile warning Return type of cpu_rt_runtime_write() should be int instead of ssize_t. Signed-off-by: Mirco Tischler <mt-ml@gmx.de> Acked-by: Paul Menage <menage@google.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-14 19:11:14 -07:00
Linus Torvalds	c3921ab715	Add new 'cond_resched_bkl()' helper function It acts exactly like a regular 'cond_resched()', but will not get optimized away when CONFIG_PREEMPT is set. Normal kernel code is already preemptable in the presense of CONFIG_PREEMPT, so cond_resched() is optimized away (see commit `02b67cc3ba` "sched: do not do cond_resched() when CONFIG_PREEMPT"). But when wanting to conditionally reschedule while holding a lock, you need to use "cond_sched_lock(lock)", and the new function is the BKL equivalent of that. Also make fs/locks.c use it. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-11 16:04:48 -07:00
Linus Torvalds	8e3e076c5a	BKL: revert back to the old spinlock implementation The generic semaphore rewrite had a huge performance regression on AIM7 (and potentially other BKL-heavy benchmarks) because the generic semaphores had been rewritten to be simple to understand and fair. The latter, in particular, turns a semaphore-based BKL implementation into a mess of scheduling. The attempt to fix the performance regression failed miserably (see the previous commit `00b41ec261` 'Revert "semaphore: fix"'), and so for now the simple and sane approach is to instead just go back to the old spinlock-based BKL implementation that never had any issues like this. This patch also has the advantage of being reported to fix the regression completely according to Yanmin Zhang, unlike the semaphore hack which still left a couple percentage point regression. As a spinlock, the BKL obviously has the potential to be a latency issue, but it's not really any different from any other spinlock in that respect. We do want to get rid of the BKL asap, but that has been the plan for several years. These days, the biggest users are in the tty layer (open/release in particular) and Alan holds out some hope: "tty release is probably a few months away from getting cured - I'm afraid it will almost certainly be the very last user of the BKL in tty to get fixed as it depends on everything else being sanely locked." so while we're not there yet, we do have a plan of action. Tested-by: Yanmin Zhang <yanmin_zhang@linux.intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Andi Kleen <andi@firstfloor.org> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Alexander Viro <viro@ftp.linux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-10 20:58:02 -07:00
Linus Torvalds	00b41ec261	Revert "semaphore: fix" This reverts commit `bf726eab37`, as it has been reported to cause a regression with processes stuck in __down(), apparently because some missing wakeup. Quoth Sven Wegener: "I'm currently investigating a regression that has showed up with my last git pull yesterday. Bisecting the commits showed bf726e "semaphore: fix" to be the culprit, reverting it fixed the issue. Symptoms: During heavy filesystem usage (e.g. a kernel compile) I get several compiler processes in uninterruptible sleep, blocking all i/o on the filesystem. System is an Intel Core 2 Quad running a 64bit kernel and userspace. Filesystem is xfs on top of lvm. See below for the output of sysrq-w." See http://lkml.org/lkml/2008/5/10/45 for full report. In the meantime, we can just fix the BKL performance regression by reverting back to the good old BKL spinlock implementation instead, since any sleeping lock will generally perform badly, especially if it tries to be fair. Reported-by: Sven Wegener <sven.wegener@stealer.net> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-10 20:43:22 -07:00
Rusty Russell	91e37a793b	module: don't ignore vermagic string if module doesn't have modversions Linus found a logic bug: we ignore the version number in a module's vermagic string if we have CONFIG_MODVERSIONS set, but modversions also lets through a module with no __versions section for modprobe --force (with tainting, but still). We should only ignore the start of the vermagic string if the module actually has crcs to check. Rather than (say) having an entertaining hissy fit and creating a config option to work around the buggy code. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-09 07:45:18 -07:00
Rusty Russell	a5dd697074	module: be more picky about allowing missing module versions We allow missing __versions sections, because modprobe --force strips it. It makes less sense to allow sections where there's no version for a specific symbol the module uses, so disallow that. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-09 07:45:18 -07:00
Linus Torvalds	c4f51b4662	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-fixes * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-fixes: sched: fix weight calculations semaphore: fix	2008-05-08 11:31:07 -07:00

1 2 3 4 5 ...

3920 Commits (e958b3600484533ff801920290468adc8135f89d)