Recently we encountered OOM problems due to memory use of the GEM cache.
Generally a large amuont of Shmem/Tmpfs pages tend to create a memory
shortage problem.
We often use the following calculation to determine the amount of shmem
pages:
shmem = NR_ACTIVE_ANON + NR_INACTIVE_ANON - NR_ANON_PAGES
however the expression does not consider isolated and mlocked pages.
This patch adds explicit accounting for pages used by shmem and tmpfs.
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Christoph Lameter <cl@linux-foundation.org>
Acked-by: Wu Fengguang <fengguang.wu@intel.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The amount of memory allocated to kernel stacks can become significant and
cause OOM conditions. However, we do not display the amount of memory
consumed by stacks.
Add code to display the amount of memory used for stacks in /proc/meminfo.
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Christoph Lameter <cl@linux-foundation.org>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This allows subsytems to provide devtmpfs with non-default permissions
for the device node. Instead of the default mode of 0600, null, zero,
random, urandom, full, tty, ptmx now have a mode of 0666, which allows
non-privileged processes to access standard device nodes in case no
other userspace process applies the expected permissions.
This also fixes a wrong assignment in pktcdvd and a checkpatch.pl complain.
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Devtmpfs lets the kernel create a tmpfs instance called devtmpfs
very early at kernel initialization, before any driver-core device
is registered. Every device with a major/minor will provide a
device node in devtmpfs.
Devtmpfs can be changed and altered by userspace at any time,
and in any way needed - just like today's udev-mounted tmpfs.
Unmodified udev versions will run just fine on top of it, and will
recognize an already existing kernel-created device node and use it.
The default node permissions are root:root 0600. Proper permissions
and user/group ownership, meaningful symlinks, all other policy still
needs to be applied by userspace.
If a node is created by devtmps, devtmpfs will remove the device node
when the device goes away. If the device node was created by
userspace, or the devtmpfs created node was replaced by userspace, it
will no longer be removed by devtmpfs.
If it is requested to auto-mount it, it makes init=/bin/sh work
without any further userspace support. /dev will be fully populated
and dynamic, and always reflect the current device state of the kernel.
With the commonly used dynamic device numbers, it solves the problem
where static devices nodes may point to the wrong devices.
It is intended to make the initial bootup logic simpler and more robust,
by de-coupling the creation of the inital environment, to reliably run
userspace processes, from a complex userspace bootstrap logic to provide
a working /dev.
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Jan Blunck <jblunck@suse.de>
Tested-By: Harald Hoyer <harald@redhat.com>
Tested-By: Scott James Remnant <scott@ubuntu.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When turning class devices into bus devices, we may need to
temporarily add links in sysfs so that user-space applications
are not confused. This is done by adding the following API:
* Functions to register and unregister compatibility classes.
These appear in sysfs at the same location as regular classes, but
instead of class devices, they contain links to bus devices.
* Functions to create and delete such links. Additionally, the caller
can optionally pass a target device to which a "device" link should
point (typically that would be the device's parent), to fully emulate
the original class device.
The i2c subsystem will be the first user of this API, as i2c adapters
are being converted from class devices to bus devices.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Placing dma-coherent.c in driver/base is better than in kernel,
since it contains code to do per-device coherent dma memory
handling.
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Let attribute group vectors be declared "const". We'd
like to let most attribute metadata live in read-only
sections... this is a start.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
No one should directly access the driver_data field, so remove the field
and make it private. We dynamically create the private field now if it
is needed, to handle drivers that call get/set before they are
registered with the driver core.
Also update the copyright notices on these files while we are there.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch (as1271) affects when new devices get linked into their
bus's list of devices. Currently this happens after probing, and it
doesn't happen at all if probing fails. Clearly this is wrong,
because at that point quite a few symbolic links have already been
created in sysfs. We are committed to adding the device, so it should
be linked into the bus's list regardless.
In addition, this needs to happen before the uevent announcing the new
device gets issued. Otherwise user programs might try to access the
device before it has been added to the bus.
To fix both these problems, the patch moves the call to
klist_add_tail() forward from bus_attach_device() to bus_add_device().
Since bus_attach_device() now does nothing but probe for drivers, it
has been renamed to bus_probe_device(). And lastly, the kerneldoc is
updated.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
CC: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
transition_started should be set once the preparation of devices for
a PM has started, reset before starting to resume devices. When
resuming devices, kernel calls dpm_resume_noirq then
dpm_resume_end(dpm_resume). Thus we should reset transition_started
at dpm_resume_noirq.
This patch fixes ACPI warning when resuming from suspend/hibernate:
ACPI: \_SB_.PCI0.IDE1.PRI1.MAS1 - docking
------------[ cut here ]------------
WARNING: at drivers/base/power/main.c:87 device_pm_add+0x8b/0xcc()
Hardware name: OptiPlex 760
Device: acpi
Parentless device registered during a PM transaction
[rjw: Fixed up the changelog.]
Signed-off-by: Xiaotian Feng <dfeng@redhat.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
The kerneldoc comments in drivers/base/power/main.c are generally
outdated and some of them don't describe the functions very
accurately. Update them and standardize the format to use spaces
instead of tabs.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
This patch adds default Runtime PM callbacks to the dev_pm_ops
belonging to the platform bus. The callbacks are weak symbols
that architecture specific code can override.
Allows Runtime PM even though CONFIG_PM_SLEEP=n.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Introduce a core framework for run-time power management of I/O
devices. Add device run-time PM fields to 'struct dev_pm_info'
and device run-time PM callbacks to 'struct dev_pm_ops'. Introduce
a run-time PM workqueue and define some device run-time PM helper
functions at the core level. Document all these things.
Special thanks to Alan Stern for his help with the design and
multiple detailed reviews of the pereceding versions of this patch
and to Magnus Damm for testing feedback.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Magnus Damm <damm@igel.co.jp>
Commit 783ea7d4ee
(Driver Core: Rework platform suspend/resume, print warning)
added a warning message printed for platform drivers that use the
legacy PM callbacks rather than struct dev_pm_ops. Unfortunately,
this resulted in some confusion and made some people try to convert
drivers by replacing the old callbacks with struct dev_pm_ops in
automatic way, which generally is not a good idea.
Remove the platform device runtime dev_pm_ops warning for now,
because it's annoying to users and it's not really necessary right
now.
[rjw: Modified the changelog to be more informative.]
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
If kobject_init_and_add fails, sysdev_register should not send KOBJ_ADD
uevent to userspace.
Signed-off-by: Xiaotian Feng <dfeng@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The page pointers array is allocated in fw_realloc_buffer() called by
firmware_data_write(), and should be freed in release function of firmware
device.
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Reported-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
They are not supposed to be modified by drivers, so make them const.
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
This is V2 of the platform driver power management late/early
callback removal patch. The callbacks ->suspend_late() and
->resume_early() are removed since all in-tree users now have
been migrated to dev_pm_ops.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
I just debugged an obscure crash caused by a device_del() of a all NULL'd
out struct device (in usb-serial) and found that a patch like this one would
have saved me time (in addition to improved chances of a bug report from
users hitting similar driver bugs).
[akpm@linux-foundation.org: cleanup]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6:
PM: Clear -EAGAIN in dpm_prepare
x86: Fix resume from suspend when CONFIG_CC_STACKPROTECTOR
The f_dev in _request_firmware() is allocated via the fw_setup_device()
and fw_register_device() calls and its class set to firmware_class (the
class release function is fw_dev_release).
Commit 6acf70f078 replaced the kfree(dev) in fw_dev_release() with a
put_device() call but my understanding is that the release function is
called via put_device -> kobject_put -> kref_put -> koject_release etc.
and it should call kfree since it's the last to see this device
structure alive.
Because of that, the _request_firmware() function on its -ENOENT error
path only calls device_unregister(f_dev) which would eventually call
fw_dev_release() but there is no kfree (the subsequent put_device call
would just make the kref negative).
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When the last device in the dpm list is unregistered directly after its
prepare() callback returned with -EAGAIN, the return code is passed to
the calling function, resulting in a suspend failure. Prevent this by
clearing the return code after -EAGAIN.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
* akpm: (182 commits)
fbdev: bf54x-lq043fb: use kzalloc over kmalloc/memset
fbdev: *bfin*: fix __dev{init,exit} markings
fbdev: *bfin*: drop unnecessary calls to memset
fbdev: bfin-t350mcqb-fb: drop unused local variables
fbdev: blackfin has __raw I/O accessors, so use them in fb.h
fbdev: s1d13xxxfb: add accelerated bitblt functions
tcx: use standard fields for framebuffer physical address and length
fbdev: add support for handoff from firmware to hw framebuffers
intelfb: fix a bug when changing video timing
fbdev: use framebuffer_release() for freeing fb_info structures
radeon: P2G2CLK_ALWAYS_ONb tested twice, should 2nd be P2G2CLK_DAC_ALWAYS_ONb?
s3c-fb: CPUFREQ frequency scaling support
s3c-fb: fix resource releasing on error during probing
carminefb: fix possible access beyond end of carmine_modedb[]
acornfb: remove fb_mmap function
mb862xxfb: use CONFIG_OF instead of CONFIG_PPC_OF
mb862xxfb: restrict compliation of platform driver to PPC
Samsung SoC Framebuffer driver: add Alpha Channel support
atmel-lcdc: fix pixclock upper bound detection
offb: use framebuffer_alloc() to allocate fb_info struct
...
Manually fix up conflicts due to kmemcheck in mm/slab.c
This adds the nodename callback for struct class, struct device_type and
struct device, to allow drivers to send userspace hints on the device
name and subdirectory that should be used for it.
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This removes the
warning: format not a string literal and no format arguments
warnings in the driver core that gcc 4.3.3 complains about.
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The firmware loader has a statically allocated 30 bytes long string for
the firmware id (a.k.a. the firmware file name). There is no reason why
we couldnt allocate it dynamically, and avoid having restrictions on the
firmware names lengths.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Cc: Marcel Holtmann <holtmann@linux.intel.com>
Cc: Zhu Yi <yi.zhu@intel.com>,
Cc: John Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
request_firmware_nowait declares it can be called in non-sleep contexts,
but kthead_run called by request_firmware_nowait may sleep. So fix its
documentation and comment to make callers clear about it.
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
A patch series to make .shutdown execute asynchronously. Some drivers's
shutdown can take a lot of time. The patches can help save some shutdown
time. The patches use Arjan's async API.
This patch:
synchronize all tasks submitted by .shutdown
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Dmitry Torokhov <dtor@mail.ru>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
sysdev_class_register should check the kobject_set_name return value.
Add the return value checking code.
Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This converts resource and IRQ getbyname functions for the platform
bus to use const char *, I ran into compiler moanings when I tried
using a const char * for looking up a certain resource.
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch adds a new bus notifier event which is emitted _after_ a
device is removed from its driver. This event will be used by the
dma-api debug code to check if a driver has released all dma allocations
for that device.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch reworks the platform driver code for legacy
suspend and resume to avoid installing callbacks in
struct device_driver. A warning is also added telling
users to update the platform driver to use dev_pm_ops.
The functions platform_legacy_suspend()/resume() directly
call suspend and resume callbacks in struct platform_driver
instead of wrapping things in platform_drv_suspend()/resume().
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
This patch removes the legacy callbacks ->suspend() and
->resume() from struct device_type. These callbacks seem
unused, and new code should instead make use of struct
dev_pm_ops.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Remove the ->suspend_late() and ->resume_early() callbacks
from struct bus_type V2. These callbacks are legacy stuff
at this point and since there seem to be no in-tree users
we may as well remove them. New users should use dev_pm_ops.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Acked-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
This patch (as1241) renames a bunch of functions in the PM core.
Rather than go through a boring list of name changes, suffice it to
say that in the end we have a bunch of pairs of functions:
device_resume_noirq dpm_resume_noirq
device_resume dpm_resume
device_complete dpm_complete
device_suspend_noirq dpm_suspend_noirq
device_suspend dpm_suspend
device_prepare dpm_prepare
in which device_X does the X operation on a single device and dpm_X
invokes device_X for all devices in the dpm_list.
In addition, the old dpm_power_up and device_resume_noirq have been
combined into a single function (dpm_resume_noirq).
Lastly, dpm_suspend_start and dpm_resume_end are the renamed versions
of the former top-level device_suspend and device_resume routines.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Rename the functions performing "_noirq" dev_pm_ops
operations from device_power_down() and device_power_up()
to device_suspend_noirq() and device_resume_noirq().
The new function names are chosen to show that the functions
are responsible for calling the _noirq() versions to finalize
the suspend/resume operation. The current function names do
not perform power down/up anymore so the names may be misleading.
Global function renames:
- device_power_down() -> device_suspend_noirq()
- device_power_up() -> device_resume_noirq()
Static function renames:
- suspend_device_noirq() -> __device_suspend_noirq()
- resume_device_noirq() -> __device_resume_noirq()
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Acked-by: Len Brown <lenb@kernel.org>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Sysdevs have to be suspended and resumed with interrupts disabled and
things usually break in a way that's difficult to debug if one of
sysdev drivers enables interrupts by mistake during suspend or
resume. Add extra checks that will generate warnings in such cases.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
We also fix a problem with cleaning up properly when initializing
drivers and devices, so checks like this will work successfully.
Portions of the patch by Linus and Greg and Ingo.
Reported-by: Ozan Çağlayan <ozan@pardus.org.tr>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
We shouldn't hold dpm_list_mtx while executing
[disable|enable]_nonboot_cpus(), because theoretically this may lead
to a deadlock as shown by the following example (provided by Johannes
Berg):
CPU 3 CPU 2 CPU 1
suspend/hibernate
something:
rtnl_lock() device_pm_lock()
-> mutex_lock(&dpm_list_mtx)
mutex_lock(&dpm_list_mtx)
linkwatch_work
-> rtnl_lock()
disable_nonboot_cpus()
-> flush CPU 3 workqueue
Fortunately, device drivers are supposed to stop any activities that
might lead to the registration of new device objects way before
disable_nonboot_cpus() is called, so it shouldn't be necessary to
hold dpm_list_mtx over the entire late part of device suspend and
early part of device resume.
Thus, during the late suspend and the early resume of devices acquire
dpm_list_mtx only when dpm_list is going to be traversed and release
it right after that.
This patch is reported to fix the regressions tracked as
http://bugzilla.kernel.org/show_bug.cgi?id=13245.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Reported-by: Miles Lane <miles.lane@gmail.com>
Tested-by: Ming Lei <tom.leiming@gmail.com>
Rather than calling vmalloc() repeatedly to grow the firmware image as
we receive data from userspace, just allocate and fill individual pages.
Then vmap() the whole lot in one go when we're done.
A quick test with a 337KiB iwlagn firmware shows the time taken for
request_firmware() going from ~32ms to ~5ms after I apply this patch.
[v2: define PAGE_KERNEL_RO as PAGE_KERNEL where necessary, use min_t()]
[v3: kunmap() takes the struct page *, not the virtual address]
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Tested-by: Sachin Sant <sachinp@in.ibm.com>