Commit Graph

4975 Commits (28dbc4b6a01fb579a9441c7b81e3d3413dc452df)

Author SHA1 Message Date
Balbir Singh 52bc0d8210 memcg: memory cgroup hierarchy documentation
Documentation updates for hierarchy support

Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Cc: Paul Menage <menage@google.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Pavel Emelianov <xemul@openvz.org>
Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-08 08:31:05 -08:00
KAMEZAWA Hiroyuki 8c7c6e34a1 memcg: mem+swap controller core
This patch implements per cgroup limit for usage of memory+swap.  However
there are SwapCache, double counting of swap-cache and swap-entry is
avoided.

Mem+Swap controller works as following.
  - memory usage is limited by memory.limit_in_bytes.
  - memory + swap usage is limited by memory.memsw_limit_in_bytes.

This has following benefits.
  - A user can limit total resource usage of mem+swap.

    Without this, because memory resource controller doesn't take care of
    usage of swap, a process can exhaust all the swap (by memory leak.)
    We can avoid this case.

    And Swap is shared resource but it cannot be reclaimed (goes back to memory)
    until it's used. This characteristic can be trouble when the memory
    is divided into some parts by cpuset or memcg.
    Assume group A and group B.
    After some application executes, the system can be..

    Group A -- very large free memory space but occupy 99% of swap.
    Group B -- under memory shortage but cannot use swap...it's nearly full.

    Ability to set appropriate swap limit for each group is required.

Maybe someone wonder "why not swap but mem+swap ?"

  - The global LRU(kswapd) can swap out arbitrary pages. Swap-out means
    to move account from memory to swap...there is no change in usage of
    mem+swap.

    In other words, when we want to limit the usage of swap without affecting
    global LRU, mem+swap limit is better than just limiting swap.

Accounting target information is stored in swap_cgroup which is
per swap entry record.

Charge is done as following.
  map
    - charge  page and memsw.

  unmap
    - uncharge page/memsw if not SwapCache.

  swap-out (__delete_from_swap_cache)
    - uncharge page
    - record mem_cgroup information to swap_cgroup.

  swap-in (do_swap_page)
    - charged as page and memsw.
      record in swap_cgroup is cleared.
      memsw accounting is decremented.

  swap-free (swap_free())
    - if swap entry is freed, memsw is uncharged by PAGE_SIZE.

There are people work under never-swap environments and consider swap as
something bad. For such people, this mem+swap controller extension is just an
overhead.  This overhead is avoided by config or boot option.
(see Kconfig. detail is not in this patch.)

TODO:
 - maybe more optimization can be don in swap-in path. (but not very safe.)
   But we just do simple accounting at this stage.

[nishimura@mxp.nes.nec.co.jp: make resize limit hold mutex]
[hugh@veritas.com: memswap controller core swapcache fixes]
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-08 08:31:05 -08:00
KAMEZAWA Hiroyuki c077719be8 memcg: mem+swap controller Kconfig
Config and control variable for mem+swap controller.

This patch adds CONFIG_CGROUP_MEM_RES_CTLR_SWAP
(memory resource controller swap extension.)

For accounting swap, it's obvious that we have to use additional memory to
remember "who uses swap".  This adds more overhead.  So, it's better to
offer "choice" to users.  This patch adds 2 choices.

This patch adds 2 parameters to enable swap extension or not.
  - CONFIG
  - boot option

Reviewed-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-08 08:31:05 -08:00
KAMEZAWA Hiroyuki d13d144309 memcg: handle swap caches
SwapCache support for memory resource controller (memcg)

Before mem+swap controller, memcg itself should handle SwapCache in proper
way.  This is cut-out from it.

In current memcg, SwapCache is just leaked and the user can create tons of
SwapCache.  This is a leak of account and should be handled.

SwapCache accounting is done as following.

  charge (anon)
	- charged when it's mapped.
	  (because of readahead, charge at add_to_swap_cache() is not sane)
  uncharge (anon)
	- uncharged when it's dropped from swapcache and fully unmapped.
	  means it's not uncharged at unmap.
	  Note: delete from swap cache at swap-in is done after rmap information
	        is established.
  charge (shmem)
	- charged at swap-in. this prevents charge at add_to_page_cache().

  uncharge (shmem)
	- uncharged when it's dropped from swapcache and not on shmem's
	  radix-tree.

  at migration, check against 'old page' is modified to handle shmem.

Comparing to the old version discussed (and caused troubles), we have
advantages of
  - PCG_USED bit.
  - simple migrating handling.

So, situation is much easier than several months ago, maybe.

[hugh@veritas.com: memcg: handle swap caches build fix]
Reviewed-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Tested-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-08 08:31:05 -08:00
KAMEZAWA Hiroyuki c1e862c1f5 memcg: new force_empty to free pages under group
By memcg-move-all-accounts-to-parent-at-rmdir.patch, there is no leak of
memory usage and force_empty is removed.

This patch adds "force_empty" again, in reasonable manner.

memory.force_empty file works when

  #echo 0 (or some) > memory.force_empty
  and have following function.

  1. only works when there are no task in this cgroup.
  2. free all page under this cgroup as much as possible.
  3. page which cannot be freed will be moved up to parent.
  4. Then, memcg will be empty after above echo returns.

This is much better behavior than old "force_empty" which just forget
all accounts. This patch also check signal_pending() and above "echo"
can be stopped by "Ctrl-C".

[akpm@linux-foundation.org: cleanup]
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-08 08:31:04 -08:00
KAMEZAWA Hiroyuki f817ed4853 memcg: move all acccounting to parent at rmdir()
This patch provides a function to move account information of a page
between mem_cgroups and rewrite force_empty to make use of this.

This moving of page_cgroup is done under
 - lru_lock of source/destination mem_cgroup is held.
 - lock_page_cgroup() is held.

Then, a routine which touches pc->mem_cgroup without lock_page_cgroup()
should confirm pc->mem_cgroup is still valid or not.  Typical code can be
following.

(while page is not under lock_page())
	mem = pc->mem_cgroup;
	mz = page_cgroup_zoneinfo(pc)
	spin_lock_irqsave(&mz->lru_lock);
	if (pc->mem_cgroup == mem)
		...../* some list handling */
	spin_unlock_irqrestore(&mz->lru_lock);

Of course, better way is
	lock_page_cgroup(pc);
	....
	unlock_page_cgroup(pc);

But you should confirm the nest of lock and avoid deadlock.

If you treats page_cgroup from mem_cgroup's LRU under mz->lru_lock,
you don't have to worry about what pc->mem_cgroup points to.
moved pages are added to head of lru, not to tail.

Expected users of this routine is:
  - force_empty (rmdir)
  - moving tasks between cgroup (for moving account information.)
  - hierarchy (maybe useful.)

force_empty(rmdir) uses this move_account and move pages to its parent.
This "move" will not cause OOM (I added "oom" parameter to try_charge().)

If the parent is busy (not enough memory), force_empty calls try_to_free_page()
and reduce usage.

Purpose of this behavior is
  - Fix "forget all" behavior of force_empty and avoid leak of accounting.
  - By "moving first, free if necessary", keep pages on memory as much as
    possible.

Adding a switch to change behavior of force_empty to
  - free first, move if necessary
  - free all, if there is mlocked/busy pages, return -EBUSY.
is under consideration. (I'll add if someone requtests.)

This patch also removes memory.force_empty file, a brutal debug-only interface.

Reviewed-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Tested-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Paul Menage <menage@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-08 08:31:04 -08:00
Li Zefan 18e7f1f0d3 cgroups: documentation updates
- remove 'releasable' since it has been moved to the debug subsys.
- update lock requirements of subsys callbacks.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Cc: Paul Menage <menage@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-08 08:31:01 -08:00
Linus Torvalds b424e8d3b4 Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (98 commits)
  PCI PM: Put PM callbacks in the order of execution
  PCI PM: Run default PM callbacks for all devices using new framework
  PCI PM: Register power state of devices during initialization
  PCI PM: Call pci_fixup_device from legacy routines
  PCI PM: Rearrange code in pci-driver.c
  PCI PM: Avoid touching devices behind bridges in unknown state
  PCI PM: Move pci_has_legacy_pm_support
  PCI PM: Power-manage devices without drivers during suspend-resume
  PCI PM: Add suspend counterpart of pci_reenable_device
  PCI PM: Fix poweroff and restore callbacks
  PCI: Use msleep instead of cpu_relax during ASPM link retraining
  PCI: PCIe portdrv: Add kerneldoc comments to remining core funtions
  PCI: PCIe portdrv: Rearrange code so that related things are together
  PCI: PCIe portdrv: Fix suspend and resume of PCI Express port services
  PCI: PCIe portdrv: Add kerneldoc comments to some core functions
  x86/PCI: Do not use interrupt links for devices using MSI-X
  net: sfc: Use pci_clear_master() to disable bus mastering
  PCI: Add pci_clear_master() as opposite of pci_set_master()
  PCI hotplug: remove redundant test in cpq hotplug
  PCI: pciehp: cleanup register and field definitions
  ...
2009-01-07 15:41:01 -08:00
Linus Torvalds 7c7758f99d Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: (123 commits)
  wimax/i2400m: add CREDITS and MAINTAINERS entries
  wimax: export linux/wimax.h and linux/wimax/i2400m.h with headers_install
  i2400m: Makefile and Kconfig
  i2400m/SDIO: TX and RX path backends
  i2400m/SDIO: firmware upload backend
  i2400m/SDIO: probe/disconnect, dev init/shutdown and reset backends
  i2400m/SDIO: header for the SDIO subdriver
  i2400m/USB: TX and RX path backends
  i2400m/USB: firmware upload backend
  i2400m/USB: probe/disconnect, dev init/shutdown and reset backends
  i2400m/USB: header for the USB bus driver
  i2400m: debugfs controls
  i2400m: various functions for device management
  i2400m: RX and TX data/control paths
  i2400m: firmware loading and bootrom initialization
  i2400m: linkage to the networking stack
  i2400m: Generic probe/disconnect, reset and message passing
  i2400m: host/device procotol and core driver definitions
  i2400m: documentation and instructions for usage
  wimax: Makefile, Kconfig and docbook linkage for the stack
  ...
2009-01-07 15:37:24 -08:00
Linus Torvalds c6906a2cb7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-fixes
* git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-fixes:
  kbuild: fix typos (s/bin_shipped/bin.o_shipped/) in Documentation
  kbuild: add a symlink to the source for separate objdirs
  kconfig: add script to manipulate .config files on the command line
  kbuild: reintroduce ALLSOURCE_ARCHS support for tags/cscope
  bootchart: improve output based on Dave Jones' feedback
  fix modules_install via NFS
  qnx: include <linux/types.h> for definitions of __[us]{8,16,32,64} types
2009-01-07 13:11:28 -08:00
Wolfram Sang baa91878ab kbuild: fix typos (s/bin_shipped/bin.o_shipped/) in Documentation
The text always mentions ...bin.o_shipped, just the example makefiles
actually use ...bin_shipped. It was corrected in one place some time
ago, these ones seem to have been forgotten.

Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2009-01-07 21:44:23 +01:00
Jike Song 4f628248a5 kbuild: reintroduce ALLSOURCE_ARCHS support for tags/cscope
This patch reintroduce the ALLSOURCE_ARCHS support for tags/TAGS/
cscope targets. The Kbuild previously has this feature, but after
moving the targets into scripts/tags.sh, ALLSOURCE_ARCHS disappears.

It's something like this:

	$ make ALLSOURCE_ARCHS="x86 mips arm" tags cscope

Signed-off-by: Jike Song <albcamus@gmail.com>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2009-01-07 21:44:21 +01:00
Linus Torvalds a0c9f240a9 Merge branch 'proc-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/adobriyan/proc
* 'proc-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/adobriyan/proc:
  proc: remove write-only variable in proc_pident_lookup()
  proc: fix sparse warning
  proc: add /proc/*/stack
  proc: remove '##' usage
  proc: remove useless WARN_ONs
  proc: stop using BKL
2009-01-07 12:01:06 -08:00
Linus Torvalds 5bb47b9ff3 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/blackfin-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/blackfin-2.6: (171 commits)
  Blackfin arch: fix bug - BF527 0.2 silicon has different CPUID (DSPID) value
  Blackfin arch: Enlarge flash partition for kenel for bf533/bf537 boards
  Blackfin arch: fix bug: kernel crash when enable SDIO host driver
  Blackfin arch: Print FP at level KERN_NOTICE
  Blackfin arch: drop ad73311 test code
  Blackfin arch: update board default configs
  Blackfin arch: Set PB4 as the default irq for bf548 board v1.4+.
  Blackfin arch: fix typo in early printk bit size processing
  Blackfin arch: enable reprogram cclk and sclk for bf518f-ezbrd
  Blackfin arch: add SDIO host driver platform data
  Blackfin arch: fix bug - kernel stops at initial console
  Blackfin arch: fix bug - kernel crash after config IP for ethernet port
  Blackfin arch: add sdh support for bf518f-ezbrd
  Blackfin arch: fix bug - kernel detects BF532 incorrectly
  Blackfin arch: add () to avoid warnings from gcc
  Blackfin arch: change HWTRACE Kconfig and set it on default
  Blackfin arch: Clean oprofile build path for blackfin
  Blackfin arch: remove hardware PM code, oprofile not use it
  Blackfin arch: rewrite get_sclk()/get_vco()
  Blackfin arch: cleanup and unify the ins functions
  ...
2009-01-07 12:00:25 -08:00
Linus Torvalds 2f2408a88c Merge branch 'hwmon-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6
* 'hwmon-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6: (29 commits)
  hwmon: Fix various typos
  hwmon: Check for ACPI resource conflicts
  hwmon: (lm70) Add TI TMP121 support
  hwmon: (lm70) Code streamlining and cleanup
  hwmon: Deprecate the fscher and fscpos drivers
  hwmon: (fschmd) Add watchdog support
  hwmon: (fschmd) Cleanups for watchdog support
  hwmon: (i5k_amb) Load automatically on all 5000/5400 chipsets
  hwmon: (it87) Add support for the ITE IT8720F
  hwmon: Don't overuse I2C_CLIENT_MODULE_PARM
  hwmon: Add LTC4245 driver
  hwmon: (f71882fg) Fix fan_to/from_reg prototypes
  hwmon: (f71882fg) Printout fan modes
  hwmon: (f71882fg) Add documentation
  hwmon: (f71882fg) Fix auto_channels_temp temp numbering with f8000
  hwmon: (f71882fg) Add missing pwm3 attr for f71862fg
  hwmon: (f71882fg) Add F8000 support
  hwmon: (f71882fg) Remove the fan_mode module option
  hwmon: (f71882fg) Separate max and crit alarm and beep
  hwmon: (f71882fg) Check for hwmon powerdown state
  ...
2009-01-07 11:59:51 -08:00
Linus Torvalds 57c44c5f6f Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (24 commits)
  trivial: chack -> check typo fix in main Makefile
  trivial: Add a space (and a comma) to a printk in 8250 driver
  trivial: Fix misspelling of "firmware" in docs for ncr53c8xx/sym53c8xx
  trivial: Fix misspelling of "firmware" in powerpc Makefile
  trivial: Fix misspelling of "firmware" in usb.c
  trivial: Fix misspelling of "firmware" in qla1280.c
  trivial: Fix misspelling of "firmware" in a100u2w.c
  trivial: Fix misspelling of "firmware" in megaraid.c
  trivial: Fix misspelling of "firmware" in ql4_mbx.c
  trivial: Fix misspelling of "firmware" in acpi_memhotplug.c
  trivial: Fix misspelling of "firmware" in ipw2100.c
  trivial: Fix misspelling of "firmware" in atmel.c
  trivial: Fix misspelled firmware in Kconfig
  trivial: fix an -> a typos in documentation and comments
  trivial: fix then -> than typos in comments and documentation
  trivial: update Jesper Juhl CREDITS entry with new email
  trivial: fix singal -> signal typo
  trivial: Fix incorrect use of "loose" in event.c
  trivial: printk: fix indentation of new_text_line declaration
  trivial: rtc-stk17ta8: fix sparse warning
  ...
2009-01-07 11:31:52 -08:00
Ben Hutchings 6a479079c0 PCI: Add pci_clear_master() as opposite of pci_set_master()
During an online device reset it may be useful to disable bus-mastering.
pci_disable_device() does that, and far more besides, so is not suitable
for an online reset.

Add pci_clear_master() which does just this.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-01-07 11:13:23 -08:00
Arjan van de Ven e8de1481fd resource: allow MMIO exclusivity for device drivers
Device drivers that use pci_request_regions() (and similar APIs) have a
reasonable expectation that they are the only ones accessing their device.
As part of the e1000e hunt, we were afraid that some userland (X or some
bootsplash stuff) was mapping the MMIO region that the driver thought it
had exclusively via /dev/mem or via various sysfs resource mappings.

This patch adds the option for device drivers to cause their reserved
regions to the "banned from /dev/mem use" list, so now both kernel memory
and device-exclusive MMIO regions are banned.
NOTE: This is only active when CONFIG_STRICT_DEVMEM is set.

In addition to the config option, a kernel parameter iomem=relaxed is
provided for the cases where developers want to diagnose, in the field,
drivers issues from userspace.

Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-01-07 11:12:32 -08:00
Inaky Perez-Gonzalez 3e91029ae0 i2400m: documentation and instructions for usage
The driver for the i2400m is a stacked driver. There is a core driver,
the bus-generic driver that has no knowledge or dependencies on how
the device is connected to the system; it only knows how to speak the
device protocol. Then there are the bus-specific drivers (for USB and
SDIO) that provide backends for the generic driver to communicate with
the device.

The bus generic driver connects to the network and WiMAX stacks on the
top side, and on the bottom to the bus-specific drivers.

Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-01-07 10:00:18 -08:00
Inaky Perez-Gonzalez b0c83ae1de wimax: Makefile, Kconfig and docbook linkage for the stack
This patch provides Makefile and KConfig for the WiMAX stack,
integrating them into the networking stack's Makefile, Kconfig and
doc-book templates.

Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-01-07 10:00:17 -08:00
Inaky Perez-Gonzalez 0d695913b0 wimax: documentation for the stack
wimax documentation

Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-01-07 10:00:16 -08:00
Alan Stern c838ea4626 USB: storage: make the "quirks=" module parameter writable
This patch (as1190) makes usb-storage's "quirks=" module parameter
writable, so that users can add entries for their devices at runtime
with no need to reboot or reload usb-storage.

New codes are added for the SANE_SENSE, CAPACITY_HEURISTICS, and
CAPACITY_OK flags.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-01-07 10:00:12 -08:00
Alan Stern 9ac39f28b5 USB: add asynchronous autosuspend/autoresume support
This patch (as1160b) adds support routines for asynchronous autosuspend
and autoresume, with accompanying documentation updates.  There
already are several potential users of this interface, and others are
likely to arise as autosuspend support becomes more widespread.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-01-07 09:59:53 -08:00
Alan Stern d4f373e57d USB: usb-storage: add "quirks=" module parameter
This patch (as1163b) adds a "quirks=" module parameter to usb-storage.
This will allow people to make short-term changes to their
unusual_devs list without rebuilding the entire driver.  Testing will
become much easier, and less-sophisticated users will be able to
access their buggy devices after a simple config-file change instead
of having to wait for a new kernel release.

The patch also adds a documentation entry for usb-storage's
"delay_use" parameter, which has been around for years but but was
never listed among the kernel parameters.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-01-07 09:59:53 -08:00
Jean Delvare 77fa49d94a hwmon: Fix various typos
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: David Hubbard <david.c.hubbard@gmail.com>
2009-01-07 16:37:35 +01:00
Manuel Lauss c8ac32e471 hwmon: (lm70) Add TI TMP121 support
The Texas Instruments TMP121 is a SPI temperature sensor very similar
to the LM70, with slightly higher resolution.  This patch extends the
LM70 driver to support the TMP121.  The TMP123 differs in pin assign-
ment.

Signed-off-by: Manuel Lauss <mano@roarinelk.homelinux.net>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
2009-01-07 16:37:34 +01:00
Kaiwan N Billimoria 2b7300513b hwmon: (lm70) Code streamlining and cleanup
This fixes a byteswap bug in the LM70 temperature sensor driver,
which was previously covered up by a converse bug in the driver
for the LM70EVAL-LLP board (which is also fixed).

Other fixes:  doc updates, remove an annoying msleep(), and improve
three-wire protocol handling.

Signed-off-by: Kaiwan N Billimoria <kaiwan@designergraphix.com>
[ dbrownell@users.sourceforge.net: doc and whitespace tweaks ]
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
2009-01-07 16:37:34 +01:00
Hans de Goede 0589c2de64 hwmon: Deprecate the fscher and fscpos drivers
Now that the new merged fschmd driver has gained support for the watchdog
integrated into these IC's, there is no more reason to keep the old fscher
and fscpos drivers around, so mark them as deprecated.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
2009-01-07 16:37:33 +01:00
Jean-Marc Spaggiari b4da93e4b0 hwmon: (it87) Add support for the ITE IT8720F
Allow it87.c to handle IT8720 chipset like IT8718 in order to
retrieve voltage, temperatures and fans speed from sensors
tools. Also updating the related documentation.

Signed-off-by: Jean-Marc Spaggiari <jean-marc@spaggiari.org>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
2009-01-07 16:37:32 +01:00
Ira Snyder 6e34b187bc hwmon: Add LTC4245 driver
Add Linux support for the Linear Technology LTC4245 Multiple Supply Hot
Swap controller I2C monitoring interface.

Signed-off-by: Ira W. Snyder <iws@ovro.caltech.edu>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
2009-01-07 16:37:32 +01:00
Hans de Goede 3b02d332b6 hwmon: (f71882fg) Add documentation
Add some documentation about the f71882fg driver, and update the Kconfig
documentation to report the new supported models.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
2009-01-07 16:37:31 +01:00
Graf Yang 5e6d9f511e Blackfin arch: Add document about bfin-gpio
Add document about bfin-gpio when requesting a pin
both as gpio and gpio interrupt.

Signed-off-by: Graf Yang <graf.yang@analog.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
2009-01-07 23:14:38 +08:00
Linus Torvalds db30c70575 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (29 commits)
  Input: i8042 - add Dell Vostro 1510 to nomux list
  Input: gtco - use USB endpoint API
  Input: add support for Maple controller as a joystick
  Input: atkbd - broaden the Dell DMI signatures
  Input: HIL drivers - add MODULE_ALIAS()
  Input: map_to_7segment.h - convert to __inline__ for userspace
  Input: add support for enhanced rotary controller on pxa930 and pxa935
  Input: add support for trackball on pxa930 and pxa935
  Input: add da9034 touchscreen support
  Input: ads7846 - strict_strtoul takes unsigned long
  Input: make some variables and functions static
  Input: add tsc2007 based touchscreen driver
  Input: psmouse - add module parameters to control OLPC touchpad delays
  Input: i8042 - add Gigabyte M912 netbook to noloop exception table
  Input: atkbd - Samsung NC10 key repeat fix
  Input: atkbd - add keyboard quirk for HP Pavilion ZV6100 laptop
  Input: libps2 - handle 0xfc responses from devices
  Input: add support for Wacom W8001 penabled serial touchscreen
  Input: synaptics - report multi-taps only if supported by the device
  Input: add joystick driver for Walkera WK-0701 RC transmitter
  ...
2009-01-06 17:14:01 -08:00
Linus Torvalds c861ea2cb2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
  CRED: Fix regression in cap_capable() as shown up by sys_faccessat() [ver #3]
  Revert "CRED: Fix regression in cap_capable() as shown up by sys_faccessat() [ver #2]"
  SELinux: shrink sizeof av_inhert selinux_class_perm and context
  CRED: Fix regression in cap_capable() as shown up by sys_faccessat() [ver #2]
  keys: fix sparse warning by adding __user annotation to cast
  smack: Add support for unlabeled network hosts and networks
  selinux: Deprecate and schedule the removal of the the compat_net functionality
  netlabel: Update kernel configuration API
2009-01-06 17:11:39 -08:00
Linus Torvalds 40d7ee5d16 Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: (60 commits)
  uio: make uio_info's name and version const
  UIO: Documentation for UIO ioport info handling
  UIO: Pass information about ioports to userspace (V2)
  UIO: uio_pdrv_genirq: allow custom irq_flags
  UIO: use pci_ioremap_bar() in drivers/uio
  arm: struct device - replace bus_id with dev_name(), dev_set_name()
  libata: struct device - replace bus_id with dev_name(), dev_set_name()
  avr: struct device - replace bus_id with dev_name(), dev_set_name()
  block: struct device - replace bus_id with dev_name(), dev_set_name()
  chris: struct device - replace bus_id with dev_name(), dev_set_name()
  dmi: struct device - replace bus_id with dev_name(), dev_set_name()
  gadget: struct device - replace bus_id with dev_name(), dev_set_name()
  gpio: struct device - replace bus_id with dev_name(), dev_set_name()
  gpu: struct device - replace bus_id with dev_name(), dev_set_name()
  hwmon: struct device - replace bus_id with dev_name(), dev_set_name()
  i2o: struct device - replace bus_id with dev_name(), dev_set_name()
  IA64: struct device - replace bus_id with dev_name(), dev_set_name()
  i7300_idle: struct device - replace bus_id with dev_name(), dev_set_name()
  infiniband: struct device - replace bus_id with dev_name(), dev_set_name()
  ISDN: struct device - replace bus_id with dev_name(), dev_set_name()
  ...
2009-01-06 17:02:07 -08:00
Linus Torvalds 59e3af21e9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (41 commits)
  scc_pata: make use of scc_dma_sff_read_status()
  ide-dma-sff: factor out ide_dma_sff_write_status()
  ide: move read_sff_dma_status() method to 'struct ide_dma_ops'
  ide: don't set hwif->dma_ops in init_dma() method
  Resurrect IT8172 IDE controller driver
  piix: sync ich_laptop[] with ata_piix.c
  ide: update warm-plug HOWTO
  ide: fix ide_port_scan() to do ACPI setup after initializing request queues
  ide: remove now redundant ->cur_dev checks
  ide: remove unused ide_hwif_t.sg_mapped field
  ide: struct ide_atapi_pc - remove unused fields and update documentation
  ide: remove superfluous hwif variable assignment from ide_timer_expiry()
  ide: use ide_pci_is_in_compatibility_mode() helper in setup-pci.c
  ide: make "paranoia" ->handler check in ide_intr() more strict
  ide-cd: convert to ide-atapi facilities
  ide-cd: start DMA before sending the actual packet command
  ide-cd: wait for DRQ to get set per default
  ide: Fix drive's DWORD-IO handling
  ide: add port and host iterators
  ide: dynamic allocation of device structures
  ...
2009-01-06 17:00:50 -08:00
Hidehiro Kawai 4cb0e11b15 coredump_filter: permit changing of the default filter
Introduce a new kernel parameter `coredump_filter'.  Setting a value to
this parameter causes the default bitmask of coredump_filter to be
changed.

It is useful for users to change coredump_filter settings for the whole
system at boot time.  Without this parameter, users have to change
coredump_filter settings for each /proc/<pid>/ in an initializing script.

Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-06 15:59:29 -08:00
Randy Dunlap ecb08d8131 doc: reformat some long lines in kernel-parameters.txt
Reformat text to (mostly) stay within 80 columns of text.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-06 15:59:28 -08:00
Randy Dunlap 7c4be253d3 docs: add more early params to kernel-parameters.txt
Add some (more) early_param boot options to kernel-parameters.txt.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-06 15:59:28 -08:00
Randy Dunlap 28f4d75a61 documentation: how to use DOC: section blocks
Add info on how to use DOC: sections in kernel-doc.  DOC: sections enable
the addition of inline source file comments that are general in nature
instead of being specific to a function, struct, union, enum, or typedef.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-06 15:59:28 -08:00
Randy Dunlap 58cc855c39 documentation: update s390 header file paths
Update Documentation/s390/ files to reflect changed header files
locations.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-06 15:59:28 -08:00
Randy Dunlap 07983f0e36 documentation: update header file paths
Update several Documentation/ files and a few sub-dir files (only one
change in each) to reflect changed header files locations.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-06 15:59:28 -08:00
Randy Dunlap d78dd070cc docs: document how to write @varargs in kernel-doc
Add documentation on how to use kernel-doc for function parameters
that are "..." (varargs).

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-06 15:59:28 -08:00
Masami Hiramatsu e8386a0cb2 kprobes: support probing module __exit function
Allows kprobes to probe __exit routine.  This adds flags member to struct
kprobe.  When module is freed(kprobes hooks module_notifier to get this
event), kprobes which probe the functions in that module are set to "Gone"
flag to the flags member.  These "Gone" probes are never be enabled.
Users can check the GONE flag through debugfs.

This also removes mod_refcounted, because we couldn't free a module if
kprobe incremented the refcount of that module.

[akpm@linux-foundation.org: document some locking]
[mhiramat@redhat.com: bugfix: pass aggr_kprobe to arch_remove_kprobe]
[mhiramat@redhat.com: bugfix: release old_p's insn_slot before error return]
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-06 15:59:21 -08:00
Darrick J. Wong 89fac11cb3 adt7470: make automatic fan control really work
It turns out that the adt7470's automatic fan control algorithm only works
when the temperature sensors get updated.  This in turn happens only when
someone tells the chip to read its temperature sensors.  Regrettably, this
means that we have to drive the chip periodically.

Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Cc: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-06 15:59:18 -08:00
Tejun Heo 5f820f648c poll: allow f_op->poll to sleep
f_op->poll is the only vfs operation which is not allowed to sleep.  It's
because poll and select implementation used task state to synchronize
against wake ups, which doesn't have to be the case anymore as wait/wake
interface can now use custom wake up functions.  The non-sleep restriction
can be a bit tricky because ->poll is not called from an atomic context
and the result of accidentally sleeping in ->poll only shows up as
temporary busy looping when the timing is right or rather wrong.

This patch converts poll/select to use custom wake up function and use
separate triggered variable to synchronize against wake up events.  The
only added overhead is an extra function call during wake up and
negligible.

This patch removes the one non-sleep exception from vfs locking rules and
is beneficial to userland filesystem implementations like FUSE, 9p or
peculiar fs like spufs as it's very difficult for those to implement
non-sleeping poll method.

While at it, make the following cosmetic changes to make poll.h and
select.c checkpatch friendly.

* s/type * symbol/type *symbol/		   : three places in poll.h
* remove blank line before EXPORT_SYMBOL() : two places in select.c

Oleg: spotted missing barrier in poll_schedule_timeout()
Davide: spotted missing write barrier in pollwake()

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Ron Minnich <rminnich@sandia.gov>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Cc: Davide Libenzi <davidel@xmailserver.org>
Cc: Brad Boyer <flar@allandria.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Roland McGrath <roland@redhat.com>
Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Davide Libenzi <davidel@xmailserver.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-06 15:59:12 -08:00
Hugh Dickins 63d6c5ad7f mm: remove try_to_munlock from vmscan
An unfortunate feature of the Unevictable LRU work was that reclaiming an
anonymous page involved an extra scan through the anon_vma: to check that
the page is evictable before allocating swap, because the swap could not
be freed reliably soon afterwards.

Now try_to_free_swap() has replaced remove_exclusive_swap_page(), that's
not an issue any more: remove try_to_munlock() call from
shrink_page_list(), leaving it to try_to_munmap() to discover if the page
is one to be culled to the unevictable list - in which case then
try_to_free_swap().

Update unevictable-lru.txt to remove comments on the try_to_munlock() in
shrink_page_list(), and shorten some lines over 80 columns.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Robin Holt <holt@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-06 15:59:03 -08:00
David Rientjes 2da02997e0 mm: add dirty_background_bytes and dirty_bytes sysctls
This change introduces two new sysctls to /proc/sys/vm:
dirty_background_bytes and dirty_bytes.

dirty_background_bytes is the counterpart to dirty_background_ratio and
dirty_bytes is the counterpart to dirty_ratio.

With growing memory capacities of individual machines, it's no longer
sufficient to specify dirty thresholds as a percentage of the amount of
dirtyable memory over the entire system.

dirty_background_bytes and dirty_bytes specify quantities of memory, in
bytes, that represent the dirty limits for the entire system.  If either
of these values is set, its value represents the amount of dirty memory
that is needed to commence either background or direct writeback.

When a `bytes' or `ratio' file is written, its counterpart becomes a
function of the written value.  For example, if dirty_bytes is written to
be 8096, 8K of memory is required to commence direct writeback.
dirty_ratio is then functionally equivalent to 8K / the amount of
dirtyable memory:

	dirtyable_memory = free pages + mapped pages + file cache

	dirty_background_bytes = dirty_background_ratio * dirtyable_memory
		-or-
	dirty_background_ratio = dirty_background_bytes / dirtyable_memory

		AND

	dirty_bytes = dirty_ratio * dirtyable_memory
		-or-
	dirty_ratio = dirty_bytes / dirtyable_memory

Only one of dirty_background_bytes and dirty_background_ratio may be
specified at a time, and only one of dirty_bytes and dirty_ratio may be
specified.  When one sysctl is written, the other appears as 0 when read.

The `bytes' files operate on a page size granularity since dirty limits
are compared with ZVC values, which are in page units.

Prior to this change, the minimum dirty_ratio was 5 as implemented by
get_dirty_limits() although /proc/sys/vm/dirty_ratio would show any user
written value between 0 and 100.  This restriction is maintained, but
dirty_bytes has a lower limit of only one page.

Also prior to this change, the dirty_background_ratio could not equal or
exceed dirty_ratio.  This restriction is maintained in addition to
restricting dirty_background_bytes.  If either background threshold equals
or exceeds that of the dirty threshold, it is implicitly set to half the
dirty threshold.

Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Andrea Righi <righi.andrea@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-06 15:59:03 -08:00
Gary Hade c04fc586c1 mm: show node to memory section relationship with symlinks in sysfs
Show node to memory section relationship with symlinks in sysfs

Add /sys/devices/system/node/nodeX/memoryY symlinks for all
the memory sections located on nodeX.  For example:
/sys/devices/system/node/node1/memory135 -> ../../memory/memory135
indicates that memory section 135 resides on node1.

Also revises documentation to cover this change as well as updating
Documentation/ABI/testing/sysfs-devices-memory to include descriptions
of memory hotremove files 'phys_device', 'phys_index', and 'state'
that were previously not described there.

In addition to it always being a good policy to provide users with
the maximum possible amount of physical location information for
resources that can be hot-added and/or hot-removed, the following
are some (but likely not all) of the user benefits provided by
this change.
Immediate:
  - Provides information needed to determine the specific node
    on which a defective DIMM is located.  This will reduce system
    downtime when the node or defective DIMM is swapped out.
  - Prevents unintended onlining of a memory section that was
    previously offlined due to a defective DIMM.  This could happen
    during node hot-add when the user or node hot-add assist script
    onlines _all_ offlined sections due to user or script inability
    to identify the specific memory sections located on the hot-added
    node.  The consequences of reintroducing the defective memory
    could be ugly.
  - Provides information needed to vary the amount and distribution
    of memory on specific nodes for testing or debugging purposes.
Future:
  - Will provide information needed to identify the memory
    sections that need to be offlined prior to physical removal
    of a specific node.

Symlink creation during boot was tested on 2-node x86_64, 2-node
ppc64, and 2-node ia64 systems.  Symlink creation during physical
memory hot-add tested on a 2-node x86_64 system.

Signed-off-by: Gary Hade <garyhade@us.ibm.com>
Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-06 15:59:00 -08:00
James Morris ac8cc0fa53 Merge branch 'next' into for-linus 2009-01-07 09:58:22 +11:00