Commit Graph

318909 Commits (a9197f903f72a81393932d452379c8847fade544)

Author SHA1 Message Date
J. Bruce Fields 0ec4f431eb locks: fix checking of fcntl_setlease argument
The only checks of the long argument passed to fcntl(fd,F_SETLEASE,.)
are done after converting the long to an int.  Thus some illegal values
may be let through and cause problems in later code.

[ They actually *don't* cause problems in mainline, as of Dave Jones's
  commit 8d657eb3b4 "Remove easily user-triggerable BUG from
  generic_setlease", but we should fix this anyway.  And this patch will
  be necessary to fix real bugs on earlier kernels. ]

Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-07-23 12:46:01 -07:00
Josef Bacik 96c3f4331a Btrfs: flush delayed inodes if we're short on space
Those crazy gentoo guys have been complaining about ENOSPC errors on their
portage volumes.  This is because doing things like untar tends to create
lots of new files which will soak up all the reservation space in the
delayed inodes.  Usually this gets papered over by the fact that we will try
and commit the transaction, however if this happens in the wrong spot or we
choose not to commit the transaction you will be screwed.  So add the
ability to expclitly flush delayed inodes to free up space.  Please test
this out guys to make sure it works since as usual I cannot reproduce.
Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2012-07-23 15:41:40 -04:00
David Sterba b27f7c0c15 btrfs: join DEV_STATS ioctls to one
Commit c11d2c236c (Btrfs: add ioctl to get and reset the device
stats) introduced two ioctls doing almost the same thing distinguished
by just the ioctl number which encodes "do reset after read". I have
suggested

http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg16604.html

to implement it via the ioctl args. This hasn't happen, and I think we
should use a more clean way to pass flags and should not waste ioctl
numbers.

CC: Stefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: David Sterba <dsterba@suse.cz>
2012-07-23 15:41:40 -04:00
Andrew Mahone a43a211133 btrfs: ignore unfragmented file checks in defrag when compression enabled - rebased
Rebased on btrfs-next and retested.

Inform should_defrag_range if BTRFS_DEFRAG_RANGE_COMPRESS is set. If so, skip
checks for adjacent extents and extent size when deciding whether to defrag,
as these can prevent an uncompressed and unfragmented file from being
compressed as requested.

Signed-off-by: Andrew Mahone <andrew.mahone@gmail.com>
2012-07-23 15:41:39 -04:00
Dan Carpenter e4b50e14c8 Btrfs: small naming cleanup in join_transaction()
"root->fs_info" and "fs_info" are the same, but "fs_info" is prefered
because it is shorter and that's what is used in the rest of the
function.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
2012-07-23 15:41:39 -04:00
Alexander Block 2bc5565286 Btrfs: don't update atime on RO subvolumes
Before the update_time inode operation was indroduced, it was
not possible to prevent updates of atime on RO subvolumes. VFS
was only able to check for RO on the mount, but did not know
anything about btrfs subvolumes.

btrfs_update_time does now check if the root is RO and skip
updating of times.

Signed-off-by: Alexander Block <ablock84@googlemail.com>
2012-07-23 15:41:38 -04:00
Arnd Hannemann 063849eafd Btrfs: allow mount -o remount,compress=no
Btrfs allows to turn on compression on a mounted and used filesystem
by issuing mount -o remount,compress=lzo.
This patch allows to turn compression off again
while the filesystem is mounted. As suggested by David Sterba
if the compress-force option was set, it is implicitly cleared
if compression is turned off.

Tested-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Arnd Hannemann <arnd@arndnet.de>
2012-07-23 15:41:38 -04:00
Josef Bacik c5c3c5f31e Btrfs: remove ->dirty_inode
We do all of our inode updating when we change it, and now that we do
->update_time we don't need ->dirty_inode for atime updates anymore, so just
remove it.  Thanks,

Signed-off-by: Josef Bacik <josef@redhat.com>
2012-07-23 15:41:38 -04:00
Chris Mason cbea5ac1ee Btrfs: reduce calls to wake_up on uncontended locks
The btrfs locks were unconditionally calling wake_up as the
locks were released.  This lead to extra thrashing on the waitqueue,
especially for locks that were dominated by readers.

Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-07-23 15:36:18 -04:00
Chris Mason e39e64ac0c Btrfs: don't wait around for new log writers on an SSD
Waiting on spindles improves performance, but ssds want all the
IO as quickly as we can push it down.

Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-07-23 15:36:17 -04:00
Linus Torvalds a66d2c8f7e Merge branch 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull the big VFS changes from Al Viro:
 "This one is *big* and changes quite a few things around VFS.  What's in there:

   - the first of two really major architecture changes - death to open
     intents.

     The former is finally there; it was very long in making, but with
     Miklos getting through really hard and messy final push in
     fs/namei.c, we finally have it.  Unlike his variant, this one
     doesn't introduce struct opendata; what we have instead is
     ->atomic_open() taking preallocated struct file * and passing
     everything via its fields.

     Instead of returning struct file *, it returns -E...  on error, 0
     on success and 1 in "deal with it yourself" case (e.g.  symlink
     found on server, etc.).

     See comments before fs/namei.c:atomic_open().  That made a lot of
     goodies finally possible and quite a few are in that pile:
     ->lookup(), ->d_revalidate() and ->create() do not get struct
     nameidata * anymore; ->lookup() and ->d_revalidate() get lookup
     flags instead, ->create() gets "do we want it exclusive" flag.

     With the introduction of new helper (kern_path_locked()) we are rid
     of all struct nameidata instances outside of fs/namei.c; it's still
     visible in namei.h, but not for long.  Come the next cycle,
     declaration will move either to fs/internal.h or to fs/namei.c
     itself.  [me, miklos, hch]

   - The second major change: behaviour of final fput().  Now we have
     __fput() done without any locks held by caller *and* not from deep
     in call stack.

     That obviously lifts a lot of constraints on the locking in there.
     Moreover, it's legal now to call fput() from atomic contexts (which
     has immediately simplified life for aio.c).  We also don't need
     anti-recursion logics in __scm_destroy() anymore.

     There is a price, though - the damn thing has become partially
     asynchronous.  For fput() from normal process we are guaranteed
     that pending __fput() will be done before the caller returns to
     userland, exits or gets stopped for ptrace.

     For kernel threads and atomic contexts it's done via
     schedule_work(), so theoretically we might need a way to make sure
     it's finished; so far only one such place had been found, but there
     might be more.

     There's flush_delayed_fput() (do all pending __fput()) and there's
     __fput_sync() (fput() analog doing __fput() immediately).  I hope
     we won't need them often; see warnings in fs/file_table.c for
     details.  [me, based on task_work series from Oleg merged last
     cycle]

   - sync series from Jan

   - large part of "death to sync_supers()" work from Artem; the only
     bits missing here are exofs and ext4 ones.  As far as I understand,
     those are going via the exofs and ext4 trees resp.; once they are
     in, we can put ->write_super() to the rest, along with the thread
     calling it.

   - preparatory bits from unionmount series (from dhowells).

   - assorted cleanups and fixes all over the place, as usual.

  This is not the last pile for this cycle; there's at least jlayton's
  ESTALE work and fsfreeze series (the latter - in dire need of fixes,
  so I'm not sure it'll make the cut this cycle).  I'll probably throw
  symlink/hardlink restrictions stuff from Kees into the next pile, too.
  Plus there's a lot of misc patches I hadn't thrown into that one -
  it's large enough as it is..."

* 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (127 commits)
  ext4: switch EXT4_IOC_RESIZE_FS to mnt_want_write_file()
  btrfs: switch btrfs_ioctl_balance() to mnt_want_write_file()
  switch dentry_open() to struct path, make it grab references itself
  spufs: shift dget/mntget towards dentry_open()
  zoran: don't bother with struct file * in zoran_map
  ecryptfs: don't reinvent the wheels, please - use struct completion
  don't expose I_NEW inodes via dentry->d_inode
  tidy up namei.c a bit
  unobfuscate follow_up() a bit
  ext3: pass custom EOF to generic_file_llseek_size()
  ext4: use core vfs llseek code for dir seeks
  vfs: allow custom EOF in generic_file_llseek code
  vfs: Avoid unnecessary WB_SYNC_NONE writeback during sys_sync and reorder sync passes
  vfs: Remove unnecessary flushing of block devices
  vfs: Make sys_sync writeout also block device inodes
  vfs: Create function for iterating over block devices
  vfs: Reorder operations during sys_sync
  quota: Move quota syncing to ->sync_fs method
  quota: Split dquot_quota_sync() to writeback and cache flushing part
  vfs: Move noop_backing_dev_info check from sync into writeback
  ...
2012-07-23 12:27:27 -07:00
Wanpeng Li 1d00015e26 mm/frontswap: cleanup doc and comment error
Signed-off-by: Wanpeng Li <liwp.linux@gmail.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-07-23 11:16:20 -04:00
Sasha Levin 3389b530a6 mm: frontswap: remove unneeded headers
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
[v1: Rebased with tracing removed]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-07-23 11:16:13 -04:00
Michael Walle 8ceffa7c4a spi/orion: remove uneeded spi_info
This was formerly used to store the tclk value. This is now discovered
using the clk API, rather than pass it as platform data.

Signed-off-by: Michael Walle <michael@walle.cc>
Acked-by: Jason Cooper <jason@lakedaemon.net>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Sebastian Hesselbarth <sebastian.hesselbarth@googlemail.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2012-07-23 14:14:54 +01:00
Florian Fainelli d76ea24ac4 spi/bcm63xx: fix clock configuration selection
We are currently using an inferior or equal operator for comparing
the transfer frequency with the clock frequency table. Because of
this, we always end up selecting 20Mhz as a frequency, due to the
inequality transfer hz <= 20 Mhz being always true. Fix this by
reversing the inequality, which is how the comparison should be done.

Signed-off-by: Florian Fainelli <florian@openwrt.org>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2012-07-23 14:14:11 +01:00
Takashi Iwai c1b623d9e4 ASoC: Additional updates for 3.6
A few more fixes for 3.6, some of which are relatively important -
 they've all been in -next for at least some time.
 
 - DAPM fixes for the recent locking changes.
 - Fix for _PRE and _POST widgets (which have been broken for a few
   releases now).
 - A couple of minor driver updates.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJQDSB4AAoJEBus8iNuMP3d7rYP/in6d3kH48tNMZU1K6YJi2fi
 5f7pmq7RAH86Py5jHD1m0/aGeXzltJqF2UFgOQGoh0JLsM3bnJtAXV4uCqxBc8wr
 wNfueXV5az3eZtDcZKzX0ZmgqZV06/MugDzP/P+roZB7cAH8bHA2yBajgiNLIYpB
 7DD4HMOQcHtVPeM4ioHsFq3xXIiiZ/ZbJ353gKluYgUhHFjzptMSjz/1eE3gv5RU
 CY0LQe36nfJH7cbmZseftnwf+BCWZoWxgEkjM0dcHs+uw1xe1DjpH9zVvyl1MpnS
 1Kgl6+SiVHmqmUEyS8pU1PsMo7ssrLUh4sNxA/FIJQEdl8x/ykraOILU9tdGX3gS
 HLi7okK6p0sMoPlTBayoqY/EgFI9NL6e3xG9jSVICareSqh0Zd32bCZ7bOaQC7G4
 Zd9i3Mkr3ce1OqTuiaS+CMT+j6KH7w54X/lPP8zQhT/f0sF9fZ8Y9VXwHQhe5pFD
 B4BJDk8nPB5Bc1izM/dRWF/Y3NP0O33PB3tFHLNZVPXpeuk8TUP6YLr51UYwxmip
 /++UAIwJFv0EvZ4772raIOGwQcr8ypPdClp17Uj7xqVX68wcueFH8uiLzLkJRbF0
 wB0BbDj+IQeU/fwN216hvS44qteJD0XrSv54jd0C6jDqxz6bj53kt6PvttSgjvZq
 tj5BifekUSCK3TQyN6LN
 =47kf
 -----END PGP SIGNATURE-----

Merge tag 'asoc-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-next

ASoC: Additional updates for 3.6

A few more fixes for 3.6, some of which are relatively important -
they've all been in -next for at least some time.

- DAPM fixes for the recent locking changes.
- Fix for _PRE and _POST widgets (which have been broken for a few
  releases now).
- A couple of minor driver updates.
2012-07-23 14:34:42 +02:00
Axel Lin 0dd6e4847e watchdog: orion_wdt: Convert driver to watchdog core
Convert orion_wdt driver to use watchdog framework API.

Signed-off-by: Axel Lin <axel.lin@gmail.com>
Tested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2012-07-23 12:51:09 +02:00
Sachin Kamat 6b761b2902 watchdog: s3c2410_wdt: Use module_platform_driver()
module_platform_driver() replaces module_init() and module_exit()
and makes the code simpler.

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2012-07-23 12:50:51 +02:00
Wim Van Sebroeck 7732c6b96f watchdog: sch311x_wdt: Fix Polarity when starting watchdog
Some motherboards like the Advantech ARK3400 documentation
use a non-inverted GPIO pin. We fix this by assuming that
the BIOS will set the Polarity bit for the GPIO correctly
at startup and we keep the Bit-setting intact when we start
and stop the watchdog.

Reported-by: Jean-François Deverge <jf.deverge@gmail.com>
Signed-off-by: Dave Mueller <d.mueller@elsoft.ch>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2012-07-23 12:50:30 +02:00
Lokesh Vutla 41814eed41 Watchdog: OMAP: Fix the runtime pm code to avoid module getting stuck intransition state.
OMAP watchdog driver is adapted to runtime PM like a general device
driver but it is not appropriate. It is causing couple of functional
issues.

1. On OMAP4 SYSCLK can't be gated, because of issue with WDTIMER2 module,
which constantly stays in "in transition" state. Value of register
CM_WKUP_WDTIMER2_CLKCTRL is always 0x00010000 in this case.
Issue occurs immediately after first idle, when hwmod framework tries
to disable WDTIMER2 functional clock - "wd_timer2_fck". After this
module falls to "in transition" state, and SYSCLK gating is blocked.

2. Due to runtime PM, watchdog timer may be completely disabled.
In current code base watchdog timer is not disabled only because of
issue 1. Otherwise state of WDTIMER2 module will be "Disabled", and there
will be no interrupts from omap_wdt. In other words watchdog will not
work at all.

Watchdong is a special IP and it should not be disabled otherwise
purpose of it itself is defeated. Watchdog functional clock should
never be disabled. This patch updates the runtime PM handling in
driver so that runtime PM is limited only during probe/shutdown
and suspend/resume.

The patch fixes issue 1 and 2

Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2012-07-23 12:50:11 +02:00
Gerard Snitselaar 0402450f45 watchdog: ie6xx_wdt: section mismatch in ie6xx_wdt_probe()
ie6xx_wdt_probe() calls ie6xx_wdt_debugfs_exit() as part of
it's error cleanup path, and ie6xx_wdt_debugfs_exit() is
currently annotated __devexit.

Signed-off-by: Gerard Snitselaar <dev@snitselaar.org>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2012-07-23 12:49:44 +02:00
Florian Fainelli 5a135f3c72 watchdog: bcm63xx_wdt: fix driver section mismatch
bcm63xx_wdt was used as a platform_driver but was not suffixed with
_driver, thus causing section mismatches, fix that.

Signed-off-by: Florian Fainelli <florian@openwrt.org>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2012-07-23 12:49:24 +02:00
Wim Van Sebroeck bff23431fe watchdog: iTCO_wdt.c: convert to watchdog core
This patch converts the iTCO_wdt watchdog driver to use the
generic watchdog framework.

Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2012-07-23 12:48:41 +02:00
Oskar Schirmer 18cb2ae55f char/ipmi: remove local ioctl defines replaced by generic ones
This watchdog driver had ioctl defines introduced locally
for pre timeout handling, marked to be removed as soon as
a generic replacement would become available.

The latter has actually occurred in 2006, at e05b59fe.

Remove the local duplicates for pre timeout handling.

Signed-off-by: Oskar Schirmer <oskar@scara.com>
Acked-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2012-07-23 12:48:04 +02:00
Michal Simek 90fe6c608f watchdog: xilinx: Read clock frequency directly from DT node
Do not use clock-frequency property from parent node.
Use it from watchdog node.

Signed-off-by: Michal Simek <monstr@monstr.eu>
Acked-By: Alejandro Cabrera <acabrera@udio.cujae.edu.cu>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2012-07-23 12:47:00 +02:00
Linus Walleij c362cb597b watchdog: coh901327_wdt: use clk_prepare/unprepare
Make sure we prepare/unprepare the COH901327 watchdog timer
as is required by the clk API especially if you use common
clock.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by : Pankaj Jangra <jangra.pankaj9@gmail.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2012-07-23 12:46:49 +02:00
Justin Wheeler 3017020dc7 watchdog: f71808e_wdt: Add support for Jetway JNF99 motherboard
The Jetway JNF99 motherboard features a F71869 SuperIO chip, but its
watchdog chipset ID appears to be 1007 (as opposed to 0814).  Some testing
confirmed it behaves the exact same as 0814. So add this chipset ID to the
module's ID list so that the Fintek watchdog driver can correctly identify
and access it.

Signed-off-by: Justin Wheeler <jwheeler@datademons.com>
Acked-by: Giel van Schijndel <me@mortis.eu>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2012-07-23 12:46:38 +02:00
Andrew Lunn f814f9ac5a spi/orion: add device tree binding
Signed-off-by: Michael Walle <michael@walle.cc>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Jason Cooper <jason@lakedaemon.net>
Acked-by: Sebastian Hesselbarth <sebastian.hesselbarth@googlemail.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2012-07-23 11:43:39 +01:00
Joerg Roedel 395e51f18d Merge branches 'iommu/fixes', 'x86/amd', 'groups', 'arm/tegra' and 'api/domain-attr' into next
Conflicts:
	drivers/iommu/iommu.c
	include/linux/iommu.h
2012-07-23 12:17:00 +02:00
Cyrus Lien 2d8767bb42 HID: add ASUS AIO keyboard model AK1D
Add Asus All-In-One PC keyboard model AK1D.

BugLink: https://bugs.launchpad.net/bugs/1027789

Signed-off-by: Cyrus Lien <cyrus.lien@canonical.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2012-07-23 12:10:21 +02:00
Mark Brown 15d47763b3 Merge branch 'for-3.5' into for-3.6 2012-07-23 10:45:07 +01:00
Mark Brown 0ff97ebf08 ASoC: dapm: Fix _PRE and _POST events for DAPM performance improvements
Ever since the DAPM performance improvements we've been marking all widgets
as not dirty after each DAPM run. Since _PRE and _POST events aren't part
of the DAPM graph this has rendered them non-functional, they will never be
marked dirty again and thus will never be run again.

Fix this by skipping them when marking widgets as not dirty.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Acked-by: Liam Girdwood <lrg@ti.com>
Cc: stable@vger.kernel.org
2012-07-23 10:39:54 +01:00
Weiping Pan 06b6a1cf6e rds: set correct msg_namelen
Jay Fenlason (fenlason@redhat.com) found a bug,
that recvfrom() on an RDS socket can return the contents of random kernel
memory to userspace if it was called with a address length larger than
sizeof(struct sockaddr_in).
rds_recvmsg() also fails to set the addr_len paramater properly before
returning, but that's just a bug.
There are also a number of cases wher recvfrom() can return an entirely bogus
address. Anything in rds_recvmsg() that returns a non-negative value but does
not go through the "sin = (struct sockaddr_in *)msg->msg_name;" code path
at the end of the while(1) loop will return up to 128 bytes of kernel memory
to userspace.

And I write two test programs to reproduce this bug, you will see that in
rds_server, fromAddr will be overwritten and the following sock_fd will be
destroyed.
Yes, it is the programmer's fault to set msg_namelen incorrectly, but it is
better to make the kernel copy the real length of address to user space in
such case.

How to run the test programs ?
I test them on 32bit x86 system, 3.5.0-rc7.

1 compile
gcc -o rds_client rds_client.c
gcc -o rds_server rds_server.c

2 run ./rds_server on one console

3 run ./rds_client on another console

4 you will see something like:
server is waiting to receive data...
old socket fd=3
server received data from client:data from client
msg.msg_namelen=32
new socket fd=-1067277685
sendmsg()
: Bad file descriptor

/***************** rds_client.c ********************/

int main(void)
{
	int sock_fd;
	struct sockaddr_in serverAddr;
	struct sockaddr_in toAddr;
	char recvBuffer[128] = "data from client";
	struct msghdr msg;
	struct iovec iov;

	sock_fd = socket(AF_RDS, SOCK_SEQPACKET, 0);
	if (sock_fd < 0) {
		perror("create socket error\n");
		exit(1);
	}

	memset(&serverAddr, 0, sizeof(serverAddr));
	serverAddr.sin_family = AF_INET;
	serverAddr.sin_addr.s_addr = inet_addr("127.0.0.1");
	serverAddr.sin_port = htons(4001);

	if (bind(sock_fd, (struct sockaddr*)&serverAddr, sizeof(serverAddr)) < 0) {
		perror("bind() error\n");
		close(sock_fd);
		exit(1);
	}

	memset(&toAddr, 0, sizeof(toAddr));
	toAddr.sin_family = AF_INET;
	toAddr.sin_addr.s_addr = inet_addr("127.0.0.1");
	toAddr.sin_port = htons(4000);
	msg.msg_name = &toAddr;
	msg.msg_namelen = sizeof(toAddr);
	msg.msg_iov = &iov;
	msg.msg_iovlen = 1;
	msg.msg_iov->iov_base = recvBuffer;
	msg.msg_iov->iov_len = strlen(recvBuffer) + 1;
	msg.msg_control = 0;
	msg.msg_controllen = 0;
	msg.msg_flags = 0;

	if (sendmsg(sock_fd, &msg, 0) == -1) {
		perror("sendto() error\n");
		close(sock_fd);
		exit(1);
	}

	printf("client send data:%s\n", recvBuffer);

	memset(recvBuffer, '\0', 128);

	msg.msg_name = &toAddr;
	msg.msg_namelen = sizeof(toAddr);
	msg.msg_iov = &iov;
	msg.msg_iovlen = 1;
	msg.msg_iov->iov_base = recvBuffer;
	msg.msg_iov->iov_len = 128;
	msg.msg_control = 0;
	msg.msg_controllen = 0;
	msg.msg_flags = 0;
	if (recvmsg(sock_fd, &msg, 0) == -1) {
		perror("recvmsg() error\n");
		close(sock_fd);
		exit(1);
	}

	printf("receive data from server:%s\n", recvBuffer);

	close(sock_fd);

	return 0;
}

/***************** rds_server.c ********************/

int main(void)
{
	struct sockaddr_in fromAddr;
	int sock_fd;
	struct sockaddr_in serverAddr;
	unsigned int addrLen;
	char recvBuffer[128];
	struct msghdr msg;
	struct iovec iov;

	sock_fd = socket(AF_RDS, SOCK_SEQPACKET, 0);
	if(sock_fd < 0) {
		perror("create socket error\n");
		exit(0);
	}

	memset(&serverAddr, 0, sizeof(serverAddr));
	serverAddr.sin_family = AF_INET;
	serverAddr.sin_addr.s_addr = inet_addr("127.0.0.1");
	serverAddr.sin_port = htons(4000);
	if (bind(sock_fd, (struct sockaddr*)&serverAddr, sizeof(serverAddr)) < 0) {
		perror("bind error\n");
		close(sock_fd);
		exit(1);
	}

	printf("server is waiting to receive data...\n");
	msg.msg_name = &fromAddr;

	/*
	 * I add 16 to sizeof(fromAddr), ie 32,
	 * and pay attention to the definition of fromAddr,
	 * recvmsg() will overwrite sock_fd,
	 * since kernel will copy 32 bytes to userspace.
	 *
	 * If you just use sizeof(fromAddr), it works fine.
	 * */
	msg.msg_namelen = sizeof(fromAddr) + 16;
	/* msg.msg_namelen = sizeof(fromAddr); */
	msg.msg_iov = &iov;
	msg.msg_iovlen = 1;
	msg.msg_iov->iov_base = recvBuffer;
	msg.msg_iov->iov_len = 128;
	msg.msg_control = 0;
	msg.msg_controllen = 0;
	msg.msg_flags = 0;

	while (1) {
		printf("old socket fd=%d\n", sock_fd);
		if (recvmsg(sock_fd, &msg, 0) == -1) {
			perror("recvmsg() error\n");
			close(sock_fd);
			exit(1);
		}
		printf("server received data from client:%s\n", recvBuffer);
		printf("msg.msg_namelen=%d\n", msg.msg_namelen);
		printf("new socket fd=%d\n", sock_fd);
		strcat(recvBuffer, "--data from server");
		if (sendmsg(sock_fd, &msg, 0) == -1) {
			perror("sendmsg()\n");
			close(sock_fd);
			exit(1);
		}
	}

	close(sock_fd);
	return 0;
}

Signed-off-by: Weiping Pan <wpan@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-23 01:01:44 -07:00
Dan Carpenter 5b3e7e6cb5 openvswitch: potential NULL deref in sample()
If there is no OVS_SAMPLE_ATTR_ACTIONS set then "acts_list" is NULL and
it leads to a NULL dereference when we call nla_len(acts_list).  This
is a static checker fix, not something I have seen in testing.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-23 00:59:54 -07:00
Eric Dumazet 563d34d057 tcp: dont drop MTU reduction indications
ICMP messages generated in output path if frame length is bigger than
mtu are actually lost because socket is owned by user (doing the xmit)

One example is the ipgre_tunnel_xmit() calling
icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED, htonl(mtu));

We had a similar case fixed in commit a34a101e1e (ipv6: disable GSO on
sockets hitting dst_allfrag).

Problem of such fix is that it relied on retransmit timers, so short tcp
sessions paid a too big latency increase price.

This patch uses the tcp_release_cb() infrastructure so that MTU
reduction messages (ICMP messages) are not lost, and no extra delay
is added in TCP transmits.

Reported-by: Maciej Żenczykowski <maze@google.com>
Diagnosed-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Nandita Dukkipati <nanditad@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Tore Anderson <tore@fud.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-23 00:58:46 -07:00
Yuval Mintz c3def943c7 bnx2x: Add new 57840 device IDs
The 57840 boards come in two flavours: 2 x 20G and 4 x 10G.
To better differentiate between the two flavours, a separate device ID
was assigned to each.
The silicon default value is still the currently supported 57840 device ID
(0x168d), and since a user can damage the nvram (e.g., 'ethtool -E')
the driver will still support this device ID to allow the user to amend the
nvram back into a supported configuration.

Notice this patch contains lines longer than 80 characters (strings).

Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-23 00:58:21 -07:00
Julian Anastasov 9a0a9502cb tcp: avoid oops in tcp_metrics and reset tcpm_stamp
In tcp_tw_remember_stamp we incorrectly checked tw
instead of tm, it can lead to oops if the cached entry is
not found.

	tcpm_stamp was not updated in tcpm_check_stamp when
tcpm_suck_dst was called, move the update into tcpm_suck_dst,
so that we do not call it infinitely on every next cache hit
after TCP_METRICS_TIMEOUT.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-23 00:57:12 -07:00
Shuah Khan 9b70749e64 niu: Change niu_rbr_fill() to use unlikely() to check niu_rbr_add_page() return value
Change niu_rbr_fill() to use unlikely() to check niu_rbr_add_page() return
value to be consistent with the rest of the checks after niu_rbr_add_page()
calls in this file.

Signed-off-by: Shuah Khan <shuah.khan@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-22 23:31:07 -07:00
Shuah Khan ec2deec1f3 niu: Fix to check for dma mapping errors.
Fix Neptune ethernet driver to check dma mapping error after map_page()
interface returns.

Signed-off-by: Shuah Khan <shuah.khan@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-22 23:31:06 -07:00
Roland Dreier 089117e1ad Merge branches 'cma', 'cxgb4', 'misc', 'mlx4-sriov', 'mlx-cleanups', 'ocrdma' and 'qib' into for-linus 2012-07-22 23:26:17 -07:00
Benjamin Herrenschmidt 574ce79cea powerpc/mpic: Create a revmap with enough entries for IPIs and timers
The current mpic code creates a linear revmap just big enough for all
the sources, which happens to miss the IPIs and timers on some machines.

This will in turn break when the irqdomain code loses the fallback of
doing a linear search when the revmap fails (and really slows down IPIs
otherwise).

This happens for example on the U4 based Apple machines such as the
dual core PowerMac G5s.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-23 14:20:42 +10:00
Jesper Juhl 818810472b net: Fix references to out-of-scope variables in put_cmsg_compat()
In net/compat.c::put_cmsg_compat() we may assign 'data' the address of
either the 'ctv' or 'cts' local variables inside the 'if
(!COMPAT_USE_64BIT_TIME)' branch.

Those variables go out of scope at the end of the 'if' statement, so
when we use 'data' further down in 'copy_to_user(CMSG_COMPAT_DATA(cm),
data, cmlen - sizeof(struct compat_cmsghdr))' there's no telling what
it may be refering to - not good.

Fix the problem by simply giving 'ctv' and 'cts' function scope.

Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-22 17:50:49 -07:00
David S. Miller 5e9965c15b Merge branch 'kill_rtcache'
The ipv4 routing cache is non-deterministic, performance wise, and is
subject to reasonably easy to launch denial of service attacks.

The routing cache works great for well behaved traffic, and the world
was a much friendlier place when the tradeoffs that led to the routing
cache's design were considered.

What it boils down to is that the performance of the routing cache is
a product of the traffic patterns seen by a system rather than being a
product of the contents of the routing tables.  The former of which is
controllable by external entitites.

Even for "well behaved" legitimate traffic, high volume sites can see
hit rates in the routing cache of only ~%10.

The general flow of this patch series is that first the routing cache
is removed.  We build a completely new rtable entry every lookup
request.

Next we make some simplifications due to the fact that removing the
routing cache causes several members of struct rtable to become no
longer necessary.

Then we need to make some amends such that we can legally cache
pre-constructed routes in the FIB nexthops.  Firstly, we need to
invalidate routes which are hit with nexthop exceptions.  Secondly we
have to change the semantics of rt->rt_gateway such that zero means
that the destination is on-link and non-zero otherwise.

Now that the preparations are ready, we start caching precomputed
routes in the FIB nexthops.  Output and input routes need different
kinds of care when determining if we can legally do such caching or
not.  The details are in the commit log messages for those changes.

The patch series then winds down with some more struct rtable
simplifications and other tidy ups that remove unnecessary overhead.

On a SPARC-T3 output route lookups are ~876 cycles.  Input route
lookups are ~1169 cycles with rpfilter disabled, and about ~1468
cycles with rpfilter enabled.

These measurements were taken with the kbench_mod test module in the
net_test_tools GIT tree:

git://git.kernel.org/pub/scm/linux/kernel/git/davem/net_test_tools.git

That GIT tree also includes a udpflood tester tool and stresses
route lookups on packet output.

For example, on the same SPARC-T3 system we can run:

	time ./udpflood -l 10000000 10.2.2.11

with routing cache:
real    1m21.955s       user    0m6.530s        sys     1m15.390s

without routing cache:
real    1m31.678s       user    0m6.520s        sys     1m25.140s

Performance undoubtedly can easily be improved further.

For example fib_table_lookup() performs a lot of excessive
computations with all the masking and shifting, some of it
conditionalized to deal with edge cases.

Also, Eric's no-ref optimization for input route lookups can be
re-instated for the FIB nexthop caching code path.  I would be really
pleased if someone would work on that.

In fact anyone suitable motivated can just fire up perf on the loading
of the test net_test_tools benchmark kernel module.  I spend much of
my time going:

bash# perf record insmod ./kbench_mod.ko dst=172.30.42.22 src=74.128.0.1 iif=2
bash# perf report

Thanks to helpful feedback from Joe Perches, Eric Dumazet, Ben
Hutchings, and others.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-22 17:04:15 -07:00
Benjamin Herrenschmidt 668fcb6972 Remove stale .rej file
Commit 9778b696a0 accidentally added
a .rej file (probably my fault), remove it.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-23 09:38:53 +10:00
Linus Torvalds a6be1fcbc5 MMC highlights for 3.6:
Core:
  - Rename cd-gpio to slot-gpio and extend it to support more
    slot GPIO functions, such as write-protect.
  - Add a function to get regulators (Vdd and Vccq) for a host.
 
 Drivers:
  - sdhci-pxav2, sdhci-pxav3: Add device tree support.
  - sdhi: Add device tree support.
  - sh_mmcif: Add support for regulators, device tree, slot-gpio.
  - tmio: Add regulator support, use slot-gpio.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJQDGaKAAoJEHNBYZ7TNxYMskYQAI0RsnncLyT8DXsnZw5jjNf2
 sB7h3Sc2ExIQUIrqwxaRbzbAyVCDBAK4+FyBK7iN3KqyaL6G6mWAzXInxfw0hkpy
 kV66hIXzzTbGryvndlPzwRcswJrPKWwwTMOx68Cyw12UIPn1ZK7LRGYTeUOEBkYf
 QNGYo8Jcq+Kr+FnR6Ios5NH1t9EghCUDVquTzqOalhovN8QeMD2FxbgbegSS3Tu0
 qqle1eCCd5uaDwC13Nf0kb8qvlXi164UlT367T+C0QXKoQiaB7/K6tr2BgzQBNgI
 chAi3JEGU70WnQ2LavfWxO/F7nEEp+In3g1uMb6kj1yWhBq2yJV7seMCCEt6gzqe
 djFm4JfRgEPBSUnUD0G6vP8SVJNHr/L6WXyGgtFyOa+kFVNF2rhN1Y+rzBoLdxt3
 R0sZp5lhSQtuxT88GfEqG1bqOkGbiWWvuzQqLn1Z/32pwY/6/ZvN9K52xr4pY4Tn
 xZ7vNij8Vho0MaUSueVrrLHoTcwYqZjujMlDm0Uyu3eRHSH7ON7DC9rwPJ2PWXxR
 vxKgatXSIMpL9COQfoB9LWlRPrZLsKmt0WcBCRYS49zO6bkWmsZkzyUnJ0PuBIl2
 NojLQNbaAAwtfhGmmxhaCNT9QeBfNg2U0jhCct0wLe3eCw0V1PFvb8HBlPBqrttc
 A1sqnOt5LHBxgh2uRB5a
 =OBTA
 -----END PGP SIGNATURE-----

Merge tag 'mmc-merge-for-3.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc

Pull MMC updates from Chris Ball:
 "MMC highlights for 3.6:

  Core:
   - Rename cd-gpio to slot-gpio and extend it to support more slot GPIO
     functions, such as write-protect.
   - Add a function to get regulators (Vdd and Vccq) for a host.

  Drivers:
   - sdhci-pxav2, sdhci-pxav3: Add device tree support.
   - sdhi: Add device tree support.
   - sh_mmcif: Add support for regulators, device tree, slot-gpio.
   - tmio: Add regulator support, use slot-gpio."

* tag 'mmc-merge-for-3.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc: (62 commits)
  mmc: sdhci-dove: Prepare for common clock framework
  mmc: sdhci-dove: Add SDHCI_QUIRK_NO_HISPD_BIT
  mmc: omap_hsmmc: ensure probe returns error upon resource failure
  mmc: mxs-mmc: Add wp-inverted property
  mmc: esdhc: Fix DMA_MASK to not break mx25 DMA access
  mmc: core: reset signal voltage on power up
  mmc: sd: Fix sd current limit setting
  mmc: omap_hsmmc: add clk_prepare and clk_unprepare
  mmc: sdhci: When a UHS switch fails, cycle power if regulator is used
  mmc: atmel-mci: modify CLKDIV displaying in debugfs
  mmc: atmel-mci: fix incorrect setting of host->data to NULL
  mmc: sdhci: poll for card even when card is logically unremovable
  mmc: sdhci: Introduce new flag SDHCI_USING_RETUNING_TIMER
  mmc: sdio: Change pr_warning to pr_warn_ratelimited
  mmc: core: Simplify and fix for SD switch processing
  mmc: sdhci: restore host settings when card is removed
  mmc: sdhci: fix incorrect command used in tuning
  mmc: sdhci-pci: CaFe has broken card detection
  mmc: sdhci: Report failure reasons for all cases in sdhci_add_host()
  mmc: s3cmci: Convert s3cmci driver to gpiolib API
  ...
2012-07-22 16:36:08 -07:00
Linus Torvalds 5b160bd426 Merge branch 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/mce changes from Ingo Molnar:
 "This tree improves the AMD thresholding bank code and includes a
  memory fault signal handling fixlet."

* 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mce: Fix siginfo_t->si_addr value for non-recoverable memory faults
  x86, MCE, AMD: Update copyrights and boilerplate
  x86, MCE, AMD: Give proper names to the thresholding banks
  x86, MCE, AMD: Make error_count read only
  x86, MCE, AMD: Cleanup reading of error_count
  x86, MCE, AMD: Print decimal thresholding values
  x86, MCE, AMD: Move shared bank to node descriptor
  x86, MCE, AMD: Remove local_allocate_... wrapper
  x86, MCE, AMD: Remove shared banks sysfs linking
  x86, amd_nb: Export model 0x10 and later PCI id
2012-07-22 16:07:45 -07:00
Sebastian Hesselbarth 30b87c60e9 mmc: sdhci-dove: Prepare for common clock framework
As mach-dove is moving towards common clock framework prepare
the sdhci driver to grab its clock.

Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@googlemail.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
2012-07-22 16:42:48 -04:00
Sebastian Hesselbarth a9ca1d5477 mmc: sdhci-dove: Add SDHCI_QUIRK_NO_HISPD_BIT
The sdio controller on dove doesn't have a bit to indicate
high-speed. With the quirk set it fixes accessing high-speed
sdcards.

Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@googlemail.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
2012-07-22 16:42:47 -04:00
Kevin Hilman 9c17d08ca1 mmc: omap_hsmmc: ensure probe returns error upon resource failure
If platform_get_resource_by_name() fails, driver probe is aborted an
should return an error so the driver is not bound to the device.

However, in the current error path of platform_get_resource_by_name(),
probe returns zero since the return value (ret) is not properly set.
With a zero return value, the driver core assumes probe was successful
and will bind the driver to the device.

Fix this by ensuring that probe returns an error code in this failure
path.

Signed-off-by: Kevin Hilman <khilman@ti.com>
Acked-by: Venkatraman S <svenkatr@ti.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
2012-07-22 16:42:47 -04:00
Marek Vasut b6e76f10af mmc: mxs-mmc: Add wp-inverted property
The write-protect GPIO is inverted on some boards. Handle such case.

Signed-off-by: Marek Vasut <marex@denx.de>
Signed-off-by: Chris Ball <cjb@laptop.org>
2012-07-22 16:42:46 -04:00