Commit graph

21401 commits

Author SHA1 Message Date
Rusty Russell
3db1cd5c05 net: fix assignment of 0/1 to bool variables.
DaveM said:
   Please, this kind of stuff rots forever and not using bool properly
   drives me crazy.

Joe Perches <joe@perches.com> gave me the spatch script:

	@@
	bool b;
	@@
	-b = 0
	+b = false
	@@
	bool b;
	@@
	-b = 1
	+b = true

I merely installed coccinelle, read the documentation and took credit.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-19 22:27:29 -05:00
Xi Wang
2692ba61a8 sctp: fix incorrect overflow check on autoclose
Commit 8ffd3208 voids the previous patches f6778aab and 810c0719 for
limiting the autoclose value.  If userspace passes in -1 on 32-bit
platform, the overflow check didn't work and autoclose would be set
to 0xffffffff.

This patch defines a max_autoclose (in seconds) for limiting the value
and exposes it through sysctl, with the following intentions.

1) Avoid overflowing autoclose * HZ.

2) Keep the default autoclose bound consistent across 32- and 64-bit
   platforms (INT_MAX / HZ in this patch).

3) Keep the autoclose value consistent between setsockopt() and
   getsockopt() calls.

Suggested-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-19 16:25:46 -05:00
Eric Dumazet
966567b764 net: two vzalloc() cleanups
We can use vzalloc() helper now instead of __vmalloc() trick

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-19 16:01:38 -05:00
Alex Juncu
9cef310fcd llc: llc_cmsg_rcv was getting called after sk_eat_skb.
Received non stream protocol packets were calling llc_cmsg_rcv that used a
skb after that skb was released by sk_eat_skb. This caused received STP
packets to generate kernel panics.

Signed-off-by: Alexandru Juncu <ajuncu@ixiacom.com>
Signed-off-by: Kunjan Naik <knaik@ixiacom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-19 15:58:52 -05:00
David S. Miller
447f219190 Revert "net: Remove unused neighbour layer ops."
This reverts commit 5c3ddec73d.

S390 qeth driver actually still uses the setup ops.

Reported-by: Frank Blaschka <blaschka@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-19 15:04:41 -05:00
John W. Linville
9763152c94 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/padovan/bluetooth 2011-12-19 14:12:11 -05:00
John W. Linville
9f6e20cee6 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem 2011-12-19 13:54:26 -05:00
David S. Miller
d1388dacbb Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless 2011-12-19 02:26:39 -05:00
Gustavo F. Padovan
d7660918fc Revert "Bluetooth: Revert: Fix L2CAP connection establishment"
This reverts commit 4dff523a91.

It was reported that this patch cause issues when trying to connect to
legacy devices so reverting it.

Reported-by: David Fries <david@fries.net>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-12-18 22:33:30 -02:00
Mat Martineau
79e654787c Bluetooth: Clear RFCOMM session timer when disconnecting last channel
When the last RFCOMM data channel is closed, a timer is normally set
up to disconnect the control channel at a later time.  If the control
channel disconnect command is sent with the timer pending, the timer
needs to be cancelled.

If the timer is not cancelled in this situation, the reference
counting logic for the RFCOMM session does not work correctly when the
remote device closes the L2CAP connection.  The session is freed at
the wrong time, leading to a kernel panic.

Signed-off-by: Mat Martineau <mathewm@codeaurora.org>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-12-18 22:29:35 -02:00
Mat Martineau
36e999a83a Bluetooth: Prevent uninitialized data access in L2CAP configuration
When configuring an ERTM or streaming mode connection, remote devices
are expected to send an RFC option in a successful config response.  A
misbehaving remote device might not send an RFC option, and the L2CAP
code should not access uninitialized data in this case.

Signed-off-by: Mat Martineau <mathewm@codeaurora.org>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-12-18 22:16:04 -02:00
Pablo Neira Ayuso
c4042a339f netfilter: ctnetlink: support individual atomic-get-and-reset of counters
This allows to use the get operation to atomically get-and-reset
counters.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-12-18 01:31:49 +01:00
Pablo Neira Ayuso
35dba1d7f3 netfilter: ctnetlink: use expect instead of master tuple in get operation
Use the expect tuple (if possible) instead of the master tuple for
the get operation. If two or more expectations come from the same
master, the returned expectation may not be the one that user-space
is requesting.

This is how it works for the expect deletion operation.

Although I think that nobody has been seriously using this. We
accept both possibilities, using the expect tuple if possible.
I decided to do it like this to avoid breaking backward
compatibility.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-12-18 01:31:47 +01:00
Eric Dumazet
b3e0bfa71b netfilter: nf_conntrack: use atomic64 for accounting counters
We can use atomic64_t infrastructure to avoid taking a spinlock in fast
path, and remove inaccuracies while reading values in
ctnetlink_dump_counters() and connbytes_mt() on 32bit arches.

Suggested by Pablo.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-12-18 01:19:19 +01:00
Igor Maravić
e6373c4c0e net:bridge: use IS_ENABLED
Use IS_ENABLED(CONFIG_FOO)
instead of defined(CONFIG_FOO) || defined (CONFIG_FOO_MODULE)

Signed-off-by: Igor Maravić <igorm@etf.rs>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-16 15:49:52 -05:00
Igor Maravić
c0cd115667 net:netfilter: use IS_ENABLED
Use IS_ENABLED(CONFIG_FOO)
instead of defined(CONFIG_FOO) || defined (CONFIG_FOO_MODULE)

Signed-off-by: Igor Maravić <igorm@etf.rs>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-16 15:49:52 -05:00
Igor Maravić
29c3626238 net:x25: use IS_ENABLED
Use IS_ENABLED(CONFIG_FOO)
instead of defined(CONFIG_FOO) || defined (CONFIG_FOO_MODULE)

Signed-off-by: Igor Maravić <igorm@etf.rs>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-16 15:49:52 -05:00
Igor Maravić
a3bf7ae9ae net:core: use IS_ENABLED
Use IS_ENABLED(CONFIG_FOO)
instead of defined(CONFIG_FOO) || defined (CONFIG_FOO_MODULE)

Signed-off-by: Igor Maravić <igorm@etf.rs>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-16 15:49:51 -05:00
Eric Dumazet
869aa41044 sch_gred: prefer GFP_KERNEL allocations
In control path, its better to use GFP_KERNEL allocations where
possible.

Before taking qdisc spinlock, we preallocate memory just in case we'll
need it in gred_change_vq()

This is a followup to commit 3f1e6d3fd3 (sch_gred: should not use
GFP_KERNEL while holding a spinlock)

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-16 15:40:33 -05:00
Glauber Costa
36b77a5208 net: fix sleeping while atomic problem in sock mem_cgroup.
We can't scan the proto_list to initialize sock cgroups, as it
holds a rwlock, and we also want to keep the code generic enough to
avoid calling the initialization functions of protocols directly,

Convert proto_list_lock into a mutex, so we can sleep and do the
necessary allocations. This lock is seldom taken, so there shouldn't
be any performance penalties associated with that

Signed-off-by: Glauber Costa <glommer@parallels.com>
CC: Hiroyouki Kamezawa <kamezawa.hiroyu@jp.fujitsu.com>
CC: David S. Miller <davem@davemloft.net>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: Stephen Rothwell <sfr@canb.auug.org.au>
CC: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-16 15:35:17 -05:00
Linus Torvalds
24545cf168 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  ipv6: Check dest prefix length on original route not copied one in rt6_alloc_cow().
  sch_gred: should not use GFP_KERNEL while holding a spinlock
  ipip, sit: copy parms.name after register_netdevice
  ipv6: Fix for adding multicast route for loopback device automatically.
  ssb: fix init regression with SoCs
  rtl8192{ce,cu,de,se}: avoid problems because of possible ERFOFF -> ERFSLEEP transition
  mac80211: fix another race in aggregation start
  fsl_pq_mdio: Clean up tbi address configuration
  ppp: fix pptp double release_sock in pptp_bind()
  net/fec: fix the use of pdev->id
  ath9k: fix check for antenna diversity support
  batman-adv: delete global entry in case of roaming
  batman-adv: in case of roaming mark the client with TT_CLIENT_ROAM
  Bluetooth: Correct version check in hci_setup
  btusb: fix a memory leak in btusb_send_frame()
  Bluetooth: bnep: Fix module reference
  Bluetooth: cmtp: Fix module reference
  Bluetooth: btmrvl: support Marvell Bluetooth device SD8797
2011-12-16 12:17:32 -08:00
David S. Miller
220b07e90e batman-adv: Fix merge error.
I didn't resolve the merge properly during the last pull of the net
tree into net-next.

The code in the final resolution should set flags to TT_CLIENT_ROAM
not TT_CLIENT_PENDING.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-16 15:07:28 -05:00
Ben Hutchings
278bc4296b ethtool: Define and apply a default policy for RX flow hash indirection
All drivers that support modification of the RX flow hash indirection
table initialise it in the same way: RX rings are assigned to table
entries in rotation.  Make that default policy explicit by having them
call a ethtool_rxfh_indir_default() function.

In the ethtool core, add support for a zero size value for
ETHTOOL_SRXFHINDIR, which resets the table to this default.

Partly-suggested-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Shreyas N Bhatewara <sbhatewara@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-16 13:53:18 -05:00
Ben Hutchings
7850f63f16 ethtool: Centralise validation of ETHTOOL_{G, S}RXFHINDIR parameters
Add a new ethtool operation (get_rxfh_indir_size) to get the
indirectional table size.  Use this to validate the user buffer size
before calling get_rxfh_indir or set_rxfh_indir.  Use get_rxnfc to get
the number of RX rings, and validate the contents of the new
indirection table before calling set_rxfh_indir.  Remove this
validation from drivers.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Dimitris Michailidis <dm@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-16 13:52:47 -05:00
Pavel Emelyanov
5d531aaa64 unix_diag: Write it into kbuild
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-16 13:48:29 -05:00
Pavel Emelyanov
cbf391958a unix_diag: Receive queue lenght NLA
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-16 13:48:29 -05:00
Pavel Emelyanov
2aac7a2cb0 unix_diag: Pending connections IDs NLA
When establishing a unix connection on stream sockets the
server end receives an skb with socket in its receive queue.

Report who is waiting for these ends to be accepted for
listening sockets via NLA.

There's a lokcing issue with this -- the unix sk state lock is
required to access the peer, and it is taken under the listening
sk's queue lock. Strictly speaking the queue lock should be taken
inside the state lock, but since in this case these two sockets
are different it shouldn't lead to deadlock.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-16 13:48:28 -05:00
Pavel Emelyanov
ac02be8d96 unix_diag: Unix peer inode NLA
Report the peer socket inode ID as NLA. With this it's finally
possible to find out the other end of an interesting unix connection.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-16 13:48:28 -05:00
Pavel Emelyanov
5f7b056946 unix_diag: Unix inode info NLA
Actually, the socket path if it's not anonymous doesn't give
a clue to which file the socket is bound to. Even if the path
is absolute, it can be unlinked and then new socket can be
bound to it.

With this NLA it's possible to check which file a particular
socket is really bound to.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-16 13:48:28 -05:00
Pavel Emelyanov
f5248b48a6 unix_diag: Unix socket name NLA
Report the sun_path when requested as NLA. With leading '\0' if
present but without the leading AF_UNIX bits.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-16 13:48:28 -05:00
Pavel Emelyanov
5d3cae8bc3 unix_diag: Dumping exact socket core
The socket inode is used as a key for lookup. This is effectively
the only really unique ID of a unix socket, but using this for
search currently has one problem -- it is O(number of sockets) :(

Does it worth fixing this lookup or inventing some other ID for
unix sockets?

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-16 13:48:28 -05:00
Pavel Emelyanov
45a96b9be6 unix_diag: Dumping all sockets core
Walk the unix sockets table and fill the core response structure,
which includes type, state and inode.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-16 13:48:28 -05:00
Pavel Emelyanov
22931d3b90 unix_diag: Basic module skeleton
Includes basic module_init/_exit functionality, dump/get_exact stubs
and declares the basic API structures for request and response.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-16 13:48:27 -05:00
Pavel Emelyanov
fa7ff56f75 af_unix: Export stuff required for diag module
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-16 13:48:27 -05:00
Pavel Emelyanov
f65c1b534b sock_diag: Generalize requests cookies managements
The sk address is used as a cookie between dump/get_exact calls.
It will be required for unix socket sdumping, so move it from
inet_diag to sock_diag.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-16 13:48:27 -05:00
Pavel Emelyanov
aec8dc62f6 sock_diag: Fix module netlink aliases
I've made a mistake when fixing the sock_/inet_diag aliases :(

1. The sock_diag layer should request the family-based alias,
   not just the IPPROTO_IP one;
2. The inet_diag layer should request for AF_INET+protocol alias,
   not just the protocol one.

Thus fix this.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-16 13:48:27 -05:00
Rajkumar Manoharan
5ce543d148 cfg80211: Restore orig channel values upon disconnect
When we restore regulatory settings the world regulatory domain
is properly reset on cfg80211 (or user prefered regulatory domain)
but we were never setting back channel values for drivers that use
WIPHY_FLAG_CUSTOM_REGULATORY. Set these values up again by using
the orig_ channel parameters.

This fixes restoring custom regulatory settings upon disconnect
events.

Cc: compat@orbit-lab.org
Cc: Paul Stewart <pstew@google.com>
Cc: Rajkumar Manoharan <rmanohar@qca.qualcomm.com>
Cc: Senthilkumar Balasubramanian <senthilb@qca.qualcomm.com>
Signed-off-by: Rajkumar Manoharan <rmanohar@qca.qualcomm.com>
Signed-off-by: Luis R. Rodriguez <mcgrof@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-16 09:30:43 -05:00
Luis R. Rodriguez
061acaae76 cfg80211: allow following country IE power for custom regdom cards
By definition WIPHY_FLAG_STRICT_REGULATORY was intended to allow the
wiphy to adjust itself to the country IE power information if the
card had no regulatory data but we had no way to tell cfg80211 that if
the card also had its own custom regulatory domain (these are typically
custom world regulatory domains) that we want to follow the country IE's
noted values for power for each channel. We add support for this and
document it.

This is not a critical fix but a performance optimization for cards
with custom regulatory domains that associate to an AP with sends
out country IEs with a higher EIRP than the one on the custom
regulatory domain. In practice the only driver affected right now
are the Atheros drivers as they are the only drivers using both
WIPHY_FLAG_STRICT_REGULATORY and WIPHY_FLAG_CUSTOM_REGULATORY --
used on cards that have an Atheros world regulatory domain. Cards
that have been programmed to follow a country specifically will not
follow the country IE power. So although not a stable fix distributions
should consider cherry picking this.

Cc: compat@orbit-lab.org
Cc: Paul Stewart <pstew@google.com>
Cc: Rajkumar Manoharan <rmanohar@qca.qualcomm.com>
Cc: Senthilkumar Balasubramanian <senthilb@qca.qualcomm.com>
Reported-by: Rajkumar Manoharan <rmanohar@qca.qualcomm.com>
Signed-off-by: Luis R. Rodriguez <mcgrof@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-16 09:30:42 -05:00
David S. Miller
b26e478f8f Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/freescale/fsl_pq_mdio.c
	net/batman-adv/translation-table.c
	net/ipv6/route.c
2011-12-16 02:11:14 -05:00
Mohammed Shafi Shajakhan
1478acb392 mac80211: Fix power save in change interface
we found that power save is not getting enabled when we do
change interface in this order STA->IBSS->STA. this is
because ieee80211_setup_sdata clears type-dependent union

Reported-by: Leela Kella <leela@qca.qualcomm.com>
Signed-off-by: Mohammed Shafi Shajakhan <mohammed@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-15 14:46:35 -05:00
Mohammed Shafi Shajakhan
9c38a8b491 mac80211: remove an unnecessary paraenthesis
Signed-off-by: Mohammed Shafi Shajakhan <mohammed@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-15 14:46:35 -05:00
Helmut Schaa
cf6bb79ad8 mac80211: Use appropriate TID for sending BAR, ADDBA and DELBA frames
Currently BAR, ADDBA and DELBA frames are always sent using AC_VO. If
the TID for which a BA session is established is assigned to a different
queue BAR, ADDBA and DELBA frames can "overtake" frames of the according
BA session.

Hence, always put BA session related frames into the same queue as the
BA sessions data frames.

Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-15 14:46:35 -05:00
Johannes Berg
4d33960bf9 mac80211: reduce station management complexity
Now that IBSS no longer needs to insert stations
from atomic context, we can get rid of all the
special cases for that, and even get rid of the
sta_lock (though it needs to stay as tim_lock.)

This makes the station management code much more
straight-forward.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-15 14:46:35 -05:00
Johannes Berg
8bf11d8d08 mac80211: delay IBSS station insertion
In order to notify drivers and simplify the station
management code, defer IBSS station insertion to a
work item and don't do it directly while receiving
a frame.

This increases the complexity in IBSS a little bit,
but it's pretty straight forward and it allows us
to reduce the station management complexity (next
patch) considerably.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-15 14:46:34 -05:00
Johannes Berg
56544160d4 mac80211: make address arguments to sta_info_alloc const
No real changes, just note that they are const.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-15 14:46:34 -05:00
Johannes Berg
29623892e1 mac80211: count authorized stations per BSS
Currently, each AP interface will send multicast
traffic if any interface has a station entry even
if that station entry is allocated only. With the
new station state management we can easily fix it
by adding a counter that counts each authorized
station only and send multicast traffic only when
the correct interface has at least one authorized
station.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-15 14:46:34 -05:00
Johannes Berg
d9a7ddb05e mac80211: refactor station state transitions
Station entries can have various states, the most
important ones being auth, assoc and authorized.
This patch prepares us for telling the driver about
these states, we don't want to confuse drivers with
strange transitions, so with this we enforce that
they move in the right order between them (back and
forth); some transitions might happen before the
driver even knows about the station, but at least
runtime transitions will be ordered correctly.

As a consequence, IBSS and MESH stations will now
have the ASSOC flag set (so they can transition to
AUTHORIZED), and we can get rid of a special case
in TX processing.

When freeing a station, unwind the state so that
other parts of the code (or drivers later) can rely
on the transitions.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-15 14:46:34 -05:00
Johannes Berg
87be1e1e00 mac80211: use station mutex in configuration
There's no need to use RCU here, we can just lock
the station mutex instead. This allows the code
to sleep, which is necessary for later patches.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-15 14:45:46 -05:00
Johannes Berg
92b62f28d0 mac80211: remove duplicate TDLS peer verification
This is already checked in cfg80211, so no need
to repeat the checks here.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-15 14:45:45 -05:00
Johannes Berg
bdd90d5e36 cfg80211: validate nl80211 station handling better
The nl80211 station handling code is a bit messy
and doesn't do a lot of validation. It seems like
this could be an issue for drivers that don't use
mac80211 to validate everything.

As cfg80211 doesn't keep station state, move the
validation of allowing supported_rates to change
for TDLS only in station mode to mac80211.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-15 14:45:45 -05:00
Johannes Berg
d83023daa2 nl80211: add TDLS peer flag to policy
This was evidently missed in the TDLS patch (07ba55d7).

Cc: Arik Nemtsov <arik@wizery.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-15 14:45:45 -05:00
John W. Linville
42a3b63bb2 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem 2011-12-15 13:47:58 -05:00
Dan Carpenter
c48e074c7c tcp_memcontrol: fix reversed if condition
We should only dereference the pointer if it's valid, not the other way
round.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-15 11:59:44 -05:00
Samuel Ortiz
d646960f79 NFC: Initial LLCP support
This patch is an initial implementation for the NFC Logical Link Control
protocol. It's also known as NFC peer to peer mode.
This is a basic implementation as it lacks SDP (services Discovery
Protocol), frames aggregation support, and frame rejecion parsing.
Follow up patches will implement those missing features.
This code has been tested against a Nexus S phone implementing LLCP 1.0.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-14 14:50:13 -05:00
Samuel Ortiz
541d920b05 NFC: Set and get DEP general bytes
Without an API for setting and getting the local and remote general bytes,
drivers won't be able to properly establish a DEP link.
This API also allows them to propagate the remote general bytes they get
from the DEP link establishment up to the LLCP layer.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-14 14:50:13 -05:00
Samuel Ortiz
1ed28f6106 NFC: Add a DEP link control netlink command
NFC-DEP (Data Exchange Protocol) is an NFC MAC layer.
This command allows to enable and disable the DEP link on to which e.g.
LLCP can run.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-14 14:50:12 -05:00
Samuel Ortiz
db81a62451 NFC: Atomic socket allocation
rawsock_create() is called with preemption disabled, so we should not
sleep.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-14 14:50:12 -05:00
Samuel Ortiz
94a098da42 NFC: Do not take the genl mutex from the netlink release notifier
The netlink notifier is atomic so we must not sleep in that context.
Also we know that Any netlink packets arriving to us will be purged when
the notifier is called, so we don't need to take the mutex.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-14 14:50:12 -05:00
Samuel Ortiz
7c7cd3bfec NFC: Add tx skb allocation routine
This is a factorization of the current rawsock tx skb allocation routine,
as it will be used by the LLCP code.
We also rename nfc_alloc_skb to nfc_alloc_recv_skb for consistency sake.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-14 14:50:12 -05:00
Samuel Ortiz
52858b51b2 NFC: Add function name to the NFC pr_fmt() routine
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-14 14:50:12 -05:00
Simon Wunderlich
cb71b8d803 mac80211: free skb on error path of ieee80211_ibss_join()
Our new return also created a memleak. The skb should be freed before
returning an error.

Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-14 14:50:11 -05:00
Johannes Berg
00918d33c0 nl80211: accept testmode dump with netdev
All nl80211 commands that need only the wiphy
still allow identifying it by giving an interface
index, except, as Kenny pointed out, the testmode
dump support.

Fix this by looking up the wiphy via the ifidx in
this case as well.

Tested-by: Kenny Hsu <kenny.hsu@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-14 14:50:11 -05:00
John W. Linville
5d22df200b Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless
Conflicts:
	drivers/net/wireless/iwlwifi/iwl-agn.c
2011-12-14 14:35:41 -05:00
Eric Dumazet
e6560d4dfe net: ping: remove some sparse errors
net/ipv4/sysctl_net_ipv4.c:78:6: warning: symbol 'inet_get_ping_group_range_table'
was not declared. Should it be static?

net/ipv4/sysctl_net_ipv4.c:119:31: warning: incorrect type in argument 2
(different signedness)
net/ipv4/sysctl_net_ipv4.c:119:31: expected int *range
net/ipv4/sysctl_net_ipv4.c:119:31: got unsigned int *<noident>

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-14 13:34:55 -05:00
Eric Dumazet
3a53943b5a cls_flow: remove one dynamic array
Its better to use a predefined size for this small automatic variable.

Removes a sparse error as well :

net/sched/cls_flow.c:288:13: error: bad constant expression

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-14 13:34:55 -05:00
Eric Dumazet
de93cb2eaf vlan: static functions
commit 6d4cdf47d2 (vlan: add 802.1q netpoll support) forgot to declare
as static some private functions.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-14 02:39:30 -05:00
Dan Carpenter
f9586f79bf vlan: add rtnl_dereference() annotations
The original code generates a Sparse warning:
net/8021q/vlan_core.c:336:9:
	error: incompatible types in comparison expression (different address spaces)

It's ok to dereference __rcu pointers here because we are holding the
RTNL lock.  I've added some calls to rtnl_dereference() to silence the
warning.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-14 02:39:30 -05:00
Eric Dumazet
c63044f0d2 rtnetlink: rtnl_link_register() sanity test
Before adding a struct rtnl_link_ops into link_ops list, check it doesnt
clash with a prior one.

Based on a previous patch from Alexander Smirnov

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-14 02:39:29 -05:00
Linus Torvalds
653f42f6b6 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  ceph: add missing spin_unlock at ceph_mdsc_build_path()
  ceph: fix SEEK_CUR, SEEK_SET regression
  crush: fix mapping calculation when force argument doesn't exist
  ceph: use i_ceph_lock instead of i_lock
  rbd: remove buggy rollback functionality
  rbd: return an error when an invalid header is read
  ceph: fix rasize reporting by ceph_show_options
2011-12-13 14:59:42 -08:00
David S. Miller
bb3c36863e ipv6: Check dest prefix length on original route not copied one in rt6_alloc_cow().
After commit 8e2ec63917 ("ipv6: don't
use inetpeer to store metrics for routes.") the test in rt6_alloc_cow()
for setting the ANYCAST flag is now wrong.

'rt' will always now have a plen of 128, because it is set explicitly
to 128 by ip6_rt_copy.

So to restore the semantics of the test, check the destination prefix
length of 'ort'.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-13 17:35:06 -05:00
David S. Miller
b43faac690 ipv6: If neigh lookup fails during icmp6 dst allocation, propagate error.
Don't just succeed with a route that has a NULL neighbour attached.
This follows the behavior of addrconf_dst_alloc().

Allowing this kind of route to end up with a NULL neigh attached will
result in packet drops on output until the route is somehow
invalidated, since nothing will meanwhile try to lookup the neigh
again.

A statistic is bumped for the case where we see a neigh-less route on
output, but the resulting packet drop is otherwise silent in nature,
and frankly it's a hard error for this to happen and ipv6 should do
what ipv4 does which is say something in the kernel logs.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-13 16:51:51 -05:00
David S. Miller
5c3ddec73d net: Remove unused neighbour layer ops.
It's simpler to just keep these things out until there is a real user
of them, so we can see what the needs actually are, rather than keep
these things around as useless overhead.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-13 16:44:22 -05:00
Eliad Peller
53d69c399a mac80211: don't check sdata_running in vif notifier
The ip address of the vif can be set even before the
vif is up. requiring the vif to be up in the vif
notifier makes the notifer ignore this event, which
causes wrong arp filter configuration later on.

Reported-by: Eyal Shapira <eyal@wizery.com>
Signed-off-by: Eliad Peller <eliad@wizery.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-13 15:34:15 -05:00
Eliad Peller
0d392e938b mac80211: configure BSS_CHANGED_ARP_FILTER on reconfiguration
Configure arp filtering on sta reconfiguration.

Signed-off-by: Eliad Peller <eliad@wizery.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-13 15:34:11 -05:00
Dmitry TARNYAGIN
cd6c524e9e mac80211: Do not request FIF_BCN_PRBRESP_PROMISC for HW scan.
ieee80211_configure_filter code used local->scanning as a boolean
value when it was a bit mask. Bits SCAN_COMPLETED, SCAN_ABORTED
should not set FIF_BCN_PRBRESP_PROMISC filter.

SCAN_HW_SCANNING should not set FIF_BCN_PRBRESP_PROMISC either,
as there is no explicit filter configuration request from
scan code. If a driver requires FIF_BCN_PRBRESP_PROMISC mode
during HW scanning, it's up to the driver to temporary enable it.

Similar mistake was fixed also in ieee80211_hw_config (power
configuration code).

Verified-by: Vitaly Wool <vitaly.wool@sonyericsson.com>
Signed-off-by: Dmitry Tarnyagin <dmitry.tarnyagin@stericsson.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-13 15:34:08 -05:00
Rajkumar Manoharan
4a38994f1c cfg80211: notify core hints that helps to restore regd settings
Regulatory updates set by CORE are ignored for custom regulatory cards.
Let us notify the changes to the driver, as some drivers uses core hint
to restore its orig_* reg domain setting.

Cc: Paul Stewart <pstew@google.com>
Signed-off-by: Rajkumar Manoharan <rmanohar@qca.qualcomm.com>
Acked-by: Luis R. Rodriguez <mcgrof@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-13 15:31:04 -05:00
Helmut Schaa
adf5ace5d8 mac80211: Make use of ieee80211_is_* functions in tx status path
Use ieee80211_is_data, ieee80211_is_mgmt and ieee80211_is_first_frag
in the tx status path. This makes the code easier to read and allows us
to remove two local variables: frag and type.

Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-13 15:30:46 -05:00
Yogesh Ashok Powar
42624d4913 mac80211: Purge A-MPDU TX queues before station destructions
When a station leaves suddenly while ampdu traffic to that station is still
running, there is a possibility that the ampdu pending queues are not freed due
to a race condition leading to memory leaks. In '__sta_info_destroy' when we
attempt to destroy the ampdu sessions in 'ieee80211_sta_tear_down_BA_sessions',
the driver calls 'ieee80211_stop_tx_ba_cb_irqsafe' to delete the ampdu
structures (tid_tx) and splice the pending queues and this job gets queued in
sdata workqueue. However, the sta entry can get destroyed before the above work
gets scheduled and hence the race.

Purging the queues and freeing the tid_tx to avoid the leak. The better solution
would be to fix the race, but that can be taken up in a separate patch.

Signed-off-by: Nishant Sarmukadam <nishants@marvell.com>
Signed-off-by: Yogesh Ashok Powar <yogeshp@marvell.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-13 15:30:33 -05:00
Vasanthakumar Thiagarajan
adbde344dc cfg80211: Fix race in bss timeout
It is quite possible to run into a race in bss timeout where
the drivers see the bss entry just before notifying cfg80211
of a roaming event but it got timed out by the time rdev->event_work
got scehduled from cfg80211_wq. This would result in the following
WARN-ON() along with the failure to notify the user space of
the roaming. The other situation which is happening with ath6kl
that runs into issue is when the driver reports roam to same AP
event where the AP bss entry already got expired. To fix this,
move cfg80211_get_bss() from __cfg80211_roamed() to cfg80211_roamed().

[158645.538384] WARNING: at net/wireless/sme.c:586
__cfg80211_roamed+0xc2/0x1b1()
[158645.538810] Call Trace:
[158645.538838]  [<c1033527>] warn_slowpath_common+0x65/0x7a
[158645.538917]  [<c14cfacf>] ? __cfg80211_roamed+0xc2/0x1b1
[158645.538946]  [<c103354b>] warn_slowpath_null+0xf/0x13
[158645.539055]  [<c14cfacf>] __cfg80211_roamed+0xc2/0x1b1
[158645.539086]  [<c14beb5b>] cfg80211_process_rdev_events+0x153/0x1cc
[158645.539166]  [<c14bd57b>] cfg80211_event_work+0x26/0x36
[158645.539195]  [<c10482ae>] process_one_work+0x219/0x38b
[158645.539273]  [<c14bd555>] ? wiphy_new+0x419/0x419
[158645.539301]  [<c10486cb>] worker_thread+0xf6/0x1bf
[158645.539379]  [<c10485d5>] ? rescuer_thread+0x1b5/0x1b5
[158645.539407]  [<c104b3e2>] kthread+0x62/0x67
[158645.539484]  [<c104b380>] ? __init_kthread_worker+0x42/0x42
[158645.539514]  [<c151309a>] kernel_thread_helper+0x6/0xd

Reported-by: Kalle Valo <kvalo@qca.qualcomm.com>
Signed-off-by: Vasanthakumar Thiagarajan <vthiagar@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-13 15:30:28 -05:00
Dan Carpenter
fb03c5eb8c mac80211: unlock on error path in ieee80211_ibss_join()
We recently introduced a new return here but it needs an unlock first.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-13 15:30:25 -05:00
Michael Maxim
76ad94fc5d IPVS: Modify the SH scheduler to use weights
Modify the algorithm to build the source hashing hash table to add
extra slots for destinations with higher weight. This has the effect
of allowing an IPVS SH user to give more connections to hosts that
have been configured to have a higher weight.

The reason for the Kconfig change is because the size of the hash table
becomes more relevant/important if you decide to use the weights in the
manner this patch lets you. It would be conceivable that someone might
need to increase the size of that table to accommodate their
configuration, so it will be handy to be able to do that through the
regular configuration system instead of editing the source.

Signed-off-by: Michael Maxim <mike@okcupid.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-12-13 11:34:48 +01:00
Florian Westphal
e26f9a480f netfilter: add ipv6 reverse path filter match
This is not merged with the ipv4 match into xt_rpfilter.c
to avoid ipv6 module dependency issues.

Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-12-13 11:34:43 +01:00
Hagen Paul Pfeifer
90b41a1cd4 netem: add cell concept to simulate special MAC behavior
This extension can be used to simulate special link layer
characteristics. Simulate because packet data is not modified, only the
calculation base is changed to delay a packet based on the original
packet size and artificial cell information.

packet_overhead can be used to simulate a link layer header compression
scheme (e.g. set packet_overhead to -20) or with a positive
packet_overhead value an additional MAC header can be simulated. It is
also possible to "replace" the 14 byte Ethernet header with something
else.

cell_size and cell_overhead can be used to simulate link layer schemes,
based on cells, like some TDMA schemes. Another application area are MAC
schemes using a link layer fragmentation with a (small) header each.
Cell size is the maximum amount of data bytes within one cell. Cell
overhead is an additional variable to change the per-cell-overhead
(e.g.  5 byte header per fragment).

Example (5 kbit/s, 20 byte per packet overhead, cell-size 100 byte, per
cell overhead 5 byte):

  tc qdisc add dev eth0 root netem rate 5kbit 20 100 5

Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-12 19:44:48 -05:00
David S. Miller
c7c6575f25 Merge branch 'batman-adv/next' of git://git.open-mesh.org/linux-merge 2011-12-12 19:26:07 -05:00
Eric Dumazet
3f1e6d3fd3 sch_gred: should not use GFP_KERNEL while holding a spinlock
gred_change_vq() is called under sch_tree_lock(sch).

This means a spinlock is held, and we are not allowed to sleep in this
context.

We might pre-allocate memory using GFP_KERNEL before taking spinlock,
but this is not suitable for stable material.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-12 19:08:54 -05:00
Glauber Costa
0850f0f5c5 Display maximum tcp memory allocation in kmem cgroup
This patch introduces kmem.tcp.max_usage_in_bytes file, living in the
kmem_cgroup filesystem. The root cgroup will display a value equal
to RESOURCE_MAX. This is to avoid introducing any locking schemes in
the network paths when cgroups are not being actively used.

All others, will see the maximum memory ever used by this cgroup.

Signed-off-by: Glauber Costa <glommer@parallels.com>
Reviewed-by: Hiroyouki Kamezawa <kamezawa.hiroyu@jp.fujitsu.com>
CC: David S. Miller <davem@davemloft.net>
CC: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-12 19:04:11 -05:00
Glauber Costa
ffea59e504 Display current tcp failcnt in kmem cgroup
This patch introduces kmem.tcp.failcnt file, living in the
kmem_cgroup filesystem. Following the pattern in the other
memcg resources, this files keeps a counter of how many times
allocation failed due to limits being hit in this cgroup.
The root cgroup will always show a failcnt of 0.

Signed-off-by: Glauber Costa <glommer@parallels.com>
Reviewed-by: Hiroyouki Kamezawa <kamezawa.hiroyu@jp.fujitsu.com>
CC: David S. Miller <davem@davemloft.net>
CC: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-12 19:04:11 -05:00
Glauber Costa
5a6dd34377 Display current tcp memory allocation in kmem cgroup
This patch introduces kmem.tcp.usage_in_bytes file, living in the
kmem_cgroup filesystem. It is a simple read-only file that displays the
amount of kernel memory currently consumed by the cgroup.

Signed-off-by: Glauber Costa <glommer@parallels.com>
Reviewed-by: Hiroyouki Kamezawa <kamezawa.hiroyu@jp.fujitsu.com>
CC: David S. Miller <davem@davemloft.net>
CC: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-12 19:04:11 -05:00
Glauber Costa
3aaabe2342 tcp buffer limitation: per-cgroup limit
This patch uses the "tcp.limit_in_bytes" field of the kmem_cgroup to
effectively control the amount of kernel memory pinned by a cgroup.

This value is ignored in the root cgroup, and in all others,
caps the value specified by the admin in the net namespaces'
view of tcp_sysctl_mem.

If namespaces are being used, the admin is allowed to set a
value bigger than cgroup's maximum, the same way it is allowed
to set pretty much unlimited values in a real box.

Signed-off-by: Glauber Costa <glommer@parallels.com>
Reviewed-by: Hiroyouki Kamezawa <kamezawa.hiroyu@jp.fujitsu.com>
CC: David S. Miller <davem@davemloft.net>
CC: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-12 19:04:11 -05:00
Glauber Costa
3dc43e3e4d per-netns ipv4 sysctl_tcp_mem
This patch allows each namespace to independently set up
its levels for tcp memory pressure thresholds. This patch
alone does not buy much: we need to make this values
per group of process somehow. This is achieved in the
patches that follows in this patchset.

Signed-off-by: Glauber Costa <glommer@parallels.com>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
CC: David S. Miller <davem@davemloft.net>
CC: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-12 19:04:11 -05:00
Glauber Costa
d1a4c0b37c tcp memory pressure controls
This patch introduces memory pressure controls for the tcp
protocol. It uses the generic socket memory pressure code
introduced in earlier patches, and fills in the
necessary data in cg_proto struct.

Signed-off-by: Glauber Costa <glommer@parallels.com>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujtisu.com>
CC: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-12 19:04:10 -05:00
Glauber Costa
e1aab161e0 socket: initial cgroup code.
The goal of this work is to move the memory pressure tcp
controls to a cgroup, instead of just relying on global
conditions.

To avoid excessive overhead in the network fast paths,
the code that accounts allocated memory to a cgroup is
hidden inside a static_branch(). This branch is patched out
until the first non-root cgroup is created. So when nobody
is using cgroups, even if it is mounted, no significant performance
penalty should be seen.

This patch handles the generic part of the code, and has nothing
tcp-specific.

Signed-off-by: Glauber Costa <glommer@parallels.com>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujtsu.com>
CC: Kirill A. Shutemov <kirill@shutemov.name>
CC: David S. Miller <davem@davemloft.net>
CC: Eric W. Biederman <ebiederm@xmission.com>
CC: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-12 19:04:10 -05:00
Glauber Costa
180d8cd942 foundations of per-cgroup memory pressure controlling.
This patch replaces all uses of struct sock fields' memory_pressure,
memory_allocated, sockets_allocated, and sysctl_mem to acessor
macros. Those macros can either receive a socket argument, or a mem_cgroup
argument, depending on the context they live in.

Since we're only doing a macro wrapping here, no performance impact at all is
expected in the case where we don't have cgroups disabled.

Signed-off-by: Glauber Costa <glommer@parallels.com>
Reviewed-by: Hiroyouki Kamezawa <kamezawa.hiroyu@jp.fujitsu.com>
CC: David S. Miller <davem@davemloft.net>
CC: Eric W. Biederman <ebiederm@xmission.com>
CC: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-12 19:04:10 -05:00
Ted Feng
72b36015ba ipip, sit: copy parms.name after register_netdevice
Same fix as 731abb9cb2 for ipip and sit tunnel.
Commit 1c5cae815d removed an explicit call to dev_alloc_name in
ipip_tunnel_locate and ipip6_tunnel_locate, because register_netdevice
will now create a valid name, however the tunnel keeps a copy of the
name in the private parms structure. Fix this by copying the name back
after register_netdevice has successfully returned.

This shows up if you do a simple tunnel add, followed by a tunnel show:

$ sudo ip tunnel add mode ipip remote 10.2.20.211
$ ip tunnel
tunl0: ip/ip  remote any  local any  ttl inherit  nopmtudisc
tunl%d: ip/ip  remote 10.2.20.211  local any  ttl inherit
$ sudo ip tunnel add mode sit remote 10.2.20.212
$ ip tunnel
sit0: ipv6/ip  remote any  local any  ttl 64  nopmtudisc 6rd-prefix 2002::/16
sit%d: ioctl 89f8 failed: No such device
sit%d: ipv6/ip  remote 10.2.20.212  local any  ttl inherit

Cc: stable@vger.kernel.org
Signed-off-by: Ted Feng <artisdom@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-12 18:50:51 -05:00
Li Wei
4af04aba93 ipv6: Fix for adding multicast route for loopback device automatically.
There is no obvious reason to add a default multicast route for loopback
devices, otherwise there would be a route entry whose dst.error set to
-ENETUNREACH that would blocking all multicast packets.

====================

[ more detailed explanation ]

The problem is that the resulting routing table depends on the sequence
of interface's initialization and in some situation, that would block all
muticast packets. Suppose there are two interfaces on my computer
(lo and eth0), if we initailize 'lo' before 'eth0', the resuting routing
table(for multicast) would be

# ip -6 route show | grep ff00::
unreachable ff00::/8 dev lo metric 256 error -101
ff00::/8 dev eth0 metric 256

When sending multicasting packets, routing subsystem will return the first
route entry which with a error set to -101(ENETUNREACH).

I know the kernel will set the default ipv6 address for 'lo' when it is up
and won't set the default multicast route for it, but there is no reason to
stop 'init' program from setting address for 'lo', and that is exactly what
systemd did.

I am sure there is something wrong with kernel or systemd, currently I preferred
kernel caused this problem.

====================

Signed-off-by: Li Wei <lw@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-12 18:48:18 -05:00
Dan Carpenter
f8c141c3e9 nfc: signedness bug in __nci_request()
wait_for_completion_interruptible_timeout() returns -ERESTARTSYS if
interrupted so completion_rc needs to be signed.  The current code
probably returns -ETIMEDOUT if we hit this situation, but after this
patch is applied it will return -ERESTARTSYS.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-12 14:23:27 -05:00
John W. Linville
f2abba4921 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem 2011-12-12 14:19:43 -05:00
Sage Weil
f1932fc1a6 crush: fix mapping calculation when force argument doesn't exist
If the force argument isn't valid, we should continue calculating a
mapping as if it weren't specified.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-12-12 09:09:45 -08:00
Sven Eckelmann
b5a1eeef04 batman-adv: Only write requested number of byte to user buffer
Don't write more than the requested number of bytes of an batman-adv icmp
packet to the userspace buffer. Otherwise unrelated userspace memory might get
overridden by the kernel.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
2011-12-12 19:11:07 +08:00
Sven Eckelmann
d18eb45332 batman-adv: Directly check read of icmp packet in copy_from_user
The access_ok read check can be directly done in copy_from_user since a failure
of access_ok is handled the same way as an error in __copy_from_user.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
2011-12-12 19:11:06 +08:00