Commit Graph

25893 Commits (fd90b29d757827ab12d6669292612308ec249532)

Author SHA1 Message Date
Eric Dumazet fd90b29d75 tcp: change default tcp hash size
As time passed, available memory increased faster than number of
concurrent tcp sockets.

As a result, a machine with 4GB of ram gets a hash table
with 524288 slots, using 8388608 bytes of memory.

Lets change that by a 16x factor (one slot for 128 KB of ram)

Even if a small machine needs a _lot_ of sockets, tcp lookups are now
very efficient, using one cache line per socket.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-01 11:36:37 -05:00
Eric Dumazet ce43b03e88 net: move inet_dport/inet_num in sock_common
commit 68835aba4d (net: optimize INET input path further)
moved some fields used for tcp/udp sockets lookup in the first cache
line of struct sock_common.

This patch moves inet_dport/inet_num as well, filling a 32bit hole
on 64 bit arches and reducing number of cache line misses in lookups.

Also change INET_MATCH()/INET_TW_MATCH() to perform the ports match
before addresses match, as this check is more discriminant.

Remove the hash check from MATCH() macros because we dont need to
re validate the hash value after taking a refcount on socket, and
use likely/unlikely compiler hints, as the sk_hash/hash check
makes the following conditional tests 100% predicted by cpu.

Introduce skc_addrpair/skc_portpair pair values to better
document the alignment requirements of the port/addr pairs
used in the various MATCH() macros, and remove some casts.

The namespace check can also be done at last.

This slightly improves TCP/UDP lookup times.

IP/TCP early demux needs inet->rx_dst_ifindex and
TCP needs inet->min_ttl, lets group them together in same cache line.

With help from Ben Hutchings & Joe Perches.

Idea of this patch came after Ling Ma proposal to move skc_hash
to the beginning of struct sock_common, and should allow him
to submit a final version of his patch. My tests show an improvement
doing so.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Cc: Joe Perches <joe@perches.com>
Cc: Ling Ma <ling.ma.program@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-30 15:02:56 -05:00
Thomas Graf 06a31e2b91 sctp: verify length provided in heartbeat information parameter
If the variable parameter length provided in the mandatory
heartbeat information parameter exceeds the calculated payload
length the packet has been corrupted. Reply with a parameter
length protocol violation message.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-30 12:25:52 -05:00
Rami Rosen c07135633b rtnelink: remove unused parameter from rtnl_create_link().
This patch removes an unused parameter (src_net) from rtnl_create_link()
method and from the method single invocation, in veth.
This parameter was used in the past when calling
ops->get_tx_queues(src_net, tb) in rtnl_create_link().
The get_tx_queues() member of rtnl_link_ops was replaced by two methods,
get_num_tx_queues() and get_num_rx_queues(), which do not get any
parameter. This was done in commit d40156aa5e by
Jiri Pirko ("rtnl: allow to specify different num for rx and tx queue count").

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-30 12:24:40 -05:00
David S. Miller dad52fd964 Included changes:
- Use the new ETH_P_BATMAN define instead of the private BATADV_ETH_P_BATMAN
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iQIcBAABAgAGBQJQuIa1AAoJEADl0hg6qKeOqN8QAMZ/pEhhIjxQ/8Icde1dccbh
 IuGBtm1Nhx6dvfWxzAKx1JQ5GIJMrKFNR+er4bMEJHgsUtBW08pjMVFAzqzPV3Bb
 YMVQSJtnbl39obzWMjnsvbAhB2cOC04BAnFlCapKOAeQGWmNrYwZHmq1yrjQhq+F
 xOPUqMQ94CrecGGtXOrqCTEJL7Y6VJngu15A8ZGXQBeOKPlmfLUpA4wVG2f7n0cT
 aTv7sD46wmJ4YzFCmqd25ugKWCvtmslMg+ryY+NrZYVXlptMTsuF8JTcfqopne+5
 3KaIsQzdXPciM4PGuNmJC+C83kiQulmoeQav+LNvdq+r/suJLcELsxwsEH58amfc
 Qs80mXfd0Pnop1eORHvaTtNKMd/lkTJ6ydOVIW2dwN32VSPUaCxC0VopeERInKP+
 +1flCMT8e/CB7dFHH4lvjYbg9R30VZU2oQo8G3EFMRWzybbvphrM5ITax5aHdy6e
 sDzgwoBAdg9E6UvqLOcwkCeSVwyG9GYhk/JwpYVm9sKsVeqkCLLR9+mJjHv9iwqn
 h76jqJpwfpDHoDXBEvIQCNcN63B2XqzJGtyhWL3wrS+kAKrxnTONYJbqbugM3sJd
 lJjAFlCHR1xgeP1vZ8D+N2zM4CNR73AOHdKAUPUfyfJhJxcjpQpNpK26CDuKpNC9
 gNJ31BjiZPkjGXRd9NXf
 =hrO7
 -----END PGP SIGNATURE-----

Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-merge

Included changes:
- Use the new ETH_P_BATMAN define instead of the private BATADV_ETH_P_BATMAN

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-30 12:22:04 -05:00
Tommi Rantala ee3f34e857 sctp: fix CONFIG_SCTP_DBG_MSG=y null pointer dereference in sctp_v6_get_dst()
Trinity (the syscall fuzzer) triggered the following BUG, reproducible
only when the kernel is configured with CONFIG_SCTP_DBG_MSG=y.

When CONFIG_SCTP_DBG_MSG is not set, the null pointer is never
dereferenced.

---[ end trace a4de0bfcb38a3642 ]---
BUG: unable to handle kernel NULL pointer dereference at 0000000000000100
IP: [<ffffffff8136796e>] ip6_string+0x1e/0xa0
PGD 4eead067 PUD 4e472067 PMD 0
Oops: 0000 [#1] PREEMPT SMP
Modules linked in:
CPU 3
Pid: 21324, comm: trinity-child11 Tainted: G        W    3.7.0-rc7+ #61 ASUSTeK Computer INC. EB1012/EB1012
RIP: 0010:[<ffffffff8136796e>]  [<ffffffff8136796e>] ip6_string+0x1e/0xa0
RSP: 0018:ffff88004e4637a0  EFLAGS: 00010046
RAX: ffff88004e4637da RBX: ffff88004e4637da RCX: 0000000000000000
RDX: ffffffff8246e92a RSI: 0000000000000100 RDI: ffff88004e4637da
RBP: ffff88004e4637a8 R08: 000000000000ffff R09: 000000000000ffff
R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff8289d600
R13: ffffffff8289d230 R14: ffffffff8246e928 R15: ffffffff8289d600
FS:  00007fed95153700(0000) GS:ffff88005fd80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000100 CR3: 000000004eeac000 CR4: 00000000000007e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process trinity-child11 (pid: 21324, threadinfo ffff88004e462000, task ffff8800524b0000)
Stack:
 ffff88004e4637da ffff88004e463828 ffffffff81368eee 000000004e4637d8
 ffffffff0000ffff ffff88000000ffff 0000000000000000 000000004e4637f8
 ffffffff826285d8 ffff88004e4637f8 0000000000000000 ffff8800524b06b0
Call Trace:
 [<ffffffff81368eee>] ip6_addr_string.isra.11+0x3e/0xa0
 [<ffffffff81369183>] pointer.isra.12+0x233/0x2d0
 [<ffffffff810a413a>] ? vprintk_emit+0x1ba/0x450
 [<ffffffff8110953d>] ? trace_hardirqs_on_caller+0x10d/0x1a0
 [<ffffffff81369757>] vsnprintf+0x187/0x5d0
 [<ffffffff81369c62>] vscnprintf+0x12/0x30
 [<ffffffff810a4028>] vprintk_emit+0xa8/0x450
 [<ffffffff81e5cb00>] printk+0x49/0x4b
 [<ffffffff81d17221>] sctp_v6_get_dst+0x731/0x780
 [<ffffffff81d16e15>] ? sctp_v6_get_dst+0x325/0x780
 [<ffffffff81d00a96>] sctp_transport_route+0x46/0x120
 [<ffffffff81cff0f1>] sctp_assoc_add_peer+0x161/0x350
 [<ffffffff81d0fd8d>] sctp_sendmsg+0x6cd/0xcb0
 [<ffffffff81b55bf0>] ? inet_create+0x670/0x670
 [<ffffffff81b55cfb>] inet_sendmsg+0x10b/0x220
 [<ffffffff81b55bf0>] ? inet_create+0x670/0x670
 [<ffffffff81a72a64>] ? sock_update_classid+0xa4/0x2b0
 [<ffffffff81a72ab0>] ? sock_update_classid+0xf0/0x2b0
 [<ffffffff81a6ac1c>] sock_sendmsg+0xdc/0xf0
 [<ffffffff8118e9e5>] ? might_fault+0x85/0x90
 [<ffffffff8118e99c>] ? might_fault+0x3c/0x90
 [<ffffffff81a6e12a>] sys_sendto+0xfa/0x130
 [<ffffffff810a9887>] ? do_setitimer+0x197/0x380
 [<ffffffff81e960d5>] ? sysret_check+0x22/0x5d
 [<ffffffff81e960a9>] system_call_fastpath+0x16/0x1b
Code: 01 eb 89 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 f8 31 c9 48 89 e5 53 eb 12 0f 1f 40 00 48 83 c1 01 48 83 c0 04 48 83 f9 08 74 70 <0f> b6 3c 4e 89 fb 83 e7 0f c0 eb 04 41 89 d8 41 83 e0 0f 0f b6
RIP  [<ffffffff8136796e>] ip6_string+0x1e/0xa0
 RSP <ffff88004e4637a0>
CR2: 0000000000000100
---[ end trace a4de0bfcb38a3643 ]---

Signed-off-by: Tommi Rantala <tt.rantala@gmail.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-30 12:21:27 -05:00
Alan Ott 92a2ec72a7 mac802154: use kfree_skb() instead of dev_kfree_skb()
kfree_skb() indicates failure, which is where this is being used.

Signed-off-by: Alan Ott <alan@signal11.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-30 12:19:24 -05:00
Alan Ott fcefbe9fcb mac802154: fix memory leaks
kfree_skb() was not getting called in the case of some failures.
This was pointed out by Eric Dumazet.

Signed-off-by: Alan Ott <alan@signal11.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-30 12:19:24 -05:00
Alan Ott b333b7e6ec 6lowpan: consider checksum bytes in fragmentation threshold
Change the threshold for framentation of a lowpan packet from
using the MTU size to now use the MTU size minus the checksum length,
which is added by the hardware. For IEEE 802.15.4, this effectively
changes it from 127 bytes to 125 bytes.

Signed-off-by: Alan Ott <alan@signal11.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-30 12:19:24 -05:00
Yi Zou 6e22ce2c6e 8021q: fix vlan device to inherit the unicast filtering capability flag
This bug is observed on running FCoE over a VLAN device associated w/
a real device that has IFF_UNICAST_FLT set since FCoE would add unicast
address such as FLOGI MAC to the VLAN interface that FCoE is on. Since
currently, VLAN device is not inheriting the IFF_UNICAST_FLT flag from the
parent real device even though the real device is capable of doing unicast
filtering. This forces the VLAN device and its real device go to promiscuous
mode unnecessarily even the added address is actually being added to the
available unicast filter table in real device.

Signed-off-by: Yi Zou <yi.zou@intel.com>
Cc: devel@open-fcoe.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-30 12:07:27 -05:00
David S. Miller e7165030db Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch
Conflicts:
	net/ipv6/exthdrs_core.c

Jesse Gross says:

====================
This series of improvements for 3.8/net-next contains four components:
 * Support for modifying IPv6 headers
 * Support for matching and setting skb->mark for better integration with
   things like iptables
 * Ability to recognize the EtherType for RARP packets
 * Two small performance enhancements

The movement of ipv6_find_hdr() into exthdrs_core.c causes two small merge
conflicts.  I left it as is but can do the merge if you want.  The conflicts
are:
 * ipv6_find_hdr() and ipv6_find_tlv() were both moved to the bottom of
   exthdrs_core.c.  Both should stay.
 * A new use of ipv6_find_hdr() was added to net/netfilter/ipvs/ip_vs_core.c
   after this patch.  The IPVS user has two instances of the old constant
   name IP6T_FH_F_FRAG which has been renamed to IP6_FH_F_FRAG.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-30 12:01:30 -05:00
Antonio Quartulli af5d4f7737 batman-adv: use ETH_P_BATMAN
The ETH_P_BATMAN ethertype is now defined kernel-wide. Use it instead
of the private BATADV_ETH_P_BATMAN define.

Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
2012-11-30 10:50:22 +01:00
Rami Rosen bb728820fe core: make GRO methods static.
This patch changes three methods to be static and removes their
EXPORT_SYMBOLs in core/dev.c and their external declaration in
netdevice.h. The methods, dev_gro_receive(), napi_frags_finish() and
napi_skb_finish(), which are in the GRO rx path, are not used
outside core/dev.c.

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-29 13:18:32 -05:00
David S. Miller 8a2cf062b2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-29 12:51:17 -05:00
David S. Miller a45085f6a7 Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch
Two small openswitch fixes from Jesse Gross.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-28 18:00:47 -05:00
David S. Miller 83a9d197c7 Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next
John W. Linville says:

====================
This pull request is intended for the 3.8 stream.  It is a bit large
-- I guess Thanksgiving got me off track!  At least the code got to
spend some time in linux-next... :-)

This includes the usual batch of pulls for Bluetooth, NFC, and mac80211
as well as iwlwifi.  Also here is an ath6kl pull, and a new driver
in the rtlwifi family.  The brcmfmac, brcmsmac, ath9k, and mwl8k get
their usual levels of attention, and a handful of other updates tag
along as well.

For more detail on the pulls, please see below...

On Bluetooth, Gustavo says:

"Another set of patches for integration in wireless-next. There are two big set
of changes in it: Andrei Emeltchenko and Mat Martineau added more patches
towards a full Bluetooth High Speed support and Johan Hedberg improve the
single mode support for Bluetooth dongles. Apart from that we have small fixes
and improvements."

...and:

"A few patches to 3.8. The majority of the work here is from Andrei on the High
Speed support. Other than that Johan added support for setting LE advertising
data. The rest are fixes and clean ups and small improvements like support for
a new broadcom hardware."

On mac80211, Johannes says:

"This is for mac80211, for -next (3.8). Plenty of changes, as you can see
below. Some fixes for previous changes like the export.h include, the
beacon listener fix from Ben Greear, etc. Overall, no exciting new
features, though hwsim does gain channel context support for people to
try it out and look at."

...and...:

"This one contains the mac80211-next material. Apart from a few small new
features and cleanups I have two fixes for the channel context code. The
RX_END timestamp support will probably be reworked again as Simon Barber
noted the calculations weren't really valid, but the discussions there
are still going on and it's better than what we had before."

...and:

"Please pull (see below) to get the following changes:
 * a fix & a debug aid in IBSS from Antonio,
 * mesh cleanups from Marco,
 * a few bugfixes for some of my previous patches from Arend and myself,
 * and the big initial VHT support patchset"

And on iwlwifi, Johannes says:

"In addition to the previous four patches that I'm not resending,
we have a number of cleanups, message reduction, firmware error
handling improvements (yes yes... we need to fix them instead)
and various other small things all over."

...and:

"In his quest to try to understand the current iwlwifi problems (like
stuck queues etc.) Emmanuel has first cleaned up the PCIe code, I'm
including his changes in this pull request. Other than that I only have
a small cleanup from Sachin Kamat to remove a duplicate include and a
bugfix to turn off MFP if software crypto is enabled, but this isn't
really interesting as MFP isn't supported right now anyway."

On NFC, Samuel says:

"With this one we have:

- A few HCI improvements in preparation for an upcoming HCI chipset support.
- A pn544 code cleanup after the old driver was removed.
- An LLCP improvement for notifying user space when one peer stops ACKing I
  frames."

On ath6kl, Kalle says:

"Major changes this time are firmware recover support to gracefully
handle if firmware crashes, support for changing regulatory domain and
support for new ar6004 hardware revision 1.4. Otherwise there are just
smaller fixes or cleanups from different people."

Thats about it... :-)  Please let me know if there are problems!
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-28 17:49:16 -05:00
Jesse Gross 92eb1d4771 openvswitch: Use RCU callback when detaching netdevices.
Currently, each time a device is detached from an OVS datapath
we call synchronize RCU before freeing associated data structures.
However, if a bridge is deleted (which detaches all ports) when
many devices are connected then there can be a long delay.  This
switches to use call_rcu() to group the cost together.

Reported-by: Justin Pettit <jpettit@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-11-28 14:04:34 -08:00
Nicolas Dichtel f4e0b4c5e1 ip6tnl/sit: drop packet if ECN present with not-ECT
This patch reports the change made by Stephen Hemminger in ipip and gre[6] in
commit eccc1bb8d4 (tunnel: drop packet if ECN present with not-ECT).

Goal is to handle RFC6040, Section 4.2:

Default Tunnel Egress Behaviour.
 o If the inner ECN field is Not-ECT, the decapsulator MUST NOT
      propagate any other ECN codepoint onwards.  This is because the
      inner Not-ECT marking is set by transports that rely on dropped
      packets as an indication of congestion and would not understand or
      respond to any other ECN codepoint [RFC4774].  Specifically:

      *  If the inner ECN field is Not-ECT and the outer ECN field is
         CE, the decapsulator MUST drop the packet.

      *  If the inner ECN field is Not-ECT and the outer ECN field is
         Not-ECT, ECT(0), or ECT(1), the decapsulator MUST forward the
         outgoing packet with the ECN field cleared to Not-ECT.

The patch takes benefits from common function added in net/inet_ecn.h.

Like it was done for Xin4 tunnels, it adds logging to allow detecting broken
systems that set ECN bits incorrectly when tunneling (or an intermediate
router might be changing the header). Errors are also tracked via
rx_frame_error.

CC: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-28 11:37:11 -05:00
David S. Miller 52f2ede1ce Merge branch 'master' of git://1984.lsi.us.es/nf
An interface name overflow fix in netfilter via Pablo Neira Ayuso.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-28 11:29:43 -05:00
Tommi Rantala c3b2c25819 irda: irttp: fix memory leak in irttp_open_tsap() error path
Cleanup the memory we allocated earlier in irttp_open_tsap() when we hit
this error path. The leak goes back to at least 1da177e4
("Linux-2.6.12-rc2").

Discovered with Trinity (the syscall fuzzer).

Signed-off-by: Tommi Rantala <tt.rantala@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-28 11:25:29 -05:00
Paolo Valente 462dbc9101 pkt_sched: QFQ Plus: fair-queueing service at DRR cost
This patch turns QFQ into QFQ+, a variant of QFQ that provides the
following two benefits: 1) QFQ+ is faster than QFQ, 2) differently
from QFQ, QFQ+ correctly schedules also non-leaves classes in a
hierarchical setting. A detailed description of QFQ+, plus a
performance comparison with DRR and QFQ, can be found in [1].

[1] P. Valente, "Reducing the Execution Time of Fair-Queueing Schedulers"
http://algo.ing.unimo.it/people/paolo/agg-sched/agg-sched.pdf

Signed-off-by: Paolo Valente <paolo.valente@unimore.it>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-28 11:19:35 -05:00
Schoch Christian 92d64c261e sctp: Error in calculation of RTTvar
The calculation of RTTVAR involves the subtraction of two unsigned
numbers which
may causes rollover and results in very high values of RTTVAR when RTT > SRTT.
With this patch it is possible to set RTOmin = 1 to get the minimum of RTO at
4 times the clock granularity.

Change Notes:

v2)
        *Replaced abs() by abs64() and long by __s64, changed patch
description.

Signed-off-by: Christian Schoch <e0326715@student.tuwien.ac.at>
CC: Vlad Yasevich <vyasevich@gmail.com>
CC: Sridhar Samudrala <sri@us.ibm.com>
CC: Neil Horman <nhorman@tuxdriver.com>
CC: linux-sctp@vger.kernel.org
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-28 11:13:40 -05:00
Tommi Rantala 6e51fe7572 sctp: fix -ENOMEM result with invalid user space pointer in sendto() syscall
Consider the following program, that sets the second argument to the
sendto() syscall incorrectly:

 #include <string.h>
 #include <arpa/inet.h>
 #include <sys/socket.h>

 int main(void)
 {
         int fd;
         struct sockaddr_in sa;

         fd = socket(AF_INET, SOCK_STREAM, 132 /*IPPROTO_SCTP*/);
         if (fd < 0)
                 return 1;

         memset(&sa, 0, sizeof(sa));
         sa.sin_family = AF_INET;
         sa.sin_addr.s_addr = inet_addr("127.0.0.1");
         sa.sin_port = htons(11111);

         sendto(fd, NULL, 1, 0, (struct sockaddr *)&sa, sizeof(sa));

         return 0;
 }

We get -ENOMEM:

 $ strace -e sendto ./demo
 sendto(3, NULL, 1, 0, {sa_family=AF_INET, sin_port=htons(11111), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 ENOMEM (Cannot allocate memory)

Propagate the error code from sctp_user_addto_chunk(), so that we will
tell user space what actually went wrong:

 $ strace -e sendto ./demo
 sendto(3, NULL, 1, 0, {sa_family=AF_INET, sin_port=htons(11111), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EFAULT (Bad address)

Noticed while running Trinity (the syscall fuzzer).

Signed-off-by: Tommi Rantala <tt.rantala@gmail.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-28 11:11:17 -05:00
Tommi Rantala be364c8c0f sctp: fix memory leak in sctp_datamsg_from_user() when copy from user space fails
Trinity (the syscall fuzzer) discovered a memory leak in SCTP,
reproducible e.g. with the sendto() syscall by passing invalid
user space pointer in the second argument:

 #include <string.h>
 #include <arpa/inet.h>
 #include <sys/socket.h>

 int main(void)
 {
         int fd;
         struct sockaddr_in sa;

         fd = socket(AF_INET, SOCK_STREAM, 132 /*IPPROTO_SCTP*/);
         if (fd < 0)
                 return 1;

         memset(&sa, 0, sizeof(sa));
         sa.sin_family = AF_INET;
         sa.sin_addr.s_addr = inet_addr("127.0.0.1");
         sa.sin_port = htons(11111);

         sendto(fd, NULL, 1, 0, (struct sockaddr *)&sa, sizeof(sa));

         return 0;
 }

As far as I can tell, the leak has been around since ~2003.

Signed-off-by: Tommi Rantala <tt.rantala@gmail.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-28 11:10:09 -05:00
John W. Linville 79d38f7d6c Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem
Conflicts:
	drivers/net/wireless/iwlwifi/pcie/tx.c
2012-11-28 10:56:03 -05:00
Eric Dumazet b49d3c1e1c net: ipmr: limit MRT_TABLE identifiers
Name of pimreg devices are built from following format :

char name[IFNAMSIZ]; // IFNAMSIZ == 16

sprintf(name, "pimreg%u", mrt->id);

We must therefore limit mrt->id to 9 decimal digits
or risk a buffer overflow and a crash.

Restrict table identifiers in [0 ... 999999999] interval.

Reported-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-26 17:36:59 -05:00
Joe Perches 03f52a0a55 ip6mr: Add sizeof verification to MRT6_ASSERT and MT6_PIM
Verify the length of the user-space arguments.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-26 17:35:58 -05:00
Neal Cardwell e1a676424c ipv4: avoid passing NULL to inet_putpeer() in icmpv4_xrlim_allow()
inet_getpeer_v4() can return NULL under OOM conditions, and while
inet_peer_xrlim_allow() is OK with a NULL peer, inet_putpeer() will
crash.

This code path now uses the same idiom as the others from:
1d861aa4b3 ("inet: Minimize use of
cached route inetpeer.").

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-26 17:24:41 -05:00
Brian Haley c91f6df2db sockopt: Change getsockopt() of SO_BINDTODEVICE to return an interface name
Instead of having the getsockopt() of SO_BINDTODEVICE return an index, which
will then require another call like if_indextoname() to get the actual interface
name, have it return the name directly.

This also matches the existing man page description on socket(7) which mentions
the argument being an interface name.

If the value has not been set, zero is returned and optlen will be set to zero
to indicate there is no interface name present.

Added a seqlock to protect this code path, and dev_ifname(), from someone
changing the device name via dev_change_name().

v2: Added seqlock protection while copying device name.

v3: Fixed word wrap in patch.

Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-26 17:22:14 -05:00
David Woodhouse ae088d663b atm: br2684: Fix excessive queue bloat
There's really no excuse for an additional wmem_default of buffering
between the netdev queue and the ATM device. Two packets (one in-flight,
and one ready to send) ought to be fine. It's not as if it should take
long to get another from the netdev queue when we need it.

If necessary we can make the queue space configurable later, but I don't
think it's likely to be necessary.

cf. commit 9d02daf754 (pppoatm: Fix
excessive queue bloat) which did something very similar for PPPoATM.

Note that there is a tremendously unlikely race condition which may
result in qspace temporarily going negative. If a CPU running the
br2684_pop() function goes off into the weeds for a long period of time
after incrementing qspace to 1, but before calling netdev_wake_queue()...
and another CPU ends up calling br2684_start_xmit() and *stopping* the
queue again before the first CPU comes back, the netdev queue could
end up being woken when qspace has already reached zero.

An alternative approach to coping with this race would be to check in
br2684_start_xmit() for qspace==0 and return NETDEV_TX_BUSY, but just
using '> 0' and '< 1' for comparison instead of '== 0' and '!= 0' is
simpler. It just warranted a mention of *why* we do it that way...

Move the call to atmvcc->send() to happen *after* the accounting and
potentially stopping the netdev queue, in br2684_xmit_vcc(). This matters
if the ->send() call suffers an immediate failure, because it'll call
br2684_pop() with the offending skb before returning. We want that to
happen *after* we've done the initial accounting for the packet in
question. Also make it return an appropriate success/failure indication
while we're at it.

Tested by running 'ping -l 1000 bottomless.aaisp.net.uk' from within my
network, with only a single PPPoE-over-BR2684 link running. And after
setting txqueuelen on the nas0 interface to something low (5, in fact).
Before the patch, we'd see about 15 packets being queued and a resulting
latency of ~56ms being reached. After the patch, we see only about 8,
which is fairly much what we expect. And a max latency of ~36ms. On this
OpenWRT box, wmem_default is 163840.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Reviewed-by: Krzysztof Mazur <krzysiek@podlesie.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-26 17:13:56 -05:00
Ben Hutchings b3422a314c dsa: Hide core config options; make drivers select what they need
Commit 82167cb8c6 ('net: dsa/slave: Fix
compilation warnings') fixed one possible invalid configuration
(NET_DSA enabled with no trailer formats) but added others: drivers
can select NET_DSA without its dependencies being met.

It's not very useful to make either the DSA core or the tagging
formats manually selectable without a driver to use them, so:

1. Define a hidden HAVE_NET_DSA option and move the dependencies of
   NET_DSA to that.  While we're at it, drop the deprecated
   EXPERIMENTAL dependency.
2. Make NET_DSA and the drivers dependent on HAVE_NET_DSA.
3. Hide the tagging format options again.
4. Make drivers select both NET_DSA and the appropriate tagging format
   option.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-26 17:10:44 -05:00
Oliver Hartkopp 81b401100c can: bcm: initialize ifindex for timeouts without previous frame reception
Set in the rx_ifindex to pass the correct interface index in the case of a
message timeout detection. Usually the rx_ifindex value is set at receive
time. But when no CAN frame has been received the RX_TIMEOUT notification
did not contain a valid value.

Cc: linux-stable <stable@vger.kernel.org>
Reported-by: Andre Naujoks <nautsch2@googlemail.com>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2012-11-26 22:33:59 +01:00
John W. Linville 62c8003ecb Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next 2012-11-26 14:46:41 -05:00
Ansis Atteka 39c7caebc9 openvswitch: add skb mark matching and set action
This patch adds support for skb mark matching and set action.

Signed-off-by: Ansis Atteka <aatteka@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-11-26 11:33:18 -08:00
Johannes Berg ec816087e8 cfg80211: fix some tracing output issues
In some cases, e.g. probe_status, there were spaces
missing so the trace output was confusing. Also make
it more like mac80211 when printing netdevs/wiphys
to make reading a combined log easier.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2012-11-26 12:48:17 +01:00
Johannes Berg 8bc83c2463 mac80211: support VHT rates in TX info
To achieve this, limit the number of retries to
31 (instead of 255) and use the three bits that
are then free for VHT flags.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2012-11-26 12:43:00 +01:00
Johannes Berg 5614618ec4 mac80211: support drivers reporting VHT RX
Add support to mac80211 for having drivers report
received VHT MCS information.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2012-11-26 12:42:59 +01:00
Johannes Berg db9c64cf8d nl80211/cfg80211: add VHT MCS support
Add support for reporting and calculating VHT MCSes.

Note that I'm not completely sure that the bitrate
calculations are correct, nor that they can't be
simplified.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2012-11-26 12:42:59 +01:00
Johannes Berg 4bf88530be mac80211: convert to channel definition struct
Convert mac80211 (and where necessary, some drivers a
little bit) to the new channel definition struct.

This will allow extending mac80211 for VHT, which is
currently restricted to channel contexts since there
are no drivers using that which makes it easier. As
I also don't care about VHT for drivers not using the
channel context API, I won't convert the previous API
to VHT support.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2012-11-26 12:42:59 +01:00
Johannes Berg 3d9d1d6656 nl80211/cfg80211: support VHT channel configuration
Change nl80211 to support specifying a VHT (or HT)
using the control channel frequency (as before) and
new attributes for the channel width and first and
second center frequency. The old channel type is of
course still supported for HT.

Also change the cfg80211 channel definition struct
to support these by adding the relevant fields to
it (and removing the _type field.)

This also adds new helper functions:
 - cfg80211_chandef_create to create a channel def
   struct given the control channel and channel type,
 - cfg80211_chandef_identical to check if two channel
   definitions are identical
 - cfg80211_chandef_compatible to check if the given
   channel definitions are compatible, and return the
   wider of the two

This isn't entirely complete, but that doesn't matter
until we have a driver using it. In particular, it's
missing
 - regulatory checks on the usable bandwidth (if that
   even makes sense)
 - regulatory TX power (database can't deal with it)
 - a proper channel compatibility calculation for the
   new channel types

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2012-11-26 12:42:59 +01:00
Johannes Berg 683b6d3b31 cfg80211: pass a channel definition struct
Instead of passing a channel pointer and channel type
to all functions and driver methods, pass a new channel
definition struct. Right now, this struct contains just
the control channel and channel type, but for VHT this
will change.

Also, add a small inline cfg80211_get_chandef_type() so
that drivers don't need to use the _type field of the
new structure all the time, which will change.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2012-11-26 12:42:58 +01:00
Johannes Berg 42d97a599e cfg80211: remove remain-on-channel channel type
As mwifiex (and mac80211 in the software case) are the
only drivers actually implementing remain-on-channel
with channel type, userspace can't be relying on it.
This is the case, as it's used only for P2P operations
right now.

Rather than adding a flag to tell userspace whether or
not it can actually rely on it, simplify all the code
by removing the ability to use different channel types.
Leave only the validation of the attribute, so that if
we extend it again later (with the needed capability
flag), it can't break userspace sending invalid data.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2012-11-26 12:42:58 +01:00
Johannes Berg 028e8da072 mac80211: fix managed mode channel flags handling
If ieee80211_prep_channel() decides that HT should be
disabled (because the HT IEs from the AP were invalid)
it will set the IEEE80211_STA_DISABLE_HT to not send
HT capabilities to the AP when associating. If this
happens during authentication, the flag will be lost
and we send HT frames, even if the channel config was
set up for non-HT. This can lead to issues.

Fix this by always resetting the ifmgd flags to zero
when the channel context is released so that the flag
resetting in ieee80211_mgd_assoc() isn't necessary.

To make the code a bit easier move the call to release
the channel in ieee80211_set_disassoc() to the end of
the function together with the flag resetting (which
needs to be at the end to avoid timers setting flags.)

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2012-11-26 12:37:38 +01:00
Marco Porsch 453e66f247 mac80211: remove mesh config macros from mesh_plink.c
Use shortcut pointer instead where it is appropriate.

Signed-off-by: Marco Porsch <marco.porsch@etit.tu-chemnitz.de>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2012-11-26 11:36:42 +01:00
Marco Porsch 40aefedc8b mac80211: refactor ieee80211_set_qos_hdr
Return early if not a QoS Data frame.
Give proper documentation.

Signed-off-by: Marco Porsch <marco.porsch@etit.tu-chemnitz.de>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2012-11-26 11:36:02 +01:00
Marco Porsch 65821635d2 mac80211: move Mesh Capability field definition to ieee80211.h
Signed-off-by: Marco Porsch <marco.porsch@etit.tu-chemnitz.de>
[prefix with IEEE80211_]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2012-11-26 11:35:21 +01:00
Antonio Quartulli 7bed20503f mac80211: in ADHOC print debug message for every Auth message
The debug message has to be printed also for an Auth message with
auth_sequence != 1. This helps understanding whether the two Auth
messages are exchanged correctly or not.

Signed-off-by: Antonio Quartulli <antonio@open-mesh.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2012-11-26 11:32:42 +01:00
Antonio Quartulli e584da5e3c mac80211: in ADHOC don't update last_rx if sta is not authorized
It does not make sense to keep a station alive if it is not authorised
at all. If IBSS/RSN is used it could also be the case that something
went wrong during the keys exchange and the stations ended up in a not
recoverable state.

By not updating last_rx we are giving the station a chance to be
deleted and to start the key exchange once again from scratch.

Signed-off-by: Antonio Quartulli <antonio@open-mesh.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2012-11-26 11:31:39 +01:00
Arend van Spriel c216e6417f cfg80211: change function signature of cfg80211_get_p2p_attr()
The function cfg80211_get_p2p_attr() can fail and returns
a negative error code. However, the return type is unsigned
int. The largest positive number is determined by desired_len
variable in the function, which is u16. So changing the return
type to int to allow easy error checking. Also change the type
for the attribute to enum for improved type checking.

Signed-off-by: Arend van Spriel <arend@broadcom.com>
[fix indentation, don't use u8 attr variable]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2012-11-26 11:28:55 +01:00
Joe Perches 53d6841d22 ipv4/ipmr and ipv6/ip6mr: Convert int mroute_do_<foo> to bool
Save a few bytes per table by convert mroute_do_assert and
mroute_do_pim from int to bool.

Remove !! as the compiler does that when assigning int to bool.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-25 16:34:17 -05:00