linux

q3k/linux

Author	SHA1	Message	Date
Pablo Neira Ayuso	5901b6be88	netfilter: nf_nat: support IPv6 in IRC NAT helper Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>	2012-08-30 03:00:23 +02:00
Patrick McHardy	9a66482106	netfilter: nf_nat: support IPv6 in SIP NAT helper Add IPv6 support to the SIP NAT helper. There are no functional differences to IPv4 NAT, just different formats for addresses. Signed-off-by: Patrick McHardy <kaber@trash.net>	2012-08-30 03:00:22 +02:00
Patrick McHardy	ee6eb96673	netfilter: nf_nat: support IPv6 in amanda NAT helper Signed-off-by: Patrick McHardy <kaber@trash.net>	2012-08-30 03:00:21 +02:00
Patrick McHardy	d33cbeeb1a	netfilter: nf_nat: support IPv6 in FTP NAT helper Signed-off-by: Patrick McHardy <kaber@trash.net>	2012-08-30 03:00:20 +02:00
Patrick McHardy	ed72d9e294	netfilter: ip6tables: add NETMAP target Signed-off-by: Patrick McHardy <kaber@trash.net>	2012-08-30 03:00:19 +02:00
Patrick McHardy	115e23ac78	netfilter: ip6tables: add REDIRECT target Signed-off-by: Patrick McHardy <kaber@trash.net>	2012-08-30 03:00:19 +02:00
Patrick McHardy	b3f644fc82	netfilter: ip6tables: add MASQUERADE target Signed-off-by: Patrick McHardy <kaber@trash.net>	2012-08-30 03:00:18 +02:00
Patrick McHardy	58a317f106	netfilter: ipv6: add IPv6 NAT support Signed-off-by: Patrick McHardy <kaber@trash.net>	2012-08-30 03:00:17 +02:00
Patrick McHardy	2cf545e835	net: core: add function for incremental IPv6 pseudo header checksum updates Add inet_proto_csum_replace16 for incrementally updating IPv6 pseudo header checksums for IPv6 NAT. Signed-off-by: Patrick McHardy <kaber@trash.net> Acked-by: David S. Miller <davem@davemloft.net>	2012-08-30 03:00:16 +02:00
Patrick McHardy	0ad352cb43	netfilter: ipv6: expand skb head in ip6_route_me_harder after oif change Expand the skb headroom if the oif changed due to rerouting similar to how IPv4 packets are handled. Signed-off-by: Patrick McHardy <kaber@trash.net>	2012-08-30 03:00:15 +02:00
Patrick McHardy	c7232c9979	netfilter: add protocol independent NAT core Convert the IPv4 NAT implementation to a protocol independent core and address family specific modules. Signed-off-by: Patrick McHardy <kaber@trash.net>	2012-08-30 03:00:14 +02:00
Patrick McHardy	051966c0c6	netfilter: nf_nat: add protoff argument to packet mangling functions For mangling IPv6 packets the protocol header offset needs to be known by the NAT packet mangling functions. Add a so far unused protoff argument and convert the conntrack and NAT helpers to use it in preparation of IPv6 NAT. Signed-off-by: Patrick McHardy <kaber@trash.net>	2012-08-30 03:00:13 +02:00
Patrick McHardy	811927ccfe	netfilter: nf_conntrack: restrict NAT helper invocation to IPv4 The NAT helpers currently only handle IPv4 packets correctly. Restrict invocation of the helpers to IPv4 in preparation of IPv6 NAT. Signed-off-by: Patrick McHardy <kaber@trash.net>	2012-08-30 03:00:12 +02:00
Patrick McHardy	2b60af0178	netfilter: nf_conntrack_ipv6: fix tracking of ICMPv6 error messages containing fragments ICMPv6 error messages are tracked by extracting the conntrack tuple of the inner packet and looking up the corresponding conntrack entry. Tuple extraction uses the ->get_l4proto() callback, which in case of fragments returns NEXTHDR_FRAGMENT instead of the upper protocol, even for the first fragment when the entire next header is present, resulting in a failure to find the correct connection tracking entry. This patch changes ipv6_get_l4proto() to use ipv6_skip_exthdr() instead of nf_ct_ipv6_skip_exthdr() in order to skip fragment headers when the fragment offset is zero. Signed-off-by: Patrick McHardy <kaber@trash.net>	2012-08-30 03:00:11 +02:00
Patrick McHardy	4cdd34084d	netfilter: nf_conntrack_ipv6: improve fragmentation handling The IPv6 conntrack fragmentation currently has a couple of shortcomings. Fragmentes are collected in PREROUTING/OUTPUT, are defragmented, the defragmented packet is then passed to conntrack, the resulting conntrack information is attached to each original fragment and the fragments then continue their way through the stack. Helper invocation occurs in the POSTROUTING hook, at which point only the original fragments are available. The result of this is that fragmented packets are never passed to helpers. This patch improves the situation in the following way: - If a reassembled packet belongs to a connection that has a helper assigned, the reassembled packet is passed through the stack instead of the original fragments. - During defragmentation, the largest received fragment size is stored. On output, the packet is refragmented if required. If the largest received fragment size exceeds the outgoing MTU, a "packet too big" message is generated, thus behaving as if the original fragments were passed through the stack from an outside point of view. - The ipv6_helper() hook function can't receive fragments anymore for connections using a helper, so it is switched to use ipv6_skip_exthdr() instead of the netfilter specific nf_ct_ipv6_skip_exthdr() and the reassembled packets are passed to connection tracking helpers. The result of this is that we can properly track fragmented packets, but still generate ICMPv6 Packet too big messages if we would have before. This patch is also required as a precondition for IPv6 NAT, where NAT helpers might enlarge packets up to a point that they require fragmentation. In that case we can't generate Packet too big messages since the proper MTU can't be calculated in all cases (f.i. when changing textual representation of a variable amount of addresses), so the packet is transparently fragmented iff the original packet or fragments would have fit the outgoing MTU. IPVS parts by Jesper Dangaard Brouer <brouer@redhat.com>. Signed-off-by: Patrick McHardy <kaber@trash.net>	2012-08-30 03:00:10 +02:00
Jesper Dangaard Brouer	590e3f79a2	ipvs: IPv6 MTU checking cleanup and bugfix Cleaning up the IPv6 MTU checking in the IPVS xmit code, by using a common helper function __mtu_check_toobig_v6(). The MTU check for tunnel mode can also use this helper as ntohs(old_iph->payload_len) + sizeof(struct ipv6hdr) is qual to skb->len. And the 'mtu' variable have been adjusted before calling helper. Notice, this also fixes a bug, as the the MTU check in ip_vs_dr_xmit_v6() were missing a check for skb_is_gso(). This bug e.g. caused issues for KVM IPVS setups, where different Segmentation Offloading techniques are utilized, between guests, via the virtio driver. This resulted in very bad performance, due to the ICMPv6 "too big" messages didn't affect the sender. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-08-30 02:55:39 +02:00
Amerigo Wang	072a9c4860	netpoll: revert `6bdb7fe310` and fix be_poll() instead Against -net. In the patch "netpoll: re-enable irq in poll_napi()", I tried to fix the following warning: [100718.051041] ------------[ cut here ]------------ [100718.051048] WARNING: at kernel/softirq.c:159 local_bh_enable_ip+0x7d/0xb0() (Not tainted) [100718.051049] Hardware name: ProLiant BL460c G7 ... [100718.051068] Call Trace: [100718.051073] [<ffffffff8106b747>] ? warn_slowpath_common+0x87/0xc0 [100718.051075] [<ffffffff8106b79a>] ? warn_slowpath_null+0x1a/0x20 [100718.051077] [<ffffffff810747ed>] ? local_bh_enable_ip+0x7d/0xb0 [100718.051080] [<ffffffff8150041b>] ? _spin_unlock_bh+0x1b/0x20 [100718.051085] [<ffffffffa00ee974>] ? be_process_mcc+0x74/0x230 [be2net] [100718.051088] [<ffffffffa00ea68c>] ? be_poll_tx_mcc+0x16c/0x290 [be2net] [100718.051090] [<ffffffff8144fe76>] ? netpoll_poll_dev+0xd6/0x490 [100718.051095] [<ffffffffa01d24a5>] ? bond_poll_controller+0x75/0x80 [bonding] [100718.051097] [<ffffffff8144fde5>] ? netpoll_poll_dev+0x45/0x490 [100718.051100] [<ffffffff81161b19>] ? ksize+0x19/0x80 [100718.051102] [<ffffffff81450437>] ? netpoll_send_skb_on_dev+0x157/0x240 by reenabling IRQ before calling ->poll, but it seems more problems are introduced after that patch: http://ozlabs.org/~akpm/stuff/IMG_20120824_122054.jpg http://marc.info/?l=linux-netdev&m=134563282530588&w=2 So it is safe to fix be2net driver code directly. This patch reverts the offending commit and fixes be_poll() by avoid disabling BH there, this is okay because be_poll() can be called either by poll_napi() which already disables IRQ, or by net_rx_action() which already disables BH. Reported-by: Andrew Morton <akpm@linux-foundation.org> Reported-by: Sylvain Munaut <s.munaut@whatever-company.com> Cc: Sylvain Munaut <s.munaut@whatever-company.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: David Miller <davem@davemloft.net> Cc: Sathya Perla <sathya.perla@emulex.com> Cc: Subbu Seetharaman <subbu.seetharaman@emulex.com> Cc: Ajit Khaparde <ajit.khaparde@emulex.com> Signed-off-by: Cong Wang <amwang@redhat.com> Tested-by: Sylvain Munaut <s.munaut@whatever-company.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-29 15:03:23 -04:00
Vinicius Costa Gomes	d8343f1257	Bluetooth: Fix sending a HCI Authorization Request over LE links In the case that the link is already in the connected state and a Pairing request arrives from the mgmt interface, hci_conn_security() would be called but it was not considering LE links. Reported-by: João Paulo Rechi Vita <jprvita@openbossa.org> Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@openbossa.org> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-08-27 08:11:51 -07:00
Vinicius Costa Gomes	cc110922da	Bluetooth: Change signature of smp_conn_security() To make it clear that it may be called from contexts that may not have any knowledge of L2CAP, we change the connection parameter, to receive a hci_conn. This also makes it clear that it is checking the security of the link. Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@openbossa.org> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-08-27 08:07:18 -07:00
Patrick McHardy	5f2d04f1f9	ipv4: fix path MTU discovery with connection tracking IPv4 conntrack defragments incoming packet at the PRE_ROUTING hook and (in case of forwarded packets) refragments them at POST_ROUTING independent of the IP_DF flag. Refragmentation uses the dst_mtu() of the local route without caring about the original fragment sizes, thereby breaking PMTUD. This patch fixes this by keeping track of the largest received fragment with IP_DF set and generates an ICMP fragmentation required error during refragmentation if that size exceeds the MTU. Signed-off-by: Patrick McHardy <kaber@trash.net> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: David S. Miller <davem@davemloft.net>	2012-08-26 19:13:55 +02:00
Linus Torvalds	8497ae61d0	Merge branch 'for-3.6' of git://linux-nfs.org/~bfields/linux Pull nfsd bugfixes from J. Bruce Fields: "Particular thanks to Michael Tokarev, Malahal Naineni, and Jamie Heilman for their testing and debugging help." * 'for-3.6' of git://linux-nfs.org/~bfields/linux: svcrpc: fix svc_xprt_enqueue/svc_recv busy-looping svcrpc: sends on closed socket should stop immediately svcrpc: fix BUG() in svc_tcp_clear_pages nfsd4: fix security flavor of NFSv4.0 callback	2012-08-25 11:43:41 -07:00
David S. Miller	e6acb38480	Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace This is an initial merge in of Eric Biederman's work to start adding user namespace support to the networking. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-24 18:54:37 -04:00
David S. Miller	85c21049fc	Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next John W. Linville says: ==================== This is a batch of updates intended for 3.7. The bulk of it is mac80211 changes, including some mesh work from Thomas Pederson and some multi-channel work from Johannes. A variety of driver updates and other bits are scattered in there as well. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-24 15:18:07 -04:00
David S. Miller	d05cebb915	Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless John W. Linville says: ==================== This batch of fixes is intended for 3.6... Johannes Berg gives us a pair of iwlwifi fixes. One corrects some improperly defined ifdefs that lead to crashes and BUG_ONs. The other prevents attempts to read SRAM for devices that aren't actually started. Julia Lawall provides an ipw2100 fix to properly set the return code from a function call before testing it! :-) Thomas Huehn corrects the improper use of a constant related to a power setting in ath5k. Thomas Pedersen offers a mac80211 fix to properly handle destination addresses of unicast frames passing though a mesh gate. Vladimir Zapolskiy provides a brcmsmac fix to properly mark the interface state when the device goes down. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-24 15:15:10 -04:00
Yuchung Cheng	7c4a56fec3	tcp: fix cwnd reduction for non-sack recovery The cwnd reduction in fast recovery is based on the number of packets newly delivered per ACK. For non-sack connections every DUPACK signifies a packet has been delivered, but the sender mistakenly skips counting them for cwnd reduction. The fix is to compute newly_acked_sacked after DUPACKs are accounted in sacked_out for non-sack connections. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Nandita Dukkipati <nanditad@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-24 13:48:58 -04:00
Jiri Pirko	9b361c13ce	vlan: add helper which can be called to see if device is used by vlan also, remove unused vlan_info definition from header CC: Patrick McHardy <kaber@trash.net> Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-24 13:46:39 -04:00
Pablo Neira Ayuso	20e1db19db	netlink: fix possible spoofing from non-root processes Non-root user-space processes can send Netlink messages to other processes that are well-known for being subscribed to Netlink asynchronous notifications. This allows ilegitimate non-root process to send forged messages to Netlink subscribers. The userspace process usually verifies the legitimate origin in two ways: a) Socket credentials. If UID != 0, then the message comes from some ilegitimate process and the message needs to be dropped. b) Netlink portID. In general, portID == 0 means that the origin of the messages comes from the kernel. Thus, discarding any message not coming from the kernel. However, ctnetlink sets the portID in event messages that has been triggered by some user-space process, eg. conntrack utility. So other processes subscribed to ctnetlink events, eg. conntrackd, know that the event was triggered by some user-space action. Neither of the two ways to discard ilegitimate messages coming from non-root processes can help for ctnetlink. This patch adds capability validation in case that dst_pid is set in netlink_sendmsg(). This approach is aggressive since existing applications using any Netlink bus to deliver messages between two user-space processes will break. Note that the exception is NETLINK_USERSOCK, since it is reserved for netlink-to-netlink userspace communication. Still, if anyone wants that his Netlink bus allows netlink-to-netlink userspace, then they can set NL_NONROOT_SEND. However, by default, I don't think it makes sense to allow to use NETLINK_ROUTE to communicate two processes that are sending no matter what information that is not related to link/neighbouring/routing. They should be using NETLINK_USERSOCK instead for that. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-24 13:36:09 -04:00
Ben Hutchings	8f4cccbbd9	net: Set device operstate at registration time The operstate of a device is initially IF_OPER_UNKNOWN and is updated asynchronously by linkwatch after each change of carrier state reported by the driver. The default carrier state of a net device is on, and this will never be changed on drivers that do not support carrier detection, thus the operstate remains IF_OPER_UNKNOWN. For devices that do support carrier detection, the driver must set the carrier state to off initially, then poll the hardware state when the device is opened. However, we must not activate linkwatch for a unregistered device, and commit `b473001` ('net: Do not fire linkwatch events until the device is registered.') ensured that we don't. But this means that the operstate for many devices that support carrier detection remains IF_OPER_UNKNOWN when it should be IF_OPER_DOWN. The same issue exists with the dormant state. The proper initialisation sequence, avoiding a race with opening of the device, is: rtnl_lock(); rc = register_netdevice(dev); if (rc) goto out_unlock; netif_carrier_off(dev); /* or netif_dormant_on(dev) */ rtnl_unlock(); but it seems silly that this should have to be repeated in so many drivers. Further, the operstate seen immediately after opening the device may still be IF_OPER_UNKNOWN due to the asynchronous nature of linkwatch. Commit `22604c8` ('net: Fix for initial link state in 2.6.28') attempted to fix this by setting the operstate synchronously, but it was reverted as it could lead to deadlock. This initialises the operstate synchronously at registration time only. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-24 12:46:13 -04:00
Neil Horman	3afa6d00fb	cls_cgroup: Allow classifier cgroups to have their classid reset to 0 The network classifier cgroup initalizes each cgroups instance classid value to 0. However, the sock_update_classid function only updates classid's in sockets if the tasks cgroup classid is not zero, and if it differs from the current classid. The later check is to prevent cache line dirtying, but the former is detrimental, as it prevents resetting a classid for a cgroup to 0. While this is not a common action, it has administrative usefulness (if the admin wants to disable classification of a certain group temporarily for instance). Easy fix, just remove the zero check. Tested successfully by myself Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: "David S. Miller" <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-24 12:41:17 -04:00
John W. Linville	f20b6213f1	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem	2012-08-24 12:25:30 -04:00
Eric Dumazet	78df76a065	ipv4: take rt_uncached_lock only if needed Multicast traffic allocates dst with DST_NOCACHE, but dst is not inserted into rt_uncached_list. This slowdown multicast workloads on SMP because rt_uncached_lock is contended. Change the test before taking the lock to actually check the dst was inserted into rt_uncached_list. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-24 11:47:48 -04:00
David S. Miller	e6e94e392f	Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-merge Antonio Quartulli says: ==================== Included changes: - a set of codestyle rearrangements/fixes - new feature to early detect new joining (mesh-unaware) clients - a minor fix for the gw-feature - substitution of shift operations with the BIT() macro - reorganization of the main batman-adv structure (struct batadv_priv) - some more (very) minor cleanups and fixes =================== Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-24 11:30:50 -04:00
John W. Linville	e72615f6ab	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem	2012-08-24 11:16:58 -04:00
Fengguang Wu	a0dfb2634e	af_packet: match_fanout_group() can be static cc: Eric Leblond <eric@regit.org> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-23 09:27:12 -07:00
Eric Dumazet	748e2d9396	net: reinstate rtnl in call_netdevice_notifiers() Eric Biederman pointed out that not holding RTNL while calling call_netdevice_notifiers() was racy. This patch is a direct transcription his feedback against commit `0115e8e30d` (net: remove delay at device dismantle) Thanks Eric ! Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Mahesh Bandewar <maheshb@google.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Gao feng <gaofeng@cn.fujitsu.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-23 09:24:42 -07:00
John W. Linville	c6b6eedc29	Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211	2012-08-23 09:51:15 -04:00
John W. Linville	a4881cc45a	Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next	2012-08-23 09:49:42 -04:00
Sven Eckelmann	fa4f0afcf4	batman-adv: Start new development cycle Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2012-08-23 14:20:23 +02:00
Antonio Quartulli	371351731e	batman-adv: change interface_rx to get orig node In order to understand where a broadcast packet is coming from and use this information to detect not yet announced clients, this patch modifies the interface_rx() function by passing a new argument: the orig node corresponding to the node that originated the received packet (if known). This new argument if not NULL for broadcast packets only (other packets does not have source field). Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2012-08-23 14:20:22 +02:00
Antonio Quartulli	30cfd02b60	batman-adv: detect not yet announced clients With the current TT mechanism a new client joining the network is not immediately able to communicate with other hosts because its MAC address has not been announced yet. This situation holds until the first OGM containing its joining event will be spread over the mesh network. This behaviour can be acceptable in networks where the originator interval is a small value (e.g. 1sec) but if that value is set to an higher time (e.g. 5secs) the client could suffer from several malfunctions like DHCP client timeouts, etc. This patch adds an early detection mechanism that makes nodes in the network able to recognise "not yet announced clients" by means of the broadcast packets they emitted on connection (e.g. ARP or DHCP request). The added client will then be confirmed upon receiving the OGM claiming it or purged if such OGM is not received within a fixed amount of time. Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2012-08-23 14:20:22 +02:00
Sven Eckelmann	c67893d17a	batman-adv: Reduce accumulated length of simple statements Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2012-08-23 14:20:21 +02:00
Sven Eckelmann	bbb1f90efb	batman-adv: Don't break statements after assignment operator Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2012-08-23 14:20:20 +02:00
Sven Eckelmann	8de47de575	batman-adv: Use BIT(x) macro to calculate bit positions Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2012-08-23 14:20:19 +02:00
Martin Hundebøll	74ee3634dc	batman-adv: Drop tt queries with foreign dest When enabling promiscuous mode, tt queries for other hosts might be received. Before this patch, "foreign" tt queries were processed like any other query and thus forwarded to its destination again and thereby causing a loop. This patch adds a check to drop foreign tt queries. Signed-off-by: Martin Hundebøll <martin@hundeboll.net> Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2012-08-23 14:20:19 +02:00
Martin Hundebøll	ff51fd70ad	batman-adv: Move batadv_check_unicast_packet() batadv_check_unicast_packet() is needed in batadv_recv_tt_query(), so move the former to before the latter. Signed-off-by: Martin Hundebøll <martin@hundeboll.net> Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2012-08-23 14:20:18 +02:00
Sven Eckelmann	807736f6e0	batman-adv: Split batadv_priv in sub-structures for features The structure batadv_priv grows everytime a new feature is introduced. It gets hard to find the parts of the struct that belongs to a specific feature. This becomes even harder by the fact that not every feature uses a prefix in the member name. The variables for bridge loop avoidence, gateway handling, translation table and visualization server are moved into separate structs that are included in the bat_priv main struct. Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2012-08-23 14:20:13 +02:00
Simon Wunderlich	624463079e	batman-adv: check batadv_orig_hash_add_if() return code If this call fails, some of the orig_nodes spaces may have been resized for the increased number of interface, and some may not. If we would just continue with the larger number of interfaces, this would lead to access to not allocated memory later. We better check the return code, and don't add the interface if no memory is available. OTOH, keeping some of the orig_nodes with too much memory allocated should hurt no one (except for a few too many bytes allocated). Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2012-08-23 14:02:46 +02:00
Antonio Quartulli	a51fb9b2ac	batman-adv: fix typos in comments the word millisecond is misspelled in several comments. This patch fixes it. Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2012-08-23 14:02:45 +02:00
Antonio Quartulli	d657e621a0	batman-adv: add reference counting for type batadv_tt_orig_list_entry The batadv_tt_orig_list_entry structure didn't have any refcounting mechanism so far. This patch introduces it and makes the structure being usable in much more complex context. Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2012-08-23 14:02:44 +02:00
Jonathan Corbet	3a7f291bf6	batman-adv: remove a misleading comment As much as I'm happy to see LWN links sprinkled through the kernel by the dozen, this one in particular reflects a very old state of reality; the associated comment is now incorrect. So just delete it. Signed-off-by: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2012-08-23 14:02:44 +02:00
Marek Lindner	1c9b0550f4	batman-adv: convert remaining packet counters to per_cpu_ptr() infrastructure Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Acked-by: Martin Hundebøll <martin@hundeboll.net> Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2012-08-23 14:02:43 +02:00
Simon Wunderlich	3eb8773e3a	batman-adv: rename bridge loop avoidance claim types for consistency reasons within the code and with the documentation, we should always call it "claim" and "unclaim". Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2012-08-23 14:02:42 +02:00
Simon Wunderlich	99e966fc96	batman-adv: correct comments in bridge loop avoidance Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2012-08-23 14:02:41 +02:00
Simon Wunderlich	536a23f119	batman-adv: Add the backbone gateway list to debugfs This is especially useful if there are no claims yet, but we still want to know which gateways are using bridge loop avoidance in the network. Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2012-08-23 14:02:41 +02:00
Antonio Quartulli	c70437289c	batman-adv: move function arguments on one line Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2012-08-23 14:02:40 +02:00
Pavel Emelyanov	0fa7fa98db	packet: Protect packet sk list with mutex (v2) Change since v1: * Fixed inuse counters access spotted by Eric In patch `eea68e2f` (packet: Report socket mclist info via diag module) I've introduced a "scheduling in atomic" problem in packet diag module -- the socket list is traversed under rcu_read_lock() while performed under it sk mclist access requires rtnl lock (i.e. -- mutex) to be taken. [152363.820563] BUG: scheduling while atomic: crtools/12517/0x10000002 [152363.820573] 4 locks held by crtools/12517: [152363.820581] #0: (sock_diag_mutex){+.+.+.}, at: [<ffffffff81a2dcb5>] sock_diag_rcv+0x1f/0x3e [152363.820613] #1: (sock_diag_table_mutex){+.+.+.}, at: [<ffffffff81a2de70>] sock_diag_rcv_msg+0xdb/0x11a [152363.820644] #2: (nlk->cb_mutex){+.+.+.}, at: [<ffffffff81a67d01>] netlink_dump+0x23/0x1ab [152363.820693] #3: (rcu_read_lock){.+.+..}, at: [<ffffffff81b6a049>] packet_diag_dump+0x0/0x1af Similar thing was then re-introduced by further packet diag patches (fanount mutex and pgvec mutex for rings) :( Apart from being terribly sorry for the above, I propose to change the packet sk list protection from spinlock to mutex. This lock currently protects two modifications: * sklist * prot inuse counters The sklist modifications can be just reprotected with mutex since they already occur in a sleeping context. The inuse counters modifications are trickier -- the __this_cpu_-s are used inside, thus requiring the caller to handle the potential issues with contexts himself. Since packet sockets' counters are modified in two places only (packet_create and packet_release) we only need to protect the context from being preempted. BH disabling is not required in this case. Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-22 22:58:27 -07:00
danborkmann@iogearbox.net	9e67030af3	af_packet: use define instead of constant Instead of using a hard-coded value for the status variable, it would make the code more readable to use its destined define from linux/if_packet.h. Signed-off-by: daniel.borkmann@tik.ee.ethz.ch Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-22 22:58:27 -07:00
Ying Xue	bfdc587c5a	rds: Don't disable BH on BH context Since we have already in BH context when _write_space(), _data_ready() as well as *_state_change() are called, it's unnecessary to disable BH. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-22 22:52:04 -07:00
Eric Dumazet	b87fb39e39	ipv6: gre: fix ip6gre_err() ip6gre_err() miscomputes grehlen (sizeof(ipv6h) is 4 or 8, not 40 as expected), and should take into account 'offset' parameter. Also uses pskb_may_pull() to cope with some fragged skbs Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Dmitry Kozlov <xeb@mail.ru> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-22 22:48:32 -07:00
Eric Dumazet	ef8531b64c	xfrm: fix RCU bugs This patch reverts commit `56892261ed` (xfrm: Use rcu_dereference_bh to deference pointer protected by rcu_read_lock_bh), and fixes bugs introduced in commit `418a99ac6a` ( Replace rwlock on xfrm_policy_afinfo with rcu ) 1) We properly use RCU variant in this file, not a mix of RCU/RCU_BH 2) We must defer some writes after the synchronize_rcu() call or a reader can crash dereferencing NULL pointer. 3) Now we use the xfrm_policy_afinfo_lock spinlock only from process context, we no longer need to block BH in xfrm_policy_register_afinfo() and xfrm_policy_unregister_afinfo() 4) Can use RCU_INIT_POINTER() instead of rcu_assign_pointer() in xfrm_policy_unregister_afinfo() 5) Remove a forward inline declaration (xfrm_policy_put_afinfo()), and also move xfrm_policy_get_afinfo() declaration. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Fan Du <fan.du@windriver.com> Cc: Priyanka Jain <Priyanka.Jain@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-22 22:39:46 -07:00
Eric Dumazet	0115e8e30d	net: remove delay at device dismantle I noticed extra one second delay in device dismantle, tracked down to a call to dst_dev_event() while some call_rcu() are still in RCU queues. These call_rcu() were posted by rt_free(struct rtable *rt) calls. We then wait a little (but one second) in netdev_wait_allrefs() before kicking again NETDEV_UNREGISTER. As the call_rcu() are now completed, dst_dev_event() can do the needed device swap on busy dst. To solve this problem, add a new NETDEV_UNREGISTER_FINAL, called after a rcu_barrier(), but outside of RTNL lock. Use NETDEV_UNREGISTER_FINAL with care ! Change dst_dev_event() handler to react to NETDEV_UNREGISTER_FINAL Also remove NETDEV_UNREGISTER_BATCH, as its not used anymore after IP cache removal. With help from Gao feng Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Mahesh Bandewar <maheshb@google.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-22 21:50:36 -07:00
Eric Dumazet	9b04f35005	ipv4: properly update pmtu Sylvain Munault reported following info : - TCP connection get "stuck" with data in send queue when doing "large" transfers ( like typing 'ps ax' on a ssh connection ) - Only happens on path where the PMTU is lower than the MTU of the interface - Is not present right after boot, it only appears 10-20min after boot or so. (and that's inside the _same_ TCP connection, it works fine at first and then in the same ssh session, it'll get stuck) - Definitely seems related to fragments somehow since I see a router sending ICMP message saying fragmentation is needed. - Exact same setup works fine with kernel 3.5.1 Problem happens when the 10 minutes (ip_rt_mtu_expires) expiration period is over. ip_rt_update_pmtu() calls dst_set_expires() to rearm a new expiration, but dst_set_expires() does nothing because dst.expires is already set. It seems we want to set the expires field to a new value, regardless of prior one. With help from Julian Anastasov. Reported-by: Sylvain Munaut <s.munaut@whatever-company.com> Signed-off-by: Eric Dumazet <edumazet@google.com> CC: Julian Anastasov <ja@ssi.bg> Tested-by: Sylvain Munaut <s.munaut@whatever-company.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-22 19:14:30 -07:00
David S. Miller	bf277b0cce	Merge git://1984.lsi.us.es/nf-next Pablo Neira Ayuso says: ==================== This is the first batch of Netfilter and IPVS updates for your net-next tree. Mostly cleanups for the Netfilter side. They are: * Remove unnecessary RTNL locking now that we have support for namespace in nf_conntrack, from Patrick McHardy. * Cleanup to eliminate unnecessary goto in the initialization path of several Netfilter tables, from Jean Sacren. * Another cleanup from Wu Fengguang, this time to PTR_RET instead of if IS_ERR then return PTR_ERR. * Use list_for_each_entry_continue_rcu in nf_iterate, from Michael Wang. * Add pmtu_disc sysctl option to disable PMTU in their tunneling transmitter, from Julian Anastasov. * Generalize application protocol registration in IPVS and modify IPVS FTP helper to use it, from Julian Anastasov. * update Kconfig. The IPVS FTP helper depends on the Netfilter FTP helper for NAT support, from Julian Anastasov. * Add logic to update PMTU for IPIP packets in IPVS, again from Julian Anastasov. * A couple of sparse warning fixes for IPVS and Netfilter from Claudiu Ghioc and Patrick McHardy respectively. Patrick's IPv6 NAT changes will follow after this batch, I need to flush this batch first before refreshing my tree. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-22 18:48:52 -07:00
Pravin B Shelar	46df7b8145	openvswitch: Add support for network namespaces. Following patch adds support for network namespace to openvswitch. Since it must release devices when namespaces are destroyed, a side effect of this patch is that the module no longer keeps a refcount but instead cleans up any state when it is unloaded. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>	2012-08-22 14:48:55 -07:00
David S. Miller	1304a7343b	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2012-08-22 14:21:38 -07:00
Jean Sacren	90efbed18a	netfilter: remove unnecessary goto statement for error recovery Usually it's a good practice to use goto statement for error recovery when initializing the module. This approach could be an overkill if: 1) there is only one fail case; 2) success and failure use the same return statement. For a cleaner approach, remove the unnecessary goto statement and directly implement error recovery. Signed-off-by: Jean Sacren <sakiwit@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-08-22 19:17:38 +02:00
Michael Wang	6705e86724	netfilter: replace list_for_each_continue_rcu with new interface This patch replaces list_for_each_continue_rcu() with list_for_each_entry_continue_rcu() to allow removing list_for_each_continue_rcu(). Signed-off-by: Michael Wang <wangyun@linux.vnet.ibm.com> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-08-22 19:17:20 +02:00
Linus Torvalds	f753c4ec15	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull ceph fixes from Sage Weil: "Jim's fix closes a narrow race introduced with the msgr changes. One fix resolves problems with debugfs initialization that Yan found when multiple client instances are created (e.g., two clusters mounted, or rbd + cephfs), another one fixes problems with mounting a nonexistent server subdirectory, and the last one fixes a divide by zero error from unsanitized ioctl input that Dan Carpenter found." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: ceph: avoid divide by zero in __validate_layout() libceph: avoid truncation due to racing banners ceph: tolerate (and warn on) extraneous dentry from mds libceph: delay debugfs initialization until we learn global_id	2012-08-22 09:58:05 -07:00
Sujith Manoharan	aba4e6fff8	mac80211: Fix AP mode regression Commit mac80211: avoid using synchronize_rcu in ieee80211_set_probe_resp changed the return value when the probe response template is not present. Revert to the earlier value of 1 - this fixes AP mode for drivers like ath9k. Signed-off-by: Sujith Manoharan <c_manoha@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-08-22 10:55:04 +02:00
Thomas Pedersen	27f011243a	mac80211: fix DS to MBSS address translation The destination address of unicast frames forwarded through a mesh gate was being replaced with the broadcast address. Instead leave the original destination address as the mesh DA. If the nexthop address is not in the mpath table it will be resolved. If that fails, the frame will be forwarded to known mesh gates. Reported-by: Cedric Voncken <cedric.voncken@acksys.fr> Signed-off-by: Thomas Pedersen <thomas@cozybit.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-08-22 09:45:05 +02:00
Jim Schutt	6d4221b537	libceph: avoid truncation due to racing banners Because the Ceph client messenger uses a non-blocking connect, it is possible for the sending of the client banner to race with the arrival of the banner sent by the peer. When ceph_sock_state_change() notices the connect has completed, it schedules work to process the socket via con_work(). During this time the peer is writing its banner, and arrival of the peer banner races with con_work(). If con_work() calls try_read() before the peer banner arrives, there is nothing for it to do, after which con_work() calls try_write() to send the client's banner. In this case Ceph's protocol negotiation can complete succesfully. The server-side messenger immediately sends its banner and addresses after accepting a connect request, before actually attempting to read or verify the banner from the client. As a result, it is possible for the banner from the server to arrive before con_work() calls try_read(). If that happens, try_read() will read the banner and prepare protocol negotiation info via prepare_write_connect(). prepare_write_connect() calls con_out_kvec_reset(), which discards the as-yet-unsent client banner. Next, con_work() calls try_write(), which sends the protocol negotiation info rather than the banner that the peer is expecting. The result is that the peer sees an invalid banner, and the client reports "negotiation failed". Fix this by moving con_out_kvec_reset() out of prepare_write_connect() to its callers at all locations except the one where the banner might still need to be sent. [elder@inktak.com: added note about server-side behavior] Signed-off-by: Jim Schutt <jaschut@sandia.gov> Reviewed-by: Alex Elder <elder@inktank.com>	2012-08-21 15:55:27 -07:00
Eric Dumazet	e0e3cea46d	af_netlink: force credentials passing [CVE-2012-3520] Pablo Neira Ayuso discovered that avahi and potentially NetworkManager accept spoofed Netlink messages because of a kernel bug. The kernel passes all-zero SCM_CREDENTIALS ancillary data to the receiver if the sender did not provide such data, instead of not including any such data at all or including the correct data from the peer (as it is the case with AF_UNIX). This bug was introduced in commit `16e5726269` (af_unix: dont send SCM_CREDENTIALS by default) This patch forces passing credentials for netlink, as before the regression. Another fix would be to not add SCM_CREDENTIALS in netlink messages if not provided by the sender, but it might break some programs. With help from Florian Weimer & Petr Matousek This issue is designated as CVE-2012-3520 Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Petr Matousek <pmatouse@redhat.com> Cc: Florian Weimer <fweimer@redhat.com> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-21 14:53:01 -07:00
Eric Dumazet	a9915a1b52	ipv4: fix ip header ident selection in __ip_make_skb() Christian Casteyde reported a kmemcheck 32-bit read from uninitialized memory in __ip_select_ident(). It turns out that __ip_make_skb() called ip_select_ident() before properly initializing iph->daddr. This is a bug uncovered by commit `1d861aa4b3` (inet: Minimize use of cached route inetpeer.) Addresses https://bugzilla.kernel.org/show_bug.cgi?id=46131 Reported-by: Christian Casteyde <casteyde.christian@free.fr> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-21 14:51:06 -07:00
Christoph Paasch	1a7b27c97c	ipv4: Use newinet->inet_opt in inet_csk_route_child_sock() Since `0e73441992` ("ipv4: Use inet_csk_route_child_sock() in DCCP and TCP."), inet_csk_route_child_sock() is called instead of inet_csk_route_req(). However, after creating the child-sock in tcp/dccp_v4_syn_recv_sock(), ireq->opt is set to NULL, before calling inet_csk_route_child_sock(). Thus, inside inet_csk_route_child_sock() opt is always NULL and the SRR-options are not respected anymore. Packets sent by the server won't have the correct destination-IP. This patch fixes it by accessing newinet->inet_opt instead of ireq->opt inside inet_csk_route_child_sock(). Reported-by: Luca Boccassi <luca.boccassi@gmail.com> Signed-off-by: Christoph Paasch <christoph.paasch@uclouvain.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-21 14:49:11 -07:00
Eric Dumazet	144d56e910	tcp: fix possible socket refcount problem Commit `6f458dfb40` (tcp: improve latencies of timer triggered events) added bug leading to following trace : [ 2866.131281] IPv4: Attempt to release TCP socket in state 1 ffff880019ec0000 [ 2866.131726] [ 2866.132188] ========================= [ 2866.132281] [ BUG: held lock freed! ] [ 2866.132281] 3.6.0-rc1+ #622 Not tainted [ 2866.132281] ------------------------- [ 2866.132281] kworker/0:1/652 is freeing memory ffff880019ec0000-ffff880019ec0a1f, with a lock still held there! [ 2866.132281] (sk_lock-AF_INET-RPC){+.+...}, at: [<ffffffff81903619>] tcp_sendmsg+0x29/0xcc6 [ 2866.132281] 4 locks held by kworker/0:1/652: [ 2866.132281] #0: (rpciod){.+.+.+}, at: [<ffffffff81083567>] process_one_work+0x1de/0x47f [ 2866.132281] #1: ((&task->u.tk_work)){+.+.+.}, at: [<ffffffff81083567>] process_one_work+0x1de/0x47f [ 2866.132281] #2: (sk_lock-AF_INET-RPC){+.+...}, at: [<ffffffff81903619>] tcp_sendmsg+0x29/0xcc6 [ 2866.132281] #3: (&icsk->icsk_retransmit_timer){+.-...}, at: [<ffffffff81078017>] run_timer_softirq+0x1ad/0x35f [ 2866.132281] [ 2866.132281] stack backtrace: [ 2866.132281] Pid: 652, comm: kworker/0:1 Not tainted 3.6.0-rc1+ #622 [ 2866.132281] Call Trace: [ 2866.132281] <IRQ> [<ffffffff810bc527>] debug_check_no_locks_freed+0x112/0x159 [ 2866.132281] [<ffffffff818a0839>] ? __sk_free+0xfd/0x114 [ 2866.132281] [<ffffffff811549fa>] kmem_cache_free+0x6b/0x13a [ 2866.132281] [<ffffffff818a0839>] __sk_free+0xfd/0x114 [ 2866.132281] [<ffffffff818a08c0>] sk_free+0x1c/0x1e [ 2866.132281] [<ffffffff81911e1c>] tcp_write_timer+0x51/0x56 [ 2866.132281] [<ffffffff81078082>] run_timer_softirq+0x218/0x35f [ 2866.132281] [<ffffffff81078017>] ? run_timer_softirq+0x1ad/0x35f [ 2866.132281] [<ffffffff810f5831>] ? rb_commit+0x58/0x85 [ 2866.132281] [<ffffffff81911dcb>] ? tcp_write_timer_handler+0x148/0x148 [ 2866.132281] [<ffffffff81070bd6>] __do_softirq+0xcb/0x1f9 [ 2866.132281] [<ffffffff81a0a00c>] ? _raw_spin_unlock+0x29/0x2e [ 2866.132281] [<ffffffff81a1227c>] call_softirq+0x1c/0x30 [ 2866.132281] [<ffffffff81039f38>] do_softirq+0x4a/0xa6 [ 2866.132281] [<ffffffff81070f2b>] irq_exit+0x51/0xad [ 2866.132281] [<ffffffff81a129cd>] do_IRQ+0x9d/0xb4 [ 2866.132281] [<ffffffff81a0a3ef>] common_interrupt+0x6f/0x6f [ 2866.132281] <EOI> [<ffffffff8109d006>] ? sched_clock_cpu+0x58/0xd1 [ 2866.132281] [<ffffffff81a0a172>] ? _raw_spin_unlock_irqrestore+0x4c/0x56 [ 2866.132281] [<ffffffff81078692>] mod_timer+0x178/0x1a9 [ 2866.132281] [<ffffffff818a00aa>] sk_reset_timer+0x19/0x26 [ 2866.132281] [<ffffffff8190b2cc>] tcp_rearm_rto+0x99/0xa4 [ 2866.132281] [<ffffffff8190dfba>] tcp_event_new_data_sent+0x6e/0x70 [ 2866.132281] [<ffffffff8190f7ea>] tcp_write_xmit+0x7de/0x8e4 [ 2866.132281] [<ffffffff818a565d>] ? __alloc_skb+0xa0/0x1a1 [ 2866.132281] [<ffffffff8190f952>] __tcp_push_pending_frames+0x2e/0x8a [ 2866.132281] [<ffffffff81904122>] tcp_sendmsg+0xb32/0xcc6 [ 2866.132281] [<ffffffff819229c2>] inet_sendmsg+0xaa/0xd5 [ 2866.132281] [<ffffffff81922918>] ? inet_autobind+0x5f/0x5f [ 2866.132281] [<ffffffff810ee7f1>] ? trace_clock_local+0x9/0xb [ 2866.132281] [<ffffffff8189adab>] sock_sendmsg+0xa3/0xc4 [ 2866.132281] [<ffffffff810f5de6>] ? rb_reserve_next_event+0x26f/0x2d5 [ 2866.132281] [<ffffffff8103e6a9>] ? native_sched_clock+0x29/0x6f [ 2866.132281] [<ffffffff8103e6f8>] ? sched_clock+0x9/0xd [ 2866.132281] [<ffffffff810ee7f1>] ? trace_clock_local+0x9/0xb [ 2866.132281] [<ffffffff8189ae03>] kernel_sendmsg+0x37/0x43 [ 2866.132281] [<ffffffff8199ce49>] xs_send_kvec+0x77/0x80 [ 2866.132281] [<ffffffff8199cec1>] xs_sendpages+0x6f/0x1a0 [ 2866.132281] [<ffffffff8107826d>] ? try_to_del_timer_sync+0x55/0x61 [ 2866.132281] [<ffffffff8199d0d2>] xs_tcp_send_request+0x55/0xf1 [ 2866.132281] [<ffffffff8199bb90>] xprt_transmit+0x89/0x1db [ 2866.132281] [<ffffffff81999bcd>] ? call_connect+0x3c/0x3c [ 2866.132281] [<ffffffff81999d92>] call_transmit+0x1c5/0x20e [ 2866.132281] [<ffffffff819a0d55>] __rpc_execute+0x6f/0x225 [ 2866.132281] [<ffffffff81999bcd>] ? call_connect+0x3c/0x3c [ 2866.132281] [<ffffffff819a0f33>] rpc_async_schedule+0x28/0x34 [ 2866.132281] [<ffffffff810835d6>] process_one_work+0x24d/0x47f [ 2866.132281] [<ffffffff81083567>] ? process_one_work+0x1de/0x47f [ 2866.132281] [<ffffffff819a0f0b>] ? __rpc_execute+0x225/0x225 [ 2866.132281] [<ffffffff81083a6d>] worker_thread+0x236/0x317 [ 2866.132281] [<ffffffff81083837>] ? process_scheduled_works+0x2f/0x2f [ 2866.132281] [<ffffffff8108b7b8>] kthread+0x9a/0xa2 [ 2866.132281] [<ffffffff81a12184>] kernel_thread_helper+0x4/0x10 [ 2866.132281] [<ffffffff81a0a4b0>] ? retint_restore_args+0x13/0x13 [ 2866.132281] [<ffffffff8108b71e>] ? __init_kthread_worker+0x5a/0x5a [ 2866.132281] [<ffffffff81a12180>] ? gs_change+0x13/0x13 [ 2866.308506] IPv4: Attempt to release TCP socket in state 1 ffff880019ec0000 [ 2866.309689] ============================================================================= [ 2866.310254] BUG TCP (Not tainted): Object already free [ 2866.310254] ----------------------------------------------------------------------------- [ 2866.310254] The bug comes from the fact that timer set in sk_reset_timer() can run before we actually do the sock_hold(). socket refcount reaches zero and we free the socket too soon. timer handler is not allowed to reduce socket refcnt if socket is owned by the user, or we need to change sk_reset_timer() implementation. We should take a reference on the socket in case TCP_DELACK_TIMER_DEFERRED or TCP_DELACK_TIMER_DEFERRED bit are set in tsq_flags Also fix a typo in tcp_delack_timer(), where TCP_WRITE_TIMER_DEFERRED was used instead of TCP_DELACK_TIMER_DEFERRED. For consistency, use same socket refcount change for TCP_MTU_REDUCED_DEFERRED, even if not fired from a timer. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Tested-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-21 14:42:23 -07:00
John W. Linville	01e17dacd4	Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Conflicts: drivers/net/wireless/mac80211_hwsim.c	2012-08-21 16:00:21 -04:00
AceLan Kao	06d7de831d	Revert "rfkill: remove dead code" This reverts commit `2e48928d8a`. Those functions are needed and should not be removed, or there is no way to set the rfkill led trigger name. Signed-off-by: AceLan Kao <acelan.kao@canonical.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-08-21 20:50:25 +02:00
J. Bruce Fields	d10f27a750	svcrpc: fix svc_xprt_enqueue/svc_recv busy-looping The rpc server tries to ensure that there will be room to send a reply before it receives a request. It does this by tracking, in xpt_reserved, an upper bound on the total size of the replies that is has already committed to for the socket. Currently it is adding in the estimate for a new reply before it checks whether there is space available. If it finds that there is not space, it then subtracts the estimate back out. This may lead the subsequent svc_xprt_enqueue to decide that there is space after all. The results is a svc_recv() that will repeatedly return -EAGAIN, causing server threads to loop without doing any actual work. Cc: stable@vger.kernel.org Reported-by: Michael Tokarev <mjt@tls.msk.ru> Tested-by: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-08-20 18:39:19 -04:00
J. Bruce Fields	f06f00a24d	svcrpc: sends on closed socket should stop immediately svc_tcp_sendto sets XPT_CLOSE if we fail to transmit the entire reply. However, the XPT_CLOSE won't be acted on immediately. Meanwhile other threads could send further replies before the socket is really shut down. This can manifest as data corruption: for example, if a truncated read reply is followed by another rpc reply, that second reply will look to the client like further read data. Symptoms were data corruption preceded by svc_tcp_sendto logging something like kernel: rpc-srv/tcp: nfsd: sent only 963696 when sending 1048708 bytes - shutting down socket Cc: stable@vger.kernel.org Reported-by: Malahal Naineni <malahal@us.ibm.com> Tested-by: Malahal Naineni <malahal@us.ibm.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-08-20 18:38:59 -04:00
J. Bruce Fields	be1e44441a	svcrpc: fix BUG() in svc_tcp_clear_pages Examination of svc_tcp_clear_pages shows that it assumes sk_tcplen is consistent with sk_pages[] (in particular, sk_pages[n] can't be NULL if sk_tcplen would lead us to expect n pages of data). svc_tcp_restore_pages zeroes out sk_pages[] while leaving sk_tcplen. This is OK, since both functions are serialized by XPT_BUSY. However, that means the inconsistency must be repaired before dropping XPT_BUSY. Therefore we should be ensuring that svc_tcp_save_pages repairs the problem before exiting svc_tcp_recv_record on error. Symptoms were a BUG() in svc_tcp_clear_pages. Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-08-20 18:38:44 -04:00
Sage Weil	d1c338a509	libceph: delay debugfs initialization until we learn global_id The debugfs directory includes the cluster fsid and our unique global_id. We need to delay the initialization of the debug entry until we have learned both the fsid and our global_id from the monitor or else the second client can't create its debugfs entry and will fail (and multiple client instances aren't properly reflected in debugfs). Reported by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>	2012-08-20 10:03:15 -07:00
Johannes Berg	dcf33963c4	mac80211: clean up ieee80211_subif_start_xmit There's no need to carry around a return value that is always NETDEV_TX_OK anyway. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-08-20 14:13:46 +02:00
Johannes Berg	fe94fe05e9	mac80211: pass channel to ieee80211_send_probe_req In multi-channel scenarios, the channel that we will transmit a probe request on isn't always the current channel (which will be NULL anyway) but will instead be the channel that the AP is on. Pass the channel to the ieee80211_send_probe_req() function so it can be used in the different scenarios. The scan code continues to pass the current channel, of course. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-08-20 14:13:45 +02:00
Johannes Berg	c0af07340a	mac80211: convert ops checks to WARN_ON There's no need to BUG_ON when a driver registers invalid operations, warn and return an error. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-08-20 14:13:44 +02:00
Johannes Berg	9b86487043	mac80211: check operating channel in scan The optimisation of scanning only on the current channel should check the operating channel. Also modify it to compare channel pointer rather than the frequency. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-08-20 14:13:43 +02:00
Johannes Berg	4797c7ba93	mac80211: use RX status band instead of current band Even for single-channel devices it is possible that we switch the channel temporarily (e.g. for scanning) but while doing so process a received frame that was still received on the old channel, so checking the current band is racy. Use the band from status instead. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-08-20 14:13:43 +02:00
Johannes Berg	3d01be72e6	mac80211: don't assume channel is set in tracing With the move to multi-channel and away from drv_config(), hw.conf.channel will not always be set, only for devices using the current API instead of the new channel context APIs. Check the channel is set before adding its frequency to the trace data. Also break some overly long lines in the code. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-08-20 14:13:42 +02:00
Johannes Berg	f9e6e95b63	mac80211: use oper_channel in rate init Using hw.conf.channel is wrong as it could be the temporary channel if the station is added from the workqueue while the device is already on another channel. Use oper_channel instead. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-08-20 14:13:41 +02:00
Johannes Berg	9e99a127b5	mac80211: remove freq/chantype from debugfs You can now get these values through iw, and they conflict with multi-channel work. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-08-20 14:13:40 +02:00
Johannes Berg	e31583cdf0	mac80211: remove almost unused local variable In ieee80211_beacon_get_tim() we can use the txrc.sband instead of a separate local variable. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-08-20 14:13:39 +02:00
Johannes Berg	466f310d10	mac80211: mesh: don't use global channel type Using local->_oper_channel_type in the mesh code is completely wrong as this value is the combination of the various interface channel types and can be a different value from the mesh interface in case there are multiple virtual interfaces. Use sdata->vif.bss_conf.channel_type instead as it tracks the per-vif channel type. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-08-20 14:13:38 +02:00
Johannes Berg	d348f69f59	mac80211: simplify buffers in aes_128_cmac_vector There's no need to use a single scratch buffer and calculate offsets into it, just use two separate buffers for the separate variables. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-08-20 14:03:18 +02:00
Johannes Berg	6d71117a27	mac80211: add IEEE80211_HW_P2P_DEV_ADDR_FOR_INTF Some devices like the current iwlwifi implementation require that the P2P interface address match the P2P Device address (only one P2P interface is supported.) Add the HW flag IEEE80211_HW_P2P_DEV_ADDR_FOR_INTF that allows drivers to request that P2P Interfaces added while a P2P Device is active get the same MAC address by default. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-08-20 13:58:23 +02:00
Johannes Berg	f142c6b906	mac80211: support P2P Device abstraction After cfg80211 got a P2P Device abstraction, add support to mac80211. Whether it really is supported or not will depend on whether or not the driver has support for it, but mac80211 needs to change to be able to support drivers that need a P2P Device. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-08-20 13:58:22 +02:00
Johannes Berg	98104fdeda	cfg80211: add P2P Device abstraction In order to support using a different MAC address for the P2P Device address we must first have a P2P Device abstraction that can be assigned a MAC address. This abstraction will also be useful to support offloading P2P operations to the device, e.g. periodic listen for discoverability. Currently, the driver is responsible for assigning a MAC address to the P2P Device, but this could be changed by allowing a MAC address to be given to the NEW_INTERFACE command. As it has no associated netdev, a P2P Device can only be identified by its wdev identifier but the previous patches allowed using the wdev identifier in various APIs, e.g. remain-on-channel. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-08-20 13:58:21 +02:00
Johannes Berg	cc74c0c7d6	mac80211: make ieee80211_beacon_connection_loss_work static There's no need to declare the function in the header file since it's only used in a single place, so make it static. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-08-20 13:57:50 +02:00
Johannes Berg	5bc1420b11	mac80211: check size of channel switch IE when parsing The channel switch IE has a fixed size, so we can discard it in parsing if it's not the right size and use the right struct pointer. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-08-20 13:57:50 +02:00
Johannes Berg	3049000b97	mac80211: fix CSA handling timer The time until the channel switch is in TU, not in milliseconds, so use TU_TO_EXP_TIME() to correctly program the timer. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-08-20 13:57:49 +02:00
Johannes Berg	57eebdf3c2	mac80211: clean up CSA handling code Clean up the CSA handling code by moving some of it out of the if and using a C99 initializer for the struct passed to the driver method. While at it, also add a comment that we should wait for a beacon after switching the channel. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-08-20 13:57:48 +02:00
Johannes Berg	90bcf867ce	mac80211: remove unneeded 'bssid' variable There's no need to copy the BSSID just to print it, remove the unnecessary variable. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-08-20 13:57:47 +02:00
Johannes Berg	4c29867790	mac80211: support A-MPDU status reporting Support getting A-MPDU status information from the drivers and reporting it to userspace via radiotap in the standard fields. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-08-20 13:53:22 +02:00
Johannes Berg	48613ece3d	wireless: add radiotap A-MPDU status field Define the A-MPDU status field in radiotap, also update the radiotap parser for it and the MCS field that was apparently missed last time. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-08-20 13:53:09 +02:00
Antonio Quartulli	e687f61eed	mac80211: add supported rates change notification in IBSS In IBSS it is possible that the supported rates set for a station changes over time (e.g. it gets first initialised as an empty set because of no available information about rates and updated later). In this case the driver has to be notified about the change in order to update its internal table accordingly (if needed). This behaviour is needed by all those drivers that handle rc internally but leave stations management to mac80211 Reported-by: Gui Iribarren <gui@altermundi.net> Signed-off-by: Antonio Quartulli <ordex@autistici.org> [Johannes - add docs, validate IBSS mode only, fix compilation] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-08-20 13:31:43 +02:00
Thomas Pedersen	4bd4c2dd8e	mac80211: clean up mpath_move_to_queue() Use skb_queue_walk_safe instead, and fix a few issues: - didn't free old skbs on moving - didn't react to failed skb alloc - needlessly held a local pointer to the destination frame queue - didn't check destination queue length before adding skb Signed-off-by: Thomas Pedersen <thomas@cozybit.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-08-20 13:25:05 +02:00
Thomas Pedersen	b22bd5221c	mac80211: use skb_queue_walk() in mesh_path_assign_nexthop Since all we really want is just to iterate over all skbs, do just that and avoid (de)queueing to a clusmy tmpq. Signed-off-by: Thomas Pedersen <thomas@cozybit.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-08-20 13:25:05 +02:00
Eyal Shapira	aa7a00809c	mac80211: avoid using synchronize_rcu in ieee80211_set_probe_resp This could take a while (100ms+) and may delay sending assoc resp in AP mode with WPS or P2P GO (as setting the probe resp takes place there). We've encountered situations where the delay was big enough to cause connection problems with devices like Galaxy Nexus. Switch to using call_rcu with a free handler. [Arik - rework to use plain buffer and instead of skb] Signed-off-by: Eyal Shapira <eyal@wizery.com> Signed-off-by: Arik Nemtsov <arik@wizery.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-08-20 13:20:56 +02:00
Patrick McHardy	2834a6386b	netfilter: nf_conntrack: remove unnecessary RTNL locking Locking the rtnl was added to nf_conntrack_l{3,4}_proto_unregister() for walking the network namespace list. This is not done anymore since we have proper namespace support in the protocols now, so we don't need to take the RTNL anymore. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-08-20 12:46:29 +02:00
Patrick McHardy	fe31d1a860	netfilter: sparse endian fixes Fix a couple of endian annotation in net/netfilter: net/netfilter/nfnetlink_acct.c:82:30: warning: cast to restricted __be64 net/netfilter/nfnetlink_acct.c:86:30: warning: cast to restricted __be64 net/netfilter/nfnetlink_cthelper.c:77:28: warning: cast to restricted __be16 net/netfilter/xt_NFQUEUE.c:46:16: warning: restricted __be32 degrades to integer net/netfilter/xt_NFQUEUE.c:60:34: warning: restricted __be32 degrades to integer net/netfilter/xt_NFQUEUE.c:68:34: warning: restricted __be32 degrades to integer net/netfilter/xt_osf.c:272:55: warning: cast to restricted __be16 Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-08-20 12:45:57 +02:00
Patrick McHardy	2dba62c30e	netfilter: nfnetlink_log: fix NLA_PUT macro removal bug Commit `1db20a52` (nfnetlink_log: Stop using NLA_PUT*().) incorrectly converted a NLA_PUT_BE16 macro to nla_put_be32() in nfnetlink_log: - NLA_PUT_BE16(inst->skb, NFULA_HWTYPE, htons(skb->dev->type)); + if (nla_put_be32(inst->skb, NFULA_HWTYPE, htons(skb->dev->type)) Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-08-20 12:40:23 +02:00
Neal Cardwell	fae6ef87fa	net: tcp: move sk_rx_dst_set call after tcp_create_openreq_child() This commit removes the sk_rx_dst_set calls from tcp_create_openreq_child(), because at that point the icsk_af_ops field of ipv6_mapped TCP sockets has not been set to its proper final value. Instead, to make sure we get the right sk_rx_dst_set variant appropriate for the address family of the new connection, we have tcp_v{4,6}_syn_recv_sock() directly call the appropriate function shortly after the call to tcp_create_openreq_child() returns. This also moves inet6_sk_rx_dst_set() to avoid a forward declaration with the new approach. Signed-off-by: Neal Cardwell <ncardwell@google.com> Reported-by: Artem Savkov <artem.savkov@gmail.com> Cc: Eric Dumazet <edumazet@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-20 03:03:33 -07:00
Randy Dunlap	3de7a37b02	net/core/dev.c: fix kernel-doc warning Fix kernel-doc warning: Warning(net/core/dev.c:5745): No description found for parameter 'dev' Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-20 03:00:55 -07:00
Patrick McHardy	9d7b0fc1ef	net: ipv6: fix oops in inet_putpeer() Commit `97bab73f` (inet: Hide route peer accesses behind helpers.) introduced a bug in xfrm6_policy_destroy(). The xfrm_dst's _rt6i_peer member is not initialized, causing a false positive result from inetpeer_ptr_is_peer(), which in turn causes a NULL pointer dereference in inet_putpeer(). Pid: 314, comm: kworker/0:1 Not tainted 3.6.0-rc1+ #17 To Be Filled By O.E.M. To Be Filled By O.E.M./P4S800D-X EIP: 0060:[<c03abf93>] EFLAGS: 00010246 CPU: 0 EIP is at inet_putpeer+0xe/0x16 EAX: 00000000 EBX: f3481700 ECX: 00000000 EDX: 000dd641 ESI: f3481700 EDI: c05e949c EBP: f551def4 ESP: f551def4 DS: 007b ES: 007b FS: 0000 GS: 00e0 SS: 0068 CR0: 8005003b CR2: 00000070 CR3: 3243d000 CR4: 00000750 DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 DR6: ffff0ff0 DR7: 00000400 f551df04 c0423de1 00000000 f3481700 f551df18 c038d5f7 f254b9f8 f551df28 f34f85d8 f551df20 c03ef48d f551df3c c0396870 f30697e8 f24e1738 c05e98f4 f5509540 c05cd2b4 f551df7c c0142d2b c043feb5 f5509540 00000000 c05cd2e8 [<c0423de1>] xfrm6_dst_destroy+0x42/0xdb [<c038d5f7>] dst_destroy+0x1d/0xa4 [<c03ef48d>] xfrm_bundle_flo_delete+0x2b/0x36 [<c0396870>] flow_cache_gc_task+0x85/0x9f [<c0142d2b>] process_one_work+0x122/0x441 [<c043feb5>] ? apic_timer_interrupt+0x31/0x38 [<c03967eb>] ? flow_cache_new_hashrnd+0x2b/0x2b [<c0143e2d>] worker_thread+0x113/0x3cc Fix by adding a init_dst() callback to struct xfrm_policy_afinfo to properly initialize the dst's peer pointer. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-20 02:56:56 -07:00
Jesper Juhl	d92c7f8aab	caif: Do not dereference NULL in chnl_recv_cb() In net/caif/chnl_net.c::chnl_recv_cb() we call skb_header_pointer() which may return NULL, but we do not check for a NULL pointer before dereferencing it. This patch adds such a NULL check and properly free's allocated memory and return an error (-EINVAL) on failure - much better than crashing.. Signed-off-by: Jesper Juhl <jj@chaosbits.net> Acked-by: Sjur Brændeland <sjur.brandeland@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-20 02:47:49 -07:00
David S. Miller	6c71bec66a	Merge git://1984.lsi.us.es/nf Pable Neira Ayuso says: ==================== The following five patches contain fixes for 3.6-rc, they are: * Two fixes for message parsing in the SIP conntrack helper, from Patrick McHardy. * One fix for the SIP helper introduced in the user-space cthelper infrastructure, from Patrick McHardy. * fix missing appropriate locking while modifying one conntrack entry from the nfqueue integration code, from myself. * fix possible access to uninitiliazed timer in the nf_conntrack expectation infrastructure, from myself. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-20 02:44:29 -07:00
Eric Leblond	c0de08d042	af_packet: don't emit packet on orig fanout group If a packet is emitted on one socket in one group of fanout sockets, it is transmitted again. It is thus read again on one of the sockets of the fanout group. This result in a loop for software which generate packets when receiving one. This retransmission is not the intended behavior: a fanout group must behave like a single socket. The packet should not be transmitted on a socket if it originates from a socket belonging to the same fanout group. This patch fixes the issue by changing the transmission check to take fanout group info account. Reported-by: Aleksandr Kotov <a1k@mail.ru> Signed-off-by: Eric Leblond <eric@regit.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-20 02:37:29 -07:00
Ying Xue	e6a04b1d3f	tipc: eliminate configuration for maximum number of name publications Gets rid of the need for users to specify the maximum number of name publications supported by TIPC. TIPC now automatically provides support for the maximum number of name publications to 65535. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-20 02:26:31 -07:00
Ying Xue	34f256cc79	tipc: eliminate configuration for maximum number of name subscriptions Gets rid of the need for users to specify the maximum number of name subscriptions supported by TIPC. TIPC now automatically provides support for the maximum number of name subscriptions to 65535. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-20 02:26:31 -07:00
Ying Xue	61cdd4d80b	tipc: add __read_mostly annotations to several global variables Added to the following: - tipc_random - tipc_own_addr - tipc_max_ports - tipc_net_id - tipc_remote_management - handler_enabled The above global variables are read often, but written rarely. Use __read_mostly to prevent them being on the same cacheline as another variable which is written to often, which would cause cacheline bouncing. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-20 02:26:31 -07:00
Ying Xue	f046e7d9be	tipc: convert tipc_nametbl_size type from variable to macro There is nothing changing this variable dynamically, so change it to a macro to make that more obvious when reading the code. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-20 02:26:31 -07:00
Ying Xue	379c0456af	tipc: change tipc_net_start routine return value type Since now tipc_net_start() always returns a success code - 0, its return value type should be changed from integer to void, which can avoid unnecessary check for its return value. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-20 02:26:30 -07:00
Ying Xue	381294331e	tipc: manually inline single use media_name_valid routine After eliminating the mechanism which checks whether all letters in media name string are within a given character set, the media_name_valid routine becomes trivial. It is also only used once, so it is unnecessary to keep it as a separate function. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-20 02:26:30 -07:00
Ying Xue	fc0739385b	tipc: remove pointless name sanity check and tipc_alphabet array There is no real reason to check whether all letters in the given media name and network interface name are within the character set defined in tipc_alphabet array. Even if we eliminate the checking, the rest of checking conditions in tipc_enable_bearer() can ensure we do not enable an invalid or illegal bearer. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-20 02:26:30 -07:00
Ying Xue	4225a398c1	tipc: fix lockdep warning during bearer initialization When the lockdep validator is enabled, it will report the below warning when we enable a TIPC bearer: [ INFO: possible irq lock inversion dependency detected ] --------------------------------------------------------- Possible interrupt unsafe locking scenario: CPU0 CPU1 ---- ---- lock(ptype_lock); local_irq_disable(); lock(tipc_net_lock); lock(ptype_lock); <Interrupt> lock(tipc_net_lock); * DEADLOCK * the shortest dependencies between 2nd lock and 1st lock: -> (ptype_lock){+.+...} ops: 10 { [...] SOFTIRQ-ON-W at: [<c1089418>] __lock_acquire+0x528/0x13e0 [<c108a360>] lock_acquire+0x90/0x100 [<c1553c38>] _raw_spin_lock+0x38/0x50 [<c14651ca>] dev_add_pack+0x3a/0x60 [<c182da75>] arp_init+0x1a/0x48 [<c182dce5>] inet_init+0x181/0x27e [<c1001114>] do_one_initcall+0x34/0x170 [<c17f7329>] kernel_init+0x110/0x1b2 [<c155b6a2>] kernel_thread_helper+0x6/0x10 [...] ... key at: [<c17e4b10>] ptype_lock+0x10/0x20 ... acquired at: [<c108a360>] lock_acquire+0x90/0x100 [<c1553c38>] _raw_spin_lock+0x38/0x50 [<c14651ca>] dev_add_pack+0x3a/0x60 [<c8bc18d2>] enable_bearer+0xf2/0x140 [tipc] [<c8bb283a>] tipc_enable_bearer+0x1ba/0x450 [tipc] [<c8bb3a04>] tipc_cfg_do_cmd+0x5c4/0x830 [tipc] [<c8bbc032>] handle_cmd+0x42/0xd0 [tipc] [<c148e802>] genl_rcv_msg+0x232/0x280 [<c148d3f6>] netlink_rcv_skb+0x86/0xb0 [<c148e5bc>] genl_rcv+0x1c/0x30 [<c148d144>] netlink_unicast+0x174/0x1f0 [<c148ddab>] netlink_sendmsg+0x1eb/0x2d0 [<c1456bc1>] sock_aio_write+0x161/0x170 [<c1135a7c>] do_sync_write+0xac/0xf0 [<c11360f6>] vfs_write+0x156/0x170 [<c11361e2>] sys_write+0x42/0x70 [<c155b0df>] sysenter_do_call+0x12/0x38 [...] } -> (tipc_net_lock){+..-..} ops: 4 { [...] IN-SOFTIRQ-R at: [<c108953a>] __lock_acquire+0x64a/0x13e0 [<c108a360>] lock_acquire+0x90/0x100 [<c15541cd>] _raw_read_lock_bh+0x3d/0x50 [<c8bb874d>] tipc_recv_msg+0x1d/0x830 [tipc] [<c8bc195f>] recv_msg+0x3f/0x50 [tipc] [<c146a5fa>] __netif_receive_skb+0x22a/0x590 [<c146ab0b>] netif_receive_skb+0x2b/0xf0 [<c13c43d2>] pcnet32_poll+0x292/0x780 [<c146b00a>] net_rx_action+0xfa/0x1e0 [<c103a4be>] __do_softirq+0xae/0x1e0 [...] } >From the log, we can see three different call chains between CPU0 and CPU1: Time 0 on CPU0: kernel_init()->inet_init()->dev_add_pack() At time 0, the ptype_lock is held by CPU0 in dev_add_pack(); Time 1 on CPU1: tipc_enable_bearer()->enable_bearer()->dev_add_pack() At time 1, tipc_enable_bearer() first holds tipc_net_lock, and then wants to take ptype_lock to register TIPC protocol handler into the networking stack. But the ptype_lock has been taken by dev_add_pack() on CPU0, so at this time the dev_add_pack() running on CPU1 has to be busy looping. Time 2 on CPU0: netif_receive_skb()->recv_msg()->tipc_recv_msg() At time 2, an incoming TIPC packet arrives at CPU0, hence tipc_recv_msg() will be invoked. In tipc_recv_msg(), it first wants to hold tipc_net_lock. At the moment, below scenario happens: On CPU0, below is our sequence of taking locks: lock(ptype_lock)->lock(tipc_net_lock) On CPU1, our sequence of taking locks looks like: lock(tipc_net_lock)->lock(ptype_lock) Obviously deadlock may happen in this case. But please note the deadlock possibly doesn't occur at all when the first TIPC bearer is enabled. Before enable_bearer() -- running on CPU1 does not hold ptype_lock, so the TIPC receive handler (i.e. recv_msg()) is not registered successfully via dev_add_pack(), so the tipc_recv_msg() cannot be called by recv_msg() even if a TIPC message comes to CPU0. But when the second TIPC bearer is registered, the deadlock can perhaps really happen. To fix it, we will push the work of registering TIPC protocol handler into workqueue context. After the change, both paths taking ptype_lock are always in process contexts, thus, the deadlock should never occur. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-20 02:26:30 -07:00
Ying Xue	fa7f86f1bb	tipc: optimize the initialization of network device notifier Ethernet media initialization is only done when TIPC is started or switched to network mode. So the initialization of the network device notifier structure can be moved out of this function and done statically instead. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-20 02:26:30 -07:00
Pavel Emelyanov	fff3321d75	packet: Report fanout status via diag engine Reported value is the same reported by the FANOUT getsockoption, but unlike it, the absent fanout setup results in absent nlattr, rather than in nlattr with zero value. This is done so, since zero fanout report may mean both -- no fanout, and fanout with both id and type zero. Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-20 02:23:14 -07:00
Pavel Emelyanov	16f01365fa	packet: Report rings cfg via diag engine One extension bit may result in two nlattrs -- one per ring type. If some ring type is not configured, then the respective nlatts will be empty. The structure reported contains the data, that is given to the corresponding ring setup socket option. Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-20 02:23:14 -07:00
Dan Carpenter	5ef5d6c569	gre: information leak in ip6_tnl_ioctl() There is a one byte hole between p->hop_limit and p->flowinfo where stack memory is leaked to the user. This was introduced in `c12b395a46` "gre: Support GRE over IPv6". Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>	2012-08-20 02:21:30 -07:00
Fan Du	56892261ed	xfrm: Use rcu_dereference_bh to deference pointer protected by rcu_read_lock_bh Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-16 19:39:22 -07:00
Dan Carpenter	898132ae76	ipv6: move dereference after check in fl_free() There is a dereference before checking for NULL bug here. Generally free() functions should accept NULL pointers. For example, fl_create() can pass a NULL pointer to fl_free() on the error path. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-16 16:04:42 -07:00
John Fastabend	476ad154f3	net: netprio: fix cgrp create and write priomap race A race exists where creating cgroups and also updating the priomap may result in losing a priomap update. This is because priomap writers are not protected by rtnl_lock. Move priority writer into rtnl_lock()/rtnl_unlock(). CC: Neil Horman <nhorman@tuxdriver.com> Reported-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-16 14:56:11 -07:00
John Fastabend	48a87cc26c	net: netprio: fd passed in SCM_RIGHTS datagram not set correctly A socket fd passed in a SCM_RIGHTS datagram was not getting updated with the new tasks cgrp prioidx. This leaves IO on the socket tagged with the old tasks priority. To fix this add a check in the scm recvmsg path to update the sock cgrp prioidx with the new tasks value. Thanks to Al Viro for catching this. CC: Neil Horman <nhorman@tuxdriver.com> Reported-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-16 14:56:11 -07:00
John Fastabend	f796c20cf6	net: netprio: fix files lock and remove useless d_path bits Add lock to prevent a race with a file closing and also remove useless and ugly sscanf code. The extra code was never needed and the case it supposedly protected against is in fact handled correctly by sock_from_file as pointed out by Al Viro. CC: Neil Horman <nhorman@tuxdriver.com> Reported-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-16 14:56:11 -07:00
Jason Wang	16c0b164bd	act_mirred: do not drop packets when fails to mirror it We drop packet unconditionally when we fail to mirror it. This is not intended in some cases. Consdier for kvm guest, we may mirror the traffic of the bridge to a tap device used by a VM. When kernel fails to mirror the packet in conditions such as when qemu crashes or stop polling the tap, it's hard for the management software to detect such condition and clean the the mirroring before. This would lead all packets to the bridge to be dropped and break the netowrk of other virtual machines. To solve the issue, the patch does not drop packets when kernel fails to mirror it, and only drop the redirected packets. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-16 14:54:44 -07:00
Dan Carpenter	02644a1745	sctp: fix bogus if statement in sctp_auth_recv_cid() There is an extra semi-colon here, so we always return 0 instead of calling __sctp_auth_cid(). Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-16 13:36:29 -07:00
Ulrich Weber	6932f119bd	sctp: fix compile issue with disabled CONFIG_NET_NS struct seq_net_private has no struct net if CONFIG_NET_NS is not enabled Signed-off-by: Ulrich Weber <ulrich.weber@sophos.com> Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-16 13:36:29 -07:00
Pablo Neira Ayuso	2614f86490	netfilter: nf_ct_expect: fix possible access to uninitialized timer In __nf_ct_expect_check, the function refresh_timer returns 1 if a matching expectation is found and its timer is successfully refreshed. This results in nf_ct_expect_related returning 0. Note that at this point: - the passed expectation is not inserted in the expectation table and its timer was not initialized, since we have refreshed one matching/existing expectation. - nf_ct_expect_alloc uses kmem_cache_alloc, so the expectation timer is in some undefined state just after the allocation, until it is appropriately initialized. This can be a problem for the SIP helper during the expectation addition: ... if (nf_ct_expect_related(rtp_exp) == 0) { if (nf_ct_expect_related(rtcp_exp) != 0) nf_ct_unexpect_related(rtp_exp); ... Note that nf_ct_expect_related(rtp_exp) may return 0 for the timer refresh case that is detailed above. Then, if nf_ct_unexpect_related(rtcp_exp) returns != 0, nf_ct_unexpect_related(rtp_exp) is called, which does: spin_lock_bh(&nf_conntrack_lock); if (del_timer(&exp->timeout)) { nf_ct_unlink_expect(exp); nf_ct_expect_put(exp); } spin_unlock_bh(&nf_conntrack_lock); Note that del_timer always returns false if the timer has been initialized. However, the timer was not initialized since setup_timer was not called, therefore, the expectation timer remains in some undefined state. If I'm not missing anything, this may lead to the removal an unexistent expectation. To fix this, the optimization that allows refreshing an expectation is removed. Now nf_conntrack_expect_related looks more consistent to me since it always add the expectation in case that it returns success. Thanks to Patrick McHardy for participating in the discussion of this patch. I think this may be the source of the problem described by: http://marc.info/?l=netfilter-devel&m=134073514719421&w=2 Reported-by: Rafal Fitt <rafalf@aplusc.com.pl> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-08-16 11:49:53 +02:00
Mathias Krause	43da5f2e0d	net: fix info leak in compat dev_ifconf() The implementation of dev_ifconf() for the compat ioctl interface uses an intermediate ifc structure allocated in userland for the duration of the syscall. Though, it fails to initialize the padding bytes inserted for alignment and that for leaks four bytes of kernel stack. Add an explicit memset(0) before filling the structure to avoid the info leak. Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-15 21:36:31 -07:00
Mathias Krause	2d8a041b7b	ipvs: fix info leak in getsockopt(IP_VS_SO_GET_TIMEOUT) If at least one of CONFIG_IP_VS_PROTO_TCP or CONFIG_IP_VS_PROTO_UDP is not set, __ip_vs_get_timeouts() does not fully initialize the structure that gets copied to userland and that for leaks up to 12 bytes of kernel stack. Add an explicit memset(0) before passing the structure to __ip_vs_get_timeouts() to avoid the info leak. Signed-off-by: Mathias Krause <minipli@googlemail.com> Cc: Wensong Zhang <wensong@linux-vs.org> Cc: Simon Horman <horms@verge.net.au> Cc: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-15 21:36:31 -07:00
Mathias Krause	7b07f8eb75	dccp: fix info leak via getsockopt(DCCP_SOCKOPT_CCID_TX_INFO) The CCID3 code fails to initialize the trailing padding bytes of struct tfrc_tx_info added for alignment on 64 bit architectures. It that for potentially leaks four bytes kernel stack via the getsockopt() syscall. Add an explicit memset(0) before filling the structure to avoid the info leak. Signed-off-by: Mathias Krause <minipli@googlemail.com> Cc: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-15 21:36:31 -07:00
Mathias Krause	276bdb82de	dccp: check ccid before dereferencing ccid_hc_rx_getsockopt() and ccid_hc_tx_getsockopt() might be called with a NULL ccid pointer leading to a NULL pointer dereference. This could lead to a privilege escalation if the attacker is able to map page 0 and prepare it with a fake ccid_ops pointer. Signed-off-by: Mathias Krause <minipli@googlemail.com> Cc: Gerrit Renker <gerrit@erg.abdn.ac.uk> Cc: stable@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-15 21:36:31 -07:00
Mathias Krause	3592aaeb80	llc: fix info leak via getsockname() The LLC code wrongly returns 0, i.e. "success", when the socket is zapped. Together with the uninitialized uaddrlen pointer argument from sys_getsockname this leads to an arbitrary memory leak of up to 128 bytes kernel stack via the getsockname() syscall. Return an error instead when the socket is zapped to prevent the info leak. Also remove the unnecessary memset(0). We don't directly write to the memory pointed by uaddr but memcpy() a local structure at the end of the function that is properly initialized. Signed-off-by: Mathias Krause <minipli@googlemail.com> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-15 21:36:31 -07:00
Mathias Krause	04d4fbca10	l2tp: fix info leak via getsockname() The L2TP code for IPv6 fails to initialize the l2tp_unused member of struct sockaddr_l2tpip6 and that for leaks two bytes kernel stack via the getsockname() syscall. Initialize l2tp_unused with 0 to avoid the info leak. Signed-off-by: Mathias Krause <minipli@googlemail.com> Cc: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-15 21:36:31 -07:00
Mathias Krause	792039c73c	Bluetooth: L2CAP - Fix info leak via getsockname() The L2CAP code fails to initialize the l2_bdaddr_type member of struct sockaddr_l2 and the padding byte added for alignment. It that for leaks two bytes kernel stack via the getsockname() syscall. Add an explicit memset(0) before filling the structure to avoid the info leak. Signed-off-by: Mathias Krause <minipli@googlemail.com> Cc: Marcel Holtmann <marcel@holtmann.org> Cc: Gustavo Padovan <gustavo@padovan.org> Cc: Johan Hedberg <johan.hedberg@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-15 21:36:31 -07:00
Mathias Krause	9344a97296	Bluetooth: RFCOMM - Fix info leak via getsockname() The RFCOMM code fails to initialize the trailing padding byte of struct sockaddr_rc added for alignment. It that for leaks one byte kernel stack via the getsockname() syscall. Add an explicit memset(0) before filling the structure to avoid the info leak. Signed-off-by: Mathias Krause <minipli@googlemail.com> Cc: Marcel Holtmann <marcel@holtmann.org> Cc: Gustavo Padovan <gustavo@padovan.org> Cc: Johan Hedberg <johan.hedberg@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-15 21:36:31 -07:00
Mathias Krause	f9432c5ec8	Bluetooth: RFCOMM - Fix info leak in ioctl(RFCOMMGETDEVLIST) The RFCOMM code fails to initialize the two padding bytes of struct rfcomm_dev_list_req inserted for alignment before copying it to userland. Additionally there are two padding bytes in each instance of struct rfcomm_dev_info. The ioctl() that for disclosures two bytes plus dev_num times two bytes uninitialized kernel heap memory. Allocate the memory using kzalloc() to fix this issue. Signed-off-by: Mathias Krause <minipli@googlemail.com> Cc: Marcel Holtmann <marcel@holtmann.org> Cc: Gustavo Padovan <gustavo@padovan.org> Cc: Johan Hedberg <johan.hedberg@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-15 21:36:30 -07:00
Mathias Krause	9ad2de43f1	Bluetooth: RFCOMM - Fix info leak in getsockopt(BT_SECURITY) The RFCOMM code fails to initialize the key_size member of struct bt_security before copying it to userland -- that for leaking one byte kernel stack. Initialize key_size with 0 to avoid the info leak. Signed-off-by: Mathias Krause <minipli@googlemail.com> Cc: Marcel Holtmann <marcel@holtmann.org> Cc: Gustavo Padovan <gustavo@padovan.org> Cc: Johan Hedberg <johan.hedberg@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-15 21:36:30 -07:00
Mathias Krause	3f68ba07b1	Bluetooth: HCI - Fix info leak via getsockname() The HCI code fails to initialize the hci_channel member of struct sockaddr_hci and that for leaks two bytes kernel stack via the getsockname() syscall. Initialize hci_channel with 0 to avoid the info leak. Signed-off-by: Mathias Krause <minipli@googlemail.com> Cc: Marcel Holtmann <marcel@holtmann.org> Cc: Gustavo Padovan <gustavo@padovan.org> Cc: Johan Hedberg <johan.hedberg@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-15 21:36:30 -07:00
Mathias Krause	e15ca9a0ef	Bluetooth: HCI - Fix info leak in getsockopt(HCI_FILTER) The HCI code fails to initialize the two padding bytes of struct hci_ufilter before copying it to userland -- that for leaking two bytes kernel stack. Add an explicit memset(0) before filling the structure to avoid the info leak. Signed-off-by: Mathias Krause <minipli@googlemail.com> Cc: Marcel Holtmann <marcel@holtmann.org> Cc: Gustavo Padovan <gustavo@padovan.org> Cc: Johan Hedberg <johan.hedberg@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-15 21:36:30 -07:00
Mathias Krause	3c0c5cfdcd	atm: fix info leak via getsockname() The ATM code fails to initialize the two padding bytes of struct sockaddr_atmpvc inserted for alignment. Add an explicit memset(0) before filling the structure to avoid the info leak. Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-15 21:36:30 -07:00
Mathias Krause	e862f1a9b7	atm: fix info leak in getsockopt(SO_ATMPVC) The ATM code fails to initialize the two padding bytes of struct sockaddr_atmpvc inserted for alignment. Add an explicit memset(0) before filling the structure to avoid the info leak. Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-15 21:36:30 -07:00
David S. Miller	2ea214929d	Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next John W. Linville says: ==================== This is a batch of updates intended for 3.7. The ath9k, mwifiex, and b43 drivers get the bulk of the commits this time, with a handful of other driver bits thrown-in. It is mostly just minor fixes and cleanups, etc. Also included is a Bluetooth pull, with a lot of refactoring. Gustavo says: "These are the changes I queued for 3.7. There are a many small fixes/improvements by Andre Guedes. A l2cap channel refcounting refactor by Jaganath. Bluetooth sockets now appears in /proc/net, by Masatake Yamato and Sachin Kamat changes ours drivers to use devm_kzalloc()." ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-15 15:26:05 -07:00
Razvan Ghitulete	1b3a6926fc	net: remove wrong initialization for snd_wl1 The field tp->snd_wl1 is twice initialized, the second time seems to be wrong as it may overwrite any update in tcp_ack. Signed-off-by: Razvan Ghitulete <rghitulete@ixiacom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-15 15:23:46 -07:00
Fan Du	65e0736bc2	xfrm: remove redundant parameter "int dir" in struct xfrm_mgr.acquire Sematically speaking, xfrm_mgr.acquire is called when kernel intends to ask user space IKE daemon to negotiate SAs with peers. IOW the direction will always be XFRM_POLICY_OUT, so remove int dir for clarity. Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-15 15:13:30 -07:00
Stephen Hemminger	c03307eab6	bridge: fix rcu dereference outside of rcu_read_lock Alternative solution for problem found by Linux Driver Verification project (linuxtesting.org). As it noted in the comment before the br_handle_frame_finish function, this function should be called under rcu_read_lock. The problem callgraph: br_dev_xmit -> br_nf_pre_routing_finish_bridge_slow -> -> br_handle_frame_finish -> br_port_get_rcu -> rcu_dereference And in this case there is no read-lock section. Reported-by: Denis Efremov <yefremov.denis@gmail.com> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-15 15:09:41 -07:00
John W. Linville	16698918cd	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem	2012-08-15 14:29:37 -04:00
Eric W. Biederman	e1fc3b14f9	sctp: Make sysctl tunables per net Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 23:32:16 -07:00
Eric W. Biederman	f53b5b097e	sctp: Push struct net down into sctp_verify_ext_param Add struct net as a parameter to sctp_verify_param so it can be passed to sctp_verify_ext_param where struct net will be needed when the sctp tunables become per net tunables. Add struct net as a parameter to sctp_verify_init so struct net can be passed to sctp_verify_param. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 23:30:37 -07:00
Eric W. Biederman	24cb81a6a9	sctp: Push struct net down into all of the state machine functions There are a handle of state machine functions primarily those dealing with processing INIT packets where there is neither a valid endpoint nor a valid assoication from which to derive a struct net. Therefore add struct net * to the parameter list of sctp_state_fn_t and update all of the state machine functions. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 23:30:37 -07:00
Eric W. Biederman	e7ff4a7037	sctp: Push struct net down into sctp_in_scope struct net will be needed shortly when the tunables are made per network namespace. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 23:30:37 -07:00
Eric W. Biederman	89bf3450cb	sctp: Push struct net down into sctp_transport_init Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 23:30:37 -07:00
Eric W. Biederman	55e26eb95a	sctp: Push struct net down to sctp_chunk_event_lookup This trickles up through sctp_sm_lookup_event up to sctp_do_sm and up further into sctp_primitiv_NAME before the code reaches places where struct net can be reliably found. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 23:30:37 -07:00
Eric W. Biederman	ebb7e95d93	sctp: Add infrastructure for per net sysctls Start with an empty sctp_net_table that will be populated as the various tunable sysctls are made per net. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 23:30:37 -07:00
Eric W. Biederman	b01a24078f	sctp: Make the mib per network namespace Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 23:30:36 -07:00
Eric W. Biederman	bb2db45b54	sctp: Enable sctp in all network namespaces - Fix the sctp_af operations to work in all namespaces - Enable sctp socket creation in all network namespaces. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 23:29:59 -07:00
Eric W. Biederman	13d782f6b4	sctp: Make the proc files per network namespace. - Convert all of the files under /proc/net/sctp to be per network namespace. - Don't print anything for /proc/net/sctp/snmp except in the initial network namespaces as the snmp counters still have to be converted to be per network namespace. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 23:29:53 -07:00
Eric W. Biederman	632c928a6a	sctp: Move the percpu sockets counter out of sctp_proc_init The percpu sctp socket counter has nothing at all to do with the sctp proc files, and having it in the wrong initialization is confusing, and makes network namespace support a pain. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 23:17:26 -07:00
Eric W. Biederman	2ce9550350	sctp: Make the ctl_sock per network namespace - Kill sctp_get_ctl_sock, it is useless now. - Pass struct net where needed so net->sctp.ctl_sock is accessible. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 23:17:26 -07:00
Eric W. Biederman	4db67e8086	sctp: Make the address lists per network namespace - Move the address lists into struct net - Add per network namespace initialization and cleanup - Pass around struct net so it is everywhere I need it. - Rename all of the global variable references into references to the variables moved into struct net Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 23:12:17 -07:00
Eric W. Biederman	4110cc255d	sctp: Make the association hashtable handle multiple network namespaces - Use struct net in the hash calculation - Use sock_net(association.base.sk) in the association lookups. - On receive calculate the network namespace from skb->dev. - Pass struct net from receive down to the functions that actually do the association lookup. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 22:44:12 -07:00
Eric W. Biederman	4cdadcbcb6	sctp: Make the endpoint hashtable handle multiple network namespaces - Use struct net in the hash calculation - Use sock_net(endpoint.base.sk) in the endpoint lookups. - On receive calculate the network namespace from skb->dev. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 22:44:12 -07:00
Eric W. Biederman	f1f4376307	sctp: Make the port hash table use struct net in it's key. - Add struct net into the port hash table hash calculation - Add struct net inot the struct sctp_bind_bucket so there is a memory of which network namespace a port is allocated in. No need for a ref count because sctp_bind_bucket only exists when there are sockets in the hash table and sockets can not change their network namspace, and sockets already ref count their network namespace. - Add struct net into the key comparison when we are testing to see if we have found the port hash table entry we are looking for. With these changes lookups in the port hash table becomes safe to use in multiple network namespaces. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 22:44:12 -07:00
Eric W. Biederman	26711a791e	userns: xt_owner: Add basic user namespace support. - Only allow adding matches from the initial user namespace - Add the appropriate conversion functions to handle matches against sockets in other user namespaces. Cc: Jan Engelhardt <jengelh@medozas.de> Cc: Patrick McHardy <kaber@trash.net> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-14 21:55:30 -07:00
Eric W. Biederman	da7428080a	userns xt_recent: Specify the owner/group of ip_list_perms in the initial user namespace xt_recent creates a bunch of proc files and initializes their uid and gids to the values of ip_list_uid and ip_list_gid. When initialize those proc files convert those values to kuids so they can continue to reside on the /proc inode. Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Patrick McHardy <kaber@trash.net> Cc: Jan Engelhardt <jengelh@medozas.de> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-14 21:55:29 -07:00
Eric W. Biederman	8c6e2a941a	userns: Convert xt_LOG to print socket kuids and kgids as uids and gids xt_LOG always writes messages via sb_add via printk. Therefore when xt_LOG logs the uid and gid of a socket a packet came from the values should be converted to be in the initial user namespace. Thus making xt_LOG as user namespace safe as possible. Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Patrick McHardy <kaber@trash.net> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-14 21:55:29 -07:00
Eric W. Biederman	a6c6796c71	userns: Convert cls_flow to work with user namespaces enabled The flow classifier can use uids and gids of the sockets that are transmitting packets and do insert those uids and gids into the packet classification calcuation. I don't fully understand the details but it appears that we can depend on specific uids and gids when making traffic classification decisions. To work with user namespaces enabled map from kuids and kgids into uids and gids in the initial user namespace giving raw integer values the code can play with and depend on. To avoid issues of userspace depending on uids and gids in packet classifiers installed from other user namespaces and getting confused deny all packet classifiers that use uids or gids that are not comming from a netlink socket in the initial user namespace. Cc: Patrick McHardy <kaber@trash.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Changli Gao <xiaosuo@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-14 21:55:28 -07:00
Eric W. Biederman	af4c6641f5	net sched: Pass the skb into change so it can access NETLINK_CB cls_flow.c plays with uids and gids. Unless I misread that code it is possible for classifiers to depend on the specific uid and gid values. Therefore I need to know the user namespace of the netlink socket that is installing the packet classifiers. Pass in the rtnetlink skb so I can access the NETLINK_CB of the passed packet. In particular I want access to sk_user_ns(NETLINK_CB(in_skb).ssk). Pass in not the user namespace but the incomming rtnetlink skb into the the classifier change routines as that is generally the more useful parameter. Cc: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-14 21:55:28 -07:00
Eric W. Biederman	9eea9515cb	userns: nfnetlink_log: Report socket uids in the log sockets user namespace At logging instance creation capture the peer netlink socket's user namespace. Use the captured peer user namespace when reporting socket uids to the peer. The peer socket's user namespace is guaranateed to be valid until the user closes the netlink socket. nfnetlink_log removes instances during the final close of a socket. __build_packet_message does not get called after an instance is destroyed. Therefore it is safe to let the peer netlink socket take care of the user namespace reference counting for us. Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-14 21:55:27 -07:00
Eric W. Biederman	d06ca95643	userns: Teach inet_diag to work with user namespaces Compute the user namespace of the socket that we are replying to and translate the kuids of reported sockets into that user namespace. Cc: Andrew Vagin <avagin@openvz.org> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Pavel Emelyanov <xemul@parallels.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-14 21:55:20 -07:00
Eric W. Biederman	3fbc290540	netlink: Make the sending netlink socket availabe in NETLINK_CB The sending socket of an skb is already available by it's port id in the NETLINK_CB. If you want to know more like to examine the credentials on the sending socket you have to look up the sending socket by it's port id and all of the needed functions and data structures are static inside of af_netlink.c. So do the simple thing and pass the sending socket to the receivers in the NETLINK_CB. I intend to use this to get the user namespace of the sending socket in inet_diag so that I can report uids in the context of the process who opened the socket, the same way I report uids in the contect of the process who opens files. Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-14 21:49:49 -07:00
Eric W. Biederman	d13fda8564	userns: Convert net/ax25 to use kuid_t where appropriate Cc: Ralf Baechle <ralf@linux-mips.org> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-14 21:49:42 -07:00
Eric W. Biederman	4f82f45730	net ip6 flowlabel: Make owner a union of struct pid * and kuid_t Correct a long standing omission and use struct pid in the owner field of struct ip6_flowlabel when the share type is IPV6_FL_S_PROCESS. This guarantees we don't have issues when pid wraparound occurs. Use a kuid_t in the owner field of struct ip6_flowlabel when the share type is IPV6_FL_S_USER to add user namespace support. In /proc/net/ip6_flowlabel capture the current pid namespace when opening the file and release the pid namespace when the file is closed ensuring we print the pid owner value that is meaning to the reader of the file. Similarly use from_kuid_munged to print uid values that are meaningful to the reader of the file. This requires exporting pid_nr_ns so that ipv6 can continue to built as a module. Yoiks what silliness Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-14 21:49:25 -07:00
Eric W. Biederman	7064d16e16	userns: Use kgids for sysctl_ping_group_range - Store sysctl_ping_group_range as a paire of kgid_t values instead of a pair of gid_t values. - Move the kgid conversion work from ping_init_sock into ipv4_ping_group_range - For invalid cases reset to the default disabled state. With the kgid_t conversion made part of the original value sanitation from userspace understand how the code will react becomes clearer and it becomes possible to set the sysctl ping group range from something other than the initial user namespace. Cc: Vasiliy Kulikov <segoon@openwall.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-14 21:49:10 -07:00
Eric W. Biederman	a7cb5a49bf	userns: Print out socket uids in a user namespace aware fashion. Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: James Morris <jmorris@namei.org> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Patrick McHardy <kaber@trash.net> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Sridhar Samudrala <sri@us.ibm.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-14 21:48:06 -07:00
Eric W. Biederman	976d020150	userns: Convert sock_i_uid to return a kuid_t Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-14 21:47:34 -07:00
Eric W. Biederman	d04a48b06d	userns: Convert __dev_set_promiscuity to use kuids in audit logs Cc: Klaus Heinrich Kiwi <klausk@br.ibm.com> Cc: Eric Paris <eparis@redhat.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2012-08-14 21:42:40 -07:00
Eric W. Biederman	b2e4f544fd	userns: Convert net/core/scm.c to use kuids and kgids With the existence of kuid_t and kgid_t we can take this further and remove the usage of struct cred altogether, ensuring we don't get cache line misses from reference counts. For now however start simply and do a straight forward conversion I can be certain is correct. In cred_to_ucred use from_kuid_munged and from_kgid_munged as these values are going directly to userspace and we want to use the userspace safe values not -1 when reporting a value that does not map. The earlier conversion that used from_kuid was buggy in that respect. Oops. Cc: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-14 21:41:58 -07:00
Andre Guedes	61a0cfb008	Bluetooth: Fix use-after-free bug in SMP If SMP fails, we should always cancel security_timer delayed work. Otherwise, security_timer function may run after l2cap_conn object has been freed. This patch fixes the following warning reported by ODEBUG: WARNING: at lib/debugobjects.c:261 debug_print_object+0x7c/0x8d() Hardware name: Bochs ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x27 Modules linked in: btusb bluetooth Pid: 440, comm: kworker/u:2 Not tainted 3.5.0-rc1+ #4 Call Trace: [<ffffffff81174600>] ? free_obj_work+0x4a/0x7f [<ffffffff81023eb8>] warn_slowpath_common+0x7e/0x97 [<ffffffff81023f65>] warn_slowpath_fmt+0x41/0x43 [<ffffffff811746b1>] debug_print_object+0x7c/0x8d [<ffffffff810394f0>] ? __queue_work+0x241/0x241 [<ffffffff81174fdd>] debug_check_no_obj_freed+0x92/0x159 [<ffffffff810ac08e>] slab_free_hook+0x6f/0x77 [<ffffffffa0019145>] ? l2cap_conn_del+0x148/0x157 [bluetooth] [<ffffffff810ae408>] kfree+0x59/0xac [<ffffffffa0019145>] l2cap_conn_del+0x148/0x157 [bluetooth] [<ffffffffa001b9a2>] l2cap_recv_frame+0xa77/0xfa4 [bluetooth] [<ffffffff810592f9>] ? trace_hardirqs_on_caller+0x112/0x1ad [<ffffffffa001c86c>] l2cap_recv_acldata+0xe2/0x264 [bluetooth] [<ffffffffa0002b2f>] hci_rx_work+0x235/0x33c [bluetooth] [<ffffffff81038dc3>] ? process_one_work+0x126/0x2fe [<ffffffff81038e22>] process_one_work+0x185/0x2fe [<ffffffff81038dc3>] ? process_one_work+0x126/0x2fe [<ffffffff81059f2e>] ? lock_acquired+0x1b5/0x1cf [<ffffffffa00028fa>] ? le_scan_work+0x11d/0x11d [bluetooth] [<ffffffff81036fb6>] ? spin_lock_irq+0x9/0xb [<ffffffff81039209>] worker_thread+0xcf/0x175 [<ffffffff8103913a>] ? rescuer_thread+0x175/0x175 [<ffffffff8103cfe0>] kthread+0x95/0x9d [<ffffffff812c5054>] kernel_threadi_helper+0x4/0x10 [<ffffffff812c36b0>] ? retint_restore_args+0x13/0x13 [<ffffffff8103cf4b>] ? flush_kthread_worker+0xdb/0xdb [<ffffffff812c5050>] ? gs_change+0x13/0x13 This bug can be reproduced using hctool lecc or l2test tools and bluetoothd not running. Signed-off-by: Andre Guedes <andre.guedes@openbossa.org> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-08-15 01:06:23 -03:00
David S. Miller	7bab3ae760	Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless John W. Linville says: ==================== Alexey Khoroshilov provides a potential memory leak in rndis_wlan. Bob Copeland gives us an ath5k fix for a lockdep problem. Dan Carpenter fixes a signedness mismatch in at76c50x. Felix Fietkau corrects a regression caused by an earlier commit that can lead to an IRQ storm. Lorenzo Bianconi offers a fix for a bad variable initialization in ath9k that can cause it to improperly mark decrypted frames. Rajkumar Manoharan fixes ath9k to prevent the btcoex time from running when the hardware is asleep. The remainder are Bluetooth fixes, about which Gustavo says: "Here goes some fixes for 3.6-rc1, there are a few fix to thte inquiry code by Ram Malovany, support for 2 new devices, and few others fixes for NULL dereference, possible deadlock and a memory leak." ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 17:03:22 -07:00
Ben Hutchings	4acd4945cd	ipv6: addrconf: Avoid calling netdevice notifiers with RCU read-side lock Cong Wang reports that lockdep detected suspicious RCU usage while enabling IPV6 forwarding: [ 1123.310275] =============================== [ 1123.442202] [ INFO: suspicious RCU usage. ] [ 1123.558207] 3.6.0-rc1+ #109 Not tainted [ 1123.665204] ------------------------------- [ 1123.768254] include/linux/rcupdate.h:430 Illegal context switch in RCU read-side critical section! [ 1123.992320] [ 1123.992320] other info that might help us debug this: [ 1123.992320] [ 1124.307382] [ 1124.307382] rcu_scheduler_active = 1, debug_locks = 0 [ 1124.522220] 2 locks held by sysctl/5710: [ 1124.648364] #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff81768498>] rtnl_trylock+0x15/0x17 [ 1124.882211] #1: (rcu_read_lock){.+.+.+}, at: [<ffffffff81871df8>] rcu_lock_acquire+0x0/0x29 [ 1125.085209] [ 1125.085209] stack backtrace: [ 1125.332213] Pid: 5710, comm: sysctl Not tainted 3.6.0-rc1+ #109 [ 1125.441291] Call Trace: [ 1125.545281] [<ffffffff8109d915>] lockdep_rcu_suspicious+0x109/0x112 [ 1125.667212] [<ffffffff8107c240>] rcu_preempt_sleep_check+0x45/0x47 [ 1125.781838] [<ffffffff8107c260>] __might_sleep+0x1e/0x19b [...] [ 1127.445223] [<ffffffff81757ac5>] call_netdevice_notifiers+0x4a/0x4f [...] [ 1127.772188] [<ffffffff8175e125>] dev_disable_lro+0x32/0x6b [ 1127.885174] [<ffffffff81872d26>] dev_forward_change+0x30/0xcb [ 1128.013214] [<ffffffff818738c4>] addrconf_forward_change+0x85/0xc5 [...] addrconf_forward_change() uses RCU iteration over the netdev list, which is unnecessary since it already holds the RTNL lock. We also cannot reasonably require netdevice notifier functions not to sleep. Reported-by: Cong Wang <amwang@redhat.com> Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 17:02:12 -07:00
Pavel Emelyanov	eea68e2f1a	packet: Report socket mclist info via diag module The info is reported as an array of packet_diag_mclist structures. Each includes not only the directly configured values (index, type, etc), but also the "count". Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 16:56:33 -07:00
Pavel Emelyanov	8a360be0c5	packet: Report more packet sk info via diag module This reports in one rtattr message all the other scalar values, that can be set on a packet socket with setsockopt. Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 16:56:33 -07:00
Pavel Emelyanov	96ec632714	packet: Diag core and basic socket info dumping The diag module can be built independently from the af_packet.ko one, just like it's done in unix sockets. The core dumping message carries the info available at socket creation time, i.e. family, type and protocol (in the same byte order as shown in the proc file). The socket inode number and cookie is reserved for future per-socket info retrieving. The per-protocol filtering is also reserved for future by requiring the sdiag_protocol to be zero. Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 16:56:33 -07:00
Pavel Emelyanov	2787b04b6c	packet: Introduce net/packet/internal.h header The diag module will need to access some private packet_sock data, so move it to a header in advance. This file will be shared between the af_packet.c and the diag.c Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 16:56:33 -07:00
Ben Hutchings	aadf31de16	llc: Fix races between llc2 handler use and (un)registration When registering the handlers, any state they rely on must be completely initialised first. When unregistering, we must wait until they are definitely no longer running. llc_rcv() must also avoid reading the handler pointers again after checking for NULL. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 16:52:02 -07:00
Ben Hutchings	f4f8720feb	llc2: Call llc_station_exit() on llc2_init() failure path Otherwise the station packet handler will remain registered even though the module is unloaded. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 16:51:40 -07:00
Ben Hutchings	6024935f5f	llc2: Fix silent failure of llc_station_init() llc_station_init() creates and processes an event skb with no effect other than to change the state from DOWN to UP. Allocation failure is reported, but then ignored by its caller, llc2_init(). Remove this possibility by simply initialising the state as UP. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 16:51:18 -07:00
Igor Maravic	ad5b310228	net: ipv4: fib_trie: Don't unnecessarily search for already found fib leaf We've already found leaf, don't search for it again. Same is for fib leaf info. Signed-off-by: Igor Maravic <igorm@etf.rs> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 15:02:20 -07:00
Priyanka Jain	418a99ac6a	Replace rwlock on xfrm_policy_afinfo with rcu xfrm_policy_afinfo is read mosly data structure. Write on xfrm_policy_afinfo is done only at the time of configuration. So rwlocks can be safely replaced with RCU. RCUs usage optimizes the performance. Signed-off-by: Priyanka Jain <Priyanka.Jain@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 14:57:37 -07:00
Igor Maravic	4855d6f311	net: ipv6: proc: Fix error handling Fix error handling in case making of dir dev_snmp6 failes Signed-off-by: Igor Maravic <igorm@etf.rs> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 14:45:07 -07:00
Yan, Zheng	7bd86cc282	ipv4: Cache local output routes Commit `caacf05e5a` causes big drop of UDP loop back performance. The cause of the regression is that we do not cache the local output routes. Each time we send a datagram from unconnected UDP socket, the kernel allocates a dst_entry and adds it to the rt_uncached_list. It creates lock contention on the rt_uncached_lock. Reported-by: Alex Shi <alex.shi@intel.com> Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 14:45:07 -07:00

... 2 3 4 5 6 ...

24934 commits