Commit Graph

21180 Commits (b6d640c2286d16b15a21a4475c896cdce6a0bbe0)

Author SHA1 Message Date
Pavel Emelyanov b6d640c228 udp_diag: Implement the dump-all functionality
Do the same as TCP does -- iterate the given udp_table, filter
sockets with bytecode and dump sockets into reply message.

The same filtering as for TCP applies, though only some of the
state bits really matter.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-09 14:15:00 -05:00
Pavel Emelyanov a925aa00a5 udp_diag: Implement the get_exact dumping functionality
Do the same as TCP does -- lookup a socket in the given udp_table,
check cookie, fill the reply message with existing inet socket dumping
helper and send one back.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-09 14:15:00 -05:00
Pavel Emelyanov 52b7c59bc3 udp_diag: Basic skeleton
Introduce the transport level diag handler module for UDP (and UDP-lite)
sockets and register (empty for now) callbacks in the inet_diag module.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-09 14:15:00 -05:00
Pavel Emelyanov fce823381e udp: Export code sk lookup routines
The UDP diag get_exact handler will require them to find a
socket by provided net, [sd]addr-s, [sd]ports and device.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-09 14:14:08 -05:00
Pavel Emelyanov 1942c518ca inet_diag: Generalize inet_diag dump and get_exact calls
Introduce two callbacks in inet_diag_handler -- one for dumping all
sockets (with filters) and the other one for dumping a single sk.

Replace direct calls to icsk handlers with indirect calls to callbacks
provided by handlers.

Make existing TCP and DCCP handlers use provided helpers for icsk-s.

The UDP diag module will provide its own.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-09 14:14:08 -05:00
Pavel Emelyanov 3c4d05c805 inet_diag: Introduce the inet socket dumping routine
The existing inet_csk_diag_fill dumps the inet connection sock info
into the netlink inet_diag_message. Prepare this routine to be able
to dump only the inet_sock part of a socket if the icsk part is missing.

This will be used by UDP diag module when dumping UDP sockets.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-09 14:14:08 -05:00
Pavel Emelyanov 8d07d1518a inet_diag: Introduce the byte-code run on an inet socket
The upcoming UDP module will require exactly this ability, so just
move the existing code to provide one.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-09 14:14:08 -05:00
Pavel Emelyanov efb3cb428d inet_diag: Split inet_diag_get_exact into parts
Similar to previous patch: the 1st part locks the inet handler
and will get generalized and the 2nd one dumps icsk-s and will
be used by TCP and DCCP handlers.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-09 14:14:08 -05:00
Pavel Emelyanov 476f7dbff3 inet_diag: Split inet_diag_get_exact into parts
The 1st part locks the inet handler and the 2nd one dump the
inet connection sock.

In the next patches the 1st part will be generalized to call
the socket dumping routine indirectly (i.e. TCP/UDP/DCCP) and
the 2nd part will be used by TCP and DCCP handlers.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-09 14:14:08 -05:00
Pavel Emelyanov b005ab4ef8 inet_diag: Export inet diag cookie checking routine
The netlink diag susbsys stores sk address bits in the nl message
as a "cookie" and uses one when dumps details about particular
socket.

The same will be required for udp diag module, so introduce a heler
in inet_diag module

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-09 14:14:08 -05:00
Pavel Emelyanov 87c22ea52e inet_diag: Reduce the number of args for bytecode run routine
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-09 14:14:07 -05:00
Pavel Emelyanov 7b35eadd7e inet_diag: Remove indirect sizeof from inet diag handlers
There's an info_size value stored on inet_diag_handler, but for existing
code this value is effectively constant, so just use sizeof(struct tcp_info)
where required.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-09 14:14:07 -05:00
Eric Dumazet a73ed26bba sch_red: generalize accurate MAX_P support to RED/GRED/CHOKE
Now RED uses a Q0.32 number to store max_p (max probability), allow
RED/GRED/CHOKE to use/report full resolution at config/dump time.

Old tc binaries are non aware of new attributes, and still set/get Plog.

New tc binary set/get both Plog and max_p for backward compatibility,
they display "probability value" if they get max_p from new kernels.

# tc -d  qdisc show dev ...
...
qdisc red 10: parent 1:1 limit 360Kb min 30Kb max 90Kb ecn ewma 5
probability 0.09 Scell_log 15

Make sure we avoid potential divides by 0 in reciprocal_value(), if
(max_th - min_th) is big.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-09 13:46:15 -05:00
John Fastabend 0221cd5154 Revert "net: netprio_cgroup: make net_prio_subsys static"
This reverts commit 865d9f9f74.

This commit breaks the build with CONFIG_NETPRIO_CGROUP=y so
revert it. It does build as a module though. The SUBSYS macro
in the cgroup core code automatically defines a subsys structure
as extern. Long term we should fix the macro. And I need to
fully build test things.

Tested with CONFIG_NETPRIO_CGROUP={y|m|n} with and without
CONFIG_CGROUPS defined.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
CC: Neil Horman <nhorman@tuxdriver.com>
Reported-By: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-09 13:39:27 -05:00
Dan Carpenter 6f8e4ad0ef sock_diag: off by one checks
These tests are off by one because sock_diag_handlers[] only has AF_MAX
elements.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-08 19:58:35 -05:00
John Fastabend 865d9f9f74 net: netprio_cgroup: make net_prio_subsys static
net_prio_subsys can be made static this removes the sparse
warning it was throwing.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-08 19:58:35 -05:00
Benjamin LaHaise 6d4cdf47d2 vlan: add 802.1q netpoll support
Add netpoll support to 802.1q vlan devices.  Based on the netpoll support
in the bridging code.  Tested on a forced_eth device with netconsole.

Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-08 19:58:30 -05:00
Eric Dumazet 8af2a218de sch_red: Adaptative RED AQM
Adaptative RED AQM for linux, based on paper from Sally FLoyd,
Ramakrishna Gummadi, and Scott Shenker, August 2001 :

http://icir.org/floyd/papers/adaptiveRed.pdf

Goal of Adaptative RED is to make max_p a dynamic value between 1% and
50% to reach the target average queue : (max_th - min_th) / 2

Every 500 ms:
 if (avg > target and max_p <= 0.5)
  increase max_p : max_p += alpha;
 else if (avg < target and max_p >= 0.01)
  decrease max_p : max_p *= beta;

target :[min_th + 0.4*(min_th - max_th),
          min_th + 0.6*(min_th - max_th)].
alpha : min(0.01, max_p / 4)
beta : 0.9
max_P is a Q0.32 fixed point number (unsigned, with 32 bits mantissa)

Changes against our RED implementation are :

max_p is no longer a negative power of two (1/(2^Plog)), but a Q0.32
fixed point number, to allow full range described in Adatative paper.

To deliver a random number, we now use a reciprocal divide (thats really
a multiply), but this operation is done once per marked/droped packet
when in RED_BETWEEN_TRESH window, so added cost (compared to previous
AND operation) is near zero.

dump operation gives current max_p value in a new TCA_RED_MAX_P
attribute.

Example on a 10Mbit link :

tc qdisc add dev $DEV parent 1:1 handle 10: est 1sec 8sec red \
   limit 400000 min 30000 max 90000 avpkt 1000 \
   burst 55 ecn adaptative bandwidth 10Mbit

# tc -s -d qdisc show dev eth3
...
qdisc red 10: parent 1:1 limit 400000b min 30000b max 90000b ecn
adaptative ewma 5 max_p=0.113335 Scell_log 15
 Sent 50414282 bytes 34504 pkt (dropped 35, overlimits 1392 requeues 0)
 rate 9749Kbit 831pps backlog 72056b 16p requeues 0
  marked 1357 early 35 pdrop 0 other 0

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-08 19:52:43 -05:00
Jiri Pirko 348a1443cc vlan: introduce functions to do mass addition/deletion of vids by another device
Introduce functions handy to copy vlan ids from one driver's list to
another.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-08 19:52:42 -05:00
Jiri Pirko 5b9ea6e022 vlan: introduce vid list with reference counting
This allows to keep track of vids needed to be in rx vlan filters of
devices even if they are used in bond/team etc.

vlan_info as well as vlan_group previously was, is allocated when first
vid is added and dealocated whan last vid is deleted.

vlan_group definition is moved to private header.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-08 19:52:42 -05:00
Jiri Pirko 87002b03ba net: introduce vlan_vid_[add/del] and use them instead of direct [add/kill]_vid ndo calls
This patch adds wrapper for ndo_vlan_rx_add_vid/ndo_vlan_rx_kill_vid
functions. Check for NETIF_F_HW_VLAN_FILTER feature is done in this
wrapper.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-08 19:52:42 -05:00
Jiri Pirko 7da82c06de vlan: rename vlan_dev_info to vlan_dev_priv
As this structure is priv, name it approprietely. Also for pointer to it
use name "vlan".

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-08 19:51:30 -05:00
stephen hemminger 4359881338 bridge: add local MAC address to forwarding table (v2)
If user has configured a MAC address that is not one of the existing
ports of the bridge, then we need to add a special entry in the forwarding
table. This forwarding table entry has no outgoing port so it has to be
treated a little differently. The special entry is reported by the netlink
interface with ifindex of bridge, but ignored by the old interface since there
is no usable way to put it in the ABI.

Reported-by: Koki Sanagi <sanagi.koki@jp.fujitsu.com>
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-08 19:40:28 -05:00
stephen hemminger 31e8a49c16 bridge: rearrange fdb notifications (v2)
Pass bridge to fdb_notify so it can determine correct namespace based
on namespace of bridge rather than namespace of destination port.
Also makes next patch easier.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-08 19:40:28 -05:00
stephen hemminger f58ee4e1a2 bridge: refactor fdb_notify
Move fdb_notify outside of fdb_create. This fixes the problem
that notification of local entries are not flagged correctly.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-08 19:40:28 -05:00
David S. Miller 959327c784 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2011-12-06 21:10:05 -05:00
Roar Førde f84ea779c2 caif: Replace BUG_ON with WARN_ON.
BUG_ON is too strict in a number of circumstances,
use WARN_ON instead. Protocol errors should not halt the system.

Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-06 17:21:47 -05:00
sjur.brandeland@stericsson.com 005b0b076f caif: Bad assert triggering false positive.
Fix bad assert on fragment size triggering false positive.

Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-06 17:21:47 -05:00
David S. Miller 87a115783e ipv6: Move xfrm_lookup() call down into icmp6_dst_alloc().
And return error pointers.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-06 17:04:13 -05:00
David S. Miller 8f0315190d ipv6: Make third arg to anycast_dst_alloc() bool.
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-06 16:48:14 -05:00
Stephen Boyd 681090902e net: Silence seq_scale() unused warning
On a CONFIG_NET=y build

net/core/secure_seq.c:22: warning: 'seq_scale' defined but not
used

Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-06 13:59:16 -05:00
Pavel Emelyanov 8ef874bfc7 sock_diag: Move the sock_ code to net/core/
This patch moves the sock_ code from inet_diag.c to generic sock_diag.c
file and provides necessary request_module-s calls and a pointer on
inet_diag_compat dumping routine.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-06 13:58:02 -05:00
Pavel Emelyanov a029fe26b6 inet_diag: Cleanup type2proto last user
Now all the code works with sock_diag_req-compatible structs, so it's
possible to stop using the inet_diag_type2proto in inet_csk_diag_fill.
Pass the inet_diag_req into it and use the sdiag_protocol field. At the
same time remove the explicit ext argument, since it's also on the req.

However, this conversion is still required in _compat code, so just move
this routine, not remove.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-06 13:58:02 -05:00
Pavel Emelyanov d23deaa07b inet_diag: Introduce socket family checks
The new API will specify family to work with. Teach the existing
socket walking code to bypass not interesting ones.

To preserve compatibility with existing behavior the _compat code
sets interesting family to AF_UNSPEC to dump them all.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-06 13:58:02 -05:00
Pavel Emelyanov 25c4cd2b6d inet_diag: Switch the _dump to work with new header
Make inet_diag_dumo work with given header instead of calculating
one from the nl message.

The SOCK_DIAG_BY_FAMILY just passes skb's one through, the compat code
converts the old header to new one.

Also fix the bytecode calculation to find one at proper offset.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-06 13:58:02 -05:00
Pavel Emelyanov fe50ce2846 inet_diag: Switch the _get_exact to work with new header
Make inet_diag_get_exact work with given header instead of calculating
one from the nl message.

The SOCK_DIAG_BY_FAMILY just passes skb's one through, the compat code
converts the old header to new one.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-06 13:58:01 -05:00
Pavel Emelyanov 126fdc3249 inet_diag: Introduce new inet_diag_req header
This one coinsides with the sock_diag_req in the beginning and
contains only used fields from its previous analogue.

The existing code is patched to use the _compat version of it
for now.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-06 13:58:01 -05:00
Pavel Emelyanov d366477a52 sock_diag: Initial skeleton
When receiving the SOCK_DIAG_BY_FAMILY message we have to find the
handler for provided family and pass the nl message to it.

This patch describes an infrastructure to work with such nandlers
and implements stubs for AF_INET(6) ones.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-06 13:58:01 -05:00
Pavel Emelyanov f13c95f0e2 inet_diag: Switch from _GETSOCK to IPPROTO_ numbers
Sorry, but the vger didn't let this message go to the list. Re-sending it with
less spam-filter-prone subject.

When dumping the AF_INET/AF_INET6 sockets user will also specify the protocol,
so prepare the protocol diag handlers to work with IPPROTO_ constants.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-06 13:58:01 -05:00
Pavel Emelyanov 37f352b5e3 inet_diag: Move byte-code finding up the call-stack
Current code calculates it at fixed offset. This offset will change, so
move the BC calculation upper to make the further patching simpler.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-06 13:58:01 -05:00
Pavel Emelyanov 8d34172dfd sock_diag: Introduce new message type
This type will run the family+protocol based socket dumping.
Also prepare the stub function for it.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-06 13:58:01 -05:00
Pavel Emelyanov 7f1fb60c4f inet_diag: Partly rename inet_ to sock_
The ultimate goal is to get the sock_diag module, that works in
family+protocol terms. Currently this is suitable to do on the
inet_diag basis, so rename parts of the code. It will be moved
to sock_diag.c later.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-06 13:57:36 -05:00
David S. Miller 9e998a7550 Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next 2011-12-06 13:31:19 -05:00
Peter Pan(潘卫平) 99b53bdd81 ipv4:correct description for tcp_max_syn_backlog
Since commit c5ed63d66f24(tcp: fix three tcp sysctls tuning),
sysctl_max_syn_backlog is determined by tcp_hashinfo->ehash_mask,
and the minimal value is 128, and it will increase in proportion to the
memory of machine.
The original description for tcp_max_syn_backlog and sysctl_max_syn_backlog
are out of date.

Changelog:
V2: update description for sysctl_max_syn_backlog

Signed-off-by: Weiping Pan <panweiping3@gmail.com>
Reviewed-by: Shan Wei <shanwei88@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-06 13:02:28 -05:00
Dan Carpenter f0a98ae8db openvswitch: small potential memory leak in ovs_vport_alloc()
We're unlikely to hit this leak, but the static checkers complain if we
don't take care of it.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-06 12:58:57 -05:00
John W. Linville d39aeaf260 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem 2011-12-06 10:47:12 -05:00
Igor Maravic 40e4783ee6 ipv4: arp: Cleanup in arp.c
Use "IS_ENABLED(CONFIG_FOO)" macro instead of
"defined(CONFIG_FOO) || defined(CONFIG_FOO_MODULE)"

Signed-off-by: Igor Maravic <igorm@etf.rs>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-06 00:34:40 -05:00
Eric Dumazet 0a5912db7b tcp: remove TCP_OFF and TCP_PAGE macros
As mentioned by Joe Perches, TCP_OFF() and TCP_PAGE() macros are
useless.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-05 18:30:03 -05:00
Eric Dumazet b474ae7760 bql: fix CONFIG_XPS=n build
netdev_queue_release() should be called even if CONFIG_XPS=n
to properly release device reference.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-05 18:30:03 -05:00
Eric Dumazet 4fa48bf3c7 tcp: fix tcp_trim_head()
commit f07d960df3 (tcp: avoid frag allocation for small frames)
breaked assumption in tcp stack that skb is either linear (skb->data_len
== 0), or fully fragged (skb->data_len == skb->len)

tcp_trim_head() made this assumption, we must fix it.

Thanks to Vijay for providing a very detailed explanation.

Reported-by: Vijay Subramanian <subramanian.vijay@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-05 18:30:03 -05:00