Commit graph

20265 commits

Author SHA1 Message Date
Allan Stephens
9aa88c2a50 tipc: Enhance sending of bulk name table messages
Modifies the initial transfer of name table entries to a new neighboring
node so that the messages are enqueued as a unit, rather than individually.

The revised algorithm now locates the link carrying the message only once,
and eliminates unnecessary checks for link congestion, message fragmentation,
and message bundling that are not required when sending these messages.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-09-17 22:55:10 -04:00
Paul Gortmaker
1c553bb52e tipc: relocate/coalesce node cast in tipc_named_node_up
Functions like this are called using unsigned longs from
function pointers.  In this case, the function is passed in
a node which is normally internally treated as a u32 by TIPC.

Rather than add more casts into this function in the future
for each added use of node within, move the cast to a single
place on a local.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-09-17 22:55:09 -04:00
Allan Stephens
149ce37c8d tipc: Prevent fragmented messages during initial name table exchange
Reduces the maximum size of messages sent during the initial exchange
of name table information between two nodes to be no larger than the
MTU of the first link established between the nodes. This ensures that
messages will never need to be fragmented, which would add unnecessary
overhead to the name table synchronization mechanism.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-09-17 22:55:08 -04:00
Allan Stephens
909234cdd2 tipc: Lower limits for number of bearers and media types
Reduces the number of bearers a node can support to 2, which can use
identical or non-identical media. This change won't impact users,
since they are currently limited to a maximum of 2 Ethernet bearers,
and will save memory by eliminating a number of unused entries in
TIPC's media and bearer arrays.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-09-17 22:55:07 -04:00
Allan Stephens
18abf0fb6b tipc: Remove redundant search when enabling bearer
Removes obsolete code that searches for an Ethernet bearer structure entry
to use for a newly enabled bearer, since this search is now performed
at the start of the enabling algorithm.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-09-17 22:55:06 -04:00
Allan Stephens
bcd326e844 tipc: Fix unsafe device list search when enabling bearer
Ensures that the device list lock is held while trying to locate
the Ethernet device used by a newly enabled bearer, so that the
addition or removal of a device does not cause problems.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-09-17 22:55:05 -04:00
Allan Stephens
b4b5610223 tipc: Ensure both nodes recognize loss of contact between them
Enhances TIPC to ensure that a node that loses contact with a
neighboring node does not allow contact to be re-established until
it sees that its peer has also recognized the loss of contact.

Previously, nodes that were connected by two or more links could
encounter a situation in which node A would lose contact with node B
on all of its links, purge its name table of names published by B,
and then fail to repopulate those names once contact with B was restored.
This would happen because B was able to re-establish one or more links
so quickly that it never reached a point where it had no links to A --
meaning that B never saw a loss of contact with A, and consequently
didn't re-publish its names to A.

This problem is now prevented by enhancing the cleanup done by TIPC
following a loss of contact with a neighboring node to ensure that
node A ignores all messages sent by B until it receives a LINK_PROTOCOL
message that indicates B has lost contact with A, thereby preventing
the (re)establishment of links between the nodes. The loss of contact
is recognized when a RESET or ACTIVATE message is received that has
a "redundant link exists" field of 0, indicating that B's sending link
endpoint is in a reset state and that B has no other working links.

Additionally, TIPC now suppresses the sending of (most) link protocol
messages to a neighboring node while it is cleaning up after an earlier
loss of contact with that node. This stops the peer node from prematurely
activating its link endpoint, which would prevent TIPC from later
activating its own end. TIPC still allows outgoing RESET messages to
occur during cleanup, to avoid problems if its own node recognizes
the loss of contact first and tries to notify the peer of the situation.

Finally, TIPC now recognizes an impending loss of contact with a peer node
as soon as it receives a RESET message on a working link that is the
peer's only link to the node, and ensures that the link protocol
suppression mentioned above goes into effect right away -- that is,
even before its own link endpoints have failed. This is necessary to
ensure correct operation when there are redundant links between the nodes,
since otherwise TIPC would send an ACTIVATE message upon receiving a RESET
on its first link and only begin suppressing when a RESET on its second
link was received, instead of initiating suppression with the first RESET
message as it needs to.

Note: The reworked cleanup code also eliminates a check that prevented
a link endpoint's discovery object from responding to incoming messages
while stale name table entries are being purged. This check is now
unnecessary and would have slowed down re-establishment of communication
between the nodes in some situations.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-09-17 22:55:03 -04:00
Allan Stephens
4b3743ef2c tipc: Ensure congested links receive bearer status updates
Modifies code that disables a bearer to ensure that all of its links
are deleted, not just its uncongested links. Similarly, modifies code
that blocks a bearer to ensure that all of its links are reset, not
just its uncongested links.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-09-01 11:16:37 -04:00
Allan Stephens
a0f40f02ef tipc: Prevent rounding issues when saving connect timeout option
Saves a socket's TIPC_CONN_TIMEOUT socket option value in its original
form (milliseconds), rather than jiffies. This ensures that the exact
value set using setsockopt() is always returned by getsockopt(), without
being subject to rounding issues introduced by a ms->jiffies->ms
conversion sequence.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-09-01 11:16:37 -04:00
Allan Stephens
ff60af8c16 tipc: Eliminate redundant check when sending messages
Eliminates code in tipc_send_buf_fast() that handles messages
sent to a destination on the current node, since the only caller
of the routine only passes in messages destined for other nodes.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-09-01 11:16:37 -04:00
Allan Stephens
0f38513d22 tipc: Remove obsolete congestion handling when sending a broadcast NACK
Eliminates obsolete code that handles broadcast bearer congestion when
the broadast link sends a NACK message, since the broadcast pseudo-bearer
never becomes blocked.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-09-01 11:16:37 -04:00
Allan Stephens
9f6bdcd428 tipc: Discard incoming broadcast messages that are unexpected
Modifies TIPC's incoming broadcast packet handler to discard messages
that cannot legally be sent over the broadcast link, including:

- broadcast protocol messages that do no contain state information
- payload messages that are not named multicast messages
- any other form of message except for bundled messages, fragmented
  messages, and name distribution messages.

These checks are needed to prevent TIPC from handing an unexpected
message to a routine that isn't prepared to handle it, which could
lead to incorrect processing (up to and including invalid memory
references caused by attempts to access message fields that aren't
present in the message).

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-09-01 11:16:37 -04:00
Allan Stephens
693d03ae3c tipc: Remove deferred queue head caching during broadcast message reception
Modifies TIPC's incoming broadcast packet handler so that it no longer
pre-reads information about the deferred packet queue, since the cached
value is unreliable once the associated node lock has been released.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-09-01 11:16:36 -04:00
Allan Stephens
5d3c488dfe tipc: Fix node lock problems during broadcast message reception
Modifies TIPC's incoming broadcast packet handler to ensure that the
node lock associated with the sender of the packet is held whenever
node-related data structure fields are accessed. The routine is also
restructured with a single exit point, making it easier to ensure
the node lock is properly released and the incoming packet is properly
disposed of.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-09-01 11:16:36 -04:00
Allan Stephens
169073db44 tipc: Prevent broadcast link stalling when another node fails
Ensure that broadcast link messages that have not been acknowledged
by a newly failed node do not get an implied acknowledgement until the
failed node is removed from the broadcast link's map of reachable nodes.

Previously, a race condition allowed a new broadcast link message to be
sent after the implicit acknowledgement processing was completed, but
before the map of reachable nodes was updated, resulting in the message
having an expected acknowledgement count that required the failed node
to explicitly acknowledge the message. Since this would never occur
the new message would remain in the broadcast link's transmit queue
forever, eventually causing the link to become congested and "stall".
Delaying the implicit acknowledgement processing until after the update
of the map of reachable nodes eliminates this race condition and prevents
stalling.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-09-01 11:16:36 -04:00
Allan Stephens
c5bd4d85d3 tipc: Enhance cleanup of broadcast link when contact with node is lost
Enhances cleanup of broadcast link-related information when contact
with a node is lost.

1) All broadcast link-related cleanup now occurs only if the lost node
   was capable of communicating over the broadcast link.

2) Following cleanup, the lost node is marked as no longer supporting
   the broadcast link, ensuring that any remaining broadcast messages
   received from that node prior to the re-establishment of a normal
   communication link are ignored.

Thanks to Surya [Suryanarayana.Garlapati@emerson.com] for contributing
a prototype version of this patch.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-09-01 11:16:36 -04:00
Allan Stephens
23f0ff906a tipc: Remove non-executable code to handle broadcast bearer congestion
Eliminates code associated with the sending of unsent broadcast link
traffic when the broadcast pseudo-bearer becomes unblocked following a
temporary congestion situation. This code is non-executable because the
broadcast pseudo-bearer never becomes blocked [see tipc_bcbearer_send()].

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-09-01 11:16:36 -04:00
Allan Stephens
2ff9f924a5 tipc: Cosmetic changes to broadcast bearer send routine
Updates the comments in the broadcast bearer send routine to more
accurately describe the processing done by the routine. Also replaces
the improper use of a TIPC payload message error status symbol (in a place
that has nothing to do with such errors) with its numeric equivalent.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-09-01 11:16:36 -04:00
Allan Stephens
2e2d9be845 tipc: Update obsolete references to multicast link
Updates TIPC's broadcast link in a couple of places that were missed
during the transition from its former name ("multicast-link") to its
current name ("broadcast-link"). These changes are essentially cosmetic
and do not affect the overall operation of TIPC.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-09-01 11:16:35 -04:00
Allan Stephens
641c218d12 tipc: Enhance filtering of out-dated link reset messages
Ensure TIPC ignores an out-dated link reset message whose session
number predates the current session number. (Previously, TIPC only
ignored an out-date reset message whose session number was equal
to the current link session number.)

Out-dated link reset messages should not occur under normal circumstances;
however, they can be generated if a link endpoint is unable to send a
link reset message right away and queues it for later delivery, but the
queued message is not sent until after the link is established.

Thanks to Laser [gotolaser@gmail.com] for diagnosing the problem and
contributing a prototype patch.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-09-01 11:16:35 -04:00
Allan Stephens
f882cb7684 tipc: Initialize peer session field of newly created link endpoint
Initializes the peer session number field of a newly created link
endpoint to an invalid value. This eliminates the remote possibility
that it will accidentally match the session number used by the peer
the first time the link is activated, and cause the link to ignore
a valid RESET message.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-09-01 11:16:35 -04:00
Allan Stephens
062b4c99fe tipc: Display meaningful peer interface name during link creation
Sets the peer interface portion of the name of a newly created link
endpoint to "unknown". This ensures that state and statistics information
can be properly displayed during the time between the link endpoint's
creation and the time handshaking with its peer is completed.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-09-01 11:16:35 -04:00
Allan Stephens
ed33a9c4e3 tipc: Eliminate obsolete filter for unexpected unicast messages
Removes a test that ensures unicast link endpoints discard an incoming
message if it will not be consumed by the node itself and cannot be
forwarded to another node, since the preceding test already ensures that
the message is destined for this node and single-cluster TIPC no longer
performs message forwarding.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-09-01 11:16:35 -04:00
Allan Stephens
5adeb17c93 tipc: Remove obsolete manipulation of message re-route count field
Eliminates code that increments and validates the re-route count field
of payload messages, since the elimination of multi-cluster support
means that it is no longer necessary for TIPC to forward incoming messages
to another node. (The obsolete code was incorrect anyway, since it
incorrectly incremented the re-route count field of messages that
originated on the node that forwarded the message.)

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-09-01 11:16:35 -04:00
John W. Linville
ba6e5eb107 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem 2011-08-29 14:52:20 -04:00
Joe Perches
7ac2ed0cee caif: Remove OOM messages, use kzalloc
Remove per site OOM messages because they duplicate
the generic mm subsystem OOM message.

Use kzalloc instead of kmalloc/memset
when next to the OOM message removals.

Reduces object size (allyesconfig ~2%)

$ size -t drivers/net/caif/built-in.o.old net/caif/built-in.o.old
   text	   data	    bss	    dec	    hex	filename
  32297	    700	   8224	  41221	   a105	drivers/net/caif/built-in.o.old
  72159	   1317	  20552	  94028	  16f4c	net/caif/built-in.o.old
 104456	   2017	  28776	 135249	  21051	(TOTALS)
$ size -t drivers/net/caif/built-in.o.new net/caif/built-in.o.new
   text	   data	    bss	    dec	    hex	filename
  31975	    700	   8184	  40859	   9f9b	drivers/net/caif/built-in.o.new
  70748	   1317	  20152	  92217	  16839	net/caif/built-in.o.new
 102723	   2017	  28336	 133076	  207d4	(TOTALS)

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-28 17:16:13 -04:00
Eric Dumazet
363437f40a net_sched: sfb: optimize enqueue on full queue
In case SFB queue is full (hard limit reached), there is no point
spending time to compute hash and maximum qlen/p_mark.

We instead just early drop packet.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-26 12:55:18 -04:00
chetan loke
bc59ba3991 af_packet: Prefixed tpacket_v3 structs to avoid name space collision
structs introduced in tpacket_v3 implementation are prefixed with 'tpacket'
to avoid namespace collision.

Compile tested.

Signed-off-by: Chetan Loke <loke.chetan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-26 12:38:44 -04:00
Eliad Peller
9533b4ac86 mac80211: add uapsd_queues and max_sp params fields
Add uapsd_queues and max_sp fields to ieee80211_sta.
These fields might be needed by low-level drivers in
order to configure the AP.

Signed-off-by: Eliad Peller <eliad@wizery.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-08-26 10:47:56 -04:00
Eliad Peller
c75786c9ef nl80211/cfg80211: add STA WME parameters
Add new NL80211_ATTR_STA_WME nested attribute that contains
wme params needed by the low-level driver (uapsd_queues and
max_sp).

Add these params to the station_parameters struct as well.

Signed-off-by: Eliad Peller <eliad@wizery.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-08-26 10:47:56 -04:00
Arik Nemtsov
a21fa87e3a mac80211: allow action frames with unknown BSSID in GO mode
When operating as a P2P GO, we receive some P2P action frames where the
BSSID is set to the peer MAC address. Specifically, this occurs for
invitation responses. These are valid action frames and they should be
passed up.

Signed-off-by: Arik Nemtsov <arik@wizery.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-08-26 10:47:56 -04:00
Guy Eilam
2a33bee275 mac80211: fix race condition between assoc_done and first EAP packet
When associating to an AP, the station might miss the first EAP
packet that the AP sends due to a race condition between the association
success procedure and the rx flow in mac80211.
In such cases, the packet might fall in ieee80211_rx_h_check due to
the fact that the relevant rx->sta wasn't allocated yet.
Allocation of the relevant station info struct before actually
sending the association request and setting it with a new
dummy_sta flag solve this problem.
The station will accept only EAP packets from the AP while it
is in the pre-association/dummy state.
This dummy station entry is not seen by normal sta_info_get()
calls, only by sta_info_get_bss_rx().
The driver is not notified for the first insertion of the
dummy station. The driver is notified only after the association
is complete and the dummy flag is removed from the station entry.
That way, all the rest of the code flow should be untouched by
this change.

Signed-off-by: Guy Eilam <guy@wizery.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-08-26 10:47:55 -04:00
Guy Eilam
8c71df7a2f mac80211: refactor sta_info_insert_rcu to 3 main stages
Divided the sta_info_insert_rcu function to 3 mini-functions:
sta_info_insert_check - the initial checks done when inserting
a new station
sta_info_insert_ibss - the function that handles the station
addition for IBSS interfaces
sta_info_insert_non_ibss - the function that handles the station
addition in other cases

The outer API was not changed.
The refactoring was done for better usage of the different
stages in the station addition in new scenarios added
in the next commit.

Signed-off-by: Guy Eilam <guy@wizery.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-08-26 10:47:55 -04:00
Thomas Pedersen
c613366113 mac80211: mesh gate fixes
Since a v1 of the mesh gate series was accidentally applied, this patch
contains the changes in v2.

These are:
	- automatically make mesh gate a root node.
	- use TU_TO_EXP_TIME macro.
	- initialize timer instead of checking for NULL timer function.
	- cleanups.

Signed-off-by: Thomas Pedersen <thomas@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-08-26 10:39:23 -04:00
Tim Chen
0856a30409 Scm: Remove unnecessary pid & credential references in Unix socket's send and receive path
Patch series 109f6e39..7361c36c back in 2.6.36 added functionality to
allow credentials to work across pid namespaces for packets sent via
UNIX sockets.  However, the atomic reference counts on pid and
credentials caused plenty of cache bouncing when there are numerous
threads of the same pid sharing a UNIX socket.  This patch mitigates the
problem by eliminating extraneous reference counts on pid and
credentials on both send and receive path of UNIX sockets. I found a 2x
improvement in hackbench's threaded case.

On the receive path in unix_dgram_recvmsg, currently there is an
increment of reference count on pid and credentials in scm_set_cred.
Then there are two decrement of the reference counts.  Once in scm_recv
and once when skb_free_datagram call skb->destructor function
unix_destruct_scm.  One pair of increment and decrement of ref count on
pid and credentials can be eliminated from the receive path.  Until we
destroy the skb, we already set a reference when we created the skb on
the send side.

On the send path, there are two increments of ref count on pid and
credentials, once in scm_send and once in unix_scm_to_skb.  Then there
is a decrement of the reference counts in scm_destroy's call to
scm_destroy_cred at the end of unix_dgram_sendmsg functions.   One pair
of increment and decrement of the reference counts can be removed so we
only need to increment the ref counts once.

By incorporating these changes, for hackbench running on a 4 socket
NHM-EX machine with 40 cores, the execution of hackbench on
50 groups of 20 threads sped up by factor of 2.

Hackbench command used for testing:
./hackbench 50 thread 2000

Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-24 19:41:13 -07:00
Michio Honda
6af29ccc22 sctp: Bundle HEAERTBEAT into ASCONF_ACK
With this patch a HEARTBEAT chunk is bundled into the ASCONF-ACK
for ADD IP ADDRESS, confirming the new destination as quickly as
possible.

Signed-off-by: Michio Honda <micchie@sfc.wide.ad.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-24 19:41:13 -07:00
Michio Honda
f207c050fb sctp: HEARTBEAT negotiation after ASCONF
This patch fixes BUG that the ASCONF receiver transmits DATA chunks
to the newly added UNCONFIRMED destination.

Signed-off-by: Michio Honda <micchie@sfc.wide.ad.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-24 19:41:09 -07:00
Nandita Dukkipati
a262f0cdf1 Proportional Rate Reduction for TCP.
This patch implements Proportional Rate Reduction (PRR) for TCP.
PRR is an algorithm that determines TCP's sending rate in fast
recovery. PRR avoids excessive window reductions and aims for
the actual congestion window size at the end of recovery to be as
close as possible to the window determined by the congestion control
algorithm. PRR also improves accuracy of the amount of data sent
during loss recovery.

The patch implements the recommended flavor of PRR called PRR-SSRB
(Proportional rate reduction with slow start reduction bound) and
replaces the existing rate halving algorithm. PRR improves upon the
existing Linux fast recovery under a number of conditions including:
  1) burst losses where the losses implicitly reduce the amount of
outstanding data (pipe) below the ssthresh value selected by the
congestion control algorithm and,
  2) losses near the end of short flows where application runs out of
data to send.

As an example, with the existing rate halving implementation a single
loss event can cause a connection carrying short Web transactions to
go into the slow start mode after the recovery. This is because during
recovery Linux pulls the congestion window down to packets_in_flight+1
on every ACK. A short Web response often runs out of new data to send
and its pipe reduces to zero by the end of recovery when all its packets
are drained from the network. Subsequent HTTP responses using the same
connection will have to slow start to raise cwnd to ssthresh. PRR on
the other hand aims for the cwnd to be as close as possible to ssthresh
by the end of recovery.

A description of PRR and a discussion of its performance can be found at
the following links:
- IETF Draft:
    http://tools.ietf.org/html/draft-mathis-tcpm-proportional-rate-reduction-01
- IETF Slides:
    http://www.ietf.org/proceedings/80/slides/tcpm-6.pdf
    http://tools.ietf.org/agenda/81/slides/tcpm-2.pdf
- Paper to appear in Internet Measurements Conference (IMC) 2011:
    Improving TCP Loss Recovery
    Nandita Dukkipati, Matt Mathis, Yuchung Cheng

Signed-off-by: Nandita Dukkipati <nanditad@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-24 19:40:40 -07:00
chetan loke
f6fb8f100b af-packet: TPACKET_V3 flexible buffer implementation.
1) Blocks can be configured with non-static frame-size.
2) Read/poll is at a block-level(as opposed to packet-level).
3) Added poll timeout to avoid indefinite user-space wait on idle links.
4) Added user-configurable knobs:
   4.1) block::timeout.
   4.2) tpkt_hdr::sk_rxhash.

Changes:
C1) tpacket_rcv()
    C1.1) packet_current_frame() is replaced by packet_current_rx_frame()
          The bulk of the processing is then moved in the following chain:
          packet_current_rx_frame()
            __packet_lookup_frame_in_block
              fill_curr_block()
              or
                retire_current_block
                dispatch_next_block
              or
              return NULL(queue is plugged/paused)

Signed-off-by: Chetan Loke <loke.chetan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-24 19:40:40 -07:00
Alexander Smirnov
44331fe2aa IEEE802.15.4: 6LoWPAN basic support
This patch provides base support for transmission of IPv6 packets as
well as the formation of IPv6 link-local addresses and statelessly
autoconfigured addresses on top of IEEE 802.15.4 networks.

For more information please look at the RFC4944 "Compression Format
for IPv6 Datagrams in Low Power and Losst Networks (6LoWPAN).

Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-24 19:36:06 -07:00
Ian Campbell
804cf14ea5 net: xfrm: convert to SKB frag APIs
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-24 17:52:12 -07:00
Ian Campbell
408dadf03f net: ipv6: convert to SKB frag APIs
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: "Pekka Savola (ipv6)" <pekkas@netcore.fi>
Cc: James Morris <jmorris@namei.org>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-24 17:52:12 -07:00
Ian Campbell
aff65da0f1 net: ipv4: convert to SKB frag APIs
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: "Pekka Savola (ipv6)" <pekkas@netcore.fi>
Cc: James Morris <jmorris@namei.org>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-24 17:52:11 -07:00
Ian Campbell
ea2ab69379 net: convert core to skb paged frag APIs
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: "Michał Mirosław" <mirq-linux@rere.qmqm.pl>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-24 17:52:11 -07:00
Eric Dumazet
ec5efe7946 rps: support IPIP encapsulation
Skip IPIP header to get proper layer-4 information.

Like GRE tunnels, this only works if rxhash is not already provided by
the device itself (ethtool -K ethX rxhash off), to allow kernel compute
a software rxhash.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-24 16:13:55 -07:00
Samuel Ortiz
e8753043f9 NFC: Reserve tx head and tail room
We can have the NFC core layer allocating the tx head and tail
room for the drivers and avoid 1 or more SKBs copy on write on
the Tx path.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-08-24 14:41:44 -04:00
Javier Cardona
16dd7267f4 {nl,cfg,mac}80211: let userspace make meshif mesh gate
Allow userspace to set NL80211_MESHCONF_GATE_ANNOUNCEMENTS attribute,
which will advertise this mesh node as being a mesh gate.
NL80211_HWMP_ROOTMODE must be set or this will do nothing.

Signed-off-by: Javier Cardona <javier@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-08-24 13:59:43 -04:00
Javier Cardona
0507e159a2 {nl,cfg,mac}80211: let userspace set RANN interval
Allow userspace to set Root Announcement Interval for our mesh
interface. Also, RANN interval is now in proper units of TUs.

Signed-off-by: Javier Cardona <javier@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-08-24 13:59:43 -04:00
Javier Cardona
699403dbd4 {nl,mac}80211: add missing root mode meshconf entries
This fix allows userspace to mark a meshif as a root node.

Signed-off-by: Javier Cardona <javier@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-08-24 13:59:43 -04:00
Javier Cardona
5ee68e5b39 mac80211: mesh gate implementation
In this implementation, a mesh gate is a root node with a certain bit
set in its RANN flags. The mpath to this root node is marked as a path
to a gate, and added to our list of known gates for this if_mesh. Once a
path discovery process fails, we forward the unresolved frames to a
known gate. Thanks to Luis Rodriguez for refactoring and bug fix help.

Signed-off-by: Javier Cardona <javier@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-08-24 13:59:42 -04:00