dev->real_num_tx_queues is correctly set already in alloc_netdev_mqs.
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As rt_iif represents input device even for packets
coming from loopback with output route, it is not an unique
key specific to input routes. Now rt_route_iif has such role,
it was fl.iif in 2.6.38, so better to change the checks at
some places to save CPU cycles and to restore 2.6.38 semantics.
compare_keys:
- input routes: only rt_route_iif matters, rt_iif is same
- output routes: only rt_oif matters, rt_iif is not
used for matching in __ip_route_output_key
- now we are back to 2.6.38 state
ip_route_input_common:
- matching rt_route_iif implies input route
- compared to 2.6.38 we eliminated one rth->fl.oif check
because it was not needed even for 2.6.38
compare_hash_inputs:
Only the change here is not an optimization, it has
effect only for output routes. I assume I'm restoring
the original intention to ignore oif, it was using fl.iif
- now we are back to 2.6.38 state
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
Free the locally allocated table and newinfo as done in adjacent error
handling code.
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Call cipso_v4_doi_putdef in the case of the failure of the allocation of
entry. Reverse the order of the error handling code at the end of the
function and insert more labels in order to reduce the number of
unnecessary calls to kfree.
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch corrects an erroneous update of credential's gid with uid
introduced in commit 257b5358b3 since 2.6.36.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Reviewed-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Using a gcc 4.4.3, warnings are emitted for a possibly uninitialized use
of ecn_ok.
This can happen if cookie_check_timestamp() returns due to not having
seen a timestamp. Defaulting to ecn off seems like a reasonable thing
to do in this case, so initialized ecn_ok to false.
Signed-off-by: Mike Waychison <mikew@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When a PREQ or PREP is received from an intermediate node, it contains
useful information for path selection but it doesn't include the
originator's sequence number. Therefore, when updating the mesh path
to that intermediate node, we should not set the MESH_PATH_SN_VALID
flag. BUT, if the flag is set, it should not be unset as we might have
received a valid sequence number for that intermediate node in the past.
This issue was reported, fixed and tested by Ya Bo (游波) and Pedro
Larbig (ASPj).
Signed-off-by: Javier Cardona <javier@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
drivers might assume sta.drv_priv is clear while
the sta is added, so clear it on reconfinguration.
Signed-off-by: Eliad Peller <eliad@wizery.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
When user space SME/MLME (e.g., hostapd) is not used in AP mode, the
IEs from the (Re)Association Request frame that was processed in
firmware need to be made available for user space (e.g., RSN IE for
hostapd). Allow this to be done with cfg80211_new_sta().
Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Drivers that support frame transmission with mgmt_tx() may not support
driver-based offchannel TX. Use mgmt_tx_cancel_wait instead of mgmt_tx
when figuring out whether to indicate support for this with
NL80211_ATTR_OFFCHANNEL_TX_OK.
Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
commit 07bd8df5df
(sch_sfq: fix peek() implementation) changed sfq to use generic
peek helper.
This makes HFSC complain about a non-work-conserving child qdisc, if
prio with sfq child is used within hfsc:
hfsc peeks into prio qdisc, which will then peek into sfq.
returned skb is stashed in sch->gso_skb.
Next, hfsc tries to dequeue from prio, but prio will call sfq dequeue
directly, which may return NULL instead of previously peeked-at skb.
Have prio call qdisc_dequeue_peeked, so sfq->dequeue() is
not called in this case.
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Followup of commit f2c31e32b3 (fix NULL dereferences in
check_peer_redir()).
We need to make sure dst neighbour doesnt change in dst_ifdown().
Fix some sparse errors.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This ensures the neighbor entries associated with the bridge
dev are flushed, also invalidating the associated cached L2 headers.
This means we br_add_if/br_del_if ports to implement hand-over and
not wind up with bridge packets going out with stale MAC.
This means we can also change MAC of port device and also not wind
up with bridge packets going out with stale MAC.
This builds on Stephen Hemminger's patch, also handling the br_del_if
case and the port MAC change case.
Cc: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Andrei Warkentin <andreiw@motorola.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com>
Acked-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Some drivers (ath9k for example) are using skb->protocol to treat EAPOL
frames somehow special (disallow aggregation for example).
When running in AP mode hostapd injects the EAPOL frames through a
monitor interface and thus skb->protocol isn't set at all. Hence, if the
injected frame is a data frame and carries a rfc1042 headaer update the
skb->protocol field accordingly.
Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Several uses were missing terminating newlines.
Typo fix and macro neatening.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Make mesh_preq_queue_lock locking consistent with mesh_queue_preq() using
spin_lock_bh().
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: Javier Cardona <javier@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
According to 802.11-2007, 7.3.1.14 it is compliant to use a buf_size of
0 in ADDBA requests. But some devices (AVM Fritz Stick N) arn't able to
handle that correctly and will reply with an ADDBA reponse with a
buf_size of 0 which in turn will disallow BA sessions for these
devices.
To work around this problem, initialize hw.max_tx_aggregation_subframes
to the maximum AMPDU buffer size 0x40.
Using 0 as default for the bufsize was introduced in commit
5dd36bc933 (mac80211: allow advertising
correct maximum aggregate size).
Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
If we receive an ADDBA response with status code 0 and a buf_size of 0
we should stop the TX BA session as otherwise we'll end up queuing
frames in ieee80211_tx_prep_agg forever instead of sending them out as
non AMPDUs.
This fixes a problem with AVM Fritz Stick N wireless devices where
frames to this device are not transmitted anymore by mac80211.
Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
For iwlwifi, I decided not to use this API since
it just increased the complexity for little gain.
Since nobody else intends to use it, let's kill
it again. If anybody later needs to have it, we
can always revive it then.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
A lot of code is dedicated to giving drivers the
ability to use cfg80211's wext handlers without
completely converting. However, only orinoco is
currently using this, and it is only partially
using it.
We reduce the size of both the source and binary
by removing those that nobody needs. If a driver
shows up that needs it during conversion, we can
add back those that are needed.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
A lot of drivers erroneously use wext constants
and don't notice since cfg80211.h includes them.
Make this more split up so drivers needing wext
compatibility from cfg80211 need to explicitly
include that from cfg80211-wext.h.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Make sure skb dst has reference when moving to
another context. Currently, I don't see protocols that can
hit it when sending broadcasts/multicasts to loopback using
noref dsts, so it is just a precaution.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
The raw sockets can provide source address for
routing but their privileges are not considered. We
can provide non-local source address, make sure the
FLOWI_FLAG_ANYSRC flag is set if socket has privileges
for this, i.e. based on hdrincl (IP_HDRINCL) and
transparent flags.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
TCP in some cases uses different global (raw) socket
to send RST and ACK. The transparent flag is not set there.
Currently, it is a problem for rerouting after the previous
change.
Fix it by simplifying the checks in ip_route_me_harder
and use FLOWI_FLAG_ANYSRC even for sockets. It looks safe
because the initial routing allowed this source address to
be used and now we just have to make sure the packet is rerouted.
As a side effect this also allows rerouting for normal
raw sockets that use spoofed source addresses which was not possible
even before we eliminated the ip_route_input call.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
IP_PKTOPTIONS is broken for 32-bit applications running
in COMPAT mode on 64-bit kernels.
This happens because msghdr's msg_flags field is always
set to zero. When running in COMPAT mode this should be
set to MSG_CMSG_COMPAT instead.
Signed-off-by: Tiberiu Szocs-Mihai <tszocs@ixiacom.com>
Signed-off-by: Daniel Baluta <dbaluta@ixiacom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
compare_keys and ip_route_input_common rely on
rt_oif for distinguishing of input and output routes
with same keys values. But sometimes the input route has
also same hash chain (keyed by iif != 0) with the output
routes (keyed by orig_oif=0). Problem visible if running
with small number of rhash_entries.
Fix them to use rt_route_iif instead. By this way
input route can not be returned to users that request
output route.
The patch fixes the ip_rt_bug errors that were
reported in ip_local_out context, mostly for 255.255.255.255
destinations.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
Computers have become a lot faster since we compromised on the
partial MD4 hash which we use currently for performance reasons.
MD5 is a much safer choice, and is inline with both RFC1948 and
other ISS generators (OpenBSD, Solaris, etc.)
Furthermore, only having 24-bits of the sequence number be truly
unpredictable is a very serious limitation. So the periodic
regeneration and 8-bit counter have been removed. We compute and
use a full 32-bit sequence number.
For ipv6, DCCP was found to use a 32-bit truncated initial sequence
number (it needs 43-bits) and that is fixed here as well.
Reported-by: Dan Kaminsky <dan@doxpara.com>
Tested-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
When support for binding to 'mapped INADDR_ANY (::ffff.0.0.0.0)' was added
in 0f8d3c7ac3 the rest of the code
wasn't told so now it's possible to bind IPv6 datagram socket to
::ffff.0.0.0.0, connect it to another IPv4 address and it will all
work except for getsockhame() which does not return the local address
as expected.
To give getsockname() something to work with check for 'mapped INADDR_ANY'
when connecting and update the in-core source addresses appropriately.
Signed-off-by: Max Matveev <makc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The sendmmsg() introduced by commit 228e548e "net: Add sendmmsg socket system
call" is capable of sending to multiple different destination addresses.
SMACK is using destination's address for checking sendmsg() permission.
However, security_socket_sendmsg() is called for only once even if multiple
different destination addresses are passed to sendmmsg().
Therefore, we need to call security_socket_sendmsg() for each destination
address rather than only the first destination address.
Since calling security_socket_sendmsg() every time when only single destination
address was passed to sendmmsg() is a waste of time, omit calling
security_socket_sendmsg() unless destination address of previous datagram and
that of current datagram differs.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Anton Blanchard <anton@samba.org>
Cc: stable <stable@kernel.org> [3.0+]
Signed-off-by: David S. Miller <davem@davemloft.net>
To limit the amount of time we can spend in sendmmsg, cap the
number of elements to UIO_MAXIOV (currently 1024).
For error handling an application using sendmmsg needs to retry at
the first unsent message, so capping is simpler and requires less
application logic than returning EINVAL.
Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: stable <stable@kernel.org> [3.0+]
Signed-off-by: David S. Miller <davem@davemloft.net>
sendmmsg uses a similar error return strategy as recvmmsg but it
turns out to be a confusing way to communicate errors.
The current code stores the error code away and returns it on the next
sendmmsg call. This means a call with completely valid arguments could
get an error from a previous call.
Change things so we only return an error if no datagrams could be sent.
If less than the requested number of messages were sent, the application
must retry starting at the first failed one and if the problem is
persistent the error will be returned.
This matches the behaviour of other syscalls like read/write - it
is not an error if less than the requested number of elements are sent.
Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: stable <stable@kernel.org> [3.0+]
Signed-off-by: David S. Miller <davem@davemloft.net>
Gergely Kalman reported crashes in check_peer_redir().
It appears commit f39925dbde (ipv4: Cache learned redirect
information in inetpeer.) added a race, leading to possible NULL ptr
dereference.
Since we can now change dst neighbour, we should make sure a reader can
safely use a neighbour.
Add RCU protection to dst neighbour, and make sure check_peer_redir()
can be called safely by different cpus in parallel.
As neighbours are already freed after one RCU grace period, this patch
should not add typical RCU penalty (cache cold effects)
Many thanks to Gergely for providing a pretty report pointing to the
bug.
Reported-by: Gergely Kalman <synapse@hippy.csoma.elte.hu>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When assigning a NULL value to an RCU protected pointer, no barrier
is needed. The rcu_assign_pointer, used to handle that but will soon
change to not handle the special case.
Convert all rcu_assign_pointer of NULL value.
//smpl
@@ expression P; @@
- rcu_assign_pointer(P, NULL)
+ RCU_INIT_POINTER(P, NULL)
// </smpl>
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Update the code to handle some of the differences between
RFC 3041 and RFC 4941, which obsoletes it. Also a couple
of janitorial fixes.
- Allow router advertisements to increase the lifetime of
temporary addresses. This was not allowed by RFC 3041,
but is specified by RFC 4941. It is useful when RA
lifetimes are lower than TEMP_{VALID,PREFERRED}_LIFETIME:
in this case, the previous code would delete or deprecate
addresses prematurely.
- Change the default of MAX_RETRY to 3 per RFC 4941.
- Add a comment to clarify that the preferred and valid
lifetimes in inet6_ifaddr are relative to the timestamp.
- Shorten lines to 80 characters in a couple of places.
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since skb_copy_bits() is called from assembly, add a fat comment to make
clear we should think twice before changing its prototype.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
My @hp.com will no longer be valid starting August 5, 2011 so an update is
necessary. My new email address is employer independent so we don't have
to worry about doing this again any time soon.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reported-by: Pascal Hambourg <pascal@plouf.fr.eu.org>
Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
The test is off by one so we'd read past the end of the
wiphy->bands[] array on the next line.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This patch causes CCID-2 to check the Ack Ratio after reducing the congestion
window. If the Ack Ratio is greater than the congestion window, it is
reduced. This prevents timeouts caused by an Ack Ratio larger than the
congestion window.
In this situation, we choose to set the Ack Ratio to half the congestion window
(or one if that's zero) so that if we loose one ack we don't trigger a timeout.
Signed-off-by: Samuel Jero <sj323707@ohio.edu>
Acked-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
This patch fixes an issue where CCID-2 will not increase the congestion
window for numerous RTTs after an idle period, application-limited period,
or a loss once the algorithm is in Congestion Avoidance.
What happens is that, when CCID-2 is in Congestion Avoidance mode, it will
increase hc->tx_packets_acked by one for every packet and will increment cwnd
every cwnd packets. However, if there is now an idle period in the connection,
cwnd will be reduced, possibly below the slow start threshold. This will
cause the connection to go into Slow Start. However, in Slow Start CCID-2
performs this test to increment cwnd every second ack:
++hc->tx_packets_acked == 2
Unfortunately, this will be incorrect, if cwnd previous to the idle period
was larger than 2 and if tx_packets_acked was close to cwnd. For example:
cwnd=50 and tx_packets_acked=45.
In this case, the current code, will increment tx_packets_acked until it
equals two, which will only be once tx_packets_acked (an unsigned 32-bit
integer) overflows.
My fix is simply to change that test for tx_packets_acked greater than or
equal to two in slow start.
Signed-off-by: Samuel Jero <sj323707@ohio.edu>
Acked-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Add a check to prevent CCID-2 from increasing the cwnd greater than the
Sequence Window.
When the congestion window becomes bigger than the Sequence Window, CCID-2
will attempt to keep more data in the network than the DCCP Sequence Window
code considers possible. This results in the Sequence Window code issuing
a Sync, thereby inducing needless overhead. Further, if this occurs at the
sender, CCID-2 will never detect the problem because the Acks it receives
will indicate no losses. I have seen this cause a drop of 1/3rd in throughput
for a connection.
Also add code to adjust the Sequence Window to be about 5 times the number of
packets in the network (RFC 4340, 7.5.2) and to adjust the Ack Ratio so that
the remote Sequence Window will hold about 5 times the number of packets in
the network. This allows the congestion window to increase correctly without
being limited by the Sequence Window.
Signed-off-by: Samuel Jero <sj323707@ohio.edu>
Acked-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
This uses the new feature-negotiation framework to signal Ack Ratio changes,
as required by RFC 4341, sec. 6.1.2.
That raises some problems with CCID-2, which at the moment can not cope
gracefully with Ack Ratios > 1. Since these issues are not directly related
to feature negotiation, they are marked by a FIXME.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Samuel Jero <sj323707@ohio.edu>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.uk>
If a connection is in the OPEN state, remove feature negotiation Confirm
options from the list of options after sending them once; as such options
are NOT supposed to be retransmitted and are ONLY supposed to be sent in
response to a Change option (RFC 4340 6.2).
Signed-off-by: Samuel Jero <sj323707@ohio.edu>
Acked-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
This patch adds the receiver side and the (fast-path) activation part for
dynamic changes of non-negotiable (NN) parameters in (PART)OPEN state.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Samuel Jero <sj323707@ohio.edu>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.uk>
In contrast to static feature negotiation at the begin of a connection, this
patch introduces support for exchange of dynamically changing options.
Such an update/exchange is necessary in at least two cases:
* CCID-2's Ack Ratio (RFC 4341, 6.1.2) which changes during the connection;
* Sequence Window values that, as per RFC 4340, 7.5.2, should be sent "as
the connection progresses".
Both are non-negotiable (NN) features, which means that no new capabilities
are negotiated, but rather that changes in known parameters are brought
up-to-date at either end.
Thse characteristics are reflected by the implementation:
* only NN options can be exchanged after connection setup;
* an ack is scheduled directly after activation to speed up the update;
* CCIDs may request changes to an NN feature even if a negotiation for that
feature is already underway: this is required by CCID-2, where changes in
cwnd necessitate Ack Ratio changes, such that the previous Ack Ratio (which
is still being negotiated) would cause irrecoverable RTO timeouts (thanks
to work by Samuel Jero).
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Samuel Jero <sj323707@ohio.edu>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.uk>
commit 8efa885406 (sch_sfq: avoid giving spurious NET_XMIT_CN signals)
forgot to call qdisc_tree_decrease_qlen() to signal upper levels that a
packet (from another flow) was dropped, leading to various problems.
With help from Michal Soltys and Michal Pokrywka, who did a bisection.
Bugzilla ref: https://bugzilla.kernel.org/show_bug.cgi?id=39372
Debian ref: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=631945
Reported-by: Lucas Bocchi <lucas.bocchi@gmail.com>
Reported-and-bisected-by: Michal Pokrywka <wolfmoon@o2.pl>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Michal Soltys <soltys@ziu.info>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Convert array index from the loop bound to the loop index.
A simplified version of the semantic patch that fixes this problem is as
follows: (http://coccinelle.lip6.fr/)
// <smpl>
@@
expression e1,e2,ar;
@@
for(e1 = 0; e1 < e2; e1++) { <...
ar[
- e2
+ e1
]
...> }
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Even using percpu stats, we still hit tunnel dst_entry refcount in
ip6_tnl_xmit2()
Since we are in RCU locked section, we can use skb_dst_set_noref() and
avoid these atomic operations, leaving dst shared on cpus.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ipv6_destopt_rcv() runs with rcu_read_lock(), so there is no need to
take a temporay reference on dst_entry, even if skb is freed by
ip6_parse_tlv()
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use RCU to avoid changing dst_entry refcount in fast path.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ICMP and ND are not fast path, but still we can avoid changing idev
refcount, using RCU.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix new kernel-doc warning in sunrpc:
Warning(net/sunrpc/xprt.c:196): No description found for parameter 'xprt'
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the min and max bit lengths for AES-CTR (RFC3686) keys.
The number of bits in key spec is the key length (128/256)
plus 32 bits of nonce.
This change takes care of the "Invalid key length" errors
reported by setkey when specifying 288 bit keys for aes-ctr.
Signed-off-by: Tushar Gohad <tgohad@mvista.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (32 commits)
tg3: Remove 5719 jumbo frames and TSO blocks
tg3: Break larger frags into 4k chunks for 5719
tg3: Add tx BD budgeting code
tg3: Consolidate code that calls tg3_tx_set_bd()
tg3: Add partial fragment unmapping code
tg3: Generalize tg3_skb_error_unmap()
tg3: Remove short DMA check for 1st fragment
tg3: Simplify tx bd assignments
tg3: Reintroduce tg3_tx_ring_info
ASIX: Use only 11 bits of header for data size
ASIX: Simplify condition in rx_fixup()
Fix cdc-phonet build
bonding: reduce noise during init
bonding: fix string comparison errors
net: Audit drivers to identify those needing IFF_TX_SKB_SHARING cleared
net: add IFF_SKB_TX_SHARED flag to priv_flags
net: sock_sendmsg_nosec() is static
forcedeth: fix vlans
gianfar: fix bug caused by 87c288c6e9
gro: Only reset frag0 when skb can be pulled
...
After the last patch, We are left in a state in which only drivers calling
ether_setup have IFF_TX_SKB_SHARING set (we assume that drivers touching real
hardware call ether_setup for their net_devices and don't hold any state in
their skbs. There are a handful of drivers that violate this assumption of
course, and need to be fixed up. This patch identifies those drivers, and marks
them as not being able to support the safe transmission of skbs by clearning the
IFF_TX_SKB_SHARING flag in priv_flags
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Karsten Keil <isdn@linux-pingi.de>
CC: "David S. Miller" <davem@davemloft.net>
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: Patrick McHardy <kaber@trash.net>
CC: Krzysztof Halasa <khc@pm.waw.pl>
CC: "John W. Linville" <linville@tuxdriver.com>
CC: Greg Kroah-Hartman <gregkh@suse.de>
CC: Marcel Holtmann <marcel@holtmann.org>
CC: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pktgen attempts to transmit shared skbs to net devices, which can't be used by
some drivers as they keep state information in skbs. This patch adds a flag
marking drivers as being able to handle shared skbs in their tx path. Drivers
are defaulted to being unable to do so, but calling ether_setup enables this
flag, as 90% of the drivers calling ether_setup touch real hardware and can
handle shared skbs. A subsequent patch will audit drivers to ensure that the
flag is set properly
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Reported-by: Jiri Pirko <jpirko@redhat.com>
CC: Robert Olsson <robert.olsson@its.uu.se>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: Alexey Dobriyan <adobriyan@gmail.com>
CC: David S. Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'nfs-for-3.1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (44 commits)
NFSv4: Don't use the delegation->inode in nfs_mark_return_delegation()
nfs: don't use d_move in nfs_async_rename_done
RDMA: Increasing RPCRDMA_MAX_DATA_SEGS
SUNRPC: Replace xprt->resend and xprt->sending with a priority queue
SUNRPC: Allow caller of rpc_sleep_on() to select priority levels
SUNRPC: Support dynamic slot allocation for TCP connections
SUNRPC: Clean up the slot table allocation
SUNRPC: Initalise the struct xprt upon allocation
SUNRPC: Ensure that we grab the XPRT_LOCK before calling xprt_alloc_slot
pnfs: simplify pnfs files module autoloading
nfs: document nfsv4 sillyrename issues
NFS: Convert nfs4_set_ds_client to EXPORT_SYMBOL_GPL
SUNRPC: Convert the backchannel exports to EXPORT_SYMBOL_GPL
SUNRPC: sunrpc should not explicitly depend on NFS config options
NFS: Clean up - simplify the switch to read/write-through-MDS
NFS: Move the pnfs write code into pnfs.c
NFS: Move the pnfs read code into pnfs.c
NFS: Allow the nfs_pageio_descriptor to signal that a re-coalesce is needed
NFS: Use the nfs_pageio_descriptor->pg_bsize in the read/write request
NFS: Cache rpc_ops in struct nfs_pageio_descriptor
...
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
merge fchmod() and fchmodat() guts, kill ancient broken kludge
xfs: fix misspelled S_IS...()
xfs: get rid of open-coded S_ISREG(), etc.
vfs: document locking requirements for d_move, __d_move and d_materialise_unique
omfs: fix (mode & S_IFDIR) abuse
btrfs: S_ISREG(mode) is not mode & S_IFREG...
ima: fmode_t misspelled as mode_t...
pci-label.c: size_t misspelled as mode_t
jffs2: S_ISLNK(mode & S_IFMT) is pointless
snd_msnd ->mode is fmode_t, not mode_t
v9fs_iop_get_acl: get rid of unused variable
vfs: dont chain pipe/anon/socket on superblock s_inodes list
Documentation: Exporting: update description of d_splice_alias
fs: add missing unlock in default_llseek()
This allows us to move duplicated code in <asm/atomic.h>
(atomic_inc_not_zero() for now) to <linux/atomic.h>
Signed-off-by: Arun Sharma <asharma@fb.com>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (23 commits)
ceph: document unlocked d_parent accesses
ceph: explicitly reference rename old_dentry parent dir in request
ceph: document locking for ceph_set_dentry_offset
ceph: avoid d_parent in ceph_dentry_hash; fix ceph_encode_fh() hashing bug
ceph: protect d_parent access in ceph_d_revalidate
ceph: protect access to d_parent
ceph: handle racing calls to ceph_init_dentry
ceph: set dir complete frag after adding capability
rbd: set blk_queue request sizes to object size
ceph: set up readahead size when rsize is not passed
rbd: cancel watch request when releasing the device
ceph: ignore lease mask
ceph: fix ceph_lookup_open intent usage
ceph: only link open operations to directory unsafe list if O_CREAT|O_TRUNC
ceph: fix bad parent_inode calc in ceph_lookup_open
ceph: avoid carrying Fw cap during write into page cache
libceph: don't time out osd requests that haven't been received
ceph: report f_bfree based on kb_avail rather than diffing.
ceph: only queue capsnap if caps are dirty
ceph: fix snap writeback when racing with writes
...
Just a typo fix changing regulaotry to regulatory.
Signed-off-by: Mihai Moldovan <ionic@ionic.de>
CC: John W. Linville <linville@tuxdriver.com>
CC: Mohammed Shafi <shafi.wireless@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
At the beginning of wiphy_update_regulatory() a check is performed
whether the request is to be ignored. Then the request is sent to
the driver nevertheless. This happens even if last_request points
to NULL, leading to a crash in the driver:
[<bf01d864>] (lbs_set_11d_domain_info+0x28/0x1e4 [libertas]) from [<c03b714c>] (wiphy_update_regulatory+0x4d0/0x4f4)
[<c03b714c>] (wiphy_update_regulatory+0x4d0/0x4f4) from [<c03b4008>] (wiphy_register+0x354/0x420)
[<c03b4008>] (wiphy_register+0x354/0x420) from [<bf01b17c>] (lbs_cfg_register+0x80/0x164 [libertas])
[<bf01b17c>] (lbs_cfg_register+0x80/0x164 [libertas]) from [<bf020e64>] (lbs_start_card+0x20/0x88 [libertas])
[<bf020e64>] (lbs_start_card+0x20/0x88 [libertas]) from [<bf02cbd8>] (if_sdio_probe+0x898/0x9c0 [libertas_sdio])
Fix this by returning early. Also remove the out: label as it is
not any longer needed.
Signed-off-by: Sven Neumann <s.neumann@raumfeld.com>
Cc: linux-wireless@vger.kernel.org
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Daniel Mack <daniel@zonque.org>
Cc: stable@kernel.org
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Keep track of when an outgoing message is ACKed (i.e., the server fully
received it and, presumably, queued it for processing). Time out OSD
requests only if it's been too long since they've been received.
This prevents timeouts and connection thrashing when the OSDs are simply
busy and are throttling the requests they read off the network.
Reviewed-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
Workloads using pipes and sockets hit inode_sb_list_lock contention.
superblock s_inodes list is needed for quota, dirty, pagecache and
fsnotify management. pipe/anon/socket fs are clearly not candidates for
these.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* 'for-3.1' of git://linux-nfs.org/~bfields/linux:
nfsd: don't break lease on CLAIM_DELEGATE_CUR
locks: rename lock-manager ops
nfsd4: update nfsv4.1 implementation notes
nfsd: turn on reply cache for NFSv4
nfsd4: call nfsd4_release_compoundargs from pc_release
nfsd41: Deny new lock before RECLAIM_COMPLETE done
fs: locks: remove init_once
nfsd41: check the size of request
nfsd41: error out when client sets maxreq_sz or maxresp_sz too small
nfsd4: fix file leak on open_downgrade
nfsd4: remember to put RW access on stateid destruction
NFSD: Added TEST_STATEID operation
NFSD: added FREE_STATEID operation
svcrpc: fix list-corrupting race on nfsd shutdown
rpc: allow autoloading of gss mechanisms
svcauth_unix.c: quiet sparse noise
svcsock.c: include sunrpc.h to quiet sparse noise
nfsd: Remove deprecated nfsctl system call and related code.
NFSD: allow OP_DESTROY_CLIENTID to be only op in COMPOUND
Fix up trivial conflicts in Documentation/feature-removal-schedule.txt
* Merge akpm patch series: (122 commits)
drivers/connector/cn_proc.c: remove unused local
Documentation/SubmitChecklist: add RCU debug config options
reiserfs: use hweight_long()
reiserfs: use proper little-endian bitops
pnpacpi: register disabled resources
drivers/rtc/rtc-tegra.c: properly initialize spinlock
drivers/rtc/rtc-twl.c: check return value of twl_rtc_write_u8() in twl_rtc_set_time()
drivers/rtc: add support for Qualcomm PMIC8xxx RTC
drivers/rtc/rtc-s3c.c: support clock gating
drivers/rtc/rtc-mpc5121.c: add support for RTC on MPC5200
init: skip calibration delay if previously done
misc/eeprom: add eeprom access driver for digsy_mtc board
misc/eeprom: add driver for microwire 93xx46 EEPROMs
checkpatch.pl: update $logFunctions
checkpatch: make utf-8 test --strict
checkpatch.pl: add ability to ignore various messages
checkpatch: add a "prefer __aligned" check
checkpatch: validate signature styles and To: and Cc: lines
checkpatch: add __rcu as a sparse modifier
checkpatch: suggest using min_t or max_t
...
Did this as a merge because of (trivial) conflicts in
- Documentation/feature-removal-schedule.txt
- arch/xtensa/include/asm/uaccess.h
that were just easier to fix up in the merge than in the patch series.
We presently define all kinds of notifiers in notifier.h. This is not
necessary at all, since different subsystems use different notifiers, they
are almost non-related with each other.
This can also save much build time. Suppose I add a new netdevice event,
really I don't have to recompile all the source, just network related.
Without this patch, all the source will be recompiled.
I move the notify events near to their subsystem notifier registers, so
that they can be found more easily.
This patch:
It is not necessary to share the same notifier.h.
Signed-off-by: WANG Cong <amwang@redhat.com>
Cc: David Miller <davem@davemloft.net>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
No need to use int, its uses are boolean.
May save a few bytes one day.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
From: Randy Dunlap <rdunlap@xenotime.net>
Fix new kernel-doc warning in eth.c:
Warning(net/ethernet/eth.c:237): No description found for parameter 'type'
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
If a device event generates gratuitous ARP messages, only primary
address is used for sending. This patch iterates through the whole
list. Tested with 2 IP addresses configuration on bonding interface.
Signed-off-by: Zoltan Kiss <schaman@sch.bme.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Original commit 2bda8a0c8af... "Disable router anycast
address for /127 prefixes" says:
| No need for matching code in addrconf_leave_anycast() as it
| will silently ignore any attempt to leave an unknown anycast
| address.
After analysis, because 1) we may add two or more prefixes on the
same interface, or 2)user may have manually joined that anycast,
we may hit chances to have anycast address which as if we had
generated one by /127 prefix and we should not leave from subnet-
router anycast address unconditionally.
CC: Bjørn Mork <bjorn@mork.no>
CC: Brian Haley <brian.haley@hp.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (43 commits)
fs: Merge split strings
treewide: fix potentially dangerous trailing ';' in #defined values/expressions
uwb: Fix misspelling of neighbourhood in comment
net, netfilter: Remove redundant goto in ebt_ulog_packet
trivial: don't touch files that are removed in the staging tree
lib/vsprintf: replace link to Draft by final RFC number
doc: Kconfig: `to be' -> `be'
doc: Kconfig: Typo: square -> squared
doc: Konfig: Documentation/power/{pm => apm-acpi}.txt
drivers/net: static should be at beginning of declaration
drivers/media: static should be at beginning of declaration
drivers/i2c: static should be at beginning of declaration
XTENSA: static should be at beginning of declaration
SH: static should be at beginning of declaration
MIPS: static should be at beginning of declaration
ARM: static should be at beginning of declaration
rcu: treewide: Do not use rcu_read_lock_held when calling rcu_dereference_check
Update my e-mail address
PCIe ASPM: forcedly -> forcibly
gma500: push through device driver tree
...
Fix up trivial conflicts:
- arch/arm/mach-ep93xx/dma-m2p.c (deleted)
- drivers/gpio/gpio-ep93xx.c (renamed and context nearby)
- drivers/net/r8169.c (just context changes)
Our performance team has noticed that increasing
RPCRDMA_MAX_DATA_SEGS from 8 to 64 significantly
increases throughput when using the RDMA transport.
Signed-off-by: Steve Dickson <steved@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (145 commits)
bnx2x: use pci_pcie_cap()
bnx2x: fix bnx2x_stop_on_error flow in bnx2x_sp_rtnl_task
bnx2x: enable internal target-read for 57712 and up only
bnx2x: count statistic ramrods on EQ to prevent MC assert
bnx2x: fix loopback for non 10G link
bnx2x: dcb - send all unmapped priorities to same COS as L2
iwlwifi: Fix build with CONFIG_PM disabled.
gre: fix improper error handling
ipv4: use RT_TOS after some rt_tos conversions
via-velocity: remove duplicated #include
qlge: remove duplicated #include
igb: remove duplicated #include
can: c_can: remove duplicated #include
bnad: remove duplicated #include
net: allow netif_carrier to be called safely from IRQ
bna: Header File Consolidation
bna: HW Error Counter Fix
bna: Add HW Semaphore Unlock Logic
bna: IOC Event Name Change
bna: Mboxq Flush When IOC Disabled
...
Do not set the cr0 enablement bit for iucv by default in head[31|64].S,
move the enablement to iucv_init in the iucv base layer.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Fix improper protocol err_handler, current implementation is fully
unapplicable and may cause kernel crash due to double kfree_skb.
Signed-off-by: Dmitry Kozlov <xeb@mail.ru>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
rt_tos was changed to iph->tos but it must be filtered by RT_TOS
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
msize represents the maximum PDU size that includes P9_IOHDRSZ.
Signed-off-by: Venkateswararao Jujjuri "<jvrao@linux.vnet.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
unlinkat - Remove a directory entry
size[4] Tunlinkat tag[2] dirfid[4] name[s] flag[4]
size[4] Runlinkat tag[2]
older Tremove have the below request format
size[4] Tremove tag[2] fid[4]
The remove message is used to remove a directory entry either file or directory
The remove opreation is actually a directory opertation and should ideally have
dirfid, if not we cannot represent the fid on server with anything other than
name. We will have to derive the directory name from fid in the Tremove request.
NOTE: The operation doesn't clunk the unlink fid.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
renameat - change name of file or directory
size[4] Trenameat tag[2] olddirfid[4] oldname[s] newdirfid[4] newname[s]
size[4] Rrenameat tag[2]
older Trename have the below request format
size[4] Trename tag[2] fid[4] newdirfid[4] name[s]
The rename message is used to change the name of a file, possibly moving it
to a new directory. The rename opreation is actually a directory opertation
and should ideally have olddirfid, if not we cannot represent the fid on server
with anything other than name. We will have to derive the old directory name
from fid in the Trename request.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
free the fid even in case of failed clunk.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Switch to generic kernel hexdump library and cleanup macros to
be more consistent with the way we do normal debug prints.
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
There was a BUG_ON to protect against a bad id which could be dealt with
more gracefully.
Reported-by: Natalie Orlin <norlin@us.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (107 commits)
vfs: use ERR_CAST for err-ptr tossing in lookup_instantiate_filp
isofs: Remove global fs lock
jffs2: fix IN_DELETE_SELF on overwriting rename() killing a directory
fix IN_DELETE_SELF on overwriting rename() on ramfs et.al.
mm/truncate.c: fix build for CONFIG_BLOCK not enabled
fs:update the NOTE of the file_operations structure
Remove dead code in dget_parent()
AFS: Fix silly characters in a comment
switch d_add_ci() to d_splice_alias() in "found negative" case as well
simplify gfs2_lookup()
jfs_lookup(): don't bother with . or ..
get rid of useless dget_parent() in btrfs rename() and link()
get rid of useless dget_parent() in fs/btrfs/ioctl.c
fs: push i_mutex and filemap_write_and_wait down into ->fsync() handlers
drivers: fix up various ->llseek() implementations
fs: handle SEEK_HOLE/SEEK_DATA properly in all fs's that define their own llseek
Ext4: handle SEEK_HOLE/SEEK_DATA generically
Btrfs: implement our own ->llseek
fs: add SEEK_HOLE and SEEK_DATA flags
reiserfs: make reiserfs default to barrier=flush
...
Fix up trivial conflicts in fs/xfs/linux-2.6/xfs_super.c due to the new
shrinker callout for the inode cache, that clashed with the xfs code to
start the periodic workers later.
As reported by Ben Greer and Froncois Romieu. The code path in
the netif_carrier code leads it to try and disable
a late workqueue to reenable it immediately
netif_carrier_on
-> linkwatch_fire_event
-> linkwatch_schedule_work
-> cancel_delayed_work
-> del_timer_sync
If __cancel_delayed_work is used instead then there is no
problem of waiting for running linkwatch_event.
There is a race between linkwatch_event running re-scheduling
but it is harmless to schedule an extra scan of the linkwatch queue.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some minor cleanups that won't impact code:
1. Remove inline from non-critical functions; compiler will most
likely inline them anyway.
2. Make function args const where possible.
3. Whitespace cleanup
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When STP changes state of interface need to send a new link
message to reflect that change.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If a new device is added to a bridge, the ethernet address of the
bridge network device may change. When the address changes, the
appropriate callback is called, but with the wrong device argument.
The address of the bridge device (ie br0) changes not the address
of the device being passed to add_if (ie eth0).
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the message_age is already greater than the max_age, then the
BPDU is bogus. Linux won't generate BPDU, but conformance tester
or buggy implementation might.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A bridge topology with three systems:
+------+ +------+
| A(2) |--| B(1) |
+------+ +------+
\ /
+------+
| C(3) |
+------+
What is supposed to happen:
* bridge with the lowest ID is elected root (for example: B)
* C detects that A->C is higher cost path and puts in blocking state
What happens. Bridge with lowest id (B) is elected correctly as
root and things start out fine initially. But then config BPDU
doesn't get transmitted from A -> C. Because of that
the link from A-C is transistioned to the forwarding state.
The root cause of this is that the configuration messages
is generated with bogus message age, and dropped before
sending.
In the standardmessage_age is supposed to be:
the time since the generation of the Configuration BPDU by
the Root that instigated the generation of this Configuration BPDU.
Reimplement this by recording the timestamp (age + jiffies) when
recording config information. The old code incorrectly used the time
elapsed on the ageing timer which was incorrect.
See also:
https://bugzilla.vyatta.com/show_bug.cgi?id=7164
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
rcu: Fix wrong check in list_splice_init_rcu()
net,rcu: Convert call_rcu(xt_rateest_free_rcu) to kfree_rcu()
sysctl,rcu: Convert call_rcu(free_head) to kfree
vmalloc,rcu: Convert call_rcu(rcu_free_vb) to kfree_rcu()
vmalloc,rcu: Convert call_rcu(rcu_free_va) to kfree_rcu()
ipc,rcu: Convert call_rcu(ipc_immediate_free) to kfree_rcu()
ipc,rcu: Convert call_rcu(free_un) to kfree_rcu()
security,rcu: Convert call_rcu(sel_netport_free) to kfree_rcu()
security,rcu: Convert call_rcu(sel_netnode_free) to kfree_rcu()
ia64,rcu: Convert call_rcu(sn_irq_info_free) to kfree_rcu()
block,rcu: Convert call_rcu(disk_free_ptbl_rcu_cb) to kfree_rcu()
scsi,rcu: Convert call_rcu(fc_rport_free_rcu) to kfree_rcu()
audit_tree,rcu: Convert call_rcu(__put_tree) to kfree_rcu()
security,rcu: Convert call_rcu(whitelist_item_free) to kfree_rcu()
md,rcu: Convert call_rcu(free_conf) to kfree_rcu()
icmp_route_lookup() uses the wrong flow parameters if the reverse
session route lookup isn't used.
So do not commit to the re-decoded flow until we actually make a
final decision to use a real route saved in 'rt2'.
Reported-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Because the ip fragment offset field counts 8-byte chunks, ip
fragments other than the last must contain a multiple of 8 bytes of
payload. ip_ufo_append_data wasn't respecting this constraint and,
depending on the MTU and ip option sizes, could create malformed
non-final fragments.
Google-Bug-Id: 5009328
Signed-off-by: Bill Sommerfeld <wsommerfeld@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
IPv6 fragment identification generation is way beyond what we use for
IPv4 : It uses a single generator. Its not scalable and allows DOS
attacks.
Now inetpeer is IPv6 aware, we can use it to provide a more secure and
scalable frag ident generator (per destination, instead of system wide)
This patch :
1) defines a new secure_ipv6_id() helper
2) extends inet_getid() to provide 32bit results
3) extends ipv6_select_ident() with a new dest parameter
Reported-by: Fernando Gont <fernando@gont.com.ar>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We currently cow metrics a bit too soon in IPv6 case : All routes are
tied to a single inetpeer entry.
Change ip6_rt_copy() to get destination address as second argument, so
that we fill rt6i_dst before the dst_copy_metrics() call.
icmp6_dst_alloc() must set rt6i_dst before calling dst_metric_set(), or
else the cow is done while rt6i_dst is still NULL.
If orig route points to readonly metrics, we can share the pointer
instead of performing the memory allocation and copy.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This resolves a panic on module removal.
Reported-by: Dave Jones <davej@redhat.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Some drivers (ab)use the ethtool_ops::get_regs operation to expose
only a hardware revision ID. Commit
a77f5db361 ('ethtool: Allocate register
dump buffer with vmalloc()') had the side-effect of breaking these, as
vmalloc() returns a null pointer for size=0 whereas kmalloc() did not.
For backward-compatibility, allow zero-length dumps again.
Reported-by: Kalle Valo <kvalo@qca.qualcomm.com>
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Cc: stable@kernel.org [2.6.37+]
Signed-off-by: David S. Miller <davem@davemloft.net>
There are two problems:
1) "n" was allocated with alloc_skb() so we should free it with
kfree_skb() instead of regular kfree().
2) We return the freed pointer instead of NULL.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since vlan_group_get_device and vlan_group is not going to be accessible
from device drivers, introduce function which substitutes it.
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
All these are instances of
#define NAME value;
or
#define NAME(params_opt) value;
These of course fail to build when used in contexts like
if(foo $OP NAME)
while(bar $OP NAME)
and may silently generate the wrong code in contexts such as
foo = NAME + 1; /* foo = value; + 1; */
bar = NAME - 1; /* bar = value; - 1; */
baz = NAME & quux; /* baz = value; & quux; */
Reported on comp.lang.c,
Message-ID: <ab0d55fe-25e5-482b-811e-c475aa6065c3@c29g2000yqd.googlegroups.com>
Initial analysis of the dangers provided by Keith Thompson in that thread.
There are many more instances of more complicated macros having unnecessary
trailing semicolons, but this pile seems to be all of the cases of simple
values suffering from the problem. (Thus things that are likely to be found
in one of the contexts above, more complicated ones aren't.)
Signed-off-by: Phil Carmody <ext-phil.2.carmody@nokia.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
In net/bridge/netfilter/ebt_ulog.c:ebt_ulog_packet() the 'goto unlock'
before the 'alloc_failure' label is completely redundant. This patch
removes it.
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
If overlapping networks with different interfaces was added to
the set, the type did not handle it properly. Example
ipset create test hash:net,iface
ipset add test 192.168.0.0/16,eth0
ipset add test 192.168.0.0/24,eth1
Now, if a packet was sent from 192.168.0.0/24,eth0, the type returned
a match.
In the patch the algorithm is fixed in order to correctly handle
overlapping networks.
Limitation: the same network cannot be stored with more than 64 different
interfaces in a single set.
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
The RCU callback xt_rateest_free_rcu() just calls kfree(), so we can
use kfree_rcu() instead of call_rcu(). This also allows us to dispense
with an rcu_barrier() call, speeding up unloading of this module.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Patrick McHardy <kaber@trash.net>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
commit 58389c69150e6032504dfcd3edca6b1975c8b5bc
Author: Johannes Berg <johannes.berg@intel.com>
Date: Mon Jul 18 18:08:35 2011 +0200
cfg80211: allow userspace to control supported rates in scan
made single-band cards crash since it would always
access all wiphy->bands[]. Fix this and reject any
attempts in the new helper ieee80211_get_ratemask()
to do the same, rejecting rates configuration for
unsupported bands.
Reported-by: Pavel Roskin <proski@gnu.org>
Tested-by: Pavel Roskin <proski@gnu.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
ieee80211_stop_rx_ba_session() was calling sta_info_get()
without rcu locking, and the return value was not
checked.
This resulted in the following panic:
[<bf05726c>] (ieee80211_stop_rx_ba_session+0x0/0x60 [mac80211])
[<bf0abd94>] (wl1271_event_handle+0x0/0xdc8 [wl12xx])
[<bf0a7308>] (wl1271_irq+0x0/0x4a0 [wl12xx])
[<c00c40a8>] (irq_thread+0x0/0x254)
[<c00a7398>] (kthread+0x0/0x8c)
Signed-off-by: Eliad Peller <eliad@wizery.com>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
cfg80211_netdev_notifier_call() is configuring psm in case
of NL80211_IFTYPE_STATION interface type (on NETDEV_UP).
do the same for NL80211_IFTYPE_P2P_CLIENT interface type.
Signed-off-by: Eliad Peller <eliad@wizery.com>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
In P2P client mode, the GO (AP) to connect to might
have periods of time where it is not available due
to powersave. To allow the driver to sync with it
and send frames to the GO only when it is available
add a new callback tx_sync (and the corresponding
finish_tx_sync). These callbacks can sleep unlike
the actual TX.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
combination of kern_path_parent() and lookup_create(). Does *not*
expose struct nameidata to caller. Syscalls converted to that...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Scanning currently uses the TX rate mask to
restrict the rate set, which is bogus. Make
it use the new set of rates from userspace.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Some P2P scans are not allowed to advertise
11b rates, but that is a rather special case
so instead of having that, allow userspace
to request the rate sets (per band) that are
advertised in scan probe request frames.
Since it's needed in two places now, factor
out some common code parsing a rate array.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
open(2) must always include one of O_RDONLY, O_WRONLY, or O_RDWR. No need
for any O_APPEND special case.
Passing O_WRONLY|O_RDWR is undefined according to the man page, but the
Linux VFS interprets this as O_RDWR, so we'll do the same.
This fixes open(2) with flags O_RDWR|O_APPEND, which was incorrectly being
translated to readonly.
Reported-by: Fyodor Ustinov <ufm@ufm.su>
Signed-off-by: Sage Weil <sage@newdream.net>
Introduces a new nfnetlink type that applies a given
verdict to all queued packets with an id <= the id in the verdict
message.
If a mark is provided it is applied to all matched packets.
This reduces the number of verdicts that have to be sent.
Applications that make use of this feature need to maintain
a timeout to send a batchverdict periodically to avoid starvation.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Packet identifier is currently setup in nfqnl_build_packet_message(),
using one atomic_inc_return().
Problem is that since several cpus might concurrently call
nfqnl_enqueue_packet() for the same queue, we can deliver packets to
consumer in non monotonic way (packet N+1 being delivered after packet
N)
This patch moves the packet id setup from nfqnl_build_packet_message()
to nfqnl_enqueue_packet() to guarantee correct delivery order.
This also removes one atomic operation.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Florian Westphal <fw@strlen.de>
CC: Pablo Neira Ayuso <pablo@netfilter.org>
CC: Eric Leblond <eric@regit.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Add tx_conf array to save the current tx queues
configuration, and reconfig it on resume (ieee80211_reconfig).
On resume, the driver is being reconfigured. Without
reconfiguring the tx queues as well, the driver might
configure the device to use wrong ac params (e.g. ps-poll
instead of uapsd).
Signed-off-by: Eliad Peller <eliad@wizery.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Compiler is not smart enough to avoid double BSWAP instructions in
ntohl(inet_make_mask(plen)).
Lets cache this value in struct leaf_info, (fill a hole on 64bit arches)
With route cache disabled, this saves ~2% of cpu in udpflood bench on
x86_64 machine.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
nenetlink_queue operations on SMP are not efficent if several queues are
used, because of nfnl_mutex contention when applications give packet
verdict.
Use new call_rcu field in struct nfnl_callback to advertize a callback
that is called under rcu_read_lock instead of nfnl_mutex.
On my 2x4x2 machine, I was able to reach 2.000.000 pps going through
user land returning NF_ACCEPT verdicts without losses, instead of less
than 500.000 pps before patch.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Florian Westphal <fw@strlen.de>
CC: Eric Leblond <eric@regit.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Goal of this patch is to permit nfnetlink providers not mandate
nfnl_mutex being held while nfnetlink_rcv_msg() calls them.
If struct nfnl_callback contains a non NULL call_rcu(), then
nfnetlink_rcv_msg() will use it instead of call() field, holding
rcu_read_lock instead of nfnl_mutex
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Florian Westphal <fw@strlen.de>
CC: Eric Leblond <eric@regit.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
In the future dst entries will be neigh-less. In that environment we
need to have an easy transition point for current users of
dst->neighbour outside of the packet output fast path.
Signed-off-by: David S. Miller <davem@davemloft.net>
It just makes it harder to see 1) what the code is doing
and 2) grep for all users of dst{->,.}neighbour
Signed-off-by: David S. Miller <davem@davemloft.net>
This will get us closer to being able to do "neigh stuff"
completely independent of the underlying dst_entry for
protocols (ipv4/ipv6) that wish to do so.
We will also be able to make dst entries neigh-less.
Signed-off-by: David S. Miller <davem@davemloft.net>