Commit Graph

18051 Commits (b6bf3ca032c9cd517526178f579e7a4e395c6e45)

Author SHA1 Message Date
David S. Miller 3b004569d8 ipv4: Avoid use of signed integers in fib_trie code.
GCC emits all kinds of crazy zero extensions when we go from signed
int, to unsigned short, etc. etc.

This transformation has to be legal because:

1) In tkey_extract_bits() in mask_pfx(), the values are used to
   perform shifts, on which negative values are undefined by C.

2) In fib_table_lookup() we perform comparisons with unsigned
   values, constants, and additions.  None of which should
   encounter negative values.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-17 15:49:26 -08:00
David S. Miller 3c7bd1a140 net: Add initial_ref arg to dst_alloc().
This allows avoiding multiple writes to the initial __refcnt.

The most simplest cases of wanting an initial reference of "1"
in ipv4 and ipv6 have been converted, the rest have been left
along and kept at the existing "0".

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-17 15:44:00 -08:00
David S. Miller 0c4dcd58fd ipv4: Consolidate ipv4 dst allocation logic.
This also allows us to combine all the dst->flags settings and avoid
read/modify/write sequences to this struct member.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-17 15:42:37 -08:00
David S. Miller 010c2708e5 ipv4: Move rcu_read_{lock,unlock}() into ip_route_output_slow().
Simplifies tail of __ip_route_output_key().

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-17 15:37:09 -08:00
David S. Miller 5ada552746 ipv4: Simplify output route creation call sequence.
There's a lot of redundancy and unnecessary stack frames
in the output route creation path.

1) Make __mkroute_output() return error pointers.

2) Eliminate ip_mkroute_output() entirely, made possible by #1.

3) Call __mkroute_output() directly and handling the returning error
   pointers in ip_route_output_slow().

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-17 15:29:00 -08:00
Michał Mirosław e83d360d9a net: introduce NETIF_F_RXCSUM
Introduce NETIF_F_RXCSUM to replace device-private flags for RX checksum
offload. Integrate it with ndo_fix_features.

ethtool_op_get_rx_csum() is removed altogether as nothing in-tree uses it.

Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-17 14:16:35 -08:00
Michał Mirosław da8ac86c4a net: use ndo_fix_features for ethtool_ops->set_flags
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-17 14:16:34 -08:00
Michał Mirosław 86794881c2 net: ethtool: use ndo_fix_features for offload setting
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-17 14:16:34 -08:00
Michał Mirosław 5455c6998d net: Introduce new feature setting ops
This introduces a new framework to handle device features setting.
It consists of:
  - new fields in struct net_device:
	+ hw_features - features that hw/driver supports toggling
	+ wanted_features - features that user wants enabled, when possible
  - new netdev_ops:
	+ feat = ndo_fix_features(dev, feat) - API checking constraints for
		enabling features or their combinations
	+ ndo_set_features(dev) - API updating hardware state to match
		changed dev->features
  - new ethtool commands:
	+ ETHTOOL_GFEATURES/ETHTOOL_SFEATURES: get/set dev->wanted_features
		and trigger device reconfiguration if resulting dev->features
		changed
	+ ETHTOOL_GSTRINGS(ETH_SS_FEATURES): get feature bits names (meaning)

Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-17 14:16:33 -08:00
Michał Mirosław 0a41770477 ethtool: factorize get/set_one_feature
This allows to enable GRO even if RX csum is disabled. GRO will not
be used for packets without hardware checksum anyway.

Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-17 14:16:33 -08:00
Michał Mirosław 340ae1654c ethtool: factorize ethtool_get_strings() and ethtool_get_sset_count()
This is needed for unified offloads patch.

Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-17 14:16:32 -08:00
Michał Mirosław 212b573f55 ethtool: enable GSO and GRO by default
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-17 14:16:32 -08:00
Michał Mirosław 9a279ea3a7 ethtool: move EXPORT_SYMBOL(ethtool_op_set_tx_csum) to correct place
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-17 14:16:31 -08:00
David S. Miller f878b995b0 Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfc-next-2.6 2011-02-15 12:25:19 -08:00
Ben Hutchings 5c56580b74 net: Adjust TX queue kobjects if number of queues changes during unregister
If the root qdisc for a net device is mqprio, and the driver's
ndo_setup_tc() operation dynamically adds and remvoes TX queues,
netif_set_real_num_tx_queues() will be called during device
unregistration to remove the extra TX queues when the qdisc is
destroyed.  Currently this causes the corresponding kobjects
to be leaked, and the device's reference count never drops to 0.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2011-02-15 19:45:33 +00:00
David S. Miller f39925dbde ipv4: Cache learned redirect information in inetpeer.
Note that we do not generate the redirect netevent any longer,
because we don't create a new cached route.

Instead, once the new neighbour is bound to the cached route,
we emit a neigh update event instead.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-14 21:33:27 -08:00
David S. Miller 2c8cec5c10 ipv4: Cache learned PMTU information in inetpeer.
The general idea is that if we learn new PMTU information, we
bump the peer genid.

This triggers the dst_ops->check() code to validate and if
necessary propagate the new PMTU value into the metrics.

Learned PMTU information self-expires.

This means that it is not necessary to kill a cached route
entry just because the PMTU information is too old.

As a consequence:

1) When the path appears unreachable (dst_ops->link_failure
   or dst_ops->negative_advice) we unwind the PMTU state if
   it is out of date, instead of killing the cached route.

   A redirected route will still be invalidated in these
   situations.

2) rt_check_expire(), rt_worker_func(), et al. are no longer
   necessary at all.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-14 21:33:07 -08:00
Bernard Pidoux 68aa3fd551 ROSE: AX25: finding routes simplification
With previous patch, rose_get_neigh() routine
investigates the full list of neighbor nodes
until it finds or not an already connected node whether
it is called locally or through a level 3 transit frame.
If no routes are opened through an adjacent connected node
then a classical connect request is attempted.

Then there is no more reason for an extra loop such
as the one removed by this patch.

Signed-off-by: Bernard Pidoux <f6bvp@free.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-14 13:33:49 -08:00
Bernard Pidoux c5d8b24ad0 ROSE: rose AX25 packet routing improvement
FPAC AX25 packet application is using Linux kernel ROSE
routing skills in order to connect or send packets to remote stations
knowing their ROSE address via a network of interconnected nodes.

Each FPAC node has a ROSE routing table that Linux ROSE module is
looking at each time a ROSE frame is relayed by the node or when
a connect request to a neighbor node is received.

A previous patch improved the system time response by looking at
already established routes each time the system was looking for a
route to relay a frame. If a neighbor node routing the destination
address was already connected, then the frame would be sent
through him. If not, a connection request would be issued.

The present patch extends the same routing capability to a connect
request asked by a user locally connected into an FPAC node.
Without this patch, a connect request was not well handled unless it
was directed to an immediate connected neighbor of the local node.

Implemented at a number of ROSE FPAC node stations, the present patch
improved dramatically FPAC ROSE routing time response and efficiency.

Signed-off-by: Bernard Pidoux <f6bvp@free.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-14 13:31:09 -08:00
Eric Dumazet 31d409373c ipv4: fix rcu lock imbalance in fib_select_default()
Commit 0c838ff1ad (ipv4: Consolidate all default route selection
implementations.) forgot to remove one rcu_read_unlock() from
fib_select_default().

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-14 11:23:04 -08:00
Ben Hutchings ac7100ba93 sch_mqprio: Always set num_tc to 0 in mqprio_destroy()
All the cleanup code in mqprio_destroy() is currently conditional on
priv->qdiscs being non-null, but that condition should only apply to
the per-queue qdisc cleanup.  We should always set the number of
traffic classes back to 0 here.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2011-02-14 19:07:58 +00:00
Jiri Pirko afc6151a78 bridge: implement [add/del]_slave ops
add possibility to addif/delif via rtnetlink

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-13 16:58:40 -08:00
Jiri Pirko fbaec0ea54 rtnetlink: implement setting of master device
This patch allows userspace to enslave/release slave devices via netlink
interface using IFLA_MASTER. This introduces generic way to add/remove
underling devices.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-13 16:58:39 -08:00
Jiri Pirko 1765a57533 net: make dev->master general
dev->master is now tightly connected to bonding driver. This patch makes
this pointer more general and ready to be used by others.

 - netdev_set_master() - bond specifics moved to new function
   netdev_set_bond_master()
 - introduced netif_is_bond_slave() to check if device is a bonding slave

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-13 10:42:07 -08:00
Jiri Pirko d59cfde2fb net: remove the unnecessary dance around skb_bond_should_drop
No need to check (master) twice and to drive in and out the header file.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Reviewed-by: Nicolas de Pesloüan <nicolas.2p.debian@free.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-13 10:42:06 -08:00
Ben Greear 57f89bfa21 network: Allow af_packet to transmit +4 bytes for VLAN packets.
This allows user-space to send a '1500' MTU VLAN packet on a
1500 MTU ethernet frame.  The extra 4 bytes of a VLAN header is
not usually charged against the MTU when other parts of the
network stack is transmitting vlans...

Signed-off-by: Ben Greear <greearb@candelatech.com>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-11 21:26:32 -08:00
David S. Miller ab889e6607 Merge branch 'batman-adv/next' of git://git.open-mesh.org/ecsv/linux-merge 2011-02-11 21:20:27 -08:00
Linus Lüssing 3878f1f075 batman-adv: Disallow originator addressing within mesh layer
For a host in the mesh network, the batman layer should be transparent.
However, we had one exception, data packets within the mesh network
which have the same destination as a originator are being routed to
that node, although there is no host that node's bat0 interface and
therefore gets dropped anyway. This commit removes this exception.

Signed-off-by: Linus Lüssing <linus.luessing@ascom.ch>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
2011-02-11 23:30:33 +01:00
Linus Lüssing ee1e884194 batman-adv: Remove duplicate types.h inclusions
types.h is included by main.h, which is included at the beginning of any
other c-file anyway. Therefore this commit removes those duplicate
inclussions.

Signed-off-by: Linus Lüssing <linus.luessing@ascom.ch>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
2011-02-11 23:30:29 +01:00
Marek Lindner 1406206416 batman-adv: Split combined variable declarations
Multiple variable declarations in a single statements over multiple lines can
be split into multiple variable declarations without changing the actual
behavior.

Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
2011-02-11 23:29:00 +01:00
Sven Eckelmann c2f7f0e7b3 batman-adv: Use successive sequence numbers for fragments
The two fragments of an unicast packet must have successive sequence numbers to
allow the receiver side to detect matching fragments and merge them again. The
current implementation doesn't provide that property because a sequence of two
atomic_inc_return may be interleaved with another sequence which also changes
the variable.

The access to the fragment sequence number pool has either to be protected by
correct locking or it has to reserve two sequence numbers in a single fetch.
The latter one can easily be done by increasing the value of the last used
sequence number by 2 in a single step. The generated window of two currently
unused sequence numbers can now be scattered across the two fragments.

Reported-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
2011-02-11 00:25:10 +01:00
David S. Miller 6431cbc25f inet: Create a mechanism for upward inetpeer propagation into routes.
If we didn't have a routing cache, we would not be able to properly
propagate certain kinds of dynamic path attributes, for example
PMTU information and redirects.

The reason is that if we didn't have a routing cache, then there would
be no way to lookup all of the active cached routes hanging off of
sockets, tunnels, IPSEC bundles, etc.

Consider the case where we created a cached route, but no inetpeer
entry existed and also we were not asked to pre-COW the route metrics
and therefore did not force the creation a new inetpeer entry.

If we later get a PMTU message, or a redirect, and store this
information in a new inetpeer entry, there is no way to teach that
cached route about the newly existing inetpeer entry.

The facilities implemented here handle this problem.

First we create a generation ID.  When we create a cached route of any
kind, we remember the generation ID at the time of attachment.  Any
time we force-create an inetpeer entry in response to new path
information, we bump that generation ID.

The dst_ops->check() callback is where the knowledge of this event
is propagated.  If the global generation ID does not equal the one
stored in the cached route, and the cached route has not attached
to an inetpeer yet, we look it up and attach if one is found.  Now
that we've updated the cached route's information, we update the
route's generation ID too.

This clears the way for implementing PMTU and redirects directly in
the inetpeer cache.  There is absolutely no need to consult cached
route information in order to maintain this information.

At this point nothing bumps the inetpeer genids, that comes in the
later changes which handle PMTUs and redirects using inetpeers.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-10 13:33:41 -08:00
David S. Miller ddd4aa424b inetpeer: Add redirect and PMTU discovery cached info.
Validity of the cached PMTU information is indicated by it's
expiration value being non-zero, just as per dst->expires.

The scheme we will use is that we will remember the pre-ICMP value
held in the metrics or route entry, and then at expiration time
we will restore that value.

In this way PMTU expiration does not kill off the cached route as is
done currently.

Redirect information is permanent, or at least until another redirect
is received.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-10 13:29:30 -08:00
David S. Miller 7a71ed899e inetpeer: Abstract address representation further.
Future changes will add caching information, and some of
these new elements will be addresses.

Since the family is implicit via the ->daddr.family member,
replicating the family in ever address we store is entirely
redundant.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-10 13:22:28 -08:00
Xiaotian Feng b6644cb706 net: rename group sysfs entry to netdev_group
commit a512b92 adds sysfs entry for net device group, but
before this commit, tun also uses group sysfs, so after this
commit checkin, kernel warns like this:
    sysfs: cannot create duplicate filename '/devices/virtual/net/vnet0/group'

Since tun has used this for years, rename sysfs under tun might
break existing userspace, so rename group sysfs entry for net device
group is a better choice.

Signed-off-by: Xiaotian Feng <dfeng@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-09 19:16:15 -08:00
David S. Miller 27059746a9 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6 2011-02-09 12:39:12 -08:00
David S. Miller 263fb5b1bf Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/e1000e/netdev.c
2011-02-08 17:19:01 -08:00
David S. Miller 8d13a2a9fb net: Kill NETEVENT_PMTU_UPDATE.
Nobody actually does anything in response to the event,
so just kill it off.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-08 16:17:55 -08:00
David S. Miller 8d3bdbd55a net: Fix lockdep regression caused by initializing netdev queues too early.
In commit aa94210411 ("net: init ingress
queue") we moved the allocation and lock initialization of the queues
into alloc_netdev_mq() since register_netdevice() is way too late.

The problem is that dev->type is not setup until the setup()
callback is invoked by alloc_netdev_mq(), and the dev->type is
what determines the lockdep class to use for the locks in the
queues.

Fix this by doing the queue allocation after the setup() callback
runs.

This is safe because the setup() callback is not allowed to make any
state changes that need to be undone on error (memory allocations,
etc.).  It may, however, make state changes that are undone by
free_netdev() (such as netif_napi_add(), which is done by the
ipoib driver's setup routine).

The previous code also leaked a reference to the &init_net namespace
object on RX/TX queue allocation failures.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-08 15:02:50 -08:00
David S. Miller b2df5a8446 net/caif: Fix dangling list pointer in freed object on error.
rtnl_link_ops->setup(), and the "setup" callback passed to alloc_netdev*(),
cannot make state changes which need to be undone on failure.  There is
no cleanup mechanism available at this point.

So we have to add the caif private instance to the global list once we
are sure that register_netdev() has succedded in ->newlink().

Otherwise, if register_netdev() fails, the caller will invoke free_netdev()
and we will have a reference to freed up memory on the chnl_net_list.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-08 14:31:31 -08:00
Nicolas Dichtel fa9921e46f ipsec: allow to align IPv4 AH on 32 bits
The Linux IPv4 AH stack aligns the AH header on a 64 bit boundary
(like in IPv6). This is not RFC compliant (see RFC4302, Section
3.3.3.2.1), it should be aligned on 32 bits.

For most of the authentication algorithms, the ICV size is 96 bits.
The AH header alignment on 32 or 64 bits gives the same results.

However for SHA-256-128 for instance, the wrong 64 bit alignment results
in adding useless padding in IPv4 AH, which is forbidden by the RFC.

To avoid breaking backward compatibility, we use a new flag
(XFRM_STATE_ALIGN4) do change original behavior.

Initial patch from Dang Hongwu <hongwu.dang@6wind.com> and
Christophe Gouault <christophe.gouault@6wind.com>.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-08 14:00:40 -08:00
David S. Miller c0c84ef5c1 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 2011-02-08 13:52:31 -08:00
David S. Miller e0985f27dd Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2011-02-08 12:03:54 -08:00
David S. Miller 429a01a70f Merge branch 'batman-adv/merge' of git://git.open-mesh.org/ecsv/linux-merge 2011-02-07 19:54:14 -08:00
Sven Eckelmann 531c9da8c8 batman-adv: Linearize fragment packets before merge
We access the data inside the skbs of two fragments directly using memmove
during the merge. The data of the skb could span over multiple skb pages. An
direct access without knowledge about the pages would lead to an invalid memory
access.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
[lindner_marek@yahoo.de: Move return from function to the end]
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
2011-02-08 00:54:31 +01:00
andrew hendry 95c3043008 x25: possible skb leak on bad facilities
Originally x25_parse_facilities returned
-1 for an error
 0 meaning 0 length facilities
>0 the length of the facilities parsed.

5ef41308f9 ("x25: Prevent crashing when parsing bad X.25 facilities") introduced more
error checking in x25_parse_facilities however used 0 to indicate bad parsing
a6331d6f9a ("memory corruption in X.25 facilities parsing") followed this further for
DTE facilities, again using 0 for bad parsing.

The meaning of 0 got confused in the callers.
If the facilities are messed up we can't determine where the data starts.
So patch makes all parsing errors return -1 and ensures callers close and don't use the skb further.

Reported-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Andrew Hendry <andrew.hendry@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-07 13:41:38 -08:00
Dan Carpenter 3ad97fbcc2 mac80211: remove unneeded check
"ap" is the address of sdata->u.ap so it can never be NULL here.  Also
we dereferenced it on the previous line.  I removed the check.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-02-07 16:18:28 -05:00
Mohammed Shafi Shajakhan 38f37be209 mac80211: Update comments on radiotap MCS index
mac80211 now supports passing MCS index to radiotap, so update the
comments regarding this

Signed-off-by: Mohammed Shafi Shajakhan <mshajakhan@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-02-07 16:18:28 -05:00
Felix Fietkau 4f3123366f mac80211: as a 4-addr station, do not receive packets for other stations
Since 4-addr frames completely override the source address which will
make it into the converted 802.3 frames, receiving frames for other
4-addr stations will confuse the bridging code.

To be able to handle traffic for all connected devices, the bridge
code will automatically turn on promiscuous mode, which triggers
this problem.

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Reported-by: Steve Brown <sbrown@cortland.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-02-07 16:18:27 -05:00
Ben Greear 180205bdb2 mac80211: Make some mlme timers module paramaters.
This allows users to tune the connection-loss algorithms
to be more or less lenient.  In particular, larger
null-func retries helps when using lots of virtual
stations on a loaded network.

Signed-off-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-02-07 16:18:27 -05:00