Commit graph

145959 commits

Author SHA1 Message Date
Jouni Malinen
9f26a95221 nl80211: Validate NL80211_ATTR_KEY_SEQ length
Validate RSC (NL80211_ATTR_KEY_SEQ) length in nl80211/cfg80211 instead
of having to do this in all the drivers.

Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-05-20 14:46:25 -04:00
Jouni Malinen
cc65965cbb ath9k: Fix PS mode operation to receive buffered broadcast/multicast frames
The previous implementation was moving back to NETWORK SLEEP state
immediately after receiving a Beacon frame. This means that we are
unlikely to receive all the buffered broadcast/multicast frames that
would be sent after DTIM Beacon frames. Fix this by parsing the Beacon
frame and remaining awake, if needed, to receive the buffered
broadcast/multicast frames. The last buffered frame will trigger the
move back into NETWORK SLEEP state.

If the last broadcast/multicast frame is not received properly (or if
the AP fails to send it), the next Beacon frame will work as a backup
trigger for returning into NETWORK SLEEP.

A new debug type, PS (debug=0x800 module parameter), is added to make
it easier to debug potential power save issues in the
future. Currently, this is only used for the Beacon frame and buffered
broadcast/multicast receiving.

Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-05-20 14:46:24 -04:00
Jouni Malinen
9d64a3cfaf ath9k: Clean up RX processing a bit
This makes use of the local fc variable in bit more places and uses a
common helper macro. The part of RX process that delivers skb's to
mac80211 is moved to a separate function in preparation for future
changes that will need to do this from two places. The modifications
here should not result in any functional changes.

Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-05-20 14:46:24 -04:00
Jouni Malinen
d8959fbfba ath9k: Fix a check for multicast address for virtual wiphy
The broadcast bit is in the first, not the last octet..

Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-05-20 14:46:24 -04:00
Jouni Malinen
92778180f7 mac80211: Cancel pending probereq poll on beacon RX
While the probe request poll is expected to work, it looks like it
does not always result in getting a response. The exact reason for
this is unclear, but anyway, if we do receive a Beacon frame from our
AP, there is no need to disconnect based on the probereq poll. This
seems to help keep the connection bit more stable in cases where
beacon loss is occurring semi-frequently.

Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-05-20 14:46:24 -04:00
Gábor Stefanik
13bdcd90bb zd1211rw: Replace ZD_CS_MULTICAST with ZD_CS_NO_ACK
According to my tests, all that ZD_CS_MULTICAST does is to
disable retrying/waiting for an ACK. Reflect this by renaming
the bit to ZD_CS_NO_ACK and setting it based on
IEEE80211_TX_CTL_NO_ACK, instead of is_multicast_ether_addr.

Signed-off-by: Gábor Stefanik <netrolller.3d@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-05-20 14:46:24 -04:00
Senthil Balasubramanian
cccaec98a3 mac80211: Initialize RX's last received sequence number
The STA may drop the very first frame if it happens to be a retried
frame. This is because we maintian the last received sequence number
per TID for QoS frames and it is initialized to zero through kzalloc
during sta_info_alloc and the sequence number of the very first date
frame received would be ZERO (as per IEEE 802.11-2007, 7.1.3.4.1).

If the frame dropped happens to be an EAP Request Identity(very first
frame from the AP), then wpa_supplicnat disconnects the STA and the
whole procedure starts again.

Signed-off-by: Senthil Balasubramanian <senthilkumar@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-05-20 14:46:23 -04:00
Luis R. Rodriguez
80a3511d70 cfg80211: add debugfs HT40 allow map
Here's a screenshot of what this looks like with ath9k:

mcgrof@pogo /debug/ieee80211/phy0 $ cat ht40allow_map
2412 HT40  +
2417 HT40  +
2422 HT40  +
2427 HT40  +
2432 HT40 -+
2437 HT40 -+
2442 HT40 -+
2447 HT40 -
2452 HT40 -
2457 HT40 -
2462 HT40 -
2467 Disabled
2472 Disabled
2484 Disabled
5180 HT40  +
5200 HT40 -+
5220 HT40 -+
5240 HT40 -+
5260 HT40 -+
5280 HT40 -+
5300 HT40 -+
5320 HT40 -
5500 HT40  +
5520 HT40 -+
5540 HT40 -+
5560 HT40 -+
5580 HT40 -+
5600 HT40 -+
5620 HT40 -+
5640 HT40 -+
5660 HT40 -+
5680 HT40 -+
5700 HT40 -
5745 HT40  +
5765 HT40 -+
5785 HT40 -+
5805 HT40 -+
5825 HT40 -

Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-05-20 14:46:23 -04:00
Luis R. Rodriguez
1ac61302dc mac80211/cfg80211: move wiphy specific debugfs entries to cfg80211
This moves the cfg80211 specific stuff to new cfg80211 debugfs
entries. Non-mac80211 will also get these entries now. There were
only 4 which we take:

rts_threshold
fragmentation_threshold
short_retry_limit
long_retry_limit

Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-05-20 14:46:23 -04:00
Luis R. Rodriguez
294196ab22 cfg80211: check allowed channel type upon userspace requests
Thanks to nl80211 userspace can be very specific upon device
configuration. Before processing the request for the new HT40
channel types (HT40- or HT40+) we need to ensure we can use them
regulatory-wise. This wasn't required with wireless extensions as
specifying the channel type wasn't not available and configuration
was done towards the end implicitly upon association or reception
of beacons from the AP. For the new nl80211 we have to check this
when configuring the interfaces explicitly.

Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-05-20 14:46:23 -04:00
Luis R. Rodriguez
768777ea11 mac80211: check if HT40+/- is allowed before sending assoc
We weren't checking this at all.

Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-05-20 14:46:23 -04:00
Luis R. Rodriguez
689da1b3b8 wireless: rename IEEE80211_CHAN_NO_FAT_* to HT40-/+
This is more consistent with our nl80211 naming convention
for HT40-/+.

Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-05-20 14:46:22 -04:00
Luis R. Rodriguez
038659e7c6 cfg80211: Process regulatory max bandwidth checks for HT40
We are not correctly listening to the regulatory max bandwidth
settings. To actually make use of it we need to redesign things
a bit. This patch does the work for that. We do this to so we
can obey to regulatory rules accordingly for use of HT40.

We end up dealing with HT40 by having two passes for each channel.

The first check will see if a 20 MHz channel fits into the channel's
center freq on a given frequency range. We check for a 20 MHz
banwidth channel as that is the maximum an individual channel
will use, at least for now. The first pass will go ahead and
check if the regulatory rule for that given center of frequency
allows 40 MHz bandwidths and we use this to determine whether
or not the channel supports HT40 or not. So to support HT40 you'll
need at a regulatory rule that allows you to use 40 MHz channels
but you're channel must also be enabled and support 20 MHz by itself.

The second pass is done after we do the regulatory checks over
an device's supported channel list. On each channel we'll check
if the control channel and the extension both:

 o exist
 o are enabled
 o regulatory allows 40 MHz bandwidth on its frequency range

This work allows allows us to idependently check for HT40- and
HT40+.

Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-05-20 14:46:22 -04:00
Roel Kluin
a6c6733978 wireless: beyond ARRAY_SIZE of intf->crypto_stats
Do not go beyond ARRAY_SIZE of intf->crypto_stats

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Acked-by: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-05-20 14:29:55 -04:00
Jay Sternberg
c9d2fbf36d iwlwifi: update 5000 ucode support to version 2 of API
enable iwl driver to support 5000 ucode having version 2 of API

Signed-off-by: Jay Sternberg <jay.e.sternberg@linux.intel.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-05-20 14:29:55 -04:00
Luis R. Rodriguez
5078b2e32a cfg80211: fix race between core hint and driver's custom apply
Its possible for cfg80211 to have scheduled the work and for
the global workqueue to not have kicked in prior to a cfg80211
driver's regulatory hint or wiphy_apply_custom_regulatory().

Although this is very unlikely its possible and should fix
this race. When this race would happen you are expected to have
hit a null pointer dereference panic.

Cc: stable@kernel.org
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Tested-by: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-05-20 14:29:54 -04:00
John W. Linville
267d493b32 airo: fix airo_get_encode{,ext} buffer overflow like I mean it...
"airo: airo_get_encode{,ext} potential buffer overflow" was actually a
no-op, due to an unrecognized type overflow in an assignment.  Oddly,
gcc only seems to tell me about it when using -Wextra...grrr...

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-05-20 14:29:54 -04:00
Fabio Rossi
875690c378 ath5k: fix interpolation with equal power levels
When the EEPROM contains weird values for the power levels we have to
fix the interpolation process.

Signed-off-by: Fabio Rossi <rossi.f@inwind.it>
Acked-by: Nick Kossifidis <mickflemm@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-05-20 14:29:53 -04:00
Reinette Chatre
fbc9f97bbf iwlwifi: do not cancel delayed work inside spin_lock_irqsave
Calling cancel_delayed_work() from inside
spin_lock_irqsave, introduces a potential deadlock.

As explained by Johannes Berg <johannes@sipsolutions.net>

A - lock
T - timer

phase                   CPU 1           CPU 2
---------------------------------------------

some place that calls
cancel_timer_sync()
(which is the | code)
                                        lock-irq(A)
|                                       "lock-irq"(T)
|                                       "unlock"(T)
|                                       wait(T)
                                        unlock(A)

timer softirq
                        "lock"(T)
                        run(T)
                        "unlock"(T)

irq handler
          lock(A)
          unlock(A)

Now all that again, interleaved, leading to deadlock:

                                        lock-irq(A)
                        "lock"(T)
                         run(T)
IRQ during or maybe
before run(T) -->        lock(A)
                                        "lock-irq"(T)
                                        wait(T)

We fix this by moving the call to cancel_delayed_work() into workqueue.
There are cases where the work may not actually be queued or running
at the time we are trying to cancel it, but cancel_delayed_work() is
able to deal with this.

Also cleanup iwl_set_mode related to this call. This function
(iwl_set_mode) is only called when bringing interface up and there will
thus not be any scanning done. No need to try to cancel scanning.

Fixes http://bugzilla.kernel.org/show_bug.cgi?id=13224, which was also
reported at http://marc.info/?l=linux-wireless&m=124081921903223&w=2 .

Tested-by: Miles Lane <miles.lane@gmail.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Acked-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-05-20 14:29:53 -04:00
Forrest Zhang
a54be5d43a ath5k: fix exp off-by-one when computing OFDM delta slope
Commit e8f055f0c3 ("ath5k: Update reset code") subtly changed the
code that computes floating point values for the PHY3_TIMING register
such that the exponent is off by a decimal point, which can cause
problems with OFDM channel operation.

get_bitmask_order() actually returns the highest bit set plus one,
whereas the previous code wanted the highest bit set.  Instead, use
ilog2 which is what this code is really calculating.  Also check
coef_scaled to handle the (invalid) case where we need log2(0).

Signed-off-by: Bob Copeland <me@bobcopeland.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-05-20 14:07:51 -04:00
Johannes Berg
88f16db7a2 wext: verify buffer size for SIOCSIWENCODEEXT
Another design flaw in wireless extensions (is anybody
surprised?) in the way it handles the iw_encode_ext
structure: The structure is part of the 'extra' memory
but contains the key length explicitly, instead of it
just being the length of the extra buffer - size of
the struct and using the explicit key length only for
the get operation (which only writes it).

Therefore, we have this layout:

extra: +-------------------------+
       | struct iw_encode_ext  { |
       |     ...                 |
       |     u16 key_len;        |
       |     u8 key[0];          |
       | };                      |
       +-------------------------+
       | key material            |
       +-------------------------+

Now, all drivers I checked use ext->key_len without
checking that both key_len and the struct fit into the
extra buffer that has been copied from userspace. This
leads to a buffer overrun while reading that buffer,
depending on the driver it may be possible to specify
arbitrary key_len or it may need to be a proper length
for the key algorithm specified.

Thankfully, this is only exploitable by root, but root
can actually cause a segfault or use kernel memory as
a key (which you can even get back with siocgiwencode
or siocgiwencodeext from the key buffer).

Fix this by verifying that key_len fits into the buffer
along with struct iw_encode_ext.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-05-20 14:07:50 -04:00
Pavel Roskin
2b611cb6ee ath5k: fix scanning in AR2424
AR5K_PHY_PLL_40MHZ_5413 should not be ORed with AR5K_PHY_MODE_RAD_RF5112
for 5 GHz channels.

The incorrect PLL value breaks scanning in the countries where 5 GHz
channels are allowed.

Signed-off-by: Pavel Roskin <proski@gnu.org>
Acked-by: Nick Kossifidis <mickflemm@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-05-20 14:07:50 -04:00
Ben Hutchings
97bc54152e sfc: Remove lro module parameter
GRO/LRO can be controlled through ethtool so this is unnecessary.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-19 16:19:08 -07:00
Sascha Hlusiak
645069299a sit: stateless autoconf for isatap
be sent periodically. The rs_delay can be speficied when adding the
PRL entry and defaults to 15 minutes.

The RS is sent from every link local adress that's assigned to the
tunnel interface. It's directed to the (guessed) linklocal address
of the router and is sent through the tunnel.

Better: send to ff02::2 encapsuled in unicast directed to router-v4.

Signed-off-by: Sascha Hlusiak <contact@saschahlusiak.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-19 16:02:02 -07:00
Sascha Hlusiak
9af28511be addrconf: refuse isatap eui64 for INADDR_ANY
A tunnel with no local ipv4 endpoint would otherwise use the
ISATAP linklocal address fe80::5efe:0:0, which is invalid. Rather not
add a linklocal address at all.

Signed-off-by: Sascha Hlusiak <contact@saschahlusiak.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-19 16:02:02 -07:00
Sascha Hlusiak
4b27960174 sit: ipip6_tunnel_del_prl: return err
Typo. When deleting a PRL entry, return status to userspace
instead of success.

Signed-off-by: Sascha Hlusiak <contact@saschahlusiak.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-19 16:02:01 -07:00
Sascha Hlusiak
4fddbf5d78 sit: strictly restrict incoming traffic to tunnel link device
Check link device when looking up a tunnel. When a tunnel is
linked to a interface, traffic from a different interface must not
reach the tunnel.

This also allows creating of multiple tunnels with the same
endpoints, if the link device differs.

Signed-off-by: Sascha Hlusiak <contact@saschahlusiak.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-19 16:02:00 -07:00
Sascha Hlusiak
8db99e5717 sit: Fail to create tunnel, if it already exists
When locating the tunnel, do not continue if it is found. Otherwise
a different tunnel with similar configuration would be returned and
parts could be overwritten.

Signed-off-by: Sascha Hlusiak <contact@saschahlusiak.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-19 16:02:00 -07:00
Chris Friesen
9643f45512 ipv4: teach ipconfig about the MTU option in DHCP
The DHCP spec allows the server to specify the MTU.  This can be useful
for netbooting with UDP-based NFS-root on a network using jumbo frames.
This patch allows the kernel IP autoconfiguration to handle this option
correctly.

It would be possible to use initramfs and add a script to set the MTU,
but that seems like a complicated solution if no initramfs is otherwise
necessary, and would bloat the kernel image more than this code would.

This patch was originally submitted to LKML in 2003 by Hans-Peter Jansen.

Signed-off-by: Chris Friesen <cfriesen@nortel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-19 15:36:17 -07:00
Pablo Neira Ayuso
fd2120ca0d net: use NLMSG_DEFAULT_SIZE in nlmsg_new() allocations
nlmsg_new() adds the size of the netlink header to the value
that has been passed as parameter. If NLMSG_GOODSIZE is selected,
we request an allocation of one memory page plus the size of the
header. Instead, NLMSG_DEFAULT_SIZE should be used since it
already substracts the size of the Netlink header.

I have the impression that the similar naming in both constant
is error prone when using it with nlmsg_new(). This is already
documented in include/net/netlink.h

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-19 15:36:16 -07:00
Brice Goglin
e5488ce569 myri10ge: update version to 1.5.0-1.415
Update myri10ge driver version to 1.5.0-1.415.

Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-19 15:36:16 -07:00
Brice Goglin
3a0c7d2d2b myri10ge: allow LRO to be enabled via ethtool
Allow myri10ge LRO to be enabled/disabled via ethtool
(and by the stack for packet forwarding).

Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-19 15:36:15 -07:00
Eric Dumazet
ab35cd4b8f sch_teql: Use net_device internal stats
We can slightly reduce size of teqlN structure, not duplicating stats
structure in teql_master but using stats field from net_device.stats
for tx_errors and from netdev_queue for tx_bytes/tx_packets/tx_dropped
values.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-19 15:36:15 -07:00
Jesse Brandeburg
0cefafadbb ixgbe: Cleanup feature setup code to make the code more readable
This is purely a cleanup patch.  This collapses some of the code required
when we configure our Tx and Rx feature sets, and makes the code more
readable and maintainable.

Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-19 15:36:14 -07:00
Peter P Waskiewicz Jr
537d58a00a ixgbe: Change Direct Attach Twinax cable detection for SFP+ NICs
The SFF specification for Direct Attach cable detection has now been
ratified.  Previously, DA cable detect was looking at the Twinaxial bit in
byte 9 of the SFP+ EEPROM.  The spec now defines active and passive DA
cables in byte 8 of the SFP+ EEPROM.  This patch changes the cable
detection for both 82598 and 82599 SFP+ adapters to conform to the new
spec.

Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-19 15:36:14 -07:00
Peter P Waskiewicz Jr
aa5aec8885 ixgbe: Add semaphore access for PHY initialization for 82599
The SFP+ NIC (device id 0x10fb) needs a semaphore to serialize
PHY access, so our PHY init code must honor that same semaphore.

Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-19 15:36:13 -07:00
françois romieu
3577aa1bd7 r8169: allow true forced mode setting
Due to mostly historic reasons, including a lack of reliability
of the link handling (especially with the older 8169), the
current r8169 driver emulates forced mode setting by limiting
the advertised modes.

With this change the driver allows real 10/100 forced mode
settings on the 8169 and 8101/8102.

Original idea by Vincent Steenhoute. The RTL_GIGA_MAC_VER_03
tweak was extracted from Realtek's r8169 v6.010.00 driver.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Tested-by: Jean Delvare <jdelvare@suse.de>
Cc: Edward Hsu <edward_hsu@realtek.com.tw>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-19 14:31:28 -07:00
françois romieu
381f05172b r8169: remove useless struct member
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Edward Hsu <edward_hsu@realtek.com.tw>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-19 14:31:27 -07:00
Eric Dumazet
8b2d850db2 ppp: unset IFF_XMIT_DST_RELEASE in ppp_setup()
Jarek pointed pppoe can call back dev_queue_xmit(), and might need
skb->dst, so its safer to unset IFF_XMIT_DST_RELEASE on ppp devices.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-19 14:24:37 -07:00
Eric Dumazet
93f154b594 net: release dst entry in dev_hard_start_xmit()
One point of contention in high network loads is the dst_release() performed
when a transmited skb is freed. This is because NIC tx completion calls
dev_kree_skb() long after original call to dev_queue_xmit(skb).

CPU cache is cold and the atomic op in dst_release() stalls. On SMP, this is
quite visible if one CPU is 100% handling softirqs for a network device,
since dst_clone() is done by other cpus, involving cache line ping pongs.

It seems right place to release dst is in dev_hard_start_xmit(), for most
devices but ones that are virtual, and some exceptions.

David Miller suggested to define a new device flag, set in alloc_netdev_mq()
(so that most devices set it at init time), and carefuly unset in devices
which dont want a NULL skb->dst in their ndo_start_xmit().

List of devices that must clear this flag is :

- loopback device, because it calls netif_rx() and quoting Patrick :
    "ip_route_input() doesn't accept loopback addresses, so loopback packets
     already need to have a dst_entry attached."
- appletalk/ipddp.c : needs skb->dst in its xmit function

- And all devices that call again dev_queue_xmit() from their xmit function
(as some classifiers need skb->dst) : bonding, vlan, macvlan, eql, ifb, hdlc_fr

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-18 22:19:19 -07:00
Eric W. Biederman
496a60cdcd net: FIX bonding sysfs rtnl_lock deadlock
Sysfs files for a network device can not unconditionally take the
rtnl_lock as the bonding sysfs files do.  If someone accesses those
sysfs files while the network device is being unregistered with the
rtnl_lock held we will deadlock.

So use trylock and restart_syscall to avoid this problem.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-18 22:16:00 -07:00
Eric W. Biederman
26574401fe net: Fix ipoib rtnl_lock sysfs deadlock.
Network device sysfs files that grab the rtnl_lock unconditionally
will deadlock if accessed when the network device is being
unregistered.  So use trylock and syscall_restart to avoid this
deadlock.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-18 22:15:59 -07:00
Eric W. Biederman
af38f29895 net: Fix bridgeing sysfs handling of rtnl_lock
Holding rtnl_lock when we are unregistering the sysfs files can
deadlock if we unconditionally take rtnl_lock in a sysfs file.  So fix
it with the now familiar patter of: rtnl_trylock and syscall_restart()

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-18 22:15:59 -07:00
Eric W. Biederman
9b8adb5ea0 net: Fix devinet_sysctl_forward
sysctls are unregistered with the rntl_lock held making
it unsafe to unconditionally grab the the rtnl_lock.  Instead
we need to call rtnl_trylock and restart the system call
if we can not grab it.  Otherwise we could deadlock at unregistration
time.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-18 22:15:58 -07:00
Eric W. Biederman
5007392d85 net: FIX ipv6_forward sysctl restart
Just returning -ERESTARTSYS without a signal pending is not
good that will just leak it to userspace.  We need return
-ERESTARTNOINTR so we always restart and set signal pending
so that we fall of the fast path of syscall return and setup
the system call restart.

So use restart_syscall() which does all of this for us.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-18 22:15:58 -07:00
Eric W. Biederman
336ca57c3b net-sysfs: Use rtnl_trylock in sysfs methods.
The earlier patch to fix the deadlock between a network device going
away and writing to sysfs attributes was incomplete.
- It did not set signal_pending so we would leak ERSTARTSYS to user space.
- It used ERESTARTSYS which only restarts if sigaction configures it to.
- It did not cover store and show for ifalias.

So fix all of these up and use the new helper restart_syscall so we get
the details correct on what it takes.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-18 22:15:57 -07:00
Eric W. Biederman
690cc3ffe3 syscall: Implement a convinience function restart_syscall
Currently when we have a signal pending we have the functionality
to restart that the current system call.  There are other cases
such as nasty lock ordering issues where it makes sense to have
a simple fix that uses try lock and restarts the system call.
Buying time to figure out how to rework the locking strategy.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-18 22:15:56 -07:00
Johann Baudy
69e3c75f4d net: TX_RING and packet mmap
New packet socket feature that makes packet socket more efficient for
transmission.

- It reduces number of system call through a PACKET_TX_RING mechanism,
  based on PACKET_RX_RING (Circular buffer allocated in kernel space
  which is mmapped from user space).

- It minimizes CPU copy using fragmented SKB (almost zero copy).

Signed-off-by: Johann Baudy <johann.baudy@gnu-log.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-18 22:11:22 -07:00
Frans Pop
bc8a539743 ipv4: make default for INET_LRO consistent with help text
Commit e81963b1 ("ipv4: Make INET_LRO a bool instead of tristate.")
changed this config from tristate to bool.  Add default so that it is
consistent with the help text.

Signed-off-by: Frans Pop <elendil@planet.nl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-18 21:48:38 -07:00
Dhananjay Phadke
f67f340849 netxen: fix msi irq setup
The pdev->irq was not saved in netxen_adapter, causing request_irq()
with invalid irq number.

This was broken in commit be339aee63
("netxen: fix irq tear down and msix leak.").

Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-18 21:46:40 -07:00