Commit Graph

6401 Commits (73afc9069289bdb77cf0c81cb9775dcb63894bbe)

Author SHA1 Message Date
Pavel Emelyanov d535a916cd [IPVS]: Fix compiler warning about unused register_ip_vs_protocol
This is silly, but I have turned the CONFIG_IP_VS to m,
to check the compilation of one (recently sent) fix
and set all the CONFIG_IP_VS_PROTO_XXX options to n to
speed up the compilation.

In this configuration the compiler warns me about

  CC [M]  net/ipv4/ipvs/ip_vs_proto.o
net/ipv4/ipvs/ip_vs_proto.c:49: warning: 'register_ip_vs_protocol' defined but not used

Indeed. With no protocols selected there are no
calls to this function - all are compiled out with
ifdefs.

Maybe the best fix would be to surround this call with
ifdef-s or tune the Kconfig dependences, but I think that
marking this register function as __used is enough. No?

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Simon Horman <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-20 17:44:01 -08:00
Jonas Danielsson b4a9811c42 [ARP]: Fix arp reply when sender ip 0
Fix arp reply when received arp probe with sender ip 0.

Send arp reply with target ip address 0.0.0.0 and target hardware
address set to hardware address of requester. Previously sent reply
with target ip address and target hardware address set to same as
source fields.

Signed-off-by: Jonas Danielsson <the.sator@gmail.com>
Acked-by: Alexey Kuznetov <kuznet@ms2.inr.ac.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-20 17:38:16 -08:00
YOSHIFUJI Hideaki 77adefdc98 [IPV6] TCPMD5: Fix deleting key operation.
Due to the bug, refcnt for md5sig pool was leaked when
an user try to delete a key if we have more than one key.
In addition to the leakage, we returned incorrect return
result value for userspace.

This fix should close Bug #9418, reported by <ming-baini@163.com>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-20 17:31:23 -08:00
YOSHIFUJI Hideaki aacbe8c880 [IPV6] TCPMD5: Check return value of tcp_alloc_md5sig_pool().
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-20 17:30:56 -08:00
YOSHIFUJI Hideaki 354faf0977 [IPV4] TCPMD5: Use memmove() instead of memcpy() because we have overlaps.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-20 17:30:31 -08:00
YOSHIFUJI Hideaki a80cc20da4 [IPV4] TCPMD5: Omit redundant NULL check for kfree() argument.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-20 17:30:06 -08:00
David S. Miller 53438e5d04 Merge branch 'fixes-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2007-11-20 17:24:29 -08:00
Guillaume Chazarain 92468c53cf ieee80211: Stop net_ratelimit/IEEE80211_DEBUG_DROP log pollution
if (net_ratelimit())
	IEEE80211_DEBUG_DROP(...)

can pollute the logs with messages like:

printk: 1 messages suppressed.
printk: 2 messages suppressed.
printk: 7 messages suppressed.

if debugging information is disabled. These messages are printed by
net_ratelimit(). Add a wrapper to net_ratelimit() that takes into account
the log level, so that net_ratelimit() is called only when we really want
to print something.

Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-11-20 16:43:17 -05:00
Bruno Randolf 4b50e388f8 mac80211: add missing space in error message
Signed-off-by: Bruno Randolf <bruno@thinktube.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-11-20 16:43:17 -05:00
Johannes Berg c1428b3f45 mac80211: fix allmulti/promisc behaviour
When an interface with promisc/allmulti bit is taken down,
the mac80211 state can become confused. This fixes it by
making mac80211 keep track of all *active* interfaces that
have the promisc/allmulti bit set in the sdata, we sync
the interface bit into sdata at set_multicast_list() time
so this works.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-11-20 16:20:31 -05:00
Johannes Berg b52f2198ac mac80211: fix ieee80211_set_multicast_list
I recently experienced unexplainable behaviour with the b43
driver when I had broken firmware uploaded. The cause may have
been that promisc mode was not correctly enabled or disabled
and this bug may have been the cause.

Note how the values are compared later in the function so
just doing the & will result in the wrong thing being
compared and the test being false almost always.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-11-20 16:20:30 -05:00
Evgeniy Polyakov 1f305323ff [NETFILTER]: Fix kernel panic with REDIRECT target.
When connection tracking entry (nf_conn) is about to copy itself it can
have some of its extension users (like nat) as being already freed and
thus not required to be copied.

Actually looking at this function I suspect it was copied from
nf_nat_setup_info() and thus bug was introduced.

Report and testing from David <david@unsolicited.net>.

[ Patrick McHardy states:

	I now understand whats happening:

	- new connection is allocated without helper
	- connection is REDIRECTed to localhost
	- nf_nat_setup_info adds NAT extension, but doesn't initialize it yet
	- nf_conntrack_alter_reply performs a helper lookup based on the
	   new tuple, finds the SIP helper and allocates a helper extension,
	   causing reallocation because of too little space
	- nf_nat_move_storage is called with the uninitialized nat extension

	So your fix is entirely correct, thanks a lot :)  ]

Signed-off-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-20 04:27:35 -08:00
David S. Miller 0a06ea8718 [WIRELESS] WEXT: Fix userspace corruption on 64-bit.
On 64-bit systems sizeof(struct ifreq) is 8 bytes larger than
sizeof(struct iwreq).

For GET calls, the wireless extension code copies back into userspace
using sizeof(struct ifreq) but userspace and elsewhere only allocates
a "struct iwreq".  Thus, this copy writes past the end of the iwreq
object and corrupts whatever sits after it in memory.

Fix the copy_to_user() length.

This particularly hurts the compat case because the wireless compat
code uses compat_alloc_userspace() and right after this allocated
buffer is the current bottom of the user stack, and that's what gets
overwritten by the copy_to_user() call.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-20 03:29:53 -08:00
Christoph Lameter 70cf5035de [S390] Explicitly code allocpercpu calls in iucv
The iucv is the only user of the various functions that are used to bring
parts of cpus up and down. Its the only allocpercpu user that will do
I/O on per cpu objects (which is difficult to do with virtually mapped memory).
And its the only use of allocpercpu where a GFP_DMA allocation is done.

Remove the allocpercpu calls from iucv and code the allocation and freeing
manually. After this patch it is possible to remove a large part of
the allocpercpu API.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2007-11-20 11:13:47 +01:00
Joe Perches a572da4373 [IRDA]: Add missing "space"
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-19 23:48:30 -08:00
Joe Perches c3c9d45828 [SUNRPC]: Add missing "space"
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-19 23:48:08 -08:00
Joe Perches 9ee46f1d31 [SCTP]: Add missing "space"
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-19 23:47:47 -08:00
Joe Perches 3b6d821c4f [IPV6]: Add missing "space"
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-19 23:47:25 -08:00
Joe Perches 3a47a68b05 [BRIDGE]: Add missing "space"
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-19 23:46:55 -08:00
Joe Perches 464c4f184a [IPV4]: Add missing "space"
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-19 23:46:29 -08:00
Joe Perches 4756daa3b6 [DCCP]: Add missing "space"
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-19 23:46:02 -08:00
Sam Jansen 5487796f0c [TCP]: Problem bug with sysctl_tcp_congestion_control function
From: "Sam Jansen" <sjansen@google.com>

sysctl_tcp_congestion_control seems to have a bug that prevents it
from actually calling the tcp_set_default_congestion_control
function. This is not so apparent because it does not return an error
and generally the /proc interface is used to configure the default TCP
congestion control algorithm.  This is present in 2.6.18 onwards and
probably earlier, though I have not inspected 2.6.15--2.6.17.

sysctl_tcp_congestion_control calls sysctl_string and expects a successful
return code of 0. In such a case it actually sets the congestion control
algorithm with tcp_set_default_congestion_control. Otherwise, it returns the
value returned by sysctl_string. This was correct in 2.6.14, as sysctl_string
returned 0 on success. However, sysctl_string was updated to return 1 on
success around about 2.6.15 and sysctl_tcp_congestion_control was not updated.
Even though sysctl_tcp_congestion_control returns 1, do_sysctl_strategy
converts this return code to '0', so the caller never notices the error.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-19 23:28:21 -08:00
Ilpo Järvinen 6e42141009 [TCP] MTUprobe: fix potential sk_send_head corruption
When the abstraction functions got added, conversion here was
made incorrectly. As a result, the skb may end up pointing
to skb which got included to the probe skb and then was freed.
For it to trigger, however, skb_transmit must fail sending as
well.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-19 23:24:09 -08:00
Pavel Emelyanov 1f8170b0ec [PKTGEN]: Fix double unlock of xfrm_state->lock
The pktgen_output_ipsec() function can unlock this lock twice
due to merged error and plain paths. Remove one of the calls
to spin_unlock.

Other possible solution would be to place "return 0" right 
after the first unlock, but at this place the err is known 
to be 0, so these solutions are the same except for this one
makes the code shorter.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-19 22:51:24 -08:00
Simon Horman 9055fa1f3d [IPVS]: Move remaining sysctl handlers over to CTL_UNNUMBERED
Switch the remaining IPVS sysctl entries over to to use CTL_UNNUMBERED,
I stronly doubt that anyone is using the sys_sysctl interface to
these variables.

Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-19 21:51:13 -08:00
Simon Horman 9e103fa6bd [IPVS]: Fix sysctl warnings about missing strategy in schedulers
sysctl table check failed: /net/ipv4/vs/lblc_expiration .3.5.21.19 Missing strategy
[...]
sysctl table check failed: /net/ipv4/vs/lblcr_expiration .3.5.21.20 Missing strategy

Switch these entried over to use CTL_UNNUMBERED as clearly
the sys_syscal portion wasn't working.

This is along the same lines as Christian Borntraeger's patch that fixes
up entries with no stratergy in net/ipv4/ipvs/ip_vs_ctl.c

Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-19 21:50:21 -08:00
Christian Borntraeger 611cd55b15 [IPVS]: Fix sysctl warnings about missing strategy
Running the latest git code I get the following messages during boot:
sysctl table check failed: /net/ipv4/vs/drop_entry .3.5.21.4 Missing strategy
[...]		  
sysctl table check failed: /net/ipv4/vs/drop_packet .3.5.21.5 Missing strategy
[...]
sysctl table check failed: /net/ipv4/vs/secure_tcp .3.5.21.6 Missing strategy
[...]
sysctl table check failed: /net/ipv4/vs/sync_threshold .3.5.21.24 Missing strategy

I removed the binary sysctl handler for those messages and also removed
the definitions in ip_vs.h. The alternative would be to implement a 
proper strategy handler, but syscall sysctl is deprecated.

There are other sysctl definitions that are commented out or work with 
the default sysctl_data strategy. I did not touch these. 

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Simon Horman <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-19 21:49:25 -08:00
Eric Dumazet 483b23ffa3 [NET]: Corrects a bug in ip_rt_acct_read()
It seems that stats of cpu 0 are counted twice, since
for_each_possible_cpu() is looping on all possible cpus, including 0

Before percpu conversion of ip_rt_acct, we should also remove the
assumption that CPU 0 is online (or even possible)

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-18 18:47:38 -08:00
J. Bruce Fields eda4f9b799 sunrpc: rpc_pipe_poll may miss available data in some cases
Pipe messages start out life on a queue on the inode, but when first
read they're moved to the filp's private pointer.  So it's possible for
a poll here to return null even though there's a partially read message
available.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-11-17 13:08:47 -05:00
Kevin Coffman ef338bee3f sunrpc: return error if unsupported enctype or cksumtype is encountered
Return an error from gss_import_sec_context_kerberos if the
negotiated context contains encryption or checksum types not
supported by the kernel code.

This fixes an Oops because success was assumed and later code found
no internal_ctx_id.

Signed-off-by: Kevin Coffman <kwc@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-11-17 13:08:46 -05:00
Kevin Coffman ffc40f5692 sunrpc: gss_pipe_downcall(), don't assume all errors are transient
Instead of mapping all errors except EACCES to EAGAIN, map all errors
except EAGAIN to EACCES.

An example is user-land negotiating a Kerberos context with an encryption
type that is not supported by the kernel code.  (This can happen due to
mis-configuration or a bug in the Kerberos code that does not honor our
request to limit the encryption types negotiated.)  This failure is not
transient, and returning EAGAIN causes mount to continuously retry rather
than giving up.

Signed-off-by: Kevin Coffman <kwc@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-11-17 13:08:45 -05:00
Evgeniy Polyakov 7799652557 [NETFILTER]: Fix NULL pointer dereference in nf_nat_move_storage()
Reported by Chuck Ebbert as:

	https://bugzilla.redhat.com/show_bug.cgi?id=259501#c14

This routine is called each time hash should be replaced, nf_conn has
extension list which contains pointers to connection tracking users
(like nat, which is right now the only such user), so when replace takes
place it should copy own extensions. Loop above checks for own
extension, but tries to move higer-layer one, which can lead to above
oops.

Signed-off-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-15 15:52:32 -08:00
Patrick McHardy 6452a5fde0 [NETFILTER]: fix compat_nf_sockopt typo
It should pass opt to the ->get/->set functions, not ops.

Tested-by: Luca Tettamanti <kronos.it@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-15 14:29:21 -08:00
Pavel Emelyanov dab6ba3688 [INET]: Fix potential kfree on vmalloc-ed area of request_sock_queue
The request_sock_queue's listen_opt is either vmalloc-ed or
kmalloc-ed depending on the number of table entries. Thus it 
is expected to be handled properly on free, which is done in 
the reqsk_queue_destroy().

However the error path in inet_csk_listen_start() calls 
the lite version of reqsk_queue_destroy, called 
__reqsk_queue_destroy, which calls the kfree unconditionally. 

Fix this and move the __reqsk_queue_destroy into a .c file as 
it looks too big to be inline.

As David also noticed, this is an error recovery path only,
so no locking is required and the lopt is known to be not NULL.

reqsk_queue_yank_listen_sk is also now only used in
net/core/request_sock.c so we should move it there too.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-15 02:57:06 -08:00
David S. Miller d06fc1d9b5 Merge branch 'fixes-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2007-11-14 19:44:02 -08:00
Linus Torvalds 6f37ac793d Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
  [NET]: rt_check_expire() can take a long time, add a cond_resched()
  [ISDN] sc: Really, really fix warning
  [ISDN] sc: Fix sndpkt to have the correct number of arguments
  [TCP] FRTO: Clear frto_highmark only after process_frto that uses it
  [NET]: Remove notifier block from chain when register_netdevice_notifier fails
  [FS_ENET]: Fix module build.
  [TCP]: Make sure write_queue_from does not begin with NULL ptr
  [TCP]: Fix size calculation in sk_stream_alloc_pskb
  [S2IO]: Fixed memory leak when MSI-X vector allocation fails
  [BONDING]: Fix resource use after free
  [SYSCTL]: Fix warning for token-ring from sysctl checker
  [NET] random : secure_tcp_sequence_number should not assume CONFIG_KTIME_SCALAR
  [IWLWIFI]: Not correctly dealing with hotunplug.
  [TCP] FRTO: Plug potential LOST-bit leak
  [TCP] FRTO: Limit snd_cwnd if TCP was application limited
  [E1000]: Fix schedule while atomic when called from mii-tool.
  [NETX]: Fix build failure added by 2.6.24 statistics cleanup.
  [EP93xx_ETH]: Build fix after 2.6.24 NAPI changes.
  [PKT_SCHED]: Check subqueue status before calling hard_start_xmit
2007-11-14 18:51:48 -08:00
Adrian Bunk d5cd97872d sunrpc/xprtrdma/transport.c: fix use-after-free
Fix an obvious use-after-free spotted by the Coverity checker.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Neil Brown <neilb@suse.de>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-11-14 18:45:41 -08:00
Helmut Schaa a0af5f1454 mac80211: Fix queuing of scan containing a SSID
This patch fixes scanning for specific ssid's which is broken due to the
scan being queued up without respecting the ssid to scan for.

Signed-off-by: Helmut Schaa <hschaa@suse.de>
Signed-off-by: Jiri Benc <jbenc@suse.cz>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-11-14 21:16:46 -05:00
Eric Dumazet d90bf5a976 [NET]: rt_check_expire() can take a long time, add a cond_resched()
On commit 39c90ece7565f5c47110c2fa77409d7a9478bd5b:

	[IPV4]: Convert rt_check_expire() from softirq processing to workqueue.

we converted rt_check_expire() from softirq to workqueue, allowing the
function to perform all work it was supposed to do.

When the IP route cache is big, rt_check_expire() can take a long time
to run.  (default settings : 20% of the hash table is scanned at each
invocation)

Adding cond_resched() helps giving cpu to higher priority tasks if
necessary.

Using a "if (need_resched())" test before calling "cond_resched();" is
necessary to avoid spending too much time doing the resched check.
(My tests gave a time reduction from 88 ms to 25 ms per
rt_check_expire() run on my i686 test machine)

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-14 16:14:05 -08:00
Ilpo Järvinen e1cd8f78f8 [TCP] FRTO: Clear frto_highmark only after process_frto that uses it
I broke this in commit 3de96471bd7fb76406e975ef6387abe3a0698149:

	[TCP]: Wrap-safed reordering detection FRTO check

tcp_process_frto should always see a valid frto_highmark. An invalid
frto_highmark (zero) is very likely what ultimately caused a seqno
compare in tcp_frto_enter_loss to do the wrong leading to the LOST-bit
leak.

Having LOST-bits integry ensured like done after commit
23aeeec365dcf8bc87fae44c533e50d0bb4f23cc:

	[TCP] FRTO: Plug potential LOST-bit leak

won't hurt. It may still be useful in some other, possibly legimate,
scenario.

Reported by Chazarain Guillaume <guichaz@yahoo.fr>.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-14 15:55:09 -08:00
Pavel Emelyanov c67625a1ec [NET]: Remove notifier block from chain when register_netdevice_notifier fails
Commit fcc5a03ac42564e9e255c1134dda47442289e466:

	[NET]: Allow netdev REGISTER/CHANGENAME events to fail

makes the register_netdevice_notifier() handle the error from the
NETDEV_REGISTER event, sent to the registering block.

The bad news is that in this case the notifier block is 
not removed from the list, but the error is returned to the 
caller. In case the caller is in module init function and 
handles this error this can abort the module loading. The
notifier block will be then removed from the kernel, but 
will be left in the list. Oops :(

I think that the notifier block should be removed from the
chain in case of error, regardless whether this error is 
handled by the caller or not. In the worst case (the error 
is _not_ handled) module will not receive the events any 
longer.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-14 15:53:16 -08:00
Ilpo Järvinen 96a2d41a3e [TCP]: Make sure write_queue_from does not begin with NULL ptr
NULL ptr can be returned from tcp_write_queue_head to cached_skb
and then assigned to skb if packets_out was zero. Without this,
system is vulnerable to a carefully crafted ACKs which obviously
is remotely triggerable.

Besides, there's very little that needs to be done in sacktag
if there weren't any packets outstanding, just skipping the rest
doesn't hurt.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-14 15:47:18 -08:00
Ilpo Järvinen 23aeeec365 [TCP] FRTO: Plug potential LOST-bit leak
It might be possible that, in some extreme scenario that
I just cannot now construct in my mind, end_seq <=
frto_highmark check does not match causing the lost_out
and LOST bits become out-of-sync due to clearing and
recounting in the loop.

This may fix LOST-bit leak reported by Chazarain Guillaume
<guichaz@yahoo.fr>.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-13 21:03:13 -08:00
Ilpo Järvinen 746aa32d28 [TCP] FRTO: Limit snd_cwnd if TCP was application limited
Otherwise TCP might violate packet ordering principles that FRTO
is based on. If conventional recovery path is chosen, this won't
be significant at all. In practice, any small enough value will
be sufficient to provide proper operation for FRTO, yet other
users of snd_cwnd might benefit from a "close enough" value.

FRTO's formula is now equal to what tcp_enter_cwr() uses.

FRTO used to check application limitedness a bit differently but
I changed that in commit 575ee7140d
and as a result checking for application limitedness became
completely non-existing.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-13 21:01:23 -08:00
Peter P Waskiewicz Jr 5f1a485d59 [PKT_SCHED]: Check subqueue status before calling hard_start_xmit
The only qdiscs that check subqueue state before dequeue'ing are PRIO
and RR.  The other qdiscs, including the default pfifo_fast qdisc,
will allow traffic bound for subqueue 0 through to hard_start_xmit.
The check for netif_queue_stopped() is done above in pkt_sched.h, so
it is unnecessary for qdisc_restart().  However, if the underlying
driver is multiqueue capable, and only sets queue states on subqueues,
this will allow packets to enter the driver when it's currently unable
to process packets, resulting in expensive requeues and driver
entries.  This patch re-adds the check for the subqueue status before
calling hard_start_xmit, so we can try and avoid the driver entry when
the queues are stopped.

Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-13 20:40:55 -08:00
Eric Dumazet 53756524e4 [NETFILTER]: xt_time should not assume CONFIG_KTIME_SCALAR
It is not correct to assume one can get nsec from a ktime directly by
using .tv64 field.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-13 03:49:53 -08:00
Denis V. Lunev 022cbae611 [NET]: Move unneeded data to initdata section.
This patch reverts Eric's commit 2b008b0a8e

It diets .text & .data section of the kernel if CONFIG_NET_NS is not set.
This is safe after list operations cleanup.

Signed-of-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-13 03:23:50 -08:00
Denis V. Lunev ed160e839d [NET]: Cleanup pernet operation without CONFIG_NET_NS
If CONFIG_NET_NS is not set, the only namespace is possible.

This patch removes list of pernet_operations and cleanups code a bit.
This list is not needed if there are no namespaces. We should just call
->init method.

Additionally, the ->exit will be called on module unloading only. This
case is safe - the code is not discarded. For the in/kernel code, ->exit
should never be called.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-13 03:23:21 -08:00
Patrick McHardy 81d9ddae85 [NETFILTER]: bridge: fix double POSTROUTING hook invocation
Packets routed between bridges have the POST_ROUTING hook invoked
twice since bridging mistakes them for bridged packets because
they have skb->nf_bridge set.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-13 02:58:44 -08:00
Pavel Emelyanov 4ce5ba6aec [NETFILTER]: Consolidate nf_sockopt and compat_nf_sockopt
Both lookup the nf_sockopt_ops object to call the get/set callbacks
from, but they perform it in a completely similar way.

Introduce the helper for finding the ops.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-13 02:58:09 -08:00
Li Zefan e0bf9cf15f [NETFILTER]: nf_nat: fix memset error
The size passing to memset is the size of a pointer.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-13 02:57:16 -08:00
Pavel Emelyanov d71209ded2 [INET]: Use list_head-s in inetpeer.c
The inetpeer.c tracks the LRU list of inet_perr-s, but makes
it by hands. Use the list_head-s for this.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-12 21:27:28 -08:00
Adrian Bunk 22649d1afb [IPVS]: Remove unused exports.
This patch removes the following unused EXPORT_SYMBOL's:
- ip_vs_try_bind_dest
- ip_vs_find_dest

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-12 21:25:24 -08:00
Adrian Bunk 6aed42159d [NET]: Unexport sysctl_{r,w}mem_max.
sysctl_{r,w}mem_max can now be unexported.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-12 21:24:14 -08:00
Urs Thuermann be85d4ad8a [AF_PACKET]: Fix minor code duplication
Simplify some code by eliminating duplicate if-else clauses in
packet_do_bind().

Signed-off-by: Urs Thuermann <urs@isnogud.escape.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-12 21:05:20 -08:00
David S. Miller bce943278d Merge branch 'pending' of master.kernel.org:/pub/scm/linux/kernel/git/vxy/lksctp-dev 2007-11-12 18:16:13 -08:00
Trond Myklebust 91cf45f02a [NET]: Add the helper kernel_sock_shutdown()
...and fix a couple of bugs in the NBD, CIFS and OCFS2 socket handlers.

Looking at the sock->op->shutdown() handlers, it looks as if all of them
take a SHUT_RD/SHUT_WR/SHUT_RDWR argument instead of the
RCV_SHUTDOWN/SEND_SHUTDOWN arguments.
Add a helper, and then define the SHUT_* enum to ensure that kernel users
of shutdown() don't get confused.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Mark Fasheh <mark.fasheh@oracle.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-12 18:10:39 -08:00
Pierre Ynard dbb2ed2485 [IPV6]: Add ifindex field to ND user option messages.
Userland neighbor discovery options are typically heavily involved with
the interface on which thay are received: add a missing ifindex field to
the original struct. Thanks to Rémi Denis-Courmont.

Signed-off-by: Pierre Ynard <linkfanel@yahoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-12 17:58:35 -08:00
Jesper Juhl 9abed245a6 Fix memory leak in discard case of sctp_sf_abort_violation()
In net/sctp/sm_statefuns.c::sctp_sf_abort_violation() we may leak
the storage allocated for 'abort' by returning from the function
without using or freeing it. This happens in case
"sctp_auth_recv_cid(SCTP_CID_ABORT, asoc)" is true and we jump to
the 'discard' label.
Spotted by the Coverity checker.

The simple fix is to simply move the creation of the "abort chunk"
to after the possible jump to the 'discard' label. This way we don't
even have to allocate the memory at all in the problem case.

Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2007-11-12 10:13:24 -05:00
Denis V. Lunev 2994c63863 [INET]: Small possible memory leak in FIB rules
This patch fixes a small memory leak. Default fib rules can be deleted by
the user if the rule does not carry FIB_RULE_PERMANENT flag, f.e. by
	ip rule flush

Such a rule will not be freed as the ref-counter has 2 on start and becomes
clearly unreachable after removal.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-10 22:12:03 -08:00
Alexey Dobriyan 33d36bb83c [NETNS]: init dev_base_lock only once
* it already statically initialized
* reinitializing live global spinlock every time netns is
  setup is also wrong

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-10 22:09:25 -08:00
Pavel Emelyanov 284b327be2 [UNIX]: The unix_nr_socks limit can be exceeded
The unix_nr_socks value is limited with the 2 * get_max_files() value,
as seen from the unix_create1(). However, the check and the actual
increment are separated with the GFP_KERNEL allocation, so this limit
can be exceeded under a memory pressure - task may go to sleep freeing
the pages and some other task will be allowed to allocate a new sock
and so on and so forth.

So make the increment before the check (similar thing is done in the
sock_kmalloc) and go to kmalloc after this.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-10 22:08:30 -08:00
Pavel Emelyanov 5c80f1ae98 [AF_UNIX]: Convert socks to unix_socks in scan_inflight, not in callbacks
The scan_inflight() routine scans through the unix sockets and calls
some passed callback. The fact is that all these callbacks work with
the unix_sock objects, not the sock ones, so make this conversion in
the scan_inflight() before calling the callbacks.

This removes one unneeded variable from the inc_inflight_move_tail().

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-10 22:07:13 -08:00
Pavel Emelyanov 9305cfa444 [AF_UNIX]: Make unix_tot_inflight counter non-atomic
This counter is _always_ modified under the unix_gc_lock spinlock, 
so its atomicity can be provided w/o additional efforts.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-10 22:06:01 -08:00
Peter P Waskiewicz Jr 8032b46489 [AF_PACKET]: Allow multicast traffic to be caught by ORIGDEV when bonded
The socket option for packet sockets to return the original ifindex instead
of the bonded ifindex will not match multicast traffic.  Since this socket
option is the most useful for layer 2 traffic and multicast traffic, make
the option multicast-aware.

Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-10 22:03:25 -08:00
Johannes Berg d52a60ad38 mac80211: fix MAC80211_RCSIMPLE Kconfig
I meant for this to be selectable only with EMBEDDED, not enabled only
with EMBEDDED. This does it that way. Sorry.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-11-10 22:01:42 -08:00
John W. Linville 7f3ad8943e mac80211: make "decrypt failed" messages conditional upon MAC80211_DEBUG
Make "decrypt failed" and "have no key" debugging messages compile
conditionally upon CONFIG_MAC80211_DEBUG.  They have been useful for
finding certain problems in the past, but in many cases they just
clutter a user's logs.

A typical example is an enviornment where multiple SSIDs are using a
single BSSID but with different protection schemes or different keys
for each SSID.  In such an environment these messages are just noise.
Let's just leave them for those interested enough to turn-on debugging.

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-11-10 22:01:34 -08:00
Johannes Berg 5b98b1f7da mac80211: use IW_AUTH_PRIVACY_INVOKED rather than IW_AUTH_KEY_MGMT
In the long bug-hunt for why dynamic WEP networks didn't work it
turned out that mac80211 incorrectly uses IW_AUTH_KEY_MGMT while
it should use IW_AUTH_PRIVACY_INVOKED to determine whether to
associate to protected networks or not.

This patch changes the behaviour to be that way and clarifies the
existing code.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: Jouni Malinen <j@w1.fi>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-11-10 22:01:25 -08:00
Johannes Berg 56db6c52bb mac80211: remove unused driver ops
The driver operations set_ieee8021x(), set_port_auth() and
set_privacy_invoked() are not used by any drivers, except
set_privacy_invoked() they aren't even used by mac80211.
Remove them at least until we need to support drivers with
mac80211 that require getting this information.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Acked-by: Michael Wu <flamingice@sourmilk.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-11-10 22:01:15 -08:00
Johannes Berg 8636bf6513 mac80211: remove ieee80211_common.h
Robert pointed out that I missed this file when removing the management
interface. Do it now.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-11-10 22:01:04 -08:00
Michael Buesch 2736622344 rfkill: Fix sparse warning
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-11-10 22:00:28 -08:00
Michael Buesch 7319f1e6bc rfkill: Use mutex_lock() at register and add sanity check
Replace mutex_lock_interruptible() by mutex_lock() in rfkill_register(),
as interruptible doesn't make sense there.

Add a sanity check for rfkill->type, as that's used for an unchecked dereference
in an array and might cause hard to debug crashes if the driver sets this
to an invalid value.

Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-11-10 22:00:15 -08:00
Johannes Berg 830f903866 mac80211: allow driver to ask for a rate control algorithm
This allows a driver to ask for a specific rate control algorithm.
The rate control algorithm asked for must be registered and be
available as a module or built-in.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-11-10 21:59:54 -08:00
Johannes Berg 999acd9c33 mac80211: don't allow registering the same rate control twice
Previously, mac80211 would allow registering the same rate control
algorithm twice. This is a programming error in the registration
and should not happen; additionally the second version could never
be selected. Disallow this and warn about it.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-11-10 21:59:43 -08:00
Michael Buesch 2bf236d55e rfkill: Use subsys_initcall
We must use subsys_initcall, because we must initialize before a
driver calls rfkill_register().

Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-11-10 21:59:33 -08:00
Johannes Berg ac71c691e6 mac80211: make simple rate control algorithm built-in
Too frequently people do not have module autoloading enabled
or fail to install the rate control module correctly, hence
their hardware probing fails due to no rate control algorithm
being available. This makes the 'simple' algorithm built into
the mac80211 module unless EMBEDDED is enabled in which case
it can be disabled (eg. if the wanted driver requires another
rate control algorithm.)

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Acked-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-11-10 21:59:23 -08:00
Michael Buesch 8a8f1c0437 rfkill: Register LED triggers before registering switch
Registering the switch triggers a LED event, so we must register
LED triggers before the switch.
This has a potential to fix a crash, depending on how the device
driver initializes the rfkill data structure.

Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-11-10 21:59:11 -08:00
Johannes Berg 94e10bfb8a softmac: fix wext MLME request reason code endianness
The MLME request reason code is host-endian and our passing
it to the low level functions is host-endian as well since
they do the swapping. I noticed that the reason code 768 was
sent (0x300) rather than 3 when wpa_supplicant terminates.
This removes the superfluous cpu_to_le16() call.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-11-10 21:58:41 -08:00
Radu Rendec b226801676 [PKT_SCHED] CLS_U32: Use ffs() instead of C code on hash mask to get first set bit.
Computing the rank of the first set bit in the hash mask (for using later
in u32_hash_fold()) was done with plain C code. Using ffs() instead makes
the code more readable and improves performance (since ffs() is better
optimized in assembler).

Using the conditional operator on hash mask before applying ntohl() also
saves one ntohl() call if mask is 0.

Signed-off-by: Radu Rendec <radu.rendec@ines.ro>
Signed-off-by: Jarek Poplawski <jarkao2@o2.pl>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-10 21:54:50 -08:00
Patrick McHardy 39aaac114e [VLAN]: Allow setting mac address while device is up
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-10 21:52:35 -08:00
Patrick McHardy d932e04a5e [VLAN]: Don't synchronize addresses while the vlan device is down
While the VLAN device is down, the unicast addresses are not configured
on the underlying device, so we shouldn't attempt to sync them.

Noticed by Dmitry Butskoy <buc@odusz.so-cdu.ru>

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-10 21:51:40 -08:00
Pavel Emelyanov 358352b8b8 [INET]: Cleanup the xfrm4_tunnel_(un)register
Both check for the family to select an appropriate tunnel list.
Consolidate this check and make the for() loop more readable.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-10 21:48:54 -08:00
Pavel Emelyanov 99f933263a [INET]: Add missed tunnel64_err handler
The tunnel64_protocol uses the tunnel4_protocol's err_handler and
thus calls the tunnel4_protocol's handlers.

This is not very good, as in case of (icmp) error the wrong error
handlers will be called (e.g. ipip ones instead of sit) and this
won't be noticed at all, because the error is not reported.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-10 21:47:39 -08:00
Pavel Emelyanov c2b42336f4 [IPX]: Use existing sock refcnt debugging infrastructure
Just like in the af_packet.c, the ipx_sock_nr variable is used
for debugging purposes.

Switch to using existing infrastructure. Thanks to Arnaldo for
pointing this out.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-10 21:39:26 -08:00
Pavel Emelyanov 17ab56a260 [PACKET]: Use existing sock refcnt debugging infrastructure
The packet_socks_nr variable is used purely for debugging
the number of sockets.

As Arnaldo pointed out, there's already an infrastructure
for this purposes, so switch to using it.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-10 21:38:48 -08:00
Joe Perches e9671fcb3b [NET]: Fix infinite loop in dev_mc_unsync().
From: Joe Perches <joe@perches.com>

Based upon an initial patch and report by Luis R. Rodriguez.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-10 21:36:04 -08:00
Pavel Emelyanov 03f49f3457 [NET]: Make helper to get dst entry and "use" it
There are many places that get the dst entry, increase the
__use counter and set the "lastuse" time stamp.

Make a helper for this.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-10 21:28:34 -08:00
Pavel Emelyanov b1667609cd [IPV4]: Remove bugus goto-s from ip_route_input_slow
Both places look like

        if (err == XXX) 
               goto yyy;
   done:

while both yyy targets look like

        err = XXX;
        goto done;

so this is ok to remove the above if-s.

yyy labels are used in other places and are not removed.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-10 21:26:41 -08:00
Ilpo Järvinen fbd52eb2bd [TCP]: Split SACK FRTO flag clearing (fixes FRTO corner case bug)
In case we run out of mem when fragmenting, the clearing of
FLAG_ONLY_ORIG_SACKED might get missed which then feeds FRTO
with false information. Move clearing outside skb processing
loop so that it will get executed even if the skb loop
terminates prematurely due to out-of-mem.

Besides, now the core of the loop truly deals with a single
skb only, which also enables creation a more self-contained
of tcp_sacktag_one later on.

In addition, small reorganization of if branches was made.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-10 21:24:19 -08:00
Ilpo Järvinen e49aa5d456 [TCP]: Add unlikely() to sacktag out-of-mem in fragment case
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-10 21:23:08 -08:00
Ilpo Järvinen c7caf8d3ed [TCP]: Fix reord detection due to snd_una covered holes
Fixes subtle bug like the one with fastpath_cnt_hint happening
due to the way the GSO and hints interact. Because hints are not
reset when just a GSOed skb is partially ACKed, there's no
guarantee that the relevant part of the write queue is going to
be processed in sacktag at all (skbs below snd_una) because
fastpath hint can fast forward the entrypoint.

This was also on the way of future reductions in sacktag's skb
processing. Also future cleanups in sacktag can be made after
this (in 2.6.25).

This may make reordering update in tcp_try_undo_partial
redundant but I'm not too sure so I left it there.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-10 21:22:18 -08:00
Ilpo Järvinen 8dd71c5d28 [TCP]: Consider GSO while counting reord in sacktag
Reordering detection fails to take account that the reordered
skb may have pcount larger than 1. In such case the lowest of
them had the largest reordering, the old formula used the
highest of them which is pcount - 1 packets less reordered.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-10 21:20:59 -08:00
Vlad Yasevich 7d54dc6876 SCTP: Always flush the queue when uncorcking.
When the code calls uncork, trigger a queue flush, even
if the queue was not corked.  Most callers that explicitely
cork the queue will have additinal checks to see if they 
corked it.  Callers who do not cork the queue expect packets
to flow when they call uncork.

The scneario that showcased this bug happend when we were not
able to bundle DATA with outgoing COOKIE-ECHO.  As a result
the data just sat in the outqueue and did not get transmitted.
The application expected a response, but nothing happened.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2007-11-09 11:43:41 -05:00
Vlad Yasevich cd3ae8e615 SCTP: Fix PR-SCTP to deliver all the accumulated ordered chunks
There is a small bug when we process a FWD-TSN.  We'll deliver
anything upto the current next expected SSN.  However, if the
next expected is already in the queue, it will take another
chunk to trigger its delivery.  The fix is to simply check
the current queued SSN is the next expected one.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2007-11-09 11:43:41 -05:00
Vlad Yasevich 7ab9080467 SCTP: Make sctp_verify_param return multiple indications.
SCTP-AUTH and future ADD-IP updates have a requirement to
do additional verification of parameters and an ability to
ABORT the association if verification fails.  So, introduce
additional return code so that we can clear signal a required
action.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2007-11-09 11:43:41 -05:00
Vlad Yasevich d970dbf845 SCTP: Convert custom hash lists to use hlist.
Convert the custom hash list traversals to use hlist functions.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2007-11-09 11:43:40 -05:00
Vlad Yasevich 123ed979ea SCTP: Use hashed lookup when looking for an association.
A SCTP endpoint may have a lot of associations on them and walking
the list is fairly inefficient.  Instead, use a hashed lookup,
and filter out the hash list based on the endopoing we already have.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2007-11-09 11:41:36 -05:00
Vlad Yasevich 027f6e1ad3 SCTP: Fix a potential race between timers and receive path.
There is a possible race condition where the timer code will
free the association and the next packet in the queue will also
attempt to free the same association.

The example is, when we receive an ABORT at about the same time
as the retransmission timer fires.  If the timer wins the race,
it will free the association.  Once it releases the lock, the
queue processing will recieve the ABORT and will try to free
the association again.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2007-11-07 11:39:27 -05:00
Vlad Yasevich 73d9c4fd1a SCTP: Allow ADD_IP to work with AUTH for backward compatibility.
This patch adds a tunable that will allow ADD_IP to work without
AUTH for backward compatibility.  The default value is off since
the default value for ADD_IP is off as well.  People who need
to use ADD-IP with older implementations take risks of connection
hijacking and should consider upgrading or turning this tunable on.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2007-11-07 11:39:27 -05:00
Vlad Yasevich 88799fe5ec SCTP: Correctly disable ADD-IP when AUTH is not supported.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2007-11-07 11:39:27 -05:00