linux

Commit Graph

Author	SHA1	Message	Date
Ilpo Järvinen	95c4922bf9	[TCP] FRTO: fixes fallback to conventional recovery The FRTO detection did not care how ACK pattern affects to cwnd calculation of the conventional recovery. This caused incorrect setting of cwnd when the fallback becames necessary. The knowledge tcp_process_frto() has about the incoming ACK is now passed on to tcp_enter_frto_loss() in allowed_segments parameter that gives the number of segments that must be added to packets-in-flight while calculating the new cwnd. Instead of snd_una we use FLAG_DATA_ACKED in duplicate ACK detection because RFC4138 states (in Section 2.2): If the first acknowledgment after the RTO retransmission does not acknowledge all of the data that was retransmitted in step 1, the TCP sender reverts to the conventional RTO recovery. Otherwise, a malicious receiver acknowledging partial segments could cause the sender to declare the timeout spurious in a case where data was lost. If the next ACK after RTO is duplicate, we do not retransmit anything, which is equal to what conservative conventional recovery does in such case. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:23:07 -07:00
Ilpo Järvinen	6408d206c7	[TCP] FRTO: Ignore some uninteresting ACKs Handles RFC4138 shortcoming (in step 2); it should also have case c) which ignores ACKs that are not duplicates nor advance window (opposite dir data, winupdate). Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:23:06 -07:00
Ilpo Järvinen	7b0eb22b1d	[TCP] FRTO: Use Disorder state during operation instead of Open Retransmission counter assumptions are to be changed. Forcing reason to do this exist: Using sysctl in check would be racy as soon as FRTO starts to ignore some ACKs (doing that in the following patches). Userspace may disable it at any moment giving nice oops if timing is right. frto_counter would be inaccessible from userspace, but with SACK enhanced FRTO retrans_out can include other than head, and possibly leaving it non-zero after spurious RTO, boom again. Luckily, solution seems rather simple: never go directly to Open state but use Disorder instead. This does not really change much, since TCP could anyway change its state to Disorder during FRTO using path tcp_fastretrans_alert -> tcp_try_to_open (e.g., when a SACK block makes ACK dubious). Besides, Disorder seems to be the state where TCP should be if not recovering (in Recovery or Loss state) while having some retransmissions in-flight (see tcp_try_to_open), which is exactly what happens with FRTO. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:23:05 -07:00
Ilpo Järvinen	7487c48c4f	[TCP] FRTO: Consecutive RTOs keep prior_ssthresh and ssthresh In case a latency spike causes more than one RTO, the later should not cause the already reduced ssthresh to propagate into the prior_ssthresh since FRTO declares all such RTOs spurious at once or none of them. In treating of ssthresh, we mimic what tcp_enter_loss() does. The previous state (in frto_counter) must be available until we have checked it in tcp_enter_frto(), and also ACK information flag in process_frto(). Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:23:04 -07:00
Ilpo Järvinen	30935cf4f9	[TCP] FRTO: Comment cleanup & improvement Moved comments out from the body of process_frto() to the head (preferred way; see Documentation/CodingStyle). Bonus: it's much easier to read in this compacted form. FRTO algorithm and implementation is described in greater detail. For interested reader, more information is available in RFC4138. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:23:03 -07:00
Ilpo Järvinen	bdaae17da8	[TCP] FRTO: Moved tcp_use_frto from tcp.h to tcp_input.c In addition, removed inline. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:23:02 -07:00
Ilpo Järvinen	9ead9a1d38	[TCP] FRTO: Separated response from FRTO detection algorithm FRTO spurious RTO detection algorithm (RFC4138) does not include response to a detected spurious RTO but can use different response algorithms. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:23:01 -07:00
Ilpo Järvinen	522e7548a9	[TCP] FRTO: Incorrectly clears TCPCB_EVER_RETRANS bit FRTO was slightly too brave... Should only clear TCPCB_SACKED_RETRANS bit. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:23:00 -07:00
Alexey Kuznetsov	1194ed0a3e	[NETLINK]: Infinite recursion in netlink. Reply to NETLINK_FIB_LOOKUP messages were misrouted back to kernel, which resulted in infinite recursion and stack overflow. The bug is present in all kernel versions since the feature appeared. The patch also makes some minimal cleanup: 1. Return something consistent (-ENOENT) when fib table is missing 2. Do not crash when queue is empty (does not happen, but yet) 3. Put result of lookup Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 13:07:28 -07:00
Patrick McHardy	05d224468a	[XFRM]: beet: fix pseudo header length value draft-nikander-esp-beet-mode-07.txt is not entirely clear on how the length value of the pseudo header should be calculated, it states "The Header Length field contains the length of the pseudo header, IPv4 options, and padding in 8 octets units.", but also states "Length in octets (Header Len + 1) * 8". draft-nikander-esp-beet-mode-08-pre1.txt [1] clarifies this, the header length should not include the first 8 byte. This change affects backwards compatibility, but option encapsulation didn't work until very recently anyway. [1] http://users.piuha.net/jmelen/BEET/draft-nikander-esp-beet-mode-08-pre1.txt Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-23 22:39:02 -07:00
Stephen Hemminger	4d4d3d1e88	[TCP]: Congestion control initialization. Change to defer congestion control initialization. If setsockopt() was used to change TCP_CONGESTION before connection is established, then protocols that use sequence numbers to keep track of one RTT interval (vegas, illinois, ...) get confused. Change the init hook to be called after handshake. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-23 22:32:11 -07:00
David S. Miller	49688c8431	[NETFILTER] arp_tables: Fix unaligned accesses. There are two device string comparison loops in arp_packet_match(). The first one goes byte-by-byte but the second one tries to be clever and cast the string to a long and compare by longs. The device name strings in the arp table entries are not guarenteed to be aligned enough to make this value, so just use byte-by-byte for both cases. Based upon a report by <drraid@gmail.com>. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-13 16:37:54 -07:00
Patrick McHardy	01102e7ca2	[NETFILTER]: ipt_ULOG: use put_unaligned Use put_unaligned to fix warnings about unaligned accesses. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-12 14:27:03 -07:00
Jaroslav Kysela	50c9cc2e54	[NETFILTER]: ipt_CLUSTERIP: fix oops in checkentry function The clusterip_config_find_get() already increases entries reference counter, so there is no reason to do it twice in checkentry() callback. This causes the config to be freed before it is removed from the list, resulting in a crash when adding the next rule. Signed-off-by: Jaroslav Kysela <perex@suse.cz> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-10 13:26:48 -07:00
David S. Miller	15d33c070d	[TCP]: slow_start_after_idle should influence cwnd validation too For the cases that slow_start_after_idle are meant to deal with, it is almost a certainty that the congestion window tests will think the connection is application limited and we'll thus decrease the cwnd there too. This defeats the whole point of setting slow_start_after_idle to zero. So test it there too. We do not cancel out the entire tcp_cwnd_validate() function so that if the sysctl is changed we still have the validation state maintained. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-09 13:31:15 -07:00
Patrick McHardy	254d0d24e3	[XFRM]: beet: fix IP option decapsulation Beet mode looks for the beet pseudo header after the outer IP header, which is wrong since that is followed by the ESP header. Additionally it needs to adjust the packet length after removing the pseudo header and point the data pointer to the real data location. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-05 16:03:33 -07:00
Patrick McHardy	d4b1e84062	[XFRM]: beet: fix beet mode decapsulation Beet mode decapsulation fails to properly set up the skb pointers, which only works by accident in combination with CONFIG_NETFILTER, since in that case the skb is fixed up in xfrm4_input before passing it to the netfilter hooks. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-05 15:59:41 -07:00
Patrick McHardy	04fef9893a	[XFRM]: beet: use IPOPT_NOP for option padding draft-nikander-esp-beet-mode-07.txt states "The padding MUST be filled with NOP options as defined in Internet Protocol [1] section 3.1 Internet header format.", so do that. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-05 15:54:39 -07:00
Patrick McHardy	c5027c9a89	[XFRM]: beet: fix IP option encapsulation Beet mode calculates an incorrect value for the transport header location when IP options are present, resulting in encapsulation errors. The correct location is 4 or 8 bytes before the end of the original IP header, depending on whether the pseudo header is padded. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-05 15:54:02 -07:00
John Heffner	84565070e4	[TCP]: Do receiver-side SWS avoidance for rcvbuf < MSS. Signed-off-by: John Heffner <jheffner@psc.edu> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-02 13:56:32 -07:00
Robert Olsson	d562f1f8a9	[IPV4] fib_trie: Document locking. Paul E. McKenney writes: > Those of use who dive into networking only occasionally would much > appreciate this. ;-) No problem here... Acked-by: Robert Olsson <robert.olsson@its.uu.se> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> (but trivial) Signed-off-by: David S. Miller <davem@davemloft.net>	2007-03-26 14:22:22 -07:00
Thomas Graf	a0ee18b9b7	[IPv4] fib: Fix out of bound access of fib_props[] Fixes a typo which caused fib_props[] to have the wrong size and makes sure the value used to index the array which is provided by userspace via netlink is checked to avoid out of bound access. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-03-25 18:48:03 -07:00
Thomas Graf	e1701c68c1	[NET]: Fix fib_rules compatibility breakage Based upon a patch from Patrick McHardy. The fib_rules netlink attribute policy introduced in 2.6.19 broke userspace compatibilty. When specifying a rule with "from all" or "to all", iproute adds a zero byte long netlink attribute, but the policy requires all addresses to have a size equal to sizeof(struct in_addr)/sizeof(struct in6_addr), resulting in a validation error. Check attribute length of FRA_SRC/FRA_DST in the generic framework by letting the family specific rules implementation provide the length of an address. Report an error if address length is non zero but no address attribute is provided. Fix actual bug by checking address length for non-zero instead of relying on availability of attribute. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-03-25 18:48:00 -07:00
Patrick McHardy	848c29fd64	[NETFILTER]: nat: avoid rerouting packets if only XFRM policy key changed Currently NAT not only reroutes packets in the OUTPUT chain when the routing key changed, but also if only the non-routing part of the IPsec policy key changed. This breaks ping -I since it doesn't use SO_BINDTODEVICE but IP_PKTINFO cmsg to specify the output device, and this information is lost. Only do full rerouting if the routing key changed, and just do a new policy lookup with the old route if only the ports changed. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-03-22 12:30:29 -07:00
John Heffner	53cdcc04c1	[TCP]: Fix tcp_mem[] initialization. Change tcp_mem initialization function. The fraction of total memory is now a continuous function of memory size, and independent of page size. Signed-off-by: John Heffner <jheffner@psc.edu> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-03-16 15:04:03 -07:00
Robert Olsson	d5cc4a73a5	[IPV4]: Do not disable preemption in trie_leaf_remove(). Hello, Just discussed this Patrick... We have two users of trie_leaf_remove, fn_trie_flush and fn_trie_delete both are holding RTNL. So there shouldn't be need for this preempt stuff. This is assumed to a leftover from an older RCU-take. > Mhh .. I think I just remembered something - me incorrectly suggesting > to add it there while we were talking about this at OLS :) IIRC the > idea was to make sure tnode_free (which at that time didn't use > call_rcu) wouldn't free memory while still in use in a rcu read-side > critical section. It should have been synchronize_rcu of course, > but with tnode_free using call_rcu it seems to be completely > unnecessary. So I guess we can simply remove it. Signed-off-by: Robert Olsson <robert.olsson@its.uu.se> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-03-16 15:00:07 -07:00
Geert Uytterhoeven	08882669e0	[IPV4]: Fix warning in ip_mc_rejoin_group. Kill warning about unused variable `in_dev' when CONFIG_IP_MULTICAST is not set. Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-03-12 17:02:37 -07:00
Paul Moore	38c8947c1b	[NetLabel]: parse the CIPSO ranged tag on incoming packets Commit `484b366932` added support for the CIPSO ranged categories tag. However, it appears that I made a mistake when rebasing then patch to the latest upstream sources for submission and dropped the part of the patch that actually parses the tag on incoming packets. This patch fixes this mistake by adding the required function call to the cipso_v4_skbuff_getattr() function. I've run this patch over the weekend and have not noticed any problems. Signed-off-by: Paul Moore <paul.moore@hp.com> Acked-by: James Morris <jmorris@namei.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-03-12 14:38:02 -07:00
Evgeniy Polyakov	c4e38f41e3	[IPV4]: Fix rtm_to_ifaddr() error handling. Return negative error value (embedded in the pointer) instead of returning NULL. Signed-off-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-03-09 13:43:24 -08:00
Herbert Xu	d644329bc9	[UDP]: Reread uh pointer after pskb_trim The header may have moved when trimming. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-03-07 16:08:04 -08:00
Linus Torvalds	5b3c1184e7	Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 * master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: [DCCP]: Set RTO for newly created child socket [DCCP]: Correctly split CCID half connections [NET]: Fix compat_sock_common_getsockopt typo. [NET]: Revert incorrect accept queue backlog changes. [INET]: twcal_jiffie should be unsigned long, not int [GIANFAR]: Fix compile error in latest git [PPPOE]: Use ifindex instead of device pointer in key lookups. [NETFILTER]: ip6_route_me_harder should take into account mark [NETFILTER]: nfnetlink_log: fix reference counting [NETFILTER]: nfnetlink_log: fix module reference counting [NETFILTER]: nfnetlink_log: fix possible NULL pointer dereference [NETFILTER]: nfnetlink_log: fix NULL pointer dereference [NETFILTER]: nfnetlink_log: fix use after free [NETFILTER]: nfnetlink_log: fix reference leak [NETFILTER]: tcp conntrack: accept SYN\|URG as valid [NETFILTER]: nf_conntrack/nf_nat: fix incorrect config ifdefs [NETFILTER]: conntrack: fix {nf,ip}_ct_iterate_cleanup endless loops	2007-03-06 19:53:34 -08:00
Jay Vosburgh	a816c7c712	bonding: Improve IGMP join processing In active-backup mode, the current bonding code duplicates IGMP traffic to all slaves, so that switches are up to date in case of a failover from an active to a backup interface. If bonding then fails back to the original active interface, it is likely that the "active slave" switch's IGMP forwarding for the port will be out of date until some event occurs to refresh the switch (e.g., a membership query). This patch alters the behavior of bonding to no longer flood IGMP to all ports, and to issue IGMP JOINs to the newly active port at the time of a failover. This insures that switches are kept up to date for all cases. "GOELLESCH Niels" <niels.goellesch@eurocontrol.int> originally reported this problem, and included a patch. His original patch was modified by Jay Vosburgh to additionally remove the existing IGMP flood behavior, use RCU, streamline code paths, fix trailing white space, and adjust for style. Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-03-06 06:08:11 -05:00
Patrick McHardy	d3ab4298aa	[NETFILTER]: tcp conntrack: accept SYN\|URG as valid Some stacks apparently send packets with SYN\|URG set. Linux accepts these packets, so TCP conntrack should to. Pointed out by Martijn Posthuma <posthuma@sangine.com>. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-03-05 13:25:20 -08:00
Patrick McHardy	e281db5cdf	[NETFILTER]: nf_conntrack/nf_nat: fix incorrect config ifdefs The nf_conntrack_netlink config option is named CONFIG_NF_CT_NETLINK, but multiple files use CONFIG_IP_NF_CONNTRACK_NETLINK or CONFIG_NF_CONNTRACK_NETLINK for ifdefs. Fix this and reformat all CONFIG_NF_CT_NETLINK ifdefs to only use a line. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-03-05 13:25:19 -08:00
Patrick McHardy	ec68e97ded	[NETFILTER]: conntrack: fix {nf,ip}_ct_iterate_cleanup endless loops Fix {nf,ip}_ct_iterate_cleanup unconfirmed list handling: - unconfirmed entries can not be killed manually, they are removed on confirmation or final destruction of the conntrack entry, which means we might iterate forever without making forward progress. This can happen in combination with the conntrack event cache, which holds a reference to the conntrack entry, which is only released when the packet makes it all the way through the stack or a different packet is handled. - taking references to an unconfirmed entry and using it outside the locked section doesn't work, the list entries are not refcounted and another CPU might already be waiting to destroy the entry What the code really wants to do is make sure the references of the hash table to the selected conntrack entries are released, so they will be destroyed once all references from skbs and the event cache are dropped. Since unconfirmed entries haven't even entered the hash yet, simply mark them as dying and skip confirmation based on that. Reported and tested by Chuck Ebbert <cebbert@redhat.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-03-05 13:25:18 -08:00
Paul Moore	c6387a8694	[NetLabel]: Verify sensitivity level has a valid CIPSO mapping The current CIPSO engine has a problem where it does not verify that the given sensitivity level has a valid CIPSO mapping when the "std" CIPSO DOI type is used. The end result is that bad packets are sent on the wire which should have never been sent in the first place. This patch corrects this problem by verifying the sensitivity level mapping similar to what is done with the category mapping. This patch also changes the returned error code in this case to -EPERM to better match what the category mapping verification code returns. Signed-off-by: Paul Moore <paul.moore@hp.com> Acked-by: James Morris <jmorris@namei.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-03-02 20:37:36 -08:00
Arnaldo Carvalho de Melo	a9948a7e15	[TCP]: Fix minisock tcp_create_openreq_child() typo. On 2/28/07, KOVACS Krisztian <hidden@balabit.hu> wrote: > > Hi, > > While reading TCP minisock code I've found this suspiciously looking > code fragment: > > - 8< - > struct sock tcp_create_openreq_child(struct sock sk, struct request_sock req, struct sk_buff skb) > { > struct sock newsk = inet_csk_clone(sk, req, GFP_ATOMIC); > > if (newsk != NULL) { > const struct inet_request_sock ireq = inet_rsk(req); > struct tcp_request_sock treq = tcp_rsk(req); > struct inet_connection_sock newicsk = inet_csk(sk); > struct tcp_sock *newtp; > - 8< - > > The above code initializes newicsk to inet_csk(sk), isn't that supposed > to be inet_csk(newsk)? As far as I can tell this might leave > icsk_ack.last_seg_size zero even if we do have received data. Good catch! David, please apply the attached patch. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-28 11:05:56 -08:00
Bernhard Walle	aef8811abb	[XFRM]: Fix oops in xfrm4_dst_destroy() With 2.6.21-rc1, I get an oops when running 'ifdown eth0' and an IPsec connection is active. If I shut down the connection before running 'ifdown eth0', then there's no problem. The critical operation of this script is to kill dhcpd. The problem is probably caused by commit with git identifier `4337226228` (Linus tree) "[IPSEC]: IPv4 over IPv6 IPsec tunnel". This patch fixes that oops. I don't know the network code of the Linux kernel in deep, so if that fix is wrong, please change it. But please fix the oops. :) Signed-off-by: Bernhard Walle <bwalle@suse.de> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-26 12:10:32 -08:00
Arnaldo Carvalho de Melo	e4396b544f	[XFRM_TUNNEL]: Reload header pointer after pskb_may_pull/pskb_expand_head Please consider applying, this was found on your latest net-2.6 tree while playing around with that ip_hdr() + turn skb->nh/h/mac pointers as offsets on 64 bits idea :-) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-26 11:43:01 -08:00
Joe Perches	4c3ae4d7e7	[IPV4]: Use random32() in net/ipv4/multipath Removed local random number generator function Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-26 11:43:00 -08:00
Herbert Xu	8030f54499	[IPV4] devinet: Register inetdev earlier. This patch allocates inetdev at registration for all devices in line with IPv6. This allows sysctl configuration on the devices to occur before they're brought up or addresses are added. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2007-02-26 11:42:56 -08:00
Baruch Even	f4b9479dc5	[IPV4]: Correct links in net/ipv4/Kconfig Correct dead/indirect links in net/ipv4/Kconfig Signed-off-by: Baruch Even <baruch@ev-en.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-26 11:42:51 -08:00
David S. Miller	2c4f6219ac	[TCP]: Fix MD5 signature pool locking. The locking calls assumed that these code paths were only invoked in software interrupt context, but that isn't true. Therefore we need to use spin_{lock,unlock}_bh() throughout. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-26 11:42:48 -08:00
Adrian Bunk	936bb14ce9	correct a dead URL in the IP_MULTICAST help text Reported in kernel Bugzilla #6216. Signed-off-by: Adrian Bunk <bunk@stusta.de>	2007-02-17 19:49:13 +01:00
Robert P. J. Day	d08df601a3	Various typo fixes. Correct mis-spellings of "algorithm", "appear", "consistent" and (shame, shame) "kernel". Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>	2007-02-17 19:07:33 +01:00
Eric W. Biederman	3fbfa98112	[PATCH] sysctl: remove the proc_dir_entry member for the sysctl tables It isn't needed anymore, all of the users are gone, and all of the ctl_table initializers have been converted to use explicit names of the fields they are initializing. [akpm@osdl.org: NTFS fix] Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Cc: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-14 08:10:00 -08:00
Eric W. Biederman	0b4d414714	[PATCH] sysctl: remove insert_at_head from register_sysctl The semantic effect of insert_at_head is that it would allow new registered sysctl entries to override existing sysctl entries of the same name. Which is pain for caching and the proc interface never implemented. I have done an audit and discovered that none of the current users of register_sysctl care as (excpet for directories) they do not register duplicate sysctl entries. So this patch simply removes the support for overriding existing entries in the sys_sysctl interface since no one uses it or cares and it makes future enhancments harder. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Ralf Baechle <ralf@linux-mips.org> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: David Howells <dhowells@redhat.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Andi Kleen <ak@muc.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Corey Minyard <minyard@acm.org> Cc: Neil Brown <neilb@suse.de> Cc: "John W. Linville" <linville@tuxdriver.com> Cc: James Bottomley <James.Bottomley@steeleye.com> Cc: Jan Kara <jack@ucw.cz> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Mark Fasheh <mark.fasheh@oracle.com> Cc: David Chinner <dgc@sgi.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Patrick McHardy <kaber@trash.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-14 08:09:59 -08:00
Tim Schmielau	cd354f1ae7	[PATCH] remove many unneeded #includes of sched.h After Al Viro (finally) succeeded in removing the sched.h #include in module.h recently, it makes sense again to remove other superfluous sched.h includes. There are quite a lot of files which include it but don't actually need anything defined in there. Presumably these includes were once needed for macros that used to live in sched.h, but moved to other header files in the course of cleaning it up. To ease the pain, this time I did not fiddle with any header files and only removed #includes from .c-files, which tend to cause less trouble. Compile tested against 2.6.20-rc2 and 2.6.20-rc2-mm2 (with offsets) on alpha, arm, i386, ia64, mips, powerpc, and x86_64 with allnoconfig, defconfig, allmodconfig, and allyesconfig as well as a few randconfigs on x86_64 and all configs in arch/arm/configs on arm. I also checked that no new warnings were introduced by the patch (actually, some warnings are removed that were emitted by unnecessarily included header files). Signed-off-by: Tim Schmielau <tim@physik3.uni-rostock.de> Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-14 08:09:54 -08:00
Kazunori MIYAZAWA	c0d56408e3	[IPSEC]: Changing API of xfrm4_tunnel_register. This patch changes xfrm4_tunnel register and deregister interface to prepare for solving the conflict of device tunnels with inter address family IPsec tunnel. Signed-off-by: Kazunori MIYAZAWA <miyazawa@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-13 12:54:47 -08:00
Ilpo Järvinen	600ff0c24b	[TCP]: Prevent pseudo garbage in SYN's advertized window TCP may advertize up to 16-bits window in SYN packets (no window scaling allowed). At the same time, TCP may have rcv_wnd (32-bits) that does not fit to 16-bits without window scaling resulting in pseudo garbage into advertized window from the low-order bits of rcv_wnd. This can happen at least when mss <= (1<<wscale) (see tcp_select_initial_window). This patch fixes the handling of SYN advertized windows (compile tested only). In worst case (which is unlikely to occur though), the receiver advertized window could be just couple of bytes. I'm not sure that such situation would be handled very well at all by the receiver!? Fortunately, the situation normalizes after the first non-SYN ACK is received because it has the correct, scaled window. Alternatively, tcp_select_initial_window could be changed to prevent too large rcv_wnd in the first place. [ tcp_make_synack() has the same bug, and I've added a fix for that to this patch -DaveM ] Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-13 12:42:11 -08:00
Herbert Xu	bbf4a6bc8c	[NETFILTER]: Clear GSO bits for TCP reset packet The TCP reset packet is copied from the original. This includes all the GSO bits which do not apply to the new packet. So we should clear those bits. Spotted by Patrick McHardy. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-13 12:32:58 -08:00
Patrick McHardy	e2e01bfef8	[XFRM]: Fix IPv4 tunnel mode decapsulation with IPV6=n Add missing break when CONFIG_IPV6=n. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-12 20:27:10 -08:00
Stephen Hemminger	9121c77706	[TCP]: cleanup of htcp (resend) Minor non-invasive cleanups: * white space around operators and line wrapping * use const * use __read_mostly Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-12 13:34:03 -08:00
Stephen Hemminger	59758f4459	[TCP]: Use read mostly for CUBIC parameters. These module parameters should be in the read mostly area to avoid cache pollution. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-12 13:15:20 -08:00
Patrick McHardy	a3c941b08d	[NETFILTER]: Kconfig: improve dependency handling Instead of depending on internally needed options and letting users figure out what is needed, select them when needed: - IP_NF_IPTABLES, IP_NF_ARPTABLES and IP6_NF_IPTABLES select NETFILTER_XTABLES - NETFILTER_XT_TARGET_CONNMARK, NETFILTER_XT_MATCH_CONNMARK and IP_NF_TARGET_CLUSTERIP select NF_CONNTRACK_MARK - NETFILTER_XT_MATCH_CONNBYTES selects NF_CT_ACCT Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-12 11:15:02 -08:00
Patrick McHardy	982d9a9ce3	[NETFILTER]: nf_conntrack: properly use RCU for nf_conntrack_destroyed callback Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-12 11:14:11 -08:00
Patrick McHardy	6b48a7d08d	[NETFILTER]: ip_conntrack: properly use RCU for ip_conntrack_destroyed callback Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-12 11:13:58 -08:00
Patrick McHardy	abbaccda4c	[NETFILTER]: ip_conntrack: fix invalid conntrack statistics RCU assumption CONNTRACK_STAT_INC assumes rcu_read_lock in nf_hook_slow disables preemption as well, making it legal to use __get_cpu_var without disabling preemption manually. The assumption is not correct anymore with preemptable RCU, additionally we need to protect against softirqs when not holding ip_conntrack_lock. Add CONNTRACK_STAT_INC_ATOMIC macro, which disables local softirqs, and use where necessary. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-12 11:13:14 -08:00
Patrick McHardy	923f4902fe	[NETFILTER]: nf_conntrack: properly use RCU API for nf_ct_protos/nf_ct_l3protos arrays Replace preempt_{enable,disable} based RCU by proper use of the RCU API and add missing rcu_read_lock/rcu_read_unlock calls in all paths not obviously only used within packet process context (nfnetlink_conntrack). Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-12 11:12:57 -08:00
Patrick McHardy	642d628b2c	[NETFILTER]: ip_conntrack: properly use RCU API for ip_ct_protos array Replace preempt_{enable,disable} based RCU by proper use of the RCU API and add missing rcu_read_lock/rcu_read_unlock calls in all paths not obviously only used within packet process context (nfnetlink_conntrack). Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-12 11:12:40 -08:00
Patrick McHardy	e22a054869	[NETFILTER]: nf_nat: properly use RCU API for nf_nat_protos array Replace preempt_{enable,disable} based RCU by proper use of the RCU API and add missing rcu_read_lock/rcu_read_unlock calls in paths used outside of packet processing context (nfnetlink_conntrack). Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-12 11:12:26 -08:00
Patrick McHardy	a441dfdbb2	[NETFILTER]: ip_nat: properly use RCU API for ip_nat_protos array Replace preempt_{enable,disable} based RCU by proper use of the RCU API and add missing rcu_read_lock/rcu_read_unlock calls in paths used outside of packet processing context (nfnetlink_conntrack). Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-12 11:12:09 -08:00
Patrick McHardy	e92ad99c78	[NETFILTER]: nf_log: minor cleanups - rename nf_logging to nf_loggers since its an array of registered loggers - rename nf_log_unregister_logger() to nf_log_unregister() to make it symetrical to nf_log_register() and convert all users Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-12 11:11:55 -08:00
Patrick McHardy	c3a47ab3e5	[NETFILTER]: Properly use RCU in nf_ct_attach Use rcu_assign_pointer/rcu_dereference for ip_ct_attach pointer instead of self-made RCU and use rcu_read_lock to make sure the conntrack module doesn't disappear below us while calling it, since this function can be called from outside the netfilter hooks. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-12 11:09:19 -08:00
Arjan van de Ven	9a32144e9d	[PATCH] mark struct file_operations const 7 Many struct file_operations in the kernel can be "const". Marking them const moves these to the .rodata section, which avoids false sharing with potential dirty data. In addition it'll catch accidental writes at compile time to these shared resources. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-12 09:48:46 -08:00
Linus Torvalds	cb18eccff4	Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 * master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (45 commits) [IPV4]: Restore multipath routing after rt_next changes. [XFRM] IPV6: Fix outbound RO transformation which is broken by IPsec tunnel patch. [NET]: Reorder fields of struct dst_entry [DECNET]: Convert decnet route to use the new dst_entry 'next' pointer [IPV6]: Convert ipv6 route to use the new dst_entry 'next' pointer [IPV4]: Convert ipv4 route to use the new dst_entry 'next' pointer [NET]: Introduce union in struct dst_entry to hold 'next' pointer [DECNET]: fix misannotation of linkinfo_dn [DECNET]: FRA_{DST,SRC} are le16 for decnet [UDP]: UDP can use sk_hash to speedup lookups [NET]: Fix whitespace errors. [NET] XFRM: Fix whitespace errors. [NET] X25: Fix whitespace errors. [NET] WANROUTER: Fix whitespace errors. [NET] UNIX: Fix whitespace errors. [NET] TIPC: Fix whitespace errors. [NET] SUNRPC: Fix whitespace errors. [NET] SCTP: Fix whitespace errors. [NET] SCHED: Fix whitespace errors. [NET] RXRPC: Fix whitespace errors. ...	2007-02-11 11:38:13 -08:00
Robert P. J. Day	c376222960	[PATCH] Transform kmem_cache_alloc()+memset(0) -> kmem_cache_zalloc(). Replace appropriate pairs of "kmem_cache_alloc()" + "memset(0)" with the corresponding "kmem_cache_zalloc()" call. Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Andi Kleen <ak@muc.de> Cc: Roland McGrath <roland@redhat.com> Cc: James Bottomley <James.Bottomley@steeleye.com> Cc: Greg KH <greg@kroah.com> Acked-by: Joel Becker <Joel.Becker@oracle.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Jan Kara <jack@ucw.cz> Cc: Michael Halcrow <mhalcrow@us.ibm.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: James Morris <jmorris@namei.org> Cc: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-11 10:51:27 -08:00
Eric Dumazet	5ef213f684	[IPV4]: Restore multipath routing after rt_next changes. I forgot to test build this part of the networking code... Sorry guys. This patch renames u.rt_next to u.dst.rt_next Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-10 23:20:50 -08:00
Eric Dumazet	093c2ca416	[IPV4]: Convert ipv4 route to use the new dst_entry 'next' pointer This patch removes the rt_next pointer from 'struct rtable.u' union, and renames u.rt_next to u.dst_rt_next. It also moves 'struct flowi' right after 'struct dst_entry' to prepare the gain on lookups. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-10 23:20:38 -08:00
Eric Dumazet	95f30b336b	[UDP]: UDP can use sk_hash to speedup lookups In a prior patch, I introduced a sk_hash field (__sk_common.skc_hash) to let tcp lookups use one cache line per unmatched entry instead of two. We can also use sk_hash to speedup UDP part as well. We store in sk_hash the hnum value, and use sk->sk_hash (same cache line than 'next' pointer), instead of inet->num (different cache line) Note : We still have a false sharing problem for SMP machines, because sock_hold(sock) dirties the cache line containing the 'next' pointer. Not counting the udp_hash_lock rwlock. (did someone mentioned RCU ? :) ) Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-10 23:20:29 -08:00
YOSHIFUJI Hideaki	e905a9edab	[NET] IPV4: Fix whitespace errors. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-10 23:19:39 -08:00
Eric Dumazet	dbca9b2750	[NET]: change layout of ehash table ehash table layout is currently this one : First half of this table is used by sockets not in TIME_WAIT state Second half of it is used by sockets in TIME_WAIT state. This is non optimal because of for a given hash or socket, the two chain heads are located in separate cache lines. Moreover the locks of the second half are never used. If instead of this halving, we use two list heads in inet_ehash_bucket instead of only one, we probably can avoid one cache miss, and reduce ram usage, particularly if sizeof(rwlock_t) is big (various CONFIG_DEBUG_SPINLOCK, CONFIG_DEBUG_LOCK_ALLOC settings). So we still halves the table but we keep together related chains to speedup lookups and socket state change. In this patch I did not try to align struct inet_ehash_bucket, but a future patch could try to make this structure have a convenient size (a power of two or a multiple of L1_CACHE_SIZE). I guess rwlock will just vanish as soon as RCU is plugged into ehash :) , so maybe we dont need to scratch our heads to align the bucket... Note : In case struct inet_ehash_bucket is not a power of two, we could probably change alloc_large_system_hash() (in case it use __get_free_pages()) to free the unused space. It currently allocates a big zone, but the last quarter of it could be freed. Again, this should be a temporary 'problem'. Patch tested on ipv4 tcp only, but should be OK for IPV6 and DCCP. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-08 14:16:46 -08:00
Jan Engelhardt	e60a13e030	[NETFILTER]: {ip,ip6}_tables: use struct xt_table instead of redefined structure names Signed-off-by: Jan Engelhardt <jengelh@gmx.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-08 12:39:20 -08:00
Jan Engelhardt	6709dbbb19	[NETFILTER]: {ip,ip6}_tables: remove x_tables wrapper functions Use the x_tables functions directly to make it better visible which parts are shared between ip_tables and ip6_tables. Signed-off-by: Jan Engelhardt <jengelh@gmx.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-08 12:39:19 -08:00
Jan Engelhardt	e1fd0586b0	[NETFILTER]: x_tables: fix return values for LOG/ULOG Signed-off-by: Jan Engelhardt <jengelh@gmx.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-08 12:39:18 -08:00
Eric Leblond	41f4689a7c	[NETFILTER]: NAT: optional source port randomization support This patch adds support to NAT to randomize source ports. Signed-off-by: Eric Leblond <eric@inl.fr> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-08 12:39:17 -08:00
Patrick McHardy	cdd289a2f8	[NETFILTER]: add IPv6-capable TCPMSS target Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-08 12:39:16 -08:00
Patrick McHardy	a8d0f9526f	[NET]: Add UDPLITE support in a few missing spots Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-08 12:39:14 -08:00
Patrick McHardy	efbc597634	[NETFILTER]: nf_nat: remove broken HOOKNAME macro Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-08 12:39:12 -08:00
Patrick McHardy	a09113c2c8	[NETFILTER]: tcp conntrack: do liberal tracking for picked up connections Do liberal tracking (only RSTs need to be in-window) for connections picked up without seeing a SYN to deal with window scaling. Also change logging of invalid packets not to log packets accepted by liberal tracking to avoid spamming the logs. Based on suggestion from James Ralston <ralston@pobox.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-08 12:39:10 -08:00
Stephen Hemminger	22f8cde5bc	[NET]: unregister_netdevice as void There was no real useful information from the unregister_netdevice() return code, the only error occurred in a situation that was a driver bug. So change it to a void function. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-08 12:39:06 -08:00
Alexey Dobriyan	cc63f70b8b	[IPV4/IPV6] multicast: Check add_grhead() return value add_grhead() allocates memory with GFP_ATOMIC and in at least two places skb from it passed to skb_put() without checking. Signed-off-by: Alexey Dobriyan <adobriyan@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-08 12:39:04 -08:00
David S. Miller	f2f2102d1a	[XFRM]: Fix missed error setting in xfrm4_policy.c When we can't find the afinfo we should return EAFNOSUPPORT. GCC warned about the uninitialized 'err' for this path as well. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-08 12:39:03 -08:00
Miika Komu	4337226228	[IPSEC]: IPv4 over IPv6 IPsec tunnel This is the patch to support IPv4 over IPv6 IPsec. Signed-off-by: Miika Komu <miika@iki.fi> Signed-off-by: Diego Beltrami <Diego.Beltrami@hiit.fi> Signed-off-by: Kazunori Miyazawa <miyazawa@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-08 12:39:02 -08:00
Miika Komu	c82f963efe	[IPSEC]: IPv6 over IPv4 IPsec tunnel This is the patch to support IPv6 over IPv4 IPsec Signed-off-by: Miika Komu <miika@iki.fi> Signed-off-by: Diego Beltrami <Diego.Beltrami@hiit.fi> Signed-off-by: Kazunori Miyazawa <miyazawa@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-08 12:39:01 -08:00
Miika Komu	cdca72652a	[IPSEC]: exporting xfrm_state_afinfo This patch exports xfrm_state_afinfo. Signed-off-by: Miika Komu <miika@iki.fi> Signed-off-by: Diego Beltrami <Diego.Beltrami@hiit.fi> Signed-off-by: Kazunori Miyazawa <miyazawa@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-08 12:39:00 -08:00
John Heffner	104439a887	[TCP]: Don't apply FIN exception to full TSO segments. Signed-off-by: John Heffner <jheffner@psc.edu> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-08 12:38:51 -08:00
Baruch Even	8a3c3a9727	[TCP]: Check num sacks in SACK fast path We clear the unused parts of the SACK cache, This prevents us from mistakenly taking the cache data if the old data in the SACK cache is the same as the data in the SACK block. This assumes that we never receive an empty SACK block with start and end both at zero. Signed-off-by: Baruch Even <baruch@ev-en.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-08 12:38:50 -08:00
Baruch Even	6f74651ae6	[TCP]: Seperate DSACK from SACK fast path Move DSACK code outside the SACK fast-path checking code. If the DSACK determined that the information was too old we stayed with a partial cache copied. Most likely this matters very little since the next packet will not be DSACK and we will find it in the cache. but it's still not good form and there is little reason to couple the two checks. Since the SACK receive cache doesn't need the data to be in host order we also remove the ntohl in the checking loop. Signed-off-by: Baruch Even <baruch@ev-en.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-08 12:38:49 -08:00
Baruch Even	fda03fbb56	[TCP]: Advance fast path pointer for first block only Only advance the SACK fast-path pointer for the first block, the fast-path assumes that only the first block advances next time so we should not move the cached skb for the next sack blocks. Signed-off-by: Baruch Even <baruch@ev-en.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-08 12:38:48 -08:00
David S. Miller	8eb9086f21	[IPV4/IPV6]: Always wait for IPSEC SA resolution in socket contexts. Do this even for non-blocking sockets. This avoids the silly -EAGAIN that applications can see now, even for non-blocking sockets in some cases (f.e. connect()). With help from Venkat Tekkirala. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-08 12:38:45 -08:00
Frederik Deweerdt	ba7808eac1	[TCP]: remove tcp header from tcp_v4_check (take #2 ) The tcphdr struct passed to tcp_v4_check is not used, the following patch removes it from the parameter list. This adds the netfilter modifications missing in the patch I sent for rc3-mm1. Signed-off-by: Frederik Deweerdt <frederik.deweerdt@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-08 12:38:44 -08:00
Patrick McHardy	26932566a4	[NETLINK]: Don't BUG on undersized allocations Currently netlink users BUG when the allocated skb for an event notification is undersized. While this is certainly a kernel bug, its not critical and crashing the kernel is too drastic, especially when considering that these errors have appeared multiple times in the past and it BUGs even if no listeners are present. This patch replaces BUG by WARN_ON and changes the notification functions to inform potential listeners of undersized allocations using a unique error code (EMSGSIZE). Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-08 12:38:41 -08:00
Patrick McHardy	40e0cb004a	[NETFILTER]: ctnetlink: fix compile failure with NF_CONNTRACK_MARK=n CC net/netfilter/nf_conntrack_netlink.o net/netfilter/nf_conntrack_netlink.c: In function 'ctnetlink_conntrack_event': net/netfilter/nf_conntrack_netlink.c:392: error: 'struct nf_conn' has no member named 'mark' make[3]: *** [net/netfilter/nf_conntrack_netlink.o] Error 1 Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-02 19:33:11 -08:00
Patrick McHardy	adcb471110	[NETFILTER]: SIP conntrack: fix out of bounds memory access When checking for an @-sign in skp_epaddr_len, make sure not to run over the packet boundaries. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-01-30 14:25:24 -08:00
Lars Immisch	7da5bfbb12	[NETFILTER]: SIP conntrack: fix skipping over user info in SIP headers When trying to skip over the username in the Contact header, stop at the end of the line if no @ is found to avoid mangling following headers. We don't need to worry about continuation lines because we search inside a SIP URI. Fixes Netfilter Bugzilla #532. Signed-off-by: Lars Immisch <lars@ibp.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-01-30 14:24:57 -08:00
Robert Olsson	095b8501e4	[IPV4]: Fix single-entry /proc/net/fib_trie output. When main table is just a single leaf this gets printed as belonging to the local table in /proc/net/fib_trie. A fix is below. Signed-off-by: Robert Olsson <robert.olsson@its.uu.se> Acked-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-01-26 19:06:01 -08:00
Patrick McHardy	a46bf7d5a8	[NETFILTER]: nf_nat_pptp: fix expectation removal When removing the expectation for the opposite direction, the PPTP NAT helper initializes the tuple for lookup with the addresses of the opposite direction, which makes the lookup fail. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-01-26 01:07:30 -08:00
Patrick McHardy	c72c6b2a29	[NETFILTER]: nf_nat: fix ICMP translation with statically linked conntrack When nf_nat/nf_conntrack_ipv4 are linked statically, nf_nat is initialized before nf_conntrack_ipv4, which makes the nf_ct_l3proto_find_get(AF_INET) call during nf_nat initialization return the generic l3proto instead of the AF_INET specific one. This breaks ICMP error translation since the generic protocol always initializes the IPs in the tuple to 0. Change the linking order and put nf_conntrack_ipv4 first. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-01-26 01:06:47 -08:00
David S. Miller	e89862f4c5	[TCP]: Restore SKB socket owner setting in tcp_transmit_skb(). Revert `931731123a` We can't elide the skb_set_owner_w() here because things like certain netfilter targets (such as owner MATCH) need a socket to be set on the SKB for correct operation. Thanks to Jan Engelhardt and other netfilter list members for pointing this out. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-01-26 01:04:55 -08:00

1 2 3 4 5 ...

1317 Commits (6b2bedc3a659ba228a93afc8e3f008e152abf18a)