linux

q3k/linux

Author	SHA1	Message	Date
Matt LaPlante	44c09201a4	more misc typo fixes Signed-off-by: Adrian Bunk <bunk@stusta.de>	2006-10-03 22:34:14 +02:00
Serge E. Hallyn	96b644bdec	[PATCH] namespaces: utsname: use init_utsname when appropriate In some places, particularly drivers and __init code, the init utsns is the appropriate one to use. This patch replaces those with a the init_utsname helper. Changes: Removed several uses of init_utsname(). Hope I picked all the right ones in net/ipv4/ipconfig.c. These are now changed to utsname() (the per-process namespace utsname) in the previous patch (2/7) [akpm@osdl.org: CIFS fix] Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Cc: Kirill Korotaev <dev@openvz.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: Andrey Savochkin <saw@sw.ru> Cc: Serge Hallyn <serue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-02 07:57:21 -07:00
Serge E. Hallyn	e9ff3990f0	[PATCH] namespaces: utsname: switch to using uts namespaces Replace references to system_utsname to the per-process uts namespace where appropriate. This includes things like uname. Changes: Per Eric Biederman's comments, use the per-process uts namespace for ELF_PLATFORM, sunrpc, and parts of net/ipv4/ipconfig.c [jdike@addtoit.com: UML fix] [clg@fr.ibm.com: cleanup] [akpm@osdl.org: build fix] Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Cc: Kirill Korotaev <dev@openvz.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: Andrey Savochkin <saw@sw.ru> Signed-off-by: Cedric Le Goater <clg@fr.ibm.com> Cc: Jeff Dike <jdike@addtoit.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-02 07:57:21 -07:00
Ananth N Mavinakayanahalli	3a872d89ba	[PATCH] Kprobes: Make kprobe modules more portable In an effort to make kprobe modules more portable, here is a patch that: o Introduces the "symbol_name" field to struct kprobe. The symbol->address resolution now happens in the kernel in an architecture agnostic manner. 64-bit powerpc users no longer have to specify the ".symbols" o Introduces the "offset" field to struct kprobe to allow a user to specify an offset into a symbol. o The legacy mechanism of specifying the kprobe.addr is still supported. However, if both the kprobe.addr and kprobe.symbol_name are specified, probe registration fails with an -EINVAL. o The symbol resolution code uses kallsyms_lookup_name(). So CONFIG_KPROBES now depends on CONFIG_KALLSYMS o Apparantly kprobe modules were the only legitimate out-of-tree user of the kallsyms_lookup_name() EXPORT. Now that the symbol resolution happens in-kernel, remove the EXPORT as suggested by Christoph Hellwig o Modify tcp_probe.c that uses the kprobe interface so as to make it work on multiple platforms (in its earlier form, the code wouldn't work, say, on powerpc) Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Signed-off-by: Prasanna S Panchamukhi <prasanna@in.ibm.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-02 07:57:16 -07:00
Peter Zijlstra	6e9a4738c9	[PATCH] completions: lockdep annotate on stack completions All on stack DECLARE_COMPLETIONs should be replaced by: DECLARE_COMPLETION_ONSTACK Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Ingo Molnar <mingo@elte.hu> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-01 00:39:24 -07:00
Paul Moore	95d4e6be25	[NetLabel]: audit fixups due to delayed feedback Fix some issues Steve Grubb had with the way NetLabel was using the audit subsystem. This should make NetLabel more consistent with other kernel generated audit messages specifying configuration changes. Signed-off-by: Paul Moore <paul.moore@hp.com> Acked-by: Steve Grubb <sgrubb@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-29 17:05:05 -07:00
Paul Moore	32f50cdee6	[NetLabel]: add audit support for configuration changes This patch adds audit support to NetLabel, including six new audit message types shown below. #define AUDIT_MAC_UNLBL_ACCEPT 1406 #define AUDIT_MAC_UNLBL_DENY 1407 #define AUDIT_MAC_CIPSOV4_ADD 1408 #define AUDIT_MAC_CIPSOV4_DEL 1409 #define AUDIT_MAC_MAP_ADD 1410 #define AUDIT_MAC_MAP_DEL 1411 Signed-off-by: Paul Moore <paul.moore@hp.com> Acked-by: James Morris <jmorris@namei.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:03:09 -07:00
John Heffner	8ea333eb5d	[TCP]: Fix and simplify microsecond rtt sampling This changes the microsecond RTT sampling so that samples are taken in the same way that RTT samples are taken for the RTO calculator: on the last segment acknowledged, and only when the segment hasn't been retransmitted. Signed-off-by: John Heffner <jheffner@psc.edu> Acked-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:03:08 -07:00
Wong Hoi Sing Edison	bfbea8a886	[TCP] tcp-lp: prevent chance for oops This patch fix the chance for tcp_lp_remote_hz_estimator return 0, if 0 < rhz < 64. It also make sure the flag LP_VALID_RHZ is set correctly. Signed-off-by: Wong Hoi Sing Edison <hswong3i@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:03:07 -07:00
Al Viro	96d2ca4ec0	[IPVS] bug: endianness breakage in ip_vs_ftp (p[3]<<24) \| (p[2]<<16) \| (p[1]<<8) \| p[0] is not a valid way to spell get_unaligned((__be32 *)p Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:03:05 -07:00
Al Viro	014d730d56	[IPVS]: ipvs annotations Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:03:04 -07:00
Al Viro	d4263cde88	[NETFILTER]: h323 annotations Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:03:03 -07:00
Al Viro	6a19d61472	[NETFILTER]: ipt annotations Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:03:02 -07:00
Al Viro	a76b11dd25	[NETFILTER]: NAT annotations Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:03:01 -07:00
Al Viro	cdcb71bf96	[NETFILTER]: conntrack annotations Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:03:00 -07:00
Al Viro	59b8bfd8fd	[NETFILTER]: netfilter misc annotations Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:02:59 -07:00
Simon Horman	28b06c380f	[IPVS]: Make sure ip_vs_ftp ports are valid: module_param_array approach I'm not entirely sure what happens in the case of a valid port, at best it'll be silently ignored. This patch ensures that the port values are unsigned short values, and thus always valid. This is a second take at fixing this problem, it is simpler and arguably more correct than the previous approach that was committed as `3f5af5b353`. Prior to this patch a patch that reverses `3f5af5b353` was sent. Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:02:56 -07:00
Simon Horman	e44fd82caf	[IPVS]: Reverse valid ip_vs_ftp ports fix: port check approach This patch reverses `3f5af5b353` as a better fix was suggested by Patrick McHardy. Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:02:55 -07:00
Al Viro	4324a17430	[XFRM]: fl_ipsec_spi is net-endian Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:02:43 -07:00
Al Viro	6067b2baba	[XFRM]: xfrm_parse_spi() annotations Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:02:39 -07:00
Al Viro	a94cfd1974	[XFRM]: xfrm_state_lookup() annotations spi argument of xfrm_state_lookup() is net-endian Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:02:37 -07:00
Al Viro	8f83f23e6d	[XFRM]: ports in struct xfrm_selector annotated Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:02:33 -07:00
Al Viro	9f8552996d	[IPV4]: inet_diag annotations Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:02:29 -07:00
Al Viro	82103232ed	[IPV4]: inet_rcv_saddr() annotations inet_rcv_saddr() returns net-endian Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:02:28 -07:00
Al Viro	23f33c2d4f	[IPV4]: struct inet_timewait_sock annotations Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:02:27 -07:00
Al Viro	fb99c848e5	[IPV4]: annotate inet_lookup() and friends inet_lookup() annotated along with helper functions (__inet_lookup(), __inet_lookup_established(), inet_lookup_established(), inet_lookup_listener(), __inet_lookup_listener() and inet_ehashfn()) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:02:26 -07:00
Al Viro	4f765d842f	[IPV4]: INET_MATCH() annotations INET_MATCH() and friends depend on an interesting set of kludges: * there's a pair of adjacent fields in struct inet_sock - __be16 dport followed by __u16 num. We want to search by pair, so we combine the keys into a single 32bit value and compare with 32bit value read from &...->dport. * on 64bit targets we combine comparisons with pair of adjacent __be32 fields in the same way. Make sure that we don't mix those values with anything else and that pairs we form them from have correct types. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:02:25 -07:00
Al Viro	45d60b9e29	[IPV4]: FRA_{DST,SRC} annotated use be32 netlink accessors for those Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:02:24 -07:00
Al Viro	81f7bf6cba	[IPV4]: net/ipv4/fib annotations Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:02:23 -07:00
Al Viro	114c7844f3	[IPV4]: mroute annotations Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:02:22 -07:00
Al Viro	df7a3b07c2	[TCP] net/ipv4/tcp_output.c: trivial annotations Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:02:20 -07:00
Al Viro	b03d73e30c	[IPV4] net/ipv4/icmp.c: trivial annotations Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:02:19 -07:00
Al Viro	734ab87f63	[UDP] net/ipv4/udp.c: trivial annotations Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:02:17 -07:00
Al Viro	6b72977bd6	[IPV4]: inet_csk_search_req() annotations rport argument is net-endian Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:02:15 -07:00
Al Viro	ed9bad06ee	[IPV4] net/ipv4/arp.c: trivial annotations Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:02:14 -07:00
Al Viro	4f3608b787	[TCP] net/ipv4/tcp_input.c: trivial annotations Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:02:11 -07:00
Al Viro	35986b329f	[IPV4]: ip_icmp_error() annotations port is net-endian Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:02:09 -07:00
Al Viro	0579016ec4	[IPV4]: ip_local_error() annotations port argument is net-endian Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:02:08 -07:00
Al Viro	269bd27e66	[TCP]: struct tcp_sack_block annotations Some of the instances of tcp_sack_block are host-endian, some - net-endian. Define struct tcp_sack_block_wire identical to struct tcp_sack_block with u32 replaced with __be32; annotate uses of tcp_sack_block replacing net-endian ones with tcp_sack_block_wire. Change is obviously safe since for cc(1) __be32 is typedefed to u32. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:02:04 -07:00
Al Viro	63007727e0	[IPV4]: trivial igmp annotations Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:02:02 -07:00
Al Viro	c0cda068aa	[IPV4]: ip_mc_sf_allow() annotated ip_mc_sf_allow() expects addresses to be passed net-endian. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:02:01 -07:00
Al Viro	ea4d9e7220	[IPV4]: struct ip_sf_list and struct ip_sf_socklist annotated Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:02:00 -07:00
Al Viro	8f935bbd7c	[IPV4]: ip_mc_{inc,dec}_group() annotations Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:01:58 -07:00
Al Viro	4b06a7cf2f	[IPV4]: ip_local_error() ipv4 address argument annotated daddr is net-endian Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:01:56 -07:00
Al Viro	e25d2ca6b2	[IPV4]: trivial ip_options.c annotations Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:01:55 -07:00
Al Viro	c1d18f9fa0	[IPV4]: struct ipcm_cookie annotation ->addr is net-endian Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:01:54 -07:00
Al Viro	3ca3c68e76	[IPV4]: struct ip_options annotations ->faddr is net-endian; annotated as such, variables inferred to be net-endian annotated. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:01:53 -07:00
Al Viro	7f25afbbef	[IPV4]: inet_csk_search_req() (partial) annotations raddr is net-endian Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:01:52 -07:00
Al Viro	adaf345b53	[IPV4]: annotate address in inet_request_sock Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:01:51 -07:00
Olaf Kirch	321efff7c3	[IPV4]: Fix order in inet_init failure path. This is just a minor buglet I came across by accident - when inet_init fails to register raw_prot, it jumps to out_unregister_udp_proto which should unregister UDP _and_ TCP. Signed-off-by: Olaf Kirch <okir@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:01:48 -07:00
Al Viro	13d8eaa06a	[IPV4]: ip_build_and_send_pkt() annotations saddr and daddr are net-endian Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:01:19 -07:00
Al Viro	8712f774dc	[IPV4]: ip_options_build() annotations daddr is net-endian Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:01:18 -07:00
Al Viro	e8192f367c	[IPV4] bug: broken open-coded inet_make_mask() (multipath_wrandom) multipath_wrandom.c::__multipath_lookup_weight() contains open-coded attempt at inet_make_mask(); broken on big-endian. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:01:17 -07:00
Al Viro	f20f4a60d7	[IPV4] multipath_wrandom.c: trivial annotations Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:01:16 -07:00
Al Viro	d9cd66e0e5	[IPV4]: multipath_set_nhinfo() annotations multipath_set_nhinfo() (and underlying callback) take net-endian network and netmask. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:01:15 -07:00
Al Viro	32ab5f8033	[IPV4] fib_trie.c: trivial annotations Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:01:14 -07:00
Al Viro	1e8aa6f125	[IPV4] bug: open-coded inet_make_mask() in fib_semantic_match() is broken ... and works only on little-endian Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:01:13 -07:00
Al Viro	1ef1b8c85b	[IPV4]: fib_semantic_match() annotations 'mask' and 'zone' arguments are net-endian Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:01:12 -07:00
Al Viro	b6e80c6c8b	[IPV4]: trivial fib_hash.c annotations hash key and stored netmask are net-endian Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:01:11 -07:00
Al Viro	182777700d	[IPV4]: ip_fragment.c endianness annotations Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:01:09 -07:00
Al Viro	53576d9b99	[IPV4]: inetpeer annotations This one is interesting - we use net-endian value as search key, but order the tree by host-endian comparisons of keys. OK since we only care about lookups. Annotated inet_getpeer() and friends. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:01:09 -07:00
Al Viro	d878e72e41	[IPV4]: ip_fib_check_default() annotated Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:01:08 -07:00
Al Viro	fd68322209	[IPV4]: inet_addr_type() annotations argument and inferred net-endian variables in callers annotated. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:01:07 -07:00
Al Viro	e4883014f4	[IPV4]: icmp_send() annotation The last argument is network-endian (it will go straight into the packet). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:01:06 -07:00
Al Viro	60cad5da57	[IPV4]: annotate inetdev.h helpers inet_confirm_addr(), inet_ifa_byprefix(), ip_dev_find(), inet_make_mask() and inet_ifa_match() annotated, along with inferred net-endian variables Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:01:05 -07:00
Al Viro	a7a628c442	[IPV4]: IFA_{LOCAL,ADDRESS,BROADCAST,ANYCAST} on ipv4 annotated use be32 netlink accessors Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:01:04 -07:00
Al Viro	a144ea4b7a	[IPV4]: annotate struct in_ifaddr ifa_local, ifa_address, ifa_mask, ifa_broadcast and ifa_anycast are net-endian. Annotated them and variables that are inferred to be net-endian. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:00:55 -07:00
Al Viro	6d85c10abe	[IPV4]: struct fib_config IPv4 address fields annotated Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 17:54:17 -07:00
Al Viro	17fb2c6439	[IPV4]: RTA_{DST,SRC,GATEWAY,PREFSRC} annotated these are passed net-endian; use be32 netlink accessors Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 17:54:16 -07:00
Al Viro	e448515c79	[IPV4] net/ipv4/route.c: trivial endianness annotations Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 17:54:15 -07:00
Al Viro	b83738ae00	[IPV4]: FIB_RES_PREFSRC() annotated Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 17:54:13 -07:00
Al Viro	ff428d72c5	[IPV4]: inet_addr_onlink() annotated Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 17:54:11 -07:00
Al Viro	a60c4923da	[IPV4]: ip_check_mc() annotations annotated arguments Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 17:54:10 -07:00
Al Viro	d9c9df8c93	[IPV4]: fib_validate_source() annotations annotated arguments and inferred net-endian variables in callers Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 17:54:09 -07:00
Al Viro	a61ced5d1c	[IPV4]: inet_select_addr() annotations argument and return value are net-endian. Annotated function and inferred net-endian variables in callers. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 17:54:08 -07:00
Al Viro	bada8adc4e	[IPV4]: ip_route_connect() ipv4 address arguments annotated annotated address arguments (port number left alone for now); ditto for inferred net-endian variables in callers. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 17:54:06 -07:00
Al Viro	8c7bc84085	[IPV4]: annotate rt_hash_code() users Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 17:54:04 -07:00
Al Viro	f7655229c0	[IPV4]: ip_rt_redirect() annotations The first 4 arguments of ip_rt_redirect() are net-endian. Annotated. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 17:54:03 -07:00
Al Viro	9e12bb22e3	[IPV4]: ip_route_input() annotations ip_route_input() takes net-endian source and destination address. * Annotated as such. * arguments of its invocations annotated where needed. * local helpers getting the same values passed to by it (ip_route_input_mc(), ip_route_input_slow(), ip_handle_martian_source(), ip_mkroute_input(), ip_mkroute_input_def(), __mkroute_input()) annotated Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 17:54:02 -07:00
Paul Moore	fcd4828064	[NetLabel]: rework the Netlink attribute handling (part 1) At the suggestion of Thomas Graf, rewrite NetLabel's use of Netlink attributes to better follow the common Netlink attribute usage. Signed-off-by: Paul Moore <paul.moore@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-25 15:56:09 -07:00
Paul Moore	609c92feea	[NetLabel]: make the CIPSOv4 cache spinlocks bottom half safe The CIPSOv4 cache traversal routines are triggered both the userspace events (cache invalidation due to DOI removal or updated SELinux policy) and network packet processing events. As a result there is a problem with the existing CIPSOv4 cache spinlocks as they are not bottom-half/softirq safe. This patch converts the CIPSOv4 cache spin_[un]lock() calls into spin_[un]lock_bh() calls to address this problem. Signed-off-by: Paul Moore <paul.moore@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-25 15:52:37 -07:00
Paul Moore	14a72f53fb	[NetLabel]: correct improper handling of non-NetLabel peer contexts Fix a problem where NetLabel would always set the value of sk_security_struct->peer_sid in selinux_netlbl_sock_graft() to the context of the socket, causing problems when users would query the context of the connection. This patch fixes this so that the value in sk_security_struct->peer_sid is only set when the connection is NetLabel based, otherwise the value is untouched. Signed-off-by: Paul Moore <paul.moore@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-25 15:52:01 -07:00
Stephen Hemminger	597811ec16	[TCP]: make cubic the default Change default congestion control used from BIC to the newer CUBIC which it the successor to BIC but has better properties over long delay links. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-24 20:13:03 -07:00
Stephen Hemminger	3d2573f7eb	[TCP]: default congestion control menu Change how default TCP congestion control is chosen. Don't just use last installed module, instead allow selection during configuration, and make sure and use the default regardless of load order. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-24 20:11:58 -07:00
Al Viro	3e597c6045	[PATCH] fix iptables __user misannotations Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-09-24 15:55:03 -07:00
Patrick McHardy	4c5de695cf	[NETFILTER]: PPTP conntrack: fix another GRE keymap leak When the master PPTP connection times out while still having unfullfilled expectations (and a GRE keymap entry) associated with it, the keymap entry is not destroyed. Add a destroy callback to struct ip_conntrack_helper and use it to destroy PPTP siblings when the master is destroyed. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:20:20 -07:00
Patrick McHardy	fd5e3befa4	[NETFILTER]: PPTP conntrack: fix GRE keymap leak When destroying the GRE expectations without having seen the GRE connection the keymap entry is not freed, leading to a memory leak and, in case of a following call within the same session, failure during expectation setup. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:20:19 -07:00
Patrick McHardy	62fbe9c82b	[NETFILTER]: PPTP conntrack: fix PPTP_IN_CALL message types Fix incorrectly used message types and call IDs: - PPTP_IN_CALL_REQUEST (PAC->PNS) contains a PptpInCallRequest (icreq) message and the PAC call ID - PPTP_IN_CALL_REPLY (PNS->PAC) contains a PptpInCallReply (icack) message and the PNS call ID Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:20:18 -07:00
Patrick McHardy	750a584233	[NETFILTER]: PPTP conntrack: check call ID before changing state For rejected calls the state is set to PPTP_CALL_NONE even for non-matching call ids. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:20:17 -07:00
Patrick McHardy	87a0117afd	[NETFILTER]: PPTP conntrack: clean up debugging cruft Also make sure not to hand packets received in an invalid state to the NAT helper since it will mangle the packet with invalid data. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:20:16 -07:00
Patrick McHardy	4c651756d5	[NETFILTER]: PPTP conntrack: consolidate header parsing Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:20:15 -07:00
Patrick McHardy	a1073406a1	[NETFILTER]: PPTP conntrack: consolidate header size checks Also make sure not to pass undersized messages to the NAT helper. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:20:14 -07:00
Patrick McHardy	cf9f81523e	[NETFILTER]: PPTP conntrack: simplify expectation handling Remove duplicated expectation handling in the NAT helper and simplify the remains in the conntrack helper. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:20:13 -07:00
Patrick McHardy	857c06da2b	[NETFILTER]: PPTP conntrack: remove unnecessary cid/pcid header pointers Just the values are needed, not the memory locations. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:20:12 -07:00
Patrick McHardy	6013c0a13e	[NETFILTER]: PPTP conntrack: fix header definitions Fix a few header definitions to match RFC2637. Most importantly the PptpOutCallRequest header included an invalid padding field and a size check was disabled because of this. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:20:11 -07:00
Patrick McHardy	5256f663a0	[NETFILTER]: PPTP conntrack: remove more dead code The calculated sequence numbers are not used for anything. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:20:10 -07:00
Patrick McHardy	a1ad1deed5	[NETFILTER]: PPTP conntrack: remove dead code The call ID in reply packets is never changed, remove the code. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:20:09 -07:00
Patrick McHardy	955b944293	[NETFILTER]: PPTP conntrack: get rid of unnecessary byte order conversions The conntrack structure contains the call ID in host byte order for no reason, get rid of back and forth conversions. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:20:08 -07:00
Patrick McHardy	edd5a329cf	[NETFILTER]: PPTP conntrack: fix whitespace errors Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:20:07 -07:00
Patrick McHardy	127f15dd65	[NETFILTER]: ipt_hashlimit: add compat conversion functions Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:20:06 -07:00
Patrick McHardy	9fa492cdc1	[NETFILTER]: x_tables: simplify compat API Split the xt_compat_match/xt_compat_target into smaller type-safe functions performing just one operation. Handle all alignment and size-related conversions centrally in these function instead of requiring each module to implement a full-blown conversion function. Replace ->compat callback by ->compat_from_user and ->compat_to_user callbacks, responsible for converting just a single private structure. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:20:01 -07:00
Patrick McHardy	79030ed07d	[NETFILTER]: ip_tables: revision support for compat code Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:20:00 -07:00
Patrick McHardy	bec71b1627	[NETFILTER]: ip_tables: fix module refcount leaks in compat error paths Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:19:59 -07:00
Brian Haley	1192e403e9	[NETFILTER]: make some netfilter globals __read_mostly Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:19:58 -07:00
George Hansper	c1fe3ca510	[NETFILTER]: TCP conntrack: improve dead connection detection Don't count window updates as retransmissions. Signed-off-by: George Hansper <georgeh@anstat.com.au> Signed-off-by: Patrick McHardy <kaber@trash.net>	2006-09-22 15:19:57 -07:00
Patrick McHardy	ca39df6cdf	[NETFILTER]: ipt_TTL: fix checksum update bug Fix regression introduced by the incremental checksum patches. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:19:54 -07:00
Pablo Neira Ayuso	5251e2d212	[NETFILTER]: conntrack: fix race condition in early_drop On SMP environments the maximum number of conntracks can be overpassed under heavy stress situations due to an existing race condition. CPU A CPU B atomic_read() ... early_drop() ... ... atomic_read() allocate conntrack allocate conntrack atomic_inc() atomic_inc() This patch moves the counter incrementation before the early drop stage. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:19:54 -07:00
Pablo Neira Ayuso	01f348484d	[NETFILTER]: ctnetlink: simplify the code to dump the conntrack table Merge the bits to dump the conntrack table and the ones to dump and zero counters in a single piece of code. This patch does not change the default behaviour if accounting is not enabled. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:19:52 -07:00
Dmitry Mishin	90d47db4a0	[NETFILTER]: x_tables: small check_entry & module_refcount cleanup While standard_target has target->me == NULL, module_put() should be called for it as for others, because there were try_module_get() before. Signed-off-by: Dmitry Mishin <dim@openvz.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:19:51 -07:00
Patrick McHardy	ecb70c95c4	[NETFILTER]: ipt_TCPMSS: misc cleanup - remove debugging cruft - remove printk for reallocation failures - remove unused addition Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:19:49 -07:00
Patrick McHardy	2be344c446	[NETFILTER]: ipt_TCPMSS: remove impossible condition Every skb must have a dst_entry at this point. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:19:48 -07:00
Patrick McHardy	68e1f188de	[NETFILTER]: ipt_TCPMSS: reformat - fix whitespace error - break lines at 80 characters - reformat some expressions to be more readable Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:19:47 -07:00
Patrick McHardy	df0933dcb0	[NETFILTER]: kill listhelp.h Kill listhelp.h and use the list.h functions instead. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:19:45 -07:00
Al Viro	c55e2f4997	[IPV4]: ipip and ip_gre encapsulation bugs Handling of ipip and ip_gre ICMP error relaying is b0rken; it accesses 8bit field + 3 reserved octets as host-endian 32bit, does comparison, subtraction and stuffs the result back. That breaks on big-endian. Fixed, made endian-clean. [ Note that this effected code is permanently commented out with and ifdef, so this error couldn't actually cause problems for anyone. -DaveM ] Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:19:43 -07:00
Patrick McHardy	a1e59abf82	[XFRM]: Fix wildcard as tunnel source Hashing SAs by source address breaks templates with wildcards as tunnel source since the source address used for hashing/lookup is still 0/0. Move source address lookup to xfrm_tmpl_resolve_one() so we can use the real address in the lookup. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:19:06 -07:00
Alexey Kuznetsov	1ef9696c90	[TCP]: Send ACKs each 2nd received segment. It does not affect either mss-sized connections (obviously) or connections controlled by Nagle (because there is only one small segment in flight). The idea is to record the fact that a small segment arrives on a connection, where one small segment has already been received and still not-ACKed. In this case ACK is forced after tcp_recvmsg() drains receive buffer. In other words, it is a "soft" each-2nd-segment ACK, which is enough to preserve ACK clock even when ABC is enabled. Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:19:05 -07:00
Brian Haley	94aec08ea4	[NETFILTER]: Change tunables to __read_mostly Change some netfilter tunables to __read_mostly. Also fixed some incorrect file reference comments while I was in there. (this will be my last __read_mostly patch unless someone points out something else that needs it) Signed-off-by: Brian Haley <brian.haley@hp.com> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:18:54 -07:00
Jamal Hadi Salim	eb878e8457	[IPSEC]: output mode to take an xfrm state as input param Expose IPSEC modes output path to take an xfrm state as input param. This makes it consistent with the input mode processing (which already takes the xfrm state as a param). Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:18:48 -07:00
Dmitry Mishin	fda9ef5d67	[NET]: Fix sk->sk_filter field access Function sk_filter() is called from tcp_v{4,6}_rcv() functions with arg needlock = 0, while socket is not locked at that moment. In order to avoid this and similar issues in the future, use rcu for sk->sk_filter field read protection. Signed-off-by: Dmitry Mishin <dim@openvz.org> Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: Kirill Korotaev <dev@openvz.org>	2006-09-22 15:18:47 -07:00
Herbert Xu	ff9b5e0f08	[TCP]: Fix rcv mss estimate for LRO By passing a Linux-generated TSO packet straight back into Linux, Xen becomes our first LRO user :) Unfortunately, there is at least one spot in our stack that needs to be changed to cope with this. The receive MSS estimate is computed from the raw packet size. This is broken if the packet is GSO/LRO. Fortunately the real MSS can be found in gso_size so we simply need to use that if it is non-zero. Real LRO NICs should of course set the gso_size field in future. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:18:44 -07:00
Stephen Hemminger	9bcfcaf5e9	[NETFILTER] bridge: simplify nf_bridge_pad Do some simple optimization on the nf_bridge_pad() function and don't use magic constants. Eliminate a double call and the #ifdef'd code for CONFIG_BRIDGE_NETFILTER. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:18:31 -07:00
Thomas Graf	5176f91ea8	[NETLINK]: Make use of NLA_STRING/NLA_NUL_STRING attribute validation Converts existing NLA_STRING attributes to use the new validation features, saving a couple of temporary buffers. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:18:25 -07:00
David S. Miller	e3b4eadbea	[UDP]: saddr_cmp function should take const socket pointers This also kills a warning while building ipv6: net/ipv6/udp.c: In function ‘udp_v6_get_port’: net/ipv6/udp.c:66: warning: passing argument 3 of ‘udp_get_port’ from incompatible pointer type Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:18:23 -07:00
David S. Miller	bed53ea7fe	[UDP]: Mark udp_port_rover static. It is not referenced outside of net/ipv4/udp.c any longer. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:18:22 -07:00
Gerrit Renker	25030a7f9e	[UDP]: Unify UDPv4 and UDPv6 ->get_port() This patch creates one common function which is called by udp_v4_get_port() and udp_v6_get_port(). As a result, * duplicated code is removed * udp_port_rover and local port lookup can now be removed from udp.h * further savings follow since the same function will be used by UDP-Litev4 and UDP-Litev6 In contrast to the patch sent in response to Yoshifujis comments (fixed by this variant), the code below also removes the EXPORT_SYMBOL(udp_port_rover), since udp_port_rover can now remain local to net/ipv4/udp.c. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:18:21 -07:00
Alexey Dobriyan	e5d679f339	[NET]: Use SLAB_PANIC Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:18:19 -07:00
YOSHIFUJI Hideaki	ef047f5e10	[NET]: Use BUILD_BUG_ON() for checking size of skb->cb. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:18:15 -07:00
Alexey Dobriyan	74975d40b1	[TCP] Congestion control (modulo lp, bic): use BUILD_BUG_ON Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:18:13 -07:00
Patrick McHardy	bbfb39cbf6	[IPV4]: Add support for fwmark masks in routing rules Add a FRA_FWMASK attributes for fwmark masks. For compatibility a mask of 0xFFFFFFFF is used when a mark value != 0 is sent without a mask. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:18:10 -07:00
Alexey Dobriyan	65e3d72654	[TCP] tcp_bic: use BUILD_BUG_ON Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:18:04 -07:00
Alexey Dobriyan	298969727e	[TCP] tcp_lp: use BUILD_BUG_ON Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:18:03 -07:00
David S. Miller	e4bec827fe	[IPSEC] esp: Defer output IV initialization to first use. First of all, if the xfrm_state only gets used for input packets this entropy is a complete waste. Secondly, it is often the case that a configuration loads many rules (perhaps even dynamically) and they don't all necessarily ever get used. This get_random_bytes() call was showing up in the profiles for xfrm_state inserts which is how I noticed this. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:17:35 -07:00
David S. Miller	9d4a706d85	[XFRM]: Add generation count to xfrm_state and xfrm_dst. Each xfrm_state inserted gets a new generation counter value. When a bundle is created, the xfrm_dst objects get the current generation counter of the xfrm_state they will attach to at dst->xfrm. xfrm_bundle_ok() will return false if it sees an xfrm_dst with a generation count different from the generation count of the xfrm_state that dst points to. This provides a facility by which to passively and cheaply invalidate cached IPSEC routes during SA database changes. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:08:42 -07:00
David S. Miller	edcd582152	[XFRM]: Pull xfrm_state_by{spi,src} hash table knowledge out of afinfo. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:08:39 -07:00
David S. Miller	2770834c9f	[XFRM]: Pull xfrm_state_bydst hash table knowledge out of afinfo. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:08:38 -07:00
Masahide NAKAMURA	e53820de0f	[XFRM] IPV6: Restrict bundle reusing For outbound transformation, bundle is checked whether it is suitable for current flow to be reused or not. In such IPv6 case as below, transformation may apply incorrect bundle for the flow instead of creating another bundle: - The policy selector has destination prefix length < 128 (Two or more addresses can be matched it) - Its bundle holds dst entry of default route whose prefix length < 128 (Previous traffic was used such route as next hop) - The policy and the bundle were used a transport mode state and this time flow address is not matched the bundled state. This issue is found by Mobile IPv6 usage to protect mobility signaling by IPsec, but it is not a Mobile IPv6 specific. This patch adds strict check to xfrm_bundle_ok() for each state mode and address when prefix length is less than 128. Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:06:44 -07:00
Masahide NAKAMURA	1b5c229987	[XFRM] STATE: Support non-fragment outbound transformation headers. For originated outbound IPv6 packets which will fragment, ip6_append_data() should know length of extension headers before sending them and the length is carried by dst_entry. IPv6 IPsec headers fragment then transformation was designed to place all headers after fragment header. OTOH Mobile IPv6 extension headers do not fragment then it is a good idea to make dst_entry have non-fragment length to tell it to ip6_append_data(). Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:06:41 -07:00
Masahide NAKAMURA	eb2971b68a	[XFRM] STATE: Search by address using source address list. This is a support to search transformation states by its addresses by using source address list for Mobile IPv6 usage. To use it from user-space, it is also added a message type for source address as a xfrm state option. Based on MIPL2 kernel patch. Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:06:35 -07:00
Masahide NAKAMURA	6c44e6b7ab	[XFRM] STATE: Add source address list. Support source address based searching. Mobile IPv6 will use it. Based on MIPL2 kernel patch. Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:06:34 -07:00
Masahide NAKAMURA	7e49e6de30	[XFRM]: Add XFRM_MODE_xxx for future use. Transformation mode is used as either IPsec transport or tunnel. It is required to add two more items, route optimization and inbound trigger for Mobile IPv6. Based on MIPL2 kernel patch. This patch was also written by: Ville Nuorvala <vnuorval@tcs.hut.fi> Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:05:15 -07:00
Patrick McHardy	efa741656e	[NETFILTER]: x_tables: remove unused size argument to check/destroy functions The size is verified by x_tables and isn't needed by the modules anymore. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:55:34 -07:00
Patrick McHardy	fe1cb10873	[NETFILTER]: x_tables: remove unused argument to target functions Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:55:33 -07:00
Patrick McHardy	da878c8e5a	[NETFILTER]: replace open coded checksum updates Replace open coded checksum update by nf_csum_update calls and clean up the surrounding code a bit. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:55:29 -07:00
Pablo Neira Ayuso	1a31526bae	[NETFILTER]: ctnetlink: remove impossible events tests for updates IPCT_HELPER and IPCT_NATINFO bits are never set on updates. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:55:27 -07:00
Pablo Neira Ayuso	b3a27bfba5	[NETFILTER]: ctnetlink: check for listeners before sending expectation events This patch uses nfnetlink_has_listeners to check for listeners in userspace. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:55:26 -07:00
Pablo Neira Ayuso	b9a37e0c81	[NETFILTER]: ctnetlink: dump connection mark ctnetlink dumps the mark iif the event mark happened Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:55:25 -07:00
Daniel De Graaf	b93ff78317	[NETFILTER]: ipt_recent: add module parameter for changing ownership of /proc/net/ipt_recent/* Signed-off-by: Daniel De Graaf <danield@iastate.edu> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:55:23 -07:00
Yasuyuki Kozakai	a468701db5	[NETFILTER]: x_tables: replace IPv4 DSCP target by address family independent version This replaces IPv4 DSCP target by address family independent version. This also - utilizes dsfield.h to get/mangle DS field in IPv4/IPv6 header - fixes Kconfig help text. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:55:22 -07:00
Yasuyuki Kozakai	9ba1627617	[NETFILTER]: x_tables: replace IPv4 dscp match by address family independent version This replaces IPv4 dscp match by address family independent version. This also - utilizes dsfield.h to get the DS field in IPv4/IPv6 header, and - checks for the DSCP value from user space. - fixes Kconfig help text. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:55:21 -07:00
Thomas Graf	d889ce3b29	[IPv4]: Convert route get to new netlink api Fixes various unvalidated netlink attributes causing memory corruptions when left empty by userspace applications. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:55:06 -07:00
Thomas Graf	be403ea185	[IPv4]: Convert FIB dumping to use new netlink api Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:55:05 -07:00
Thomas Graf	4e902c5741	[IPv4]: FIB configuration using struct fib_config Introduces struct fib_config replacing the ugly struct kern_rta prone to ordering issues. Avoids creating faked netlink messages for auto generated routes or requests via ioctl. A new interface net/nexthop.h is added to help navigate through nexthop configuration arrays. A new struct nl_info will be used to carry the necessary netlink information to be used for notifications later on. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:55:04 -07:00
Brian Haley	ab32ea5d8a	[NET/IPV4/IPV6]: Change some sysctl variables to __read_mostly Change net/core, ipv4 and ipv6 sysctl variables to __read_mostly. Couldn't actually measure any performance increase while testing (.3% I consider noise), but seems like the right thing to do. Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:55:03 -07:00
Thomas Graf	f21c7bc5f6	[IPv4] route: Convert route notifications to use rtnl_notify() Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:54:54 -07:00
Thomas Graf	d6062cbbd1	[IPv4] address: Convert address notification to use rtnl_notify() Adds support for NLM_F_ECHO allowing applications to easly see which address have been deleted, added, or promoted. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:54:53 -07:00
Thomas Graf	2942e90050	[RTNETLINK]: Use rtnl_unicast() for rtnetlink unicasts Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:54:48 -07:00
Martin Bligh	81aa646cc4	[IPV4]: add the UdpSndbufErrors and UdpRcvbufErrors MIBs Signed-off-by: Martin Bligh <mbligh@google.com> Signed-off-by: Andrew Morton <akpm@osdl.org>	2006-09-22 14:54:41 -07:00
Patrick McHardy	1af5a8c4a1	[IPV4]: Increase number of possible routing tables to 2^32 Increase the number of possible routing tables to 2^32 by replacing the fixed sized array of pointers by a hash table and replacing iterations over all possible table IDs by hash table walking. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:54:26 -07:00
Patrick McHardy	9e762a4a89	[NET]: Introduce RTA_TABLE/FRA_TABLE attributes Introduce RTA_TABLE route attribute and FRA_TABLE routing rule attribute to hold 32 bit routing table IDs. Usespace compatibility is provided by continuing to accept and send the rtm_table field, but because of its limited size it can only carry the low 8 bits of the table ID. This implies that if larger IDs are used, _all_ userspace programs using them need to use RTA_TABLE. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:54:25 -07:00
Patrick McHardy	2dfe55b47e	[NET]: Use u32 for routing table IDs Use u32 for routing table IDs in net/ipv4 and net/decnet in preparation of support for a larger number of routing tables. net/ipv6 already uses u32 everywhere and needs no further changes. No functional changes are made by this patch. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:54:24 -07:00
Dave Jones	bf0d52492d	[NET]: Remove unnecessary config.h includes from net/ config.h is automatically included by kbuild these days. Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:54:21 -07:00
Herbert Xu	8f491069b4	[IPV4]: Use network-order dport for all visible inet_lookup_* Right now most inet_lookup_* functions take a host-order hnum instead of a network-order dport because that's how it is represented internally. This means that users of these functions have to be careful about using the right byte-order. To add more confusion, inet_lookup takes a network-order dport unlike all other functions. So this patch changes all visible inet_lookup functions to take a dport and move all dport->hnum conversion inside them. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:54:14 -07:00
Stephen Hemminger	832b4c5e18	[IPV4] fib: convert reader/writer to spinlock Ther is no point in using a more expensive reader/writer lock for a low contention lock like the fib_info_lock. The only reader case is in handling route redirects. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:54:13 -07:00
Herbert Xu	99a92ff504	[IPV4]: Uninline inet_lookup_listener By modern standards this function is way too big to be inlined. It's even bigger than __inet_lookup_listener :) Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:54:11 -07:00
Louis Nyffenegger	1a01912ae0	[INET]: Remove is_setbyuser patch The value is_setbyuser from struct ip_options is never used and set only one time (http://linux-net.osdl.org/index.php/TODO#IPV4). This little patch removes it from the kernel source. Signed-off-by: Louis Nyffenegger <louis.nyffenegger@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:54:10 -07:00
David S. Miller	0298f36a57	[IPV4]: Kill fib4_rules_clean(). As noted by Adrian Bunk this function is totally unused. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:54:09 -07:00
Adrian Bunk	8ce11e6a9f	[NET]: Make code static. This patch makes needlessly global code static. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:54:07 -07:00
Patrick McHardy	4cf411de49	[NETFILTER]: Get rid of HW checksum invalidation Update hardware checksums incrementally to avoid breaking GSO. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:53:54 -07:00
Patrick McHardy	84fa7933a3	[NET]: Replace CHECKSUM_HW by CHECKSUM_PARTIAL/CHECKSUM_COMPLETE Replace CHECKSUM_HW by CHECKSUM_PARTIAL (for outgoing packets, whose checksum still needs to be completed) and CHECKSUM_COMPLETE (for incoming packets, device supplied full checksum). Patch originally from Herbert Xu, updated by myself for 2.6.18-rc3. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:53:53 -07:00
Patrick McHardy	8584d6df39	[NETFILTER]: netbios conntrack: fix compile Fix compile breakage caused by move of IFA_F_SECONDARY to new header file. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:53:51 -07:00
Thomas Graf	1823730fbc	[IPv4]: Move interface address bits to linux/if_addr.h Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:53:47 -07:00
Thomas Graf	47f68512d2	[IPV4]: Convert address dumping to new netlink api Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:53:46 -07:00
Thomas Graf	dfdd5fd4e9	[IPV4]: Convert address deletion to new netlink api Fixes various unvalidated netlink attributes causing memory corruptions when left empty by userspace. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:53:45 -07:00
Thomas Graf	5c7539781d	[IPV4]: Convert address addition to new netlink api Adds rtm_to_ifaddr() transforming a netlink message to a struct in_ifaddr. Fixes various unvalidated netlink attributes causing memory corruptions when left empty by userspace applications. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:53:45 -07:00
Thomas Graf	e1ef4bf23b	[IPV4]: Use Protocol Independant Policy Routing Rules Framework Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:53:42 -07:00
Paul Moore	446fda4f26	[NetLabel]: CIPSOv4 engine Add support for the Commercial IP Security Option (CIPSO) to the IPv4 network stack. CIPSO has become a de-facto standard for trusted/labeled networking amongst existing Trusted Operating Systems such as Trusted Solaris, HP-UX CMW, etc. This implementation is designed to be used with the NetLabel subsystem to provide explicit packet labeling to LSM developers. The CIPSO/IPv4 packet labeling works by the LSM calling a NetLabel API function which attaches a CIPSO label (IPv4 option) to a given socket; this in turn attaches the CIPSO label to every packet leaving the socket without any extra processing on the outbound side. On the inbound side the individual packet's sk_buff is examined through a call to a NetLabel API function to determine if a CIPSO/IPv4 label is present and if so the security attributes of the CIPSO label are returned to the caller of the NetLabel API function. Signed-off-by: Paul Moore <paul.moore@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:53:33 -07:00
Paul Moore	11a03f78fb	[NetLabel]: core network changes Changes to the core network stack to support the NetLabel subsystem. This includes changes to the IPv4 option handling to support CIPSO labels. Signed-off-by: Paul Moore <paul.moore@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:53:32 -07:00
Venkat Yekkirala	4237c75c0a	[MLSXFRM]: Auto-labeling of child sockets This automatically labels the TCP, Unix stream, and dccp child sockets as well as openreqs to be at the same MLS level as the peer. This will result in the selection of appropriately labeled IPSec Security Associations. This also uses the sock's sid (as opposed to the isec sid) in SELinux enforcement of secmark in rcv_skb and postroute_last hooks. Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:53:29 -07:00
Venkat Yekkirala	beb8d13bed	[MLSXFRM]: Add flow labeling This labels the flows that could utilize IPSec xfrms at the points the flows are defined so that IPSec policy and SAs at the right label can be used. The following protos are currently not handled, but they should continue to be able to use single-labeled IPSec like they currently do. ipmr ip_gre ipip igmp sit sctp ip6_tunnel (IPv6 over IPv6 tunnel device) decnet Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:53:27 -07:00
Herbert Xu	e4d5b79c66	[CRYPTO] users: Use crypto_comp and crypto_has_* This patch converts all users to use the new crypto_comp type and the crypto_has_* functions. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2006-09-21 11:46:22 +10:00
Herbert Xu	07d4ee583e	[IPSEC]: Use HMAC template and hash interface This patch converts IPsec to use the new HMAC template. The names of existing simple digest algorithms may still be used to refer to their HMAC composites. The same structure can be used by other MACs such as AES-XCBC-MAC. This patch also switches from the digest interface to hash. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-21 11:46:18 +10:00
Herbert Xu	6b7326c849	[IPSEC] ESP: Use block ciphers where applicable This patch converts IPSec/ESP to use the new block cipher type where applicable. Similar to the HMAC conversion, existing algorithm names have been kept for compatibility. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2006-09-21 11:46:14 +10:00
Al Viro	888454c57a	[IPV4] fib_trie: missing ntohl() when calling fib_semantic_match() fib_trie.c::check_leaf() passes host-endian where fib_semantic_match() expects (and stores into) net-endian. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-19 13:42:46 -07:00
Wong Hoi Sing Edison	3795da47e8	[TCP] tcp-lp: bug fix for oops in 2.6.18-rc6 Sorry that the patch submited yesterday still contain a small bug. This version have already been test for hours with BT connections. The oops is now difficult to reproduce. Signed-off-by: Wong Hoi Sing Edison <hswong3i@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-17 23:21:09 -07:00
Simon Horman	b552216ff1	[IPVS]: remove the debug option go ip_vs_ftp This patch makes the debuging behaviour of this code more consistent with the rest of IPVS. Signed-Off-By: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-17 23:21:06 -07:00
Simon Horman	3f5af5b353	[IPVS]: Make sure ip_vs_ftp ports are valid I'm not entirely sure what happens in the case of a valid port, at best it'll be silently ignored. This patch ignores them a little more verbosely. Signed-Off-By: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-17 23:21:05 -07:00
Simon Horman	70e76b768b	[IPVS]: auto-help for ip_vs_ftp Fill in a help message for the ports option to ip_vs_ftp Signed-Off-By: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-17 23:21:04 -07:00
Stephen Hemminger	b3a8a40da5	[TCP]: Turn ABC off. Turn Appropriate Byte Count off by default because it unfairly penalizes applications that do small writes. Add better documentation to describe what it is so users will understand why they might want to turn it on. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-17 23:21:02 -07:00
Wei Dong	0668b47205	[IPV4]: Fix SNMPv2 "ipFragFails" counter error When I tested Linux kernel 2.6.17.7 about statistics "ipFragFails",found that this counter couldn't increase correctly. The criteria is RFC2011: RFC2011 ipFragFails OBJECT-TYPE SYNTAX Counter32 MAX-ACCESS read-only STATUS current DESCRIPTION "The number of IP datagrams that have been discarded because they needed to be fragmented at this entity but could not be, e.g., because their Don't Fragment flag was set." ::= { ip 18 } When I send big IP packet to a router with DF bit set to 1 which need to be fragmented, and router just sends an ICMP error message ICMP_FRAG_NEEDED but no increments for this counter(in the function ip_fragment). Signed-off-by: Wei Dong <weid@nanjing-fnst.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-31 15:24:48 -07:00
Daikichi Osuga	3fdf3f0c99	[TCP]: Two RFC3465 Appropriate Byte Count fixes. 1) fix slow start after retransmit timeout 2) fix case of L=2*SMSS acked bytes comparison Signed-off-by: Daikichi Osuga <osugad@s1.nttdocomo.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-29 21:22:16 -07:00
Stephen Hemminger	316c1592be	[TCP]: Limit window scaling if window is clamped. This small change allows for easy per-route workarounds for broken hosts or middleboxes that are not compliant with TCP standards for window scaling. Rather than having to turn off window scaling globally. This patch allows reducing or disabling window scaling if window clamp is present. Example: Mark Lord reported a problem with 2.6.17 kernel being unable to access http://www.everymac.com # ip route add 216.145.246.23/32 via 10.8.0.1 window 65535 Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-22 14:33:57 -07:00
Patrick McHardy	e0b7cde997	[NETFILTER]: arp_tables: fix table locking in arpt_do_table table->private might change because of ruleset changes, don't use it without holding the lock. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-22 14:33:56 -07:00
Patrick McHardy	8311731afc	[NETFILTER]: ip_tables: fix table locking in ipt_do_table table->private might change because of ruleset changes, don't use it without holding the lock. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-17 18:13:53 -07:00
Patrick McHardy	d205dc4079	[NETFILTER]: ctnetlink: fix deadlock in table dumping ip_conntrack_put must not be called while holding ip_conntrack_lock since destroy_conntrack takes it again. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-17 18:12:38 -07:00
Alexey Kuznetsov	6e8fcbf640	[IPV4]: severe locking bug in fib_semantics.c Found in 2.4 by Yixin Pan <yxpan@hotmail.com>. > When I read fib_semantics.c of Linux-2.4.32, write_lock(&fib_info_lock) = > is used in fib_release_info() instead of write_lock_bh(&fib_info_lock). = > Is the following case possible: a BH interrupts fib_release_info() while = > holding the write lock, and calls ip_check_fib_default() which calls = > read_lock(&fib_info_lock), and spin forever. Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-17 16:44:46 -07:00
David L Stevens	acd6e00b8e	[MCAST]: Fix filter leak on device removal. This fixes source filter leakage when a device is removed and a process leaves the group thereafter. This also includes corresponding fixes for IPv6 multicast source filters on device removal. Signed-off-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-17 16:29:57 -07:00
Michal Ruzicka	bb699cbca0	[IPV4]: Possible leak of multicast source filter sctructure There is a leak of a socket's multicast source filter list structure on closing a socket with a multicast source filter set on an interface that does not exist any more. Signed-off-by: Michal Ruzicka <michal.ruzicka@comstar.cz> Acked-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-17 16:29:49 -07:00
Herbert Xu	e9fa4f7bd2	[INET]: Use pskb_trim_unique when trimming paged unique skbs The IPv4/IPv6 datagram output path was using skb_trim to trim paged packets because they know that the packet has not been cloned yet (since the packet hasn't been given to anything else in the system). This broke because skb_trim no longer allows paged packets to be trimmed. Paged packets must be given to one of the pskb_trim functions instead. This patch adds a new pskb_trim_unique function to cover the IPv4/IPv6 datagram output path scenario and replaces the corresponding skb_trim calls with it. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-13 20:12:58 -07:00
Mark Huang	dcb7cd97f1	[NETFILTER]: ulog: fix panic on SMP kernels Fix kernel panic on various SMP machines. The culprit is a null ub->skb in ulog_send(). If ulog_timer() has already been scheduled on one CPU and is spinning on the lock, and ipt_ulog_packet() flushes the queue on another CPU by calling ulog_send() right before it exits, there will be no skbuff when ulog_timer() acquires the lock and calls ulog_send(). Cancelling the timer in ulog_send() doesn't help because it has already been scheduled and is running on the first CPU. Similar problem exists in ebt_ulog.c and nfnetlink_log.c. Signed-off-by: Mark Huang <mlhuang@cs.princeton.edu> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-13 18:57:54 -07:00
Patrick McHardy	0eff66e625	[NETFILTER]: {arp,ip,ip6}_tables: proper error recovery in init path Neither of {arp,ip,ip6}_tables cleans up behind itself when something goes wrong during initialization. Noticed by Rennie deGraaf <degraaf@cpsc.ucalgary.ca> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-13 18:57:28 -07:00
Patrick McHardy	1c7628bd7a	[NETFILTER]: xt_hashlimit: fix limit off-by-one Hashlimit doesn't account for the first packet, which is inconsistent with the limit match. Reported by ryan.castellucci@gmail.com, netfilter bugzilla #500. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-13 18:06:02 -07:00
David S. Miller	18b6fe64d4	[TCP]: Fix botched memory leak fix to tcpprobe_read(). Somehow I clobbered James's original fix and only my subsequent compiler warning change went in for that changeset. Get the real fix in there. Noticed by Jesper Juhl. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-13 18:05:09 -07:00
Wei Yongjun	bd37a08859	[TCP]: SNMPv2 tcpOutSegs counter error Do not count retransmitted segments. Signed-off-by: Wei Yongjun <yjwei@nanjing-fnst.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-07 21:04:15 -07:00
Kirill Korotaev	8d1502de27	[IPV4]: Limit rt cache size properly. From: Kirill Korotaev <dev@sw.ru> During OpenVZ stress testing we found that UDP traffic with random src can generate too much excessive rt hash growing leading finally to OOM and kernel panics. It was found that for 4GB i686 system (having 1048576 total pages and 225280 normal zone pages) kernel allocates the following route hash: syslog: IP route cache hash table entries: 262144 (order: 8, 1048576 bytes) => ip_rt_max_size = 4194304 entries, i.e. max rt size is 4194304 * 256b = 1Gb of RAM > normal_zone Attached the patch which removes HASH_HIGHMEM flag from alloc_large_system_hash() call. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-07 20:44:22 -07:00
Ilpo Järvinen	d254bcdbf2	[TCP]: Fixes IW > 2 cases when TCP is application limited Whenever a transfer is application limited, we are allowed at least initial window worth of data per window unless cwnd is previously less than that. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-04 22:59:52 -07:00
Alexey Dobriyan	29bbd72d6e	[NET]: Fix more per-cpu typos Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-02 15:02:31 -07:00
Catherine Zhang	dc49c1f94e	[AF_UNIX]: Kernel memory leak fix for af_unix datagram getpeersec patch From: Catherine Zhang <cxzhang@watson.ibm.com> This patch implements a cleaner fix for the memory leak problem of the original unix datagram getpeersec patch. Instead of creating a security context each time a unix datagram is sent, we only create the security context when the receiver requests it. This new design requires modification of the current unix_getsecpeer_dgram LSM hook and addition of two new hooks, namely, secid_to_secctx and release_secctx. The former retrieves the security context and the latter releases it. A hook is required for releasing the security context because it is up to the security module to decide how that's done. In the case of Selinux, it's a simple kfree operation. Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-02 14:12:06 -07:00
Wei Dong	dafee49085	[IPV6]: SNMPv2 "ipv6IfStatsOutFragCreates" counter error When I tested linux kernel 2.6.71.7 about statistics "ipv6IfStatsOutFragCreates", and found that it couldn't increase correctly. The criteria is RFC 2465: ipv6IfStatsOutFragCreates OBJECT-TYPE SYNTAX Counter32 MAX-ACCESS read-only STATUS current DESCRIPTION "The number of output datagram fragments that have been generated as a result of fragmentation at this output interface." ::= { ipv6IfStatsEntry 15 } I think there are two issues in Linux kernel. 1st: RFC2465 specifies the counter is "The number of output datagram fragments...". I think increasing this counter after output a fragment successfully is better. And it should not be increased even though a fragment is created but failed to output. 2nd: If we send a big ICMP/ICMPv6 echo request to a host, and receive ICMP/ICMPv6 echo reply consisted of some fragments. As we know that in Linux kernel first fragmentation occurs in ICMP layer(maybe saying transport layer is better), but this is not the "real" fragmentation,just do some "pre-fragment" -- allocate space for date, and form a frag_list, etc. The "real" fragmentation happens in IP layer -- set offset and MF flag and so on. So I think in "fast path" for ip_fragment/ip6_fragment, if we send a fragment which "pre-fragment" by upper layer we should also increase "ipv6IfStatsOutFragCreates". Signed-off-by: Wei Dong <weid@nanjing-fnst.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-02 13:41:21 -07:00
Patrick McHardy	3ab720881b	[NETFILTER]: xt_hashlimit/xt_string: missing string validation The hashlimit table name and the textsearch algorithm need to be terminated, the textsearch pattern length must not exceed the maximum size. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-02 13:38:29 -07:00
Patrick McHardy	b10866fd7d	[NETFILTER]: SIP helper: expect RTP streams in both directions Since we don't know in which direction the first packet will arrive, we need to create one expectation for each direction, which is currently prevented by max_expected beeing set to 1. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-02 13:38:28 -07:00
David S. Miller	52499afe40	[TCP]: Process linger2 timeout consistently. Based upon guidance from Alexey Kuznetsov. When linger2 is active, we check to see if the fin_wait2 timeout is longer than the timewait. If it is, we schedule the keepalive timer for the difference between the timewait timeout and the fin_wait2 timeout. When this orphan socket is seen by tcp_keepalive_timer() it will try to transform this fin_wait2 socket into a fin_wait2 mini-socket, again if linger2 is active. Not all paths were setting this initial keepalive timer correctly. The tcp input path was doing it correctly, but tcp_close() wasn't, potentially making the socket linger longer than it really needs to. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-02 13:38:24 -07:00
Tom Tucker	8d71740c56	[NET]: Core net changes to generate netevents Generate netevents for: - neighbour changes - routing redirects - pmtu changes Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-02 13:38:21 -07:00
Wei Yongjun	3687b1dc6f	[TCP]: SNMPv2 tcpAttemptFails counter error Refer to RFC2012, tcpAttemptFails is defined as following: tcpAttemptFails OBJECT-TYPE SYNTAX Counter32 MAX-ACCESS read-only STATUS current DESCRIPTION "The number of times TCP connections have made a direct transition to the CLOSED state from either the SYN-SENT state or the SYN-RCVD state, plus the number of times TCP connections have made a direct transition to the LISTEN state from the SYN-RCVD state." ::= { tcp 7 } When I lookup into RFC793, I found that the state change should occured under following condition: 1. SYN-SENT -> CLOSED a) Received ACK,RST segment when SYN-SENT state. 2. SYN-RCVD -> CLOSED b) Received SYN segment when SYN-RCVD state(came from LISTEN). c) Received RST segment when SYN-RCVD state(came from SYN-SENT). d) Received SYN segment when SYN-RCVD state(came from SYN-SENT). 3. SYN-RCVD -> LISTEN e) Received RST segment when SYN-RCVD state(came from LISTEN). In my test, those direct state transition can not be counted to tcpAttemptFails. Signed-off-by: Wei Yongjun <yjwei@nanjing-fnst.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-02 13:38:19 -07:00
James Morris	118075b3cd	[TCP]: fix memory leak in net/ipv4/tcp_probe.c::tcpprobe_read() Based upon a patch by Jesper Juhl. Signed-off-by: James Morris <jmorris@namei.org> Acked-by: Stephen Hemminger <shemminger@osdl.org> Acked-by: Jesper Juhl <jesper.juhl@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-02 13:38:18 -07:00
Tetsuo Handa	f59fc7f30b	[IPV4/IPV6]: Setting 0 for unused port field in RAW IP recvmsg(). From: Tetsuo Handa from-linux-kernel@i-love.sakura.ne.jp The recvmsg() for raw socket seems to return random u16 value from the kernel stack memory since port field is not initialized. But I'm not sure this patch is correct. Does raw socket return any information stored in port field? [ BSD defines RAW IP recvmsg to return a sin_port value of zero. This is described in Steven's TCP/IP Illustrated Volume 2 on page 1055, which is discussing the BSD rip_input() implementation. ] Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-25 17:05:35 -07:00
Alexey Kuznetsov	7228749092	[IPV4] ipmr: ip multicast route bug fix. IP multicast route code was reusing an skb which causes use after free and double free. From: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Note, it is real skb_clone(), not alloc_skb(). Equeued skb contains the whole half-prepared netlink message plus room for the rest. It could be also skb_copy(), if we want to be puristic about mangling cloned data, but original copy is really not going to be used. Acked-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-25 16:45:12 -07:00
Guillaume Chazarain	d569f1d72f	[IPV4]: Clear the whole IPCB, this clears also IPCB(skb)->flags. Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-24 23:45:16 -07:00
Patrick McHardy	8cf8fb5687	[NETFILTER]: SNMP NAT: fix byteorder confusion Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-24 22:53:35 -07:00
Adrian Bunk	72b5582359	[NETFILTER]: conntrack: fix SYSCTL=n compile Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-24 22:53:12 -07:00
Patrick McHardy	083edca05a	[NETFILTER]: H.323 helper: fix possible NULL-ptr dereference An RCF message containing a timeout results in a NULL-ptr dereference if no RRQ has been seen before. Noticed by the "SATURN tool", reported by Thomas Dillig <tdillig@stanford.edu> and Isil Dillig <isil@stanford.edu>. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-24 22:52:10 -07:00
Patrick McHardy	8265abc082	[IPV4]: Fix nexthop realm dumping for multipath routes Routing realms exist per nexthop, but are only returned to userspace for the first nexthop. This is due to the fact that iproute2 only allows to set the realm for the first nexthop and the kernel refuses multipath routes where only a single realm is present. Dump all realms for multipath routes to enable iproute to correctly display them. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-21 15:09:55 -07:00
Panagiotis Issaris	0da974f4f3	[NET]: Conversions from kmalloc+memset to k(z\|c)alloc. Signed-off-by: Panagiotis Issaris <takis@issaris.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-21 14:51:30 -07:00
Herbert Xu	5d9c5a3292	[IPV4]: Get rid of redundant IPCB->opts initialisation Now that we always zero the IPCB->opts in ip_rcv, it is no longer necessary to do so before calling netif_rx for tunneled packets. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-21 14:29:53 -07:00
Stephen Hemminger	53602f92dd	[IPV4]: Clear skb cb on IP input when data arrives at IP through loopback (and possibly other devices). So the field needs to be cleared before it confuses the route code. This was seen when running netem over loopback, but there are probably other device cases. Maybe this should go into stable? Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-14 14:49:32 -07:00
Herbert Xu	b47b2ec198	[IPV4]: Fix error handling for fib_insert_node call The error handling around fib_insert_node was broken because we always zeroed the error before checking it. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-12 13:59:04 -07:00
Herbert Xu	da952315c9	[IPCOMP]: Fix truesize after decompression The truesize check has uncovered the fact that we forgot to update truesize after pskb_expand_head. Unfortunately pskb_expand_head can't update it for us because it's used in all sorts of different contexts, some of which would not allow truesize to be updated by itself. So the solution for now is to simply update it in IPComp. This patch also changes skb_put to __skb_put since we've just expanded tailroom by exactly that amount so we know it's there (but gcc does not). Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-12 13:58:55 -07:00
Xiaoliang (David) Wei	6150c22e2a	[TCP] tcp_highspeed: Fix AI updates. I think there is still a problem with the AIMD parameter update in HighSpeed TCP code. Line 125~138 of the code (net/ipv4/tcp_highspeed.c): /* Update AIMD parameters */ if (tp->snd_cwnd > hstcp_aimd_vals[ca->ai].cwnd) { while (tp->snd_cwnd > hstcp_aimd_vals[ca->ai].cwnd && ca->ai < HSTCP_AIMD_MAX - 1) ca->ai++; } else if (tp->snd_cwnd < hstcp_aimd_vals[ca->ai].cwnd) { while (tp->snd_cwnd > hstcp_aimd_vals[ca->ai].cwnd && ca->ai > 0) ca->ai--; In fact, the second part (decreasing ca->ai) never decreases since the while loop's inequality is in the reverse direction. This leads to unfairness with multiple flows (once a flow happens to enjoy a higher ca->ai, it keeps enjoying that even its cwnd decreases) Here is a tentative fix (I also added a comment, trying to keep the change clear): Acked-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-12 13:58:50 -07:00
David S. Miller	c427d27452	[TCP]: Remove TCP Compound This reverts: `f890f92104` The inclusion of TCP Compound needs to be reverted at this time because it is not 100% certain that this code conforms to the requirements of Developer's Certificate of Origin 1.1 paragraph (b). Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-10 14:50:35 -07:00
Herbert Xu	7466d90f85	[IPV4] inetpeer: Get rid of volatile from peer_total The variable peer_total is protected by a lock. The volatile marker makes no sense. This shaves off 20 bytes on i386. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-10 14:50:30 -07:00
Patrick McHardy	26e0fd1ce2	[NET]: Fix IPv4/DECnet routing rule dumping When more rules are present than fit in a single skb, the remaining rules are incorrectly skipped. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-08 13:38:55 -07:00
Herbert Xu	a430a43d08	[NET] gso: Fix up GSO packets with broken checksums Certain subsystems in the stack (e.g., netfilter) can break the partial checksum on GSO packets. Until they're fixed, this patch allows this to work by recomputing the partial checksums through the GSO mechanism. Once they've all been converted to update the partial checksum instead of clearing it, this workaround can be removed. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-08 13:34:56 -07:00
Herbert Xu	89114afd43	[NET] gso: Add skb_is_gso This patch adds the wrapper function skb_is_gso which can be used instead of directly testing skb_shinfo(skb)->gso_size. This makes things a little nicer and allows us to change the primary key for indicating whether an skb is GSO (if we ever want to do that). Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-08 13:34:32 -07:00
Herbert Xu	bbcf467dab	[NET]: Verify gso_type too in gso_segment We don't want nasty Xen guests to pass a TCPv6 packet in with gso_type set to TCPv4 or even UDP (or a packet that's both TCP and UDP). Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-03 19:38:35 -07:00
Ingo Molnar	c636618485	[PATCH] lockdep: annotate bh_lock_sock() Teach special (recursive) locking code to the lock validator. Has no effect on non-lockdep kernels. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-03 15:27:08 -07:00
Ingo Molnar	6205120044	[PATCH] lockdep: fix RT_HASH_LOCK_SZ On lockdep we have a quite big spinlock_t, so keep the size down. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-03 15:27:05 -07:00
Ingo Molnar	8a25d5debf	[PATCH] lockdep: prove spinlock rwlock locking correctness Use the lock validator framework to prove spinlock and rwlock locking correctness. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-03 15:27:04 -07:00
Ingo Molnar	e4d9191885	[PATCH] lockdep: locking init debugging improvement Locking init improvement: - introduce and use __SPIN_LOCK_UNLOCKED for array initializations, to pass in the name string of locks, used by debugging Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-03 15:27:02 -07:00
Linus Torvalds	e37a72de84	Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 * master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: [IPV6]: Added GSO support for TCPv6 [NET]: Generalise TSO-specific bits from skb_setup_caps [IPV6]: Added GSO support for TCPv6 [IPV6]: Remove redundant length check on input [NETFILTER]: SCTP conntrack: fix crash triggered by packet without chunks [TG3]: Update version and reldate [TG3]: Add TSO workaround using GSO [TG3]: Turn on hw fix for ASF problems [TG3]: Add rx BD workaround [TG3]: Add tg3_netif_stop() in vlan functions [TCP]: Reset gso_segs if packet is dodgy	2006-06-30 15:40:17 -07:00
Herbert Xu	f83ef8c0b5	[IPV6]: Added GSO support for TCPv6 This patch adds GSO support for IPv6 and TCPv6. This is based on a patch by Ananda Raju <Ananda.Raju@neterion.com>. His original description is: This patch enables TSO over IPv6. Currently Linux network stacks restricts TSO over IPv6 by clearing of the NETIF_F_TSO bit from "dev->features". This patch will remove this restriction. This patch will introduce a new flag NETIF_F_TSO6 which will be used to check whether device supports TSO over IPv6. If device support TSO over IPv6 then we don't clear of NETIF_F_TSO and which will make the TCP layer to create TSO packets. Any device supporting TSO over IPv6 will set NETIF_F_TSO6 flag in "dev->features" along with NETIF_F_TSO. In case when user disables TSO using ethtool, NETIF_F_TSO will get cleared from "dev->features". So even if we have NETIF_F_TSO6 we don't get TSO packets created by TCP layer. SKB_GSO_TCPV4 renamed to SKB_GSO_TCP to make it generic GSO packet. SKB_GSO_UDPV4 renamed to SKB_GSO_UDP as UFO is not a IPv4 feature. UFO is supported over IPv6 also The following table shows there is significant improvement in throughput with normal frames and CPU usage for both normal and jumbo. -------------------------------------------------- \| \| 1500 \| 9600 \| \| ------------------\|-------------------\| \| \| thru CPU \| thru CPU \| -------------------------------------------------- \| TSO OFF \| 2.00 5.5% id \| 5.66 20.0% id \| -------------------------------------------------- \| TSO ON \| 2.63 78.0 id \| 5.67 39.0% id \| -------------------------------------------------- Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-30 14:12:10 -07:00
Herbert Xu	bcd7611117	[NET]: Generalise TSO-specific bits from skb_setup_caps This patch generalises the TSO-specific bits from sk_setup_caps by adding the sk_gso_type member to struct sock. This makes sk_setup_caps generic so that it can be used by TCPv6 or UFO. The only catch is that whoever uses this must provide a GSO implementation for their protocol which I think is a fair deal :) For now UFO continues to live without a GSO implementation which is OK since it doesn't use the sock caps field at the moment. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-30 14:12:08 -07:00
Herbert Xu	adcfc7d0b4	[IPV6]: Added GSO support for TCPv6 This patch adds GSO support for IPv6 and TCPv6. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-30 14:12:06 -07:00
Patrick McHardy	dd7271feba	[NETFILTER]: SCTP conntrack: fix crash triggered by packet without chunks When a packet without any chunks is received, the newconntrack variable in sctp_packet contains an out of bounds value that is used to look up an pointer from the array of timeouts, which is then dereferenced, resulting in a crash. Make sure at least a single chunk is present. Problem noticed by George A. Theall <theall@tenablesecurity.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-30 14:12:01 -07:00
Herbert Xu	3820c3f3e4	[TCP]: Reset gso_segs if packet is dodgy I wasn't paranoid enough in verifying GSO information. A bogus gso_segs could upset drivers as much as a bogus header would. Let's reset it in the per-protocol gso_segment functions. I didn't verify gso_size because that can be verified by the source of the dodgy packets. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-30 14:11:47 -07:00
Jörn Engel	6ab3d5624e	Remove obsolete #include <linux/config.h> Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>	2006-06-30 19:25:36 +02:00
Matt LaPlante	c22751b73a	[NETFILTE] ipv4: Fix typo (Bugzilla #6753 ) This patch fixes bugzilla #6753, a typo in the netfilter Kconfig Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-29 16:58:28 -07:00
Michael Chan	b0da853703	[NET]: Add ECN support for TSO In the current TSO implementation, NETIF_F_TSO and ECN cannot be turned on together in a TCP connection. The problem is that most hardware that supports TSO does not handle CWR correctly if it is set in the TSO packet. Correct handling requires CWR to be set in the first packet only if it is set in the TSO header. This patch adds the ability to turn on NETIF_F_TSO and ECN using GSO if necessary to handle TSO packets with CWR set. Hardware that handles CWR correctly can turn on NETIF_F_TSO_ECN in the dev-> features flag. All TSO packets with CWR set will have the SKB_GSO_TCPV4_ECN set. If the output device does not have the NETIF_F_TSO_ECN feature set, GSO will split the packet up correctly with CWR only set in the first segment. With help from Herbert Xu <herbert@gondor.apana.org.au>. Since ECN can always be enabled with TSO, the SOCK_NO_LARGESEND sock flag is completely removed. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-29 16:58:08 -07:00
Sridhar Samudrala	47da8ee681	[TCP]: Export accept queue len of a TCP listening socket via rx_queue While debugging a TCP server hang issue, we noticed that currently there is no way for a user to get the acceptq backlog value for a TCP listen socket. All the standard networking utilities that display socket info like netstat, ss and /proc/net/tcp have 2 fields called rx_queue and tx_queue. These fields do not mean much for listening sockets. This patch uses one of these unused fields(rx_queue) to export the accept queue len for listening sockets. Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-29 16:57:57 -07:00
Darrel Goeddel	c7bdb545d2	[NETLINK]: Encapsulate eff_cap usage within security framework. This patch encapsulates the usage of eff_cap (in netlink_skb_params) within the security framework by extending security_netlink_recv to include a required capability parameter and converting all direct usage of eff_caps outside of the lsm modules to use the interface. It also updates the SELinux implementation of the security_netlink_send and security_netlink_recv hooks to take advantage of the sid in the netlink_skb_params struct. This also enables SELinux to perform auditing of netlink capability checks. Please apply, for 2.6.18 if possible. Signed-off-by: Darrel Goeddel <dgoeddel@trustedcs.com> Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Acked-by: James Morris <jmorris@namei.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-29 16:57:55 -07:00
Herbert Xu	576a30eb64	[NET]: Added GSO header verification When GSO packets come from an untrusted source (e.g., a Xen guest domain), we need to verify the header integrity before passing it to the hardware. Since the first step in GSO is to verify the header, we can reuse that code by adding a new bit to gso_type: SKB_GSO_DODGY. Packets with this bit set can only be fed directly to devices with the corresponding bit NETIF_F_GSO_ROBUST. If the device doesn't have that bit, then the skb is fed to the GSO engine which will allow the packet to be sent to the hardware if it passes the header check. This patch changes the sg flag to a full features flag. The same method can be used to implement TSO ECN support. We simply have to mark packets with CWR set with SKB_GSO_ECN so that only hardware with a corresponding NETIF_F_TSO_ECN can accept them. The GSO engine can either fully segment the packet, or segment the first MTU and pass the rest to the hardware for further segmentation. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-29 16:57:53 -07:00
Patrick McHardy	ef47c6a7b8	[NETFILTER]: ip_queue/nfnetlink_queue: drop bridge port references when dev disappears When a device that is acting as a bridge port is unregistered, the ip_queue/nfnetlink_queue notifier doesn't check if its one of physindev/physoutdev and doesn't release the references if it is. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-29 16:57:48 -07:00
Patrick McHardy	da298d3a4f	[NETFILTER]: x_tables: fix xt_register_table error propagation When xt_register_table fails the error is not properly propagated back. Based on patch by Lepton Wu <ytht.net@gmail.com>. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-29 16:57:40 -07:00
Herbert Xu	0718bcc09b	[NET]: Fix CHECKSUM_HW GSO problems. Fix checksum problems in the GSO code path for CHECKSUM_HW packets. The ipv4 TCP pseudo header checksum has to be adjusted for GSO segmented packets. The adjustment is needed because the length field in the pseudo-header changes. However, because we have the inequality oldlen > newlen, we know that delta = (u16)~oldlen + newlen is still a 16-bit quantity. This also means that htonl(delta) + th->check still fits in 32 bits. Therefore we don't have to use csum_add on this operations. This is based on a patch by Michael Chan <mchan@broadcom.com>. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-25 23:55:46 -07:00
Paul Mackerras	bfe5d83419	[PATCH] Define __raw_get_cpu_var and use it There are several instances of per_cpu(foo, raw_smp_processor_id()), which is semantically equivalent to __get_cpu_var(foo) but without the warning that smp_processor_id() can give if CONFIG_DEBUG_PREEMPT is enabled. For those architectures with optimized per-cpu implementations, namely ia64, powerpc, s390, sparc64 and x86_64, per_cpu() turns into more and slower code than __get_cpu_var(), so it would be preferable to use __get_cpu_var on those platforms. This defines a __raw_get_cpu_var(x) macro which turns into per_cpu(x, raw_smp_processor_id()) on architectures that use the generic per-cpu implementation, and turns into __get_cpu_var(x) on the architectures that have an optimized per-cpu implementation. Signed-off-by: Paul Mackerras <paulus@samba.org> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Ingo Molnar <mingo@elte.hu> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-25 10:01:01 -07:00
Andrew Morton	fb1bb34d45	[PATCH] remove for_each_cpu() Convert a few stragglers over to for_each_possible_cpu(), remove for_each_cpu(). Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-25 10:00:54 -07:00
Herbert Xu	09b8f7a93e	[IPSEC]: Handle GSO packets This patch segments GSO packets received by the IPsec stack. This can happen when a NIC driver injects GSO packets into the stack which are then forwarded to another host. The primary application of this is going to be Xen where its backend driver may inject GSO packets into dom0. Of course this also can be used by other virtualisation schemes such as VMWare or UML since the tap device could be modified to inject GSO packets received through splice. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-23 02:07:38 -07:00
Herbert Xu	f4c50d990d	[NET]: Add software TSOv4 This patch adds the GSO implementation for IPv4 TCP. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-23 02:07:33 -07:00
Herbert Xu	7967168cef	[NET]: Merge TSO/UFO fields in sk_buff Having separate fields in sk_buff for TSO/UFO (tso_size/ufo_size) is not going to scale if we add any more segmentation methods (e.g., DCCP). So let's merge them. They were used to tell the protocol of a packet. This function has been subsumed by the new gso_type field. This is essentially a set of netdev feature bits (shifted by 16 bits) that are required to process a specific skb. As such it's easy to tell whether a given device can process a GSO skb: you just have to and the gso_type field and the netdev's features field. I've made gso_type a conjunction. The idea is that you have a base type (e.g., SKB_GSO_TCPV4) that can be modified further to support new features. For example, if we add a hardware TSO type that supports ECN, they would declare NETIF_F_TSO \| NETIF_F_TSO_ECN. All TSO packets with CWR set would have a gso_type of SKB_GSO_TCPV4 \| SKB_GSO_TCPV4_ECN while all other TSO packets would be SKB_GSO_TCPV4. This means that only the CWR packets need to be emulated in software. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-23 02:07:29 -07:00
Linus Torvalds	4c84a39c8a	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (46 commits) IB/uverbs: Don't serialize with ib_uverbs_idr_mutex IB/mthca: Make all device methods truly reentrant IB/mthca: Fix memory leak on modify_qp error paths IB/uverbs: Factor out common idr code IB/uverbs: Don't decrement usecnt on error paths IB/uverbs: Release lock on error path IB/cm: Use address handle helpers IB/sa: Add ib_init_ah_from_path() IB: Add ib_init_ah_from_wc() IB/ucm: Get rid of duplicate P_Key parameter IB/srp: Factor out common request reset code IB/srp: Support SRP rev. 10 targets [SCSI] srp.h: Add I/O Class values IB/fmr: Use device's max_map_map_per_fmr attribute in FMR pool. IB/mthca: Fill in max_map_per_fmr device attribute IB/ipath: Add client reregister event generation IB/mthca: Add client reregister event generation IB: Move struct port_info from ipath to <rdma/ib_smi.h> IPoIB: Handle client reregister events IB: Add client reregister event type ...	2006-06-19 19:01:59 -07:00
Herbert Xu	8648b3053b	[NET]: Add NETIF_F_GEN_CSUM and NETIF_F_ALL_CSUM The current stack treats NETIF_F_HW_CSUM and NETIF_F_NO_CSUM identically so we test for them in quite a few places. For the sake of brevity, I'm adding the macro NETIF_F_GEN_CSUM for these two. We also test the disjunct of NETIF_F_IP_CSUM and the other two in various places, for that purpose I've added NETIF_F_ALL_CSUM. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 22:06:05 -07:00
David S. Miller	35089bb203	[TCP]: Add tcp_slow_start_after_idle sysctl. A lot of people have asked for a way to disable tcp_cwnd_restart(), and it seems reasonable to add a sysctl to do that. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:30:53 -07:00
Luca De Cicco	bc726a71d2	[TCP] Westwood: reset RTT min after FRTO RTT_min is updated each time a timeout event occurs in order to cope with hard handovers in wireless scenarios such as UMTS. Signed-off-by: Luca De Cicco <ldecicco@gmail.com> Signed-off-by: Stephen Hemminger <shemminger@dxpl.pdx.osdl.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:30:38 -07:00
Luca De Cicco	b3a92eabe5	[TCP] Westwood: bandwidth filter startup The bandwidth estimate filter is now initialized with the first sample in order to have better performances in the case of small file transfers. Signed-off-by: Luca De Cicco <ldecicco@gmail.com> Signed-off-by: Stephen Hemminger <shemminger@dxpl.pdx.osdl.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:30:36 -07:00
Luca De Cicco	b7d7a9e3c9	[TCP] Westwood: comment fixes Cleanup some comments and add more references Signed-off-by: Luca De Cicco <ldecicco@gmail.com> Signed-off-by: Stephen Hemminger <shemminger@dxpl.pdx.osdl.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:30:34 -07:00
Stephen Hemminger	f61e29018a	[TCP] Westwood: fix first sample Need to update send sequence number tracking after first ack. Rework of patch from Luca De Cicco. Signed-off-by: Stephen Hemminger <shemminger@dxpl.pdx.osdl.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:30:32 -07:00
Stephen Hemminger	bdeb04c6d9	[NET]: net.ipv4.ip_autoconfig sysctl removal The sysctl net.ipv4.ip_autoconfig is a legacy value that is not used. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:30:30 -07:00
Herbert Xu	364c6badde	[NET]: Clean up skb_linearize The linearisation operation doesn't need to be super-optimised. So we can replace __skb_linearize with __pskb_pull_tail which does the same thing but is more general. Also, most users of skb_linearize end up testing whether the skb is linear or not so it helps to make skb_linearize do just that. Some callers of skb_linearize also use it to copy cloned data, so it's useful to have a new function skb_linearize_cow to copy the data if it's either non-linear or cloned. Last but not least, I've removed the gfp argument since nobody uses it anymore. If it's ever needed we can easily add it back. Misc bugs fixed by this patch: * via-velocity error handling (also, no SG => no frags) Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:30:16 -07:00
Patrick McHardy	bf0857ea32	[NETFILTER]: hashlimit match: fix random initialization hashlimit does: if (!ht->rnd) get_random_bytes(&ht->rnd, 4); ignoring that 0 is also a valid random number. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:30:11 -07:00
Patrick McHardy	2b2283d030	[NETFILTER]: recent match: missing refcnt initialization Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:30:09 -07:00
Patrick McHardy	a0e889bb1b	[NETFILTER]: recent match: fix "sleeping function called from invalid context" create_proc_entry must not be called with locks held. Use a mutex instead to protect data only changed in user context. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:30:07 -07:00
James Morris	7c9728c393	[SECMARK]: Add secmark support to conntrack Add a secmark field to IP and NF conntracks, so that security markings on packets can be copied to their associated connections, and also copied back to packets as required. This is similar to the network mark field currently used with conntrack, although it is intended for enforcement of security policy rather than network policy. Signed-off-by: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:30:01 -07:00
James Morris	984bc16cc9	[SECMARK]: Add secmark support to core networking. Add a secmark field to the skbuff structure, to allow security subsystems to place security markings on network packets. This is similar to the nfmark field, except is intended for implementing security policy, rather than than networking policy. This patch was already acked in principle by Dave Miller. Signed-off-by: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:29:57 -07:00
David S. Miller	f86502bfc1	[IPV4] icmp: Kill local 'ip' arg in icmp_redirect(). It is typed wrong, and it's only assigned and used once. So just pass in iph->daddr directly which fixes both problems. Based upon a patch by Alexey Dobriyan. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:29:41 -07:00
Alexey Dobriyan	6d74165350	[IPV4]: Right prototype of __raw_v4_lookup() All users pass 32-bit values as addresses and internally they're compared with 32-bit entities. So, change "laddr" and "raddr" types to __be32. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:29:39 -07:00
Alexey Dobriyan	338fcf9886	[IPV4] igmp: Fixup struct ip_mc_list::multiaddr type All users except two expect 32-bit big-endian value. One is of ->multiaddr = ->multiaddr variety. And last one is "%08lX". Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:29:37 -07:00
David S. Miller	70df2311ee	[TCP]: Fix compile warning in tcp_probe.c The suseconds_t et al. are not necessarily any particular type on every platform, so cast to unsigned long so that we can use one printf format string and avoid warnings across the board Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:29:35 -07:00
Stephen Hemminger	738980ffa6	[TCP]: Limited slow start for Highspeed TCP Implementation of RFC3742 limited slow start. Added as part of the TCP highspeed congestion control module. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:29:33 -07:00
Stephen Hemminger	a42e9d6ce8	[TCP]: TCP Probe congestion window tracing This adds a new module for tracking TCP state variables non-intrusively using kprobes. It has a simple /proc interface that outputs one line for each packet received. A sample usage is to collect congestion window and ssthresh over time graphs. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:29:31 -07:00
Stephen Hemminger	72dc5b9225	[TCP]: Minimum congestion window consolidation. Many of the TCP congestion methods all just use ssthresh as the minimum congestion window on decrease. Rather than duplicating the code, just have that be the default if that handle in the ops structure is not set. Minor behaviour change to TCP compound. It probably wants to use this (ssthresh) as lower bound, rather than ssthresh/2 because the latter causes undershoot on loss. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:29:29 -07:00
Stephen Hemminger	a4ed258495	[TCP]: TCP Compound quad root function The original code did a 64 bit divide directly, which won't work on 32 bit platforms. Rather than doing a 64 bit square root twice, just implement a 4th root function in one pass using Newton's method. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:29:27 -07:00
Angelo P. Castellani	f890f92104	[TCP]: TCP Compound congestion control TCP Compound is a sender-side only change to TCP that uses a mixed Reno/Vegas approach to calculate the cwnd. For further details look here: ftp://ftp.research.microsoft.com/pub/tr/TR-2005-86.pdf Signed-off-by: Angelo P. Castellani <angelo.castellani@gmail.com> Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:29:25 -07:00
Bin Zhou	76f1017757	[TCP]: TCP Veno congestion control TCP Veno module is a new congestion control module to improve TCP performance over wireless networks. The key innovation in TCP Veno is the enhancement of TCP Reno/Sack congestion control algorithm by using the estimated state of a connection based on TCP Vegas. This scheme significantly reduces "blind" reduction of TCP window regardless of the cause of packet loss. This work is based on the research paper "TCP Veno: TCP Enhancement for Transmission over Wireless Access Networks." C. P. Fu, S. C. Liew, IEEE Journal on Selected Areas in Communication, Feb. 2003. Original paper and many latest research works on veno: http://www.ntu.edu.sg/home/ascpfu/veno/veno.html Signed-off-by: Bin Zhou <zhou0022@ntu.edu.sg> Cheng Peng Fu <ascpfu@ntu.edu.sg> Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:29:23 -07:00
Wong Hoi Sing Edison	7c106d7e78	[TCP]: TCP Low Priority congestion control TCP Low Priority is a distributed algorithm whose goal is to utilize only the excess network bandwidth as compared to the ``fair share`` of bandwidth as targeted by TCP. Available from: http://www.ece.rice.edu/~akuzma/Doc/akuzma/TCP-LP.pdf Original Author: Aleksandar Kuzmanovic <akuzma@northwestern.edu> See http://www-ece.rice.edu/networks/TCP-LP/ for their implementation. As of 2.6.13, Linux supports pluggable congestion control algorithms. Due to the limitation of the API, we take the following changes from the original TCP-LP implementation: o We use newReno in most core CA handling. Only add some checking within cong_avoid. o Error correcting in remote HZ, therefore remote HZ will be keeped on checking and updating. o Handling calculation of One-Way-Delay (OWD) within rtt_sample, sicne OWD have a similar meaning as RTT. Also correct the buggy formular. o Handle reaction for Early Congestion Indication (ECI) within pkts_acked, as mentioned within pseudo code. o OWD is handled in relative format, where local time stamp will in tcp_time_stamp format. Port from 2.4.19 to 2.6.16 as module by: Wong Hoi Sing Edison <hswong3i@gmail.com> Hung Hing Lun <hlhung3i@gmail.com> Signed-off-by: Wong Hoi Sing Edison <hswong3i@gmail.com> Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:29:21 -07:00
Alexey Dobriyan	c45fb1089e	[NETFILTER]: PPTP helper: fixup gre_keymap_lookup() return type GRE keys are 16-bit wide. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:29:17 -07:00
Patrick McHardy	ae5b7d8ba2	[NETFILTER]: Add SIP connection tracking helper Add SIP connection tracking helper. Originally written by Christian Hentschel <chentschel@arnet.com.ar>, some cleanup, minor fixes and bidirectional SIP support added by myself. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:29:15 -07:00
Patrick McHardy	e44ab66a75	[NETFILTER]: H.323 helper: replace internal_net_addr parameter by routing-based heuristic Call Forwarding doesn't need to create an expectation if both peers can reach each other without our help. The internal_net_addr parameter lets the user explicitly specify a single network where this is true, but is not very flexible and even fails in the common case that calls will both be forwarded to outside parties and inside parties. Use an optional heuristic based on routing instead, the assumption is that if bpth the outgoing device and the gateway are equal, both peers can reach each other directly. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:29:13 -07:00
Jing Min Zhao	c0d4cfd96d	[NETFILTER]: H.323 helper: Add support for Call Forwarding Signed-off-by: Jing Min Zhao <zhaojingmin@users.sourceforge.net> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:29:11 -07:00
Patrick McHardy	c952616934	[NETFILTER]: amanda helper: convert to textsearch infrastructure When a port number within a packet is replaced by a differently sized number only the packet is resized, but not the copy of the data. Following port numbers are rewritten based on their offsets within the copy, leading to packet corruption. Convert the amanda helper to the textsearch infrastructure to avoid the copy entirely. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:29:09 -07:00
Patrick McHardy	7d8c501817	[NETFILTER]: FTP helper: search optimization Instead of skipping search entries for the wrong direction simply index them by direction. Based on patch by Pablo Neira <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:29:07 -07:00
Patrick McHardy	695ecea329	[NETFILTER]: SNMP helper: fix debug module param type debug is the debug level, not a bool. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:29:05 -07:00
Patrick McHardy	89f2e21883	[NETFILTER]: ctnetlink: change table dumping not to require an unique ID Instead of using the ID to find out where to continue dumping, take a reference to the last entry dumped and try to continue there. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:29:03 -07:00
Patrick McHardy	3726add766	[NETFILTER]: ctnetlink: fix NAT configuration The current configuration only allows to configure one manip and overloads conntrack status flags with netlink semantic. Signed-off-by: Patrick Mchardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:29:01 -07:00
Eric Leblond	997ae831ad	[NETFILTER]: conntrack: add fixed timeout flag in connection tracking Add a flag in a connection status to have a non updated timeout. This permits to have connection that automatically die at a given time. Signed-off-by: Eric Leblond <eric@inl.fr> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:28:59 -07:00
Patrick McHardy	39a27a35c5	[NETFILTER]: conntrack: add sysctl to disable checksumming Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:28:57 -07:00
Patrick McHardy	6442f1cf89	[NETFILTER]: conntrack: don't call helpers for related ICMP messages None of the existing helpers expects to get called for related ICMP packets and some even drop them if they can't parse them. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:28:55 -07:00
Patrick McHardy	404bdbfd24	[NETFILTER]: recent match: replace by rewritten version Replace the unmaintainable ipt_recent match by a rewritten version that should be fully compatible. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:28:53 -07:00
Patrick McHardy	957dc80ac3	[NETFILTER]: x_tables: add SCTP/DCCP support where missing Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:28:47 -07:00
Patrick McHardy	3e72b2fe5b	[NETFILTER]: x_tables: remove some unnecessary casts Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:28:45 -07:00
Herbert Xu	31a4ab9302	[IPSEC] proto: Move transport mode input path into xfrm_mode_transport Now that we have xfrm_mode objects we can move the transport mode specific input decapsulation code into xfrm_mode_transport. This removes duplicate code as well as unnecessary header movement in case of tunnel mode SAs since we will discard the original IP header immediately. This also fixes a minor bug for transport-mode ESP where the IP payload length is set to the correct value minus the header length (with extension headers for IPv6). Of course the other neat thing is that we no longer have to allocate temporary buffers to hold the IP headers for ESP and IPComp. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:28:41 -07:00
Herbert Xu	b59f45d0b2	[IPSEC] xfrm: Abstract out encapsulation modes This patch adds the structure xfrm_mode. It is meant to represent the operations carried out by transport/tunnel modes. By doing this we allow additional encapsulation modes to be added without clogging up the xfrm_input/xfrm_output paths. Candidate modes include 4-to-6 tunnel mode, 6-to-4 tunnel mode, and BEET modes. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:28:39 -07:00
Herbert Xu	546be2405b	[IPSEC] xfrm: Undo afinfo lock proliferation The number of locks used to manage afinfo structures can easily be reduced down to one each for policy and state respectively. This is based on the observation that the write locks are only held by module insertion/removal which are very rare events so there is no need to further differentiate between the insertion of modules like ipv6 versus esp6. The removal of the read locks in xfrm4_policy.c/xfrm6_policy.c might look suspicious at first. However, after you realise that nobody ever takes the corresponding write lock you'll feel better :) As far as I can gather it's an attempt to guard against the removal of the corresponding modules. Since neither module can be unloaded at all we can leave it to whoever fixes up IPv6 unloading :) Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:28:37 -07:00
David S. Miller	15986e1aad	[TCP]: tcp_rcv_rtt_measure_ts() call in pure-ACK path is superfluous We only want to take receive RTT mesaurements for data bearing frames, here in the header prediction fast path for a pure-sender, we know that we have a pure-ACK and thus the checks in tcp_rcv_rtt_mesaure_ts() will not pass. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:26:16 -07:00
Chris Leech	1a2449a87b	[I/OAT]: TCP recv offload to I/OAT Locks down user pages and sets up for DMA in tcp_recvmsg, then calls dma_async_try_early_copy in tcp_v4_do_rcv Signed-off-by: Chris Leech <christopher.leech@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:25:56 -07:00
Chris Leech	9593782585	[I/OAT]: Add a sysctl for tuning the I/OAT offloaded I/O threshold Any socket recv of less than this ammount will not be offloaded Signed-off-by: Chris Leech <christopher.leech@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:25:54 -07:00
Chris Leech	624d116473	[I/OAT]: Make sk_eat_skb I/OAT aware. Add an extra argument to sk_eat_skb, and make it move early copied packets to the async_wait_queue instead of freeing them. Signed-off-by: Chris Leech <christopher.leech@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:25:52 -07:00
Chris Leech	0e4b4992b8	[I/OAT]: Rename cleanup_rbuf to tcp_cleanup_rbuf and make non-static Needed to be able to call tcp_cleanup_rbuf in tcp_input.c for I/OAT Signed-off-by: Chris Leech <christopher.leech@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:25:50 -07:00
Sean Hefty	a1e8733e55	[NET]: Export ip_dev_find() Export ip_dev_find() to allow locating a net_device given an IP address. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-06-17 20:37:28 -07:00
Weidong	42d1d52e69	[IPV4]: Increment ipInHdrErrors when TTL expires. Signed-off-by: Weidong <weid@nanjing-fnst.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-12 13:09:59 -07:00
Aki M Nyrhinen	79320d7e14	[TCP]: continued: reno sacked_out count fix From: Aki M Nyrhinen <anyrhine@cs.helsinki.fi> IMHO the current fix to the problem (in_flight underflow in reno) is incorrect. it treats the symptons but ignores the problem. the problem is timing out packets other than the head packet when we don't have sack. i try to explain (sorry if explaining the obvious). with sack, scanning the retransmit queue for timed out packets is fine because we know which packets in our retransmit queue have been acked by the receiver. without sack, we know only how many packets in our retransmit queue the receiver has acknowledged, but no idea which packets. think of a "typical" slow-start overshoot case, where for example every third packet in a window get lost because a router buffer gets full. with sack, we check for timeouts on those every third packet (as the rest have been sacked). the packet counting works out and if there is no reordering, we'll retransmit exactly the packets that were lost. without sack, however, we check for timeout on every packet and end up retransmitting consecutive packets in the retransmit queue. in our slow-start example, 2/3 of those retransmissions are unnecessary. these unnecessary retransmissions eat the congestion window and evetually prevent fast recovery from continuing, if enough packets were lost. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-11 21:18:56 -07:00
Herbert Xu ~{PmVHI~}	f291196979	[TCP]: Avoid skb_pull if possible when trimming head Trimming the head of an skb by calling skb_pull can cause the packet to become unaligned if the length pulled is odd. Since the length is entirely arbitrary for a FIN packet carrying data, this is actually quite common. Unaligned data is not the end of the world, but we should avoid it if it's easily done. In this case it is trivial. Since we're discarding all of the head data it doesn't matter whether we move skb->data forward or back. However, it is still possible to have unaligned skb->data in general. So network drivers should be prepared to handle it instead of crashing. This patch also adds an unlikely marking on len < headlen since partial ACKs on head data are extremely rare in the wild. As the return value of __pskb_trim_head is no longer ever NULL that has been removed. Signed-off-by: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-05 15:03:37 -07:00
Stephen Hemminger	fb80a6e1a5	[TCP] tcp_highspeed: Fix problem observed by Xiaoliang (David) Wei When snd_cwnd is smaller than 38 and the connection is in congestion avoidance phase (snd_cwnd > snd_ssthresh), the snd_cwnd seems to stop growing. The additive increase was confused because C array's are 0 based. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-02 17:51:08 -07:00
Alexey Dobriyan	7114b0bb6d	[NETFILTER]: PPTP helper: fix sstate/cstate typo Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-05-28 22:51:05 -07:00
Patrick McHardy	ca3ba88d0c	[NETFILTER]: mark H.323 helper experimental Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-05-28 22:50:40 -07:00
Marcel Holtmann	6c813c3fe9	[NETFILTER]: Fix small information leak in SO_ORIGINAL_DST (CVE-2006-1343) It appears that sockaddr_in.sin_zero is not zeroed during getsockopt(...SO_ORIGINAL_DST...) operation. This can lead to an information leak (CVE-2006-1343). Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-05-28 22:50:18 -07:00
Chris Wright	4a06373913	[NETFILTER]: SNMP NAT: fix memleak in snmp_object_decode If kmalloc fails, error path leaks data allocated from asn1_oid_decode(). Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-05-23 15:15:13 -07:00
Patrick McHardy	4d942d8b39	[NETFILTER]: H.323 helper: fix sequence extension parsing When parsing unknown sequence extensions the "son"-pointer points behind the last known extension for this type, don't try to interpret it. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-05-23 15:15:10 -07:00
Patrick McHardy	7185989db4	[NETFILTER]: H.323 helper: fix parser error propagation The condition "> H323_ERROR_STOP" can never be true since H323_ERROR_STOP is positive and is the highest possible return code, while real errors are negative, fix the checks. Also only abort on real errors in some spots that were just interpreting any return value != 0 as error. Fixes crashes caused by use of stale data after a parsing error occured: BUG: unable to handle kernel paging request at virtual address bfffffff printing eip: c01aa0f8 pde = 1a801067 pte = 00000000 Oops: 0000 [#1] PREEMPT Modules linked in: ip_nat_h323 ip_conntrack_h323 nfsd exportfs sch_sfq sch_red cls_fw sch_hfsc xt_length ipt_owner xt_MARK iptable_mangle nfs lockd sunrpc pppoe pppoxx CPU: 0 EIP: 0060:[<c01aa0f8>] Not tainted VLI EFLAGS: 00210646 (2.6.17-rc4 #8) EIP is at memmove+0x19/0x22 eax: d77264e9 ebx: d77264e9 ecx: e88d9b17 edx: d77264e9 esi: bfffffff edi: bfffffff ebp: de6a7680 esp: c0349db8 ds: 007b es: 007b ss: 0068 Process asterisk (pid: 3765, threadinfo=c0349000 task=da068540) Stack: <0>00000006 c0349e5e d77264e3 e09a2b4e e09a38a0 d7726052 d7726124 00000491 00000006 00000006 00000006 00000491 de6a7680 d772601e d7726032 c0349f74 e09a2dc2 00000006 c0349e5e 00000006 00000000 d76dda28 00000491 c0349f74 Call Trace: [<e09a2b4e>] mangle_contents+0x62/0xfe [ip_nat] [<e09a2dc2>] ip_nat_mangle_tcp_packet+0xa1/0x191 [ip_nat] [<e0a2712d>] set_addr+0x74/0x14c [ip_nat_h323] [<e0ad531e>] process_setup+0x11b/0x29e [ip_conntrack_h323] [<e0ad534f>] process_setup+0x14c/0x29e [ip_conntrack_h323] [<e0ad57bd>] process_q931+0x3c/0x142 [ip_conntrack_h323] [<e0ad5dff>] q931_help+0xe0/0x144 [ip_conntrack_h323] ... Found by the PROTOS c07-h2250v4 testsuite. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-05-23 15:15:08 -07:00
Patrick McHardy	f41d5bb1d9	[NETFILTER]: SNMP NAT: fix memory corruption Fix memory corruption caused by snmp_trap_decode: - When snmp_trap_decode fails before the id and address are allocated, the pointers contain random memory, but are freed by the caller (snmp_parse_mangle). - When snmp_trap_decode fails after allocating just the ID, it tries to free both address and ID, but the address pointer still contains random memory. The caller frees both ID and random memory again. - When snmp_trap_decode fails after allocating both, it frees both, and the callers frees both again. The corruption can be triggered remotely when the ip_nat_snmp_basic module is loaded and traffic on port 161 or 162 is NATed. Found by multiple testcases of the trap-app and trap-enc groups of the PROTOS c06-snmpv1 testsuite. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-05-22 16:55:14 -07:00
Alexey Dobriyan	4195f81453	[NET]: Fix "ntohl(ntohs" bugs Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-05-22 16:53:22 -07:00
Solar Designer	2c8ac66bb2	[NETFILTER]: Fix do_add_counters race, possible oops or info leak (CVE-2006-0039) Solar Designer found a race condition in do_add_counters(). The beginning of paddc is supposed to be the same as tmp which was sanity-checked above, but it might not be the same in reality. In case the integer overflow and/or the race condition are triggered, paddc->num_counters might not match the allocation size for paddc. If the check below (t->private->number != paddc->num_counters) nevertheless passes (perhaps this requires the race condition to be triggered), IPT_ENTRY_ITERATE() would read kernel memory beyond the allocation size, potentially causing an oops or leaking sensitive data (e.g., passwords from host system or from another VPS) via counter increments. This requires CAP_NET_ADMIN. Signed-off-by: Solar Designer <solar@openwall.com> Signed-off-by: Kirill Korotaev <dev@openvz.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-05-19 02:16:52 -07:00
Alexey Dobriyan	a467704dcb	[NETFILTER]: GRE conntrack: fix htons/htonl confusion GRE keys are 16 bit. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-05-19 02:16:29 -07:00
Philip Craig	5c170a09d9	[NETFILTER]: fix format specifier for netfilter log targets The prefix argument for nf_log_packet is a format specifier, so don't pass the user defined string directly to it. Signed-off-by: Philip Craig <philipc@snapgear.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-05-19 02:15:47 -07:00
Jesper Juhl	493e2428aa	[NETFILTER]: Fix memory leak in ipt_recent The Coverity checker spotted that we may leak 'hold' in net/ipv4/netfilter/ipt_recent.c::checkentry() when the following is true: if (!curr_table->status_proc) { ... if(!curr_table) { ... return 0; <-- here we leak. Simply moving an existing vfree(hold); up a bit avoids the possible leak. Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-05-19 02:15:13 -07:00
Angelo P. Castellani	8872d8e1c4	[TCP]: reno sacked_out count fix From: "Angelo P. Castellani" <angelo.castellani+lkml@gmail.com> Using NewReno, if a sk_buff is timed out and is accounted as lost_out, it should also be removed from the sacked_out. This is necessary because recovery using NewReno fast retransmit could take up to a lot RTTs and the sk_buff RTO can expire without actually being really lost. left_out = sacked_out + lost_out in_flight = packets_out - left_out + retrans_out Using NewReno without this patch, on very large network losses, left_out becames bigger than packets_out + retrans_out (!!). For this reason unsigned integer in_flight overflows to 2^32 - something. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-05-16 21:42:11 -07:00
Wei Yongjun	63cbd2fda3	[IPV4]: ip_options_fragment() has no effect on fragmentation Fix error point to options in ip_options_fragment(). optptr get a error pointer to the ipv4 header, correct is pointer to ipv4 options. Signed-off-by: Wei Yongjun <weiyj@soft.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-05-09 15:18:50 -07:00
Hua Zhong	0182bd2b1e	[IPV4]: Remove likely in ip_rcv_finish() This is another result from my likely profiling tool (dwalker@mvista.com just sent the patch of the profiling tool to linux-kernel mailing list, which is similar to what I use). On my system (not very busy, normal development machine within a VMWare workstation), I see a 6/5 miss/hit ratio for this "likely". Signed-off-by: Hua Zhong <hzhong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-05-06 18:11:39 -07:00
John Heffner	5528e568a7	[TCP]: Fix snd_cwnd adjustments in tcp_highspeed.c Xiaoliang (David) Wei wrote: > Hi gurus, > > I am reading the code of tcp_highspeed.c in the kernel and have a > question on the hstcp_cong_avoid function, specifically the following > AI part (line 136~143 in net/ipv4/tcp_highspeed.c ): > > /* Do additive increase */ > if (tp->snd_cwnd < tp->snd_cwnd_clamp) { > tp->snd_cwnd_cnt += ca->ai; > if (tp->snd_cwnd_cnt >= tp->snd_cwnd) { > tp->snd_cwnd++; > tp->snd_cwnd_cnt -= tp->snd_cwnd; > } > } > > In this part, when (tp->snd_cwnd_cnt == tp->snd_cwnd), > snd_cwnd_cnt will be -1... snd_cwnd_cnt is defined as u16, will this > small chance of getting -1 becomes a problem? > Shall we change it by reversing the order of the cwnd++ and cwnd_cnt -= > cwnd? Absolutely correct. Thanks. Signed-off-by: John Heffner <jheffner@psc.edu> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-05-05 17:41:44 -07:00
Herbert Xu	75c2d9077c	[TCP]: Fix sock_orphan dead lock Calling sock_orphan inside bh_lock_sock in tcp_close can lead to dead locks. For example, the inet_diag code holds sk_callback_lock without disabling BH. If an inbound packet arrives during that admittedly tiny window, it will cause a dead lock on bh_lock_sock. Another possible path would be through sock_wfree if the network device driver frees the tx skb in process context with BH enabled. We can fix this by moving sock_orphan out of bh_lock_sock. The tricky bit is to work out when we need to destroy the socket ourselves and when it has already been destroyed by someone else. By moving sock_orphan before the release_sock we can solve this problem. This is because as long as we own the socket lock its state cannot change. So we simply record the socket state before the release_sock and then check the state again after we regain the socket lock. If the socket state has transitioned to TCP_CLOSE in the time being, we know that the socket has been destroyed. Otherwise the socket is still ours to keep. Note that I've also moved the increment on the orphan count forward. This may look like a problem as we're increasing it even if the socket is just about to be destroyed where it'll be decreased again. However, this simply enlarges a window that already exists. This also changes the orphan count test by one. Considering what the orphan count is meant to do this is no big deal. This problem was discoverd by Ingo Molnar using his lock validator. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-05-03 23:31:35 -07:00
Patrick McHardy	7800007c1e	[NETFILTER]: x_tables: don't use __copy_{from,to}_user on unchecked memory in compat layer Noticed by Linus Torvalds <torvalds@osdl.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-05-03 23:20:27 -07:00
Jing Min Zhao	7582e9d17e	[NETFILTER]: H.323 helper: Change author's email address Signed-off-by: Jing Min Zhao <zhaojingmin@users.sourceforge.net> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-05-03 23:19:59 -07:00
Patrick McHardy	2354feaeb2	[NETFILTER]: NAT: silence unused variable warnings with CONFIG_XFRM=n net/ipv4/netfilter/ip_nat_standalone.c: In function 'ip_nat_out': net/ipv4/netfilter/ip_nat_standalone.c:223: warning: unused variable 'ctinfo' net/ipv4/netfilter/ip_nat_standalone.c:222: warning: unused variable 'ct' Surprisingly no complaints so far .. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-05-03 23:19:26 -07:00
Patrick McHardy	4228e2a989	[NETFILTER]: H.323 helper: fix use of uninitialized data When a Choice element contains an unsupported choice no error is returned and parsing continues normally, but the choice value is not set and contains data from the last parsed message. This may in turn lead to parsing of more stale data and following crashes. Fixes a crash triggered by testcase 0003243 from the PROTOS c07-h2250v4 testsuite following random other testcases: CPU: 0 EIP: 0060:[<c01a9554>] Not tainted VLI EFLAGS: 00210646 (2.6.17-rc2 #3) EIP is at memmove+0x19/0x22 eax: d7be0307 ebx: d7be0307 ecx: e841fcf9 edx: d7be0307 esi: bfffffff edi: bfffffff ebp: da5eb980 esp: c0347e2c ds: 007b es: 007b ss: 0068 Process events/0 (pid: 4, threadinfo=c0347000 task=dff86a90) Stack: <0>00000006 c0347ea6 d7be0301 e09a6b2c 00000006 da5eb980 d7be003e d7be0052 c0347f6c e09a6d9c 00000006 c0347ea6 00000006 00000000 d7b9a548 00000000 c0347f6c d7b9a548 00000004 e0a1a119 0000028f 00000006 c0347ea6 00000006 Call Trace: [<e09a6b2c>] mangle_contents+0x40/0xd8 [ip_nat] [<e09a6d9c>] ip_nat_mangle_tcp_packet+0xa1/0x191 [ip_nat] [<e0a1a119>] set_addr+0x60/0x14d [ip_nat_h323] [<e0ab6e66>] q931_help+0x2da/0x71a [ip_conntrack_h323] [<e0ab6e98>] q931_help+0x30c/0x71a [ip_conntrack_h323] [<e09af242>] ip_conntrack_help+0x22/0x2f [ip_conntrack] [<c022934a>] nf_iterate+0x2e/0x5f [<c025d357>] xfrm4_output_finish+0x0/0x39f [<c02294ce>] nf_hook_slow+0x42/0xb0 [<c025d357>] xfrm4_output_finish+0x0/0x39f [<c025d732>] xfrm4_output+0x3c/0x4e [<c025d357>] xfrm4_output_finish+0x0/0x39f [<c0230370>] ip_forward+0x1c2/0x1fa [<c022f417>] ip_rcv+0x388/0x3b5 [<c02188f9>] netif_receive_skb+0x2bc/0x2ec [<c0218994>] process_backlog+0x6b/0xd0 [<c021675a>] net_rx_action+0x4b/0xb7 [<c0115606>] __do_softirq+0x35/0x7d [<c0104294>] do_softirq+0x38/0x3f Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-05-03 23:17:11 -07:00
Patrick McHardy	6fd737031e	[NETFILTER]: H.323 helper: fix endless loop caused by invalid TPKT len When the TPKT len included in the packet is below the lowest valid value of 4 an underflow occurs which results in an endless loop. Found by testcase 0000058 from the PROTOS c07-h2250v4 testsuite. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-05-03 23:16:29 -07:00
Patrick McHardy	e17df688f7	[NETFILTER] SCTP conntrack: fix infinite loop fix infinite loop in the SCTP-netfilter code: check SCTP chunk size to guarantee progress of for_each_sctp_chunk(). (all other uses of for_each_sctp_chunk() are preceded by do_basic_checks(), so this fix should be complete.) Based on patch from Ingo Molnar <mingo@elte.hu> CVE-2006-1527 Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-05-02 17:26:39 -07:00
Patrick McHardy	46c5ea3c9a	[NETFILTER] x_tables: fix compat related crash on non-x86 When iptables userspace adds an ipt_standard_target, it calculates the size of the entire entry as: sizeof(struct ipt_entry) + XT_ALIGN(sizeof(struct ipt_standard_target)) ipt_standard_target looks like this: struct xt_standard_target { struct xt_entry_target target; int verdict; }; xt_entry_target contains a pointer, so when compiled for 64 bit the structure gets an extra 4 byte of padding at the end. On 32 bit architectures where iptables aligns to 8 byte it will also have 4 byte padding at the end because it is only 36 bytes large. The compat_ipt_standard_fn in the kernel adjusts the offsets by sizeof(struct ipt_standard_target) - sizeof(struct compat_ipt_standard_target), which will always result in 4, even if the structure from userspace was already padded to a multiple of 8. On x86 this works out by accident because userspace only aligns to 4, on all other architectures this is broken and causes incorrect adjustments to the size and following offsets. Thanks to Linus for lots of debugging help and testing. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-05-01 20:48:32 -07:00
Hua Zhong	83de47cd0c	[TCP]: Fix unlikely usage in tcp_transmit_skb() The following unlikely should be replaced by likely because the condition happens every time unless there is a hard error to transmit a packet. Signed-off-by: Hua Zhong <hzhong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-29 18:33:19 -07:00
Herbert Xu	a76e07acd0	[IPSEC]: Fix IP ID selection I was looking through the xfrm input/output code in order to abstract out the address family specific encapsulation/decapsulation code. During that process I found this bug in the IP ID selection code in xfrm4_output.c. At that point dst is still the xfrm_dst for the current SA which represents an internal flow as far as the IPsec tunnel is concerned. Since the IP ID is going to sit on the outside of the encapsulated packet, we obviously want the external flow which is just dst->child. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-29 18:33:16 -07:00
Heiko Carstens	a536e07787	[IPV4]: inet_init() -> fs_initcall Convert inet_init to an fs_initcall to make sure its called before any device driver's initcall. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-29 18:33:14 -07:00
Thomas Voegtle	44adf28f4a	[NETFILTER]: ULOG target is not obsolete The backend part is obsoleted, but the target itself is still needed. Signed-off-by: Thomas Voegtle <tv@lio96.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-24 17:27:29 -07:00
Herbert Xu	b60b49ea6a	[TCP]: Account skb overhead in tcp_fragment Make sure that we get the full sizeof(struct sk_buff) plus the data size accounted for in skb->truesize. This will create invariants that will allow adding assertion checks on skb->truesize. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-19 21:35:00 -07:00
Jesper Juhl	63903ca6af	[NET]: Remove redundant NULL checks before [kv]free Redundant NULL check before kfree removal from net/ Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Acked-by: James Morris <jmorris@namei.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-18 15:57:55 -07:00
Herbert Xu	ef5cb9738b	[TCP]: Fix truesize underflow There is a problem with the TSO packet trimming code. The cause of this lies in the tcp_fragment() function. When we allocate a fragment for a completely non-linear packet the truesize is calculated for a payload length of zero. This means that truesize could in fact be less than the real payload length. When that happens the TSO packet trimming can cause truesize to become negative. This in turn can cause sk_forward_alloc to be -n * PAGE_SIZE which would trigger the warning. I've copied the code DaveM used in tso_fragment which should work here. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-18 15:57:49 -07:00
Stephen Hemminger	d2c962b853	[IPV4]: ip_route_input panic fix This fixes http://bugzilla.kernel.org/show_bug.cgi?id=6388 The bug is caused by ip_route_input dereferencing skb->nh.protocol of the dummy skb passed dow from inet_rtm_getroute (Thanks Thomas for seeing it). It only happens if the route requested is for a multicast IP address. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-17 17:27:11 -07:00
Zach Brown	3d9dd7564d	[PATCH] ip_output: account for fraggap when checking to add trailer_len During other work I noticed that ip_append_data() seemed to be forgetting to include the frag gap in its calculation of a fragment that consumes the rest of the payload. Herbert confirmed that this was a bug that snuck in during a previous rework. Signed-off-by: Zach Brown <zach.brown@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-14 16:04:18 -07:00
Adrian Bunk	6c97e72a16	[IPV4]: Possible cleanups. This patch contains the following possible cleanups: - make the following needlessly global function static: - arp.c: arp_rcv() - remove the following unused EXPORT_SYMBOL's: - devinet.c: devinet_ioctl - fib_frontend.c: ip_rt_ioctl - inet_hashtables.c: inet_bind_bucket_create - inet_hashtables.c: inet_bind_hash - tcp_input.c: sysctl_tcp_abc - tcp_ipv4.c: sysctl_tcp_tw_reuse - tcp_output.c: sysctl_tcp_mtu_probing - tcp_output.c: sysctl_tcp_base_mss Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-14 15:00:20 -07:00
KAMEZAWA Hiroyuki	6f91204225	[PATCH] for_each_possible_cpu: network codes for_each_cpu() actually iterates across all possible CPUs. We've had mistakes in the past where people were using for_each_cpu() where they should have been iterating across only online or present CPUs. This is inefficient and possibly buggy. We're renaming for_each_cpu() to for_each_possible_cpu() to avoid this in the future. This patch replaces for_each_cpu with for_each_possible_cpu under /net Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:31 -07:00
David S. Miller	55c0022e53	[IPV4] ip_fragment: Always compute hash with ipfrag_lock held. Otherwise we could compute an inaccurate hash due to the random seed changing. Noticed by Zach Brown and patch is based upon some feedback from Herbert Xu. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-09 22:43:55 -07:00
Patrick McHardy	19910d1aec	[NETFILTER]: Fix DNAT in LOCAL_OUT Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-09 22:38:29 -07:00
Patrick McHardy	7a43c99551	[NETFILTER]: H.323 helper: remove changelog Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-09 22:25:43 -07:00
Patrick McHardy	96f6bf82ea	[NETFILTER]: Convert conntrack/ipt_REJECT to new checksumming functions Besides removing lots of duplicate code, all converted users benefit from improved HW checksum error handling. Tested with and without HW checksums in almost all combinations. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-09 22:25:42 -07:00
Patrick McHardy	422c346fad	[NETFILTER]: Add address family specific checksum helpers Add checksum operation which takes care of verifying the checksum and dealing with HW checksum errors and avoids multiple checksum operations by setting ip_summed to CHECKSUM_UNNECESSARY after successful verification. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-09 22:25:41 -07:00

... 5 6 7 8 9 ...

1281 commits