Commit graph

277 commits

Author SHA1 Message Date
Stephen Hemminger
48bc41a49c [IPV4]: Reassembly trim not clearing CHECKSUM_HW
This was found by inspection while looking for checksum problems
with the skge driver that sets CHECKSUM_HW. It did not fix the
problem, but it looks like it is needed.

If IP reassembly is trimming an overlapping fragment, it
should reset (or adjust) the hardware checksum flag on the skb.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-06 15:51:48 -07:00
Patrick McHardy
e446639939 [NETFILTER]: Missing unlock in TCP connection tracking error path
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-06 15:11:10 -07:00
Pablo Neira Ayuso
49719eb355 [NETFILTER]: kill __ip_ct_expect_unlink_destroy
The following patch kills __ip_ct_expect_unlink_destroy and export
unlink_expect as ip_ct_unlink_expect. As it was discussed [1], the function
__ip_ct_expect_unlink_destroy is a bit confusing so better do the following
sequence: ip_ct_destroy_expect and ip_conntrack_expect_put.

[1] https://lists.netfilter.org/pipermail/netfilter-devel/2005-August/020794.html

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-06 15:10:46 -07:00
Pablo Neira Ayuso
91c46e2e60 [NETFILTER]: Don't increase master refcount on expectations
As it's been discussed [1][2]. We shouldn't increase the master conntrack
refcount for non-fulfilled conntracks. During the conntrack destruction,
the expectations are always killed before the conntrack itself, this
guarantees that there won't be any orphan expectation.

[1]https://lists.netfilter.org/pipermail/netfilter-devel/2005-August/020783.html
[2]https://lists.netfilter.org/pipermail/netfilter-devel/2005-August/020904.html

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-06 15:10:23 -07:00
Patrick McHardy
03486a4f83 [NETFILTER]: Handle NAT module load race
When the NAT module is loaded when connections are already confirmed
it must not change their tuples anymore. This is especially important
with CONFIG_NETFILTER_DEBUG, the netfilter listhelp functions will
refuse to remove an entry from a list when it can not be found on
the list, so when a changed tuple hashes to a new bucket the entry
is kept in the list until and after the conntrack is freed.

Allocate the exact conntrack tuple for NAT for already confirmed
connections or drop them if that fails.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-06 15:09:43 -07:00
Yasuyuki Kozakai
31c913e7fd [NETFILTER]: Fix CONNMARK Kconfig dependency
Connection mark tracking support is one of the feature in connection
tracking, so IP_NF_CONNTRACK_MARK depends on IP_NF_CONNTRACK.

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-06 15:09:20 -07:00
Patrick McHardy
a2978aea39 [NETFILTER]: Add NetBIOS name service helper
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-06 15:08:51 -07:00
Patrick McHardy
2248bcfcd8 [NETFILTER]: Add support for permanent expectations
A permanent expectation exists until timeing out and can expect
multiple related connections.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-06 15:06:42 -07:00
Herbert Xu
fb5f5e6e0c [TCP]: Fix TCP_OFF() bug check introduced by previous change.
The TCP_OFF assignment at the bottom of that if block can indeed set
TCP_OFF without setting TCP_PAGE.  Since there is not much to be
gained from avoiding this situation, we might as well just zap the
offset.  The following patch should fix it.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-05 18:55:48 -07:00
Adrian Bunk
43d60661ac [IPV4]: net/ipv4/ipconfig.c should #include <linux/nfs_fs.h>
Every file should #include the header files containing the prototypes of 
it's global functions.

nfs_fs.h contains the prototype of root_nfs_parse_addr().

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-05 18:05:52 -07:00
David S. Miller
6475be16fd [TCP]: Keep TSO enabled even during loss events.
All we need to do is resegment the queue so that
we record SACK information accurately.  The edges
of the SACK blocks guide our resegmenting decisions.

With help from Herbert Xu.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-01 22:47:01 -07:00
Herbert Xu
ef01578615 [TCP]: Fix sk_forward_alloc underflow in tcp_sendmsg
I've finally found a potential cause of the sk_forward_alloc underflows
that people have been reporting sporadically.

When tcp_sendmsg tacks on extra bits to an existing TCP_PAGE we don't
check sk_forward_alloc even though a large amount of time may have
elapsed since we allocated the page.  In the mean time someone could've
come along and liberated packets and reclaimed sk_forward_alloc memory.

This patch makes tcp_sendmsg check sk_forward_alloc every time as we
do in do_tcp_sendpages.
 
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-01 17:48:59 -07:00
Herbert Xu
d80d99d643 [NET]: Add sk_stream_wmem_schedule
This patch introduces sk_stream_wmem_schedule as a short-hand for
the sk_forward_alloc checking on egress.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-01 17:48:23 -07:00
Jesper Juhl
573dbd9596 [CRYPTO]: crypto_free_tfm() callers no longer need to check for NULL
Since the patch to add a NULL short-circuit to crypto_free_tfm() went in,
there's no longer any need for callers of that function to check for NULL.
This patch removes the redundant NULL checks and also a few similar checks
for NULL before calls to kfree() that I ran into while doing the
crypto_free_tfm bits.

I've succesfuly compile tested this patch, and a kernel with the patch 
applied boots and runs just fine.

When I posted the patch to LKML (and other lists/people on Cc) it drew the
following comments :

 J. Bruce Fields commented
  "I've no problem with the auth_gss or nfsv4 bits.--b."

 Sridhar Samudrala said
  "sctp change looks fine."

 Herbert Xu signed off on the patch.

So, I guess this is ready to be dropped into -mm and eventually mainline.

Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-01 17:44:29 -07:00
KOVACS Krisztian
5170dbebbb [NETFILTER]: CLUSTERIP: fix memcpy() length typo
Fix a trivial typo in clusterip_config_init().

Signed-off-by: KOVACS Krisztian <hidden@balabit.hu>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-01 17:44:06 -07:00
Harald Welte
5f2c3b9107 [NETFILTER]: Add new iptables TTL target
This new iptables target allows manipulation of the TTL of an IPv4 packet.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:13:22 -07:00
Eric Dumazet
ba89966c19 [NET]: use __read_mostly on kmem_cache_t , DEFINE_SNMP_STAT pointers
This patch puts mostly read only data in the right section
(read_mostly), to help sharing of these data between CPUS without
memory ping pongs.

On one of my production machine, tcp_statistics was sitting in a
heavily modified cache line, so *every* SNMP update had to force a
reload.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:11:18 -07:00
David S. Miller
29cb9f9c55 [LIB]: Make TEXTSEARCH_BM plain tristate like the others
And select it when the relevant modules are enabled.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:11:11 -07:00
Robert Olsson
2373ce1ca0 [IPV4]: Convert FIB Trie to RCU.
* Removes RW-lock
* Proteced read functions uses
  rcu_dereference proteced with rcu_read_lock()
* writing of procted pointer w. rcu_assigen_pointer
* Insert/Replace atomic list_replace_rcu
* A BUG_ON condition removed.in trie_rebalance

With help from Paul E. McKenney.

Signed-off-by: Robert Olsson <Robert.Olsson@data.slu.se>
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:09:03 -07:00
Robert Olsson
e5b4376074 [IPV4]: Prepare FIB core for RCU.
* RCU versions of hlist_***_rcu
* fib_alias partial rcu port just whats needed now.

Signed-off-by: Robert Olsson <Robert.Olsson@data.slu.se>
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:08:31 -07:00
Ralf Baechle
3625796806 [IPV4]: Module export of ip_rcv() no longer needed.
With ip_rcv nowhere outside the IP stack being used anymore it's
EXPORT_SYMBOL is not needed any longer either.

Signed-off-by: Ralf Baechle DL5RB <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:08:23 -07:00
Stephen Hemminger
0c7770c740 [IPV4]: FIB trie cleanup
This is a redo of earlier cleanup stuff:
	* replace DBG() macro with pr_debug()
	* get rid of duplicate extern's that are already in fib_lookup.h
	* use BUG_ON and WARN_ON
	* don't use BUG checks for null pointers where next statement would
	  get a fault anyway
	* remove debug printout when rebalance causes deep tree
	* remove trailing blanks

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Robert Olsson <robert.olsson@its.uu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:08:01 -07:00
Arnaldo Carvalho de Melo
dc40c7bc76 [ICSK]: Generalise tcp_listen_poll
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:05:32 -07:00
Patrick McHardy
05465343bf [NETFILTER]: Add goto target
Originally written by Henrik Nordstrom <hno@marasystems.com>, taken
from netfilter patch-o-matic and added ip6_tables support.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:04:18 -07:00
Pablo Neira Ayuso
7567662ba8 [NETFILTER]: Add string match
Signed-off-by: Pablo Neira Ayuso <pablo@eurodev.net>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:04:07 -07:00
Thomas Graf
33d043d65b [IPV4]: ip_finish_output() can be inlined
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:03:10 -07:00
Thomas Graf
9070683bda [IPV4]: Remove some dead code from ip_forward()
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:03:06 -07:00
Thomas Graf
3e192beaf5 [IPV4]: Avoid common branch mispredictions in ip_rcv_finish()
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:03:03 -07:00
Thomas Graf
d245407e75 [IPV4]: Move ip options parsing out of ip_rcv_finish()
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:03:00 -07:00
Thomas Graf
e9c6042273 [IPV4]: Avoid common branch misprediction while checking csum in ip_rcv()
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:02:57 -07:00
Thomas Graf
5861524241 [IPV4]: Consistency and whitespace cleanup of ip_rcv()
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:02:53 -07:00
David S. Miller
bf0ff9e578 [IPVS]: ipv4_table --> ipvs_ipv4_table
Fix conflict with symbol of same name in global
namespace.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:02:29 -07:00
David S. Miller
d179cd1292 [NET]: Implement SKB fast cloning.
Protocols that make extensive use of SKB cloning,
for example TCP, eat at least 2 allocations per
packet sent as a result.

To cut the kmalloc() count in half, we implement
a pre-allocation scheme wherein we allocate
2 sk_buff objects in advance, then use a simple
reference count to free up the memory at the
correct time.

Based upon an initial patch by Thomas Graf and
suggestions from Herbert Xu.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:01:54 -07:00
David S. Miller
ba602a8161 [IPVS]: Rename tcp_{init,exit}() --> ip_vs_tcp_{init,exit}()
Conflicts with global namespace functions with the
same name.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:01:47 -07:00
Arnaldo Carvalho de Melo
4c6ea29d82 [IP]: Introduce ip_options_get_from_user
This variant is needed to satisfy sparse __user annotations.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:01:39 -07:00
Arnaldo Carvalho de Melo
20380731bc [NET]: Fix sparse warnings
Of this type, mostly:

CHECK   net/ipv6/netfilter.c
net/ipv6/netfilter.c:96:12: warning: symbol 'ipv6_netfilter_init' was not declared. Should it be static?
net/ipv6/netfilter.c:101:6: warning: symbol 'ipv6_netfilter_fini' was not declared. Should it be static?

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:01:32 -07:00
Patrick McHardy
066286071d [NETLINK]: Add "groups" argument to netlink_kernel_create
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:01:11 -07:00
Patrick McHardy
ac6d439d20 [NETLINK]: Convert netlink users to use group numbers instead of bitmasks
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:00:54 -07:00
Patrick McHardy
db08052979 [NETLINK]: Remove unused groups member from struct netlink_skb_parms
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:00:39 -07:00
Christoph Hellwig
34b4a4a624 [NETFILTER]: Remove tasklist_lock abuse in ipt{,6}owner
Rip out cmd/sid/pid matching since its unfixable broken and stands in the
way of locking changes to tasklist_lock.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:59:07 -07:00
Gary Wayne Smith
000efe1d86 [NETFILTER]: Make NETMAP target usable in OUTPUT
Signed-off-by: Gary Wayne Smith <gary.w.smith@primeexalia.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:58:41 -07:00
Patrick McHardy
9baa5c67ff [NETFILTER]: Don't exclude local packets from MASQUERADING
Increases consistency in source-address selection.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:58:36 -07:00
Patrick McHardy
a61bbcf28a [NET]: Store skb->timestamp as offset to a base timestamp
Reduces skb size by 8 bytes on 64-bit.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:58:24 -07:00
Patrick McHardy
25ed891019 [NETFILTER]: Nicer names for ipt_connbytes constants
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:58:17 -07:00
Patrick McHardy
8ffde67173 [NETFILTER]: Fix div64_64 in ipt_connbytes
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:58:11 -07:00
Harald Welte
9d810fd2d2 [NETFILTER]: Add new iptables "connbytes" match
This patch ads a new "connbytes" match that utilizes the CONFIG_NF_CT_ACCT
per-connection byte and packet counters.  Using it you can do things like
packet classification on average packet size within a connection.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:58:04 -07:00
Arnaldo Carvalho de Melo
17b085eace [INET_DIAG]: Move the tcp_diag interface to the proper place
With this the previous setup is back, i.e. tcp_diag can be built as a module,
as dccp_diag and both share the infrastructure available in inet_diag.

If one selects CONFIG_INET_DIAG as module CONFIG_INET_TCP_DIAG will also be
built as a module, as will CONFIG_INET_DCCP_DIAG, if CONFIG_IP_DCCP was
selected static or as a module, if CONFIG_INET_DIAG is y, being statically
linked CONFIG_INET_TCP_DIAG will follow suit and CONFIG_INET_DCCP_DIAG will be
built in the same manner as CONFIG_IP_DCCP.

Now to aim at UDP, converting it to use inet_hashinfo, so that we can use
iproute2 for UDP sockets as well.

Ah, just to show an example of this new infrastructure working for DCCP :-)

[root@qemu ~]# ./ss -dane
State      Recv-Q Send-Q Local Address:Port  Peer Address:Port
LISTEN     0      0                  *:5001             *:*     ino:942 sk:cfd503a0
ESTAB      0      0          127.0.0.1:5001     127.0.0.1:32770 ino:943 sk:cfd50a60
ESTAB      0      0          127.0.0.1:32770    127.0.0.1:5001  ino:947 sk:cfd50700
TIME-WAIT  0      0          127.0.0.1:32769    127.0.0.1:5001  timer:(timewait,3.430ms,0) ino:0 sk:cf209620

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:57:54 -07:00
Arnaldo Carvalho de Melo
a8c2190ee7 [INET_DIAG]: Rename tcp_diag.[ch] to inet_diag.[ch]
Next changeset will introduce net/ipv4/tcp_diag.c, moving the code that was put
transitioanlly in inet_diag.c.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:57:48 -07:00
Arnaldo Carvalho de Melo
73c1f4a033 [TCPDIAG]: Just rename everything to inet_diag
Next changeset will rename tcp_diag.[ch] to inet_diag.[ch].

I'm taking this longer route so as to easy review, making clear the changes
made all along the way.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:57:44 -07:00
Arnaldo Carvalho de Melo
4f5736c4c7 [TCPDIAG]: Introduce inet_diag_{register,unregister}
Next changeset will rename tcp_diag to inet_diag and move the tcp_diag code out
of it and into a new tcp_diag.c, similar to the net/dccp/diag.c introduced in
this changeset, completing the transition to a generic inet_diag
infrastructure.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:57:38 -07:00