Commit Graph

210723 Commits (8b5ce5cd44743af84507721fa2cb4125ae67955c)

Author SHA1 Message Date
J. Bruce Fields 8b5ce5cd44 nfsd4: callback program number is per-session
The callback program is allowed to depend on the session which the
callback is going over.

No change in behavior yet, while we still only do callbacks over a
single session for the lifetime of the client.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-21 10:11:54 -04:00
J. Bruce Fields d29c374cd2 nfsd4: track backchannel connections
We need to keep track of which connections are available for use with
the backchannel, which for the forechannel, and which for both.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-10-21 10:11:53 -04:00
J. Bruce Fields 86c3e16cc7 nfsd4: confirm only on succesful create_session
Following rfc 5661, section 18.36.4: "If the session is not successfully
created, then no changes are made to any client records on the server."
We shouldn't be confirming or incrementing the sequence id in this case.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-21 10:11:52 -04:00
J. Bruce Fields ac7c46f29a nfsd4: make backchannel sequence number per-session
Currently we don't deal well with a client that has multiple sessions
associated with it (even simultaneously, or serially over the lifetime
of the client).

In particular, we don't attempt to keep the backchannel running after
the original session diseappears.

We will fix that soon.

Once we do that, we need the slot sequence number to be per-session;
otherwise, for example, we cannot correctly handle a case like this:

	- All session 1 connections are lost.
	- The client creates session 2.  We use it for the backchannel
	  (since it's the only working choice).
	- The client gives us a new connection to use with session 1.
	- The client destroys session 2.

At this point our only choice is to go back to using session 1.  When we
do so we must use the sequence number that is next for session 1.  We
therefore need to maintain multiple sequence number streams.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-10-21 10:11:51 -04:00
J. Bruce Fields 90c8145bb6 nfsd4: use client pointer to backchannel session
Instead of copying the sessionid, use the new cl_cb_session pointer,
which indicates which session we're using for the backchannel.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-10-21 10:11:50 -04:00
J. Bruce Fields edd7678663 nfsd4: move callback setup into session init code
The backchannel should  be associated with a session, it isn't really
global to the client.

We do, however, want a pointer global to the client which tracks which
session we're currently using for client-based callbacks.

This is a first step in that direction; for now, just reshuffling of
code with no significant change in behavior.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-10-21 10:11:49 -04:00
J. Bruce Fields cd5b814458 nfsd4: don't cache seq_misordered replies
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-21 10:11:48 -04:00
Chuck Lever 9247685088 SUNRPC: Properly initialize sock_xprt.srcaddr in all cases
The source address field in the transport's sock_xprt is initialized
ONLY IF the RPC application passed a pointer to a source address
during the call to rpc_create().  However, xs_bind() subsequently uses
the value of this field without regard to whether the source address
was initialized during transport creation or not.

So far we've been lucky: the uninitialized value of this field is
zeroes.  xs_bind(), until recently, used only the sin[6]_addr field in
this sockaddr, and all zeroes is a valid value for this: it means
ANYADDR.  This is a happy coincidence.

However, xs_bind() now wants to use the sa_family field as well, and
expects it to be initialized to something other than zero.

Therefore, the source address sockaddr field should be fully
initialized at transport create time in _every_ case, not just when
the RPC application wants to use a specific bind address.

Bruce added a workaround for this missing initialization by adjusting
commit 6bc9638a, but the "right" way to do this is to ensure that the
source address sockaddr is always correctly initialized from the
get-go.

This patch doesn't introduce a behavior change.  It's simply a
clean-up of Bruce's fix, to prevent future problems of this kind.  It
may look like overkill, but

  a) it clearly documents the default initial value of this field,

  b) it doesn't assume that the sockaddr_storage memory is first
     initialized to any particular value, and

  c) it will fail verbosely if some unknown address family is passed
     in

Originally introduced by commit d3bc9a1d.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-21 10:11:47 -04:00
Chuck Lever 4232e8634a SUNRPC: Use conventional switch statement when reclassifying sockets
Clean up.

Defensive coding: If "family" is ever something that is neither
AF_INET nor AF_INET6, xs_reclassify_socket6() is not the appropriate
default action.  Choose to do nothing in that case.

Introduced by commit 6bc9638a.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-21 10:11:46 -04:00
Tejun Heo a25e758c5f sunrpc/xprtrdma: clean up workqueue usage
* Create and use svc_rdma_wq instead of using the system workqueue and
  flush_scheduled_work().  This workqueue is necessary to serve as
  flushing domain for rdma->sc_work which is used to destroy itself
  and thus can't be flushed explicitly.

* Replace cancel_delayed_work() + flush_scheduled_work() with
  cancel_delayed_work_sync().

* Implement synchronous connect in xprt_rdma_connect() using
  flush_delayed_work() on the rdma_connect work instead of using
  flush_scheduled_work().

This is to prepare for the deprecation and removal of
flush_scheduled_work().

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-21 10:11:45 -04:00
Pavel Emelyanov 8f3a6de313 sunrpc: Turn list_for_each-s into the ..._entry-s
Saves some lines of code and some branticks when reading one.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-19 10:48:16 -04:00
Pavel Emelyanov 50fa0d40a9 sunrpc: Remove dead "else" branch from bc xprt creation
Since the xprt in question is forcibly set to be bound the else
branch of this check is unneeded.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-19 10:48:16 -04:00
Pavel Emelyanov c636b572e0 sunrpc: Don't return NULL from rpcb_create
> The reason for this is in the future, we may want to support additional
> address family types.  We should, therefore, ensure that every piece of
> code that is sensitive to address families fail in some orderly manner
> to let developers know where a change is needed.

Makes sense. I was under impression, that AF-s other than INET are not
cared about at all :(

Here's a fixed version of the patch.

Log:

Its callers check for ERR_PTR.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-19 10:48:16 -04:00
Pavel Emelyanov f10fef38d2 sunrpc: Remove useless if (task == NULL) from xprt_reserve_xprt
The task in question is dereferenced above (and is actually never NULL).

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-19 10:48:16 -04:00
Pavel Emelyanov 8c14ff2aaf sunrpc: Remove UDP worker wrappers
Same for UDP sockets creation paths.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-19 10:48:15 -04:00
Pavel Emelyanov cdd518d524 sunrpc: Remove TCP worker wrappers
The v4 and the v6 wrappers only pass the respective family
to the xs_tcp_setup_socket. This family can be taken from the
xprt's sockaddr.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-19 10:48:15 -04:00
Pavel Emelyanov 7dfe1fc362 sunrpc: Pass family to setup_socket calls
Now we have a single socket creation routine and can call it
directly from the setup_socket routines.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-19 10:48:15 -04:00
Pavel Emelyanov 6bc9638ab4 sunrpc: Merge xs_create_sock code
After xs_bind is merged it's easy to merge its callers.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
[bfields@redhat.com: fix address family initialization]
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-19 10:48:15 -04:00
Pavel Emelyanov beb59b6828 sunrpc: Merge the xs_bind code
There's the only difference betseen the xs_bind4 and the
xs_bind6 - the size of sockaddr structure they use.

Fortunatelly its size can be indirectly get from the transport.

Change since v1:
* use sockaddr_storage instead of sockaddr
* use rpc_set_port instead of manual port assigning

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
[bfields@redhat.com: fix address family initialization]
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-19 10:48:15 -04:00
Pavel Emelyanov 573018c07e sunrpc: Call xs_create_sockX directly from setup_socket
Remove now unneeded wrappers that just add type and protocol
to socket creation callback.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-19 10:48:15 -04:00
Pavel Emelyanov 22d44a7d8a sunrpc: Factor out v6 sockets creation
Same patch for v6 protocols.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-19 10:48:15 -04:00
Pavel Emelyanov 22f793268d sunrpc: Factor out v4 sockets creation
The UDPv4 and TCPv4 socket creation callbacks now look very similar.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-19 10:48:14 -04:00
Pavel Emelyanov b65c031061 sunrpc: Factor out udp sockets creation
Make it look like the TCP sockets creation.
Unfortunately the git diff made the patch look messy :(

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-19 10:48:14 -04:00
Pavel Emelyanov 58dddac9c5 sunrpc: Remove duplicate xprt/transport arguments from calls
The xs_tcp_reuse_connection takes the xprt only to pass it down
to the xs_abort_connection. The later one can get it from the given
transport itself.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-19 10:48:14 -04:00
Pavel Emelyanov a9f5f0f7bf sunrpc: Get xprt pointer once in xs_tcp_setup_socket
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-19 10:48:14 -04:00
Pavel Emelyanov baaf4e487a sunrpc: Remove unused sock arg from xs_next_srcport
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-19 10:48:14 -04:00
Pavel Emelyanov 5d4ec93297 sunrpc: Remove unused sock arg from xs_get_srcport
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-19 10:48:14 -04:00
Tom Tucker 4a84386fc2 svcrdma: Cleanup DMA unmapping in error paths.
There are several error paths in the code that do not unmap DMA. This
patch adds calls to svc_rdma_unmap_dma to free these DMA contexts.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-18 19:51:32 -04:00
Tom Tucker b432e6b3d9 svcrdma: Change DMA mapping logic to avoid the page_address kernel API
There was logic in the send path that assumed that a page containing data
to send to the client has a KVA. This is not always the case and can result
in data corruption when page_address returns zero and we end up DMA mapping
zero.

This patch changes the bus mapping logic to avoid page_address() where
necessary and converts all calls from ib_dma_map_single to ib_dma_map_page
in order to keep the map/unmap calls symmetric.

Signed-off-by: Tom Tucker <tom@ogc.us>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-18 19:51:31 -04:00
J. Bruce Fields ecec6e34e1 nfsd4: expire clients more promptly
Expire clients more promptly, at the expense of possibly running the
laundromat thread more frequently.

Though it's not the default, I'd like it to be feasible to run with a
lease time of just a few seconds, at which point a minimum 10 second
wait between laundromat runs seems a little much.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-11 20:00:18 -04:00
Pavel Emelyanov 70dc78da2c sunrpc: Use helper to set v4 mapped addr in ip_map_parse
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-11 20:00:17 -04:00
NeilBrown e33534d54f sunrpc/cache: centralise handling of size limit on deferred list.
We limit the number of 'defer' requests to DFR_MAX.

The imposition of this limit is spread about a bit - sometime we don't
add new things to the list, sometimes we remove old things.

Also it is currently applied to requests which we are 'waiting' for
rather than 'deferring'.  This doesn't seem ideal as 'waiting'
requests are naturally limited by the number of threads.

So gather the DFR_MAX handling code to one place and only apply it to
requests that are actually being deferred.

This means that not all 'cache_deferred_req' structures go on the
'cache_defer_list, so we need to be careful when adding and removing
things.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-11 19:30:28 -04:00
NeilBrown d29068c431 sunrpc: Simplify cache_defer_req and related functions.
The return value from cache_defer_req is somewhat confusing.
Various different error codes are returned, but the single caller is
only interested in success or failure.

In fact it can measure this success or failure itself by checking
CACHE_PENDING, which makes the point of the code more explicit.

So change cache_defer_req to return 'void' and test CACHE_PENDING
after it completes, to see if the request was actually deferred or
not.

Similarly setup_deferral and cache_wait_req don't need a return value,
so make them void and remove some code.

The call to cache_revisit_request (to guard against a race) is only
needed for the second call to setup_deferral, so move it out of
setup_deferral to after that second call.  With the first call the
race is handled differently (by explicitly calling
'wait_for_completion').

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-11 19:30:27 -04:00
J. Bruce Fields 3351514215 nfsd4: return expired on unfound stateid's
Commit 78155ed75f "nfsd4: distinguish
expired from stale stateids" attempted to distinguish expired and stale
stateid's using time information that may not have been completely
reliable, so I reverted it.

That was throwing out the baby with the bathwater; we still do want to
return expired, but let's do that using the simpler approach of just
assuming any stateid is expired if it looks like it was given out by the
current server instance, but we can't find it any more.

This may help clients that are recovering from network partitions.

Reported-by: Bian Naimeng <biannm@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-02 18:49:33 -04:00
J. Bruce Fields 328ead2872 nfsd4: add new connections to session
As long as we're not implementing any session security, we should just
automatically add any new connections that come along to the list of
sessions associated with the session.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-01 19:29:45 -04:00
J. Bruce Fields db90681d6e nfsd4: refactor connection allocation
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-01 19:29:45 -04:00
J. Bruce Fields 19cf5c026f nfsd4: use callbacks on svc_xprt_deletion
Remove connections from the list when they go down.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-10-01 19:29:44 -04:00
J. Bruce Fields edc7a89403 nfsd: provide callbacks on svc_xprt deletion
NFSv4.1 needs warning when a client tcp connection goes down, if that
connection is being used as a backchannel, so that it can warn the
client that it has lost the backchannel connection.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-10-01 19:29:44 -04:00
J. Bruce Fields c7662518c7 nfsd4: keep per-session list of connections
The spec requires us in various places to keep track of the connections
associated with each session.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-10-01 19:29:44 -04:00
J. Bruce Fields 5b6feee960 nfsd4: clean up session allocation
Changes:
	- make sure session memory reservation is released on failure
	  path.
	- use min_t()/min() for more compact code in several places.
	- break alloc_init_session into smaller pieces.
	- miscellaneous other cleanup.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-10-01 19:29:44 -04:00
J. Bruce Fields dd93842457 nfsd4: fix alloc_init_session return type
This returns an nfs error, not -ERRNO.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-01 19:29:44 -04:00
J. Bruce Fields c23753dac1 nfsd4: fix alloc_init_session BUILD_BUG_ON()
Note we're allocating an array of nfsd4_slot *'s, not nfsd4_slot's.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-01 19:29:44 -04:00
J. Bruce Fields 6ff8da0887 nfsd4: Move callback setup to callback queue
Instead of creating the new rpc client from a regular server thread,
set a flag, kick off a null call, and allow the null call to do the work
of setting up the client on the callback workqueue.

Use a spinlock to ensure the callback work gets a consistent view of the
callback parameters.

This allows, for example, changing the callback from contexts where
sleeping is not allowed.  I hope it will also keep the locking simple as
we add more session and trunking features, by serializing most of the
callback-specific work.

This also closes a small race where the the new cb_ident could be used
with an old connection (or vice-versa).

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-10-01 19:29:44 -04:00
J. Bruce Fields fb00392326 nfsd4: remove separate cb_args struct
I don't see the point of the separate struct.  It seems to just be
getting in the way.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-10-01 19:29:43 -04:00
J. Bruce Fields cee277d924 nfsd4: use generic callback code in null case
This will eventually allow us, for example, to kick off null callback
from contexts where we can't sleep.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-10-01 19:29:43 -04:00
J. Bruce Fields 5878453dbd nfsd4: generic callback code
Make the recall callback code more generic, so that other callbacks
will be able to use it too.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-10-01 19:29:43 -04:00
J. Bruce Fields 1c8556026e nfsd4: rename nfs4_rpc_args->nfsd4_cb_args
With apologies for the gratuitous rename, the new name seems more
helpful to me.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-10-01 19:29:43 -04:00
J. Bruce Fields 586f36735e nfsd4: combine nfs4_rpc_args and nfsd4_cb_sequence
These two structs don't really need to be distinct as far as I can tell.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-10-01 19:29:43 -04:00
J. Bruce Fields 07263f1efe nfsd4: minor variable renaming (cb -> conn)
Now that we have both nfsd4_callback and nfsd4_cb_conn structures, I get
confused if variables of both types are always named cb....

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-10-01 19:29:43 -04:00
J. Bruce Fields 1e7af1b806 nfsd4: remove spkm3
Unfortunately, spkm3 never got very far; while interoperability with one
other implementation was demonstrated at some point, problems were found
with the spec that were deemed not worth fixing.

The kernel code is useless on its own without nfs-utils patches which
were never merged into nfs-utils, and were only ever available from
citi.umich.edu.  They appear not to have been updated since 2005.

Therefore it seems safe to assume that this code has no users, and never
will.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-01 18:09:55 -04:00