summaryrefslogtreecommitdiffstats
path: root/net/sunrpc
Commit message (Collapse)AuthorAgeFilesLines
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds2008-10-171-2/+0
|\ | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: net: Remove CONFIG_KMOD from net/ (towards removing CONFIG_KMOD entirely) ipv4: Add a missing rcu_assign_pointer() in routing cache. [netdrvr] ibmtr: PCMCIA IBMTR is ok on 64bit xen-netfront: Avoid unaligned accesses to IP header lmc: copy_*_user under spinlock [netdrvr] myri10ge, ixgbe: remove broken select INTEL_IOATDMA
| * net: Remove CONFIG_KMOD from net/ (towards removing CONFIG_KMOD entirely)Johannes Berg2008-10-161-2/+0
| | | | | | | | | | | | | | | | | | | | | | Some code here depends on CONFIG_KMOD to not try to load protocol modules or similar, replace by CONFIG_MODULES where more than just request_module depends on CONFIG_KMOD and and also use try_then_request_module in ebtables. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch 'next'Trond Myklebust2008-10-157-268/+616
|\ \
| * | RPC/RDMA: ensure connection attempt is complete before signalling.Tom Talpey2008-10-101-5/+1
| | | | | | | | | | | | | | | | | | | | | | | | The RPC/RDMA connection logic could return early from reconnection attempts, leading to additional spurious retries. Signed-off-by: Tom Talpey <talpey@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | RPC/RDMA: correct the reconnect timer backoffTom Talpey2008-10-101-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The RPC/RDMA code had a constant 5-second reconnect backoff, and always performed it, even when re-establishing a connection to a server after the RPC layer closed it due to being idle. Make it an geometric backoff (up to 30 seconds), and don't delay idle reconnect. Signed-off-by: Tom Talpey <talpey@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | RPC/RDMA: optionally emit useful transport info upon connect/disconnect.Tom Talpey2008-10-102-1/+22
| | | | | | | | | | | | | | | Signed-off-by: Tom Talpey <talpey@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | RPC/RDMA: reformat a debug printk to keep lines together.Tom Talpey2008-10-101-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | The send marshaling code split a particular dprintk across two lines, which makes it hard to extract from logfiles. Signed-off-by: Tom Talpey <talpey@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | RPC/RDMA: harden connection logic against missing/late rdma_cm upcalls.Tom Talpey2008-10-102-4/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | Add defensive timeouts to wait_for_completion() calls in RDMA address resolution, and make them interruptible. Fix the timeout units to milliseconds (formerly jiffies) and move to private header. Signed-off-by: Tom Talpey <talpey@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | RPC/RDMA: fix connect/reconnect resource leak.Tom Talpey2008-10-101-5/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | The RPC/RDMA code can leak RDMA connection manager endpoints in certain error cases on connect. Don't signal unwanted events, and be certain to destroy any allocated qp. Signed-off-by: Tom Talpey <talpey@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | RPC/RDMA: return a consistent error, when connect fails.Tom Talpey2008-10-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The xprt_connect call path does not expect such errors as ECONNREFUSED to be returned from failed transport connection attempts, otherwise it translates them to EIO and signals fatal errors. For example, mount.nfs prints simply "internal error". Translate all such errors to ENOTCONN from RPC/RDMA to match sockets behavior. Signed-off-by: Tom Talpey <talpey@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | RPC/RDMA: adhere to protocol for unpadded client trailing write chunks.Tom Talpey2008-10-103-2/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The RPC/RDMA protocol allows clients and servers to avoid RDMA operations for data which is purely the result of XDR padding. On the client, automatically insert the necessary padding for such server replies, and optionally don't marshal such chunks. Signed-off-by: Tom Talpey <talpey@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | RPC/RDMA: avoid an oops due to disconnect racing with async upcalls.Tom Talpey2008-10-101-11/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | RDMA disconnects yield an upcall from the RDMA connection manager, which can race with rpc transport close, e.g. on ^C of a mount. Ensure any rdma cm_id and qp are fully destroyed before continuing. Signed-off-by: Tom Talpey <talpey@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | RPC/RDMA: maintain the RPC task bytes-sent statistic.Tom Talpey2008-10-101-0/+1
| | | | | | | | | | | | | | | Signed-off-by: Tom Talpey <talpey@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | RPC/RDMA: suppress retransmit on RPC/RDMA clients.Tom Talpey2008-10-103-4/+15
| | | | | | | | | | | | | | | | | | | | | | | | An RPC/RDMA client cannot retransmit on an unbroken connection, doing so violates its flow control with the server. Signed-off-by: Tom Talpey <talpey@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | RPC/RDMA: fix connection IRD/ORD settingTom Tucker2008-10-101-37/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This logic sets the connection parameter that configures the local device and informs the remote peer how many concurrent incoming RDMA_READ requests are supported. The original logic didn't really do what was intended for two reasons: - The max number supported by the device is typically smaller than any one factor in the calculation used, and - The field in the connection parameter structure where the value is stored is a u8 and always overflows for the default settings. So what really happens is the value requested for responder resources is the left over 8 bits from the "desired value". If the desired value happened to be a multiple of 256, the result was zero and it wouldn't connect at all. Given the above and the fact that max_requests is almost always larger than the max responder resources supported by the adapter, this patch simplifies this logic and simply requests the max supported by the device, subject to a reasonable limit. This bug was found by Jim Schutt at Sandia. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Tom Talpey <talpey@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | RPC/RDMA: support FRMR client memory registration.Tom Talpey2008-10-102-6/+167
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Configure, detect and use "fastreg" support from IB/iWARP verbs layer to perform RPC/RDMA memory registration. Make FRMR the default memreg mode (will fall back if not supported by the selected RDMA adapter). This allows full and optimal operation over the cxgb3 adapter, and others. Signed-off-by: Tom Talpey <talpey@netapp.com> Acked-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | RPC/RDMA: check selected memory registration mode at runtime.Tom Talpey2008-10-101-15/+80
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | At transport creation, check for, and use, any local dma lkey. Then, check that the selected memory registration mode is in fact supported by the RDMA adapter selected for the mount. Fall back to best alternative if not. Signed-off-by: Tom Talpey <talpey@netapp.com> Acked-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | RPC/RDMA: add data types and new FRMR memory registration enum.Tom Talpey2008-10-101-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | Internal RPC/RDMA structure updates in preparation for FRMR support. Signed-off-by: Tom Talpey <talpey@netapp.com> Acked-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | RPC/RDMA: refactor the inline memory registration code.Tom Talpey2008-10-101-158/+207
| | | | | | | | | | | | | | | | | | | | | | | | | | | Refactor the memory registration and deregistration routines. This saves stack space, makes the code more readable and prepares to add the new FRMR registration methods. Signed-off-by: Tom Talpey <talpey@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | sunrpc: fix oops in rpc_create when the mount namespace is unsharedCedric Le Goater2008-10-071-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On a system with nfs mounts, if a task unshares its mount namespace, a oops can occur when the system is rebooted if the task is the last to unreference the nfs mount. It will try to create a rpc request using utsname() which has been invalidated by free_nsproxy(). The patch fixes the issue by using the global init_utsname() which is always valid. the capability of identifying rpc clients per uts namespace stills needs some extra work so this should not be a problem. BUG: unable to handle kernel NULL pointer dereference at 00000004 IP: [<c024c9ab>] rpc_create+0x332/0x42f Oops: 0000 [#1] DEBUG_PAGEALLOC Pid: 1857, comm: uts-oops Not tainted (2.6.27-rc5-00319-g7686ad5 #4) EIP: 0060:[<c024c9ab>] EFLAGS: 00210287 CPU: 0 EIP is at rpc_create+0x332/0x42f EAX: 00000000 EBX: df26adf0 ECX: c0251887 EDX: 00000001 ESI: df26ae58 EDI: c02f293c EBP: dda0fc9c ESP: dda0fc2c DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 Process uts-oops (pid: 1857, ti=dda0e000 task=dd9a0778 task.ti=dda0e000) Stack: c0104532 dda0fffc dda0fcac dda0e000 dda0e000 dd93b7f0 00000009 c02f2880 df26aefc dda0fc68 c01096b7 00000000 c0266ee0 c039a070 c039a070 dda0fc74 c012ca67 c039a064 dda0fc8c c012cb20 c03daf74 00000011 00000000 c0275c90 Call Trace: [<c0104532>] ? dump_trace+0xc2/0xe2 [<c01096b7>] ? save_stack_trace+0x1c/0x3a [<c012ca67>] ? save_trace+0x37/0x8c [<c012cb20>] ? add_lock_to_list+0x64/0x96 [<c0256fc4>] ? rpcb_register_call+0x62/0xbb [<c02570c8>] ? rpcb_register+0xab/0xb3 [<c0252f4d>] ? svc_register+0xb4/0x128 [<c0253114>] ? svc_destroy+0xec/0x103 [<c02531b2>] ? svc_exit_thread+0x87/0x8d [<c01a75cd>] ? lockd_down+0x61/0x81 [<c01a577b>] ? nlmclnt_done+0xd/0xf [<c01941fe>] ? nfs_destroy_server+0x14/0x16 [<c0194328>] ? nfs_free_server+0x4c/0xaa [<c019a066>] ? nfs_kill_super+0x23/0x27 [<c0158585>] ? deactivate_super+0x3f/0x51 [<c01695d1>] ? mntput_no_expire+0x95/0xb4 [<c016965b>] ? release_mounts+0x6b/0x7a [<c01696cc>] ? __put_mnt_ns+0x62/0x70 [<c0127501>] ? free_nsproxy+0x25/0x80 [<c012759a>] ? switch_task_namespaces+0x3e/0x43 [<c01275a9>] ? exit_task_namespaces+0xa/0xc [<c0117fed>] ? do_exit+0x4fd/0x666 [<c01181b3>] ? do_group_exit+0x5d/0x83 [<c011fa8c>] ? get_signal_to_deliver+0x2c8/0x2e0 [<c0102630>] ? do_notify_resume+0x69/0x700 [<c011d85a>] ? do_sigaction+0x134/0x145 [<c0127205>] ? hrtimer_nanosleep+0x8f/0xce [<c0126d1a>] ? hrtimer_wakeup+0x0/0x1c [<c0103488>] ? work_notifysig+0x13/0x1b ======================= Code: 70 20 68 cb c1 2c c0 e8 75 4e 01 00 8b 83 ac 00 00 00 59 3d 00 f0 ff ff 5f 77 63 eb 57 a1 00 80 2d c0 8b 80 a8 02 00 00 8d 73 68 <8b> 40 04 83 c0 45 e8 41 46 f7 ff ba 20 00 00 00 83 f8 21 0f 4c EIP: [<c024c9ab>] rpc_create+0x332/0x42f SS:ESP 0068:dda0fc2c Signed-off-by: Cedric Le Goater <clg@fr.ibm.com> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: "Serge E. Hallyn" <serue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | SUNRPC: Fix a memory leak in rpcb_getport_asyncTrond Myklebust2008-10-071-1/+3
| | | | | | | | | | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | SUNRPC: Fix autobind on cloned rpc clientsTrond Myklebust2008-10-071-7/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Despite the fact that cloned rpc clients won't have the cl_autobind flag set, they may still find themselves calling rpcb_getport_async(). For this to happen, it suffices for a _parent_ rpc_clnt to use autobinding, in which case any clone may find itself triggering the !xprt_bound() case in call_bind(). The correct fix for this is to walk back up the tree of cloned rpc clients, in order to find the parent that 'owns' the transport, either because it has clnt->cl_autobind set, or because it originally created the transport... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | sunrpc: do not pin sunrpc module in the memoryDenis V. Lunev2008-10-071-8/+4
| | | | | | | | | | | | | | | | | | | | | | | | Basically, try_module_get here are pretty useless. Any other module using this API will pin sunrpc in memory due using exported symbols. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* | | Merge branch 'for-2.6.28' of git://linux-nfs.org/~bfields/linuxLinus Torvalds2008-10-148-207/+989
|\ \ \ | |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-2.6.28' of git://linux-nfs.org/~bfields/linux: (59 commits) svcrdma: Fix IRD/ORD polarity svcrdma: Update svc_rdma_send_error to use DMA LKEY svcrdma: Modify the RPC reply path to use FRMR when available svcrdma: Modify the RPC recv path to use FRMR when available svcrdma: Add support to svc_rdma_send to handle chained WR svcrdma: Modify post recv path to use local dma key svcrdma: Add a service to register a Fast Reg MR with the device svcrdma: Query device for Fast Reg support during connection setup svcrdma: Add FRMR get/put services NLM: Remove unused argument from svc_addsock() function NLM: Remove "proto" argument from lockd_up() NLM: Always start both UDP and TCP listeners lockd: Remove unused fields in the nlm_reboot structure lockd: Add helper to sanity check incoming NOTIFY requests lockd: change nlmclnt_grant() to take a "struct sockaddr *" lockd: Adjust nlmsvc_lookup_host() to accomodate AF_INET6 addresses lockd: Adjust nlmclnt_lookup_host() signature to accomodate non-AF_INET lockd: Support non-AF_INET addresses in nlm_lookup_host() NLM: Convert nlm_lookup_host() to use a single argument svcrdma: Add Fast Reg MR Data Types ...
| * | Merge branch 'from-tomtucker' into for-2.6.28J. Bruce Fields2008-10-083-122/+684
| |\ \
| | * | svcrdma: Fix IRD/ORD polarityTom Tucker2008-10-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The inititator/responder resources in the event have been swapped. They no represent what the local peer would set their values to in order to match the peer. Note that iWARP does not exchange these on the wire and the provider is simply putting in the local device max. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
| | * | svcrdma: Update svc_rdma_send_error to use DMA LKEYTom Tucker2008-10-061-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Update the svc_rdma_send_error code to use the DMA LKEY which is valid regardless of the memory registration strategy in use. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
| | * | svcrdma: Modify the RPC reply path to use FRMR when availableTom Tucker2008-10-062-40/+217
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use FRMR to map local RPC reply data. This allows RDMA_WRITE to send reply data using a single WR. The FRMR is invalidated by linking the LOCAL_INV WR to the RDMA_SEND message used to complete the reply. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
| | * | svcrdma: Modify the RPC recv path to use FRMR when availableTom Tucker2008-10-062-22/+170
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | RPCRDMA requests that specify a read-list are fetched with RDMA_READ. Using an FRMR to map the data sink improves NFSRDMA security on transports that place the RDMA_READ data sink LKEY on the wire because the valid lifetime of the MR is only the duration of the RDMA_READ. The LKEY is invalidated when the last RDMA_READ WR completes. Mapping the data sink also allows for very large amounts to data to be fetched with a single WR, so if the client is also using FRMR, the entire RPC read-list can be fetched with a single WR. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
| | * | svcrdma: Add support to svc_rdma_send to handle chained WRTom Tucker2008-10-061-8/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | WR can be submitted as linked lists of WR. Update the svc_rdma_send routine to handle WR chains. This will be used to submit a WR that uses an FRMR with another WR that invalidates the FRMR. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
| | * | svcrdma: Modify post recv path to use local dma keyTom Tucker2008-10-061-3/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Update the svc_rdma_post_recv routine to use the adapter's global LKEY instead of sc_phys_mr which is only valid when using a DMA MR. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
| | * | svcrdma: Add a service to register a Fast Reg MR with the deviceTom Tucker2008-10-061-35/+76
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fast Reg MR introduces a new WR type. Add a service to register the region with the adapter and update the completion handling to support completions with a NULL WR context. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
| | * | svcrdma: Query device for Fast Reg support during connection setupTom Tucker2008-10-061-6/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Query the device capabilities in the svc_rdma_accept function to determine what advanced memory management capabilities are supported by the device. Based on the query, select the most secure model available given the requirements of the transport and capabilities of the adapter. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
| | * | svcrdma: Add FRMR get/put servicesTom Tucker2008-10-061-5/+111
| | |/ | | | | | | | | | | | | | | | | | | | | | Add services for the allocating, freeing, and unmapping Fast Reg MR. These services will be used by the transport connection setup, send and receive routines. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
| * | NLM: Remove unused argument from svc_addsock() functionChuck Lever2008-10-041-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Clean up: The svc_addsock() function no longer uses its "proto" argument, so remove it. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: Neil Brown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
| * | nfsd: use nfs client rpc callback programBenny Halevy2008-09-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | since commit ff7d9756b501744540be65e172d27ee321d86103 "nfsd: use static memory for callback program and stats" do_probe_callback uses a static callback program (NFS4_CALLBACK) rather than the one set in clp->cl_callback.cb_prog as passed in by the client in setclientid (4.0) or create_session (4.1). This patches introduces rpc_create_args.prognumber that allows overriding program->number when creating rpc_clnt. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
| * | SUNRPC: Clean up debug messages in rpcb_clnt.cChuck Lever2008-09-291-9/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The RPCB XDR functions are used for multiple procedures. For instance, rpcb_encode_getaddr() is used for RPCB_GETADDR, RPCB_SET, and RPCB_UNSET. Make the XDR debug messages more generic so they are less confusing. And, unlike in other RPC consumers in the kernel, a single debug flag enables all levels of debug messages in the RPC bind client, including XDR debug messages. Since the XDR decoders already report success or failure in this case, remove redundant debug messages in the mid-level rpcb_register_call() function. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
| * | SUNRPC: Fix up svc_unregister()Chuck Lever2008-09-291-20/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the new rpcbind code, a PMAP_UNSET will not have any effect on services registered via rpcbind v3 or v4. Implement a version of svc_unregister() that uses an RPCB_UNSET with an empty netid string to make sure we have cleared *all* entries for a kernel RPC service when shutting down, or before starting a fresh instance of the service. Use the new version only when CONFIG_SUNRPC_REGISTER_V4 is enabled; otherwise, the legacy PMAP version is used to ensure complete backwards-compatibility with the Linux portmapper daemon. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
| * | SUNRPC: Use short-hand IPv6 ANYADDR for RPCB_SETChuck Lever2008-09-291-4/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Clean up: When doing an RPCB_SET, make the kernel's rpcb client use the shorthand "::" for the universal form of the IPv6 ANY address. Without this patch, rpcbind will advertise: 0000:0000:0000:0000:0000:0000:0000:0000.x.y This is cosmetic only. It cleans up the display of information from /sbin/rpcinfo. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
| * | SUNRPC: Register both netids for AF_INET6 serversChuck Lever2008-09-291-43/+100
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | TI-RPC is a user-space library of RPC functions that replaces ONC RPC and allows RPC to operate in the new world of IPv6. TI-RPC combines the concept of a transport protocol (UDP and TCP) and a protocol family (PF_INET and PF_INET6) into a single identifier called a "netid." For example, "udp" means UDP over IPv4, and "udp6" means UDP over IPv6. For rpcbind, then, the RPC service tuple that is registered and advertised is: [RPC program, RPC version, service address and port, netid] instead of [RPC program, RPC version, port, protocol] Service address is typically ANYADDR, but can be a specific address of one of the interfaces on a multi-homed host. The third item in the new tuple is expressed as a universal address. The current Linux rpcbind implementation registers a netid for both protocol families when RPCB_SET is done for just the PF_INET6 version of the netid (ie udp6 or tcp6). So registering "udp6" causes a registration for "udp" to appear automatically as well. We've recently determined that this is incorrect behavior. In the TI-RPC world, "udp6" is not meant to imply that the registered RPC service handles requests from AF_INET as well, even if the listener socket does address mapping. "udp" and "udp6" are entirely separate capabilities, and must be registered separately. The Linux kernel, unlike TI-RPC, leverages address mapping to allow a single listener socket to handle requests for both AF_INET and AF_INET6. This is still OK, but the kernel currently assumes registering "udp6" will cover "udp" as well. It registers only "udp6" for it's AF_INET6 services, even though they handle both AF_INET and AF_INET6 on the same port. So svc_register() actually needs to register both "udp" and "udp6" explicitly (and likewise for TCP). Until rpcbind is fixed, the kernel can ignore the return code for the second RPCB_SET call. Please merge this with commit 15231312: SUNRPC: Support IPv6 when registering kernel RPC services Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: Olaf Kirch <okir@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
| * | SUNRPC: Support IPv6 when registering kernel RPC servicesChuck Lever2008-09-291-7/+88
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to advertise NFS-related services on IPv6 interfaces via rpcbind, the kernel RPC server implementation must use rpcb_v4_register() instead of rpcb_register(). A new kernel build option allows distributions to use the legacy v2 call until they integrate an appropriate user-space rpcbind daemon that can support IPv6 RPC services. I tried adding some automatic logic to fall back if registering with a v4 protocol request failed, but there are too many corner cases. So I just made it a compile-time switch that distributions can throw when they've replaced portmapper with rpcbind. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
| * | SUNRPC: Split portmap unregister API into separate functionChuck Lever2008-09-291-12/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Create a separate server-level interface for unregistering RPC services. The mechanics of, and the API for, registering and unregistering RPC services will diverge further as support for IPv6 is added. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
| * | SUNRPC: Simplify rpcb_register() APIChuck Lever2008-09-292-43/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Bruce suggested there's no need to expose the difference between an error sending the PMAP_SET request and an error reply from the portmapper to rpcb_register's callers. The user space equivalent of rpcb_register() is pmap_set(3), which returns a bool_t : either the PMAP set worked, or it didn't. Simple. So let's remove the "*okay" argument from rpcb_register() and rpcb_v4_register(), and simply return an error if any part of the call didn't work. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
| * | SUNRPC: Set V6ONLY socket option for RPC listener socketsChuck Lever2008-09-291-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | My plan is to use an AF_INET listener on systems that support only IPv4, and an AF_INET6 listener on systems that can support IPv6. Incoming IPv4 packets will be posted to an AF_INET6 listener with a mapped IPv4 address. Max Matveev <makc@sgi.com> says: Creating a single listener can be dangerous - if net.ipv6.bindv6only is enabled then it's possible to create another listener in v4 namespace on the same port and steal the traffic from the "unifed" listener. You need to disable V6ONLY explicitly via a sockopt to stop that. Set appropriate socket option on RPC server listener sockets to prevent this. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
| * | SUNRPC: Use proper INADDR_ANY when setting up RPC services on IPv6Chuck Lever2008-09-291-6/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Teach svc_create_xprt() to use the correct ANY address for AF_INET6 based RPC services. No caller uses AF_INET6 yet. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
| * | SUNRPC: Add address family field to svc_serv data structureChuck Lever2008-09-291-5/+6
| |/ | | | | | | | | | | | | | | | | | | Introduce and initialize an address family field in the svc_serv structure. This field will determine what family to use for the service's listener sockets and what families are advertised via the local rpcbind daemon. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* | net: Rationalise email address: Network Specific PartsAlan Cox2008-10-131-2/+2
| | | | | | | | | | | | | | | | | | | | Clean up the various different email addresses of mine listed in the code to a single current and valid address. As Dave says his network merges for 2.6.28 are now done this seems a good point to send them in where they won't risk disrupting real changes. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: Use hton[sl]() instead of __constant_hton[sl]() where applicableArnaldo Carvalho de Melo2008-09-201-2/+2
|/ | | | | Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sunrpc: fix possible overrun on read of /proc/sys/sunrpc/transportsCyrill Gorcunov2008-09-011-14/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Vegard Nossum reported ---------------------- > I noticed that something weird is going on with /proc/sys/sunrpc/transports. > This file is generated in net/sunrpc/sysctl.c, function proc_do_xprt(). When > I "cat" this file, I get the expected output: > $ cat /proc/sys/sunrpc/transports > tcp 1048576 > udp 32768 > But I think that it does not check the length of the buffer supplied by > userspace to read(). With my original program, I found that the stack was > being overwritten by the characters above, even when the length given to > read() was just 1. David Wagner added (among other things) that copy_to_user could be probably used here. Ingo Oeser suggested to use simple_read_from_buffer() here. The conclusion is that proc_do_xprt doesn't check for userside buffer size indeed so fix this by using Ingo's suggestion. Reported-by: Vegard Nossum <vegard.nossum@gmail.com> Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> CC: Ingo Oeser <ioe-lkml@rameria.de> Cc: Neil Brown <neilb@suse.de> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: Greg Banks <gnb@sgi.com> Cc: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* svcrdma: Fix race between svc_rdma_recvfrom thread and the dto_taskletTom Tucker2008-08-132-7/+6
| | | | | | | | | | | | | | | | | RDMA_READ completions are kept on a separate queue from the general I/O request queue. Since a separate lock is used to protect the RDMA_READ completion queue, a race exists between the dto_tasklet and the svc_rdma_recvfrom thread where the dto_tasklet sets the XPT_DATA bit and adds I/O to the read-completion queue. Concurrently, the recvfrom thread checks the generic queue, finds it empty and resets the XPT_DATA bit. A subsequent svc_xprt_enqueue will fail to enqueue the transport for I/O and cause the transport to "stall". The fix is to protect both lists with the same lock and set the XPT_DATA bit with this lock held. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>