From d42b386834ee1c22f6fac2f856bba8a6e4de38bb Mon Sep 17 00:00:00 2001 From: Al Viro Date: Thu, 26 May 2016 00:04:18 -0400 Subject: update D/f/directory-locking Signed-off-by: Al Viro --- Documentation/filesystems/directory-locking | 32 ++++++++++++++++++----------- 1 file changed, 20 insertions(+), 12 deletions(-) (limited to 'Documentation') diff --git a/Documentation/filesystems/directory-locking b/Documentation/filesystems/directory-locking index 09bbf9a54f808..c314badbcfc63 100644 --- a/Documentation/filesystems/directory-locking +++ b/Documentation/filesystems/directory-locking @@ -1,30 +1,37 @@ Locking scheme used for directory operations is based on two -kinds of locks - per-inode (->i_mutex) and per-filesystem +kinds of locks - per-inode (->i_rwsem) and per-filesystem (->s_vfs_rename_mutex). - When taking the i_mutex on multiple non-directory objects, we + When taking the i_rwsem on multiple non-directory objects, we always acquire the locks in order by increasing address. We'll call that "inode pointer" order in the following. For our purposes all operations fall in 5 classes: 1) read access. Locking rules: caller locks directory we are accessing. +The lock is taken shared. -2) object creation. Locking rules: same as above. +2) object creation. Locking rules: same as above, but the lock is taken +exclusive. 3) object removal. Locking rules: caller locks parent, finds victim, -locks victim and calls the method. +locks victim and calls the method. Locks are exclusive. 4) rename() that is _not_ cross-directory. Locking rules: caller locks -the parent and finds source and target. If target already exists, lock -it. If source is a non-directory, lock it. If that means we need to -lock both, lock them in inode pointer order. +the parent and finds source and target. In case of exchange (with +RENAME_EXCHANGE in rename2() flags argument) lock both. In any case, +if the target already exists, lock it. If the source is a non-directory, +lock it. If we need to lock both, lock them in inode pointer order. +Then call the method. All locks are exclusive. +NB: we might get away with locking the the source (and target in exchange +case) shared. 5) link creation. Locking rules: * lock parent * check that source is not a directory * lock source * call the method. +All locks are exclusive. 6) cross-directory rename. The trickiest in the whole bunch. Locking rules: @@ -35,11 +42,12 @@ rules: fail with -ENOTEMPTY * if new parent is equal to or is a descendent of source fail with -ELOOP - * If target exists, lock it. If source is a non-directory, lock - it. In case that means we need to lock both source and target, - do so in inode pointer order. + * If it's an exchange, lock both the source and the target. + * If the target exists, lock it. If the source is a non-directory, + lock it. If we need to lock both, do so in inode pointer order. * call the method. - +All ->i_rwsem are taken exclusive. Again, we might get away with locking +the the source (and target in exchange case) shared. The rules above obviously guarantee that all directories that are going to be read, modified or removed by method will be locked by caller. @@ -73,7 +81,7 @@ objects - A < B iff A is an ancestor of B. attempt to acquire some lock and already holds at least one lock. Let's consider the set of contended locks. First of all, filesystem lock is not contended, since any process blocked on it is not holding any locks. -Thus all processes are blocked on ->i_mutex. +Thus all processes are blocked on ->i_rwsem. By (3), any process holding a non-directory lock can only be waiting on another non-directory lock with a larger address. Therefore -- cgit v1.2.3 From 3767e255b390d72f9a33c08d9e86c5f21f25860f Mon Sep 17 00:00:00 2001 From: Al Viro Date: Fri, 27 May 2016 11:06:05 -0400 Subject: switch ->setxattr() to passing dentry and inode separately smack ->d_instantiate() uses ->setxattr(), so to be able to call it before we'd hashed the new dentry and attached it to inode, we need ->setxattr() instances getting the inode as an explicit argument rather than obtaining it from dentry. Similar change for ->getxattr() had been done in commit ce23e64. Unlike ->getxattr() (which is used by both selinux and smack instances of ->d_instantiate()) ->setxattr() is used only by smack one and unfortunately it got missed back then. Reported-by: Seung-Woo Kim Tested-by: Casey Schaufler Signed-off-by: Al Viro --- Documentation/filesystems/porting | 7 +++++++ drivers/staging/lustre/lustre/llite/llite_internal.h | 4 ++-- drivers/staging/lustre/lustre/llite/xattr.c | 6 ++---- fs/bad_inode.c | 4 ++-- fs/ecryptfs/crypto.c | 9 +++++---- fs/ecryptfs/ecryptfs_kernel.h | 4 ++-- fs/ecryptfs/inode.c | 7 ++++--- fs/ecryptfs/mmap.c | 3 ++- fs/fuse/dir.c | 6 +++--- fs/hfs/attr.c | 6 +++--- fs/hfs/hfs_fs.h | 2 +- fs/kernfs/inode.c | 11 ++++++----- fs/kernfs/kernfs-internal.h | 3 ++- fs/libfs.c | 5 +++-- fs/overlayfs/inode.c | 5 +++-- fs/overlayfs/overlayfs.h | 5 +++-- fs/xattr.c | 8 ++++---- include/linux/fs.h | 3 ++- include/linux/xattr.h | 3 ++- security/smack/smack_lsm.c | 2 +- 20 files changed, 59 insertions(+), 44 deletions(-) (limited to 'Documentation') diff --git a/Documentation/filesystems/porting b/Documentation/filesystems/porting index 46f3bb7a02f5f..a5fb89cac615c 100644 --- a/Documentation/filesystems/porting +++ b/Documentation/filesystems/porting @@ -578,3 +578,10 @@ in your dentry operations instead. -- [mandatory] ->atomic_open() calls without O_CREAT may happen in parallel. +-- +[mandatory] + ->setxattr() and xattr_handler.set() get dentry and inode passed separately. + dentry might be yet to be attached to inode, so do _not_ use its ->d_inode + in the instances. Rationale: !@#!@# security_d_instantiate() needs to be + called before we attach dentry to inode and !@#!@##!@$!$#!@#$!@$!@$ smack + ->d_instantiate() uses not just ->getxattr() but ->setxattr() as well. diff --git a/drivers/staging/lustre/lustre/llite/llite_internal.h b/drivers/staging/lustre/lustre/llite/llite_internal.h index ce1f949430f1a..3f2f30b6542c7 100644 --- a/drivers/staging/lustre/lustre/llite/llite_internal.h +++ b/drivers/staging/lustre/lustre/llite/llite_internal.h @@ -976,8 +976,8 @@ static inline __u64 ll_file_maxbytes(struct inode *inode) } /* llite/xattr.c */ -int ll_setxattr(struct dentry *dentry, const char *name, - const void *value, size_t size, int flags); +int ll_setxattr(struct dentry *dentry, struct inode *inode, + const char *name, const void *value, size_t size, int flags); ssize_t ll_getxattr(struct dentry *dentry, struct inode *inode, const char *name, void *buffer, size_t size); ssize_t ll_listxattr(struct dentry *dentry, char *buffer, size_t size); diff --git a/drivers/staging/lustre/lustre/llite/xattr.c b/drivers/staging/lustre/lustre/llite/xattr.c index ed4de04381c3f..608014b0dbcd6 100644 --- a/drivers/staging/lustre/lustre/llite/xattr.c +++ b/drivers/staging/lustre/lustre/llite/xattr.c @@ -211,11 +211,9 @@ int ll_setxattr_common(struct inode *inode, const char *name, return 0; } -int ll_setxattr(struct dentry *dentry, const char *name, - const void *value, size_t size, int flags) +int ll_setxattr(struct dentry *dentry, struct inode *inode, + const char *name, const void *value, size_t size, int flags) { - struct inode *inode = d_inode(dentry); - LASSERT(inode); LASSERT(name); diff --git a/fs/bad_inode.c b/fs/bad_inode.c index 72e35b721608e..3ba385eaa26ee 100644 --- a/fs/bad_inode.c +++ b/fs/bad_inode.c @@ -100,8 +100,8 @@ static int bad_inode_setattr(struct dentry *direntry, struct iattr *attrs) return -EIO; } -static int bad_inode_setxattr(struct dentry *dentry, const char *name, - const void *value, size_t size, int flags) +static int bad_inode_setxattr(struct dentry *dentry, struct inode *inode, + const char *name, const void *value, size_t size, int flags) { return -EIO; } diff --git a/fs/ecryptfs/crypto.c b/fs/ecryptfs/crypto.c index ebd40f46ed4c4..0d8eb3455b34d 100644 --- a/fs/ecryptfs/crypto.c +++ b/fs/ecryptfs/crypto.c @@ -1141,12 +1141,13 @@ ecryptfs_write_metadata_to_contents(struct inode *ecryptfs_inode, static int ecryptfs_write_metadata_to_xattr(struct dentry *ecryptfs_dentry, + struct inode *ecryptfs_inode, char *page_virt, size_t size) { int rc; - rc = ecryptfs_setxattr(ecryptfs_dentry, ECRYPTFS_XATTR_NAME, page_virt, - size, 0); + rc = ecryptfs_setxattr(ecryptfs_dentry, ecryptfs_inode, + ECRYPTFS_XATTR_NAME, page_virt, size, 0); return rc; } @@ -1215,8 +1216,8 @@ int ecryptfs_write_metadata(struct dentry *ecryptfs_dentry, goto out_free; } if (crypt_stat->flags & ECRYPTFS_METADATA_IN_XATTR) - rc = ecryptfs_write_metadata_to_xattr(ecryptfs_dentry, virt, - size); + rc = ecryptfs_write_metadata_to_xattr(ecryptfs_dentry, ecryptfs_inode, + virt, size); else rc = ecryptfs_write_metadata_to_contents(ecryptfs_inode, virt, virt_len); diff --git a/fs/ecryptfs/ecryptfs_kernel.h b/fs/ecryptfs/ecryptfs_kernel.h index 3ec495db7e827..4ba1547bb9adc 100644 --- a/fs/ecryptfs/ecryptfs_kernel.h +++ b/fs/ecryptfs/ecryptfs_kernel.h @@ -609,8 +609,8 @@ ssize_t ecryptfs_getxattr_lower(struct dentry *lower_dentry, struct inode *lower_inode, const char *name, void *value, size_t size); int -ecryptfs_setxattr(struct dentry *dentry, const char *name, const void *value, - size_t size, int flags); +ecryptfs_setxattr(struct dentry *dentry, struct inode *inode, const char *name, + const void *value, size_t size, int flags); int ecryptfs_read_xattr_region(char *page_virt, struct inode *ecryptfs_inode); #ifdef CONFIG_ECRYPT_FS_MESSAGING int ecryptfs_process_response(struct ecryptfs_daemon *daemon, diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c index 318b04689d769..9d153b6a1d723 100644 --- a/fs/ecryptfs/inode.c +++ b/fs/ecryptfs/inode.c @@ -1001,7 +1001,8 @@ static int ecryptfs_getattr(struct vfsmount *mnt, struct dentry *dentry, } int -ecryptfs_setxattr(struct dentry *dentry, const char *name, const void *value, +ecryptfs_setxattr(struct dentry *dentry, struct inode *inode, + const char *name, const void *value, size_t size, int flags) { int rc = 0; @@ -1014,8 +1015,8 @@ ecryptfs_setxattr(struct dentry *dentry, const char *name, const void *value, } rc = vfs_setxattr(lower_dentry, name, value, size, flags); - if (!rc && d_really_is_positive(dentry)) - fsstack_copy_attr_all(d_inode(dentry), d_inode(lower_dentry)); + if (!rc && inode) + fsstack_copy_attr_all(inode, d_inode(lower_dentry)); out: return rc; } diff --git a/fs/ecryptfs/mmap.c b/fs/ecryptfs/mmap.c index 148d11b514fb4..9c3437c8a5b12 100644 --- a/fs/ecryptfs/mmap.c +++ b/fs/ecryptfs/mmap.c @@ -442,7 +442,8 @@ static int ecryptfs_write_inode_size_to_xattr(struct inode *ecryptfs_inode) if (size < 0) size = 8; put_unaligned_be64(i_size_read(ecryptfs_inode), xattr_virt); - rc = lower_inode->i_op->setxattr(lower_dentry, ECRYPTFS_XATTR_NAME, + rc = lower_inode->i_op->setxattr(lower_dentry, lower_inode, + ECRYPTFS_XATTR_NAME, xattr_virt, size, 0); inode_unlock(lower_inode); if (rc) diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c index b9419058108fa..ccd4971cc6c1a 100644 --- a/fs/fuse/dir.c +++ b/fs/fuse/dir.c @@ -1719,10 +1719,10 @@ static int fuse_getattr(struct vfsmount *mnt, struct dentry *entry, return fuse_update_attributes(inode, stat, NULL, NULL); } -static int fuse_setxattr(struct dentry *entry, const char *name, - const void *value, size_t size, int flags) +static int fuse_setxattr(struct dentry *unused, struct inode *inode, + const char *name, const void *value, + size_t size, int flags) { - struct inode *inode = d_inode(entry); struct fuse_conn *fc = get_fuse_conn(inode); FUSE_ARGS(args); struct fuse_setxattr_in inarg; diff --git a/fs/hfs/attr.c b/fs/hfs/attr.c index 064f92f17efc8..d9a86919fdf6e 100644 --- a/fs/hfs/attr.c +++ b/fs/hfs/attr.c @@ -13,10 +13,10 @@ #include "hfs_fs.h" #include "btree.h" -int hfs_setxattr(struct dentry *dentry, const char *name, - const void *value, size_t size, int flags) +int hfs_setxattr(struct dentry *unused, struct inode *inode, + const char *name, const void *value, + size_t size, int flags) { - struct inode *inode = d_inode(dentry); struct hfs_find_data fd; hfs_cat_rec rec; struct hfs_cat_file *file; diff --git a/fs/hfs/hfs_fs.h b/fs/hfs/hfs_fs.h index fa3eed86837cf..ee2f385811c82 100644 --- a/fs/hfs/hfs_fs.h +++ b/fs/hfs/hfs_fs.h @@ -212,7 +212,7 @@ extern void hfs_evict_inode(struct inode *); extern void hfs_delete_inode(struct inode *); /* attr.c */ -extern int hfs_setxattr(struct dentry *dentry, const char *name, +extern int hfs_setxattr(struct dentry *dentry, struct inode *inode, const char *name, const void *value, size_t size, int flags); extern ssize_t hfs_getxattr(struct dentry *dentry, struct inode *inode, const char *name, void *value, size_t size); diff --git a/fs/kernfs/inode.c b/fs/kernfs/inode.c index 1719649d7ad78..63b925d5ba1e4 100644 --- a/fs/kernfs/inode.c +++ b/fs/kernfs/inode.c @@ -160,10 +160,11 @@ static int kernfs_node_setsecdata(struct kernfs_node *kn, void **secdata, return 0; } -int kernfs_iop_setxattr(struct dentry *dentry, const char *name, - const void *value, size_t size, int flags) +int kernfs_iop_setxattr(struct dentry *unused, struct inode *inode, + const char *name, const void *value, + size_t size, int flags) { - struct kernfs_node *kn = dentry->d_fsdata; + struct kernfs_node *kn = inode->i_private; struct kernfs_iattrs *attrs; void *secdata; int error; @@ -175,11 +176,11 @@ int kernfs_iop_setxattr(struct dentry *dentry, const char *name, if (!strncmp(name, XATTR_SECURITY_PREFIX, XATTR_SECURITY_PREFIX_LEN)) { const char *suffix = name + XATTR_SECURITY_PREFIX_LEN; - error = security_inode_setsecurity(d_inode(dentry), suffix, + error = security_inode_setsecurity(inode, suffix, value, size, flags); if (error) return error; - error = security_inode_getsecctx(d_inode(dentry), + error = security_inode_getsecctx(inode, &secdata, &secdata_len); if (error) return error; diff --git a/fs/kernfs/kernfs-internal.h b/fs/kernfs/kernfs-internal.h index 45c9192c276e4..37159235ac109 100644 --- a/fs/kernfs/kernfs-internal.h +++ b/fs/kernfs/kernfs-internal.h @@ -81,7 +81,8 @@ int kernfs_iop_permission(struct inode *inode, int mask); int kernfs_iop_setattr(struct dentry *dentry, struct iattr *iattr); int kernfs_iop_getattr(struct vfsmount *mnt, struct dentry *dentry, struct kstat *stat); -int kernfs_iop_setxattr(struct dentry *dentry, const char *name, const void *value, +int kernfs_iop_setxattr(struct dentry *dentry, struct inode *inode, + const char *name, const void *value, size_t size, int flags); int kernfs_iop_removexattr(struct dentry *dentry, const char *name); ssize_t kernfs_iop_getxattr(struct dentry *dentry, struct inode *inode, diff --git a/fs/libfs.c b/fs/libfs.c index 8765ff1adc07d..3db2721144c27 100644 --- a/fs/libfs.c +++ b/fs/libfs.c @@ -1118,8 +1118,9 @@ static int empty_dir_setattr(struct dentry *dentry, struct iattr *attr) return -EPERM; } -static int empty_dir_setxattr(struct dentry *dentry, const char *name, - const void *value, size_t size, int flags) +static int empty_dir_setxattr(struct dentry *dentry, struct inode *inode, + const char *name, const void *value, + size_t size, int flags) { return -EOPNOTSUPP; } diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c index c7b31a03dc9cf..0ed7c40124378 100644 --- a/fs/overlayfs/inode.c +++ b/fs/overlayfs/inode.c @@ -210,8 +210,9 @@ static bool ovl_is_private_xattr(const char *name) return strncmp(name, OVL_XATTR_PRE_NAME, OVL_XATTR_PRE_LEN) == 0; } -int ovl_setxattr(struct dentry *dentry, const char *name, - const void *value, size_t size, int flags) +int ovl_setxattr(struct dentry *dentry, struct inode *inode, + const char *name, const void *value, + size_t size, int flags) { int err; struct dentry *upperdentry; diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h index 99ec4b0352371..d79577eb3937b 100644 --- a/fs/overlayfs/overlayfs.h +++ b/fs/overlayfs/overlayfs.h @@ -171,8 +171,9 @@ int ovl_check_d_type_supported(struct path *realpath); /* inode.c */ int ovl_setattr(struct dentry *dentry, struct iattr *attr); int ovl_permission(struct inode *inode, int mask); -int ovl_setxattr(struct dentry *dentry, const char *name, - const void *value, size_t size, int flags); +int ovl_setxattr(struct dentry *dentry, struct inode *inode, + const char *name, const void *value, + size_t size, int flags); ssize_t ovl_getxattr(struct dentry *dentry, struct inode *inode, const char *name, void *value, size_t size); ssize_t ovl_listxattr(struct dentry *dentry, char *list, size_t size); diff --git a/fs/xattr.c b/fs/xattr.c index b16d078897000..4beafc43daa58 100644 --- a/fs/xattr.c +++ b/fs/xattr.c @@ -100,7 +100,7 @@ int __vfs_setxattr_noperm(struct dentry *dentry, const char *name, if (issec) inode->i_flags &= ~S_NOSEC; if (inode->i_op->setxattr) { - error = inode->i_op->setxattr(dentry, name, value, size, flags); + error = inode->i_op->setxattr(dentry, inode, name, value, size, flags); if (!error) { fsnotify_xattr(dentry); security_inode_post_setxattr(dentry, name, value, @@ -745,7 +745,8 @@ generic_listxattr(struct dentry *dentry, char *buffer, size_t buffer_size) * Find the handler for the prefix and dispatch its set() operation. */ int -generic_setxattr(struct dentry *dentry, const char *name, const void *value, size_t size, int flags) +generic_setxattr(struct dentry *dentry, struct inode *inode, const char *name, + const void *value, size_t size, int flags) { const struct xattr_handler *handler; @@ -754,8 +755,7 @@ generic_setxattr(struct dentry *dentry, const char *name, const void *value, siz handler = xattr_resolve_name(dentry->d_sb->s_xattr, &name); if (IS_ERR(handler)) return PTR_ERR(handler); - return handler->set(handler, dentry, d_inode(dentry), name, value, - size, flags); + return handler->set(handler, dentry, inode, name, value, size, flags); } /* diff --git a/include/linux/fs.h b/include/linux/fs.h index 5f61431d86739..62bdb0a6cf2d0 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1730,7 +1730,8 @@ struct inode_operations { struct inode *, struct dentry *, unsigned int); int (*setattr) (struct dentry *, struct iattr *); int (*getattr) (struct vfsmount *mnt, struct dentry *, struct kstat *); - int (*setxattr) (struct dentry *, const char *,const void *,size_t,int); + int (*setxattr) (struct dentry *, struct inode *, + const char *, const void *, size_t, int); ssize_t (*getxattr) (struct dentry *, struct inode *, const char *, void *, size_t); ssize_t (*listxattr) (struct dentry *, char *, size_t); diff --git a/include/linux/xattr.h b/include/linux/xattr.h index 76beb206741ab..94079bab92434 100644 --- a/include/linux/xattr.h +++ b/include/linux/xattr.h @@ -54,7 +54,8 @@ int vfs_removexattr(struct dentry *, const char *); ssize_t generic_getxattr(struct dentry *dentry, struct inode *inode, const char *name, void *buffer, size_t size); ssize_t generic_listxattr(struct dentry *dentry, char *buffer, size_t buffer_size); -int generic_setxattr(struct dentry *dentry, const char *name, const void *value, size_t size, int flags); +int generic_setxattr(struct dentry *dentry, struct inode *inode, + const char *name, const void *value, size_t size, int flags); int generic_removexattr(struct dentry *dentry, const char *name); ssize_t vfs_getxattr_alloc(struct dentry *dentry, const char *name, char **xattr_value, size_t size, gfp_t flags); diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c index ff2b8c3cf7a9c..6777295f4b2b7 100644 --- a/security/smack/smack_lsm.c +++ b/security/smack/smack_lsm.c @@ -3514,7 +3514,7 @@ static void smack_d_instantiate(struct dentry *opt_dentry, struct inode *inode) */ if (isp->smk_flags & SMK_INODE_CHANGED) { isp->smk_flags &= ~SMK_INODE_CHANGED; - rc = inode->i_op->setxattr(dp, + rc = inode->i_op->setxattr(dp, inode, XATTR_NAME_SMACKTRANSMUTE, TRANS_TRUE, TRANS_TRUE_SIZE, 0); -- cgit v1.2.3