summaryrefslogtreecommitdiffstats
path: root/fs/f2fs/dir.c
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'for-f2fs-3.11' of ↵Linus Torvalds2013-07-021-47/+62
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "This patch-set includes the following major enhancement patches: - remount_fs callback function - restore parent inode number to enhance the fsync performance - xattr security labels - reduce the number of redundant lock/unlock data pages - avoid frequent write_inode calls The other minor bug fixes are as follows. - endian conversion bugs - various bugs in the roll-forward recovery routine" * tag 'for-f2fs-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (56 commits) f2fs: fix to recover i_size from roll-forward f2fs: remove the unused argument "sbi" of func destroy_fsync_dnodes() f2fs: remove reusing any prefree segments f2fs: code cleanup and simplify in func {find/add}_gc_inode f2fs: optimize the init_dirty_segmap function f2fs: fix an endian conversion bug detected by sparse f2fs: fix crc endian conversion f2fs: add remount_fs callback support f2fs: recover wrong pino after checkpoint during fsync f2fs: optimize do_write_data_page() f2fs: make locate_dirty_segment() as static f2fs: remove unnecessary parameter "offset" from __add_sum_entry() f2fs: avoid freqeunt write_inode calls f2fs: optimise the truncate_data_blocks_range() range f2fs: use the F2FS specific flags in f2fs_ioctl() f2fs: sync dir->i_size with its block allocation f2fs: fix i_blocks translation on various types of files f2fs: set sb->s_fs_info before calling parse_options() f2fs: support xattr security labels f2fs: fix iget/iput of dir during recovery ...
| * f2fs: recover wrong pino after checkpoint during fsyncJaegeuk Kim2013-06-141-1/+1
| | | | | | | | | | | | | | | | | | | | If a file is linked, f2fs loose its parent inode number so that fsync calls for the linked file should do checkpoint all the time. But, if we can recover its parent inode number after the checkpoint, we can adjust roll-forward mechanism for the further fsync calls, which is able to improve the fsync performance significatly. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * f2fs: sync dir->i_size with its block allocationJaegeuk Kim2013-06-121-5/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If new dentry block is allocated and its i_size is updated, we should update its inode block together in order to sync i_size and its block allocation. Otherwise, we can loose additional dentry block due to the unconsistent i_size. Errorneous Scenario ------------------- In the recovery routine, - recovery_dentry | - __f2fs_add_link | | - get_new_data_page | | | - i_size_write(new_i_size) | | | - mark_inode_dirty_sync(dir) | | - update_parent_metadata | | | - mark_inode_dirty(dir) | - write_checkpoint - sync_dirty_dir_inodes - filemap_flush(dentry_blocks) - f2fs_write_data_page - skip to write the last dentry block due to index < i_size In the above flow, new_i_size is not updated to its inode block so that the last dentry block will be lost accordingly. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * f2fs: support xattr security labelsJaegeuk Kim2013-06-111-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds the support of security labels for f2fs, which will be used by Linus Security Models (LSMs). Quote from http://en.wikipedia.org/wiki/Linux_Security_Modules: "Linux Security Modules (LSM) is a framework that allows the Linux kernel to support a variety of computer security models while avoiding favoritism toward any single security implementation. The framework is licensed under the terms of the GNU General Public License and is standard part of the Linux kernel since Linux 2.6. AppArmor, SELinux, Smack and TOMOYO Linux are the currently accepted modules in the official kernel.". Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * f2fs: cover cp_file information with ilockJaegeuk Kim2013-05-281-1/+7
| | | | | | | | | | | | | | | | | | | | If a file is linked with other files, it should be checkpointed at every fsync calls. For this, we use set_cp_file() with FADVISE_CP_BIT, but previously we didn't cover the flag by the global lock. This patch fixes that the inode page stores this correctly. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * f2fs: remove unneeded initializations in f2fs_parent_dirNamjae Jeon2013-05-281-3/+3
| | | | | | | | | | | | | | | | | | | | There is no need to initialize few pointers in f2fs_parent_dir as the values are not checked and instead directly initialized values are used. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * f2fs: update inode page after creationJaegeuk Kim2013-05-281-40/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I found a bug when testing power-off-recovery as follows. [Bug Scenario] 1. create a file 2. fsync the file 3. reboot w/o any sync 4. try to recover the file - found its fsync mark - found its dentry mark : try to recover its dentry - get its file name - get its parent inode number : here we got zero value The reason why we get the wrong parent inode number is that we didn't synchronize the inode page with its newly created inode information perfectly. Especially, previous f2fs stores fi->i_pino and writes it to the cached node page in a wrong order, which incurs the zero-valued i_pino during the recovery. So, this patch modifies the creation flow to fix the synchronization order of inode page with its inode. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * f2fs: change get_new_data_page to pass a locked node pageJaegeuk Kim2013-05-281-2/+2
| | | | | | | | | | | | This patch is for passing a locked node page to get_dnode_of_data. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* | [readdir] convert f2fsAl Viro2013-06-291-22/+14
|/ | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* Merge tag 'f2fs-for-v3.10' of ↵Linus Torvalds2013-05-081-58/+52
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "This patch-set includes the following major enhancement patches. - introduce a new gloabl lock scheme - add tracepoints on several major functions - fix the overall cleaning process focused on victim selection - apply the block plugging to merge IOs as much as possible - enhance management of free nids and its list - enhance the readahead mode for node pages - address several cretical deadlock conditions - reduce lock_page calls The other minor bug fixes and enhancements are as follows. - calculation mistakes: overflow - bio types: READ, READA, and READ_SYNC - fix the recovery flow, data races, and null pointer errors" * tag 'f2fs-for-v3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (68 commits) f2fs: cover free_nid management with spin_lock f2fs: optimize scan_nat_page() f2fs: code cleanup for scan_nat_page() and build_free_nids() f2fs: bugfix for alloc_nid_failed() f2fs: recover when journal contains deleted files f2fs: continue to mount after failing recovery f2fs: avoid deadlock during evict after f2fs_gc f2fs: modify the number of issued pages to merge IOs f2fs: remove useless #include <linux/proc_fs.h> as we're now using sysfs as debug entry. f2fs: fix inconsistent using of NM_WOUT_THRESHOLD f2fs: check truncation of mapping after lock_page f2fs: enhance alloc_nid and build_free_nids flows f2fs: add a tracepoint on f2fs_new_inode f2fs: check nid == 0 in add_free_nid f2fs: add REQ_META about metadata requests for submit f2fs: give a chance to merge IOs by IO scheduler f2fs: avoid frequent background GC f2fs: add tracepoints to debug checkpoint request f2fs: add tracepoints for write page operations f2fs: add tracepoints to debug the block allocation ...
| * f2fs: give a chance to merge IOs by IO schedulerJaegeuk Kim2013-04-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, background GC submits many 4KB read requests to load victim blocks and/or its (i)node blocks. ... f2fs_gc : f2fs_readpage: ino = 1, page_index = 0xb61, blkaddr = 0x3b964ed f2fs_gc : block_rq_complete: 8,16 R () 499854968 + 8 [0] f2fs_gc : f2fs_readpage: ino = 1, page_index = 0xb6f, blkaddr = 0x3b964ee f2fs_gc : block_rq_complete: 8,16 R () 499854976 + 8 [0] f2fs_gc : f2fs_readpage: ino = 1, page_index = 0xb79, blkaddr = 0x3b964ef f2fs_gc : block_rq_complete: 8,16 R () 499854984 + 8 [0] ... However, by the fact that many IOs are sequential, we can give a chance to merge the IOs by IO scheduler. In order to do that, let's use blk_plug. ... f2fs_gc : f2fs_iget: ino = 143 f2fs_gc : f2fs_readpage: ino = 143, page_index = 0x1c6, blkaddr = 0x2e6ee f2fs_gc : f2fs_iget: ino = 143 f2fs_gc : f2fs_readpage: ino = 143, page_index = 0x1c7, blkaddr = 0x2e6ef <idle> : block_rq_complete: 8,16 R () 1519616 + 8 [0] <idle> : block_rq_complete: 8,16 R () 1519848 + 8 [0] <idle> : block_rq_complete: 8,16 R () 1520432 + 96 [0] <idle> : block_rq_complete: 8,16 R () 1520536 + 104 [0] <idle> : block_rq_complete: 8,16 R () 1521008 + 112 [0] <idle> : block_rq_complete: 8,16 R () 1521440 + 152 [0] <idle> : block_rq_complete: 8,16 R () 1521688 + 144 [0] <idle> : block_rq_complete: 8,16 R () 1522128 + 192 [0] <idle> : block_rq_complete: 8,16 R () 1523256 + 328 [0] ... Note that this issue should be addressed in checkpoint, and some readahead flows too. Reviewed-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * f2fs: introduce a new global lock schemeJaegeuk Kim2013-04-091-57/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the previous version, f2fs uses global locks according to the usage types, such as directory operations, block allocation, block write, and so on. Reference the following lock types in f2fs.h. enum lock_type { RENAME, /* for renaming operations */ DENTRY_OPS, /* for directory operations */ DATA_WRITE, /* for data write */ DATA_NEW, /* for data allocation */ DATA_TRUNC, /* for data truncate */ NODE_NEW, /* for node allocation */ NODE_TRUNC, /* for node truncate */ NODE_WRITE, /* for node write */ NR_LOCK_TYPE, }; In that case, we lose the performance under the multi-threading environment, since every types of operations must be conducted one at a time. In order to address the problem, let's share the locks globally with a mutex array regardless of any types. So, let users grab a mutex and perform their jobs in parallel as much as possbile. For this, I propose a new global lock scheme as follows. 0. Data structure - f2fs_sb_info -> mutex_lock[NR_GLOBAL_LOCKS] - f2fs_sb_info -> node_write 1. mutex_lock_op(sbi) - try to get an avaiable lock from the array. - returns the index of the gottern lock variable. 2. mutex_unlock_op(sbi, index of the lock) - unlock the given index of the lock. 3. mutex_lock_all(sbi) - grab all the locks in the array before the checkpoint. 4. mutex_unlock_all(sbi) - release all the locks in the array after checkpoint. 5. block_operations() - call mutex_lock_all() - sync_dirty_dir_inodes() - grab node_write - sync_node_pages() Note that, the pairs of mutex_lock_op()/mutex_unlock_op() and mutex_lock_all()/mutex_unlock_all() should be used together. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * f2fs: align f2fs maximum name length to linux based filesystemJaegeuk Kim2013-03-181-0/+3
| | | | | | | | | | | | | | | | | | The maximum filename length supported in linux is 255 characters. So let's follow that. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* | mode_t, whack-a-mole at 11...Al Viro2013-04-091-1/+1
|/ | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* Merge branch 'for-linus' of ↵Linus Torvalds2013-02-261-1/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs pile (part one) from Al Viro: "Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent locking violations, etc. The most visible changes here are death of FS_REVAL_DOT (replaced with "has ->d_weak_revalidate()") and a new helper getting from struct file to inode. Some bits of preparation to xattr method interface changes. Misc patches by various people sent this cycle *and* ocfs2 fixes from several cycles ago that should've been upstream right then. PS: the next vfs pile will be xattr stuff." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits) saner proc_get_inode() calling conventions proc: avoid extra pde_put() in proc_fill_super() fs: change return values from -EACCES to -EPERM fs/exec.c: make bprm_mm_init() static ocfs2/dlm: use GFP_ATOMIC inside a spin_lock ocfs2: fix possible use-after-free with AIO ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero target: writev() on single-element vector is pointless export kernel_write(), convert open-coded instances fs: encode_fh: return FILEID_INVALID if invalid fid_type kill f_vfsmnt vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op nfsd: handle vfs_getattr errors in acl protocol switch vfs_getattr() to struct path default SET_PERSONALITY() in linux/elf.h ceph: prepopulate inodes only when request is aborted d_hash_and_lookup(): export, switch open-coded instances 9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate() 9p: split dropping the acls from v9fs_set_create_acl() ...
| * new helper: file_inode(file)Al Viro2013-02-221-1/+1
| | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | Merge branch 'f2fs' of ↵Jaegeuk Kim2013-02-121-16/+13
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs into dev Pull f2fs cleanup patches from Al Viro: f2fs: get rid of fake on-stack dentries f2fs: switch init_inode_metadata() to passing parent and name separately f2fs: switch new_inode_page() from dentry to qstr f2fs: init_dent_inode() should take qstr Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com> Conflicts: fs/f2fs/recovery.c
| * | f2fs: get rid of fake on-stack dentriesAl Viro2013-02-081-7/+5
| | | | | | | | | | | | | | | | | | those should never be used for a lot of reasons... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | f2fs: switch init_inode_metadata() to passing parent and name separatelyAl Viro2013-02-081-6/+5
| | | | | | | | | | | | | | | | | | | | | ... sure, it's tempting to just pass dentry. Except that we don't _have_ anything resembling a real dentry on one of the paths to it. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | f2fs: switch new_inode_page() from dentry to qstrAl Viro2013-02-081-1/+1
| | | | | | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | f2fs: init_dent_inode() should take qstrAl Viro2013-02-081-5/+5
| |/ | | | | | | | | | | | | | | for one thing, it doesn't (and shouldn't) use anything else from dentry; for another, on some call chains the dentry is fake and should be eliminated completely. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* / f2fs: avoid redundant time update for parent directory in f2fs_delete_entryNamjae Jeon2013-01-141-1/+1
|/ | | | | | | | | | In call to f2fs_delete_entry, 'dir' time modification code is put at two places. So, remove the redundant code for timing update. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: unify string length declarations and usageLeon Romanovsky2012-12-281-5/+5
| | | | | | | | | | This patch is intended to unify string length declarations and usage. There are number of calls to strlen which return size_t object. The size of this object depends on compiler if it will be bigger, equal or even smaller than an unsigned int Signed-off-by: Leon Romanovsky <leon@leon.nu> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: fix handling errors got by f2fs_write_inodeJaegeuk Kim2012-12-261-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | Ruslan reported that f2fs hangs with an infinite loop in f2fs_sync_file(): while (sync_node_pages(sbi, inode->i_ino, &wbc) == 0) f2fs_write_inode(inode, NULL); The reason was revealed that the cold flag is not set even thought this inode is a normal file. Therefore, sync_node_pages() skips to write node blocks since it only writes cold node blocks. The cold flag is stored to the node_footer in node block, and whenever a new node page is allocated, it is set according to its file type, file or directory. But, after sudden-power-off, when recovering the inode page, f2fs doesn't recover its cold flag. So, let's assign the cold flag in more right places. One more thing: If f2fs_write_inode() returns an error due to whatever situations, there would be no dirty node pages so that sync_node_pages() returns zero. (i.e., zero means nothing was written.) Reported-by: Ruslan N. Marchenko <me@ruff.mobi> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: fix up f2fs_get_parent issue to retrieve correct parent inode numberNamjae Jeon2012-12-261-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | Test Case: [NFS Client] ls -lR . [NFS Server] while [ 1 ] do echo 3 > /proc/sys/vm/drop_caches done Error on NFS Client: "No such file or directory" When cache is dropped at the server, it results in lookup failure at the NFS client due to non-connection with the parent. The default path is it initiates a lookup by calculating the hash value for the name, even though the hash values stored on the disk for "." and ".." is maintained as zero, which results in failure from find_in_block due to not matching HASH values. Fix up, by using the correct hashing values for these entries. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: fix tracking parent inode numberJaegeuk Kim2012-12-111-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | Previously, f2fs didn't track the parent inode number correctly which is stored in each f2fs_inode. In the case of the following scenario, a bug can be occured. Let's suppose there are one directory, "/b", and two files, "/a" and "/b/a". - pino of "/a" is ROOT_INO. - pino of "/b/a" is DIR_B_INO. Then, # sync : The inode pages of "/a" and "/b/a" contain the parent inode numbers as ROOT_INO and DIR_B_INO respectively. # mv /a /b/a : The parent inode number of "/a" should be changed to DIR_B_INO, but f2fs didn't do that. Ref. f2fs_set_link(). In order to fix this clearly, I added i_pino in f2fs_inode_info, and whenever it needs to be changed like in f2fs_add_link() and f2fs_set_link(), it is updated temporarily in f2fs_inode_info. And later, f2fs_write_inode() stores the latest information to the inode pages. For power-off-recovery, f2fs_sync_file() triggers simply f2fs_write_inode(). Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: introduce accessor to retrieve number of dentry slotsNamjae Jeon2012-12-111-8/+5
| | | | | | | | Simplify code by providing the accessor macro to retrieve the number of dentry slots for a given filename length. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com>
* f2fs: remove redundant call to f2fs_put_page in delete entryNamjae Jeon2012-12-111-3/+2
| | | | | | | | | Since, we anyway need to put the page after deleting entry. So, there is no need to make same call under different conditions. Move out the f2fs_put_page from the two conditions and call at once. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com>
* f2fs: remove unused variableWei Yongjun2012-12-111-2/+0
| | | | | | | The variables node_page and page_offset are initialized but never used otherwise, so remove those unused variables. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
* f2fs: adjust kernel coding styleJaegeuk Kim2012-12-111-2/+2
| | | | | | | As pointed out by Randy Dunlap, this patch removes all usage of "/**" for comment blocks. Instead, just use "/*". Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: fix endian conversion bugs reported by sparseJaegeuk Kim2012-12-111-4/+4
| | | | | | | | | | | | | This patch should resolve the bugs reported by the sparse tool. Initial reports were written by "kbuild test robot" managed by fengguang.wu. In my local machines, I've tested also by running: > make C=2 CF="-D__CHECK_ENDIAN__" Accordingly, I've found lots of warnings and bugs related to the endian conversion. And I've fixed all at this moment. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: add core directory operationsJaegeuk Kim2012-12-111-0/+672
this adds core functions to find, add, delete, and link dentries. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>