summaryrefslogtreecommitdiffstats
path: root/drivers/scsi/sg.c
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'hch.procfs' of ↵Linus Torvalds2018-06-041-112/+12
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull procfs updates from Al Viro: "Christoph's proc_create_... cleanups series" * 'hch.procfs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (44 commits) xfs, proc: hide unused xfs procfs helpers isdn/gigaset: add back gigaset_procinfo assignment proc: update SIZEOF_PDE_INLINE_NAME for the new pde fields tty: replace ->proc_fops with ->proc_show ide: replace ->proc_fops with ->proc_show ide: remove ide_driver_proc_write isdn: replace ->proc_fops with ->proc_show atm: switch to proc_create_seq_private atm: simplify procfs code bluetooth: switch to proc_create_seq_data netfilter/x_tables: switch to proc_create_seq_private netfilter/xt_hashlimit: switch to proc_create_{seq,single}_data neigh: switch to proc_create_seq_data hostap: switch to proc_create_{seq,single}_data bonding: switch to proc_create_seq_data rtc/proc: switch to proc_create_single_data drbd: switch to proc_create_single resource: switch to proc_create_seq_data staging/rtl8192u: simplify procfs code jfs: simplify procfs code ...
| * sg: simplify procfs codeChristoph Hellwig2018-05-161-112/+12
| | | | | | | | | | | | | | | | | | | | | | | | Use remove_proc_subtree to remove the whole subtree on cleanup, and unwind the registration loop into individual calls. Switch to use proc_create_seq where applicable. Also don't bother handling proc_create* failures - the driver works perfectly fine without the proc files, and the cleanup will handle missing files gracefully. Signed-off-by: Christoph Hellwig <hch@lst.de>
* | Merge tag 'for-4.18/block-20180603' of git://git.kernel.dk/linux-blockLinus Torvalds2018-06-041-1/+1
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull block updates from Jens Axboe: - clean up how we pass around gfp_t and blk_mq_req_flags_t (Christoph) - prepare us to defer scheduler attach (Christoph) - clean up drivers handling of bounce buffers (Christoph) - fix timeout handling corner cases (Christoph/Bart/Keith) - bcache fixes (Coly) - prep work for bcachefs and some block layer optimizations (Kent). - convert users of bio_sets to using embedded structs (Kent). - fixes for the BFQ io scheduler (Paolo/Davide/Filippo) - lightnvm fixes and improvements (Matias, with contributions from Hans and Javier) - adding discard throttling to blk-wbt (me) - sbitmap blk-mq-tag handling (me/Omar/Ming). - remove the sparc jsflash block driver, acked by DaveM. - Kyber scheduler improvement from Jianchao, making it more friendly wrt merging. - conversion of symbolic proc permissions to octal, from Joe Perches. Previously the block parts were a mix of both. - nbd fixes (Josef and Kevin Vigor) - unify how we handle the various kinds of timestamps that the block core and utility code uses (Omar) - three NVMe pull requests from Keith and Christoph, bringing AEN to feature completeness, file backed namespaces, cq/sq lock split, and various fixes - various little fixes and improvements all over the map * tag 'for-4.18/block-20180603' of git://git.kernel.dk/linux-block: (196 commits) blk-mq: update nr_requests when switching to 'none' scheduler block: don't use blocking queue entered for recursive bio submits dm-crypt: fix warning in shutdown path lightnvm: pblk: take bitmap alloc. out of critical section lightnvm: pblk: kick writer on new flush points lightnvm: pblk: only try to recover lines with written smeta lightnvm: pblk: remove unnecessary bio_get/put lightnvm: pblk: add possibility to set write buffer size manually lightnvm: fix partial read error path lightnvm: proper error handling for pblk_bio_add_pages lightnvm: pblk: fix smeta write error path lightnvm: pblk: garbage collect lines with failed writes lightnvm: pblk: rework write error recovery path lightnvm: pblk: remove dead function lightnvm: pass flag on graceful teardown to targets lightnvm: pblk: check for chunk size before allocating it lightnvm: pblk: remove unnecessary argument lightnvm: pblk: remove unnecessary indirection lightnvm: pblk: return NVM_ error on failed submission lightnvm: pblk: warn in case of corrupted write buffer ...
| * | block: sanitize blk_get_request calling conventionsChristoph Hellwig2018-05-141-1/+1
| |/ | | | | | | | | | | | | | | | | Switch everyone to blk_get_request_flags, and then rename blk_get_request_flags to blk_get_request. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* / scsi: sg: allocate with __GFP_ZERO in sg_build_indirect()Alexander Potapenko2018-05-181-1/+1
|/ | | | | | | | | | | | This shall help avoid copying uninitialized memory to the userspace when calling ioctl(fd, SG_IO) with an empty command. Reported-by: syzbot+7d26fc1eea198488deab@syzkaller.appspotmail.com Cc: stable@vger.kernel.org Signed-off-by: Alexander Potapenko <glider@google.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* vfs: do bulk POLL* -> EPOLL* replacementLinus Torvalds2018-02-111-6/+6
| | | | | | | | | | | | | | | | | | | | | | | This is the mindless scripted replacement of kernel use of POLL* variables as described by Al, done by this script: for V in IN OUT PRI ERR RDNORM RDBAND WRNORM WRBAND HUP RDHUP NVAL MSG; do L=`git grep -l -w POLL$V | grep -v '^t' | grep -v /um/ | grep -v '^sa' | grep -v '/poll.h$'|grep -v '^D'` for f in $L; do sed -i "-es/^\([^\"]*\)\(\<POLL$V\>\)/\\1E\\2/" $f; done done with de-mangling cleanups yet to come. NOTE! On almost all architectures, the EPOLL* constants have the same values as the POLL* constants do. But they keyword here is "almost". For various bad reasons they aren't the same, and epoll() doesn't actually work quite correctly in some cases due to this on Sparc et al. The next patch from Al will sort out the final differences, and we should be all done. Scripted-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* debugging printk in sg_poll() uses %x to print POLL... bitmapAl Viro2017-11-281-1/+1
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* the rest of drivers/*: annotate ->poll() instancesAl Viro2017-11-281-2/+2
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* Merge branch 'for-4.15/block' of git://git.kernel.dk/linux-blockLinus Torvalds2017-11-141-1/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull core block layer updates from Jens Axboe: "This is the main pull request for block storage for 4.15-rc1. Nothing out of the ordinary in here, and no API changes or anything like that. Just various new features for drivers, core changes, etc. In particular, this pull request contains: - A patch series from Bart, closing the whole on blk/scsi-mq queue quescing. - A series from Christoph, building towards hidden gendisks (for multipath) and ability to move bio chains around. - NVMe - Support for native multipath for NVMe (Christoph). - Userspace notifications for AENs (Keith). - Command side-effects support (Keith). - SGL support (Chaitanya Kulkarni) - FC fixes and improvements (James Smart) - Lots of fixes and tweaks (Various) - bcache - New maintainer (Michael Lyle) - Writeback control improvements (Michael) - Various fixes (Coly, Elena, Eric, Liang, et al) - lightnvm updates, mostly centered around the pblk interface (Javier, Hans, and Rakesh). - Removal of unused bio/bvec kmap atomic interfaces (me, Christoph) - Writeback series that fix the much discussed hundreds of millions of sync-all units. This goes all the way, as discussed previously (me). - Fix for missing wakeup on writeback timer adjustments (Yafang Shao). - Fix laptop mode on blk-mq (me). - {mq,name} tupple lookup for IO schedulers, allowing us to have alias names. This means you can use 'deadline' on both !mq and on mq (where it's called mq-deadline). (me). - blktrace race fix, oopsing on sg load (me). - blk-mq optimizations (me). - Obscure waitqueue race fix for kyber (Omar). - NBD fixes (Josef). - Disable writeback throttling by default on bfq, like we do on cfq (Luca Miccio). - Series from Ming that enable us to treat flush requests on blk-mq like any other request. This is a really nice cleanup. - Series from Ming that improves merging on blk-mq with schedulers, getting us closer to flipping the switch on scsi-mq again. - BFQ updates (Paolo). - blk-mq atomic flags memory ordering fixes (Peter Z). - Loop cgroup support (Shaohua). - Lots of minor fixes from lots of different folks, both for core and driver code" * 'for-4.15/block' of git://git.kernel.dk/linux-block: (294 commits) nvme: fix visibility of "uuid" ns attribute blk-mq: fixup some comment typos and lengths ide: ide-atapi: fix compile error with defining macro DEBUG blk-mq: improve tag waiting setup for non-shared tags brd: remove unused brd_mutex blk-mq: only run the hardware queue if IO is pending block: avoid null pointer dereference on null disk fs: guard_bio_eod() needs to consider partitions xtensa/simdisk: fix compile error nvme: expose subsys attribute to sysfs nvme: create 'slaves' and 'holders' entries for hidden controllers block: create 'slaves' and 'holders' entries for hidden gendisks nvme: also expose the namespace identification sysfs files for mpath nodes nvme: implement multipath access to nvme subsystems nvme: track shared namespaces nvme: introduce a nvme_ns_ids structure nvme: track subsystems block, nvme: Introduce blk_mq_req_flags_t block, scsi: Make SCSI quiesce and resume work reliably block: Add the QUEUE_FLAG_PREEMPT_ONLY request queue flag ...
| * block: pass full fmode_t to blk_verify_commandChristoph Hellwig2017-11-101-1/+1
| | | | | | | | | | | | | | Use the obvious calling convention. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* | scsi: sg: Re-fix off by one in sg_fill_request_table()Ben Hutchings2017-10-171-1/+1
|/ | | | | | | | | | | | | | | | Commit 109bade9c625 ("scsi: sg: use standard lists for sg_requests") introduced an off-by-one error in sg_ioctl(), which was fixed by commit bd46fc406b30 ("scsi: sg: off by one in sg_ioctl()"). Unfortunately commit 4759df905a47 ("scsi: sg: factor out sg_fill_request_table()") moved that code, and reintroduced the bug (perhaps due to a botched rebase). Fix it again. Fixes: 4759df905a47 ("scsi: sg: factor out sg_fill_request_table()") Cc: stable@vger.kernel.org Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: sg: fixup infoleak when using SG_GET_REQUEST_TABLEHannes Reinecke2017-09-151-3/+2
| | | | | | | | | | | | | When calling SG_GET_REQUEST_TABLE ioctl only a half-filled table is returned; the remaining part will then contain stale kernel memory information. This patch zeroes out the entire table to avoid this issue. Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: sg: factor out sg_fill_request_table()Hannes Reinecke2017-09-151-26/+35
| | | | | | | | | | | Factor out sg_fill_request_table() for better readability. [mkp: typos, applied by hand] Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* Merge branch 'fixes' into miscJames Bottomley2017-09-071-31/+2
|\
| * scsi: sg: off by one in sg_ioctl()Dan Carpenter2017-08-221-1/+1
| | | | | | | | | | | | | | | | | | | | If "val" is SG_MAX_QUEUE then we are one element beyond the end of the "rinfo" array so the > should be >=. Fixes: 109bade9c625 ("scsi: sg: use standard lists for sg_requests") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: sg: only check for dxfer_len greater than 256MJohannes Thumshirn2017-07-271-30/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Don't make any assumptions on the sg_io_hdr_t::dxfer_direction or the sg_io_hdr_t::dxferp in order to determine if it is a valid request. The only way we can check for bad requests is by checking if the length exceeds 256M. Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Fixes: 28676d869bbb (scsi: sg: check for valid direction before starting the request) Reported-by: Jason L Tibbitts III <tibbs@math.uh.edu> Tested-by: Jason L Tibbitts III <tibbs@math.uh.edu> Suggested-by: Doug Gilbert <dgilbert@interlog.com> Cc: Doug Gilbert <dgilbert@interlog.com> Cc: <stable@vger.kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | scsi: sg: Fix type of last blk_trace_setup() argumentBart Van Assche2017-08-251-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Avoid that sparse reports the following: drivers/scsi/sg.c:1114:41: warning: incorrect type in argument 5 (different address spaces) drivers/scsi/sg.c:1114:41: expected char [noderef] <asn:1>*arg drivers/scsi/sg.c:1114:41: got char *<noident> This patch does not change any functionality. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | scsi: sg: protect against races between mmap() and SG_SET_RESERVED_SIZETodd Poynor2017-08-241-3/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Take f_mutex around mmap() processing to protect against races with the SG_SET_RESERVED_SIZE ioctl. Ensure the reserve buffer length remains consistent during the mapping operation, and set the "mmap called" flag to prevent further changes to the reserved buffer size as an atomic operation with the mapping. [mkp: fixed whitespace] Signed-off-by: Todd Poynor <toddpoynor@google.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | scsi: sg: recheck MMAP_IO request length with lock heldTodd Poynor2017-08-241-2/+5
|/ | | | | | | | | | | | | | | | | | | Commit 1bc0eb044615 ("scsi: sg: protect accesses to 'reserved' page array") adds needed concurrency protection for the "reserve" buffer. Some checks that are initially made outside the lock are replicated once the lock is taken to ensure the checks and resulting decisions are made using consistent state. The check that a request with flag SG_FLAG_MMAP_IO set fits in the reserve buffer also needs to be performed again under the lock to ensure the reserve buffer length compared against matches the value in effect when the request is linked to the reserve buffer. An -ENOMEM should be returned in this case, instead of switching over to an indirect buffer as for non-MMAP_IO requests. Signed-off-by: Todd Poynor <toddpoynor@google.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: sg: fix static checker warning in sg_is_valid_dxferJohannes Thumshirn2017-07-171-2/+5
| | | | | | | | | | | | | | | | dxfer_len is an unsigned int and we always assign a value > 0 to it, so it doesn't make any sense to check if it is < 0. We can't really check dxferp as well as we have both NULL and not NULL cases in the possible call paths. So just return true for SG_DXFER_FROM_DEV transfer in sg_is_valid_dxfer(). Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Reported-by: Colin Ian King <colin.king@canonical.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: sg: fix SG_DXFER_FROM_DEV transfersJohannes Thumshirn2017-07-121-1/+4
| | | | | | | | | | | | | | | | | | SG_DXFER_FROM_DEV transfers do not necessarily have a dxferp as we set it to NULL for the old sg_io read/write interface, but must have a length bigger than 0. This fixes a regression introduced by commit 28676d869bbb ("scsi: sg: check for valid direction before starting the request") Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Fixes: 28676d869bbb ("scsi: sg: check for valid direction before starting the request") Reported-by: Chris Clayton <chris2553@googlemail.com> Tested-by: Chris Clayton <chris2553@googlemail.com> Cc: Douglas Gilbert <dgilbert@interlog.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Chris Clayton <chris2553@googlemail.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* block: Make most scsi_req_init() calls implicitBart Van Assche2017-06-201-2/+0
| | | | | | | | | | | | | | | | | | Instead of explicitly calling scsi_req_init() after blk_get_request(), call that function from inside blk_get_request(). Add an .initialize_rq_fn() callback function to the block drivers that need it. Merge the IDE .init_rq_fn() function into .initialize_rq_fn() because it is too small to keep it as a separate function. Keep the scsi_req_init() call in ide_prep_sense() because it follows a blk_rq_init() call. References: commit 82ed4db499b8 ("block: split scsi_request out of struct request") Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.com> Cc: Omar Sandoval <osandov@fb.com> Cc: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* block: introduce new block status code typeChristoph Hellwig2017-06-091-3/+3
| | | | | | | | | | | | | | | | | | | | | | | Currently we use nornal Linux errno values in the block layer, and while we accept any error a few have overloaded magic meanings. This patch instead introduces a new blk_status_t value that holds block layer specific status codes and explicitly explains their meaning. Helpers to convert from and to the previous special meanings are provided for now, but I suspect we want to get rid of them in the long run - those drivers that have a errno input (e.g. networking) usually get errnos that don't know about the special block layer overloads, and similarly returning them to userspace will usually return somethings that strictly speaking isn't correct for file system operations, but that's left as an exercise for later. For now the set of errors is a very limited set that closely corresponds to the previous overloaded errno values, but there is some low hanging fruite to improve it. blk_status_t (ab)uses the sparse __bitwise annotations to allow for sparse typechecking, so that we can easily catch places passing the wrong values. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
* scsi: sg: don't return bogus Sg_requestsJohannes Thumshirn2017-05-111-2/+3
| | | | | | | | | | | | | | | | | | If the list search in sg_get_rq_mark() fails to find a valid request, we return a bogus element. This then can later lead to a GPF in sg_remove_scat(). So don't return bogus Sg_requests in sg_get_rq_mark() but NULL in case the list search doesn't find a valid request. Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Reported-by: Andrey Konovalov <andreyknvl@google.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Doug Gilbert <dgilbert@interlog.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Acked-by: Doug Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds2017-05-041-143/+141
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull SCSI updates from James Bottomley: "This update includes the usual round of major driver updates (hisi_sas, ufs, fnic, cxlflash, be2iscsi, ipr, stex). There's also the usual amount of cosmetic and spelling stuff" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (155 commits) scsi: qla4xxx: fix spelling mistake: "Tempalate" -> "Template" scsi: stex: make S6flag static scsi: mac_esp: fix to pass correct device identity to free_irq() scsi: aacraid: pci_alloc_consistent() failures on ARM64 scsi: ufs: make ufshcd_get_lists_status() register operation obvious scsi: ufs: use MASK_EE_STATUS scsi: mac_esp: Replace bogus memory barrier with spinlock scsi: fcoe: make fcoe_e_d_tov and fcoe_r_a_tov static scsi: sd_zbc: Do not write lock zones for reset scsi: sd_zbc: Remove superfluous assignments scsi: sd: sd_zbc: Rename sd_zbc_setup_write_cmnd scsi: Improve scsi_get_sense_info_fld scsi: sd: Cleanup sd_done sense data handling scsi: sd: Improve sd_completed_bytes scsi: sd: Fix function descriptions scsi: mpt3sas: remove redundant wmb scsi: mpt: Move scsi_remove_host() out of mptscsih_remove_host() scsi: sg: reset 'res_in_use' after unlinking reserved array scsi: mvumi: remove code handling zero scsi_sg_count(scmd) case scsi: fusion: fix spelling mistake: "Persistancy" -> "Persistency" ...
| * scsi: sg: reset 'res_in_use' after unlinking reserved arrayHannes Reinecke2017-04-241-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Once the reserved page array is unused we can reset the 'res_in_use' state; here we can do a lazy update without holding the mutex as we only need to check against concurrent access, not concurrent release. [mkp: checkpatch] Fixes: 1bc0eb044615 ("scsi: sg: protect accesses to 'reserved' page array") Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: sg: close race condition in sg_remove_sfp_usercontext()Hannes Reinecke2017-04-111-2/+10
| | | | | | | | | | | | | | | | | | | | | | | | sg_remove_sfp_usercontext() is clearing any sg requests, but needs to take 'rq_list_lock' when modifying the list. Reported-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Tested-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: sg: use standard lists for sg_requestsHannes Reinecke2017-04-111-86/+61
| | | | | | | | | | | | | | | | | | | | | | 'Sg_request' is using a private list implementation; convert it to standard lists. Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Tested-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: sg: check for valid direction before starting the requestJohannes Thumshirn2017-04-111-12/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Check for a valid direction before starting the request, otherwise we risk running into an assertion in the scsi midlayer checking for valid requests. [mkp: fixed typo] Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Link: http://www.spinics.net/lists/linux-scsi/msg104400.html Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Hannes Reinecke <hare@suse.com> Tested-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: sg: protect accesses to 'reserved' page arrayHannes Reinecke2017-04-111-20/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | The 'reserved' page array is used as a short-cut for mapping data, saving us to allocate pages per request. However, the 'reserved' array is only capable of holding one request, so this patch introduces a mutex for protect 'sg_fd' against concurrent accesses. Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Tested-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: sg: remove 'save_scat_len'Hannes Reinecke2017-04-111-2/+0
| | | | | | | | | | | | | | | | | | | | Unused. Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Tested-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * scsi: sg: disable SET_FORCE_LOW_DMAHannes Reinecke2017-04-111-21/+9
| | | | | | | | | | | | | | | | | | | | | | | | The ioctl SET_FORCE_LOW_DMA has never worked since the initial git check-in, and the respective setting is nowadays handled correctly. So disable it entirely. Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Tested-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | Merge branch 'work.uaccess' of ↵Linus Torvalds2017-05-011-1/+1
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull uaccess unification updates from Al Viro: "This is the uaccess unification pile. It's _not_ the end of uaccess work, but the next batch of that will go into the next cycle. This one mostly takes copy_from_user() and friends out of arch/* and gets the zero-padding behaviour in sync for all architectures. Dealing with the nocache/writethrough mess is for the next cycle; fortunately, that's x86-only. Same for cleanups in iov_iter.c (I am sold on access_ok() in there, BTW; just not in this pile), same for reducing __copy_... callsites, strn*... stuff, etc. - there will be a pile about as large as this one in the next merge window. This one sat in -next for weeks. -3KLoC" * 'work.uaccess' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (96 commits) HAVE_ARCH_HARDENED_USERCOPY is unconditional now CONFIG_ARCH_HAS_RAW_COPY_USER is unconditional now m32r: switch to RAW_COPY_USER hexagon: switch to RAW_COPY_USER microblaze: switch to RAW_COPY_USER get rid of padding, switch to RAW_COPY_USER ia64: get rid of copy_in_user() ia64: sanitize __access_ok() ia64: get rid of 'segment' argument of __do_{get,put}_user() ia64: get rid of 'segment' argument of __{get,put}_user_check() ia64: add extable.h powerpc: get rid of zeroing, switch to RAW_COPY_USER esas2r: don't open-code memdup_user() alpha: fix stack smashing in old_adjtimex(2) don't open-code kernel_setsockopt() mips: switch to RAW_COPY_USER mips: get rid of tail-zeroing in primitives mips: make copy_from_user() zero tail explicitly mips: clean and reorder the forest of macros... mips: consolidate __invoke_... wrappers ...
| * | new helper: uaccess_kernel()Al Viro2017-03-281-1/+1
| |/ | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | Merge branch 'for-4.12/block' of git://git.kernel.dk/linux-blockLinus Torvalds2017-05-011-2/+2
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull block layer updates from Jens Axboe: - Add BFQ IO scheduler under the new blk-mq scheduling framework. BFQ was initially a fork of CFQ, but subsequently changed to implement fairness based on B-WF2Q+, a modified variant of WF2Q. BFQ is meant to be used on desktop type single drives, providing good fairness. From Paolo. - Add Kyber IO scheduler. This is a full multiqueue aware scheduler, using a scalable token based algorithm that throttles IO based on live completion IO stats, similary to blk-wbt. From Omar. - A series from Jan, moving users to separately allocated backing devices. This continues the work of separating backing device life times, solving various problems with hot removal. - A series of updates for lightnvm, mostly from Javier. Includes a 'pblk' target that exposes an open channel SSD as a physical block device. - A series of fixes and improvements for nbd from Josef. - A series from Omar, removing queue sharing between devices on mostly legacy drivers. This helps us clean up other bits, if we know that a queue only has a single device backing. This has been overdue for more than a decade. - Fixes for the blk-stats, and improvements to unify the stats and user windows. This both improves blk-wbt, and enables other users to register a need to receive IO stats for a device. From Omar. - blk-throttle improvements from Shaohua. This provides a scalable framework for implementing scalable priotization - particularly for blk-mq, but applicable to any type of block device. The interface is marked experimental for now. - Bucketized IO stats for IO polling from Stephen Bates. This improves efficiency of polled workloads in the presence of mixed block size IO. - A few fixes for opal, from Scott. - A few pulls for NVMe, including a lot of fixes for NVMe-over-fabrics. From a variety of folks, mostly Sagi and James Smart. - A series from Bart, improving our exposed info and capabilities from the blk-mq debugfs support. - A series from Christoph, cleaning up how handle WRITE_ZEROES. - A series from Christoph, cleaning up the block layer handling of how we track errors in a request. On top of being a nice cleanup, it also shrinks the size of struct request a bit. - Removal of mg_disk and hd (sorry Linus) by Christoph. The former was never used by platforms, and the latter has outlived it's usefulness. - Various little bug fixes and cleanups from a wide variety of folks. * 'for-4.12/block' of git://git.kernel.dk/linux-block: (329 commits) block: hide badblocks attribute by default blk-mq: unify hctx delay_work and run_work block: add kblock_mod_delayed_work_on() blk-mq: unify hctx delayed_run_work and run_work nbd: fix use after free on module unload MAINTAINERS: bfq: Add Paolo as maintainer for the BFQ I/O scheduler blk-mq-sched: alloate reserved tags out of normal pool mtip32xx: use runtime tag to initialize command header scsi: Implement blk_mq_ops.show_rq() blk-mq: Add blk_mq_ops.show_rq() blk-mq: Show operation, cmd_flags and rq_flags names blk-mq: Make blk_flags_show() callers append a newline character blk-mq: Move the "state" debugfs attribute one level down blk-mq: Unregister debugfs attributes earlier blk-mq: Only unregister hctxs for which registration succeeded blk-mq-debugfs: Rename functions for registering and unregistering the mq directory blk-mq: Let blk_mq_debugfs_register() look up the queue name blk-mq: Register <dev>/queue/mq after having registered <dev>/queue ide-pm: always pass 0 error to ide_complete_rq in ide_do_devset ide-pm: always pass 0 error to __blk_end_request_all ..
| * | scsi: introduce a result field in struct scsi_requestChristoph Hellwig2017-04-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This passes on the scsi_cmnd result field to users of passthrough requests. Currently we abuse req->errors for this purpose, but that field will go away in its current form. Note that the old IDE code abuses the errors field in very creative ways and stores all kinds of different values in it. I didn't dare to touch this magic, so the abuses are brought forward 1:1. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Bart Van Assche <Bart.VanAssche@sandisk.com> Signed-off-by: Jens Axboe <axboe@fb.com>
| * | block, scsi: move the retries field to struct scsi_requestChristoph Hellwig2017-04-051-1/+1
| |/ | | | | | | | | | | | | | | | | Instead of bloating the generic struct request with it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jens Axboe <axboe@fb.com>
* | Merge remote-tracking branch 'mkp-scsi/4.11/scsi-fixes' into fixesJames Bottomley2017-03-291-0/+2
|\ \ | |/ |/|
| * scsi: sg: check length passed to SG_NEXT_CMD_LENpeter chang2017-03-161-0/+2
| | | | | | | | | | | | | | | | | | | | | | The user can control the size of the next command passed along, but the value passed to the ioctl isn't checked against the usable max command size. Cc: <stable@vger.kernel.org> Signed-off-by: Peter Chang <dpf@google.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | mm, fs: reduce fault, page_mkwrite, and pfn_mkwrite to take only vmfDave Jiang2017-02-241-1/+2
|/ | | | | | | | | | | | | | | | | | | | | | | ->fault(), ->page_mkwrite(), and ->pfn_mkwrite() calls do not need to take a vma and vmf parameter when the vma already resides in vmf. Remove the vma parameter to simplify things. [arnd@arndb.de: fix ARM build] Link: http://lkml.kernel.org/r/20170125223558.1451224-1-arnd@arndb.de Link: http://lkml.kernel.org/r/148521301778.19116.10840599906674778980.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge tag 'for-4.11/linus-merge-signed' of git://git.kernel.dk/linux-blockLinus Torvalds2017-02-211-15/+18
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull block layer updates from Jens Axboe: - blk-mq scheduling framework from me and Omar, with a port of the deadline scheduler for this framework. A port of BFQ from Paolo is in the works, and should be ready for 4.12. - Various fixups and improvements to the above scheduling framework from Omar, Paolo, Bart, me, others. - Cleanup of the exported sysfs blk-mq data into debugfs, from Omar. This allows us to export more information that helps debug hangs or performance issues, without cluttering or abusing the sysfs API. - Fixes for the sbitmap code, the scalable bitmap code that was migrated from blk-mq, from Omar. - Removal of the BLOCK_PC support in struct request, and refactoring of carrying SCSI payloads in the block layer. This cleans up the code nicely, and enables us to kill the SCSI specific parts of struct request, shrinking it down nicely. From Christoph mainly, with help from Hannes. - Support for ranged discard requests and discard merging, also from Christoph. - Support for OPAL in the block layer, and for NVMe as well. Mainly from Scott Bauer, with fixes/updates from various others folks. - Error code fixup for gdrom from Christophe. - cciss pci irq allocation cleanup from Christoph. - Making the cdrom device operations read only, from Kees Cook. - Fixes for duplicate bdi registrations and bdi/queue life time problems from Jan and Dan. - Set of fixes and updates for lightnvm, from Matias and Javier. - A few fixes for nbd from Josef, using idr to name devices and a workqueue deadlock fix on receive. Also marks Josef as the current maintainer of nbd. - Fix from Josef, overwriting queue settings when the number of hardware queues is updated for a blk-mq device. - NVMe fix from Keith, ensuring that we don't repeatedly mark and IO aborted, if we didn't end up aborting it. - SG gap merging fix from Ming Lei for block. - Loop fix also from Ming, fixing a race and crash between setting loop status and IO. - Two block race fixes from Tahsin, fixing request list iteration and fixing a race between device registration and udev device add notifiations. - Double free fix from cgroup writeback, from Tejun. - Another double free fix in blkcg, from Hou Tao. - Partition overflow fix for EFI from Alden Tondettar. * tag 'for-4.11/linus-merge-signed' of git://git.kernel.dk/linux-block: (156 commits) nvme: Check for Security send/recv support before issuing commands. block/sed-opal: allocate struct opal_dev dynamically block/sed-opal: tone down not supported warnings block: don't defer flushes on blk-mq + scheduling blk-mq-sched: ask scheduler for work, if we failed dispatching leftovers blk-mq: don't special case flush inserts for blk-mq-sched blk-mq-sched: don't add flushes to the head of requeue queue blk-mq: have blk_mq_dispatch_rq_list() return if we queued IO or not block: do not allow updates through sysfs until registration completes lightnvm: set default lun range when no luns are specified lightnvm: fix off-by-one error on target initialization Maintainers: Modify SED list from nvme to block Move stack parameters for sed_ioctl to prevent oversized stack with CONFIG_KASAN uapi: sed-opal fix IOW for activate lsp to use correct struct cdrom: Make device operations read-only elevator: fix loading wrong elevator type for blk-mq devices cciss: switch to pci_irq_alloc_vectors block/loop: fix race between I/O and set_status blk-mq-sched: don't hold queue_lock when calling exit_icq block: set make_request_fn manually in blk_mq_update_nr_hw_queues ...
| * block: fold cmd_type into the REQ_OP_ spaceChristoph Hellwig2017-01-311-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of keeping two levels of indirection for requests types, fold it all into the operations. The little caveat here is that previously cmd_type only applied to struct request, while the request and bio op fields were set to plain REQ_OP_READ/WRITE even for passthrough operations. Instead this patch adds new REQ_OP_* for SCSI passthrough and driver private requests, althought it has to add two for each so that we can communicate the data in/out nature of the request. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
| * block: split scsi_request out of struct requestChristoph Hellwig2017-01-271-14/+16
| | | | | | | | | | | | | | | | | | | | And require all drivers that want to support BLOCK_PC to allocate it as the first thing of their private data. To support this the legacy IDE and BSG code is switched to set cmd_size on their queues to let the block layer allocate the additional space. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
* | Fix missing sanity check in /dev/sgAl Viro2017-02-191-0/+4
|/ | | | | | | | | | | | | | | | | | | | What happens is that a write to /dev/sg is given a request with non-zero ->iovec_count combined with zero ->dxfer_len. Or with ->dxferp pointing to an array full of empty iovecs. Having write permission to /dev/sg shouldn't be equivalent to the ability to trigger BUG_ON() while holding spinlocks... Found by Dmitry Vyukov and syzkaller. [ The BUG_ON() got changed to a WARN_ON_ONCE(), but this fixes the underlying issue. - Linus ] Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Reported-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* sg_write()/bsg_write() is not fit to be called under KERNEL_DSAl Viro2016-12-221-0/+3
| | | | | | | | | | Both damn things interpret userland pointers embedded into the payload; worse, they are actually traversing those. Leaving aside the bad API design, this is very much _not_ safe to call with KERNEL_DS. Bail out early if that happens. Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* scsi: sg: Use mult_frac, drop MULDIV macroPaul Burton2016-08-301-15/+4
| | | | | | | | | | The MULDIV macro is essentially a duplicate of the more standard mult_frac macro. Replace use of MULDIV with mult_frac & drop the duplication. Signed-off-by: Paul Burton <paul.burton@imgtec.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: sg: Avoid overflow when USER_HZ > HZPaul Burton2016-08-301-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Calculating the maximum timeout that a user can set via the SG_SET_TIMEOUT ioctl involves multiplying INT_MAX by USER_HZ/HZ. If USER_HZ is larger than HZ then this results in an overflow when performed as a 32 bit integer calculation, resulting in compiler warnings such as the following: drivers/scsi/sg.c: In function 'sg_ioctl': drivers/scsi/sg.c:91:67: warning: integer overflow in expression [-Woverflow] #define MULDIV(X,MUL,DIV) ((((X % DIV) * MUL) / DIV) + ((X / DIV) * MUL)) ^ drivers/scsi/sg.c:887:14: note: in expansion of macro 'MULDIV' if (val >= MULDIV (INT_MAX, USER_HZ, HZ)) ^ drivers/scsi/sg.c:91:67: warning: integer overflow in expression [-Woverflow] #define MULDIV(X,MUL,DIV) ((((X % DIV) * MUL) / DIV) + ((X / DIV) * MUL)) ^ drivers/scsi/sg.c:888:13: note: in expansion of macro 'MULDIV' val = MULDIV (INT_MAX, USER_HZ, HZ); ^ Avoid this overflow by performing the (constant) arithmetic on 64 bit integers, which ensures that overflow from multiplying the 32 bit values cannot occur. When converting the result back to a 32 bit integer use min_t to ensure that we don't simply truncate a value beyond INT_MAX to a 32 bit integer, but instead use INT_MAX where the result was larger than it. As the values are all compile time constant the 64 bit arithmetic should have no runtime cost. Signed-off-by: Paul Burton <paul.burton@imgtec.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* sg: fix dxferp in from_to caseDouglas Gilbert2016-03-091-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | One of the strange things that the original sg driver did was let the user provide both a data-out buffer (it followed the sg_header+cdb) _and_ specify a reply length greater than zero. What happened was that the user data-out buffer was copied into some kernel buffers and then the mid level was told a read type operation would take place with the data from the device overwriting the same kernel buffers. The user would then read those kernel buffers back into the user space. From what I can tell, the above action was broken by commit fad7f01e61bf ("sg: set dxferp to NULL for READ with the older SG interface") in 2008 and syzkaller found that out recently. Make sure that a user space pointer is passed through when data follows the sg_header structure and command. Fix the abnormal case when a non-zero reply_len is also given. Fixes: fad7f01e61bf737fe8a3740d803f000db57ecac6 Cc: <stable@vger.kernel.org> #v2.6.28+ Signed-off-by: Douglas Gilbert <dgilbert@interlog.com> Reviewed-by: Ewan Milne <emilne@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* drivers/scsi/sg.c: mark VMA as VM_IO to prevent migrationKirill A. Shutemov2016-02-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reduced testcase: #include <fcntl.h> #include <unistd.h> #include <sys/mman.h> #include <numaif.h> #define SIZE 0x2000 int main() { int fd; void *p; fd = open("/dev/sg0", O_RDWR); p = mmap(NULL, SIZE, PROT_EXEC, MAP_PRIVATE | MAP_LOCKED, fd, 0); mbind(p, SIZE, 0, NULL, 0, MPOL_MF_MOVE); return 0; } We shouldn't try to migrate pages in sg VMA as we don't have a way to update Sg_scatter_hold::pages accordingly from mm core. Let's mark the VMA as VM_IO to indicate to mm core that the VMA is not migratable. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Doug Gilbert <dgilbert@interlog.com> Cc: David Rientjes <rientjes@google.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Shiraz Hashim <shashim@codeaurora.org> Cc: Hugh Dickins <hughd@google.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: syzkaller <syzkaller@googlegroups.com> Cc: Kostya Serebryany <kcc@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* sg: Fix double-free when drives detach during SG_IOCalvin Owens2015-11-021-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In sg_common_write(), we free the block request and return -ENODEV if the device is detached in the middle of the SG_IO ioctl(). Unfortunately, sg_finish_rem_req() also tries to free srp->rq, so we end up freeing rq->cmd in the already free rq object, and then free the object itself out from under the current user. This ends up corrupting random memory via the list_head on the rq object. The most common crash trace I saw is this: ------------[ cut here ]------------ kernel BUG at block/blk-core.c:1420! Call Trace: [<ffffffff81281eab>] blk_put_request+0x5b/0x80 [<ffffffffa0069e5b>] sg_finish_rem_req+0x6b/0x120 [sg] [<ffffffffa006bcb9>] sg_common_write.isra.14+0x459/0x5a0 [sg] [<ffffffff8125b328>] ? selinux_file_alloc_security+0x48/0x70 [<ffffffffa006bf95>] sg_new_write.isra.17+0x195/0x2d0 [sg] [<ffffffffa006cef4>] sg_ioctl+0x644/0xdb0 [sg] [<ffffffff81170f80>] do_vfs_ioctl+0x90/0x520 [<ffffffff81258967>] ? file_has_perm+0x97/0xb0 [<ffffffff811714a1>] SyS_ioctl+0x91/0xb0 [<ffffffff81602afb>] tracesys+0xdd/0xe2 RIP [<ffffffff81281e04>] __blk_put_request+0x154/0x1a0 The solution is straightforward: just set srp->rq to NULL in the failure branch so that sg_finish_rem_req() doesn't attempt to re-free it. Additionally, since sg_rq_end_io() will never be called on the object when this happens, we need to free memory backing ->cmd if it isn't embedded in the object itself. KASAN was extremely helpful in finding the root cause of this bug. Signed-off-by: Calvin Owens <calvinowens@fb.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>