summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Add linux-next specific files for 20190507next-20190507Stephen Rothwell2019-05-075-0/+16283
| | | | Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
* Merge branch 'akpm/master'Stephen Rothwell2019-05-0790-303/+403
|\
| * drivers/media/platform/sti/delta/delta-ipc.c: fix read buffer overflowAndi Kleen2019-05-071-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The single caller passes a string to delta_ipc_open, which copies with a fixed size larger than the string. So it copies some random data after the original string the ro segment. If the string was at the end of a page it may fault. Just copy the string with a normal strcpy after clearing the field. Found by a LTO build (which errors out) because the compiler inlines the functions and can resolve the string sizes and triggers the compile time checks in memcpy. In function `memcpy', inlined from `delta_ipc_open.constprop' at linux/drivers/media/platform/sti/delta/delta-ipc.c:178:0, inlined from `delta_mjpeg_ipc_open' at linux/drivers/media/platform/sti/delta/delta-mjpeg-dec.c:227:0, inlined from `delta_mjpeg_decode' at linux/drivers/media/platform/sti/delta/delta-mjpeg-dec.c:403:0: /home/andi/lsrc/linux/include/linux/string.h:337:0: error: call to `__read_overflow2' declared with attribute error: detected read beyond size of object passed as 2nd parameter __read_overflow2(); Link: http://lkml.kernel.org/r/20171222001212.1850-1-andi@firstfloor.org Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Hugues FRUCHET <hugues.fruchet@st.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
| * mm: memcontrol: fix NUMA round-robin reclaim at intermediate levelJohannes Weiner2019-05-071-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a cgroup is reclaimed on behalf of a configured limit, reclaim needs to round-robin through all NUMA nodes that hold pages of the memcg in question. However, when assembling the mask of candidate NUMA nodes, the code only consults the *local* cgroup LRU counters, not the recursive counters for the entire subtree. Cgroup limits are frequently configured against intermediate cgroups that do not have memory on their own LRUs. In this case, the node mask will always come up empty and reclaim falls back to scanning only the current node. If a cgroup subtree has some memory on one node but the processes are bound to another node afterwards, the limit reclaim will never age or reclaim that memory anymore. To fix this, use the recursive LRU counts for a cgroup subtree to determine which nodes hold memory of that cgroup. The code has been broken like this forever, so it doesn't seem to be a problem in practice. I just noticed it while reviewing the way the LRU counters are used in general. Link: http://lkml.kernel.org/r/20190412151507.2769-5-hannes@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Shakeel Butt <shakeelb@google.com> Reviewed-by: Roman Gushchin <guro@fb.com> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
| * mm: memcontrol: fix recursive statistics correctness & scalabiltyJohannes Weiner2019-05-072-109/+150
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Right now, when somebody needs to know the recursive memory statistics and events of a cgroup subtree, they need to walk the entire subtree and sum up the counters manually. There are two issues with this: 1. When a cgroup gets deleted, its stats are lost. The state counters should all be 0 at that point, of course, but the events are not. When this happens, the event counters, which are supposed to be monotonic, can go backwards in the parent cgroups. 2. During regular operation, we always have a certain number of lazily freed cgroups sitting around that have been deleted, have no tasks, but have a few cache pages remaining. These groups' statistics do not change until we eventually hit memory pressure, but somebody watching, say, memory.stat on an ancestor has to iterate those every time. This patch addresses both issues by introducing recursive counters at each level that are propagated from the write side when stats change. Upward propagation happens when the per-cpu caches spill over into the local atomic counter. This is the same thing we do during charge and uncharge, except that the latter uses atomic RMWs, which are more expensive; stat changes happen at around the same rate. In a sparse file test (page faults and reclaim at maximum CPU speed) with 5 cgroup nesting levels, perf shows __mod_memcg_page state at ~1%. Link: http://lkml.kernel.org/r/20190412151507.2769-4-hannes@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Shakeel Butt <shakeelb@google.com> Reviewed-by: Roman Gushchin <guro@fb.com> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
| * mm: memcontrol: move stat/event counting functions out-of-lineJohannes Weiner2019-05-072-57/+84
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These are getting too big to be inlined in every callsite. They were stolen from vmstat.c, which already out-of-lines them, and they have only been growing since. The callsites aren't that hot, either. Move __mod_memcg_state() __mod_lruvec_state() and __count_memcg_events() out of line and add kerneldoc comments. Link: http://lkml.kernel.org/r/20190412151507.2769-3-hannes@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Shakeel Butt <shakeelb@google.com> Reviewed-by: Roman Gushchin <guro@fb.com> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
| * mm-memcontrol-make-cgroup-stats-and-events-query-api-explicitly-local-fixJohannes Weiner2019-05-071-1/+1
| | | | | | | | | | | | | | | | | | | | The lruvec_page_state() -> lruvec_page_state_local() rename should have been part of this patch, not the previous one. Link: http://lkml.kernel.org/r/20190417160347.GC23013@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
| * mm: memcontrol: make cgroup stats and events query API explicitly localJohannes Weiner2019-05-074-32/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Patch series "mm: memcontrol: memory.stat cost & correctness". The cgroup memory.stat file holds recursive statistics for the entire subtree. The current implementation does this tree walk on-demand whenever the file is read. This is giving us problems in production. 1. The cost of aggregating the statistics on-demand is high. A lot of system service cgroups are mostly idle and their stats don't change between reads, yet we always have to check them. There are also always some lazily-dying cgroups sitting around that are pinned by a handful of remaining page cache; the same applies to them. In an application that periodically monitors memory.stat in our fleet, we have seen the aggregation consume up to 5% CPU time. 2. When cgroups die and disappear from the cgroup tree, so do their accumulated vm events. The result is that the event counters at higher-level cgroups can go backwards and confuse some of our automation, let alone people looking at the graphs over time. To address both issues, this patch series changes the stat implementation to spill counts upwards when the counters change. The upward spilling is batched using the existing per-cpu cache. In a sparse file stress test with 5 level cgroup nesting, the additional cost of the flushing was negligible (a little under 1% of CPU at 100% CPU utilization, compared to the 5% of reading memory.stat during regular operation). This patch (of 4): memcg_page_state(), lruvec_page_state(), memcg_sum_events() are currently returning the state of the local memcg or lruvec, not the recursive state. In practice there is a demand for both versions, although the callers that want the recursive counts currently sum them up by hand. Per default, cgroups are considered recursive entities and generally we expect more users of the recursive counters, with the local counts being special cases. To reflect that in the name, add a _local suffix to the current implementations. The following patch will re-incarnate these functions with recursive semantics, but with an O(1) implementation. Link: http://lkml.kernel.org/r/20190412151507.2769-2-hannes@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Shakeel Butt <shakeelb@google.com> Reviewed-by: Roman Gushchin <guro@fb.com> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
| * drivers/virt/fsl_hypervisor.c: prevent integer overflow in ioctlDan Carpenter2019-05-071-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The "param.count" value is a u64 thatcomes from the user. The code later in the function assumes that param.count is at least one and if it's not then it leads to an Oops when we dereference the ZERO_SIZE_PTR. Also the addition can have an integer overflow which would lead us to allocate a smaller "pages" array than required. I can't immediately tell what the possible run times implications are, but it's safest to prevent the overflow. Link: http://lkml.kernel.org/r/20181218082129.GE32567@kadam Fixes: 6db7199407ca ("drivers/virt: introduce Freescale hypervisor management driver") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Timur Tabi <timur@freescale.com> Cc: Mihai Caraman <mihai.caraman@freescale.com> Cc: Kumar Gala <galak@kernel.crashing.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
| * drivers/virt/fsl_hypervisor.c: dereferencing error pointers in ioctlDan Carpenter2019-05-071-13/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | strndup_user() returns error pointers on error, and then in the error handling we pass the error pointers to kfree(). It will cause an Oops. Link: http://lkml.kernel.org/r/20181218082003.GD32567@kadam Fixes: 6db7199407ca ("drivers/virt: introduce Freescale hypervisor management driver") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Timur Tabi <timur@freescale.com> Cc: Mihai Caraman <mihai.caraman@freescale.com> Cc: Kumar Gala <galak@kernel.crashing.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
| * mm, memcg: consider subtrees in memory.eventsChris Down2019-05-074-4/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | memory.stat and other files already consider subtrees in their output, and we should too in order to not present an inconsistent interface. The current situation is fairly confusing, because people interacting with cgroups expect hierarchical behaviour in the vein of memory.stat, cgroup.events, and other files. For example, this causes confusion when debugging reclaim events under low, as currently these always read "0" at non-leaf memcg nodes, which frequently causes people to misdiagnose breach behaviour. The same confusion applies to other counters in this file when debugging issues. Aggregation is done at write time instead of at read-time since these counters aren't hot (unlike memory.stat which is per-page, so it does it at read time), and it makes sense to bundle this with the file notifications. After this patch, events are propagated up the hierarchy: [root@ktst ~]# cat /sys/fs/cgroup/system.slice/memory.events low 0 high 0 max 0 oom 0 oom_kill 0 [root@ktst ~]# systemd-run -p MemoryMax=1 true Running as unit: run-r251162a189fb4562b9dabfdc9b0422f5.service [root@ktst ~]# cat /sys/fs/cgroup/system.slice/memory.events low 0 high 0 max 7 oom 1 oom_kill 1 As this is a change in behaviour, this can be reverted to the old behaviour by mounting with the `memory_localevents' flag set. However, we use the new behaviour by default as there's a lack of evidence that there are any current users of memory.events that would find this change undesirable. Link: http://lkml.kernel.org/r/20190208224419.GA24772@chrisdown.name Signed-off-by: Chris Down <chris@chrisdown.name> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Tejun Heo <tj@kernel.org> Cc: Roman Gushchin <guro@fb.com> Cc: Dennis Zhou <dennis@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
| * mm-rename-ambiguously-named-memorystat-counters-and-functions-fixAndrew Morton2019-05-071-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | fix it for preceding changes Cc: Chris Down <chris@chrisdown.name> Cc: Dennis Zhou <dennis@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Roman Gushchin <guro@fb.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
| * mm, memcg: rename ambiguously named memory.stat counters and functionsChris Down2019-05-072-82/+86
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I spent literally an hour trying to work out why an earlier version of my memory.events aggregation code doesn't work properly, only to find out I was calling memcg->events instead of memcg->memory_events, which is fairly confusing. This naming seems in need of reworking, so make it harder to do the wrong thing by using vmevents instead of events, which makes it more clear that these are vm counters rather than memcg-specific counters. There are also a few other inconsistent names in both the percpu and aggregated structs, so these are all cleaned up to be more coherent and easy to understand. This commit contains code cleanup only: there are no logic changes. Link: http://lkml.kernel.org/r/20190208224319.GA23801@chrisdown.name Signed-off-by: Chris Down <chris@chrisdown.name> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Tejun Heo <tj@kernel.org> Cc: Roman Gushchin <guro@fb.com> Cc: Dennis Zhou <dennis@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
| * arch: remove <asm/sizes.h> and <asm-generic/sizes.h>Masahiro Yamada2019-05-078-9/+0
| | | | | | | | | | | | | | | | | | | | Now that all instances of #include <asm/sizes.h> have been replaced with #include <linux/sizes.h>, we can remove these. Link: http://lkml.kernel.org/r/1553267665-27228-2-git-send-email-yamada.masahiro@socionext.com Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
| * treewide: replace #include <asm/sizes.h> with #include <linux/sizes.h>Masahiro Yamada2019-05-0768-68/+68
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Since dccd2304cc90 ("ARM: 7430/1: sizes.h: move from asm-generic to <linux/sizes.h>"), <asm/sizes.h> and <asm-generic/sizes.h> are just wrappers of <linux/sizes.h>. This commit replaces all <asm/sizes.h> and <asm-generic/sizes.h> to prepare for the removal. Link: http://lkml.kernel.org/r/1553267665-27228-1-git-send-email-yamada.masahiro@socionext.com Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
| * fs/block_dev.c: Remove duplicate headerSabyasachi Gupta2019-05-071-1/+0
| | | | | | | | | | | | | | | | | | | | linux/dax.h is included more than once. Link: http://lkml.kernel.org/r/5c867e95.1c69fb81.4f15a.e5e4@mx.google.com Signed-off-by: Sabyasachi Gupta <sabyasachi.linux@gmail.com> Acked-by: Souptick Joarder <jrdr.linux@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
| * fs/cachefiles/namei.c: remove duplicate headerSabyasachi Gupta2019-05-071-1/+0
| | | | | | | | | | | | | | | | | | | | linux/xattr.h is included more than once. Link: http://lkml.kernel.org/r/5c86803d.1c69fb81.1a7c6.2b78@mx.google.com Signed-off-by: Sabyasachi Gupta <sabyasachi.linux@gmail.com> Acked-by: Souptick Joarder <jrdr.linux@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
| * include/linux/sched/signal.h: replace `tsk' with `task'Andrei Vagin2019-05-071-25/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | This file uses "task" 85 times and "tsk" 25 times. It is better to be consistent. Link: http://lkml.kernel.org/r/20181129180547.15976-1-avagin@gmail.com Signed-off-by: Andrei Vagin <avagin@gmail.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
| * fs/coda/psdev.c: remove duplicate headerSabyasachi Gupta2019-05-071-1/+0
| | | | | | | | | | | | | | | | | | | | | | linux/poll.h is included more than once. Link: http://lkml.kernel.org/r/5c86820f.1c69fb81.149f0.0834@mx.google.com Signed-off-by: Sabyasachi Gupta <sabyasachi.linux@gmail.com> Acked-by: Souptick Joarder <jrdr.linux@gmail.com> Cc: Jan Harkes <jaharkes@cs.cmu.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
| * pinctrl: fix pxa2xx.c build warningsRandy Dunlap2019-05-071-0/+1
|/ | | | | | | | | | | | | | | | | Add #include of <linux/pinctrl/machine.h> to fix build warnings in pinctrl-pxa2xx.c. Fixes these warnings: In file included from ../drivers/pinctrl/pxa/pinctrl-pxa2xx.c:24:0: ../drivers/pinctrl/pxa/../pinctrl-utils.h:36:8: warning: `enum pinctrl_map_type' declared inside parameter list [enabled by default] enum pinctrl_map_type type); ^ ../drivers/pinctrl/pxa/../pinctrl-utils.h:36:8: warning: its scope is only this definition or declaration, which is probably not what you want [enabled by default] Link: http://lkml.kernel.org/r/0024542e-cba9-8f13-6c18-32d0050a6007@infradead.org Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Robert Jarzmik <robert.jarzmik@free.fr> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
* Merge branch 'akpm-current/current'Stephen Rothwell2019-05-07354-3858/+10010
|\
| * ipc-do-cyclic-id-allocation-for-the-ipc-object-fixAndrew Morton2019-04-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | fix max() warning ipc/util.c: In function 'ipc_idr_alloc': include/linux/kernel.h:828:29: warning: comparison of distinct pointer types lacks a cast (!!(sizeof((typeof(x) *)1 == (typeof(y) *)1))) Reported-by: kbuild test robot <lkp@intel.com> Cc: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
| * ipc: do cyclic id allocation for the ipc object.Manfred Spraul2019-04-283-1/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For ipcmni_extend mode, the sequence number space is only 7 bits. So the chance of id reuse is relatively high compared with the non-extended mode. To alleviate this id reuse problem, this patch enables cyclic allocation for the index to the radix tree (idx). The disadvantage is that this can cause a slight slow-down of the fast path, as the radix tree could be higher than necessary. To limit the radix tree height, I have chosen the following limits: - 1) The cycling is done over in_use*1.5. - 2) At least, the cycling is done over "normal" ipcnmi mode: RADIX_TREE_MAP_SIZE elements "ipcmni_extended": 4096 elements Result: - for normal mode: No change for <= 42 active ipc elements. With more than 42 active ipc elements, a 2nd level would be added to the radix tree. Without cyclic allocation, a 2nd level would be added only with more than 63 active elements. - for extended mode: Cycling creates always at least a 2-level radix tree. With more than 2730 active objects, a 3rd level would be added, instead of > 4095 active objects until the 3rd level is added without cyclic allocation. For a 2-level radix tree compared to a 1-level radix tree, I have observed < 1% performance impact. Notes: 1) Normal "x=semget();y=semget();" is unaffected: Then the idx is e.g. a and a+1, regardless if idr_alloc() or idr_alloc_cyclic() is used. 2) The -1% happens in a microbenchmark after this situation: x=semget(); for(i=0;i<4000;i++) {t=semget();semctl(t,0,IPC_RMID);} y=semget(); Now perform semget calls on x and y that do not sleep. 3) The worst-case reuse cycle time is unfortunately unaffected: If you have 2^24-1 ipc objects allocated, and get/remove the last possible element in a loop, then the id is reused after 128 get/remove pairs. Performance check: A microbenchmark that performes no-op semop() randomly on two IDs, with only these two IDs allocated. The IDs were set using /proc/sys/kernel/sem_next_id. The test was run 5 times, averages are shown. 1 & 2: Base (6.22 seconds for 10.000.000 semops) 1 & 40: -0.2% 1 & 3348: - 0.8% 1 & 27348: - 1.6% 1 & 15777204: - 3.2% Or: ~12.6 cpu cycles per additional radix tree level. The cpu is an Intel I3-5010U. ~1300 cpu cycles/syscall is slower than what I remember (spectre impact?). V2 of the patch: - use "min" and "max" - use RADIX_TREE_MAP_SIZE * RADIX_TREE_MAP_SIZE instead of (2<<12). Link: http://lkml.kernel.org/r/20190329204930.21620-3-longman@redhat.com Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Acked-by: Waiman Long <longman@redhat.com> Cc: "Luis R. Rodriguez" <mcgrof@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Matthew Wilcox <willy@infradead.org> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: Takashi Iwai <tiwai@suse.de> Cc: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
| * ipc: conserve sequence numbers in ipcmni_extend modeManfred Spraul2019-04-283-9/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rewrite, based on the patch from Waiman Long: The mixing in of a sequence number into the IPC IDs is probably to avoid ID reuse in userspace as much as possible. With ipcmni_extend mode, the number of usable sequence numbers is greatly reduced leading to higher chance of ID reuse. To address this issue, we need to conserve the sequence number space as much as possible. Right now, the sequence number is incremented for every new ID created. In reality, we only need to increment the sequence number when new allocated ID is not greater than the last one allocated. It is in such case that the new ID may collide with an existing one. This is being done irrespective of the ipcmni mode. In order to avoid any races, the index is first allocated and then the pointer is replaced. Changes compared to the initial patch: - Handle failures from idr_alloc(). - Avoid that concurrent operations can see the wrong sequence number. (This is achieved by using idr_replace()). - IPCMNI_SEQ_SHIFT is not a constant, thus renamed to ipcmni_seq_shift(). - IPCMNI_SEQ_MAX is not a constant, thus renamed to ipcmni_seq_max(). Link: http://lkml.kernel.org/r/20190329204930.21620-2-longman@redhat.com Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Waiman Long <longman@redhat.com> Suggested-by: Matthew Wilcox <willy@infradead.org> Acked-by: Waiman Long <longman@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kees Cook <keescook@chromium.org> Cc: "Luis R. Rodriguez" <mcgrof@kernel.org> Cc: Takashi Iwai <tiwai@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
| * ipc: allow boot time extension of IPCMNI from 32k to 16MWaiman Long2019-04-284-15/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The maximum number of unique System V IPC identifiers was limited to 32k. That limit should be big enough for most use cases. However, there are some users out there requesting for more, especially those that are migrating from Solaris which uses 24 bits for unique identifiers. To satisfy the need of those users, a new boot time kernel option "ipcmni_extend" is added to extend the IPCMNI value to 16M. This is a 512X increase which should be big enough for users out there that need a large number of unique IPC identifier. The use of this new option will change the pattern of the IPC identifiers returned by functions like shmget(2). An application that depends on such pattern may not work properly. So it should only be used if the users really need more than 32k of unique IPC numbers. This new option does have the side effect of reducing the maximum number of unique sequence numbers from 64k down to 128. So it is a trade-off. The computation of a new IPC id is not done in the performance critical path. So a little bit of additional overhead shouldn't have any real performance impact. Link: http://lkml.kernel.org/r/20190329204930.21620-1-longman@redhat.com Signed-off-by: Waiman Long <longman@redhat.com> Acked-by: Manfred Spraul <manfred@colorfullife.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kees Cook <keescook@chromium.org> Cc: "Luis R. Rodriguez" <mcgrof@kernel.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Takashi Iwai <tiwai@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
| * ipc/mqueue: optimize msg_get()Davidlohr Bueso2019-04-281-25/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Our msg priorities became an rbtree as of d6629859b36d ("ipc/mqueue: improve performance of send/recv"). However, consuming a msg in msg_get() remains logarithmic (still being better than the case before of course). By applying well known techniques to cache pointers we can have the node with the highest priority in O(1), which is specially nice for the rt cases. Furthermore, some callers can call msg_get() in a loop. A new msg_tree_erase() helper is also added to encapsulate the tree removal and node_cache game. Passes ltp mq testcases. Link: http://lkml.kernel.org/r/20190321190216.1719-2-dave@stgolabs.net Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Cc: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
| * ipc/mqueue: remove redundant wq task assignmentDavidlohr Bueso2019-04-281-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | We already store the current task fo the new waiter before calling wq_sleep() in both send and recv paths. Trivially remove the redundant assignment. Link: http://lkml.kernel.org/r/20190321190216.1719-1-dave@stgolabs.net Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Cc: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
| * ipc: prevent lockup on alloc_msg and free_msgLi Rongqing2019-04-282-2/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | msgctl10 of ltp triggers the following lockup When CONFIG_KASAN is enabled on large memory SMP systems, the pages initialization can take a long time, if msgctl10 requests a huge block memory, and it will block rcu scheduler, so release cpu actively. After adding schedule() in free_msg, free_msg can not be called when holding spinlock, so adding msg to a tmp list, and free it out of spinlock [79441.630467] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: [79441.637566] rcu: Tasks blocked on level-1 rcu_node (CPUs 16-31): P32505 [79441.645355] rcu: Tasks blocked on level-1 rcu_node (CPUs 48-63): P34978 [79441.653149] rcu: (detected by 11, t=35024 jiffies, g=44237529, q=16542267) [79441.661247] msgctl10 R running task 21608 32505 2794 0x00000082 [79441.669455] Call Trace: [79441.736659] preempt_schedule_irq+0x4c/0xb0 [79441.741578] retint_kernel+0x1b/0x2d [79441.745796] RIP: 0010:__is_insn_slot_addr+0xfb/0x250 [79441.751595] Code: 82 1d 00 48 8b 9b 90 00 00 00 4c 89 f7 49 c1 ee 03 e8 59 83 1d 00 48 b8 00 00 00 00 00 fc ff df 4c 39 eb 48 89 9d 58 ff ff ff <41> c6 04 06 f8 74 66 4c 8d 75 98 4c 89 f1 48 c1 e9 03 48 01 c8 48 [79441.773232] RSP: 0018:ffff88bce041f758 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13 [79441.782071] RAX: dffffc0000000000 RBX: ffffffff8471bc50 RCX: ffffffff828a2a57 [79441.790337] RDX: dffffc0000000000 RSI: dffffc0000000000 RDI: ffff88bce041f780 [79441.798612] RBP: ffff88bce041f828 R08: ffffed15f3f4c5b3 R09: ffffed15f3f4c5b3 [79441.806877] R10: 0000000000000001 R11: ffffed15f3f4c5b2 R12: 000000318aee9b73 [79441.815139] R13: ffffffff8471bc50 R14: 1ffff1179c083ef0 R15: 1ffff1179c083eec [79441.848618] kernel_text_address+0xc1/0x100 [79441.853542] __kernel_text_address+0xe/0x30 [79441.858453] unwind_get_return_address+0x2f/0x50 [79441.863864] __save_stack_trace+0x92/0x100 [79441.868742] create_object+0x380/0x650 [79441.911831] __kmalloc+0x14c/0x2b0 [79441.915874] load_msg+0x38/0x1a0 [79441.919726] do_msgsnd+0x19e/0xcf0 [79442.006475] do_syscall_64+0x117/0x400 [79442.037964] entry_SYSCALL_64_after_hwframe+0x49/0xbe [79386.022357] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: [79386.029455] rcu: Tasks blocked on level-1 rcu_node (CPUs 0-15): P32170 [79386.037146] rcu: (detected by 14, t=35016 jiffies, g=44237525, q=12423063) [79386.045242] msgctl10 R running task 21608 32170 32155 0x00000082 [79386.053447] Call Trace: [79386.107584] preempt_schedule_irq+0x4c/0xb0 [79386.112495] retint_kernel+0x1b/0x2d [79386.116712] RIP: 0010:lock_acquire+0x4d/0x340 [79386.121816] Code: 48 81 ec c0 00 00 00 45 89 c6 4d 89 cf 48 8d 6c 24 20 48 89 3c 24 48 8d bb e4 0c 00 00 89 74 24 0c 48 c7 44 24 20 b3 8a b5 41 <48> c1 ed 03 48 c7 44 24 28 b4 25 18 84 48 c7 44 24 30 d0 54 7a 82 [79386.143446] RSP: 0018:ffff88af83417738 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff13 [79386.152278] RAX: dffffc0000000000 RBX: ffff88bd335f3080 RCX: 0000000000000002 [79386.160543] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88bd335f3d64 [79386.168798] RBP: ffff88af83417758 R08: 0000000000000000 R09: 0000000000000000 [79386.177049] R10: 0000000000000001 R11: ffffed13f3f745b2 R12: 0000000000000000 [79386.185308] R13: 0000000000000002 R14: 0000000000000000 R15: 0000000000000000 [79386.213791] is_bpf_text_address+0x32/0xe0 [79386.223516] kernel_text_address+0xec/0x100 [79386.233532] __kernel_text_address+0xe/0x30 [79386.238448] unwind_get_return_address+0x2f/0x50 [79386.243858] __save_stack_trace+0x92/0x100 [79386.252648] save_stack+0x32/0xb0 [79386.357923] __kasan_slab_free+0x130/0x180 [79386.362745] kfree+0xfa/0x2d0 [79386.366291] free_msg+0x24/0x50 [79386.370020] do_msgrcv+0x508/0xe60 [79386.446596] do_syscall_64+0x117/0x400 [79386.478122] entry_SYSCALL_64_after_hwframe+0x49/0xbe Davidlohr said: : So after releasing the lock, the msg rbtree/list is empty and new calls : will not see those in the newly populated tmp_msg list, and therefore they : cannot access the delayed msg freeing pointers, which is good. Also the : fact that the node_cache is now freed before the actual messages seems to : be harmless as this is wanted for msg_insert() avoiding GFP_ATOMIC : allocations, and after releasing the info->lock the thing is freed anyway : so it should not change things. Link: http://lkml.kernel.org/r/1552029161-4957-1-git-send-email-lirongqing@baidu.com Signed-off-by: Li RongQing <lirongqing@baidu.com> Signed-off-by: Zhang Yu <zhangyu31@baidu.com> Reviewed-by: Davidlohr Bueso <dbueso@suse.de> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
| * scripts/gdb: add $lx_clk_core_lookup functionLeonard Crestez2019-04-281-0/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Finding an individual clk_core requires walking the tree which can be quite complicated so add a helper for easy access. (gdb) print *(struct clk_scu*)$lx_clk_core_lookup("uart0_clk")->hw Link: http://lkml.kernel.org/r/Message-ID: Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com> Cc: Jan Kiszka <jan.kiszka@siemens.com> Cc: Jason Wessel <jason.wessel@windriver.com> Cc: Kieran Bingham <kbingham@kernel.org> Cc: Stephen Boyd <swboyd@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
| * scripts/gdb: initial clk support: lx-clk-summaryLeonard Crestez2019-04-282-0/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add an lx-clk-summary command which prints a subset of /sys/kernel/debug/clk/clk_summary. This can be used to examine hangs caused by clk not being enabled. Link: http://lkml.kernel.org/r/Message-ID: Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com> Cc: Jan Kiszka <jan.kiszka@siemens.com> Cc: Jason Wessel <jason.wessel@windriver.com> Cc: Kieran Bingham <kbingham@kernel.org> Cc: Stephen Boyd <swboyd@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
| * scripts/gdb: add hlist utilitiesLeonard Crestez2019-04-281-0/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | This allows easily examining kernel hlists in python. Link: http://lkml.kernel.org/r/Message-ID: Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com> Reviewed-by: Stephen Boyd <sboyd@kernel.org> Cc: Jason Wessel <jason.wessel@windriver.com> Cc: Jan Kiszka <jan.kiszka@siemens.com> Cc: Kieran Bingham <kbingham@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
| * scripts/gdb: silence pep8 checksStephen Boyd2019-04-285-5/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These scripts have some pep8 style warnings. Fix them up so that this directory is all pep8 clean. Link: http://lkml.kernel.org/r/20190329220844.38234-6-swboyd@chromium.org Signed-off-by: Stephen Boyd <swboyd@chromium.org> Cc: Douglas Anderson <dianders@chromium.org> Cc: Nikolay Borisov <n.borisov.lkml@gmail.com> Cc: Kieran Bingham <kbingham@kernel.org> Cc: Jan Kiszka <jan.kiszka@siemens.com> Cc: Jackie Liu <liuyun01@kylinos.cn> Cc: Jason Wessel <jason.wessel@windriver.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
| * scripts-gdb-add-a-timer-list-command-v2Stephen Boyd2019-04-282-56/+86
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Fixed some TODOs in timerlist printing * cpumask printing * jiffies printing uses jiffies_64 now to avoid conversion problems Link: http://lkml.kernel.org/r/20190329220844.38234-5-swboyd@chromium.org Signed-off-by: Stephen Boyd <swboyd@chromium.org> Cc: Douglas Anderson <dianders@chromium.org> Cc: Nikolay Borisov <n.borisov.lkml@gmail.com> Cc: Kieran Bingham <kbingham@kernel.org> Cc: Jan Kiszka <jan.kiszka@siemens.com> Cc: Jackie Liu <liuyun01@kylinos.cn> Cc: Jason Wessel <jason.wessel@windriver.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
| * scripts/gdb: add a timer list commandStephen Boyd2019-04-283-0/+203
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement a command to print the timer list, much like how /proc/timer_list is implemented. This can be used to look at the pending timers on a crashed system. Link: http://lkml.kernel.org/r/20190325184522.260535-5-swboyd@chromium.org Signed-off-by: Stephen Boyd <swboyd@chromium.org> Cc: Douglas Anderson <dianders@chromium.org> Cc: Nikolay Borisov <n.borisov.lkml@gmail.com> Cc: Kieran Bingham <kbingham@kernel.org> Cc: Jan Kiszka <jan.kiszka@siemens.com> Cc: Jackie Liu <liuyun01@kylinos.cn> Cc: Jason Wessel <jason.wessel@windriver.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
| * scripts-gdb-add-rb-tree-iterating-utilities-v2Stephen Boyd2019-04-281-8/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | Link: http://lkml.kernel.org/r/20190329220844.38234-4-swboyd@chromium.org Signed-off-by: Stephen Boyd <swboyd@chromium.org> Cc: Douglas Anderson <dianders@chromium.org> Cc: Nikolay Borisov <n.borisov.lkml@gmail.com> Cc: Kieran Bingham <kbingham@kernel.org> Cc: Jan Kiszka <jan.kiszka@siemens.com> Cc: Jackie Liu <liuyun01@kylinos.cn> Cc: Jason Wessel <jason.wessel@windriver.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
| * scripts/gdb: add rb tree iterating utilitiesStephen Boyd2019-04-282-0/+170
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement gdb functions for rb_first(), rb_last(), rb_next(), and rb_prev(). These can be useful to iterate through the kernel's red-black trees. Link: http://lkml.kernel.org/r/20190325184522.260535-4-swboyd@chromium.org Signed-off-by: Stephen Boyd <swboyd@chromium.org> Cc: Douglas Anderson <dianders@chromium.org> Cc: Nikolay Borisov <n.borisov.lkml@gmail.com> Cc: Kieran Bingham <kbingham@kernel.org> Cc: Jan Kiszka <jan.kiszka@siemens.com> Cc: Jackie Liu <liuyun01@kylinos.cn> Cc: Jason Wessel <jason.wessel@windriver.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
| * scripts-gdb-add-kernel-config-dumping-command-v2Stephen Boyd2019-04-281-13/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Fixed config dumping script off-by-one error on builtin config size * Silenced pep8 style warnings and errors Link: http://lkml.kernel.org/r/20190329220844.38234-3-swboyd@chromium.org Signed-off-by: Stephen Boyd <swboyd@chromium.org> Cc: Douglas Anderson <dianders@chromium.org> Cc: Nikolay Borisov <n.borisov.lkml@gmail.com> Cc: Kieran Bingham <kbingham@kernel.org> Cc: Jan Kiszka <jan.kiszka@siemens.com> Cc: Jackie Liu <liuyun01@kylinos.cn> Cc: Jason Wessel <jason.wessel@windriver.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
| * scripts/gdb: add kernel config dumping commandStephen Boyd2019-04-282-0/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | lx-configdump <file> dumps the contents of the gzipped .config to a text file when the config is included in the kernel with CONFIG_IKCONFIG. By default, the file written is called config.txt, but it can be any user supplied filename as well. If the kernel config is in a module (configs.ko), then it can be loaded along with symbols for the module loaded with 'lx-symbols' and then this command will still work. Obviously if you have the whole vmlinux then this can also be achieved with scripts/extract-ikconfig, but this gdb script can be useful to confirm that the memory contents of the config in memory and the vmlinux contents on disk match what is expected. Link: http://lkml.kernel.org/r/20190325184522.260535-3-swboyd@chromium.org Signed-off-by: Stephen Boyd <swboyd@chromium.org> Cc: Douglas Anderson <dianders@chromium.org> Cc: Nikolay Borisov <n.borisov.lkml@gmail.com> Cc: Kieran Bingham <kbingham@kernel.org> Cc: Jan Kiszka <jan.kiszka@siemens.com> Cc: Jackie Liu <liuyun01@kylinos.cn> Cc: Jason Wessel <jason.wessel@windriver.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
| * scripts/gdb: find vmlinux where it was beforeStephen Boyd2019-04-281-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Patch series "gdb script for kconfig and timer list". This is a handful of changes to the kernel's gdb scripts to do some more debugging with kgdb. The first patch allows the vmlinux to be reloaded from where it was specified on the command line so that this set of scripts can be used from anywhere. The second patch adds a script to dump the config.gz to a file on the host debugging machine. The third patch adds some rb tree utilities and the last patch uses those rb tree walking utilities to dump out the contents of /proc/timer_list from a system under debug. This patch (of 5): If I run 'gdb <path/to/vmlinux>' and there's the vmlinux-gdb.py file there I can properly see symbols and use the lx commands provided by the GDB scripts. But once I run 'lx-symbols' at the command prompt, gdb reloads the vmlinux symbols assuming that this script was run from the directory that has vmlinux at the root. That isn't always true, but we could just look and see what symbols were already loaded and use that instead. Let's do that so this can work by being invoked anywhere. Link: http://lkml.kernel.org/r/20190325184522.260535-2-swboyd@chromium.org Signed-off-by: Stephen Boyd <swboyd@chromium.org> Cc: Douglas Anderson <dianders@chromium.org> Cc: Nikolay Borisov <n.borisov.lkml@gmail.com> Cc: Kieran Bingham <kbingham@kernel.org> Cc: Jan Kiszka <jan.kiszka@siemens.com> Cc: Jackie Liu <liuyun01@kylinos.cn> Cc: Jason Wessel <jason.wessel@windriver.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
| * pps: pps-gpio PPS ECHO implementationTom Burkart2019-04-282-3/+87
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements the PPS ECHO functionality for pps-gpio, that sysfs claims is available already. Configuration is done via device tree bindings. No changes are made to userspace interfaces. This patch was originally written by Lukas Senger as part of a masters thesis project and modified for inclusion into the linux kernel by Tom Burkart. Link: http://lkml.kernel.org/r/20190324043305.6627-4-tom@aussec.com Signed-off-by: Tom Burkart <tom@aussec.com> Acked-by: Rodolfo Giometti <giometti@enneenne.com> Signed-off-by: Lukas Senger <lukas@fridolin.com> Cc: Philipp Zabel <philipp.zabel@gmail.com> Cc: Rob Herring <robh@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
| * dt-bindings: pps: pps-gpio PPS ECHO implementationTom Burkart2019-04-281-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements the device tree binding changes required for the PPS ECHO functionality for pps-gpio, that sysfs claims is available already. It adds two DT properties for configuring the PPS ECHO functionality. This patch is provided separated from the rest of the patch per Documentation/devicetree/bindings/submitting-patches.txt. This patch was originally written by Lukas Senger as part of a masters thesis project and modified for inclusion into the linux kernel by Tom Burkart. Link: http://lkml.kernel.org/r/20190324043305.6627-3-tom@aussec.com Signed-off-by: Tom Burkart <tom@aussec.com> Signed-off-by: Lukas Senger <lukas@fridolin.com> Acked-by: Rodolfo Giometti <giometti@enneenne.com> Reviewed-by: Rob Herring <robh@kernel.org> Cc: Philipp Zabel <philipp.zabel@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
| * pps: descriptor-based gpioTom Burkart2019-04-282-38/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch changes the GPIO access for the pps-gpio driver from the integer based API to the descriptor based API. The integer based API is considered deprecated and the descriptor based API is the preferred way to access GPIOs as per Documentation/driver-api/gpio/intro.rst No changes are made to userspace interfaces. Link: http://lkml.kernel.org/r/20190324043305.6627-2-tom@aussec.com Signed-off-by: Tom Burkart <tom@aussec.com> Acked-by: Rodolfo Giometti <giometti@enneenne.com> Reviewed-by: Philipp Zabel <philipp.zabel@gmail.com> Cc: Lukas Senger <lukas@fridolin.com> Cc: Rob Herring <robh@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
| * panic-add-an-option-to-replay-all-the-printk-message-in-buffer-v4Feng Tang2019-04-281-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | keep the original console_flush_on_panic() inside panic(), as suggested by Petr Mladek Link: http://lkml.kernel.org/r/1556199137-14163-1-git-send-email-feng.tang@intel.com Signed-off-by: Feng Tang <feng.tang@intel.com> Cc: Aaro Koskinen <aaro.koskinen@nokia.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Cc: Kees Cook <keescook@chromium.org> Cc: Borislav Petkov <bp@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
| * panic: add an option to replay all the printk message in bufferFeng Tang2019-04-285-5/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently on panic, kernel will lower the loglevel and print out pending printk msg only with console_flush_on_panic(). Add an option for users to configure the "panic_print" to replay all dmesg in buffer, some of which they may have never seen due to the loglevel setting, which will help panic debugging . Link: http://lkml.kernel.org/r/1556095872-36838-1-git-send-email-feng.tang@intel.com Signed-off-by: Feng Tang <feng.tang@intel.com> Cc: Aaro Koskinen <aaro.koskinen@nokia.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Cc: Kees Cook <keescook@chromium.org> Cc: Borislav Petkov <bp@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
| * panic/reboot: allow specifying reboot_mode for panic onlyAaro Koskinen2019-04-284-6/+22
| | | | | | | | | | | | | | | | | | | | | | | | Allow specifying reboot_mode for panic only. This is needed on systems where ramoops is used to store panic logs, and user wants to use warm reset to preserve those, while still having cold reset on normal reboots. Link: http://lkml.kernel.org/r/20190322004735.27702-1-aaro.koskinen@iki.fi Signed-off-by: Aaro Koskinen <aaro.koskinen@nokia.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
| * panic: avoid the extra noise dmesgFeng Tang2019-04-284-0/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When kernel panic happens, it will first print the panic call stack, then the ending msg like: [ 35.743249] ---[ end Kernel panic - not syncing: Fatal exception [ 35.749975] ------------[ cut here ]------------ The above message are very useful for debugging. But if system is configured to not reboot on panic, say the "panic_timeout" parameter equals 0, it will likely print out many noisy message like WARN() call stack for each and every CPU except the panic one, messages like below: WARNING: CPU: 1 PID: 280 at kernel/sched/core.c:1198 set_task_cpu+0x183/0x190 Call Trace: <IRQ> try_to_wake_up default_wake_function autoremove_wake_function __wake_up_common __wake_up_common_lock __wake_up wake_up_klogd_work_func irq_work_run_list irq_work_tick update_process_times tick_sched_timer __hrtimer_run_queues hrtimer_interrupt smp_apic_timer_interrupt apic_timer_interrupt For people working in console mode, the screen will first show the panic call stack, but immediately overridden by these noisy extra messages, which makes debugging much more difficult, as the original context gets lost on screen. Also these noisy messages will confuse some users, as I have seen many bug reporters posted the noisy message into bugzilla, instead of the real panic call stack and context. Adding a flag "suppress_printk" which gets set in panic() to avoid those noisy messages, without changing current kernel behavior that both panic blinking and sysrq magic key can work as is, suggested by Petr Mladek. To verify this, make sure kernel is not configured to reboot on panic and in console # echo c > /proc/sysrq-trigger to see if console only prints out the panic call stack. Link: http://lkml.kernel.org/r/1551430186-24169-1-git-send-email-feng.tang@intel.com Signed-off-by: Feng Tang <feng.tang@intel.com> Suggested-by: Petr Mladek <pmladek@suse.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Kees Cook <keescook@chromium.org> Cc: Borislav Petkov <bp@suse.de> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jiri Slaby <jslaby@suse.com> Cc: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
| * gcov-clang-support-checkpatch-fixesAndrew Morton2019-04-281-7/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | WARNING: Non-standard signature: Co-authored-by: #31: Co-authored-by: Nick Desaulniers <ndesaulniers@google.com> WARNING: Non-standard signature: Co-authored-by: #32: Co-authored-by: Tri Vo <trong@android.com> WARNING: Possible unnecessary 'out of memory' message #158: FILE: kernel/gcov/clang.c:90: + if (!info) { + pr_warn_ratelimited("failed to allocate gcov info\n"); WARNING: Possible unnecessary 'out of memory' message #193: FILE: kernel/gcov/clang.c:125: + if (!info) { + pr_warn_ratelimited("failed to allocate gcov function info for %s\n", WARNING: line over 80 characters #546: FILE: kernel/gcov/clang.c:478: + pos += store_gcov_u32(buffer, pos, fi_ptr->cfg_checksum); total: 0 errors, 5 warnings, 663 lines checked NOTE: For some of the reported defects, checkpatch may be able to mechanically convert to the typical style using --fix or --fix-inplace. ./patches/gcov-clang-support.patch has style problems, please review. NOTE: If any of the errors are false positives, please report them to the maintainer, see CHECKPATCH in MAINTAINERS. Please run checkpatch prior to sending patches Cc: Daniel Mentz <danielmentz@google.com> Cc: Greg Hackmann <ghackmann@android.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Oberparleiter <oberpar@linux.ibm.com> Cc: Petri Gynther <pgynther@google.com> Cc: Prasad Sodagudi <psodagud@quicinc.com> Cc: Trilok Soni <tsoni@quicinc.com> Cc: Tri Vo <trong@android.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
| * gcov: clang supportGreg Hackmann2019-04-287-2/+616
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | LLVM uses profiling data that's deliberately similar to GCC, but has a very different way of exporting that data. LLVM calls llvm_gcov_init() once per module, and provides a couple of callbacks that we can use to ask for more data. We care about the "writeout" callback, which in turn calls back into compiler-rt/this module to dump all the gathered coverage data to disk: llvm_gcda_start_file() llvm_gcda_emit_function() llvm_gcda_emit_arcs() llvm_gcda_emit_function() llvm_gcda_emit_arcs() [... repeats for each function ...] llvm_gcda_summary_info() llvm_gcda_end_file() This design is much more stateless and unstructured than gcc's, and is intended to run at process exit. This forces us to keep some local state about which module we're dealing with at the moment. On the other hand, it also means we don't depend as much on how LLVM represents profiling data internally. See LLVM's lib/Transforms/Instrumentation/GCOVProfiling.cpp for more details on how this works, particularly GCOVProfiler::emitProfileArcs(), GCOVProfiler::insertCounterWriteout(), and GCOVProfiler::insertFlush(). Link: http://lkml.kernel.org/r/20190417225328.208129-1-trong@android.com Signed-off-by: Greg Hackmann <ghackmann@android.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Tri Vo <trong@android.com> Co-developed-by: Nick Desaulniers <ndesaulniers@google.com> Co-developed-by: Tri Vo <trong@android.com> Tested-by: Trilok Soni <tsoni@quicinc.com> Tested-by: Prasad Sodagudi <psodagud@quicinc.com> Tested-by: Tri Vo <trong@android.com> Tested-by: Daniel Mentz <danielmentz@google.com> Tested-by: Petri Gynther <pgynther@google.com> Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
| * gcov: docs: add a note on GCC vs Clang differencesTri Vo2019-04-281-4/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Document some things of note to gcov users: 1. GCC gcov and Clang llvm-cov tools are not compatible. 2. The use of GCC vs Clang is transparent at build-time. Also adjust the documentation to account for the removal of config symbol CONFIG_GCOV_FORMAT_AUTODETECT by commit 6a61b70b43c9 ("gcov: remove CONFIG_GCOV_FORMAT_AUTODETECT"). Link: http://lkml.kernel.org/r/20190318025411.98014-4-trong@android.com Signed-off-by: Tri Vo <trong@android.com> Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com> Cc: Daniel Mentz <danielmentz@google.com> Cc: Greg Hackmann <ghackmann@android.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Petri Gynther <pgynther@google.com> Cc: Prasad Sodagudi <psodagud@quicinc.com> Cc: Trilok Soni <tsoni@quicinc.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
| * gcov: clang: move common GCC code into gcc_base.cGreg Hackmann2019-04-284-84/+93
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Patch series "gcov: add Clang support", v4. This patch (of 3): base.c contains a few callbacks specific to GCC's gcov implementation. Move these into their own module in preparation for Clang support. Link: http://lkml.kernel.org/r/20190318025411.98014-2-trong@android.com Signed-off-by: Greg Hackmann <ghackmann@android.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Tri Vo <trong@android.com> Tested-by: Trilok Soni <tsoni@quicinc.com> Tested-by: Prasad Sodagudi <psodagud@quicinc.com> Tested-by: Tri Vo <trong@android.com> Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com> Cc: Daniel Mentz <danielmentz@google.com> Cc: Petri Gynther <pgynther@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>