summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorJohannes Weiner <hannes@cmpxchg.org>2019-04-17 15:30:59 +1000
committerStephen Rothwell <sfr@canb.auug.org.au>2019-04-17 17:01:44 +1000
commitb360cf7974ea29880ed4a1da7ebbc64e259d750e (patch)
tree660f6e65ad2e43493b67c63ed6cf6df93a1c2df4
parentf599dc6dd07ebfeb9858b980fd284f16d18b63c9 (diff)
downloadlinux-b360cf7974ea29880ed4a1da7ebbc64e259d750e.tar.gz
linux-b360cf7974ea29880ed4a1da7ebbc64e259d750e.tar.xz
mm: memcontrol: fix NUMA round-robin reclaim at intermediate level
When a cgroup is reclaimed on behalf of a configured limit, reclaim needs to round-robin through all NUMA nodes that hold pages of the memcg in question. However, when assembling the mask of candidate NUMA nodes, the code only consults the *local* cgroup LRU counters, not the recursive counters for the entire subtree. Cgroup limits are frequently configured against intermediate cgroups that do not have memory on their own LRUs. In this case, the node mask will always come up empty and reclaim falls back to scanning only the current node. If a cgroup subtree has some memory on one node but the processes are bound to another node afterwards, the limit reclaim will never age or reclaim that memory anymore. To fix this, use the recursive LRU counts for a cgroup subtree to determine which nodes hold memory of that cgroup. The code has been broken like this forever, so it doesn't seem to be a problem in practice. I just noticed it while reviewing the way the LRU counters are used in general. Link: http://lkml.kernel.org/r/20190412151507.2769-5-hannes@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Shakeel Butt <shakeelb@google.com> Reviewed-by: Roman Gushchin <guro@fb.com> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
-rw-r--r--mm/memcontrol.c8
1 files changed, 4 insertions, 4 deletions
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 2eb2d4ef9b34..2535e54e7989 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1512,13 +1512,13 @@ static bool test_mem_cgroup_node_reclaimable(struct mem_cgroup *memcg,
{
struct lruvec *lruvec = mem_cgroup_lruvec(NODE_DATA(nid), memcg);
- if (lruvec_page_state_local(lruvec, NR_INACTIVE_FILE) ||
- lruvec_page_state_local(lruvec, NR_ACTIVE_FILE))
+ if (lruvec_page_state(lruvec, NR_INACTIVE_FILE) ||
+ lruvec_page_state(lruvec, NR_ACTIVE_FILE))
return true;
if (noswap || !total_swap_pages)
return false;
- if (lruvec_page_state_local(lruvec, NR_INACTIVE_ANON) ||
- lruvec_page_state_local(lruvec, NR_ACTIVE_ANON))
+ if (lruvec_page_state(lruvec, NR_INACTIVE_ANON) ||
+ lruvec_page_state(lruvec, NR_ACTIVE_ANON))
return true;
return false;