bpf: Make cgroup storages shared between programs on the same cgroup

This change comes in several parts: One, the restriction that the CGROUP_STORAGE map can only be used by one program is removed. This results in the removal of the field 'aux' in struct bpf_cgroup_storage_map, and removal of relevant code associated with the field, and removal of now-noop functions bpf_free_cgroup_storage and bpf_cgroup_storage_release. Second, we permit a key of type u64 as the key to the map. Providing such a key type indicates that the map should ignore attach type when comparing map keys. However, for simplicity newly linked storage will still have the attach type at link time in its key struct. cgroup_storage_check_btf is adapted to accept u64 as the type of the key. Third, because the storages are now shared, the storages cannot be unconditionally freed on program detach. There could be two ways to solve this issue: * A. Reference count the usage of the storages, and free when the last program is detached. * B. Free only when the storage is impossible to be referred to again, i.e. when either the cgroup_bpf it is attached to, or the map itself, is freed. Option A has the side effect that, when the user detach and reattach a program, whether the program gets a fresh storage depends on whether there is another program attached using that storage. This could trigger races if the user is multi-threaded, and since nondeterminism in data races is evil, go with option B. The both the map and the cgroup_bpf now tracks their associated storages, and the storage unlink and free are removed from cgroup_bpf_detach and added to cgroup_bpf_release and cgroup_storage_map_free. The latter also new holds the cgroup_mutex to prevent any races with the former. Fourth, on attach, we reuse the old storage if the key already exists in the map, via cgroup_storage_lookup. If the storage does not exist yet, we create a new one, and publish it at the last step in the attach process. This does not create a race condition because for the whole attach the cgroup_mutex is held. We keep track of an array of new storages that was allocated and if the process fails only the new storages would get freed. Signed-off-by: YiFei Zhu <zhuyifei@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/d5401c6106728a00890401190db40020a1f84ff1.1595565795.git.zhuyifei@google.com
author: YiFei Zhu <zhuyifei@google.com> 2020-07-23 23:47:43 -0500
committer: Alexei Starovoitov <ast@kernel.org> 2020-07-25 20:16:35 -0700
commit: 7d9c3427894fe70d1347b4820476bf37736d2ff0 (patch)
tree: b3bde4987056582ca21d13b75589cca8026e03b3 /kernel/bpf/core.c
parent: 9e5bd1f7633bc1c3c8b25496eedfeced6d2675ff (diff)
download: linux-7d9c3427894fe70d1347b4820476bf37736d2ff0.tar.gz
linux-7d9c3427894fe70d1347b4820476bf37736d2ff0.tar.xz
1 files changed, 0 insertions, 12 deletions
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 7be02e555ab9..bde93344164d 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -2097,24 +2097,12 @@ int bpf_prog_array_copy_info(struct bpf_prog_array *array,
 								     : 0;
 }
 
-static void bpf_free_cgroup_storage(struct bpf_prog_aux *aux)
-{
-	enum bpf_cgroup_storage_type stype;
-
-	for_each_cgroup_storage_type(stype) {
-		if (!aux->cgroup_storage[stype])
-			continue;
-		bpf_cgroup_storage_release(aux, aux->cgroup_storage[stype]);
-	}
-}
-
 void __bpf_free_used_maps(struct bpf_prog_aux *aux,
 			  struct bpf_map **used_maps, u32 len)
 {
 	struct bpf_map *map;
 	u32 i;
 
-	bpf_free_cgroup_storage(aux);
 	for (i = 0; i < len; i++) {
 		map = used_maps[i];
 		if (map->ops->map_poke_untrack)
author	YiFei Zhu <zhuyifei@google.com>	2020-07-23 23:47:43 -0500
committer	Alexei Starovoitov <ast@kernel.org>	2020-07-25 20:16:35 -0700
commit	7d9c3427894fe70d1347b4820476bf37736d2ff0 (patch)
tree	b3bde4987056582ca21d13b75589cca8026e03b3 /kernel/bpf/core.c
parent	9e5bd1f7633bc1c3c8b25496eedfeced6d2675ff (diff)
download	linux-7d9c3427894fe70d1347b4820476bf37736d2ff0.tar.gz linux-7d9c3427894fe70d1347b4820476bf37736d2ff0.tar.xz