From 19d67694745c5c9ed085d0d8332fa02d835a07d0 Mon Sep 17 00:00:00 2001 From: Rodrigo Campos Date: Fri, 2 Jul 2021 17:19:27 +0200 Subject: Documentation: seccomp: Fix typo in user notification The close on exec flag is O_CLOEXEC, not O_EXEC. This patch just fixes the typo. Suggested-by: Christian Brauner Signed-off-by: Rodrigo Campos Acked-by: Christian Brauner Signed-off-by: Kees Cook Fixes: 0ae71c7720e3 ("seccomp: Support atomic "addfd + send reply"") Link: https://lore.kernel.org/r/20210702151927.263402-1-rodrigo@kinvolk.io --- Documentation/userspace-api/seccomp_filter.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/userspace-api/seccomp_filter.rst b/Documentation/userspace-api/seccomp_filter.rst index d61219889e49..539e9d4a4860 100644 --- a/Documentation/userspace-api/seccomp_filter.rst +++ b/Documentation/userspace-api/seccomp_filter.rst @@ -263,7 +263,7 @@ Userspace can also add file descriptors to the notifying process via ``ioctl(SECCOMP_IOCTL_NOTIF_ADDFD)``. The ``id`` member of ``struct seccomp_notif_addfd`` should be the same ``id`` as in ``struct seccomp_notif``. The ``newfd_flags`` flag may be used to set flags -like O_EXEC on the file descriptor in the notifying process. If the supervisor +like O_CLOEXEC on the file descriptor in the notifying process. If the supervisor wants to inject the file descriptor with a specific number, the ``SECCOMP_ADDFD_FLAG_SETFD`` flag can be used, and set the ``newfd`` member to the specific number to use. If that file descriptor is already open in the -- cgit v1.2.3 From 4592ee7f525c4683ec9e290381601fdee50ae110 Mon Sep 17 00:00:00 2001 From: Florian Westphal Date: Wed, 4 Aug 2021 15:02:15 +0200 Subject: netfilter: conntrack: remove offload_pickup sysctl again These two sysctls were added because the hardcoded defaults (2 minutes, tcp, 30 seconds, udp) turned out to be too low for some setups. They appeared in 5.14-rc1 so it should be fine to remove it again. Marcelo convinced me that there should be no difference between a flow that was offloaded vs. a flow that was not wrt. timeout handling. Thus the default is changed to those for TCP established and UDP stream, 5 days and 120 seconds, respectively. Marcelo also suggested to account for the timeout value used for the offloading, this avoids increase beyond the value in the conntrack-sysctl and will also instantly expire the conntrack entry with altered sysctls. Example: nf_conntrack_udp_timeout_stream=60 nf_flowtable_udp_timeout=60 This will remove offloaded udp flows after one minute, rather than two. An earlier version of this patch also cleared the ASSURED bit to allow nf_conntrack to evict the entry via early_drop (i.e., table full). However, it looks like we can safely assume that connection timed out via HW is still in established state, so this isn't needed. Quoting Oz: [..] the hardware sends all packets with a set FIN flags to sw. [..] Connections that are aged in hardware are expected to be in the established state. In case it turns out that back-to-sw-path transition can occur for 'dodgy' connections too (e.g., one side disappeared while software-path would have been in RETRANS timeout), we can adjust this later. Cc: Oz Shlomo Cc: Paul Blakey Suggested-by: Marcelo Ricardo Leitner Signed-off-by: Florian Westphal Reviewed-by: Marcelo Ricardo Leitner Reviewed-by: Oz Shlomo Signed-off-by: Pablo Neira Ayuso --- Documentation/networking/nf_conntrack-sysctl.rst | 10 ---------- include/net/netns/conntrack.h | 2 -- net/netfilter/nf_conntrack_proto_tcp.c | 1 - net/netfilter/nf_conntrack_proto_udp.c | 1 - net/netfilter/nf_conntrack_standalone.c | 16 ---------------- net/netfilter/nf_flow_table_core.c | 11 ++++++++--- 6 files changed, 8 insertions(+), 33 deletions(-) (limited to 'Documentation') diff --git a/Documentation/networking/nf_conntrack-sysctl.rst b/Documentation/networking/nf_conntrack-sysctl.rst index d31ed6c1cb0d..024d784157c8 100644 --- a/Documentation/networking/nf_conntrack-sysctl.rst +++ b/Documentation/networking/nf_conntrack-sysctl.rst @@ -191,19 +191,9 @@ nf_flowtable_tcp_timeout - INTEGER (seconds) TCP connections may be offloaded from nf conntrack to nf flow table. Once aged, the connection is returned to nf conntrack with tcp pickup timeout. -nf_flowtable_tcp_pickup - INTEGER (seconds) - default 120 - - TCP connection timeout after being aged from nf flow table offload. - nf_flowtable_udp_timeout - INTEGER (seconds) default 30 Control offload timeout for udp connections. UDP connections may be offloaded from nf conntrack to nf flow table. Once aged, the connection is returned to nf conntrack with udp pickup timeout. - -nf_flowtable_udp_pickup - INTEGER (seconds) - default 30 - - UDP connection timeout after being aged from nf flow table offload. diff --git a/include/net/netns/conntrack.h b/include/net/netns/conntrack.h index 37e5300c7e5a..fefd38db95b3 100644 --- a/include/net/netns/conntrack.h +++ b/include/net/netns/conntrack.h @@ -30,7 +30,6 @@ struct nf_tcp_net { u8 tcp_ignore_invalid_rst; #if IS_ENABLED(CONFIG_NF_FLOW_TABLE) unsigned int offload_timeout; - unsigned int offload_pickup; #endif }; @@ -44,7 +43,6 @@ struct nf_udp_net { unsigned int timeouts[UDP_CT_MAX]; #if IS_ENABLED(CONFIG_NF_FLOW_TABLE) unsigned int offload_timeout; - unsigned int offload_pickup; #endif }; diff --git a/net/netfilter/nf_conntrack_proto_tcp.c b/net/netfilter/nf_conntrack_proto_tcp.c index 3259416f2ea4..af5115e127cf 100644 --- a/net/netfilter/nf_conntrack_proto_tcp.c +++ b/net/netfilter/nf_conntrack_proto_tcp.c @@ -1478,7 +1478,6 @@ void nf_conntrack_tcp_init_net(struct net *net) #if IS_ENABLED(CONFIG_NF_FLOW_TABLE) tn->offload_timeout = 30 * HZ; - tn->offload_pickup = 120 * HZ; #endif } diff --git a/net/netfilter/nf_conntrack_proto_udp.c b/net/netfilter/nf_conntrack_proto_udp.c index 698fee49e732..f8e3c0d2602f 100644 --- a/net/netfilter/nf_conntrack_proto_udp.c +++ b/net/netfilter/nf_conntrack_proto_udp.c @@ -271,7 +271,6 @@ void nf_conntrack_udp_init_net(struct net *net) #if IS_ENABLED(CONFIG_NF_FLOW_TABLE) un->offload_timeout = 30 * HZ; - un->offload_pickup = 30 * HZ; #endif } diff --git a/net/netfilter/nf_conntrack_standalone.c b/net/netfilter/nf_conntrack_standalone.c index 214d9f9e499b..e84b499b7bfa 100644 --- a/net/netfilter/nf_conntrack_standalone.c +++ b/net/netfilter/nf_conntrack_standalone.c @@ -575,7 +575,6 @@ enum nf_ct_sysctl_index { NF_SYSCTL_CT_PROTO_TIMEOUT_TCP_UNACK, #if IS_ENABLED(CONFIG_NF_FLOW_TABLE) NF_SYSCTL_CT_PROTO_TIMEOUT_TCP_OFFLOAD, - NF_SYSCTL_CT_PROTO_TIMEOUT_TCP_OFFLOAD_PICKUP, #endif NF_SYSCTL_CT_PROTO_TCP_LOOSE, NF_SYSCTL_CT_PROTO_TCP_LIBERAL, @@ -585,7 +584,6 @@ enum nf_ct_sysctl_index { NF_SYSCTL_CT_PROTO_TIMEOUT_UDP_STREAM, #if IS_ENABLED(CONFIG_NF_FLOW_TABLE) NF_SYSCTL_CT_PROTO_TIMEOUT_UDP_OFFLOAD, - NF_SYSCTL_CT_PROTO_TIMEOUT_UDP_OFFLOAD_PICKUP, #endif NF_SYSCTL_CT_PROTO_TIMEOUT_ICMP, NF_SYSCTL_CT_PROTO_TIMEOUT_ICMPV6, @@ -776,12 +774,6 @@ static struct ctl_table nf_ct_sysctl_table[] = { .mode = 0644, .proc_handler = proc_dointvec_jiffies, }, - [NF_SYSCTL_CT_PROTO_TIMEOUT_TCP_OFFLOAD_PICKUP] = { - .procname = "nf_flowtable_tcp_pickup", - .maxlen = sizeof(unsigned int), - .mode = 0644, - .proc_handler = proc_dointvec_jiffies, - }, #endif [NF_SYSCTL_CT_PROTO_TCP_LOOSE] = { .procname = "nf_conntrack_tcp_loose", @@ -832,12 +824,6 @@ static struct ctl_table nf_ct_sysctl_table[] = { .mode = 0644, .proc_handler = proc_dointvec_jiffies, }, - [NF_SYSCTL_CT_PROTO_TIMEOUT_UDP_OFFLOAD_PICKUP] = { - .procname = "nf_flowtable_udp_pickup", - .maxlen = sizeof(unsigned int), - .mode = 0644, - .proc_handler = proc_dointvec_jiffies, - }, #endif [NF_SYSCTL_CT_PROTO_TIMEOUT_ICMP] = { .procname = "nf_conntrack_icmp_timeout", @@ -1018,7 +1004,6 @@ static void nf_conntrack_standalone_init_tcp_sysctl(struct net *net, #if IS_ENABLED(CONFIG_NF_FLOW_TABLE) table[NF_SYSCTL_CT_PROTO_TIMEOUT_TCP_OFFLOAD].data = &tn->offload_timeout; - table[NF_SYSCTL_CT_PROTO_TIMEOUT_TCP_OFFLOAD_PICKUP].data = &tn->offload_pickup; #endif } @@ -1111,7 +1096,6 @@ static int nf_conntrack_standalone_init_sysctl(struct net *net) table[NF_SYSCTL_CT_PROTO_TIMEOUT_UDP_STREAM].data = &un->timeouts[UDP_CT_REPLIED]; #if IS_ENABLED(CONFIG_NF_FLOW_TABLE) table[NF_SYSCTL_CT_PROTO_TIMEOUT_UDP_OFFLOAD].data = &un->offload_timeout; - table[NF_SYSCTL_CT_PROTO_TIMEOUT_UDP_OFFLOAD_PICKUP].data = &un->offload_pickup; #endif nf_conntrack_standalone_init_tcp_sysctl(net, table); diff --git a/net/netfilter/nf_flow_table_core.c b/net/netfilter/nf_flow_table_core.c index 551976e4284c..8788b519255e 100644 --- a/net/netfilter/nf_flow_table_core.c +++ b/net/netfilter/nf_flow_table_core.c @@ -183,7 +183,7 @@ static void flow_offload_fixup_ct_timeout(struct nf_conn *ct) const struct nf_conntrack_l4proto *l4proto; struct net *net = nf_ct_net(ct); int l4num = nf_ct_protonum(ct); - unsigned int timeout; + s32 timeout; l4proto = nf_ct_l4proto_find(l4num); if (!l4proto) @@ -192,15 +192,20 @@ static void flow_offload_fixup_ct_timeout(struct nf_conn *ct) if (l4num == IPPROTO_TCP) { struct nf_tcp_net *tn = nf_tcp_pernet(net); - timeout = tn->offload_pickup; + timeout = tn->timeouts[TCP_CONNTRACK_ESTABLISHED]; + timeout -= tn->offload_timeout; } else if (l4num == IPPROTO_UDP) { struct nf_udp_net *tn = nf_udp_pernet(net); - timeout = tn->offload_pickup; + timeout = tn->timeouts[UDP_CT_REPLIED]; + timeout -= tn->offload_timeout; } else { return; } + if (timeout < 0) + timeout = 0; + if (nf_flow_timeout_delta(ct->timeout) > (__s32)timeout) ct->timeout = nfct_time_stamp + timeout; } -- cgit v1.2.3 From 7c4a22339e7ce7b6ed473a8e682da622c3a774ee Mon Sep 17 00:00:00 2001 From: Randy Dunlap Date: Sun, 1 Aug 2021 18:50:37 -0700 Subject: libbpf, doc: Eliminate warnings in libbpf_naming_convention Use "code-block: none" instead of "c" for non-C-language code blocks. Removes these warnings: lnx-514-rc4/Documentation/bpf/libbpf/libbpf_naming_convention.rst:111: WARNING: Could not lex literal_block as "c". Highlighting skipped. lnx-514-rc4/Documentation/bpf/libbpf/libbpf_naming_convention.rst:124: WARNING: Could not lex literal_block as "c". Highlighting skipped. Fixes: f42cfb469f9b ("bpf: Add documentation for libbpf including API autogen") Signed-off-by: Randy Dunlap Signed-off-by: Daniel Borkmann Link: https://lore.kernel.org/bpf/20210802015037.787-1-rdunlap@infradead.org --- Documentation/bpf/libbpf/libbpf_naming_convention.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'Documentation') diff --git a/Documentation/bpf/libbpf/libbpf_naming_convention.rst b/Documentation/bpf/libbpf/libbpf_naming_convention.rst index 3de1d51e41da..6bf9c5ac7576 100644 --- a/Documentation/bpf/libbpf/libbpf_naming_convention.rst +++ b/Documentation/bpf/libbpf/libbpf_naming_convention.rst @@ -108,7 +108,7 @@ This bump in ABI version is at most once per kernel development cycle. For example, if current state of ``libbpf.map`` is: -.. code-block:: c +.. code-block:: none LIBBPF_0.0.1 { global: @@ -121,7 +121,7 @@ For example, if current state of ``libbpf.map`` is: , and a new symbol ``bpf_func_c`` is being introduced, then ``libbpf.map`` should be changed like this: -.. code-block:: c +.. code-block:: none LIBBPF_0.0.1 { global: -- cgit v1.2.3 From ffd5caa26f6afde0c1e3ed126806607748a83c6e Mon Sep 17 00:00:00 2001 From: Daniel Vetter Date: Tue, 10 Aug 2021 16:27:48 +0200 Subject: drm/doc/rfc: drop lmem uapi section We still have quite a bit more work to do with overall reworking of the ttm-based dg1 code, but the uapi stuff is now finalized with the latest pull. So remove that. This also fixes kerneldoc build warnings because we've included the same headers in two places, resulting in sphinx complaining about duplicated symbols. This regression has been created when we moved the uapi definitions to the real include/uapi/ folder in 727ecd99a4c9 ("drm/doc/rfc: drop the i915_gem_lmem.h header") v2: Fix a few references that I missed, the htmldocs build took forever. Acked-by: Jason Ekstrand Acked-by: Maarten Lankhorst Tested-by Stephen Rothwell (v1) References: https://lore.kernel.org/dri-devel/20210603193242.1ce99344@canb.auug.org.au/ Reported-by: Stephen Rothwell Cc: Stephen Rothwell Fixes: 727ecd99a4c9 ("drm/doc/rfc: drop the i915_gem_lmem.h header") Cc: Matthew Auld Signed-off-by: Daniel Vetter Link: https://patchwork.freedesktop.org/patch/msgid/20210810142748.1983271-1-daniel.vetter@ffwll.ch (cherry picked from commit dae2d28832968751f7731336b560a4a84a197b76) Signed-off-by: Rodrigo Vivi --- Documentation/gpu/rfc/i915_gem_lmem.rst | 109 -------------------------------- 1 file changed, 109 deletions(-) (limited to 'Documentation') diff --git a/Documentation/gpu/rfc/i915_gem_lmem.rst b/Documentation/gpu/rfc/i915_gem_lmem.rst index 675ba8620d66..b421a3c1806e 100644 --- a/Documentation/gpu/rfc/i915_gem_lmem.rst +++ b/Documentation/gpu/rfc/i915_gem_lmem.rst @@ -18,114 +18,5 @@ real, with all the uAPI bits is: * Route shmem backend over to TTM SYSTEM for discrete * TTM purgeable object support * Move i915 buddy allocator over to TTM - * MMAP ioctl mode(see `I915 MMAP`_) - * SET/GET ioctl caching(see `I915 SET/GET CACHING`_) * Send RFC(with mesa-dev on cc) for final sign off on the uAPI * Add pciid for DG1 and turn on uAPI for real - -New object placement and region query uAPI -========================================== -Starting from DG1 we need to give userspace the ability to allocate buffers from -device local-memory. Currently the driver supports gem_create, which can place -buffers in system memory via shmem, and the usual assortment of other -interfaces, like dumb buffers and userptr. - -To support this new capability, while also providing a uAPI which will work -beyond just DG1, we propose to offer three new bits of uAPI: - -DRM_I915_QUERY_MEMORY_REGIONS ------------------------------ -New query ID which allows userspace to discover the list of supported memory -regions(like system-memory and local-memory) for a given device. We identify -each region with a class and instance pair, which should be unique. The class -here would be DEVICE or SYSTEM, and the instance would be zero, on platforms -like DG1. - -Side note: The class/instance design is borrowed from our existing engine uAPI, -where we describe every physical engine in terms of its class, and the -particular instance, since we can have more than one per class. - -In the future we also want to expose more information which can further -describe the capabilities of a region. - -.. kernel-doc:: include/uapi/drm/i915_drm.h - :functions: drm_i915_gem_memory_class drm_i915_gem_memory_class_instance drm_i915_memory_region_info drm_i915_query_memory_regions - -GEM_CREATE_EXT --------------- -New ioctl which is basically just gem_create but now allows userspace to provide -a chain of possible extensions. Note that if we don't provide any extensions and -set flags=0 then we get the exact same behaviour as gem_create. - -Side note: We also need to support PXP[1] in the near future, which is also -applicable to integrated platforms, and adds its own gem_create_ext extension, -which basically lets userspace mark a buffer as "protected". - -.. kernel-doc:: include/uapi/drm/i915_drm.h - :functions: drm_i915_gem_create_ext - -I915_GEM_CREATE_EXT_MEMORY_REGIONS ----------------------------------- -Implemented as an extension for gem_create_ext, we would now allow userspace to -optionally provide an immutable list of preferred placements at creation time, -in priority order, for a given buffer object. For the placements we expect -them each to use the class/instance encoding, as per the output of the regions -query. Having the list in priority order will be useful in the future when -placing an object, say during eviction. - -.. kernel-doc:: include/uapi/drm/i915_drm.h - :functions: drm_i915_gem_create_ext_memory_regions - -One fair criticism here is that this seems a little over-engineered[2]. If we -just consider DG1 then yes, a simple gem_create.flags or something is totally -all that's needed to tell the kernel to allocate the buffer in local-memory or -whatever. However looking to the future we need uAPI which can also support -upcoming Xe HP multi-tile architecture in a sane way, where there can be -multiple local-memory instances for a given device, and so using both class and -instance in our uAPI to describe regions is desirable, although specifically -for DG1 it's uninteresting, since we only have a single local-memory instance. - -Existing uAPI issues -==================== -Some potential issues we still need to resolve. - -I915 MMAP ---------- -In i915 there are multiple ways to MMAP GEM object, including mapping the same -object using different mapping types(WC vs WB), i.e multiple active mmaps per -object. TTM expects one MMAP at most for the lifetime of the object. If it -turns out that we have to backpedal here, there might be some potential -userspace fallout. - -I915 SET/GET CACHING --------------------- -In i915 we have set/get_caching ioctl. TTM doesn't let us to change this, but -DG1 doesn't support non-snooped pcie transactions, so we can just always -allocate as WB for smem-only buffers. If/when our hw gains support for -non-snooped pcie transactions then we must fix this mode at allocation time as -a new GEM extension. - -This is related to the mmap problem, because in general (meaning, when we're -not running on intel cpus) the cpu mmap must not, ever, be inconsistent with -allocation mode. - -Possible idea is to let the kernel picks the mmap mode for userspace from the -following table: - -smem-only: WB. Userspace does not need to call clflush. - -smem+lmem: We only ever allow a single mode, so simply allocate this as uncached -memory, and always give userspace a WC mapping. GPU still does snooped access -here(assuming we can't turn it off like on DG1), which is a bit inefficient. - -lmem only: always WC - -This means on discrete you only get a single mmap mode, all others must be -rejected. That's probably going to be a new default mode or something like -that. - -Links -===== -[1] https://patchwork.freedesktop.org/series/86798/ - -[2] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5599#note_553791 -- cgit v1.2.3