summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2020-11-14ipv6: remove unused function ipv6_skb_idev()Lukas Bulwahn1-5/+0
Commit bdb7cc643fc9 ("ipv6: Count interface receive statistics on the ingress netdev") removed all callees for ipv6_skb_idev(). Hence, since then, ipv6_skb_idev() is unused and make CC=clang W=1 warns: net/ipv6/exthdrs.c:909:33: warning: unused function 'ipv6_skb_idev' [-Wunused-function] So, remove this unused function and a -Wunused-function warning. Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Link: https://lore.kernel.org/r/20201113135012.32499-1-lukas.bulwahn@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-14Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextJakub Kicinski83-1271/+3907
Daniel Borkmann says: ==================== pull-request: bpf-next 2020-11-14 1) Add BTF generation for kernel modules and extend BTF infra in kernel e.g. support for split BTF loading and validation, from Andrii Nakryiko. 2) Support for pointers beyond pkt_end to recognize LLVM generated patterns on inlined branch conditions, from Alexei Starovoitov. 3) Implements bpf_local_storage for task_struct for BPF LSM, from KP Singh. 4) Enable FENTRY/FEXIT/RAW_TP tracing program to use the bpf_sk_storage infra, from Martin KaFai Lau. 5) Add XDP bulk APIs that introduce a defer/flush mechanism to optimize the XDP_REDIRECT path, from Lorenzo Bianconi. 6) Fix a potential (although rather theoretical) deadlock of hashtab in NMI context, from Song Liu. 7) Fixes for cross and out-of-tree build of bpftool and runqslower allowing build for different target archs on same source tree, from Jean-Philippe Brucker. 8) Fix error path in htab_map_alloc() triggered from syzbot, from Eric Dumazet. 9) Move functionality from test_tcpbpf_user into the test_progs framework so it can run in BPF CI, from Alexander Duyck. 10) Lift hashtab key_size limit to be larger than MAX_BPF_STACK, from Florian Lehner. Note that for the fix from Song we have seen a sparse report on context imbalance which requires changes in sparse itself for proper annotation detection where this is currently being discussed on linux-sparse among developers [0]. Once we have more clarification/guidance after their fix, Song will follow-up. [0] https://lore.kernel.org/linux-sparse/CAHk-=wh4bx8A8dHnX612MsDO13st6uzAz1mJ1PaHHVevJx_ZCw@mail.gmail.com/T/ https://lore.kernel.org/linux-sparse/20201109221345.uklbp3lzgq6g42zb@ltop.local/T/ * git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (66 commits) net: mlx5: Add xdp tx return bulking support net: mvpp2: Add xdp tx return bulking support net: mvneta: Add xdp tx return bulking support net: page_pool: Add bulk support for ptr_ring net: xdp: Introduce bulking for xdp tx return path bpf: Expose bpf_d_path helper to sleepable LSM hooks bpf: Augment the set of sleepable LSM hooks bpf: selftest: Use bpf_sk_storage in FENTRY/FEXIT/RAW_TP bpf: Allow using bpf_sk_storage in FENTRY/FEXIT/RAW_TP bpf: Rename some functions in bpf_sk_storage bpf: Folding omem_charge() into sk_storage_charge() selftests/bpf: Add asm tests for pkt vs pkt_end comparison. selftests/bpf: Add skb_pkt_end test bpf: Support for pointers beyond pkt_end. tools/bpf: Always run the *-clean recipes tools/bpf: Add bootstrap/ to .gitignore bpf: Fix NULL dereference in bpf_task_storage tools/bpftool: Fix build slowdown tools/runqslower: Build bpftool using HOSTCC tools/runqslower: Enable out-of-tree build ... ==================== Link: https://lore.kernel.org/r/20201114020819.29584-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-13net: phy: mscc: Add PTP support for 2 more VSC PHYsSteen Hegelund1-0/+2
Add VSC8572 and VSC8574 in the PTP configuration as they also support PTP. The relevant datasheets can be found here: - VSC8572: https://www.microchip.com/wwwproducts/en/VSC8572 - VSC8574: https://www.microchip.com/wwwproducts/en/VSC8574 Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com> Link: https://lore.kernel.org/r/20201112092250.914079-1-steen.hegelund@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-14Merge branch 'xdp-redirect-bulk'Daniel Borkmann7-17/+192
Lorenzo Bianconi says: ==================== XDP bulk APIs introduce a defer/flush mechanism to return pages belonging to the same xdp_mem_allocator object (identified via the mem.id field) in bulk to optimize I-cache and D-cache since xdp_return_frame is usually run inside the driver NAPI tx completion loop. Convert mvneta, mvpp2 and mlx5 drivers to xdp_return_frame_bulk APIs. More details on benchmarks run on mlx5 can be found here: https://github.com/xdp-project/xdp-project/blob/master/areas/mem/xdp_bulk_return01.org Changes since v5: - do not keep looping over ptr_ring if the cache is full but release leftover pages running page_pool_return_page Changes since v4: - fix comments - introduce xdp_frame_bulk_init utility routine - compiler annotations for I-cache code layout - move rcu_read_lock outside fast-path - mlx5 xdp bulking code optimization Changes since v3: - align DEV_MAP_BULK_SIZE to XDP_BULK_QUEUE_SIZE - refactor page_pool_put_page_bulk to avoid code duplication Changes since v2: - move mvneta changes in a dedicated patch Changes since v1: - improve comments - rework xdp_return_frame_bulk routine logic - move count and xa fields at the beginning of xdp_frame_bulk struct - invert logic in page_pool_put_page_bulk for loop ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
2020-11-14net: mlx5: Add xdp tx return bulking supportLorenzo Bianconi1-4/+18
Convert mlx5 driver to xdp_return_frame_bulk APIs. XDP_REDIRECT (upstream codepath): 8.9Mpps XDP_REDIRECT (upstream codepath + bulking APIs): 10.2Mpps Co-developed-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/250460319fd868b7b5668fc1deca74dd42813a90.1605267335.git.lorenzo@kernel.org
2020-11-14net: mvpp2: Add xdp tx return bulking supportLorenzo Bianconi1-1/+9
Convert mvpp2 driver to xdp_return_frame_bulk APIs. XDP_REDIRECT (upstream codepath): 1.79Mpps XDP_REDIRECT (upstream codepath + bulking APIs): 1.93Mpps Co-developed-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Matteo Croce <mcroce@microsoft.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/0b38c295e58e8ce251ef6b4e2187a2f457f9f7a3.1605267335.git.lorenzo@kernel.org
2020-11-14net: mvneta: Add xdp tx return bulking supportLorenzo Bianconi1-1/+9
Convert mvneta driver to xdp_return_frame_bulk APIs. XDP_REDIRECT (upstream codepath): 275Kpps XDP_REDIRECT (upstream codepath + bulking APIs): 284Kpps Co-developed-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/9af8014006d022fc0fec78cdaa71beb56999750d.1605267335.git.lorenzo@kernel.org
2020-11-14net: page_pool: Add bulk support for ptr_ringLorenzo Bianconi3-17/+88
Introduce the capability to batch page_pool ptr_ring refill since it is usually run inside the driver NAPI tx completion loop. Suggested-by: Jesper Dangaard Brouer <brouer@redhat.com> Co-developed-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Link: https://lore.kernel.org/bpf/08dd249c9522c001313f520796faa777c4089e1c.1605267335.git.lorenzo@kernel.org
2020-11-14net: xdp: Introduce bulking for xdp tx return pathLorenzo Bianconi2-1/+75
XDP bulk APIs introduce a defer/flush mechanism to return pages belonging to the same xdp_mem_allocator object (identified via the mem.id field) in bulk to optimize I-cache and D-cache since xdp_return_frame is usually run inside the driver NAPI tx completion loop. The bulk queue size is set to 16 to be aligned to how XDP_REDIRECT bulking works. The bulk is flushed when it is full or when mem.id changes. xdp_frame_bulk is usually stored/allocated on the function call-stack to avoid locking penalties. Current implementation considers only page_pool memory model. Suggested-by: Jesper Dangaard Brouer <brouer@redhat.com> Co-developed-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Link: https://lore.kernel.org/bpf/e190c03eac71b20c8407ae0fc2c399eda7835f49.1605267335.git.lorenzo@kernel.org
2020-11-13net: stmmac: platform: use optional clk/reset get APIsJisheng Zhang1-13/+9
Use the devm_reset_control_get_optional() and devm_clk_get_optional() rather than open coding them. Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com> Link: https://lore.kernel.org/r/20201112092606.5173aa6f@xhacker.debian Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-13r8169: improve rtl_txHeiner Kallweit1-4/+3
We can simplify the for() condition and eliminate variable tx_left. The change also considers that tp->cur_tx may be incremented by a racing rtl8169_start_xmit(). In addition replace the write to tp->dirty_tx and the following smp_mb() with an equivalent call to smp_store_mb(). This implicitly adds a WRITE_ONCE() to the write. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://lore.kernel.org/r/c2e19e5e-3d3f-d663-af32-13c3374f5def@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-13r8169: use READ_ONCE in rtl_tx_slots_availHeiner Kallweit1-1/+2
tp->dirty_tx and tp->cur_tx may be changed by a racing rtl_tx() or rtl8169_start_xmit(). Use READ_ONCE() to annotate the races and ensure that the compiler doesn't use cached values. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://lore.kernel.org/r/5676fee3-f6b4-84f2-eba5-c64949a371ad@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-13Merge branch 'net-ipa-two-fixes'Jakub Kicinski2-3/+4
Alex Elder says: ==================== net: ipa: two fixes This small series makes two fixes to the IPA code: - While reviewing something else I found that one of the resource limits on the SDM845 used the wrong value. The first patch fixes this. The correct value allocates more resources of this type for IPA to use, and otherwise does not change behavior. - When the IPA-resident microcontroller starts up it generates an event, which triggers an AP interrupt. The event merely provides some information for logging, which we don't support. We already ignore the event, and that's harmless. So this patch explicitly ignores it rather than issuing a warning when it occurs. ==================== Link: https://lore.kernel.org/r/20201112121157.19784-1-elder@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-13net: ipa: ignore the microcontroller log eventAlex Elder1-1/+2
The IPA-resident microcontroller has the ability to log various activity in an area of IPA shared memory. When the microcontroller starts it generates an event to the AP to provide information about the log. We don't support reading this log, and we can safely ignore the event. So do that rather than treating the log info event we receive as "unsupported." Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-13net: ipa: fix source packet contexts limitAlex Elder1-2/+2
I have discovered that the maximum number of source packet contexts configured for SDM845 is incorrect. Fix this error. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-13Merge branch 'sfc-further-ef100-encap-tso-features'Jakub Kicinski3-9/+37
Edward Cree says: ==================== sfc: further EF100 encap TSO features This series adds support for GRE and GRE_CSUM TSO on EF100 NICs, as well as improving the handling of UDP tunnel TSO. ==================== Link: https://lore.kernel.org/r/eda2de73-edf2-8b92-edb9-099ebda09ebc@solarflare.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-13sfc: support GRE TSO on EF100Edward Cree2-2/+11
We can treat SKB_GSO_GRE almost exactly the same as UDP tunnels, except that we don't want to edit the outer UDP len (as there isn't one). For SKB_GSO_GRE_CSUM, we have to use GSO_PARTIAL as the device doesn't support offload of non-UDP outer L4 checksums. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Martin Habets <mhabets@solarflare.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
2020-11-13sfc: correctly support non-partial GSO_UDP_TUNNEL_CSUM on EF100Edward Cree2-2/+5
By asking the HW for the correct edits, we can make UDP tunnel TSO work without needing GSO_PARTIAL. So don't specify it in our netdev->gso_partial_features. However, retain GSO_PARTIAL support, as this will be used for other protocols later. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Martin Habets <mhabets@solarflare.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
2020-11-13sfc: extend bitfield macros to 19 fieldsEdward Cree1-5/+21
Our TSO descriptors got even more fussy. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Martin Habets <mhabets@solarflare.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
2020-11-13Merge branch 'net-ipa-gsi-register-consolidation'Jakub Kicinski2-91/+93
Alex Elder says: ==================== net: ipa: GSI register consolidation This series rearranges and consolidates some GSI register definitions. Its general aim is to make things more consistent, by: - Using enumerated types to define the values held in GSI register fields - Defining field values in "gsi_reg.h", together with the definition of the register (and field) that holds them - Format enumerated type members consistently, with hexidecimal numeric values, and assignments aligned on the same column There is one checkpatch "CHECK" warning requesting a blank line; I ignored that because my intention was to group certain definitions. ==================== Link: https://lore.kernel.org/r/20201110215922.23514-1-elder@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-13net: ipa: use enumerated types for GSI field valuesAlex Elder2-10/+18
Replace constants defined with an "_FVAL" suffix with values defined in enumerated types, to be consistent with other usage in the driver. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-13net: ipa: move GSI command opcode values into "gsi_reg.h"Alex Elder2-22/+19
The gsi_ch_cmd_opcode, gsi_evt_cmd_opcode, and gsi_generic_cmd_opcode enumerated types are values that fields in the GSI command registers can take on. Move their definitions out of "gsi.c" and into "gsi_reg.h", alongside the definition of registers they are associated with. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-13net: ipa: move GSI error values into "gsi_reg.h"Alex Elder2-20/+19
The gsi_err_code and gsi_err_type enumerated types are values that fields in the GSI ERROR_LOG register can take on. Move their definitions out of "gsi.c" and into "gsi_reg.h", alongside the definition of the ERROR_LOG register offset and field symbols. Drop the "_ERR" suffix in the names of the gsi_err_code members. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-13net: ipa: move channel type values into "gsi_reg.h"Alex Elder2-8/+8
The gsi_channel_type enumerated type define values used for the channel type/protocol for event rings and channels. Move its definition out of "gsi.c" and into "gsi_reg.h", alongside the definition of the CH_C_CNTXT_0 register offset and its fields. Add a comment near the definition of the EV_CH_E_CNTXT_0 register indicating this type is used for its EV_CHTYPE field. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-13net: ipa: use common value for channel type and protocolAlex Elder1-17/+10
The numeric values that represent the event ring channel type are identical to the values that represent the matching protocol used for a channel. Use a new gsi_channel_type enumerated type to represent the values programmed for both cases, using "CHANNEL_TYPE" in member names in place of "EVT_CHTYPE" and "CHANNEL_PROTOCOL". Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-13net: ipa: define GSI interrupt types with enumsAlex Elder2-20/+25
Define the GSI global interrupt types with an enumerated type whose values are the bit positions representing the global interrupt types. Similarly, define the GSI general interrupt types with an enumerated type whose values are the bit positions of general interrupt types. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-13tipc: fix -Wstringop-truncation warningsWenlin Kang1-4/+1
Replace strncpy() with strscpy(), fixes the following warning: In function 'bearer_name_validate', inlined from 'tipc_enable_bearer' at net/tipc/bearer.c:246:7: net/tipc/bearer.c:141:2: warning: 'strncpy' specified bound 32 equals destination size [-Wstringop-truncation] strncpy(name_copy, name, TIPC_MAX_BEARER_NAME); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Wenlin Kang <wenlin.kang@windriver.com> Acked-by: Ying Xue <ying.xue@windriver.com> Link: https://lore.kernel.org/r/20201112093442.8132-1-wenlin.kang@windriver.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-13Merge tag 'mac80211-next-for-net-next-2020-11-13' of ↵Jakub Kicinski38-347/+198
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Johannes Berg says: ==================== Some updates: * injection/radiotap updates for new test capabilities * remove WDS support - even years ago when we turned it off by default it was already basically unusable * support for HE (802.11ax) rates for beacons * support for some vendor-specific HE rates * many other small features/cleanups * tag 'mac80211-next-for-net-next-2020-11-13' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next: (21 commits) nl80211: fix kernel-doc warning in the new SAE attribute cfg80211: remove WDS code mac80211: remove WDS-related code rt2x00: remove WDS code b43legacy: remove WDS code b43: remove WDS code carl9170: remove WDS code ath9k: remove WDS code wireless: remove CONFIG_WIRELESS_WDS mac80211: assure that certain drivers adhere to DONT_REORDER flag mac80211: don't overwrite QoS TID of injected frames mac80211: adhere to Tx control flag that prevents frame reordering mac80211: add radiotap flag to assure frames are not reordered mac80211: save HE oper info in BSS config for mesh cfg80211: add support to configure HE MCS for beacon rate nl80211: fix beacon tx rate mask validation nl80211/cfg80211: fix potential infinite loop cfg80211: Add support to calculate and report 4096-QAM HE rates cfg80211: Add support to configure SAE PWE value to drivers ieee80211: Add definition for WFA DPP ... ==================== Link: https://lore.kernel.org/r/20201113101148.25268-1-johannes@sipsolutions.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-13bpf: Expose bpf_d_path helper to sleepable LSM hooksKP Singh1-1/+7
Sleepable hooks are never called from an NMI/interrupt context, so it is safe to use the bpf_d_path helper in LSM programs attaching to these hooks. The helper is not restricted to sleepable programs and merely uses the list of sleepable hooks as the initial subset of LSM hooks where it can be used. Signed-off-by: KP Singh <kpsingh@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20201113005930.541956-3-kpsingh@chromium.org
2020-11-13bpf: Augment the set of sleepable LSM hooksKP Singh3-15/+89
Update the set of sleepable hooks with the ones that do not trigger a warning with might_fault() when exercised with the correct kernel config options enabled, i.e. DEBUG_ATOMIC_SLEEP=y LOCKDEP=y PROVE_LOCKING=y This means that a sleepable LSM eBPF program can be attached to these LSM hooks. A new helper method bpf_lsm_is_sleepable_hook is added and the set is maintained locally in bpf_lsm.c Signed-off-by: KP Singh <kpsingh@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20201113005930.541956-2-kpsingh@chromium.org
2020-11-12Merge branch 'bpf: Enable bpf_sk_storage for FENTRY/FEXIT/RAW_TP'Alexei Starovoitov6-32/+369
Martin KaFai says: ==================== This set is to allow the FENTRY/FEXIT/RAW_TP tracing program to use bpf_sk_storage. The first two patches are a cleanup. The last patch is tests. Patch 3 has the required kernel changes to enable bpf_sk_storage for FENTRY/FEXIT/RAW_TP. Please see individual patch for details. v2: - Rename some of the function prefix from sk_storage to bpf_sk_storage - Use prefix check instead of substr check ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-11-12bpf: selftest: Use bpf_sk_storage in FENTRY/FEXIT/RAW_TPMartin KaFai Lau3-0/+259
This patch tests storing the task's related info into the bpf_sk_storage by fentry/fexit tracing at listen, accept, and connect. It also tests the raw_tp at inet_sock_set_state. A negative test is done by tracing the bpf_sk_storage_free() and using bpf_sk_storage_get() at the same time. It ensures this bpf program cannot load. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20201112211320.2587537-1-kafai@fb.com
2020-11-12bpf: Allow using bpf_sk_storage in FENTRY/FEXIT/RAW_TPMartin KaFai Lau3-0/+81
This patch enables the FENTRY/FEXIT/RAW_TP tracing program to use the bpf_sk_storage_(get|delete) helper, so those tracing programs can access the sk's bpf_local_storage and the later selftest will show some examples. The bpf_sk_storage is currently used in bpf-tcp-cc, tc, cg sockops...etc which is running either in softirq or task context. This patch adds bpf_sk_storage_get_tracing_proto and bpf_sk_storage_delete_tracing_proto. They will check in runtime that the helpers can only be called when serving softirq or running in a task context. That should enable most common tracing use cases on sk. During the load time, the new tracing_allowed() function will ensure the tracing prog using the bpf_sk_storage_(get|delete) helper is not tracing any bpf_sk_storage*() function itself. The sk is passed as "void *" when calling into bpf_local_storage. This patch only allows tracing a kernel function. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20201112211313.2587383-1-kafai@fb.com
2020-11-12bpf: Rename some functions in bpf_sk_storageMartin KaFai Lau1-19/+19
Rename some of the functions currently prefixed with sk_storage to bpf_sk_storage. That will make the next patch have fewer prefix check and also bring the bpf_sk_storage.c to a more consistent function naming. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: KP Singh <kpsingh@google.com> Link: https://lore.kernel.org/bpf/20201112211307.2587021-1-kafai@fb.com
2020-11-12bpf: Folding omem_charge() into sk_storage_charge()Martin KaFai Lau1-13/+10
sk_storage_charge() is the only user of omem_charge(). This patch simplifies it by folding omem_charge() into sk_storage_charge(). Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: KP Singh <kpsingh@google.com> Link: https://lore.kernel.org/bpf/20201112211301.2586255-1-kafai@fb.com
2020-11-12Merge https://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski322-2053/+5566
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-13Merge branch 'bpf-ptrs-beyond-pkt-end'Daniel Borkmann5-23/+245
Alexei Starovoitov says: ==================== v1->v2: - removed set-but-unused variable. - added Jiri's Tested-by. In some cases LLVM uses the knowledge that branch is taken to optimze the code which causes the verifier to reject valid programs. Teach the verifier to recognize that r1 = skb->data; r1 += 10; r2 = skb->data_end; if (r1 > r2) { here r1 points beyond packet_end and subsequent if (r1 > r2) // always evaluates to "true". } ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2020-11-13selftests/bpf: Add asm tests for pkt vs pkt_end comparison.Alexei Starovoitov1-0/+42
Add few assembly tests for packet comparison. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Jiri Olsa <jolsa@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20201111031213.25109-4-alexei.starovoitov@gmail.com
2020-11-13selftests/bpf: Add skb_pkt_end testAlexei Starovoitov2-0/+95
Add a test that currently makes LLVM generate assembly code: $ llvm-objdump -S skb_pkt_end.o 0000000000000000 <main_prog>: ; if (skb_shorter(skb, ETH_IPV4_TCP_SIZE)) 0: 61 12 50 00 00 00 00 00 r2 = *(u32 *)(r1 + 80) 1: 61 14 4c 00 00 00 00 00 r4 = *(u32 *)(r1 + 76) 2: bf 43 00 00 00 00 00 00 r3 = r4 3: 07 03 00 00 36 00 00 00 r3 += 54 4: b7 01 00 00 00 00 00 00 r1 = 0 5: 2d 23 02 00 00 00 00 00 if r3 > r2 goto +2 <LBB0_2> 6: 07 04 00 00 0e 00 00 00 r4 += 14 ; if (skb_shorter(skb, ETH_IPV4_TCP_SIZE)) 7: bf 41 00 00 00 00 00 00 r1 = r4 0000000000000040 <LBB0_2>: 8: b4 00 00 00 ff ff ff ff w0 = -1 ; if (!(ip = get_iphdr(skb))) 9: 2d 23 05 00 00 00 00 00 if r3 > r2 goto +5 <LBB0_6> ; proto = ip->protocol; 10: 71 12 09 00 00 00 00 00 r2 = *(u8 *)(r1 + 9) ; if (proto != IPPROTO_TCP) 11: 56 02 03 00 06 00 00 00 if w2 != 6 goto +3 <LBB0_6> ; if (tcp->dest != 0) 12: 69 12 16 00 00 00 00 00 r2 = *(u16 *)(r1 + 22) 13: 56 02 01 00 00 00 00 00 if w2 != 0 goto +1 <LBB0_6> ; return tcp->urg_ptr; 14: 69 10 26 00 00 00 00 00 r0 = *(u16 *)(r1 + 38) 0000000000000078 <LBB0_6>: ; } 15: 95 00 00 00 00 00 00 00 exit Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20201111031213.25109-3-alexei.starovoitov@gmail.com
2020-11-13bpf: Support for pointers beyond pkt_end.Alexei Starovoitov2-23/+108
This patch adds the verifier support to recognize inlined branch conditions. The LLVM knows that the branch evaluates to the same value, but the verifier couldn't track it. Hence causing valid programs to be rejected. The potential LLVM workaround: https://reviews.llvm.org/D87428 can have undesired side effects, since LLVM doesn't know that skb->data/data_end are being compared. LLVM has to introduce extra boolean variable and use inline_asm trick to force easier for the verifier assembly. Instead teach the verifier to recognize that r1 = skb->data; r1 += 10; r2 = skb->data_end; if (r1 > r2) { here r1 points beyond packet_end and subsequent if (r1 > r2) // always evaluates to "true". } Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Jiri Olsa <jolsa@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20201111031213.25109-2-alexei.starovoitov@gmail.com
2020-11-12selftests: set conf.all.rp_filter=0 in bareudp.shGuillaume Nault1-0/+2
When working on the rp_filter problem, I didn't realise that disabling it on the network devices didn't cover all cases: rp_filter could also be enabled globally in the namespace, in which case it would drop packets, even if the net device has rp_filter=0. Fixes: 1ccd58331f6f ("selftests: disable rp_filter when testing bareudp") Fixes: bbbc7aa45eef ("selftests: add test script for bareudp tunnels") Signed-off-by: Guillaume Nault <gnault@redhat.com> Link: https://lore.kernel.org/r/f2d459346471f163b239aa9d63ce3e2ba9c62895.1605107012.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12Merge branch ↵Jakub Kicinski5-234/+635
'mlxsw-spectrum-prepare-for-xm-implementation-prefix-insertion-and-removal' Ido Schimmel says: ==================== mlxsw: spectrum: Prepare for XM implementation - prefix insertion and removal Jiri says: This is a preparation patchset for follow-up support of boards with extended mezzanine (XM), which is going to allow extended (scale-wise) router offload. XM requires a separate PRM register named XMDR to be used instead of RALUE to insert/update/remove FIB entries. Therefore, this patchset extends the previously introduces low-level ops to be able to have XM-specific FIB entry config implementation. Currently the existing original RALUE implementation is moved to "basic" low-level ops. Unlike legacy router, insertion/update/removal of FIB entries into XM could be done in bulks up to 4 items in a single PRM register write. That is why this patchset implements "an op context", that allows the future XM ops implementation to squash multiple FIB events to single register write. For that, the way in which the FIB events are processed by the work queue has to be changed. The conversion from 1:1 FIB event - work callback call to event queue is implemented in patch #3. Patch #4 introduces "an op context" that will allow in future to squash multiple FIB events into one XMDR register write. Patch #12 converts it from stack to be allocated per instance. Existing RALUE manipulations are pushed to ops in patch #10. Patch #13 is introducing a possibility for low-level implementation to have per FIB entry private memory. The rest of the patches are either cosmetics or smaller preparations. ==================== Link: https://lore.kernel.org/r/20201110094900.1920158-1-idosch@idosch.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12mlxsw: spectrum_router: Introduce FIB entry update opJiri Pirko2-5/+12
Follow-up patchset introducing XMDR implementation is going to need to distinguish write and update ops. Therefore introduce "update op" and call "write op" only when new FIB entry is inserted. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12mlxsw: spectrum_router: Track FIB entry committed state and skip uncommitted ↵Jiri Pirko2-0/+12
on delete In case bulking is used, the entry that was previously added may not be yet committed to the HW as it waits in the queue for bulk send. For such entries, skip the deletion. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12mlxsw: spectrum_router: Introduce fib_entry priv for low-level opsJiri Pirko4-44/+176
Prepare for the low-level ops that need to store some data alongside the fib_entry and introduce a per-fib_entry priv for ll ops. The priv is reference counted as in the follow-up patch it is going to be saved in pack() function and used later on in commit() even in case the related fib_entry gets freed in the middle. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12mlxsw: spectrum_router: Have FIB entry op context allocated for the instanceJiri Pirko2-30/+114
Get the max size needed for FIB entry op context and allocate it once for the instance. Use it repeatedly from the scheduled work. By this, allow to extend the context to hold more data than it is wise to do when it was on the stack. Make sure to signalize that the context needs to be initialized in case families of subsequent FIB entries differ. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12mlxsw: spectrum_router: Prepare work context for possible bulkingJiri Pirko2-2/+14
For XMDR register it is possible to carry multiple FIB entry operations in a single write. However the FW does not restrict mixing the types of operations, make the code easier and indicate the bulking is ok only in case the bulk contains FIB operations of the same family and event. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12mlxsw: spectrum: Push RALUE packing and writing into low-level router opsJiri Pirko4-55/+107
With follow-up introduction of XM implementation, XMDR register is going to be optionally used instead of RALUE register. Push the RALUE packing helpers and write call into low-level router ops. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12mlxsw: spectrum_router: Use RALUE pack helper from abort functionJiri Pirko1-2/+2
Unify the RALUE register payload packing and use the __mlxsw_sp_fib_entry_ralue_pack() helper from __mlxsw_sp_router_set_abort_trap(). Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12mlxsw: reg: Allow to pass NULL pointer to mlxsw_reg_ralue_pack4/6()Jiri Pirko1-2/+4
In preparation for the change that is going to be done in the next patch, allow to pass NULL pointer to mlxsw_reg_ralue_pack4() and mlxsw_reg_ralue_pack6() helpers. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>