From 0f4a9b7d4ecbac191052cb80b84a46471fd30d80 Mon Sep 17 00:00:00 2001 From: Magnus Karlsson Date: Thu, 21 Feb 2019 10:21:28 +0100 Subject: xsk: add FAQ to facilitate for first time users Added an FAQ section in Documentation/networking/af_xdp.rst to help first time users with common problems. As problems are getting identified, entries will be added to the FAQ. Signed-off-by: Magnus Karlsson Signed-off-by: Daniel Borkmann --- Documentation/networking/af_xdp.rst | 36 +++++++++++++++++++++++++++++++++++- 1 file changed, 35 insertions(+), 1 deletion(-) (limited to 'Documentation/networking') diff --git a/Documentation/networking/af_xdp.rst b/Documentation/networking/af_xdp.rst index 4ae4f9d8f8fe..e14d7d40fc75 100644 --- a/Documentation/networking/af_xdp.rst +++ b/Documentation/networking/af_xdp.rst @@ -295,6 +295,41 @@ using:: For XDP_SKB mode, use the switch "-S" instead of "-N" and all options can be displayed with "-h", as usual. +FAQ +======= + +Q: I am not seeing any traffic on the socket. What am I doing wrong? + +A: When a netdev of a physical NIC is initialized, Linux usually + allocates one Rx and Tx queue pair per core. So on a 8 core system, + queue ids 0 to 7 will be allocated, one per core. In the AF_XDP + bind call or the xsk_socket__create libbpf function call, you + specify a specific queue id to bind to and it is only the traffic + towards that queue you are going to get on you socket. So in the + example above, if you bind to queue 0, you are NOT going to get any + traffic that is distributed to queues 1 through 7. If you are + lucky, you will see the traffic, but usually it will end up on one + of the queues you have not bound to. + + There are a number of ways to solve the problem of getting the + traffic you want to the queue id you bound to. If you want to see + all the traffic, you can force the netdev to only have 1 queue, queue + id 0, and then bind to queue 0. You can use ethtool to do this:: + + sudo ethtool -L combined 1 + + If you want to only see part of the traffic, you can program the + NIC through ethtool to filter out your traffic to a single queue id + that you can bind your XDP socket to. Here is one example in which + UDP traffic to and from port 4242 are sent to queue 2:: + + sudo ethtool -N rx-flow-hash udp4 fn + sudo ethtool -N flow-type udp4 src-port 4242 dst-port \ + 4242 action 2 + + A number of other ways are possible all up to the capabilitites of + the NIC you have. + Credits ======= @@ -309,4 +344,3 @@ Credits - Michael S. Tsirkin - Qi Z Zhang - Willem de Bruijn - -- cgit v1.2.3 From 46604676c8c6c4c07649767d32ae66f4429ccd6f Mon Sep 17 00:00:00 2001 From: Andrii Nakryiko Date: Thu, 28 Feb 2019 17:12:21 -0800 Subject: docs/bpf: minor casing/punctuation fixes Fix few casing and punctuation glitches. Signed-off-by: Andrii Nakryiko Signed-off-by: Daniel Borkmann --- Documentation/bpf/bpf_design_QA.rst | 24 ++++++++++++------------ Documentation/networking/filter.txt | 2 +- 2 files changed, 13 insertions(+), 13 deletions(-) (limited to 'Documentation/networking') diff --git a/Documentation/bpf/bpf_design_QA.rst b/Documentation/bpf/bpf_design_QA.rst index 7cc9e368c1e9..10453c627135 100644 --- a/Documentation/bpf/bpf_design_QA.rst +++ b/Documentation/bpf/bpf_design_QA.rst @@ -36,27 +36,27 @@ consideration important quirks of other architectures) and defines calling convention that is compatible with C calling convention of the linux kernel on those architectures. -Q: can multiple return values be supported in the future? +Q: Can multiple return values be supported in the future? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ A: NO. BPF allows only register R0 to be used as return value. -Q: can more than 5 function arguments be supported in the future? +Q: Can more than 5 function arguments be supported in the future? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ A: NO. BPF calling convention only allows registers R1-R5 to be used as arguments. BPF is not a standalone instruction set. (unlike x64 ISA that allows msft, cdecl and other conventions) -Q: can BPF programs access instruction pointer or return address? +Q: Can BPF programs access instruction pointer or return address? ----------------------------------------------------------------- A: NO. -Q: can BPF programs access stack pointer ? +Q: Can BPF programs access stack pointer ? ------------------------------------------ A: NO. Only frame pointer (register R10) is accessible. From compiler point of view it's necessary to have stack pointer. -For example LLVM defines register R11 as stack pointer in its +For example, LLVM defines register R11 as stack pointer in its BPF backend, but it makes sure that generated code never uses it. Q: Does C-calling convention diminishes possible use cases? @@ -66,8 +66,8 @@ A: YES. BPF design forces addition of major functionality in the form of kernel helper functions and kernel objects like BPF maps with seamless interoperability between them. It lets kernel call into -BPF programs and programs call kernel helpers with zero overhead. -As all of them were native C code. That is particularly the case +BPF programs and programs call kernel helpers with zero overhead, +as all of them were native C code. That is particularly the case for JITed BPF programs that are indistinguishable from native kernel C code. @@ -75,9 +75,9 @@ Q: Does it mean that 'innovative' extensions to BPF code are disallowed? ------------------------------------------------------------------------ A: Soft yes. -At least for now until BPF core has support for +At least for now, until BPF core has support for bpf-to-bpf calls, indirect calls, loops, global variables, -jump tables, read only sections and all other normal constructs +jump tables, read-only sections, and all other normal constructs that C code can produce. Q: Can loops be supported in a safe way? @@ -109,16 +109,16 @@ For example why BPF_JNE and other compare and jumps are not cpu-like? A: This was necessary to avoid introducing flags into ISA which are impossible to make generic and efficient across CPU architectures. -Q: why BPF_DIV instruction doesn't map to x64 div? +Q: Why BPF_DIV instruction doesn't map to x64 div? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ A: Because if we picked one-to-one relationship to x64 it would have made it more complicated to support on arm64 and other archs. Also it needs div-by-zero runtime check. -Q: why there is no BPF_SDIV for signed divide operation? +Q: Why there is no BPF_SDIV for signed divide operation? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ A: Because it would be rarely used. llvm errors in such case and -prints a suggestion to use unsigned divide instead +prints a suggestion to use unsigned divide instead. Q: Why BPF has implicit prologue and epilogue? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ diff --git a/Documentation/networking/filter.txt b/Documentation/networking/filter.txt index b5e060edfc38..319e5e041f38 100644 --- a/Documentation/networking/filter.txt +++ b/Documentation/networking/filter.txt @@ -829,7 +829,7 @@ tracing filters may do to maintain counters of events, for example. Register R9 is not used by socket filters either, but more complex filters may be running out of registers and would have to resort to spill/fill to stack. -Internal BPF can used as generic assembler for last step performance +Internal BPF can be used as a generic assembler for last step performance optimizations, socket filters and seccomp are using it as assembler. Tracing filters may use it as assembler to generate code from kernel. In kernel usage may not be bounded by security considerations, since generated internal BPF code -- cgit v1.2.3