summaryrefslogtreecommitdiffstats
path: root/samples
AgeCommit message (Collapse)AuthorFilesLines
2016-12-13Merge tag 'vfio-v4.10-rc1' of git://github.com/awilliam/linux-vfioLinus Torvalds2-0/+1516
Pull VFIO updates from Alex Williamson: - VFIO updates for v4.10 primarily include a new Mediated Device interface, which essentially allows software defined devices to be exposed to users through VFIO. The host vendor driver providing this virtual device polices, or mediates user access to the device. These devices often incorporate portions of real devices, for instance the primary initial users of this interface expose vGPUs which allow the user to map mediated devices, or mdevs, to a portion of a physical GPU. QEMU composes these mdevs into PCI representations using the existing VFIO user API. This enables both Intel KVM-GT support, which is also expected to arrive into Linux mainline during the v4.10 merge window, as well as NVIDIA vGPU, and also Channel I/O devices (aka CCW devices) for s390 virtualization support. (Kirti Wankhede, Neo Jia) - Drop unnecessary uses of pcibios_err_to_errno() (Cao Jin) - Fixes to VFIO capability chain handling (Eric Auger) - Error handling fixes for fallout from mdev (Christophe JAILLET) - Notifiers to expose struct kvm to mdev vendor drivers (Jike Song) - type1 IOMMU model search fixes (Kirti Wankhede, Neo Jia) * tag 'vfio-v4.10-rc1' of git://github.com/awilliam/linux-vfio: (30 commits) vfio iommu type1: Fix size argument to vfio_find_dma() in pin_pages/unpin_pages vfio iommu type1: Fix size argument to vfio_find_dma() during DMA UNMAP. vfio iommu type1: WARN_ON if notifier block is not unregistered kvm: set/clear kvm to/from vfio_group when group add/delete vfio: support notifier chain in vfio_group vfio: vfio_register_notifier: classify iommu notifier vfio: Fix handling of error returned by 'vfio_group_get_from_dev()' vfio: fix vfio_info_cap_add/shift vfio/pci: Drop unnecessary pcibios_err_to_errno() MAINTAINERS: Add entry VFIO based Mediated device drivers docs: Sample driver to demonstrate how to use Mediated device framework. docs: Sysfs ABI for mediated device framework docs: Add Documentation for Mediated devices vfio: Define device_api strings vfio_platform: Updated to use vfio_set_irqs_validate_and_prepare() vfio_pci: Updated to use vfio_set_irqs_validate_and_prepare() vfio: Introduce vfio_set_irqs_validate_and_prepare() vfio_pci: Update vfio_pci to use vfio_info_add_capability() vfio: Introduce common function to add capabilities vfio iommu: Add blocking notifier to notify DMA_UNMAP ...
2016-12-08bpf: xdp: Add XDP example for head adjustmentMartin KaFai Lau8-93/+630
The XDP prog checks if the incoming packet matches any VIP:PORT combination in the BPF hashmap. If it is, it will encapsulate the packet with a IPv4/v6 header as instructed by the value of the BPF hashmap and then XDP_TX it out. The VIP:PORT -> IP-Encap-Info can be specified by the cmd args of the user prog. Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-03samples, bpf: Add automated test for cgroup filter attachmentsSargun Dhillon2-0/+134
This patch adds the sample program test_cgrp2_attach2. This program is similar to test_cgrp2_attach, but it performs automated testing of the cgroupv2 BPF attached filters. It runs the following checks: * Simple filter attachment * Application of filters to child cgroups * Overriding filters on child cgroups * Checking that this still works when the parent filter is removed The filters that are used here are simply allow all / deny all filters, so it isn't checking the actual functionality of the filters, but rather the behaviour around detachment / attachment. If net_cls is enabled, this test will fail. Signed-off-by: Sargun Dhillon <sargun@sargun.me> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-03samples, bpf: Refactor test_current_task_under_cgroup - separate out helpersSargun Dhillon4-85/+218
This patch modifies test_current_task_under_cgroup_user. The test has several helpers around creating a temporary environment for cgroup testing, and moving the current task around cgroups. This set of helpers can then be used in other tests. Signed-off-by: Sargun Dhillon <sargun@sargun.me> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-03samples/bpf: silence compiler warningsAlexei Starovoitov1-0/+2
silence some of the clang compiler warnings like: include/linux/fs.h:2693:9: warning: comparison of unsigned enum expression < 0 is always false arch/x86/include/asm/processor.h:491:30: warning: taking address of packed member 'sp0' of class or structure 'x86_hw_tss' may result in an unaligned pointer value include/linux/cgroup-defs.h:326:16: warning: field 'cgrp' with variable sized type 'struct cgroup' not at the end of a struct or class is a GNU extension since they add too much noise to samples/bpf/ build. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller3-3/+3
Couple conflicts resolved here: 1) In the MACB driver, a bug fix to properly initialize the RX tail pointer properly overlapped with some changes to support variable sized rings. 2) In XGBE we had a "CONFIG_PM" --> "CONFIG_PM_SLEEP" fix overlapping with a reorganization of the driver to support ACPI, OF, as well as PCI variants of the chip. 3) In 'net' we had several probe error path bug fixes to the stmmac driver, meanwhile a lot of this code was cleaned up and reorganized in 'net-next'. 4) The cls_flower classifier obtained a helper function in 'net-next' called __fl_delete() and this overlapped with Daniel Borkamann's bug fix to use RCU for object destruction in 'net'. It also overlapped with Jiri's change to guard the rhashtable_remove_fast() call with a check against tc_skip_sw(). 5) In mlx4, a revert bug fix in 'net' overlapped with some unrelated changes in 'net-next'. 6) In geneve, a stale header pointer after pskb_expand_head() bug fix in 'net' overlapped with a large reorganization of the same code in 'net-next'. Since the 'net-next' code no longer had the bug in question, there was nothing to do other than to simply take the 'net-next' hunks. Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-02samples/bpf: add userspace example for prohibiting socketsDavid Ahern4-0/+195
Add examples preventing a process in a cgroup from opening a socket based family, protocol and type. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-02samples/bpf: Update bpf loader for cgroup section namesDavid Ahern2-3/+12
Add support for section names starting with cgroup/skb and cgroup/sock. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-02samples: bpf: add userspace example for modifying sk_bound_dev_ifDavid Ahern3-0/+132
Add a simple program to demonstrate the ability to attach a bpf program to a cgroup that sets sk_bound_dev_if for AF_INET{6} sockets when they are created. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-02bpf: Add tests and samples for LWT-BPFThomas Graf7-0/+855
Adds a series of tests to verify the functionality of attaching BPF programs at LWT hooks. Also adds a sample which collects a histogram of packet sizes which pass through an LWT hook. $ ./lwt_len_hist.sh Starting netserver with host 'IN(6)ADDR_ANY' port '12865' and family AF_UNSPEC MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.253.2 () port 0 AF_INET : demo Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 87380 16384 16384 10.00 39857.69 1 -> 1 : 0 | | 2 -> 3 : 0 | | 4 -> 7 : 0 | | 8 -> 15 : 0 | | 16 -> 31 : 0 | | 32 -> 63 : 22 | | 64 -> 127 : 98 | | 128 -> 255 : 213 | | 256 -> 511 : 1444251 |******** | 512 -> 1023 : 660610 |*** | 1024 -> 2047 : 535241 |** | 2048 -> 4095 : 19 | | 4096 -> 8191 : 180 | | 8192 -> 16383 : 5578023 |************************************* | 16384 -> 32767 : 632099 |*** | 32768 -> 65535 : 6575 | | Signed-off-by: Thomas Graf <tgraf@suug.ch> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30samples/bpf: fix include pathAlexei Starovoitov1-1/+1
Fix the following build error: HOSTCC samples/bpf/test_lru_dist.o ../samples/bpf/test_lru_dist.c:25:22: fatal error: bpf_util.h: No such file or directory This is due to objtree != srctree. Use srctree, since that's where bpf_util.h is located. Fixes: e00c7b216f34 ("bpf: fix multiple issues in selftest suite and samples") Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30samples: bpf: Refactor test_cgrp2_attach -- use getopt, and add modeSargun Dhillon1-30/+50
This patch modifies test_cgrp2_attach to use getopt so we can use standard command line parsing. It also adds an option to run the program in detach only mode. This does not attach a new filter at the cgroup, but only runs the detach command. Lastly, it changes the attach code to not detach and then attach. It relies on the 'hotswap' behaviour of CGroup BPF programs to be able to change in-place. If detach-then-attach behaviour needs to be tested, the example can be run in detach only mode prior to attachment. Signed-off-by: Sargun Dhillon <sargun@sargun.me> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-28bpf/samples: Fix PT_REGS_IP on s390x and use itMichael Holzheu3-3/+3
The files "sampleip_kern.c" and "trace_event_kern.c" directly access "ctx->regs.ip" which is not available on s390x. Fix this and use the PT_REGS_IP() macro instead. Also fix the macro for s390x and use "psw.addr" from "pt_regs". Reported-by: Zvonko Kosic <zvonko.kosic@de.ibm.com> Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-27bpf: fix multiple issues in selftest suite and samplesDaniel Borkmann5-5/+15
1) The test_lru_map and test_lru_dist fails building on my machine since the sys/resource.h header is not included. 2) test_verifier fails in one test case where we try to call an invalid function, since the verifier log output changed wrt printing function names. 3) Current selftest suite code relies on sysconf(_SC_NPROCESSORS_CONF) for retrieving the number of possible CPUs. This is broken at least in our scenario and really just doesn't work. glibc tries a number of things for retrieving _SC_NPROCESSORS_CONF. First it tries equivalent of /sys/devices/system/cpu/cpu[0-9]* | wc -l, if that fails, depending on the config, it either tries to count CPUs in /proc/cpuinfo, or returns the _SC_NPROCESSORS_ONLN value instead. If /proc/cpuinfo has some issue, it returns just 1 worst case. This oddity is nothing new [1], but semantics/behaviour seems to be settled. _SC_NPROCESSORS_ONLN will parse /sys/devices/system/cpu/online, if that fails it looks into /proc/stat for cpuX entries, and if also that fails for some reason, /proc/cpuinfo is consulted (and returning 1 if unlikely all breaks down). While that might match num_possible_cpus() from the kernel in some cases, it's really not guaranteed with CPU hotplugging, and can result in a buffer overflow since the array in user space could have too few number of slots, and on perpcu map lookup, the kernel will write beyond that memory of the value buffer. William Tu reported such mismatches: [...] The fact that sysconf(_SC_NPROCESSORS_CONF) != num_possible_cpu() happens when CPU hotadd is enabled. For example, in Fusion when setting vcpu.hotadd = "TRUE" or in KVM, setting ./qemu-system-x86_64 -smp 2, maxcpus=4 ... the num_possible_cpu() will be 4 and sysconf() will be 2 [2]. [...] Documentation/cputopology.txt says /sys/devices/system/cpu/possible outputs cpu_possible_mask. That is the same as in num_possible_cpus(), so first step would be to fix the _SC_NPROCESSORS_CONF calls with our own implementation. Later, we could add support to bpf(2) for passing a mask via CPU_SET(3), for example, to just select a subset of CPUs. BPF samples code needs this fix as well (at least so that people stop copying this). Thus, define bpf_num_possible_cpus() once in selftests and import it from there for the sample code to avoid duplicating it. The remaining sysconf(_SC_NPROCESSORS_CONF) in samples are unrelated. After all three issues are fixed, the test suite runs fine again: # make run_tests | grep self selftests: test_verifier [PASS] selftests: test_maps [PASS] selftests: test_lru_map [PASS] selftests: test_kmod.sh [PASS] [1] https://www.sourceware.org/ml/libc-alpha/2011-06/msg00079.html [2] https://www.mail-archive.com/netdev@vger.kernel.org/msg121183.html Fixes: 3059303f59cf ("samples/bpf: update tracex[23] examples to use per-cpu maps") Fixes: 86af8b4191d2 ("Add sample for adding simple drop program to link") Fixes: df570f577231 ("samples/bpf: unit test for BPF_MAP_TYPE_PERCPU_ARRAY") Fixes: e15596717948 ("samples/bpf: unit test for BPF_MAP_TYPE_PERCPU_HASH") Fixes: ebb676daa1a3 ("bpf: Print function name in addition to function id") Fixes: 5db58faf989f ("bpf: Add tests for the LRU bpf_htab") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: William Tu <u9012063@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25samples: bpf: add userspace example for attaching eBPF programs to cgroupsDaniel Mack4-0/+173
Add a simple userpace program to demonstrate the new API to attach eBPF programs to cgroups. This is what it does: * Create arraymap in kernel with 4 byte keys and 8 byte values * Load eBPF program The eBPF program accesses the map passed in to store two pieces of information. The number of invocations of the program, which maps to the number of packets received, is stored to key 0. Key 1 is incremented on each iteration by the number of bytes stored in the skb. * Detach any eBPF program previously attached to the cgroup * Attach the new program to the cgroup using BPF_PROG_ATTACH * Once a second, read map[0] and map[1] to see how many bytes and packets were seen on any socket of tasks in the given cgroup. The program takes a cgroup path as 1st argument, and either "ingress" or "egress" as 2nd. Optionally, "drop" can be passed as 3rd argument, which will make the generated eBPF program return 0 instead of 1, so the kernel will drop the packet. libbpf gained two new wrappers for the new syscall commands. Signed-off-by: Daniel Mack <daniel@zonque.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-24samples/bpf: fix bpf loaderAlexei Starovoitov1-0/+4
llvm can emit relocations into sections other than program code (like debug info sections). Ignore them during parsing of elf file Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-24samples/bpf: fix sockex2 exampleAlexei Starovoitov2-2/+2
since llvm commit "Do not expand UNDEF SDNode during insn selection lowering" llvm will generate code that uses uninitialized registers for cases where C code is actually uses uninitialized data. So this sockex2 example is technically broken. Fix it by initializing on the stack variable fully. Also increase verifier buffer limit, since verifier output may not fit in 64k for this sockex2 code depending on llvm version. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-17docs: Sample driver to demonstrate how to use Mediated device framework.Kirti Wankhede2-0/+1516
The Sample driver creates mdev device that simulates serial port over PCI card. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Neo Jia <cjia@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-11-15bpf: Add tests for the LRU bpf_htabMartin KaFai Lau4-0/+611
This patch has some unit tests and a test_lru_dist. The test_lru_dist reads in the numeric keys from a file. The files used here are generated by a modified fio-genzipf tool originated from the fio test suit. The sample data file can be found here: https://github.com/iamkafai/bpf-lru The zipf.* data files have 100k numeric keys and the key is also ranged from 1 to 100k. The test_lru_dist outputs the number of unique keys (nr_unique). F.e. The following means, 61239 of them is unique out of 100k keys. nr_misses means it cannot be found in the LRU map, so nr_misses must be >= nr_unique. test_lru_dist also simulates a perfect LRU map as a comparison: [root@arch-fb-vm1 ~]# ~/devshare/fb-kernel/linux/samples/bpf/test_lru_dist \ /root/zipf.100k.a1_01.out 4000 1 ... test_parallel_lru_dist (map_type:9 map_flags:0x0): task:0 BPF LRU: nr_unique:23093(/100000) nr_misses:31603(/100000) task:0 Perfect LRU: nr_unique:23093(/100000 nr_misses:34328(/100000) .... test_parallel_lru_dist (map_type:9 map_flags:0x2): task:0 BPF LRU: nr_unique:23093(/100000) nr_misses:31710(/100000) task:0 Perfect LRU: nr_unique:23093(/100000 nr_misses:34328(/100000) [root@arch-fb-vm1 ~]# ~/devshare/fb-kernel/linux/samples/bpf/test_lru_dist \ /root/zipf.100k.a0_01.out 40000 1 ... test_parallel_lru_dist (map_type:9 map_flags:0x0): task:0 BPF LRU: nr_unique:61239(/100000) nr_misses:67054(/100000) task:0 Perfect LRU: nr_unique:61239(/100000 nr_misses:66993(/100000) ... test_parallel_lru_dist (map_type:9 map_flags:0x2): task:0 BPF LRU: nr_unique:61239(/100000) nr_misses:67068(/100000) task:0 Perfect LRU: nr_unique:61239(/100000 nr_misses:66993(/100000) LRU map has also been added to map_perf_test: /* Global LRU */ [root@kerneltest003.31.prn1 ~]# for i in 1 4 8; do echo -n "$i cpus: "; \ ./map_perf_test 16 $i | awk '{r += $3}END{print r " updates"}'; done 1 cpus: 2934082 updates 4 cpus: 7391434 updates 8 cpus: 6500576 updates /* Percpu LRU */ [root@kerneltest003.31.prn1 ~]# for i in 1 4 8; do echo -n "$i cpus: "; \ ./map_perf_test 32 $i | awk '{r += $3}END{print r " updates"}'; done 1 cpus: 2896553 updates 4 cpus: 9766395 updates 8 cpus: 17460553 updates Signed-off-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller4-0/+486
Several cases of bug fixes in 'net' overlapping other changes in 'net-next-. Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-12bpf: Add test for bpf_redirect to ipip/ip6tnlMartin KaFai Lau4-0/+486
The test creates two netns, ns1 and ns2. The host (the default netns) has an ipip or ip6tnl dev configured for tunneling traffic to the ns2. ping VIPS from ns1 <----> host <--tunnel--> ns2 (VIPs at loopback) The test is to have ns1 pinging VIPs configured at the loopback interface in ns2. The VIPs are 10.10.1.102 and 2401:face::66 (which are configured at lo@ns2). [Note: 0x66 => 102]. At ns1, the VIPs are routed _via_ the host. At the host, bpf programs are installed at the veth to redirect packets from a veth to the ipip/ip6tnl. The test is configured in a way so that both ingress and egress can be tested. At ns2, the ipip/ip6tnl dev is configured with the local and remote address specified. The return path is routed to the dev ipip/ip6tnl. During egress test, the host also locally tests pinging the VIPs to ensure that bpf_redirect at egress also works for the direct egress (i.e. not forwarding from dev ve1 to ve2). Acked-by: Alexei Starovoitov <ast@fb.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-30Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller30-1/+3662
Mostly simple overlapping changes. For example, David Ahern's adjacency list revamp in 'net-next' conflicted with an adjacency list traversal bug fix in 'net'. Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds6-0/+6
Pull networking fixes from David Miller: "Lots of fixes, mostly drivers as is usually the case. 1) Don't treat zero DMA address as invalid in vmxnet3, from Alexey Khoroshilov. 2) Fix element timeouts in netfilter's nft_dynset, from Anders K. Pedersen. 3) Don't put aead_req crypto struct on the stack in mac80211, from Ard Biesheuvel. 4) Several uninitialized variable warning fixes from Arnd Bergmann. 5) Fix memory leak in cxgb4, from Colin Ian King. 6) Fix bpf handling of VLAN header push/pop, from Daniel Borkmann. 7) Several VRF semantic fixes from David Ahern. 8) Set skb->protocol properly in ip6_tnl_xmit(), from Eli Cooper. 9) Socket needs to be locked in udp_disconnect(), from Eric Dumazet. 10) Div-by-zero on 32-bit fix in mlx4 driver, from Eugenia Emantayev. 11) Fix stale link state during failover in NCSCI driver, from Gavin Shan. 12) Fix netdev lower adjacency list traversal, from Ido Schimmel. 13) Propvide proper handle when emitting notifications of filter deletes, from Jamal Hadi Salim. 14) Memory leaks and big-endian issues in rtl8xxxu, from Jes Sorensen. 15) Fix DESYNC_FACTOR handling in ipv6, from Jiri Bohac. 16) Several routing offload fixes in mlxsw driver, from Jiri Pirko. 17) Fix broadcast sync problem in TIPC, from Jon Paul Maloy. 18) Validate chunk len before using it in SCTP, from Marcelo Ricardo Leitner. 19) Revert a netns locking change that causes regressions, from Paul Moore. 20) Add recursion limit to GRO handling, from Sabrina Dubroca. 21) GFP_KERNEL in irq context fix in ibmvnic, from Thomas Falcon. 22) Avoid accessing stale vxlan/geneve socket in data path, from Pravin Shelar" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (189 commits) geneve: avoid using stale geneve socket. vxlan: avoid using stale vxlan socket. qede: Fix out-of-bound fastpath memory access net: phy: dp83848: add dp83822 PHY support enic: fix rq disable tipc: fix broadcast link synchronization problem ibmvnic: Fix missing brackets in init_sub_crq_irqs ibmvnic: Fix releasing of sub-CRQ IRQs in interrupt context Revert "ibmvnic: Fix releasing of sub-CRQ IRQs in interrupt context" arch/powerpc: Update parameters for csum_tcpudp_magic & csum_tcpudp_nofold net/mlx4_en: Save slave ethtool stats command net/mlx4_en: Fix potential deadlock in port statistics flow net/mlx4: Fix firmware command timeout during interrupt test net/mlx4_core: Do not access comm channel if it has not yet been initialized net/mlx4_en: Fix panic during reboot net/mlx4_en: Process all completions in RX rings after port goes up net/mlx4_en: Resolve dividing by zero in 32-bit system net/mlx4_core: Change the default value of enable_qos net/mlx4_core: Avoid setting ports to auto when only one port type is supported net/mlx4_core: Fix the resource-type enum in res tracker to conform to FW spec ...
2016-10-29bpf: fix samples to add fake KBUILD_MODNAMEDaniel Borkmann6-0/+6
Some of the sample files are causing issues when they are loaded with tc and cls_bpf, meaning tc bails out while trying to parse the resulting ELF file as program/map/etc sections are not present, which can be easily spotted with readelf(1). Currently, BPF samples are including some of the kernel headers and mid term we should change them to refrain from this, really. When dynamic debugging is enabled, we bail out due to undeclared KBUILD_MODNAME, which is easily overlooked in the build as clang spills this along with other noisy warnings from various header includes, and llc still generates an ELF file with mentioned characteristics. For just playing around with BPF examples, this can be a bit of a hurdle to take. Just add a fake KBUILD_MODNAME as a band-aid to fix the issue, same is done in xdp*_kern samples already. Fixes: 65d472fb007d ("samples/bpf: add 'pointer to packet' tests") Fixes: 6afb1e28b859 ("samples/bpf: Add tunnel set/get tests.") Fixes: a3f74617340b ("cgroup: bpf: Add an example to do cgroup checking in BPF") Reported-by: Chandrasekar Kannan <ckannan@console.to> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18bpf: add initial suite for selftestsDaniel Borkmann3-3151/+0
Add a start of a test suite for kernel selftests. This moves test_verifier and test_maps over to tools/testing/selftests/bpf/ along with various code improvements and also adds a script for invoking test_bpf module. The test suite can simply be run via selftest framework, f.e.: # cd tools/testing/selftests/bpf/ # make # make run_tests Both test_verifier and test_maps were kind of misplaced in samples/bpf/ directory and we were looking into adding them to selftests for a while now, so it can be picked up by kbuild bot et al and hopefully also get more exposure and thus new test case additions. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18bpf: add various tests around spill/fill of regsDaniel Borkmann1-0/+116
Add several spill/fill tests. Besides others, one that performs xadd on the spilled register, one ldx/stx test where different types are spilled from two branches and read out from common path. Verfier does handle all correctly. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-14Merge tag 'linux-kselftest-4.9-rc1-update' of ↵Linus Torvalds24-1/+3656
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull kselftest updates from Shuah Khan: "This update consists of: - Fixes and improvements to existing tests - Moving code from Documentation to selftests, samples, and tools: * Moves dnotify_test, prctl, ptp, vDSO, ia64, watchdog, and networking tests from Documentation to selftests. * Moves mic/mpssd, misc-devices/mei, timers, watchdog, auxdisplay, and blackfin examples from Documentation to samples. * Moves accounting, laptops/dslm, and pcmcia/crc32hash tools from Documentation to tools. * Deletes BUILD_DOCSRC and its dependencies" * tag 'linux-kselftest-4.9-rc1-update' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: (21 commits) selftests/futex: Check ANSI terminal color support Doc: update 00-INDEX files to reflect the runnable code move samples: move blackfin gptimers-example from Documentation tools: move pcmcia crc32hash tool from Documentation tools: move laptops dslm tool from Documentation tools: move accounting tool from Documentation samples: move auxdisplay example code from Documentation samples: move watchdog example code from Documentation samples: move timers example code from Documentation samples: move misc-devices/mei example code from Documentation samples: move mic/mpssd example code from Documentation selftests: Move networking/timestamping from Documentation selftests: move watchdog tests from Documentation/watchdog selftests: move ia64 tests from Documentation/ia64 selftests: move vDSO tests from Documentation/vDSO selftests: move ptp tests from Documentation/ptp selftests: move prctl tests from Documentation/prctl selftests: move dnotify_test from Documentation/filesystems selftests/timers: Add missing error code assignment before test selftests/zram: replace ZRAM_LZ4_COMPRESS ...
2016-10-10samples: move blackfin gptimers-example from DocumentationShuah Khan4-1/+99
Move blackfin gptimers-example to samples and remove it from Documentation Makefile. Update samples Kconfig and Makefile to build gptimers-example. blackfin is the last CONFIG_BUILD_DOCSRC target in Documentation/Makefile. Hence this patch also includes changes to remove CONFIG_BUILD_DOCSRC from Makefile and lib/Kconfig.debug and updates VIDEO_PCI_SKELETON dependency on BUILD_DOCSRC. Documentation/Makefile is not deleted to avoid braking make htmldocs and make distclean. Acked-by: Michal Marek <mmarek@suse.com> Acked-by: Jonathan Corbet <corbet@lwn.net> Reviewed-by: Kees Cook <keescook@chromium.org> Reported-by: Valentin Rothberg <valentinrothberg@gmail.com> Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2016-10-06Merge tag 'rpmsg-v4.9' of git://github.com/andersson/remoteprocLinus Torvalds1-9/+23
Pull rpmsg updates from Bjorn Andersson: "The bulk of these patches involve splitting the rpmsg implementation into a framework/API part and a virtio specific backend part. It then adds the Qualcomm Shared Memory Device (SMD) as an additional supported wire format. Also included is a set of code style cleanups that have been lingering for a while" * tag 'rpmsg-v4.9' of git://github.com/andersson/remoteproc: (26 commits) rpmsg: smd: fix dependency on QCOM_SMD=n rpmsg: Introduce Qualcomm SMD backend rpmsg: Allow callback to return errors rpmsg: Move virtio specifics from public header rpmsg: virtio: Hide vrp pointer from the public API rpmsg: Hide rpmsg indirection tables rpmsg: Split rpmsg core and virtio backend rpmsg: Split off generic tail of create_channel() rpmsg: Move helper for finding rpmsg devices to core rpmsg: Move endpoint related interface to rpmsg core rpmsg: Indirection table for rpmsg_endpoint operations rpmsg: Move rpmsg_device API to new file rpmsg: Introduce indirection table for rpmsg_device operations rpmsg: Clean up rpmsg device vs channel naming rpmsg: Make rpmsg_create_ept() take channel_info struct rpmsg: rpmsg_send() operations takes rpmsg_endpoint rpmsg: Name rpmsg devices based on channel id rpmsg: Enable matching devices with drivers based on DT rpmsg: Drop prototypes for non-existing functions samples/rpmsg: add support for multiple instances ...
2016-09-29bpf: allow access into map value arraysJosef Bacik2-4/+247
Suppose you have a map array value that is something like this struct foo { unsigned iter; int array[SOME_CONSTANT]; }; You can easily insert this into an array, but you cannot modify the contents of foo->array[] after the fact. This is because we have no way to verify we won't go off the end of the array at verification time. This patch provides a start for this work. We accomplish this by keeping track of a minimum and maximum value a register could be while we're checking the code. Then at the time we try to do an access into a MAP_VALUE we verify that the maximum offset into that region is a valid access into that memory region. So in practice, code such as this unsigned index = 0; if (foo->iter >= SOME_CONSTANT) foo->iter = index; else index = foo->iter++; foo->array[index] = bar; would be allowed, as we can verify that index will always be between 0 and SOME_CONSTANT-1. If you wish to use signed values you'll have to have an extra check to make sure the index isn't less than 0, or do something like index %= SOME_CONSTANT. Signed-off-by: Josef Bacik <jbacik@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-27bpf samples: update tracex5 sample to use __seccomp_filterNaveen N. Rao2-9/+10
seccomp_phase1() does not exist anymore. Instead, update sample to use __seccomp_filter(). While at it, set max locked memory to unlimited. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-27bpf samples: fix compiler errors with sockex2 and sockex3Naveen N. Rao3-11/+11
These samples fail to compile as 'struct flow_keys' conflicts with definition in net/flow_dissector.h. Fix the same by renaming the structure used in the sample. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-23samples: move auxdisplay example code from DocumentationShuah Khan3-0/+291
Move auxdisplay examples to samples and remove it from Documentation Makefile. Create a new Makefile to build auxdisplay. It can be built from top level directory or from auxdisplay directory: Run make -C samples/auxdisplay or cd samples/auxdisplay; make Acked-by: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2016-09-23samples: move watchdog example code from DocumentationShuah Khan3-0/+33
Move watchdog examples to samples and remove it from Documentation Makefile. Create a new Makefile to build watchdog. It can be built from top level directory or from watchdog directory: Run make -C samples/watchdog or cd samples/watchdog; make Acked-by: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2016-09-23samples: move timers example code from DocumentationShuah Khan3-0/+310
Move timers examples to samples and remove it from Documentation Makefile. Create a new Makefile to build timers. It can be built from top level directory or from timers directory: Run make -C samples/timers or cd samples/timers; make Acked-by: Jonathan Corbet <corbet@lwn.net> Acked-by: Clemens Ladisch <clemens@ladisch.de> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2016-09-23samples: move misc-devices/mei example code from DocumentationShuah Khan4-0/+491
Move misc-devices/mei examples to samples/mei and remove it from Documentation Makefile. Delete misc-devices/Makefile. Create a new Makefile to build samples/mei. It can be built from top level directory or from mei directory: Run make -C samples/mei or cd samples/mei; make Acked-by: Jonathan Corbet <corbet@lwn.net> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2016-09-20bpf: add test cases for direct packet accessDaniel Borkmann1-3/+430
Add couple of test cases for direct write and the negative size issue, and also adjust the direct packet access test4 since it asserts that writes are not possible, but since we've just added support for writes, we need to invert the verdict to ACCEPT, of course. Summary: 133 PASSED, 0 FAILED. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-20samples: move mic/mpssd example code from DocumentationShuah Khan7-0/+2432
Move mic/mpssd examples to samples and remove it from Documentation Makefile. Create a new Makefile to build mic/mpssd. It can be built from top level directory or from mic/mpssd directory: Run make -C samples/mic/mpssd or cd samples/mic/mpssd; make Acked-by: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2016-09-17samples/bpf: add comprehensive ipip, ipip6, ip6ip6 testAlexei Starovoitov2-0/+310
the test creates 3 namespaces with veth connected via bridge. First two namespaces simulate two different hosts with the same IPv4 and IPv6 addresses configured on the tunnel interface and they communicate with outside world via standard tunnels. Third namespace creates collect_md tunnel that is driven by BPF program which selects different remote host (either first or second namespace) based on tcp dest port number while tcp dst ip is the same. This scenario is rough approximation of load balancer use case. The tests check both traditional tunnel configuration and collect_md mode. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-17samples/bpf: extend test_tunnel_bpf.sh with IPIP testAlexei Starovoitov2-8/+106
extend existing tests for vxlan, geneve, gre to include IPIP tunnel. It tests both traditional tunnel configuration and dynamic via bpf helpers. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-08rpmsg: Allow callback to return errorsBjorn Andersson1-2/+4
Some rpmsg backends support holding on to and redelivering messages upon failed handling of them, so provide a way for the callback to report and error and allow the backends to handle this. Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2016-09-08rpmsg: Clean up rpmsg device vs channel namingBjorn Andersson1-3/+3
The rpmsg device representing struct is called rpmsg_channel and the variable name used throughout is rpdev, with the communication happening on endpoints it's clearer to just call this a "device" in a public API. Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2016-09-08rpmsg: rpmsg_send() operations takes rpmsg_endpointBjorn Andersson1-2/+2
The rpmsg_send() operations has been taking a rpmsg_device, but this forces users of secondary rpmsg_endpoints to use the rpmsg_sendto() interface - by extracting source and destination from the given data structures. If we instead pass the rpmsg_endpoint to these functions a service can use rpmsg_sendto() to respond to messages, even on secondary endpoints. In addition this would allow us to support operations on multiple channels in future backends that does not support off-channel operations. Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2016-09-08bpf: fix range propagation on direct packet accessDaniel Borkmann1-0/+102
LLVM can generate code that tests for direct packet access via skb->data/data_end in a way that currently gets rejected by the verifier, example: [...] 7: (61) r3 = *(u32 *)(r6 +80) 8: (61) r9 = *(u32 *)(r6 +76) 9: (bf) r2 = r9 10: (07) r2 += 54 11: (3d) if r3 >= r2 goto pc+12 R1=inv R2=pkt(id=0,off=54,r=0) R3=pkt_end R4=inv R6=ctx R9=pkt(id=0,off=0,r=0) R10=fp 12: (18) r4 = 0xffffff7a 14: (05) goto pc+430 [...] from 11 to 24: R1=inv R2=pkt(id=0,off=54,r=0) R3=pkt_end R4=inv R6=ctx R9=pkt(id=0,off=0,r=0) R10=fp 24: (7b) *(u64 *)(r10 -40) = r1 25: (b7) r1 = 0 26: (63) *(u32 *)(r6 +56) = r1 27: (b7) r2 = 40 28: (71) r8 = *(u8 *)(r9 +20) invalid access to packet, off=20 size=1, R9(id=0,off=0,r=0) The reason why this gets rejected despite a proper test is that we currently call find_good_pkt_pointers() only in case where we detect tests like rX > pkt_end, where rX is of type pkt(id=Y,off=Z,r=0) and derived, for example, from a register of type pkt(id=Y,off=0,r=0) pointing to skb->data. find_good_pkt_pointers() then fills the range in the current branch to pkt(id=Y,off=0,r=Z) on success. For above case, we need to extend that to recognize pkt_end >= rX pattern and mark the other branch that is taken on success with the appropriate pkt(id=Y,off=0,r=Z) type via find_good_pkt_pointers(). Since eBPF operates on BPF_JGT (>) and BPF_JGE (>=), these are the only two practical options to test for from what LLVM could have generated, since there's no such thing as BPF_JLT (<) or BPF_JLE (<=) that we would need to take into account as well. After the fix: [...] 7: (61) r3 = *(u32 *)(r6 +80) 8: (61) r9 = *(u32 *)(r6 +76) 9: (bf) r2 = r9 10: (07) r2 += 54 11: (3d) if r3 >= r2 goto pc+12 R1=inv R2=pkt(id=0,off=54,r=0) R3=pkt_end R4=inv R6=ctx R9=pkt(id=0,off=0,r=0) R10=fp 12: (18) r4 = 0xffffff7a 14: (05) goto pc+430 [...] from 11 to 24: R1=inv R2=pkt(id=0,off=54,r=54) R3=pkt_end R4=inv R6=ctx R9=pkt(id=0,off=0,r=54) R10=fp 24: (7b) *(u64 *)(r10 -40) = r1 25: (b7) r1 = 0 26: (63) *(u32 *)(r6 +56) = r1 27: (b7) r2 = 40 28: (71) r8 = *(u8 *)(r9 +20) 29: (bf) r1 = r8 30: (25) if r8 > 0x3c goto pc+47 R1=inv56 R2=imm40 R3=pkt_end R4=inv R6=ctx R8=inv56 R9=pkt(id=0,off=0,r=54) R10=fp 31: (b7) r1 = 1 [...] Verifier test cases are also added in this work, one that demonstrates the mentioned example here and one that tries a bad packet access for the current/fall-through branch (the one with types pkt(id=X,off=Y,r=0), pkt(id=X,off=0,r=0)), then a case with good and bad accesses, and two with both test variants (>, >=). Fixes: 969bf05eb3ce ("bpf: direct packet access") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-02samples/bpf: add sampleip exampleBrendan Gregg3-0/+238
sample instruction pointer and frequency count in a BPF map Signed-off-by: Brendan Gregg <bgregg@netflix.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-02samples/bpf: add perf_event+bpf exampleAlexei Starovoitov5-1/+290
The bpf program is called 50 times a second and does hashmap[kern&user_stackid]++ It's primary purpose to check that key bpf helpers like map lookup, update, get_stackid, trace_printk and ctx access are all working. It checks: - PERF_COUNT_HW_CPU_CYCLES on all cpus - PERF_COUNT_HW_CPU_CYCLES for current process and inherited perf_events to children - PERF_COUNT_SW_CPU_CLOCK on all cpus - PERF_COUNT_SW_CPU_CLOCK for current process Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-19samples/bpf: Add tunnel set/get tests.William Tu4-0/+327
The patch creates sample code exercising bpf_skb_{set,get}_tunnel_key, and bpf_skb_{set,get}_tunnel_opt for GRE, VXLAN, and GENEVE. A native tunnel device is created in a namespace to interact with a lwtunnel device out of the namespace, with metadata enabled. The bpf_skb_set_* program is attached to tc egress and bpf_skb_get_* is attached to egress qdisc. A ping between two tunnels is used to verify correctness and the result of bpf_skb_get_* printed by bpf_trace_printk. Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-18Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller5-31/+24
Minor overlapping changes for both merge conflicts. Resolution work done by Stephen Rothwell was used as a reference. Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds3-5/+16
Pull networking fixes from David Miller: 1) Buffers powersave frame test is reversed in cfg80211, fix from Felix Fietkau. 2) Remove bogus WARN_ON in openvswitch, from Jarno Rajahalme. 3) Fix some tg3 ethtool logic bugs, and one that would cause no interrupts to be generated when rx-coalescing is set to 0. From Satish Baddipadige and Siva Reddy Kallam. 4) QLCNIC mailbox corruption and napi budget handling fix from Manish Chopra. 5) Fix fib_trie logic when walking the trie during /proc/net/route output than can access a stale node pointer. From David Forster. 6) Several sctp_diag fixes from Phil Sutter. 7) PAUSE frame handling fixes in mlxsw driver from Ido Schimmel. 8) Checksum fixup fixes in bpf from Daniel Borkmann. 9) Memork leaks in nfnetlink, from Liping Zhang. 10) Use after free in rxrpc, from David Howells. 11) Use after free in new skb_array code of macvtap driver, from Jason Wang. 12) Calipso resource leak, from Colin Ian King. 13) mediatek bug fixes (missing stats sync init, etc.) from Sean Wang. 14) Fix bpf non-linear packet write helpers, from Daniel Borkmann. 15) Fix lockdep splats in macsec, from Sabrina Dubroca. 16) hv_netvsc bug fixes from Vitaly Kuznetsov, mostly to do with VF handling. 17) Various tc-action bug fixes, from CONG Wang. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (116 commits) net_sched: allow flushing tc police actions net_sched: unify the init logic for act_police net_sched: convert tcf_exts from list to pointer array net_sched: move tc offload macros to pkt_cls.h net_sched: fix a typo in tc_for_each_action() net_sched: remove an unnecessary list_del() net_sched: remove the leftover cleanup_a() mlxsw: spectrum: Allow packets to be trapped from any PG mlxsw: spectrum: Unmap 802.1Q FID before destroying it mlxsw: spectrum: Add missing rollbacks in error path mlxsw: reg: Fix missing op field fill-up mlxsw: spectrum: Trap loop-backed packets mlxsw: spectrum: Add missing packet traps mlxsw: spectrum: Mark port as active before registering it mlxsw: spectrum: Create PVID vPort before registering netdevice mlxsw: spectrum: Remove redundant errors from the code mlxsw: spectrum: Don't return upon error in removal path i40e: check for and deal with non-contiguous TCs ixgbe: Re-enable ability to toggle VLAN filtering ixgbe: Force VLNCTRL.VFE to be set in all VMDq paths ...
2016-08-12samples/bpf: add verifier tests for the helper access to the packetAaron Yue1-4/+110
test various corner cases of the helper function access to the packet via crafted XDP programs. Signed-off-by: Aaron Yue <haoxuany@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>