summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2017-11-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller14-152/+216
Daniel Borkmann says: ==================== pull-request: bpf 2017-11-23 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Several BPF offloading fixes, from Jakub. Among others: - Limit offload to cls_bpf and XDP program types only. - Move device validation into the driver and don't make any assumptions about the device in the classifier due to shared blocks semantics. - Don't pass offloaded XDP program into the driver when it should be run in native XDP instead. Offloaded ones are not JITed for the host in such cases. - Don't destroy device offload state when moved to another namespace. - Revert dumping offload info into user space for now, since ifindex alone is not sufficient. This will be redone properly for bpf-next tree. 2) Fix test_verifier to avoid using bpf_probe_write_user() helper in test cases, since it's dumping a warning into kernel log which may confuse users when only running tests. Switch to use bpf_trace_printk() instead, from Yonghong. 3) Several fixes for correcting ARG_CONST_SIZE_OR_ZERO semantics before it becomes uabi, from Gianluca. More specifically: - Add a type ARG_PTR_TO_MEM_OR_NULL that is used only by bpf_csum_diff(), where the argument is either a valid pointer or NULL. The subsequent ARG_CONST_SIZE_OR_ZERO then enforces a valid pointer in case of non-0 size or a valid pointer or NULL in case of size 0. Given that, the semantics for ARG_PTR_TO_MEM in combination with ARG_CONST_SIZE_OR_ZERO are now such that in case of size 0, the pointer must always be valid and cannot be NULL. This fix in semantics allows for bpf_probe_read() to drop the recently added size == 0 check in the helper that would become part of uabi otherwise once released. At the same time we can then fix bpf_probe_read_str() and bpf_perf_event_output() to use ARG_CONST_SIZE_OR_ZERO instead of ARG_CONST_SIZE in order to fix recently reported issues by Arnaldo et al, where LLVM optimizes two boundary checks into a single one for unknown variables where the verifier looses track of the variable bounds and thus rejects valid programs otherwise. 4) A fix for the verifier for the case when it detects comparison of two constants where the branch is guaranteed to not be taken at runtime. Verifier will rightfully prune the exploration of such paths, but we still pass the program to JITs, where they would complain about using reserved fields, etc. Track such dead instructions and sanitize them with mov r0,r0. Rejection is not possible since LLVM may generate them for valid C code and doesn't do as much data flow analysis as verifier. For bpf-next we might implement removal of such dead code and adjust branches instead. Fix from Alexei. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-24net: accept UFO datagrams from tuntap and packetWillem de Bruijn15-14/+209
Tuntap and similar devices can inject GSO packets. Accept type VIRTIO_NET_HDR_GSO_UDP, even though not generating UFO natively. Processes are expected to use feature negotiation such as TUNSETOFFLOAD to detect supported offload types and refrain from injecting other packets. This process breaks down with live migration: guest kernels do not renegotiate flags, so destination hosts need to expose all features that the source host does. Partially revert the UFO removal from 182e0b6b5846~1..d9d30adf5677. This patch introduces nearly(*) no new code to simplify verification. It brings back verbatim tuntap UFO negotiation, VIRTIO_NET_HDR_GSO_UDP insertion and software UFO segmentation. It does not reinstate protocol stack support, hardware offload (NETIF_F_UFO), SKB_GSO_UDP tunneling in SKB_GSO_SOFTWARE or reception of VIRTIO_NET_HDR_GSO_UDP packets in tuntap. To support SKB_GSO_UDP reappearing in the stack, also reinstate logic in act_csum and openvswitch. Achieve equivalence with v4.13 HEAD by squashing in commit 939912216fa8 ("net: skb_needs_check() removes CHECKSUM_UNNECESSARY check for tx.") and reverting commit 8d63bee643f1 ("net: avoid skb_warn_bad_offload false positives on UFO"). (*) To avoid having to bring back skb_shinfo(skb)->ip6_frag_id, ipv6_proxy_select_ident is changed to return a __be32 and this is assigned directly to the frag_hdr. Also, SKB_GSO_UDP is inserted at the end of the enum to minimize code churn. Tested Booted a v4.13 guest kernel with QEMU. On a host kernel before this patch `ethtool -k eth0` shows UFO disabled. After the patch, it is enabled, same as on a v4.13 host kernel. A UFO packet sent from the guest appears on the tap device: host: nc -l -p -u 8000 & tcpdump -n -i tap0 guest: dd if=/dev/zero of=payload.txt bs=1 count=2000 nc -u 192.16.1.1 8000 < payload.txt Direct tap to tap transmission of VIRTIO_NET_HDR_GSO_UDP succeeds, packets arriving fragmented: ./with_tap_pair.sh ./tap_send_ufo tap0 tap1 (from https://github.com/wdebruij/kerneltools/tree/master/tests) Changes v1 -> v2 - simplified set_offload change (review comment) - documented test procedure Link: http://lkml.kernel.org/r/<CAF=yD-LuUeDuL9YWPJD9ykOZ0QCjNeznPDr6whqZ9NGMNF12Mw@mail.gmail.com> Fixes: fb652fdfe837 ("macvlan/macvtap: Remove NETIF_F_UFO advertisement.") Reported-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-24net: realtek: r8169: implement set_link_ksettings()Tobias Jakobi1-16/+22
Commit 6fa1ba61520576cf1346c4ff09a056f2950cb3bf partially implemented the new ethtool API, by replacing get_settings() with get_link_ksettings(). This breaks ethtool, since the userspace tool (according to the new API specs) never tries the legacy set() call, when the new get() call succeeds. All attempts to chance some setting from userspace result in: > Cannot set new settings: Operation not supported Implement the missing set() call. Signed-off-by: Tobias Jakobi <tjakobi@math.uni-bielefeld.de> Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-24net: ipv6: Fixup device for anycast routes during copyDavid Ahern1-1/+1
Florian reported a breakage with anycast routes due to commit 4832c30d5458 ("net: ipv6: put host and anycast routes on device with address"). Prior to this commit anycast routes were added against the loopback device causing repetitive route entries with no insight into why they existed. e.g.: $ ip -6 ro ls table local type anycast anycast 2001:db8:1:: dev lo proto kernel metric 0 pref medium anycast 2001:db8:2:: dev lo proto kernel metric 0 pref medium anycast fe80:: dev lo proto kernel metric 0 pref medium anycast fe80:: dev lo proto kernel metric 0 pref medium The point of commit 4832c30d5458 is to add the routes using the device with the address which is causing the route to be added. e.g.,: $ ip -6 ro ls table local type anycast anycast 2001:db8:1:: dev eth1 proto kernel metric 0 pref medium anycast 2001:db8:2:: dev eth2 proto kernel metric 0 pref medium anycast fe80:: dev eth2 proto kernel metric 0 pref medium anycast fe80:: dev eth1 proto kernel metric 0 pref medium For traffic to work as it did before, the dst device needs to be switched to the loopback when the copy is created similar to local routes. Fixes: 4832c30d5458 ("net: ipv6: put host and anycast routes on device with address") Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-24Merge branch 'smc-fixes-for-smc-buffer-handling'David S. Miller1-2/+2
Ursula Braun says: ==================== net/smc: fixes for smc buffer handling here are 2 cleanup patches for smc buffer handling. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-24net/smc: Fix preinitialization of buf_desc in __smc_buf_create()Geert Uytterhoeven1-1/+1
With gcc-4.1.2: net/smc/smc_core.c: In function ‘__smc_buf_create’: net/smc/smc_core.c:567: warning: ‘bufsize’ may be used uninitialized in this function Indeed, if the for-loop is never executed, bufsize is used uninitialized. In addition, buf_desc is stored for later use, while it is still a NULL pointer. Before, error handling was done by checking if buf_desc is non-NULL. The cleanup changed this to an error check, but forgot to update the preinitialization of buf_desc to an error pointer. Update the preinitializatin of buf_desc to fix this. Fixes: b33982c3a6838d13 ("net/smc: cleanup function __smc_buf_create()") Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-24net/smc: use sk_rcvbuf as start for rmb creationUrsula Braun1-1/+1
Commit 3e034725c0d8 ("net/smc: common functions for RMBs and send buffers") merged handling of SMC receive and send buffers. It introduced sk_buf_size as merged start value for size determination. But since sk_buf_size is not used at all, sk_sndbuf is erroneously used as start for rmb creation. This patch makes sure, sk_buf_size is really used as intended, and sk_rcvbuf is used as start value for rmb creation. Fixes: 3e034725c0d8 ("net/smc: common functions for RMBs and send buffers") Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Reviewed-by: Hans Wippel <hwippel@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-24ipv6: Do not consider linkdown nexthops during multipathIdo Schimmel1-0/+5
When the 'ignore_routes_with_linkdown' sysctl is set, we should not consider linkdown nexthops during route lookup. While the code correctly verifies that the initially selected route ('match') has a carrier, it does not perform the same check in the subsequent multipath selection, resulting in a potential packet loss. In case the chosen route does not have a carrier and the sysctl is set, choose the initially selected route. Fixes: 35103d11173b ("net: ipv6 sysctl option to ignore routes when nexthop link is down") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: David Ahern <dsahern@gmail.com> Acked-by: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-24net: sched: fix crash when deleting secondary chainsRoman Kapl1-3/+4
If you flush (delete) a filter chain other than chain 0 (such as when deleting the device), the kernel may run into a use-after-free. The chain refcount must not be decremented unless we are sure we are done with the chain. To reproduce the bug, run: ip link add dtest type dummy tc qdisc add dev dtest ingress tc filter add dev dtest chain 1 parent ffff: flower ip link del dtest Introduced in: commit f93e1cdcf42c ("net/sched: fix filter flushing"), but unless you have KAsan or luck, you won't notice it until commit 0dadc117ac8b ("cls_flower: use tcf_exts_get_net() before call_rcu()") Fixes: f93e1cdcf42c ("net/sched: fix filter flushing") Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Roman Kapl <code@rkapl.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-24net: phy: cortina: add missing MODULE_DESCRIPTION/AUTHOR/LICENSEJesse Chan1-0/+4
This change resolves a new compile-time warning when built as a loadable module: WARNING: modpost: missing MODULE_LICENSE() in drivers/net/phy/cortina.o see include/linux/module.h for more information This adds the license as "GPL", which matches the header of the file. MODULE_DESCRIPTION and MODULE_AUTHOR are also added. Signed-off-by: Jesse Chan <jc@linux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-23bpf: fix branch pruning logicAlexei Starovoitov2-1/+28
when the verifier detects that register contains a runtime constant and it's compared with another constant it will prune exploration of the branch that is guaranteed not to be taken at runtime. This is all correct, but malicious program may be constructed in such a way that it always has a constant comparison and the other branch is never taken under any conditions. In this case such path through the program will not be explored by the verifier. It won't be taken at run-time either, but since all instructions are JITed the malicious program may cause JITs to complain about using reserved fields, etc. To fix the issue we have to track the instructions explored by the verifier and sanitize instructions that are dead at run time with NOPs. We cannot reject such dead code, since llvm generates it for valid C code, since it doesn't do as much data flow analysis as the verifier does. Fixes: 17a5267067f3 ("bpf: verifier (add verifier core)") Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-11-22Merge branch 'bpf-fix-null-arg-semantics'Daniel Borkmann5-18/+116
Gianluca Borello says: ==================== This set includes some fixes in semantics and usability issues that emerged recently, and would be good to have them in net before the next release. In particular, ARG_CONST_SIZE_OR_ZERO semantics was recently changed in commit 9fd29c08e520 ("bpf: improve verifier ARG_CONST_SIZE_OR_ZERO semantics") with the goal of letting the compiler generate simpler code that the verifier can more easily accept. To handle this change in semantics, a few checks in some helpers were added, like in commit 9c019e2bc4b2 ("bpf: change helper bpf_probe_read arg2 type to ARG_CONST_SIZE_OR_ZERO"), and those checks are less than ideal because once they make it into a released kernel bpf programs can start relying on them, preventing the possibility of being removed later on. This patch tries to fix the issue by introducing a new argument type ARG_PTR_TO_MEM_OR_NULL that can be used for helpers that can receive a <NULL, 0> tuple. By doing so, we can fix the semantics of the other helpers that don't need <NULL, 0> and can just handle <!NULL, 0>, allowing the code to get rid of those checks. ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-11-22bpf: change bpf_perf_event_output arg5 type to ARG_CONST_SIZE_OR_ZEROGianluca Borello1-2/+2
Commit 9fd29c08e520 ("bpf: improve verifier ARG_CONST_SIZE_OR_ZERO semantics") relaxed the treatment of ARG_CONST_SIZE_OR_ZERO due to the way the compiler generates optimized BPF code when checking boundaries of an argument from C code. A typical example of this optimized code can be generated using the bpf_perf_event_output helper when operating on variable memory: /* len is a generic scalar */ if (len > 0 && len <= 0x7fff) bpf_perf_event_output(ctx, &perf_map, 0, buf, len); 110: (79) r5 = *(u64 *)(r10 -40) 111: (bf) r1 = r5 112: (07) r1 += -1 113: (25) if r1 > 0x7ffe goto pc+6 114: (bf) r1 = r6 115: (18) r2 = 0xffff94e5f166c200 117: (b7) r3 = 0 118: (bf) r4 = r7 119: (85) call bpf_perf_event_output#25 R5 min value is negative, either use unsigned or 'var &= const' With this code, the verifier loses track of the variable. Replacing arg5 with ARG_CONST_SIZE_OR_ZERO is thus desirable since it avoids this quite common case which leads to usability issues, and the compiler generates code that the verifier can more easily test: if (len <= 0x7fff) bpf_perf_event_output(ctx, &perf_map, 0, buf, len); or bpf_perf_event_output(ctx, &perf_map, 0, buf, len & 0x7fff); No changes to the bpf_perf_event_output helper are necessary since it can handle a case where size is 0, and an empty frame is pushed. Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Gianluca Borello <g.borello@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-11-22bpf: change bpf_probe_read_str arg2 type to ARG_CONST_SIZE_OR_ZEROGianluca Borello1-1/+1
Commit 9fd29c08e520 ("bpf: improve verifier ARG_CONST_SIZE_OR_ZERO semantics") relaxed the treatment of ARG_CONST_SIZE_OR_ZERO due to the way the compiler generates optimized BPF code when checking boundaries of an argument from C code. A typical example of this optimized code can be generated using the bpf_probe_read_str helper when operating on variable memory: /* len is a generic scalar */ if (len > 0 && len <= 0x7fff) bpf_probe_read_str(p, len, s); 251: (79) r1 = *(u64 *)(r10 -88) 252: (07) r1 += -1 253: (25) if r1 > 0x7ffe goto pc-42 254: (bf) r1 = r7 255: (79) r2 = *(u64 *)(r10 -88) 256: (bf) r8 = r4 257: (85) call bpf_probe_read_str#45 R2 min value is negative, either use unsigned or 'var &= const' With this code, the verifier loses track of the variable. Replacing arg2 with ARG_CONST_SIZE_OR_ZERO is thus desirable since it avoids this quite common case which leads to usability issues, and the compiler generates code that the verifier can more easily test: if (len <= 0x7fff) bpf_probe_read_str(p, len, s); or bpf_probe_read_str(p, len & 0x7fff, s); No changes to the bpf_probe_read_str helper are necessary since strncpy_from_unsafe itself immediately returns if the size passed is 0. Signed-off-by: Gianluca Borello <g.borello@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-11-22bpf: remove explicit handling of 0 for arg2 in bpf_probe_readGianluca Borello1-5/+1
Commit 9c019e2bc4b2 ("bpf: change helper bpf_probe_read arg2 type to ARG_CONST_SIZE_OR_ZERO") changed arg2 type to ARG_CONST_SIZE_OR_ZERO to simplify writing bpf programs by taking advantage of the new semantics introduced for ARG_CONST_SIZE_OR_ZERO which allows <!NULL, 0> arguments. In order to prevent the helper from actually passing a NULL pointer to probe_kernel_read, which can happen when <NULL, 0> is passed to the helper, the commit also introduced an explicit check against size == 0. After the recent introduction of the ARG_PTR_TO_MEM_OR_NULL type, bpf_probe_read can not receive a pair of <NULL, 0> arguments anymore, thus the check is not needed anymore and can be removed, since probe_kernel_read can correctly handle a <!NULL, 0> call. This also fixes the semantics of the helper before it gets officially released and bpf programs start relying on this check. Fixes: 9c019e2bc4b2 ("bpf: change helper bpf_probe_read arg2 type to ARG_CONST_SIZE_OR_ZERO") Signed-off-by: Gianluca Borello <g.borello@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-11-22bpf: introduce ARG_PTR_TO_MEM_OR_NULLGianluca Borello4-10/+112
With the current ARG_PTR_TO_MEM/ARG_PTR_TO_UNINIT_MEM semantics, an helper argument can be NULL when the next argument type is ARG_CONST_SIZE_OR_ZERO and the verifier can prove the value of this next argument is 0. However, most helpers are just interested in handling <!NULL, 0>, so forcing them to deal with <NULL, 0> makes the implementation of those helpers more complicated for no apparent benefits, requiring them to explicitly handle those corner cases with checks that bpf programs could start relying upon, preventing the possibility of removing them later. Solve this by making ARG_PTR_TO_MEM/ARG_PTR_TO_UNINIT_MEM never accept NULL even when ARG_CONST_SIZE_OR_ZERO is set, and introduce a new argument type ARG_PTR_TO_MEM_OR_NULL to explicitly deal with the NULL case. Currently, the only helper that needs this is bpf_csum_diff_proto(), so change arg1 and arg3 to this new type as well. Also add a new battery of tests that explicitly test the !ARG_PTR_TO_MEM_OR_NULL combination: all the current ones testing the various <NULL, 0> variations are focused on bpf_csum_diff, so cover also other helpers. Signed-off-by: Gianluca Borello <g.borello@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-11-21bpf: change bpf_probe_write_user to bpf_trace_printk in test_verifierYonghong Song1-23/+16
There are four tests in test_verifier using bpf_probe_write_user helper. These four tests will emit the following kernel messages [ 12.974753] test_verifier[220] is installing a program with bpf_probe_write_user helper that may corrupt user memory! [ 12.979285] test_verifier[220] is installing a program with bpf_probe_write_user helper that may corrupt user memory! ...... This may confuse certain users. This patch replaces bpf_probe_write_user with bpf_trace_printk. The test_verifier already uses bpf_trace_printk earlier in the test and a trace_printk warning message has been printed. So this patch does not emit any more kernel messages. Fixes: b6ff63911232 ("bpf: fix and add test cases for ARG_CONST_SIZE_OR_ZERO semantics change") Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-11-21Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds32-176/+220
Pull networking fixes from David Miller: 1) Fix a reference to a module parameter which was lost during the GREv6 receive path rewrite, from Alexey Kodanev. 2) Fix deref before NULL check in ipheth, from Gustavo A. R. Silva. 3) RCU read lock imbalance in tun_build_skb(), from Xin Long. 4) Some stragglers from the mac80211 folks: a) Timer conversions from Kees Cook b) Fix some sequencing issue when cfg80211 is built statically, from Johannes Berg c) Memory leak in mac80211_hwsim, from Ben Hutchings. 5) Add new qmi_wwan device ID, from Sebastian Sjoholm. 6) Fix use after free in tipc, from Jon Maloy. 7) Missing kdoc in nfp driver, from Jakub Kicinski. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: nfp: flower: add missing kdoc tipc: fix access of released memory net: qmi_wwan: add Quectel BG96 2c7c:0296 mlxsw: spectrum: Do not try to create non-existing ports during unsplit mac80211: properly free requested-but-not-started TX agg sessions mac80211_hwsim: Fix memory leak in hwsim_new_radio_nl() cfg80211: initialize regulatory keys/database later mac80211: aggregation: Convert timers to use timer_setup() nl80211: don't expose wdev->ssid for most interfaces mac80211: Convert timers to use timer_setup() net: vxge: Fix some indentation issues net: ena: fix race condition between device reset and link up setup r8169: use same RTL8111EVL green settings as in vendor driver r8169: fix RTL8111EVL EEE and green settings tun: fix rcu_read_lock imbalance in tun_build_skb tcp: when scheduling TLP, time of RTO should account for current ACK usbnet: ipheth: fix potential null pointer dereference in ipheth_carrier_set gre6: use log_ecn_error module parameter in ip6_tnl_rcv()
2017-11-21Merge tag 'for-linus-4.15-ofs1' of ↵Linus Torvalds10-154/+72
git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux Pull orangefs updates from Mike Marshall: "Fix: - stop setting atime on inode dirty (Martin Brandenburg) Cleanups: - remove initialization of i_version (Jeff Layton) - use ARRAY_SIZE (Jérémy Lefaure) - call op_release sooner when creating inodes (Mike MarshallMartin Brandenburg)" * tag 'for-linus-4.15-ofs1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux: orangefs: call op_release sooner when creating inodes orangefs: stop setting atime on inode dirty orangefs: use ARRAY_SIZE orangefs: remove initialization of i_version
2017-11-21Merge tag 'ceph-for-4.15-rc1' of git://github.com/ceph/ceph-clientLinus Torvalds11-150/+237
Pull ceph updates from Ilya Dryomov: "We have a set of file locking improvements from Zheng, rbd rw/ro state handling code cleanup from myself and some assorted CephFS fixes from Jeff. rbd now defaults to single-major=Y, lifting the limit of ~240 rbd images per host for everyone" * tag 'ceph-for-4.15-rc1' of git://github.com/ceph/ceph-client: rbd: default to single-major device number scheme libceph: don't WARN() if user tries to add invalid key rbd: set discard_alignment to zero ceph: silence sparse endianness warning in encode_caps_cb ceph: remove the bump of i_version ceph: present consistent fsid, regardless of arch endianness ceph: clean up spinlocking and list handling around cleanup_cap_releases() rbd: get rid of rbd_mapping::read_only rbd: fix and simplify rbd_ioctl_set_ro() ceph: remove unused and redundant variable dropping ceph: mark expected switch fall-throughs ceph: -EINVAL on decoding failure in ceph_mdsc_handle_fsmap() ceph: disable cached readdir after dropping positive dentry ceph: fix bool initialization/comparison ceph: handle 'session get evicted while there are file locks' ceph: optimize flock encoding during reconnect ceph: make lock_to_ceph_filelock() static ceph: keep auth cap when inode has flocks or posix locks
2017-11-21Merge branch 'for-linus' of ↵Linus Torvalds3-6/+4
git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk Pull printk updates from Petr Mladek: - print the warning about dropped messages on consoles on a separate line. It makes it more legible. - one typo fix and small code clean up. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk: added new line symbol after warning about dropped messages printk: fix typo in printk_safe.c printk: simplify no_printk()
2017-11-21Merge tag 'mac80211-for-davem-2017-11-20' of ↵David S. Miller17-137/+155
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== A few things: * straggler timer conversions from Kees * memory leak fix in hwsim * fix some fallout from regdb changes if wireless is built-in * also free aggregation sessions in startup state when station goes away, to avoid crashing the timer ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-21nfp: flower: add missing kdocJakub Kicinski1-0/+1
Commit 0115552eac14 ("nfp: remove false positive offloads in flower vxlan") missed adding kdoc for a new parameter of nfp_flower_add_offload(). Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-21tipc: fix access of released memoryJon Maloy1-1/+1
When the function tipc_group_filter_msg() finds that a member event indicates that the member is leaving the group, it first deletes the member instance, and then purges the message queue being handled by the call. But the message queue is an aggregated field in the just deleted item, leading the purge call to access freed memory. We fix this by swapping the order of the two actions. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-21net: qmi_wwan: add Quectel BG96 2c7c:0296Sebastian Sjoholm1-0/+1
Quectel BG96 is an Qualcomm MDM9206 based IoT modem, supporting both CAT-M and NB-IoT. Tested hardware is BG96 mounted on Quectel development board (EVB). The USB id is added to qmi_wwan.c to allow QMI communication with the BG96. Signed-off-by: Sebastian Sjoholm <ssjoholm@mac.com> Acked-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-21mlxsw: spectrum: Do not try to create non-existing ports during unsplitIdo Schimmel2-2/+8
On some systems, when we unsplit a port we need to re-create two ports instead. On other systems, only one needs to be re-created. Do not try to create a port if during driver initialization it was assigned a negative module number, which is invalid. This avoids the following error during unsplit: [ 941.012478] mlxsw_spectrum 0000:01:00.0: Port 43: Failed to map module The error is harmless and caused by the fact that a local port is already mapped to module 0. Fixes: be94535f9531 ("mlxsw: spectrum: Make split flow match firmware requirements") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-20Merge tag 'fbdev-v4.15' of git://github.com/bzolnier/linuxLinus Torvalds28-748/+164
Pull fbdev updates from Bartlomiej Zolnierkiewicz: "There is nothing really major here (though removal of the dead igafb driver stands out in diffstat). Summary: - convert timers to use timer_setup() (Kees Cook, Thierry Reding) - fix panels support on iMX boards in mxsfb driver (Stefan Agner) - fix timeout on EDID read in udlfb driver (Ladislav Michl) - add missing modes to fix out of bounds access in controlfb driver (Geert Uytterhoeven) - update initialisation paths in sa1100fb driver to be more robust (Russell King) - fix error handling path of ->probe method in au1200fb driver (Christophe JAILLET) - fix handling of cases when either panel or crt is defined in sm501fb driver (Sudip Mukherjee, Colin Ian King) - add ability to the Goldfish FB driver to be recognized by OS via DT (Aleksandar Markovic) - structures constifications (Bhumika Goyal) - misc fixes (Allen Pais, Gustavo A. R. Silva, Dan Carpenter) - misc cleanups (Colin Ian King, Himanshu Jha, Markus Elfring) - remove dead igafb driver" * tag 'fbdev-v4.15' of git://github.com/bzolnier/linux: (42 commits) OMAPFB: prevent buffer underflow in omapfb_parse_vram_param() video: fbdev: sm501fb: fix potential null pointer dereference on fbi fbcon: Initialize ops->info early video: fbdev: Convert timers to use timer_setup() video: fbdev: pxa3xx_gcu: Convert timers to use timer_setup() fbdev: controlfb: Add missing modes to fix out of bounds access video: fbdev: sis_main: mark expected switch fall-throughs video: fbdev: cirrusfb: mark expected switch fall-throughs video: fbdev: aty: radeon_pm: mark expected switch fall-throughs video: fbdev: sm501fb: mark expected switch fall-through in sm501fb_blank_crt video: fbdev: intelfb: remove redundant variables video/fbdev/dnfb: Use common error handling code in dnfb_probe() sm501fb: suspend and resume fb if it exists sm501fb: unregister framebuffer only if registered sm501fb: deallocate colormap only if allocated video: goldfishfb: Add support for device tree bindings Documentation: Add device tree binding for Goldfish FB driver video: udlfb: Fix read EDID timeout video: fbdev: remove dead igafb driver video: fbdev: mxsfb: fix pixelclock polarity ...
2017-11-20Merge tag 'devicetree-fixes-for-4.15' of ↵Linus Torvalds6-79/+29
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull DeviceTree fixes from Rob Herring: - Remove mc13892 as a trivial device - Improve of_find_node_by_name() documentation - Fix unit test dtc warnings - Clean-ups of USB binding documentation - Fix potential NULL deref in of_pci_map_rid * tag 'devicetree-fixes-for-4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: dt-bindings: trivial-devices: Remove fsl,mc13892 of: Document exactly what of_find_node_by_name() puts of: unittest: disable interrupts_property warning of: unittest: let dtc generate __local_fixups__ dt-bindings: usb: document hub and host-controller properties dt-bindings: usb: clean up compatible property dt-bindings: usb: fix reg-property port-number range dt-bindings: usb: fix example hub node name of/pci: Fix theoretical NULL dereference
2017-11-20Merge tag 'jfs-4.15-2' of git://github.com/kleikamp/linux-shaggyLinus Torvalds1-1/+1
Pull jfs fixlet from Dave Kleikamp: "Update jfs git tree in MAINTAINERS" * tag 'jfs-4.15-2' of git://github.com/kleikamp/linux-shaggy: MAINTAINERS: fix jfs tree location
2017-11-21Merge branch 'bpf-offload-fixes'Daniel Borkmann10-110/+56
Jakub Kicinski says: ==================== This series addresses some late comments and moves checking if program has been loaded for the correct device to the drivers. There are also some problems with net namespaces which I didn't take into consideration. On the kernel side we will now simply ignore namespace moves. Since the user space API is not reporting any namespace identification we have to remove the ifindex until a correct way of reporting is agreed upon. v2: - fix ext ack reporting for XDP (David A); - add Jiri's Ack. ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-11-21bpf: make bpf_prog_offload_verifier_prep() static inlineJakub Kicinski1-1/+1
Header implementation of bpf_prog_offload_verifier_prep() which is used if CONFIG_NET=n should be a static inline. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-11-21bpf: revert report offload info to user spaceJakub Kicinski4-24/+0
This reverts commit bd601b6ada11 ("bpf: report offload info to user space"). The ifindex by itself is not sufficient, we should provide information on which network namespace this ifindex belongs to. After considering some options we concluded that it's best to just remove this API for now, and rework it in -next. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-11-21bpftool: revert printing program device bound infoJakub Kicinski2-37/+0
This reverts commit 928631e05495 ("bpftool: print program device bound info"). We will remove this API and redo it right in -next. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-11-21bpf: offload: ignore namespace movesJakub Kicinski1-0/+4
We are currently destroying the device offload state when device moves to another net namespace. This doesn't break with current NFP code, because offload state is not used on program removal, but it's not correct behaviour. Ignore the device unregister notifications on namespace move. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-11-21bpf: turn bpf_prog_get_type() into a wrapperJakub Kicinski2-17/+6
bpf_prog_get_type() is identical to bpf_prog_get_type_dev(), with false passed as attach_drv. Instead of keeping it as an exported symbol turn it into static inline wrapper. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-11-21net: xdp: don't allow device-bound programs in driver modeJakub Kicinski1-0/+7
Currently device-bound programs are not able to run on the host to save resources (host JIT is not invoked). Don't allow XDP programs to be attached without the HW_MODE flag. In theory if program is already translated for device offload the driver should choose to offload it instead of loading it in the driver. However, offloading translated program may still fail resulting in device-bound program being run on the host. Prevent this by refusing to attach device bound programs if XDP_FLAGS_HW_MODE is not set. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-11-21bpf: offload: move offload device validation out to the driversJakub Kicinski5-25/+27
With TC shared block changes we can't depend on correct netdev pointer being available in cls_bpf. Move the device validation to the driver. Core will only make sure that offloaded programs are always attached in the driver (or in HW by the driver). We trust that drivers which implement offload callbacks will perform necessary checks. Moving the checks to the driver is generally a useful thing, in practice the check should be against a switchdev instance, not a netdev, given that most ASICs will probably allow using the same program on many ports. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-11-21bpf: offload: rename the ifindex fieldJakub Kicinski4-5/+5
bpf_target_prog seems long and clunky, rename it to prog_ifindex. We don't want to call this field just ifindex, because maps may need a similar field in the future and bpf_attr members for programs and maps are unnamed. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-11-21bpf: offload: limit offload to cls_bpf and xdp programs onlyJakub Kicinski1-2/+3
We are currently only allowing attachment of device-bound cls_bpf and XDP programs. Make this restriction explicit in the BPF offload code. This way we can potentially reuse the ifindex field in the future. Since XDP and cls_bpf programs can only be loaded by admin, we can drop the explicit capability check from offload code. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-11-21bpf: offload: add comment warning developers about double destroyJakub Kicinski1-0/+4
Offload state may get destroyed either because the device for which it was constructed is going away, or because the refcount of bpf program itself has reached 0. In both of those cases we will call __bpf_prog_offload_destroy() to unlink the offload from the device. We may in fact call it twice, which works just fine, but we should make clear this is intended and caution others trying to extend the function. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-11-20dt-bindings: trivial-devices: Remove fsl,mc13892Jonathan Neuschäfer1-1/+0
This device's bindings are not trivial: Additional properties are documented in in Documentation/devicetree/bindings/mfd/mc13xxx.txt. Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net> Reviewed-by: Fabio Estevam <fabio.estevam@nxp.com> Signed-off-by: Rob Herring <robh@kernel.org>
2017-11-20of: Document exactly what of_find_node_by_name() putsStephen Boyd1-3/+3
It isn't clear if this function of_node_put()s the 'from' argument, or the node it searches. Clearly indicate which variable is touched. Fold in some more fixes from Randy too because we're in the area. Cc: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Rob Herring <robh@kernel.org>
2017-11-20MAINTAINERS: fix jfs tree locationTom Saeger1-1/+1
JFS tree has been moved to github. Signed-off-by: Tom Saeger <tom.saeger@oracle.com> Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
2017-11-20mac80211: properly free requested-but-not-started TX agg sessionsJohannes Berg1-0/+5
When deleting a station or otherwise tearing down all aggregation sessions, make sure to delete requested but not yet started ones, to avoid the following scenario: * session is requested, added to tid_start_tx[] * ieee80211_ba_session_work() runs, gets past BLOCK_BA check * ieee80211_sta_tear_down_BA_sessions() runs, locks &sta->ampdu_mlme.mtx, e.g. while deleting the station - deleting all active sessions * ieee80211_ba_session_work() continues since tear down flushes it, and calls ieee80211_tx_ba_session_handle_start() for the new session, arms the timer for it * station deletion continues to __cleanup_single_sta() and frees the session struct, while the timer is armed Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-11-20mac80211_hwsim: Fix memory leak in hwsim_new_radio_nl()Ben Hutchings1-1/+4
hwsim_new_radio_nl() now copies the name attribute in order to add a null-terminator. mac80211_hwsim_new_radio() (indirectly) copies it again into the net_device structure, so the first copy is not used or freed later. Free the first copy before returning. Fixes: ff4dd73dd2b4 ("mac80211_hwsim: check HWSIM_ATTR_RADIO_NAME length") Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-11-20cfg80211: initialize regulatory keys/database laterJohannes Berg1-15/+27
When cfg80211 is built as a module, everything is fine, and we can keep the code as is; in fact, we have to, because there can only be a single module_init(). When cfg80211 is built-in, however, it needs to initialize before drivers (device_initcall/module_init), and thus used to be at subsys_initcall(). I'd moved it to fs_initcall() earlier, where it can remain. However, this is still too early because at that point the key infrastructure hasn't been initialized yet, so X.509 certificates can't be parsed yet. To work around this problem, load the regdb keys only later in a late_initcall(), at which point the necessary infrastructure has been initialized. Fixes: 90a53e4432b1 ("cfg80211: implement regdb signature checking") Reported-by: Xiaolong Ye <xiaolong.ye@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-11-20mac80211: aggregation: Convert timers to use timer_setup()Kees Cook4-60/+45
In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. This removes the tid mapping array and expands the tid structures to add a pointer back to the station, along with the tid index itself. Cc: Johannes Berg <johannes@sipsolutions.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: linux-wireless@vger.kernel.org Cc: netdev@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> [switch tid variables to u8, the valid range is 0-15 at most, initialize tid_tx->sta/tid properly] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-11-20nl80211: don't expose wdev->ssid for most interfacesJohannes Berg1-2/+24
For mesh, this is simply wrong - there's no SSID, only the mesh ID, so don't expose it at all. For (P2P) client, it's wrong, because it exposes an internal value that's only used when certain APIs are used. For AP, it's actually the only correct case, so leave that. All other interface types shouldn't be setting this anyway, so there it won't change anything. Fixes: b84e7a05f619 ("nl80211: send the NL80211_ATTR_SSID in nl80211_send_iface()") Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-11-20mac80211: Convert timers to use timer_setup()Kees Cook11-59/+50
In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. Cc: Johannes Berg <johannes@sipsolutions.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: linux-wireless@vger.kernel.org Cc: netdev@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-11-19Merge tag 'ntb-4.15' of git://github.com/jonmason/ntbLinus Torvalds15-367/+1715
Pull ntb updates from Jon Mason: "Support for the switchtec ntb and related changes. Also, a couple of bug fixes" [ The timing isn't great. I had asked people to send me pull requests before my family vacation, and this code has not even been in linux-next as far as I can tell. But Logan Gunthorpe pleaded for its inclusion because the Switchtec driver has apparently been around for a while, just never in linux-next - Linus ] * tag 'ntb-4.15' of git://github.com/jonmason/ntb: ntb: intel: remove b2b memory window workaround for Skylake NTB NTB: make idt_89hpes_cfg const NTB: switchtec_ntb: Update switchtec documentation with notes for NTB NTB: switchtec_ntb: Add memory window support NTB: switchtec_ntb: Implement scratchpad registers NTB: switchtec_ntb: Implement doorbell registers NTB: switchtec_ntb: Add link management NTB: switchtec_ntb: Add skeleton NTB driver NTB: switchtec_ntb: Initialize hardware for doorbells and messages NTB: switchtec_ntb: Initialize hardware for memory windows NTB: switchtec_ntb: Introduce initial NTB driver NTB: Add check and comment for link up to mw_count() and mw_get_align() NTB: Ensure ntb_mw_get_align() is only called when the link is up NTB: switchtec: Add link event notifier callback NTB: switchtec: Add NTB hardware register definitions NTB: switchtec: Export class symbol for use in upper layer driver NTB: switchtec: Move structure definitions into a common header ntb: update maintainer list for Intel NTB driver