Age | Commit message (Collapse) | Author | Files | Lines |
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core:
- bpf:
- allow bpf programs calling kernel functions (initially to
reuse TCP congestion control implementations)
- enable task local storage for tracing programs - remove the
need to store per-task state in hash maps, and allow tracing
programs access to task local storage previously added for
BPF_LSM
- add bpf_for_each_map_elem() helper, allowing programs to walk
all map elements in a more robust and easier to verify fashion
- sockmap: support UDP and cross-protocol BPF_SK_SKB_VERDICT
redirection
- lpm: add support for batched ops in LPM trie
- add BTF_KIND_FLOAT support - mostly to allow use of BTF on
s390 which has floats in its headers files
- improve BPF syscall documentation and extend the use of kdoc
parsing scripts we already employ for bpf-helpers
- libbpf, bpftool: support static linking of BPF ELF files
- improve support for encapsulation of L2 packets
- xdp: restructure redirect actions to avoid a runtime lookup,
improving performance by 4-8% in microbenchmarks
- xsk: build skb by page (aka generic zerocopy xmit) - improve
performance of software AF_XDP path by 33% for devices which don't
need headers in the linear skb part (e.g. virtio)
- nexthop: resilient next-hop groups - improve path stability on
next-hops group changes (incl. offload for mlxsw)
- ipv6: segment routing: add support for IPv4 decapsulation
- icmp: add support for RFC 8335 extended PROBE messages
- inet: use bigger hash table for IP ID generation
- tcp: deal better with delayed TX completions - make sure we don't
give up on fast TCP retransmissions only because driver is slow in
reporting that it completed transmitting the original
- tcp: reorder tcp_congestion_ops for better cache locality
- mptcp:
- add sockopt support for common TCP options
- add support for common TCP msg flags
- include multiple address ids in RM_ADDR
- add reset option support for resetting one subflow
- udp: GRO L4 improvements - improve 'forward' / 'frag_list'
co-existence with UDP tunnel GRO, allowing the first to take place
correctly even for encapsulated UDP traffic
- micro-optimize dev_gro_receive() and flow dissection, avoid
retpoline overhead on VLAN and TEB GRO
- use less memory for sysctls, add a new sysctl type, to allow using
u8 instead of "int" and "long" and shrink networking sysctls
- veth: allow GRO without XDP - this allows aggregating UDP packets
before handing them off to routing, bridge, OvS, etc.
- allow specifing ifindex when device is moved to another namespace
- netfilter:
- nft_socket: add support for cgroupsv2
- nftables: add catch-all set element - special element used to
define a default action in case normal lookup missed
- use net_generic infra in many modules to avoid allocating
per-ns memory unnecessarily
- xps: improve the xps handling to avoid potential out-of-bound
accesses and use-after-free when XPS change race with other
re-configuration under traffic
- add a config knob to turn off per-cpu netdev refcnt to catch
underflows in testing
Device APIs:
- add WWAN subsystem to organize the WWAN interfaces better and
hopefully start driving towards more unified and vendor-
independent APIs
- ethtool:
- add interface for reading IEEE MIB stats (incl. mlx5 and bnxt
support)
- allow network drivers to dump arbitrary SFP EEPROM data,
current offset+length API was a poor fit for modern SFP which
define EEPROM in terms of pages (incl. mlx5 support)
- act_police, flow_offload: add support for packet-per-second
policing (incl. offload for nfp)
- psample: add additional metadata attributes like transit delay for
packets sampled from switch HW (and corresponding egress and
policy-based sampling in the mlxsw driver)
- dsa: improve support for sandwiched LAGs with bridge and DSA
- netfilter:
- flowtable: use direct xmit in topologies with IP forwarding,
bridging, vlans etc.
- nftables: counter hardware offload support
- Bluetooth:
- improvements for firmware download w/ Intel devices
- add support for reading AOSP vendor capabilities
- add support for virtio transport driver
- mac80211:
- allow concurrent monitor iface and ethernet rx decap
- set priority and queue mapping for injected frames
- phy: add support for Clause-45 PHY Loopback
- pci/iov: add sysfs MSI-X vector assignment interface to distribute
MSI-X resources to VFs (incl. mlx5 support)
New hardware/drivers:
- dsa: mv88e6xxx: add support for Marvell mv88e6393x - 11-port
Ethernet switch with 8x 1-Gigabit Ethernet and 3x 10-Gigabit
interfaces.
- dsa: support for legacy Broadcom tags used on BCM5325, BCM5365 and
BCM63xx switches
- Microchip KSZ8863 and KSZ8873; 3x 10/100Mbps Ethernet switches
- ath11k: support for QCN9074 a 802.11ax device
- Bluetooth: Broadcom BCM4330 and BMC4334
- phy: Marvell 88X2222 transceiver support
- mdio: add BCM6368 MDIO mux bus controller
- r8152: support RTL8153 and RTL8156 (USB Ethernet) chips
- mana: driver for Microsoft Azure Network Adapter (MANA)
- Actions Semi Owl Ethernet MAC
- can: driver for ETAS ES58X CAN/USB interfaces
Pure driver changes:
- add XDP support to: enetc, igc, stmmac
- add AF_XDP support to: stmmac
- virtio:
- page_to_skb() use build_skb when there's sufficient tailroom
(21% improvement for 1000B UDP frames)
- support XDP even without dedicated Tx queues - share the Tx
queues with the stack when necessary
- mlx5:
- flow rules: add support for mirroring with conntrack, matching
on ICMP, GTP, flex filters and more
- support packet sampling with flow offloads
- persist uplink representor netdev across eswitch mode changes
- allow coexistence of CQE compression and HW time-stamping
- add ethtool extended link error state reporting
- ice, iavf: support flow filters, UDP Segmentation Offload
- dpaa2-switch:
- move the driver out of staging
- add spanning tree (STP) support
- add rx copybreak support
- add tc flower hardware offload on ingress traffic
- ionic:
- implement Rx page reuse
- support HW PTP time-stamping
- octeon: support TC hardware offloads - flower matching on ingress
and egress ratelimitting.
- stmmac:
- add RX frame steering based on VLAN priority in tc flower
- support frame preemption (FPE)
- intel: add cross time-stamping freq difference adjustment
- ocelot:
- support forwarding of MRP frames in HW
- support multiple bridges
- support PTP Sync one-step timestamping
- dsa: mv88e6xxx, dpaa2-switch: offload bridge port flags like
learning, flooding etc.
- ipa: add IPA v4.5, v4.9 and v4.11 support (Qualcomm SDX55, SM8350,
SC7280 SoCs)
- mt7601u: enable TDLS support
- mt76:
- add support for 802.3 rx frames (mt7915/mt7615)
- mt7915 flash pre-calibration support
- mt7921/mt7663 runtime power management fixes"
* tag 'net-next-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2451 commits)
net: selftest: fix build issue if INET is disabled
net: netrom: nr_in: Remove redundant assignment to ns
net: tun: Remove redundant assignment to ret
net: phy: marvell: add downshift support for M88E1240
net: dsa: ksz: Make reg_mib_cnt a u8 as it never exceeds 255
net/sched: act_ct: Remove redundant ct get and check
icmp: standardize naming of RFC 8335 PROBE constants
bpf, selftests: Update array map tests for per-cpu batched ops
bpf: Add batched ops support for percpu array
bpf: Implement formatted output helpers with bstr_printf
seq_file: Add a seq_bprintf function
sfc: adjust efx->xdp_tx_queue_count with the real number of initialized queues
net:nfc:digital: Fix a double free in digital_tg_recv_dep_req
net: fix a concurrency bug in l2tp_tunnel_register()
net/smc: Remove redundant assignment to rc
mpls: Remove redundant assignment to err
llc2: Remove redundant assignment to rc
net/tls: Remove redundant initialization of record
rds: Remove redundant assignment to nr_sig
dt-bindings: net: mdio-gpio: add compatible for microchip,mdio-smi0
...
|
|
In case ethernet driver is enabled and INET is disabled, selftest will
fail to build.
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Fixes: 3e1e58d64c3d ("net: add generic selftest support")
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20210428130947.29649-1-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Variable ns is set to 'skb->data[17]' but this value is never read as
it is overwritten or not used later on, hence it is a redundant
assignment and can be removed.
Cleans up the following clang-analyzer warning:
net/netrom/nr_in.c:156:2: warning: Value stored to 'ns' is never read
[clang-analyzer-deadcode.DeadStores].
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Link: https://lore.kernel.org/r/1619603885-115604-1-git-send-email-jiapeng.chong@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The assignment is not being used and redundant.
The check for null is redundant as nf_conntrack_put() also
checks this.
Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Paul Blakey <paulb@nvidia.com>
Link: https://lore.kernel.org/r/20210428060532.3330974-1-roid@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The current definitions of constants for PROBE, currently defined only
in the net-next kernel branch, are inconsistent, with
some beginning with ICMP and others with simply EXT. This patch
attempts to standardize the naming conventions of the constants for
PROBE before their release into a stable Kernel, and to update the
relevant definitions in net/ipv4/icmp.c.
Similarly, the definitions for the code field (previously
ICMP_EXT_MAL_QUERY, etc) use the same prefixes as the type field. This
patch adds _CODE_ to the prefix to clarify the distinction of these
constants.
Signed-off-by: Andreas Roeseler <andreas.a.roeseler@gmail.com>
Acked-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20210427153635.2591-1-andreas.a.roeseler@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
- Clean up SCHED_DEBUG: move the decades old mess of sysctl, procfs and
debugfs interfaces to a unified debugfs interface.
- Signals: Allow caching one sigqueue object per task, to improve
performance & latencies.
- Improve newidle_balance() irq-off latencies on systems with a large
number of CPU cgroups.
- Improve energy-aware scheduling
- Improve the PELT metrics for certain workloads
- Reintroduce select_idle_smt() to improve load-balancing locality -
but without the previous regressions
- Add 'scheduler latency debugging': warn after long periods of pending
need_resched. This is an opt-in feature that requires the enabling of
the LATENCY_WARN scheduler feature, or the use of the
resched_latency_warn_ms=xx boot parameter.
- CPU hotplug fixes for HP-rollback, and for the 'fail' interface. Fix
remaining balance_push() vs. hotplug holes/races
- PSI fixes, plus allow /proc/pressure/ files to be written by
CAP_SYS_RESOURCE tasks as well
- Fix/improve various load-balancing corner cases vs. capacity margins
- Fix sched topology on systems with NUMA diameter of 3 or above
- Fix PF_KTHREAD vs to_kthread() race
- Minor rseq optimizations
- Misc cleanups, optimizations, fixes and smaller updates
* tag 'sched-core-2021-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (61 commits)
cpumask/hotplug: Fix cpu_dying() state tracking
kthread: Fix PF_KTHREAD vs to_kthread() race
sched/debug: Fix cgroup_path[] serialization
sched,psi: Handle potential task count underflow bugs more gracefully
sched: Warn on long periods of pending need_resched
sched/fair: Move update_nohz_stats() to the CONFIG_NO_HZ_COMMON block to simplify the code & fix an unused function warning
sched/debug: Rename the sched_debug parameter to sched_verbose
sched,fair: Alternative sched_slice()
sched: Move /proc/sched_debug to debugfs
sched,debug: Convert sysctl sched_domains to debugfs
debugfs: Implement debugfs_create_str()
sched,preempt: Move preempt_dynamic to debug.c
sched: Move SCHED_DEBUG sysctl to debugfs
sched: Don't make LATENCYTOP select SCHED_DEBUG
sched: Remove sched_schedstats sysctl out from under SCHED_DEBUG
sched/numa: Allow runtime enabling/disabling of NUMA balance without SCHED_DEBUG
sched: Use cpu_dying() to fix balance_push vs hotplug-rollback
cpumask: Introduce DYING mask
cpumask: Make cpu_{online,possible,present,active}() inline
rseq: Optimise rseq_get_rseq_cs() and clear_rseq_cs()
...
|
|
It seems like Fedora 34 ends up enabling a few new gcc warnings, notably
"-Wstringop-overread" and "-Warray-parameter".
Both of them cause what seem to be valid warnings in the kernel, where
we have array size mismatches in function arguments (that are no longer
just silently converted to a pointer to element, but actually checked).
This fixes most of the trivial ones, by making the function declaration
match the function definition, and in the case of intel_pm.c, removing
the over-specified array size from the argument declaration.
At least one 'stringop-overread' warning remains in the i915 driver, but
that one doesn't have the same obvious trivial fix, and may or may not
actually be indicative of a bug.
[ It was a mistake to upgrade one of my machines to Fedora 34 while
being busy with the merge window, but if this is the extent of the
compiler upgrade problems, things are better than usual - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
In digital_tg_recv_dep_req, it calls nfc_tm_data_received(..,resp).
If nfc_tm_data_received() failed, the callee will free the resp via
kfree_skb() and return error. But in the exit branch, the resp
will be freed again.
My patch sets resp to NULL if nfc_tm_data_received() failed, to
avoid the double free.
Fixes: 1c7a4c24fbfd9 ("NFC Digital: Add target NFC-DEP support")
Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
The following patchset contains Netfilter updates for net-next:
1) Add support for the catch-all set element. This special element
can be used to define a default action to be applied in case that
the set lookup returns no matching element.
2) Fix incorrect #ifdef dependencies in the nftables cgroupsv2
support, from Arnd Bergmann.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
l2tp_tunnel_register() registers a tunnel without fully
initializing its attribute. This can allow another kernel thread
running l2tp_xmit_core() to access the uninitialized data and
then cause a kernel NULL pointer dereference error, as shown below.
Thread 1 Thread 2
//l2tp_tunnel_register()
list_add_rcu(&tunnel->list, &pn->l2tp_tunnel_list);
//pppol2tp_connect()
tunnel = l2tp_tunnel_get(sock_net(sk), info.tunnel_id);
// Fetch the new tunnel
...
//l2tp_xmit_core()
struct sock *sk = tunnel->sock;
...
bh_lock_sock(sk);
//Null pointer error happens
tunnel->sock = sk;
Fix this bug by initializing tunnel->sock before adding the
tunnel into l2tp_tunnel_list.
Reviewed-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Sishuai Gong <sishuai@purdue.edu>
Reported-by: Sishuai Gong <sishuai@purdue.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Variable rc is set to zero but this value is never read as it is
overwritten with a new value later on, hence it is a redundant
assignment and can be removed.
Cleans up the following clang-analyzer warning:
net/smc/af_smc.c:1079:3: warning: Value stored to 'rc' is never read
[clang-analyzer-deadcode.DeadStores].
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Variable err is set to -ENOMEM but this value is never read as it is
overwritten with a new value later on, hence it is a redundant
assignment and can be removed.
Cleans up the following clang-analyzer warning:
net/mpls/af_mpls.c:1022:2: warning: Value stored to 'err' is never read
[clang-analyzer-deadcode.DeadStores].
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Variable rc is set to zero but this value is never read as it is
overwritten with a new value later on, hence it is a redundant
assignment and can be removed.
Cleans up the following clang-analyzer warning:
net/llc/llc_station.c:86:2: warning: Value stored to 'rc' is never read
[clang-analyzer-deadcode.DeadStores].
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
record is being initialized to ctx->open_record but this is never
read as record is overwritten later on. Remove the redundant
initialization.
Cleans up the following clang-analyzer warning:
net/tls/tls_device.c:421:26: warning: Value stored to 'record' during
its initialization is never read [clang-analyzer-deadcode.DeadStores].
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Variable nr_sig is being assigned a value however the assignment is
never read, so this redundant assignment can be removed.
Cleans up the following clang-analyzer warning:
net/rds/ib_send.c:297:2: warning: Value stored to 'nr_sig' is never read
[clang-analyzer-deadcode.DeadStores].
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Although HWTSTAMP_TX_ONESTEP_SYNC existed in ioctl for hardware timestamp
configuration, the PTP Sync one-step timestamping had never been supported.
This patch is to truely support it.
- ocelot_port_txtstamp_request()
This function handles tx timestamp request by storing
ptp_cmd(tx timestamp type) in OCELOT_SKB_CB(skb)->ptp_cmd,
and additionally for two-step timestamp storing ts_id in
OCELOT_SKB_CB(clone)->ptp_cmd.
- ocelot_ptp_rew_op()
During xmit, this function is called to get rew_op (rewriter option) by
checking skb->cb for tx timestamp request, and configure to transmitting.
Non-onestep-Sync packet with one-step timestamp request falls back to use
two-step timestamp.
Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Free skb->cb usage in core driver and let device drivers decide to
use or not. The reason having a DSA_SKB_CB(skb)->clone was because
dsa_skb_tx_timestamp() which may set the clone pointer was called
before p->xmit() which would use the clone if any, and the device
driver has no way to initialize the clone pointer.
This patch just put memset(skb->cb, 0, sizeof(skb->cb)) at beginning
of dsa_slave_xmit(). Some new features in the future, like one-step
timestamp may need more bytes of skb->cb to use in
dsa_skb_tx_timestamp(), and p->xmit().
Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
It was a waste to clone skb directly in dsa_skb_tx_timestamp().
For one-step timestamping, a clone was not needed. For any failure of
port_txtstamp (this may usually happen), the skb clone had to be freed.
So this patch moves skb cloning for tx timestamp out of dsa core, and
let drivers clone skb in port_txtstamp if they really need.
Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Tested-by: Kurt Kanzenbach <kurt@linutronix.de>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Move ptp_classify_raw out of dsa core driver for handling tx
timestamp request. Let device drivers do this if they want.
Not all drivers want to limit tx timestamping for only PTP
packet.
Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Tested-by: Kurt Kanzenbach <kurt@linutronix.de>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Check tx timestamp request in core driver at very beginning of
dsa_skb_tx_timestamp(), so that most skbs not requiring tx
timestamp just return. And drop such checking in device drivers.
Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Tested-by: Kurt Kanzenbach <kurt@linutronix.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Variable offset is being assigned a value from a calculation
however the variable is never read, so this redundant variable
can be removed.
Cleans up the following clang-analyzer warning:
net/rxrpc/rxkad.c:579:2: warning: Value stored to 'offset' is never read
[clang-analyzer-deadcode.DeadStores].
net/rxrpc/rxkad.c:485:2: warning: Value stored to 'offset' is never read
[clang-analyzer-deadcode.DeadStores].
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The IPv6 Multicast Router Advertisements parsing has the following two
issues:
For one thing, ICMPv6 MRD Advertisements are smaller than ICMPv6 MLD
messages (ICMPv6 MRD Adv.: 8 bytes vs. ICMPv6 MLDv1/2: >= 24 bytes,
assuming MLDv2 Reports with at least one multicast address entry).
When ipv6_mc_check_mld_msg() tries to parse an Multicast Router
Advertisement its MLD length check will fail - and it will wrongly
return -EINVAL, even if we have a valid MRD Advertisement. With the
returned -EINVAL the bridge code will assume a broken packet and will
wrongly discard it, potentially leading to multicast packet loss towards
multicast routers.
The second issue is the MRD header parsing in
br_ip6_multicast_mrd_rcv(): It wrongly checks for an ICMPv6 header
immediately after the IPv6 header (IPv6 next header type). However
according to RFC4286, section 2 all MRD messages contain a Router Alert
option (just like MLD). So instead there is an IPv6 Hop-by-Hop option
for the Router Alert between the IPv6 and ICMPv6 header, again leading
to the bridge wrongly discarding Multicast Router Advertisements.
To fix these two issues, introduce a new return value -ENODATA to
ipv6_mc_check_mld() to indicate a valid ICMPv6 packet with a hop-by-hop
option which is not an MLD but potentially an MRD packet. This also
simplifies further parsing in the bridge code, as ipv6_mc_check_mld()
already fully checks the ICMPv6 header and hop-by-hop option.
These issues were found and fixed with the help of the mrdisc tool
(https://github.com/troglobit/mrdisc).
Fixes: 4b3087c7e37f ("bridge: Snoop Multicast Router Advertisements")
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull selinux updates from Paul Moore:
- Add support for measuring the SELinux state and policy capabilities
using IMA.
- A handful of SELinux/NFS patches to compare the SELinux state of one
mount with a set of mount options. Olga goes into more detail in the
patch descriptions, but this is important as it allows more
flexibility when using NFS and SELinux context mounts.
- Properly differentiate between the subjective and objective LSM
credentials; including support for the SELinux and Smack. My clumsy
attempt at a proper fix for AppArmor didn't quite pass muster so John
is working on a proper AppArmor patch, in the meantime this set of
patches shouldn't change the behavior of AppArmor in any way. This
change explains the bulk of the diffstat beyond security/.
- Fix a problem where we were not properly terminating the permission
list for two SELinux object classes.
* tag 'selinux-pr-20210426' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: add proper NULL termination to the secclass_map permissions
smack: differentiate between subjective and objective task credentials
selinux: clarify task subjective and objective credentials
lsm: separate security_task_getsecid() into subjective and objective variants
nfs: account for selinux security context when deciding to share superblock
nfs: remove unneeded null check in nfs_fill_super()
lsm,selinux: add new hook to compare new mount to an existing mount
selinux: fix misspellings using codespell tool
selinux: fix misspellings using codespell tool
selinux: measure state and policy capabilities
selinux: Allow context mounts for unpriviliged overlayfs
|
|
In some configurations, the sock_cgroup_ptr() function is not available:
net/netfilter/nft_socket.c: In function 'nft_sock_get_eval_cgroupv2':
net/netfilter/nft_socket.c:47:16: error: implicit declaration of function 'sock_cgroup_ptr'; did you mean 'obj_cgroup_put'? [-Werror=implicit-function-declaration]
47 | cgrp = sock_cgroup_ptr(&sk->sk_cgrp_data);
| ^~~~~~~~~~~~~~~
| obj_cgroup_put
net/netfilter/nft_socket.c:47:14: error: assignment to 'struct cgroup *' from 'int' makes pointer from integer without a cast [-Werror=int-conversion]
47 | cgrp = sock_cgroup_ptr(&sk->sk_cgrp_data);
| ^
Change the caller to match the same #ifdef check, only calling it
when the function is defined.
Fixes: e0bb96db96f8 ("netfilter: nft_socket: add support for cgroupsv2")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
The variable is only used in an #ifdef, causing a harmless warning:
net/netfilter/nft_socket.c: In function 'nft_socket_init':
net/netfilter/nft_socket.c:137:27: error: unused variable 'level' [-Werror=unused-variable]
137 | unsigned int len, level;
| ^~~~~
Move it into the same #ifdef block.
Fixes: e0bb96db96f8 ("netfilter: nft_socket: add support for cgroupsv2")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
Pull AFS updates from David Howells:
"Use the new netfs lib.
Begin the process of overhauling the use of the fscache API by AFS and
the introduction of support for features such as Transparent Huge
Pages (THPs).
- Add some support for THPs, including using core VM helper functions
to find details of pages.
- Use the ITER_XARRAY I/O iterator to mediate access to the pagecache
as this handles THPs and doesn't require allocation of large bvec
arrays.
- Delegate address_space read/pre-write I/O methods for AFS to the
netfs helper library. A method is provided to the library that
allows it to issue a read against the server.
This includes a change in use for PG_fscache (it now indicates a
DIO write in progress from the marked page), so a number of waits
need to be deployed for it.
- Split the core AFS writeback function to make it easier to modify
in future patches to handle writing to the cache. [This might
feasibly make more sense moved out into my fscache-iter branch].
I've tested these with "xfstests -g quick" against an AFS volume
(xfstests needs patching to make it work). With this, AFS without a
cache passes all expected xfstests; with a cache, there's an extra
failure, but that's also there before these patches. Fixing that
probably requires a greater overhaul (as can be found on my
fscache-iter branch, but that's for a later time).
Thanks should go to Marc Dionne and Jeff Altman of AuriStor for
exercising the patches in their test farm also"
Link: https://lore.kernel.org/lkml/3785063.1619482429@warthog.procyon.org.uk/
* tag 'afs-netfs-lib-20210426' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
afs: Use the netfs_write_begin() helper
afs: Use new netfs lib read helper API
afs: Use the fs operation ops to handle FetchData completion
afs: Prepare for use of THPs
afs: Extract writeback extension into its own function
afs: Wait on PG_fscache before modifying/releasing a page
afs: Use ITER_XARRAY for writing
afs: Set up the iov_iter before calling afs_extract_data()
afs: Log remote unmarshalling errors
afs: Don't truncate iter during data fetch
afs: Move key to afs_read struct
afs: Print the operation debug_id when logging an unexpected data version
afs: Pass page into dirty region helpers to provide THP size
afs: Disable use of the fscache I/O routines
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull CFI on arm64 support from Kees Cook:
"This builds on last cycle's LTO work, and allows the arm64 kernels to
be built with Clang's Control Flow Integrity feature. This feature has
happily lived in Android kernels for almost 3 years[1], so I'm excited
to have it ready for upstream.
The wide diffstat is mainly due to the treewide fixing of mismatched
list_sort prototypes. Other things in core kernel are to address
various CFI corner cases. The largest code portion is the CFI runtime
implementation itself (which will be shared by all architectures
implementing support for CFI). The arm64 pieces are Acked by arm64
maintainers rather than coming through the arm64 tree since carrying
this tree over there was going to be awkward.
CFI support for x86 is still under development, but is pretty close.
There are a handful of corner cases on x86 that need some improvements
to Clang and objtool, but otherwise works well.
Summary:
- Clean up list_sort prototypes (Sami Tolvanen)
- Introduce CONFIG_CFI_CLANG for arm64 (Sami Tolvanen)"
* tag 'cfi-v5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
arm64: allow CONFIG_CFI_CLANG to be selected
KVM: arm64: Disable CFI for nVHE
arm64: ftrace: use function_nocfi for ftrace_call
arm64: add __nocfi to __apply_alternatives
arm64: add __nocfi to functions that jump to a physical address
arm64: use function_nocfi with __pa_symbol
arm64: implement function_nocfi
psci: use function_nocfi for cpu_resume
lkdtm: use function_nocfi
treewide: Change list_sort to use const pointers
bpf: disable CFI in dispatcher functions
kallsyms: strip ThinLTO hashes from static functions
kthread: use WARN_ON_FUNCTION_MISMATCH
workqueue: use WARN_ON_FUNCTION_MISMATCH
module: ensure __cfi_check alignment
mm: add generic function_nocfi macro
cfi: add __cficanonical
add support for Clang CFI
|
|
This patch extends the set infrastructure to add a special catch-all set
element. If the lookup fails to find an element (or range) in the set,
then the catch-all element is selected. Users can specify a mapping,
expression(s) and timeout to be attached to the catch-all element.
This patch adds a catchall list to the set, this list might contain more
than one single catch-all element (e.g. in case that the catch-all
element is removed and a new one is added in the same transaction).
However, most of the time, there will be either one element or no
elements at all in this list.
The catch-all element is identified via NFT_SET_ELEM_CATCHALL flag and
such special element has no NFTA_SET_ELEM_KEY attribute. There is a new
nft_set_elem_catchall object that stores a reference to the dummy
catch-all element (catchall->elem) whose layout is the same of the set
element type to reuse the existing set element codebase.
The set size does not apply to the catch-all element, users can define a
catch-all element even if the set is full.
The check for valid set element flags hava been updates to report
EOPNOTSUPP in case userspace requests flags that are not supported when
using new userspace nftables and old kernel.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
When binding sets to rule, validate set element data according to
set definition. This patch adds a helper function to be reused by
the catch-all set element support.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
This patch adds nft_set_flush() which prepares for the catch-all
element support.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
This patch adds nft_check_loops() to reuse it in the new catch-all
element codebase.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Rename:
- nft_set_elem_activate() to nft_set_elem_data_activate().
- nft_set_elem_deactivate() to nft_set_elem_data_deactivate().
To prepare for updates in the set element infrastructure to add support
for the special catch-all element.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Harald Arnesen reported [1] a deadlock at reboot time, and after
he captured a stack trace a picture developed of what's going on:
The distribution he's using is using iwd (not wpa_supplicant) to
manage wireless. iwd will usually use the "socket owner" option
when it creates new interfaces, so that they're automatically
destroyed when it quits (unexpectedly or otherwise). This is also
done by wpa_supplicant, but it doesn't do it for the normal one,
only for additional ones, which is different with iwd.
Anyway, during shutdown, iwd quits while the netdev is still UP,
i.e. IFF_UP is set. This causes the stack trace that Linus so
nicely transcribed from the pictures:
cfg80211_destroy_iface_wk() takes wiphy_lock
-> cfg80211_destroy_ifaces()
->ieee80211_del_iface
->ieeee80211_if_remove
->cfg80211_unregister_wdev
->unregister_netdevice_queue
->dev_close_many
->__dev_close_many
->raw_notifier_call_chain
->cfg80211_netdev_notifier_call
and that last call tries to take wiphy_lock again.
In commit a05829a7222e ("cfg80211: avoid holding the RTNL when
calling the driver") I had taken into account the possibility of
recursing from cfg80211 into cfg80211_netdev_notifier_call() via
the network stack, but only for NETDEV_UNREGISTER, not for what
happens here, NETDEV_GOING_DOWN and NETDEV_DOWN notifications.
Additionally, while this worked still back in commit 78f22b6a3a92
("cfg80211: allow userspace to take ownership of interfaces"), it
missed another corner case: unregistering a netdev will cause
dev_close() to be called, and thus stop wireless operations (e.g.
disconnecting), but there are some types of virtual interfaces in
wifi that don't have a netdev - for that we need an additional
call to cfg80211_leave().
So, to fix this mess, change cfg80211_destroy_ifaces() to not
require the wiphy_lock(), but instead make it acquire it, but
only after it has actually closed all the netdevs on the list,
and then call cfg80211_leave() as well before removing them
from the driver, to fix the second issue. The locking change in
this requires modifying the nl80211 call to not get the wiphy
lock passed in, but acquire it by itself after flushing any
potentially pending destruction requests.
[1] https://lore.kernel.org/r/09464e67-f3de-ac09-28a3-e27b7914ee7d@skogtun.org
Cc: stable@vger.kernel.org # 5.12
Reported-by: Harald Arnesen <harald@skogtun.org>
Fixes: 776a39b8196d ("cfg80211: call cfg80211_destroy_ifaces() with wiphy lock held")
Fixes: 78f22b6a3a92 ("cfg80211: allow userspace to take ownership of interfaces")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Tested-by: Harald Arnesen <harald@skogtun.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Pull nfsd updates from Chuck Lever:
"Highlights:
- Update NFSv2 and NFSv3 XDR encoding functions
- Add batch Receive posting to the server's RPC/RDMA transport (take 2)
- Reduce page allocator traffic in svcrdma"
* tag 'nfsd-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (70 commits)
NFSD: Use DEFINE_SPINLOCK() for spinlock
sunrpc: Remove unused function ip_map_lookup
NFSv4.2: fix copy stateid copying for the async copy
UAPI: nfsfh.h: Replace one-element array with flexible-array member
svcrdma: Clean up dto_q critical section in svc_rdma_recvfrom()
svcrdma: Remove svc_rdma_recv_ctxt::rc_pages and ::rc_arg
svcrdma: Remove sc_read_complete_q
svcrdma: Single-stage RDMA Read
SUNRPC: Move svc_xprt_received() call sites
SUNRPC: Export svc_xprt_received()
svcrdma: Retain the page backing rq_res.head[0].iov_base
svcrdma: Remove unused sc_pages field
svcrdma: Normalize Send page handling
svcrdma: Add a "deferred close" helper
svcrdma: Maintain a Receive water mark
svcrdma: Use svc_rdma_refresh_recvs() in wc_receive
svcrdma: Add a batch Receive posting mechanism
svcrdma: Remove stale comment for svc_rdma_wc_receive()
svcrdma: Provide an explanatory comment in CMA event handler
svcrdma: RPCDBG_FACILITY is no longer used
...
|
|
while testing re-assembly/re-fragmentation using act_ct, it's possible to
observe a crash like the following one:
KASAN: maybe wild-memory-access in range [0x0001000000000448-0x000100000000044f]
CPU: 50 PID: 0 Comm: swapper/50 Tainted: G S 5.12.0-rc7+ #424
Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.4.3 01/17/2017
RIP: 0010:inet_frag_rbtree_purge+0x50/0xc0
Code: 00 fc ff df 48 89 c3 31 ed 48 89 df e8 a9 7a 38 ff 4c 89 fe 48 89 df 49 89 c6 e8 5b 3a 38 ff 48 8d 7b 40 48 89 f8 48 c1 e8 03 <42> 80 3c 20 00 75 59 48 8d bb d0 00 00 00 4c 8b 6b 40 48 89 f8 48
RSP: 0018:ffff888c31449db8 EFLAGS: 00010203
RAX: 0000200000000089 RBX: 000100000000040e RCX: ffffffff989eb960
RDX: 0000000000000140 RSI: ffffffff97cfb977 RDI: 000100000000044e
RBP: 0000000000000900 R08: 0000000000000000 R09: ffffed1186289350
R10: 0000000000000003 R11: ffffed1186289350 R12: dffffc0000000000
R13: 000100000000040e R14: 0000000000000000 R15: ffff888155e02160
FS: 0000000000000000(0000) GS:ffff888c31440000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00005600cb70a5b8 CR3: 0000000a2c014005 CR4: 00000000003706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<IRQ>
inet_frag_destroy+0xa9/0x150
call_timer_fn+0x2d/0x180
run_timer_softirq+0x4fe/0xe70
__do_softirq+0x197/0x5a0
irq_exit_rcu+0x1de/0x200
sysvec_apic_timer_interrupt+0x6b/0x80
</IRQ>
when act_ct temporarily stores an IP fragment, restoring the skb qdisc cb
results in putting random data in FRAG_CB(), and this causes those "wild"
memory accesses later, when the rbtree is purged. Never overwrite the skb
cb in case tcf_ct_handle_fragments() returns -EINPROGRESS.
Fixes: ae372cb1750f ("net/sched: act_ct: fix restore the qdisc_skb_cb after defrag")
Fixes: 7baf2429a1a9 ("net/sched: cls_flower add CT_FLAGS_INVALID flag support")
Reported-by: Shuang Li <shuali@redhat.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next
Marc Kleine-Budde says:
====================
pull-request: can-next 2021-04-26
this is a pull request of 4 patches for net-next/master.
the first two patches are from Colin Ian King and target the
etas_es58x driver, they add a missing NULL pointer check and fix some
typos.
The next two patches are by Erik Flodin. The first one updates the CAN
documentation regarding filtering, the other one fixes the header
alignment in CAN related proc output on 64 bit systems.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
The following patchset contains Netfilter updates for net-next:
1) The various ip(6)table_foo incarnations are updated to expect
that the table is passed as 'void *priv' argument that netfilter core
passes to the hook functions. This reduces the struct net size by 2
cachelines on x86_64. From Florian Westphal.
2) Add cgroupsv2 support for nftables.
3) Fix bridge log family merge into nf_log_syslog: Missing
unregistration from netns exit path, from Phil Sutter.
4) Add nft_pernet() helper to access nftables pernet area.
5) Add struct nfnl_info to reduce nfnetlink callback footprint and
to facilite future updates. Consolidate nfnetlink callbacks.
6) Add CONFIG_NETFILTER_XTABLES_COMPAT Kconfig knob, also from Florian.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty and serial driver updates from Greg KH:
"Here is the big set of tty and serial driver updates for 5.13-rc1.
Actually busy this release, with a number of cleanups happening:
- much needed core tty cleanups by Jiri Slaby
- removal of unused and orphaned old-style serial drivers. If anyone
shows up with this hardware, it is trivial to restore these but we
really do not think they are in use anymore.
- fixes and cleanups from Johan Hovold on a number of termios setting
corner cases that loads of drivers got wrong as well as removing
unneeded code due to tty core changes from long ago that were never
propagated out to the drivers
- loads of platform-specific serial port driver updates and fixes
- coding style cleanups and other small fixes and updates all over
the tty/serial tree.
All of these have been in linux-next for a while now with no reported
issues"
* tag 'tty-5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (186 commits)
serial: extend compile-test coverage
serial: stm32: add FIFO threshold configuration
dt-bindings: serial: 8250: update TX FIFO trigger level
dt-bindings: serial: stm32: override FIFO threshold properties
dt-bindings: serial: add RX and TX FIFO properties
serial: xilinx_uartps: drop low-latency workaround
serial: vt8500: drop low-latency workaround
serial: timbuart: drop low-latency workaround
serial: sunsu: drop low-latency workaround
serial: sifive: drop low-latency workaround
serial: txx9: drop low-latency workaround
serial: sa1100: drop low-latency workaround
serial: rp2: drop low-latency workaround
serial: rda: drop low-latency workaround
serial: owl: drop low-latency workaround
serial: msm_serial: drop low-latency workaround
serial: mpc52xx_uart: drop low-latency workaround
serial: meson: drop low-latency workaround
serial: mcf: drop low-latency workaround
serial: lpc32xx_hs: drop low-latency workaround
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver updates from Greg KH:
"Here is the big set of various smaller driver subsystem updates for
5.13-rc1.
Major bits in here are:
- habanalabs driver updates
- hwtracing driver updates
- interconnect driver updates
- mhi driver updates
- extcon driver updates
- fpga driver updates
- new binder features added
- nvmem driver updates
- phy driver updates
- soundwire driver updates
- smaller misc and char driver fixes and updates.
- bluetooth driver bugfix that maintainer wanted to go through this
tree.
All of these have been in linux-next with no reported issues"
* tag 'char-misc-5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (330 commits)
bluetooth: eliminate the potential race condition when removing the HCI controller
coresight: etm-perf: Fix define build issue when built as module
phy: Revert "phy: ti: j721e-wiz: add missing of_node_put"
phy: ti: j721e-wiz: Add missing include linux/slab.h
phy: phy-twl4030-usb: Fix possible use-after-free in twl4030_usb_remove()
stm class: Use correct UUID APIs
intel_th: pci: Add Alder Lake-M support
intel_th: pci: Add Rocket Lake CPU support
intel_th: Consistency and off-by-one fix
intel_th: Constify attribute_group structs
intel_th: Constify all drvdata references
stm class: Remove an unused function
habanalabs/gaudi: Fix uninitialized return code rc when read size is zero
greybus: es2: fix kernel-doc warnings
mei: me: add Alder Lake P device id.
dw-xdata-pcie: Update outdated info and improve text format
dw-xdata-pcie: Fix documentation build warns
fbdev: zero-fill colormap in fbcmap.c
firmware: qcom-scm: Fix QCOM_SCM configuration
speakup: i18n: Switch to kmemdup_nul() in spk_msg_set()
...
|
|
The compat layer needs to parse untrusted input (the ruleset)
to translate it to a 64bit compatible format.
We had a number of bugs in this department in the past, so allow users
to turn this feature off.
Add CONFIG_NETFILTER_XTABLES_COMPAT kconfig knob and make it default to y
to keep existing behaviour.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Add enum nfnl_callback_type to identify the callback type to provide one
single callback.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Update batch callbacks to use the nfnl_info structure. Rename one
clashing info variable to expr_info.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Update rcu callbacks to use the nfnl_info structure.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
"API:
- crypto_destroy_tfm now ignores errors as well as NULL pointers
Algorithms:
- Add explicit curve IDs in ECDH algorithm names
- Add NIST P384 curve parameters
- Add ECDSA
Drivers:
- Add support for Green Sardine in ccp
- Add ecdh/curve25519 to hisilicon/hpre
- Add support for AM64 in sa2ul"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (184 commits)
fsverity: relax build time dependency on CRYPTO_SHA256
fscrypt: relax Kconfig dependencies for crypto API algorithms
crypto: camellia - drop duplicate "depends on CRYPTO"
crypto: s5p-sss - consistently use local 'dev' variable in probe()
crypto: s5p-sss - remove unneeded local variable initialization
crypto: s5p-sss - simplify getting of_device_id match data
ccp: ccp - add support for Green Sardine
crypto: ccp - Make ccp_dev_suspend and ccp_dev_resume void functions
crypto: octeontx2 - add support for OcteonTX2 98xx CPT block.
crypto: chelsio/chcr - Remove useless MODULE_VERSION
crypto: ux500/cryp - Remove duplicate argument
crypto: chelsio - remove unused function
crypto: sa2ul - Add support for AM64
crypto: sa2ul - Support for per channel coherency
dt-bindings: crypto: ti,sa2ul: Add new compatible for AM64
crypto: hisilicon - enable new error types for QM
crypto: hisilicon - add new error type for SEC
crypto: hisilicon - support new error types for ZIP
crypto: hisilicon - dynamic configuration 'err_info'
crypto: doc - fix kernel-doc notation in chacha.c and af_alg.c
...
|
|
Add a new structure to reduce callback footprint and to facilite
extensions of the nfnetlink callback interface in the future.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Consolidate call to net_generic(net, nf_tables_net_id) in this
wrapper function.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2021-04-21
devlink external port attribute for SF (Sub-Function) port flavour
This adds the support to instantiate Sub-Functions on external hosts
E.g when Eswitch manager is enabled on the ARM SmarNic SoC CPU, users
are now able to spawn new Sub-Functions on the Host server CPU.
Parav Pandit Says:
==================
This series introduces and uses external attribute for the SF port to
indicate that a SF port belongs to an external controller.
This is needed to generate unique phys_port_name when PF and SF numbers
are overlapping between local and external controllers.
For example two controllers 0 and 1, both of these controller have a SF.
having PF number 0, SF number 77. Here, phys_port_name has duplicate
entry which doesn't have controller number in it.
Hence, add controller number optionally when a SF port is for an
external controller. This extension is similar to existing PF and VF
eswitch ports of the external controller.
When a SF is for external controller an example view of external SF
port and config sequence:
On eswitch system:
$ devlink dev eswitch set pci/0033:01:00.0 mode switchdev
$ devlink port show
pci/0033:01:00.0/196607: type eth netdev enP51p1s0f0np0 flavour physical port 0 splittable false
pci/0033:01:00.0/131072: type eth netdev eth0 flavour pcipf controller 1 pfnum 0 external true splittable false
function:
hw_addr 00:00:00:00:00:00
$ devlink port add pci/0033:01:00.0 flavour pcisf pfnum 0 sfnum 77 controller 1
pci/0033:01:00.0/163840: type eth netdev eth1 flavour pcisf controller 1 pfnum 0 sfnum 77 splittable false
function:
hw_addr 00:00:00:00:00:00 state inactive opstate detached
phys_port_name construction:
$ cat /sys/class/net/eth1/phys_port_name
c1pf0sf77
Patch summary:
First 3 patches prepares the eswitch to handle vports in more generic
way using xarray to lookup vport from its unique vport number.
Patch-1 returns maximum eswitch ports only when eswitch is enabled
Patch-2 prepares eswitch to return eswitch max ports from a struct
Patch-3 uses xarray for vport and representor lookup
Patch-4 considers SF for an additioanl range of SF vports
Patch-5 relies on SF hw table to check SF support
Patch-6 extends SF devlink port attribute for external flag
Patch-7 stores the per controller SF allocation attributes
Patch-8 uses SF function id for filtering events
Patch-9 uses helper for allocation and free
Patch-10 splits hw table into per controller table and generic one
Patch-11 extends sf table for additional range
==================
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Without this, a stale pointer remains in pernet loggers after module
unload causing a kernel oops during dereference. Easily reproduced by:
| # modprobe nf_log_syslog
| # rmmod nf_log_syslog
| # cat /proc/net/netfilter/nf_log
Fixes: 77ccee96a6742 ("netfilter: nf_log_bridge: merge with nf_log_syslog")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Same patch as the ip_tables one: removal of all accesses to ip6_tables
xt_table pointers. After this patch the struct net xt_table anchors
can be removed.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|