Age | Commit message (Collapse) | Author | Files | Lines |
|
Currently, other paths calling drv_sta_state() hold the mutex
and therefore drivers can assume that, and look at links with
that protection. Fix that for the reconfig path as well; to
do it more easily use ieee80211_reconfig_stations() for the
AP/AP_VLAN station reconfig as well.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Add an MLD address attribute to BSS entries that the interface
is currently associated with to help userspace figure out what's
going on.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Refactor the code to apply QoS settings to the driver so
we can call it on link switch.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
When we stop a not-yet-started scan, we erroneously call
into the driver, causing a sequence of sw_scan_start()
followed by sw_scan_complete() twice. This will cause a
warning in hwsim with next in line commit that validates
the address passed to wmediumd/virtio. Fix this by doing
the calls only if we were actually scanning.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Right now, we assign the link address only after we add
the link to the driver, which is quite obviously wrong.
It happens to work in many cases because it gets updated
immediately, and then link_conf updates may update it,
but it's clearly not really right.
Set the link address during ieee80211_mgd_setup_link()
so it's set before telling the driver about the link.
Fixes: 81151ce462e5 ("wifi: mac80211: support MLO authentication/association with one link")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
We probably should've done that originally, we already have
about 300 lines of code there, and will add more. Move all
the link code we wrote to a new file.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
We don't need the sdata argument, and it doesn't make any
sense for a direct conversion from one value to another,
so just remove the argument
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Return value from rdev_set_mcast_rate() directly instead of
taking this in another redundant variable.
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Jinpeng Cui <cui.jinpeng2@zte.com.cn>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Adds support in mac80211 for NL80211_EXT_FEATURE_POWERED_ADDR_CHANGE.
The motivation behind this functionality is to fix limitations of
address randomization on frequencies which are disallowed in world
roaming.
The way things work now, if a client wants to randomize their address
per-connection it must power down the device, change the MAC, and
power back up. Here lies a problem since powering down the device
may result in frequencies being disabled (until the regdom is set).
If the desired BSS is on one such frequency the client is unable to
connect once the phy is powered again.
For mac80211 based devices changing the MAC while powered is possible
but currently disallowed (-EBUSY). This patch adds some logic to
allow a MAC change while powered by removing the interface, changing
the MAC, and adding it again. mac80211 will advertise support for
this feature so userspace can determine the best course of action e.g.
disallow address randomization on certain frequencies if not
supported.
There are certain limitations put on this which simplify the logic:
- No active connection
- No offchannel work, including scanning.
Signed-off-by: James Prestwood <prestwoj@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
We haven't tried this yet, and it's not very likely to
work well right now, so for now disable 4-addr use on
interfaces that are MLDs.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Link: https://lore.kernel.org/r/20220902161143.f2e4cc2efaa1.I5924e8fb44a2d098b676f5711b36bbc1b1bd68e2@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Do not allow VLANs to be added to AP interfaces that are
MLDs, this isn't going to work because the link structs
aren't propagated to the VLAN interfaces yet.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Link: https://lore.kernel.org/r/20220902161144.8c88531146e9.If2ef9a3b138d4f16ed2fda91c852da156bdf5e4d@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
We sometimes copy all the addresses from the 802.11 header
for the AAD, which may cause complaints from fortify checks.
Use struct_group() to avoid the compiler warnings/errors.
Change-Id: Ic3ea389105e7813b22095b295079eecdabde5045
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
If we hit an authentication or association timeout, we only
release the chanctx for the deflink, and the other link(s)
are released later by ieee80211_vif_set_links(), but we're
not locking this correctly.
Fix the locking here while releasing the channels and links.
Change-Id: I9e08c1a5434592bdc75253c1abfa6c788f9f39b1
Fixes: 81151ce462e5 ("wifi: mac80211: support MLO authentication/association with one link")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
In the prep_channel error case we didn't release the deflink
channel leaving it to be left around. Fix that.
Change-Id: If0dfd748125ec46a31fc6045a480dc28e03723d2
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
The rx data link pointer isn't set from the RX aggregation timer,
resulting in a later warning. Fix that by setting it to the first
valid link for now, with a FIXME to worry about statistics later,
it's not very important since it's just the timeout case.
Reported-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/498d714c-76be-9d04-26db-a1206878de5e@redhat.com
Fixes: 56057da4569b ("wifi: mac80211: rx: track link in RX data")
Signed-off-by: Mukesh Sisodiya <mukesh.sisodiya@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
If htb_init() fails, qdisc_create() invokes htb_destroy() to clear
resources. Therefore, remove redundant resource cleanup in htb_init().
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If fq_codel_init() fails, qdisc_create() invokes fq_codel_destroy() to
clear resources. Therefore, remove redundant resource cleanup in
fq_codel_init().
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
tcf_qevent_parse_block_index() never returns a zero block_index.
Therefore, it is unnecessary to check the value of block_index in
tcf_qevent_init().
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Link: https://lore.kernel.org/r/20220901011617.14105-1-shaozhengchao@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This patch changes bpf_getsockopt(SOL_IPV6) to reuse
do_ipv6_getsockopt(). It removes the duplicated code from
bpf_getsockopt(SOL_IPV6).
This also makes bpf_getsockopt(SOL_IPV6) supporting the same
set of optnames as in bpf_setsockopt(SOL_IPV6). In particular,
this adds IPV6_AUTOFLOWLABEL support to bpf_getsockopt(SOL_IPV6).
ipv6 could be compiled as a module. Like how other code solved it
with stubs in ipv6_stubs.h, this patch adds the do_ipv6_getsockopt
to the ipv6_bpf_stub.
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20220902002931.2896218-1-kafai@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This patch changes bpf_getsockopt(SOL_IP) to reuse
do_ip_getsockopt() and remove the duplicated code.
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20220902002925.2895416-1-kafai@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This patch changes bpf_getsockopt(SOL_TCP) to reuse
do_tcp_getsockopt(). It removes the duplicated code from
bpf_getsockopt(SOL_TCP).
Before this patch, there were some optnames available to
bpf_setsockopt(SOL_TCP) but missing in bpf_getsockopt(SOL_TCP).
For example, TCP_NODELAY, TCP_MAXSEG, TCP_KEEPIDLE, TCP_KEEPINTVL,
and a few more. It surprises users from time to time. This patch
automatically closes this gap without duplicating more code.
bpf_getsockopt(TCP_SAVED_SYN) does not free the saved_syn,
so it stays in sol_tcp_sockopt().
For string name value like TCP_CONGESTION, bpf expects it
is always null terminated, so sol_tcp_sockopt() decrements
optlen by one before calling do_tcp_getsockopt() and
the 'if (optlen < saved_optlen) memset(..,0,..);'
in __bpf_getsockopt() will always do a null termination.
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20220902002918.2894511-1-kafai@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This patch changes bpf_getsockopt(SOL_SOCKET) to reuse
sk_getsockopt(). It removes all duplicated code from
bpf_getsockopt(SOL_SOCKET).
Before this patch, there were some optnames available to
bpf_setsockopt(SOL_SOCKET) but missing in bpf_getsockopt(SOL_SOCKET).
It surprises users from time to time. For example, SO_REUSEADDR,
SO_KEEPALIVE, SO_RCVLOWAT, and SO_MAX_PACING_RATE. This patch
automatically closes this gap without duplicating more code.
The only exception is SO_BINDTODEVICE because it needs to acquire a
blocking lock. Thus, SO_BINDTODEVICE is not supported.
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20220902002912.2894040-1-kafai@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This patch moves the "#ifdef CONFIG_XXX" check into the "if/else"
statement itself. The change is done for the bpf_getsockopt()
function only. It will make the latter patches easier to follow
without the surrounding ifdef macro.
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20220902002906.2893572-1-kafai@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Similar to the earlier patch that changes sk_getsockopt() to
use sockopt_{lock,release}_sock() such that it can avoid taking the
lock when called from bpf. This patch also changes do_ipv6_getsockopt()
to use sockopt_{lock,release}_sock() such that bpf_getsockopt(SOL_IPV6)
can reuse do_ipv6_getsockopt().
Although bpf_getsockopt(SOL_IPV6) currently does not support optname
that requires lock_sock(), using sockopt_{lock,release}_sock()
consistently across *_getsockopt() will make future optname addition
harder to miss the sockopt_{lock,release}_sock() usage. eg. when
adding new optname that requires a lock and the new optname is
needed in bpf_getsockopt() also.
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20220902002859.2893064-1-kafai@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Similar to the earlier patch that changes sk_getsockopt() to
take the sockptr_t argument . This patch also changes
do_ipv6_getsockopt() to take the sockptr_t argument such that
a latter patch can make bpf_getsockopt(SOL_IPV6) to reuse
do_ipv6_getsockopt().
Note on the change in ip6_mc_msfget(). This function is to
return an array of sockaddr_storage in optval. This function
is shared between ipv6_get_msfilter() and compat_ipv6_get_msfilter().
However, the sockaddr_storage is stored at different offset of the
optval because of the difference between group_filter and
compat_group_filter. Thus, a new 'ss_offset' argument is
added to ip6_mc_msfget().
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20220902002853.2892532-1-kafai@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Pass the len to the compat_ipv6_get_msfilter() instead of
compat_ipv6_get_msfilter() getting it again from optlen.
Its counter part ipv6_get_msfilter() is also taking the
len from do_ipv6_getsockopt().
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20220902002846.2892091-1-kafai@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The 'unsigned int flags' argument is always 0, so it can be removed.
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20220902002840.2891763-1-kafai@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Similar to the earlier commit that changed sk_setsockopt() to
use sockopt_{lock,release}_sock() such that it can avoid taking
lock when called from bpf. This patch also changes do_ip_getsockopt()
to use sockopt_{lock,release}_sock() such that a latter patch can
make bpf_getsockopt(SOL_IP) to reuse do_ip_getsockopt().
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20220902002834.2891514-1-kafai@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Similar to the earlier patch that changes sk_getsockopt() to
take the sockptr_t argument. This patch also changes
do_ip_getsockopt() to take the sockptr_t argument such that
a latter patch can make bpf_getsockopt(SOL_IP) to reuse
do_ip_getsockopt().
Note on the change in ip_mc_gsfget(). This function is to
return an array of sockaddr_storage in optval. This function
is shared between ip_get_mcast_msfilter() and
compat_ip_get_mcast_msfilter(). However, the sockaddr_storage
is stored at different offset of the optval because of
the difference between group_filter and compat_group_filter.
Thus, a new 'ss_offset' argument is added to ip_mc_gsfget().
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20220902002828.2890585-1-kafai@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Similar to the earlier commit that changed sk_setsockopt() to
use sockopt_{lock,release}_sock() such that it can avoid taking
lock when called from bpf. This patch also changes do_tcp_getsockopt()
to use sockopt_{lock,release}_sock() such that a latter patch can
make bpf_getsockopt(SOL_TCP) to reuse do_tcp_getsockopt().
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20220902002821.2889765-1-kafai@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Similar to the earlier patch that changes sk_getsockopt() to
take the sockptr_t argument . This patch also changes
do_tcp_getsockopt() to take the sockptr_t argument such that
a latter patch can make bpf_getsockopt(SOL_TCP) to reuse
do_tcp_getsockopt().
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20220902002815.2889332-1-kafai@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Similar to the earlier commit that changed sk_setsockopt() to
use sockopt_{lock,release}_sock() such that it can avoid taking
lock when called from bpf. This patch also changes sk_getsockopt()
to use sockopt_{lock,release}_sock() such that a latter patch can
make bpf_getsockopt(SOL_SOCKET) to reuse sk_getsockopt().
Only sk_get_filter() requires this change and it is used by
the optname SO_GET_FILTER.
The '.getname' implementations in sock->ops->getname() is not
changed also since bpf does not always have the sk->sk_socket
pointer and cannot support SO_PEERNAME.
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20220902002809.2888981-1-kafai@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This patch changes sk_getsockopt() to take the sockptr_t argument
such that it can be used by bpf_getsockopt(SOL_SOCKET) in a
latter patch.
security_socket_getpeersec_stream() is not changed. It stays
with the __user ptr (optval.user and optlen.user) to avoid changes
to other security hooks. bpf_getsockopt(SOL_SOCKET) also does not
support SO_PEERSEC.
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20220902002802.2888419-1-kafai@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
A latter patch refactors bpf_getsockopt(SOL_SOCKET) with the
sock_getsockopt() to avoid code duplication and code
drift between the two duplicates.
The current sock_getsockopt() takes sock ptr as the argument.
The very first thing of this function is to get back the sk ptr
by 'sk = sock->sk'.
bpf_getsockopt() could be called when the sk does not have
the sock ptr created. Meaning sk->sk_socket is NULL. For example,
when a passive tcp connection has just been established but has yet
been accept()-ed. Thus, it cannot use the sock_getsockopt(sk->sk_socket)
or else it will pass a NULL ptr.
This patch moves all sock_getsockopt implementation to the newly
added sk_getsockopt(). The new sk_getsockopt() takes a sk ptr
and immediately gets the sock ptr by 'sock = sk->sk_socket'
The existing sock_getsockopt(sock) is changed to call
sk_getsockopt(sock->sk). All existing callers have both sock->sk
and sk->sk_socket pointer.
The latter patch will make bpf_getsockopt(SOL_SOCKET) call
sk_getsockopt(sk) directly. The bpf_getsockopt(SOL_SOCKET) does
not use the optnames that require sk->sk_socket, so it will
be safe.
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20220902002756.2887884-1-kafai@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Florian Westphal says:
====================
netfilter: bug fixes for net
1. Fix IP address check in irc DCC conntrack helper, this should check
the opposite direction rather than the destination address of the
packets' direction, from David Leadbeater.
2. bridge netfilter needs to drop dst references, from Harsh Modi.
This was fine back in the day the code was originally written,
but nowadays various tunnels can pre-set metadata dsts on packets.
3. Remove nf_conntrack_helper sysctl and the modparam toggle, users
need to explicitily assign the helpers to use via nftables or
iptables. Conntrack helpers, by design, may be used to add dynamic
port redirections to internal machines, so its necessary to restrict
which hosts/peers are allowed to use them.
It was discovered that improper checking in the irc DCC helper makes
it possible to trigger the 'please do dynamic port forward'
from outside by embedding a 'DCC' in a PING request; if the client
echos that back a expectation/port forward gets added.
The auto-assign-for-everything mechanism has been in "please don't do this"
territory since 2012. From Pablo.
4. Fix a memory leak in the netdev hook error unwind path, also from Pablo.
* git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: nf_conntrack_irc: Fix forged IP logic
netfilter: nf_tables: clean up hook list when offload flags check fails
netfilter: br_netfilter: Drop dst references before setting.
netfilter: remove nf_conntrack_helper sysctl and modparam toggles
====================
Link: https://lore.kernel.org/r/20220901071238.3044-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
hci_read_buffer_size_sync shall not use HCI_OP_LE_READ_BUFFER_SIZE_V2
sinze that is LE specific, instead it is hci_le_read_buffer_size_sync
version that shall use it.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216382
Fixes: 26afbd826ee3 ("Bluetooth: Add initial implementation of CIS connections")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
Existing 'bpf_skb_get_tunnel_key' extracts various tunnel parameters
(id, ttl, tos, local and remote) but does not expose ip_tunnel_info's
tun_flags to the BPF program.
It makes sense to expose tun_flags to the BPF program.
Assume for example multiple GRE tunnels maintained on a single GRE
interface in collect_md mode. The program expects origins to initiate
over GRE, however different origins use different GRE characteristics
(e.g. some prefer to use GRE checksum, some do not; some pass a GRE key,
some do not, etc..).
A BPF program getting tun_flags can therefore remember the relevant
flags (e.g. TUNNEL_CSUM, TUNNEL_SEQ...) for each initiating remote. In
the reply path, the program can use 'bpf_skb_set_tunnel_key' in order
to correctly reply to the remote, using similar characteristics, based
on the stored tunnel flags.
Introduce BPF_F_TUNINFO_FLAGS flag for bpf_skb_get_tunnel_key. If
specified, 'bpf_tunnel_key->tunnel_flags' is set with the tun_flags.
Decided to use the existing unused 'tunnel_ext' as the storage for the
'tunnel_flags' in order to avoid changing bpf_tunnel_key's layout.
Also, the following has been considered during the design:
1. Convert the "interesting" internal TUNNEL_xxx flags back to BPF_F_yyy
and place into the new 'tunnel_flags' field. This has 2 drawbacks:
- The BPF_F_yyy flags are from *set_tunnel_key* enumeration space,
e.g. BPF_F_ZERO_CSUM_TX. It is awkward that it is "returned" into
tunnel_flags from a *get_tunnel_key* call.
- Not all "interesting" TUNNEL_xxx flags can be mapped to existing
BPF_F_yyy flags, and it doesn't make sense to create new BPF_F_yyy
flags just for purposes of the returned tunnel_flags.
2. Place key.tun_flags into 'tunnel_flags' but mask them, keeping only
"interesting" flags. That's ok, but the drawback is that what's
"interesting" for my usecase might be limiting for other usecases.
Therefore I decided to expose what's in key.tun_flags *as is*, which seems
most flexible. The BPF user can just choose to ignore bits he's not
interested in. The TUNNEL_xxx are also UAPI, so no harm exposing them
back in the get_tunnel_key call.
Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220831144010.174110-1-shmulik.ladkani@gmail.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
David Howells says:
====================
rxrpc fixes
Here are some fixes for AF_RXRPC:
(1) Fix the handling of ICMP/ICMP6 packets. This is a problem due to
rxrpc being switched to acting as a UDP tunnel, thereby allowing it to
steal the packets before they go through the UDP Rx queue. UDP
tunnels can't get ICMP/ICMP6 packets, however. This patch adds an
additional encap hook so that they can.
(2) Fix the encryption routines in rxkad to handle packets that have more
than three parts correctly. The problem is that ->nr_frags doesn't
count the initial fragment, so the sglist ends up too short.
(3) Fix a problem with destruction of the local endpoint potentially
getting repeated.
(4) Fix the calculation of the time at which to resend.
jiffies_to_usecs() gives microseconds, not nanoseconds.
(5) Fix AFS to work out when callback promises and locks expire based on
the time an op was issued rather than the time the first reply packet
arrives. We don't know how long the server took between calculating
the expiry interval and transmitting the reply.
(6) Given (5), rxrpc_get_reply_time() is no longer used, so remove it.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We got a recent syzbot report [1] showing a possible misuse
of pfmemalloc page status in TCP zerocopy paths.
Indeed, for pages coming from user space or other layers,
using page_is_pfmemalloc() is moot, and possibly could give
false positives.
There has been attempts to make page_is_pfmemalloc() more robust,
but not using it in the first place in this context is probably better,
removing cpu cycles.
Note to stable teams :
You need to backport 84ce071e38a6 ("net: introduce
__skb_fill_page_desc_noacc") as a prereq.
Race is more probable after commit c07aea3ef4d4
("mm: add a signature in struct page") because page_is_pfmemalloc()
is now using low order bit from page->lru.next, which can change
more often than page->index.
Low order bit should never be set for lru.next (when used as an anchor
in LRU list), so KCSAN report is mostly a false positive.
Backporting to older kernel versions seems not necessary.
[1]
BUG: KCSAN: data-race in lru_add_fn / tcp_build_frag
write to 0xffffea0004a1d2c8 of 8 bytes by task 18600 on cpu 0:
__list_add include/linux/list.h:73 [inline]
list_add include/linux/list.h:88 [inline]
lruvec_add_folio include/linux/mm_inline.h:105 [inline]
lru_add_fn+0x440/0x520 mm/swap.c:228
folio_batch_move_lru+0x1e1/0x2a0 mm/swap.c:246
folio_batch_add_and_move mm/swap.c:263 [inline]
folio_add_lru+0xf1/0x140 mm/swap.c:490
filemap_add_folio+0xf8/0x150 mm/filemap.c:948
__filemap_get_folio+0x510/0x6d0 mm/filemap.c:1981
pagecache_get_page+0x26/0x190 mm/folio-compat.c:104
grab_cache_page_write_begin+0x2a/0x30 mm/folio-compat.c:116
ext4_da_write_begin+0x2dd/0x5f0 fs/ext4/inode.c:2988
generic_perform_write+0x1d4/0x3f0 mm/filemap.c:3738
ext4_buffered_write_iter+0x235/0x3e0 fs/ext4/file.c:270
ext4_file_write_iter+0x2e3/0x1210
call_write_iter include/linux/fs.h:2187 [inline]
new_sync_write fs/read_write.c:491 [inline]
vfs_write+0x468/0x760 fs/read_write.c:578
ksys_write+0xe8/0x1a0 fs/read_write.c:631
__do_sys_write fs/read_write.c:643 [inline]
__se_sys_write fs/read_write.c:640 [inline]
__x64_sys_write+0x3e/0x50 fs/read_write.c:640
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
read to 0xffffea0004a1d2c8 of 8 bytes by task 18611 on cpu 1:
page_is_pfmemalloc include/linux/mm.h:1740 [inline]
__skb_fill_page_desc include/linux/skbuff.h:2422 [inline]
skb_fill_page_desc include/linux/skbuff.h:2443 [inline]
tcp_build_frag+0x613/0xb20 net/ipv4/tcp.c:1018
do_tcp_sendpages+0x3e8/0xaf0 net/ipv4/tcp.c:1075
tcp_sendpage_locked net/ipv4/tcp.c:1140 [inline]
tcp_sendpage+0x89/0xb0 net/ipv4/tcp.c:1150
inet_sendpage+0x7f/0xc0 net/ipv4/af_inet.c:833
kernel_sendpage+0x184/0x300 net/socket.c:3561
sock_sendpage+0x5a/0x70 net/socket.c:1054
pipe_to_sendpage+0x128/0x160 fs/splice.c:361
splice_from_pipe_feed fs/splice.c:415 [inline]
__splice_from_pipe+0x222/0x4d0 fs/splice.c:559
splice_from_pipe fs/splice.c:594 [inline]
generic_splice_sendpage+0x89/0xc0 fs/splice.c:743
do_splice_from fs/splice.c:764 [inline]
direct_splice_actor+0x80/0xa0 fs/splice.c:931
splice_direct_to_actor+0x305/0x620 fs/splice.c:886
do_splice_direct+0xfb/0x180 fs/splice.c:974
do_sendfile+0x3bf/0x910 fs/read_write.c:1249
__do_sys_sendfile64 fs/read_write.c:1317 [inline]
__se_sys_sendfile64 fs/read_write.c:1303 [inline]
__x64_sys_sendfile64+0x10c/0x150 fs/read_write.c:1303
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
value changed: 0x0000000000000000 -> 0xffffea0004a1d288
Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 18611 Comm: syz-executor.4 Not tainted 6.0.0-rc2-syzkaller-00248-ge022620b5d05-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/22/2022
Fixes: c07aea3ef4d4 ("mm: add a signature in struct page")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Shakeel Butt <shakeelb@google.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There is a shift wrapping bug in this code so anything thing above
31 will return false.
Fixes: 35c55c9877f8 ("tipc: add neighbor monitoring framework")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The sch_sfb enqueue() routine assumes the skb is still alive after it has
been enqueued into a child qdisc, using the data in the skb cb field in the
increment_qlen() routine after enqueue. However, the skb may in fact have
been freed, causing a use-after-free in this case. In particular, this
happens if sch_cake is used as a child of sfb, and the GSO splitting mode
of CAKE is enabled (in which case the skb will be split into segments and
the original skb freed).
Fix this by copying the sfb cb data to the stack before enqueueing the skb,
and using this stack copy in increment_qlen() instead of the skb pointer
itself.
Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-18231
Fixes: e13e02a3c68d ("net_sched: SFB flow scheduler")
Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This is a followup of commit c67b85558ff2 ("ipv6: tcp: send consistent
autoflowlabel in TIME_WAIT state"), but for SYN_RECV state.
In some cases, TCP sends a challenge ACK on behalf of a SYN_RECV request.
WHen this happens, we want to use the flow label that was used when
the prior SYNACK packet was sent, instead of another one.
After his patch, following packetdrill passes:
0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
+0 bind(3, ..., ...) = 0
+0 listen(3, 1) = 0
+.2 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7>
+0 > (flowlabel 0x11) S. 0:0(0) ack 1 <...>
// Test if a challenge ack is properly sent (same flowlabel than prior SYNACK)
+.01 < . 4000000000:4000000000(0) ack 1 win 320
+0 > (flowlabel 0x11) . 1:1(0) ack 1
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20220831203729.458000-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The open code is defined as a new helper function(netif_oper_up) on netdev.h,
the code is dev->operstate == IF_OPER_UP || dev->operstate == IF_OPER_UNKNOWN.
Thus, replace the open code to netif_oper_up. This patch doesn't change logic.
Signed-off-by: Juhee Kang <claudiajkang@gmail.com>
Link: https://lore.kernel.org/r/20220831125845.1333-1-claudiajkang@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
etf_enable_offload() is only called when q->offload is false in
etf_init(). So remove true check in etf_enable_offload().
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Link: https://lore.kernel.org/r/20220831092919.146149-1-shaozhengchao@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
tools/testing/selftests/net/.gitignore
sort the net-next version and use it
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
We need to make sure that the req->rq_private_buf is completely up to
date before we make req->rq_reply_bytes_recvd visible to the
call_decode() routine in order to avoid triggering the WARN_ON().
Reported-by: Benjamin Coddington <bcodding@redhat.com>
Fixes: 72691a269f0b ("SUNRPC: Don't reuse bvec on retransmission of the request")
Tested-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
The kfree invoked by gred_destroy_vq checks whether the input parameter
is empty. Therefore, gred_destroy() doesn't need to check table->tab.
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Link: https://lore.kernel.org/r/20220831041452.33026-1-shaozhengchao@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Remove rxrpc_get_reply_time() as that is no longer used now that the call
issue time is used instead of the reply time.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Fix the calculation of the resend age to add a microsecond value as
microseconds, not nanoseconds.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
If the local processor work item for the rxrpc local endpoint gets requeued
by an event (such as an incoming packet) between it getting scheduled for
destruction and the UDP socket being closed, the rxrpc_local_destroyer()
function can get run twice. The second time it can hang because it can end
up waiting for cleanup events that will never happen.
Signed-off-by: David Howells <dhowells@redhat.com>
|