summaryrefslogtreecommitdiffstats
path: root/net/mptcp
AgeCommit message (Collapse)AuthorFilesLines
2022-05-05mptcp: add more offered MIBs counterPaolo Abeni3-1/+13
Track the exceptional handling of MPTCP-level offered window with a few more counters for observability. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-05mptcp: never shrink offered windowPaolo Abeni3-12/+50
As per RFC, the offered MPTCP-level window should never shrink. While we currently track the right edge, we don't enforce the above constraint on the wire. Additionally, concurrent xmit on different subflows can end-up in erroneous right edge update. Address the above explicitly updating the announced window and protecting the update with an additional atomic operation (sic) Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-05tcp: allow MPTCP to update the announced windowPaolo Abeni1-1/+1
The MPTCP RFC requires that the MPTCP-level receive window's right edge never moves backward. Currently the MPTCP code enforces such constraint while tracking the right edge, but it does not reflects it on the wire, as MPTCP lacks a suitable hook to update accordingly the TCP header. This change modifies the existing mptcp_write_options() hook, providing the current packet's TCP header to the MPTCP protocol, so that the next patch could implement the above mentioned constraint. No functional changes intended. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-05mptcp: add mib for xmit window sharingPaolo Abeni3-1/+5
Bump a counter for counter when snd_wnd is shared among subflow, for observability's sake. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-05mptcp: really share subflow snd_wndPaolo Abeni1-9/+15
As per RFC, mptcp subflows use a "shared" snd_wnd: the effective window is the maximum among the current values received on all subflows. Without such feature a data transfer using multiple subflows could block. Window sharing is currently implemented in the RX side: __tcp_select_window uses the mptcp-level receive buffer to compute the announced window. That is not enough: the TCP stack will stick to the window size received on the given subflow; we need to propagate the msk window value on each subflow at xmit time. Change the packet scheduler to ignore the subflow level window and use instead the msk level one Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-05net: switch to netif_napi_add_tx()Jakub Kicinski1-2/+2
Switch net callers to the new API not requiring the NAPI_POLL_WEIGHT argument. Acked-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Alex Elder <elder@linaro.org> Acked-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Acked-by: Alexandra Winter <wintera@linux.ibm.com> Link: https://lore.kernel.org/r/20220504163725.550782-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-04mptcp: netlink: allow userspace-driven subflow establishmentFlorian Westphal3-0/+211
This allows userspace to tell kernel to add a new subflow to an existing mptcp connection. Userspace provides the token to identify the mptcp-level connection that needs a change in active subflows and the local and remote addresses of the new or the to-be-removed subflow. MPTCP_PM_CMD_SUBFLOW_CREATE requires the following parameters: { token, { loc_id, family, loc_addr4 | loc_addr6 }, { family, rem_addr4 | rem_addr6, rem_port } MPTCP_PM_CMD_SUBFLOW_DESTROY requires the following parameters: { token, { family, loc_addr4 | loc_addr6, loc_port }, { family, rem_addr4 | rem_addr6, rem_port } Acked-by: Paolo Abeni <pabeni@redhat.com> Co-developed-by: Kishen Maloor <kishen.maloor@intel.com> Signed-off-by: Kishen Maloor <kishen.maloor@intel.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-04mptcp: netlink: Add MPTCP_PM_CMD_REMOVEKishen Maloor3-2/+74
This change adds a MPTCP netlink command for issuing a REMOVE_ADDR signal for an address over the chosen MPTCP connection from a userspace path manager. The command requires the following parameters: {token, loc_id}. Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Kishen Maloor <kishen.maloor@intel.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-04mptcp: netlink: Add MPTCP_PM_CMD_ANNOUNCEKishen Maloor3-5/+79
This change adds a MPTCP netlink interface for issuing ADD_ADDR advertisements over the chosen MPTCP connection from a userspace path manager. The command requires the following parameters: { token, { loc_id, family, daddr4 | daddr6 [, dport] } [, if_idx], flags[signal] }. Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Kishen Maloor <kishen.maloor@intel.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-04mptcp: netlink: split mptcp_pm_parse_addr into two functionsFlorian Westphal1-21/+39
Next patch will need to parse MPTCP_PM_ATTR_ADDR attributes and fill an mptcp_addr_info structure from a different genl command callback. To avoid copy-paste, split the existing function to a helper that does the common part and then call the helper from the (renamed)mptcp_pm_parse_entry function. Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-04mptcp: read attributes of addr entries managed by userspace PMsKishen Maloor4-3/+62
This change introduces a parallel path in the kernel for retrieving the local id, flags, if_index for an addr entry in the context of an MPTCP connection that's being managed by a userspace PM. The userspace and in-kernel PM modes deviate in their procedures for obtaining this information. Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Kishen Maloor <kishen.maloor@intel.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-04mptcp: handle local addrs announced by userspace PMsKishen Maloor6-26/+113
This change adds an internal function to store/retrieve local addrs announced by userspace PM implementations to/from its kernel context. The function addresses the requirements of three scenarios: 1) ADD_ADDR announcements (which require that a local id be provided), 2) retrieving the local id associated with an address, and also where one may need to be assigned, and 3) reissuance of ADD_ADDRs when there's a successful match of addr/id. The list of all stored local addr entries is held under the MPTCP sock structure. Memory for these entries is allocated from the sock option buffer, so the list of addrs is bounded by optmem_max. The list if not released via REMOVE_ADDR signals is ultimately freed when the sock is destructed. Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Kishen Maloor <kishen.maloor@intel.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-03mptcp: allow ADD_ADDR reissuance by userspace PMsKishen Maloor1-2/+10
This change allows userspace PM implementations to reissue ADD_ADDR announcements (if necessary) based on their chosen policy. Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Kishen Maloor <kishen.maloor@intel.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-03mptcp: expose server_side attribute in MPTCP netlink eventsKishen Maloor1-0/+3
This change records the 'server_side' attribute of MPTCP_EVENT_CREATED and MPTCP_EVENT_ESTABLISHED events to inform their recipient about the Client/Server role of the running MPTCP application. Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/246 Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Kishen Maloor <kishen.maloor@intel.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-03mptcp: establish subflows from either end of connectionKishen Maloor3-7/+8
This change updates internal logic to permit subflows to be established from either the client or server ends of MPTCP connections. This symmetry and added flexibility may be harnessed by PM implementations running on either end in creating new subflows. The essence of this change lies in not relying on the "server_side" flag (which continues to be available if needed). Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Kishen Maloor <kishen.maloor@intel.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-03mptcp: reflect remote port (not 0) in ANNOUNCED eventsKishen Maloor4-8/+15
Per RFC 8684, if no port is specified in an ADD_ADDR message, MPTCP SHOULD attempt to connect to the specified address on the same port as the port that is already in use by the subflow on which the ADD_ADDR signal was sent. To facilitate that, this change reflects the specific remote port in use by that subflow in MPTCP_EVENT_ANNOUNCED events. Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Kishen Maloor <kishen.maloor@intel.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-03mptcp: store remote id from MP_JOIN SYN/ACK in local ctxKishen Maloor1-0/+1
This change reads the addr id assigned to the remote endpoint of a subflow from the MP_JOIN SYN/ACK message and stores it in the related subflow context. The remote id was not being captured prior to this change, and will now provide a consistent view of remote endpoints and their ids as seen through netlink events. Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Kishen Maloor <kishen.maloor@intel.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-03mptcp: bypass in-kernel PM restrictions for non-kernel PMsKishen Maloor4-3/+32
Current limits on the # of addresses/subflows must apply only to in-kernel PM managed sockets. Thus this change removes such restrictions on connections overseen by non-kernel (e.g. userspace) PMs. This change also ensures that the kernel does not record stats inside struct mptcp_pm_data updated along kernel code paths when exercised via non-kernel PMs. Additionally, address announcements are acknolwedged and subflow requests are honored only when it's deemed that a userspace path manager is active at the time. Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Kishen Maloor <kishen.maloor@intel.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-29mptcp: Add a per-namespace sysctl to set the default path manager typeMat Martineau3-11/+45
The new net.mptcp.pm_type sysctl determines which path manager will be used by each newly-created MPTCP socket. v2: Handle builds without CONFIG_SYSCTL v3: Clarify logic for type-specific PM init (Geliang Tang and Paolo Abeni) Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-29mptcp: Make kernel path manager check for userspace-managed socketsMat Martineau1-6/+12
Userspace-managed sockets should not have their subflows or advertisements changed by the kernel path manager. v3: Use helper function for PM mode (Paolo Abeni) Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-29mptcp: Bypass kernel PM when userspace PM is enabledMat Martineau2-1/+6
When a MPTCP connection is managed by a userspace PM, bypass the kernel PM for incoming advertisements and subflow events. Netlink events are still sent to userspace. v2: Remove unneeded check in mptcp_pm_rm_addr_received() (Kishen Maloor) v3: Add and use helper function for PM mode (Paolo Abeni) Acked-by: Paolo Abeni <pabeni@redhat.com> Co-developed-by: Kishen Maloor <kishen.maloor@intel.com> Signed-off-by: Kishen Maloor <kishen.maloor@intel.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-29mptcp: Add a member to mptcp_pm_data to track kernel vs userspace modeMat Martineau2-0/+13
When adding support for netlink path management commands, the kernel needs to know whether paths are being controlled by the in-kernel path manager or a userspace PM. Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-29mptcp: Remove redundant assignments in path manager initMat Martineau3-27/+18
A few members of the mptcp_pm_data struct were assigned to hard-coded values in mptcp_pm_data_reset(), and then immediately changed in mptcp_pm_nl_data_init(). Instead, flatten all the assignments in to mptcp_pm_data_reset(). v2: Resolve conflicts due to rename of mptcp_pm_data_reset() v4: Resolve conflict in mptcp_pm_data_reset() Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-27mptcp: reset subflow when MP_FAIL doesn't respondGeliang Tang4-0/+68
This patch adds a new msk->flags bit MPTCP_FAIL_NO_RESPONSE, then reuses sk_timer to trigger a check if we have not received a response from the peer after sending MP_FAIL. If the peer doesn't respond properly, reset the subflow. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-27mptcp: add MP_FAIL response supportGeliang Tang3-1/+12
This patch adds a new struct member mp_fail_response_expect in struct mptcp_subflow_context to support MP_FAIL response. In the single subflow with checksum error and contiguous data special case, a MP_FAIL is sent in response to another MP_FAIL. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-27mptcp: add data lock for sk timersGeliang Tang1-0/+12
mptcp_data_lock() needs to be held when manipulating the msk retransmit_timer or the sk sk_timer. This patch adds the data lock for the both timers. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-27mptcp: use mptcp_stop_timerGeliang Tang1-2/+2
Use the helper mptcp_stop_timer() instead of using sk_stop_timer() to stop icsk_retransmit_timer directly. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-23mptcp: add mib for infinite map sendingGeliang Tang3-0/+3
This patch adds a new mib named MPTCP_MIB_INFINITEMAPTX, increase it when a infinite mapping has been sent out. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-23mptcp: infinite mapping receivingGeliang Tang1-1/+3
This patch adds the infinite mapping receiving logic. When the infinite mapping is received, set the map_data_len of the subflow to 0. In subflow_check_data_avail(), only reset the subflow when the map_data_len of the subflow is non-zero. Suggested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-23mptcp: infinite mapping sendingGeliang Tang4-2/+41
This patch adds the infinite mapping sending logic. Add a new flag send_infinite_map in struct mptcp_subflow_context. Set it true when a single contiguous subflow is in use and the allow_infinite_fallback flag is true in mptcp_pm_mp_fail_received(). In mptcp_sendmsg_frag(), if this flag is true, call the new function mptcp_update_infinite_map() to set the infinite mapping. Add a new flag infinite_map in struct mptcp_ext, set it true in mptcp_update_infinite_map(), and check this flag in a new helper mptcp_check_infinite_map(). In mptcp_update_infinite_map(), set data_len to 0, and clear the send_infinite_map flag, then do fallback. In mptcp_established_options(), use the helper mptcp_check_infinite_map() to let the infinite mapping DSS can be sent out in the fallback mode. Suggested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-23mptcp: track and update contiguous data statusGeliang Tang3-1/+7
This patch adds a new member allow_infinite_fallback in mptcp_sock, which is initialized to 'true' when the connection begins and is set to 'false' on any retransmit or successful MP_JOIN. Only do infinite mapping fallback if there is a single subflow AND there have been no retransmissions AND there have never been any MP_JOINs. Suggested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-23mptcp: add the fallback checkGeliang Tang1-21/+24
This patch adds the fallback check in subflow_check_data_avail(). Only do the fallback when the msk hasn't fallen back yet. Suggested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-23mptcp: don't send RST for single subflowGeliang Tang1-5/+5
When a bad checksum is detected and a single subflow is in use, don't send RST + MP_FAIL, send data_ack + MP_FAIL instead. So invoke tcp_send_active_reset() only when mptcp_has_another_subflow() is true. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-12net: remove noblock parameter from recvmsg() entitiesOliver Hartkopp1-2/+2
The internal recvmsg() functions have two parameters 'flags' and 'noblock' that were merged inside skb_recv_datagram(). As a follow up patch to commit f4b41f062c42 ("net: remove noblock parameter from skb_recv_datagram()") this patch removes the separate 'noblock' parameter for recvmsg(). Analogue to the referenced patch for skb_recv_datagram() the 'flags' and 'noblock' parameters are unnecessarily split up with e.g. err = sk->sk_prot->recvmsg(sk, msg, size, flags & MSG_DONTWAIT, flags & ~MSG_DONTWAIT, &addr_len); or in err = INDIRECT_CALL_2(sk->sk_prot->recvmsg, tcp_recvmsg, udp_recvmsg, sk, msg, size, flags & MSG_DONTWAIT, flags & ~MSG_DONTWAIT, &addr_len); instead of simply using only flags all the time and check for MSG_DONTWAIT where needed (to preserve for the formerly separated no(n)block condition). Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Link: https://lore.kernel.org/r/20220411124955.154876-1-socketcan@hartkopp.net Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-04-11mptcp: listen diag dump supportFlorian Westphal1-0/+91
makes 'ss -Ml' show mptcp listen sockets. Iterate over the tcp listen sockets and pick those that have mptcp ulp info attached. mptcp_diag_get_info() is modified to prefer msk->first for mptcp sockets in listen state. This reports accurate number for recv and send queue (pending / max connection backlog counters). Sample output: ss -Mil State Recv-Q Send-Q Local Address:Port Peer Address:Port LISTEN 0 20 127.0.0.1:12000 0.0.0.0:* subflows_max:2 Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-11mptcp: remove locking in mptcp_diag_fill_infoFlorian Westphal1-6/+0
Problem is that listener iteration would call this from atomic context so this locking is not allowed. One way is to drop locks before calling the helper, but afaics the lock isn't really needed, all values are fetched via READ_ONCE(). Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-11mptcp: diag: switch to context structureFlorian Westphal1-3/+11
Raw access to cb->arg[] is deprecated, use a context structure. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-11mptcp: add pm_nl_pernet helpersGeliang Tang1-17/+24
This patch adds two pm_nl_pernet related helpers, named pm_nl_get_pernet() and pm_nl_get_pernet_from_msk() to get pm_nl_pernet from 'net' or 'msk'. Use these helpers instead of using net_generic() directly. Suggested-by: Florian Westphal <fw@strlen.de> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-11mptcp: reset the packet scheduler on PRIO changePaolo Abeni1-0/+2
Similar to the previous patch, for priority changes requested by the local PM. Reported-and-suggested-by: Davide Caratti <dcaratti@redhat.com> Fixes: 067065422fcd ("mptcp: add the outgoing MP_PRIO support") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-11mptcp: reset the packet scheduler on incoming MP_PRIOPaolo Abeni3-4/+18
When an incoming MP_PRIO option changes the backup status of any subflow, we need to reset the packet scheduler status, or the next send could keep using the previously selected subflow, without taking in account the new priorities. Reported-by: Davide Caratti <dcaratti@redhat.com> Fixes: 40453a5c61f4 ("mptcp: add the incoming MP_PRIO support") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-11mptcp: optimize release_cb for the common casePaolo Abeni1-7/+9
The mptcp release callback checks several flags in atomic context, but only MPTCP_CLEAN_UNA can be up frequently. Reorganize the code to avoid multiple conditionals in the most common scenarios. Additional clarify a related comment. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-0/+1
Merge in overtime fixes, no conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-18mptcp: send ADD_ADDR echo before create subflowsYonglong Li1-13/+5
In some corner cases, the peer handing an incoming ADD_ADDR option, can receive a retransmitted ADD_ADDR for the same address before the subflow creation completes. We can avoid the above issue by generating and sending the ADD_ADDR echo before starting the MPJ subflow connection. This slightly changes the behaviour of the packetdrill tests as the ADD_ADDR echo packet is sent earlier. Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Yonglong Li <liyonglong@chinatelecom.cn> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Link: https://lore.kernel.org/r/20220317221444.426335-1-mathew.j.martineau@linux.intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-18mptcp: Fix crash due to tcp_tsorted_anchor was initialized before release skbYonglong Li1-0/+1
Got crash when doing pressure test of mptcp: =========================================================================== dst_release: dst:ffffa06ce6e5c058 refcnt:-1 kernel tried to execute NX-protected page - exploit attempt? (uid: 0) BUG: unable to handle kernel paging request at ffffa06ce6e5c058 PGD 190a01067 P4D 190a01067 PUD 43fffb067 PMD 22e403063 PTE 8000000226e5c063 Oops: 0011 [#1] SMP PTI CPU: 7 PID: 7823 Comm: kworker/7:0 Kdump: loaded Tainted: G E Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.2.1 04/01/2014 Call Trace: ? skb_release_head_state+0x68/0x100 ? skb_release_all+0xe/0x30 ? kfree_skb+0x32/0xa0 ? mptcp_sendmsg_frag+0x57e/0x750 ? __mptcp_retrans+0x21b/0x3c0 ? __switch_to_asm+0x35/0x70 ? mptcp_worker+0x25e/0x320 ? process_one_work+0x1a7/0x360 ? worker_thread+0x30/0x390 ? create_worker+0x1a0/0x1a0 ? kthread+0x112/0x130 ? kthread_flush_work_fn+0x10/0x10 ? ret_from_fork+0x35/0x40 =========================================================================== In __mptcp_alloc_tx_skb skb was allocated and skb->tcp_tsorted_anchor will be initialized, in under memory pressure situation sk_wmem_schedule will return false and then kfree_skb. In this case skb->_skb_refdst is not null because_skb_refdst and tcp_tsorted_anchor are stored in the same mem, and kfree_skb will try to release dst and cause crash. Fixes: f70cad1085d1 ("mptcp: stop relying on tcp_tx_skb_cache") Reviewed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Yonglong Li <liyonglong@chinatelecom.cn> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Link: https://lore.kernel.org/r/20220317220953.426024-1-mathew.j.martineau@linux.intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-08mptcp: add fullmesh flag check for adding addressGeliang Tang1-0/+6
The fullmesh flag mustn't be used with the signal flag when adding an address. This patch added the necessary flags check for this case. Fixes: 73c762c1f07d ("mptcp: set fullmesh flag in pm_netlink") Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-08mptcp: strict local address ID selectionPaolo Abeni4-23/+63
The address ID selection for MPJ subflows created in response to incoming ADD_ADDR option is currently unreliable: it happens at MPJ socket creation time, when the local address could be unknown. Additionally, if the no local endpoint is available for the local address, a new dummy endpoint is created, confusing the user-land. This change refactor the code to move the address ID selection inside the rebuild_header() helper, when the local address eventually selected by the route lookup is finally known. If the address used is not mapped by any endpoint - and thus can't be advertised/removed pick the id 0 instead of allocate a new endpoint. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-08mptcp: introduce implicit endpointsPaolo Abeni1-17/+44
In some edge scenarios, an MPTCP subflows can use a local address mapped by a "implicit" endpoint created by the in-kernel path manager. Such endpoints presence can be confusing, as it's creation is hard to track and will prevent the later endpoint creation from the user-space using the same address. Define a new endpoint flag to mark implicit endpoints and allow the user-space to replace implicit them with user-provided data at endpoint creation time. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-08mptcp: more careful RM_ADDR generationPaolo Abeni1-6/+4
The in-kernel MPTCP path manager, when processing the MPTCP_PM_CMD_FLUSH_ADDR command, generates RM_ADDR events for each known local address. While that is allowed by the RFC, it makes unpredictable the exact number of RM_ADDR generated when both ends flush the PM addresses. This change restricts the RM_ADDR generation to previously explicitly announced addresses, and adjust the expected results in a bunch of related self-tests. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-08mptcp: use MPTCP_SUBFLOW_NODATAGeliang Tang1-4/+4
Set subflow->data_avail with the enum value MPTCP_SUBFLOW_NODATA, instead of using 0 directly. Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-08mptcp: add tracepoint in mptcp_sendmsg_fragGeliang Tang1-0/+1
The tracepoint in get_mapping_status() only dumped the incoming mpext fields. This patch added a new tracepoint in mptcp_sendmsg_frag() to dump the outgoing mpext too. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>