summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2020-12-10Merge branch 'add-ppp_generic-ioctls-to-bridge-channels'David S. Miller3-3/+167
Tom Parkin says: ==================== add ppp_generic ioctl(s) to bridge channels Following on from my previous RFC[1], this series adds two ioctl calls to the ppp code to implement "channel bridging". When two ppp channels are bridged, frames presented to ppp_input() on one channel are passed to the other channel's ->start_xmit function for transmission. The primary use-case for this functionality is in an L2TP Access Concentrator where PPP frames are typically presented in a PPPoE session (e.g. from a home broadband user) and are forwarded to the ISP network in a PPPoL2TP session. The two new ioctls, PPPIOCBRIDGECHAN and PPPIOCUNBRIDGECHAN form a symmetric pair. Userspace code testing and illustrating use of the ioctl calls is available in the go-l2tp[2] and l2tp-ktest[3] repositories. [1]. Previous RFC series: https://lore.kernel.org/netdev/20201106181647.16358-1-tparkin@katalix.com/ [2]. go-l2tp: a Go library for building L2TP applications on Linux systems. Support for the PPPIOCBRIDGECHAN ioctl is on a branch: https://github.com/katalix/go-l2tp/tree/tp_002_pppoe_2 [3]. l2tp-ktest: a test suite for the Linux Kernel L2TP subsystem. Support for the PPPIOCBRIDGECHAN ioctl is on a branch: https://github.com/katalix/l2tp-ktest/tree/tp_ac_pppoe_tests_2 Changelog: v4: * Fix NULL-pointer access in PPPIOCBRIDGECHAN in the case that the ID of the channel to be bridged wasn't found. * Add comment in ppp_unbridge_channels to better document the unbridge process. v3: * Use rcu_dereference_protected for accessing struct channel 'bridge' field during updates with lock 'upl' held. * Avoid race in ppp_unbridge_channels by ensuring that each channel in the bridge points to it's peer before decrementing refcounts. v2: * Add missing __rcu annotation to struct channel 'bridge' field in order to squash a sparse warning from a C=1 build * Integrate review comments from gnault@redhat.com * Have ppp_unbridge_channels return -EINVAL if the channel isn't part of a bridge: this better aligns with the return code from ppp_disconnect_channel. * Improve docs update by including information on ioctl arguments and error return codes. ==================== Reviewed-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-10docs: update ppp_generic.rst to document new ioctlsTom Parkin1-0/+16
Add documentation of the newly-added PPPIOCBRIDGECHAN and PPPIOCUNBRIDGECHAN ioctls. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-10ppp: add PPPIOCBRIDGECHAN and PPPIOCUNBRIDGECHAN ioctlsTom Parkin2-3/+151
This new ioctl pair allows two ppp channels to be bridged together: frames arriving in one channel are transmitted in the other channel and vice versa. The practical use for this is primarily to support the L2TP Access Concentrator use-case. The end-user session is presented as a ppp channel (typically PPPoE, although it could be e.g. PPPoA, or even PPP over a serial link) and is switched into a PPPoL2TP session for transmission to the LNS. At the LNS the PPP session is terminated in the ISP's network. When a PPP channel is bridged to another it takes a reference on the other's struct ppp_file. This reference is dropped when the channels are unbridged, which can occur either explicitly on userspace calling the PPPIOCUNBRIDGECHAN ioctl, or implicitly when either channel in the bridge is unregistered. In order to implement the channel bridge, struct channel is extended with a new field, 'bridge', which points to the other struct channel making up the bridge. This pointer is RCU protected to avoid adding another lock to the data path. To guard against concurrent writes to the pointer, the existing struct channel lock 'upl' coverage is extended rather than adding a new lock. The 'upl' lock is used to protect the existing unit pointer. Since the bridge effectively replaces the unit (they're mutually exclusive for a channel) it makes coding easier to use the same lock to cover them both. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-10rtnetlink: RCU-annotate both dimensions of rtnl_msg_handlersJakub Kicinski1-11/+13
We use rcu_assign_pointer to assign both the table and the entries, but the entries are not marked as __rcu. This generates sparse warnings. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-10Revert "macb: support the two tx descriptors on at91rm9200"Willy Tarreau2-40/+8
This reverts commit 0a4e9ce17ba77847e5a9f87eed3c0ba46e3f82eb. The code was developed and tested on an MSC313E SoC, which seems to be half-way between the AT91RM9200 and the AT91SAM9260 in that it supports both the 2-descriptors mode and a Tx ring. It turns out that after the code was merged I could notice that the controller would sometimes lock up, and only when dealing with sustained bidirectional transfers, in which case it would report a Tx overrun condition right after having reported being ready, and will stop sending even after the status is cleared (a down/up cycle fixes it though). After adding lots of traces I couldn't spot a sequence pattern allowing to predict that this situation would happen. The chip comes with no documentation and other bits are often reported with no conclusive pattern either. It is possible that my change is wrong just like it is possible that the controller on the chip is bogus or at least unpredictable based on existing docs from other chips. I do not have an RM9200 at hand to test at the moment and a few tests run on a more recent 9G20 indicate that this code path cannot be used there to test the code on a 3rd platform. Since the MSC313E works fine in the single-descriptor mode, and that people using the old RM9200 very likely favor stability over performance, better revert this patch until we can test it on the original platform this part of the driver was written for. Note that the reverted patch was actually tested on MSC313E. Cc: Nicolas Ferre <nicolas.ferre@microchip.com> Cc: Claudiu Beznea <claudiu.beznea@microchip.com> Cc: Daniel Palmer <daniel@0x0f.com> Cc: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/netdev/20201206092041.GA10646@1wt.eu/ Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-10net: qualcomm: rmnet: Update rmnet device MTU based on real deviceSubash Abhinov Kasiviswanathan4-3/+90
Packets sent by rmnet to the real device have variable MAP header lengths based on the data format configured. This patch adds checks to ensure that the real device MTU is sufficient to transmit the MAP packet comprising of the MAP header and the IP packet. This check is enforced when rmnet devices are created and updated and during MTU updates of both the rmnet and real device. Additionally, rmnet devices now have a default MTU configured which accounts for the real device MTU and the headroom based on the data format. Signed-off-by: Sean Tranchetti <stranche@codeaurora.org> Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Tested-by: Loic Poulain <loic.poulain@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-10net: lapbether: Consider it successful if (dis)connecting when already ↵Xie He1-2/+11
(dis)connected When the upper layer instruct us to connect (or disconnect), but we have already connected (or disconnected), consider this operation successful rather than failed. This can help the upper layer to correct its record about whether we are connected or not here in layer 2. The upper layer may not have the correct information about whether we are connected or not. This can happen if this driver has already been running for some time when the "x25" module gets loaded. Another X.25 driver (hdlc_x25) is already doing this, so we make this driver do this, too. Cc: Martin Schiller <ms@dev.tdt.de> Signed-off-by: Xie He <xie.he.0141@gmail.com> Acked-by: Martin Schiller <ms@dev.tdt.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-10igc: Add new device IDSasha Neftin3-0/+3
Add new device ID for the next step of the silicon and reflect the I226_K part. Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-10tcp: correctly handle increased zerocopy args struct sizeArjun Roy1-2/+2
A prior patch increased the size of struct tcp_zerocopy_receive but did not update do_tcp_getsockopt() handling to properly account for this. This patch simply reintroduces content erroneously cut from the referenced prior patch that handles the new struct size. Fixes: 18fb76ed5386 ("net-zerocopy: Copy straggler unaligned data for TCP Rx. zerocopy.") Signed-off-by: Arjun Roy <arjunroy@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-10net: mediatek: simplify the return expression of mtk_gmac_sgmii_path_setup()Zheng Yongjun1-6/+2
Simplify the return expression. Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-10net/mlx4: simplify the return expression of mlx4_init_srq_table()Zheng Yongjun1-7/+2
Simplify the return expression. Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-10net: stmmac: simplify the return tc_delete_knode()Zheng Yongjun1-8/+2
Simplify the return expression. Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-10Merge tag 'linux-can-next-for-5.11-20201210' of ↵David S. Miller10-48/+242
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next Marc Kleine-Budde says: ==================== pull-request: can-next 2020-12-10 here's a pull request of 7 patches for net-next/master. The first patch is by Oliver Hartkopp for the CAN ISOTP, which adds support for functional addressing. A patch by Antonio Quartulli removes an unneeded unlikely() annotation from the rx-offload helper. The next three patches target the m_can driver. Sean Nyekjaers's patch removes a double clearing of clock stop request bit, Patrik Flykt's patch moves the runtime PM enable/disable to m_can_platform and Jarkko Nikula's patch adds a PCI glue code driver. Fabio Estevam's patch converts the flexcan driver to DT only. And Manivannan Sadhasivam's patchd for the mcp251xfd driver adds internal loopback mode support. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-10vxlan: avoid double unlikely() notation when using IS_ERR()Antonio Quartulli1-1/+1
The definition of IS_ERR() already applies the unlikely() notation when checking the error status of the passed pointer. For this reason there is no need to have the same notation outside of IS_ERR() itself. Clean up code by removing redundant notation. Signed-off-by: Antonio Quartulli <a@unstable.cc> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-10can: mcp251xfd: Add support for internal loopback modeManivannan Sadhasivam1-4/+7
MCP251xFD supports internal loopback mode which can be used to verify CAN functionality in the absence of a real CAN device. Link: https://lore.kernel.org/r/20201201054019.11012-1-manivannan.sadhasivam@linaro.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> [mkl: mcp251xfd_get_normal_mode(): move CAN_CTRLMODE_LOOPBACK check to front] Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2020-12-10can: flexcan: convert the driver to DT-onlyFabio Estevam1-17/+1
The flexcan driver runs only on DT platforms, so simplify the code by using of_device_get_match_data() to retrieve the driver data and also by removing the unused id_table. Signed-off-by: Fabio Estevam <festevam@gmail.com> Link: https://lore.kernel.org/r/20201128132855.7724-1-festevam@gmail.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2020-12-10can: m_can: add PCI glue driver for Intel Elkhart LakeJarkko Nikula3-0/+194
Add support for M_CAN controller on Intel Elkhart Lake attached to the PCI bus. It integrates the Bosch M_CAN controller with Message RAM and the wrapper IP block with additional registers which all of them are within the same MMIO range. Currently only interrupt control register from wrapper IP is used and the MRAM configuration is expected to come from the firmware via "bosch,mram-cfg" device property and parsed by m_can.c core. Initial implementation is done by Felipe Balbi while he was working at Intel with later changes from Raymond Tan and me. Co-developed-by: Felipe Balbi (Intel) <balbi@kernel.org> Co-developed-by: Raymond Tan <raymond.tan@intel.com> Signed-off-by: Felipe Balbi (Intel) <balbi@kernel.org> Signed-off-by: Raymond Tan <raymond.tan@intel.com> Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Link: https://lore.kernel.org/r/20201117160827.3636264-1-jarkko.nikula@linux.intel.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2020-12-10can: m_can: move runtime PM enable/disable to m_can_platformPatrik Flykt2-8/+9
This is a preparatory patch for upcoming PCI based M_CAN devices. The current PM implementation would cause PCI based drivers to enable PM twice, once when the PCI device is added and a second time in m_can_class_register(). This will cause 'Unbalanced pm_runtime_enable!' to be logged, and is a situation that should be avoided. Therefore, in anticipation of PCI devices, move PM enabling out from M_CAN class registration to its only user, the m_can_platform driver. Signed-off-by: Patrik Flykt <patrik.flykt@linux.intel.com> Link: https://lore.kernel.org/r/20201023115800.46538-2-patrik.flykt@linux.intel.com [mkl: m_can_plat_probe(): fix error handling m_can_class_register(): simplify error handling] Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2020-12-10can: m_can: m_can_config_endisable(): remove double clearing of clock stop ↵Sean Nyekjaer1-4/+0
request bit The CSR bit is already cleared when arriving here so remove this section of duplicate code. The registers set in m_can_config_endisable() is set to same exact values as before this patch. Signed-off-by: Sean Nyekjaer <sean@geanix.com> Acked-by: Sriram Dash <sriram.dash@samsung.com> Acked-by: Dan Murphy <dmurphy@ti.com> Link: https://lore.kernel.org/r/20191211063227.84259-1-sean@geanix.com Fixes: f524f829b75a ("can: m_can: Create a m_can platform framework") Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2020-12-10can: rx-offload: can_rx_offload_offload_one(): avoid double unlikely() ↵Antonio Quartulli1-1/+1
notation when using IS_ERR() The definition of IS_ERR() already applies the unlikely() notation when checking the error status of the passed pointer. For this reason there is no need to have the same notation outside of IS_ERR() itself. Clean up code by removing redundant notation. Signed-off-by: Antonio Quartulli <a@unstable.cc> Link: https://lore.kernel.org/r/20201210085321.18693-1-a@unstable.cc Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2020-12-10can: isotp: add SF_BROADCAST support for functional addressingOliver Hartkopp2-14/+30
When CAN_ISOTP_SF_BROADCAST is set in the CAN_ISOTP_OPTS flags the CAN_ISOTP socket is switched into functional addressing mode, where only single frame (SF) protocol data units can be send on the specified CAN interface and the given tp.tx_id after bind(). In opposite to normal and extended addressing this socket does not register a CAN-ID for reception which would be needed for a 1-to-1 ISOTP connection with a segmented bi-directional data transfer. Sending SFs on this socket is therefore a TX-only 'broadcast' operation. Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Thomas Wagner <thwa1@web.de> Link: https://lore.kernel.org/r/20201206144731.4609-1-socketcan@hartkopp.net Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2020-12-09Merge branch 'hns3-next'David S. Miller12-151/+759
Huazhong Tan says: ==================== net: hns3: updates for -next This patchset adds support for tc mqprio offload, hw tc offload of tc flower, and adpation for max rss size changes. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09net: hns3: adjust rss tc mode configure commandGuojia Liao2-1/+5
For the max rss size of PF may be up to 512, the max queue number of single tc may be up to 512 too. For the total queue numbers may be up to 1280, so the queue offset of each tc may be more than 1024. So adjust the rss tc mode configuration command, including extend tc size field from 10 bits to 11 bits, and extend tc size field from 3 bits to 4 bits. Signed-off-by: Guojia Liao <liaoguojia@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09net: hns3: adjust rss indirection table configure commandGuojia Liao2-8/+19
For the max rss size of PF may be up to 512, so adjust the command of configuring rss indirection table to support queue id larger than 255. The width of queue id is extended from 8 bits to 10 bits. The high 2 bits are stored in filed rss_qid_h when the queue id is larger than 255. Signed-off-by: Guojia Liao <liaoguojia@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09net: hns3: add support for max 512 rss sizeGuojia Liao5-16/+39
Currently, the driver gets the max rss size from configuration file when initialization. Both the PF and VF share the same max rss size, and no more than 128. For DEVICE_VERSION_V3, the max rss size for PF can be up to 512, so there is a new field in configuration file to store it, the old filed is used for VF. To be compatible with boards using old configure file, the PF will use the old filed if the one is zero. For the rss size may be larger than 256, so the type of rss_indirection_tbl of struct hclge_vport should be changed to u16 as well. Signed-off-by: Guojia Liao <liaoguojia@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09net: hns3: add support for hw tc offload of tc flowerJian Shen4-13/+397
Some new device supports forwarding packet to queues of specified TC when flow director rule hit. So add support to configure flow director rule by tc flower. To avoid rule conflict, add a new flow director mode HCLGE_FD_TC_FLOWER_ACTIVE, and only one mode can be active at the same time. Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09net: hns3: add support for forwarding packet to queues of specified TC when ↵Jian Shen4-5/+27
flow director rule hit For some new device, it supports forwarding packet to queues of specified TC when flow director rule hit. So extend the command handle to support it. Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09net: hns3: add support for tc mqprio offloadJian Shen5-53/+220
Currently, the HNS3 driver only supports offload for tc number and prio_tc. This patch adds support for other qopts, including queues count and offset for each tc. When enable tc mqprio offload, it's not allowed to change queue numbers by ethtool. For hardware limitation, the queue number of each tc should be power of 2. For the queues is not assigned to each tc by average, so it's should return vport->alloc_tqps for hclge_get_max_channels(). Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09net: hns3: refine the struct hane3_tc_infoJian Shen8-67/+64
Currently, there are multiple members related to tc information in struct hnae3_knic_private_info. Merge them into a new struct hnae3_tc_info. Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09nfp: silence set but not used warning with IPV6=nJakub Kicinski1-1/+1
Test robot reports: drivers/net/ethernet/netronome/nfp/crypto/tls.c: In function 'nfp_net_tls_rx_resync_req': drivers/net/ethernet/netronome/nfp/crypto/tls.c:477:18: warning: variable 'ipv6h' set but not used [-Wunused-but-set-variable] 477 | struct ipv6hdr *ipv6h; | ^~~~~ In file included from include/linux/compiler_types.h:65, from <command-line>: drivers/net/ethernet/netronome/nfp/crypto/tls.c: In function 'nfp_net_tls_add': include/linux/compiler_attributes.h:208:41: warning: statement will never be executed [-Wswitch-unreachable] 208 | # define fallthrough __attribute__((__fallthrough__)) | ^~~~~~~~~~~~~ drivers/net/ethernet/netronome/nfp/crypto/tls.c:299:3: note: in expansion of macro 'fallthrough' 299 | fallthrough; | ^~~~~~~~~~~ Use the IPv6 header in the switch, it doesn't matter which header we use to read the version field. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09net: stmmac: allow stmmac to probe for C45 PHY devicesWong Vee Khee1-0/+3
Assign stmmac's mdio_bus probe capabilities to MDIOBUS_C22_C45. This extended the probing of C45 PHY devices on the MDIO bus. Signed-off-by: Wong Vee Khee <vee.khee.wong@intel.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09Merge branch 'Add-support-for-VSOL-V2801F-CarlitoxxPro-CPGOS03-GPON-moDavid S. Miller2-11/+63
dule' Russell King says: ==================== Add support for VSOL V2801F/CarlitoxxPro CPGOS03 GPON module This patch set adds support for the V2801F / CarlitoxxPro module. This requires two changes: 1) the module only supports single byte reads to the ID EEPROM, while we need to still permit sequential reads to the diagnostics EEPROM for atomicity reasons. 2) we need to relax the encoding check when we have no reported capabilities to allow 1000base-X based on the module bitrate. Thanks to Pali Rohár for responsive testing over the last two days. (Resending, dropping the utf-8 characters in Pali's name so the patches get through vger. Added Andrew's r-b tags.) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09net: sfp: relax bitrate-derived mode checkRussell King1-6/+5
Do not check the encoding when deriving 1000BASE-X from the bitrate when no other modes are discovered. Some GPON modules (VSOL V2801F and CarlitoxxPro CPGOS03-0490 v2.0) indicate NRZ encoding with a 1200Mbaud bitrate, but should be driven with 1000BASE-X on the host side. Tested-by: Pali Rohár <pali@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09net: sfp: VSOL V2801F / CarlitoxxPro CPGOS03-0490 v2.0 workaroundRussell King1-5/+58
Add a workaround for the detection of VSOL V2801F / CarlitoxxPro CPGOS03-0490 v2.0 GPON module which CarlitoxxPro states needs single byte I2C reads to the EEPROM. Pali Rohár reports that he also has a CarlitoxxPro-based V2801F module, which reports a manufacturer of "OEM". This manufacturer can't be matched as it appears in many different modules, so also match the part number too. Reported-by: Thomas Schreiber <tschreibe@gmail.com> Reported-by: Pali Rohár <pali@kernel.org> Tested-by: Pali Rohár <pali@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09net: x25: Fix handling of Restart Request and Restart ConfirmationXie He1-16/+9
1. When the x25 module gets loaded, layer 2 may already be running and connected. In this case, although we are in X25_LINK_STATE_0, we still need to handle the Restart Request received, rather than ignore it. 2. When we are in X25_LINK_STATE_2, we have already sent a Restart Request and is waiting for the Restart Confirmation with t20timer. t20timer will restart itself repeatedly forever so it will always be there, as long as we are in State 2. So we don't need to check x25_t20timer_pending again. Fixes: d023b2b9ccc2 ("net/x25: fix restart request/confirm handling") Cc: Martin Schiller <ms@dev.tdt.de> Signed-off-by: Xie He <xie.he.0141@gmail.com> Acked-by: Martin Schiller <ms@dev.tdt.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09Merge branch 'mptcp-fixes'David S. Miller5-10/+48
Paolo Abeni says: ==================== mptcp: a bunch of fixes This series includes a few fixes following-up the recent code refactor for the MPTCP RX and TX paths. Boundling them together, since the fixes are somewhat related. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09mptcp: be careful on subflows shutdownPaolo Abeni1-2/+10
When the workqueue disposes of the msk, the subflows can still receive some data from the peer after __mptcp_close_ssk() completes. The above could trigger a race between the msk receive path and the msk destruction. Acquiring the mptcp_data_lock() in __mptcp_destroy_sock() will not save the day: the rx path could be reached even after msk destruction completes. Instead use the subflow 'disposable' flag to prevent entering the msk receive path after __mptcp_close_ssk(). Fixes: e16163b6e2b7 ("mptcp: refactor shutdown and close") Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09mptcp: plug subflow context memory leakPaolo Abeni1-2/+3
When a MPTCP listener socket is closed with unaccepted children pending, the ULP release callback will be invoked, but nobody will call into __mptcp_close_ssk() on the corresponding subflow. As a consequence, at ULP release time, the 'disposable' flag will be cleared and the subflow context memory will be leaked. This change addresses the issue always freeing the context if the subflow is still in the accept queue at ULP release time. Additionally, this fixes an incorrect code reference in the related comment. Note: this fix leverages the changes introduced by the previous commit. Fixes: e16163b6e2b7 ("mptcp: refactor shutdown and close") Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09mptcp: link MPC subflow into msk only after acceptPaolo Abeni5-6/+35
Christoph reported the following splat: WARNING: CPU: 0 PID: 4615 at net/ipv4/inet_connection_sock.c:1031 inet_csk_listen_stop+0x8e8/0xad0 net/ipv4/inet_connection_sock.c:1031 Modules linked in: CPU: 0 PID: 4615 Comm: syz-executor.4 Not tainted 5.9.0 #37 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:inet_csk_listen_stop+0x8e8/0xad0 net/ipv4/inet_connection_sock.c:1031 Code: 03 00 00 00 e8 79 b2 3d ff e9 ad f9 ff ff e8 1f 76 ba fe be 02 00 00 00 4c 89 f7 e8 62 b2 3d ff e9 14 f9 ff ff e8 08 76 ba fe <0f> 0b e9 97 f8 ff ff e8 fc 75 ba fe be 03 00 00 00 4c 89 f7 e8 3f RSP: 0018:ffffc900037f7948 EFLAGS: 00010293 RAX: ffff88810a349c80 RBX: ffff888114ee1b00 RCX: ffffffff827b14cd RDX: 0000000000000000 RSI: ffffffff827b1c38 RDI: 0000000000000005 RBP: ffff88810a2a8000 R08: ffff88810a349c80 R09: fffff520006fef1f R10: 0000000000000003 R11: fffff520006fef1e R12: ffff888114ee2d00 R13: dffffc0000000000 R14: 0000000000000001 R15: ffff888114ee1d68 FS: 00007f2ac1945700(0000) GS:ffff88811b400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffd44798bc0 CR3: 0000000109810002 CR4: 0000000000170ef0 Call Trace: __tcp_close+0xd86/0x1110 net/ipv4/tcp.c:2433 __mptcp_close_ssk+0x256/0x430 net/mptcp/protocol.c:1761 __mptcp_destroy_sock+0x49b/0x770 net/mptcp/protocol.c:2127 mptcp_close+0x62d/0x910 net/mptcp/protocol.c:2184 inet_release+0xe9/0x1f0 net/ipv4/af_inet.c:434 __sock_release+0xd2/0x280 net/socket.c:596 sock_close+0x15/0x20 net/socket.c:1277 __fput+0x276/0x960 fs/file_table.c:281 task_work_run+0x109/0x1d0 kernel/task_work.c:151 get_signal+0xe8f/0x1d40 kernel/signal.c:2561 arch_do_signal+0x88/0x1b60 arch/x86/kernel/signal.c:811 exit_to_user_mode_loop kernel/entry/common.c:161 [inline] exit_to_user_mode_prepare+0x9b/0xf0 kernel/entry/common.c:191 syscall_exit_to_user_mode+0x22/0x150 kernel/entry/common.c:266 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7f2ac1254469 Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ff 49 2b 00 f7 d8 64 89 01 48 RSP: 002b:00007f2ac1944dc8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffbf RBX: 000000000069bf00 RCX: 00007f2ac1254469 RDX: 0000000000000000 RSI: 0000000000008982 RDI: 0000000000000003 RBP: 000000000069bf00 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 000000000069bf0c R13: 00007ffeb53f178f R14: 00000000004668b0 R15: 0000000000000003 After commit 0397c6d85f9c ("mptcp: keep unaccepted MPC subflow into join list"), the msk's workqueue and/or PM can touch the MPC subflow - and acquire its socket lock - even if it's still unaccepted. If the above event races with the relevant listener socket close, we can end-up with the above splat. This change addresses the issue delaying the MPC socket insertion in conn_list at accept time - that is, partially reverting the blamed commit. We must additionally ensure that mptcp_pm_fully_established() happens after accept() time, or the PM will not be able to handle properly such event - conn_list could be empty otherwise. In the receive path, we check the subflow list node to ensure it is out of the listener queue. Be sure client subflows do not match transiently such condition moving them into the join list earlier at creation time. Since we now have multiple mptcp_pm_fully_established() call sites from different code-paths, said helper can now race with itself. Use an additional PM status bit to avoid multiple notifications. Reported-by: Christoph Paasch <cpaasch@apple.com> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/103 Fixes: 0397c6d85f9c ("mptcp: keep unaccepted MPC subflow into join list"), Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09net: hdlc_x25: Remove unnecessary skb_reset_network_header callsXie He1-2/+0
1. In x25_xmit, skb_reset_network_header is not necessary before we call lapb_data_request. The lapb module doesn't need skb->network_header. So there is no need to set skb->network_header before calling lapb_data_request. 2. In x25_data_indication (called by the lapb module after data have been received), skb_reset_network_header is not necessary before we call netif_rx. After we call netif_rx, the code in net/core/dev.c will call skb_reset_network_header before handing the skb to upper layers (in __netif_receive_skb_core, called by __netif_receive_skb_one_core, called by __netif_receive_skb, called by process_backlog). So we don't need to call skb_reset_network_header by ourselves. Cc: Martin Schiller <ms@dev.tdt.de> Signed-off-by: Xie He <xie.he.0141@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09net/mlx4: simplify the return expression of mlx4_init_cq_table()Zheng Yongjun1-7/+2
Simplify the return expression. Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09ibmvnic: fix rx buffer tracking and index management in replenish_rx_pool ↵Dwip N. Banerjee1-0/+2
partial success We observed that in the error case for batched send_subcrq_indirect() the driver does not account for the partial success case. This caused Linux to crash when free_map and pool index are inconsistent. Driver needs to update the rx pools "available" count when some batched sends worked but an error was encountered as part of the whole operation. Also track replenish_add_buff_failure for statistic purposes. Fixes: 4f0b6812e9b9a ("ibmvnic: Introduce batched RX buffer descriptor transmission") Signed-off-by: Dwip N. Banerjee <dnbanerg@us.ibm.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09Merge branch 'mptcp-Add-port-parameter-to-ADD_ADDR-option'David S. Miller5-79/+146
Mat Martineau says: ==================== mptcp: Add port parameter to ADD_ADDR option The ADD_ADDR MPTCP option is used to announce available IP addresses that a peer may connect to when adding more TCP subflows to an existing MPTCP connection. There is an optional port number field in that ADD_ADDR header, and this patch set adds capability for that port number to be sent and received. Patches 1, 2, and 4 refactor existing ADD_ADDR code to simplify implementation of port number support. Patches 3 and 5 are the main functional changes, for sending and receiving the port number in the MPTCP ADD_ADDR option. Patch 6 sends the ADD_ADDR option with port number on a bare TCP ACK, since the extra length of the option may run in to cases where sufficient TCP option space is not available on a data packet. Patch 7 plumbs in port number support for the in-kernel MPTCP path manager. Patches 8-11 add some optional debug output and a little more cleanup refactoring. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09mptcp: use the variable sk instead of open-codingGeliang Tang1-2/+2
Since the local variable sk has been defined, use it instead of open-coding. Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09mptcp: rename add_addr_signal and mptcp_add_addr_statusGeliang Tang3-16/+16
Since the RM_ADDR signal had been reused with add_addr_signal, it's not suitable to call it add_addr_signal or mptcp_add_addr_status. So this patch renamed add_addr_signal to addr_signal, and renamed mptcp_add_addr_status to mptcp_addr_signal_status. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09mptcp: drop rm_addr_signal flagGeliang Tang2-5/+17
This patch reused add_addr_signal for the RM_ADDR announcing signal, by defining a new ADD_ADDR status named MPTCP_RM_ADDR_SIGNAL. Then the flag rm_addr_signal in PM could be dropped. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09mptcp: print out port and ahmac when receiving ADD_ADDRGeliang Tang1-3/+3
This patch printed out more debugging information for the ADD_ADDR suboption parsing on the incoming path. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09mptcp: add port parameter for mptcp_pm_announce_addrGeliang Tang3-6/+11
This patch added a new parameter 'port' for mptcp_pm_announce_addr. If this parameter is true, we set the MPTCP_ADD_ADDR_PORT bit of the add_addr_signal. That means the announced address is added with a port number. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09mptcp: send out dedicated packet for ADD_ADDR using portGeliang Tang3-5/+15
The process is similar to that of the ADD_ADDR IPv6, this patch also sent out a pure ack for the ADD_ADDR using port. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09mptcp: add the outgoing ADD_ADDR port supportGeliang Tang3-7/+20
This patch added a new add_addr_signal type named MPTCP_ADD_ADDR_PORT, to identify it is an address with port to be added. It also added a new parameter 'port' for both mptcp_add_addr_len and mptcp_pm_add_addr_signal. In mptcp_established_options_add_addr, we check whether the announced address is added with port. If it is, we put this port number to mptcp_out_options's port field. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>