summaryrefslogtreecommitdiffstats
path: root/drivers/net/hyperv/hyperv_net.h
AgeCommit message (Collapse)AuthorFilesLines
2022-10-07hv_netvsc: Fix race between VF offering and VF association message from hostGaurav Kohli1-1/+2
During vm boot, there might be possibility that vf registration call comes before the vf association from host to vm. And this might break netvsc vf path, To prevent the same block vf registration until vf bind message comes from host. Cc: stable@vger.kernel.org Fixes: 00d7ddba11436 ("hv_netvsc: pair VF based on serial number") Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Gaurav Kohli <gauravkohli@linux.microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-11hv_netvsc: Add support for XDP_REDIRECTHaiyang Zhang1-3/+66
Handle XDP_REDIRECT action in netvsc driver. Also, transparently pass ndo_xdp_xmit to VF when available. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Link: https://lore.kernel.org/r/1649362894-20077-1-git-send-email-haiyangz@microsoft.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-20net: netvsc: Add Isolation VM support for netvsc driverTianyu Lan1-0/+5
In Isolation VM, all shared memory with host needs to mark visible to host via hvcall. vmbus_establish_gpadl() has already done it for netvsc rx/tx ring buffer. The page buffer used by vmbus_sendpacket_ pagebuffer() stills need to be handled. Use DMA API to map/umap these memory during sending/receiving packet and Hyper-V swiotlb bounce buffer dma address will be returned. The swiotlb bounce buffer has been masked to be visible to host during boot up. rx/tx ring buffer is allocated via vzalloc() and they need to be mapped into unencrypted address space(above vTOM) before sharing with host and accessing. Add hv_map/unmap_memory() to map/umap rx /tx ring buffer. Signed-off-by: Tianyu Lan <Tianyu.Lan@microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/20211213071407.314309-6-ltykernel@gmail.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-10-28Drivers: hv: vmbus: Mark vmbus ring buffer visible to host in Isolation VMTianyu Lan1-2/+3
Mark vmbus ring buffer visible with set_memory_decrypted() when establish gpadl handle. Reviewed-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Tianyu Lan <Tianyu.Lan@microsoft.com> Link: https://lore.kernel.org/r/20211025122116.264793-5-ltykernel@gmail.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-06-30Merge tag 'net-next-5.14' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Core: - BPF: - add syscall program type and libbpf support for generating instructions and bindings for in-kernel BPF loaders (BPF loaders for BPF), this is a stepping stone for signed BPF programs - infrastructure to migrate TCP child sockets from one listener to another in the same reuseport group/map to improve flexibility of service hand-off/restart - add broadcast support to XDP redirect - allow bypass of the lockless qdisc to improving performance (for pktgen: +23% with one thread, +44% with 2 threads) - add a simpler version of "DO_ONCE()" which does not require jump labels, intended for slow-path usage - virtio/vsock: introduce SOCK_SEQPACKET support - add getsocketopt to retrieve netns cookie - ip: treat lowest address of a IPv4 subnet as ordinary unicast address allowing reclaiming of precious IPv4 addresses - ipv6: use prandom_u32() for ID generation - ip: add support for more flexible field selection for hashing across multi-path routes (w/ offload to mlxsw) - icmp: add support for extended RFC 8335 PROBE (ping) - seg6: add support for SRv6 End.DT46 behavior - mptcp: - DSS checksum support (RFC 8684) to detect middlebox meddling - support Connection-time 'C' flag - time stamping support - sctp: packetization Layer Path MTU Discovery (RFC 8899) - xfrm: speed up state addition with seq set - WiFi: - hidden AP discovery on 6 GHz and other HE 6 GHz improvements - aggregation handling improvements for some drivers - minstrel improvements for no-ack frames - deferred rate control for TXQs to improve reaction times - switch from round robin to virtual time-based airtime scheduler - add trace points: - tcp checksum errors - openvswitch - action execution, upcalls - socket errors via sk_error_report Device APIs: - devlink: add rate API for hierarchical control of max egress rate of virtual devices (VFs, SFs etc.) - don't require RCU read lock to be held around BPF hooks in NAPI context - page_pool: generic buffer recycling New hardware/drivers: - mobile: - iosm: PCIe Driver for Intel M.2 Modem - support for Qualcomm MSM8998 (ipa) - WiFi: Qualcomm QCN9074 and WCN6855 PCI devices - sparx5: Microchip SparX-5 family of Enterprise Ethernet switches - Mellanox BlueField Gigabit Ethernet (control NIC of the DPU) - NXP SJA1110 Automotive Ethernet 10-port switch - Qualcomm QCA8327 switch support (qca8k) - Mikrotik 10/25G NIC (atl1c) Driver changes: - ACPI support for some MDIO, MAC and PHY devices from Marvell and NXP (our first foray into MAC/PHY description via ACPI) - HW timestamping (PTP) support: bnxt_en, ice, sja1105, hns3, tja11xx - Mellanox/Nvidia NIC (mlx5) - NIC VF offload of L2 bridging - support IRQ distribution to Sub-functions - Marvell (prestera): - add flower and match all - devlink trap - link aggregation - Netronome (nfp): connection tracking offload - Intel 1GE (igc): add AF_XDP support - Marvell DPU (octeontx2): ingress ratelimit offload - Google vNIC (gve): new ring/descriptor format support - Qualcomm mobile (rmnet & ipa): inline checksum offload support - MediaTek WiFi (mt76) - mt7915 MSI support - mt7915 Tx status reporting - mt7915 thermal sensors support - mt7921 decapsulation offload - mt7921 enable runtime pm and deep sleep - Realtek WiFi (rtw88) - beacon filter support - Tx antenna path diversity support - firmware crash information via devcoredump - Qualcomm WiFi (wcn36xx) - Wake-on-WLAN support with magic packets and GTK rekeying - Micrel PHY (ksz886x/ksz8081): add cable test support" * tag 'net-next-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2168 commits) tcp: change ICSK_CA_PRIV_SIZE definition tcp_yeah: check struct yeah size at compile time gve: DQO: Fix off by one in gve_rx_dqo() stmmac: intel: set PCI_D3hot in suspend stmmac: intel: Enable PHY WOL option in EHL net: stmmac: option to enable PHY WOL with PMT enabled net: say "local" instead of "static" addresses in ndo_dflt_fdb_{add,del} net: use netdev_info in ndo_dflt_fdb_{add,del} ptp: Set lookup cookie when creating a PTP PPS source. net: sock: add trace for socket errors net: sock: introduce sk_error_report net: dsa: replay the local bridge FDB entries pointing to the bridge dev too net: dsa: ensure during dsa_fdb_offload_notify that dev_hold and dev_put are on the same dev net: dsa: include fdb entries pointing to bridge in the host fdb list net: dsa: include bridge addresses which are local in the host fdb list net: dsa: sync static FDB entries on foreign interfaces to hardware net: dsa: install the host MDB and FDB entries in the master's RX filter net: dsa: reference count the FDB addresses at the cross-chip notifier level net: dsa: introduce a separate cross-chip notifier type for host FDBs net: dsa: reference count the MDB entries at the cross-chip notifier level ...
2021-06-22hv_netvsc: Avoid field-overflowing memcpy()Kees Cook1-0/+1
In preparation for FORTIFY_SOURCE performing compile-time and run-time field bounds checking for memcpy(), memmove(), and memset(), avoid intentionally writing across neighboring fields. Add flexible array to represent start of buf_info, improving readability and avoid future warning where memcpy() thinks it is writing past the end of the structure. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-14Drivers: hv: vmbus: Copy packets sent by Hyper-V out of the ring bufferAndres Beltran1-0/+7
Pointers to ring-buffer packets sent by Hyper-V are used within the guest VM. Hyper-V can send packets with erroneous values or modify packet fields after they are processed by the guest. To defend against these scenarios, return a copy of the incoming VMBus packet after validating its length and offset fields in hv_pkt_iter_first(). In this way, the packet can no longer be modified by the host. Signed-off-by: Andres Beltran <lkmlabelt@gmail.com> Co-developed-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com> Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/20210408161439.341988-1-parri.andrea@gmail.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-03-29hv_netvsc: Add error handling while switching data pathHaiyang Zhang1-1/+5
Add error handling in case of failure to send switching data path message to the host. Reported-by: Shachar Raindel <shacharr@microsoft.com> Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-01hv_netvsc: Fix validation in netvsc_linkstatus_callback()Andrea Parri (Microsoft)1-1/+1
Contrary to the RNDIS protocol specification, certain (pre-Fe) implementations of Hyper-V's vSwitch did not account for the status buffer field in the length of an RNDIS packet; the bug was fixed in newer implementations. Validate the status buffer fields using the length of the 'vmtransfer_page' packet (all implementations), that is known/validated to be less than or equal to the receive section size and not smaller than the length of the RNDIS message. Reported-by: Dexuan Cui <decui@microsoft.com> Suggested-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com> Fixes: 505e3f00c3f36 ("hv_netvsc: Add (more) validation for untrusted Hyper-V values") Signed-off-by: David S. Miller <davem@davemloft.net>
2021-01-29hv_netvsc: Copy packets sent by Hyper-V out of the receive bufferAndrea Parri (Microsoft)1-45/+48
Pointers to receive-buffer packets sent by Hyper-V are used within the guest VM. Hyper-V can send packets with erroneous values or modify packet fields after they are processed by the guest. To defend against these scenarios, copy (sections of) the incoming packet after validating their length and offset fields in netvsc_filter_receive(). In this way, the packet can no longer be modified by the host. Reported-by: Juan Vazquez <juvazq@microsoft.com> Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com> Link: https://lore.kernel.org/r/20210126162907.21056-1-parri.andrea@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-17hv_netvsc: Use vmbus_requestor to generate transaction IDs for VMBus hardeningAndres Beltran1-0/+13
Currently, pointers to guest memory are passed to Hyper-V as transaction IDs in netvsc. In the face of errors or malicious behavior in Hyper-V, netvsc should not expose or trust the transaction IDs returned by Hyper-V to be valid guest memory addresses. Instead, use small integers generated by vmbus_requestor as requests (transaction) IDs. Signed-off-by: Andres Beltran <lkmlabelt@gmail.com> Co-developed-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com> Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Wei Liu <wei.liu@kernel.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jakub Kicinski <kuba@kernel.org> Cc: netdev@vger.kernel.org Link: https://lore.kernel.org/r/20201109100402.8946-4-parri.andrea@gmail.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2020-09-17hv_netvsc: Add validation for untrusted Hyper-V valuesAndres Beltran1-0/+4
For additional robustness in the face of Hyper-V errors or malicious behavior, validate all values that originate from packets that Hyper-V has sent to the guest in the host-to-guest ring buffer. Ensure that invalid values cannot cause indexing off the end of an array, or subvert an existing validation via integer overflow. Ensure that outgoing packets do not have any leftover guest memory that has not been zeroed out. Signed-off-by: Andres Beltran <lkmlabelt@gmail.com> Co-developed-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com> Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jakub Kicinski <kuba@kernel.org> Cc: netdev@vger.kernel.org Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-10hv_netvsc: Cache the current data path to avoid duplicate call and messageDexuan Cui1-0/+3
The previous change "hv_netvsc: Switch the data path at the right time during hibernation" adds the call of netvsc_vf_changed() upon NETDEV_CHANGE, so it's necessary to avoid the duplicate call and message when the VF is brought UP or DOWN. Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22hv_netvsc: add support for vlans in AF_PACKET modeSriram Krishnan1-0/+1
Vlan tagged packets are getting dropped when used with DPDK that uses the AF_PACKET interface on a hyperV guest. The packet layer uses the tpacket interface to communicate the vlans information to the upper layers. On Rx path, these drivers can read the vlan info from the tpacket header but on the Tx path, this information is still within the packet frame and requires the paravirtual drivers to push this back into the NDIS header which is then used by the host OS to form the packet. This transition from the packet frame to NDIS header is currently missing hence causing the host OS to drop the all vlan tagged packets sent by the drivers that use AF_PACKET (ETH_P_ALL) such as DPDK. Here is an overview of the changes in the vlan header in the packet path: The RX path (userspace handles everything): 1. RX VLAN packet is stripped by HOST OS and placed in NDIS header 2. Guest Kernel RX hv_netvsc packets and moves VLAN info from NDIS header into kernel SKB 3. Kernel shares packets with user space application with PACKET_MMAP. The SKB VLAN info is copied to tpacket layer and indication set TP_STATUS_VLAN_VALID. 4. The user space application will re-insert the VLAN info into the frame The TX path: 1. The user space application has the VLAN info in the frame. 2. Guest kernel gets packets from the application with PACKET_MMAP. 3. The kernel later sends the frame to the hv_netvsc driver. The only way to send VLANs is when the SKB is setup & the VLAN is stripped from the frame. 4. TX VLAN is re-inserted by HOST OS based on the NDIS header. If it sees a VLAN in the frame the packet is dropped. Cc: xe-linux-external@cisco.com Cc: Sriram Krishnan <srirakr2@cisco.com> Signed-off-by: Sriram Krishnan <srirakr2@cisco.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-25hv_netvsc: Add XDP supportHaiyang Zhang1-1/+20
This patch adds support of XDP in native mode for hv_netvsc driver, and transparently sets the XDP program on the associated VF NIC as well. Setting / unsetting XDP program on synthetic NIC (netvsc) propagates to VF NIC automatically. Setting / unsetting XDP program on VF NIC directly is not recommended, also not propagated to synthetic NIC, and may be overwritten by setting of synthetic NIC. The Azure/Hyper-V synthetic NIC receive buffer doesn't provide headroom for XDP. We thought about re-use the RNDIS header space, but it's too small. So we decided to copy the packets to a page buffer for XDP. And, most of our VMs on Azure have Accelerated Network (SRIOV) enabled, so most of the packets run on VF NIC. The synthetic NIC is considered as a fallback data-path. So the data copy on netvsc won't impact performance significantly. XDP program cannot run with LRO (RSC) enabled, so you need to disable LRO before running XDP: ethtool -K eth0 lro off XDP actions not yet supported: XDP_REDIRECT Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-20hv_netvsc: Fix unwanted rx_table resetHaiyang Zhang1-1/+2
In existing code, the receive indirection table, rx_table, is in struct rndis_device, which will be reset when changing MTU, ringparam, etc. User configured receive indirection table values will be lost. To fix this, move rx_table to struct net_device_context, and check netif_is_rxfh_configured(), so rx_table will be set to default only if no user configured value. Fixes: ff4a44199012 ("netvsc: allow get/set of RSS indirection table") Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-30Merge tag 'hyperv-next-signed' of ↵Linus Torvalds1-0/+3
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull Hyper-V updates from Sasha Levin: - support for new VMBus protocols (Andrea Parri) - hibernation support (Dexuan Cui) - latency testing framework (Branden Bonaby) - decoupling Hyper-V page size from guest page size (Himadri Pandya) * tag 'hyperv-next-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: (22 commits) Drivers: hv: vmbus: Fix crash handler reset of Hyper-V synic drivers/hv: Replace binary semaphore with mutex drivers: iommu: hyperv: Make HYPERV_IOMMU only available on x86 HID: hyperv: Add the support of hibernation hv_balloon: Add the support of hibernation x86/hyperv: Implement hv_is_hibernation_supported() Drivers: hv: balloon: Remove dependencies on guest page size Drivers: hv: vmbus: Remove dependencies on guest page size x86: hv: Add function to allocate zeroed page for Hyper-V Drivers: hv: util: Specify ring buffer size using Hyper-V page size Drivers: hv: Specify receive buffer size using Hyper-V page size tools: hv: add vmbus testing tool drivers: hv: vmbus: Introduce latency testing video: hyperv: hyperv_fb: Support deferred IO for Hyper-V frame buffer driver video: hyperv: hyperv_fb: Obtain screen resolution from Hyper-V host hv_netvsc: Add the support of hibernation hv_sock: Add the support of hibernation video: hyperv_fb: Add the support of hibernation scsi: storvsc: Add the support of hibernation Drivers: hv: vmbus: Add module parameter to cap the VMBus version ...
2019-11-23hv_netvsc: make recording RSS hash depend on feature flagStephen Hemminger1-1/+2
The recording of RSS hash should be controlled by NETIF_F_RXHASH. Fixes: 1fac7ca4e63b ("hv_netvsc: record hardware hash in skb") Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-11-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-1/+2
Minor conflict in drivers/s390/net/qeth_l2_main.c, kept the lock from commit c8183f548902 ("s390/qeth: fix potential deadlock on workqueue flush"), removed the code which was removed by commit 9897d583b015 ("s390/qeth: consolidate some duplicated HW cmd code"). Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-11-21hv_netvsc: Fix offset usage in netvsc_send_table()Haiyang Zhang1-1/+2
To reach the data region, the existing code adds offset in struct nvsp_5_send_indirect_table on the beginning of this struct. But the offset should be based on the beginning of its container, struct nvsp_message. This bug causes the first table entry missing, and adds an extra zero from the zero pad after the data region. This can put extra burden on the channel 0. So, correct the offset usage. Also add a boundary check to ensure not reading beyond data region. Fixes: 5b54dac856cb ("hyperv: Add support for virtual Receive Side Scaling (vRSS)") Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-21hv_netvsc: Add the support of hibernationDexuan Cui1-0/+3
The existing netvsc_detach() and netvsc_attach() APIs make it easy to implement the suspend/resume callbacks. Signed-off-by: Dexuan Cui <decui@microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-05hv_netvsc: record hardware hash in skbStephen Hemminger1-0/+1
Since RSS hash is available from the host, record it in the skb. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-07hv_netvsc: Allow scatter-gather feature to be tunableHaiyang Zhang1-1/+1
In a previous patch, the NETIF_F_SG was missing after the code changes. That caused the SG feature to be "fixed". This patch includes it into hw_features, so it is tunable again. Fixes: 23312a3be999 ("netvsc: negotiate checksum and segmentation parameters") Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-30treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 201Thomas Gleixner1-13/+1
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms and conditions of the gnu general public license version 2 as published by the free software foundation this program is distributed in the hope it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details you should have received a copy of the gnu general public license along with this program if not see http www gnu org licenses extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 228 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Steve Winslow <swinslow@gmail.com> Reviewed-by: Richard Fontana <rfontana@redhat.com> Reviewed-by: Alexios Zavras <alexios.zavras@intel.com> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190528171438.107155473@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-03-29hv_netvsc: Fix unwanted wakeup after tx_disableHaiyang Zhang1-0/+1
After queue stopped, the wakeup mechanism may wake it up again when ring buffer usage is lower than a threshold. This may cause send path panic on NULL pointer when we stopped all tx queues in netvsc_detach and start removing the netvsc device. This patch fix it by adding a tx_disable flag to prevent unwanted queue wakeup. Fixes: 7b2ee50c0cd5 ("hv_netvsc: common detach logic") Reported-by: Mohammed Gamal <mgamal@redhat.com> Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-23hv_netvsc: fix typos in code commentsAdrian Vladu1-1/+1
Fix all typos from hyperv netvsc code comments. Signed-off-by: Adrian Vladu <avladu@cloudbasesolutions.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Sasha Levin <sashal@kernel.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: "Alessandro Pilotti" <apilotti@cloudbasesolutions.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-01-23hv_netvsc: Fix hash key value reset after other opsHaiyang Zhang1-3/+7
Changing mtu, channels, or buffer sizes ops call to netvsc_attach(), rndis_set_subchannel(), which always reset the hash key to default value. That will override hash key changed previously. This patch fixes the problem by save the hash key, then restore it when we re- add the netvsc device. Fixes: ff4a44199012 ("netvsc: allow get/set of RSS indirection table") Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> [sl: fix up subject line] Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-09-22hv_netvsc: Add handler for LRO setting changeHaiyang Zhang1-0/+4
This patch adds the handler for LRO setting change, so that a user can use ethtool command to enable / disable LRO feature. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-22hv_netvsc: Add support for LRO/RSC in the vSwitchHaiyang Zhang1-7/+40
LRO/RSC in the vSwitch is a feature available in Windows Server 2019 hosts and later. It reduces the per packet processing overhead by coalescing multiple TCP segments when possible. This patch adds netvsc driver support for this feature. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-30hv_netvsc: Add per-cpu ethtool stats for netvscYidong Ren1-0/+11
This patch implements following ethtool stats fields for netvsc: cpu<n>_tx/rx_packets/bytes cpu<n>_vf_tx/rx_packets/bytes Corresponding per-cpu counters already exist in current code. Exposing these counters will help troubleshooting performance issues. for_each_present_cpu() was used instead of for_each_possible_cpu(). for_each_possible_cpu() would create very long and useless output. It is still being used for internal buffer, but not for ethtool output. There could be an overflow if cpu was added between ethtool call netvsc_get_sset_count() and netvsc_get_ethtool_stats() and netvsc_get_strings(). (still safe if cpu was removed) ethtool makes these three function calls separately. As long as we use ethtool, I can't see any clean solution. Currently and in foreseeable short term, Hyper-V doesn't support cpu hot-plug. Plus, ethtool is for admin use. Unlikely the admin would perform such combo operations. Changes in v2: - Remove cpp style comment - Resubmit after freeze Changes in v3: - Reimplemented with kvmalloc instead of alloc_percpu Changes in v4: - Fixed inconsistent array size - Use kvmalloc_array instead of kvmalloc Signed-off-by: Yidong Ren <yidren@microsoft.com> Reviewed-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-30hv_netvsc: split sub-channel setup into async and syncStephen Hemminger1-1/+1
When doing device hotplug the sub channel must be async to avoid deadlock issues because device is discovered in softirq context. When doing changes to MTU and number of channels, the setup must be synchronous to avoid races such as when MTU and device settings are done in a single ip command. Reported-by: Thomas Walker <Thomas.Walker@twosigma.com> Fixes: 8195b1396ec8 ("hv_netvsc: fix deadlock on hotplug") Fixes: 732e49850c5e ("netvsc: fix race on sub channel creation") Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-14hv_netvsc: Fix the variable sizes in ipsecv2 and rsc offloadHaiyang Zhang1-13/+13
These fields in struct ndis_ipsecv2_offload and struct ndis_rsc_offload are one byte according to the specs. This patch defines them with the right size. These structs are not in use right now, but will be used soon. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-12hv_netvsc: fix network namespace issues with VF supportStephen Hemminger1-0/+2
When finding the parent netvsc device, the search needs to be across all netvsc device instances (independent of network namespace). Find parent device of VF using upper_dev_get routine which searches only adjacent list. Fixes: e8ff40d4bff1 ("hv_netvsc: improve VF device matching") Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> netns aware byref Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-12hv_netvsc: drop common code until callback model fixedStephen Hemminger1-2/+0
The callback model of handling network failover is not suitable in the current form. 1. It was merged without addressing all the review feedback. 2. It was merged without approval of any of the netvsc maintainers. 3. Design discussion on how to handle PV/VF fallback is still not complete. 4. IMHO the code model using callbacks is trying to make something common which isn't. Revert the netvsc specific changes for now. Does not impact ongoing development of failover model for virtio. Revisit this after a simpler library based failover kernel routines are extracted. This reverts commit 9c6ffbacdb57 ("hv_netvsc: fix error return code in netvsc_probe()") and commit 1ff78076d8dd ("netvsc: refactor notifier/event handling code to use the failover framework") Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-10Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds1-1/+0
Pull SCSI updates from James Bottomley: "This is mostly updates to the usual drivers: ufs, qedf, mpt3sas, lpfc, xfcp, hisi_sas, cxlflash, qla2xxx. In the absence of Nic, we're also taking target updates which are mostly minor except for the tcmu refactor. The only real core change to worry about is the removal of high page bouncing (in sas, storvsc and iscsi). This has been well tested and no problems have shown up so far" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (268 commits) scsi: lpfc: update driver version to 12.0.0.4 scsi: lpfc: Fix port initialization failure. scsi: lpfc: Fix 16gb hbas failing cq create. scsi: lpfc: Fix crash in blk_mq layer when executing modprobe -r lpfc scsi: lpfc: correct oversubscription of nvme io requests for an adapter scsi: lpfc: Fix MDS diagnostics failure (Rx < Tx) scsi: hisi_sas: Mark PHY as in reset for nexus reset scsi: hisi_sas: Fix return value when get_free_slot() failed scsi: hisi_sas: Terminate STP reject quickly for v2 hw scsi: hisi_sas: Add v2 hw force PHY function for internal ATA command scsi: hisi_sas: Include TMF elements in struct hisi_sas_slot scsi: hisi_sas: Try wait commands before before controller reset scsi: hisi_sas: Init disks after controller reset scsi: hisi_sas: Create a scsi_host_template per HW module scsi: hisi_sas: Reset disks when discovered scsi: hisi_sas: Add LED feature for v3 hw scsi: hisi_sas: Change common allocation mode of device id scsi: hisi_sas: change slot index allocation mode scsi: hisi_sas: Introduce hisi_sas_phy_set_linkrate() scsi: hisi_sas: fix a typo in hisi_sas_task_prep() ...
2018-05-28netvsc: refactor notifier/event handling code to use the failover frameworkSridhar Samudrala1-0/+2
Use the registration/notification framework supported by the generic failover infrastructure. Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-10hv_netvsc: typo in NDIS RSS parameters structureStephen Hemminger1-1/+1
Fix simple misspelling kashkey_offset should be hashkey_offset. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18hv_netvsc: Add NetVSP v6 and v6.1 into version negotiationHaiyang Zhang1-0/+164
This patch adds the NetVSP v6 and 6.1 message structures, and includes these versions into NetVSC/NetVSP version negotiation process. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18scsi: netvsc: Use the vmbus function to calculate ring buffer percentageLong Li1-1/+0
In Vmbus, we have defined a function to calculate available ring buffer percentage to write. Use that function and remove netvsc's private version. [mkp: typo] Signed-off-by: Long Li <longli@microsoft.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-25hv_netvsc: Add range checking for rx packet offset and lengthHaiyang Zhang1-0/+1
This patch adds range checking for rx packet offset and length. It may only happen if there is a host side bug. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-22hv_netvsc: common detach logicStephen Hemminger1-1/+0
Make common function for detaching internals of device during changes to MTU and RSS. Make sure no more packets are transmitted and all packets have been received before doing device teardown. Change the wait logic to be common and use usleep_range(). Changes transmit enabling logic so that transmit queues are disabled during the period when lower device is being changed. And enabled only after sub channels are setup. This avoids issue where it could be that a packet was being sent while subchannel was not initialized. Fixes: 8195b1396ec8 ("hv_netvsc: fix deadlock on hotplug") Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08hv_netvsc: avoid repeated updates of packet filterStephen Hemminger1-0/+1
The netvsc driver can get repeated calls to netvsc_rx_mode during network setup; each of these calls ends up scheduling the lower layers to update tha packet filter. This update requires an request/response to the host. So avoid doing this if we already know that the correct packet filter value is set. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13hv_netvsc: empty current transmit aggregation if flow blockedStephen Hemminger1-1/+1
If the transmit queue is known full, then don't keep aggregating data. And the cp_partial flag which indicates that the current aggregation buffer is full can be folded in to avoid more conditionals. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13hv_netvsc: remove open_cnt reference countStephen Hemminger1-2/+0
There is only ever a single instance of network device object referencing the internal rndis object. Therefore the open_cnt atomic is not necessary. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13hv_netvsc: pass netvsc_device to receive callbackStephen Hemminger1-0/+1
The netvsc_receive_callback function was using RCU to find the appropriate underlying netvsc_device. Since calling function already had that pointer, this was unnecessary. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13hv_netvsc: simplify function args in receive status pathStephen Hemminger1-2/+1
The caller (netvsc_receive) already has the net device pointer, and should just pass that to functions rather than the hyperv device. This eliminates several impossible error paths in the process. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13hv_netvsc: track memory allocation failures in ethtool statsStephen Hemminger1-0/+1
When skb can not be allocated, update ethtool statisitics rather than rx_dropped which is intended for netif_receive. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13hv_netvsc: Fix the TX/RX buffer default sizesHaiyang Zhang1-1/+12
The values were not computed correctly. There are no significant visible impact, though. The intended size of RX buffer is 16 MB, and the default slot size is 1728. So, NETVSC_DEFAULT_RX should be 16*1024*1024 / 1728 = 9709. The intended size of TX buffer is 1 MB, and the slot size is 6144. So, NETVSC_DEFAULT_TX should be 1024*1024 / 6144 = 170. The patch puts the formula directly into the macro, and moves them to hyperv_net.h, together with related macros. Fixes: 5023a6db73196 ("netvsc: increase default receive buffer size") Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13hv_netvsc: Fix the receive buffer size limitHaiyang Zhang1-2/+4
The max should be 31 MB on host with NVSP version > 2. On legacy hosts (NVSP version <=2) only 15 MB receive buffer is allowed, otherwise the buffer request will be rejected by the host, resulting vNIC not coming up. The NVSP version is only available after negotiation. So, we add the limit checking for legacy hosts in netvsc_init_buf(). Fixes: 5023a6db73196 ("netvsc: increase default receive buffer size") Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-03hv_netvsc: use reciprocal divide to speed up percent calculationStephen Hemminger1-3/+3
Every packet sent checks the available ring space. The calculation can be sped up by using reciprocal divide which is multiplication. Since ring_size can only be configured by module parameter, so it doesn't have to be passed around everywhere. Also it should be unsigned since it is number of pages. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>