diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2015-09-26 06:01:33 -0400 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2015-09-26 06:01:33 -0400 |
commit | 518a7cb6980cd640c7f979d29021ad870f60d7d7 (patch) | |
tree | 7ef65013cbf1b5b3f65c8295756446dafcd4f784 | |
parent | d4a748a10e50d95992ae67677f1a1a13e2d6ed47 (diff) | |
parent | bdb06cbf77cb01911694cc9076ffa8196b7b9b61 (diff) | |
download | linux-518a7cb6980cd640c7f979d29021ad870f60d7d7.tar.bz2 |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) When we run a tap on netlink sockets, we have to copy mmap'd SKBs
instead of cloning them. From Daniel Borkmann.
2) When converting classical BPF into eBPF, fix the setting of the
source reg to BPF_REG_X. From Tycho Andersen.
3) Fix igmpv3/mldv2 report parsing in the bridge multicast code, from
Linus Lussing.
4) Fix dst refcounting for ipv6 tunnels, from Martin KaFai Lau.
5) Set NLM_F_REPLACE flag properly when replacing ipv6 routes, from
Roopa Prabhu.
6) Add some new cxgb4 PCI device IDs, from Hariprasad Shenai.
7) Fix headroom tests and SKB leaks in ipv6 fragmentation code, from
Florian Westphal.
8) Check DMA mapping errors in bna driver, from Ivan Vecera.
9) Several 8139cp bug fixes (dev_kfree_skb_any in interrupt context,
misclearing of interrupt status in TX timeout handler, etc.) from
David Woodhouse.
10) In tipc, reset SKB header pointer after skb_linearize(), from Erik
Hugne.
11) Fix autobind races et al. in netlink code, from Herbert Xu with
help from Tejun Heo and others.
12) Missing SET_NETDEV_DEV in sunvnet driver, from Sowmini Varadhan.
13) Fix various races in timewait timer and reqsk_queue_hadh_req, from
Eric Dumazet.
14) Fix array overruns in mac80211, from Johannes Berg and Dan
Carpenter.
15) Fix data race in rhashtable_rehash_one(), from Dmitriy Vyukov.
16) Fix race between poll_one_napi and napi_disable, from Neil Horman.
17) Fix byte order in geneve tunnel port config, from John W Linville.
18) Fix handling of ARP replies over lightweight tunnels, from Jiri
Benc.
19) We can loop when fib rule dumps cross multiple SKBs, fix from Wilson
Kok and Roopa Prabhu.
20) Several reference count handling bug fixes in the PHY/MDIO layer
from Russel King.
21) Fix lockdep splat in ppp_dev_uninit(), from Guillaume Nault.
22) Fix crash in icmp_route_lookup(), from David Ahern.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (116 commits)
net: Fix panic in icmp_route_lookup
net: update docbook comment for __mdiobus_register()
ppp: fix lockdep splat in ppp_dev_uninit()
net: via/Kconfig: GENERIC_PCI_IOMAP required if PCI not selected
phy: marvell: add link partner advertised modes
net: fix net_device refcounting
phy: add phy_device_remove()
phy: fixed-phy: properly validate phy in fixed_phy_update_state()
net: fix phy refcounting in a bunch of drivers
of_mdio: fix MDIO phy device refcounting
phy: add proper phy struct device refcounting
phy: fix mdiobus module safety
net: dsa: fix of_mdio_find_bus() device refcount leak
phy: fix of_mdio_find_bus() device refcount leak
ip6_tunnel: Reduce log level in ip6_tnl_err() to debug
ip6_gre: Reduce log level in ip6gre_err() to debug
fib_rules: fix fib rule dumps across multiple skbs
bnx2x: byte swap rss_key to comply to Toeplitz specs
net: revert "net_sched: move tp->root allocation into fw_init()"
lwtunnel: remove source and destination UDP port config option
...
117 files changed, 1697 insertions, 639 deletions
diff --git a/Documentation/networking/vrf.txt b/Documentation/networking/vrf.txt new file mode 100644 index 000000000000..031ef4a63485 --- /dev/null +++ b/Documentation/networking/vrf.txt @@ -0,0 +1,96 @@ +Virtual Routing and Forwarding (VRF) +==================================== +The VRF device combined with ip rules provides the ability to create virtual +routing and forwarding domains (aka VRFs, VRF-lite to be specific) in the +Linux network stack. One use case is the multi-tenancy problem where each +tenant has their own unique routing tables and in the very least need +different default gateways. + +Processes can be "VRF aware" by binding a socket to the VRF device. Packets +through the socket then use the routing table associated with the VRF +device. An important feature of the VRF device implementation is that it +impacts only Layer 3 and above so L2 tools (e.g., LLDP) are not affected +(ie., they do not need to be run in each VRF). The design also allows +the use of higher priority ip rules (Policy Based Routing, PBR) to take +precedence over the VRF device rules directing specific traffic as desired. + +In addition, VRF devices allow VRFs to be nested within namespaces. For +example network namespaces provide separation of network interfaces at L1 +(Layer 1 separation), VLANs on the interfaces within a namespace provide +L2 separation and then VRF devices provide L3 separation. + +Design +------ +A VRF device is created with an associated route table. Network interfaces +are then enslaved to a VRF device: + + +-----------------------------+ + | vrf-blue | ===> route table 10 + +-----------------------------+ + | | | + +------+ +------+ +-------------+ + | eth1 | | eth2 | ... | bond1 | + +------+ +------+ +-------------+ + | | + +------+ +------+ + | eth8 | | eth9 | + +------+ +------+ + +Packets received on an enslaved device and are switched to the VRF device +using an rx_handler which gives the impression that packets flow through +the VRF device. Similarly on egress routing rules are used to send packets +to the VRF device driver before getting sent out the actual interface. This +allows tcpdump on a VRF device to capture all packets into and out of the +VRF as a whole.[1] Similiarly, netfilter [2] and tc rules can be applied +using the VRF device to specify rules that apply to the VRF domain as a whole. + +[1] Packets in the forwarded state do not flow through the device, so those + packets are not seen by tcpdump. Will revisit this limitation in a + future release. + +[2] Iptables on ingress is limited to NF_INET_PRE_ROUTING only with skb->dev + set to real ingress device and egress is limited to NF_INET_POST_ROUTING. + Will revisit this limitation in a future release. + + +Setup +----- +1. VRF device is created with an association to a FIB table. + e.g, ip link add vrf-blue type vrf table 10 + ip link set dev vrf-blue up + +2. Rules are added that send lookups to the associated FIB table when the + iif or oif is the VRF device. e.g., + ip ru add oif vrf-blue table 10 + ip ru add iif vrf-blue table 10 + + Set the default route for the table (and hence default route for the VRF). + e.g, ip route add table 10 prohibit default + +3. Enslave L3 interfaces to a VRF device. + e.g, ip link set dev eth1 master vrf-blue + + Local and connected routes for enslaved devices are automatically moved to + the table associated with VRF device. Any additional routes depending on + the enslaved device will need to be reinserted following the enslavement. + +4. Additional VRF routes are added to associated table. + e.g., ip route add table 10 ... + + +Applications +------------ +Applications that are to work within a VRF need to bind their socket to the +VRF device: + + setsockopt(sd, SOL_SOCKET, SO_BINDTODEVICE, dev, strlen(dev)+1); + +or to specify the output device using cmsg and IP_PKTINFO. + + +Limitations +----------- +VRF device currently only works for IPv4. Support for IPv6 is under development. + +Index of original ingress interface is not available via cmsg. Will address +soon. diff --git a/Documentation/sysctl/net.txt b/Documentation/sysctl/net.txt index 6294b5186ae5..809ab6efcc74 100644 --- a/Documentation/sysctl/net.txt +++ b/Documentation/sysctl/net.txt @@ -54,13 +54,15 @@ default_qdisc -------------- The default queuing discipline to use for network devices. This allows -overriding the default queue discipline of pfifo_fast with an -alternative. Since the default queuing discipline is created with the -no additional parameters so is best suited to queuing disciplines that -work well without configuration like stochastic fair queue (sfq), -CoDel (codel) or fair queue CoDel (fq_codel). Don't use queuing disciplines -like Hierarchical Token Bucket or Deficit Round Robin which require setting -up classes and bandwidths. +overriding the default of pfifo_fast with an alternative. Since the default +queuing discipline is created without additional parameters so is best suited +to queuing disciplines that work well without configuration like stochastic +fair queue (sfq), CoDel (codel) or fair queue CoDel (fq_codel). Don't use +queuing disciplines like Hierarchical Token Bucket or Deficit Round Robin +which require setting up classes and bandwidths. Note that physical multiqueue +interfaces still use mq as root qdisc, which in turn uses this default for its +leaves. Virtual devices (like e.g. lo or veth) ignore this setting and instead +default to noqueue. Default: pfifo_fast busy_read diff --git a/MAINTAINERS b/MAINTAINERS index 45b06ab43ec0..5de7c7945022 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -808,6 +808,13 @@ S: Maintained F: drivers/video/fbdev/arcfb.c F: drivers/video/fbdev/core/fb_defio.c +ARCNET NETWORK LAYER +M: Michael Grzeschik <m.grzeschik@pengutronix.de> +L: netdev@vger.kernel.org +S: Maintained +F: drivers/net/arcnet/ +F: include/uapi/linux/if_arcnet.h + ARM MFM AND FLOPPY DRIVERS M: Ian Molton <spyro@f2s.com> S: Maintained @@ -8500,7 +8507,6 @@ F: Documentation/networking/LICENSE.qla3xxx F: drivers/net/ethernet/qlogic/qla3xxx.* QLOGIC QLCNIC (1/10)Gb ETHERNET DRIVER -M: Shahed Shaikh <shahed.shaikh@qlogic.com> M: Dept-GELinuxNICDev@qlogic.com L: netdev@vger.kernel.org S: Supported @@ -11262,6 +11268,7 @@ L: netdev@vger.kernel.org S: Maintained F: drivers/net/vrf.c F: include/net/vrf.h +F: Documentation/networking/vrf.txt VT1211 HARDWARE MONITOR DRIVER M: Juerg Haefliger <juergh@gmail.com> diff --git a/drivers/atm/he.c b/drivers/atm/he.c index a8da3a50e374..0f5cb37636bc 100644 --- a/drivers/atm/he.c +++ b/drivers/atm/he.c @@ -1578,9 +1578,7 @@ he_stop(struct he_dev *he_dev) kfree(he_dev->rbpl_virt); kfree(he_dev->rbpl_table); - - if (he_dev->rbpl_pool) - dma_pool_destroy(he_dev->rbpl_pool); + dma_pool_destroy(he_dev->rbpl_pool); if (he_dev->rbrq_base) dma_free_coherent(&he_dev->pci_dev->dev, CONFIG_RBRQ_SIZE * sizeof(struct he_rbrq), @@ -1594,8 +1592,7 @@ he_stop(struct he_dev *he_dev) dma_free_coherent(&he_dev->pci_dev->dev, CONFIG_TBRQ_SIZE * sizeof(struct he_tbrq), he_dev->tpdrq_base, he_dev->tpdrq_phys); - if (he_dev->tpd_pool) - dma_pool_destroy(he_dev->tpd_pool); + dma_pool_destroy(he_dev->tpd_pool); if (he_dev->pci_dev) { pci_read_config_word(he_dev->pci_dev, PCI_COMMAND, &command); diff --git a/drivers/atm/solos-pci.c b/drivers/atm/solos-pci.c index 74e18b0a6d89..3d7fb6516f74 100644 --- a/drivers/atm/solos-pci.c +++ b/drivers/atm/solos-pci.c @@ -805,7 +805,12 @@ static void solos_bh(unsigned long card_arg) continue; } - skb = alloc_skb(size + 1, GFP_ATOMIC); + /* Use netdev_alloc_skb() because it adds NET_SKB_PAD of + * headroom, and ensures we can route packets back out an + * Ethernet interface (for example) without having to + * reallocate. Adding NET_IP_ALIGN also ensures that both + * PPPoATM and PPPoEoBR2684 packets end up aligned. */ + skb = netdev_alloc_skb_ip_align(NULL, size + 1); if (!skb) { if (net_ratelimit()) dev_warn(&card->dev->dev, "Failed to allocate sk_buff for RX\n"); @@ -869,7 +874,10 @@ static void solos_bh(unsigned long card_arg) /* Allocate RX skbs for any ports which need them */ if (card->using_dma && card->atmdev[port] && !card->rx_skb[port]) { - struct sk_buff *skb = alloc_skb(RX_DMA_SIZE, GFP_ATOMIC); + /* Unlike the MMIO case (qv) we can't add NET_IP_ALIGN + * here; the FPGA can only DMA to addresses which are + * aligned to 4 bytes. */ + struct sk_buff *skb = dev_alloc_skb(RX_DMA_SIZE); if (skb) { SKB_CB(skb)->dma_addr = dma_map_single(&card->dev->dev, skb->data, diff --git a/drivers/net/arcnet/arcnet.c b/drivers/net/arcnet/arcnet.c index 10f71c732b59..816d0e94961c 100644 --- a/drivers/net/arcnet/arcnet.c +++ b/drivers/net/arcnet/arcnet.c @@ -326,7 +326,7 @@ static void arcdev_setup(struct net_device *dev) dev->type = ARPHRD_ARCNET; dev->netdev_ops = &arcnet_netdev_ops; dev->header_ops = &arcnet_header_ops; - dev->hard_header_len = sizeof(struct archdr); + dev->hard_header_len = sizeof(struct arc_hardware); dev->mtu = choose_mtu(); dev->addr_len = ARCNET_ALEN; diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c index 6f13f7206762..f8baa897d1a0 100644 --- a/drivers/net/dsa/mv88e6xxx.c +++ b/drivers/net/dsa/mv88e6xxx.c @@ -2000,6 +2000,7 @@ static int mv88e6xxx_setup_port(struct dsa_switch *ds, int port) */ reg = _mv88e6xxx_reg_read(ds, REG_PORT(port), PORT_PCS_CTRL); if (dsa_is_cpu_port(ds, port) || dsa_is_dsa_port(ds, port)) { + reg &= ~PORT_PCS_CTRL_UNFORCED; reg |= PORT_PCS_CTRL_FORCE_LINK | PORT_PCS_CTRL_LINK_UP | PORT_PCS_CTRL_DUPLEX_FULL | diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c b/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c index cfa37041ab71..c4bb8027b3fb 100644 --- a/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c +++ b/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c @@ -689,16 +689,24 @@ static int xgene_enet_phy_connect(struct net_device *ndev) netdev_dbg(ndev, "No phy-handle found in DT\n"); return -ENODEV; } - pdata->phy_dev = of_phy_find_device(phy_np); - } - phy_dev = pdata->phy_dev; + phy_dev = of_phy_connect(ndev, phy_np, &xgene_enet_adjust_link, + 0, pdata->phy_mode); + if (!phy_dev) { + netdev_err(ndev, "Could not connect to PHY\n"); + return -ENODEV; + } + + pdata->phy_dev = phy_dev; + } else { + phy_dev = pdata->phy_dev; - if (!phy_dev || - phy_connect_direct(ndev, phy_dev, &xgene_enet_adjust_link, - pdata->phy_mode)) { - netdev_err(ndev, "Could not connect to PHY\n"); - return -ENODEV; + if (!phy_dev || + phy_connect_direct(ndev, phy_dev, &xgene_enet_adjust_link, + pdata->phy_mode)) { + netdev_err(ndev, "Could not connect to PHY\n"); + return -ENODEV; + } } pdata->phy_speed = SPEED_UNKNOWN; diff --git a/drivers/net/ethernet/arc/emac_arc.c b/drivers/net/ethernet/arc/emac_arc.c index f9cb99bfb511..ffd180570920 100644 --- a/drivers/net/ethernet/arc/emac_arc.c +++ b/drivers/net/ethernet/arc/emac_arc.c @@ -78,6 +78,7 @@ static const struct of_device_id emac_arc_dt_ids[] = { { .compatible = "snps,arc-emac" }, { /* Sentinel */ } }; +MODULE_DEVICE_TABLE(of, emac_arc_dt_ids); static struct platform_driver emac_arc_driver = { .probe = emac_arc_probe, diff --git a/drivers/net/ethernet/broadcom/bcmsysport.c b/drivers/net/ethernet/broadcom/bcmsysport.c index b9a5a97ed4dd..f1b5364f3521 100644 --- a/drivers/net/ethernet/broadcom/bcmsysport.c +++ b/drivers/net/ethernet/broadcom/bcmsysport.c @@ -2079,6 +2079,7 @@ static const struct of_device_id bcm_sysport_of_match[] = { { .compatible = "brcm,systemport" }, { /* sentinel */ } }; +MODULE_DEVICE_TABLE(of, bcm_sysport_of_match); static struct platform_driver bcm_sysport_driver = { .probe = bcm_sysport_probe, diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h index ba936635322a..b5e64b02200c 100644 --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h @@ -1946,6 +1946,7 @@ struct bnx2x { u16 vlan_cnt; u16 vlan_credit; u16 vxlan_dst_port; + u8 vxlan_dst_port_count; bool accept_any_vlan; }; diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c index e3da2bddf143..f1d62d5dbaff 100644 --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c @@ -3705,16 +3705,14 @@ out: void bnx2x_update_mfw_dump(struct bnx2x *bp) { - struct timeval epoc; u32 drv_ver; u32 valid_dump; if (!SHMEM2_HAS(bp, drv_info)) return; - /* Update Driver load time */ - do_gettimeofday(&epoc); - SHMEM2_WR(bp, drv_info.epoc, epoc.tv_sec); + /* Update Driver load time, possibly broken in y2038 */ + SHMEM2_WR(bp, drv_info.epoc, (u32)ktime_get_real_seconds()); drv_ver = bnx2x_update_mng_version_utility(DRV_MODULE_VERSION, true); SHMEM2_WR(bp, drv_info.drv_ver, drv_ver); @@ -10110,12 +10108,18 @@ static void __bnx2x_add_vxlan_port(struct bnx2x *bp, u16 port) if (!netif_running(bp->dev)) return; - if (bp->vxlan_dst_port || !IS_PF(bp)) { + if (bp->vxlan_dst_port_count && bp->vxlan_dst_port == port) { + bp->vxlan_dst_port_count++; + return; + } + + if (bp->vxlan_dst_port_count || !IS_PF(bp)) { DP(BNX2X_MSG_SP, "Vxlan destination port limit reached\n"); return; } bp->vxlan_dst_port = port; + bp->vxlan_dst_port_count = 1; bnx2x_schedule_sp_rtnl(bp, BNX2X_SP_RTNL_ADD_VXLAN_PORT, 0); } @@ -10130,10 +10134,14 @@ static void bnx2x_add_vxlan_port(struct net_device *netdev, static void __bnx2x_del_vxlan_port(struct bnx2x *bp, u16 port) { - if (!bp->vxlan_dst_port || bp->vxlan_dst_port != port || !IS_PF(bp)) { + if (!bp->vxlan_dst_port_count || bp->vxlan_dst_port != port || + !IS_PF(bp)) { DP(BNX2X_MSG_SP, "Invalid vxlan port\n"); return; } + bp->vxlan_dst_port--; + if (bp->vxlan_dst_port) + return; if (netif_running(bp->dev)) { bnx2x_schedule_sp_rtnl(bp, BNX2X_SP_RTNL_DEL_VXLAN_PORT, 0); diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sp.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sp.c index c9bd7f16018e..ff702a707a91 100644 --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sp.c +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sp.c @@ -4319,8 +4319,16 @@ static int bnx2x_setup_rss(struct bnx2x *bp, /* RSS keys */ if (test_bit(BNX2X_RSS_SET_SRCH, &p->rss_flags)) { - memcpy(&data->rss_key[0], &p->rss_key[0], - sizeof(data->rss_key)); + u8 *dst = (u8 *)(data->rss_key) + sizeof(data->rss_key); + const u8 *src = (const u8 *)p->rss_key; + int i; + + /* Apparently, bnx2x reads this array in reverse order + * We need to byte swap rss_key to comply with Toeplitz specs. + */ + for (i = 0; i < sizeof(data->rss_key); i++) + *--dst = *src++; + caps |= ETH_RSS_UPDATE_RAMROD_DATA_UPDATE_RSS_KEY; } diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.c b/drivers/net/ethernet/broadcom/genet/bcmgenet.c index fadbd0088d3e..3bc701e4c59e 100644 --- a/drivers/net/ethernet/broadcom/genet/bcmgenet.c +++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.c @@ -3155,6 +3155,7 @@ static const struct of_device_id bcmgenet_match[] = { { .compatible = "brcm,genet-v4", .data = (void *)GENET_V4 }, { }, }; +MODULE_DEVICE_TABLE(of, bcmgenet_match); static int bcmgenet_probe(struct platform_device *pdev) { diff --git a/drivers/net/ethernet/brocade/bna/bna_tx_rx.c b/drivers/net/ethernet/brocade/bna/bna_tx_rx.c index 5d0753cc7e73..04b0d16b210e 100644 --- a/drivers/net/ethernet/brocade/bna/bna_tx_rx.c +++ b/drivers/net/ethernet/brocade/bna/bna_tx_rx.c @@ -2400,6 +2400,7 @@ bna_rx_create(struct bna *bna, struct bnad *bnad, q0->rcb->id = 0; q0->rx_packets = q0->rx_bytes = 0; q0->rx_packets_with_error = q0->rxbuf_alloc_failed = 0; + q0->rxbuf_map_failed = 0; bna_rxq_qpt_setup(q0, rxp, dpage_count, PAGE_SIZE, &dqpt_mem[i], &dsqpt_mem[i], &dpage_mem[i]); @@ -2428,6 +2429,7 @@ bna_rx_create(struct bna *bna, struct bnad *bnad, : rx_cfg->q1_buf_size; q1->rx_packets = q1->rx_bytes = 0; q1->rx_packets_with_error = q1->rxbuf_alloc_failed = 0; + q1->rxbuf_map_failed = 0; bna_rxq_qpt_setup(q1, rxp, hpage_count, PAGE_SIZE, &hqpt_mem[i], &hsqpt_mem[i], diff --git a/drivers/net/ethernet/brocade/bna/bna_types.h b/drivers/net/ethernet/brocade/bna/bna_types.h index e0e797f2ea14..c438d032e8bf 100644 --- a/drivers/net/ethernet/brocade/bna/bna_types.h +++ b/drivers/net/ethernet/brocade/bna/bna_types.h @@ -587,6 +587,7 @@ struct bna_rxq { u64 rx_bytes; u64 rx_packets_with_error; u64 rxbuf_alloc_failed; + u64 rxbuf_map_failed; }; /* RxQ pair */ diff --git a/drivers/net/ethernet/brocade/bna/bnad.c b/drivers/net/ethernet/brocade/bna/bnad.c index 506047c38607..21a0cfc3e7ec 100644 --- a/drivers/net/ethernet/brocade/bna/bnad.c +++ b/drivers/net/ethernet/brocade/bna/bnad.c @@ -399,7 +399,13 @@ bnad_rxq_refill_page(struct bnad *bnad, struct bna_rcb *rcb, u32 nalloc) } dma_addr = dma_map_page(&bnad->pcidev->dev, page, page_offset, - unmap_q->map_size, DMA_FROM_DEVICE); + unmap_q->map_size, DMA_FROM_DEVICE); + if (dma_mapping_error(&bnad->pcidev->dev, dma_addr)) { + put_page(page); + BNAD_UPDATE_CTR(bnad, rxbuf_map_failed); + rcb->rxq->rxbuf_map_failed++; + goto finishing; + } unmap->page = page; unmap->page_offset = page_offset; @@ -454,8 +460,15 @@ bnad_rxq_refill_skb(struct bnad *bnad, struct bna_rcb *rcb, u32 nalloc) rcb->rxq->rxbuf_alloc_failed++; goto finishing; } + dma_addr = dma_map_single(&bnad->pcidev->dev, skb->data, buff_sz, DMA_FROM_DEVICE); + if (dma_mapping_error(&bnad->pcidev->dev, dma_addr)) { + dev_kfree_skb_any(skb); + BNAD_UPDATE_CTR(bnad, rxbuf_map_failed); + rcb->rxq->rxbuf_map_failed++; + goto finishing; + } unmap->skb = skb; dma_unmap_addr_set(&unmap->vector, dma_addr, dma_addr); @@ -3025,6 +3038,11 @@ bnad_start_xmit(struct sk_buff *skb, struct net_device *netdev) unmap = head_unmap; dma_addr = dma_map_single(&bnad->pcidev->dev, skb->data, len, DMA_TO_DEVICE); + if (dma_mapping_error(&bnad->pcidev->dev, dma_addr)) { + dev_kfree_skb_any(skb); + BNAD_UPDATE_CTR(bnad, tx_skb_map_failed); + return NETDEV_TX_OK; + } BNA_SET_DMA_ADDR(dma_addr, &txqent->vector[0].host_addr); txqent->vector[0].length = htons(len); dma_unmap_addr_set(&unmap->vectors[0], dma_addr, dma_addr); @@ -3056,6 +3074,15 @@ bnad_start_xmit(struct sk_buff *skb, struct net_device *netdev) dma_addr = skb_frag_dma_map(&bnad->pcidev->dev, frag, 0, size, DMA_TO_DEVICE); + if (dma_mapping_error(&bnad->pcidev->dev, dma_addr)) { + /* Undo the changes starting at tcb->producer_index */ + bnad_tx_buff_unmap(bnad, unmap_q, q_depth, + tcb->producer_index); + dev_kfree_skb_any(skb); + BNAD_UPDATE_CTR(bnad, tx_skb_map_failed); + return NETDEV_TX_OK; + } + dma_unmap_len_set(&unmap->vectors[vect_id], dma_len, size); BNA_SET_DMA_ADDR(dma_addr, &txqent->vector[vect_id].host_addr); txqent->vector[vect_id].length = htons(size); diff --git a/drivers/net/ethernet/brocade/bna/bnad.h b/drivers/net/ethernet/brocade/bna/bnad.h index faedbf24777e..f4ed816b93ee 100644 --- a/drivers/net/ethernet/brocade/bna/bnad.h +++ b/drivers/net/ethernet/brocade/bna/bnad.h @@ -175,6 +175,7 @@ struct bnad_drv_stats { u64 tx_skb_headlen_zero; u64 tx_skb_frag_zero; u64 tx_skb_len_mismatch; + u64 tx_skb_map_failed; u64 hw_stats_updates; u64 netif_rx_dropped; @@ -189,6 +190,7 @@ struct bnad_drv_stats { u64 rx_unmap_q_alloc_failed; u64 rxbuf_alloc_failed; + u64 rxbuf_map_failed; }; /* Complete driver stats */ diff --git a/drivers/net/ethernet/brocade/bna/bnad_ethtool.c b/drivers/net/ethernet/brocade/bna/bnad_ethtool.c index 2bdfc5dff4b1..0e4fdc3dd729 100644 --- a/drivers/net/ethernet/brocade/bna/bnad_ethtool.c +++ b/drivers/net/ethernet/brocade/bna/bnad_ethtool.c @@ -90,6 +90,7 @@ static const char *bnad_net_stats_strings[BNAD_ETHTOOL_STATS_NUM] = { "tx_skb_headlen_zero", "tx_skb_frag_zero", "tx_skb_len_mismatch", + "tx_skb_map_failed", "hw_stats_updates", "netif_rx_dropped", @@ -102,6 +103,7 @@ static const char *bnad_net_stats_strings[BNAD_ETHTOOL_STATS_NUM] = { "tx_unmap_q_alloc_failed", "rx_unmap_q_alloc_failed", "rxbuf_alloc_failed", + "rxbuf_map_failed", "mac_stats_clr_cnt", "mac_frame_64", @@ -807,6 +809,7 @@ bnad_per_q_stats_fill(struct bnad *bnad, u64 *buf, int bi) rx_packets_with_error; buf[bi++] = rcb->rxq-> rxbuf_alloc_failed; + buf[bi++] = rcb->rxq->rxbuf_map_failed; buf[bi++] = rcb->producer_index; buf[bi++] = rcb->consumer_index; } @@ -821,6 +824,7 @@ bnad_per_q_stats_fill(struct bnad *bnad, u64 *buf, int bi) rx_packets_with_error; buf[bi++] = rcb->rxq-> rxbuf_alloc_failed; + buf[bi++] = rcb->rxq->rxbuf_map_failed; buf[bi++] = rcb->producer_index; buf[bi++] = rcb->consumer_index; } diff --git a/drivers/net/ethernet/chelsio/cxgb4/t4_pci_id_tbl.h b/drivers/net/ethernet/chelsio/cxgb4/t4_pci_id_tbl.h index 8353a6cbfcc2..03ed00c49823 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/t4_pci_id_tbl.h +++ b/drivers/net/ethernet/chelsio/cxgb4/t4_pci_id_tbl.h @@ -157,6 +157,11 @@ CH_PCI_DEVICE_ID_TABLE_DEFINE_BEGIN CH_PCI_ID_TABLE_FENTRY(0x5090), /* Custom T540-CR */ CH_PCI_ID_TABLE_FENTRY(0x5091), /* Custom T522-CR */ CH_PCI_ID_TABLE_FENTRY(0x5092), /* Custom T520-CR */ + CH_PCI_ID_TABLE_FENTRY(0x5093), /* Custom T580-LP-CR */ + CH_PCI_ID_TABLE_FENTRY(0x5094), /* Custom T540-CR */ + CH_PCI_ID_TABLE_FENTRY(0x5095), /* Custom T540-CR-SO */ + CH_PCI_ID_TABLE_FENTRY(0x5096), /* Custom T580-CR */ + CH_PCI_ID_TABLE_FENTRY(0x5097), /* Custom T520-KR */ /* T6 adapters: */ diff --git a/drivers/net/ethernet/emulex/benet/be.h b/drivers/net/ethernet/emulex/benet/be.h index 0a27805cbbbd..821540913343 100644 --- a/drivers/net/ethernet/emulex/benet/be.h +++ b/drivers/net/ethernet/emulex/benet/be.h @@ -582,6 +582,7 @@ struct be_adapter { u16 pvid; __be16 vxlan_port; int vxlan_port_count; + int vxlan_port_aliases; struct phy_info phy; u8 wol_cap; bool wol_en; diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c index 12687bf52b95..7bf51a1a0a77 100644 --- a/drivers/net/ethernet/emulex/benet/be_main.c +++ b/drivers/net/ethernet/emulex/benet/be_main.c @@ -5176,6 +5176,11 @@ static void be_add_vxlan_port(struct net_device *netdev, sa_family_t sa_family, if (lancer_chip(adapter) || BEx_chip(adapter) || be_is_mc(adapter)) return; + if (adapter->vxlan_port == port && adapter->vxlan_port_count) { + adapter->vxlan_port_aliases++; + return; + } + if (adapter->flags & BE_FLAGS_VXLAN_OFFLOADS) { dev_info(dev, "Only one UDP port supported for VxLAN offloads\n"); @@ -5226,6 +5231,11 @@ static void be_del_vxlan_port(struct net_device *netdev, sa_family_t sa_family, if (adapter->vxlan_port != port) goto done; + if (adapter->vxlan_port_aliases) { + adapter->vxlan_port_aliases--; + return; + } + be_disable_vxlan_offloads(adapter); dev_info(&adapter->pdev->dev, diff --git a/drivers/net/ethernet/freescale/gianfar.c b/drivers/net/ethernet/freescale/gianfar.c index 4b69d061d90f..710715fcb23d 100644 --- a/drivers/net/ethernet/freescale/gianfar.c +++ b/drivers/net/ethernet/freescale/gianfar.c @@ -1710,8 +1710,10 @@ static void gfar_configure_serdes(struct net_device *dev) * everything for us? Resetting it takes the link down and requires * several seconds for it to come back. */ - if (phy_read(tbiphy, MII_BMSR) & BMSR_LSTATUS) + if (phy_read(tbiphy, MII_BMSR) & BMSR_LSTATUS) { + put_device(&tbiphy->dev); return; + } /* Single clk mode, mii mode off(for serdes communication) */ phy_write(tbiphy, MII_TBICON, TBICON_CLK_SELECT); @@ -1723,6 +1725,8 @@ static void gfar_configure_serdes(struct net_device *dev) phy_write(tbiphy, MII_BMCR, BMCR_ANENABLE | BMCR_ANRESTART | BMCR_FULLDPLX | BMCR_SPEED1000); + + put_device(&tbiphy->dev); } static int __gfar_is_rx_idle(struct gfar_private *priv) @@ -1970,8 +1974,7 @@ static int register_grp_irqs(struct gfar_priv_grp *grp) /* Install our interrupt handlers for Error, * Transmit, and Receive */ - err = request_irq(gfar_irq(grp, ER)->irq, gfar_error, - IRQF_NO_SUSPEND, + err = request_irq(gfar_irq(grp, ER)->irq, gfar_error, 0, gfar_irq(grp, ER)->name, grp); if (err < 0) { netif_err(priv, intr, dev, "Can't get IRQ %d\n", @@ -1979,6 +1982,8 @@ static int register_grp_irqs(struct gfar_priv_grp *grp) goto err_irq_fail; } + enable_irq_wake(gfar_irq(grp, ER)->irq); + err = request_irq(gfar_irq(grp, TX)->irq, gfar_transmit, 0, gfar_irq(grp, TX)->name, grp); if (err < 0) { @@ -1994,14 +1999,14 @@ static int register_grp_irqs(struct gfar_priv_grp *grp) goto rx_irq_fail; } } else { - err = request_irq(gfar_irq(grp, TX)->irq, gfar_interrupt, - IRQF_NO_SUSPEND, + err = request_irq(gfar_irq(grp, TX)->irq, gfar_interrupt, 0, gfar_irq(grp, TX)->name, grp); if (err < 0) { netif_err(priv, intr, dev, "Can't get IRQ %d\n", gfar_irq(grp, TX)->irq); goto err_irq_fail; } + enable_irq_wake(gfar_irq(grp, TX)->irq); } return 0; diff --git a/drivers/net/ethernet/freescale/gianfar_ptp.c b/drivers/net/ethernet/freescale/gianfar_ptp.c index 8e3cd77aa347..664d0c261269 100644 --- a/drivers/net/ethernet/freescale/gianfar_ptp.c +++ b/drivers/net/ethernet/freescale/gianfar_ptp.c @@ -557,6 +557,7 @@ static const struct of_device_id match_table[] = { { .compatible = "fsl,etsec-ptp" }, {}, }; +MODULE_DEVICE_TABLE(of, match_table); static struct platform_driver gianfar_ptp_driver = { .driver = { diff --git a/drivers/net/ethernet/freescale/ucc_geth.c b/drivers/net/ethernet/freescale/ucc_geth.c index 4dd40e057f40..650f7888e32b 100644 --- a/drivers/net/ethernet/freescale/ucc_geth.c +++ b/drivers/net/ethernet/freescale/ucc_geth.c @@ -1384,6 +1384,8 @@ static int adjust_enet_interface(struct ucc_geth_private *ugeth) value = phy_read(tbiphy, ENET_TBI_MII_CR); value &= ~0x1000; /* Turn off autonegotiation */ phy_write(tbiphy, ENET_TBI_MII_CR, value); + + put_device(&tbiphy->dev); } init_check_frame_length_mode(ug_info->lengthCheckRx, &ug_regs->maccfg2); @@ -1702,8 +1704,10 @@ static void uec_configure_serdes(struct net_device *dev) * everything for us? Resetting it takes the link down and requires * several seconds for it to come back. */ - if (phy_read(tbiphy, ENET_TBI_MII_SR) & TBISR_LSTATUS) + if (phy_read(tbiphy, ENET_TBI_MII_SR) & TBISR_LSTATUS) { + put_device(&tbiphy->dev); return; + } /* Single clk mode, mii mode off(for serdes communication) */ phy_write(tbiphy, ENET_TBI_MII_ANA, TBIANA_SETTINGS); @@ -1711,6 +1715,8 @@ static void uec_configure_serdes(struct net_device *dev) phy_write(tbiphy, ENET_TBI_MII_TBICON, TBICON_CLK_SELECT); phy_write(tbiphy, ENET_TBI_MII_CR, TBICR_SETTINGS); + + put_device(&tbiphy->dev); } /* Configure the PHY for dev. diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c index fe2299ac4f5c..514df76fc70f 100644 --- a/drivers/net/ethernet/marvell/mvneta.c +++ b/drivers/net/ethernet/marvell/mvneta.c @@ -1479,6 +1479,7 @@ static int mvneta_rx(struct mvneta_port *pp, int rx_todo, struct mvneta_rx_desc *rx_desc = mvneta_rxq_next_desc_get(rxq); struct sk_buff *skb; unsigned char *data; + dma_addr_t phys_addr; u32 rx_status; int rx_bytes, err; @@ -1486,6 +1487,7 @@ static int mvneta_rx(struct mvneta_port *pp, int rx_todo, rx_status = rx_desc->status; rx_bytes = rx_desc->data_size - (ETH_FCS_LEN + MVNETA_MH_SIZE); data = (unsigned char *)rx_desc->buf_cookie; + phys_addr = rx_desc->buf_phys_addr; if (!mvneta_rxq_desc_is_first_last(rx_status) || (rx_status & MVNETA_RXD_ERR_SUMMARY)) { @@ -1534,7 +1536,7 @@ static int mvneta_rx(struct mvneta_port *pp, int rx_todo, if (!skb) goto err_drop_frame; - dma_unmap_single(dev->dev.parent, rx_desc->buf_phys_addr, + dma_unmap_single(dev->dev.parent, phys_addr, MVNETA_RX_BUF_SIZE(pp->pkt_size), DMA_FROM_DEVICE); rcvd_pkts++; @@ -3173,6 +3175,8 @@ static int mvneta_probe(struct platform_device *pdev) struct phy_device *phy = of_phy_find_device(dn); mvneta_fixed_link_update(pp, phy); + + put_device(&phy->dev); } return 0; diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c index 4c7de8c44659..e7a5000aa12c 100644 --- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c +++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c @@ -1270,8 +1270,6 @@ int mlx4_en_config_rss_steer(struct mlx4_en_priv *priv) rss_context->hash_fn = MLX4_RSS_HASH_TOP; memcpy(rss_context->rss_key, priv->rss_key, MLX4_EN_RSS_KEY_SIZE); - netdev_rss_key_fill(rss_context->rss_key, - MLX4_EN_RSS_KEY_SIZE); } else { en_err(priv, "Unknown RSS hash function requested\n"); err = -EINVAL; diff --git a/drivers/net/ethernet/micrel/ks8851.c b/drivers/net/ethernet/micrel/ks8851.c index 66d4ab703f45..60f43ec22175 100644 --- a/drivers/net/ethernet/micrel/ks8851.c +++ b/drivers/net/ethernet/micrel/ks8851.c @@ -1601,6 +1601,7 @@ static const struct of_device_id ks8851_match_table[] = { { .compatible = "micrel,ks8851" }, { } }; +MODULE_DEVICE_TABLE(of, ks8851_match_table); static struct spi_driver ks8851_driver = { .driver = { diff --git a/drivers/net/ethernet/moxa/moxart_ether.c b/drivers/net/ethernet/moxa/moxart_ether.c index becbb5f1f5a7..a10c928bbd6b 100644 --- a/drivers/net/ethernet/moxa/moxart_ether.c +++ b/drivers/net/ethernet/moxa/moxart_ether.c @@ -552,6 +552,7 @@ static const struct of_device_id moxart_mac_match[] = { { .compatible = "moxa,moxart-mac" }, { } }; +MODULE_DEVICE_TABLE(of, moxart_mac_match); static struct platform_driver moxart_mac_driver = { .probe = moxart_mac_probe, diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic.h b/drivers/net/ethernet/qlogic/qlcnic/qlcnic.h index 06bcc734fe8d..d6696cfa11d2 100644 --- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic.h +++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic.h @@ -536,6 +536,7 @@ struct qlcnic_hardware_context { u8 extend_lb_time; u8 phys_port_id[ETH_ALEN]; u8 lb_mode; + u8 vxlan_port_count; u16 vxlan_port; struct device *hwmon_dev; u32 post_mode; diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c index 8b08b20e8b30..d4481454b5f8 100644 --- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c +++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c @@ -483,11 +483,17 @@ static void qlcnic_add_vxlan_port(struct net_device *netdev, /* Adapter supports only one VXLAN port. Use very first port * for enabling offload */ - if (!qlcnic_encap_rx_offload(adapter) || ahw->vxlan_port) + if (!qlcnic_encap_rx_offload(adapter)) return; + if (!ahw->vxlan_port_count) { + ahw->vxlan_port_count = 1; + ahw->vxlan_port = ntohs(port); + adapter->flags |= QLCNIC_ADD_VXLAN_PORT; + return; + } + if (ahw->vxlan_port == ntohs(port)) + ahw->vxlan_port_count++; - ahw->vxlan_port = ntohs(port); - adapter->flags |= QLCNIC_ADD_VXLAN_PORT; } static void qlcnic_del_vxlan_port(struct net_device *netdev, @@ -496,11 +502,13 @@ static void qlcnic_del_vxlan_port(struct net_device *netdev, struct qlcnic_adapter *adapter = netdev_priv(netdev); struct qlcnic_hardware_context *ahw = adapter->ahw; - if (!qlcnic_encap_rx_offload(adapter) || !ahw->vxlan_port || + if (!qlcnic_encap_rx_offload(adapter) || !ahw->vxlan_port_count || (ahw->vxlan_port != ntohs(port))) return; - adapter->flags |= QLCNIC_DEL_VXLAN_PORT; + ahw->vxlan_port_count--; + if (!ahw->vxlan_port_count) + adapter->flags |= QLCNIC_DEL_VXLAN_PORT; } static netdev_features_t qlcnic_features_check(struct sk_buff *skb, diff --git a/drivers/net/ethernet/realtek/8139cp.c b/drivers/net/ethernet/realtek/8139cp.c index d79e33b3c191..686334f4588d 100644 --- a/drivers/net/ethernet/realtek/8139cp.c +++ b/drivers/net/ethernet/realtek/8139cp.c @@ -157,6 +157,7 @@ enum { NWayAdvert = 0x66, /* MII ADVERTISE */ NWayLPAR = 0x68, /* MII LPA */ NWayExpansion = 0x6A, /* MII Expansion */ + TxDmaOkLowDesc = 0x82, /* Low 16 bit address of a Tx descriptor. */ Config5 = 0xD8, /* Config5 */ TxPoll = 0xD9, /* Tell chip to check Tx descriptors for work */ RxMaxSize = 0xDA, /* Max size of an Rx packet (8169 only) */ @@ -341,6 +342,7 @@ struct cp_private { unsigned tx_tail; struct cp_desc *tx_ring; struct sk_buff *tx_skb[CP_TX_RING_SIZE]; + u32 tx_opts[CP_TX_RING_SIZE]; unsigned rx_buf_sz; unsigned wol_enabled : 1; /* Is Wake-on-LAN enabled? */ @@ -665,7 +667,7 @@ static void cp_tx (struct cp_private *cp) BUG_ON(!skb); dma_unmap_single(&cp->pdev->dev, le64_to_cpu(txd->addr), - le32_to_cpu(txd->opts1) & 0xffff, + cp->tx_opts[tx_tail] & 0xffff, PCI_DMA_TODEVICE); if (status & LastFrag) { @@ -733,7 +735,7 @@ static netdev_tx_t cp_start_xmit (struct sk_buff *skb, { struct cp_private *cp = netdev_priv(dev); unsigned entry; - u32 eor, flags; + u32 eor, opts1; unsigned long intr_flags; __le32 opts2; int mss = 0; @@ -753,6 +755,21 @@ static netdev_tx_t cp_start_xmit (struct sk_buff *skb, mss = skb_shinfo(skb)->gso_size; opts2 = cpu_to_le32(cp_tx_vlan_tag(skb)); + opts1 = DescOwn; + if (mss) + opts1 |= LargeSend | ((mss & MSSMask) << MSSShift); + else if (skb->ip_summed == CHECKSUM_PARTIAL) { + const struct iphdr *ip = ip_hdr(skb); + if (ip->protocol == IPPROTO_TCP) + opts1 |= IPCS | TCPCS; + else if (ip->protocol == IPPROTO_UDP) + opts1 |= IPCS | UDPCS; + else { + WARN_ONCE(1, + "Net bug: asked to checksum invalid Legacy IP packet\n"); + goto out_dma_error; + } + } if (skb_shinfo(skb)->nr_frags == 0) { struct cp_desc *txd = &cp->tx_ring[entry]; @@ -768,31 +785,20 @@ static netdev_tx_t cp_start_xmit (struct sk_buff *skb, txd->addr = cpu_to_le64(mapping); wmb(); - flags = eor | len | DescOwn | FirstFrag | LastFrag; - - if (mss) - flags |= LargeSend | ((mss & MSSMask) << MSSShift); - else if (skb->ip_summed == CHECKSUM_PARTIAL) { - const struct iphdr *ip = ip_hdr(skb); - if (ip->protocol == IPPROTO_TCP) - flags |= IPCS | TCPCS; - else if (ip->protocol == IPPROTO_UDP) - flags |= IPCS | UDPCS; - else - WARN_ON(1); /* we need a WARN() */ - } + opts1 |= eor | len | FirstFrag | LastFrag; - txd->opts1 = cpu_to_le32(flags); + txd->opts1 = cpu_to_le32(opts1); wmb(); cp->tx_skb[entry] = skb; - entry = NEXT_TX(entry); + cp->tx_opts[entry] = opts1; + netif_dbg(cp, tx_queued, cp->dev, "tx queued, slot %d, skblen %d\n", + entry, skb->len); } else { struct cp_desc *txd; - u32 first_len, first_eor; + u32 first_len, first_eor, ctrl; dma_addr_t first_mapping; int frag, first_entry = entry; - const struct iphdr *ip = ip_hdr(skb); /* We must give this initial chunk to the device last. * Otherwise we could race with the device. @@ -805,14 +811,14 @@ static netdev_tx_t cp_start_xmit (struct sk_buff *skb, goto out_dma_error; cp->tx_skb[entry] = skb; - entry = NEXT_TX(entry); for (frag = 0; frag < skb_shinfo(skb)->nr_frags; frag++) { const skb_frag_t *this_frag = &skb_shinfo(skb)->frags[frag]; u32 len; - u32 ctrl; dma_addr_t mapping; + entry = NEXT_TX(entry); + len = skb_frag_size(this_frag); mapping = dma_map_single(&cp->pdev->dev, skb_frag_address(this_frag), @@ -824,19 +830,7 @@ static netdev_tx_t cp_start_xmit (struct sk_buff *skb, eor = (entry == (CP_TX_RING_SIZE - 1)) ? RingEnd : 0; - ctrl = eor | len | DescOwn; - - if (mss) - ctrl |= LargeSend | - ((mss & MSSMask) << MSSShift); - else if (skb->ip_summed == CHECKSUM_PARTIAL) { - if (ip->protocol == IPPROTO_TCP) - ctrl |= IPCS | TCPCS; - else if (ip->protocol == IPPROTO_UDP) - ctrl |= IPCS | UDPCS; - else - BUG(); - } + ctrl = opts1 | eor | len; if (frag == skb_shinfo(skb)->nr_frags - 1) ctrl |= LastFrag; @@ -849,8 +843,8 @@ static netdev_tx_t cp_start_xmit (struct sk_buff *skb, txd->opts1 = cpu_to_le32(ctrl); wmb(); + cp->tx_opts[entry] = ctrl; cp->tx_skb[entry] = skb; - entry = NEXT_TX(entry); } txd = &cp->tx_ring[first_entry]; @@ -858,27 +852,17 @@ static netdev_tx_t cp_start_xmit (struct sk_buff *skb, txd->addr = cpu_to_le64(first_mapping); wmb(); - if (skb->ip_summed == CHECKSUM_PARTIAL) { - if (ip->protocol == IPPROTO_TCP) - txd->opts1 = cpu_to_le32(first_eor | first_len | - FirstFrag | DescOwn | - IPCS | TCPCS); - else if (ip->protocol == IPPROTO_UDP) - txd->opts1 = cpu_to_le32(first_eor | first_len | - FirstFrag | DescOwn | - IPCS | UDPCS); - else - BUG(); - } else - txd->opts1 = cpu_to_le32(first_eor | first_len | - FirstFrag | DescOwn); + ctrl = opts1 | first_eor | first_len | FirstFrag; + txd->opts1 = cpu_to_le32(ctrl); wmb(); + + cp->tx_opts[first_entry] = ctrl; + netif_dbg(cp, tx_queued, cp->dev, "tx queued, slots %d-%d, skblen %d\n", + first_entry, entry, skb->len); } - cp->tx_head = entry; + cp->tx_head = NEXT_TX(entry); netdev_sent_queue(dev, skb->len); - netif_dbg(cp, tx_queued, cp->dev, "tx queued, slot %d, skblen %d\n", - entry, skb->len); if (TX_BUFFS_AVAIL(cp) <= (MAX_SKB_FRAGS + 1)) netif_stop_queue(dev); @@ -1115,6 +1099,7 @@ static int cp_init_rings (struct cp_private *cp) { memset(cp->tx_ring, 0, sizeof(struct cp_desc) * CP_TX_RING_SIZE); cp->tx_ring[CP_TX_RING_SIZE - 1].opts1 = cpu_to_le32(RingEnd); + memset(cp->tx_opts, 0, sizeof(cp->tx_opts)); cp_init_rings_index(cp); @@ -1151,7 +1136,7 @@ static void cp_clean_rings (struct cp_private *cp) desc = cp->rx_ring + i; dma_unmap_single(&cp->pdev->dev,le64_to_cpu(desc->addr), cp->rx_buf_sz, PCI_DMA_FROMDEVICE); - dev_kfree_skb(cp->rx_skb[i]); + dev_kfree_skb_any(cp->rx_skb[i]); } } @@ -1164,7 +1149,7 @@ static void cp_clean_rings (struct cp_private *cp) le32_to_cpu(desc->opts1) & 0xffff, PCI_DMA_TODEVICE); if (le32_to_cpu(desc->opts1) & LastFrag) - dev_kfree_skb(skb); + dev_kfree_skb_any(skb); cp->dev->stats.tx_dropped++; } } @@ -1172,6 +1157,7 @@ static void cp_clean_rings (struct cp_private *cp) memset(cp->rx_ring, 0, sizeof(struct cp_desc) * CP_RX_RING_SIZE); memset(cp->tx_ring, 0, sizeof(struct cp_desc) * CP_TX_RING_SIZE); + memset(cp->tx_opts, 0, sizeof(cp->tx_opts)); memset(cp->rx_skb, 0, sizeof(struct sk_buff *) * CP_RX_RING_SIZE); memset(cp->tx_skb, 0, sizeof(struct sk_buff *) * CP_TX_RING_SIZE); @@ -1249,7 +1235,7 @@ static void cp_tx_timeout(struct net_device *dev) { struct cp_private *cp = netdev_priv(dev); unsigned long flags; - int rc; + int rc, i; netdev_warn(dev, "Transmit timeout, status %2x %4x %4x %4x\n", cpr8(Cmd), cpr16(CpCmd), @@ -1257,13 +1243,26 @@ static void cp_tx_timeout(struct net_device *dev) spin_lock_irqsave(&cp->lock, flags); + netif_dbg(cp, tx_err, cp->dev, "TX ring head %d tail %d desc %x\n", + cp->tx_head, cp->tx_tail, cpr16(TxDmaOkLowDesc)); + for (i = 0; i < CP_TX_RING_SIZE; i++) { + netif_dbg(cp, tx_err, cp->dev, + "TX slot %d @%p: %08x (%08x) %08x %llx %p\n", + i, &cp->tx_ring[i], le32_to_cpu(cp->tx_ring[i].opts1), + cp->tx_opts[i], le32_to_cpu(cp->tx_ring[i].opts2), + le64_to_cpu(cp->tx_ring[i].addr), + cp->tx_skb[i]); + } + cp_stop_hw(cp); cp_clean_rings(cp); rc = cp_init_rings(cp); cp_start_hw(cp); - cp_enable_irq(cp); + __cp_set_rx_mode(dev); + cpw16_f(IntrMask, cp_norx_intr_mask); netif_wake_queue(dev); + napi_schedule_irqoff(&cp->napi); spin_unlock_irqrestore(&cp->lock, flags); } diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c index b735fa22ac95..ebf6abc4853f 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c @@ -161,11 +161,16 @@ int stmmac_mdio_reset(struct mii_bus *bus) if (!gpio_request(reset_gpio, "mdio-reset")) { gpio_direction_output(reset_gpio, active_low ? 1 : 0); - udelay(data->delays[0]); + if (data->delays[0]) + msleep(DIV_ROUND_UP(data->delays[0], 1000)); + gpio_set_value(reset_gpio, active_low ? 0 : 1); - udelay(data->delays[1]); + if (data->delays[1]) + msleep(DIV_ROUND_UP(data->delays[1], 1000)); + gpio_set_value(reset_gpio, active_low ? 1 : 0); - udelay(data->delays[2]); + if (data->delays[2]) + msleep(DIV_ROUND_UP(data->delays[2], 1000)); } } #endif diff --git a/drivers/net/ethernet/sun/sunvnet.c b/drivers/net/ethernet/sun/sunvnet.c index 53fe200e0b79..cc106d892e29 100644 --- a/drivers/net/ethernet/sun/sunvnet.c +++ b/drivers/net/ethernet/sun/sunvnet.c @@ -1756,7 +1756,8 @@ static const struct net_device_ops vnet_ops = { #endif }; -static struct vnet *vnet_new(const u64 *local_mac) +static struct vnet *vnet_new(const u64 *local_mac, + struct vio_dev *vdev) { struct net_device *dev; struct vnet *vp; @@ -1790,6 +1791,8 @@ static struct vnet *vnet_new(const u64 *local_mac) NETIF_F_HW_CSUM | NETIF_F_SG; dev->features = dev->hw_features; + SET_NETDEV_DEV(dev, &vdev->dev); + err = register_netdev(dev); if (err) { pr_err("Cannot register net device, aborting\n"); @@ -1808,7 +1811,8 @@ err_out_free_dev: return ERR_PTR(err); } -static struct vnet *vnet_find_or_create(const u64 *local_mac) +static struct vnet *vnet_find_or_create(const u64 *local_mac, + struct vio_dev *vdev) { struct vnet *iter, *vp; @@ -1821,7 +1825,7 @@ static struct vnet *vnet_find_or_create(const u64 *local_mac) } } if (!vp) - vp = vnet_new(local_mac); + vp = vnet_new(local_mac, vdev); mutex_unlock(&vnet_list_mutex); return vp; @@ -1848,7 +1852,8 @@ static void vnet_cleanup(void) static const char *local_mac_prop = "local-mac-address"; static struct vnet *vnet_find_parent(struct mdesc_handle *hp, - u64 port_node) + u64 port_node, + struct vio_dev *vdev) { const u64 *local_mac = NULL; u64 a; @@ -1869,7 +1874,7 @@ static struct vnet *vnet_find_parent(struct mdesc_handle *hp, if (!local_mac) return ERR_PTR(-ENODEV); - return vnet_find_or_create(local_mac); + return vnet_find_or_create(local_mac, vdev); } static struct ldc_channel_config vnet_ldc_cfg = { @@ -1923,7 +1928,7 @@ static int vnet_port_probe(struct vio_dev *vdev, const struct vio_device_id *id) hp = mdesc_grab(); - vp = vnet_find_parent(hp, vdev->mp); + vp = vnet_find_parent(hp, vdev->mp, vdev); if (IS_ERR(vp)) { pr_err("Cannot find port parent vnet\n"); err = PTR_ERR(vp); diff --git a/drivers/net/ethernet/ti/netcp_core.c b/drivers/net/ethernet/ti/netcp_core.c index 1a5aca55ea9f..9f9832f0dea9 100644 --- a/drivers/net/ethernet/ti/netcp_core.c +++ b/drivers/net/ethernet/ti/netcp_core.c @@ -291,13 +291,6 @@ static int netcp_module_probe(struct netcp_device *netcp_device, interface_list) { struct netcp_intf_modpriv *intf_modpriv; - /* If interface not registered then register now */ - if (!netcp_intf->netdev_registered) - ret = netcp_register_interface(netcp_intf); - - if (ret) - return -ENODEV; - intf_modpriv = devm_kzalloc(dev, sizeof(*intf_modpriv), GFP_KERNEL); if (!intf_modpriv) @@ -306,6 +299,11 @@ static int netcp_module_probe(struct netcp_device *netcp_device, interface = of_parse_phandle(netcp_intf->node_interface, module->name, 0); + if (!interface) { + devm_kfree(dev, intf_modpriv); + continue; + } + intf_modpriv->netcp_priv = netcp_intf; intf_modpriv->netcp_module = module; list_add_tail(&intf_modpriv->intf_list, @@ -323,6 +321,18 @@ static int netcp_module_probe(struct netcp_device *netcp_device, continue; } } + + /* Now register the interface with netdev */ + list_for_each_entry(netcp_intf, + &netcp_device->interface_head, + interface_list) { + /* If interface not registered then register now */ + if (!netcp_intf->netdev_registered) { + ret = netcp_register_interface(netcp_intf); + if (ret) + return -ENODEV; + } + } return 0; } @@ -357,7 +367,6 @@ int netcp_register_module(struct netcp_module *module) if (ret < 0) goto fail; } - mutex_unlock(&netcp_modules_lock); return 0; @@ -796,7 +805,7 @@ static void netcp_rxpool_free(struct netcp_intf *netcp) netcp->rx_pool = NULL; } -static void netcp_allocate_rx_buf(struct netcp_intf *netcp, int fdq) +static int netcp_allocate_rx_buf(struct netcp_intf *netcp, int fdq) { struct knav_dma_desc *hwdesc; unsigned int buf_len, dma_sz; @@ -810,7 +819,7 @@ static void netcp_allocate_rx_buf(struct netcp_intf *netcp, int fdq) hwdesc = knav_pool_desc_get(netcp->rx_pool); if (IS_ERR_OR_NULL(hwdesc)) { dev_dbg(netcp->ndev_dev, "out of rx pool desc\n"); - return; + return -ENOMEM; } if (likely(fdq == 0)) { @@ -862,25 +871,26 @@ static void netcp_allocate_rx_buf(struct netcp_intf *netcp, int fdq) knav_pool_desc_map(netcp->rx_pool, hwdesc, sizeof(*hwdesc), &dma, &dma_sz); knav_queue_push(netcp->rx_fdq[fdq], dma, sizeof(*hwdesc), 0); - return; + return 0; fail: knav_pool_desc_put(netcp->rx_pool, hwdesc); + return -ENOMEM; } /* Refill Rx FDQ with descriptors & attached buffers */ static void netcp_rxpool_refill(struct netcp_intf *netcp) { u32 fdq_deficit[KNAV_DMA_FDQ_PER_CHAN] = {0}; - int i; + int i, ret = 0; /* Calculate the FDQ deficit and refill */ for (i = 0; i < KNAV_DMA_FDQ_PER_CHAN && netcp->rx_fdq[i]; i++) { fdq_deficit[i] = netcp->rx_queue_depths[i] - knav_queue_get_count(netcp->rx_fdq[i]); - while (fdq_deficit[i]--) - netcp_allocate_rx_buf(netcp, i); + while (fdq_deficit[i]-- && !ret) + ret = netcp_allocate_rx_buf(netcp, i); } /* end for fdqs */ } @@ -893,12 +903,12 @@ static int netcp_rx_poll(struct napi_struct *napi, int budget) packets = netcp_process_rx_packets(netcp, budget); + netcp_rxpool_refill(netcp); if (packets < budget) { napi_complete(&netcp->rx_napi); knav_queue_enable_notify(netcp->rx_queue); } - netcp_rxpool_refill(netcp); return packets; } @@ -1384,7 +1394,6 @@ static void netcp_addr_sweep_del(struct netcp_intf *netcp) continue; dev_dbg(netcp->ndev_dev, "deleting address %pM, type %x\n", naddr->addr, naddr->type); - mutex_lock(&netcp_modules_lock); for_each_module(netcp, priv) { module = priv->netcp_module; if (!module->del_addr) @@ -1393,7 +1402,6 @@ static void netcp_addr_sweep_del(struct netcp_intf *netcp) naddr); WARN_ON(error); } - mutex_unlock(&netcp_modules_lock); netcp_addr_del(netcp, naddr); } } @@ -1410,7 +1418,7 @@ static void netcp_addr_sweep_add(struct netcp_intf *netcp) continue; dev_dbg(netcp->ndev_dev, "adding address %pM, type %x\n", naddr->addr, naddr->type); - mutex_lock(&netcp_modules_lock); + for_each_module(netcp, priv) { module = priv->netcp_module; if (!module->add_addr) @@ -1418,7 +1426,6 @@ static void netcp_addr_sweep_add(struct netcp_intf *netcp) error = module->add_addr(priv->module_priv, naddr); WARN_ON(error); } - mutex_unlock(&netcp_modules_lock); } } @@ -1432,6 +1439,7 @@ static void netcp_set_rx_mode(struct net_device *ndev) ndev->flags & IFF_ALLMULTI || netdev_mc_count(ndev) > NETCP_MAX_MCAST_ADDR); + spin_lock(&netcp->lock); /* first clear all marks */ netcp_addr_clear_mark(netcp); @@ -1450,6 +1458,7 @@ static void netcp_set_rx_mode(struct net_device *ndev) /* finally sweep and callout into modules */ netcp_addr_sweep_del(netcp); netcp_addr_sweep_add(netcp); + spin_unlock(&netcp->lock); } static void netcp_free_navigator_resources(struct netcp_intf *netcp) @@ -1614,7 +1623,6 @@ static int netcp_ndo_open(struct net_device *ndev) goto fail; } - mutex_lock(&netcp_modules_lock); for_each_module(netcp, intf_modpriv) { module = intf_modpriv->netcp_module; if (module->open) { @@ -1625,7 +1633,6 @@ static int netcp_ndo_open(struct net_device *ndev) } } } - mutex_unlock(&netcp_modules_lock); napi_enable(&netcp->rx_napi); napi_enable(&netcp->tx_napi); @@ -1642,7 +1649,6 @@ fail_open: if (module->close) module->close(intf_modpriv->module_priv, ndev); } - mutex_unlock(&netcp_modules_lock); fail: netcp_free_navigator_resources(netcp); @@ -1666,7 +1672,6 @@ static int netcp_ndo_stop(struct net_device *ndev) napi_disable(&netcp->rx_napi); napi_disable(&netcp->tx_napi); - mutex_lock(&netcp_modules_lock); for_each_module(netcp, intf_modpriv) { module = intf_modpriv->netcp_module; if (module->close) { @@ -1675,7 +1680,6 @@ static int netcp_ndo_stop(struct net_device *ndev) dev_err(netcp->ndev_dev, "Close failed\n"); } } - mutex_unlock(&netcp_modules_lock); /* Recycle Rx descriptors from completion queue */ netcp_empty_rx_queue(netcp); @@ -1703,7 +1707,6 @@ static int netcp_ndo_ioctl(struct net_device *ndev, if (!netif_running(ndev)) return -EINVAL; - mutex_lock(&netcp_modules_lock); for_each_module(netcp, intf_modpriv) { module = intf_modpriv->netcp_module; if (!module->ioctl) @@ -1719,7 +1722,6 @@ static int netcp_ndo_ioctl(struct net_device *ndev, } out: - mutex_unlock(&netcp_modules_lock); return (ret == 0) ? 0 : err; } @@ -1754,11 +1756,12 @@ static int netcp_rx_add_vid(struct net_device *ndev, __be16 proto, u16 vid) struct netcp_intf *netcp = netdev_priv(ndev); struct netcp_intf_modpriv *intf_modpriv; struct netcp_module *module; + unsigned long flags; int err = 0; dev_dbg(netcp->ndev_dev, "adding rx vlan id: %d\n", vid); - mutex_lock(&netcp_modules_lock); + spin_lock_irqsave(&netcp->lock, flags); for_each_module(netcp, intf_modpriv) { module = intf_modpriv->netcp_module; if ((module->add_vid) && (vid != 0)) { @@ -1770,7 +1773,8 @@ static int netcp_rx_add_vid(struct net_device *ndev, __be16 proto, u16 vid) } } } - mutex_unlock(&netcp_modules_lock); + spin_unlock_irqrestore(&netcp->lock, flags); + return err; } @@ -1779,11 +1783,12 @@ static int netcp_rx_kill_vid(struct net_device *ndev, __be16 proto, u16 vid) struct netcp_intf *netcp = netdev_priv(ndev); struct netcp_intf_modpriv *intf_modpriv; struct netcp_module *module; + unsigned long flags; int err = 0; dev_dbg(netcp->ndev_dev, "removing rx vlan id: %d\n", vid); - mutex_lock(&netcp_modules_lock); + spin_lock_irqsave(&netcp->lock, flags); for_each_module(netcp, intf_modpriv) { module = intf_modpriv->netcp_module; if (module->del_vid) { @@ -1795,7 +1800,7 @@ static int netcp_rx_kill_vid(struct net_device *ndev, __be16 proto, u16 vid) } } } - mutex_unlock(&netcp_modules_lock); + spin_unlock_irqrestore(&netcp->lock, flags); return err; } @@ -2040,7 +2045,6 @@ static int netcp_probe(struct platform_device *pdev) struct device_node *child, *interfaces; struct netcp_device *netcp_device; struct device *dev = &pdev->dev; - struct netcp_module *module; int ret; if (!node) { @@ -2087,14 +2091,6 @@ static int netcp_probe(struct platform_device *pdev) /* Add the device instance to the list */ list_add_tail(&netcp_device->device_list, &netcp_devices); - /* Probe & attach any modules already registered */ - mutex_lock(&netcp_modules_lock); - for_each_netcp_module(module) { - ret = netcp_module_probe(netcp_device, module); - if (ret < 0) - dev_err(dev, "module(%s) probe failed\n", module->name); - } - mutex_unlock(&netcp_modules_lock); return 0; probe_quit_interface: diff --git a/drivers/net/ethernet/ti/netcp_ethss.c b/drivers/net/ethernet/ti/netcp_ethss.c index 6f16d6aaf7b7..6bff8d82ceab 100644 --- a/drivers/net/ethernet/ti/netcp_ethss.c +++ b/drivers/net/ethernet/ti/netcp_ethss.c @@ -77,6 +77,7 @@ #define GBENU_ALE_OFFSET 0x1e000 #define GBENU_HOST_PORT_NUM 0 #define GBENU_NUM_ALE_ENTRIES 1024 +#define GBENU_SGMII_MODULE_SIZE 0x100 /* 10G Ethernet SS defines */ #define XGBE_MODULE_NAME "netcp-xgbe" @@ -149,8 +150,8 @@ #define XGBE_STATS2_MODULE 2 /* s: 0-based slave_port */ -#define SGMII_BASE(s) \ - (((s) < 2) ? gbe_dev->sgmii_port_regs : gbe_dev->sgmii_port34_regs) +#define SGMII_BASE(d, s) \ + (((s) < 2) ? (d)->sgmii_port_regs : (d)->sgmii_port34_regs) #define GBE_TX_QUEUE 648 #define GBE_TXHOOK_ORDER 0 @@ -1997,13 +1998,8 @@ static void netcp_ethss_update_link_state(struct gbe_priv *gbe_dev, return; if (!SLAVE_LINK_IS_XGMII(slave)) { - if (gbe_dev->ss_version == GBE_SS_VERSION_14) - sgmii_link_state = - netcp_sgmii_get_port_link(SGMII_BASE(sp), sp); - else - sgmii_link_state = - netcp_sgmii_get_port_link( - gbe_dev->sgmii_port_regs, sp); + sgmii_link_state = + netcp_sgmii_get_port_link(SGMII_BASE(gbe_dev, sp), sp); } phy_link_state = gbe_phy_link_status(slave); @@ -2100,17 +2096,11 @@ static void gbe_port_config(struct gbe_priv *gbe_dev, struct gbe_slave *slave, static void gbe_sgmii_rtreset(struct gbe_priv *priv, struct gbe_slave *slave, bool set) { - void __iomem *sgmii_port_regs; - if (SLAVE_LINK_IS_XGMII(slave)) return; - if ((priv->ss_version == GBE_SS_VERSION_14) && (slave->slave_num >= 2)) - sgmii_port_regs = priv->sgmii_port34_regs; - else - sgmii_port_regs = priv->sgmii_port_regs; - - netcp_sgmii_rtreset(sgmii_port_regs, slave->slave_num, set); + netcp_sgmii_rtreset(SGMII_BASE(priv, slave->slave_num), + slave->slave_num, set); } static void gbe_slave_stop(struct gbe_intf *intf) @@ -2136,17 +2126,12 @@ static void gbe_slave_stop(struct gbe_intf *intf) static void gbe_sgmii_config(struct gbe_priv *priv, struct gbe_slave *slave) { - void __iomem *sgmii_port_regs; - - sgmii_port_regs = priv->sgmii_port_regs; - if ((priv->ss_version == GBE_SS_VERSION_14) && (slave->slave_num >= 2)) - sgmii_port_regs = priv->sgmii_port34_regs; + if (SLAVE_LINK_IS_XGMII(slave)) + return; - if (!SLAVE_LINK_IS_XGMII(slave)) { - netcp_sgmii_reset(sgmii_port_regs, slave->slave_num); - netcp_sgmii_config(sgmii_port_regs, slave->slave_num, - slave->link_interface); - } + netcp_sgmii_reset(SGMII_BASE(priv, slave->slave_num), slave->slave_num); + netcp_sgmii_config(SGMII_BASE(priv, slave->slave_num), slave->slave_num, + slave->link_interface); } static int gbe_slave_open(struct gbe_intf *gbe_intf) @@ -2997,6 +2982,14 @@ static int set_gbenu_ethss_priv(struct gbe_priv *gbe_dev, gbe_dev->switch_regs = regs; gbe_dev->sgmii_port_regs = gbe_dev->ss_regs + GBENU_SGMII_MODULE_OFFSET; + + /* Although sgmii modules are mem mapped to one contiguous + * region on GBENU devices, setting sgmii_port34_regs allows + * consistent code when accessing sgmii api + */ + gbe_dev->sgmii_port34_regs = gbe_dev->sgmii_port_regs + + (2 * GBENU_SGMII_MODULE_SIZE); + gbe_dev->host_port_regs = gbe_dev->switch_regs + GBENU_HOST_PORT_OFFSET; for (i = 0; i < (gbe_dev->max_num_ports); i++) diff --git a/drivers/net/ethernet/via/Kconfig b/drivers/net/ethernet/via/Kconfig index 2f1264b882b9..d3d094742a7e 100644 --- a/drivers/net/ethernet/via/Kconfig +++ b/drivers/net/ethernet/via/Kconfig @@ -17,7 +17,7 @@ if NET_VENDOR_VIA config VIA_RHINE tristate "VIA Rhine support" - depends on (PCI || OF_IRQ) + depends on PCI || (OF_IRQ && GENERIC_PCI_IOMAP) depends on HAS_DMA select CRC32 select MII diff --git a/drivers/net/ethernet/xilinx/xilinx_emaclite.c b/drivers/net/ethernet/xilinx/xilinx_emaclite.c index 6008eee01a33..cf468c87ce57 100644 --- a/drivers/net/ethernet/xilinx/xilinx_emaclite.c +++ b/drivers/net/ethernet/xilinx/xilinx_emaclite.c @@ -828,6 +828,8 @@ static int xemaclite_mdio_setup(struct net_local *lp, struct device *dev) if (!phydev) dev_info(dev, "MDIO of the phy is not registered yet\n"); + else + put_device(&phydev->dev); return 0; } diff --git a/drivers/net/fjes/fjes_hw.c b/drivers/net/fjes/fjes_hw.c index b5f4a78da828..2d3848c9dc35 100644 --- a/drivers/net/fjes/fjes_hw.c +++ b/drivers/net/fjes/fjes_hw.c @@ -1011,11 +1011,11 @@ static void fjes_hw_update_zone_task(struct work_struct *work) set_bit(epidx, &irq_bit); break; } - } - - hw->ep_shm_info[epidx].es_status = info[epidx].es_status; - hw->ep_shm_info[epidx].zone = info[epidx].zone; + hw->ep_shm_info[epidx].es_status = + info[epidx].es_status; + hw->ep_shm_info[epidx].zone = info[epidx].zone; + } break; } diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c index da3259ce7c8d..8f5c02eed47d 100644 --- a/drivers/net/geneve.c +++ b/drivers/net/geneve.c @@ -126,6 +126,8 @@ static void geneve_rx(struct geneve_sock *gs, struct sk_buff *skb) __be32 addr; int err; + iph = ip_hdr(skb); /* outer IP header... */ + if (gs->collect_md) { static u8 zero_vni[3]; @@ -133,7 +135,6 @@ static void geneve_rx(struct geneve_sock *gs, struct sk_buff *skb) addr = 0; } else { vni = gnvh->vni; - iph = ip_hdr(skb); /* Still outer IP header... */ addr = iph->saddr; } @@ -178,7 +179,6 @@ static void geneve_rx(struct geneve_sock *gs, struct sk_buff *skb) skb_reset_network_header(skb); - iph = ip_hdr(skb); /* Now inner IP header... */ err = IP_ECN_decapsulate(iph, skb); if (unlikely(err)) { @@ -626,6 +626,7 @@ static netdev_tx_t geneve_xmit(struct sk_buff *skb, struct net_device *dev) struct geneve_sock *gs = geneve->sock; struct ip_tunnel_info *info = NULL; struct rtable *rt = NULL; + const struct iphdr *iip; /* interior IP header */ struct flowi4 fl4; __u8 tos, ttl; __be16 sport; @@ -653,6 +654,8 @@ static netdev_tx_t geneve_xmit(struct sk_buff *skb, struct net_device *dev) sport = udp_flow_src_port(geneve->net, skb, 1, USHRT_MAX, true); skb_reset_mac_header(skb); + iip = ip_hdr(skb); + if (info) { const struct ip_tunnel_key *key = &info->key; u8 *opts = NULL; @@ -668,19 +671,16 @@ static netdev_tx_t geneve_xmit(struct sk_buff *skb, struct net_device *dev) if (unlikely(err)) goto err; - tos = key->tos; + tos = ip_tunnel_ecn_encap(key->tos, iip, skb); ttl = key->ttl; df = key->tun_flags & TUNNEL_DONT_FRAGMENT ? htons(IP_DF) : 0; } else { - const struct iphdr *iip; /* interior IP header */ - udp_csum = false; err = geneve_build_skb(rt, skb, 0, geneve->vni, 0, NULL, udp_csum); if (unlikely(err)) goto err; - iip = ip_hdr(skb); tos = ip_tunnel_ecn_encap(fl4.flowi4_tos, iip, skb); ttl = geneve->ttl; if (!ttl && IN_MULTICAST(ntohl(fl4.daddr))) @@ -748,12 +748,8 @@ static void geneve_setup(struct net_device *dev) dev->features |= NETIF_F_RXCSUM; dev->features |= NETIF_F_GSO_SOFTWARE; - dev->vlan_features = dev->features; - dev->features |= NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_STAG_TX; - dev->hw_features |= NETIF_F_SG | NETIF_F_HW_CSUM | NETIF_F_RXCSUM; dev->hw_features |= NETIF_F_GSO_SOFTWARE; - dev->hw_features |= NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_STAG_TX; netif_keep_dst(dev); dev->priv_flags |= IFF_LIVE_ADDR_CHANGE | IFF_NO_QUEUE; @@ -819,7 +815,7 @@ static struct geneve_dev *geneve_find_dev(struct geneve_net *gn, static int geneve_configure(struct net *net, struct net_device *dev, __be32 rem_addr, __u32 vni, __u8 ttl, __u8 tos, - __u16 dst_port, bool metadata) + __be16 dst_port, bool metadata) { struct geneve_net *gn = net_generic(net, geneve_net_id); struct geneve_dev *t, *geneve = netdev_priv(dev); @@ -844,10 +840,10 @@ static int geneve_configure(struct net *net, struct net_device *dev, geneve->ttl = ttl; geneve->tos = tos; - geneve->dst_port = htons(dst_port); + geneve->dst_port = dst_port; geneve->collect_md = metadata; - t = geneve_find_dev(gn, htons(dst_port), rem_addr, geneve->vni, + t = geneve_find_dev(gn, dst_port, rem_addr, geneve->vni, &tun_on_same_port, &tun_collect_md); if (t) return -EBUSY; @@ -871,7 +867,7 @@ static int geneve_configure(struct net *net, struct net_device *dev, static int geneve_newlink(struct net *net, struct net_device *dev, struct nlattr *tb[], struct nlattr *data[]) { - __u16 dst_port = GENEVE_UDP_PORT; + __be16 dst_port = htons(GENEVE_UDP_PORT); __u8 ttl = 0, tos = 0; bool metadata = false; __be32 rem_addr; @@ -890,7 +886,7 @@ static int geneve_newlink(struct net *net, struct net_device *dev, tos = nla_get_u8(data[IFLA_GENEVE_TOS]); if (data[IFLA_GENEVE_PORT]) - dst_port = nla_get_u16(data[IFLA_GENEVE_PORT]); + dst_port = nla_get_be16(data[IFLA_GENEVE_PORT]); if (data[IFLA_GENEVE_COLLECT_METADATA]) metadata = true; @@ -913,7 +909,7 @@ static size_t geneve_get_size(const struct net_device *dev) nla_total_size(sizeof(struct in_addr)) + /* IFLA_GENEVE_REMOTE */ nla_total_size(sizeof(__u8)) + /* IFLA_GENEVE_TTL */ nla_total_size(sizeof(__u8)) + /* IFLA_GENEVE_TOS */ - nla_total_size(sizeof(__u16)) + /* IFLA_GENEVE_PORT */ + nla_total_size(sizeof(__be16)) + /* IFLA_GENEVE_PORT */ nla_total_size(0) + /* IFLA_GENEVE_COLLECT_METADATA */ 0; } @@ -935,7 +931,7 @@ static int geneve_fill_info(struct sk_buff *skb, const struct net_device *dev) nla_put_u8(skb, IFLA_GENEVE_TOS, geneve->tos)) goto nla_put_failure; - if (nla_put_u16(skb, IFLA_GENEVE_PORT, ntohs(geneve->dst_port))) + if (nla_put_be16(skb, IFLA_GENEVE_PORT, geneve->dst_port)) goto nla_put_failure; if (geneve->collect_md) { @@ -975,7 +971,7 @@ struct net_device *geneve_dev_create_fb(struct net *net, const char *name, if (IS_ERR(dev)) return dev; - err = geneve_configure(net, dev, 0, 0, 0, 0, dst_port, true); + err = geneve_configure(net, dev, 0, 0, 0, 0, htons(dst_port), true); if (err) { free_netdev(dev); return ERR_PTR(err); diff --git a/drivers/net/irda/ali-ircc.c b/drivers/net/irda/ali-ircc.c index 58ae11a14bb6..64bb44d5d867 100644 --- a/drivers/net/irda/ali-ircc.c +++ b/drivers/net/irda/ali-ircc.c @@ -1031,7 +1031,6 @@ static void ali_ircc_fir_change_speed(struct ali_ircc_cb *priv, __u32 baud) static void ali_ircc_sir_change_speed(struct ali_ircc_cb *priv, __u32 speed) { struct ali_ircc_cb *self = priv; - unsigned long flags; int iobase; int fcr; /* FIFO control reg */ int lcr; /* Line control reg */ @@ -1061,8 +1060,6 @@ static void ali_ircc_sir_change_speed(struct ali_ircc_cb *priv, __u32 speed) /* Update accounting for new speed */ self->io.speed = speed; - spin_lock_irqsave(&self->lock, flags); - divisor = 115200/speed; fcr = UART_FCR_ENABLE_FIFO; @@ -1089,9 +1086,6 @@ static void ali_ircc_sir_change_speed(struct ali_ircc_cb *priv, __u32 speed) /* without this, the connection will be broken after come back from FIR speed, but with this, the SIR connection is harder to established */ outb((UART_MCR_DTR | UART_MCR_RTS | UART_MCR_OUT2), iobase+UART_MCR); - - spin_unlock_irqrestore(&self->lock, flags); - } static void ali_ircc_change_dongle_speed(struct ali_ircc_cb *priv, int speed) diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c index edd77342773a..248478c6f6e4 100644 --- a/drivers/net/macvtap.c +++ b/drivers/net/macvtap.c @@ -1111,10 +1111,10 @@ static long macvtap_ioctl(struct file *file, unsigned int cmd, return 0; case TUNSETSNDBUF: - if (get_user(u, up)) + if (get_user(s, sp)) return -EFAULT; - q->sk.sk_sndbuf = u; + q->sk.sk_sndbuf = s; return 0; case TUNGETVNETHDRSZ: diff --git a/drivers/net/phy/fixed_phy.c b/drivers/net/phy/fixed_phy.c index fb1299c6326e..e23bf5b90e17 100644 --- a/drivers/net/phy/fixed_phy.c +++ b/drivers/net/phy/fixed_phy.c @@ -220,7 +220,7 @@ int fixed_phy_update_state(struct phy_device *phydev, struct fixed_mdio_bus *fmb = &platform_fmb; struct fixed_phy *fp; - if (!phydev || !phydev->bus) + if (!phydev || phydev->bus != fmb->mii_bus) return -EINVAL; list_for_each_entry(fp, &fmb->phys, node) { diff --git a/drivers/net/phy/marvell.c b/drivers/net/phy/marvell.c index e6897b6a8a53..5de8d5827536 100644 --- a/drivers/net/phy/marvell.c +++ b/drivers/net/phy/marvell.c @@ -785,6 +785,7 @@ static int marvell_read_status(struct phy_device *phydev) int adv; int err; int lpa; + int lpagb; int status = 0; /* Update the link, but return if there @@ -802,10 +803,17 @@ static int marvell_read_status(struct phy_device *phydev) if (lpa < 0) return lpa; + lpagb = phy_read(phydev, MII_STAT1000); + if (lpagb < 0) + return lpagb; + adv = phy_read(phydev, MII_ADVERTISE); if (adv < 0) return adv; + phydev->lp_advertising = mii_stat1000_to_ethtool_lpa_t(lpagb) | + mii_lpa_to_ethtool_lpa_t(lpa); + lpa &= adv; if (status & MII_M1011_PHY_STATUS_FULLDUPLEX) @@ -853,6 +861,7 @@ static int marvell_read_status(struct phy_device *phydev) phydev->speed = SPEED_10; phydev->pause = phydev->asym_pause = 0; + phydev->lp_advertising = 0; } return 0; diff --git a/drivers/net/phy/mdio-bcm-unimac.c b/drivers/net/phy/mdio-bcm-unimac.c index 6a52a7f0fa0d..4bde5e728fe0 100644 --- a/drivers/net/phy/mdio-bcm-unimac.c +++ b/drivers/net/phy/mdio-bcm-unimac.c @@ -244,6 +244,7 @@ static const struct of_device_id unimac_mdio_ids[] = { { .compatible = "brcm,unimac-mdio", }, { /* sentinel */ }, }; +MODULE_DEVICE_TABLE(of, unimac_mdio_ids); static struct platform_driver unimac_mdio_driver = { .driver = { diff --git a/drivers/net/phy/mdio-gpio.c b/drivers/net/phy/mdio-gpio.c index 7dc21e56a7aa..3bc9f03349f3 100644 --- a/drivers/net/phy/mdio-gpio.c +++ b/drivers/net/phy/mdio-gpio.c @@ -261,6 +261,7 @@ static const struct of_device_id mdio_gpio_of_match[] = { { .compatible = "virtual,mdio-gpio", }, { /* sentinel */ } }; +MODULE_DEVICE_TABLE(of, mdio_gpio_of_match); static struct platform_driver mdio_gpio_driver = { .probe = mdio_gpio_probe, diff --git a/drivers/net/phy/mdio-mux.c b/drivers/net/phy/mdio-mux.c index 4d4d25efc1e1..280c7c311f72 100644 --- a/drivers/net/phy/mdio-mux.c +++ b/drivers/net/phy/mdio-mux.c @@ -113,18 +113,18 @@ int mdio_mux_init(struct device *dev, if (!parent_bus_node) return -ENODEV; - parent_bus = of_mdio_find_bus(parent_bus_node); - if (parent_bus == NULL) { - ret_val = -EPROBE_DEFER; - goto err_parent_bus; - } - pb = devm_kzalloc(dev, sizeof(*pb), GFP_KERNEL); if (pb == NULL) { ret_val = -ENOMEM; goto err_parent_bus; } + parent_bus = of_mdio_find_bus(parent_bus_node); + if (parent_bus == NULL) { + ret_val = -EPROBE_DEFER; + goto err_parent_bus; + } + pb->switch_data = data; pb->switch_fn = switch_fn; pb->current_child = -1; @@ -173,6 +173,10 @@ int mdio_mux_init(struct device *dev, dev_info(dev, "Version " DRV_VERSION "\n"); return 0; } + + /* balance the reference of_mdio_find_bus() took */ + put_device(&pb->mii_bus->dev); + err_parent_bus: of_node_put(parent_bus_node); return ret_val; @@ -189,6 +193,9 @@ void mdio_mux_uninit(void *mux_handle) mdiobus_free(cb->mii_bus); cb = cb->next; } + + /* balance the reference of_mdio_find_bus() in mdio_mux_init() took */ + put_device(&pb->mii_bus->dev); } EXPORT_SYMBOL_GPL(mdio_mux_uninit); diff --git a/drivers/net/phy/mdio_bus.c b/drivers/net/phy/mdio_bus.c index 02a4615b65f8..12f44c53cc8e 100644 --- a/drivers/net/phy/mdio_bus.c +++ b/drivers/net/phy/mdio_bus.c @@ -167,7 +167,9 @@ static int of_mdio_bus_match(struct device *dev, const void *mdio_bus_np) * of_mdio_find_bus - Given an mii_bus node, find the mii_bus. * @mdio_bus_np: Pointer to the mii_bus. * - * Returns a pointer to the mii_bus, or NULL if none found. + * Returns a reference to the mii_bus, or NULL if none found. The + * embedded struct device will have its reference count incremented, + * and this must be put once the bus is finished with. * * Because the association of a device_node and mii_bus is made via * of_mdiobus_register(), the mii_bus cannot be found before it is @@ -234,15 +236,18 @@ static inline void of_mdiobus_link_phydev(struct mii_bus *mdio, #endif /** - * mdiobus_register - bring up all the PHYs on a given bus and attach them to bus + * __mdiobus_register - bring up all the PHYs on a given bus and attach them to bus * @bus: target mii_bus + * @owner: module containing bus accessor functions * * Description: Called by a bus driver to bring up all the PHYs - * on a given bus, and attach them to the bus. + * on a given bus, and attach them to the bus. Drivers should use + * mdiobus_register() rather than __mdiobus_register() unless they + * need to pass a specific owner module. * * Returns 0 on success or < 0 on error. */ -int mdiobus_register(struct mii_bus *bus) +int __mdiobus_register(struct mii_bus *bus, struct module *owner) { int i, err; @@ -253,6 +258,7 @@ int mdiobus_register(struct mii_bus *bus) BUG_ON(bus->state != MDIOBUS_ALLOCATED && bus->state != MDIOBUS_UNREGISTERED); + bus->owner = owner; bus->dev.parent = bus->parent; bus->dev.class = &mdio_bus_class; bus->dev.groups = NULL; @@ -288,13 +294,16 @@ int mdiobus_register(struct mii_bus *bus) error: while (--i >= 0) { - if (bus->phy_map[i]) - device_unregister(&bus->phy_map[i]->dev); + struct phy_device *phydev = bus->phy_map[i]; + if (phydev) { + phy_device_remove(phydev); + phy_device_free(phydev); + } } device_del(&bus->dev); return err; } -EXPORT_SYMBOL(mdiobus_register); +EXPORT_SYMBOL(__mdiobus_register); void mdiobus_unregister(struct mii_bus *bus) { @@ -304,9 +313,11 @@ void mdiobus_unregister(struct mii_bus *bus) bus->state = MDIOBUS_UNREGISTERED; for (i = 0; i < PHY_MAX_ADDR; i++) { - if (bus->phy_map[i]) - device_unregister(&bus->phy_map[i]->dev); - bus->phy_map[i] = NULL; + struct phy_device *phydev = bus->phy_map[i]; + if (phydev) { + phy_device_remove(phydev); + phy_device_free(phydev); + } } device_del(&bus->dev); } diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c index c0f211127274..f761288abe66 100644 --- a/drivers/net/phy/phy_device.c +++ b/drivers/net/phy/phy_device.c @@ -384,6 +384,24 @@ int phy_device_register(struct phy_device *phydev) EXPORT_SYMBOL(phy_device_register); /** + * phy_device_remove - Remove a previously registered phy device from the MDIO bus + * @phydev: phy_device structure to remove + * + * This doesn't free the phy_device itself, it merely reverses the effects + * of phy_device_register(). Use phy_device_free() to free the device + * after calling this function. + */ +void phy_device_remove(struct phy_device *phydev) +{ + struct mii_bus *bus = phydev->bus; + int addr = phydev->addr; + + device_del(&phydev->dev); + bus->phy_map[addr] = NULL; +} +EXPORT_SYMBOL(phy_device_remove); + +/** * phy_find_first - finds the first PHY device on the bus * @bus: the target MII bus */ @@ -578,14 +596,22 @@ EXPORT_SYMBOL(phy_init_hw); * generic driver is used. The phy_device is given a ptr to * the attaching device, and given a callback for link status * change. The phy_device is returned to the attaching driver. + * This function takes a reference on the phy device. */ int phy_attach_direct(struct net_device *dev, struct phy_device *phydev, u32 flags, phy_interface_t interface) { + struct mii_bus *bus = phydev->bus; struct device *d = &phydev->dev; - struct module *bus_module; int err; + if (!try_module_get(bus->owner)) { + dev_err(&dev->dev, "failed to get the bus module\n"); + return -EIO; + } + + get_device(d); + /* Assume that if there is no driver, that it doesn't * exist, and we should use the genphy driver. */ @@ -600,20 +626,13 @@ int phy_attach_direct(struct net_device *dev, struct phy_device *phydev, err = device_bind_driver(d); if (err) - return err; + goto error; } if (phydev->attached_dev) { dev_err(&dev->dev, "PHY already attached\n"); - return -EBUSY; - } - - /* Increment the bus module reference count */ - bus_module = phydev->bus->dev.driver ? - phydev->bus->dev.driver->owner : NULL; - if (!try_module_get(bus_module)) { - dev_err(&dev->dev, "failed to get the bus module\n"); - return -EIO; + err = -EBUSY; + goto error; } phydev->attached_dev = dev; @@ -636,6 +655,11 @@ int phy_attach_direct(struct net_device *dev, struct phy_device *phydev, phy_resume(phydev); return err; + +error: + put_device(d); + module_put(bus->owner); + return err; } EXPORT_SYMBOL(phy_attach_direct); @@ -677,14 +701,15 @@ EXPORT_SYMBOL(phy_attach); /** * phy_detach - detach a PHY device from its network device * @phydev: target phy_device struct + * + * This detaches the phy device from its network device and the phy + * driver, and drops the reference count taken in phy_attach_direct(). */ void phy_detach(struct phy_device *phydev) { + struct mii_bus *bus; int i; - if (phydev->bus->dev.driver) - module_put(phydev->bus->dev.driver->owner); - phydev->attached_dev->phydev = NULL; phydev->attached_dev = NULL; phy_suspend(phydev); @@ -700,6 +725,15 @@ void phy_detach(struct phy_device *phydev) break; } } + + /* + * The phydev might go away on the put_device() below, so avoid + * a use-after-free bug by reading the underlying bus first. + */ + bus = phydev->bus; + + put_device(&phydev->dev); + module_put(bus->owner); } EXPORT_SYMBOL(phy_detach); diff --git a/drivers/net/phy/vitesse.c b/drivers/net/phy/vitesse.c index 17cad185169d..76cad712ddb2 100644 --- a/drivers/net/phy/vitesse.c +++ b/drivers/net/phy/vitesse.c @@ -66,7 +66,6 @@ #define PHY_ID_VSC8244 0x000fc6c0 #define PHY_ID_VSC8514 0x00070670 #define PHY_ID_VSC8574 0x000704a0 -#define PHY_ID_VSC8641 0x00070431 #define PHY_ID_VSC8662 0x00070660 #define PHY_ID_VSC8221 0x000fc550 #define PHY_ID_VSC8211 0x000fc4b0 @@ -273,18 +272,6 @@ static struct phy_driver vsc82xx_driver[] = { .config_intr = &vsc82xx_config_intr, .driver = { .owner = THIS_MODULE,}, }, { - .phy_id = PHY_ID_VSC8641, - .name = "Vitesse VSC8641", - .phy_id_mask = 0x000ffff0, - .features = PHY_GBIT_FEATURES, - .flags = PHY_HAS_INTERRUPT, - .config_init = &vsc824x_config_init, - .config_aneg = &vsc82x4_config_aneg, - .read_status = &genphy_read_status, - .ack_interrupt = &vsc824x_ack_interrupt, - .config_intr = &vsc82xx_config_intr, - .driver = { .owner = THIS_MODULE,}, -}, { .phy_id = PHY_ID_VSC8662, .name = "Vitesse VSC8662", .phy_id_mask = 0x000ffff0, @@ -331,7 +318,6 @@ static struct mdio_device_id __maybe_unused vitesse_tbl[] = { { PHY_ID_VSC8244, 0x000fffc0 }, { PHY_ID_VSC8514, 0x000ffff0 }, { PHY_ID_VSC8574, 0x000ffff0 }, - { PHY_ID_VSC8641, 0x000ffff0 }, { PHY_ID_VSC8662, 0x000ffff0 }, { PHY_ID_VSC8221, 0x000ffff0 }, { PHY_ID_VSC8211, 0x000ffff0 }, diff --git a/drivers/net/ppp/ppp_generic.c b/drivers/net/ppp/ppp_generic.c index 0481daf9201a..ed00446759b2 100644 --- a/drivers/net/ppp/ppp_generic.c +++ b/drivers/net/ppp/ppp_generic.c @@ -2755,6 +2755,7 @@ static struct ppp *ppp_create_interface(struct net *net, int unit, */ dev_net_set(dev, net); + rtnl_lock(); mutex_lock(&pn->all_ppp_mutex); if (unit < 0) { @@ -2785,7 +2786,7 @@ static struct ppp *ppp_create_interface(struct net *net, int unit, ppp->file.index = unit; sprintf(dev->name, "ppp%d", unit); - ret = register_netdev(dev); + ret = register_netdevice(dev); if (ret != 0) { unit_put(&pn->units_idr, unit); netdev_err(ppp->dev, "PPP: couldn't register device %s (%d)\n", @@ -2797,6 +2798,7 @@ static struct ppp *ppp_create_interface(struct net *net, int unit, atomic_inc(&ppp_unit_count); mutex_unlock(&pn->all_ppp_mutex); + rtnl_unlock(); *retp = 0; return ppp; diff --git a/drivers/net/usb/Kconfig b/drivers/net/usb/Kconfig index 1610b79ae386..fbb9325d1f6e 100644 --- a/drivers/net/usb/Kconfig +++ b/drivers/net/usb/Kconfig @@ -583,4 +583,15 @@ config USB_VL600 http://ubuntuforums.org/showpost.php?p=10589647&postcount=17 +config USB_NET_CH9200 + tristate "QingHeng CH9200 USB ethernet support" + depends on USB_USBNET + select MII + help + Choose this option if you have a USB ethernet adapter with a QinHeng + CH9200 chipset. + + To compile this driver as a module, choose M here: the + module will be called ch9200. + endif # USB_NET_DRIVERS diff --git a/drivers/net/usb/Makefile b/drivers/net/usb/Makefile index cf6a0e610a7f..b5f04068dbe4 100644 --- a/drivers/net/usb/Makefile +++ b/drivers/net/usb/Makefile @@ -38,4 +38,4 @@ obj-$(CONFIG_USB_NET_HUAWEI_CDC_NCM) += huawei_cdc_ncm.o obj-$(CONFIG_USB_VL600) += lg-vl600.o obj-$(CONFIG_USB_NET_QMI_WWAN) += qmi_wwan.o obj-$(CONFIG_USB_NET_CDC_MBIM) += cdc_mbim.o - +obj-$(CONFIG_USB_NET_CH9200) += ch9200.o diff --git a/drivers/net/usb/ch9200.c b/drivers/net/usb/ch9200.c new file mode 100644 index 000000000000..5e151e6a3e09 --- /dev/null +++ b/drivers/net/usb/ch9200.c @@ -0,0 +1,432 @@ +/* + * USB 10M/100M ethernet adapter + * + * This file is licensed under the terms of the GNU General Public License + * version 2. This program is licensed "as is" without any warranty of any + * kind, whether express or implied + * + */ + +#include <linux/kernel.h> +#include <linux/module.h> +#include <linux/sched.h> +#include <linux/stddef.h> +#include <linux/init.h> +#include <linux/netdevice.h> +#include <linux/etherdevice.h> +#include <linux/ethtool.h> +#include <linux/mii.h> +#include <linux/usb.h> +#include <linux/crc32.h> +#include <linux/usb/usbnet.h> +#include <linux/slab.h> + +#define CH9200_VID 0x1A86 +#define CH9200_PID_E092 0xE092 + +#define CTRL_TIMEOUT_MS 1000 + +#define CONTROL_TIMEOUT_MS 1000 + +#define REQUEST_READ 0x0E +#define REQUEST_WRITE 0x0F + +/* Address space: + * 00-63 : MII + * 64-128: MAC + * + * Note: all accesses must be 16-bit + */ + +#define MAC_REG_CTRL 64 +#define MAC_REG_STATUS 66 +#define MAC_REG_INTERRUPT_MASK 68 +#define MAC_REG_PHY_COMMAND 70 +#define MAC_REG_PHY_DATA 72 +#define MAC_REG_STATION_L 74 +#define MAC_REG_STATION_M 76 +#define MAC_REG_STATION_H 78 +#define MAC_REG_HASH_L 80 +#define MAC_REG_HASH_M1 82 +#define MAC_REG_HASH_M2 84 +#define MAC_REG_HASH_H 86 +#define MAC_REG_THRESHOLD 88 +#define MAC_REG_FIFO_DEPTH 90 +#define MAC_REG_PAUSE 92 +#define MAC_REG_FLOW_CONTROL 94 + +/* Control register bits + * + * Note: bits 13 and 15 are reserved + */ +#define LOOPBACK (0x01 << 14) +#define BASE100X (0x01 << 12) +#define MBPS_10 (0x01 << 11) +#define DUPLEX_MODE (0x01 << 10) +#define PAUSE_FRAME (0x01 << 9) +#define PROMISCUOUS (0x01 << 8) +#define MULTICAST (0x01 << 7) +#define BROADCAST (0x01 << 6) +#define HASH (0x01 << 5) +#define APPEND_PAD (0x01 << 4) +#define APPEND_CRC (0x01 << 3) +#define TRANSMITTER_ACTION (0x01 << 2) +#define RECEIVER_ACTION (0x01 << 1) +#define DMA_ACTION (0x01 << 0) + +/* Status register bits + * + * Note: bits 7-15 are reserved + */ +#define ALIGNMENT (0x01 << 6) +#define FIFO_OVER_RUN (0x01 << 5) +#define FIFO_UNDER_RUN (0x01 << 4) +#define RX_ERROR (0x01 << 3) +#define RX_COMPLETE (0x01 << 2) +#define TX_ERROR (0x01 << 1) +#define TX_COMPLETE (0x01 << 0) + +/* FIFO depth register bits + * + * Note: bits 6 and 14 are reserved + */ + +#define ETH_TXBD (0x01 << 15) +#define ETN_TX_FIFO_DEPTH (0x01 << 8) +#define ETH_RXBD (0x01 << 7) +#define ETH_RX_FIFO_DEPTH (0x01 << 0) + +static int control_read(struct usbnet *dev, + unsigned char request, unsigned short value, + unsigned short index, void *data, unsigned short size, + int timeout) +{ + unsigned char *buf = NULL; + unsigned char request_type; + int err = 0; + + if (request == REQUEST_READ) + request_type = (USB_DIR_IN | USB_TYPE_VENDOR | USB_RECIP_OTHER); + else + request_type = (USB_DIR_IN | USB_TYPE_VENDOR | + USB_RECIP_DEVICE); + + netdev_dbg(dev->net, "Control_read() index=0x%02x size=%d\n", + index, size); + + buf = kmalloc(size, GFP_KERNEL); + if (!buf) { + err = -ENOMEM; + goto err_out; + } + + err = usb_control_msg(dev->udev, + usb_rcvctrlpipe(dev->udev, 0), + request, request_type, value, index, buf, size, + timeout); + if (err == size) + memcpy(data, buf, size); + else if (err >= 0) + err = -EINVAL; + kfree(buf); + + return err; + +err_out: + return err; +} + +static int control_write(struct usbnet *dev, unsigned char request, + unsigned short value, unsigned short index, + void *data, unsigned short size, int timeout) +{ + unsigned char *buf = NULL; + unsigned char request_type; + int err = 0; + + if (request == REQUEST_WRITE) + request_type = (USB_DIR_OUT | USB_TYPE_VENDOR | + USB_RECIP_OTHER); + else + request_type = (USB_DIR_OUT | USB_TYPE_VENDOR | + USB_RECIP_DEVICE); + + netdev_dbg(dev->net, "Control_write() index=0x%02x size=%d\n", + index, size); + + if (data) { + buf = kmalloc(size, GFP_KERNEL); + if (!buf) { + err = -ENOMEM; + goto err_out; + } + memcpy(buf, data, size); + } + + err = usb_control_msg(dev->udev, + usb_sndctrlpipe(dev->udev, 0), + request, request_type, value, index, buf, size, + timeout); + if (err >= 0 && err < size) + err = -EINVAL; + kfree(buf); + + return 0; + +err_out: + return err; +} + +static int ch9200_mdio_read(struct net_device *netdev, int phy_id, int loc) +{ + struct usbnet *dev = netdev_priv(netdev); + unsigned char buff[2]; + + netdev_dbg(netdev, "ch9200_mdio_read phy_id:%02x loc:%02x\n", + phy_id, loc); + + if (phy_id != 0) + return -ENODEV; + + control_read(dev, REQUEST_READ, 0, loc * 2, buff, 0x02, + CONTROL_TIMEOUT_MS); + + return (buff[0] | buff[1] << 8); +} + +static void ch9200_mdio_write(struct net_device *netdev, + int phy_id, int loc, int val) +{ + struct usbnet *dev = netdev_priv(netdev); + unsigned char buff[2]; + + netdev_dbg(netdev, "ch9200_mdio_write() phy_id=%02x loc:%02x\n", + phy_id, loc); + + if (phy_id != 0) + return; + + buff[0] = (unsigned char)val; + buff[1] = (unsigned char)(val >> 8); + + control_write(dev, REQUEST_WRITE, 0, loc * 2, buff, 0x02, + CONTROL_TIMEOUT_MS); +} + +static int ch9200_link_reset(struct usbnet *dev) +{ + struct ethtool_cmd ecmd; + + mii_check_media(&dev->mii, 1, 1); + mii_ethtool_gset(&dev->mii, &ecmd); + + netdev_dbg(dev->net, "link_reset() speed:%d duplex:%d\n", + ecmd.speed, ecmd.duplex); + + return 0; +} + +static void ch9200_status(struct usbnet *dev, struct urb *urb) +{ + int link; + unsigned char *buf; + + if (urb->actual_length < 16) + return; + + buf = urb->transfer_buffer; + link = !!(buf[0] & 0x01); + + if (link) { + netif_carrier_on(dev->net); + usbnet_defer_kevent(dev, EVENT_LINK_RESET); + } else { + netif_carrier_off(dev->net); + } +} + +static struct sk_buff *ch9200_tx_fixup(struct usbnet *dev, struct sk_buff *skb, + gfp_t flags) +{ + int i = 0; + int len = 0; + int tx_overhead = 0; + + tx_overhead = 0x40; + + len = skb->len; + if (skb_headroom(skb) < tx_overhead) { + struct sk_buff *skb2; + + skb2 = skb_copy_expand(skb, tx_overhead, 0, flags); + dev_kfree_skb_any(skb); + skb = skb2; + if (!skb) + return NULL; + } + + __skb_push(skb, tx_overhead); + /* usbnet adds padding if length is a multiple of packet size + * if so, adjust length value in header + */ + if ((skb->len % dev->maxpacket) == 0) + len++; + + skb->data[0] = len; + skb->data[1] = len >> 8; + skb->data[2] = 0x00; + skb->data[3] = 0x80; + + for (i = 4; i < 48; i++) + skb->data[i] = 0x00; + + skb->data[48] = len; + skb->data[49] = len >> 8; + skb->data[50] = 0x00; + skb->data[51] = 0x80; + + for (i = 52; i < 64; i++) + skb->data[i] = 0x00; + + return skb; +} + +static int ch9200_rx_fixup(struct usbnet *dev, struct sk_buff *skb) +{ + int len = 0; + int rx_overhead = 0; + + rx_overhead = 64; + + if (unlikely(skb->len < rx_overhead)) { + dev_err(&dev->udev->dev, "unexpected tiny rx frame\n"); + return 0; + } + + len = (skb->data[skb->len - 16] | skb->data[skb->len - 15] << 8); + skb_trim(skb, len); + + return 1; +} + +static int get_mac_address(struct usbnet *dev, unsigned char *data) +{ + int err = 0; + unsigned char mac_addr[0x06]; + int rd_mac_len = 0; + + netdev_dbg(dev->net, "get_mac_address:\n\tusbnet VID:%0x PID:%0x\n", + dev->udev->descriptor.idVendor, + dev->udev->descriptor.idProduct); + + memset(mac_addr, 0, sizeof(mac_addr)); + rd_mac_len = control_read(dev, REQUEST_READ, 0, + MAC_REG_STATION_L, mac_addr, 0x02, + CONTROL_TIMEOUT_MS); + rd_mac_len += control_read(dev, REQUEST_READ, 0, MAC_REG_STATION_M, + mac_addr + 2, 0x02, CONTROL_TIMEOUT_MS); + rd_mac_len += control_read(dev, REQUEST_READ, 0, MAC_REG_STATION_H, + mac_addr + 4, 0x02, CONTROL_TIMEOUT_MS); + if (rd_mac_len != ETH_ALEN) + err = -EINVAL; + + data[0] = mac_addr[5]; + data[1] = mac_addr[4]; + data[2] = mac_addr[3]; + data[3] = mac_addr[2]; + data[4] = mac_addr[1]; + data[5] = mac_addr[0]; + + return err; +} + +static int ch9200_bind(struct usbnet *dev, struct usb_interface *intf) +{ + int retval = 0; + unsigned char data[2]; + + retval = usbnet_get_endpoints(dev, intf); + if (retval) + return retval; + + dev->mii.dev = dev->net; + dev->mii.mdio_read = ch9200_mdio_read; + dev->mii.mdio_write = ch9200_mdio_write; + dev->mii.reg_num_mask = 0x1f; + + dev->mii.phy_id_mask = 0x1f; + + dev->hard_mtu = dev->net->mtu + dev->net->hard_header_len; + dev->rx_urb_size = 24 * 64 + 16; + mii_nway_restart(&dev->mii); + + data[0] = 0x01; + data[1] = 0x0F; + retval = control_write(dev, REQUEST_WRITE, 0, MAC_REG_THRESHOLD, data, + 0x02, CONTROL_TIMEOUT_MS); + + data[0] = 0xA0; + data[1] = 0x90; + retval = control_write(dev, REQUEST_WRITE, 0, MAC_REG_FIFO_DEPTH, data, + 0x02, CONTROL_TIMEOUT_MS); + + data[0] = 0x30; + data[1] = 0x00; + retval = control_write(dev, REQUEST_WRITE, 0, MAC_REG_PAUSE, data, + 0x02, CONTROL_TIMEOUT_MS); + + data[0] = 0x17; + data[1] = 0xD8; + retval = control_write(dev, REQUEST_WRITE, 0, MAC_REG_FLOW_CONTROL, + data, 0x02, CONTROL_TIMEOUT_MS); + + /* Undocumented register */ + data[0] = 0x01; + data[1] = 0x00; + retval = control_write(dev, REQUEST_WRITE, 0, 254, data, 0x02, + CONTROL_TIMEOUT_MS); + + data[0] = 0x5F; + data[1] = 0x0D; + retval = control_write(dev, REQUEST_WRITE, 0, MAC_REG_CTRL, data, 0x02, + CONTROL_TIMEOUT_MS); + + retval = get_mac_address(dev, dev->net->dev_addr); + + return retval; +} + +static const struct driver_info ch9200_info = { + .description = "CH9200 USB to Network Adaptor", + .flags = FLAG_ETHER, + .bind = ch9200_bind, + .rx_fixup = ch9200_rx_fixup, + .tx_fixup = ch9200_tx_fixup, + .status = ch9200_status, + .link_reset = ch9200_link_reset, + .reset = ch9200_link_reset, +}; + +static const struct usb_device_id ch9200_products[] = { + { + USB_DEVICE(0x1A86, 0xE092), + .driver_info = (unsigned long)&ch9200_info, + }, + {}, +}; + +MODULE_DEVICE_TABLE(usb, ch9200_products); + +static struct usb_driver ch9200_driver = { + .name = "ch9200", + .id_table = ch9200_products, + .probe = usbnet_probe, + .disconnect = usbnet_disconnect, + .suspend = usbnet_suspend, + .resume = usbnet_resume, +}; + +module_usb_driver(ch9200_driver); + +MODULE_DESCRIPTION("QinHeng CH9200 USB Network device"); +MODULE_LICENSE("GPL"); diff --git a/drivers/net/vrf.c b/drivers/net/vrf.c index e7094fbd7568..488c6f50df73 100644 --- a/drivers/net/vrf.c +++ b/drivers/net/vrf.c @@ -193,7 +193,8 @@ static netdev_tx_t vrf_process_v4_outbound(struct sk_buff *skb, .flowi4_oif = vrf_dev->ifindex, .flowi4_iif = LOOPBACK_IFINDEX, .flowi4_tos = RT_TOS(ip4h->tos), - .flowi4_flags = FLOWI_FLAG_ANYSRC | FLOWI_FLAG_VRFSRC, + .flowi4_flags = FLOWI_FLAG_ANYSRC | FLOWI_FLAG_VRFSRC | + FLOWI_FLAG_SKIP_NH_OIF, .daddr = ip4h->daddr, }; diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c index cf8b7f0473b3..bbac1d35ed4e 100644 --- a/drivers/net/vxlan.c +++ b/drivers/net/vxlan.c @@ -2392,10 +2392,6 @@ static void vxlan_setup(struct net_device *dev) eth_hw_addr_random(dev); ether_setup(dev); - if (vxlan->default_dst.remote_ip.sa.sa_family == AF_INET6) - dev->needed_headroom = ETH_HLEN + VXLAN6_HEADROOM; - else - dev->needed_headroom = ETH_HLEN + VXLAN_HEADROOM; dev->netdev_ops = &vxlan_netdev_ops; dev->destructor = free_netdev; @@ -2640,8 +2636,11 @@ static int vxlan_dev_configure(struct net *src_net, struct net_device *dev, dst->remote_ip.sa.sa_family = AF_INET; if (dst->remote_ip.sa.sa_family == AF_INET6 || - vxlan->cfg.saddr.sa.sa_family == AF_INET6) + vxlan->cfg.saddr.sa.sa_family == AF_INET6) { + if (!IS_ENABLED(CONFIG_IPV6)) + return -EPFNOSUPPORT; use_ipv6 = true; + } if (conf->remote_ifindex) { struct net_device *lowerdev @@ -2670,8 +2669,12 @@ static int vxlan_dev_configure(struct net *src_net, struct net_device *dev, dev->needed_headroom = lowerdev->hard_header_len + (use_ipv6 ? VXLAN6_HEADROOM : VXLAN_HEADROOM); - } else if (use_ipv6) + } else if (use_ipv6) { vxlan->flags |= VXLAN_F_IPV6; + dev->needed_headroom = ETH_HLEN + VXLAN6_HEADROOM; + } else { + dev->needed_headroom = ETH_HLEN + VXLAN_HEADROOM; + } memcpy(&vxlan->cfg, conf, sizeof(*conf)); if (!vxlan->cfg.dst_port) diff --git a/drivers/of/of_mdio.c b/drivers/of/of_mdio.c index 1350fa25cdb0..a87a868fed64 100644 --- a/drivers/of/of_mdio.c +++ b/drivers/of/of_mdio.c @@ -197,7 +197,8 @@ static int of_phy_match(struct device *dev, void *phy_np) * of_phy_find_device - Give a PHY node, find the phy_device * @phy_np: Pointer to the phy's device tree node * - * Returns a pointer to the phy_device. + * If successful, returns a pointer to the phy_device with the embedded + * struct device refcount incremented by one, or NULL on failure. */ struct phy_device *of_phy_find_device(struct device_node *phy_np) { @@ -217,7 +218,9 @@ EXPORT_SYMBOL(of_phy_find_device); * @hndlr: Link state callback for the network device * @iface: PHY data interface type * - * Returns a pointer to the phy_device if successful. NULL otherwise + * If successful, returns a pointer to the phy_device with the embedded + * struct device refcount incremented by one, or NULL on failure. The + * refcount must be dropped by calling phy_disconnect() or phy_detach(). */ struct phy_device *of_phy_connect(struct net_device *dev, struct device_node *phy_np, @@ -225,13 +228,19 @@ struct phy_device *of_phy_connect(struct net_device *dev, phy_interface_t iface) { struct phy_device *phy = of_phy_find_device(phy_np); + int ret; if (!phy) return NULL; phy->dev_flags = flags; - return phy_connect_direct(dev, phy, hndlr, iface) ? NULL : phy; + ret = phy_connect_direct(dev, phy, hndlr, iface); + + /* refcount is held by phy_connect_direct() on success */ + put_device(&phy->dev); + + return ret ? NULL : phy; } EXPORT_SYMBOL(of_phy_connect); @@ -241,17 +250,27 @@ EXPORT_SYMBOL(of_phy_connect); * @phy_np: Node pointer for the PHY * @flags: flags to pass to the PHY * @iface: PHY data interface type + * + * If successful, returns a pointer to the phy_device with the embedded + * struct device refcount incremented by one, or NULL on failure. The + * refcount must be dropped by calling phy_disconnect() or phy_detach(). */ struct phy_device *of_phy_attach(struct net_device *dev, struct device_node *phy_np, u32 flags, phy_interface_t iface) { struct phy_device *phy = of_phy_find_device(phy_np); + int ret; if (!phy) return NULL; - return phy_attach_direct(dev, phy, flags, iface) ? NULL : phy; + ret = phy_attach_direct(dev, phy, flags, iface); + + /* refcount is held by phy_attach_direct() on success */ + put_device(&phy->dev); + + return ret ? NULL : phy; } EXPORT_SYMBOL(of_phy_attach); diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 88a00694eda5..2d15e3831440 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -507,6 +507,7 @@ static inline void napi_enable(struct napi_struct *n) BUG_ON(!test_bit(NAPI_STATE_SCHED, &n->state)); smp_mb__before_atomic(); clear_bit(NAPI_STATE_SCHED, &n->state); + clear_bit(NAPI_STATE_NPSVC, &n->state); } #ifdef CONFIG_SMP diff --git a/include/linux/phy.h b/include/linux/phy.h index 962387a192f1..4a4e3a092337 100644 --- a/include/linux/phy.h +++ b/include/linux/phy.h @@ -19,6 +19,7 @@ #include <linux/spinlock.h> #include <linux/ethtool.h> #include <linux/mii.h> +#include <linux/module.h> #include <linux/timer.h> #include <linux/workqueue.h> #include <linux/mod_devicetable.h> @@ -153,6 +154,7 @@ struct sk_buff; * PHYs should register using this structure */ struct mii_bus { + struct module *owner; const char *name; char id[MII_BUS_ID_SIZE]; void *priv; @@ -198,7 +200,8 @@ static inline struct mii_bus *mdiobus_alloc(void) return mdiobus_alloc_size(0); } -int mdiobus_register(struct mii_bus *bus); +int __mdiobus_register(struct mii_bus *bus, struct module *owner); +#define mdiobus_register(bus) __mdiobus_register(bus, THIS_MODULE) void mdiobus_unregister(struct mii_bus *bus); void mdiobus_free(struct mii_bus *bus); struct mii_bus *devm_mdiobus_alloc_size(struct device *dev, int sizeof_priv); @@ -742,6 +745,7 @@ struct phy_device *phy_device_create(struct mii_bus *bus, int addr, int phy_id, struct phy_c45_device_ids *c45_ids); struct phy_device *get_phy_device(struct mii_bus *bus, int addr, bool is_c45); int phy_device_register(struct phy_device *phy); +void phy_device_remove(struct phy_device *phydev); int phy_init_hw(struct phy_device *phydev); int phy_suspend(struct phy_device *phydev); int phy_resume(struct phy_device *phydev); diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 2738d355cdf9..2b0a30a6e31c 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -179,6 +179,9 @@ struct nf_bridge_info { u8 bridged_dnat:1; __u16 frag_max_size; struct net_device *physindev; + + /* always valid & non-NULL from FORWARD on, for physdev match */ + struct net_device *physoutdev; union { /* prerouting: detect dnat in orig/reply direction */ __be32 ipv4_daddr; @@ -189,9 +192,6 @@ struct nf_bridge_info { * skb is out in neigh layer. */ char neigh_header[8]; - - /* always valid & non-NULL from FORWARD on, for physdev match */ - struct net_device *physoutdev; }; }; #endif @@ -2707,6 +2707,9 @@ static inline void skb_postpull_rcsum(struct sk_buff *skb, { if (skb->ip_summed == CHECKSUM_COMPLETE) skb->csum = csum_sub(skb->csum, csum_partial(start, len, 0)); + else if (skb->ip_summed == CHECKSUM_PARTIAL && + skb_checksum_start_offset(skb) <= len) + skb->ip_summed = CHECKSUM_NONE; } unsigned char *skb_pull_rcsum(struct sk_buff *skb, unsigned int len); diff --git a/include/net/flow.h b/include/net/flow.h index acd6a096250e..9b85db85f13c 100644 --- a/include/net/flow.h +++ b/include/net/flow.h @@ -35,6 +35,7 @@ struct flowi_common { #define FLOWI_FLAG_ANYSRC 0x01 #define FLOWI_FLAG_KNOWN_NH 0x02 #define FLOWI_FLAG_VRFSRC 0x04 +#define FLOWI_FLAG_SKIP_NH_OIF 0x08 __u32 flowic_secid; struct flowi_tunnel flowic_tun_key; }; diff --git a/include/net/inet_timewait_sock.h b/include/net/inet_timewait_sock.h index 879d6e5a973b..186f3a1e1b1f 100644 --- a/include/net/inet_timewait_sock.h +++ b/include/net/inet_timewait_sock.h @@ -110,7 +110,19 @@ struct inet_timewait_sock *inet_twsk_alloc(const struct sock *sk, void __inet_twsk_hashdance(struct inet_timewait_sock *tw, struct sock *sk, struct inet_hashinfo *hashinfo); -void inet_twsk_schedule(struct inet_timewait_sock *tw, const int timeo); +void __inet_twsk_schedule(struct inet_timewait_sock *tw, int timeo, + bool rearm); + +static void inline inet_twsk_schedule(struct inet_timewait_sock *tw, int timeo) +{ + __inet_twsk_schedule(tw, timeo, false); +} + +static void inline inet_twsk_reschedule(struct inet_timewait_sock *tw, int timeo) +{ + __inet_twsk_schedule(tw, timeo, true); +} + void inet_twsk_deschedule_put(struct inet_timewait_sock *tw); void inet_twsk_purge(struct inet_hashinfo *hashinfo, diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h index 063d30474cf6..aaf9700fc9e5 100644 --- a/include/net/ip6_fib.h +++ b/include/net/ip6_fib.h @@ -275,7 +275,8 @@ int fib6_add(struct fib6_node *root, struct rt6_info *rt, struct nl_info *info, struct mx6_config *mxc); int fib6_del(struct rt6_info *rt, struct nl_info *info); -void inet6_rt_notify(int event, struct rt6_info *rt, struct nl_info *info); +void inet6_rt_notify(int event, struct rt6_info *rt, struct nl_info *info, + unsigned int flags); void fib6_run_gc(unsigned long expires, struct net *net, bool force); diff --git a/include/net/ip6_tunnel.h b/include/net/ip6_tunnel.h index b8529aa1dae7..fa915fa0f703 100644 --- a/include/net/ip6_tunnel.h +++ b/include/net/ip6_tunnel.h @@ -32,6 +32,12 @@ struct __ip6_tnl_parm { __be32 o_key; }; +struct ip6_tnl_dst { + seqlock_t lock; + struct dst_entry __rcu *dst; + u32 cookie; +}; + /* IPv6 tunnel */ struct ip6_tnl { struct ip6_tnl __rcu *next; /* next tunnel in list */ @@ -39,8 +45,7 @@ struct ip6_tnl { struct net *net; /* netns for packet i/o */ struct __ip6_tnl_parm parms; /* tunnel configuration parameters */ struct flowi fl; /* flowi template for xmit */ - struct dst_entry *dst_cache; /* cached dst */ - u32 dst_cookie; + struct ip6_tnl_dst __percpu *dst_cache; /* cached dst */ int err_count; unsigned long err_time; @@ -60,9 +65,11 @@ struct ipv6_tlv_tnl_enc_lim { __u8 encap_limit; /* tunnel encapsulation limit */ } __packed; -struct dst_entry *ip6_tnl_dst_check(struct ip6_tnl *t); +struct dst_entry *ip6_tnl_dst_get(struct ip6_tnl *t); +int ip6_tnl_dst_init(struct ip6_tnl *t); +void ip6_tnl_dst_destroy(struct ip6_tnl *t); void ip6_tnl_dst_reset(struct ip6_tnl *t); -void ip6_tnl_dst_store(struct ip6_tnl *t, struct dst_entry *dst); +void ip6_tnl_dst_set(struct ip6_tnl *t, struct dst_entry *dst); int ip6_tnl_rcv_ctl(struct ip6_tnl *t, const struct in6_addr *laddr, const struct in6_addr *raddr); int ip6_tnl_xmit_ctl(struct ip6_tnl *t, const struct in6_addr *laddr, @@ -79,7 +86,7 @@ static inline void ip6tunnel_xmit(struct sock *sk, struct sk_buff *skb, struct net_device_stats *stats = &dev->stats; int pkt_len, err; - pkt_len = skb->len; + pkt_len = skb->len - skb_inner_network_offset(skb); err = ip6_local_out_sk(sk, skb); if (net_xmit_eval(err) == 0) { diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h index a37d0432bebd..727d6e9a9685 100644 --- a/include/net/ip_fib.h +++ b/include/net/ip_fib.h @@ -236,8 +236,11 @@ static inline int fib_lookup(struct net *net, const struct flowi4 *flp, rcu_read_lock(); tb = fib_get_table(net, RT_TABLE_MAIN); - if (tb && !fib_table_lookup(tb, flp, res, flags | FIB_LOOKUP_NOREF)) - err = 0; + if (tb) + err = fib_table_lookup(tb, flp, res, flags | FIB_LOOKUP_NOREF); + + if (err == -EAGAIN) + err = -ENETUNREACH; rcu_read_unlock(); @@ -258,7 +261,7 @@ static inline int fib_lookup(struct net *net, struct flowi4 *flp, struct fib_result *res, unsigned int flags) { struct fib_table *tb; - int err; + int err = -ENETUNREACH; flags |= FIB_LOOKUP_NOREF; if (net->ipv4.fib_has_custom_rules) @@ -268,15 +271,20 @@ static inline int fib_lookup(struct net *net, struct flowi4 *flp, res->tclassid = 0; - for (err = 0; !err; err = -ENETUNREACH) { - tb = rcu_dereference_rtnl(net->ipv4.fib_main); - if (tb && !fib_table_lookup(tb, flp, res, flags)) - break; + tb = rcu_dereference_rtnl(net->ipv4.fib_main); + if (tb) + err = fib_table_lookup(tb, flp, res, flags); + + if (!err) + goto out; + + tb = rcu_dereference_rtnl(net->ipv4.fib_default); + if (tb) + err = fib_table_lookup(tb, flp, res, flags); - tb = rcu_dereference_rtnl(net->ipv4.fib_default); - if (tb && !fib_table_lookup(tb, flp, res, flags)) - break; - } +out: + if (err == -EAGAIN) + err = -ENETUNREACH; rcu_read_unlock(); diff --git a/include/net/ip_tunnels.h b/include/net/ip_tunnels.h index 9a6a3ba888e8..f6dafec9102c 100644 --- a/include/net/ip_tunnels.h +++ b/include/net/ip_tunnels.h @@ -276,6 +276,8 @@ int iptunnel_pull_header(struct sk_buff *skb, int hdr_len, __be16 inner_proto); int iptunnel_xmit(struct sock *sk, struct rtable *rt, struct sk_buff *skb, __be32 src, __be32 dst, u8 proto, u8 tos, u8 ttl, __be16 df, bool xnet); +struct metadata_dst *iptunnel_metadata_reply(struct metadata_dst *md, + gfp_t flags); struct sk_buff *iptunnel_handle_offloads(struct sk_buff *skb, bool gre_csum, int gso_type_mask); diff --git a/include/net/route.h b/include/net/route.h index cc61cb95f059..f46af256880c 100644 --- a/include/net/route.h +++ b/include/net/route.h @@ -255,7 +255,7 @@ static inline void ip_route_connect_init(struct flowi4 *fl4, __be32 dst, __be32 flow_flags |= FLOWI_FLAG_ANYSRC; if (netif_index_is_vrf(sock_net(sk), oif)) - flow_flags |= FLOWI_FLAG_VRFSRC; + flow_flags |= FLOWI_FLAG_VRFSRC | FLOWI_FLAG_SKIP_NH_OIF; flowi4_init_output(fl4, oif, sk->sk_mark, tos, RT_SCOPE_UNIVERSE, protocol, flow_flags, dst, src, dport, sport); diff --git a/include/uapi/linux/lwtunnel.h b/include/uapi/linux/lwtunnel.h index 34141a5dfe74..f8b01887a495 100644 --- a/include/uapi/linux/lwtunnel.h +++ b/include/uapi/linux/lwtunnel.h @@ -21,8 +21,6 @@ enum lwtunnel_ip_t { LWTUNNEL_IP_SRC, LWTUNNEL_IP_TTL, LWTUNNEL_IP_TOS, - LWTUNNEL_IP_SPORT, - LWTUNNEL_IP_DPORT, LWTUNNEL_IP_FLAGS, __LWTUNNEL_IP_MAX, }; @@ -36,8 +34,6 @@ enum lwtunnel_ip6_t { LWTUNNEL_IP6_SRC, LWTUNNEL_IP6_HOPLIMIT, LWTUNNEL_IP6_TC, - LWTUNNEL_IP6_SPORT, - LWTUNNEL_IP6_DPORT, LWTUNNEL_IP6_FLAGS, __LWTUNNEL_IP6_MAX, }; diff --git a/lib/rhashtable.c b/lib/rhashtable.c index cc0c69710dcf..a54ff8949f91 100644 --- a/lib/rhashtable.c +++ b/lib/rhashtable.c @@ -187,10 +187,7 @@ static int rhashtable_rehash_one(struct rhashtable *ht, unsigned int old_hash) head = rht_dereference_bucket(new_tbl->buckets[new_hash], new_tbl, new_hash); - if (rht_is_a_nulls(head)) - INIT_RHT_NULLS_HEAD(entry->next, ht, new_hash); - else - RCU_INIT_POINTER(entry->next, head); + RCU_INIT_POINTER(entry->next, head); rcu_assign_pointer(new_tbl->buckets[new_hash], entry); spin_unlock(new_bucket_lock); diff --git a/net/atm/clip.c b/net/atm/clip.c index 17e55dfecbe2..e07f551a863c 100644 --- a/net/atm/clip.c +++ b/net/atm/clip.c @@ -317,6 +317,9 @@ static int clip_constructor(struct neighbour *neigh) static int clip_encap(struct atm_vcc *vcc, int mode) { + if (!CLIP_VCC(vcc)) + return -EBADFD; + CLIP_VCC(vcc)->encap = mode; return 0; } diff --git a/net/bluetooth/smp.c b/net/bluetooth/smp.c index ad82324f710f..0510a577a7b5 100644 --- a/net/bluetooth/smp.c +++ b/net/bluetooth/smp.c @@ -2311,12 +2311,6 @@ int smp_conn_security(struct hci_conn *hcon, __u8 sec_level) if (!conn) return 1; - chan = conn->smp; - if (!chan) { - BT_ERR("SMP security requested but not available"); - return 1; - } - if (!hci_dev_test_flag(hcon->hdev, HCI_LE_ENABLED)) return 1; @@ -2330,6 +2324,12 @@ int smp_conn_security(struct hci_conn *hcon, __u8 sec_level) if (smp_ltk_encrypt(conn, hcon->pending_sec_level)) return 0; + chan = conn->smp; + if (!chan) { + BT_ERR("SMP security requested but not available"); + return 1; + } + l2cap_chan_lock(chan); /* If SMP is already in progress ignore this request */ diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c index 66efdc21f548..480b3de1a0e3 100644 --- a/net/bridge/br_multicast.c +++ b/net/bridge/br_multicast.c @@ -1006,7 +1006,7 @@ static int br_ip4_multicast_igmp3_report(struct net_bridge *br, ih = igmpv3_report_hdr(skb); num = ntohs(ih->ngrec); - len = sizeof(*ih); + len = skb_transport_offset(skb) + sizeof(*ih); for (i = 0; i < num; i++) { len += sizeof(*grec); @@ -1067,7 +1067,7 @@ static int br_ip6_multicast_mld2_report(struct net_bridge *br, icmp6h = icmp6_hdr(skb); num = ntohs(icmp6h->icmp6_dataun.un_data16[1]); - len = sizeof(*icmp6h); + len = skb_transport_offset(skb) + sizeof(*icmp6h); for (i = 0; i < num; i++) { __be16 *nsrcs, _nsrcs; diff --git a/net/core/dev.c b/net/core/dev.c index 877c84834d81..6bb6470f5b7b 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -4713,6 +4713,8 @@ void napi_disable(struct napi_struct *n) while (test_and_set_bit(NAPI_STATE_SCHED, &n->state)) msleep(1); + while (test_and_set_bit(NAPI_STATE_NPSVC, &n->state)) + msleep(1); hrtimer_cancel(&n->timer); diff --git a/net/core/fib_rules.c b/net/core/fib_rules.c index bf77e3639ce0..365de66436ac 100644 --- a/net/core/fib_rules.c +++ b/net/core/fib_rules.c @@ -631,15 +631,17 @@ static int dump_rules(struct sk_buff *skb, struct netlink_callback *cb, { int idx = 0; struct fib_rule *rule; + int err = 0; rcu_read_lock(); list_for_each_entry_rcu(rule, &ops->rules_list, list) { if (idx < cb->args[1]) goto skip; - if (fib_nl_fill_rule(skb, rule, NETLINK_CB(cb->skb).portid, - cb->nlh->nlmsg_seq, RTM_NEWRULE, - NLM_F_MULTI, ops) < 0) + err = fib_nl_fill_rule(skb, rule, NETLINK_CB(cb->skb).portid, + cb->nlh->nlmsg_seq, RTM_NEWRULE, + NLM_F_MULTI, ops); + if (err) break; skip: idx++; @@ -648,7 +650,7 @@ skip: cb->args[1] = idx; rules_ops_put(ops); - return skb->len; + return err; } static int fib_nl_dumprule(struct sk_buff *skb, struct netlink_callback *cb) @@ -664,7 +666,9 @@ static int fib_nl_dumprule(struct sk_buff *skb, struct netlink_callback *cb) if (ops == NULL) return -EAFNOSUPPORT; - return dump_rules(skb, cb, ops); + dump_rules(skb, cb, ops); + + return skb->len; } rcu_read_lock(); diff --git a/net/core/filter.c b/net/core/filter.c index 13079f03902e..05a04ea87172 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -478,9 +478,9 @@ do_pass: bpf_src = BPF_X; } else { insn->dst_reg = BPF_REG_A; - insn->src_reg = BPF_REG_X; insn->imm = fp->k; bpf_src = BPF_SRC(fp->code); + insn->src_reg = bpf_src == BPF_X ? BPF_REG_X : 0; } /* Common case where 'jump_false' is next insn. */ diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c index b279077c3089..805a95a48107 100644 --- a/net/core/net-sysfs.c +++ b/net/core/net-sysfs.c @@ -1481,6 +1481,15 @@ static int of_dev_node_match(struct device *dev, const void *data) return ret == 0 ? dev->of_node == data : ret; } +/* + * of_find_net_device_by_node - lookup the net device for the device node + * @np: OF device node + * + * Looks up the net_device structure corresponding with the device node. + * If successful, returns a pointer to the net_device with the embedded + * struct device refcount incremented by one, or NULL on failure. The + * refcount must be dropped when done with the net_device. + */ struct net_device *of_find_net_device_by_node(struct device_node *np) { struct device *dev; diff --git a/net/core/netpoll.c b/net/core/netpoll.c index 6aa3db8dfc3b..8bdada242a7d 100644 --- a/net/core/netpoll.c +++ b/net/core/netpoll.c @@ -142,7 +142,7 @@ static void queue_process(struct work_struct *work) */ static int poll_one_napi(struct napi_struct *napi, int budget) { - int work; + int work = 0; /* net_rx_action's ->poll() invocations and our's are * synchronized by this test which is only made while @@ -151,7 +151,12 @@ static int poll_one_napi(struct napi_struct *napi, int budget) if (!test_bit(NAPI_STATE_SCHED, &napi->state)) return budget; - set_bit(NAPI_STATE_NPSVC, &napi->state); + /* If we set this bit but see that it has already been set, + * that indicates that napi has been disabled and we need + * to abort this operation + */ + if (test_and_set_bit(NAPI_STATE_NPSVC, &napi->state)) + goto out; work = napi->poll(napi, budget); WARN_ONCE(work > budget, "%pF exceeded budget in poll\n", napi->poll); @@ -159,6 +164,7 @@ static int poll_one_napi(struct napi_struct *napi, int budget) clear_bit(NAPI_STATE_NPSVC, &napi->state); +out: return budget - work; } diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index a466821d1441..0ec48403ed68 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -3047,6 +3047,7 @@ static int rtnl_bridge_getlink(struct sk_buff *skb, struct netlink_callback *cb) u32 portid = NETLINK_CB(cb->skb).portid; u32 seq = cb->nlh->nlmsg_seq; u32 filter_mask = 0; + int err; if (nlmsg_len(cb->nlh) > sizeof(struct ifinfomsg)) { struct nlattr *extfilt; @@ -3067,20 +3068,25 @@ static int rtnl_bridge_getlink(struct sk_buff *skb, struct netlink_callback *cb) struct net_device *br_dev = netdev_master_upper_dev_get(dev); if (br_dev && br_dev->netdev_ops->ndo_bridge_getlink) { - if (idx >= cb->args[0] && - br_dev->netdev_ops->ndo_bridge_getlink( - skb, portid, seq, dev, filter_mask, - NLM_F_MULTI) < 0) - break; + if (idx >= cb->args[0]) { + err = br_dev->netdev_ops->ndo_bridge_getlink( + skb, portid, seq, dev, + filter_mask, NLM_F_MULTI); + if (err < 0 && err != -EOPNOTSUPP) + break; + } idx++; } if (ops->ndo_bridge_getlink) { - if (idx >= cb->args[0] && - ops->ndo_bridge_getlink(skb, portid, seq, dev, - filter_mask, - NLM_F_MULTI) < 0) - break; + if (idx >= cb->args[0]) { + err = ops->ndo_bridge_getlink(skb, portid, + seq, dev, + filter_mask, + NLM_F_MULTI); + if (err < 0 && err != -EOPNOTSUPP) + break; + } idx++; } } diff --git a/net/core/sock.c b/net/core/sock.c index ca2984afe16e..3307c02244d3 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -2740,10 +2740,8 @@ static void req_prot_cleanup(struct request_sock_ops *rsk_prot) return; kfree(rsk_prot->slab_name); rsk_prot->slab_name = NULL; - if (rsk_prot->slab) { - kmem_cache_destroy(rsk_prot->slab); - rsk_prot->slab = NULL; - } + kmem_cache_destroy(rsk_prot->slab); + rsk_prot->slab = NULL; } static int req_prot_init(const struct proto *prot) @@ -2828,10 +2826,8 @@ void proto_unregister(struct proto *prot) list_del(&prot->node); mutex_unlock(&proto_list_mutex); - if (prot->slab != NULL) { - kmem_cache_destroy(prot->slab); - prot->slab = NULL; - } + kmem_cache_destroy(prot->slab); + prot->slab = NULL; req_prot_cleanup(prot->rsk_prot); diff --git a/net/dccp/ackvec.c b/net/dccp/ackvec.c index bd9e718c2a20..3de0d0362d7f 100644 --- a/net/dccp/ackvec.c +++ b/net/dccp/ackvec.c @@ -398,12 +398,8 @@ out_err: void dccp_ackvec_exit(void) { - if (dccp_ackvec_slab != NULL) { - kmem_cache_destroy(dccp_ackvec_slab); - dccp_ackvec_slab = NULL; - } - if (dccp_ackvec_record_slab != NULL) { - kmem_cache_destroy(dccp_ackvec_record_slab); - dccp_ackvec_record_slab = NULL; - } + kmem_cache_destroy(dccp_ackvec_slab); + dccp_ackvec_slab = NULL; + kmem_cache_destroy(dccp_ackvec_record_slab); + dccp_ackvec_record_slab = NULL; } diff --git a/net/dccp/ccid.c b/net/dccp/ccid.c index 83498975165f..90f77d08cc37 100644 --- a/net/dccp/ccid.c +++ b/net/dccp/ccid.c @@ -95,8 +95,7 @@ static struct kmem_cache *ccid_kmem_cache_create(int obj_size, char *slab_name_f static void ccid_kmem_cache_destroy(struct kmem_cache *slab) { - if (slab != NULL) - kmem_cache_destroy(slab); + kmem_cache_destroy(slab); } static int __init ccid_activate(struct ccid_operations *ccid_ops) diff --git a/net/dccp/minisocks.c b/net/dccp/minisocks.c index 30addee2dd03..838f524cf11a 100644 --- a/net/dccp/minisocks.c +++ b/net/dccp/minisocks.c @@ -48,8 +48,6 @@ void dccp_time_wait(struct sock *sk, int state, int timeo) tw->tw_ipv6only = sk->sk_ipv6only; } #endif - /* Linkage updates. */ - __inet_twsk_hashdance(tw, sk, &dccp_hashinfo); /* Get the TIME_WAIT timeout firing. */ if (timeo < rto) @@ -60,6 +58,8 @@ void dccp_time_wait(struct sock *sk, int state, int timeo) timeo = DCCP_TIMEWAIT_LEN; inet_twsk_schedule(tw, timeo); + /* Linkage updates. */ + __inet_twsk_hashdance(tw, sk, &dccp_hashinfo); inet_twsk_put(tw); } else { /* Sorry, if we're out of memory, just CLOSE this diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c index 76e3800765f8..c59fa5d9c22c 100644 --- a/net/dsa/dsa.c +++ b/net/dsa/dsa.c @@ -634,6 +634,10 @@ static void dsa_of_free_platform_data(struct dsa_platform_data *pd) port_index++; } kfree(pd->chip[i].rtable); + + /* Drop our reference to the MDIO bus device */ + if (pd->chip[i].host_dev) + put_device(pd->chip[i].host_dev); } kfree(pd->chip); } @@ -661,16 +665,22 @@ static int dsa_of_probe(struct device *dev) return -EPROBE_DEFER; ethernet = of_parse_phandle(np, "dsa,ethernet", 0); - if (!ethernet) - return -EINVAL; + if (!ethernet) { + ret = -EINVAL; + goto out_put_mdio; + } ethernet_dev = of_find_net_device_by_node(ethernet); - if (!ethernet_dev) - return -EPROBE_DEFER; + if (!ethernet_dev) { + ret = -EPROBE_DEFER; + goto out_put_mdio; + } pd = kzalloc(sizeof(*pd), GFP_KERNEL); - if (!pd) - return -ENOMEM; + if (!pd) { + ret = -ENOMEM; + goto out_put_ethernet; + } dev->platform_data = pd; pd->of_netdev = ethernet_dev; @@ -691,7 +701,9 @@ static int dsa_of_probe(struct device *dev) cd = &pd->chip[chip_index]; cd->of_node = child; - cd->host_dev = &mdio_bus->dev; + + /* When assigning the host device, increment its refcount */ + cd->host_dev = get_device(&mdio_bus->dev); sw_addr = of_get_property(child, "reg", NULL); if (!sw_addr) @@ -711,6 +723,12 @@ static int dsa_of_probe(struct device *dev) ret = -EPROBE_DEFER; goto out_free_chip; } + + /* Drop the mdio_bus device ref, replacing the host + * device with the mdio_bus_switch device, keeping + * the refcount from of_mdio_find_bus() above. + */ + put_device(cd->host_dev); cd->host_dev = &mdio_bus_switch->dev; } @@ -744,6 +762,10 @@ static int dsa_of_probe(struct device *dev) } } + /* The individual chips hold their own refcount on the mdio bus, + * so drop ours */ + put_device(&mdio_bus->dev); + return 0; out_free_chip: @@ -751,6 +773,10 @@ out_free_chip: out_free: kfree(pd); dev->platform_data = NULL; +out_put_ethernet: + put_device(ðernet_dev->dev); +out_put_mdio: + put_device(&mdio_bus->dev); return ret; } @@ -762,6 +788,7 @@ static void dsa_of_remove(struct device *dev) return; dsa_of_free_platform_data(pd); + put_device(&pd->of_netdev->dev); kfree(pd); } #else diff --git a/net/dsa/tag_trailer.c b/net/dsa/tag_trailer.c index d25efc93d8f1..b6ca0890d018 100644 --- a/net/dsa/tag_trailer.c +++ b/net/dsa/tag_trailer.c @@ -78,7 +78,7 @@ static int trailer_rcv(struct sk_buff *skb, struct net_device *dev, trailer = skb_tail_pointer(skb) - 4; if (trailer[0] != 0x80 || (trailer[1] & 0xf8) != 0x00 || - (trailer[3] & 0xef) != 0x00 || trailer[3] != 0x00) + (trailer[2] & 0xef) != 0x00 || trailer[3] != 0x00) goto out_drop; source_port = trailer[1] & 7; diff --git a/net/ipv4/arp.c b/net/ipv4/arp.c index 30409b75e925..f03db8b7abee 100644 --- a/net/ipv4/arp.c +++ b/net/ipv4/arp.c @@ -113,6 +113,8 @@ #include <net/arp.h> #include <net/ax25.h> #include <net/netrom.h> +#include <net/dst_metadata.h> +#include <net/ip_tunnels.h> #include <linux/uaccess.h> @@ -296,7 +298,8 @@ static void arp_send_dst(int type, int ptype, __be32 dest_ip, struct net_device *dev, __be32 src_ip, const unsigned char *dest_hw, const unsigned char *src_hw, - const unsigned char *target_hw, struct sk_buff *oskb) + const unsigned char *target_hw, + struct dst_entry *dst) { struct sk_buff *skb; @@ -309,9 +312,7 @@ static void arp_send_dst(int type, int ptype, __be32 dest_ip, if (!skb) return; - if (oskb) - skb_dst_copy(skb, oskb); - + skb_dst_set(skb, dst); arp_xmit(skb); } @@ -333,6 +334,7 @@ static void arp_solicit(struct neighbour *neigh, struct sk_buff *skb) __be32 target = *(__be32 *)neigh->primary_key; int probes = atomic_read(&neigh->probes); struct in_device *in_dev; + struct dst_entry *dst = NULL; rcu_read_lock(); in_dev = __in_dev_get_rcu(dev); @@ -381,9 +383,10 @@ static void arp_solicit(struct neighbour *neigh, struct sk_buff *skb) } } + if (skb && !(dev->priv_flags & IFF_XMIT_DST_RELEASE)) + dst = dst_clone(skb_dst(skb)); arp_send_dst(ARPOP_REQUEST, ETH_P_ARP, target, dev, saddr, - dst_hw, dev->dev_addr, NULL, - dev->priv_flags & IFF_XMIT_DST_RELEASE ? NULL : skb); + dst_hw, dev->dev_addr, NULL, dst); } static int arp_ignore(struct in_device *in_dev, __be32 sip, __be32 tip) @@ -649,6 +652,7 @@ static int arp_process(struct sock *sk, struct sk_buff *skb) int addr_type; struct neighbour *n; struct net *net = dev_net(dev); + struct dst_entry *reply_dst = NULL; bool is_garp = false; /* arp_rcv below verifies the ARP header and verifies the device @@ -749,13 +753,18 @@ static int arp_process(struct sock *sk, struct sk_buff *skb) * cache. */ + if (arp->ar_op == htons(ARPOP_REQUEST) && skb_metadata_dst(skb)) + reply_dst = (struct dst_entry *) + iptunnel_metadata_reply(skb_metadata_dst(skb), + GFP_ATOMIC); + /* Special case: IPv4 duplicate address detection packet (RFC2131) */ if (sip == 0) { if (arp->ar_op == htons(ARPOP_REQUEST) && inet_addr_type_dev_table(net, dev, tip) == RTN_LOCAL && !arp_ignore(in_dev, sip, tip)) - arp_send(ARPOP_REPLY, ETH_P_ARP, sip, dev, tip, sha, - dev->dev_addr, sha); + arp_send_dst(ARPOP_REPLY, ETH_P_ARP, sip, dev, tip, + sha, dev->dev_addr, sha, reply_dst); goto out; } @@ -774,9 +783,10 @@ static int arp_process(struct sock *sk, struct sk_buff *skb) if (!dont_send) { n = neigh_event_ns(&arp_tbl, sha, &sip, dev); if (n) { - arp_send(ARPOP_REPLY, ETH_P_ARP, sip, - dev, tip, sha, dev->dev_addr, - sha); + arp_send_dst(ARPOP_REPLY, ETH_P_ARP, + sip, dev, tip, sha, + dev->dev_addr, sha, + reply_dst); neigh_release(n); } } @@ -794,9 +804,10 @@ static int arp_process(struct sock *sk, struct sk_buff *skb) if (NEIGH_CB(skb)->flags & LOCALLY_ENQUEUED || skb->pkt_type == PACKET_HOST || NEIGH_VAR(in_dev->arp_parms, PROXY_DELAY) == 0) { - arp_send(ARPOP_REPLY, ETH_P_ARP, sip, - dev, tip, sha, dev->dev_addr, - sha); + arp_send_dst(ARPOP_REPLY, ETH_P_ARP, + sip, dev, tip, sha, + dev->dev_addr, sha, + reply_dst); } else { pneigh_enqueue(&arp_tbl, in_dev->arp_parms, skb); diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c index 26d6ffb6d23c..6c2af797f2f9 100644 --- a/net/ipv4/fib_trie.c +++ b/net/ipv4/fib_trie.c @@ -1426,7 +1426,7 @@ found: nh->nh_flags & RTNH_F_LINKDOWN && !(fib_flags & FIB_LOOKUP_IGNORE_LINKSTATE)) continue; - if (!(flp->flowi4_flags & FLOWI_FLAG_VRFSRC)) { + if (!(flp->flowi4_flags & FLOWI_FLAG_SKIP_NH_OIF)) { if (flp->flowi4_oif && flp->flowi4_oif != nh->nh_oif) continue; diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c index 79fe05befcae..e5eb8ac4089d 100644 --- a/net/ipv4/icmp.c +++ b/net/ipv4/icmp.c @@ -427,7 +427,7 @@ static void icmp_reply(struct icmp_bxm *icmp_param, struct sk_buff *skb) fl4.flowi4_mark = mark; fl4.flowi4_tos = RT_TOS(ip_hdr(skb)->tos); fl4.flowi4_proto = IPPROTO_ICMP; - fl4.flowi4_oif = vrf_master_ifindex(skb->dev) ? : skb->dev->ifindex; + fl4.flowi4_oif = vrf_master_ifindex(skb->dev); security_skb_classify_flow(skb, flowi4_to_flowi(&fl4)); rt = ip_route_output_key(net, &fl4); if (IS_ERR(rt)) @@ -461,7 +461,7 @@ static struct rtable *icmp_route_lookup(struct net *net, fl4->flowi4_proto = IPPROTO_ICMP; fl4->fl4_icmp_type = type; fl4->fl4_icmp_code = code; - fl4->flowi4_oif = vrf_master_ifindex(skb_in->dev) ? : skb_in->dev->ifindex; + fl4->flowi4_oif = vrf_master_ifindex(skb_in->dev); security_skb_classify_flow(skb_in, flowi4_to_flowi(fl4)); rt = __ip_route_output_key(net, fl4); diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c index 134957159c27..7bb9c39e0a4d 100644 --- a/net/ipv4/inet_connection_sock.c +++ b/net/ipv4/inet_connection_sock.c @@ -685,20 +685,20 @@ void reqsk_queue_hash_req(struct request_sock_queue *queue, req->num_timeout = 0; req->sk = NULL; + setup_timer(&req->rsk_timer, reqsk_timer_handler, (unsigned long)req); + mod_timer_pinned(&req->rsk_timer, jiffies + timeout); + req->rsk_hash = hash; + /* before letting lookups find us, make sure all req fields * are committed to memory and refcnt initialized. */ smp_wmb(); atomic_set(&req->rsk_refcnt, 2); - setup_timer(&req->rsk_timer, reqsk_timer_handler, (unsigned long)req); - req->rsk_hash = hash; spin_lock(&queue->syn_wait_lock); req->dl_next = lopt->syn_table[hash]; lopt->syn_table[hash] = req; spin_unlock(&queue->syn_wait_lock); - - mod_timer_pinned(&req->rsk_timer, jiffies + timeout); } EXPORT_SYMBOL(reqsk_queue_hash_req); diff --git a/net/ipv4/inet_timewait_sock.c b/net/ipv4/inet_timewait_sock.c index ae22cc24fbe8..c67f9bd7699c 100644 --- a/net/ipv4/inet_timewait_sock.c +++ b/net/ipv4/inet_timewait_sock.c @@ -123,13 +123,15 @@ void __inet_twsk_hashdance(struct inet_timewait_sock *tw, struct sock *sk, /* * Step 2: Hash TW into tcp ehash chain. * Notes : - * - tw_refcnt is set to 3 because : + * - tw_refcnt is set to 4 because : * - We have one reference from bhash chain. * - We have one reference from ehash chain. + * - We have one reference from timer. + * - One reference for ourself (our caller will release it). * We can use atomic_set() because prior spin_lock()/spin_unlock() * committed into memory all tw fields. */ - atomic_set(&tw->tw_refcnt, 1 + 1 + 1); + atomic_set(&tw->tw_refcnt, 4); inet_twsk_add_node_rcu(tw, &ehead->chain); /* Step 3: Remove SK from hash chain */ @@ -217,7 +219,7 @@ void inet_twsk_deschedule_put(struct inet_timewait_sock *tw) } EXPORT_SYMBOL(inet_twsk_deschedule_put); -void inet_twsk_schedule(struct inet_timewait_sock *tw, const int timeo) +void __inet_twsk_schedule(struct inet_timewait_sock *tw, int timeo, bool rearm) { /* timeout := RTO * 3.5 * @@ -245,12 +247,14 @@ void inet_twsk_schedule(struct inet_timewait_sock *tw, const int timeo) */ tw->tw_kill = timeo <= 4*HZ; - if (!mod_timer_pinned(&tw->tw_timer, jiffies + timeo)) { - atomic_inc(&tw->tw_refcnt); + if (!rearm) { + BUG_ON(mod_timer_pinned(&tw->tw_timer, jiffies + timeo)); atomic_inc(&tw->tw_dr->tw_count); + } else { + mod_timer_pending(&tw->tw_timer, jiffies + timeo); } } -EXPORT_SYMBOL_GPL(inet_twsk_schedule); +EXPORT_SYMBOL_GPL(__inet_twsk_schedule); void inet_twsk_purge(struct inet_hashinfo *hashinfo, struct inet_timewait_death_row *twdr, int family) diff --git a/net/ipv4/ip_tunnel_core.c b/net/ipv4/ip_tunnel_core.c index 29ed6c5a5185..84dce6a92f93 100644 --- a/net/ipv4/ip_tunnel_core.c +++ b/net/ipv4/ip_tunnel_core.c @@ -46,12 +46,13 @@ #include <net/net_namespace.h> #include <net/netns/generic.h> #include <net/rtnetlink.h> +#include <net/dst_metadata.h> int iptunnel_xmit(struct sock *sk, struct rtable *rt, struct sk_buff *skb, __be32 src, __be32 dst, __u8 proto, __u8 tos, __u8 ttl, __be16 df, bool xnet) { - int pkt_len = skb->len; + int pkt_len = skb->len - skb_inner_network_offset(skb); struct iphdr *iph; int err; @@ -119,6 +120,33 @@ int iptunnel_pull_header(struct sk_buff *skb, int hdr_len, __be16 inner_proto) } EXPORT_SYMBOL_GPL(iptunnel_pull_header); +struct metadata_dst *iptunnel_metadata_reply(struct metadata_dst *md, + gfp_t flags) +{ + struct metadata_dst *res; + struct ip_tunnel_info *dst, *src; + + if (!md || md->u.tun_info.mode & IP_TUNNEL_INFO_TX) + return NULL; + + res = metadata_dst_alloc(0, flags); + if (!res) + return NULL; + + dst = &res->u.tun_info; + src = &md->u.tun_info; + dst->key.tun_id = src->key.tun_id; + if (src->mode & IP_TUNNEL_INFO_IPV6) + memcpy(&dst->key.u.ipv6.dst, &src->key.u.ipv6.src, + sizeof(struct in6_addr)); + else + dst->key.u.ipv4.dst = src->key.u.ipv4.src; + dst->mode = src->mode | IP_TUNNEL_INFO_TX; + + return res; +} +EXPORT_SYMBOL_GPL(iptunnel_metadata_reply); + struct sk_buff *iptunnel_handle_offloads(struct sk_buff *skb, bool csum_help, int gso_type_mask) @@ -198,8 +226,6 @@ static const struct nla_policy ip_tun_policy[LWTUNNEL_IP_MAX + 1] = { [LWTUNNEL_IP_SRC] = { .type = NLA_U32 }, [LWTUNNEL_IP_TTL] = { .type = NLA_U8 }, [LWTUNNEL_IP_TOS] = { .type = NLA_U8 }, - [LWTUNNEL_IP_SPORT] = { .type = NLA_U16 }, - [LWTUNNEL_IP_DPORT] = { .type = NLA_U16 }, [LWTUNNEL_IP_FLAGS] = { .type = NLA_U16 }, }; @@ -239,12 +265,6 @@ static int ip_tun_build_state(struct net_device *dev, struct nlattr *attr, if (tb[LWTUNNEL_IP_TOS]) tun_info->key.tos = nla_get_u8(tb[LWTUNNEL_IP_TOS]); - if (tb[LWTUNNEL_IP_SPORT]) - tun_info->key.tp_src = nla_get_be16(tb[LWTUNNEL_IP_SPORT]); - - if (tb[LWTUNNEL_IP_DPORT]) - tun_info->key.tp_dst = nla_get_be16(tb[LWTUNNEL_IP_DPORT]); - if (tb[LWTUNNEL_IP_FLAGS]) tun_info->key.tun_flags = nla_get_u16(tb[LWTUNNEL_IP_FLAGS]); @@ -266,8 +286,6 @@ static int ip_tun_fill_encap_info(struct sk_buff *skb, nla_put_be32(skb, LWTUNNEL_IP_SRC, tun_info->key.u.ipv4.src) || nla_put_u8(skb, LWTUNNEL_IP_TOS, tun_info->key.tos) || nla_put_u8(skb, LWTUNNEL_IP_TTL, tun_info->key.ttl) || - nla_put_u16(skb, LWTUNNEL_IP_SPORT, tun_info->key.tp_src) || - nla_put_u16(skb, LWTUNNEL_IP_DPORT, tun_info->key.tp_dst) || nla_put_u16(skb, LWTUNNEL_IP_FLAGS, tun_info->key.tun_flags)) return -ENOMEM; @@ -281,8 +299,6 @@ static int ip_tun_encap_nlsize(struct lwtunnel_state *lwtstate) + nla_total_size(4) /* LWTUNNEL_IP_SRC */ + nla_total_size(1) /* LWTUNNEL_IP_TOS */ + nla_total_size(1) /* LWTUNNEL_IP_TTL */ - + nla_total_size(2) /* LWTUNNEL_IP_SPORT */ - + nla_total_size(2) /* LWTUNNEL_IP_DPORT */ + nla_total_size(2); /* LWTUNNEL_IP_FLAGS */ } @@ -305,8 +321,6 @@ static const struct nla_policy ip6_tun_policy[LWTUNNEL_IP6_MAX + 1] = { [LWTUNNEL_IP6_SRC] = { .len = sizeof(struct in6_addr) }, [LWTUNNEL_IP6_HOPLIMIT] = { .type = NLA_U8 }, [LWTUNNEL_IP6_TC] = { .type = NLA_U8 }, - [LWTUNNEL_IP6_SPORT] = { .type = NLA_U16 }, - [LWTUNNEL_IP6_DPORT] = { .type = NLA_U16 }, [LWTUNNEL_IP6_FLAGS] = { .type = NLA_U16 }, }; @@ -346,12 +360,6 @@ static int ip6_tun_build_state(struct net_device *dev, struct nlattr *attr, if (tb[LWTUNNEL_IP6_TC]) tun_info->key.tos = nla_get_u8(tb[LWTUNNEL_IP6_TC]); - if (tb[LWTUNNEL_IP6_SPORT]) - tun_info->key.tp_src = nla_get_be16(tb[LWTUNNEL_IP6_SPORT]); - - if (tb[LWTUNNEL_IP6_DPORT]) - tun_info->key.tp_dst = nla_get_be16(tb[LWTUNNEL_IP6_DPORT]); - if (tb[LWTUNNEL_IP6_FLAGS]) tun_info->key.tun_flags = nla_get_u16(tb[LWTUNNEL_IP6_FLAGS]); @@ -373,8 +381,6 @@ static int ip6_tun_fill_encap_info(struct sk_buff *skb, nla_put_in6_addr(skb, LWTUNNEL_IP6_SRC, &tun_info->key.u.ipv6.src) || nla_put_u8(skb, LWTUNNEL_IP6_HOPLIMIT, tun_info->key.tos) || nla_put_u8(skb, LWTUNNEL_IP6_TC, tun_info->key.ttl) || - nla_put_u16(skb, LWTUNNEL_IP6_SPORT, tun_info->key.tp_src) || - nla_put_u16(skb, LWTUNNEL_IP6_DPORT, tun_info->key.tp_dst) || nla_put_u16(skb, LWTUNNEL_IP6_FLAGS, tun_info->key.tun_flags)) return -ENOMEM; @@ -388,8 +394,6 @@ static int ip6_tun_encap_nlsize(struct lwtunnel_state *lwtstate) + nla_total_size(16) /* LWTUNNEL_IP6_SRC */ + nla_total_size(1) /* LWTUNNEL_IP6_HOPLIMIT */ + nla_total_size(1) /* LWTUNNEL_IP6_TC */ - + nla_total_size(2) /* LWTUNNEL_IP6_SPORT */ - + nla_total_size(2) /* LWTUNNEL_IP6_DPORT */ + nla_total_size(2); /* LWTUNNEL_IP6_FLAGS */ } diff --git a/net/ipv4/route.c b/net/ipv4/route.c index 5f4a5565ad8b..c6ad99ad0ffb 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -2045,6 +2045,7 @@ struct rtable *__ip_route_output_key(struct net *net, struct flowi4 *fl4) struct fib_result res; struct rtable *rth; int orig_oif; + int err = -ENETUNREACH; res.tclassid = 0; res.fi = NULL; @@ -2153,7 +2154,8 @@ struct rtable *__ip_route_output_key(struct net *net, struct flowi4 *fl4) goto make_route; } - if (fib_lookup(net, fl4, &res, 0)) { + err = fib_lookup(net, fl4, &res, 0); + if (err) { res.fi = NULL; res.table = NULL; if (fl4->flowi4_oif) { @@ -2181,7 +2183,7 @@ struct rtable *__ip_route_output_key(struct net *net, struct flowi4 *fl4) res.type = RTN_UNICAST; goto make_route; } - rth = ERR_PTR(-ENETUNREACH); + rth = ERR_PTR(err); goto out; } diff --git a/net/ipv4/tcp_cubic.c b/net/ipv4/tcp_cubic.c index c6ded6b2a79f..448c2615fece 100644 --- a/net/ipv4/tcp_cubic.c +++ b/net/ipv4/tcp_cubic.c @@ -154,14 +154,20 @@ static void bictcp_init(struct sock *sk) static void bictcp_cwnd_event(struct sock *sk, enum tcp_ca_event event) { if (event == CA_EVENT_TX_START) { - s32 delta = tcp_time_stamp - tcp_sk(sk)->lsndtime; struct bictcp *ca = inet_csk_ca(sk); + u32 now = tcp_time_stamp; + s32 delta; + + delta = now - tcp_sk(sk)->lsndtime; /* We were application limited (idle) for a while. * Shift epoch_start to keep cwnd growth to cubic curve. */ - if (ca->epoch_start && delta > 0) + if (ca->epoch_start && delta > 0) { ca->epoch_start += delta; + if (after(ca->epoch_start, now)) + ca->epoch_start = now; + } return; } } diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c index 6d8795b066ac..def765911ff8 100644 --- a/net/ipv4/tcp_minisocks.c +++ b/net/ipv4/tcp_minisocks.c @@ -162,9 +162,9 @@ kill_with_rst: if (tcp_death_row.sysctl_tw_recycle && tcptw->tw_ts_recent_stamp && tcp_tw_remember_stamp(tw)) - inet_twsk_schedule(tw, tw->tw_timeout); + inet_twsk_reschedule(tw, tw->tw_timeout); else - inet_twsk_schedule(tw, TCP_TIMEWAIT_LEN); + inet_twsk_reschedule(tw, TCP_TIMEWAIT_LEN); return TCP_TW_ACK; } @@ -201,7 +201,7 @@ kill: return TCP_TW_SUCCESS; } } - inet_twsk_schedule(tw, TCP_TIMEWAIT_LEN); + inet_twsk_reschedule(tw, TCP_TIMEWAIT_LEN); if (tmp_opt.saw_tstamp) { tcptw->tw_ts_recent = tmp_opt.rcv_tsval; @@ -251,7 +251,7 @@ kill: * Do not reschedule in the last case. */ if (paws_reject || th->ack) - inet_twsk_schedule(tw, TCP_TIMEWAIT_LEN); + inet_twsk_reschedule(tw, TCP_TIMEWAIT_LEN); return tcp_timewait_check_oow_rate_limit( tw, skb, LINUX_MIB_TCPACKSKIPPEDTIMEWAIT); @@ -322,9 +322,6 @@ void tcp_time_wait(struct sock *sk, int state, int timeo) } while (0); #endif - /* Linkage updates. */ - __inet_twsk_hashdance(tw, sk, &tcp_hashinfo); - /* Get the TIME_WAIT timeout firing. */ if (timeo < rto) timeo = rto; @@ -338,6 +335,8 @@ void tcp_time_wait(struct sock *sk, int state, int timeo) } inet_twsk_schedule(tw, timeo); + /* Linkage updates. */ + __inet_twsk_hashdance(tw, sk, &tcp_hashinfo); inet_twsk_put(tw); } else { /* Sorry, if we're out of memory, just CLOSE this diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index f9a8a12b62ee..1100ffe4a722 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -2897,6 +2897,7 @@ void tcp_send_active_reset(struct sock *sk, gfp_t priority) skb_reserve(skb, MAX_TCP_HEADER); tcp_init_nondata_skb(skb, tcp_acceptable_seq(sk), TCPHDR_ACK | TCPHDR_RST); + skb_mstamp_get(&skb->skb_mstamp); /* Send it off. */ if (tcp_transmit_skb(sk, skb, 0, priority)) NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPABORTFAILED); diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index c0a15e7f359f..f7d1d5e19e95 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -1024,7 +1024,8 @@ int udp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) if (netif_index_is_vrf(net, ipc.oif)) { flowi4_init_output(fl4, ipc.oif, sk->sk_mark, tos, RT_SCOPE_UNIVERSE, sk->sk_protocol, - (flow_flags | FLOWI_FLAG_VRFSRC), + (flow_flags | FLOWI_FLAG_VRFSRC | + FLOWI_FLAG_SKIP_NH_OIF), faddr, saddr, dport, inet->inet_sport); diff --git a/net/ipv4/xfrm4_policy.c b/net/ipv4/xfrm4_policy.c index bb919b28619f..c10a9ee68433 100644 --- a/net/ipv4/xfrm4_policy.c +++ b/net/ipv4/xfrm4_policy.c @@ -33,6 +33,8 @@ static struct dst_entry *__xfrm4_dst_lookup(struct net *net, struct flowi4 *fl4, if (saddr) fl4->saddr = saddr->a4; + fl4->flowi4_flags = FLOWI_FLAG_SKIP_NH_OIF; + rt = __ip_route_output_key(net, fl4); if (!IS_ERR(rt)) return &rt->dst; diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index 030fefdc9aed..900113376d4e 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -5127,13 +5127,12 @@ static void __ipv6_ifa_notify(int event, struct inet6_ifaddr *ifp) rt = addrconf_get_prefix_route(&ifp->peer_addr, 128, ifp->idev->dev, 0, 0); - if (rt && ip6_del_rt(rt)) - dst_free(&rt->dst); + if (rt) + ip6_del_rt(rt); } dst_hold(&ifp->rt->dst); - if (ip6_del_rt(ifp->rt)) - dst_free(&ifp->rt->dst); + ip6_del_rt(ifp->rt); rt_genid_bump_ipv6(net); break; diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c index 418d9823692b..7d2e0023c72d 100644 --- a/net/ipv6/ip6_fib.c +++ b/net/ipv6/ip6_fib.c @@ -155,6 +155,11 @@ static void node_free(struct fib6_node *fn) kmem_cache_free(fib6_node_kmem, fn); } +static void rt6_rcu_free(struct rt6_info *rt) +{ + call_rcu(&rt->dst.rcu_head, dst_rcu_free); +} + static void rt6_free_pcpu(struct rt6_info *non_pcpu_rt) { int cpu; @@ -169,7 +174,7 @@ static void rt6_free_pcpu(struct rt6_info *non_pcpu_rt) ppcpu_rt = per_cpu_ptr(non_pcpu_rt->rt6i_pcpu, cpu); pcpu_rt = *ppcpu_rt; if (pcpu_rt) { - dst_free(&pcpu_rt->dst); + rt6_rcu_free(pcpu_rt); *ppcpu_rt = NULL; } } @@ -181,7 +186,7 @@ static void rt6_release(struct rt6_info *rt) { if (atomic_dec_and_test(&rt->rt6i_ref)) { rt6_free_pcpu(rt); - dst_free(&rt->dst); + rt6_rcu_free(rt); } } @@ -846,7 +851,7 @@ add: *ins = rt; rt->rt6i_node = fn; atomic_inc(&rt->rt6i_ref); - inet6_rt_notify(RTM_NEWROUTE, rt, info); + inet6_rt_notify(RTM_NEWROUTE, rt, info, 0); info->nl_net->ipv6.rt6_stats->fib_rt_entries++; if (!(fn->fn_flags & RTN_RTINFO)) { @@ -872,7 +877,7 @@ add: rt->rt6i_node = fn; rt->dst.rt6_next = iter->dst.rt6_next; atomic_inc(&rt->rt6i_ref); - inet6_rt_notify(RTM_NEWROUTE, rt, info); + inet6_rt_notify(RTM_NEWROUTE, rt, info, NLM_F_REPLACE); if (!(fn->fn_flags & RTN_RTINFO)) { info->nl_net->ipv6.rt6_stats->fib_route_nodes++; fn->fn_flags |= RTN_RTINFO; @@ -933,6 +938,10 @@ int fib6_add(struct fib6_node *root, struct rt6_info *rt, int replace_required = 0; int sernum = fib6_new_sernum(info->nl_net); + if (WARN_ON_ONCE((rt->dst.flags & DST_NOCACHE) && + !atomic_read(&rt->dst.__refcnt))) + return -EINVAL; + if (info->nlh) { if (!(info->nlh->nlmsg_flags & NLM_F_CREATE)) allow_create = 0; @@ -1025,6 +1034,7 @@ int fib6_add(struct fib6_node *root, struct rt6_info *rt, fib6_start_gc(info->nl_net, rt); if (!(rt->rt6i_flags & RTF_CACHE)) fib6_prune_clones(info->nl_net, pn); + rt->dst.flags &= ~DST_NOCACHE; } out: @@ -1049,7 +1059,8 @@ out: atomic_inc(&pn->leaf->rt6i_ref); } #endif - dst_free(&rt->dst); + if (!(rt->dst.flags & DST_NOCACHE)) + dst_free(&rt->dst); } return err; @@ -1060,7 +1071,8 @@ out: st_failure: if (fn && !(fn->fn_flags & (RTN_RTINFO|RTN_ROOT))) fib6_repair_tree(info->nl_net, fn); - dst_free(&rt->dst); + if (!(rt->dst.flags & DST_NOCACHE)) + dst_free(&rt->dst); return err; #endif } @@ -1410,7 +1422,7 @@ static void fib6_del_route(struct fib6_node *fn, struct rt6_info **rtp, fib6_purge_rt(rt, fn, net); - inet6_rt_notify(RTM_DELROUTE, rt, info); + inet6_rt_notify(RTM_DELROUTE, rt, info, 0); rt6_release(rt); } diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c index 4038c694ec03..3c7b9310b33f 100644 --- a/net/ipv6/ip6_gre.c +++ b/net/ipv6/ip6_gre.c @@ -404,13 +404,13 @@ static void ip6gre_err(struct sk_buff *skb, struct inet6_skb_parm *opt, struct ipv6_tlv_tnl_enc_lim *tel; __u32 mtu; case ICMPV6_DEST_UNREACH: - net_warn_ratelimited("%s: Path to destination invalid or inactive!\n", - t->parms.name); + net_dbg_ratelimited("%s: Path to destination invalid or inactive!\n", + t->parms.name); break; case ICMPV6_TIME_EXCEED: if (code == ICMPV6_EXC_HOPLIMIT) { - net_warn_ratelimited("%s: Too small hop limit or routing loop in tunnel!\n", - t->parms.name); + net_dbg_ratelimited("%s: Too small hop limit or routing loop in tunnel!\n", + t->parms.name); } break; case ICMPV6_PARAMPROB: @@ -421,12 +421,12 @@ static void ip6gre_err(struct sk_buff *skb, struct inet6_skb_parm *opt, if (teli && teli == be32_to_cpu(info) - 2) { tel = (struct ipv6_tlv_tnl_enc_lim *) &skb->data[teli]; if (tel->encap_limit == 0) { - net_warn_ratelimited("%s: Too small encapsulation limit or routing loop in tunnel!\n", - t->parms.name); + net_dbg_ratelimited("%s: Too small encapsulation limit or routing loop in tunnel!\n", + t->parms.name); } } else { - net_warn_ratelimited("%s: Recipient unable to parse tunneled packet!\n", - t->parms.name); + net_dbg_ratelimited("%s: Recipient unable to parse tunneled packet!\n", + t->parms.name); } break; case ICMPV6_PKT_TOOBIG: @@ -634,20 +634,20 @@ static netdev_tx_t ip6gre_xmit2(struct sk_buff *skb, } if (!fl6->flowi6_mark) - dst = ip6_tnl_dst_check(tunnel); + dst = ip6_tnl_dst_get(tunnel); if (!dst) { - ndst = ip6_route_output(net, NULL, fl6); + dst = ip6_route_output(net, NULL, fl6); - if (ndst->error) + if (dst->error) goto tx_err_link_failure; - ndst = xfrm_lookup(net, ndst, flowi6_to_flowi(fl6), NULL, 0); - if (IS_ERR(ndst)) { - err = PTR_ERR(ndst); - ndst = NULL; + dst = xfrm_lookup(net, dst, flowi6_to_flowi(fl6), NULL, 0); + if (IS_ERR(dst)) { + err = PTR_ERR(dst); + dst = NULL; goto tx_err_link_failure; } - dst = ndst; + ndst = dst; } tdev = dst->dev; @@ -702,12 +702,9 @@ static netdev_tx_t ip6gre_xmit2(struct sk_buff *skb, skb = new_skb; } - if (fl6->flowi6_mark) { - skb_dst_set(skb, dst); - ndst = NULL; - } else { - skb_dst_set_noref(skb, dst); - } + if (!fl6->flowi6_mark && ndst) + ip6_tnl_dst_set(tunnel, ndst); + skb_dst_set(skb, dst); proto = NEXTHDR_GRE; if (encap_limit >= 0) { @@ -762,14 +759,12 @@ static netdev_tx_t ip6gre_xmit2(struct sk_buff *skb, skb_set_inner_protocol(skb, protocol); ip6tunnel_xmit(NULL, skb, dev); - if (ndst) - ip6_tnl_dst_store(tunnel, ndst); return 0; tx_err_link_failure: stats->tx_carrier_errors++; dst_link_failure(skb); tx_err_dst_release: - dst_release(ndst); + dst_release(dst); return err; } @@ -1223,6 +1218,9 @@ static const struct net_device_ops ip6gre_netdev_ops = { static void ip6gre_dev_free(struct net_device *dev) { + struct ip6_tnl *t = netdev_priv(dev); + + ip6_tnl_dst_destroy(t); free_percpu(dev->tstats); free_netdev(dev); } @@ -1245,9 +1243,10 @@ static void ip6gre_tunnel_setup(struct net_device *dev) netif_keep_dst(dev); } -static int ip6gre_tunnel_init(struct net_device *dev) +static int ip6gre_tunnel_init_common(struct net_device *dev) { struct ip6_tnl *tunnel; + int ret; tunnel = netdev_priv(dev); @@ -1255,16 +1254,37 @@ static int ip6gre_tunnel_init(struct net_device *dev) tunnel->net = dev_net(dev); strcpy(tunnel->parms.name, dev->name); + dev->tstats = netdev_alloc_pcpu_stats(struct pcpu_sw_netstats); + if (!dev->tstats) + return -ENOMEM; + + ret = ip6_tnl_dst_init(tunnel); + if (ret) { + free_percpu(dev->tstats); + dev->tstats = NULL; + return ret; + } + + return 0; +} + +static int ip6gre_tunnel_init(struct net_device *dev) +{ + struct ip6_tnl *tunnel; + int ret; + + ret = ip6gre_tunnel_init_common(dev); + if (ret) + return ret; + + tunnel = netdev_priv(dev); + memcpy(dev->dev_addr, &tunnel->parms.laddr, sizeof(struct in6_addr)); memcpy(dev->broadcast, &tunnel->parms.raddr, sizeof(struct in6_addr)); if (ipv6_addr_any(&tunnel->parms.raddr)) dev->header_ops = &ip6gre_header_ops; - dev->tstats = netdev_alloc_pcpu_stats(struct pcpu_sw_netstats); - if (!dev->tstats) - return -ENOMEM; - return 0; } @@ -1460,19 +1480,16 @@ static void ip6gre_netlink_parms(struct nlattr *data[], static int ip6gre_tap_init(struct net_device *dev) { struct ip6_tnl *tunnel; + int ret; - tunnel = netdev_priv(dev); + ret = ip6gre_tunnel_init_common(dev); + if (ret) + return ret; - tunnel->dev = dev; - tunnel->net = dev_net(dev); - strcpy(tunnel->parms.name, dev->name); + tunnel = netdev_priv(dev); ip6gre_tnl_link_config(tunnel, 1); - dev->tstats = netdev_alloc_pcpu_stats(struct pcpu_sw_netstats); - if (!dev->tstats) - return -ENOMEM; - return 0; } diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c index 26ea47930740..92b1aa38f121 100644 --- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -586,20 +586,22 @@ int ip6_fragment(struct sock *sk, struct sk_buff *skb, frag_id = ipv6_select_ident(net, &ipv6_hdr(skb)->daddr, &ipv6_hdr(skb)->saddr); + hroom = LL_RESERVED_SPACE(rt->dst.dev); if (skb_has_frag_list(skb)) { int first_len = skb_pagelen(skb); struct sk_buff *frag2; if (first_len - hlen > mtu || ((first_len - hlen) & 7) || - skb_cloned(skb)) + skb_cloned(skb) || + skb_headroom(skb) < (hroom + sizeof(struct frag_hdr))) goto slow_path; skb_walk_frags(skb, frag) { /* Correct geometry. */ if (frag->len > mtu || ((frag->len & 7) && frag->next) || - skb_headroom(frag) < hlen) + skb_headroom(frag) < (hlen + hroom + sizeof(struct frag_hdr))) goto slow_path_clean; /* Partially cloned skb? */ @@ -616,8 +618,6 @@ int ip6_fragment(struct sock *sk, struct sk_buff *skb, err = 0; offset = 0; - frag = skb_shinfo(skb)->frag_list; - skb_frag_list_init(skb); /* BUILD HEADER */ *prevhdr = NEXTHDR_FRAGMENT; @@ -625,8 +625,11 @@ int ip6_fragment(struct sock *sk, struct sk_buff *skb, if (!tmp_hdr) { IP6_INC_STATS(net, ip6_dst_idev(skb_dst(skb)), IPSTATS_MIB_FRAGFAILS); - return -ENOMEM; + err = -ENOMEM; + goto fail; } + frag = skb_shinfo(skb)->frag_list; + skb_frag_list_init(skb); __skb_pull(skb, hlen); fh = (struct frag_hdr *)__skb_push(skb, sizeof(struct frag_hdr)); @@ -723,7 +726,6 @@ slow_path: */ *prevhdr = NEXTHDR_FRAGMENT; - hroom = LL_RESERVED_SPACE(rt->dst.dev); troom = rt->dst.dev->needed_tailroom; /* diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c index b0ab420612bc..eabffbb89795 100644 --- a/net/ipv6/ip6_tunnel.c +++ b/net/ipv6/ip6_tunnel.c @@ -126,36 +126,92 @@ static struct net_device_stats *ip6_get_stats(struct net_device *dev) * Locking : hash tables are protected by RCU and RTNL */ -struct dst_entry *ip6_tnl_dst_check(struct ip6_tnl *t) +static void ip6_tnl_per_cpu_dst_set(struct ip6_tnl_dst *idst, + struct dst_entry *dst) { - struct dst_entry *dst = t->dst_cache; + write_seqlock_bh(&idst->lock); + dst_release(rcu_dereference_protected( + idst->dst, + lockdep_is_held(&idst->lock.lock))); + if (dst) { + dst_hold(dst); + idst->cookie = rt6_get_cookie((struct rt6_info *)dst); + } else { + idst->cookie = 0; + } + rcu_assign_pointer(idst->dst, dst); + write_sequnlock_bh(&idst->lock); +} + +struct dst_entry *ip6_tnl_dst_get(struct ip6_tnl *t) +{ + struct ip6_tnl_dst *idst; + struct dst_entry *dst; + unsigned int seq; + u32 cookie; - if (dst && dst->obsolete && - !dst->ops->check(dst, t->dst_cookie)) { - t->dst_cache = NULL; + idst = raw_cpu_ptr(t->dst_cache); + + rcu_read_lock(); + do { + seq = read_seqbegin(&idst->lock); + dst = rcu_dereference(idst->dst); + cookie = idst->cookie; + } while (read_seqretry(&idst->lock, seq)); + + if (dst && !atomic_inc_not_zero(&dst->__refcnt)) + dst = NULL; + rcu_read_unlock(); + + if (dst && dst->obsolete && !dst->ops->check(dst, cookie)) { + ip6_tnl_per_cpu_dst_set(idst, NULL); dst_release(dst); - return NULL; + dst = NULL; } - return dst; } -EXPORT_SYMBOL_GPL(ip6_tnl_dst_check); +EXPORT_SYMBOL_GPL(ip6_tnl_dst_get); void ip6_tnl_dst_reset(struct ip6_tnl *t) { - dst_release(t->dst_cache); - t->dst_cache = NULL; + int i; + + for_each_possible_cpu(i) + ip6_tnl_per_cpu_dst_set(raw_cpu_ptr(t->dst_cache), NULL); } EXPORT_SYMBOL_GPL(ip6_tnl_dst_reset); -void ip6_tnl_dst_store(struct ip6_tnl *t, struct dst_entry *dst) +void ip6_tnl_dst_set(struct ip6_tnl *t, struct dst_entry *dst) +{ + ip6_tnl_per_cpu_dst_set(raw_cpu_ptr(t->dst_cache), dst); + +} +EXPORT_SYMBOL_GPL(ip6_tnl_dst_set); + +void ip6_tnl_dst_destroy(struct ip6_tnl *t) { - struct rt6_info *rt = (struct rt6_info *) dst; - t->dst_cookie = rt6_get_cookie(rt); - dst_release(t->dst_cache); - t->dst_cache = dst; + if (!t->dst_cache) + return; + + ip6_tnl_dst_reset(t); + free_percpu(t->dst_cache); } -EXPORT_SYMBOL_GPL(ip6_tnl_dst_store); +EXPORT_SYMBOL_GPL(ip6_tnl_dst_destroy); + +int ip6_tnl_dst_init(struct ip6_tnl *t) +{ + int i; + + t->dst_cache = alloc_percpu(struct ip6_tnl_dst); + if (!t->dst_cache) + return -ENOMEM; + + for_each_possible_cpu(i) + seqlock_init(&per_cpu_ptr(t->dst_cache, i)->lock); + + return 0; +} +EXPORT_SYMBOL_GPL(ip6_tnl_dst_init); /** * ip6_tnl_lookup - fetch tunnel matching the end-point addresses @@ -271,6 +327,9 @@ ip6_tnl_unlink(struct ip6_tnl_net *ip6n, struct ip6_tnl *t) static void ip6_dev_free(struct net_device *dev) { + struct ip6_tnl *t = netdev_priv(dev); + + ip6_tnl_dst_destroy(t); free_percpu(dev->tstats); free_netdev(dev); } @@ -510,14 +569,14 @@ ip6_tnl_err(struct sk_buff *skb, __u8 ipproto, struct inet6_skb_parm *opt, struct ipv6_tlv_tnl_enc_lim *tel; __u32 mtu; case ICMPV6_DEST_UNREACH: - net_warn_ratelimited("%s: Path to destination invalid or inactive!\n", - t->parms.name); + net_dbg_ratelimited("%s: Path to destination invalid or inactive!\n", + t->parms.name); rel_msg = 1; break; case ICMPV6_TIME_EXCEED: if ((*code) == ICMPV6_EXC_HOPLIMIT) { - net_warn_ratelimited("%s: Too small hop limit or routing loop in tunnel!\n", - t->parms.name); + net_dbg_ratelimited("%s: Too small hop limit or routing loop in tunnel!\n", + t->parms.name); rel_msg = 1; } break; @@ -529,13 +588,13 @@ ip6_tnl_err(struct sk_buff *skb, __u8 ipproto, struct inet6_skb_parm *opt, if (teli && teli == *info - 2) { tel = (struct ipv6_tlv_tnl_enc_lim *) &skb->data[teli]; if (tel->encap_limit == 0) { - net_warn_ratelimited("%s: Too small encapsulation limit or routing loop in tunnel!\n", - t->parms.name); + net_dbg_ratelimited("%s: Too small encapsulation limit or routing loop in tunnel!\n", + t->parms.name); rel_msg = 1; } } else { - net_warn_ratelimited("%s: Recipient unable to parse tunneled packet!\n", - t->parms.name); + net_dbg_ratelimited("%s: Recipient unable to parse tunneled packet!\n", + t->parms.name); } break; case ICMPV6_PKT_TOOBIG: @@ -1010,23 +1069,23 @@ static int ip6_tnl_xmit2(struct sk_buff *skb, memcpy(&fl6->daddr, addr6, sizeof(fl6->daddr)); neigh_release(neigh); } else if (!fl6->flowi6_mark) - dst = ip6_tnl_dst_check(t); + dst = ip6_tnl_dst_get(t); if (!ip6_tnl_xmit_ctl(t, &fl6->saddr, &fl6->daddr)) goto tx_err_link_failure; if (!dst) { - ndst = ip6_route_output(net, NULL, fl6); + dst = ip6_route_output(net, NULL, fl6); - if (ndst->error) + if (dst->error) goto tx_err_link_failure; - ndst = xfrm_lookup(net, ndst, flowi6_to_flowi(fl6), NULL, 0); - if (IS_ERR(ndst)) { - err = PTR_ERR(ndst); - ndst = NULL; + dst = xfrm_lookup(net, dst, flowi6_to_flowi(fl6), NULL, 0); + if (IS_ERR(dst)) { + err = PTR_ERR(dst); + dst = NULL; goto tx_err_link_failure; } - dst = ndst; + ndst = dst; } tdev = dst->dev; @@ -1072,12 +1131,11 @@ static int ip6_tnl_xmit2(struct sk_buff *skb, consume_skb(skb); skb = new_skb; } - if (fl6->flowi6_mark) { - skb_dst_set(skb, dst); - ndst = NULL; - } else { - skb_dst_set_noref(skb, dst); - } + + if (!fl6->flowi6_mark && ndst) + ip6_tnl_dst_set(t, ndst); + skb_dst_set(skb, dst); + skb->transport_header = skb->network_header; proto = fl6->flowi6_proto; @@ -1101,14 +1159,12 @@ static int ip6_tnl_xmit2(struct sk_buff *skb, ipv6h->saddr = fl6->saddr; ipv6h->daddr = fl6->daddr; ip6tunnel_xmit(NULL, skb, dev); - if (ndst) - ip6_tnl_dst_store(t, ndst); return 0; tx_err_link_failure: stats->tx_carrier_errors++; dst_link_failure(skb); tx_err_dst_release: - dst_release(ndst); + dst_release(dst); return err; } @@ -1573,12 +1629,21 @@ static inline int ip6_tnl_dev_init_gen(struct net_device *dev) { struct ip6_tnl *t = netdev_priv(dev); + int ret; t->dev = dev; t->net = dev_net(dev); dev->tstats = netdev_alloc_pcpu_stats(struct pcpu_sw_netstats); if (!dev->tstats) return -ENOMEM; + + ret = ip6_tnl_dst_init(t); + if (ret) { + free_percpu(dev->tstats); + dev->tstats = NULL; + return ret; + } + return 0; } diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 53617d715188..f204089e854c 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -1322,8 +1322,7 @@ static void ip6_link_failure(struct sk_buff *skb) if (rt) { if (rt->rt6i_flags & RTF_CACHE) { dst_hold(&rt->dst); - if (ip6_del_rt(rt)) - dst_free(&rt->dst); + ip6_del_rt(rt); } else if (rt->rt6i_node && (rt->rt6i_flags & RTF_DEFAULT)) { rt->rt6i_node->fn_sernum = -1; } @@ -1886,9 +1885,11 @@ int ip6_route_info_create(struct fib6_config *cfg, struct rt6_info **rt_ret) rt->dst.input = ip6_pkt_prohibit; break; case RTN_THROW: + case RTN_UNREACHABLE: default: rt->dst.error = (cfg->fc_type == RTN_THROW) ? -EAGAIN - : -ENETUNREACH; + : (cfg->fc_type == RTN_UNREACHABLE) + ? -EHOSTUNREACH : -ENETUNREACH; rt->dst.output = ip6_pkt_discard_out; rt->dst.input = ip6_pkt_discard; break; @@ -2028,7 +2029,8 @@ static int __ip6_del_rt(struct rt6_info *rt, struct nl_info *info) struct fib6_table *table; struct net *net = dev_net(rt->dst.dev); - if (rt == net->ipv6.ip6_null_entry) { + if (rt == net->ipv6.ip6_null_entry || + rt->dst.flags & DST_NOCACHE) { err = -ENOENT; goto out; } @@ -2515,6 +2517,7 @@ struct rt6_info *addrconf_dst_alloc(struct inet6_dev *idev, rt->rt6i_dst.addr = *addr; rt->rt6i_dst.plen = 128; rt->rt6i_table = fib6_get_table(net, RT6_TABLE_LOCAL); + rt->dst.flags |= DST_NOCACHE; atomic_set(&rt->dst.__refcnt, 1); @@ -3303,7 +3306,8 @@ errout: return err; } -void inet6_rt_notify(int event, struct rt6_info *rt, struct nl_info *info) +void inet6_rt_notify(int event, struct rt6_info *rt, struct nl_info *info, + unsigned int nlm_flags) { struct sk_buff *skb; struct net *net = info->nl_net; @@ -3318,7 +3322,7 @@ void inet6_rt_notify(int event, struct rt6_info *rt, struct nl_info *info) goto errout; err = rt6_fill_node(net, skb, rt, NULL, NULL, 0, - event, info->portid, seq, 0, 0, 0); + event, info->portid, seq, 0, 0, nlm_flags); if (err < 0) { /* -EMSGSIZE implies BUG in rt6_nlmsg_size() */ WARN_ON(err == -EMSGSIZE); diff --git a/net/mac80211/cfg.c b/net/mac80211/cfg.c index 17b1fe961c5d..7a77a1470f25 100644 --- a/net/mac80211/cfg.c +++ b/net/mac80211/cfg.c @@ -2474,6 +2474,7 @@ static int ieee80211_set_cqm_rssi_config(struct wiphy *wiphy, bss_conf->cqm_rssi_thold = rssi_thold; bss_conf->cqm_rssi_hyst = rssi_hyst; + sdata->u.mgd.last_cqm_event_signal = 0; /* tell the driver upon association, unless already associated */ if (sdata->u.mgd.associated && @@ -2518,15 +2519,17 @@ static int ieee80211_set_bitrate_mask(struct wiphy *wiphy, continue; for (j = 0; j < IEEE80211_HT_MCS_MASK_LEN; j++) { - if (~sdata->rc_rateidx_mcs_mask[i][j]) + if (~sdata->rc_rateidx_mcs_mask[i][j]) { sdata->rc_has_mcs_mask[i] = true; + break; + } + } - if (~sdata->rc_rateidx_vht_mcs_mask[i][j]) + for (j = 0; j < NL80211_VHT_NSS_MAX; j++) { + if (~sdata->rc_rateidx_vht_mcs_mask[i][j]) { sdata->rc_has_vht_mcs_mask[i] = true; - - if (sdata->rc_has_mcs_mask[i] && - sdata->rc_has_vht_mcs_mask[i]) break; + } } } diff --git a/net/netfilter/nf_log.c b/net/netfilter/nf_log.c index 675d12c69e32..a5d41dfa9f05 100644 --- a/net/netfilter/nf_log.c +++ b/net/netfilter/nf_log.c @@ -107,12 +107,17 @@ EXPORT_SYMBOL(nf_log_register); void nf_log_unregister(struct nf_logger *logger) { + const struct nf_logger *log; int i; mutex_lock(&nf_log_mutex); - for (i = 0; i < NFPROTO_NUMPROTO; i++) - RCU_INIT_POINTER(loggers[i][logger->type], NULL); + for (i = 0; i < NFPROTO_NUMPROTO; i++) { + log = nft_log_dereference(loggers[i][logger->type]); + if (log == logger) + RCU_INIT_POINTER(loggers[i][logger->type], NULL); + } mutex_unlock(&nf_log_mutex); + synchronize_rcu(); } EXPORT_SYMBOL(nf_log_unregister); diff --git a/net/netfilter/nft_compat.c b/net/netfilter/nft_compat.c index 66def315eb56..9c8fab00164b 100644 --- a/net/netfilter/nft_compat.c +++ b/net/netfilter/nft_compat.c @@ -619,6 +619,13 @@ struct nft_xt { static struct nft_expr_type nft_match_type; +static bool nft_match_cmp(const struct xt_match *match, + const char *name, u32 rev, u32 family) +{ + return strcmp(match->name, name) == 0 && match->revision == rev && + (match->family == NFPROTO_UNSPEC || match->family == family); +} + static const struct nft_expr_ops * nft_match_select_ops(const struct nft_ctx *ctx, const struct nlattr * const tb[]) @@ -626,7 +633,7 @@ nft_match_select_ops(const struct nft_ctx *ctx, struct nft_xt *nft_match; struct xt_match *match; char *mt_name; - __u32 rev, family; + u32 rev, family; if (tb[NFTA_MATCH_NAME] == NULL || tb[NFTA_MATCH_REV] == NULL || @@ -641,8 +648,7 @@ nft_match_select_ops(const struct nft_ctx *ctx, list_for_each_entry(nft_match, &nft_match_list, head) { struct xt_match *match = nft_match->ops.data; - if (strcmp(match->name, mt_name) == 0 && - match->revision == rev && match->family == family) { + if (nft_match_cmp(match, mt_name, rev, family)) { if (!try_module_get(match->me)) return ERR_PTR(-ENOENT); @@ -693,6 +699,13 @@ static LIST_HEAD(nft_target_list); static struct nft_expr_type nft_target_type; +static bool nft_target_cmp(const struct xt_target *tg, + const char *name, u32 rev, u32 family) +{ + return strcmp(tg->name, name) == 0 && tg->revision == rev && + (tg->family == NFPROTO_UNSPEC || tg->family == family); +} + static const struct nft_expr_ops * nft_target_select_ops(const struct nft_ctx *ctx, const struct nlattr * const tb[]) @@ -700,7 +713,7 @@ nft_target_select_ops(const struct nft_ctx *ctx, struct nft_xt *nft_target; struct xt_target *target; char *tg_name; - __u32 rev, family; + u32 rev, family; if (tb[NFTA_TARGET_NAME] == NULL || tb[NFTA_TARGET_REV] == NULL || @@ -715,8 +728,7 @@ nft_target_select_ops(const struct nft_ctx *ctx, list_for_each_entry(nft_target, &nft_target_list, head) { struct xt_target *target = nft_target->ops.data; - if (strcmp(target->name, tg_name) == 0 && - target->revision == rev && target->family == family) { + if (nft_target_cmp(target, tg_name, rev, family)) { if (!try_module_get(target->me)) return ERR_PTR(-ENOENT); diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c index 7f86d3b55060..8f060d7f9a0e 100644 --- a/net/netlink/af_netlink.c +++ b/net/netlink/af_netlink.c @@ -125,6 +125,24 @@ static inline u32 netlink_group_mask(u32 group) return group ? 1 << (group - 1) : 0; } +static struct sk_buff *netlink_to_full_skb(const struct sk_buff *skb, + gfp_t gfp_mask) +{ + unsigned int len = skb_end_offset(skb); + struct sk_buff *new; + + new = alloc_skb(len, gfp_mask); + if (new == NULL) + return NULL; + + NETLINK_CB(new).portid = NETLINK_CB(skb).portid; + NETLINK_CB(new).dst_group = NETLINK_CB(skb).dst_group; + NETLINK_CB(new).creds = NETLINK_CB(skb).creds; + + memcpy(skb_put(new, len), skb->data, len); + return new; +} + int netlink_add_tap(struct netlink_tap *nt) { if (unlikely(nt->dev->type != ARPHRD_NETLINK)) @@ -206,7 +224,11 @@ static int __netlink_deliver_tap_skb(struct sk_buff *skb, int ret = -ENOMEM; dev_hold(dev); - nskb = skb_clone(skb, GFP_ATOMIC); + + if (netlink_skb_is_mmaped(skb) || is_vmalloc_addr(skb->head)) + nskb = netlink_to_full_skb(skb, GFP_ATOMIC); + else + nskb = skb_clone(skb, GFP_ATOMIC); if (nskb) { nskb->dev = dev; nskb->protocol = htons((u16) sk->sk_protocol); @@ -279,11 +301,6 @@ static void netlink_rcv_wake(struct sock *sk) } #ifdef CONFIG_NETLINK_MMAP -static bool netlink_skb_is_mmaped(const struct sk_buff *skb) -{ - return NETLINK_CB(skb).flags & NETLINK_SKB_MMAPED; -} - static bool netlink_rx_is_mmaped(struct sock *sk) { return nlk_sk(sk)->rx_ring.pg_vec != NULL; @@ -846,7 +863,6 @@ static void netlink_ring_set_copied(struct sock *sk, struct sk_buff *skb) } #else /* CONFIG_NETLINK_MMAP */ -#define netlink_skb_is_mmaped(skb) false #define netlink_rx_is_mmaped(sk) false #define netlink_tx_is_mmaped(sk) false #define netlink_mmap sock_no_mmap @@ -1094,8 +1110,8 @@ static int netlink_insert(struct sock *sk, u32 portid) lock_sock(sk); - err = -EBUSY; - if (nlk_sk(sk)->portid) + err = nlk_sk(sk)->portid == portid ? 0 : -EBUSY; + if (nlk_sk(sk)->bound) goto err; err = -ENOMEM; @@ -1115,10 +1131,14 @@ static int netlink_insert(struct sock *sk, u32 portid) err = -EOVERFLOW; if (err == -EEXIST) err = -EADDRINUSE; - nlk_sk(sk)->portid = 0; sock_put(sk); + goto err; } + /* We need to ensure that the socket is hashed and visible. */ + smp_wmb(); + nlk_sk(sk)->bound = portid; + err: release_sock(sk); return err; @@ -1503,6 +1523,7 @@ static int netlink_bind(struct socket *sock, struct sockaddr *addr, struct sockaddr_nl *nladdr = (struct sockaddr_nl *)addr; int err; long unsigned int groups = nladdr->nl_groups; + bool bound; if (addr_len < sizeof(struct sockaddr_nl)) return -EINVAL; @@ -1519,9 +1540,14 @@ static int netlink_bind(struct socket *sock, struct sockaddr *addr, return err; } - if (nlk->portid) + bound = nlk->bound; + if (bound) { + /* Ensure nlk->portid is up-to-date. */ + smp_rmb(); + if (nladdr->nl_pid != nlk->portid) return -EINVAL; + } if (nlk->netlink_bind && groups) { int group; @@ -1537,7 +1563,10 @@ static int netlink_bind(struct socket *sock, struct sockaddr *addr, } } - if (!nlk->portid) { + /* No need for barriers here as we return to user-space without + * using any of the bound attributes. + */ + if (!bound) { err = nladdr->nl_pid ? netlink_insert(sk, nladdr->nl_pid) : netlink_autobind(sock); @@ -1585,7 +1614,10 @@ static int netlink_connect(struct socket *sock, struct sockaddr *addr, !netlink_allowed(sock, NL_CFG_F_NONROOT_SEND)) return -EPERM; - if (!nlk->portid) + /* No need for barriers here as we return to user-space without + * using any of the bound attributes. + */ + if (!nlk->bound) err = netlink_autobind(sock); if (err == 0) { @@ -2426,10 +2458,13 @@ static int netlink_sendmsg(struct socket *sock, struct msghdr *msg, size_t len) dst_group = nlk->dst_group; } - if (!nlk->portid) { + if (!nlk->bound) { err = netlink_autobind(sock); if (err) goto out; + } else { + /* Ensure nlk is hashed and visible. */ + smp_rmb(); } /* It's a really convoluted way for userland to ask for mmaped diff --git a/net/netlink/af_netlink.h b/net/netlink/af_netlink.h index 89008405d6b4..14437d9b1965 100644 --- a/net/netlink/af_netlink.h +++ b/net/netlink/af_netlink.h @@ -35,6 +35,7 @@ struct netlink_sock { unsigned long state; size_t max_recvmsg_len; wait_queue_head_t wait; + bool bound; bool cb_running; struct netlink_callback cb; struct mutex *cb_mutex; @@ -59,6 +60,15 @@ static inline struct netlink_sock *nlk_sk(struct sock *sk) return container_of(sk, struct netlink_sock, sk); } +static inline bool netlink_skb_is_mmaped(const struct sk_buff *skb) +{ +#ifdef CONFIG_NETLINK_MMAP + return NETLINK_CB(skb).flags & NETLINK_SKB_MMAPED; +#else + return false; +#endif /* CONFIG_NETLINK_MMAP */ +} + struct netlink_table { struct rhashtable hash; struct hlist_head mc_list; diff --git a/net/openvswitch/Kconfig b/net/openvswitch/Kconfig index 2a071f470d57..d143aa9f6654 100644 --- a/net/openvswitch/Kconfig +++ b/net/openvswitch/Kconfig @@ -5,7 +5,8 @@ config OPENVSWITCH tristate "Open vSwitch" depends on INET - depends on (!NF_CONNTRACK || NF_CONNTRACK) + depends on !NF_CONNTRACK || \ + (NF_CONNTRACK && (!NF_DEFRAG_IPV6 || NF_DEFRAG_IPV6)) select LIBCRC32C select MPLS select NET_MPLS_GSO diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c index e8e524ad8a01..002a755fa07e 100644 --- a/net/openvswitch/conntrack.c +++ b/net/openvswitch/conntrack.c @@ -275,13 +275,15 @@ static int ovs_ct_helper(struct sk_buff *skb, u16 proto) case NFPROTO_IPV6: { u8 nexthdr = ipv6_hdr(skb)->nexthdr; __be16 frag_off; + int ofs; - protoff = ipv6_skip_exthdr(skb, sizeof(struct ipv6hdr), - &nexthdr, &frag_off); - if (protoff < 0 || (frag_off & htons(~0x7)) != 0) { + ofs = ipv6_skip_exthdr(skb, sizeof(struct ipv6hdr), &nexthdr, + &frag_off); + if (ofs < 0 || (frag_off & htons(~0x7)) != 0) { pr_debug("proto header not found\n"); return NF_ACCEPT; } + protoff = ofs; break; } default: diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c index 6fbd2decb19e..b816ff871528 100644 --- a/net/openvswitch/datapath.c +++ b/net/openvswitch/datapath.c @@ -952,7 +952,7 @@ static int ovs_flow_cmd_new(struct sk_buff *skb, struct genl_info *info) if (error) goto err_kfree_flow; - ovs_flow_mask_key(&new_flow->key, &key, &mask); + ovs_flow_mask_key(&new_flow->key, &key, true, &mask); /* Extract flow identifier. */ error = ovs_nla_get_identifier(&new_flow->id, a[OVS_FLOW_ATTR_UFID], @@ -1080,7 +1080,7 @@ static struct sw_flow_actions *get_flow_actions(struct net *net, struct sw_flow_key masked_key; int error; - ovs_flow_mask_key(&masked_key, key, mask); + ovs_flow_mask_key(&masked_key, key, true, mask); error = ovs_nla_copy_actions(net, a, &masked_key, &acts, log); if (error) { OVS_NLERR(log, diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c index c92d6a262bc5..5c030a4d7338 100644 --- a/net/openvswitch/flow_netlink.c +++ b/net/openvswitch/flow_netlink.c @@ -57,6 +57,7 @@ struct ovs_len_tbl { }; #define OVS_ATTR_NESTED -1 +#define OVS_ATTR_VARIABLE -2 static void update_range(struct sw_flow_match *match, size_t offset, size_t size, bool is_mask) @@ -304,6 +305,10 @@ size_t ovs_key_attr_size(void) + nla_total_size(28); /* OVS_KEY_ATTR_ND */ } +static const struct ovs_len_tbl ovs_vxlan_ext_key_lens[OVS_VXLAN_EXT_MAX + 1] = { + [OVS_VXLAN_EXT_GBP] = { .len = sizeof(u32) }, +}; + static const struct ovs_len_tbl ovs_tunnel_key_lens[OVS_TUNNEL_KEY_ATTR_MAX + 1] = { [OVS_TUNNEL_KEY_ATTR_ID] = { .len = sizeof(u64) }, [OVS_TUNNEL_KEY_ATTR_IPV4_SRC] = { .len = sizeof(u32) }, @@ -315,8 +320,9 @@ static const struct ovs_len_tbl ovs_tunnel_key_lens[OVS_TUNNEL_KEY_ATTR_MAX + 1] [OVS_TUNNEL_KEY_ATTR_TP_SRC] = { .len = sizeof(u16) }, [OVS_TUNNEL_KEY_ATTR_TP_DST] = { .len = sizeof(u16) }, [OVS_TUNNEL_KEY_ATTR_OAM] = { .len = 0 }, - [OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS] = { .len = OVS_ATTR_NESTED }, - [OVS_TUNNEL_KEY_ATTR_VXLAN_OPTS] = { .len = OVS_ATTR_NESTED }, + [OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS] = { .len = OVS_ATTR_VARIABLE }, + [OVS_TUNNEL_KEY_ATTR_VXLAN_OPTS] = { .len = OVS_ATTR_NESTED, + .next = ovs_vxlan_ext_key_lens }, }; /* The size of the argument for each %OVS_KEY_ATTR_* Netlink attribute. */ @@ -349,6 +355,13 @@ static const struct ovs_len_tbl ovs_key_lens[OVS_KEY_ATTR_MAX + 1] = { [OVS_KEY_ATTR_CT_LABEL] = { .len = sizeof(struct ovs_key_ct_label) }, }; +static bool check_attr_len(unsigned int attr_len, unsigned int expected_len) +{ + return expected_len == attr_len || + expected_len == OVS_ATTR_NESTED || + expected_len == OVS_ATTR_VARIABLE; +} + static bool is_all_zero(const u8 *fp, size_t size) { int i; @@ -388,7 +401,7 @@ static int __parse_flow_nlattrs(const struct nlattr *attr, } expected_len = ovs_key_lens[type].len; - if (nla_len(nla) != expected_len && expected_len != OVS_ATTR_NESTED) { + if (!check_attr_len(nla_len(nla), expected_len)) { OVS_NLERR(log, "Key %d has unexpected len %d expected %d", type, nla_len(nla), expected_len); return -EINVAL; @@ -473,29 +486,50 @@ static int genev_tun_opt_from_nlattr(const struct nlattr *a, return 0; } -static const struct nla_policy vxlan_opt_policy[OVS_VXLAN_EXT_MAX + 1] = { - [OVS_VXLAN_EXT_GBP] = { .type = NLA_U32 }, -}; - -static int vxlan_tun_opt_from_nlattr(const struct nlattr *a, +static int vxlan_tun_opt_from_nlattr(const struct nlattr *attr, struct sw_flow_match *match, bool is_mask, bool log) { - struct nlattr *tb[OVS_VXLAN_EXT_MAX+1]; + struct nlattr *a; + int rem; unsigned long opt_key_offset; struct vxlan_metadata opts; - int err; BUILD_BUG_ON(sizeof(opts) > sizeof(match->key->tun_opts)); - err = nla_parse_nested(tb, OVS_VXLAN_EXT_MAX, a, vxlan_opt_policy); - if (err < 0) - return err; - memset(&opts, 0, sizeof(opts)); + nla_for_each_nested(a, attr, rem) { + int type = nla_type(a); - if (tb[OVS_VXLAN_EXT_GBP]) - opts.gbp = nla_get_u32(tb[OVS_VXLAN_EXT_GBP]); + if (type > OVS_VXLAN_EXT_MAX) { + OVS_NLERR(log, "VXLAN extension %d out of range max %d", + type, OVS_VXLAN_EXT_MAX); + return -EINVAL; + } + + if (!check_attr_len(nla_len(a), + ovs_vxlan_ext_key_lens[type].len)) { + OVS_NLERR(log, "VXLAN extension %d has unexpected len %d expected %d", + type, nla_len(a), + ovs_vxlan_ext_key_lens[type].len); + return -EINVAL; + } + + switch (type) { + case OVS_VXLAN_EXT_GBP: + opts.gbp = nla_get_u32(a); + break; + default: + OVS_NLERR(log, "Unknown VXLAN extension attribute %d", + type); + return -EINVAL; + } + } + if (rem) { + OVS_NLERR(log, "VXLAN extension message has %d unknown bytes.", + rem); + return -EINVAL; + } if (!is_mask) SW_FLOW_KEY_PUT(match, tun_opts_len, sizeof(opts), false); @@ -528,8 +562,8 @@ static int ipv4_tun_from_nlattr(const struct nlattr *attr, return -EINVAL; } - if (ovs_tunnel_key_lens[type].len != nla_len(a) && - ovs_tunnel_key_lens[type].len != OVS_ATTR_NESTED) { + if (!check_attr_len(nla_len(a), + ovs_tunnel_key_lens[type].len)) { OVS_NLERR(log, "Tunnel attr %d has unexpected len %d expected %d", type, nla_len(a), ovs_tunnel_key_lens[type].len); return -EINVAL; @@ -1052,10 +1086,13 @@ static void nlattr_set(struct nlattr *attr, u8 val, /* The nlattr stream should already have been validated */ nla_for_each_nested(nla, attr, rem) { - if (tbl && tbl[nla_type(nla)].len == OVS_ATTR_NESTED) - nlattr_set(nla, val, tbl[nla_type(nla)].next); - else + if (tbl[nla_type(nla)].len == OVS_ATTR_NESTED) { + if (tbl[nla_type(nla)].next) + tbl = tbl[nla_type(nla)].next; + nlattr_set(nla, val, tbl); + } else { memset(nla_data(nla), val, nla_len(nla)); + } } } @@ -1922,8 +1959,7 @@ static int validate_set(const struct nlattr *a, key_len /= 2; if (key_type > OVS_KEY_ATTR_MAX || - (ovs_key_lens[key_type].len != key_len && - ovs_key_lens[key_type].len != OVS_ATTR_NESTED)) + !check_attr_len(key_len, ovs_key_lens[key_type].len)) return -EINVAL; if (masked && !validate_masked(nla_data(ovs_key), key_len)) diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c index d22d8e948d0f..f2ea83ba4763 100644 --- a/net/openvswitch/flow_table.c +++ b/net/openvswitch/flow_table.c @@ -57,20 +57,21 @@ static u16 range_n_bytes(const struct sw_flow_key_range *range) } void ovs_flow_mask_key(struct sw_flow_key *dst, const struct sw_flow_key *src, - const struct sw_flow_mask *mask) + bool full, const struct sw_flow_mask *mask) { - const long *m = (const long *)((const u8 *)&mask->key + - mask->range.start); - const long *s = (const long *)((const u8 *)src + - mask->range.start); - long *d = (long *)((u8 *)dst + mask->range.start); + int start = full ? 0 : mask->range.start; + int len = full ? sizeof *dst : range_n_bytes(&mask->range); + const long *m = (const long *)((const u8 *)&mask->key + start); + const long *s = (const long *)((const u8 *)src + start); + long *d = (long *)((u8 *)dst + start); int i; - /* The memory outside of the 'mask->range' are not set since - * further operations on 'dst' only uses contents within - * 'mask->range'. + /* If 'full' is true then all of 'dst' is fully initialized. Otherwise, + * if 'full' is false the memory outside of the 'mask->range' is left + * uninitialized. This can be used as an optimization when further + * operations on 'dst' only use contents within 'mask->range'. */ - for (i = 0; i < range_n_bytes(&mask->range); i += sizeof(long)) + for (i = 0; i < len; i += sizeof(long)) *d++ = *s++ & *m++; } @@ -475,7 +476,7 @@ static struct sw_flow *masked_flow_lookup(struct table_instance *ti, u32 hash; struct sw_flow_key masked_key; - ovs_flow_mask_key(&masked_key, unmasked, mask); + ovs_flow_mask_key(&masked_key, unmasked, false, mask); hash = flow_hash(&masked_key, &mask->range); head = find_bucket(ti, hash); hlist_for_each_entry_rcu(flow, head, flow_table.node[ti->node_ver]) { diff --git a/net/openvswitch/flow_table.h b/net/openvswitch/flow_table.h index 616eda10d955..2dd9900f533d 100644 --- a/net/openvswitch/flow_table.h +++ b/net/openvswitch/flow_table.h @@ -86,5 +86,5 @@ struct sw_flow *ovs_flow_tbl_lookup_ufid(struct flow_table *, bool ovs_flow_cmp(const struct sw_flow *, const struct sw_flow_match *); void ovs_flow_mask_key(struct sw_flow_key *dst, const struct sw_flow_key *src, - const struct sw_flow_mask *mask); + bool full, const struct sw_flow_mask *mask); #endif /* flow_table.h */ diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c index 7b8e39a22387..aa4b15c35884 100644 --- a/net/packet/af_packet.c +++ b/net/packet/af_packet.c @@ -230,6 +230,8 @@ struct packet_skb_cb { } sa; }; +#define vio_le() virtio_legacy_is_little_endian() + #define PACKET_SKB_CB(__skb) ((struct packet_skb_cb *)((__skb)->cb)) #define GET_PBDQC_FROM_RB(x) ((struct tpacket_kbdq_core *)(&(x)->prb_bdqc)) @@ -2680,15 +2682,15 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len) goto out_unlock; if ((vnet_hdr.flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) && - (__virtio16_to_cpu(false, vnet_hdr.csum_start) + - __virtio16_to_cpu(false, vnet_hdr.csum_offset) + 2 > - __virtio16_to_cpu(false, vnet_hdr.hdr_len))) - vnet_hdr.hdr_len = __cpu_to_virtio16(false, - __virtio16_to_cpu(false, vnet_hdr.csum_start) + - __virtio16_to_cpu(false, vnet_hdr.csum_offset) + 2); + (__virtio16_to_cpu(vio_le(), vnet_hdr.csum_start) + + __virtio16_to_cpu(vio_le(), vnet_hdr.csum_offset) + 2 > + __virtio16_to_cpu(vio_le(), vnet_hdr.hdr_len))) + vnet_hdr.hdr_len = __cpu_to_virtio16(vio_le(), + __virtio16_to_cpu(vio_le(), vnet_hdr.csum_start) + + __virtio16_to_cpu(vio_le(), vnet_hdr.csum_offset) + 2); err = -EINVAL; - if (__virtio16_to_cpu(false, vnet_hdr.hdr_len) > len) + if (__virtio16_to_cpu(vio_le(), vnet_hdr.hdr_len) > len) goto out_unlock; if (vnet_hdr.gso_type != VIRTIO_NET_HDR_GSO_NONE) { @@ -2731,7 +2733,7 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len) hlen = LL_RESERVED_SPACE(dev); tlen = dev->needed_tailroom; skb = packet_alloc_skb(sk, hlen + tlen, hlen, len, - __virtio16_to_cpu(false, vnet_hdr.hdr_len), + __virtio16_to_cpu(vio_le(), vnet_hdr.hdr_len), msg->msg_flags & MSG_DONTWAIT, &err); if (skb == NULL) goto out_unlock; @@ -2778,8 +2780,8 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len) if (po->has_vnet_hdr) { if (vnet_hdr.flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) { - u16 s = __virtio16_to_cpu(false, vnet_hdr.csum_start); - u16 o = __virtio16_to_cpu(false, vnet_hdr.csum_offset); + u16 s = __virtio16_to_cpu(vio_le(), vnet_hdr.csum_start); + u16 o = __virtio16_to_cpu(vio_le(), vnet_hdr.csum_offset); if (!skb_partial_csum_set(skb, s, o)) { err = -EINVAL; goto out_free; @@ -2787,7 +2789,7 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len) } skb_shinfo(skb)->gso_size = - __virtio16_to_cpu(false, vnet_hdr.gso_size); + __virtio16_to_cpu(vio_le(), vnet_hdr.gso_size); skb_shinfo(skb)->gso_type = gso_type; /* Header must be checked, and gso_segs computed. */ @@ -3161,9 +3163,9 @@ static int packet_recvmsg(struct socket *sock, struct msghdr *msg, size_t len, /* This is a hint as to how much should be linear. */ vnet_hdr.hdr_len = - __cpu_to_virtio16(false, skb_headlen(skb)); + __cpu_to_virtio16(vio_le(), skb_headlen(skb)); vnet_hdr.gso_size = - __cpu_to_virtio16(false, sinfo->gso_size); + __cpu_to_virtio16(vio_le(), sinfo->gso_size); if (sinfo->gso_type & SKB_GSO_TCPV4) vnet_hdr.gso_type = VIRTIO_NET_HDR_GSO_TCPV4; else if (sinfo->gso_type & SKB_GSO_TCPV6) @@ -3181,9 +3183,9 @@ static int packet_recvmsg(struct socket *sock, struct msghdr *msg, size_t len, if (skb->ip_summed == CHECKSUM_PARTIAL) { vnet_hdr.flags = VIRTIO_NET_HDR_F_NEEDS_CSUM; - vnet_hdr.csum_start = __cpu_to_virtio16(false, + vnet_hdr.csum_start = __cpu_to_virtio16(vio_le(), skb_checksum_start_offset(skb)); - vnet_hdr.csum_offset = __cpu_to_virtio16(false, + vnet_hdr.csum_offset = __cpu_to_virtio16(vio_le(), skb->csum_offset); } else if (skb->ip_summed == CHECKSUM_UNNECESSARY) { vnet_hdr.flags = VIRTIO_NET_HDR_F_DATA_VALID; diff --git a/net/sched/cls_fw.c b/net/sched/cls_fw.c index 715e01e5910a..f23a3b68bba6 100644 --- a/net/sched/cls_fw.c +++ b/net/sched/cls_fw.c @@ -33,7 +33,6 @@ struct fw_head { u32 mask; - bool mask_set; struct fw_filter __rcu *ht[HTSIZE]; struct rcu_head rcu; }; @@ -84,7 +83,7 @@ static int fw_classify(struct sk_buff *skb, const struct tcf_proto *tp, } } } else { - /* old method */ + /* Old method: classify the packet using its skb mark. */ if (id && (TC_H_MAJ(id) == 0 || !(TC_H_MAJ(id ^ tp->q->handle)))) { res->classid = id; @@ -114,14 +113,9 @@ static unsigned long fw_get(struct tcf_proto *tp, u32 handle) static int fw_init(struct tcf_proto *tp) { - struct fw_head *head; - - head = kzalloc(sizeof(struct fw_head), GFP_KERNEL); - if (head == NULL) - return -ENOBUFS; - - head->mask_set = false; - rcu_assign_pointer(tp->root, head); + /* We don't allocate fw_head here, because in the old method + * we don't need it at all. + */ return 0; } @@ -252,7 +246,7 @@ static int fw_change(struct net *net, struct sk_buff *in_skb, int err; if (!opt) - return handle ? -EINVAL : 0; + return handle ? -EINVAL : 0; /* Succeed if it is old method. */ err = nla_parse_nested(tb, TCA_FW_MAX, opt, fw_policy); if (err < 0) @@ -302,11 +296,17 @@ static int fw_change(struct net *net, struct sk_buff *in_skb, if (!handle) return -EINVAL; - if (!head->mask_set) { - head->mask = 0xFFFFFFFF; + if (!head) { + u32 mask = 0xFFFFFFFF; if (tb[TCA_FW_MASK]) - head->mask = nla_get_u32(tb[TCA_FW_MASK]); - head->mask_set = true; + mask = nla_get_u32(tb[TCA_FW_MASK]); + + head = kzalloc(sizeof(*head), GFP_KERNEL); + if (!head) + return -ENOBUFS; + head->mask = mask; + + rcu_assign_pointer(tp->root, head); } f = kzalloc(sizeof(struct fw_filter), GFP_KERNEL); diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c index b7143337e4fa..3d9ea9a48289 100644 --- a/net/sctp/protocol.c +++ b/net/sctp/protocol.c @@ -1186,7 +1186,7 @@ static void sctp_v4_del_protocol(void) unregister_inetaddr_notifier(&sctp_inetaddr_notifier); } -static int __net_init sctp_net_init(struct net *net) +static int __net_init sctp_defaults_init(struct net *net) { int status; @@ -1279,12 +1279,6 @@ static int __net_init sctp_net_init(struct net *net) sctp_dbg_objcnt_init(net); - /* Initialize the control inode/socket for handling OOTB packets. */ - if ((status = sctp_ctl_sock_init(net))) { - pr_err("Failed to initialize the SCTP control sock\n"); - goto err_ctl_sock_init; - } - /* Initialize the local address list. */ INIT_LIST_HEAD(&net->sctp.local_addr_list); spin_lock_init(&net->sctp.local_addr_lock); @@ -1300,9 +1294,6 @@ static int __net_init sctp_net_init(struct net *net) return 0; -err_ctl_sock_init: - sctp_dbg_objcnt_exit(net); - sctp_proc_exit(net); err_init_proc: cleanup_sctp_mibs(net); err_init_mibs: @@ -1311,15 +1302,12 @@ err_sysctl_register: return status; } -static void __net_exit sctp_net_exit(struct net *net) +static void __net_exit sctp_defaults_exit(struct net *net) { /* Free the local address list */ sctp_free_addr_wq(net); sctp_free_local_addr_list(net); - /* Free the control endpoint. */ - inet_ctl_sock_destroy(net->sctp.ctl_sock); - sctp_dbg_objcnt_exit(net); sctp_proc_exit(net); @@ -1327,9 +1315,32 @@ static void __net_exit sctp_net_exit(struct net *net) sctp_sysctl_net_unregister(net); } -static struct pernet_operations sctp_net_ops = { - .init = sctp_net_init, - .exit = sctp_net_exit, +static struct pernet_operations sctp_defaults_ops = { + .init = sctp_defaults_init, + .exit = sctp_defaults_exit, +}; + +static int __net_init sctp_ctrlsock_init(struct net *net) +{ + int status; + + /* Initialize the control inode/socket for handling OOTB packets. */ + status = sctp_ctl_sock_init(net); + if (status) + pr_err("Failed to initialize the SCTP control sock\n"); + + return status; +} + +static void __net_init sctp_ctrlsock_exit(struct net *net) +{ + /* Free the control endpoint. */ + inet_ctl_sock_destroy(net->sctp.ctl_sock); +} + +static struct pernet_operations sctp_ctrlsock_ops = { + .init = sctp_ctrlsock_init, + .exit = sctp_ctrlsock_exit, }; /* Initialize the universe into something sensible. */ @@ -1462,8 +1473,11 @@ static __init int sctp_init(void) sctp_v4_pf_init(); sctp_v6_pf_init(); - status = sctp_v4_protosw_init(); + status = register_pernet_subsys(&sctp_defaults_ops); + if (status) + goto err_register_defaults; + status = sctp_v4_protosw_init(); if (status) goto err_protosw_init; @@ -1471,9 +1485,9 @@ static __init int sctp_init(void) if (status) goto err_v6_protosw_init; - status = register_pernet_subsys(&sctp_net_ops); + status = register_pernet_subsys(&sctp_ctrlsock_ops); if (status) - goto err_register_pernet_subsys; + goto err_register_ctrlsock; status = sctp_v4_add_protocol(); if (status) @@ -1489,12 +1503,14 @@ out: err_v6_add_protocol: sctp_v4_del_protocol(); err_add_protocol: - unregister_pernet_subsys(&sctp_net_ops); -err_register_pernet_subsys: + unregister_pernet_subsys(&sctp_ctrlsock_ops); +err_register_ctrlsock: sctp_v6_protosw_exit(); err_v6_protosw_init: sctp_v4_protosw_exit(); err_protosw_init: + unregister_pernet_subsys(&sctp_defaults_ops); +err_register_defaults: sctp_v4_pf_exit(); sctp_v6_pf_exit(); sctp_sysctl_unregister(); @@ -1527,12 +1543,14 @@ static __exit void sctp_exit(void) sctp_v6_del_protocol(); sctp_v4_del_protocol(); - unregister_pernet_subsys(&sctp_net_ops); + unregister_pernet_subsys(&sctp_ctrlsock_ops); /* Free protosw registrations */ sctp_v6_protosw_exit(); sctp_v4_protosw_exit(); + unregister_pernet_subsys(&sctp_defaults_ops); + /* Unregister with socket layer. */ sctp_v6_pf_exit(); sctp_v4_pf_exit(); diff --git a/net/tipc/msg.c b/net/tipc/msg.c index 562c926a51cc..c5ac436235e0 100644 --- a/net/tipc/msg.c +++ b/net/tipc/msg.c @@ -539,6 +539,7 @@ bool tipc_msg_lookup_dest(struct net *net, struct sk_buff *skb, int *err) *err = -TIPC_ERR_NO_NAME; if (skb_linearize(skb)) return false; + msg = buf_msg(skb); if (msg_reroute_cnt(msg)) return false; dnode = addr_domain(net, msg_lookup_scope(msg)); |