summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2017-01-09net: change init_inodecache() return voidyuan linyu1-4/+2
sock_init() call it but not check it's return value, so change it to void return and add an internal BUG_ON() check. Signed-off-by: yuan linyu <Linyu.Yuan@alcatel-sbell.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-08Merge branch 'tc-skb-diet'David S. Miller13-118/+66
Willem de Bruijn says: ==================== convert tc_verd to integer bitfields The skb tc_verd field takes up two bytes but uses far fewer bits. Convert the remaining use cases to bitfields that fit in existing holes (depending on config options) and potentially save the two bytes in struct sk_buff. This patchset is based on an earlier set by Florian Westphal and its discussion (http://www.spinics.net/lists/netdev/msg329181.html). Patches 1 and 2 are low hanging fruit: removing the last traces of data that are no longer stored in tc_verd. Patches 3 and 4 convert tc_verd to individual bitfields (5 bits). Patch 5 reduces TC_AT to a single bitfield, as AT_STACK is not valid here (unlike in the case of TC_FROM). Patch 6 changes TC_FROM to two bitfields with clearly defined purpose. It may be possible to reduce storage further after this initial round. If tc_skip_classify is set only by IFB, testing skb_iif may suffice. The L2 header pushing/popping logic can perhaps be shared with AF_PACKET, which currently not pkt_type for the same purpose. Changes: RFC -> v1 - (patch 3): remove no longer needed label in tfc_action_exec - (patch 5): set tc_at_ingress at the same points as existing SET_TC_AT calls Tested ingress mirred + netem + ifb: ip link set dev ifb0 up tc qdisc add dev eth0 ingress tc filter add dev eth0 parent ffff: \ u32 match ip dport 8000 0xffff \ action mirred egress redirect dev ifb0 tc qdisc add dev ifb0 root netem delay 1000ms nc -u -l 8000 & ssh $otherhost nc -u $host 8000 Tested egress mirred: ip link add veth1 type veth peer name veth2 ip link set dev veth1 up ip link set dev veth2 up tcpdump -n -i veth2 udp and dst port 8000 & tc qdisc add dev eth0 root handle 1: prio tc filter add dev eth0 parent 1:0 \ u32 match ip dport 8000 0xffff \ action mirred egress redirect dev veth1 tc qdisc add dev veth1 root netem delay 1000ms nc -u $otherhost 8000 Tested ingress mirred: ip link add veth1 type veth peer name veth2 ip link add veth3 type veth peer name veth4 ip netns add ns0 ip netns add ns1 for i in 1 2 3 4; do \ NS=ns$((${i}%2)); \ ip link set dev veth${i} netns ${NS}; \ ip netns exec ${NS} \ ip addr add dev veth${i} 192.168.1.${i}/24; \ ip netns exec ${NS} \ ip link set dev veth${i} up; \ done ip netns exec ns0 tc qdisc add dev veth2 ingress ip netns exec ns0 \ tc filter add dev veth2 parent ffff: \ u32 match ip dport 8000 0xffff \ action mirred ingress redirect dev veth4 ip netns exec ns0 \ tcpdump -n -i veth4 udp and dst port 8000 & ip netns exec ns1 \ nc -u 192.168.1.2 8000 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-08net-tc: convert tc_from to tc_from_ingress and tc_redirectedWillem de Bruijn6-19/+15
The tc_from field fulfills two roles. It encodes whether a packet was redirected by an act_mirred device and, if so, whether act_mirred was called on ingress or egress. Split it into separate fields. The information is needed by the special IFB loop, where packets are taken out of the normal path by act_mirred, forwarded to IFB, then reinjected at their original location (ingress or egress) by IFB. The IFB device cannot use skb->tc_at_ingress, because that may have been overwritten as the packet travels from act_mirred to ifb_xmit, when it passes through tc_classify on the IFB egress path. Cache this value in skb->tc_from_ingress. That field is valid only if a packet arriving at ifb_xmit came from act_mirred. Other packets can be crafted to reach ifb_xmit. These must be dropped. Set tc_redirected on redirection and drop all packets that do not have this bit set. Both fields are set only on cloned skbs in tc actions, so original packet sources do not have to clear the bit when reusing packets (notably, pktgen and octeon). Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-08net-tc: convert tc_at to tc_at_ingressWillem de Bruijn4-14/+12
Field tc_at is used only within tc actions to distinguish ingress from egress processing. A single bit is sufficient for this purpose. Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-08net-tc: convert tc_verd to integer bitfieldsWillem de Bruijn11-65/+29
Extract the remaining two fields from tc_verd and remove the __u16 completely. TC_AT and TC_FROM are converted to equivalent two-bit integer fields tc_at and tc_from. Where possible, use existing helper skb_at_tc_ingress when reading tc_at. Introduce helper skb_reset_tc to clear fields. Not documenting tc_from and tc_at, because they will be replaced with single bit fields in follow-on patches. Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-08net-tc: extract skip classify bit from tc_verdWillem de Bruijn6-22/+23
Packets sent by the IFB device skip subsequent tc classification. A single bit governs this state. Move it out of tc_verd in anticipation of removing that __u16 completely. The new bitfield tc_skip_classify temporarily uses one bit of a hole, until tc_verd is removed completely in a follow-up patch. Remove the bit hole comment. It could be 2, 3, 4 or 5 bits long. With that many options, little value in documenting it. Introduce a helper function to deduplicate the logic in the two sites that check this bit. The field tc_skip_classify is set only in IFB on skbs cloned in act_mirred, so original packet sources do not have to clear the bit when reusing packets (notably, pktgen and octeon). Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-08net-tc: make MAX_RECLASSIFY_LOOP localWillem de Bruijn2-6/+2
This field is no longer kept in tc_verd. Remove it from the global definition of that struct. Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-08net-tc: remove unused tc_verd fieldsWillem de Bruijn1-7/+0
Remove the last reference to tc_verd's munge and redirect ttl bits. These fields are no longer used. Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-08mdio: Demote print from info to debug in mdio_device_registerFlorian Fainelli1-1/+1
While it is useful to know which MDIO device is being registered, demote the dev_info() to a dev_dbg(). Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-08net: remove useless memset's in drivers get_stats64stephen hemminger3-4/+0
In dev_get_stats() the statistic structure storage has already been zeroed. Therefore network drivers do not need to call memset() again. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-08net: make ndo_get_stats64 a void functionstephen hemminger82-309/+166
The network device operation for reading statistics is only called in one place, and it ignores the return value. Having a structure return value is potentially confusing because some future driver could incorrectly assume that the return value was used. Fix all drivers with ndo_get_stats64 to have a void function. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-08Merge branch '100GbE' of ↵David S. Miller7-30/+33
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 100GbE Intel Wired LAN Driver Updates 2017-01-08 This series contains updates to fm10k only. Ngai-Mint changes the driver to use the MAC pointer in the fm10k_mac_info structure for fm10k_get_host_state_generic(). Fixed a race condition where the mailbox interrupt request bits can be cleared before being handled causing certain mailbox messages from the PF to be untreated and the PF will enter in some inactive state. Jake removes the typecast of u8 to char, and the extra variable that was created for the typecast. Bumps the driver version. Added back the receive descriptor timestamp value so that applications built on top of the IES API can function properly. Cleaned up the debug statistics flag, since debug statistics were removed and the flag was missed in the removal. Scott limits the DMA sync for CPU to the actual length of the packet, instead of the entire buffer, since the DMA sync occurs every time a packet is received. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-08net: ipv4: Remove flow arg from ip_mkroute_inputDavid Ahern1-2/+1
fl4 arg is not used; remove it. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-08net: ipmr: Remove nowait arg to ipmr_get_routeDavid Ahern3-8/+3
ipmr_get_route has 1 caller and the nowait arg is 0. Remove the arg and simplify ipmr_get_route accordingly. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-08liquidio: simplify octeon_flush_iq()Derek Chickles4-28/+27
Because every call to octeon_flush_iq() has a hardcoded 1 for the pending_thresh argument, simplify that function by removing that argument. This avoids one atomic read as well. Signed-off-by: Derek Chickles <derek.chickles@cavium.com> Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: Satanand Burla <satananda.burla@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-08fm10k: remove FM10K_FLAG_DEBUG_STATSJacob Keller1-1/+0
The debug statistics were removed due to complications with the ethtool statistics API which are not possible to resolve without a new statistics interface. The flag was left behind, but we no longer need it. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-01-08fm10k: report the receive timestamp in FM10K_CB(skb)->tstampJacob Keller2-3/+4
This was accidentally removed when we defeatured the full 1588 Clock support. We need to report the Rx descriptor timestamp value so that applications built on top of the IES API can function properly. Additionally, remove the FM10K_FLAG_RX_TS_ENABLED, as it is not used now that 1588 functionality has been removed. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-01-08fm10k: Limit dma sync of RX buffers to actual packet sizeScott Peterson1-3/+5
On packet RX, we perform a dma sync for cpu before passing the packet up. Here we limit that sync to the actual length of the incoming packet, rather than always syncing the entire buffer. Signed-off-by: Scott Peterson <scott.d.peterson@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-01-08fm10k: bump version numberJacob Keller1-1/+1
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-01-08fm10k: do not clear global mailbox interrupt bitsNgai-Mint Kwan1-4/+0
Partially revert commit 5e93cbadd3e9 ("fm10k: Reset mailbox global interrupts", 2016-06-07) The register bits related to this commit are now solely being handled by the IES API. Recent changes in the IES API will allow an automatic recovery from improper handling of these bits. Signed-off-by: Ngai-Mint Kwan <ngai-mint.kwan@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-01-08fm10k: request reset when mbx->state changesNgai-Mint Kwan2-4/+12
Multiple IES API resets can cause a race condition where the mailbox interrupt request bits can be cleared before being handled. This can leave certain mailbox messages from the PF to be untreated and the PF will enter in some inactive state. If this situation occurs, the IES API will initiate a mailbox version reset which, then, trigger a mailbox state change. Once this mailbox transition occurs (from OPEN to CONNECT state), a request for reset will be returned. This ensures that PF will undergo a reset whenever IES API encounters an unknown global mailbox interrupt event or whenever the IES API terminates. Signed-off-by: Ngai-Mint Kwan <ngai-mint.kwan@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-01-08fm10k: remove extraneous variable definition in fm10k_ethtool.cJacob Keller1-12/+9
We don't need to typecast a u8 * into a char *, so just remove the extra variable. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-01-08fm10k-shared: use mac-> instead of hw->mac.Ngai-Mint Kwan1-3/+3
Since a pointer "mac" to fm10k_mac_info structure exists, use it to access the contents of its members. Signed-off-by: Ngai-Mint Kwan <ngai-mint.kwan@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-01-07net: dsa: move HWMON support to its own fileVivien Didelot4-129/+159
Isolate the HWMON support in DSA in its own file. Currently only the legacy DSA code is concerned. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-07Merge branch 'netcp-next'David S. Miller6-53/+292
Murali Karicheri says: ==================== netcp: enhancements and minor fixes This series is for net-next. This propagates enhancements and minor bug fixes from internal version of the driver to keep the upstream in sync. Please review and apply if this looks good. Tested on all of K2HK/E/L boards with nfs rootfs. Test logs below K2HK-EVM: http://pastebin.ubuntu.com/23754106/ k2L-EVM: http://pastebin.ubuntu.com/23754143/ K2E-EVM: http://pastebin.ubuntu.com/23754159/ History: v1 - dropped 1/10 amd 2/10 of v0 based on comments from Rob as it needs more work before submission v0 - Initial version ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-07net: netcp: ale: add proper ale entry mask bits for netcp switch ALEKaricheri, Muralidharan2-19/+84
For NetCP NU Switch ALE, some of the mask bits are different than defaults used in the driver. Add a new macro DEFINE_ALE_FIELD1 that use a configurable mask bits and use it in the driver. These bits are set to correct values by using the new variables added to cpsw_ale structure and re-used in the macros. The parameter nu_switch_ale is configured by the caller driver to indicate the ALE is for that switch and is used in the ALE driver to do customization as needed. Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: Sekhar Nori <nsekhar@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-07net: netcp: ale: use ale_status to size the ale tableKaricheri, Muralidharan2-4/+31
ALE h/w on newer version of NetCP (K2E/L/G) does provide a ALE_STATUS register for the size of the ALE Table implemented in h/w. Currently for example we set ALE Table size to 1024 for NetCP ALE on K2E even though the ALE Status/Documentation shows it has 8192 entries. So take advantage of this register to read the size of ALE table supported and use that value in the driver for the newer version of NetCP ALE. For NetCP lite, ALE Table size is much less (64) and indicated by a size of zero in ALE_STATUS. So use that as a default for now. While at it, also fix the ale table size on 10G switch to 2048 per User guide http://www.ti.com/lit/ug/spruhj5/spruhj5.pdf Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: Sekhar Nori <nsekhar@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-07net: netcp: ale: update to support unknown vlan controls for NU switchKaricheri, Muralidharan3-7/+61
In NU Ethernet switch used on some of the Keystone SoCs, there is separate UNKNOWNVLAN register for membership, unreg mcast flood, reg mcast flood and force untag egress bits in ALE. So control for these fields require different address offset, shift and size of field. As this ALE has the same version number as ALE in CPSW found on other SoCs, customization based on version number is not possible. So use a configuration parameter, nu_switch_ale, to identify the ALE ALE found in NU Switch. Different treatment is needed for NU Switch ALE due to difference in the ale table bits, separate unknown vlan registers etc. The register information available in ale_controls, needs to be updated to support the netcp NU switch h/w. So it is not constant array any more since it needs to be updated based on ALE type. The header of the file is also updated to indicate it supports N port switch ALE, not just 3 port. The version mask is 3 bits in NU Switch ALE vs 8 bits on other ALE types. While at it, change the debug print to info print so that ALE version gets displayed in boot log. Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: Sekhar Nori <nsekhar@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-07net: netcp: use hw capability to remove FCS word from rx packetsKaricheri, Muralidharan3-4/+16
Some of the newer Ethernet switch hw (such as that on k2e/l/g) can strip the Etherenet FCS from packet at the port 0 egress of the switch. So use this capability instead of doing it in software. Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: Sekhar Nori <nsekhar@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-07net: netcp: ethss: get phy-handle only if link interface is MAC-to-PHYKaricheri, Muralidharan1-1/+3
Currently to parse phy-handle, driver doesn't check if the interface is MAC to PHY. This patch add this check for all MAC to PHY interface types supported by the driver. Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: Sekhar Nori <nsekhar@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-07net: netcp: store network statistics in 64 bitsMichael Scherban2-12/+74
Previously the network statistics were stored in 32 bit variable which can cause some stats to roll over after several minutes of high traffic. This implements 64 bit storage so larger numbers can be stored. Signed-off-by: Michael Scherban <m-scherban@ti.com> Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: Sekhar Nori <nsekhar@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-07net: netcp: remove the redundant memmov()Karicheri, Muralidharan1-3/+3
The psdata is populated with command data by netcp modules to the tail of the buffer and set_words() copy the same to the front of the psdata. So remove the redundant memmov function call. Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-07net: netcp: extract eflag from desc for rx_hook handlingKaricheri, Muralidharan3-3/+20
Extract the eflag bits from the received desc and pass it down the rx_hook chain to be available for netcp modules. Also the psdata and epib data has to be inspected by the netcp modules. So the desc can be freed only after returning from the rx_hook. So move knav_pool_desc_put() after the rx_hook processing. Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-07Merge branch 'cpsw-cpdma-DDR'David S. Miller8-72/+199
Grygorii Strashko says: ==================== net: ethernet: ti: cpsw: support placing CPDMA descriptors into DDR This series intended to add support for placing CPDMA descriptors into DDR by introducing new module parameter "descs_pool_size" to specify size of descriptor's pool. The "descs_pool_size" defines total number of CPDMA CPPI descriptors to be used for both ingress/egress packets processing. If not specified - the default value 256 will be used which will allow to place descriptor's pool into the internal CPPI RAM. In addition, added ability to re-split CPDMA pool of descriptors between RX and TX path via ethtool '-G' command wich will allow to configure and fix number of descriptors used by RX and TX path, which, then, will be split between RX/TX channels proportionally depending on number of RX/TX channels and its weight. This allows significantly to reduce UDP packets drop rate for bandwidth >301 Mbits/sec (am57x). Before enabling this feature, the am437x SoC has to be fixed as it's proved that it's not working when CPDMA descriptors placed in DDR. So, the patch 1 fixes this issue. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-07Documentation: DT: net: cpsw: remove no_bd_ram propertyGrygorii Strashko5-7/+0
Even if no_bd_ram property is described in TI CPSW bindings the support for it has never been introduced in CPSW driver, so there are no real users of it. Hence, remove no_bd_ram property from documentation and DT files. Cc: 'Rob Herring <robh+dt@kernel.org>' Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-07net: ethernet: ti: cpsw: add support for ringparam configurationGrygorii Strashko3-8/+122
The CPDMA uses one pool of descriptors for both RX and TX which by default split between all channels proportionally depending on total number of CPDMA channels and number of TX and RX channels. As result, more descriptors will be consumed by TX path if there are more TX channels and there is no way now to dedicate more descriptors for RX path. So, add the ability to re-split CPDMA pool of descriptors between RX and TX path via ethtool '-G' command wich will allow to configure and fix number of descriptors used by RX and TX path, which, then, will be split between RX/TX channels proportionally depending on RX/TX channels number and weight. ethtool '-G' command will accept only number of RX entries and rest of descriptors will be arranged for TX automatically. Command: ethtool -G <devname> rx <number of descriptors> defaults and limitations: - minimum number of rx descriptors is 10% of total number of descriptors in CPDMA pool - maximum number of rx descriptors is 90% of total number of descriptors in CPDMA pool - by default, descriptors will be split equally between RX/TX path - any values passed in "tx" parameter will be ignored Usage: # ethtool -g eth0 Pre-set maximums: RX: 7372 RX Mini: 0 RX Jumbo: 0 TX: 0 Current hardware settings: RX: 4096 RX Mini: 0 RX Jumbo: 0 TX: 4096 # ethtool -G eth0 rx 7372 # ethtool -g eth0 Ring parameters for eth0: Pre-set maximums: RX: 7372 RX Mini: 0 RX Jumbo: 0 TX: 0 Current hardware settings: RX: 7372 RX Mini: 0 RX Jumbo: 0 TX: 820 Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-07net: ethernet: ti: cpsw: add support for descs pool size configurationGrygorii Strashko3-3/+22
The CPSW CPDMA can process buffer descriptors placed as in internal CPPI RAM as in DDR. This patch adds support in CPSW and CPDMA for descs_pool_size mudule parameter, which defines total number of CPDMA CPPI descriptors to be used for both ingress/egress packets processing: - memory size, required for CPDMA descriptor pool, is calculated basing on number of descriptors specified by user in descs_pool_size and CPDMA descriptor size and allocated from coherent memory (CMA area); - CPDMA descriptor pool will be allocated in DDR if pool memory size > internal CPPI RAM or use internal CPPI RAM otherwise; - if descs_pool_size not specified in DT - the default value 256 will be used which will allow to place CPDMA descriptors pool into the internal CPPI RAM (current default behaviour); - CPDMA will ignore descs_pool_size if descs_pool_size = 0 for backward comaptiobility with davinci_emac. descs_pool_size is boot time setting and can't be changed once CPSW/CPDMA is initialized. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-07net: ethernet: ti: cpdma: use devm_ioremapGrygorii Strashko1-3/+2
Use devm_ioremap() and simplify the code. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-07net: ethernet: ti: cpdma: minimize number of parameters in ↵Grygorii Strashko1-32/+30
cpdma_desc_pool_create/destroy() Update cpdma_desc_pool_create/destroy() to accept only one parameter struct cpdma_ctlr*, as this structure contains all required information for pool creation/destruction. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-07net: ethernet: ti: cpdma: fix desc re-queuingGrygorii Strashko1-1/+1
The currently processing cpdma descriptor with EOQ flag set may contain two values in Next Descriptor Pointer field: - valid pointer: means CPDMA missed addition of new desc in queue; - null: no more descriptors in queue. In the later case, it's not required to write to HDP register, but now CPDMA does it. Hence, add additional check for Next Descriptor Pointer != null in cpdma_chan_process() function before writing in HDP register. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-07net: ethernet: ti: cpdma: am437x: allow descs to be plased in ddrGrygorii Strashko1-18/+22
It's observed that cpsw/cpdma is not working properly when CPPI descriptors are placed in DDR instead of internal CPPI RAM on am437x SoC: - rx/tx silently stops processing packets; - or - after boot it's working for sometime, but stuck once Network load is increased (ping is working, but iperf is not). (The same issue has not been reproduced on am335x and am57xx). It seems that write to HDP register processed faster by interconnect than writing of descriptor memory buffer in DDR, which is probably caused by store buffer / write buffer differences as these functions are implemented differently across devices. So, to fix this i come up with two minimal, required changes: 1) all accesses to the channel register HDP/CP/RXFREE registers should be done using sync IO accessors readl()/writel(), because all previous memory writes writes have to be completed before starting channel (write to HDP) or completing desc processing. 2) the change 1 only doesn't work on am437x and additional reading of desc's field is required right after the new descriptor was filled with data and before pointer on it will be stored in prev_desc->hw_next field or HDP register. In addition, to above changes this patch eliminates all relaxed ordering I/O accessors in this driver as suggested by David Miller to avoid such kind of issues in the future, but with one exception - relaxed IO accessors will still be used to fill desc in cpdma_chan_submit(), which is safe as there is read barrier at the end of write sequence, and because sync IO accessors usage here will affect on net performance. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-06Merge branch 'l2tp-cleanup-socket-lookup-code'David S. Miller2-22/+37
Guillaume Nault says: ==================== l2tp: cleanup socket lookup code in l2tp_ip and l2tp_ip6 First three patches remove redundant tests and add missing "const" qualifiers. Fourth patch splits the conditionals found in __l2tp_ip*_bind_lookup(), to make these functions easier to review. In the process, I found that some corner cases were still not handled properly. So I've added the missing tests in this patch too, because they're pretty simple and the whole "if" statements are modified anyway. I expect it to be easier to review this way. If not, I can split up patch #4, post the missing tests separately to -net, and later repost this series as pure cleanup. Just let me know. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-06l2tp: rework socket comparison in __l2tp_ip*_bind_lookup()Guillaume Nault2-14/+35
Split conditions, so that each test becomes clearer. Also, for l2tp_ip, check if "laddr" is 0. This prevents a socket from binding to the unspecified address when other sockets are already bound using the same device (if any), connection ID and namespace. Same thing for l2tp_ip6: add ipv6_addr_any(laddr) and ipv6_addr_any(raddr) tests to ensure that an IPv6 unspecified address passed as parameter is properly treated a wildcard. Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-06l2tp: remove useless NULL check in __l2tp_ip*_bind_lookup()Guillaume Nault2-6/+0
If "l2tp" was NULL, that'd mean "sk" is NULL too. This can't happen since "sk" is returned by sk_for_each_bound(). Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-06l2tp: make __l2tp_ip*_bind_lookup() parameters 'const'Guillaume Nault2-5/+5
Add const qualifier wherever possible for __l2tp_ip_bind_lookup() and __l2tp_ip6_bind_lookup(). Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-06l2tp: remove redundant addr_len check in l2tp_ip_bind()Guillaume Nault1-1/+1
addr_len's value has already been verified at this point. Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-06RDS: validate the requested traces user input against max supportedsantosh.shilimkar@oracle.com1-0/+3
Larger than supported value can lead to array read/write overflow. Reported-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-06Merge tag 'rxrpc-rewrite-20170106' of ↵David S. Miller6-57/+275
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs David Howells says: ==================== afs: Implement bulk read This pair of patches implements bulk data reading from an AFS server. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-06sctp: prepare asoc stream for stream reconfXin Long10-206/+147
sctp stream reconf, described in RFC 6525, needs a structure to save per stream information in assoc, like stream state. In the future, sctp stream scheduler also needs it to save some stream scheduler params and queues. This patchset is to prepare the stream array in assoc for stream reconf. It defines sctp_stream that includes stream arrays inside to replace ssnmap. Note that we use different structures for IN and OUT streams, as the members in per OUT stream will get more and more different from per IN stream. v1->v2: - put these patches into a smaller group. v2->v3: - define sctp_stream to contain stream arrays, and create stream.c to put stream-related functions. - merge 3 patches into 1, as new sctp_stream has the same name with before. Signed-off-by: Xin Long <lucien.xin@gmail.com> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-06udp: inuse checks can quit early for reuseportEric Garver1-10/+19
UDP lib inuse checks will walk the entire hash bucket to check if the portaddr is in use. In the case of reuseport we can stop searching when we find a matching reuseport. On a 16-core VM a test program that spawns 16 threads that each bind to 1024 sockets (one per 10ms) takes 1m45s. With this change it takes 11s. Also add a cond_resched() when the port is not specified. Signed-off-by: Eric Garver <e@erig.me> Signed-off-by: David S. Miller <davem@davemloft.net>