summaryrefslogtreecommitdiffstats
path: root/drivers/net/ethernet/intel/ice
AgeCommit message (Collapse)AuthorFilesLines
2022-09-27ice: xsk: change batched Tx descriptor cleaningMaciej Fijalkowski3-88/+64
AF_XDP Tx descriptor cleaning in ice driver currently works in a "lazy" way - descriptors are not cleaned immediately after send. We rather hold on with cleaning until we see that free space in ring drops below particular threshold. This was supposed to reduce the amount of unnecessary work related to cleaning and instead of keeping the ring empty, ring was rather saturated. In AF_XDP realm cleaning Tx descriptors implies producing them to CQ. This is a way of letting know user space that particular descriptor has been sent, as John points out in [0]. We tried to implement serial descriptor cleaning which would be used in conjunction with batched cleaning but it made code base more convoluted and probably harder to maintain in future. Therefore we step away from batched cleaning in a current form in favor of an approach where we set RS bit on every last descriptor from a batch and clean always at the beginning of ice_xmit_zc(). This means that we give up a bit of Tx performance, but this doesn't hurt l2fwd scenario which is way more meaningful than txonly as this can be treaten as AF_XDP based packet generator. l2fwd is not hurt due to the fact that Tx side is much faster than Rx and Rx is the one that has to catch Tx up. FWIW Tx descriptors are still produced in a batched way. [0]: https://lore.kernel.org/bpf/62b0a20232920_3573208ab@john.notmuch/ Fixes: 126cdfe1007a ("ice: xsk: Improve AF_XDP ZC Tx and use batching API") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-09-27ice: reorder PF/representor devlink port register/unregister flowsJiri Pirko3-10/+10
Make sure that netdevice is registered/unregistered while devlink port is registered. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski3-28/+44
drivers/net/ethernet/freescale/fec.h 7b15515fc1ca ("Revert "fec: Restart PPS after link state change"") 40c79ce13b03 ("net: fec: add stop mode support for imx8 platform") https://lore.kernel.org/all/20220921105337.62b41047@canb.auug.org.au/ drivers/pinctrl/pinctrl-ocelot.c c297561bc98a ("pinctrl: ocelot: Fix interrupt controller") 181f604b33cd ("pinctrl: ocelot: add ability to be used in a non-mmio configuration") https://lore.kernel.org/all/20220921110032.7cd28114@canb.auug.org.au/ tools/testing/selftests/drivers/net/bonding/Makefile bbb774d921e2 ("net: Add tests for bonding and team address list management") 152e8ec77640 ("selftests/bonding: add a test for bonding lladdr target") https://lore.kernel.org/all/20220921110437.5b7dbd82@canb.auug.org.au/ drivers/net/can/usb/gs_usb.c 5440428b3da6 ("can: gs_usb: gs_can_open(): fix race dev->can.state condition") 45dfa45f52e6 ("can: gs_usb: add RX and TX hardware timestamp support") https://lore.kernel.org/all/84f45a7d-92b6-4dc5-d7a1-072152fab6ff@tessares.net/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-21Merge branch '100GbE' of ↵Jakub Kicinski1-9/+14
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2022-09-20 (ice) Michal re-sets TC configuration when changing number of queues. Mateusz moves the check and call for link-down-on-close to the specific path for downing/closing the interface. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: ice: Fix interface being down after reset with link-down-on-close flag on ice: config netdev tc before setting queues number ==================== Link: https://lore.kernel.org/r/20220920205344.1860934-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-21ice: Fix ice_xdp_xmit() when XDP TX queue number is not sufficientLarysa Zaremba1-1/+4
The original patch added the static branch to handle the situation, when assigning an XDP TX queue to every CPU is not possible, so they have to be shared. However, in the XDP transmit handler ice_xdp_xmit(), an error was returned in such cases even before static condition was checked, thus making queue sharing still impossible. Fixes: 22bf877e528f ("ice: introduce XDP_TX fallback path") Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com> Link: https://lore.kernel.org/r/20220919134346.25030-1-larysa.zaremba@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-20ice: Add low latency Tx timestamp readKarol Kolacinski7-45/+154
E810 products can support low latency Tx timestamp register read. This requires usage of threaded IRQ instead of kthread to reduce the kthread start latency (spikes up to 20 ms). Add a check for the device capability and use the new method if supported. Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20220916201728.241510-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-20ice: Fix interface being down after reset with link-down-on-close flag onMateusz Palczewski1-9/+12
When performing a reset on ice driver with link-down-on-close flag on interface would always stay down. Fix this by moving a check of this flag to ice_stop() that is called only when user wants to bring interface down. Fixes: ab4ab73fc1ec ("ice: Add ethtool private flag to make forcing link down optional") Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Petr Oros <poros@redhat.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-09-20ice: config netdev tc before setting queues numberMichal Swiatkowski1-0/+2
After lowering number of tx queues the warning appears: "Number of in use tx queues changed invalidating tc mappings. Priority traffic classification disabled!" Example command to reproduce: ethtool -L enp24s0f0 tx 36 rx 36 Fix this by setting correct tc mapping before setting real number of queues on netdev. Fixes: 0754d65bd4be5 ("ice: Add infrastructure for mqprio support via ndo_setup_tc") Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-09-20ice: Add L2TPv3 hardware offload supportMarcin Szycik4-2/+109
Add support for offloading packets based on L2TPv3 session id in switchdev mode. Example filter: tc filter add dev $PF1 ingress prio 1 protocol ip flower ip_proto l2tp \ l2tpv3_sid 1234 skip_sw action mirred egress redirect dev $VF1_PR Changes in iproute2 are required to be able to specify l2tpv3_sid. ICE COMMS DDP package is required to create a filter as it contains L2TPv3 profiles. Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-08ice: Fix crash by keep old cfg when update TCs more than queuesDing Hui1-16/+26
There are problems if allocated queues less than Traffic Classes. Commit a632b2a4c920 ("ice: ethtool: Prohibit improper channel config for DCB") already disallow setting less queues than TCs. Another case is if we first set less queues, and later update more TCs config due to LLDP, ice_vsi_cfg_tc() will failed but left dirty num_txq/rxq and tc_cfg in vsi, that will cause invalid pointer access. [ 95.968089] ice 0000:3b:00.1: More TCs defined than queues/rings allocated. [ 95.968092] ice 0000:3b:00.1: Trying to use more Rx queues (8), than were allocated (1)! [ 95.968093] ice 0000:3b:00.1: Failed to config TC for VSI index: 0 [ 95.969621] general protection fault: 0000 [#1] SMP NOPTI [ 95.969705] CPU: 1 PID: 58405 Comm: lldpad Kdump: loaded Tainted: G U W O --------- -t - 4.18.0 #1 [ 95.969867] Hardware name: O.E.M/BC11SPSCB10, BIOS 8.23 12/30/2021 [ 95.969992] RIP: 0010:devm_kmalloc+0xa/0x60 [ 95.970052] Code: 5c ff ff ff 31 c0 5b 5d 41 5c c3 b8 f4 ff ff ff eb f4 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 89 d1 <8b> 97 60 02 00 00 48 8d 7e 18 48 39 f7 72 3f 55 89 ce 53 48 8b 4c [ 95.970344] RSP: 0018:ffffc9003f553888 EFLAGS: 00010206 [ 95.970425] RAX: dead000000000200 RBX: ffffea003c425b00 RCX: 00000000006080c0 [ 95.970536] RDX: 00000000006080c0 RSI: 0000000000000200 RDI: dead000000000200 [ 95.970648] RBP: dead000000000200 R08: 00000000000463c0 R09: ffff888ffa900000 [ 95.970760] R10: 0000000000000000 R11: 0000000000000002 R12: ffff888ff6b40100 [ 95.970870] R13: ffff888ff6a55018 R14: 0000000000000000 R15: ffff888ff6a55460 [ 95.970981] FS: 00007f51b7d24700(0000) GS:ffff88903ee80000(0000) knlGS:0000000000000000 [ 95.971108] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 95.971197] CR2: 00007fac5410d710 CR3: 0000000f2c1de002 CR4: 00000000007606e0 [ 95.971309] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 95.971419] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 95.971530] PKRU: 55555554 [ 95.971573] Call Trace: [ 95.971622] ice_setup_rx_ring+0x39/0x110 [ice] [ 95.971695] ice_vsi_setup_rx_rings+0x54/0x90 [ice] [ 95.971774] ice_vsi_open+0x25/0x120 [ice] [ 95.971843] ice_open_internal+0xb8/0x1f0 [ice] [ 95.971919] ice_ena_vsi+0x4f/0xd0 [ice] [ 95.971987] ice_dcb_ena_dis_vsi.constprop.5+0x29/0x90 [ice] [ 95.972082] ice_pf_dcb_cfg+0x29a/0x380 [ice] [ 95.972154] ice_dcbnl_setets+0x174/0x1b0 [ice] [ 95.972220] dcbnl_ieee_set+0x89/0x230 [ 95.972279] ? dcbnl_ieee_del+0x150/0x150 [ 95.972341] dcb_doit+0x124/0x1b0 [ 95.972392] rtnetlink_rcv_msg+0x243/0x2f0 [ 95.972457] ? dcb_doit+0x14d/0x1b0 [ 95.972510] ? __kmalloc_node_track_caller+0x1d3/0x280 [ 95.972591] ? rtnl_calcit.isra.31+0x100/0x100 [ 95.972661] netlink_rcv_skb+0xcf/0xf0 [ 95.972720] netlink_unicast+0x16d/0x220 [ 95.972781] netlink_sendmsg+0x2ba/0x3a0 [ 95.975891] sock_sendmsg+0x4c/0x50 [ 95.979032] ___sys_sendmsg+0x2e4/0x300 [ 95.982147] ? kmem_cache_alloc+0x13e/0x190 [ 95.985242] ? __wake_up_common_lock+0x79/0x90 [ 95.988338] ? __check_object_size+0xac/0x1b0 [ 95.991440] ? _copy_to_user+0x22/0x30 [ 95.994539] ? move_addr_to_user+0xbb/0xd0 [ 95.997619] ? __sys_sendmsg+0x53/0x80 [ 96.000664] __sys_sendmsg+0x53/0x80 [ 96.003747] do_syscall_64+0x5b/0x1d0 [ 96.006862] entry_SYSCALL_64_after_hwframe+0x65/0xca Only update num_txq/rxq when passed check, and restore tc_cfg if setup queue map failed. Fixes: a632b2a4c920 ("ice: ethtool: Prohibit improper channel config for DCB") Signed-off-by: Ding Hui <dinghui@sangfor.com.cn> Reviewed-by: Anatolii Gerasymenko <anatolii.gerasymenko@intel.com> Tested-by: Arpana Arland <arpanax.arland@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-09-08ice: Don't double unplug aux on peer initiated resetDave Ertman1-2/+0
In the IDC callback that is accessed when the aux drivers request a reset, the function to unplug the aux devices is called. This function is also called in the ice_prepare_for_reset function. This double call is causing a "scheduling while atomic" BUG. [ 662.676430] ice 0000:4c:00.0 rocep76s0: cqp opcode = 0x1 maj_err_code = 0xffff min_err_code = 0x8003 [ 662.676609] ice 0000:4c:00.0 rocep76s0: [Modify QP Cmd Error][op_code=8] status=-29 waiting=1 completion_err=1 maj=0xffff min=0x8003 [ 662.815006] ice 0000:4c:00.0 rocep76s0: ICE OICR event notification: oicr = 0x10000003 [ 662.815014] ice 0000:4c:00.0 rocep76s0: critical PE Error, GLPE_CRITERR=0x00011424 [ 662.815017] ice 0000:4c:00.0 rocep76s0: Requesting a reset [ 662.815475] BUG: scheduling while atomic: swapper/37/0/0x00010002 [ 662.815475] BUG: scheduling while atomic: swapper/37/0/0x00010002 [ 662.815477] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill 8021q garp mrp stp llc vfat fat rpcrdma intel_rapl_msr intel_rapl_common sunrpc i10nm_edac rdma_ucm nfit ib_srpt libnvdimm ib_isert iscsi_target_mod x86_pkg_temp_thermal intel_powerclamp coretemp target_core_mod snd_hda_intel ib_iser snd_intel_dspcfg libiscsi snd_intel_sdw_acpi scsi_transport_iscsi kvm_intel iTCO_wdt rdma_cm snd_hda_codec kvm iw_cm ipmi_ssif iTCO_vendor_support snd_hda_core irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hwdep snd_seq snd_seq_device rapl snd_pcm snd_timer isst_if_mbox_pci pcspkr isst_if_mmio irdma intel_uncore idxd acpi_ipmi joydev isst_if_common snd mei_me idxd_bus ipmi_si soundcore i2c_i801 mei ipmi_devintf i2c_smbus i2c_ismt ipmi_msghandler acpi_power_meter acpi_pad rv(OE) ib_uverbs ib_cm ib_core xfs libcrc32c ast i2c_algo_bit drm_vram_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm_ttm_helpe r ttm [ 662.815546] nvme nvme_core ice drm crc32c_intel i40e t10_pi wmi pinctrl_emmitsburg dm_mirror dm_region_hash dm_log dm_mod fuse [ 662.815557] Preemption disabled at: [ 662.815558] [<0000000000000000>] 0x0 [ 662.815563] CPU: 37 PID: 0 Comm: swapper/37 Kdump: loaded Tainted: G S OE 5.17.1 #2 [ 662.815566] Hardware name: Intel Corporation D50DNP/D50DNP, BIOS SE5C6301.86B.6624.D18.2111021741 11/02/2021 [ 662.815568] Call Trace: [ 662.815572] <IRQ> [ 662.815574] dump_stack_lvl+0x33/0x42 [ 662.815581] __schedule_bug.cold.147+0x7d/0x8a [ 662.815588] __schedule+0x798/0x990 [ 662.815595] schedule+0x44/0xc0 [ 662.815597] schedule_preempt_disabled+0x14/0x20 [ 662.815600] __mutex_lock.isra.11+0x46c/0x490 [ 662.815603] ? __ibdev_printk+0x76/0xc0 [ib_core] [ 662.815633] device_del+0x37/0x3d0 [ 662.815639] ice_unplug_aux_dev+0x1a/0x40 [ice] [ 662.815674] ice_schedule_reset+0x3c/0xd0 [ice] [ 662.815693] irdma_iidc_event_handler.cold.7+0xb6/0xd3 [irdma] [ 662.815712] ? bitmap_find_next_zero_area_off+0x45/0xa0 [ 662.815719] ice_send_event_to_aux+0x54/0x70 [ice] [ 662.815741] ice_misc_intr+0x21d/0x2d0 [ice] [ 662.815756] __handle_irq_event_percpu+0x4c/0x180 [ 662.815762] handle_irq_event_percpu+0xf/0x40 [ 662.815764] handle_irq_event+0x34/0x60 [ 662.815766] handle_edge_irq+0x9a/0x1c0 [ 662.815770] __common_interrupt+0x62/0x100 [ 662.815774] common_interrupt+0xb4/0xd0 [ 662.815779] </IRQ> [ 662.815780] <TASK> [ 662.815780] asm_common_interrupt+0x1e/0x40 [ 662.815785] RIP: 0010:cpuidle_enter_state+0xd6/0x380 [ 662.815789] Code: 49 89 c4 0f 1f 44 00 00 31 ff e8 65 d7 95 ff 45 84 ff 74 12 9c 58 f6 c4 02 0f 85 64 02 00 00 31 ff e8 ae c5 9c ff fb 45 85 f6 <0f> 88 12 01 00 00 49 63 d6 4c 2b 24 24 48 8d 04 52 48 8d 04 82 49 [ 662.815791] RSP: 0018:ff2c2c4f18edbe80 EFLAGS: 00000202 [ 662.815793] RAX: ff280805df140000 RBX: 0000000000000002 RCX: 000000000000001f [ 662.815795] RDX: 0000009a52da2d08 RSI: ffffffff93f8240b RDI: ffffffff93f53ee7 [ 662.815796] RBP: ff5e2bd11ff41928 R08: 0000000000000000 R09: 000000000002f8c0 [ 662.815797] R10: 0000010c3f18e2cf R11: 000000000000000f R12: 0000009a52da2d08 [ 662.815798] R13: ffffffff94ad7e20 R14: 0000000000000002 R15: 0000000000000000 [ 662.815801] cpuidle_enter+0x29/0x40 [ 662.815803] do_idle+0x261/0x2b0 [ 662.815807] cpu_startup_entry+0x19/0x20 [ 662.815809] start_secondary+0x114/0x150 [ 662.815813] secondary_startup_64_no_verify+0xd5/0xdb [ 662.815818] </TASK> [ 662.815846] bad: scheduling from the idle thread! [ 662.815849] CPU: 37 PID: 0 Comm: swapper/37 Kdump: loaded Tainted: G S W OE 5.17.1 #2 [ 662.815852] Hardware name: Intel Corporation D50DNP/D50DNP, BIOS SE5C6301.86B.6624.D18.2111021741 11/02/2021 [ 662.815853] Call Trace: [ 662.815855] <IRQ> [ 662.815856] dump_stack_lvl+0x33/0x42 [ 662.815860] dequeue_task_idle+0x20/0x30 [ 662.815863] __schedule+0x1c3/0x990 [ 662.815868] schedule+0x44/0xc0 [ 662.815871] schedule_preempt_disabled+0x14/0x20 [ 662.815873] __mutex_lock.isra.11+0x3a8/0x490 [ 662.815876] ? __ibdev_printk+0x76/0xc0 [ib_core] [ 662.815904] device_del+0x37/0x3d0 [ 662.815909] ice_unplug_aux_dev+0x1a/0x40 [ice] [ 662.815937] ice_schedule_reset+0x3c/0xd0 [ice] [ 662.815961] irdma_iidc_event_handler.cold.7+0xb6/0xd3 [irdma] [ 662.815979] ? bitmap_find_next_zero_area_off+0x45/0xa0 [ 662.815985] ice_send_event_to_aux+0x54/0x70 [ice] [ 662.816011] ice_misc_intr+0x21d/0x2d0 [ice] [ 662.816033] __handle_irq_event_percpu+0x4c/0x180 [ 662.816037] handle_irq_event_percpu+0xf/0x40 [ 662.816039] handle_irq_event+0x34/0x60 [ 662.816042] handle_edge_irq+0x9a/0x1c0 [ 662.816045] __common_interrupt+0x62/0x100 [ 662.816048] common_interrupt+0xb4/0xd0 [ 662.816052] </IRQ> [ 662.816053] <TASK> [ 662.816054] asm_common_interrupt+0x1e/0x40 [ 662.816057] RIP: 0010:cpuidle_enter_state+0xd6/0x380 [ 662.816060] Code: 49 89 c4 0f 1f 44 00 00 31 ff e8 65 d7 95 ff 45 84 ff 74 12 9c 58 f6 c4 02 0f 85 64 02 00 00 31 ff e8 ae c5 9c ff fb 45 85 f6 <0f> 88 12 01 00 00 49 63 d6 4c 2b 24 24 48 8d 04 52 48 8d 04 82 49 [ 662.816063] RSP: 0018:ff2c2c4f18edbe80 EFLAGS: 00000202 [ 662.816065] RAX: ff280805df140000 RBX: 0000000000000002 RCX: 000000000000001f [ 662.816067] RDX: 0000009a52da2d08 RSI: ffffffff93f8240b RDI: ffffffff93f53ee7 [ 662.816068] RBP: ff5e2bd11ff41928 R08: 0000000000000000 R09: 000000000002f8c0 [ 662.816070] R10: 0000010c3f18e2cf R11: 000000000000000f R12: 0000009a52da2d08 [ 662.816071] R13: ffffffff94ad7e20 R14: 0000000000000002 R15: 0000000000000000 [ 662.816075] cpuidle_enter+0x29/0x40 [ 662.816077] do_idle+0x261/0x2b0 [ 662.816080] cpu_startup_entry+0x19/0x20 [ 662.816083] start_secondary+0x114/0x150 [ 662.816087] secondary_startup_64_no_verify+0xd5/0xdb [ 662.816091] </TASK> [ 662.816169] bad: scheduling from the idle thread! The correct place to unplug the aux devices for a reset is in the prepare_for_reset function, as this is a common place for all reset flows. It also has built in protection from being called twice in a single reset instance before the aux devices are replugged. Fixes: f9f5301e7e2d4 ("ice: Register auxiliary device to provide RDMA") Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Helena Anna Dubel <helena.anna.dubel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-09-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netPaolo Abeni4-18/+80
drivers/net/ethernet/freescale/fec.h 7d650df99d52 ("net: fec: add pm_qos support on imx6q platform") 40c79ce13b03 ("net: fec: add stop mode support for imx8 platform") Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-06ice: Simplify memory allocation in ice_sched_init_port()Christophe JAILLET1-2/+2
'buf' is locale to the ice_sched_init_port() function. There is no point in using devm_kzalloc()/devm_kfree(). use kzalloc()/kfree() instead. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-09-06ice: switch: Simplify memory allocationChristophe JAILLET1-4/+2
'rbuf' is locale to the ice_get_initial_sw_cfg() function. There is no point in using devm_kzalloc()/devm_kfree(). use kzalloc()/kfree() instead. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-09-06ice: add helper function to check FW API versionPaul Greenwalt1-28/+29
Several functions in ice_common.c check the firmware API version to see if the current API version meets some minimum requirement. Improve the readability of these checks by introducing ice_is_fw_api_min_ver, a helper function to perform that check. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-09-06ice: Check if reset in progress while waiting for offsetsMichal Michalik1-0/+3
Occasionally while waiting to valid offsets from hardware we get reset. Add check for reset before proceeding to execute scheduled work. Co-developed-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Michal Michalik <michal.michalik@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-09-06ice: Allow operation with reduced device MSI-XTony Nguyen1-85/+100
The driver currently takes an all or nothing approach for device MSI-X vectors. Meaning if it does not get its full allocation, it will fail and not load. There is no reason it can't work with a reduced number of MSI-X vectors. Take a similar approach as commit 741106f7bd8d ("ice: Improve MSI-X fallback logic") and, instead, adjust the MSI-X request to make use of what is available. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Petr Oros <poros@redhat.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
2022-09-02ice: use bitmap_free instead of devm_kfreeMichal Swiatkowski1-1/+1
pf->avail_txqs was allocated using bitmap_zalloc, bitmap_free should be used to free this memory. Fixes: 78b5713ac1241 ("ice: Alloc queue management bitmaps and arrays dynamically") Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-09-02ice: Fix DMA mappings leakPrzemyslaw Patynowski4-17/+79
Fix leak, when user changes ring parameters. During reallocation of RX buffers, new DMA mappings are created for those buffers. New buffers with different RX ring count should substitute older ones, but those buffers were freed in ice_vsi_cfg_rxq and reallocated again with ice_alloc_rx_buf. kfree on rx_buf caused leak of already mapped DMA. Reallocate ZC with xdp_buf struct, when BPF program loads. Reallocate back to rx_buf, when BPF program unloads. If BPF program is loaded/unloaded and XSK pools are created, reallocate RX queues accordingly in XDP_SETUP_XSK_POOL handler. Steps for reproduction: while : do for ((i=0; i<=8160; i=i+32)) do ethtool -G enp130s0f0 rx $i tx $i sleep 0.5 ethtool -g enp130s0f0 done done Fixes: 617f3e1b588c ("ice: xsk: allocate separate memory for XDP SW ring") Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Chandan <chandanx.rout@intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-26Merge branch '100GbE' of ↵David S. Miller9-27/+768
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2022-08-24 (ice) This series contains updates to ice driver only. Marcin adds support for TC parsing on TTL and ToS fields. Anatolli adds support for devlink port split command to allow configuration of various port configurations. Jake allows for passing and writing an additional NVM write activate field by expanding current cmd_flag. Ani makes PHY debug output more readable. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski4-29/+54
drivers/net/ethernet/mellanox/mlx5/core/en_fs.c 21234e3a84c7 ("net/mlx5e: Fix use after free in mlx5e_fs_init()") c7eafc5ed068 ("net/mlx5e: Convert ethtool_steering member of flow_steering struct to pointer") https://lore.kernel.org/all/20220825104410.67d4709c@canb.auug.org.au/ https://lore.kernel.org/all/20220823055533.334471-1-saeed@kernel.org/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-24ice: Print human-friendly PHY typesAnirudh Venkataramanan1-18/+140
Provide human readable description of PHY capabilities and report_mode. Sample output: Old: [ 286.130405] ice 0000:16:00.0: get phy caps - report_mode = 0x2 [ 286.130409] ice 0000:16:00.0: phy_type_low = 0x108021020502000 [ 286.130412] ice 0000:16:00.0: phy_type_high = 0x0 [ 286.130415] ice 0000:16:00.0: caps = 0xc8 [ 286.130419] ice 0000:16:00.0: low_power_ctrl_an = 0x4 [ 286.130421] ice 0000:16:00.0: eee_cap = 0x0 [ 286.130424] ice 0000:16:00.0: eeer_value = 0x0 [ 286.130427] ice 0000:16:00.0: link_fec_options = 0xdf [ 286.130430] ice 0000:16:00.0: module_compliance_enforcement = 0x0 [ 286.130433] ice 0000:16:00.0: extended_compliance_code = 0xb [ 286.130435] ice 0000:16:00.0: module_type[0] = 0x11 [ 286.130438] ice 0000:16:00.0: module_type[1] = 0x1 [ 286.130441] ice 0000:16:00.0: module_type[2] = 0x0 New: [ 1128.297347] ice 0000:16:00.0: get phy caps dump [ 1128.297351] ice 0000:16:00.0: phy_caps_active: phy_type_low: 0x0108021020502000 [ 1128.297355] ice 0000:16:00.0: phy_caps_active: bit(13): 10G_SFI_DA [ 1128.297359] ice 0000:16:00.0: phy_caps_active: bit(20): 25GBASE_CR [ 1128.297362] ice 0000:16:00.0: phy_caps_active: bit(22): 25GBASE_CR1 [ 1128.297365] ice 0000:16:00.0: phy_caps_active: bit(29): 25G_AUI_C2C [ 1128.297368] ice 0000:16:00.0: phy_caps_active: bit(36): 50GBASE_CR2 [ 1128.297371] ice 0000:16:00.0: phy_caps_active: bit(41): 50G_LAUI2 [ 1128.297374] ice 0000:16:00.0: phy_caps_active: bit(51): 100GBASE_CR4 [ 1128.297377] ice 0000:16:00.0: phy_caps_active: bit(56): 100G_CAUI4 [ 1128.297380] ice 0000:16:00.0: phy_caps_active: phy_type_high: 0x0000000000000000 [ 1128.297383] ice 0000:16:00.0: phy_caps_active: report_mode = 0x4 [ 1128.297386] ice 0000:16:00.0: phy_caps_active: caps = 0xc8 [ 1128.297389] ice 0000:16:00.0: phy_caps_active: low_power_ctrl_an = 0x4 [ 1128.297392] ice 0000:16:00.0: phy_caps_active: eee_cap = 0x0 [ 1128.297394] ice 0000:16:00.0: phy_caps_active: eeer_value = 0x0 [ 1128.297397] ice 0000:16:00.0: phy_caps_active: link_fec_options = 0xdf [ 1128.297400] ice 0000:16:00.0: phy_caps_active: module_compliance_enforcement = 0x0 [ 1128.297402] ice 0000:16:00.0: phy_caps_active: extended_compliance_code = 0xb [ 1128.297405] ice 0000:16:00.0: phy_caps_active: module_type[0] = 0x11 [ 1128.297408] ice 0000:16:00.0: phy_caps_active: module_type[1] = 0x1 [ 1128.297411] ice 0000:16:00.0: phy_caps_active: module_type[2] = 0x0 Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Co-developed-by: Lukasz Plachno <lukasz.plachno@intel.com> Signed-off-by: Lukasz Plachno <lukasz.plachno@intel.com> Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-24ice: Implement devlink port split operationsAnatolii Gerasymenko2-0/+290
Allow to configure port split options using the devlink port split interface. Support port splitting only for port 0, as the FW has a predefined set of available port split options for the whole device. Add ice_devlink_port_options_print() function to print the table with all available FW port split options. It will be printed after each port split and unsplit command. Add documentation for devlink port split interface usage for the ice driver. Co-developed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Anatolii Gerasymenko <anatolii.gerasymenko@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-24ice: Add additional flags to ice_nvm_write_activateJacob Keller3-5/+16
The ice_nvm_write_activate function is used to issue AdminQ command 0x0707 which sends a request to firmware to activate a flash bank. For basic operations, this command takes an 8bit flag value which defines the flags to control the activation process. There are some additional flags that are stored in a second 8bit flag field. We can simplify the interface by using a u16 cmd_flags variable. Split this over the two bytes of flag storage in the structure. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Anatolii Gerasymenko <anatolii.gerasymenko@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-24ice: Add port option admin queue commandsAnatolii Gerasymenko3-0/+178
Implement support for Get/Set Port Options admin queue commands (0x06EA/0x06EB). These firmware commands allow the driver to change port specific options and will be used in the next patch. Co-developed-by: Lev Faerman <lev.faerman@intel.com> Signed-off-by: Lev Faerman <lev.faerman@intel.com> Co-developed-by: Damian Milosek <damian.milosek@intel.com> Signed-off-by: Damian Milosek <damian.milosek@intel.com> Co-developed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Anatolii Gerasymenko <anatolii.gerasymenko@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-24ice: Add support for ip TTL & ToS offloadMarcin Szycik2-4/+144
Add support for parsing TTL and ToS (Hop Limit and Traffic Class) tc fields and matching on those fields in filters. Incomplete part of implementation was already in place (getting enc_ip and enc_tos from flow_match_ip and writing them to filter header). Note: matching on ipv6 ip_ttl, enc_ttl and enc_tos is currently not supported by the DDP package. Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-22Merge branch '100GbE' of ↵Jakub Kicinski13-184/+136
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2022-08-18 (ice) This series contains updates to ice driver only. Jesse and Anatolii add support for controlling FCS/CRC stripping via ethtool. Anirudh allows for 100M speeds on devices which support it. Sylwester removes ucast_shared field and the associated dead code related to it. Mikael removes non-inclusive language from the driver. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: ice: remove non-inclusive language ice: Remove ucast_shared ice: Allow 100M speeds for some devices ice: Implement FCS/CRC and VLAN stripping co-existence policy ice: Implement control of FCS/CRC stripping ==================== Link: https://lore.kernel.org/r/20220818155207.996297-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-22ice: xsk: use Rx ring's XDP ring when picking NAPI contextMaciej Fijalkowski4-29/+48
Ice driver allocates per cpu XDP queues so that redirect path can safely use smp_processor_id() as an index to the array. At the same time though, XDP rings are used to pick NAPI context to call napi_schedule() or set NAPIF_STATE_MISSED. When user reduces queue count, say to 8, and num_possible_cpus() of underlying platform is 44, then this means queue vectors with correlated NAPI contexts will carry several XDP queues. This in turn can result in a broken behavior where NAPI context of interest will never be scheduled and AF_XDP socket will not process any traffic. To fix this, let us change the way how XDP rings are assigned to Rx rings and use this information later on when setting ice_tx_ring::xsk_pool pointer. For each Rx ring, grab the associated queue vector and walk through Tx ring's linked list. Once we stumble upon XDP ring in it, assign this ring to ice_rx_ring::xdp_ring. Previous [0] approach of fixing this issue was for txonly scenario because of the described grouping of XDP rings across queue vectors. So, relying on Rx ring meant that NAPI context could be scheduled with a queue vector without XDP ring with associated XSK pool. [0]: https://lore.kernel.org/netdev/20220707161128.54215-1-maciej.fijalkowski@intel.com/ Fixes: 2d4238f55697 ("ice: Add support for AF_XDP") Fixes: 22bf877e528f ("ice: introduce XDP_TX fallback path") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-22ice: xsk: prohibit usage of non-balanced queue idMaciej Fijalkowski1-0/+6
Fix the following scenario: 1. ethtool -L $IFACE rx 8 tx 96 2. xdpsock -q 10 -t -z Above refers to a case where user would like to attach XSK socket in txonly mode at a queue id that does not have a corresponding Rx queue. At this moment ice's XSK logic is tightly bound to act on a "queue pair", e.g. both Tx and Rx queues at a given queue id are disabled/enabled and both of them will get XSK pool assigned, which is broken for the presented queue configuration. This results in the splat included at the bottom, which is basically an OOB access to Rx ring array. To fix this, allow using the ids only in scope of "combined" queues reported by ethtool. However, logic should be rewritten to allow such configurations later on, which would end up as a complete rewrite of the control path, so let us go with this temporary fix. [420160.558008] BUG: kernel NULL pointer dereference, address: 0000000000000082 [420160.566359] #PF: supervisor read access in kernel mode [420160.572657] #PF: error_code(0x0000) - not-present page [420160.579002] PGD 0 P4D 0 [420160.582756] Oops: 0000 [#1] PREEMPT SMP NOPTI [420160.588396] CPU: 10 PID: 21232 Comm: xdpsock Tainted: G OE 5.19.0-rc7+ #10 [420160.597893] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0008.031920191559 03/19/2019 [420160.609894] RIP: 0010:ice_xsk_pool_setup+0x44/0x7d0 [ice] [420160.616968] Code: f3 48 83 ec 40 48 8b 4f 20 48 8b 3f 65 48 8b 04 25 28 00 00 00 48 89 44 24 38 31 c0 48 8d 04 ed 00 00 00 00 48 01 c1 48 8b 11 <0f> b7 92 82 00 00 00 48 85 d2 0f 84 2d 75 00 00 48 8d 72 ff 48 85 [420160.639421] RSP: 0018:ffffc9002d2afd48 EFLAGS: 00010282 [420160.646650] RAX: 0000000000000050 RBX: ffff88811d8bdd00 RCX: ffff888112c14ff8 [420160.655893] RDX: 0000000000000000 RSI: ffff88811d8bdd00 RDI: ffff888109861000 [420160.665166] RBP: 000000000000000a R08: 000000000000000a R09: 0000000000000000 [420160.674493] R10: 000000000000889f R11: 0000000000000000 R12: 000000000000000a [420160.683833] R13: 000000000000000a R14: 0000000000000000 R15: ffff888117611828 [420160.693211] FS: 00007fa869fc1f80(0000) GS:ffff8897e0880000(0000) knlGS:0000000000000000 [420160.703645] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [420160.711783] CR2: 0000000000000082 CR3: 00000001d076c001 CR4: 00000000007706e0 [420160.721399] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [420160.731045] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [420160.740707] PKRU: 55555554 [420160.745960] Call Trace: [420160.750962] <TASK> [420160.755597] ? kmalloc_large_node+0x79/0x90 [420160.762703] ? __kmalloc_node+0x3f5/0x4b0 [420160.769341] xp_assign_dev+0xfd/0x210 [420160.775661] ? shmem_file_read_iter+0x29a/0x420 [420160.782896] xsk_bind+0x152/0x490 [420160.788943] __sys_bind+0xd0/0x100 [420160.795097] ? exit_to_user_mode_prepare+0x20/0x120 [420160.802801] __x64_sys_bind+0x16/0x20 [420160.809298] do_syscall_64+0x38/0x90 [420160.815741] entry_SYSCALL_64_after_hwframe+0x63/0xcd [420160.823731] RIP: 0033:0x7fa86a0dd2fb [420160.830264] Code: c3 66 0f 1f 44 00 00 48 8b 15 69 8b 0c 00 f7 d8 64 89 02 b8 ff ff ff ff eb bc 0f 1f 44 00 00 f3 0f 1e fa b8 31 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 3d 8b 0c 00 f7 d8 64 89 01 48 [420160.855410] RSP: 002b:00007ffc1146f618 EFLAGS: 00000246 ORIG_RAX: 0000000000000031 [420160.866366] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fa86a0dd2fb [420160.876957] RDX: 0000000000000010 RSI: 00007ffc1146f680 RDI: 0000000000000003 [420160.887604] RBP: 000055d7113a0520 R08: 00007fa868fb8000 R09: 0000000080000000 [420160.898293] R10: 0000000000008001 R11: 0000000000000246 R12: 000055d7113a04e0 [420160.909038] R13: 000055d7113a0320 R14: 000000000000000a R15: 0000000000000000 [420160.919817] </TASK> [420160.925659] Modules linked in: ice(OE) af_packet binfmt_misc nls_iso8859_1 ipmi_ssif intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp mei_me coretemp ioatdma mei ipmi_si wmi ipmi_msghandler acpi_pad acpi_power_meter ip_tables x_tables autofs4 ixgbe i40e crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd ahci mdio dca libahci lpc_ich [last unloaded: ice] [420160.977576] CR2: 0000000000000082 [420160.985037] ---[ end trace 0000000000000000 ]--- [420161.097724] RIP: 0010:ice_xsk_pool_setup+0x44/0x7d0 [ice] [420161.107341] Code: f3 48 83 ec 40 48 8b 4f 20 48 8b 3f 65 48 8b 04 25 28 00 00 00 48 89 44 24 38 31 c0 48 8d 04 ed 00 00 00 00 48 01 c1 48 8b 11 <0f> b7 92 82 00 00 00 48 85 d2 0f 84 2d 75 00 00 48 8d 72 ff 48 85 [420161.134741] RSP: 0018:ffffc9002d2afd48 EFLAGS: 00010282 [420161.144274] RAX: 0000000000000050 RBX: ffff88811d8bdd00 RCX: ffff888112c14ff8 [420161.155690] RDX: 0000000000000000 RSI: ffff88811d8bdd00 RDI: ffff888109861000 [420161.168088] RBP: 000000000000000a R08: 000000000000000a R09: 0000000000000000 [420161.179295] R10: 000000000000889f R11: 0000000000000000 R12: 000000000000000a [420161.190420] R13: 000000000000000a R14: 0000000000000000 R15: ffff888117611828 [420161.201505] FS: 00007fa869fc1f80(0000) GS:ffff8897e0880000(0000) knlGS:0000000000000000 [420161.213628] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [420161.223413] CR2: 0000000000000082 CR3: 00000001d076c001 CR4: 00000000007706e0 [420161.234653] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [420161.245893] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [420161.257052] PKRU: 55555554 Fixes: 2d4238f55697 ("ice: Add support for AF_XDP") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-18Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski6-20/+89
No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-18ice: remove non-inclusive languageMikael Barsehyan2-9/+9
Remove non-inclusive language from the driver where possible; replace "master" with "primary"; replace "slave" with "secondary". Signed-off-by: Mikael Barsehyan <mikael.barsehyan@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-18ice: Remove ucast_sharedSylwester Dziedziuch3-165/+5
Remove ucast_shared as it was always true. Remove the code depending on ucast_shared from ice_add_mac and ice_remove_mac. Remove ice_find_ucast_rule_entry function as it was only used when ucast_shared was set to false. Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com> Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-18ice: Allow 100M speeds for some devicesAnirudh Venkataramanan3-4/+28
For certain devices, 100M speeds are supported. Do not mask off 100M speed for these devices. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Co-developed-by: Chinh T Cao <chinh.t.cao@intel.com> Signed-off-by: Chinh T Cao <chinh.t.cao@intel.com> Signed-off-by: Mikael Barsehyan <mikael.barsehyan@intel.com> Tested-by: Kavya AV <kavyax.av@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-18ice: Implement FCS/CRC and VLAN stripping co-existence policyAnatolii Gerasymenko1-0/+25
Make sure that only the valid combinations of FCS/CRC stripping and VLAN stripping offloads are allowed. You cannot have FCS/CRC stripping disabled while VLAN stripping is enabled - this breaks the correctness of the FCS/CRC. If administrator tries to enable VLAN stripping when FCS/CRC stripping is disabled, the request should be rejected. If administrator tries to disable FCS/CRC stripping when VLAN stripping is enabled, the request should be rejected if VLANs are configured. If there is no VLAN configured, then both FCS/CRC and VLAN stripping should be disabled. Testing Hints: The default settings after driver load are: - VLAN C-Tag offloads are enabled - VLAN S-Tag offloads are disabled - FCS/CRC stripping is enabled Restore the default settings before each test with the command: ethtool -K eth0 rx-fcs off rxvlan on txvlan on rx-vlan-stag-hw-parse off tx-vlan-stag-hw-insert off Test 1: Disable FCS/CRC and VLAN stripping: ethtool -K eth0 rx-fcs on rxvlan off Try to enable VLAN stripping: ethtool -K eth0 rxvlan on Expected: VLAN stripping request is rejected Test 2: Try to disable FCS/CRC stripping: ethtool -K eth0 rx-fcs on Expected: VLAN stripping is also disabled, as there are no VLAN configured Test 3: Add a VLAN: ip link add link eth0 eth0.42 type vlan id 42 ip link set eth0 up Try to disable FCS/CRC stripping: ethtool -K eth0 rx-fcs on Expected: FCS/CRC stripping request is rejected Signed-off-by: Anatolii Gerasymenko <anatolii.gerasymenko@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-18ice: Implement control of FCS/CRC strippingJesse Brandeburg7-6/+69
The driver can allow the user to configure whether the CRC aka the FCS (Frame Check Sequence) is DMA'd to the host as part of the receive buffer. The driver usually wants this feature disabled so that the hardware checks the FCS and strips it in order to save PCI bandwidth. Control the reception of FCS to the host using the command: ethtool -K eth0 rx-fcs <on|off> The default shown in ethtool -k eth0 | grep fcs; should be "off", as the hardware will drop any frame with a bad checksum, and DMA of the checksum is useless overhead especially for small packets. Testing Hints: test the FCS/CRC arrives with received packets using tcpdump -nnpi eth0 -xxxx and it should show crc data as the last 4 bytes of the packet. Can also use wireshark to turn on CRC checking and check the data is correct. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Co-developed-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Co-developed-by: Benjamin Mikailenko <benjamin.mikailenko@intel.com> Signed-off-by: Benjamin Mikailenko <benjamin.mikailenko@intel.com> Co-developed-by: Anatolii Gerasymenko <anatolii.gerasymenko@intel.com> Signed-off-by: Anatolii Gerasymenko <anatolii.gerasymenko@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-17ice: Fix VF not able to send tagged traffic with no VLAN filtersSylwester Dziedziuch2-11/+57
VF was not able to send tagged traffic when it didn't have any VLAN interfaces and VLAN anti-spoofing was enabled. Fix this by allowing VFs with no VLAN filters to send tagged traffic. After VF adds a VLAN interface it will be able to send tagged traffic matching VLAN filters only. Testing hints: 1. Spawn VF 2. Send tagged packet from a VF 3. The packet should be sent out and not dropped 4. Add a VLAN interface on VF 5. Send tagged packet on that VLAN interface 6. Packet should be sent out and not dropped 7. Send tagged packet with id different than VLAN interface 8. Packet should be dropped Fixes: daf4dd16438b ("ice: Refactor spoofcheck configuration functions") Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-17ice: Ignore error message when setting same promiscuous modeBenjamin Mikailenko1-4/+4
Commit 1273f89578f2 ("ice: Fix broken IFF_ALLMULTI handling") introduced new checks when setting/clearing promiscuous mode. But if the requested promiscuous mode setting already exists, an -EEXIST error message would be printed. This is incorrect because promiscuous mode is either on/off and shouldn't print an error when the requested configuration is already set. This can happen when removing a bridge with two bonded interfaces and promiscuous most isn't fully cleared from VLAN VSI in hardware. Fix this by ignoring cases where requested promiscuous mode exists. Fixes: 1273f89578f2 ("ice: Fix broken IFF_ALLMULTI handling") Signed-off-by: Benjamin Mikailenko <benjamin.mikailenko@intel.com> Signed-off-by: Grzegorz Siwik <grzegorz.siwik@intel.com> Link: https://lore.kernel.org/all/CAK8fFZ7m-KR57M_rYX6xZN39K89O=LGooYkKsu6HKt0Bs+x6xQ@mail.gmail.com/ Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-17ice: Fix clearing of promisc mode with bridge over bondGrzegorz Siwik2-2/+16
When at least two interfaces are bonded and a bridge is enabled on the bond, an error can occur when the bridge is removed and re-added. The reason for the error is because promiscuous mode was not fully cleared from the VLAN VSI in the hardware. With this change, promiscuous mode is properly removed when the bridge disconnects from bonding. [ 1033.676359] bond1: link status definitely down for interface enp95s0f0, disabling it [ 1033.676366] bond1: making interface enp175s0f0 the new active one [ 1033.676369] device enp95s0f0 left promiscuous mode [ 1033.676522] device enp175s0f0 entered promiscuous mode [ 1033.676901] ice 0000:af:00.0 enp175s0f0: Error setting Multicast promiscuous mode on VSI 6 [ 1041.795662] ice 0000:af:00.0 enp175s0f0: Error setting Multicast promiscuous mode on VSI 6 [ 1041.944826] bond1: link status definitely down for interface enp175s0f0, disabling it [ 1041.944874] device enp175s0f0 left promiscuous mode [ 1041.944918] bond1: now running without any active interface! Fixes: c31af68a1b94 ("ice: Add outer_vlan_ops and VSI specific VLAN ops implementations") Co-developed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Grzegorz Siwik <grzegorz.siwik@intel.com> Link: https://lore.kernel.org/all/CAK8fFZ7m-KR57M_rYX6xZN39K89O=LGooYkKsu6HKt0Bs+x6xQ@mail.gmail.com/ Tested-by: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com> Tested-by: Igor Raits <igor@gooddata.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-17ice: Ignore EEXIST when setting promisc modeGrzegorz Siwik1-1/+1
Ignore EEXIST error when setting promiscuous mode. This fix is needed because the driver could set promiscuous mode when it still has not cleared properly. Promiscuous mode could be set only once, so setting it second time will be rejected. Fixes: 5eda8afd6bcc ("ice: Add support for PF/VF promiscuous mode") Signed-off-by: Grzegorz Siwik <grzegorz.siwik@intel.com> Link: https://lore.kernel.org/all/CAK8fFZ7m-KR57M_rYX6xZN39K89O=LGooYkKsu6HKt0Bs+x6xQ@mail.gmail.com/ Tested-by: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com> Tested-by: Igor Raits <igor@gooddata.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-17ice: Fix double VLAN error when entering promisc modeGrzegorz Siwik1-0/+7
Avoid enabling or disabling VLAN 0 when trying to set promiscuous VLAN mode if double VLAN mode is enabled. This fix is needed because the driver tries to add the VLAN 0 filter twice (once for inner and once for outer) when double VLAN mode is enabled. The filter program is rejected by the firmware when double VLAN is enabled, because the promiscuous filter only needs to be set once. This issue was missed in the initial implementation of double VLAN mode. Fixes: 5eda8afd6bcc ("ice: Add support for PF/VF promiscuous mode") Signed-off-by: Grzegorz Siwik <grzegorz.siwik@intel.com> Link: https://lore.kernel.org/all/CAK8fFZ7m-KR57M_rYX6xZN39K89O=LGooYkKsu6HKt0Bs+x6xQ@mail.gmail.com/ Tested-by: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com> Tested-by: Igor Raits <igor@gooddata.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-16ice: introduce ice_ptp_reset_cached_phctime functionJacob Keller1-23/+76
If the PTP hardware clock is adjusted, the ice driver must update the cached PHC timestamp. This is required in order to perform timestamp extension on the shorter timestamps captured by the PHY. Currently, we simply call ice_ptp_update_cached_phctime in the settime and adjtime callbacks. This has a few issues: 1) if ICE_CFG_BUSY is set because another thread is updating the Rx rings, we will exit with an error. This is not checked, and the functions do not re-schedule the update. This could leave the cached timestamp incorrect until the next scheduled work item execution. 2) even if we did handle an update, any currently outstanding Tx timestamp would be extended using the wrong cached PHC time. This would produce incorrect results. To fix these issues, introduce a new ice_ptp_reset_cached_phctime function. This function calls the ice_ptp_update_cached_phctime, and discards outstanding Tx timestamps. If the ice_ptp_update_cached_phctime function fails because ICE_CFG_BUSY is set, we log a warning and schedule the thread to execute soon. The update function is modified so that it always updates the cached copy in the PF regardless. This ensures we have the most up to date values possible and minimizes the risk of a packet timestamp being extended with the wrong value. It would be nice if we could skip reporting Rx timestamps until the cached values are up to date. However, we can't access the Rx rings while ICE_CFG_BUSY is set because they are actively being updated by another thread. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-16ice: re-arrange some static functions in ice_ptp.cJacob Keller1-336/+336
A following change is going to want to make use of ice_ptp_flush_tx_tracker earlier in the ice_ptp.c file. To make this work, move the Tx timestamp tracking functions higher up in the file, and pull the ice_ptp_update_cached_timestamp function below them. This should have no functional change. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-16ice: track and warn when PHC update is lateJacob Keller3-3/+34
The ice driver requires a cached copy of the PHC time in order to perform timestamp extension on Tx and Rx hardware timestamp values. This cached PHC time must always be updated at least once every 2 seconds. Otherwise, the math used to perform the extension would produce invalid results. The updates are supposed to occur periodically in the PTP kthread work item, which is scheduled to run every half second. Thus, we do not expect an update to be delayed for so long. However, there are error conditions which can cause the update to be delayed. Track this situation by using jiffies to determine approximately how long ago the last update occurred. Add a new statistic and a dev_warn when we have failed to update the cached PHC time. This makes the error case more obvious. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-16ice: track Tx timestamp stats similar to other Intel driversJacob Keller4-4/+20
Several Intel networking drivers which support PTP track when Tx timestamps are skipped or when they timeout without a timestamp from hardware. The conditions which could cause these events are rare, but it can be useful to know when and how often they occur. Implement similar statistics for the ice driver, tx_hwtstamp_skipped, tx_hwtstamp_timeouts, and tx_hwtstamp_flushed. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-16ice: initialize cached_phctime when creating Rx ringsJacob Keller2-0/+2
When we create new Rx rings, the cached_phctime field is zero initialized. This could result in incorrect timestamp reporting due to the cached value not yet being updated. Although a background task will periodically update the cached value, ensure it matches the existing cached value in the PF structure at ring initialization. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-16ice: set tx_tstamps when creating new Tx rings via ethtoolJacob Keller1-0/+1
When the user changes the number of queues via ethtool, the driver allocates new rings. This allocation did not initialize tx_tstamps. This results in the tx_tstamps field being zero (due to kcalloc allocation), and would result in a NULL pointer dereference when attempting a transmit timestamp on the new ring. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-15Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/netDavid S. Miller2-2/+4
-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2022-08-11 (ice) This series contains updates to ice driver only. Benjamin corrects a misplaced parenthesis for a WARN_ON check. Michal removes WARN_ON from a check as its recoverable and not warranting of a call trace. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-11ice: Fix call trace with null VSI during VF resetMichal Jaron1-1/+3
During stress test with attaching and detaching VF from KVM and simultaneously changing VFs spoofcheck and trust there was a call trace in ice_reset_vf that VF's VSI is null. [145237.352797] WARNING: CPU: 46 PID: 840629 at drivers/net/ethernet/intel/ice/ice_vf_lib.c:508 ice_reset_vf+0x3d6/0x410 [ice] [145237.352851] Modules linked in: ice(E) vfio_pci vfio_pci_core vfio_virqfd vfio_iommu_type1 vfio iavf dm_mod xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink tun bridge stp llc sunrpc intel_rapl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm iTCO_wdt iTC O_vendor_support irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl ipmi_si intel_cstate ipmi_devintf joydev intel_uncore m ei_me ipmi_msghandler i2c_i801 pcspkr mei lpc_ich ioatdma i2c_smbus acpi_pad acpi_power_meter ip_tables xfs libcrc32c i2c_algo_bit drm_sh mem_helper drm_kms_helper sd_mod t10_pi crc64_rocksoft syscopyarea crc64 sysfillrect sg sysimgblt fb_sys_fops drm i40e ixgbe ahci libahci libata crc32c_intel mdio dca wmi fuse [last unloaded: ice] [145237.352917] CPU: 46 PID: 840629 Comm: kworker/46:2 Tainted: G S W I E 5.19.0-rc6+ #24 [145237.352921] Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS SE5C610.86B.01.01.0008.021120151325 02/11/2015 [145237.352923] Workqueue: ice ice_service_task [ice] [145237.352948] RIP: 0010:ice_reset_vf+0x3d6/0x410 [ice] [145237.352984] Code: 30 ec f3 cc e9 28 fd ff ff 0f b7 4b 50 48 c7 c2 48 19 9c c0 4c 89 ee 48 c7 c7 30 fe 9e c0 e8 d1 21 9d cc 31 c0 e9 a 9 fe ff ff <0f> 0b b8 ea ff ff ff e9 c1 fc ff ff 0f 0b b8 fb ff ff ff e9 91 fe [145237.352987] RSP: 0018:ffffb453e257fdb8 EFLAGS: 00010246 [145237.352990] RAX: ffff8bd0040181c0 RBX: ffff8be68db8f800 RCX: 0000000000000000 [145237.352991] RDX: 000000000000ffff RSI: 0000000000000000 RDI: ffff8be68db8f800 [145237.352993] RBP: ffff8bd0040181c0 R08: 0000000000001000 R09: ffff8bcfd520e000 [145237.352995] R10: 0000000000000000 R11: 00008417b5ab0bc0 R12: 0000000000000005 [145237.352996] R13: ffff8bcee061c0d0 R14: ffff8bd004019640 R15: 0000000000000000 [145237.352998] FS: 0000000000000000(0000) GS:ffff8be5dfb00000(0000) knlGS:0000000000000000 [145237.353000] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [145237.353002] CR2: 00007fd81f651d68 CR3: 0000001a0fe10001 CR4: 00000000001726e0 [145237.353003] Call Trace: [145237.353008] <TASK> [145237.353011] ice_process_vflr_event+0x8d/0xb0 [ice] [145237.353049] ice_service_task+0x79f/0xef0 [ice] [145237.353074] process_one_work+0x1c8/0x390 [145237.353081] ? process_one_work+0x390/0x390 [145237.353084] worker_thread+0x30/0x360 [145237.353087] ? process_one_work+0x390/0x390 [145237.353090] kthread+0xe8/0x110 [145237.353094] ? kthread_complete_and_exit+0x20/0x20 [145237.353097] ret_from_fork+0x22/0x30 [145237.353103] </TASK> Remove WARN_ON() from check if VSI is null in ice_reset_vf. Add "VF is already removed\n" in dev_dbg(). This WARN_ON() is unnecessary and causes call trace, despite that call trace, driver still works. There is no need for this warn because this piece of code is responsible for disabling VF's Tx/Rx queues when VF is disabled, but when VF is already removed there is no need to do reset or disable queues. Fixes: efe41860008e ("ice: Fix memory corruption in VF driver") Signed-off-by: Michal Jaron <michalx.jaron@intel.com> Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Tested-by: Marek Szlosek <marek.szlosek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-11ice: Fix VSI rebuild WARN_ON check for VFBenjamin Mikailenko1-1/+1
In commit b03d519d3460 ("ice: store VF pointer instead of VF ID") WARN_ON checks were added to validate the vsi->vf pointer and catch programming errors. However, one check to vsi->vf was missed. This caused a call trace when resetting VFs. Fix ice_vsi_rebuild by encompassing VF pointer in WARN_ON check. Fixes: b03d519d3460 ("ice: store VF pointer instead of VF ID") Signed-off-by: Benjamin Mikailenko <benjamin.mikailenko@intel.com> Tested-by: Marek Szlosek <marek.szlosek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-07Merge tag 'bitmap-6.0-rc1' of https://github.com/norov/linuxLinus Torvalds1-1/+1
Pull bitmap updates from Yury Norov: - fix the duplicated comments on bitmap_to_arr64() (Qu Wenruo) - optimize out non-atomic bitops on compile-time constants (Alexander Lobakin) - cleanup bitmap-related headers (Yury Norov) - x86/olpc: fix 'logical not is only applied to the left hand side' (Alexander Lobakin) - lib/nodemask: inline wrappers around bitmap (Yury Norov) * tag 'bitmap-6.0-rc1' of https://github.com/norov/linux: (26 commits) lib/nodemask: inline next_node_in() and node_random() powerpc: drop dependency on <asm/machdep.h> in archrandom.h x86/olpc: fix 'logical not is only applied to the left hand side' lib/cpumask: move some one-line wrappers to header file headers/deps: mm: align MANITAINERS and Docs with new gfp.h structure headers/deps: mm: Split <linux/gfp_types.h> out of <linux/gfp.h> headers/deps: mm: Optimize <linux/gfp.h> header dependencies lib/cpumask: move trivial wrappers around find_bit to the header lib/cpumask: change return types to unsigned where appropriate cpumask: change return types to bool where appropriate lib/bitmap: change type of bitmap_weight to unsigned long lib/bitmap: change return types to bool where appropriate arm: align find_bit declarations with generic kernel iommu/vt-d: avoid invalid memory access via node_online(NUMA_NO_NODE) lib/test_bitmap: test the tail after bitmap_to_arr64() lib/bitmap: fix off-by-one in bitmap_to_arr64() lib: test_bitmap: add compile-time optimization/evaluations assertions bitmap: don't assume compiler evaluates small mem*() builtins calls net/ice: fix initializing the bitmap in the switch code bitops: let optimize out non-atomic bitops on compile-time constants ...