Age | Commit message (Collapse) | Author | Files | Lines |
|
use tc_qdisc_stats_dump() in qdisc.
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Reviewed-by: Victor Nogueira <victor@mojatatu.com>
Tested-by: Victor Nogueira <victor@mojatatu.com>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Currently, the change function can be called by two ways. The one way is
that qdisc_change() will call it. Before calling change function,
qdisc_change() ensures tca[TCA_OPTIONS] is not empty. The other way is
that .init() will call it. The opt parameter is also checked before
calling change function in .init(). Therefore, it's no need to check the
input parameter opt in change function.
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Acked-by: Toke Høiland-Jørgensen <toke@toke.dk>
Link: https://lore.kernel.org/r/20220829071219.208646-1-shaozhengchao@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
qdisc_reset() is clearing qdisc->q.qlen and qdisc->qstats.backlog
_after_ calling qdisc->ops->reset. There is no need to clear them
again in the specific reset function.
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Link: https://lore.kernel.org/r/20220824005231.345727-1-shaozhengchao@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Shuang reported that the following script:
1) tc qdisc add dev ddd0 handle 10: parent 1: ets bands 8 strict 4 priomap 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7
2) mausezahn ddd0 -A 10.10.10.1 -B 10.10.10.2 -c 0 -a own -b 00:c1:a0:c1:a0:00 -t udp &
3) tc qdisc change dev ddd0 handle 10: ets bands 4 strict 2 quanta 2500 2500 priomap 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
crashes systematically when line 2) is commented:
list_del corruption, ffff8e028404bd30->next is LIST_POISON1 (dead000000000100)
------------[ cut here ]------------
kernel BUG at lib/list_debug.c:47!
invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 954 Comm: tc Not tainted 5.16.0-rc4+ #478
Hardware name: Red Hat KVM, BIOS 1.11.1-4.module+el8.1.0+4066+0f1aadab 04/01/2014
RIP: 0010:__list_del_entry_valid.cold.1+0x12/0x47
Code: fe ff 0f 0b 48 89 c1 4c 89 c6 48 c7 c7 08 42 1b 87 e8 1d c5 fe ff 0f 0b 48 89 fe 48 89 c2 48 c7 c7 98 42 1b 87 e8 09 c5 fe ff <0f> 0b 48 c7 c7 48 43 1b 87 e8 fb c4 fe ff 0f 0b 48 89 f2 48 89 fe
RSP: 0018:ffffae46807a3888 EFLAGS: 00010246
RAX: 000000000000004e RBX: 0000000000000007 RCX: 0000000000000202
RDX: 0000000000000000 RSI: ffffffff871ac536 RDI: 00000000ffffffff
RBP: ffffae46807a3a10 R08: 0000000000000000 R09: c0000000ffff7fff
R10: 0000000000000001 R11: ffffae46807a36a8 R12: ffff8e028404b800
R13: ffff8e028404bd30 R14: dead000000000100 R15: ffff8e02fafa2400
FS: 00007efdc92e4480(0000) GS:ffff8e02fb600000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000682f48 CR3: 00000001058be000 CR4: 0000000000350ef0
Call Trace:
<TASK>
ets_qdisc_change+0x58b/0xa70 [sch_ets]
tc_modify_qdisc+0x323/0x880
rtnetlink_rcv_msg+0x169/0x4a0
netlink_rcv_skb+0x50/0x100
netlink_unicast+0x1a5/0x280
netlink_sendmsg+0x257/0x4d0
sock_sendmsg+0x5b/0x60
____sys_sendmsg+0x1f2/0x260
___sys_sendmsg+0x7c/0xc0
__sys_sendmsg+0x57/0xa0
do_syscall_64+0x3a/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7efdc8031338
Code: 89 02 48 c7 c0 ff ff ff ff eb b5 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 25 43 2c 00 8b 00 85 c0 75 17 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 41 54 41 89 d4 55
RSP: 002b:00007ffdf1ce9828 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000061b37a97 RCX: 00007efdc8031338
RDX: 0000000000000000 RSI: 00007ffdf1ce9890 RDI: 0000000000000003
RBP: 0000000000000000 R08: 0000000000000001 R09: 000000000078a940
R10: 000000000000000c R11: 0000000000000246 R12: 0000000000000001
R13: 0000000000688880 R14: 0000000000000000 R15: 0000000000000000
</TASK>
Modules linked in: sch_ets sch_tbf dummy rfkill iTCO_wdt iTCO_vendor_support intel_rapl_msr intel_rapl_common joydev pcspkr i2c_i801 virtio_balloon i2c_smbus lpc_ich ip_tables xfs libcrc32c crct10dif_pclmul crc32_pclmul crc32c_intel serio_raw ghash_clmulni_intel ahci libahci libata virtio_blk virtio_console virtio_net net_failover failover sunrpc dm_mirror dm_region_hash dm_log dm_mod [last unloaded: sch_ets]
---[ end trace f35878d1912655c2 ]---
RIP: 0010:__list_del_entry_valid.cold.1+0x12/0x47
Code: fe ff 0f 0b 48 89 c1 4c 89 c6 48 c7 c7 08 42 1b 87 e8 1d c5 fe ff 0f 0b 48 89 fe 48 89 c2 48 c7 c7 98 42 1b 87 e8 09 c5 fe ff <0f> 0b 48 c7 c7 48 43 1b 87 e8 fb c4 fe ff 0f 0b 48 89 f2 48 89 fe
RSP: 0018:ffffae46807a3888 EFLAGS: 00010246
RAX: 000000000000004e RBX: 0000000000000007 RCX: 0000000000000202
RDX: 0000000000000000 RSI: ffffffff871ac536 RDI: 00000000ffffffff
RBP: ffffae46807a3a10 R08: 0000000000000000 R09: c0000000ffff7fff
R10: 0000000000000001 R11: ffffae46807a36a8 R12: ffff8e028404b800
R13: ffff8e028404bd30 R14: dead000000000100 R15: ffff8e02fafa2400
FS: 00007efdc92e4480(0000) GS:ffff8e02fb600000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000682f48 CR3: 00000001058be000 CR4: 0000000000350ef0
Kernel panic - not syncing: Fatal exception in interrupt
Kernel Offset: 0x4e00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
we can remove 'q->classes[i].alist' only if DRR class 'i' was part of the
active list. In the ETS scheduler DRR classes belong to that list only if
the queue length is greater than zero: we need to test for non-zero value
of 'q->classes[i].qdisc->q.qlen' before removing from the list, similarly
to what has been done elsewhere in the ETS code.
Fixes: de6d25924c2a ("net/sched: sch_ets: don't peek at classes beyond 'nbands'")
Reported-by: Shuang Li <shuali@redhat.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
when the number of DRR classes decreases, the round-robin active list can
contain elements that have already been freed in ets_qdisc_change(). As a
consequence, it's possible to see a NULL dereference crash, caused by the
attempt to call cl->qdisc->ops->peek(cl->qdisc) when cl->qdisc is NULL:
BUG: kernel NULL pointer dereference, address: 0000000000000018
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP NOPTI
CPU: 1 PID: 910 Comm: mausezahn Not tainted 5.16.0-rc1+ #475
Hardware name: Red Hat KVM, BIOS 1.11.1-4.module+el8.1.0+4066+0f1aadab 04/01/2014
RIP: 0010:ets_qdisc_dequeue+0x129/0x2c0 [sch_ets]
Code: c5 01 41 39 ad e4 02 00 00 0f 87 18 ff ff ff 49 8b 85 c0 02 00 00 49 39 c4 0f 84 ba 00 00 00 49 8b ad c0 02 00 00 48 8b 7d 10 <48> 8b 47 18 48 8b 40 38 0f ae e8 ff d0 48 89 c3 48 85 c0 0f 84 9d
RSP: 0000:ffffbb36c0b5fdd8 EFLAGS: 00010287
RAX: ffff956678efed30 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000002 RSI: ffffffff9b938dc9 RDI: 0000000000000000
RBP: ffff956678efed30 R08: e2f3207fe360129c R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000001 R12: ffff956678efeac0
R13: ffff956678efe800 R14: ffff956611545000 R15: ffff95667ac8f100
FS: 00007f2aa9120740(0000) GS:ffff95667b800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000018 CR3: 000000011070c000 CR4: 0000000000350ee0
Call Trace:
<TASK>
qdisc_peek_dequeued+0x29/0x70 [sch_ets]
tbf_dequeue+0x22/0x260 [sch_tbf]
__qdisc_run+0x7f/0x630
net_tx_action+0x290/0x4c0
__do_softirq+0xee/0x4f8
irq_exit_rcu+0xf4/0x130
sysvec_apic_timer_interrupt+0x52/0xc0
asm_sysvec_apic_timer_interrupt+0x12/0x20
RIP: 0033:0x7f2aa7fc9ad4
Code: b9 ff ff 48 8b 54 24 18 48 83 c4 08 48 89 ee 48 89 df 5b 5d e9 ed fc ff ff 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa <53> 48 83 ec 10 48 8b 05 10 64 33 00 48 8b 00 48 85 c0 0f 85 84 00
RSP: 002b:00007ffe5d33fab8 EFLAGS: 00000202
RAX: 0000000000000002 RBX: 0000561f72c31460 RCX: 0000561f72c31720
RDX: 0000000000000002 RSI: 0000561f72c31722 RDI: 0000561f72c31720
RBP: 000000000000002a R08: 00007ffe5d33fa40 R09: 0000000000000014
R10: 0000000000000000 R11: 0000000000000246 R12: 0000561f7187e380
R13: 0000000000000000 R14: 0000000000000000 R15: 0000561f72c31460
</TASK>
Modules linked in: sch_ets sch_tbf dummy rfkill iTCO_wdt intel_rapl_msr iTCO_vendor_support intel_rapl_common joydev virtio_balloon lpc_ich i2c_i801 i2c_smbus pcspkr ip_tables xfs libcrc32c crct10dif_pclmul crc32_pclmul crc32c_intel ahci libahci ghash_clmulni_intel serio_raw libata virtio_blk virtio_console virtio_net net_failover failover sunrpc dm_mirror dm_region_hash dm_log dm_mod
CR2: 0000000000000018
Ensuring that 'alist' was never zeroed [1] was not sufficient, we need to
remove from the active list those elements that are no more SP nor DRR.
[1] https://lore.kernel.org/netdev/60d274838bf09777f0371253416e8af71360bc08.1633609148.git.dcaratti@redhat.com/
v3: fix race between ets_qdisc_change() and ets_qdisc_dequeue() delisting
DRR classes beyond 'nbands' in ets_qdisc_change() with the qdisc lock
acquired, thanks to Cong Wang.
v2: when a NULL qdisc is found in the DRR active list, try to dequeue skb
from the next list item.
Reported-by: Hangbin Liu <liuhangbin@gmail.com>
Fixes: dcc68b4d8084 ("net: sch_ets: Add a new Qdisc")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Link: https://lore.kernel.org/r/7a5c496eed2d62241620bdbb83eb03fb9d571c99.1637762721.git.dcaratti@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The Qdisc::running sequence counter has two uses:
1. Reliably reading qdisc's tc statistics while the qdisc is running
(a seqcount read/retry loop at gnet_stats_add_basic()).
2. As a flag, indicating whether the qdisc in question is running
(without any retry loops).
For the first usage, the Qdisc::running sequence counter write section,
qdisc_run_begin() => qdisc_run_end(), covers a much wider area than what
is actually needed: the raw qdisc's bstats update. A u64_stats sync
point was thus introduced (in previous commits) inside the bstats
structure itself. A local u64_stats write section is then started and
stopped for the bstats updates.
Use that u64_stats sync point mechanism for the bstats read/retry loop
at gnet_stats_add_basic().
For the second qdisc->running usage, a __QDISC_STATE_RUNNING bit flag,
accessed with atomic bitops, is sufficient. Using a bit flag instead of
a sequence counter at qdisc_run_begin/end() and qdisc_is_running() leads
to the SMP barriers implicitly added through raw_read_seqcount() and
write_seqcount_begin/end() getting removed. All call sites have been
surveyed though, and no required ordering was identified.
Now that the qdisc->running sequence counter is no longer used, remove
it.
Note, using u64_stats implies no sequence counter protection for 64-bit
architectures. This can lead to the qdisc tc statistics "packets" vs.
"bytes" values getting out of sync on rare occasions. The individual
values will still be valid.
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The only factor differentiating per-CPU bstats data type (struct
gnet_stats_basic_cpu) from the packed non-per-CPU one (struct
gnet_stats_basic_packed) was a u64_stats sync point inside the former.
The two data types are now equivalent: earlier commits added a u64_stats
sync point to the latter.
Combine both data types into "struct gnet_stats_basic_sync". This
eliminates redundancy and simplifies the bstats read/write APIs.
Use u64_stats_t for bstats "packets" and "bytes" data types. On 64-bit
architectures, u64_stats sync points do not use sequence counter
protection.
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The not-per-CPU variant of qdisc tc (traffic control) statistics,
Qdisc::gnet_stats_basic_packed bstats, is protected with Qdisc::running
sequence counter.
This sequence counter is used for reliably protecting bstats reads from
parallel writes. Meanwhile, the seqcount's write section covers a much
wider area than bstats update: qdisc_run_begin() => qdisc_run_end().
That read/write section asymmetry can lead to needless retries of the
read section. To prepare for removing the Qdisc::running sequence
counter altogether, introduce a u64_stats sync point inside bstats
instead.
Modify _bstats_update() to start/end the bstats u64_stats write
section.
For bisectability, and finer commits granularity, the bstats read
section is still protected with a Qdisc::running read/retry loop and
qdisc_run_begin/end() still starts/ends that seqcount write section.
Once all call sites are modified to use _bstats_update(), the
Qdisc::running seqcount will be removed and bstats read/retry loop will
be modified to utilize the internal u64_stats sync point.
Note, using u64_stats implies no sequence counter protection for 64-bit
architectures. This can lead to the statistics "packets" vs. "bytes"
values getting out of sync on rare occasions. The individual values will
still be valid.
[bigeasy: Minor commit message edits, init all gnet_stats_basic_packed.]
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
leaf classes of ETS qdiscs are served in strict priority or deficit round
robin (DRR), depending on the value of 'nstrict'. Since this value can be
changed while traffic is running, we need to be sure that the active list
of DRR classes can be updated at any time, so:
1) call INIT_LIST_HEAD(&alist) on all leaf classes in .init(), before the
first packet hits any of them.
2) ensure that 'alist' is not overwritten with zeros when a leaf class is
no more strict priority nor DRR (i.e. array elements beyond 'nbands').
Link: https://lore.kernel.org/netdev/YS%2FoZ+f0Nr8eQkzH@dcaratti.users.ipa.redhat.com
Suggested-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
drivers/net/wwan/mhi_wwan_mbim.c - drop the extra arg.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
While running kselftests, Hangbin observed that sch_ets.sh often crashes,
and splats like the following one are seen in the output of 'dmesg':
BUG: kernel NULL pointer dereference, address: 0000000000000000
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 159f12067 P4D 159f12067 PUD 159f13067 PMD 0
Oops: 0000 [#1] SMP NOPTI
CPU: 2 PID: 921 Comm: tc Not tainted 5.14.0-rc6+ #458
Hardware name: Red Hat KVM, BIOS 1.11.1-4.module+el8.1.0+4066+0f1aadab 04/01/2014
RIP: 0010:__list_del_entry_valid+0x2d/0x50
Code: 48 8b 57 08 48 b9 00 01 00 00 00 00 ad de 48 39 c8 0f 84 ac 6e 5b 00 48 b9 22 01 00 00 00 00 ad de 48 39 ca 0f 84 cf 6e 5b 00 <48> 8b 32 48 39 fe 0f 85 af 6e 5b 00 48 8b 50 08 48 39 f2 0f 85 94
RSP: 0018:ffffb2da005c3890 EFLAGS: 00010217
RAX: 0000000000000000 RBX: ffff9073ba23f800 RCX: dead000000000122
RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffff9073ba23fbc8
RBP: ffff9073ba23f890 R08: 0000000000000001 R09: 0000000000000001
R10: 0000000000000001 R11: 0000000000000001 R12: dead000000000100
R13: ffff9073ba23fb00 R14: 0000000000000002 R15: 0000000000000002
FS: 00007f93e5564e40(0000) GS:ffff9073bba00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 000000014ad34000 CR4: 0000000000350ee0
Call Trace:
ets_qdisc_reset+0x6e/0x100 [sch_ets]
qdisc_reset+0x49/0x1d0
tbf_reset+0x15/0x60 [sch_tbf]
qdisc_reset+0x49/0x1d0
dev_reset_queue.constprop.42+0x2f/0x90
dev_deactivate_many+0x1d3/0x3d0
dev_deactivate+0x56/0x90
qdisc_graft+0x47e/0x5a0
tc_get_qdisc+0x1db/0x3e0
rtnetlink_rcv_msg+0x164/0x4c0
netlink_rcv_skb+0x50/0x100
netlink_unicast+0x1a5/0x280
netlink_sendmsg+0x242/0x480
sock_sendmsg+0x5b/0x60
____sys_sendmsg+0x1f2/0x260
___sys_sendmsg+0x7c/0xc0
__sys_sendmsg+0x57/0xa0
do_syscall_64+0x3a/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f93e44b8338
Code: 89 02 48 c7 c0 ff ff ff ff eb b5 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 25 43 2c 00 8b 00 85 c0 75 17 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 41 54 41 89 d4 55
RSP: 002b:00007ffc0db737a8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000061255c06 RCX: 00007f93e44b8338
RDX: 0000000000000000 RSI: 00007ffc0db73810 RDI: 0000000000000003
RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
R10: 000000000000000b R11: 0000000000000246 R12: 0000000000000001
R13: 0000000000687880 R14: 0000000000000000 R15: 0000000000000000
Modules linked in: sch_ets sch_tbf dummy rfkill iTCO_wdt iTCO_vendor_support intel_rapl_msr intel_rapl_common joydev i2c_i801 pcspkr i2c_smbus lpc_ich virtio_balloon ip_tables xfs libcrc32c crct10dif_pclmul crc32_pclmul crc32c_intel ahci libahci ghash_clmulni_intel libata serio_raw virtio_blk virtio_console virtio_net net_failover failover sunrpc dm_mirror dm_region_hash dm_log dm_mod
CR2: 0000000000000000
When the change() function decreases the value of 'nstrict', we must take
into account that packets might be already enqueued on a class that flips
from 'strict' to 'quantum': otherwise that class will not be added to the
bandwidth-sharing list. Then, a call to ets_qdisc_reset() will attempt to
do list_del(&alist) with 'alist' filled with zero, hence the NULL pointer
dereference.
For classes flipping from 'strict' to 'quantum', initialize an empty list
and eventually add it to the bandwidth-sharing list, if there are packets
already enqueued. In this way, the kernel will:
a) prevent crashing as described above.
b) avoid retaining the backlog packets (for an arbitrarily long time) in
case no packet is enqueued after a change from 'strict' to 'quantum'.
Reported-by: Hangbin Liu <liuhangbin@gmail.com>
Fixes: dcc68b4d8084 ("net: sch_ets: Add a new Qdisc")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
currently, only 'ingress' and 'clsact ingress' qdiscs store the tc 'chain
id' in the skb extension. However, userspace programs (like ovs) are able
to setup egress rules, and datapath gets confused in case it doesn't find
the 'chain id' for a packet that's "recirculated" by tc.
Change tcf_classify() to have the same semantic as tcf_classify_ingress()
so that a single function can be called in ingress / egress, using the tc
ingress / egress block respectively.
Suggested-by: Alaa Hleilel <alaa@nvidia.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This reverts commit aebe4426ccaa4838f36ea805cdf7d76503e65117.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.
[1] https://www.kernel.org/doc/html/latest/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
A following patch introduces qevents, points in qdisc algorithm where
packet can be processed by user-defined filters. Should this processing
lead to a situation where a new packet is to be enqueued on the same port,
holding the root lock would lead to deadlocks. To solve the issue, qevent
handler needs to unlock and relock the root lock when necessary.
To that end, add the root lock argument to the qdisc op enqueue, and
propagate throughout.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add hooks at appropriate points to make it possible to offload the ETS
Qdisc.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Introduces a new Qdisc, which is based on 802.1Q-2014 wording. It is
PRIO-like in how it is configured, meaning one needs to specify how many
bands there are, how many are strict and how many are dwrr, quanta for the
latter, and priomap.
The new Qdisc operates like the PRIO / DRR combo would when configured as
per the standard. The strict classes, if any, are tried for traffic first.
When there's no traffic in any of the strict queues, the ETS ones (if any)
are treated in the same way as in DRR.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|