diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2021-11-02 06:20:58 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2021-11-02 06:20:58 -0700 |
commit | fc02cb2b37fe2cbf1d3334b9f0f0eab9431766c4 (patch) | |
tree | 93b16bc48fdc3be4a1adccbf4c7de92a5e8440e1 /kernel/bpf/verifier.c | |
parent | bfc484fe6abba4b89ec9330e0e68778e2a9856b2 (diff) | |
parent | 84882cf72cd774cf16fd338bdbf00f69ac9f9194 (diff) | |
download | linux-fc02cb2b37fe2cbf1d3334b9f0f0eab9431766c4.tar.bz2 |
Merge tag 'net-next-for-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core:
- Remove socket skb caches
- Add a SO_RESERVE_MEM socket op to forward allocate buffer space and
avoid memory accounting overhead on each message sent
- Introduce managed neighbor entries - added by control plane and
resolved by the kernel for use in acceleration paths (BPF / XDP
right now, HW offload users will benefit as well)
- Make neighbor eviction on link down controllable by userspace to
work around WiFi networks with bad roaming implementations
- vrf: Rework interaction with netfilter/conntrack
- fq_codel: implement L4S style ce_threshold_ect1 marking
- sch: Eliminate unnecessary RCU waits in mini_qdisc_pair_swap()
BPF:
- Add support for new btf kind BTF_KIND_TAG, arbitrary type tagging
as implemented in LLVM14
- Introduce bpf_get_branch_snapshot() to capture Last Branch Records
- Implement variadic trace_printk helper
- Add a new Bloomfilter map type
- Track <8-byte scalar spill and refill
- Access hw timestamp through BPF's __sk_buff
- Disallow unprivileged BPF by default
- Document BPF licensing
Netfilter:
- Introduce egress hook for looking at raw outgoing packets
- Allow matching on and modifying inner headers / payload data
- Add NFT_META_IFTYPE to match on the interface type either from
ingress or egress
Protocols:
- Multi-Path TCP:
- increase default max additional subflows to 2
- rework forward memory allocation
- add getsockopts: MPTCP_INFO, MPTCP_TCPINFO, MPTCP_SUBFLOW_ADDRS
- MCTP flow support allowing lower layer drivers to configure msg
muxing as needed
- Automatic Multicast Tunneling (AMT) driver based on RFC7450
- HSR support the redbox supervision frames (IEC-62439-3:2018)
- Support for the ip6ip6 encapsulation of IOAM
- Netlink interface for CAN-FD's Transmitter Delay Compensation
- Support SMC-Rv2 eliminating the current same-subnet restriction, by
exploiting the UDP encapsulation feature of RoCE adapters
- TLS: add SM4 GCM/CCM crypto support
- Bluetooth: initial support for link quality and audio/codec offload
Driver APIs:
- Add a batched interface for RX buffer allocation in AF_XDP buffer
pool
- ethtool: Add ability to control transceiver modules' power mode
- phy: Introduce supported interfaces bitmap to express MAC
capabilities and simplify PHY code
- Drop rtnl_lock from DSA .port_fdb_{add,del} callbacks
New drivers:
- WiFi driver for Realtek 8852AE 802.11ax devices (rtw89)
- Ethernet driver for ASIX AX88796C SPI device (x88796c)
Drivers:
- Broadcom PHYs
- support 72165, 7712 16nm PHYs
- support IDDQ-SR for additional power savings
- PHY support for QCA8081, QCA9561 PHYs
- NXP DPAA2: support for IRQ coalescing
- NXP Ethernet (enetc): support for software TCP segmentation
- Renesas Ethernet (ravb) - support DMAC and EMAC blocks of
Gigabit-capable IP found on RZ/G2L SoC
- Intel 100G Ethernet
- support for eswitch offload of TC/OvS flow API, including
offload of GRE, VxLAN, Geneve tunneling
- support application device queues - ability to assign Rx and Tx
queues to application threads
- PTP and PPS (pulse-per-second) extensions
- Broadcom Ethernet (bnxt)
- devlink health reporting and device reload extensions
- Mellanox Ethernet (mlx5)
- offload macvlan interfaces
- support HW offload of TC rules involving OVS internal ports
- support HW-GRO and header/data split
- support application device queues
- Marvell OcteonTx2:
- add XDP support for PF
- add PTP support for VF
- Qualcomm Ethernet switch (qca8k): support for QCA8328
- Realtek Ethernet DSA switch (rtl8366rb)
- support bridge offload
- support STP, fast aging, disabling address learning
- support for Realtek RTL8365MB-VC, a 4+1 port 10M/100M/1GE switch
- Mellanox Ethernet/IB switch (mlxsw)
- multi-level qdisc hierarchy offload (e.g. RED, prio and shaping)
- offload root TBF qdisc as port shaper
- support multiple routing interface MAC address prefixes
- support for IP-in-IP with IPv6 underlay
- MediaTek WiFi (mt76)
- mt7921 - ASPM, 6GHz, SDIO and testmode support
- mt7915 - LED and TWT support
- Qualcomm WiFi (ath11k)
- include channel rx and tx time in survey dump statistics
- support for 80P80 and 160 MHz bandwidths
- support channel 2 in 6 GHz band
- spectral scan support for QCN9074
- support for rx decapsulation offload (data frames in 802.3
format)
- Qualcomm phone SoC WiFi (wcn36xx)
- enable Idle Mode Power Save (IMPS) to reduce power consumption
during idle
- Bluetooth driver support for MediaTek MT7922 and MT7921
- Enable support for AOSP Bluetooth extension in Qualcomm WCN399x and
Realtek 8822C/8852A
- Microsoft vNIC driver (mana)
- support hibernation and kexec
- Google vNIC driver (gve)
- support for jumbo frames
- implement Rx page reuse
Refactor:
- Make all writes to netdev->dev_addr go thru helpers, so that we can
add this address to the address rbtree and handle the updates
- Various TCP cleanups and optimizations including improvements to
CPU cache use
- Simplify the gnet_stats, Qdisc stats' handling and remove
qdisc->running sequence counter
- Driver changes and API updates to address devlink locking
deficiencies"
* tag 'net-next-for-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2122 commits)
Revert "net: avoid double accounting for pure zerocopy skbs"
selftests: net: add arp_ndisc_evict_nocarrier
net: ndisc: introduce ndisc_evict_nocarrier sysctl parameter
net: arp: introduce arp_evict_nocarrier sysctl parameter
libbpf: Deprecate AF_XDP support
kbuild: Unify options for BTF generation for vmlinux and modules
selftests/bpf: Add a testcase for 64-bit bounds propagation issue.
bpf: Fix propagation of signed bounds from 64-bit min/max into 32-bit.
bpf: Fix propagation of bounds from 64-bit min/max into 32-bit and var_off.
net: vmxnet3: remove multiple false checks in vmxnet3_ethtool.c
net: avoid double accounting for pure zerocopy skbs
tcp: rename sk_wmem_free_skb
netdevsim: fix uninit value in nsim_drv_configure_vfs()
selftests/bpf: Fix also no-alu32 strobemeta selftest
bpf: Add missing map_delete_elem method to bloom filter map
selftests/bpf: Add bloom map success test for userspace calls
bpf: Add alignment padding for "map_extra" + consolidate holes
bpf: Bloom filter map naming fixups
selftests/bpf: Add test cases for struct_ops prog
bpf: Add dummy BPF STRUCT_OPS for test purpose
...
Diffstat (limited to 'kernel/bpf/verifier.c')
-rw-r--r-- | kernel/bpf/verifier.c | 373 |
1 files changed, 297 insertions, 76 deletions
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index de006552be8a..f0dca726ebfd 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -612,6 +612,20 @@ static const char *kernel_type_name(const struct btf* btf, u32 id) return btf_name_by_offset(btf, btf_type_by_id(btf, id)->name_off); } +/* The reg state of a pointer or a bounded scalar was saved when + * it was spilled to the stack. + */ +static bool is_spilled_reg(const struct bpf_stack_state *stack) +{ + return stack->slot_type[BPF_REG_SIZE - 1] == STACK_SPILL; +} + +static void scrub_spilled_slot(u8 *stype) +{ + if (*stype != STACK_INVALID) + *stype = STACK_MISC; +} + static void print_verifier_state(struct bpf_verifier_env *env, const struct bpf_func_state *state) { @@ -717,7 +731,7 @@ static void print_verifier_state(struct bpf_verifier_env *env, continue; verbose(env, " fp%d", (-i - 1) * BPF_REG_SIZE); print_liveness(env, state->stack[i].spilled_ptr.live); - if (state->stack[i].slot_type[0] == STACK_SPILL) { + if (is_spilled_reg(&state->stack[i])) { reg = &state->stack[i].spilled_ptr; t = reg->type; verbose(env, "=%s", reg_type_str[t]); @@ -1406,12 +1420,12 @@ static void __reg_combine_32_into_64(struct bpf_reg_state *reg) static bool __reg64_bound_s32(s64 a) { - return a > S32_MIN && a < S32_MAX; + return a >= S32_MIN && a <= S32_MAX; } static bool __reg64_bound_u32(u64 a) { - return a > U32_MIN && a < U32_MAX; + return a >= U32_MIN && a <= U32_MAX; } static void __reg_combine_64_into_32(struct bpf_reg_state *reg) @@ -1626,52 +1640,168 @@ static int add_subprog(struct bpf_verifier_env *env, int off) return env->subprog_cnt - 1; } +#define MAX_KFUNC_DESCS 256 +#define MAX_KFUNC_BTFS 256 + struct bpf_kfunc_desc { struct btf_func_model func_model; u32 func_id; s32 imm; + u16 offset; +}; + +struct bpf_kfunc_btf { + struct btf *btf; + struct module *module; + u16 offset; }; -#define MAX_KFUNC_DESCS 256 struct bpf_kfunc_desc_tab { struct bpf_kfunc_desc descs[MAX_KFUNC_DESCS]; u32 nr_descs; }; -static int kfunc_desc_cmp_by_id(const void *a, const void *b) +struct bpf_kfunc_btf_tab { + struct bpf_kfunc_btf descs[MAX_KFUNC_BTFS]; + u32 nr_descs; +}; + +static int kfunc_desc_cmp_by_id_off(const void *a, const void *b) { const struct bpf_kfunc_desc *d0 = a; const struct bpf_kfunc_desc *d1 = b; /* func_id is not greater than BTF_MAX_TYPE */ - return d0->func_id - d1->func_id; + return d0->func_id - d1->func_id ?: d0->offset - d1->offset; +} + +static int kfunc_btf_cmp_by_off(const void *a, const void *b) +{ + const struct bpf_kfunc_btf *d0 = a; + const struct bpf_kfunc_btf *d1 = b; + + return d0->offset - d1->offset; } static const struct bpf_kfunc_desc * -find_kfunc_desc(const struct bpf_prog *prog, u32 func_id) +find_kfunc_desc(const struct bpf_prog *prog, u32 func_id, u16 offset) { struct bpf_kfunc_desc desc = { .func_id = func_id, + .offset = offset, }; struct bpf_kfunc_desc_tab *tab; tab = prog->aux->kfunc_tab; return bsearch(&desc, tab->descs, tab->nr_descs, - sizeof(tab->descs[0]), kfunc_desc_cmp_by_id); + sizeof(tab->descs[0]), kfunc_desc_cmp_by_id_off); +} + +static struct btf *__find_kfunc_desc_btf(struct bpf_verifier_env *env, + s16 offset, struct module **btf_modp) +{ + struct bpf_kfunc_btf kf_btf = { .offset = offset }; + struct bpf_kfunc_btf_tab *tab; + struct bpf_kfunc_btf *b; + struct module *mod; + struct btf *btf; + int btf_fd; + + tab = env->prog->aux->kfunc_btf_tab; + b = bsearch(&kf_btf, tab->descs, tab->nr_descs, + sizeof(tab->descs[0]), kfunc_btf_cmp_by_off); + if (!b) { + if (tab->nr_descs == MAX_KFUNC_BTFS) { + verbose(env, "too many different module BTFs\n"); + return ERR_PTR(-E2BIG); + } + + if (bpfptr_is_null(env->fd_array)) { + verbose(env, "kfunc offset > 0 without fd_array is invalid\n"); + return ERR_PTR(-EPROTO); + } + + if (copy_from_bpfptr_offset(&btf_fd, env->fd_array, + offset * sizeof(btf_fd), + sizeof(btf_fd))) + return ERR_PTR(-EFAULT); + + btf = btf_get_by_fd(btf_fd); + if (IS_ERR(btf)) { + verbose(env, "invalid module BTF fd specified\n"); + return btf; + } + + if (!btf_is_module(btf)) { + verbose(env, "BTF fd for kfunc is not a module BTF\n"); + btf_put(btf); + return ERR_PTR(-EINVAL); + } + + mod = btf_try_get_module(btf); + if (!mod) { + btf_put(btf); + return ERR_PTR(-ENXIO); + } + + b = &tab->descs[tab->nr_descs++]; + b->btf = btf; + b->module = mod; + b->offset = offset; + + sort(tab->descs, tab->nr_descs, sizeof(tab->descs[0]), + kfunc_btf_cmp_by_off, NULL); + } + if (btf_modp) + *btf_modp = b->module; + return b->btf; +} + +void bpf_free_kfunc_btf_tab(struct bpf_kfunc_btf_tab *tab) +{ + if (!tab) + return; + + while (tab->nr_descs--) { + module_put(tab->descs[tab->nr_descs].module); + btf_put(tab->descs[tab->nr_descs].btf); + } + kfree(tab); +} + +static struct btf *find_kfunc_desc_btf(struct bpf_verifier_env *env, + u32 func_id, s16 offset, + struct module **btf_modp) +{ + if (offset) { + if (offset < 0) { + /* In the future, this can be allowed to increase limit + * of fd index into fd_array, interpreted as u16. + */ + verbose(env, "negative offset disallowed for kernel module function call\n"); + return ERR_PTR(-EINVAL); + } + + return __find_kfunc_desc_btf(env, offset, btf_modp); + } + return btf_vmlinux ?: ERR_PTR(-ENOENT); } -static int add_kfunc_call(struct bpf_verifier_env *env, u32 func_id) +static int add_kfunc_call(struct bpf_verifier_env *env, u32 func_id, s16 offset) { const struct btf_type *func, *func_proto; + struct bpf_kfunc_btf_tab *btf_tab; struct bpf_kfunc_desc_tab *tab; struct bpf_prog_aux *prog_aux; struct bpf_kfunc_desc *desc; const char *func_name; + struct btf *desc_btf; unsigned long addr; int err; prog_aux = env->prog->aux; tab = prog_aux->kfunc_tab; + btf_tab = prog_aux->kfunc_btf_tab; if (!tab) { if (!btf_vmlinux) { verbose(env, "calling kernel function is not supported without CONFIG_DEBUG_INFO_BTF\n"); @@ -1699,7 +1829,29 @@ static int add_kfunc_call(struct bpf_verifier_env *env, u32 func_id) prog_aux->kfunc_tab = tab; } - if (find_kfunc_desc(env->prog, func_id)) + /* func_id == 0 is always invalid, but instead of returning an error, be + * conservative and wait until the code elimination pass before returning + * error, so that invalid calls that get pruned out can be in BPF programs + * loaded from userspace. It is also required that offset be untouched + * for such calls. + */ + if (!func_id && !offset) + return 0; + + if (!btf_tab && offset) { + btf_tab = kzalloc(sizeof(*btf_tab), GFP_KERNEL); + if (!btf_tab) + return -ENOMEM; + prog_aux->kfunc_btf_tab = btf_tab; + } + + desc_btf = find_kfunc_desc_btf(env, func_id, offset, NULL); + if (IS_ERR(desc_btf)) { + verbose(env, "failed to find BTF for kernel function\n"); + return PTR_ERR(desc_btf); + } + + if (find_kfunc_desc(env->prog, func_id, offset)) return 0; if (tab->nr_descs == MAX_KFUNC_DESCS) { @@ -1707,20 +1859,20 @@ static int add_kfunc_call(struct bpf_verifier_env *env, u32 func_id) return -E2BIG; } - func = btf_type_by_id(btf_vmlinux, func_id); + func = btf_type_by_id(desc_btf, func_id); if (!func || !btf_type_is_func(func)) { verbose(env, "kernel btf_id %u is not a function\n", func_id); return -EINVAL; } - func_proto = btf_type_by_id(btf_vmlinux, func->type); + func_proto = btf_type_by_id(desc_btf, func->type); if (!func_proto || !btf_type_is_func_proto(func_proto)) { verbose(env, "kernel function btf_id %u does not have a valid func_proto\n", func_id); return -EINVAL; } - func_name = btf_name_by_offset(btf_vmlinux, func->name_off); + func_name = btf_name_by_offset(desc_btf, func->name_off); addr = kallsyms_lookup_name(func_name); if (!addr) { verbose(env, "cannot find address for kernel function %s\n", @@ -1730,13 +1882,14 @@ static int add_kfunc_call(struct bpf_verifier_env *env, u32 func_id) desc = &tab->descs[tab->nr_descs++]; desc->func_id = func_id; - desc->imm = BPF_CAST_CALL(addr) - __bpf_call_base; - err = btf_distill_func_proto(&env->log, btf_vmlinux, + desc->imm = BPF_CALL_IMM(addr); + desc->offset = offset; + err = btf_distill_func_proto(&env->log, desc_btf, func_proto, func_name, &desc->func_model); if (!err) sort(tab->descs, tab->nr_descs, sizeof(tab->descs[0]), - kfunc_desc_cmp_by_id, NULL); + kfunc_desc_cmp_by_id_off, NULL); return err; } @@ -1815,7 +1968,7 @@ static int add_subprog_and_kfunc(struct bpf_verifier_env *env) } else if (bpf_pseudo_call(insn)) { ret = add_subprog(env, i + insn->imm + 1); } else { - ret = add_kfunc_call(env, insn->imm); + ret = add_kfunc_call(env, insn->imm, insn->off); } if (ret < 0) @@ -2152,12 +2305,17 @@ static int get_prev_insn_idx(struct bpf_verifier_state *st, int i, static const char *disasm_kfunc_name(void *data, const struct bpf_insn *insn) { const struct btf_type *func; + struct btf *desc_btf; if (insn->src_reg != BPF_PSEUDO_KFUNC_CALL) return NULL; - func = btf_type_by_id(btf_vmlinux, insn->imm); - return btf_name_by_offset(btf_vmlinux, func->name_off); + desc_btf = find_kfunc_desc_btf(data, insn->imm, insn->off, NULL); + if (IS_ERR(desc_btf)) + return "<error>"; + + func = btf_type_by_id(desc_btf, insn->imm); + return btf_name_by_offset(desc_btf, func->name_off); } /* For given verifier state backtrack_insn() is called from the last insn to @@ -2373,7 +2531,7 @@ static void mark_all_scalars_precise(struct bpf_verifier_env *env, reg->precise = true; } for (j = 0; j < func->allocated_stack / BPF_REG_SIZE; j++) { - if (func->stack[j].slot_type[0] != STACK_SPILL) + if (!is_spilled_reg(&func->stack[j])) continue; reg = &func->stack[j].spilled_ptr; if (reg->type != SCALAR_VALUE) @@ -2415,7 +2573,7 @@ static int __mark_chain_precision(struct bpf_verifier_env *env, int regno, } while (spi >= 0) { - if (func->stack[spi].slot_type[0] != STACK_SPILL) { + if (!is_spilled_reg(&func->stack[spi])) { stack_mask = 0; break; } @@ -2514,7 +2672,7 @@ static int __mark_chain_precision(struct bpf_verifier_env *env, int regno, return 0; } - if (func->stack[i].slot_type[0] != STACK_SPILL) { + if (!is_spilled_reg(&func->stack[i])) { stack_mask &= ~(1ull << i); continue; } @@ -2626,15 +2784,21 @@ static bool __is_pointer_value(bool allow_ptr_leaks, } static void save_register_state(struct bpf_func_state *state, - int spi, struct bpf_reg_state *reg) + int spi, struct bpf_reg_state *reg, + int size) { int i; state->stack[spi].spilled_ptr = *reg; - state->stack[spi].spilled_ptr.live |= REG_LIVE_WRITTEN; + if (size == BPF_REG_SIZE) + state->stack[spi].spilled_ptr.live |= REG_LIVE_WRITTEN; + + for (i = BPF_REG_SIZE; i > BPF_REG_SIZE - size; i--) + state->stack[spi].slot_type[i - 1] = STACK_SPILL; - for (i = 0; i < BPF_REG_SIZE; i++) - state->stack[spi].slot_type[i] = STACK_SPILL; + /* size < 8 bytes spill */ + for (; i; i--) + scrub_spilled_slot(&state->stack[spi].slot_type[i - 1]); } /* check_stack_{read,write}_fixed_off functions track spill/fill of registers, @@ -2681,7 +2845,7 @@ static int check_stack_write_fixed_off(struct bpf_verifier_env *env, env->insn_aux_data[insn_idx].sanitize_stack_spill = true; } - if (reg && size == BPF_REG_SIZE && register_is_bounded(reg) && + if (reg && !(off % BPF_REG_SIZE) && register_is_bounded(reg) && !register_is_null(reg) && env->bpf_capable) { if (dst_reg != BPF_REG_FP) { /* The backtracking logic can only recognize explicit @@ -2694,7 +2858,7 @@ static int check_stack_write_fixed_off(struct bpf_verifier_env *env, if (err) return err; } - save_register_state(state, spi, reg); + save_register_state(state, spi, reg, size); } else if (reg && is_spillable_regtype(reg->type)) { /* register containing pointer is being spilled into stack */ if (size != BPF_REG_SIZE) { @@ -2706,16 +2870,16 @@ static int check_stack_write_fixed_off(struct bpf_verifier_env *env, verbose(env, "cannot spill pointers to stack into stack frame of the caller\n"); return -EINVAL; } - save_register_state(state, spi, reg); + save_register_state(state, spi, reg, size); } else { u8 type = STACK_MISC; /* regular write of data into stack destroys any spilled ptr */ state->stack[spi].spilled_ptr.type = NOT_INIT; /* Mark slots as STACK_MISC if they belonged to spilled ptr. */ - if (state->stack[spi].slot_type[0] == STACK_SPILL) + if (is_spilled_reg(&state->stack[spi])) for (i = 0; i < BPF_REG_SIZE; i++) - state->stack[spi].slot_type[i] = STACK_MISC; + scrub_spilled_slot(&state->stack[spi].slot_type[i]); /* only mark the slot as written if all 8 bytes were written * otherwise read propagation may incorrectly stop too soon @@ -2918,23 +3082,50 @@ static int check_stack_read_fixed_off(struct bpf_verifier_env *env, struct bpf_func_state *state = vstate->frame[vstate->curframe]; int i, slot = -off - 1, spi = slot / BPF_REG_SIZE; struct bpf_reg_state *reg; - u8 *stype; + u8 *stype, type; stype = reg_state->stack[spi].slot_type; reg = ®_state->stack[spi].spilled_ptr; - if (stype[0] == STACK_SPILL) { + if (is_spilled_reg(®_state->stack[spi])) { if (size != BPF_REG_SIZE) { + u8 scalar_size = 0; + if (reg->type != SCALAR_VALUE) { verbose_linfo(env, env->insn_idx, "; "); verbose(env, "invalid size of register fill\n"); return -EACCES; } - if (dst_regno >= 0) { + + mark_reg_read(env, reg, reg->parent, REG_LIVE_READ64); + if (dst_regno < 0) + return 0; + + for (i = BPF_REG_SIZE; i > 0 && stype[i - 1] == STACK_SPILL; i--) + scalar_size++; + + if (!(off % BPF_REG_SIZE) && size == scalar_size) { + /* The earlier check_reg_arg() has decided the + * subreg_def for this insn. Save it first. + */ + s32 subreg_def = state->regs[dst_regno].subreg_def; + + state->regs[dst_regno] = *reg; + state->regs[dst_regno].subreg_def = subreg_def; + } else { + for (i = 0; i < size; i++) { + type = stype[(slot - i) % BPF_REG_SIZE]; + if (type == STACK_SPILL) + continue; + if (type == STACK_MISC) + continue; + verbose(env, "invalid read from stack off %d+%d size %d\n", + off, i, size); + return -EACCES; + } mark_reg_unknown(env, state->regs, dst_regno); - state->regs[dst_regno].live |= REG_LIVE_WRITTEN; } - mark_reg_read(env, reg, reg->parent, REG_LIVE_READ64); + state->regs[dst_regno].live |= REG_LIVE_WRITTEN; return 0; } for (i = 1; i < BPF_REG_SIZE; i++) { @@ -2965,8 +3156,6 @@ static int check_stack_read_fixed_off(struct bpf_verifier_env *env, } mark_reg_read(env, reg, reg->parent, REG_LIVE_READ64); } else { - u8 type; - for (i = 0; i < size; i++) { type = stype[(slot - i) % BPF_REG_SIZE]; if (type == STACK_MISC) @@ -4514,17 +4703,17 @@ static int check_stack_range_initialized( goto mark; } - if (state->stack[spi].slot_type[0] == STACK_SPILL && + if (is_spilled_reg(&state->stack[spi]) && state->stack[spi].spilled_ptr.type == PTR_TO_BTF_ID) goto mark; - if (state->stack[spi].slot_type[0] == STACK_SPILL && + if (is_spilled_reg(&state->stack[spi]) && (state->stack[spi].spilled_ptr.type == SCALAR_VALUE || env->allow_ptr_leaks)) { if (clobber) { __mark_reg_unknown(env, &state->stack[spi].spilled_ptr); for (j = 0; j < BPF_REG_SIZE; j++) - state->stack[spi].slot_type[j] = STACK_MISC; + scrub_spilled_slot(&state->stack[spi].slot_type[j]); } goto mark; } @@ -4813,7 +5002,10 @@ static int resolve_map_arg_type(struct bpf_verifier_env *env, return -EINVAL; } break; - + case BPF_MAP_TYPE_BLOOM_FILTER: + if (meta->func_id == BPF_FUNC_map_peek_elem) + *arg_type = ARG_PTR_TO_MAP_VALUE; + break; default: break; } @@ -5388,6 +5580,11 @@ static int check_map_func_compatibility(struct bpf_verifier_env *env, func_id != BPF_FUNC_task_storage_delete) goto error; break; + case BPF_MAP_TYPE_BLOOM_FILTER: + if (func_id != BPF_FUNC_map_peek_elem && + func_id != BPF_FUNC_map_push_elem) + goto error; + break; default: break; } @@ -5455,13 +5652,18 @@ static int check_map_func_compatibility(struct bpf_verifier_env *env, map->map_type != BPF_MAP_TYPE_SOCKHASH) goto error; break; - case BPF_FUNC_map_peek_elem: case BPF_FUNC_map_pop_elem: - case BPF_FUNC_map_push_elem: if (map->map_type != BPF_MAP_TYPE_QUEUE && map->map_type != BPF_MAP_TYPE_STACK) goto error; break; + case BPF_FUNC_map_peek_elem: + case BPF_FUNC_map_push_elem: + if (map->map_type != BPF_MAP_TYPE_QUEUE && + map->map_type != BPF_MAP_TYPE_STACK && + map->map_type != BPF_MAP_TYPE_BLOOM_FILTER) + goto error; + break; case BPF_FUNC_sk_storage_get: case BPF_FUNC_sk_storage_delete: if (map->map_type != BPF_MAP_TYPE_SK_STORAGE) @@ -6485,23 +6687,33 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn) struct bpf_reg_state *regs = cur_regs(env); const char *func_name, *ptr_type_name; u32 i, nargs, func_id, ptr_type_id; + struct module *btf_mod = NULL; const struct btf_param *args; + struct btf *desc_btf; int err; + /* skip for now, but return error when we find this in fixup_kfunc_call */ + if (!insn->imm) + return 0; + + desc_btf = find_kfunc_desc_btf(env, insn->imm, insn->off, &btf_mod); + if (IS_ERR(desc_btf)) + return PTR_ERR(desc_btf); + func_id = insn->imm; - func = btf_type_by_id(btf_vmlinux, func_id); - func_name = btf_name_by_offset(btf_vmlinux, func->name_off); - func_proto = btf_type_by_id(btf_vmlinux, func->type); + func = btf_type_by_id(desc_btf, func_id); + func_name = btf_name_by_offset(desc_btf, func->name_off); + func_proto = btf_type_by_id(desc_btf, func->type); if (!env->ops->check_kfunc_call || - !env->ops->check_kfunc_call(func_id)) { + !env->ops->check_kfunc_call(func_id, btf_mod)) { verbose(env, "calling kernel function %s is not allowed\n", func_name); return -EACCES; } /* Check the arguments */ - err = btf_check_kfunc_arg_match(env, btf_vmlinux, func_id, regs); + err = btf_check_kfunc_arg_match(env, desc_btf, func_id, regs); if (err) return err; @@ -6509,15 +6721,15 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn) mark_reg_not_init(env, regs, caller_saved[i]); /* Check return type */ - t = btf_type_skip_modifiers(btf_vmlinux, func_proto->type, NULL); + t = btf_type_skip_modifiers(desc_btf, func_proto->type, NULL); if (btf_type_is_scalar(t)) { mark_reg_unknown(env, regs, BPF_REG_0); mark_btf_func_reg_size(env, BPF_REG_0, t->size); } else if (btf_type_is_ptr(t)) { - ptr_type = btf_type_skip_modifiers(btf_vmlinux, t->type, + ptr_type = btf_type_skip_modifiers(desc_btf, t->type, &ptr_type_id); if (!btf_type_is_struct(ptr_type)) { - ptr_type_name = btf_name_by_offset(btf_vmlinux, + ptr_type_name = btf_name_by_offset(desc_btf, ptr_type->name_off); verbose(env, "kernel function %s returns pointer type %s %s is not supported\n", func_name, btf_type_str(ptr_type), @@ -6525,7 +6737,7 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn) return -EINVAL; } mark_reg_known_zero(env, regs, BPF_REG_0); - regs[BPF_REG_0].btf = btf_vmlinux; + regs[BPF_REG_0].btf = desc_btf; regs[BPF_REG_0].type = PTR_TO_BTF_ID; regs[BPF_REG_0].btf_id = ptr_type_id; mark_btf_func_reg_size(env, BPF_REG_0, sizeof(void *)); @@ -6536,7 +6748,7 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn) for (i = 0; i < nargs; i++) { u32 regno = i + 1; - t = btf_type_skip_modifiers(btf_vmlinux, args[i].type, NULL); + t = btf_type_skip_modifiers(desc_btf, args[i].type, NULL); if (btf_type_is_ptr(t)) mark_btf_func_reg_size(env, regno, sizeof(void *)); else @@ -10356,9 +10568,9 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old, * return false to continue verification of this path */ return false; - if (i % BPF_REG_SIZE) + if (i % BPF_REG_SIZE != BPF_REG_SIZE - 1) continue; - if (old->stack[spi].slot_type[0] != STACK_SPILL) + if (!is_spilled_reg(&old->stack[spi])) continue; if (!regsafe(env, &old->stack[spi].spilled_ptr, &cur->stack[spi].spilled_ptr, idmap)) @@ -10565,7 +10777,7 @@ static int propagate_precision(struct bpf_verifier_env *env, } for (i = 0; i < state->allocated_stack / BPF_REG_SIZE; i++) { - if (state->stack[i].slot_type[0] != STACK_SPILL) + if (!is_spilled_reg(&state->stack[i])) continue; state_reg = &state->stack[i].spilled_ptr; if (state_reg->type != SCALAR_VALUE || @@ -11076,7 +11288,8 @@ static int do_check(struct bpf_verifier_env *env) env->jmps_processed++; if (opcode == BPF_CALL) { if (BPF_SRC(insn->code) != BPF_K || - insn->off != 0 || + (insn->src_reg != BPF_PSEUDO_KFUNC_CALL + && insn->off != 0) || (insn->src_reg != BPF_REG_0 && insn->src_reg != BPF_PSEUDO_CALL && insn->src_reg != BPF_PSEUDO_KFUNC_CALL) || @@ -12432,6 +12645,7 @@ static int jit_subprogs(struct bpf_verifier_env *env) func[i]->aux->stack_depth = env->subprog_info[i].stack_depth; func[i]->jit_requested = 1; func[i]->aux->kfunc_tab = prog->aux->kfunc_tab; + func[i]->aux->kfunc_btf_tab = prog->aux->kfunc_btf_tab; func[i]->aux->linfo = prog->aux->linfo; func[i]->aux->nr_linfo = prog->aux->nr_linfo; func[i]->aux->jited_linfo = prog->aux->jited_linfo; @@ -12469,8 +12683,7 @@ static int jit_subprogs(struct bpf_verifier_env *env) if (!bpf_pseudo_call(insn)) continue; subprog = insn->off; - insn->imm = BPF_CAST_CALL(func[subprog]->bpf_func) - - __bpf_call_base; + insn->imm = BPF_CALL_IMM(func[subprog]->bpf_func); } /* we use the aux data to keep a list of the start addresses @@ -12618,10 +12831,15 @@ static int fixup_kfunc_call(struct bpf_verifier_env *env, { const struct bpf_kfunc_desc *desc; + if (!insn->imm) { + verbose(env, "invalid kernel function call not eliminated in verifier pass\n"); + return -EINVAL; + } + /* insn->imm has the btf func_id. Replace it with * an address (relative to __bpf_base_call). */ - desc = find_kfunc_desc(env->prog, insn->imm); + desc = find_kfunc_desc(env->prog, insn->imm, insn->off); if (!desc) { verbose(env, "verifier internal error: kernel function descriptor not found for func_id %u\n", insn->imm); @@ -12902,7 +13120,8 @@ static int do_misc_fixups(struct bpf_verifier_env *env) insn->imm == BPF_FUNC_map_push_elem || insn->imm == BPF_FUNC_map_pop_elem || insn->imm == BPF_FUNC_map_peek_elem || - insn->imm == BPF_FUNC_redirect_map)) { + insn->imm == BPF_FUNC_redirect_map || + insn->imm == BPF_FUNC_for_each_map_elem)) { aux = &env->insn_aux_data[i + delta]; if (bpf_map_ptr_poisoned(aux)) goto patch_call_imm; @@ -12946,36 +13165,37 @@ static int do_misc_fixups(struct bpf_verifier_env *env) (int (*)(struct bpf_map *map, void *value))NULL)); BUILD_BUG_ON(!__same_type(ops->map_redirect, (int (*)(struct bpf_map *map, u32 ifindex, u64 flags))NULL)); + BUILD_BUG_ON(!__same_type(ops->map_for_each_callback, + (int (*)(struct bpf_map *map, + bpf_callback_t callback_fn, + void *callback_ctx, + u64 flags))NULL)); patch_map_ops_generic: switch (insn->imm) { case BPF_FUNC_map_lookup_elem: - insn->imm = BPF_CAST_CALL(ops->map_lookup_elem) - - __bpf_call_base; + insn->imm = BPF_CALL_IMM(ops->map_lookup_elem); continue; case BPF_FUNC_map_update_elem: - insn->imm = BPF_CAST_CALL(ops->map_update_elem) - - __bpf_call_base; + insn->imm = BPF_CALL_IMM(ops->map_update_elem); continue; case BPF_FUNC_map_delete_elem: - insn->imm = BPF_CAST_CALL(ops->map_delete_elem) - - __bpf_call_base; + insn->imm = BPF_CALL_IMM(ops->map_delete_elem); continue; case BPF_FUNC_map_push_elem: - insn->imm = BPF_CAST_CALL(ops->map_push_elem) - - __bpf_call_base; + insn->imm = BPF_CALL_IMM(ops->map_push_elem); continue; case BPF_FUNC_map_pop_elem: - insn->imm = BPF_CAST_CALL(ops->map_pop_elem) - - __bpf_call_base; + insn->imm = BPF_CALL_IMM(ops->map_pop_elem); continue; case BPF_FUNC_map_peek_elem: - insn->imm = BPF_CAST_CALL(ops->map_peek_elem) - - __bpf_call_base; + insn->imm = BPF_CALL_IMM(ops->map_peek_elem); continue; case BPF_FUNC_redirect_map: - insn->imm = BPF_CAST_CALL(ops->map_redirect) - - __bpf_call_base; + insn->imm = BPF_CALL_IMM(ops->map_redirect); + continue; + case BPF_FUNC_for_each_map_elem: + insn->imm = BPF_CALL_IMM(ops->map_for_each_callback); continue; } @@ -13826,6 +14046,7 @@ skip_full_check: env->verification_time = ktime_get_ns() - start_time; print_verification_stats(env); + env->prog->aux->verified_insns = env->insn_processed; if (log->level && bpf_verifier_log_full(log)) ret = -ENOSPC; |