Age | Commit message (Collapse) | Author | Files | Lines |
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Borislav Petkov:
- Pass only an initialized perf event attribute to the LSM hook
- Fix a use-after-free on the perf syscall's error path
- A potential integer overflow fix in amd_core_pmu_init()
- Fix the cgroup events tracking after the context handling rewrite
- Return the proper value from the inherit_event() function on error
* tag 'perf_urgent_for_v6.2_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/core: Call LSM hook after copying perf_event_attr
perf: Fix use-after-free in error path
perf/x86/amd: fix potential integer overflow on shift of a int
perf/core: Fix cgroup events tracking
perf core: Return error pointer if inherit_event() fails to find pmu_ctx
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
- Two fixes to correct how kprobes handles INT3 now that they're added
by other functionality like the rethunks and not only kgdb
- Remove __init section markings of two functions which are referenced
by a function in the .text section
* tag 'x86_urgent_for_v6.2_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/kprobes: Fix optprobe optimization check with CONFIG_RETHUNK
x86/kprobes: Fix kprobes instruction boudary check with CONFIG_RETHUNK
x86/calldepth: Fix incorrect init section references
|
|
Use the correct indention.
Fixes: d50650500064 ("arm64: dts: imx8mp-evk: Add PCIe support")
Signed-off-by: Marco Felsch <m.felsch@pengutronix.de>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
|
|
dt_binding_check detects an issue with the pgc_hsiomix power
domain:
pgc: 'power-domains@17' does not match any of the regexes
This is because 'power-domains' should be 'power-domain'
Fixes: 2ae42e0c0b67 ("arm64: dts: imx8mp: add HSIO power-domains")
Signed-off-by: Adam Ford <aford173@gmail.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
|
|
The GPC node references an interrupt parent, but it doesn't
state the interrupt itself. According to the TRM, this IRQ
is 87. This also eliminate an error detected from dt_binding_check
Fixes: fc0f05124621 ("arm64: dts: imx8mp: add GPC node with GPU power domains")
Signed-off-by: Adam Ford <aford173@gmail.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
|
|
The clk_xtal32k have clock-cells = <0>, drop the bogus specifier.
Fixes: 9509593f327a ("arm64: dts: imx8mm: Model PMIC to SNVS RTC clock path on Data Modul i.MX8M Mini eDM SBC")
Signed-off-by: Marek Vasut <marex@denx.de>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
|
|
Calling of_find_compatible_node() returns a node pointer with refcount
incremented. Use of_node_put() on it when done.
The patch fixes the same problem on different i.MX platforms.
Fixes: 8b88f7ef31dde ("ARM: mx25: Retrieve IIM base from dt")
Fixes: 94b2bec1b0e05 ("ARM: imx27: Retrieve the SYSCTRL base address from devicetree")
Fixes: 3172225d45bd9 ("ARM: imx31: Retrieve the IIM base address from devicetree")
Fixes: f68ea682d1da7 ("ARM: imx35: Retrieve the IIM base address from devicetree")
Fixes: ee18a7154ee08 ("ARM: imx5: retrieve iim base from device tree")
Signed-off-by: Dario Binacchi <dario.binacchi@amarulasolutions.com>
Reviewed-by: Fabio Estevam <festevam@gmail.com>
Reviewed-by: Martin Kaiser <martin@kaiser.cx>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
|
|
For clock and strobe pad of usdhc, need to config as pull down.
Current pad config set these pad as both pull up and pull down,
this is wrong, so fix it here.
Find this issue when enable HS400ES mode on one Micron eMMC chip,
CMD8 always meet CRC error in HS400ES/HS400 mode.
Fixes: e37907bd8294 ("arm64: dts: freescale: add i.MX93 11x11 EVK basic support")
Signed-off-by: Haibo Chen <haibo.chen@nxp.com>
Reviewed-by: Peng Fan <peng.fan@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
|
|
Set optional `simple-audio-card,mclk-fs` parameter to ensure a proper
clock to the nau8822 audio codec. Without this change with an audio
stream rate of 44.1 kHz the playback is faster.
Set the MCLK at the right frequency, codec can properly use it to
generate 44.1 kHz I2S-FS.
Fixes: 6a57f224f734 ("arm64: dts: freescale: add initial support for verdin imx8m mini")
Signed-off-by: Emanuele Ghidoli <emanuele.ghidoli@toradex.com>
Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com>
Reviewed-by: Fabio Estevam <festevam@gmail.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
|
|
There is no "no-emmc" property, so intention for SD/SDIO only nodes was
to use "no-mmc".
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
|
|
Early hardware did not support hardware handshaking on the UART, but
final production hardware did. When the hardware was updated the chip
select was changed to facilitate hardware handshaking on UART3. Fix the
ecspi2 pin mux to eliminate a pin conflict with UART3 and allow the
EEPROM to operate again.
Fixes: 4ce01ce36d77 ("arm64: dts: imx8mm-beacon: Enable RTS-CTS on UART3")
Signed-off-by: Adam Ford <aford173@gmail.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
|
|
"make dtbs_check":
arch/arm64/boot/dts/freescale/fsl-ls1012a-qds.dtb: pca9547@77: $nodename:0: 'pca9547@77' does not match '^(i2c-?)?mux'
From schema: Documentation/devicetree/bindings/i2c/i2c-mux-pca954x.yaml
arch/arm64/boot/dts/freescale/fsl-ls1012a-qds.dtb: pca9547@77: Unevaluated properties are not allowed ('#address-cells', '#size-cells', 'i2c@4' were unexpected)
From schema: Documentation/devicetree/bindings/i2c/i2c-mux-pca954x.yaml
...
Fix this by renaming PCA954x nodes to "i2c-mux", to match the I2C bus
multiplexer/switch DT bindings and the Generic Names Recommendation in
the Devicetree Specification.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
|
|
"make dtbs_check":
arch/arm/boot/dts/vf610-zii-dev-rev-b.dtb: tca9548@70: $nodename:0: 'tca9548@70' does not match '^(i2c-?)?mux'
From schema: Documentation/devicetree/bindings/i2c/i2c-mux-pca954x.yaml
arch/arm/boot/dts/vf610-zii-dev-rev-b.dtb: tca9548@70: Unevaluated properties are not allowed ('#address-cells', '#size-cells', 'i2c@0', 'i2c@1', 'i2c@2', 'i2c@3', 'i2c@4' were unexpected)
From schema: /scratch/geert/linux/linux-renesas/Documentation/devicetree/bindings/i2c/i2c-mux-pca954x.yaml
...
Fix this by renaming PCA9548 nodes to "i2c-mux", to match the I2C bus
multiplexer/switch DT bindings and the Generic Names Recommendation in
the Devicetree Specification.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
|
|
"make dtbs_check":
arch/arm/boot/dts/imx53-ppd.dtb: i2c-switch@70: $nodename:0: 'i2c-switch@70' does not match '^(i2c-?)?mux'
From schema: Documentation/devicetree/bindings/i2c/i2c-mux-pca954x.yaml
arch/arm/boot/dts/imx53-ppd.dtb: i2c-switch@70: Unevaluated properties are not allowed ('#address-cells', '#size-cells', 'i2c@0', 'i2c@1', 'i2c@2', 'i2c@3', 'i2c@4', 'i2c@5', 'i2c@6', 'i2c@7' were unexpected)
From schema: Documentation/devicetree/bindings/i2c/i2c-mux-pca954x.yaml
Fix this by renaming the PCA9547 node to "i2c-mux", to match the I2C bus
multiplexer/switch DT bindings and the Generic Names Recommendation in
the Devicetree Specification.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
|
|
Set optional `simple-audio-card,mclk-fs` parameter to ensure a proper
clock to the wm8904 audio codec. Without this change with an audio
stream rate of 44.1 kHz the playback is completely distorted.
Fixes: 6a57f224f734 ("arm64: dts: freescale: add initial support for verdin imx8m mini")
Signed-off-by: Emanuele Ghidoli <emanuele.ghidoli@toradex.com>
Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com>
Reviewed-by: Fabio Estevam <festevam@gmail.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
|
|
The following build warning is seen when running:
make dtbs_check DT_SCHEMA_FILES=fsl-imx-uart.yaml
arch/arm/boot/dts/imx6dl-gw560x.dtb: serial@2020000: rts-gpios: False schema does not allow [[20, 1, 0]]
From schema: Documentation/devicetree/bindings/serial/fsl-imx-uart.yaml
The imx6qdl-gw560x board does not expose the UART RTS and CTS
as native UART pins, so 'uart-has-rtscts' should not be used.
Using 'uart-has-rtscts' with 'rts-gpios' is an invalid combination
detected by serial.yaml.
Fix the problem by removing the incorrect 'uart-has-rtscts' property.
Fixes: b8a559feffb2 ("ARM: dts: imx: add Gateworks Ventana GW5600 support")
Signed-off-by: Fabio Estevam <festevam@denx.de>
Acked-by: Tim Harvey <tharvey@gateworks.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
|
|
'clock_frequency' is not a valid property.
Use the correct 'clock-frequency' instead.
Fixes: 8b646cfb84c3 ("ARM: dts: imx7d-pico: Add support for the dwarf baseboard")
Fixes: 6418fd92417f ("ARM: dts: imx7d-pico: Add support for the nymph baseboard")
Signed-off-by: Fabio Estevam <festevam@denx.de>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
|
|
'clock_frequency' is not a valid property.
Use the correct 'clock-frequency' instead.
Fixes: 47246fafef84 ("ARM: dts: imx6ul-pico: Add support for the dwarf baseboard")
Signed-off-by: Fabio Estevam <festevam@denx.de>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
|
|
'regulator-compatible' is not a valid property according to
nxp,pca9450-regulator.yaml and causes the following warning:
DTC_CHK arch/arm64/boot/dts/freescale/imx8mp-dhcom-pdk2.dtb
...
pmic@25: regulators:LDO1: Unevaluated properties are not allowed ('regulator-compatible' was unexpected)
Remove the invalid 'regulator-compatible' property.
Cc: Teresa Remmet <t.remmet@phytec.de>
Fixes: 88f7f6bcca37 ("arm64: dts: freescale: Add support for phyBOARD-Pollux-i.MX8MP")
Signed-off-by: Fabio Estevam <festevam@denx.de>
Reviewed-by: Teresa Remmet <t.remmet@phytec.de>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
|
|
After commit cf7f3f4fa9e5 ("clk: imx8mp: fix usb_root_clk parent"),
usb_root_clk is no longer for suspend clock so update dts accordingly
to use right bus clock and suspend clock.
Fixes: fb8587a2c165 ("arm64: dtsi: imx8mp: add usb nodes")
Cc: stable@vger.kernel.org # ed1f4ccfe947: clk: imx: imx8mp: add shared clk gate for usb suspend clk
Cc: stable@vger.kernel.org # v5.19+
Reviewed-by: Peng Fan <peng.fan@nxp.com>
Tested-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Signed-off-by: Li Jun <jun.li@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
|
|
The SM4 CCM/GCM assembly functions for encryption and decryption is
called via indirect function calls. Therefore they need to use
SYM_TYPED_FUNC_START instead of SYM_FUNC_START to cause its type hash
to be emitted when the kernel is built with CONFIG_CFI_CLANG=y.
Otherwise, the code crashes with a CFI failure (if the compiler didn't
happen to optimize out the indirect call).
Fixes: 67fa3a7fdf80 ("crypto: arm64/sm4 - add CE implementation for CCM mode")
Fixes: ae1b83c7d572 ("crypto: arm64/sm4 - add CE implementation for GCM mode")
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
The interconnect providers accept only one argument (cells == 1), so fix
a copy&paste from SM8450:
sm8350-hdk.dtb: mmc@8804000: interconnects: [[74, 9, 0], [75, 1, 0], [76, 2, 0], [77, 36, 0]] is too long
Fixes: 60477435e4de ("arm64: dts: qcom: sm8350: Add SDHCI2")
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20221224214351.18215-1-krzysztof.kozlowski@linaro.org
|
|
state"
Due to recent improvements of the cluster idle state support for Qcom based
platforms, we are now able to support the deepest cluster idle state. Let's
therefore revert the earlier workaround.
This reverts commit cadaa773bcf1 ("arm64: dts: qcom: sm8250: Disable the
not yet supported cluster idle state"), which is available from v6.1-rc6.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20221122123713.65631-1-ulf.hansson@linaro.org
|
|
While changing node names of APQ8084 SDHCI, the ones in IFC6540 board
were not updated leading to disabled and misconfigured SDHCI.
Cc: <stable@vger.kernel.org>
Fixes: 2477d81901a2 ("ARM: dts: qcom: Fix sdhci node names - use 'mmc@'")
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20221204084614.12193-1-krzysztof.kozlowski@linaro.org
|
|
x86:
* several fixes to nested VMX execution controls
* fixes and clarification to the documentation for Xen emulation
* do not unnecessarily release a pmu event with zero period
* MMU fixes
* fix Coverity warning in kvm_hv_flush_tlb()
selftests:
* fixes for the ucall mechanism in selftests
* other fixes mostly related to compilation with clang
|
|
While KVM_XEN_EVTCHN_RESET is usually called with no vCPUs running,
if that happened it could cause a deadlock. This is due to
kvm_xen_eventfd_reset() doing a synchronize_srcu() inside
a kvm->lock critical section.
To avoid this, first collect all the evtchnfd objects in an
array and free all of them once the kvm->lock critical section
is over and th SRCU grace period has expired.
Reported-by: Michal Luczaj <mhal@rbox.co>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Add region for memory hole present on bullhead in order to
fix a reboot issue on recent kernels
Reported-by: Petr Vorel <petr.vorel@gmail.com>
Signed-off-by: Dominik Kobinski <dominikkobinski314@gmail.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Tested-by: Petr Vorel <petr.vorel@gmail.com>
Reviewed-by: Petr Vorel <petr.vorel@gmail.com>
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20221211100501.82323-1-dominikkobinski314@gmail.com
|
|
Update the USB4-USB3-DP QMP PHY nodes to match the new binding which
specifically includes the missing register regions (e.g. DP_PHY) and
allows for supporting DisplayPort Alternate Mode.
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20221121085058.31213-16-johan+linaro@kernel.org
|
|
The vendor kernel is using the GCC_USB4_DP_PHY_PRIM_BCR and
GCC_USB4_1_DP_PHY_PRIM_BCR resets for the USB4-USB3-DP QMP PHYs.
Update the primary USB-DP PHY node to match.
Fixes: 152d1faf1e2f ("arm64: dts: qcom: add SC8280XP platform")
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20221121085058.31213-15-johan+linaro@kernel.org
|
|
Since the CONFIG_RETHUNK and CONFIG_SLS will use INT3 for stopping
speculative execution after function return, kprobe jump optimization
always fails on the functions with such INT3 inside the function body.
(It already checks the INT3 padding between functions, but not inside
the function)
To avoid this issue, as same as kprobes, check whether the INT3 comes
from kgdb or not, and if so, stop decoding and make it fail. The other
INT3 will come from CONFIG_RETHUNK/CONFIG_SLS and those can be
treated as a one-byte instruction.
Fixes: e463a09af2f0 ("x86: Add straight-line-speculation mitigation")
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/167146051929.1374301.7419382929328081706.stgit@devnote3
|
|
Since the CONFIG_RETHUNK and CONFIG_SLS will use INT3 for stopping
speculative execution after RET instruction, kprobes always failes to
check the probed instruction boundary by decoding the function body if
the probed address is after such sequence. (Note that some conditional
code blocks will be placed after function return, if compiler decides
it is not on the hot path.)
This is because kprobes expects kgdb puts the INT3 as a software
breakpoint and it will replace the original instruction.
But these INT3 are not such purpose, it doesn't need to recover the
original instruction.
To avoid this issue, kprobes checks whether the INT3 is owned by
kgdb or not, and if so, stop decoding and make it fail. The other
INT3 will come from CONFIG_RETHUNK/CONFIG_SLS and those can be
treated as a one-byte instruction.
Fixes: e463a09af2f0 ("x86: Add straight-line-speculation mitigation")
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/167146051026.1374301.392728975473572291.stgit@devnote3
|
|
The addition of callthunks_translate_call_dest means that
skip_addr() and patch_dest() can no longer be discarded
as part of the __init section freeing:
WARNING: modpost: vmlinux.o: section mismatch in reference: callthunks_translate_call_dest.cold (section: .text.unlikely) -> skip_addr (section: .init.text)
WARNING: modpost: vmlinux.o: section mismatch in reference: callthunks_translate_call_dest.cold (section: .text.unlikely) -> patch_dest (section: .init.text)
WARNING: modpost: vmlinux.o: section mismatch in reference: is_callthunk.cold (section: .text.unlikely) -> skip_addr (section: .init.text)
ERROR: modpost: Section mismatches detected.
Set CONFIG_SECTION_MISMATCH_WARN_ONLY=y to allow them.
Fixes: b2e9dfe54be4 ("x86/bpf: Emit call depth accounting if required")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20221215164334.968863-1-arnd@kernel.org
|
|
The left shift of int 32 bit integer constant 1 is evaluated using 32 bit
arithmetic and then passed as a 64 bit function argument. In the case where
i is 32 or more this can lead to an overflow. Avoid this by shifting
using the BIT_ULL macro instead.
Fixes: 471af006a747 ("perf/x86/amd: Constrain Large Increment per Cycle events")
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Ian Rogers <irogers@google.com>
Acked-by: Kim Phillips <kim.phillips@amd.com>
Link: https://lore.kernel.org/r/20221202135149.1797974-1-colin.i.king@gmail.com
|
|
These are (uint64_t)-1 magic values are a userspace ABI, allowing the
shared info pages and other enlightenments to be disabled. This isn't
a Xen ABI because Xen doesn't let the guest turn these off except with
the full SHUTDOWN_soft_reset mechanism. Under KVM, the userspace VMM is
expected to handle soft reset, and tear down the kernel parts of the
enlightenments accordingly.
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Message-Id: <20221226120320.1125390-5-dwmw2@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Port number is validated in kvm_xen_setattr_evtchn().
Remove superfluous checks in kvm_xen_eventfd_assign() and
kvm_xen_eventfd_update().
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Message-Id: <20221222203021.1944101-3-mhal@rbox.co>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Message-Id: <20221226120320.1125390-4-dwmw2@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The evtchnfd structure itself must be protected by either kvm->lock or
SRCU. Use the former in kvm_xen_eventfd_update(), since the lock is
being taken anyway; kvm_xen_hcall_evtchn_send() instead is a reader and
does not need kvm->lock, and is called in SRCU critical section from the
kvm_x86_handle_exit function.
It is also important to use rcu_read_{lock,unlock}() in
kvm_xen_hcall_evtchn_send(), because idr_remove() will *not*
use synchronize_srcu() to wait for readers to complete.
Remove a superfluous if (kvm) check before calling synchronize_srcu()
in kvm_xen_eventfd_deassign() where kvm has been dereferenced already.
Co-developed-by: Michal Luczaj <mhal@rbox.co>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Message-Id: <20221226120320.1125390-3-dwmw2@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
In particular, we shouldn't assume that being contiguous in guest virtual
address space means being contiguous in guest *physical* address space.
In dropping the manual calls to kvm_mmu_gva_to_gpa_system(), also drop
the srcu_read_lock() that was around them. All call sites are reached
from kvm_xen_hypercall() which is called from the handle_exit function
with the read lock already held.
536395260 ("KVM: x86/xen: handle PV timers oneshot mode")
1a65105a5 ("KVM: x86/xen: handle PV spinlocks slowpath")
Fixes: 2fd6df2f2 ("KVM: x86/xen: intercept EVTCHNOP_send from guests")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Message-Id: <20221226120320.1125390-2-dwmw2@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Release page irrespectively of kvm_vcpu_write_guest() return value.
Suggested-by: Paul Durrant <paul@xen.org>
Fixes: 23200b7a30de ("KVM: x86/xen: intercept xen hypercalls if enabled")
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Message-Id: <20221220151454.712165-1-mhal@rbox.co>
Reviewed-by: Paul Durrant <paul@xen.org>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Message-Id: <20221226120320.1125390-1-dwmw2@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
"be split be split" -> "be split"
Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com>
Message-Id: <20221207120505.9175-1-jiangshanlai@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Due to several bugs caused by timers being re-armed after they are
shutdown and just before they are freed, a new state of timers was added
called "shutdown". After a timer is set to this state, then it can no
longer be re-armed.
The following script was run to find all the trivial locations where
del_timer() or del_timer_sync() is called in the same function that the
object holding the timer is freed. It also ignores any locations where
the timer->function is modified between the del_timer*() and the free(),
as that is not considered a "trivial" case.
This was created by using a coccinelle script and the following
commands:
$ cat timer.cocci
@@
expression ptr, slab;
identifier timer, rfield;
@@
(
- del_timer(&ptr->timer);
+ timer_shutdown(&ptr->timer);
|
- del_timer_sync(&ptr->timer);
+ timer_shutdown_sync(&ptr->timer);
)
... when strict
when != ptr->timer
(
kfree_rcu(ptr, rfield);
|
kmem_cache_free(slab, ptr);
|
kfree(ptr);
)
$ spatch timer.cocci . > /tmp/t.patch
$ patch -p1 < /tmp/t.patch
Link: https://lore.kernel.org/lkml/20221123201306.823305113@linutronix.de/
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Acked-by: Pavel Machek <pavel@ucw.cz> [ LED ]
Acked-by: Kalle Valo <kvalo@kernel.org> [ wireless ]
Acked-by: Paolo Abeni <pabeni@redhat.com> [ networking ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
Pull MIPS fixes from Thomas Bogendoerfer:
"Fixes due to DT changes"
* tag 'mips_6.2_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
MIPS: dts: bcm63268: Add missing properties to the TWD node
MIPS: ralink: mt7621: avoid to init common ralink reset controller
|
|
Don't install a leaf TDP MMU SPTE if the parent page's level doesn't
match the target level of the fault, and instead have the vCPU retry the
faulting instruction after warning. Continuing on is completely
unnecessary as the absolute worst case scenario of retrying is DoSing
the vCPU, whereas continuing on all but guarantees bigger explosions, e.g.
------------[ cut here ]------------
kernel BUG at arch/x86/kvm/mmu/tdp_mmu.c:559!
invalid opcode: 0000 [#1] SMP
CPU: 1 PID: 1025 Comm: nx_huge_pages_t Tainted: G W 6.1.0-rc4+ #64
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
RIP: 0010:__handle_changed_spte.cold+0x95/0x9c
RSP: 0018:ffffc9000072faf8 EFLAGS: 00010246
RAX: 00000000000000c1 RBX: ffffc90000731000 RCX: 0000000000000027
RDX: 0000000000000000 RSI: 00000000ffffdfff RDI: ffff888277c5b4c8
RBP: 0600000112400bf3 R08: ffff888277c5b4c0 R09: ffffc9000072f9a0
R10: 0000000000000001 R11: 0000000000000001 R12: 06000001126009f3
R13: 0000000000000002 R14: 0000000012600901 R15: 0000000012400b01
FS: 00007fba9f853740(0000) GS:ffff888277c40000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 000000010aa7a003 CR4: 0000000000172ea0
Call Trace:
<TASK>
kvm_tdp_mmu_map+0x3b0/0x510
kvm_tdp_page_fault+0x10c/0x130
kvm_mmu_page_fault+0x103/0x680
vmx_handle_exit+0x132/0x5a0 [kvm_intel]
vcpu_enter_guest+0x60c/0x16f0
kvm_arch_vcpu_ioctl_run+0x1e2/0x9d0
kvm_vcpu_ioctl+0x271/0x660
__x64_sys_ioctl+0x80/0xb0
do_syscall_64+0x2b/0x50
entry_SYSCALL_64_after_hwframe+0x46/0xb0
</TASK>
Modules linked in: kvm_intel
---[ end trace 0000000000000000 ]---
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20221213033030.83345-5-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Re-check sp->nx_huge_page_disallowed under the tdp_mmu_pages_lock spinlock
when adding a new shadow page in the TDP MMU. To ensure the NX reclaim
kthread can't see a not-yet-linked shadow page, the page fault path links
the new page table prior to adding the page to possible_nx_huge_pages.
If the page is zapped by different task, e.g. because dirty logging is
disabled, between linking the page and adding it to the list, KVM can end
up triggering use-after-free by adding the zapped SP to the aforementioned
list, as the zapped SP's memory is scheduled for removal via RCU callback.
The bug is detected by the sanity checks guarded by CONFIG_DEBUG_LIST=y,
i.e. the below splat is just one possible signature.
------------[ cut here ]------------
list_add corruption. prev->next should be next (ffffc9000071fa70), but was ffff88811125ee38. (prev=ffff88811125ee38).
WARNING: CPU: 1 PID: 953 at lib/list_debug.c:30 __list_add_valid+0x79/0xa0
Modules linked in: kvm_intel
CPU: 1 PID: 953 Comm: nx_huge_pages_t Tainted: G W 6.1.0-rc4+ #71
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
RIP: 0010:__list_add_valid+0x79/0xa0
RSP: 0018:ffffc900006efb68 EFLAGS: 00010286
RAX: 0000000000000000 RBX: ffff888116cae8a0 RCX: 0000000000000027
RDX: 0000000000000027 RSI: 0000000100001872 RDI: ffff888277c5b4c8
RBP: ffffc90000717000 R08: ffff888277c5b4c0 R09: ffffc900006efa08
R10: 0000000000199998 R11: 0000000000199a20 R12: ffff888116cae930
R13: ffff88811125ee38 R14: ffffc9000071fa70 R15: ffff88810b794f90
FS: 00007fc0415d2740(0000) GS:ffff888277c40000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 0000000115201006 CR4: 0000000000172ea0
Call Trace:
<TASK>
track_possible_nx_huge_page+0x53/0x80
kvm_tdp_mmu_map+0x242/0x2c0
kvm_tdp_page_fault+0x10c/0x130
kvm_mmu_page_fault+0x103/0x680
vmx_handle_exit+0x132/0x5a0 [kvm_intel]
vcpu_enter_guest+0x60c/0x16f0
kvm_arch_vcpu_ioctl_run+0x1e2/0x9d0
kvm_vcpu_ioctl+0x271/0x660
__x64_sys_ioctl+0x80/0xb0
do_syscall_64+0x2b/0x50
entry_SYSCALL_64_after_hwframe+0x46/0xb0
</TASK>
---[ end trace 0000000000000000 ]---
Fixes: 61f94478547b ("KVM: x86/mmu: Set disallowed_nx_huge_page in TDP MMU before setting SPTE")
Reported-by: Greg Thelen <gthelen@google.com>
Analyzed-by: David Matlack <dmatlack@google.com>
Cc: David Matlack <dmatlack@google.com>
Cc: Ben Gardon <bgardon@google.com>
Cc: Mingwei Zhang <mizhang@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20221213033030.83345-4-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Map the leaf SPTE when handling a TDP MMU page fault if and only if the
target level is reached. A recent commit reworked the retry logic and
incorrectly assumed that walking SPTEs would never "fail", as the loop
either bails (retries) or installs parent SPs. However, the iterator
itself will bail early if it detects a frozen (REMOVED) SPTE when
stepping down. The TDP iterator also rereads the current SPTE before
stepping down specifically to avoid walking into a part of the tree that
is being removed, which means it's possible to terminate the loop without
the guts of the loop observing the frozen SPTE, e.g. if a different task
zaps a parent SPTE between the initial read and try_step_down()'s refresh.
Mapping a leaf SPTE at the wrong level results in all kinds of badness as
page table walkers interpret the SPTE as a page table, not a leaf, and
walk into the weeds.
------------[ cut here ]------------
WARNING: CPU: 1 PID: 1025 at arch/x86/kvm/mmu/tdp_mmu.c:1070 kvm_tdp_mmu_map+0x481/0x510
Modules linked in: kvm_intel
CPU: 1 PID: 1025 Comm: nx_huge_pages_t Tainted: G W 6.1.0-rc4+ #64
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
RIP: 0010:kvm_tdp_mmu_map+0x481/0x510
RSP: 0018:ffffc9000072fba8 EFLAGS: 00010286
RAX: 0000000000000000 RBX: ffffc9000072fcc0 RCX: 0000000000000027
RDX: 0000000000000027 RSI: 00000000ffffdfff RDI: ffff888277c5b4c8
RBP: ffff888107d45a10 R08: ffff888277c5b4c0 R09: ffffc9000072fa48
R10: 0000000000000001 R11: 0000000000000001 R12: ffffc9000073a0e0
R13: ffff88810fc54800 R14: ffff888107d1ae60 R15: ffff88810fc54f90
FS: 00007fba9f853740(0000) GS:ffff888277c40000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 000000010aa7a003 CR4: 0000000000172ea0
Call Trace:
<TASK>
kvm_tdp_page_fault+0x10c/0x130
kvm_mmu_page_fault+0x103/0x680
vmx_handle_exit+0x132/0x5a0 [kvm_intel]
vcpu_enter_guest+0x60c/0x16f0
kvm_arch_vcpu_ioctl_run+0x1e2/0x9d0
kvm_vcpu_ioctl+0x271/0x660
__x64_sys_ioctl+0x80/0xb0
do_syscall_64+0x2b/0x50
entry_SYSCALL_64_after_hwframe+0x46/0xb0
</TASK>
---[ end trace 0000000000000000 ]---
Invalid SPTE change: cannot replace a present leaf
SPTE with another present leaf SPTE mapping a
different PFN!
as_id: 0 gfn: 100200 old_spte: 600000112400bf3 new_spte: 6000001126009f3 level: 2
------------[ cut here ]------------
kernel BUG at arch/x86/kvm/mmu/tdp_mmu.c:559!
invalid opcode: 0000 [#1] SMP
CPU: 1 PID: 1025 Comm: nx_huge_pages_t Tainted: G W 6.1.0-rc4+ #64
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
RIP: 0010:__handle_changed_spte.cold+0x95/0x9c
RSP: 0018:ffffc9000072faf8 EFLAGS: 00010246
RAX: 00000000000000c1 RBX: ffffc90000731000 RCX: 0000000000000027
RDX: 0000000000000000 RSI: 00000000ffffdfff RDI: ffff888277c5b4c8
RBP: 0600000112400bf3 R08: ffff888277c5b4c0 R09: ffffc9000072f9a0
R10: 0000000000000001 R11: 0000000000000001 R12: 06000001126009f3
R13: 0000000000000002 R14: 0000000012600901 R15: 0000000012400b01
FS: 00007fba9f853740(0000) GS:ffff888277c40000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 000000010aa7a003 CR4: 0000000000172ea0
Call Trace:
<TASK>
kvm_tdp_mmu_map+0x3b0/0x510
kvm_tdp_page_fault+0x10c/0x130
kvm_mmu_page_fault+0x103/0x680
vmx_handle_exit+0x132/0x5a0 [kvm_intel]
vcpu_enter_guest+0x60c/0x16f0
kvm_arch_vcpu_ioctl_run+0x1e2/0x9d0
kvm_vcpu_ioctl+0x271/0x660
__x64_sys_ioctl+0x80/0xb0
do_syscall_64+0x2b/0x50
entry_SYSCALL_64_after_hwframe+0x46/0xb0
</TASK>
Modules linked in: kvm_intel
---[ end trace 0000000000000000 ]---
Fixes: 63d28a25e04c ("KVM: x86/mmu: simplify kvm_tdp_mmu_map flow when guest has to retry")
Cc: Robert Hoo <robert.hu@linux.intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20221213033030.83345-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Hoist the is_removed_spte() check above the "level == goal_level" check
when walking SPTEs during a TDP MMU page fault to avoid attempting to map
a leaf entry if said entry is frozen by a different task/vCPU.
------------[ cut here ]------------
WARNING: CPU: 3 PID: 939 at arch/x86/kvm/mmu/tdp_mmu.c:653 kvm_tdp_mmu_map+0x269/0x4b0
Modules linked in: kvm_intel
CPU: 3 PID: 939 Comm: nx_huge_pages_t Not tainted 6.1.0-rc4+ #67
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
RIP: 0010:kvm_tdp_mmu_map+0x269/0x4b0
RSP: 0018:ffffc9000068fba8 EFLAGS: 00010246
RAX: 00000000000005a0 RBX: ffffc9000068fcc0 RCX: 0000000000000005
RDX: ffff88810741f000 RSI: ffff888107f04600 RDI: ffffc900006a3000
RBP: 060000010b000bf3 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 000ffffffffff000 R12: 0000000000000005
R13: ffff888113670000 R14: ffff888107464958 R15: 0000000000000000
FS: 00007f01c942c740(0000) GS:ffff888277cc0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 0000000117013006 CR4: 0000000000172ea0
Call Trace:
<TASK>
kvm_tdp_page_fault+0x10c/0x130
kvm_mmu_page_fault+0x103/0x680
vmx_handle_exit+0x132/0x5a0 [kvm_intel]
vcpu_enter_guest+0x60c/0x16f0
kvm_arch_vcpu_ioctl_run+0x1e2/0x9d0
kvm_vcpu_ioctl+0x271/0x660
__x64_sys_ioctl+0x80/0xb0
do_syscall_64+0x2b/0x50
entry_SYSCALL_64_after_hwframe+0x46/0xb0
</TASK>
---[ end trace 0000000000000000 ]---
Fixes: 63d28a25e04c ("KVM: x86/mmu: simplify kvm_tdp_mmu_map flow when guest has to retry")
Cc: Robert Hoo <robert.hu@linux.intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Robert Hoo <robert.hu@linux.intel.com>
Message-Id: <20221213033030.83345-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
When stuffing the allowed secondary execution controls for nested VMX in
response to CPUID updates, don't set the allowed-1 bit for a feature that
isn't supported by KVM, i.e. isn't allowed by the canonical vmcs_config.
WARN if KVM attempts to manipulate a feature that isn't supported. All
features that are currently stuffed are always advertised to L1 for
nested VMX if they are supported in KVM's base configuration, and no
additional features should ever be added to the CPUID-induced stuffing
(updating VMX MSRs in response to CPUID updates is a long-standing KVM
flaw that is slowly being fixed).
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20221213062306.667649-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Set ENABLE_USR_WAIT_PAUSE in KVM's supported VMX MSR configuration if the
feature is supported in hardware and enabled in KVM's base, non-nested
configuration, i.e. expose ENABLE_USR_WAIT_PAUSE to L1 if it's supported.
This fixes a bug where saving/restoring, i.e. migrating, a vCPU will fail
if WAITPKG (the associated CPUID feature) is enabled for the vCPU, and
obviously allows L1 to enable the feature for L2.
KVM already effectively exposes ENABLE_USR_WAIT_PAUSE to L1 by stuffing
the allowed-1 control ina vCPU's virtual MSR_IA32_VMX_PROCBASED_CTLS2 when
updating secondary controls in response to KVM_SET_CPUID(2), but (a) that
depends on flawed code (KVM shouldn't touch VMX MSRs in response to CPUID
updates) and (b) runs afoul of vmx_restore_control_msr()'s restriction
that the guest value must be a strict subset of the supported host value.
Although no past commit explicitly enabled nested support for WAITPKG,
doing so is safe and functionally correct from an architectural
perspective as no additional KVM support is needed to virtualize TPAUSE,
UMONITOR, and UMWAIT for L2 relative to L1, and KVM already forwards
VM-Exits to L1 as necessary (commit bf653b78f960, "KVM: vmx: Introduce
handle_unexpected_vmexit and handle WAITPKG vmexit").
Note, KVM always keeps the hosts MSR_IA32_UMWAIT_CONTROL resident in
hardware, i.e. always runs both L1 and L2 with the host's power management
settings for TPAUSE and UMWAIT. See commit bf09fb6cba4f ("KVM: VMX: Stop
context switching MSR_IA32_UMWAIT_CONTROL") for more details.
Fixes: e69e72faa3a0 ("KVM: x86: Add support for user wait instructions")
Cc: stable@vger.kernel.org
Reported-by: Aaron Lewis <aaronlewis@google.com>
Reported-by: Yu Zhang <yu.c.zhang@linux.intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Message-Id: <20221213062306.667649-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Explicitly drop the result of kvm_vcpu_write_guest() when writing the
"launch state" as part of VMCLEAR emulation, and add a comment to call
out that KVM's behavior is architecturally valid. Intel's pseudocode
effectively says that VMCLEAR is a nop if the target VMCS address isn't
in memory, e.g. if the address points at MMIO.
Add a FIXME to call out that suppressing failures on __copy_to_user() is
wrong, as memory (a memslot) does exist in that case. Punt the issue to
the future as open coding kvm_vcpu_write_guest() just to make sure the
guest dies with -EFAULT isn't worth the extra complexity. The flaw will
need to be addressed if KVM ever does something intelligent on uaccess
failures, e.g. to support post-copy demand paging, but in that case KVM
will need a more thorough overhaul, i.e. VMCLEAR shouldn't need to open
code a core KVM helper.
No functional change intended.
Reported-by: coverity-bot <keescook+coverity-bot@chromium.org>
Addresses-Coverity-ID: 1527765 ("Error handling issues")
Fixes: 587d7e72aedc ("kvm: nVMX: VMCLEAR should not cause the vCPU to shut down")
Cc: Jim Mattson <jmattson@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20221220154224.526568-1-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Add a sanity check in kvm_handle_memory_failure() to assert that a valid
x86_exception structure is provided if the memory "failure" wants to
propagate a fault into the guest. If a memory failure happens during a
direct guest physical memory access, e.g. for nested VMX, KVM hardcodes
the failure to X86EMUL_IO_NEEDED and doesn't provide an exception pointer
(because the exception struct would just be filled with garbage).
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20221220153427.514032-1-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
kvm_apic_hw_enabled() only needs to return bool, there is no place
to use the return value of MSR_IA32_APICBASE_ENABLE.
Signed-off-by: Peng Hao <flyingpeng@tencent.com>
Message-Id: <CAPm50aJ=BLXNWT11+j36Dd6d7nz2JmOBk4u7o_NPQ0N61ODu1g@mail.gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|