diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2022-10-04 10:12:08 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2022-10-04 10:12:08 -0700 |
commit | b5f0b11353a6a33a1accd0b742c80ed6b2f35ac0 (patch) | |
tree | ffd293394865cc2beb0ef4023b99ac3c6e7e0db1 | |
parent | 9bf445b65dd7d4c7d0e4efaecf38525abbbe74e3 (diff) | |
parent | 712f210a457d9c32414df246a72781550bc23ef6 (diff) | |
download | linux-b5f0b11353a6a33a1accd0b742c80ed6b2f35ac0.tar.bz2 |
Merge tag 'x86_microcode_for_v6.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x75 microcode loader updates from Borislav Petkov:
- Get rid of a single ksize() usage
- By popular demand, print the previous microcode revision an update
was done over
- Remove more code related to the now gone MICROCODE_OLD_INTERFACE
- Document the problems stemming from microcode late loading
* tag 'x86_microcode_for_v6.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/microcode/AMD: Track patch allocation size explicitly
x86/microcode: Print previous version of microcode after reload
x86/microcode: Remove ->request_microcode_user()
x86/microcode: Document the whole late loading problem
-rw-r--r-- | Documentation/admin-guide/tainted-kernels.rst | 6 | ||||
-rw-r--r-- | Documentation/x86/microcode.rst | 116 | ||||
-rw-r--r-- | arch/x86/include/asm/microcode.h | 4 | ||||
-rw-r--r-- | arch/x86/kernel/cpu/microcode/amd.c | 10 | ||||
-rw-r--r-- | arch/x86/kernel/cpu/microcode/core.c | 5 | ||||
-rw-r--r-- | arch/x86/kernel/cpu/microcode/intel.c | 17 |
6 files changed, 119 insertions, 39 deletions
diff --git a/Documentation/admin-guide/tainted-kernels.rst b/Documentation/admin-guide/tainted-kernels.rst index 7d80e8c307d1..92a8a07f5c43 100644 --- a/Documentation/admin-guide/tainted-kernels.rst +++ b/Documentation/admin-guide/tainted-kernels.rst @@ -134,6 +134,12 @@ More detailed explanation for tainting scsi/snic on something else than x86_64, scsi/ips on non x86/x86_64/itanium, have broken firmware settings for the irqchip/irq-gic on arm64 ...). + - x86/x86_64: Microcode late loading is dangerous and will result in + tainting the kernel. It requires that all CPUs rendezvous to make sure + the update happens when the system is as quiescent as possible. However, + a higher priority MCE/SMI/NMI can move control flow away from that + rendezvous and interrupt the update, which can be detrimental to the + machine. 3) ``R`` if a module was force unloaded by ``rmmod -f``, ``' '`` if all modules were unloaded normally. diff --git a/Documentation/x86/microcode.rst b/Documentation/x86/microcode.rst index a320d37982ed..b627c6f36bcf 100644 --- a/Documentation/x86/microcode.rst +++ b/Documentation/x86/microcode.rst @@ -6,6 +6,7 @@ The Linux Microcode Loader :Authors: - Fenghua Yu <fenghua.yu@intel.com> - Borislav Petkov <bp@suse.de> + - Ashok Raj <ashok.raj@intel.com> The kernel has a x86 microcode loading facility which is supposed to provide microcode loading methods in the OS. Potential use cases are @@ -92,15 +93,8 @@ vendor's site. Late loading ============ -There are two legacy user space interfaces to load microcode, either through -/dev/cpu/microcode or through /sys/devices/system/cpu/microcode/reload file -in sysfs. - -The /dev/cpu/microcode method is deprecated because it needs a special -userspace tool for that. - -The easier method is simply installing the microcode packages your distro -supplies and running:: +You simply install the microcode packages your distro supplies and +run:: # echo 1 > /sys/devices/system/cpu/microcode/reload @@ -110,6 +104,110 @@ The loading mechanism looks for microcode blobs in /lib/firmware/{intel-ucode,amd-ucode}. The default distro installation packages already put them there. +Since kernel 5.19, late loading is not enabled by default. + +The /dev/cpu/microcode method has been removed in 5.19. + +Why is late loading dangerous? +============================== + +Synchronizing all CPUs +---------------------- + +The microcode engine which receives the microcode update is shared +between the two logical threads in a SMT system. Therefore, when +the update is executed on one SMT thread of the core, the sibling +"automatically" gets the update. + +Since the microcode can "simulate" MSRs too, while the microcode update +is in progress, those simulated MSRs transiently cease to exist. This +can result in unpredictable results if the SMT sibling thread happens to +be in the middle of an access to such an MSR. The usual observation is +that such MSR accesses cause #GPs to be raised to signal that former are +not present. + +The disappearing MSRs are just one common issue which is being observed. +Any other instruction that's being patched and gets concurrently +executed by the other SMT sibling, can also result in similar, +unpredictable behavior. + +To eliminate this case, a stop_machine()-based CPU synchronization was +introduced as a way to guarantee that all logical CPUs will not execute +any code but just wait in a spin loop, polling an atomic variable. + +While this took care of device or external interrupts, IPIs including +LVT ones, such as CMCI etc, it cannot address other special interrupts +that can't be shut off. Those are Machine Check (#MC), System Management +(#SMI) and Non-Maskable interrupts (#NMI). + +Machine Checks +-------------- + +Machine Checks (#MC) are non-maskable. There are two kinds of MCEs. +Fatal un-recoverable MCEs and recoverable MCEs. While un-recoverable +errors are fatal, recoverable errors can also happen in kernel context +are also treated as fatal by the kernel. + +On certain Intel machines, MCEs are also broadcast to all threads in a +system. If one thread is in the middle of executing WRMSR, a MCE will be +taken at the end of the flow. Either way, they will wait for the thread +performing the wrmsr(0x79) to rendezvous in the MCE handler and shutdown +eventually if any of the threads in the system fail to check in to the +MCE rendezvous. + +To be paranoid and get predictable behavior, the OS can choose to set +MCG_STATUS.MCIP. Since MCEs can be at most one in a system, if an +MCE was signaled, the above condition will promote to a system reset +automatically. OS can turn off MCIP at the end of the update for that +core. + +System Management Interrupt +--------------------------- + +SMIs are also broadcast to all CPUs in the platform. Microcode update +requests exclusive access to the core before writing to MSR 0x79. So if +it does happen such that, one thread is in WRMSR flow, and the 2nd got +an SMI, that thread will be stopped in the first instruction in the SMI +handler. + +Since the secondary thread is stopped in the first instruction in SMI, +there is very little chance that it would be in the middle of executing +an instruction being patched. Plus OS has no way to stop SMIs from +happening. + +Non-Maskable Interrupts +----------------------- + +When thread0 of a core is doing the microcode update, if thread1 is +pulled into NMI, that can cause unpredictable behavior due to the +reasons above. + +OS can choose a variety of methods to avoid running into this situation. + + +Is the microcode suitable for late loading? +------------------------------------------- + +Late loading is done when the system is fully operational and running +real workloads. Late loading behavior depends on what the base patch on +the CPU is before upgrading to the new patch. + +This is true for Intel CPUs. + +Consider, for example, a CPU has patch level 1 and the update is to +patch level 3. + +Between patch1 and patch3, patch2 might have deprecated a software-visible +feature. + +This is unacceptable if software is even potentially using that feature. +For instance, say MSR_X is no longer available after an update, +accessing that MSR will cause a #GP fault. + +Basically there is no way to declare a new microcode update suitable +for late-loading. This is another one of the problems that caused late +loading to be not enabled by default. + Builtin microcode ================= diff --git a/arch/x86/include/asm/microcode.h b/arch/x86/include/asm/microcode.h index 0c3d3440fe27..74ecc2bd6cd0 100644 --- a/arch/x86/include/asm/microcode.h +++ b/arch/x86/include/asm/microcode.h @@ -9,6 +9,7 @@ struct ucode_patch { struct list_head plist; void *data; /* Intel uses only this one */ + unsigned int size; u32 patch_id; u16 equiv_cpu; }; @@ -32,9 +33,6 @@ enum ucode_state { }; struct microcode_ops { - enum ucode_state (*request_microcode_user) (int cpu, - const void __user *buf, size_t size); - enum ucode_state (*request_microcode_fw) (int cpu, struct device *, bool refresh_fw); diff --git a/arch/x86/kernel/cpu/microcode/amd.c b/arch/x86/kernel/cpu/microcode/amd.c index 8b2fcdfa6d31..e7410e98fc1f 100644 --- a/arch/x86/kernel/cpu/microcode/amd.c +++ b/arch/x86/kernel/cpu/microcode/amd.c @@ -788,6 +788,7 @@ static int verify_and_add_patch(u8 family, u8 *fw, unsigned int leftover, kfree(patch); return -EINVAL; } + patch->size = *patch_size; mc_hdr = (struct microcode_header_amd *)(fw + SECTION_HDR_SIZE); proc_id = mc_hdr->processor_rev_id; @@ -869,7 +870,7 @@ load_microcode_amd(bool save, u8 family, const u8 *data, size_t size) return ret; memset(amd_ucode_patch, 0, PATCH_MAX_SIZE); - memcpy(amd_ucode_patch, p->data, min_t(u32, ksize(p->data), PATCH_MAX_SIZE)); + memcpy(amd_ucode_patch, p->data, min_t(u32, p->size, PATCH_MAX_SIZE)); return ret; } @@ -924,12 +925,6 @@ static enum ucode_state request_microcode_amd(int cpu, struct device *device, return ret; } -static enum ucode_state -request_microcode_user(int cpu, const void __user *buf, size_t size) -{ - return UCODE_ERROR; -} - static void microcode_fini_cpu_amd(int cpu) { struct ucode_cpu_info *uci = ucode_cpu_info + cpu; @@ -938,7 +933,6 @@ static void microcode_fini_cpu_amd(int cpu) } static struct microcode_ops microcode_amd_ops = { - .request_microcode_user = request_microcode_user, .request_microcode_fw = request_microcode_amd, .collect_cpu_info = collect_cpu_info_amd, .apply_microcode = apply_microcode_amd, diff --git a/arch/x86/kernel/cpu/microcode/core.c b/arch/x86/kernel/cpu/microcode/core.c index ad57e0e4d674..6a41cee242f6 100644 --- a/arch/x86/kernel/cpu/microcode/core.c +++ b/arch/x86/kernel/cpu/microcode/core.c @@ -491,7 +491,7 @@ wait_for_siblings: */ static int microcode_reload_late(void) { - int ret; + int old = boot_cpu_data.microcode, ret; pr_err("Attempting late microcode loading - it is dangerous and taints the kernel.\n"); pr_err("You should switch to early loading, if possible.\n"); @@ -503,7 +503,8 @@ static int microcode_reload_late(void) if (ret == 0) microcode_check(); - pr_info("Reload completed, microcode revision: 0x%x\n", boot_cpu_data.microcode); + pr_info("Reload completed, microcode revision: 0x%x -> 0x%x\n", + old, boot_cpu_data.microcode); return ret; } diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c index 025c8f0cd948..1fcbd671f1df 100644 --- a/arch/x86/kernel/cpu/microcode/intel.c +++ b/arch/x86/kernel/cpu/microcode/intel.c @@ -916,24 +916,7 @@ static enum ucode_state request_microcode_fw(int cpu, struct device *device, return ret; } -static enum ucode_state -request_microcode_user(int cpu, const void __user *buf, size_t size) -{ - struct iov_iter iter; - struct iovec iov; - - if (is_blacklisted(cpu)) - return UCODE_NFOUND; - - iov.iov_base = (void __user *)buf; - iov.iov_len = size; - iov_iter_init(&iter, WRITE, &iov, 1, size); - - return generic_load_microcode(cpu, &iter); -} - static struct microcode_ops microcode_intel_ops = { - .request_microcode_user = request_microcode_user, .request_microcode_fw = request_microcode_fw, .collect_cpu_info = collect_cpu_info, .apply_microcode = apply_microcode_intel, |