summaryrefslogtreecommitdiffstats
path: root/arch/x86/boot
AgeCommit message (Collapse)AuthorFilesLines
2017-02-20Merge branch 'x86-boot-for-linus' of ↵Linus Torvalds3-3/+151
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 boot updates from Ingo Molnar: "Misc updates: - fix e820 error handling - convert page table setup code from assembly to C - fix kexec environment bug - ... plus small cleanups" * 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/kconfig: Remove misleading note regarding hibernation and KASLR x86/boot: Fix KASLR and memmap= collision x86/e820/32: Fix e820_search_gap() error handling on x86-32 x86/boot/32: Convert the 32-bit pgtable setup code from assembly to C x86/e820: Make e820_search_gap() static and remove unused variables
2017-02-07efi: Get and store the secure boot statusDavid Howells1-0/+7
Get the firmware's secure-boot status in the kernel boot wrapper and stash it somewhere that the main kernel image can find. The efi_get_secureboot() function is extracted from the ARM stub and (a) generalised so that it can be called from x86 and (b) made to use efi_call_runtime() so that it can be run in mixed-mode. For x86, it is stored in boot_params and can be overridden by the boot loader or kexec. This allows secure-boot mode to be passed on to a new kernel. Suggested-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/1486380166-31868-5-git-send-email-ard.biesheuvel@linaro.org [ Small readability edits. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-02-07x86/efi: Allow invocation of arbitrary runtime servicesDavid Howells3-7/+8
Provide the ability to perform mixed-mode runtime service calls for x86 in the same way the following commit provided the ability to invoke for boot services: 0a637ee61247bd ("x86/efi: Allow invocation of arbitrary boot services") Suggested-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/1486380166-31868-2-git-send-email-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-02-01x86/efi: Deduplicate efi_char16_printk()Lukas Wunner1-24/+2
Eliminate the separate 32-bit and 64x- bit code paths by way of the shiny new efi_call_proto() macro. No functional change intended. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/1485868902-20401-3-git-send-email-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-02-01efi: Deduplicate efi_file_size() / _read() / _close()Lukas Wunner1-148/+0
There's one ARM, one x86_32 and one x86_64 version which can be folded into a single shared version by masking their differences with the shiny new efi_call_proto() macro. No functional change intended. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/1485868902-20401-2-git-send-email-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-28Merge branch 'linus' into x86/boot, to pick up fixesIngo Molnar2-0/+10
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-25x86/boot: Fix KASLR and memmap= collisionDave Jiang3-3/+151
CONFIG_RANDOMIZE_BASE=y relocates the kernel to a random base address. However it does not take into account the memmap= parameter passed in from the kernel command line. This results in the kernel sometimes being put in the middle of memmap. Teach KASLR to not insert the kernel in memmap defined regions. We support up to 4 memmap regions: any additional regions will cause KASLR to disable. The mem_avoid set has been augmented to add up to 4 unusable regions of memmaps provided by the user to exclude those regions from the set of valid address range to insert the uncompressed kernel image. The nn@ss ranges will be skipped by the mem_avoid set since it indicates that memory is useable. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Baoquan He <bhe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: dan.j.williams@intel.com Cc: david@fromorbit.com Cc: linux-nvdimm@lists.01.org Link: http://lkml.kernel.org/r/148417664156.131935.2248592164852799738.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-09x86/boot: Add missing declaration of string functionsNicholas Mc Guire2-0/+10
Add the missing declarations of basic string functions to string.h to allow a clean build. Fixes: 5be865661516 ("String-handling functions for the new x86 setup code.") Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org> Link: http://lkml.kernel.org/r/1483781911-21399-1-git-send-email-hofrat@osadl.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-12-19Revert "x86/boot: Fail the boot if !M486 and CPUID is missing"Andy Lutomirski1-6/+0
This reverts commit ed68d7e9b9cfb64f3045ffbcb108df03c09a0f98. The patch wasn't quite correct -- there are non-Intel (and hence non-486) CPUs that we support that don't have CPUID. Since we no longer require CPUID for sync_core(), just revert the patch. I think the relevant CPUs are Geode and Elan, but I'm not sure. In principle, we should try to do better at identifying CPUID-less CPUs in early boot, but that's more complicated. Reported-by: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk> Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Juergen Gross <jgross@suse.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Matthew Whitehead <tedheadster@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: xen-devel <Xen-devel@lists.xen.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/82acde18a108b8e353180dd6febcc2876df33f24.1481307769.git.luto@kernel.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-12-14Merge branch 'for-linus' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial updates from Jiri Kosina. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: NTB: correct ntb_spad_count comment typo misc: ibmasm: fix typo in error message Remove references to dead make variable LINUX_INCLUDE Remove last traces of ikconfig.h treewide: Fix printk() message errors Documentation/device-mapper: s/getsize/getsz/
2016-12-14Remove references to dead make variable LINUX_INCLUDEPaul Bolle1-1/+1
Commit 4fd06960f120 ("Use the new x86 setup code for i386") introduced a reference to the make variable LINUX_INCLUDE. That reference got moved around a bit and copied twice and now there are three references to it. There has never been a definition of that variable. (Presumably that is because it started out as a mistyped reference to LINUXINCLUDE.) So this reference has always been an empty string. Let's remove it before it spreads any further. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2016-12-12Merge branch 'x86-boot-for-linus' of ↵Linus Torvalds2-3/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 boot updates from Ingo Molnar: "Misc cleanups/simplifications by Borislav Petkov, Paul Bolle and Wei Yang" * 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/boot/64: Optimize fixmap page fixup x86/boot: Simplify the GDTR calculation assembly code a bit x86/boot/build: Remove always empty $(USERINCLUDE)
2016-12-12Merge branch 'efi-core-for-linus' of ↵Linus Torvalds1-0/+65
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull EFI updates from Ingo Molnar: "The main changes in this development cycle were: - Implement EFI dev path parser and other changes to fully support thunderbolt devices on Apple Macbooks (Lukas Wunner) - Add RNG seeding via the EFI stub, on ARM/arm64 (Ard Biesheuvel) - Expose EFI framebuffer configuration to user-space, to improve tooling (Peter Jones) - Misc fixes and cleanups (Ivan Hu, Wei Yongjun, Yisheng Xie, Dan Carpenter, Roy Franz)" * 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: efi/libstub: Make efi_random_alloc() allocate below 4 GB on 32-bit thunderbolt: Compile on x86 only thunderbolt, efi: Fix Kconfig dependencies harder thunderbolt, efi: Fix Kconfig dependencies thunderbolt: Use Device ROM retrieved from EFI x86/efi: Retrieve and assign Apple device properties efi: Allow bitness-agnostic protocol calls efi: Add device path parser efi/arm*/libstub: Invoke EFI_RNG_PROTOCOL to seed the UEFI RNG table efi/libstub: Add random.c to ARM build efi: Add support for seeding the RNG from a UEFI config table MAINTAINERS: Add ARM and arm64 EFI specific files to EFI subsystem efi/libstub: Fix allocation size calculations efi/efivar_ssdt_load: Don't return success on allocation failure efifb: Show framebuffer layout as device attributes efi/efi_test: Use memdup_user() as a cleanup efi/efi_test: Fix uninitialized variable 'rv' efi/efi_test: Fix uninitialized variable 'datasize' efi/arm*: Fix efi_init() error handling efi: Remove unused include of <linux/version.h>
2016-11-21x86/build: Build compressed x86 kernels as PIE when !CONFIG_RELOCATABLE as wellH.J. Lu1-3/+2
Since the bootloader may load the compressed x86 kernel at any address, it should always be built as PIE, not just when CONFIG_RELOCATABLE=y. Otherwise, linker in binutils 2.27 will optimize GOT load into the absolute address when building the compressed x86 kernel as a non-PIE executable. Signed-off-by: H.J. Lu <hjl.tools@gmail.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org [ Small wording changes. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-21x86/boot: Fail the boot if !M486 and CPUID is missingAndy Lutomirski1-0/+6
Linux will have all kinds of sporadic problems on systems that don't have the CPUID instruction unless CONFIG_M486=y. In particular, sync_core() will explode. I believe that these kernels had a better chance of working before commit 05fb3c199bb0 ("x86/boot: Initialize FPU and X86_FEATURE_ALWAYS even if we don't have CPUID"). That commit inadvertently fixed a serious bug: we used to fail to detect the FPU if CPUID wasn't present. Because we also used to forget to set X86_FEATURE_ALWAYS, we end up with no cpu feature bits set at all. This meant that alternative patching didn't do anything and, if paravirt was disabled, we could plausibly finish the entire boot process without calling sync_core(). Rather than trying to work around these issues, just have the kernel fail loudly if it's running on a CPUID-less 486, doesn't have CPUID, and doesn't have CONFIG_M486 set. Reported-by: Matthew Whitehead <tedheadster@gmail.com> Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/70eac6639f23df8be5fe03fa1984aedd5d40077a.1479598603.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-13x86/efi: Retrieve and assign Apple device propertiesLukas Wunner1-0/+65
Apple's EFI drivers supply device properties which are needed to support Macs optimally. They contain vital information which cannot be obtained any other way (e.g. Thunderbolt Device ROM). They're also used to convey the current device state so that OS drivers can pick up where EFI drivers left (e.g. GPU mode setting). There's an EFI driver dubbed "AAPL,PathProperties" which implements a per-device key/value store. Other EFI drivers populate it using a custom protocol. The macOS bootloader /System/Library/CoreServices/boot.efi retrieves the properties with the same protocol. The kernel extension AppleACPIPlatform.kext subsequently merges them into the I/O Kit registry (see ioreg(8)) where they can be queried by other kernel extensions and user space. This commit extends the efistub to retrieve the device properties before ExitBootServices is called. It assigns them to devices in an fs_initcall so that they can be queried with the API in <linux/property.h>. Note that the device properties will only be available if the kernel is booted with the efistub. Distros should adjust their installers to always use the efistub on Macs. grub with the "linux" directive will not work unless the functionality of this commit is duplicated in grub. (The "linuxefi" directive should work but is not included upstream as of this writing.) The custom protocol has GUID 91BD12FE-F6C3-44FB-A5B7-5122AB303AE0 and looks like this: typedef struct { unsigned long version; /* 0x10000 */ efi_status_t (*get) ( IN struct apple_properties_protocol *this, IN struct efi_dev_path *device, IN efi_char16_t *property_name, OUT void *buffer, IN OUT u32 *buffer_len); /* EFI_SUCCESS, EFI_NOT_FOUND, EFI_BUFFER_TOO_SMALL */ efi_status_t (*set) ( IN struct apple_properties_protocol *this, IN struct efi_dev_path *device, IN efi_char16_t *property_name, IN void *property_value, IN u32 property_value_len); /* allocates copies of property name and value */ /* EFI_SUCCESS, EFI_OUT_OF_RESOURCES */ efi_status_t (*del) ( IN struct apple_properties_protocol *this, IN struct efi_dev_path *device, IN efi_char16_t *property_name); /* EFI_SUCCESS, EFI_NOT_FOUND */ efi_status_t (*get_all) ( IN struct apple_properties_protocol *this, OUT void *buffer, IN OUT u32 *buffer_len); /* EFI_SUCCESS, EFI_BUFFER_TOO_SMALL */ } apple_properties_protocol; Thanks to Pedro Vilaça for this blog post which was helpful in reverse engineering Apple's EFI drivers and bootloader: https://reverse.put.as/2016/06/25/apple-efi-firmware-passwords-and-the-scbo-myth/ If someone at Apple is reading this, please note there's a memory leak in your implementation of the del() function as the property struct is freed but the name and value allocations are not. Neither the macOS bootloader nor Apple's EFI drivers check the protocol version, but we do to avoid breakage if it's ever changed. It's been the same since at least OS X 10.6 (2009). The get_all() function conveniently fills a buffer with all properties in marshalled form which can be passed to the kernel as a setup_data payload. The number of device properties is dynamic and can change between a first invocation of get_all() (to determine the buffer size) and a second invocation (to retrieve the actual buffer), hence the peculiar loop which does not finish until the buffer size settles. The macOS bootloader does the same. The setup_data payload is later on unmarshalled in an fs_initcall. The idea is that most buses instantiate devices in "subsys" initcall level and drivers are usually bound to these devices in "device" initcall level, so we assign the properties in-between, i.e. in "fs" initcall level. This assumes that devices to which properties pertain are instantiated from a "subsys" initcall or earlier. That should always be the case since on macOS, AppleACPIPlatformExpert::matchEFIDevicePath() only supports ACPI and PCI nodes and we've fully scanned those buses during "subsys" initcall level. The second assumption is that properties are only needed from a "device" initcall or later. Seems reasonable to me, but should this ever not work out, an alternative approach would be to store the property sets e.g. in a btree early during boot. Then whenever device_add() is called, an EFI Device Path would have to be constructed for the newly added device, and looked up in the btree. That way, the property set could be assigned to the device immediately on instantiation. And this would also work for devices instantiated in a deferred fashion. It seems like this approach would be more complicated and require more code. That doesn't seem justified without a specific use case. For comparison, the strategy on macOS is to assign properties to objects in the ACPI namespace (AppleACPIPlatformExpert::mergeEFIProperties()). That approach is definitely wrong as it fails for devices not present in the namespace: The NHI EFI driver supplies properties for attached Thunderbolt devices, yet on Macs with Thunderbolt 1 only one device level behind the host controller is described in the namespace. Consequently macOS cannot assign properties for chained devices. With Thunderbolt 2 they started to describe three device levels behind host controllers in the namespace but this grossly inflates the SSDT and still fails if the user daisy-chained more than three devices. We copy the property names and values from the setup_data payload to swappable virtual memory and afterwards make the payload available to the page allocator. This is just for the sake of good housekeeping, it wouldn't occupy a meaningful amount of physical memory (4444 bytes on my machine). Only the payload is freed, not the setup_data header since otherwise we'd break the list linkage and we cannot safely update the predecessor's ->next link because there's no locking for the list. The payload is currently not passed on to kexec'ed kernels, same for PCI ROMs retrieved by setup_efi_pci(). This can be added later if there is demand by amending setup_efi_state(). The payload can then no longer be made available to the page allocator of course. Tested-by: Lukas Wunner <lukas@wunner.de> [MacBookPro9,1] Tested-by: Pierre Moreau <pierre.morrow@free.fr> [MacBookPro11,3] Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Cc: Andreas Noever <andreas.noever@gmail.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pedro Vilaça <reverser@put.as> Cc: Peter Jones <pjones@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: grub-devel@gnu.org Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/20161112213237.8804-9-matt@codeblueprint.co.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-07x86/boot: Simplify the GDTR calculation assembly code a bitWei Yang1-2/+1
This patch calculates the GDTR's base address via a single instruction. ( EBP contains the address where it is loaded and GDTR's base address is already set to "gdt" in compilation. It is fine to get the correct base address by adding the delta to GDTR's base address. ) Signed-off-by: Wei Yang <richard.weiyang@gmail.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1478015364-5547-1-git-send-email-richard.weiyang@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-07x86/boot/build: Remove always empty $(USERINCLUDE)Paul Bolle1-1/+1
Commmit b6eea87fc685: ("x86, boot: Explicitly include autoconf.h for hostprogs") correctly noted: [...] that because $(USERINCLUDE) isn't exported by the top-level Makefile it's actually empty in arch/x86/boot/Makefile. So let's do the sane thing and remove the reference to that make variable. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk> Cc: David Howells <dhowells@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1478166468-9760-1-git-send-email-pebolle@tiscali.nl Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-13Merge tag 'efi-next' of ↵Ingo Molnar3-44/+10
git://git.kernel.org/pub/scm/linux/kernel/git/mfleming/efi into efi/core Pull EFI updates from Matt Fleming: "* Refactor the EFI memory map code into architecture neutral files and allow drivers to permanently reserve EFI boot services regions on x86, as well as ARM/arm64 - Matt Fleming * Add ARM support for the EFI esrt driver - Ard Biesheuvel * Make the EFI runtime services and efivar API interruptible by swapping spinlocks for semaphores - Sylvain Chouleur * Provide the EFI identity mapping for kexec which allows kexec to work on SGI/UV platforms with requiring the "noefi" kernel command line parameter - Alex Thorlton * Add debugfs node to dump EFI page tables on arm64 - Ard Biesheuvel * Merge the EFI test driver being carried out of tree until now in the FWTS project - Ivan Hu * Expand the list of flags for classifying EFI regions as "RAM" on arm64 so we align with the UEFI spec - Ard Biesheuvel * Optimise out the EFI mixed mode if it's unsupported (CONFIG_X86_32) or disabled (CONFIG_EFI_MIXED=n) and switch the early EFI boot services function table for direct calls, alleviating us from having to maintain the custom function table - Lukas Wunner * Miscellaneous cleanups and fixes" Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-09x86/efi: Allow invocation of arbitrary boot servicesLukas Wunner3-19/+8
We currently allow invocation of 8 boot services with efi_call_early(). Not included are LocateHandleBuffer and LocateProtocol in particular. For graphics output or to retrieve PCI ROMs and Apple device properties, we're thus forced to use the LocateHandle + AllocatePool + LocateHandle combo, which is cumbersome and needs more code. The ARM folks allow invocation of the full set of boot services but are restricted to our 8 boot services in functions shared across arches. Thus, rather than adding just LocateHandleBuffer and LocateProtocol to struct efi_config, let's rework efi_call_early() to allow invocation of arbitrary boot services by selecting the 64 bit vs 32 bit code path in the macro itself. When compiling for 32 bit or for 64 bit without mixed mode, the unused code path is optimized away and the binary code is the same as before. But on 64 bit with mixed mode enabled, this commit adds one compare instruction to each invocation of a boot service and, depending on the code path selected, two jump instructions. (Most of the time gcc arranges the jumps in the 32 bit code path.) The result is a minuscule performance penalty and the binary code becomes slightly larger and more difficult to read when disassembled. This isn't a hot path, so these drawbacks are arguably outweighed by the attainable simplification of the C code. We have some overhead anyway for thunking or conversion between calling conventions. The 8 boot services can consequently be removed from struct efi_config. No functional change intended (for now). Example -- invocation of free_pool before (64 bit code path): 0x2d4 movq %ds:efi_early, %rdx ; efi_early 0x2db movq %ss:arg_0-0x20(%rsp), %rsi 0x2e0 xorl %eax, %eax 0x2e2 movq %ds:0x28(%rdx), %rdi ; efi_early->free_pool 0x2e6 callq *%ds:0x58(%rdx) ; efi_early->call() Example -- invocation of free_pool after (64 / 32 bit mixed code path): 0x0dc movq %ds:efi_early, %rax ; efi_early 0x0e3 cmpb $0, %ds:0x28(%rax) ; !efi_early->is64 ? 0x0e7 movq %ds:0x20(%rax), %rdx ; efi_early->call() 0x0eb movq %ds:0x10(%rax), %rax ; efi_early->boot_services 0x0ef je $0x150 0x0f1 movq %ds:0x48(%rax), %rdi ; free_pool (64 bit) 0x0f5 xorl %eax, %eax 0x0f7 callq *%rdx ... 0x150 movl %ds:0x30(%rax), %edi ; free_pool (32 bit) 0x153 jmp $0x0f5 Size of eboot.o text section: CONFIG_X86_32: 6464 before, 6318 after CONFIG_X86_64 && !CONFIG_EFI_MIXED: 7670 before, 7573 after CONFIG_X86_64 && CONFIG_EFI_MIXED: 7670 before, 8319 after Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
2016-09-09x86/efi: Remove unused find_bits() functionLukas Wunner1-23/+0
Left behind by commit fc37206427ce ("efi/libstub: Move Graphics Output Protocol handling to generic code"). Signed-off-by: Lukas Wunner <lukas@wunner.de> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
2016-09-09x86/efi: Initialize status to ensure garbage is not returned on small sizeColin Ian King1-2/+2
Although very unlikey, if size is too small or zero, then we end up with status not being set and returning garbage. Instead, initializing status to EFI_INVALID_PARAMETER to indicate that size is invalid in the calls to setup_uga32 and setup_uga64. Signed-off-by: Colin Ian King <colin.king@canonical.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
2016-09-05x86/efi: Use efi_exit_boot_services()Jeffrey Hugo1-69/+67
The eboot code directly calls ExitBootServices. This is inadvisable as the UEFI spec details a complex set of errors, race conditions, and API interactions that the caller of ExitBootServices must get correct. The eboot code attempts allocations after calling ExitBootSerives which is not permitted per the spec. Call the efi_exit_boot_services() helper intead, which handles the allocation scenario properly. Signed-off-by: Jeffrey Hugo <jhugo@codeaurora.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Leif Lindholm <leif.lindholm@linaro.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
2016-09-05efi/libstub: Allocate headspace in efi_get_memory_map()Jeffrey Hugo1-7/+13
efi_get_memory_map() allocates a buffer to store the memory map that it retrieves. This buffer may need to be reused by the client after ExitBootServices() is called, at which point allocations are not longer permitted. To support this usecase, provide the allocated buffer size back to the client, and allocate some additional headroom to account for any reasonable growth in the map that is likely to happen between the call to efi_get_memory_map() and the client reusing the buffer. Signed-off-by: Jeffrey Hugo <jhugo@codeaurora.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Leif Lindholm <leif.lindholm@linaro.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
2016-08-02Merge branch 'kbuild' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild Pull kbuild updates from Michal Marek: - GCC plugin support by Emese Revfy from grsecurity, with a fixup from Kees Cook. The plugins are meant to be used for static analysis of the kernel code. Two plugins are provided already. - reduction of the gcc commandline by Arnd Bergmann. - IS_ENABLED / IS_REACHABLE macro enhancements by Masahiro Yamada - bin2c fix by Michael Tautschnig - setlocalversion fix by Wolfram Sang * 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild: gcc-plugins: disable under COMPILE_TEST kbuild: Abort build on bad stack protector flag scripts: Fix size mismatch of kexec_purgatory_size kbuild: make samples depend on headers_install Kbuild: don't add obj tree in additional includes Kbuild: arch: look for generated headers in obtree Kbuild: always prefix objtree in LINUXINCLUDE Kbuild: avoid duplicate include path Kbuild: don't add ../../ to include path vmlinux.lds.h: replace config_enabled() with IS_ENABLED() kconfig.h: allow to use IS_{ENABLE,REACHABLE} in macro expansion kconfig.h: use already defined macros for IS_REACHABLE() define export.h: use __is_defined() to check if __KSYM_* is defined kconfig.h: use __is_defined() to check if MODULE is defined kbuild: setlocalversion: print error to STDERR Add sancov plugin Add Cyclomatic complexity GCC plugin GCC plugin infrastructure Shared library support
2016-07-25Merge branch 'x86-boot-for-linus' of ↵Linus Torvalds5-182/+190
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 boot updates from Ingo Molnar: "The main changes: - add initial commits to randomize kernel memory section virtual addresses, enabled via a new kernel option: RANDOMIZE_MEMORY (Thomas Garnier, Kees Cook, Baoquan He, Yinghai Lu) - enhance KASLR (RANDOMIZE_BASE) physical memory randomization (Kees Cook) - EBDA/BIOS region boot quirk cleanups (Andy Lutomirski, Ingo Molnar) - misc cleanups/fixes" * 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/boot: Simplify EBDA-vs-BIOS reservation logic x86/boot: Clarify what x86_legacy_features.reserve_bios_regions does x86/boot: Reorganize and clean up the BIOS area reservation code x86/mm: Do not reference phys addr beyond kernel x86/mm: Add memory hotplug support for KASLR memory randomization x86/mm: Enable KASLR for vmalloc memory regions x86/mm: Enable KASLR for physical mapping memory regions x86/mm: Implement ASLR for kernel memory regions x86/mm: Separate variable for trampoline PGD x86/mm: Add PUD VA support for physical mapping x86/mm: Update physical mapping variable names x86/mm: Refactor KASLR entropy functions x86/KASLR: Fix boot crash with certain memory configurations x86/boot/64: Add forgotten end of function marker x86/KASLR: Allow randomization below the load address x86/KASLR: Extend kernel image physical address randomization to addresses larger than 4G x86/KASLR: Randomize virtual address separately x86/KASLR: Clarify identity map interface x86/boot: Refuse to build with data relocations x86/KASLR, x86/power: Remove x86 hibernation restrictions
2016-07-25Merge branch 'x86-mm-for-linus' of ↵Linus Torvalds7-12/+53
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 mm updates from Ingo Molnar: "Various x86 low level modifications: - preparatory work to support virtually mapped kernel stacks (Andy Lutomirski) - support for 64-bit __get_user() on 32-bit kernels (Benjamin LaHaise) - (involved) workaround for Knights Landing CPU erratum (Dave Hansen) - MPX enhancements (Dave Hansen) - mremap() extension to allow remapping of the special VDSO vma, for purposes of user level context save/restore (Dmitry Safonov) - hweight and entry code cleanups (Borislav Petkov) - bitops code generation optimizations and cleanups with modern GCC (H. Peter Anvin) - syscall entry code optimizations (Paolo Bonzini)" * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (43 commits) x86/mm/cpa: Add missing comment in populate_pdg() x86/mm/cpa: Fix populate_pgd(): Stop trying to deallocate failed PUDs x86/syscalls: Add compat_sys_preadv64v2/compat_sys_pwritev64v2 x86/smp: Remove unnecessary initialization of thread_info::cpu x86/smp: Remove stack_smp_processor_id() x86/uaccess: Move thread_info::addr_limit to thread_struct x86/dumpstack: Rename thread_struct::sig_on_uaccess_error to sig_on_uaccess_err x86/uaccess: Move thread_info::uaccess_err and thread_info::sig_on_uaccess_err to thread_struct x86/dumpstack: When OOPSing, rewind the stack before do_exit() x86/mm/64: In vmalloc_fault(), use CR3 instead of current->active_mm x86/dumpstack/64: Handle faults when printing the "Stack: " part of an OOPS x86/dumpstack: Try harder to get a call trace on stack overflow x86/mm: Remove kernel_unmap_pages_in_pgd() and efi_cleanup_page_tables() x86/mm/cpa: In populate_pgd(), don't set the PGD entry until it's populated x86/mm/hotplug: Don't remove PGD entries in remove_pagetable() x86/mm: Use pte_none() to test for empty PTE x86/mm: Disallow running with 32-bit PTEs to work around erratum x86/mm: Ignore A/D bits in pte/pmd/pud_none() x86/mm: Move swap offset/type up in PTE to work around erratum x86/entry: Inline enter_from_user_mode() ...
2016-07-18Kbuild: arch: look for generated headers in obtreeArnd Bergmann1-1/+1
There are very few files that need add an -I$(obj) gcc for the preprocessor or the assembler. For C files, we add always these for both the objtree and srctree, but for the other ones we require the Makefile to add them, and Kbuild then adds it for both trees. As a preparation for changing the meaning of the -I$(obj) directive to only refer to the srctree, this changes the two instances in arch/x86 to use an explictit $(objtree) prefix where needed, otherwise we won't find the headers any more, as reported by the kbuild 0day builder. arch/x86/realmode/rm/realmode.lds.S:75:20: fatal error: pasyms.h: No such file or directory Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Michal Marek <mmarek@suse.com>
2016-07-15Merge branch 'x86/asm' into x86/mm, to resolve conflictsIngo Molnar3-12/+15
Conflicts: tools/testing/selftests/x86/Makefile Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-13x86/mm: Disallow running with 32-bit PTEs to work around erratumDave Hansen5-0/+38
The Intel(R) Xeon Phi(TM) Processor x200 Family (codename: Knights Landing) has an erratum where a processor thread setting the Accessed or Dirty bits may not do so atomically against its checks for the Present bit. This may cause a thread (which is about to page fault) to set A and/or D, even though the Present bit had already been atomically cleared. These bits are truly "stray". In the case of the Dirty bit, the thread associated with the stray set was *not* allowed to write to the page. This means that we do not have to launder the bit(s); we can simply ignore them. If the PTE is used for storing a swap index or a NUMA migration index, the A bit could be misinterpreted as part of the swap type. The stray bits being set cause a software-cleared PTE to be interpreted as a swap entry. In some cases (like when the swap index ends up being for a non-existent swapfile), the kernel detects the stray value and WARN()s about it, but there is no guarantee that the kernel can always detect it. When we have 64-bit PTEs (64-bit mode or 32-bit PAE), we were able to move the swap PTE format around to avoid these troublesome bits. But, 32-bit non-PAE is tight on bits. So, disallow it from running on this hardware. I can't imagine anyone wanting to run 32-bit non-highmem kernels on this hardware, but disallowing them from running entirely is surely the safe thing to do. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave@sr71.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luis R. Rodriguez <mcgrof@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toshi Kani <toshi.kani@hp.com> Cc: dave.hansen@intel.com Cc: linux-mm@kvack.org Cc: mhocko@suse.com Link: http://lkml.kernel.org/r/20160708001914.D0B50110@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-08x86/mm: Enable KASLR for physical mapping memory regionsThomas Garnier1-0/+3
Add the physical mapping in the list of randomized memory regions. The physical memory mapping holds most allocations from boot and heap allocators. Knowing the base address and physical memory size, an attacker can deduce the PDE virtual address for the vDSO memory page. This attack was demonstrated at CanSecWest 2016, in the following presentation: "Getting Physical: Extreme Abuse of Intel Based Paged Systems": https://github.com/n3k/CansecWest2016_Getting_Physical_Extreme_Abuse_of_Intel_Based_Paging_Systems/blob/master/Presentation/CanSec2016_Presentation.pdf (See second part of the presentation). The exploits used against Linux worked successfully against 4.6+ but fail with KASLR memory enabled: https://github.com/n3k/CansecWest2016_Getting_Physical_Extreme_Abuse_of_Intel_Based_Paging_Systems/tree/master/Demos/Linux/exploits Similar research was done at Google leading to this patch proposal. Variants exists to overwrite /proc or /sys objects ACLs leading to elevation of privileges. These variants were tested against 4.6+. The page offset used by the compressed kernel retains the static value since it is not yet randomized during this boot stage. Signed-off-by: Thomas Garnier <thgarnie@google.com> Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Alexander Kuleshov <kuleshovmail@gmail.com> Cc: Alexander Popov <alpopov@ptsecurity.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Baoquan He <bhe@redhat.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bp@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dave Young <dyoung@redhat.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jan Beulich <JBeulich@suse.com> Cc: Joerg Roedel <jroedel@suse.de> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Lv Zheng <lv.zheng@intel.com> Cc: Mark Salter <msalter@redhat.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toshi Kani <toshi.kani@hpe.com> Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: kernel-hardening@lists.openwall.com Cc: linux-doc@vger.kernel.org Link: http://lkml.kernel.org/r/1466556426-32664-7-git-send-email-keescook@chromium.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-08x86/mm: Refactor KASLR entropy functionsThomas Garnier1-71/+5
Move the KASLR entropy functions into arch/x86/lib to be used in early kernel boot for KASLR memory randomization. Signed-off-by: Thomas Garnier <thgarnie@google.com> Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Alexander Kuleshov <kuleshovmail@gmail.com> Cc: Alexander Popov <alpopov@ptsecurity.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Baoquan He <bhe@redhat.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bp@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dave Young <dyoung@redhat.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jan Beulich <JBeulich@suse.com> Cc: Joerg Roedel <jroedel@suse.de> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Lv Zheng <lv.zheng@intel.com> Cc: Mark Salter <msalter@redhat.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toshi Kani <toshi.kani@hpe.com> Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: kernel-hardening@lists.openwall.com Cc: linux-doc@vger.kernel.org Link: http://lkml.kernel.org/r/1466556426-32664-2-git-send-email-keescook@chromium.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-08x86/KASLR: Fix boot crash with certain memory configurationsBaoquan He1-0/+2
Ye Xiaolong reported this boot crash: | | XZ-compressed data is corrupt | | -- System halted | Fix the bug in mem_avoid_overlap() of finding the earliest overlap. Reported-and-tested-by: Ye Xiaolong <xiaolong.ye@intel.com> Signed-off-by: Baoquan He <bhe@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-27x86/efi: Remove unused variable 'efi'Colin Ian King1-2/+0
Remove unused variable 'efi', it is never used. This fixes the following clang build warning: arch/x86/boot/compressed/eboot.c:803:2: warning: Value stored to 'efi' is never read Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/1466839230-12781-4-git-send-email-matt@codeblueprint.co.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-26x86/KASLR: Allow randomization below the load addressYinghai Lu1-2/+9
Currently the kernel image physical address randomization's lower boundary is the original kernel load address. For bootloaders that load kernels into very high memory (e.g. kexec), this means randomization takes place in a very small window at the top of memory, ignoring the large region of physical memory below the load address. Since mem_avoid[] is already correctly tracking the regions that must be avoided, this patch changes the minimum address to whatever is less: 512M (to conservatively avoid unknown things in lower memory) or the load address. Now, for example, if the kernel is loaded at 8G, [512M, 8G) will be added to the list of possible physical memory positions. Signed-off-by: Yinghai Lu <yinghai@kernel.org> [ Rewrote the changelog, refactored the code to use min(). ] Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Baoquan He <bhe@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: H.J. Lu <hjl.tools@gmail.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1464216334-17200-6-git-send-email-keescook@chromium.org [ Edited the changelog some more, plus the code comment as well. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-26x86/KASLR: Extend kernel image physical address randomization to addresses ↵Kees Cook1-46/+69
larger than 4G We want the physical address to be randomized anywhere between 16MB and the top of physical memory (up to 64TB). This patch exchanges the prior slots[] array for the new slot_areas[] array, and lifts the limitation of KERNEL_IMAGE_SIZE on the physical address offset for 64-bit. As before, process_e820_entry() walks memory and populates slot_areas[], splitting on any detected mem_avoid collisions. Finally, since the slots[] array and its associated functions are not needed any more, so they are removed. Based on earlier patches by Baoquan He. Originally-from: Baoquan He <bhe@redhat.com> Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Baoquan He <bhe@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: H.J. Lu <hjl.tools@gmail.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1464216334-17200-5-git-send-email-keescook@chromium.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-26x86/KASLR: Randomize virtual address separatelyBaoquan He3-48/+64
The current KASLR implementation randomizes the physical and virtual addresses of the kernel together (both are offset by the same amount). It calculates the delta of the physical address where vmlinux was linked to load and where it is finally loaded. If the delta is not equal to 0 (i.e. the kernel was relocated), relocation handling needs be done. On 64-bit, this patch randomizes both the physical address where kernel is decompressed and the virtual address where kernel text is mapped and will execute from. We now have two values being chosen, so the function arguments are reorganized to pass by pointer so they can be directly updated. Since relocation handling only depends on the virtual address, we must check the virtual delta, not the physical delta for processing kernel relocations. This also populates the page table for the new virtual address range. 32-bit does not support a separate virtual address, so it continues to use the physical offset for its virtual offset. Additionally updates the sanity checks done on the resulting kernel addresses since they are potentially separate now. [kees: rewrote changelog, limited virtual split to 64-bit only, update checks] [kees: fix CONFIG_RANDOMIZE_BASE=n boot failure] Signed-off-by: Baoquan He <bhe@redhat.com> Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: H.J. Lu <hjl.tools@gmail.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1464216334-17200-4-git-send-email-keescook@chromium.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-26x86/KASLR: Clarify identity map interfaceKees Cook3-10/+22
This extracts the call to prepare_level4() into a top-level function that the user of the pagetable.c interface must call to initialize the new page tables. For clarity and to match the "finalize" function, it has been renamed to initialize_identity_maps(). This function also gains the initialization of mapping_info so we don't have to do it each time in add_identity_map(). Additionally add copyright notice to the top, to make it clear that the bulk of the pagetable.c code was written by Yinghai, and that I just added bugs later. :) Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Baoquan He <bhe@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: H.J. Lu <hjl.tools@gmail.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1464216334-17200-3-git-send-email-keescook@chromium.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-26x86/boot: Refuse to build with data relocationsKees Cook1-0/+18
The compressed kernel is built with -fPIC/-fPIE so that it can run in any location a bootloader happens to put it. However, since ELF relocation processing is not happening (and all the relocation information has already been stripped at link time), none of the code can use data relocations (e.g. static assignments of pointers). This is already noted in a warning comment at the top of misc.c, but this adds an explicit check for the condition during the linking stage to block any such bugs from appearing. If this was in place with the earlier bug in pagetable.c, the build would fail like this: ... CC arch/x86/boot/compressed/pagetable.o DATAREL arch/x86/boot/compressed/vmlinux error: arch/x86/boot/compressed/pagetable.o has data relocations! make[2]: *** [arch/x86/boot/compressed/vmlinux] Error 1 ... A clean build shows: ... CC arch/x86/boot/compressed/pagetable.o DATAREL arch/x86/boot/compressed/vmlinux LD arch/x86/boot/compressed/vmlinux ... Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Baoquan He <bhe@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: H.J. Lu <hjl.tools@gmail.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1464216334-17200-2-git-send-email-keescook@chromium.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-26x86/KASLR, x86/power: Remove x86 hibernation restrictionsKees Cook1-7/+0
With the following fix: 70595b479ce1 ("x86/power/64: Fix crash whan the hibernation code passes control to the image kernel") ... there is no longer a problem with hibernation resuming a KASLR-booted kernel image, so remove the restriction. Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Baoquan He <bhe@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Len Brown <len.brown@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Linux PM list <linux-pm@vger.kernel.org> Cc: Logan Gunthorpe <logang@deltatee.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yinghai Lu <yinghai@kernel.org> Cc: linux-doc@vger.kernel.org Link: http://lkml.kernel.org/r/20160613221002.GA29719@www.outflux.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-11Merge branch 'linus' into x86/asm, to pick up fixesIngo Molnar1-0/+3
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-08x86, asm, boot: Use CC_SET()/CC_OUT() in arch/x86/boot/boot.hH. Peter Anvin1-4/+5
Remove open-coded uses of set instructions to use CC_SET()/CC_OUT() in arch/x86/boot/boot.h. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Link: http://lkml.kernel.org/r/1465414726-197858-10-git-send-email-hpa@linux.intel.com Reviewed-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Borislav Petkov <bp@suse.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2016-06-08x86, asm: use bool for bitops and other assembly outputsH. Peter Anvin3-8/+10
The gcc people have confirmed that using "bool" when combined with inline assembly always is treated as a byte-sized operand that can be assumed to be 0 or 1, which is exactly what the SET instruction emits. Change the output types and intermediate variables of as many operations as practical to "bool". Signed-off-by: H. Peter Anvin <hpa@zytor.com> Link: http://lkml.kernel.org/r/1465414726-197858-3-git-send-email-hpa@linux.intel.com Reviewed-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Borislav Petkov <bp@suse.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2016-06-07x86, build: copy ldlinux.c32 to image.isoH. Peter Anvin1-0/+3
For newer versions of Syslinux, we need ldlinux.c32 in addition to isolinux.bin to reside on the boot disk, so if the latter is found, copy it, too, to the isoimage tree. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Linux Stable Tree <stable@vger.kernel.org>
2016-05-26Merge branch 'kbuild' of ↵Linus Torvalds1-1/+0
git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild Pull kbuild updates from Michal Marek: - new option CONFIG_TRIM_UNUSED_KSYMS which does a two-pass build and unexports symbols which are not used in the current config [Nicolas Pitre] - several kbuild rule cleanups [Masahiro Yamada] - warning option adjustments for gcov etc [Arnd Bergmann] - a few more small fixes * 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild: (31 commits) kbuild: move -Wunused-const-variable to W=1 warning level kbuild: fix if_change and friends to consider argument order kbuild: fix adjust_autoksyms.sh for modules that need only one symbol kbuild: fix ksym_dep_filter when multiple EXPORT_SYMBOL() on the same line gcov: disable -Wmaybe-uninitialized warning gcov: disable tree-loop-im to reduce stack usage gcov: disable for COMPILE_TEST Kbuild: disable 'maybe-uninitialized' warning for CONFIG_PROFILE_ALL_BRANCHES Kbuild: change CC_OPTIMIZE_FOR_SIZE definition kbuild: forbid kernel directory to contain spaces and colons kbuild: adjust ksym_dep_filter for some cmd_* renames kbuild: Fix dependencies for final vmlinux link kbuild: better abstract vmlinux sequential prerequisites kbuild: fix call to adjust_autoksyms.sh when output directory specified kbuild: Get rid of KBUILD_STR kbuild: rename cmd_as_s_S to cmd_cpp_s_S kbuild: rename cmd_cc_i_c to cmd_cpp_i_c kbuild: drop redundant "PHONY += FORCE" kbuild: delete unnecessary "@:" kbuild: mark help target as PHONY ...
2016-05-16Merge branch 'x86-boot-for-linus' of ↵Linus Torvalds17-541/+947
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 boot updates from Ingo Molnar: "The biggest changes in this cycle were: - prepare for more KASLR related changes, by restructuring, cleaning up and fixing the existing boot code. (Kees Cook, Baoquan He, Yinghai Lu) - simplifly/concentrate subarch handling code, eliminate paravirt_enabled() usage. (Luis R Rodriguez)" * 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (50 commits) x86/KASLR: Clarify purpose of each get_random_long() x86/KASLR: Add virtual address choosing function x86/KASLR: Return earliest overlap when avoiding regions x86/KASLR: Add 'struct slot_area' to manage random_addr slots x86/boot: Add missing file header comments x86/KASLR: Initialize mapping_info every time x86/boot: Comment what finalize_identity_maps() does x86/KASLR: Build identity mappings on demand x86/boot: Split out kernel_ident_mapping_init() x86/boot: Clean up indenting for asm/boot.h x86/KASLR: Improve comments around the mem_avoid[] logic x86/boot: Simplify pointer casting in choose_random_location() x86/KASLR: Consolidate mem_avoid[] entries x86/boot: Clean up pointer casting x86/boot: Warn on future overlapping memcpy() use x86/boot: Extract error reporting functions x86/boot: Correctly bounds-check relocations x86/KASLR: Clean up unused code from old 'run_size' and rename it to 'kernel_total_size' x86/boot: Fix "run_size" calculation x86/boot: Calculate decompression size during boot not build ...
2016-05-10x86/KASLR: Clarify purpose of each get_random_long()Kees Cook1-4/+5
KASLR will be calling get_random_long() twice, but the debug output won't distinguishing between them. This patch adds a report on when it is fetching the physical vs virtual address. With this, once the virtual offset is separate, the report changes from: KASLR using RDTSC... KASLR using RDTSC... into: Physical KASLR using RDTSC... Virtual KASLR using RDTSC... Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Baoquan He <bhe@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bp@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Young <dyoung@redhat.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: kernel-hardening@lists.openwall.com Cc: lasse.collin@tukaani.org Link: http://lkml.kernel.org/r/1462825332-10505-7-git-send-email-keescook@chromium.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-10x86/KASLR: Add virtual address choosing functionBaoquan He1-4/+28
To support randomizing the kernel virtual address separately from the physical address, this patch adds find_random_virt_addr() to choose a slot anywhere between LOAD_PHYSICAL_ADDR and KERNEL_IMAGE_SIZE. Since this address is virtual, not physical, we can place the kernel anywhere in this region, as long as it is aligned and (in the case of kernel being larger than the slot size) placed with enough room to load the entire kernel image. For clarity and readability, find_random_addr() is renamed to find_random_phys_addr() and has "size" renamed to "image_size" to match find_random_virt_addr(). Signed-off-by: Baoquan He <bhe@redhat.com> [ Rewrote changelog, refactored slot calculation for readability. ] [ Renamed find_random_phys_addr() and size argument. ] Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bp@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Young <dyoung@redhat.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: kernel-hardening@lists.openwall.com Cc: lasse.collin@tukaani.org Link: http://lkml.kernel.org/r/1462825332-10505-6-git-send-email-keescook@chromium.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-10x86/KASLR: Return earliest overlap when avoiding regionsKees Cook1-9/+20
In preparation for being able to detect where to split up contiguous memory regions that overlap with memory regions to avoid, we need to pass back what the earliest overlapping region was. This modifies the overlap checker to return that information. Based on a separate mem_min_overlap() implementation by Baoquan He. Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Baoquan He <bhe@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bp@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Young <dyoung@redhat.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: kernel-hardening@lists.openwall.com Cc: lasse.collin@tukaani.org Link: http://lkml.kernel.org/r/1462825332-10505-5-git-send-email-keescook@chromium.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-10x86/KASLR: Add 'struct slot_area' to manage random_addr slotsBaoquan He1-0/+29
In order to support KASLR moving the kernel anywhere in physical memory (which could be up to 64TB), we need to handle counting the potential randomization locations in a more efficient manner. In the worst case with 64TB, there could be roughly 32 * 1024 * 1024 randomization slots if CONFIG_PHYSICAL_ALIGN is 0x1000000. Currently the starting address of candidate positions is stored into the slots[] array, one at a time. This method would cost too much memory and it's also very inefficient to get and save the slot information into the slot array one by one. This patch introduces 'struct slot_area' to manage each contiguous region of randomization slots. Each slot_area will contain the starting address and how many available slots are in this area. As with the original code, the slot_areas[] will avoid the mem_avoid[] regions. Since setup_data is a linked list, it could contain an unknown number of memory regions to be avoided, which could cause us to fragment the contiguous memory that the slot_area array is tracking. In normal operation this level of fragmentation will be extremely rare, but we choose a suitably large value (100) for the array. If setup_data forces the slot_area array to become highly fragmented and there are more slots available beyond the first 100 found, the rest will be ignored for KASLR selection. The function store_slot_info() is used to calculate the number of slots available in the passed-in memory region and stores it into slot_areas[] after adjusting for alignment and size requirements. Signed-off-by: Baoquan He <bhe@redhat.com> [ Rewrote changelog, squashed with new functions. ] Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bp@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Young <dyoung@redhat.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: kernel-hardening@lists.openwall.com Cc: lasse.collin@tukaani.org Link: http://lkml.kernel.org/r/1462825332-10505-4-git-send-email-keescook@chromium.org Signed-off-by: Ingo Molnar <mingo@kernel.org>