<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/arch/x86/lib, branch master</title>
<subtitle>Linux Kernel (branches are rebased on master from time to time)</subtitle>
<id>https://sre.ring0.de/linux/atom?h=master</id>
<link rel='self' href='https://sre.ring0.de/linux/atom?h=master'/>
<link rel='alternate' type='text/html' href='https://sre.ring0.de/linux/'/>
<updated>2023-01-03T17:46:06Z</updated>
<entry>
<title>x86/insn: Avoid namespace clash by separating instruction decoder MMIO type from MMIO trace type</title>
<updated>2023-01-03T17:46:06Z</updated>
<author>
<name>Jason A. Donenfeld</name>
<email>Jason@zx2c4.com</email>
</author>
<published>2023-01-01T16:29:04Z</published>
<link rel='alternate' type='text/html' href='https://sre.ring0.de/linux/commit/?id=72bb8f8cc088730c4d84117a6906f458c2fc64bb'/>
<id>urn:sha1:72bb8f8cc088730c4d84117a6906f458c2fc64bb</id>
<content type='text'>
Both &lt;linux/mmiotrace.h&gt; and &lt;asm/insn-eval.h&gt; define various MMIO_ enum constants,
whose namespace overlaps.

Rename the &lt;asm/insn-eval.h&gt; ones to have a INSN_ prefix, so that the headers can be
used from the same source file.

Signed-off-by: Jason A. Donenfeld &lt;Jason@zx2c4.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Link: https://lore.kernel.org/r/20230101162910.710293-2-Jason@zx2c4.com
</content>
</entry>
<entry>
<title>x86/asm: Fix an assembler warning with current binutils</title>
<updated>2023-01-03T16:55:11Z</updated>
<author>
<name>Mikulas Patocka</name>
<email>mpatocka@redhat.com</email>
</author>
<published>2023-01-03T15:24:11Z</published>
<link rel='alternate' type='text/html' href='https://sre.ring0.de/linux/commit/?id=55d235361fccef573990dfa5724ab453866e7816'/>
<id>urn:sha1:55d235361fccef573990dfa5724ab453866e7816</id>
<content type='text'>
Fix a warning: "found `movsd'; assuming `movsl' was meant"

Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: linux-kernel@vger.kernel.org
</content>
</entry>
<entry>
<title>Merge tag 'x86_core_for_v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2022-12-14T23:03:00Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-12-14T23:03:00Z</published>
<link rel='alternate' type='text/html' href='https://sre.ring0.de/linux/commit/?id=94a855111ed9106971ca2617c5d075269e6aefde'/>
<id>urn:sha1:94a855111ed9106971ca2617c5d075269e6aefde</id>
<content type='text'>
Pull x86 core updates from Borislav Petkov:

 - Add the call depth tracking mitigation for Retbleed which has been
   long in the making. It is a lighterweight software-only fix for
   Skylake-based cores where enabling IBRS is a big hammer and causes a
   significant performance impact.

   What it basically does is, it aligns all kernel functions to 16 bytes
   boundary and adds a 16-byte padding before the function, objtool
   collects all functions' locations and when the mitigation gets
   applied, it patches a call accounting thunk which is used to track
   the call depth of the stack at any time.

   When that call depth reaches a magical, microarchitecture-specific
   value for the Return Stack Buffer, the code stuffs that RSB and
   avoids its underflow which could otherwise lead to the Intel variant
   of Retbleed.

   This software-only solution brings a lot of the lost performance
   back, as benchmarks suggest:

       https://lore.kernel.org/all/20220915111039.092790446@infradead.org/

   That page above also contains a lot more detailed explanation of the
   whole mechanism

 - Implement a new control flow integrity scheme called FineIBT which is
   based on the software kCFI implementation and uses hardware IBT
   support where present to annotate and track indirect branches using a
   hash to validate them

 - Other misc fixes and cleanups

* tag 'x86_core_for_v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (80 commits)
  x86/paravirt: Use common macro for creating simple asm paravirt functions
  x86/paravirt: Remove clobber bitmask from .parainstructions
  x86/debug: Include percpu.h in debugreg.h to get DECLARE_PER_CPU() et al
  x86/cpufeatures: Move X86_FEATURE_CALL_DEPTH from bit 18 to bit 19 of word 11, to leave space for WIP X86_FEATURE_SGX_EDECCSSA bit
  x86/Kconfig: Enable kernel IBT by default
  x86,pm: Force out-of-line memcpy()
  objtool: Fix weak hole vs prefix symbol
  objtool: Optimize elf_dirty_reloc_sym()
  x86/cfi: Add boot time hash randomization
  x86/cfi: Boot time selection of CFI scheme
  x86/ibt: Implement FineIBT
  objtool: Add --cfi to generate the .cfi_sites section
  x86: Add prefix symbols for function padding
  objtool: Add option to generate prefix symbols
  objtool: Avoid O(bloody terrible) behaviour -- an ode to libelf
  objtool: Slice up elf_create_section_symbol()
  kallsyms: Revert "Take callthunks into account"
  x86: Unconfuse CONFIG_ and X86_FEATURE_ namespaces
  x86/retpoline: Fix crash printing warning
  x86/paravirt: Fix a !PARAVIRT build warning
  ...
</content>
</entry>
<entry>
<title>Merge tag 'x86_asm_for_v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2022-12-13T22:40:54Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-12-13T22:40:54Z</published>
<link rel='alternate' type='text/html' href='https://sre.ring0.de/linux/commit/?id=8b9ed79c2d587bec5f603d66801478a5af9af842'/>
<id>urn:sha1:8b9ed79c2d587bec5f603d66801478a5af9af842</id>
<content type='text'>
Pull x86 asm updates from Borislav Petkov:

 - Move the 32-bit memmove() asm implementation out-of-line in order to
   fix a 32-bit full LTO build failure with clang where it would fail at
   register allocation.

   Move it to an asm file and clean it up while at it, similar to what
   has been already done on 64-bit

* tag 'x86_asm_for_v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mem: Move memmove to out of line assembler
</content>
</entry>
<entry>
<title>Merge tag 'v6.1-rc6' into x86/core, to resolve conflicts</title>
<updated>2022-11-21T22:01:51Z</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@kernel.org</email>
</author>
<published>2022-11-21T21:54:36Z</published>
<link rel='alternate' type='text/html' href='https://sre.ring0.de/linux/commit/?id=0ce096db719ebaf46d4faf93e1ed1341c1853919'/>
<id>urn:sha1:0ce096db719ebaf46d4faf93e1ed1341c1853919</id>
<content type='text'>
Resolve conflicts between these commits in arch/x86/kernel/asm-offsets.c:

 # upstream:
 debc5a1ec0d1 ("KVM: x86: use a separate asm-offsets.c file")

 # retbleed work in x86/core:
 5d8213864ade ("x86/retbleed: Add SKL return thunk")

... and these commits in include/linux/bpf.h:

  # upstram:
  18acb7fac22f ("bpf: Revert ("Fix dispatcher patchable function entry to 5 bytes nop")")

  # x86/core commits:
  931ab63664f0 ("x86/ibt: Implement FineIBT")
  bea75b33895f ("x86/Kconfig: Introduce function padding")

The latter two modify BPF_DISPATCHER_ATTRIBUTES(), which was removed upstream.

 Conflicts:
	arch/x86/kernel/asm-offsets.c
	include/linux/bpf.h

Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>x86/uaccess: instrument copy_from_user_nmi()</title>
<updated>2022-11-08T23:57:24Z</updated>
<author>
<name>Alexander Potapenko</name>
<email>glider@google.com</email>
</author>
<published>2022-11-02T11:06:08Z</published>
<link rel='alternate' type='text/html' href='https://sre.ring0.de/linux/commit/?id=11385b2612004298ac2fbc9877e73f1410cfd3c0'/>
<id>urn:sha1:11385b2612004298ac2fbc9877e73f1410cfd3c0</id>
<content type='text'>
Make sure usercopy hooks from linux/instrumented.h are invoked for
copy_from_user_nmi().  This fixes KMSAN false positives reported when
dumping opcodes for a stack trace.

Link: https://lkml.kernel.org/r/20221102110611.1085175-2-glider@google.com
Signed-off-by: Alexander Potapenko &lt;glider@google.com&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Marco Elver &lt;elver@google.com&gt;
Cc: Masahiro Yamada &lt;masahiroy@kernel.org&gt;
Cc: Nick Desaulniers &lt;ndesaulniers@google.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>x86/mem: Move memmove to out of line assembler</title>
<updated>2022-11-01T22:44:07Z</updated>
<author>
<name>Nick Desaulniers</name>
<email>ndesaulniers@google.com</email>
</author>
<published>2022-10-18T17:21:55Z</published>
<link rel='alternate' type='text/html' href='https://sre.ring0.de/linux/commit/?id=bce5a1e8a34006a5e80213ede5e5c465d53f1dce'/>
<id>urn:sha1:bce5a1e8a34006a5e80213ede5e5c465d53f1dce</id>
<content type='text'>
When building ARCH=i386 with CONFIG_LTO_CLANG_FULL=y, it's possible
(depending on additional configs which I have not been able to isolate)
to observe a failure during register allocation:

  error: inline assembly requires more registers than available

when memmove is inlined into tcp_v4_fill_cb() or tcp_v6_fill_cb().

memmove is quite large and probably shouldn't be inlined due to size
alone. A noinline function attribute would be the simplest fix, but
there's a few things that stand out with the current definition:

In addition to having complex constraints that can't always be resolved,
the clobber list seems to be missing %bx. By using numbered operands
rather than symbolic operands, the constraints are quite obnoxious to
refactor.

Having a large function be 99% inline asm is a code smell that this
function should simply be written in stand-alone out-of-line assembler.

Moving this to out of line assembler guarantees that the
compiler cannot inline calls to memmove.

This has been done previously for 64b:
commit 9599ec0471de ("x86-64, mem: Convert memmove() to assembly file
and fix return value bug")

That gives the opportunity for other cleanups like fixing the
inconsistent use of tabs vs spaces and instruction suffixes, and the
label 3 appearing twice.  Symbolic operands, local labels, and
additional comments would provide this code with a fresh coat of paint.

Finally, add a test that tickles the `rep movsl` implementation to test
it for correctness, since it has implicit operands.

Suggested-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Suggested-by: David Laight &lt;David.Laight@aculab.com&gt;
Signed-off-by: Nick Desaulniers &lt;ndesaulniers@google.com&gt;
Signed-off-by: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Reviewed-by: Kees Cook &lt;keescook@chromium.org&gt;
Tested-by: Kees Cook &lt;keescook@chromium.org&gt;
Tested-by: Nathan Chancellor &lt;nathan@kernel.org&gt;
Link: https://lore.kernel.org/all/20221018172155.287409-1-ndesaulniers%40google.com
</content>
</entry>
<entry>
<title>x86/calldepth: Add ret/call counting for debug</title>
<updated>2022-10-17T14:41:16Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2022-09-15T11:11:30Z</published>
<link rel='alternate' type='text/html' href='https://sre.ring0.de/linux/commit/?id=f5c1bb2afe93396d41c5cbdcb909b08a75b8dde4'/>
<id>urn:sha1:f5c1bb2afe93396d41c5cbdcb909b08a75b8dde4</id>
<content type='text'>
Add a debuigfs mechanism to validate the accounting, e.g. vs. call/ret
balance and to gather statistics about the stuffing to call ratio.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20220915111148.204285506@infradead.org
</content>
</entry>
<entry>
<title>x86/retpoline: Add SKL retthunk retpolines</title>
<updated>2022-10-17T14:41:15Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2022-09-15T11:11:28Z</published>
<link rel='alternate' type='text/html' href='https://sre.ring0.de/linux/commit/?id=3b6c1747da48ff40ab746b0e860cffe83619f5c5'/>
<id>urn:sha1:3b6c1747da48ff40ab746b0e860cffe83619f5c5</id>
<content type='text'>
Ensure that retpolines do the proper call accounting so that the return
accounting works correctly.

Specifically; retpolines are used to replace both 'jmp *%reg' and
'call *%reg', however these two cases do not have the same accounting
requirements. Therefore split things up and provide two different
retpoline arrays for SKL.

The 'jmp *%reg' case needs no accounting, the
__x86_indirect_jump_thunk_array[] covers this. The retpoline is
changed to not use the return thunk; it's a simple call;ret construct.

[ strictly speaking it should do:
	andq $(~0x1f), PER_CPU_VAR(__x86_call_depth)
  but we can argue this can be covered by the fuzz we already have
  in the accounting depth (12) vs the RSB depth (16) ]

The 'call *%reg' case does need accounting, the
__x86_indirect_call_thunk_array[] covers this. Again, this retpoline
avoids the use of the return-thunk, in this case to avoid double
accounting.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20220915111147.996634749@infradead.org
</content>
</entry>
<entry>
<title>x86/retbleed: Add SKL return thunk</title>
<updated>2022-10-17T14:41:15Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2022-09-15T11:11:27Z</published>
<link rel='alternate' type='text/html' href='https://sre.ring0.de/linux/commit/?id=5d8213864ade86b48fc492584ea86d65a62f892e'/>
<id>urn:sha1:5d8213864ade86b48fc492584ea86d65a62f892e</id>
<content type='text'>
To address the Intel SKL RSB underflow issue in software it's required to
do call depth tracking.

Provide a return thunk for call depth tracking on Intel SKL CPUs.

The tracking does not use a counter. It uses uses arithmetic shift
right on call entry and logical shift left on return.

The depth tracking variable is initialized to 0x8000.... when the call
depth is zero. The arithmetic shift right sign extends the MSB and
saturates after the 12th call. The shift count is 5 so the tracking covers
12 nested calls. On return the variable is shifted left logically so it
becomes zero again.

 CALL	 	   	RET
 0: 0x8000000000000000	0x0000000000000000
 1: 0xfc00000000000000	0xf000000000000000
...
11: 0xfffffffffffffff8	0xfffffffffffffc00
12: 0xffffffffffffffff	0xffffffffffffffe0

After a return buffer fill the depth is credited 12 calls before the next
stuffing has to take place.

There is a inaccuracy for situations like this:

   10 calls
    5 returns
    3 calls
    4 returns
    3 calls
    ....

The shift count might cause this to be off by one in either direction, but
there is still a cushion vs. the RSB depth. The algorithm does not claim to
be perfect, but it should obfuscate the problem enough to make exploitation
extremly difficult.

The theory behind this is:

RSB is a stack with depth 16 which is filled on every call. On the return
path speculation "pops" entries to speculate down the call chain. Once the
speculative RSB is empty it switches to other predictors, e.g. the Branch
History Buffer, which can be mistrained by user space and misguide the
speculation path to a gadget.

Call depth tracking is designed to break this speculation path by stuffing
speculation trap calls into the RSB which are never getting a corresponding
return executed. This stalls the prediction path until it gets resteered,

The assumption is that stuffing at the 12th return is sufficient to break
the speculation before it hits the underflow and the fallback to the other
predictors. Testing confirms that it works. Johannes, one of the retbleed
researchers. tried to attack this approach but failed.

There is obviously no scientific proof that this will withstand future
research progress, but all we can do right now is to speculate about it.

The SAR/SHL usage was suggested by Andi Kleen.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20220915111147.890071690@infradead.org
</content>
</entry>
</feed>
