From 9e3d6223d2093a8903c8f570a06284453ee59944 Mon Sep 17 00:00:00 2001 From: Peter Zijlstra Date: Fri, 9 Dec 2016 09:30:11 +0100 Subject: math64, timers: Fix 32bit mul_u64_u32_shr() and friends It turns out that while GCC-4.4 manages to generate 32x32->64 mult instructions for the 32bit mul_u64_u32_shr() code, any GCC after that fails horribly. Fix this by providing an explicit mul_u32_u32() function which can be architcture provided. Reported-by: Chris Metcalf Signed-off-by: Peter Zijlstra (Intel) Acked-by: Chris Metcalf [for tile] Cc: Christopher S. Hall Cc: David Gibson Cc: John Stultz Cc: Laurent Vivier Cc: Liav Rehana Cc: Linus Torvalds Cc: Parit Bhargava Cc: Peter Zijlstra Cc: Richard Cochran Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/20161209083011.GD15765@worktop.programming.kicks-ass.net Signed-off-by: Ingo Molnar --- arch/x86/include/asm/div64.h | 11 +++++++++++ 1 file changed, 11 insertions(+) (limited to 'arch/x86') diff --git a/arch/x86/include/asm/div64.h b/arch/x86/include/asm/div64.h index ced283ac79df..af95c47d5c9e 100644 --- a/arch/x86/include/asm/div64.h +++ b/arch/x86/include/asm/div64.h @@ -59,6 +59,17 @@ static inline u64 div_u64_rem(u64 dividend, u32 divisor, u32 *remainder) } #define div_u64_rem div_u64_rem +static inline u64 mul_u32_u32(u32 a, u32 b) +{ + u32 high, low; + + asm ("mull %[b]" : "=a" (low), "=d" (high) + : [a] "a" (a), [b] "rm" (b) ); + + return low | ((u64)high) << 32; +} +#define mul_u32_u32 mul_u32_u32 + #else # include #endif /* CONFIG_X86_32 */ -- cgit v1.2.3 From 4c45c5167c9563b1a2eee3e2fe954621355e4ca8 Mon Sep 17 00:00:00 2001 From: Jiri Slaby Date: Thu, 19 Jan 2017 12:47:30 +0100 Subject: x86/timer: Make delay() work during early bootup When a panic happens during bootup, "Rebooting in X seconds.." is shown, but reboot happens immediatelly. It is because panic() uses mdelay() and mdelay() calls __const_udelay() immediately, which does not work while booting. The per_cpu cpu_info.loops_per_jiffy value is not initialized yet, so __const_udelay() actually multiplies the number of loops by zero. This results in __const_udelay() to delay the execution only by a nanosecond or so. So check whether cpu_info.loops_per_jiffy is zero and use loops_per_jiffy in that case. mdelay() will not be so precise without proper calibration, but it works relatively well. Before: [ 0.170039] delaying 100ms [ 0.170828] done After [ 0.214042] delaying 100ms [ 0.313974] done I do not think the added check matters given we are about to spin the processor in the next few hundred cycles. Signed-off-by: Jiri Slaby Reviewed-by: Andy Shevchenko Acked-by: Thomas Gleixner Cc: Linus Torvalds Cc: Peter Zijlstra Link: http://lkml.kernel.org/r/20170119114730.2670-1-jslaby@suse.cz [ Minor edits. ] Signed-off-by: Ingo Molnar --- arch/x86/lib/delay.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'arch/x86') diff --git a/arch/x86/lib/delay.c b/arch/x86/lib/delay.c index 073d1f1a620b..a8e91ae89fb3 100644 --- a/arch/x86/lib/delay.c +++ b/arch/x86/lib/delay.c @@ -156,13 +156,13 @@ EXPORT_SYMBOL(__delay); inline void __const_udelay(unsigned long xloops) { + unsigned long lpj = this_cpu_read(cpu_info.loops_per_jiffy) ? : loops_per_jiffy; int d0; xloops *= 4; asm("mull %%edx" :"=d" (xloops), "=&a" (d0) - :"1" (xloops), "0" - (this_cpu_read(cpu_info.loops_per_jiffy) * (HZ/4))); + :"1" (xloops), "0" (lpj * (HZ / 4))); __delay(++xloops); } -- cgit v1.2.3