summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2018-10-14powerpc/book3s64: Avoid multiple endian conversion in pte helpersChristophe Leroy1-39/+32
In the same spirit as already done in pte query helpers, this patch changes pte setting helpers to perform endian conversions on the constants rather than on the pte value. In the meantime, it changes pte_access_permitted() to use pte helpers for the same reason. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-14powerpc/8xx: change name of a few page flags to avoid confusionChristophe Leroy4-19/+19
_PAGE_PRIVILEGED corresponds to the SH bit which doesn't protect against user access but only disables ASID verification on kernel accesses. User access is controlled with _PMD_USER flag. Name it _PAGE_SH instead of _PAGE_PRIVILEGED _PAGE_HUGE corresponds to the SPS bit which doesn't really tells that's it is a huge page but only that it is not a 4k page. Name it _PAGE_SPS instead of _PAGE_HUGE Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-14powerpc/mm: Get rid of pte-common.hChristophe Leroy2-27/+24
Do not include pte-common.h in nohash/32/pgtable.h As that was the last includer, get rid of pte-common.h Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-14powerpc/mm: Define platform default caches related flagsChristophe Leroy3-11/+7
Cache related flags like _PAGE_COHERENT and _PAGE_WRITETHRU are defined on most platforms. The platforms not defining them don't define any alternative. So we can give them a NUL value directly for those platforms directly. Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-14powerpc/mm: Allow platforms to redefine some helpersChristophe Leroy6-44/+91
The 40xx defines _PAGE_HWWRITE while others don't. The 8xx defines _PAGE_RO instead of _PAGE_RW. The 8xx defines _PAGE_PRIVILEGED instead of _PAGE_USER. The 8xx defines _PAGE_HUGE and _PAGE_NA while others don't. Lets those platforms redefine pte_write(), pte_wrprotect() and pte_mkwrite() and get _PAGE_RO and _PAGE_HWWRITE off the common helpers. Lets the 8xx redefine pte_user(), pte_mkprivileged() and pte_mkuser() and get rid of _PAGE_PRIVILEGED and _PAGE_USER default values. Lets the 8xx redefine pte_mkhuge() and get rid of _PAGE_HUGE default value. Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-14powerpc/nohash/64: do not include pte-common.hChristophe Leroy3-36/+44
nohash/64 only uses book3e PTE flags, so it doesn't need pte-common.h This also allows to drop PAGE_SAO and H_PAGE_4K_PFN from pte_common.h as they are only used by PPC64 Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-14powerpc/mm: Distribute platform specific PAGE and PMD flags and definitionsChristophe Leroy6-66/+159
The base kernel PAGE_XXXX definition sets are more or less platform specific. Lets distribute them close to platform _PAGE_XXX flags definition, and customise them to their exact platform flags. Also defines _PAGE_PSIZE and _PTE_NONE_MASK for each platform allthough they are defined as 0. Do the same with _PMD flags like _PMD_USER and _PMD_PRESENT_MASK Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-14powerpc/mm: Move pte_user() into nohash/pgtable.hChristophe Leroy2-13/+10
Now the pte-common.h is only for nohash platforms, lets move pte_user() helper out of pte-common.h to put it together with other helpers. Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-14powerpc/book3s/32: do not include pte-common.hChristophe Leroy2-17/+101
As done for book3s/64, add necessary flags/defines in book3s/32/pgtable.h and do not include pte-common.h It allows in the meantime to remove all related hash definitions from pte-common.h and to also remove _PAGE_EXEC default as _PAGE_EXEC is defined on all platforms except book3s/32. Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-14powerpc/mm: move __P and __S tables in the common pgtable.hChristophe Leroy3-40/+19
__P and __S flags are the same for all platform and should remain as is in the future, so avoid duplication. Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-14powerpc/mm: drop unused page flagsChristophe Leroy2-25/+2
The following page flags in pte-common.h can be dropped: _PAGE_ENDIAN is only used in mm/fsl_booke_mmu.c and is defined in asm/nohash/32/pte-fsl-booke.h _PAGE_4K_PFN is nowhere defined nor used _PAGE_READ, _PAGE_WRITE and _PAGE_PTE are only defined and used in book3s/64 The following page flags in book3s/64/pgtable.h can be dropped as they are not used on this platform nor by common code. _PAGE_NA, _PAGE_RO, _PAGE_USER and _PAGE_PSIZE Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-14powerpc/mm: Split dump_pagelinuxtables flag_array tableChristophe Leroy6-153/+307
To reduce the complexity of flag_array, and allow the removal of default 0 value of non existing flags, lets have one flag_array table for each platform family with only the really existing flags. Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-14powerpc/mm: use pte helpers in generic codeChristophe Leroy7-45/+41
Get rid of platform specific _PAGE_XXXX in powerpc common code and use helpers instead. mm/dump_linuxpagetables.c will be handled separately Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-14powerpc/mm: don't use _PAGE_EXEC for calling hash_preload()Christophe Leroy5-8/+10
The 'access' parameter of hash_preload() is either 0 or _PAGE_EXEC. Among the two versions of hash_preload(), only the PPC64 one is doing something with this 'access' parameter. In order to remove the use of _PAGE_EXEC outside platform code, 'access' parameter is replaced by 'is_exec' which will be either true of false, and the PPC64 version of hash_preload() creates the access flag based on 'is_exec'. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-14powerpc/mm: add pte helpers to query and change pte flagsChristophe Leroy5-0/+114
In order to avoid using generic _PAGE_XXX flags in powerpc core functions, define helpers for all needed flags: - pte_mkuser() and pte_mkprivileged() to set/unset and/or unset/set _PAGE_USER and/or _PAGE_PRIVILEGED - pte_hashpte() to check if _PAGE_HASHPTE is set. - pte_ci() check if cache is inhibited (already existing on book3s/64) - pte_exprotect() to protect against execution - pte_exec() and pte_mkexec() to query and set page execution - pte_mkpte() to set _PAGE_PTE flag. - pte_hw_valid() to check _PAGE_PRESENT since pte_present does something different on book3s/64. On book3s/32 there is no exec protection, so pte_mkexec() and pte_exprotect() are nops and pte_exec() returns always true. Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-14powerpc/mm: move some nohash pte helpers in nohash/[32:64]/pgtable.hChristophe Leroy3-28/+48
In order to allow their use in nohash/32/pgtable.h, we have to move the following helpers in nohash/[32:64]/pgtable.h: - pte_mkwrite() - pte_mkdirty() - pte_mkyoung() - pte_wrprotect() Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-14powerpc/mm: don't use _PAGE_EXEC in book3s/32Christophe Leroy2-4/+4
book3s/32 doesn't define _PAGE_EXEC, so no need to use it. All other platforms define _PAGE_EXEC so no need to check it is not NUL when not book3s/32. Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-14powerpc: handover page flags with a pgprot_t parameterChristophe Leroy20-77/+64
In order to avoid multiple conversions, handover directly a pgprot_t to map_kernel_page() as already done for radix. Do the same for __ioremap_caller() and __ioremap_at(). Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-14powerpc/mm: properly set PAGE_KERNEL flags in ioremap()Christophe Leroy7-33/+24
Set PAGE_KERNEL directly in the caller and do not rely on a hack adding PAGE_KERNEL flags when _PAGE_PRESENT is not set. As already done for PPC64, use pgprot_cache() helpers instead of _PAGE_XXX flags in PPC32 ioremap() derived functions. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-14powerpc: don't use ioremap_prot() nor __ioremap() unless really needed.Christophe Leroy6-11/+10
In many places, ioremap_prot() and __ioremap() can be replaced with higher level functions like ioremap(), ioremap_coherent(), ioremap_cache(), ioremap_wc() ... Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-14soc/fsl/qbman: use ioremap_cache() instead of ioremap_prot(0)Christophe Leroy1-1/+1
ioremap_prot() with flag set to 0 relies on a hack in __ioremap_caller() which adds PAGE_KERNEL flags when the handed flags don't look like a valid set of flags (ie don't include _PAGE_PRESENT) The intention being to map cached memory, use ioremap_cache() instead. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-13drivers/block/z2ram: use ioremap_wt() instead of __ioremap(_PAGE_WRITETHRU)Christophe Leroy1-2/+1
_PAGE_WRITETHRU is a target specific flag. Prefer generic functions. Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-13drivers/video/fbdev: use ioremap_wc/wt() instead of __ioremap()Christophe Leroy4-16/+9
_PAGE_NO_CACHE is a platform specific flag. In addition, this flag is misleading because one would think it requests a noncached page whereas a noncached page is _PAGE_NO_CACHE | _PAGE_GUARDED _PAGE_NO_CACHE alone means write combined noncached page, so lets use ioremap_wc() instead. _PAGE_WRITETHRU is also platform specific flag. Use ioremap_wt() instead. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-13powerpc/32: Add ioremap_wt() and ioremap_coherent()Christophe Leroy3-0/+35
Other arches have ioremap_wt() to map IO areas write-through. Implement it on PPC as well in order to avoid drivers using __ioremap(_PAGE_WRITETHRU) Also implement ioremap_coherent() to avoid drivers using __ioremap(_PAGE_COHERENT) Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-13powerpc/rtas: Fix a potential race between CPU-Offline & MigrationGautham R. Shenoy1-0/+9
Live Partition Migrations require all the present CPUs to execute the H_JOIN call, and hence rtas_ibm_suspend_me() onlines any offline CPUs before initiating the migration for this purpose. The commit 85a88cabad57 ("powerpc/pseries: Disable CPU hotplug across migrations") disables any CPU-hotplug operations once all the offline CPUs are brought online to prevent any further state change. Once the CPU-Hotplug operation is disabled, the code assumes that all the CPUs are online. However, there is a minor window in rtas_ibm_suspend_me() between onlining the offline CPUs and disabling CPU-Hotplug when a concurrent CPU-offline operations initiated by the userspace can succeed thereby nullifying the the aformentioned assumption. In this unlikely case these offlined CPUs will not call H_JOIN, resulting in a system hang. Fix this by verifying that all the present CPUs are actually online after CPU-Hotplug has been disabled, failing which we restore the state of the offline CPUs in rtas_ibm_suspend_me() and return an -EBUSY. Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> Cc: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> Suggested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Reviewed-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-13powerpc/cacheinfo: Report the correct shared_cpu_map on big-coresGautham R. Shenoy1-2/+35
Currently on POWER9 SMT8 cores systems, in sysfs, we report the shared_cache_map for L1 caches (both data and instruction) to be the cpu-ids of the threads in SMT8 cores. This is incorrect since on POWER9 SMT8 cores there are two groups of threads, each of which shares its own L1 cache. This patch addresses this by reporting the shared_cpu_map correctly in sysfs for L1 caches. Before the patch /sys/devices/system/cpu/cpu0/cache/index0/shared_cpu_map : 000000ff /sys/devices/system/cpu/cpu0/cache/index1/shared_cpu_map : 000000ff /sys/devices/system/cpu/cpu1/cache/index0/shared_cpu_map : 000000ff /sys/devices/system/cpu/cpu1/cache/index1/shared_cpu_map : 000000ff After the patch /sys/devices/system/cpu/cpu0/cache/index0/shared_cpu_map : 00000055 /sys/devices/system/cpu/cpu0/cache/index1/shared_cpu_map : 00000055 /sys/devices/system/cpu/cpu1/cache/index0/shared_cpu_map : 000000aa /sys/devices/system/cpu/cpu1/cache/index1/shared_cpu_map : 000000aa Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-13powerpc: Use cpu_smallcore_sibling_mask at SMT level on bigcoresGautham R. Shenoy1-1/+18
POWER9 SMT8 cores consist of two groups of threads, where threads in each group shares L1-cache. The scheduler is not aware of this distinction as the current sched-domain hierarchy has all the threads of the core defined at the SMT domain. SMT [Thread siblings of the SMT8 core] DIE [CPUs in the same die] NUMA [All the CPUs in the system] Due to this, we can observe run-to-run variance when we run a multi-threaded benchmark bound to a single core based on how the scheduler spreads the software threads across the two groups in the core. We fix this in this patch by defining each group of threads which share L1-cache to be the SMT level. The group of threads in the SMT8 core is defined to be the CACHE level. The sched-domain hierarchy after this patch will be : SMT [Thread siblings in the core that share L1 cache] CACHE [Thread siblings that are in the SMT8 core] DIE [CPUs in the same die] NUMA [All the CPUs in the system] Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-13powerpc: Detect the presence of big-cores via "ibm, thread-groups"Gautham R. Shenoy3-0/+235
On IBM POWER9, the device tree exposes a property array identifed by "ibm,thread-groups" which will indicate which groups of threads share a particular set of resources. As of today we only have one form of grouping identifying the group of threads in the core that share the L1 cache, translation cache and instruction data flow. This patch adds helper functions to parse the contents of "ibm,thread-groups" and populate a per-cpu variable to cache information about siblings of each CPU that share the L1, traslation cache and instruction data-flow. It also defines a new global variable named "has_big_cores" which indicates if the cores on this configuration have multiple groups of threads that share L1 cache. For each online CPU, it maintains a cpu_smallcore_mask, which indicates the online siblings which share the L1-cache with it. Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-13powerpc: Fix stackprotector detection for non-glibc toolchainsMichael Ellerman1-1/+2
If GCC is not built with glibc support then we must explicitly tell it which register to use for TLS mode stack protector, otherwise it will error out and the cc-option check will fail. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-13powerpc/xmon: Show the stack protector canary in xmonMichael Ellerman1-0/+3
This is helpful for debugging stack protector crashes. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-13powerpc: Use SWITCH_FRAME_SIZE for prom and rtas entryJoel Stanley2-14/+5
Commit 6c1719942e19 ("powerpc/of: Remove useless register save/restore when calling OF back") removed the saving of srr0 and srr1 when calling into OpenFirmware. Commit e31aa453bbc4 ("powerpc: Use LOAD_REG_IMMEDIATE only for constants on 64-bit") did the same for rtas. This means we don't need to save the extra stack space and can use the common SWITCH_FRAME_SIZE. There were already no users of _SRR0 and _SRR1 so we can remove them too. Link: https://github.com/linuxppc/linux/issues/83 Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-13powerpc/pseries/mobility: Extend start/stop topology update scopeMichael Bringmann3-2/+11
The powerpc mobility code may receive RTAS requests to perform PRRN (Platform Resource Reassignment Notification) topology changes at any time, including during LPAR migration operations. In some configurations where the affinity of CPUs or memory is being changed on that platform, the PRRN requests may apply or refer to outdated information prior to the complete update of the device-tree. This patch changes the duration for which topology updates are suppressed during LPAR migrations from just the rtas_ibm_suspend_me() / 'ibm,suspend-me' call(s) to cover the entire migration_store() operation to allow all changes to the device-tree to be applied prior to accepting and applying any PRRN requests. For tracking purposes, pr_info notices are added to the functions start_topology_update() and stop_topology_update() of 'numa.c'. Signed-off-by: Michael Bringmann <mwb@linux.vnet.ibm.com> Reviewed-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-13powerpc/Makefile: Fix PPC_BOOK3S_64 ASFLAGSJoel Stanley1-1/+5
Ever since commit 15a3204d24a3 ("powerpc/64s: Set assembler machine type to POWER4") we force -mpower4 to be passed to the assembler irrespective of the CFLAGS used (for Book3s 64). When building a powerpc64 kernel with clang, clang will not add -many to the assembler flags, so any instructions that the compiler has generated that are not available on power4 will cause an error: /usr/bin/as -a64 -mppc64 -mlittle-endian -mpower8 \ -I ./arch/powerpc/include -I ./arch/powerpc/include/generated \ -I ./include -I ./arch/powerpc/include/uapi \ -I ./arch/powerpc/include/generated/uapi -I ./include/uapi \ -I ./include/generated/uapi -I arch/powerpc -I arch/powerpc \ -maltivec -mpower4 -o init/do_mounts.o /tmp/do_mounts-3b0a3d.s /tmp/do_mounts-51ce54.s:748: Error: unrecognized opcode: `isel' GCC does include -many, so the GCC driven gas call will succeed: as -v -I ./arch/powerpc/include -I ./arch/powerpc/include/generated -I ./include -I ./arch/powerpc/include/uapi -I ./arch/powerpc/include/generated/uapi -I ./include/uapi -I ./include/generated/uapi -I arch/powerpc -I arch/powerpc -a64 -mpower8 -many -mlittle -maltivec -mpower4 -o init/do_mounts.o Note that isel is power7 and above for IBM CPUs. GCC only generates it for Power9 and above, but the above test was run against the clang generated assembly. Peter Bergner explains: When using -many -mpower4, gas will first try and find a matching power4 mnemonic and failing that, it will then allow any valid mnemonic that gas knows about. GCC's use of -many predates me though. IIRC, Alan looked at trying to remove it, but I forget why he didn't. Could be either a gcc or gas issue at the time. I'm not sure whether issue still exists or not. He and I have modified how gas works internally a fair amount since he tried removing gcc use of -many. I will also note that when using -many, gas will choose the first mnemonic that matches in the mnemonic table and we have (mostly) sorted the table so that server mnemonics show up earlier in the table than other mnemonics, so they'll be seen/chosen first. By explicitly setting -many we can build with Clang and GCC while retaining the -mpower4 option. Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-13powerpc/pseries/memory-hotplug: Fix return value type of find_aa_indexYueHaibing1-33/+28
The variable 'aa_index' is defined as an unsigned value in update_lmb_associativity_index(), but find_aa_index() may return -1 when dlpar_clone_property() fails. So change find_aa_index() to return a bool, which indicates whether 'aa_index' was found or not. Fixes: c05a5a40969e ("powerpc/pseries: Dynamic add entires to associativity lookup array") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Nathan Fontenot nfont@linux.vnet.ibm.com> [mpe: Tweak changelog, rename is_found to just found] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-13powerpc/eeh: Cleanup control flow in eeh_handle_normal_event()Sam Bobroff1-102/+94
Rather than mixing "if (state)" blocks and gotos, convert entirely to "if (state)" blocks to make the state machine behaviour clearer. Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-13powerpc/eeh: Cleanup eeh_ops.wait_state()Sam Bobroff6-108/+62
The wait_state member of eeh_ops does not need to be platform dependent; it's just logic around eeh_ops.get_state(). Therefore, merge the two (slightly different!) platform versions into a new function, eeh_wait_state() and remove the eeh_ops member. While doing this, also correct: * The wait logic, so that it never waits longer than max_wait. * The wait logic, so that it never waits less than EEH_STATE_MIN_WAIT_TIME. * One call site where the result is treated like a bit field before it's checked for negative error values. * In pseries_eeh_get_state(), rename the "state" parameter to "delay" because that's what it is. Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-13powerpc/eeh: Cleanup eeh_pe_state_mark()Sam Bobroff6-53/+46
Currently, eeh_pe_state_mark() marks a PE (and it's children) with a state and then performs additional processing if that state included EEH_PE_ISOLATED. The state parameter is always a constant at the call site, so rearrange eeh_pe_state_mark() into two functions and just call the appropriate one at each site. Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-13powerpc/eeh: Cleanup unnecessary eeh_pe_state_mark_with_cfg()Sam Bobroff2-24/+4
The function eeh_pe_state_mark_with_cfg() just performs the work of eeh_pe_state_mark() and then, conditionally, the work of eeh_pe_state_clear(). However it is only ever called with a constant state such that the condition is always true, so replace it by direct calls. Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-13powerpc/eeh: Cleanup eeh_enabled()Sam Bobroff1-5/+1
Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-13powerpc/eeh: Cleanup logic in eeh_rmv_from_parent_pe()Sam Bobroff1-2/+2
Move the call to eeh_dev_to_pe() up, so that later it's clear that "pe" isn't NULL. Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-13powerpc/eeh: Cleanup field names in eeh_rmv_dataSam Bobroff1-10/+11
Change the name of the fields in eeh_rmv_data to clarify their usage. Change "edev_list" to "removed_vf_list" because it does not contain generic edevs, but rather only edevs that contain virtual functions (which need to be removed during recovery). Similarly, change "removed" to "removed_dev_count" because it is a count of any removed devices, not just those in the above list. Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-13powerpc/eeh: Cleanup list_head field namesSam Bobroff6-21/+19
Instances of struct eeh_pe are placed in a tree structure using the fields "child_list" and "child", so place these next to each other in the definition. The field "child" is a list entry, so remove the unnecessary and misleading use of the list initializer, LIST_HEAD(), on it. The eeh_dev struct contains two list entry fields, called "list" and "rmv_list". Rename them to "entry" and "rmv_entry" and, as above, stop initializing them with LIST_HEAD(). Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-13powerpc/eeh: Cleanup eeh_add_virt_device()Sam Bobroff1-4/+3
Remove the unnecessary cast through void * on the first parameter and remove the unused second parameter (always NULL). Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-13powerpc/eeh: Cleanup unused field in eeh_devSam Bobroff2-2/+0
The 'bus' member of struct eeh_dev is assigned to once but never used, so remove it. Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-13powerpc/eeh: Cleanup EEH_POSTPONED_PROBESam Bobroff4-27/+7
Currently a flag, EEH_POSTPONED_PROBE, is used to prevent an incorrect message "EEH: No capable adapters found" from being displayed during the boot of powernv systems. It is necessary because, on powernv, the call to eeh_probe_devices() made from eeh_init() is too early and EEH can't yet be enabled. A second call is made later from eeh_pnv_post_init(), which succeeds. (On pseries, the first call succeeds because PCI devices are set up early enough and no second call is made.) This can be simplified by moving the early call to eeh_probe_devices() from eeh_init() (where it's seen by both platforms) to pSeries_final_fixup(), so that each platform only calls eeh_probe_devices() once, at a point where it can succeed. This is slightly later in the boot sequence, but but still early enough and it is now in the same place in the sequence for both platforms (the pcibios_fixup hook). The display of the message can be cleaned up as well, by moving it into eeh_probe_devices(). Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-13powerpc/eeh: Fix use of EEH_PE_KEEP on wrong fieldSam Bobroff1-1/+1
eeh_add_to_parent_pe() sometimes removes the EEH_PE_KEEP flag, but it incorrectly removes it from pe->type, instead of pe->state. However, rather than clearing it from the correct field, remove it. Inspection of the code shows that it can't ever have had any effect (even if it had been cleared from the correct field), because the field is never tested after it is cleared by the statement in question. The clear statement was added by commit 807a827d4e74 ("powerpc/eeh: Keep PE during hotplug"), but it didn't explain why it was necessary. Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-13powerpc/eeh: Fix null deref for devices removed during EEHSam Bobroff1-0/+4
If a device is removed during EEH processing (either by a driver's handler or as part of recovery), it can lead to a null dereference in eeh_pe_report_edev(). To handle this, skip devices that have been removed. Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-13powerpc/eeh: Fix possible null deref in eeh_dump_dev_log()Sam Bobroff1-0/+5
If an error occurs during an unplug operation, it's possible for eeh_dump_dev_log() to be called when edev->pdn is null, which currently leads to dereferencing a null pointer. Handle this by skipping the error log for those devices. Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-13powerpc/boot: Build boot wrapper with optimisationsJoel Stanley1-1/+1
The boot wrapper is currently built with -Os. By building with O2 we can meaningfully reduce the time decompressing the kernel. I tested by comparing 10 runs of each option in Qemu and on hardware. The kernel is compressed with KERNEL_XZ built with GCC 8.2.0-7ubuntu1. The values are counts of the timebase. Qemu TCG powernv Power8: Os O2 O3 median 10221123889 6201518438 6568186825 stddev 1361267211 429090641 657930076 improvement 39.33% 35.74% Palmetto Power8: Os O2 O3 median 50279 50599 35790 stddev 992144533 627130655 623721078 improvement 36.79% 37.13% Romulus Power9: Os O2 O3 median 670312391 454733720 448881398 stddev 157569 107276 108760 improvement 32.16% 33.03% TCG was quite noisy, with every few runs producing an outlier. Even so, O2 is faster than O3. On hardware the numbers were less noisy and O3 is slightly faster than O2. The wrapper size increases when moving from Os. Comparing zImage.epapr to the existing Os build using bloat-o-meter: Before=43401, After=56837 (13KB), chg +30.96% Before=43401, After=64305 (20KB), chg +48.16% I chose O2 for a balance between Qemu and hardware speed up. Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-13powerpc/boot: Disable vector instructionsJoel Stanley1-2/+2
This will avoid auto-vectorisation when building with higher optimisation levels. We don't know if the machine can support VSX and even if it's present it's probably not going to be enabled at this point in boot. These flag were both added prior to GCC 4.6 which is the minimum compiler version supported by upstream, thanks to Segher for the details. Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>