summaryrefslogtreecommitdiffstats
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2014-01-21sched: add tracepoints related to NUMA task migrationMel Gorman1-0/+87
This patch adds three tracepoints o trace_sched_move_numa when a task is moved to a node o trace_sched_swap_numa when a task is swapped with another task o trace_sched_stick_numa when a numa-related migration fails The tracepoints allow the NUMA scheduler activity to be monitored and the following high-level metrics can be calculated o NUMA migrated stuck nr trace_sched_stick_numa o NUMA migrated idle nr trace_sched_move_numa o NUMA migrated swapped nr trace_sched_swap_numa o NUMA local swapped trace_sched_swap_numa src_nid == dst_nid (should never happen) o NUMA remote swapped trace_sched_swap_numa src_nid != dst_nid (should == NUMA migrated swapped) o NUMA group swapped trace_sched_swap_numa src_ngid == dst_ngid Maybe a small number of these are acceptable but a high number would be a major surprise. It would be even worse if bounces are frequent. o NUMA avg task migs. Average number of migrations for tasks o NUMA stddev task mig Self-explanatory o NUMA max task migs. Maximum number of migrations for a single task In general the intent of the tracepoints is to help diagnose problems where automatic NUMA balancing appears to be doing an excessive amount of useless work. [akpm@linux-foundation.org: remove semicolon-after-if, repair coding-style] Signed-off-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Alex Thorlton <athorlton@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-21mm: numa: trace tasks that fail migration due to rate limitingMel Gorman1-0/+26
A low local/remote numa hinting fault ratio is potentially explained by failed migrations. This patch adds a tracepoint that fires when migration fails due to migration rate limitation. Signed-off-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Alex Thorlton <athorlton@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-21mm: numa: limit scope of lock for NUMA migrate rate limitingMel Gorman1-4/+1
NUMA migrate rate limiting protects a migration counter and window using a lock but in some cases this can be a contended lock. It is not critical that the number of pages be perfect, lost updates are acceptable. Reduce the importance of this lock. Signed-off-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Alex Thorlton <athorlton@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-21mm/memblock: add memblock memory allocation apisSantosh Shilimkar1-0/+152
Introduce memblock memory allocation APIs which allow to support PAE or LPAE extension on 32 bits archs where the physical memory start address can be beyond 4GB. In such cases, existing bootmem APIs which operate on 32 bit addresses won't work and needs memblock layer which operates on 64 bit addresses. So we add equivalent APIs so that we can replace usage of bootmem with memblock interfaces. Architectures already converted to NO_BOOTMEM use these new memblock interfaces. The architectures which are still not converted to NO_BOOTMEM continue to function as is because we still maintain the fal lback option of bootmem back-end supporting these new interfaces. So no functional change as such. In long run, once all the architectures moves to NO_BOOTMEM, we can get rid of bootmem layer completely. This is one step to remove the core code dependency with bootmem and also gives path for architectures to move away from bootmem. The proposed interface will became active if both CONFIG_HAVE_MEMBLOCK and CONFIG_NO_BOOTMEM are specified by arch. In case !CONFIG_NO_BOOTMEM, the memblock() wrappers will fallback to the existing bootmem apis so that arch's not converted to NO_BOOTMEM continue to work as is. The meaning of MEMBLOCK_ALLOC_ACCESSIBLE and MEMBLOCK_ALLOC_ANYWHERE is kept same. [akpm@linux-foundation.org: s/depricated/deprecated/] Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Tejun Heo <tj@kernel.org> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Paul Walmsley <paul@pwsan.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Russell King <linux@arm.linux.org.uk> Cc: Tony Lindgren <tony@atomide.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-21mm/memblock: switch to use NUMA_NO_NODE instead of MAX_NUMNODESGrygorii Strashko1-2/+2
It's recommended to use NUMA_NO_NODE everywhere to select "process any node" behavior or to indicate that "no node id specified". Hence, update __next_free_mem_range*() API's to accept both NUMA_NO_NODE and MAX_NUMNODES, but emit warning once on MAX_NUMNODES, and correct corresponding API's documentation to describe new behavior. Also, update other memblock/nobootmem APIs where MAX_NUMNODES is used dirrectly. The change was suggested by Tejun Heo. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Tejun Heo <tj@kernel.org> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Paul Walmsley <paul@pwsan.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Russell King <linux@arm.linux.org.uk> Cc: Tony Lindgren <tony@atomide.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-21mm/memblock: reorder parameters of memblock_find_in_range_nodeGrygorii Strashko1-2/+3
Reorder parameters of memblock_find_in_range_node to be consistent with other memblock APIs. The change was suggested by Tejun Heo <tj@kernel.org>. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Tejun Heo <tj@kernel.org> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Paul Walmsley <paul@pwsan.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Russell King <linux@arm.linux.org.uk> Cc: Tony Lindgren <tony@atomide.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-21mm/bootmem: remove duplicated declaration of __free_pages_bootmem()Grygorii Strashko1-1/+0
The __free_pages_bootmem is used internally by MM core and already defined in internal.h. So, remove duplicated declaration. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Reviewed-by: Tejun Heo <tj@kernel.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Paul Walmsley <paul@pwsan.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Russell King <linux@arm.linux.org.uk> Cc: Tony Lindgren <tony@atomide.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-21introduce for_each_thread() to replace the buggy while_each_thread()Oleg Nesterov2-0/+14
while_each_thread() and next_thread() should die, almost every lockless usage is wrong. 1. Unless g == current, the lockless while_each_thread() is not safe. while_each_thread(g, t) can loop forever if g exits, next_thread() can't reach the unhashed thread in this case. Note that this can happen even if g is the group leader, it can exec. 2. Even if while_each_thread() itself was correct, people often use it wrongly. It was never safe to just take rcu_read_lock() and loop unless you verify that pid_alive(g) == T, even the first next_thread() can point to the already freed/reused memory. This patch adds signal_struct->thread_head and task->thread_node to create the normal rcu-safe list with the stable head. The new for_each_thread(g, t) helper is always safe under rcu_read_lock() as long as this task_struct can't go away. Note: of course it is ugly to have both task_struct->thread_node and the old task_struct->thread_group, we will kill it later, after we change the users of while_each_thread() to use for_each_thread(). Perhaps we can kill it even before we convert all users, we can reimplement next_thread(t) using the new thread_head/thread_node. But we can't do this right now because this will lead to subtle behavioural changes. For example, do/while_each_thread() always sees at least one task, while for_each_thread() can do nothing if the whole thread group has died. Or thread_group_empty(), currently its semantics is not clear unless thread_group_leader(p) and we need to audit the callers before we can change it. So this patch adds the new interface which has to coexist with the old one for some time, hopefully the next changes will be more or less straightforward and the old one will go away soon. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Sergey Dyasly <dserrg@gmail.com> Tested-by: Sergey Dyasly <dserrg@gmail.com> Reviewed-by: Sameer Nanda <snanda@chromium.org> Acked-by: David Rientjes <rientjes@google.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mandeep Singh Baines <msb@chromium.org> Cc: "Ma, Xindong" <xindong.ma@intel.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: "Tu, Xiaobing" <xiaobing.tu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-21mm/rmap: use rmap_walk() in page_referenced()Joonsoo Kim2-3/+1
Now, we have an infrastructure in rmap_walk() to handle difference from variants of rmap traversing functions. So, just use it in page_referenced(). In this patch, I change following things. 1. remove some variants of rmap traversing functions. cf> page_referenced_ksm, page_referenced_anon, page_referenced_file 2. introduce new struct page_referenced_arg and pass it to page_referenced_one(), main function of rmap_walk, in order to count reference, to store vm_flags and to check finish condition. 3. mechanical change to use rmap_walk() in page_referenced(). [liwanp@linux.vnet.ibm.com: fix BUG at rmap_walk] Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Hugh Dickins <hughd@google.com> Cc: Rik van Riel <riel@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Hillf Danton <dhillf@gmail.com> Signed-off-by: Wanpeng Li <liwanp@linux.vnet.ibm.com> Cc: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-21mm/rmap: use rmap_walk() in try_to_munlock()Joonsoo Kim1-6/+0
Now, we have an infrastructure in rmap_walk() to handle difference from variants of rmap traversing functions. So, just use it in try_to_munlock(). In this patch, I change following things. 1. remove some variants of rmap traversing functions. cf> try_to_unmap_ksm, try_to_unmap_anon, try_to_unmap_file 2. mechanical change to use rmap_walk() in try_to_munlock(). 3. copy and paste comments. Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Hugh Dickins <hughd@google.com> Cc: Rik van Riel <riel@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Hillf Danton <dhillf@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-21mm/rmap: use rmap_walk() in try_to_unmap()Joonsoo Kim1-4/+1
Now, we have an infrastructure in rmap_walk() to handle difference from variants of rmap traversing functions. So, just use it in try_to_unmap(). In this patch, I change following things. 1. enable rmap_walk() if !CONFIG_MIGRATION. 2. mechanical change to use rmap_walk() in try_to_unmap(). Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Hugh Dickins <hughd@google.com> Cc: Rik van Riel <riel@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Hillf Danton <dhillf@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-21mm/rmap: extend rmap_walk_xxx() to cope with different casesJoonsoo Kim1-0/+15
There are a lot of common parts in traversing functions, but there are also a little of uncommon parts in it. By assigning proper function pointer on each rmap_walker_control, we can handle these difference correctly. Following are differences we should handle. 1. difference of lock function in anon mapping case 2. nonlinear handling in file mapping case 3. prechecked condition: checking memcg in page_referenced(), checking VM_SHARE in page_mkclean() checking temporary vma in try_to_unmap() 4. exit condition: checking page_mapped() in try_to_unmap() So, in this patch, I introduce 4 function pointers to handle above differences. Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Hugh Dickins <hughd@google.com> Cc: Rik van Riel <riel@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Hillf Danton <dhillf@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-21mm/rmap: make rmap_walk to get the rmap_walk_control argumentJoonsoo Kim2-6/+10
In each rmap traverse case, there is some difference so that we need function pointers and arguments to them in order to handle these For this purpose, struct rmap_walk_control is introduced in this patch, and will be extended in following patch. Introducing and extending are separate, because it clarify changes. Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Hugh Dickins <hughd@google.com> Cc: Rik van Riel <riel@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Hillf Danton <dhillf@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-21memblock, mem_hotplug: make memblock skip hotpluggable regions if neededTang Chen1-0/+24
Linux kernel cannot migrate pages used by the kernel. As a result, hotpluggable memory used by the kernel won't be able to be hot-removed. To solve this problem, the basic idea is to prevent memblock from allocating hotpluggable memory for the kernel at early time, and arrange all hotpluggable memory in ACPI SRAT(System Resource Affinity Table) as ZONE_MOVABLE when initializing zones. In the previous patches, we have marked hotpluggable memory regions with MEMBLOCK_HOTPLUG flag in memblock.memory. In this patch, we make memblock skip these hotpluggable memory regions in the default top-down allocation function if movable_node boot option is specified. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "Rafael J . Wysocki" <rjw@sisk.pl> Cc: Chen Tang <imtangchen@gmail.com> Cc: Gong Chen <gong.chen@linux.intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jiang Liu <jiang.liu@huawei.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Len Brown <lenb@kernel.org> Cc: Liu Jiang <jiang.liu@huawei.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Nazarewicz <mina86@mina86.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Taku Izumi <izumi.taku@jp.fujitsu.com> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Renninger <trenn@suse.de> Cc: Toshi Kani <toshi.kani@hp.com> Cc: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com> Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com> Cc: Wen Congyang <wency@cn.fujitsu.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-21memblock: make memblock_set_node() support different memblock_typeTang Chen1-1/+2
[sfr@canb.auug.org.au: fix powerpc build] Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Reviewed-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "Rafael J . Wysocki" <rjw@sisk.pl> Cc: Chen Tang <imtangchen@gmail.com> Cc: Gong Chen <gong.chen@linux.intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jiang Liu <jiang.liu@huawei.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Len Brown <lenb@kernel.org> Cc: Liu Jiang <jiang.liu@huawei.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Nazarewicz <mina86@mina86.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Taku Izumi <izumi.taku@jp.fujitsu.com> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Renninger <trenn@suse.de> Cc: Toshi Kani <toshi.kani@hp.com> Cc: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com> Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com> Cc: Wen Congyang <wency@cn.fujitsu.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-21memblock, mem_hotplug: introduce MEMBLOCK_HOTPLUG flag to mark hotpluggable ↵Tang Chen1-0/+17
regions In find_hotpluggable_memory, once we find out a memory region which is hotpluggable, we want to mark them in memblock.memory. So that we could control memblock allocator not to allocte hotpluggable memory for the kernel later. To achieve this goal, we introduce MEMBLOCK_HOTPLUG flag to indicate the hotpluggable memory regions in memblock and a function memblock_mark_hotplug() to mark hotpluggable memory if we find one. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Reviewed-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "Rafael J . Wysocki" <rjw@sisk.pl> Cc: Chen Tang <imtangchen@gmail.com> Cc: Gong Chen <gong.chen@linux.intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jiang Liu <jiang.liu@huawei.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Len Brown <lenb@kernel.org> Cc: Liu Jiang <jiang.liu@huawei.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Nazarewicz <mina86@mina86.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Taku Izumi <izumi.taku@jp.fujitsu.com> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Renninger <trenn@suse.de> Cc: Toshi Kani <toshi.kani@hp.com> Cc: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com> Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com> Cc: Wen Congyang <wency@cn.fujitsu.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-21memblock, numa: introduce flags field into memblockTang Chen1-0/+1
There is no flag in memblock to describe what type the memory is. Sometimes, we may use memblock to reserve some memory for special usage. And we want to know what kind of memory it is. So we need a way to In hotplug environment, we want to reserve hotpluggable memory so the kernel won't be able to use it. And when the system is up, we have to free these hotpluggable memory to buddy. So we need to mark these memory first. In order to do so, we need to mark out these special memory in memblock. In this patch, we introduce a new "flags" member into memblock_region: struct memblock_region { phys_addr_t base; phys_addr_t size; unsigned long flags; /* This is new. */ #ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP int nid; #endif }; This patch does the following things: 1) Add "flags" member to memblock_region. 2) Modify the following APIs' prototype: memblock_add_region() memblock_insert_region() 3) Add memblock_reserve_region() to support reserve memory with flags, and keep memblock_reserve()'s prototype unmodified. 4) Modify other APIs to support flags, but keep their prototype unmodified. The idea is from Wen Congyang <wency@cn.fujitsu.com> and Liu Jiang <jiang.liu@huawei.com>. Suggested-by: Wen Congyang <wency@cn.fujitsu.com> Suggested-by: Liu Jiang <jiang.liu@huawei.com> Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Reviewed-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "Rafael J . Wysocki" <rjw@sisk.pl> Cc: Chen Tang <imtangchen@gmail.com> Cc: Gong Chen <gong.chen@linux.intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jiang Liu <jiang.liu@huawei.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Len Brown <lenb@kernel.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Nazarewicz <mina86@mina86.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Taku Izumi <izumi.taku@jp.fujitsu.com> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Renninger <trenn@suse.de> Cc: Toshi Kani <toshi.kani@hp.com> Cc: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com> Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-21mm: add overcommit_kbytes sysctl variableJerome Marchand2-0/+10
Some applications that run on HPC clusters are designed around the availability of RAM and the overcommit ratio is fine tuned to get the maximum usage of memory without swapping. With growing memory, the 1%-of-all-RAM grain provided by overcommit_ratio has become too coarse for these workload (on a 2TB machine it represents no less than 20GB). This patch adds the new overcommit_kbytes sysctl variable that allow a much finer grain. [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: fix nommu build] Signed-off-by: Jerome Marchand <jmarchan@redhat.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-21mm, show_mem: remove SHOW_MEM_FILTER_PAGE_COUNTMel Gorman1-1/+0
Commit 4b59e6c47309 ("mm, show_mem: suppress page counts in non-blockable contexts") introduced SHOW_MEM_FILTER_PAGE_COUNT to suppress PFN walks on large memory machines. Commit c78e93630d15 ("mm: do not walk all of system memory during show_mem") avoided a PFN walk in the generic show_mem helper which removes the requirement for SHOW_MEM_FILTER_PAGE_COUNT in that case. This patch removes PFN walkers from the arch-specific implementations that report on a per-node or per-zone granularity. ARM and unicore32 still do a PFN walk as they report memory usage on each bank which is a much finer granularity where the debugging information may still be of use. As the remaining arches doing PFN walks have relatively small amounts of memory, this patch simply removes SHOW_MEM_FILTER_PAGE_COUNT. [akpm@linux-foundation.org: fix parisc] Signed-off-by: Mel Gorman <mgorman@suse.de> Acked-by: David Rientjes <rientjes@google.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: James Bottomley <jejb@parisc-linux.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-21mm, mempolicy: remove unneeded functions for UMA configsDavid Rientjes1-32/+0
Mempolicies only exist for CONFIG_NUMA configurations. Therefore, a certain class of functions are unneeded in configurations where CONFIG_NUMA is disabled such as functions that duplicate existing mempolicies, lookup existing policies, set certain mempolicy traits, or test mempolicies for certain attributes. Remove the unneeded functions so that any future callers get a compile- time error and protect their code with CONFIG_NUMA as required. Signed-off-by: David Rientjes <rientjes@google.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-21mm: create a separate slab for page->ptl allocationKirill A. Shutemov1-0/+12
If DEBUG_SPINLOCK and DEBUG_LOCK_ALLOC are enabled spinlock_t on x86_64 is 72 bytes. For page->ptl they will be allocated from kmalloc-96 slab, so we loose 24 on each. An average system can easily allocate few tens thousands of page->ptl and overhead is significant. Let's create a separate slab for page->ptl allocation to solve this. To make sure that it really works this time, some numbers from my test machine (just booted, no load): Before: # grep '^\(kmalloc-96\|page->ptl\)' /proc/slabinfo kmalloc-96 31987 32190 128 30 1 : tunables 120 60 8 : slabdata 1073 1073 92 After: # grep '^\(kmalloc-96\|page->ptl\)' /proc/slabinfo page->ptl 27516 28143 72 53 1 : tunables 120 60 8 : slabdata 531 531 9 kmalloc-96 3853 5280 128 30 1 : tunables 120 60 8 : slabdata 176 176 0 Note that the patch is useful not only for debug case, but also for PREEMPT_RT, where spinlock_t is always bloated. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-21mm: get rid of unnecessary pageblock scanning in setup_zone_migrate_reserveYasuaki Ishimatsu1-0/+6
Yasuaki Ishimatsu reported memory hot-add spent more than 5 _hours_ on 9TB memory machine since onlining memory sections is too slow. And we found out setup_zone_migrate_reserve spent >90% of the time. The problem is, setup_zone_migrate_reserve scans all pageblocks unconditionally, but it is only necessary if the number of reserved block was reduced (i.e. memory hot remove). Moreover, maximum MIGRATE_RESERVE per zone is currently 2. It means that the number of reserved pageblocks is almost always unchanged. This patch adds zone->nr_migrate_reserve_block to maintain the number of MIGRATE_RESERVE pageblocks and it reduces the overhead of setup_zone_migrate_reserve dramatically. The following table shows time of onlining a memory section. Amount of memory | 128GB | 192GB | 256GB| --------------------------------------------- linux-3.12 | 23.9 | 31.4 | 44.5 | This patch | 8.3 | 8.3 | 8.6 | Mel's proposal patch | 10.9 | 19.2 | 31.3 | --------------------------------------------- (millisecond) 128GB : 4 nodes and each node has 32GB of memory 192GB : 6 nodes and each node has 32GB of memory 256GB : 8 nodes and each node has 32GB of memory (*1) Mel proposed his idea by the following threads. https://lkml.org/lkml/2013/10/30/272 [akpm@linux-foundation.org: tweak comment] Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Reported-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Tested-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Mel Gorman <mgorman@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-21mm: thp: turn compound_head() into BUG_ON(!PageTail) in get_huge_page_tail()Oleg Nesterov1-4/+3
get_huge_page_tail()->compound_head() looks confusing. Every caller must check PageTail(page), otherwise atomic_inc(&page->_mapcount) is simply wrong if this page is compound-trans-head. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Dave Jones <davej@redhat.com> Cc: Darren Hart <dvhart@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mel Gorman <mgorman@suse.de> Acked-by: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-21mm: tail page refcounting optimization for slab and hugetlbfsAndrea Arcangeli2-7/+31
This skips the _mapcount mangling for slab and hugetlbfs pages. The main trouble in doing this is to guarantee that PageSlab and PageHeadHuge remains constant for all get_page/put_page run on the tail of slab or hugetlbfs compound pages. Otherwise if they're set during get_page but not set during put_page, the _mapcount of the tail page would underflow. PageHeadHuge will remain true until the compound page is released and enters the buddy allocator so it won't risk to change even if the tail page is the last reference left on the page. PG_slab instead is cleared before the slab frees the head page with put_page, so if the tail pin is released after the slab freed the page, we would have a problem. But in the slab case the tail pin cannot be the last reference left on the page. This is because the slab code is free to reuse the compound page after a kfree/kmem_cache_free without having to check if there's any tail pin left. In turn all tail pins must be always released while the head is still pinned by the slab code and so we know PG_slab will be still set too. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Reviewed-by: Khalid Aziz <khalid.aziz@oracle.com> Cc: Pravin Shelar <pshelar@nicira.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Ben Hutchings <bhutchings@solarflare.com> Cc: Christoph Lameter <cl@linux.com> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-21mm: thp: optimize compound_trans_hugeAndrea Arcangeli1-0/+23
Currently we don't clobber page_tail->first_page during split_huge_page, so compound_trans_head can be set to compound_head without adverse effects, and this mostly optimizes away a smp_rmb. It looks worthwhile to keep around the implementation that doesn't relay on page_tail->first_page not to be clobbered, because it would be necessary if we'll decide to enforce page->private to zero at all times whenever PG_private is not set, also for anonymous pages. For anonymous pages enforcing such an invariant doesn't matter as anonymous pages don't use page->private so we can get away with this microoptimization. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Khalid Aziz <khalid.aziz@oracle.com> Cc: Pravin Shelar <pshelar@nicira.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Ben Hutchings <bhutchings@solarflare.com> Cc: Christoph Lameter <cl@linux.com> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-21mm: hugetlbfs: Add some VM_BUG_ON()s to catch non-hugetlbfs pagesDave Hansen1-0/+1
Dave Jiang reported that he was seeing oopses when running NUMA systems and default_hugepagesz=1G. I traced the issue down to migrate_page_copy() trying to use the same code for hugetlb pages and transparent hugepages. It should not have been trying to pass thp pages in there. So, add some VM_BUG_ON()s for the next hapless VM developer that tries the same thing. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Tested-by: Dave Jiang <dave.jiang@intel.com> Acked-by: Mel Gorman <mgorman@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-21mm: Make {,set}page_address() static inline if WANT_PAGE_VIRTUALGeert Uytterhoeven1-5/+8
{,set}page_address() are macros if WANT_PAGE_VIRTUAL. If !WANT_PAGE_VIRTUAL, they're plain C functions. If someone calls them with a void *, this pointer is auto-converted to struct page * if !WANT_PAGE_VIRTUAL, but causes a build failure on architectures using WANT_PAGE_VIRTUAL (arc, m68k and sparc64): drivers/md/bcache/bset.c: In function `__btree_sort': drivers/md/bcache/bset.c:1190: warning: dereferencing `void *' pointer drivers/md/bcache/bset.c:1190: error: request for member `virtual' in something not a structure or union Convert them to static inline functions to fix this. There are already plenty of users of struct page members inside <linux/mm.h>, so there's no reason to keep them as macros. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Guenter Roeck <linux@roeck-us.net> Tested-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-21posix_acl: uninliningAndrew Morton1-72/+6
Uninline vast tracts of nested inline functions in include/linux/posix_acl.h. This reduces the text+data+bss size of x86_64 allyesconfig vmlinux by 8026 bytes. The patch also regularises the positioning of the EXPORT_SYMBOLs in posix_acl.c. Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: J. Bruce Fields <bfields@fieldses.org> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Tested-by: Benny Halevy <bhalevy@primarydata.com> Cc: Benny Halevy <bhalevy@panasas.com> Cc: Andreas Gruenbacher <agruen@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-21fsnotify: remove .should_send_event callbackJan Kara1-4/+0
After removing event structure creation from the generic layer there is no reason for separate .should_send_event and .handle_event callbacks. So just remove the first one. Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Eric Paris <eparis@parisplace.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-21fsnotify: do not share events between notification groupsJan Kara1-87/+27
Currently fsnotify framework creates one event structure for each notification event and links this event into all interested notification groups. This is done so that we save memory when several notification groups are interested in the event. However the need for event structure shared between inotify & fanotify bloats the event structure so the result is often higher memory consumption. Another problem is that fsnotify framework keeps path references with outstanding events so that fanotify can return open file descriptors with its events. This has the undesirable effect that filesystem cannot be unmounted while there are outstanding events - a regression for inotify compared to a situation before it was converted to fsnotify framework. For fanotify this problem is hard to avoid and users of fanotify should kind of expect this behavior when they ask for file descriptors from notified files. This patch changes fsnotify and its users to create separate event structure for each group. This allows for much simpler code (~400 lines removed by this patch) and also smaller event structures. For example on 64-bit system original struct fsnotify_event consumes 120 bytes, plus additional space for file name, additional 24 bytes for second and each subsequent group linking the event, and additional 32 bytes for each inotify group for private data. After the conversion inotify event consumes 48 bytes plus space for file name which is considerably less memory unless file names are long and there are several groups interested in the events (both of which are uncommon). Fanotify event fits in 56 bytes after the conversion (fanotify doesn't care about file names so its events don't have to have it allocated). A win unless there are four or more fanotify groups interested in the event. The conversion also solves the problem with unmount when only inotify is used as we don't have to grab path references for inotify events. [hughd@google.com: fanotify: fix corruption preventing startup] Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Eric Paris <eparis@parisplace.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-21dma-debug: introduce debug_dma_assert_idle()Dan Williams1-0/+6
Record actively mapped pages and provide an api for asserting a given page is dma inactive before execution proceeds. Placing debug_dma_assert_idle() in cow_user_page() flagged the violation of the dma-api in the NET_DMA implementation (see commit 77873803363c "net_dma: mark broken"). The implementation includes the capability to count, in a limited way, repeat mappings of the same page that occur without an intervening unmap. This 'overlap' counter is limited to the few bits of tag space in a radix tree. This mechanism is added to mitigate false negative cases where, for example, a page is dma mapped twice and debug_dma_assert_idle() is called after the page is un-mapped once. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Vinod Koul <vinod.koul@intel.com> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Cc: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-21Merge tag 'for-v3.14' of git://git.infradead.org/battery-2.6Linus Torvalds4-56/+43
Pull battery updates from Dmitry Eremin-Solenikov: "I'm picking up power supply maintainership from Anton Vorontov. Could you please pull battery-2.6 git tree changes prepared for the v3.14 release. Highlights: - Power supply notifier - Several drivers gained DT support - Added Maxim 14577 driver - Change of maintainer" * tag 'for-v3.14' of git://git.infradead.org/battery-2.6: MAINTAINERS: Pick up power supply maintainership max17042_battery: Add IRQF_ONESHOT flag to use default irq handler gpio-charger: Support wakeup events power_supply: Add charger support for Maxim 14577 dt: Binding documentation for isp1704 charger isp1704_charger: Add DT support charger-manager: of_cm_parse_desc() should be static bq2415x_charger: Add DT support power_supply: Add power_supply_get_by_phandle bq2415x_charger: Use power_supply notifier for automode power: reset: Add as3722 power-off driver mfd: AS3722: Add dt node properties for system power controller charger-manager: Support deivce tree in charger manager driver charger-manager: Modify the way of checking battery's temperature power_supply: Add power_supply notifier
2014-01-21Merge tag 'mfd-3.14-1' of git://git.linaro.org/people/ljones/mfdLinus Torvalds13-63/+533
Pull MFD changes from Lee Jones: "New drivers - Samsung Maxim 14577; Micro USB, Regulator, IRQ Controller and Battery Charger - TI/National Semiconductor LP3943 I2C GPIO Expander and PWM Generator Existing driver adaptions - Expansion of Wolfson Arizona DSP and High-Pass filter controls - TI TWL6040 default Regmap support and Regcache addition/bypass - Some nice Smatch catch fixes - Conversion of TI OMAP-USB and TI TWL6030 to endian neutralness - ChromeOS EC timing (delay) adaptions and added dependency on OF - Many constifications of 'struct {mfd_cell,regmap_irq,et.al}' - Watchdog support added for NVIDIA AS3722 - Convert functions to static in TI AM335x - Realigned previously defeated functionality in TI AM335x - IIO ADC-TSC concurrency dead-lock/timeout resolution - Addition of Power Management and Clock support for Samsung core - DEFINE_PCI_DEVICE_TABLE macro removal from MFD Subsystem - Greater use of irqdomain functionality in ST-E AB8500 - Removal of 'include/linux/mfd/abx500/ab8500-gpio.h' - Wolfson WM831x PMIC Power Management changes s/poweroff/shutdown/ - Device Tree documentation added for TI/Nat Semi LP3943 - Version detection and voltage tables for TI TPS6586x PMIC devices - Simplification of Freescale MC13XXX (de-)initialisation routines - Clean-up and simplification of the Realtek parent driver - Added support for RTL8402 Realtek PCI-Express card reader - Resource leak fix for Maxim 77686 - Possible suspend BUG() fix in OMAP USB TLL - Support for new Wolfson WM5110 Revision (D) - Testing of automatic assignment of of_node in mfd_add_device() - Reversion of the above when it started to cause issues - Remove legacy Platform Data from; TI TWL Core, Qualcomm SSBI and ST-E ABx500 Pinctrl - Clean-ups; tabbing issues, function name changes, 'drvdata = NULL' removal, unused uninitialised warning mitigation, error message clarity, removal of redundant/duplicate checks, licensing (GPL -> GPL2), coding consistency, duplicate function declaration, ret checks, commit corrections, redundant of_match_ptr() helper removal, spelling, #if-deffery removal and header guards name changes" * tag 'mfd-3.14-1' of git://git.linaro.org/people/ljones/mfd: (78 commits) mfd: wm5110: Add register patch for rev D chip mfd: omap-usb-tll: Don't hold lock during pm_runtime_get/put_sync() gpio: lp3943: Remove redundant of_match_ptr helper mfd: sta2x11-mfd: Use named constants for pci_power_t values Documentation: mfd: Fix LDO index in s2mps11.txt mfd: Cleanup mfd-mcp-sa11x0.h header mfd: max8997: Use "IS_ENABLED(CONFIG_OF)" for DT code. mfd: twl6030: Fix endianness problem in IRQ handler mfd: sec-core: Add cells for S5M8767-clocks mfd: max14577: Remove redundant of_match_ptr helper mfd: twl6040: Fix sparse non static symbol warning mfd: Revert "mfd: Always assign of_node in mfd_add_device()" mfd: rtsx: Fix sparse non static symbol warning mfd: max77693: Set proper maximum register for MUIC regmap mfd: max77686: Fix regmap resource leak on driver remove mfd: Represent correct filenames in file headers mfd: rtsx: Add support for card reader rtl8402 mfd: rtsx: Add set pull control macro and simplify rtl8411 mfd: max8997: Enforce mfd_add_devices() return value check mfd: mc13xxx: Simplify probe() & remove() ...
2014-01-21Merge tag 'sound-3.14-rc1' of ↵Linus Torvalds19-41/+872
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound updates from Takashi Iwai: "It was holiday season, so no wonder that there are little changes in framework level, although diffstat shows quite many changes spreaded over sound/* directories. Most of changes are cleanups, code refactoring and fixes. Some highlights: - Removal of OSS sleep_on usages by Arnd - Simplified memalloc helper codes, drop obsoleted features; now it's built into PCM driver instead of an individual module - Warn if PCM buffer preallocation fails, which will show page allocation issues more clearly - Compress offload API updates for sample rates by Vinod - PCM glitch workaround on ctxfi emu20k1 by Sarah - Drop cs46xx DSP blobs, using firmware loader now - USB-audio quitks for Plantronics Gamecom 780, Creative VF0420, and Focusrite Saffire 6 HD-audio specifics: - Standardize Kconfigs of HD-audio codec drivers; now "make localmodconfig" recognizes configs properly (finally!) - Parallel PM implementation by Mengdong - BayleyBay/ValleyView2 board fixups - Broadwell audio support - Runtime PM improvement (PantherPoint, etc) - Quirks: Dell subwooer, Gigabyte mobo jack detection oddity, Dell AiO click noise fixes, Dell headset mic fixes, etc - Automatic bind with HDMI codec parser without generic parser - More AD codec fixes (since 3.12 regression) including the automatic stereo mix support - Common Thinkpad ACPI helper for Realtek and Conexant codecs ASoC specifics: - Update to the generic DMA code to support deferred probe and managed resources - New drivers for BCM2835 (used in Raspberry Pi), Tegra with MAX98090 and Analog Devices AXI I2S and S/PDIF controller IPs - Device tree support for the simple card, max98090 and cs42l52 - Conversion of the Samsung drivers to native dmaengine, making them multiplatform compatible and hopefully helping keep them more modern and up to date. - More regmap conversions, including a very welcome one for twl6040 from Peter Ujfalusi - A big overhaul of the DaVinci drivers also from Peter Ujfalusi - Lots of DMA updates from Lars-Peter - Improvements to the constraints handling code from Lars-Peter - A very helpful conversion of the TWL4030 driver to regmap from Peter - A new driver for the Freescale ESAI controller from Nicolin Chen - Conversion of some of the drivers to use params_width() - Extensions to DPCM for use with compressed audio from Liam" * tag 'sound-3.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (396 commits) ASoC: dapm: Fix double prefix addition ASoC: compress: Add suport for DPCM into compressed audio ASoC: DPCM: make some DPCM API calls non static for compressed usage ASoC: core: Fix possible NULL pointer dereference of pcm->config ALSA: hda - add headset mic detect quirks for some Dell machines ASoC: tlv320aic32x4: Fix regmap range_min ASoC: core: Return -ENOTSUPP from set_sysclk() if no operation provided ASoC: dapm: Change prototype of soc_widget_read ASoC: samsung: Remove SND_DMAENGINE_PCM_FLAG_NO_RESIDUE flag ASoC: axi-{spdif,i2s}: Remove SND_DMAENGINE_PCM_FLAG_NO_RESIDUE flag ASoC: generic-dmaengine-pcm: Check DMA residue granularity ASoC: generic-dmaengine-pcm: Check NO_RESIDUE flag at runtime dma: pl330: Set residue_granularity dma: Indicate residue granularity in dma_slave_caps ASoC: simple-card: fix one bug to writing to the platform data ASoC: pcm: Use snd_pcm_rate_mask_intersect() helper ALSA: Add helper function for intersecting two rate masks ASoC: s6000: Don't mix SNDRV_PCM_RATE_CONTINUOUS with specific rates ASoC: fsl: Don't mix SNDRV_PCM_RATE_CONTINUOUS with specific rates ASoC: pcm: Properly initialize hw->rate_max ...
2014-01-21Merge tag 'pinctrl-v3.14-1' of ↵Linus Torvalds3-2/+11
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull bulk pin control changes from Linus Walleij: "This has been queued and tested for a while. Lots of action here, like in the GPIO tree, embedded stuff like this is really hot now it seems. Details in the signed tag. I'm especially happy about the Qualcomm driver as it is used in such a huge subset of mobile handsets out there, and these platforms in general need better upstream support - New driver for the Qualcomm TLMM pin controller and its msm8x74 subdriver. - New driver for the Broadcom Capri BCM281xx SoC. - New subdriver for the imx25 pin controller. - New subdriver for the Tegra124 pin controller. - Lock GPIO lines as IRQs for select combined pin control and GPIO drivers for baytrail and sirf. - Some semi-big refactorings and extenstions to the sirf driver. - Lots of patching, cleanup and fixing in the Renesas "PFC" driver and associated subdrivers as usual. It is settling down a little bit now it seems. - Minor fixes and incremental updates here and there as usual" * tag 'pinctrl-v3.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (72 commits) pinctrl: sunxi: Honor GPIO output initial vaules pinctrl: capri: add dependency on OF ARM: bcm11351: Enable pinctrl for Broadcom Capri SoCs ARM: pinctrl: Add Broadcom Capri pinctrl driver pinctrl: Add pinctrl binding for Broadcom Capri SoCs pinctrl: Add void * to pinctrl_pin_desc pinctrl: st: Fix a typo in probe pinctrl: Fix some typos and grammar issues in the documentation pinctrl: sirf: lock IRQs when starting them pinctrl: sirf: put gpio interrupt pin into input status automatically pinctrl: sirf: use only one irq_domain for the whole device node pinctrl: single: fix infinite loop caused by bad mask pinctrl: single: fix pcs_disable with bits_per_mux pinctrl: single: fix DT bindings documentation pinctrl: as3722: Set pin to output mode for some function pinctrl: sirf: add pin group for USP0 with only RX or TX frame sync pinctrl: sirf: fix the pins of sdmmc5 connected with TriG pinctrl: sirf: add lost usp1_uart_nostreamctrl group for atlas6 pinctrl: sunxi: Add Allwinner A20 clock output pin functions pinctrl/lantiq: fix typo ...
2014-01-21Merge tag 'gpio-v3.14-1' of ↵Linus Torvalds5-93/+74
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio Pull GPIO tree bulk changes from Linus Walleij: "A big set this merge window, as we have much going on in this subsystem. The changes to other subsystems (notably a slew of ARM machines as I am doing away with their custom APIs) have all been ACKed to the extent possible. Major changes this time: - Some core improvements and cleanups to the new GPIO descriptor API. This seems to be working now so we can start the exodus to this API, moving gradually away from the global GPIO numberspace. - Incremental improvements to the ACPI GPIO core, and move the few GPIO ACPI clients we have to the GPIO descriptor API right *now* before we go any further. We actually managed to contain this *before* we started to litter the kernel with yet another hackish global numberspace for the ACPI GPIOs, which is a big win. - The RFkill GPIO driver and all platforms using it have been migrated to use the GPIO descriptors rather than fixed number assignments. Tegra machine has been migrated as part of this. - New drivers for MOXA ART, Xtensa GPIO32 and SMSC SCH311x. Those should be really good examples of how I expect a nice GPIO driver to look these days. - Do away with custom GPIO implementations on a major part of the ARM machines: ks8695, lpc32xx, mv78xx0. Make a first step towards the same in the horribly convoluted Samsung S3C include forest. We expect to continue to clean this up as we move forward. - Flag GPIO lines used for IRQ on adnp, bcm-kona, em, intel-mid and lynxpoint. This makes the GPIOlib core aware that a certain GPIO line is used for IRQs and can then enforce some semantics such as disallowing a GPIO line marked as in use for IRQ to be switched to output mode. - Drop all use of irq_set_chip_and_handler_name(). The name provided in these cases were just unhelpful tags like "mux" or "demux". - Extend the MCP23s08 driver to handle interrupts. - Minor incremental improvements for rcar, lynxpoint, em 74x164 and msm drivers. - Some non-urgent bug fixes here and there, duplicate #includes and that usual kind of cleanups" Fix up broken Kconfig file manually to make this all compile. * tag 'gpio-v3.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: (71 commits) gpio: mcp23s08: fix casting caused build warning gpio: mcp23s08: depend on OF_GPIO gpio: mcp23s08: Add irq functionality for i2c chips ARM: S5P[v210|c100|64x0]: Fix build error gpio: pxa: clamp gpio get value to [0,1] ARM: s3c24xx: explicit dependency on <plat/gpio-cfg.h> ARM: S3C[24|64]xx: move includes back under <mach/> scope Documentation / ACPI: update to GPIO descriptor API gpio / ACPI: get rid of acpi_gpio.h gpio / ACPI: register to ACPI events automatically mmc: sdhci-acpi: convert to use GPIO descriptor API ARM: s3c24xx: fix build error gpio: f7188x: set can_sleep attribute gpio: samsung: Update documentation gpio: samsung: Remove hardware.h inclusion gpio: xtensa: depend on HAVE_XTENSA_GPIO32 gpio: clps711x: Enable driver compilation with COMPILE_TEST gpio: clps711x: Use of_match_ptr() net: rfkill: gpio: convert to descriptor-based GPIO interface leds: s3c24xx: Fix build failure ...
2014-01-21Merge branch 'i2c/for-next' of ↵Linus Torvalds2-40/+0
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c updates from Wolfram Sang: "For 3.14, the I2C subsystem has the following to offer: - new drivers for Renesas RIIC and RobotFuzz OSIF - driver cleanups & improvements & bugfixes Pretty standard stuff this time, I'd say. There is more complex stuff coming up, but I didn't have the bandwidth between the years to pull it in for this release. Sadly" * 'i2c/for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (26 commits) i2c: s3c2410: fix quirk usage for 64-bit i2c: pnx: Use devm_*() functions i2c: at91: add a new compatibility string for the at91sam9261 i2c-ismt: support I2C_SMBUS_I2C_BLOCK_DATA transaction type i2c: Add bus driver for for OSIF USB i2c device. i2c: i2c-tiny-usb: Remove RobotFuzz USB vendor:product ID i2c: designware: remove HAVE_CLK build dependecy Documentation: i2c: Remove obsolete example i2c: nomadik: remove platform data header i2c: nomadik: auto-calculate slave setup time i2c: viperboard: remove superfluous assignment i2c: xilinx: Use devm_* functions i2c: xilinx: Do not enable irq before irq handler i2c: xilinx: Fix i2c checkpatch warnings i2c: at91: document clock properties i2c: isch: Use devm_request_region() i2c: viperboard: Use devm_kzalloc() functions i2c: imx: propagate irq error code in probe i2c: s3c2410: dont need CPU_FREQ transitions for exynos series i2c: s3c2410: Add polling mode support ...
2014-01-21Merge branch 'for-linus' of ↵Linus Torvalds1-0/+12
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security layer updates from James Morris: "Changes for this kernel include maintenance updates for Smack, SELinux (and several networking fixes), IMA and TPM" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (39 commits) SELinux: Fix memory leak upon loading policy tpm/tpm-sysfs: active_show() can be static tpm: tpm_tis: Fix compile problems with CONFIG_PM_SLEEP/CONFIG_PNP tpm: Make tpm-dev allocate a per-file structure tpm: Use the ops structure instead of a copy in tpm_vendor_specific tpm: Create a tpm_class_ops structure and use it in the drivers tpm: Pull all driver sysfs code into tpm-sysfs.c tpm: Move sysfs functions from tpm-interface to tpm-sysfs tpm: Pull everything related to /dev/tpmX into tpm-dev.c char: tpm: nuvoton: remove unused variable tpm: MAINTAINERS: Cleanup TPM Maintainers file tpm/tpm_i2c_atmel: fix coccinelle warnings tpm/tpm_ibmvtpm: fix unreachable code warning (smatch warning) tpm/tpm_i2c_stm_st33: Check return code of get_burstcount tpm/tpm_ppi: Check return value of acpi_get_name tpm/tpm_ppi: Do not compare strcmp(a,b) == -1 ima: remove unneeded size_limit argument from ima_eventdigest_init_common() ima: update IMA-templates.txt documentation ima: pass HASH_ALGO__LAST as hash algo in ima_eventdigest_init() ima: change the default hash algorithm to SHA1 in ima_eventdigest_ng_init() ...
2014-01-21mfd: Cleanup mfd-mcp-sa11x0.h headerSachin Kamat1-4/+2
Commit a1fd844c6e62 ("ARM: sa1100: move platform_data definitions") moved the file to the current location but forgot to remove the pointer to its previous location. Clean it up. While at it also change the header file protection macros appropriately. Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org> Signed-off-by: Lee Jones <lee.jones@linaro.org>
2014-01-21mfd: Represent correct filenames in file headersLaszlo Papp3-3/+3
The original author(s) probably copy/pasted these headers from the existing public header files. Signed-off-by: Laszlo Papp <lpapp@kde.org> Signed-off-by: Lee Jones <lee.jones@linaro.org>
2014-01-21mfd: mc13xxx: Remove duplicate mc13xxx_get_flags() declarationAlexander Shiyan1-2/+0
mc13xxx_get_flags() declaration given twice. This patch removes this duplicate. Signed-off-by: Alexander Shiyan <shc_work@mail.ru> Signed-off-by: Lee Jones <lee.jones@linaro.org>
2014-01-21mfd: ssbi: Constify buffer in ssbi_writeStephen Boyd1-1/+1
In preparation for passing a const pointer directly to ssbi_write() from the regmap APIs. Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Lee Jones <lee.jones@linaro.org>
2014-01-21mfd: ssbi: Remove platform data structs and hide ssbi type enumStephen Boyd1-16/+1
The ssbi driver assumes that the device is DT based. Remove the platform data structs that will never be used and hide the enum in the only C file that uses it. Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Lee Jones <lee.jones@linaro.org>
2014-01-21mfd: tps6586x: Add version detectionStefan Agner1-0/+7
Use the VERSIONCRC to determine the exact device version. According to the datasheet this register can be used as device identifier. The identification is needed since some tps6586x regulators use a different voltage table. Signed-off-by: Stefan Agner <stefan@agner.ch> Reviewed-by: Thierry Reding <treding@nvidia.com> Acked-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Lee Jones <lee.jones@linaro.org>
2014-01-21mfd: Add LP3943 MFD driverMilo Kim1-0/+114
LP3943 has 16 output pins which can be used as GPIO expander and PWM generator. * Regmap I2C interface for R/W LP3943 registers * Atomic operations for output pin assignment The driver should check whether requested pin is available or not. If the pin is already used, pin request returns as a failure. A driver data, 'pin_used' is checked when gpio_request() and pwm_request() are called. If the pin is available, then pin_used is set. And it is cleared when gpio_free() and pwm_free(). * Device tree support Compatible strings for GPIO and PWM driver. LP3943 platform data is PWM related, so parsing the device tree is implemented in the PWM driver. Signed-off-by: Milo Kim <milo.kim@ti.com> Signed-off-by: Lee Jones <lee.jones@linaro.org>
2014-01-21mfd/pinctrl: Delete platform data headerLinus Walleij1-23/+0
This deletes the special AB8500 GPIO platform data passing header and merges the few remaining contents down into the abx500 pinctrl driver which handles the abx500 GPIO device. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Lee Jones <lee.jones@linaro.org>
2014-01-21mfd: ab8500: Delete all GPIO platform data instancesLinus Walleij2-12/+0
This deletes all instances where the AB8500 GPIO platform data is passed around. It is completely unused in the kernel now, so it does not hurt anyone. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Lee Jones <lee.jones@linaro.org>
2014-01-21Merge tag 'ib-iio-input-3.13-1' into for-mfd-nextLee Jones1-2/+6
Immutable branch for IIO and Input
2014-01-21Merge tag 'ib-asoc-3.14.2' into for-mfd-nextLee Jones1-0/+3
Immutable branch between MFD and ASoC due for the v3.14 merge window
2014-01-21Merge tag 'tags/ib-asoc-1' into for-mfd-nextLee Jones1-0/+37
Immutable branch for ASoC, as requested by Mark Brown