summaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
AgeCommit message (Collapse)AuthorFilesLines
2018-08-27drm/ttm: remove dead codesHuang Rui1-7/+1
These codes are not used. Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-01drm/ttm: fix missed conversion of set_pages_array_ucHuang Rui1-1/+1
This patch fixed the error when do not configure CONFIG_X86, otherwise, below error will be encountered. All errors (new ones prefixed by >>): drivers/gpu/drm/ttm/ttm_page_alloc_dma.c: In function 'ttm_set_pages_caching': >> drivers/gpu/drm/ttm/ttm_page_alloc_dma.c:272:7: error: implicit declaration of function 'set_pages_array_uc'; did you mean +'ttm_set_pages_array_uc'? [-Werror=implicit-function-declaration] r = set_pages_array_uc(pages, cpages); ^~~~~~~~~~~~~~~~~~ ttm_set_pages_array_uc cc1: some warnings being treated as errors Reported-by: kbuild test robot <lkp@intel.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-27drm/ttm: Merge hugepage attr changes in ttm_dma_page_put. (v2)Bas Nieuwenhuizen1-7/+4
Every set_pages_array_wb call resulted in cross-core interrupts and TLB flushes. Merge more of them for less overhead. This reduces the time needed to free a 1.6 GiB GTT WC buffer as part of Vulkan CTS from ~2 sec to < 0.25 sec. (Allocation still takes more than 2 sec though) (v2): use set_pages_wb instead of set_memory_wb. Signed-off-by: Bas Nieuwenhuizen <basni@chromium.org> Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-27drm/ttm: clean up non-x86 definitions on ttm_page_alloc_dmaHuang Rui1-44/+4
All non-x86 definitions are moved to ttm_set_memory header, so remove it from ttm_page_alloc_dma.c. Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Bas Nieuwenhuizen <basni@chromium.org> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-06-12treewide: kmalloc() -> kmalloc_array()Kees Cook1-3/+5
The kmalloc() function has a 2-factor argument form, kmalloc_array(). This patch replaces cases of: kmalloc(a * b, gfp) with: kmalloc_array(a * b, gfp) as well as handling cases of: kmalloc(a * b * c, gfp) with: kmalloc(array3_size(a, b, c), gfp) as it's slightly less ugly than: kmalloc_array(array_size(a, b), c, gfp) This does, however, attempt to ignore constant size factors like: kmalloc(4 * 1024, gfp) though any constants defined via macros get caught up in the conversion. Any factors with a sizeof() of "unsigned char", "char", and "u8" were dropped, since they're redundant. The tools/ directory was manually excluded, since it has its own implementation of kmalloc(). The Coccinelle script used for this was: // Fix redundant parens around sizeof(). @@ type TYPE; expression THING, E; @@ ( kmalloc( - (sizeof(TYPE)) * E + sizeof(TYPE) * E , ...) | kmalloc( - (sizeof(THING)) * E + sizeof(THING) * E , ...) ) // Drop single-byte sizes and redundant parens. @@ expression COUNT; typedef u8; typedef __u8; @@ ( kmalloc( - sizeof(u8) * (COUNT) + COUNT , ...) | kmalloc( - sizeof(__u8) * (COUNT) + COUNT , ...) | kmalloc( - sizeof(char) * (COUNT) + COUNT , ...) | kmalloc( - sizeof(unsigned char) * (COUNT) + COUNT , ...) | kmalloc( - sizeof(u8) * COUNT + COUNT , ...) | kmalloc( - sizeof(__u8) * COUNT + COUNT , ...) | kmalloc( - sizeof(char) * COUNT + COUNT , ...) | kmalloc( - sizeof(unsigned char) * COUNT + COUNT , ...) ) // 2-factor product with sizeof(type/expression) and identifier or constant. @@ type TYPE; expression THING; identifier COUNT_ID; constant COUNT_CONST; @@ ( - kmalloc + kmalloc_array ( - sizeof(TYPE) * (COUNT_ID) + COUNT_ID, sizeof(TYPE) , ...) | - kmalloc + kmalloc_array ( - sizeof(TYPE) * COUNT_ID + COUNT_ID, sizeof(TYPE) , ...) | - kmalloc + kmalloc_array ( - sizeof(TYPE) * (COUNT_CONST) + COUNT_CONST, sizeof(TYPE) , ...) | - kmalloc + kmalloc_array ( - sizeof(TYPE) * COUNT_CONST + COUNT_CONST, sizeof(TYPE) , ...) | - kmalloc + kmalloc_array ( - sizeof(THING) * (COUNT_ID) + COUNT_ID, sizeof(THING) , ...) | - kmalloc + kmalloc_array ( - sizeof(THING) * COUNT_ID + COUNT_ID, sizeof(THING) , ...) | - kmalloc + kmalloc_array ( - sizeof(THING) * (COUNT_CONST) + COUNT_CONST, sizeof(THING) , ...) | - kmalloc + kmalloc_array ( - sizeof(THING) * COUNT_CONST + COUNT_CONST, sizeof(THING) , ...) ) // 2-factor product, only identifiers. @@ identifier SIZE, COUNT; @@ - kmalloc + kmalloc_array ( - SIZE * COUNT + COUNT, SIZE , ...) // 3-factor product with 1 sizeof(type) or sizeof(expression), with // redundant parens removed. @@ expression THING; identifier STRIDE, COUNT; type TYPE; @@ ( kmalloc( - sizeof(TYPE) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kmalloc( - sizeof(TYPE) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kmalloc( - sizeof(TYPE) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kmalloc( - sizeof(TYPE) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kmalloc( - sizeof(THING) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kmalloc( - sizeof(THING) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kmalloc( - sizeof(THING) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kmalloc( - sizeof(THING) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) ) // 3-factor product with 2 sizeof(variable), with redundant parens removed. @@ expression THING1, THING2; identifier COUNT; type TYPE1, TYPE2; @@ ( kmalloc( - sizeof(TYPE1) * sizeof(TYPE2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) | kmalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) | kmalloc( - sizeof(THING1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) | kmalloc( - sizeof(THING1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) | kmalloc( - sizeof(TYPE1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) | kmalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) ) // 3-factor product, only identifiers, with redundant parens removed. @@ identifier STRIDE, SIZE, COUNT; @@ ( kmalloc( - (COUNT) * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kmalloc( - COUNT * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kmalloc( - COUNT * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kmalloc( - (COUNT) * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kmalloc( - COUNT * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kmalloc( - (COUNT) * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kmalloc( - (COUNT) * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kmalloc( - COUNT * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) ) // Any remaining multi-factor products, first at least 3-factor products, // when they're not all constants... @@ expression E1, E2, E3; constant C1, C2, C3; @@ ( kmalloc(C1 * C2 * C3, ...) | kmalloc( - (E1) * E2 * E3 + array3_size(E1, E2, E3) , ...) | kmalloc( - (E1) * (E2) * E3 + array3_size(E1, E2, E3) , ...) | kmalloc( - (E1) * (E2) * (E3) + array3_size(E1, E2, E3) , ...) | kmalloc( - E1 * E2 * E3 + array3_size(E1, E2, E3) , ...) ) // And then all remaining 2 factors products when they're not all constants, // keeping sizeof() as the second factor argument. @@ expression THING, E1, E2; type TYPE; constant C1, C2, C3; @@ ( kmalloc(sizeof(THING) * C2, ...) | kmalloc(sizeof(TYPE) * C2, ...) | kmalloc(C1 * C2 * C3, ...) | kmalloc(C1 * C2, ...) | - kmalloc + kmalloc_array ( - sizeof(TYPE) * (E2) + E2, sizeof(TYPE) , ...) | - kmalloc + kmalloc_array ( - sizeof(TYPE) * E2 + E2, sizeof(TYPE) , ...) | - kmalloc + kmalloc_array ( - sizeof(THING) * (E2) + E2, sizeof(THING) , ...) | - kmalloc + kmalloc_array ( - sizeof(THING) * E2 + E2, sizeof(THING) , ...) | - kmalloc + kmalloc_array ( - (E1) * E2 + E1, E2 , ...) | - kmalloc + kmalloc_array ( - (E1) * (E2) + E1, E2 , ...) | - kmalloc + kmalloc_array ( - E1 * E2 + E1, E2 , ...) ) Signed-off-by: Kees Cook <keescook@chromium.org>
2018-05-09drm/ttm: Use GFP_TRANSHUGE_LIGHT for allocating huge pagesMichel Dänzer1-1/+2
GFP_TRANSHUGE tries very hard to allocate huge pages, which can result in long delays with high memory pressure. I have observed firefox freezing for up to around a minute due to this while restic was taking a full system backup. Since we don't really need huge pages, use GFP_TRANSHUGE_LIGHT | __GFP_NORETRY instead, in order to fail quickly when there are no huge pages available. Set __GFP_KSWAPD_RECLAIM as well, in order for huge pages to be freed up in the background if necessary. With these changes, I'm no longer seeing freezes during a restic backup. Cc: stable@vger.kernel.org Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-02-26drm/ttm: check if free mem space is under the lower limitRoger He1-0/+3
the free mem space and the lower limit both include two parts: system memory and swap space. For the OOM triggered by TTM, that is the case as below: first swap space is full of swapped out pages and soon system memory also is filled up with ttm pages. and then any memory allocation request will run into OOM. to cover two cases: a. if no swap disk at all or free swap space is under swap mem limit but available system mem is bigger than sys mem limit, allow TTM allocation; b. if the available system mem is less than sys mem limit but free swap space is bigger than swap mem limit, allow TTM allocation. v2: merge two memory limit(swap and system) into one v3: keep original behavior except ttm_opt_ctx->flags with TTM_OPT_FLAG_FORCE_ALLOC v4: always set force_alloc as tx->flags & TTM_OPT_FLAG_FORCE_ALLOC v5: add an attribute for lower_mem_limit v6: set lower_mem_limit as 0 to keep original behavior Signed-off-by: Roger He <Hongbo.He@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-02-26drm/ttm: drop ttm->globChristian König1-5/+6
The pointer is available as ttm->bdev->glob as well. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-02-19drm/ttm: Simplify ttm_dma_page_put()Tom St Denis1-3/+1
Remove redundant store of return code. Signed-off-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-02-19drm/ttm: Fix coding style in ttm_dma_pool_alloc_new_pages()Tom St Denis1-2/+1
Add missing {} braces. Signed-off-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-02-19drm/ttm: Simplify ttm_dma_find_pool() (v2)Tom St Denis1-9/+6
Flip the logic of the comparison and remove the redudant variable for the pool address. (v2): Remove {} bracing. Signed-off-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-02-19drm/ttm: Fix coding style in ttm_pool_store()Tom St Denis1-3/+4
Correct missing {} style. Signed-off-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-02-19drm/ttm: Allow page allocations w/o triggering OOM..Andrey Grodzovsky1-0/+3
This to allow drivers to choose to avoid OOM invocation and handle page allocation failures instead. v2: Remove extra new lines. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Roger He <Hongbo.He@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-01-18drm/ttm: add VADDR_FLAG_UPDATED_COUNT to correctly update dma_page global countRoger He1-26/+29
add this for correctly updating global mem count in ttm_mem_zone. before that when ttm_mem_global_alloc_page fails, we would update all dma_page's global mem count in ttm_dma->pages_list. but actually here we should not update for the last dma_page. v2: only the update of last dma_page is not right v3: use lower bits of dma_page vaddr Signed-off-by: Roger He <Hongbo.He@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-01-16drm/ttm: check the return value of register_shrinkerRoger He1-10/+14
This fixes the build warning: "ignoring return value of 'register_shrinker', declared with attribute warn_unused_result [-Wunused-result]" Signed-off-by: Roger He <Hongbo.He@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-01-10drm/ttm: specify DMA_ATTR_NO_WARN for huge page poolsChristian König1-2/+6
Suppress warning messages when allocating huge pages fails since we can always fall back to normal pages. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-28drm/ttm: use an operation ctx for ttm_tt_populate in ttm_bo_driver (v2)Roger He1-7/+4
forward the operation context to ttm_tt_populate as well, and the ultimate goal is swapout enablement for reserved BOs. v2: squash in fix for vboxvideo Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Roger He <Hongbo.He@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-27drm/ttm: use an operation ctx for ttm_mem_global_alloc_pageRoger He1-2/+6
forward the operation context to ttm_mem_global_alloc_page as well, and the ultimate goal is swapout enablement for reserved BOs. Here reserved BOs refer to all the BOs which share same reservation object Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Roger He <Hongbo.He@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-06drm/ttm: Use a static string instead of an array of char *Joe Perches1-4/+1
Make the object a bit smaller by using a simple string instead of a format string and array of char *. $ size drivers/gpu/drm/ttm/ttm_page_alloc_dma.o* text data bss dec hex filename 8820 216 4136 13172 3374 drivers/gpu/drm/ttm/ttm_page_alloc_dma.o.defconfig.new 8910 216 4136 13262 33ce drivers/gpu/drm/ttm/ttm_page_alloc_dma.o.defconfig.old 25383 5044 4384 34811 87fb drivers/gpu/drm/ttm/ttm_page_alloc_dma.o.allyesconfig.new 25797 5428 4384 35609 8b19 drivers/gpu/drm/ttm/ttm_page_alloc_dma.o.allyesconfig.old Miscellanea: o The h array had more entries than were emitted, all are now removed Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-11-04drm/ttm: Downgrade pr_err to pr_debug for memory allocation failuresMichel Dänzer1-6/+6
Memory allocation failure should generally be handled gracefully by callers. In particular, with transparent hugepage support, attempts to allocate huge pages can fail under memory pressure, but the callers fall back to allocating individual pages instead. In that case, there would be spurious [TTM] Unable to get page %u error messages in dmesg. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-10-19drm/ttm: don't use compound pages for nowChristian König1-0/+1
We need to figure out first how to correctly map them into the CPU page tables. bug: https://bugs.freedesktop.org/show_bug.cgi?id=103138 Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-10-06drm/ttm: add transparent huge page support for DMA allocations v2Christian König1-48/+169
Try to allocate huge pages when it makes sense. v2: fix comment and use ifdef Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-10-06drm/ttm: add support for different pool sizesChristian König1-3/+4
Correctly handle different page sizes in the memory accounting. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-10-06drm/ttm: remove unsued options from ttm_mem_global_alloc_pageChristian König1-2/+1
Nobody is actually using that, remove it. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26drm/ttm: cleanup ttm_page_alloc_dma.cChristian König1-26/+16
Remove unused defines and variables. Also stop computing the gfp_flags when they aren't used. No intended functional change. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-08drm: use set_memory.h headerLaura Abbott1-0/+3
set_memory_* functions have moved to set_memory.h. Switch to this explicitly. [akpm@linux-foundation.org: track drivers/gpu/drm/i915/i915_gem_gtt.c linux-next changes] Link: http://lkml.kernel.org/r/1488920133-27229-8-git-send-email-labbott@redhat.com Signed-off-by: Laura Abbott <labbott@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-16drm/ttm: remove cpu_address member from ttm_ttAlexandre Courbot1-2/+0
Patch 3d50d4dcb0 exposed the CPU address of DMA-allocated pages as returned by dma_alloc_coherent because Nouveau on Tegra needed it. This is not required anymore - as there were no other users for it, remove it and save some memory for everyone. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-03-30drm/ttm: Remove TTM_HAS_AGPDaniel Vetter1-4/+4
It tries to do fancy things with excluding agp support if ttm is built-in, but agp isn't. Instead just express this depency like drm does and use CONFIG_AGP everywhere. Also use the neat Makefile magic to make the entire ttm_agp_backend file optional. v2: Use IS_ENABLED(CONFIG_AGP) as suggested by Ville v3: Review from Emil. v4: Actually get it right as spotted by 0-day. Cc: Emil Velikov <emil.l.velikov@gmail.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1459337046-25882-1-git-send-email-daniel.vetter@ffwll.ch
2015-07-17drm/ttm: improve uncached page deallocation.Jérôme Glisse1-6/+6
Calls to set_memory_wb() incure heavy TLB flush and IPI cost. To minimize those wait until pool grow beyond batch size before draining the pool. Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Reviewed-by: Mario Kleiner <mario.kleiner.de@gmail.com> Reviewed-and-Tested-by: Michel Dänzer <michel@daenzer.net> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-07-17drm/ttm: fix uncached page deallocation to properly fill page pool v3.Jérôme Glisse1-1/+0
Current code never allowed the page pool to actualy fill in anyway. This fix it, so that we only start freeing page from the pool when we go over the pool size. Changed since v1: - Move the page batching optimization to its separate patch. Changed since v2: - Do not remove code part of the batching optimization with this patch. - Better commit message. Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Reviewed-by: Mario Kleiner <mario.kleiner.de@gmail.com> Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-06-02drm/ttm: dma: Don't crash on memory in the vmalloc rangeAlexandre Courbot1-3/+6
dma_alloc_coherent() can return memory in the vmalloc range. virt_to_page() cannot handle such addresses and crashes. This patch detects such cases and obtains the struct page * using vmalloc_to_page() instead. Signed-off-by: Alexandre Courbot <acourbot@nvidia.com> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-11-20drm/ttm: Avoid memory allocation from shrinker functions.Tetsuo Handa1-10/+15
Andrew Morton wrote: > On Wed, 12 Nov 2014 13:08:55 +0900 Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> wrote: > > > Andrew Morton wrote: > > > Poor ttm guys - this is a bit of a trap we set for them. > > > > Commit a91576d7916f6cce ("drm/ttm: Pass GFP flags in order to avoid deadlock.") > > changed to use sc->gfp_mask rather than GFP_KERNEL. > > > > - pages_to_free = kmalloc(npages_to_free * sizeof(struct page *), > > - GFP_KERNEL); > > + pages_to_free = kmalloc(npages_to_free * sizeof(struct page *), gfp); > > > > But this bug is caused by sc->gfp_mask containing some flags which are not > > in GFP_KERNEL, right? Then, I think > > > > - pages_to_free = kmalloc(npages_to_free * sizeof(struct page *), gfp); > > + pages_to_free = kmalloc(npages_to_free * sizeof(struct page *), gfp & GFP_KERNEL); > > > > would hide this bug. > > > > But I think we should use GFP_ATOMIC (or drop __GFP_WAIT flag) > > Well no - ttm_page_pool_free() should stop calling kmalloc altogether. > Just do > > struct page *pages_to_free[16]; > > and rework the code to free 16 pages at a time. Easy. Well, ttm code wants to process 512 pages at a time for performance. Memory footprint increased by 512 * sizeof(struct page *) buffer is only 4096 bytes. What about using static buffer like below? ---------- >From d3cb5393c9c8099d6b37e769f78c31af1541fe8c Mon Sep 17 00:00:00 2001 From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Date: Thu, 13 Nov 2014 22:21:54 +0900 Subject: [PATCH] drm/ttm: Avoid memory allocation from shrinker functions. Commit a91576d7916f6cce ("drm/ttm: Pass GFP flags in order to avoid deadlock.") caused BUG_ON() due to sc->gfp_mask containing flags which are not in GFP_KERNEL. https://bugzilla.kernel.org/show_bug.cgi?id=87891 Changing from sc->gfp_mask to (sc->gfp_mask & GFP_KERNEL) would avoid the BUG_ON(), but avoiding memory allocation from shrinker function is better and reliable fix. Shrinker function is already serialized by global lock, and clean up function is called after shrinker function is unregistered. Thus, we can use static buffer when called from shrinker function and clean up function. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: stable <stable@kernel.org> [2.6.35+] Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-08-10drm/ttm: expose CPU address of DMA-allocated pagesAlexandre Courbot1-0/+2
Pages allocated using the DMA API have a coherent memory mapping. Make this mapping visible to drivers so they can decide to use it instead of creating their own redundant one. Signed-off-by: Alexandre Courbot <acourbot@nvidia.com> Acked-by: David Airlie <airlied@linux.ie> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2014-08-05drm/ttm: Pass GFP flags in order to avoid deadlock.Tetsuo Handa1-10/+9
Commit 7dc19d5a "drivers: convert shrinkers to new count/scan API" added deadlock warnings that ttm_page_pool_free() and ttm_dma_page_pool_free() are currently doing GFP_KERNEL allocation. But these functions did not get updated to receive gfp_t argument. This patch explicitly passes sc->gfp_mask or GFP_KERNEL to these functions, and removes the deadlock warning. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: stable <stable@kernel.org> [2.6.35+] Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-08-05drm/ttm: Use mutex_trylock() to avoid deadlock inside shrinker functions.Tetsuo Handa1-2/+4
I can observe that RHEL7 environment stalls with 100% CPU usage when a certain type of memory pressure is given. While the shrinker functions are called by shrink_slab() before the OOM killer is triggered, the stall lasts for many minutes. One of reasons of this stall is that ttm_dma_pool_shrink_count()/ttm_dma_pool_shrink_scan() are called and are blocked at mutex_lock(&_manager->lock). GFP_KERNEL allocation with _manager->lock held causes someone (including kswapd) to deadlock when these functions are called due to memory pressure. This patch changes "mutex_lock();" to "if (!mutex_trylock()) return ...;" in order to avoid deadlock. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: stable <stable@kernel.org> [3.3+] Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-08-05drm/ttm: Choose a pool to shrink correctly in ttm_dma_pool_shrink_scan().Tetsuo Handa1-3/+3
We can use "unsigned int" instead of "atomic_t" by updating start_pool variable under _manager->lock. This patch will make it possible to avoid skipping when choosing a pool to shrink in round-robin style, after next patch changes mutex_lock(_manager->lock) to !mutex_trylock(_manager->lork). Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: stable <stable@kernel.org> [3.3+] Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-08-05drm/ttm: Fix possible division by 0 in ttm_dma_pool_shrink_scan().Tetsuo Handa1-0/+3
list_empty(&_manager->pools) being false before taking _manager->lock does not guarantee that _manager->npools != 0 after taking _manager->lock because _manager->npools is updated under _manager->lock. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: stable <stable@kernel.org> [3.3+] Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-11-06drm/ttm: Enable the dma page pool also for intel IOMMUsThomas Hellstrom1-0/+3
Used by the vmwgfx driver Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-09-10drivers: convert shrinkers to new count/scan APIDave Chinner1-19/+32
Convert the driver shrinkers to the new API. Most changes are compile tested only because I either don't have the hardware or it's staging stuff. FWIW, the md and android code is pretty good, but the rest of it makes me want to claw my eyes out. The amount of broken code I just encountered is mind boggling. I've added comments explaining what is broken, but I fear that some of the code would be best dealt with by being dragged behind the bike shed, burying in mud up to it's neck and then run over repeatedly with a blunt lawn mower. Special mention goes to the zcache/zcache2 drivers. They can't co-exist in the build at the same time, they are under different menu options in menuconfig, they only show up when you've got the right set of mm subsystem options configured and so even compile testing is an exercise in pulling teeth. And that doesn't even take into account the horrible, broken code... [glommer@openvz.org: fixes for i915, android lowmem, zcache, bcache] Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Glauber Costa <glommer@openvz.org> Acked-by: Mel Gorman <mgorman@suse.de> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Kent Overstreet <koverstreet@google.com> Cc: John Stultz <john.stultz@linaro.org> Cc: David Rientjes <rientjes@google.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Thomas Hellstrom <thellstrom@vmware.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Cc: Arve Hjønnevåg <arve@android.com> Cc: Carlos Maiolino <cmaiolino@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: David Rientjes <rientjes@google.com> Cc: Gleb Natapov <gleb@redhat.com> Cc: Greg Thelen <gthelen@google.com> Cc: J. Bruce Fields <bfields@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Jerome Glisse <jglisse@redhat.com> Cc: John Stultz <john.stultz@linaro.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Kent Overstreet <koverstreet@google.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Thomas Hellstrom <thellstrom@vmware.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-10-03Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds1-4/+1
Pull drm merge (part 1) from Dave Airlie: "So first of all my tree and uapi stuff has a conflict mess, its my fault as the nouveau stuff didn't hit -next as were trying to rebase regressions out of it before we merged. Highlights: - SH mobile modesetting driver and associated helpers - some DRM core documentation - i915 modesetting rework, haswell hdmi, haswell and vlv fixes, write combined pte writing, ilk rc6 support, - nouveau: major driver rework into a hw core driver, makes features like SLI a lot saner to implement, - psb: add eDP/DP support for Cedarview - radeon: 2 layer page tables, async VM pte updates, better PLL selection for > 2 screens, better ACPI interactions The rest is general grab bag of fixes. So why part 1? well I have the exynos pull req which came in a bit late but was waiting for me to do something they shouldn't have and it looks fairly safe, and David Howells has some more header cleanups he'd like me to pull, that seem like a good idea, but I'd like to get this merge out of the way so -next dosen't get blocked." Tons of conflicts mostly due to silly include line changes, but mostly mindless. A few other small semantic conflicts too, noted from Dave's pre-merged branch. * 'drm-next' of git://people.freedesktop.org/~airlied/linux: (447 commits) drm/nv98/crypt: fix fuc build with latest envyas drm/nouveau/devinit: fixup various issues with subdev ctor/init ordering drm/nv41/vm: fix and enable use of "real" pciegart drm/nv44/vm: fix and enable use of "real" pciegart drm/nv04/dmaobj: fixup vm target handling in preparation for nv4x pcie drm/nouveau: store supported dma mask in vmmgr drm/nvc0/ibus: initial implementation of subdev drm/nouveau/therm: add support for fan-control modes drm/nouveau/hwmon: rename pwm0* to pmw1* to follow hwmon's rules drm/nouveau/therm: calculate the pwm divisor on nv50+ drm/nouveau/fan: rewrite the fan tachometer driver to get more precision, faster drm/nouveau/therm: move thermal-related functions to the therm subdev drm/nouveau/bios: parse the pwm divisor from the perf table drm/nouveau/therm: use the EXTDEV table to detect i2c monitoring devices drm/nouveau/therm: rework thermal table parsing drm/nouveau/gpio: expose the PWM/TOGGLE parameter found in the gpio vbios table drm/nouveau: fix pm initialization order drm/nouveau/bios: check that fixed tvdac gpio data is valid before using it drm/nouveau: log channel debug/error messages from client object rather than drm client drm/nouveau: have drm debugging macros build on top of core macros ...
2012-10-02UAPI: (Scripted) Convert #include "..." to #include <path/...> in drivers/gpu/David Howells1-2/+2
Convert #include "..." to #include <path/...> in drivers/gpu/. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Dave Airlie <airlied@redhat.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Dave Jones <davej@redhat.com>
2012-10-02drivers/gpu/drm/ttm/ttm_page_alloc_dma.c: Remove useless kfreePeter Senna Tschudin1-4/+1
Remove useless kfree() and clean up code related to the removal. The semantic patch that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @r exists@ position p1,p2; expression x; @@ if (x@p1 == NULL) { ... kfree@p2(x); ... return ...; } @unchanged exists@ position r.p1,r.p2; expression e <= r.x,x,e1; iterator I; statement S; @@ if (x@p1 == NULL) { ... when != I(x,...) S when != e = e1 when != e += e1 when != e -= e1 when != ++e when != --e when != e++ when != e-- when != &e kfree@p2(x); ... return ...; } @ok depends on unchanged exists@ position any r.p1; position r.p2; expression x; @@ ... when != true x@p1 == NULL kfree@p2(x); @depends on !ok && unchanged@ position r.p2; expression x; @@ *kfree@p2(x); // </smpl> Signed-off-by: Peter Senna Tschudin <peter.senna@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-03-20drm/ttm: Use pr_fmt and pr_<level>Joe Perches1-34/+26
Use the more current logging style. Add pr_fmt and remove the TTM_PFX uses. Coalesce formats and align arguments. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-01-13ttm/dma: Remove the WARN() which is not useful.Konrad Rzeszutek Wilk1-3/+2
. It was useful during development, but now on a production system we can get this (if the user forgot to upload the firmware): [drm] radeon: irq initialized. [drm] GART: num cpu pages 131072, num gpu pages 131072 [drm] radeon: ib pool ready. [drm] Loading SUMO Microcode r600_cp: Failed to load firmware "radeon/SUMO_pfp.bin" atl1c 0000:03:00.0: version 1.0.1.0-NAPI.213057] [drm:evergreen_startup] *ERROR* Failed to load firmware! radeon 0000:00:01.0: disabling GPU acceleration 88] radeon 0000:00:01.0: ffff8801bb782400 unpin not necessary ------------[ cut here ]------------ WARNING: at /home/konrad/linux-linus/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c:956 ttm_dma_unpopulate+0x79/0x300 [ttm]() Hardware name: System Product Name Modules linked in: e1000e atl1c radeon(+) ahci libahci libata scsi_mod fbcon tileblit font ttm bitblit softcursor drm_kms_helper wmi xen_blkfront xen_netfront fb_sys_fops sysimgblt sysfillrect syscopyarea xenfs xen_privcmd Pid: 1600, comm: modprobe Not tainted 3.2.0-06100-ge343a89 #1 Call Trace: [<ffffffff8108973a>] warn_slowpath_common+0x7a/0xb0 [<ffffffff81089785>] warn_slowpath_null+0x15/0x20 [<ffffffffa0060309>] ttm_dma_unpopulate+0x79/0x300 [ttm] [<ffffffffa01341c0>] radeon_ttm_tt_unpopulate+0x120/0x130 [radeon] [<ffffffffa0056e0c>] ttm_tt_destroy+0x2c/0x70 [ttm] [<ffffffffa0057a4e>] ttm_bo_cleanup_memtype_use+0x3e/0x80 [ttm] [<ffffffffa00595a1>] ttm_bo_release+0x251/0x280 [ttm] [<ffffffffa0059610>] ttm_bo_unref+0x40/0x60 [ttm] [<ffffffffa0134d02>] radeon_bo_unref+0x42/0x80 [radeon] [<ffffffffa0186dfb>] radeon_sa_bo_manager_fini+0x6b/0x80 [radeon] [<ffffffffa0146b8f>] radeon_ib_pool_fini+0x6f/0x90 [radeon] [<ffffffffa014be49>] r100_ib_fini+0x19/0x20 [radeon] [<ffffffffa017b47e>] evergreen_init+0x1ee/0x2d0 [radeon] The big WARN() has nothing to do with the culprit - which is that the firmware was not loaded. So lets remove the WARN() from the TTM DMA code. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Jerome Glisse <jglisse@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-01-10drm/ttm: fix condition (and vs or)Dan Carpenter1-5/+3
The "if (!p && !p->dev)" condition isn't right because || was intended instead of &&. But actually, "p" is the list cursor and so it's always non-NULL and we can just remove that bit. We can remove the another similar check as well. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Jerome Glisse <jglisse@redhat.com> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-01-06drm/ttm/dma: Fix accounting error when calling ttm_mem_global_free_page and ↵Konrad Rzeszutek Wilk1-5/+10
don't try to free freed pages. The code to figure out how many pages to shrink the pool ends up capping the 'count' at _manager->options.max_size - which is OK. Except that the 'count' is also used when accounting for how many pages are recycled - which we end up with the invalid values. This fixes it by using a different value for the amount of pages to shrink. On top of that we would free the cached page pool - which is nonsense as they are deleted from the pool - so there are no free pages in that pool.. Also we also missed the opportunity to batch the amount of pages to free (similar to how ttm_page_alloc.c does it). This reintroduces the code that was lost during rebasing. Reviewed-by: Jerome Glisse <jglisse@redhat.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-01-06drm/ttm/dma: Only call set_pages_array_wb when the page is not in WB pool.Konrad Rzeszutek Wilk1-2/+5
Otherwise we are doing redundant work. Especially since the 'unbind' and 'unpopulate' have been merged and nouveau driver ends up calling it quite excessivly. On a GeForce 8600 GT with Gnome Shell (GNOME 3) we end up spending about 54% CPU time in __change_page_attr_set_clr checking the page flags. The callgraph (annotated) looks as so before this patch: 53.29% gnome-shell [kernel.kallsyms] [k] static_protections | --- static_protections | |--91.80%-- __change_page_attr_set_clr | change_page_attr_set_clr | set_pages_array_wb | | | |--96.55%-- ttm_dma_unpopulate | | nouveau_ttm_tt_unpopulate | | ttm_tt_destroy | | ttm_bo_cleanup_memtype_use | | ttm_bo_release | | kref_put | | ttm_bo_unref | | nouveau_gem_object_del | | drm_gem_object_free | | kref_put | | drm_gem_object_unreference_unlocked | | drm_gem_object_handle_unreference_unlocked.part.1 | | drm_gem_handle_delete | | drm_gem_close_ioctl | | drm_ioctl | | do_vfs_ioctl | | sys_ioctl | | system_call_fastpath | | __GI___ioctl | | | --3.45%-- ttm_dma_pages_put | ttm_dma_page_pool_free | ttm_dma_unpopulate | nouveau_ttm_tt_unpopulate | ttm_tt_destroy | ttm_bo_cleanup_memtype_use | ttm_bo_release | kref_put | ttm_bo_unref | nouveau_gem_object_del | drm_gem_object_free | kref_put | drm_gem_object_unreference_unlocked | drm_gem_object_handle_unreference_unlocked.part.1 | drm_gem_handle_delete | drm_gem_close_ioctl | drm_ioctl | do_vfs_ioctl | sys_ioctl | system_call_fastpath | __GI___ioctl | --8.20%-- change_page_attr_set_clr set_pages_array_wb | |--93.76%-- ttm_dma_unpopulate | nouveau_ttm_tt_unpopulate | ttm_tt_destroy | ttm_bo_cleanup_memtype_use | ttm_bo_release | kref_put | ttm_bo_unref | nouveau_gem_object_del | drm_gem_object_free | kref_put | drm_gem_object_unreference_unlocked | drm_gem_object_handle_unreference_unlocked.part.1 | drm_gem_handle_delete | drm_gem_close_ioctl | drm_ioctl | do_vfs_ioctl | sys_ioctl | system_call_fastpath | __GI___ioctl | --6.24%-- ttm_dma_pages_put ttm_dma_page_pool_free ttm_dma_unpopulate nouveau_ttm_tt_unpopulate ttm_tt_destroy ttm_bo_cleanup_memtype_use ttm_bo_release kref_put ttm_bo_unref nouveau_gem_object_del drm_gem_object_free kref_put drm_gem_object_unreference_unlocked drm_gem_object_handle_unreference_unlocked.part.1 drm_gem_handle_delete drm_gem_close_ioctl drm_ioctl do_vfs_ioctl sys_ioctl system_call_fastpath __GI___ioctl and after this patch all of that disappears. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-12-06drm/ttm: isolate dma data from ttm_tt V4Jerome Glisse1-16/+19
Move dma data to a superset ttm_dma_tt structure which herit from ttm_tt. This allow driver that don't use dma functionalities to not have to waste memory for it. V2 Rebase on top of no memory account changes (where/when is my delorean when i need it ?) V3 Make sure page list is initialized empty V4 typo/syntax fixes Signed-off-by: Jerome Glisse <jglisse@redhat.com> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
2011-12-06drm/ttm: provide dma aware ttm page pool code V9Konrad Rzeszutek Wilk1-0/+1134
In TTM world the pages for the graphic drivers are kept in three different pools: write combined, uncached, and cached (write-back). When the pages are used by the graphic driver the graphic adapter via its built in MMU (or AGP) programs these pages in. The programming requires the virtual address (from the graphic adapter perspective) and the physical address (either System RAM or the memory on the card) which is obtained using the pci_map_* calls (which does the virtual to physical - or bus address translation). During the graphic application's "life" those pages can be shuffled around, swapped out to disk, moved from the VRAM to System RAM or vice-versa. This all works with the existing TTM pool code - except when we want to use the software IOTLB (SWIOTLB) code to "map" the physical addresses to the graphic adapter MMU. We end up programming the bounce buffer's physical address instead of the TTM pool memory's and get a non-worky driver. There are two solutions: 1) using the DMA API to allocate pages that are screened by the DMA API, or 2) using the pci_sync_* calls to copy the pages from the bounce-buffer and back. This patch fixes the issue by allocating pages using the DMA API. The second is a viable option - but it has performance drawbacks and potential correctness issues - think of the write cache page being bounced (SWIOTLB->TTM), the WC is set on the TTM page and the copy from SWIOTLB not making it to the TTM page until the page has been recycled in the pool (and used by another application). The bounce buffer does not get activated often - only in cases where we have a 32-bit capable card and we want to use a page that is allocated above the 4GB limit. The bounce buffer offers the solution of copying the contents of that 4GB page to an location below 4GB and then back when the operation has been completed (or vice-versa). This is done by using the 'pci_sync_*' calls. Note: If you look carefully enough in the existing TTM page pool code you will notice the GFP_DMA32 flag is used - which should guarantee that the provided page is under 4GB. It certainly is the case, except this gets ignored in two cases: - If user specifies 'swiotlb=force' which bounces _every_ page. - If user is using a Xen's PV Linux guest (which uses the SWIOTLB and the underlaying PFN's aren't necessarily under 4GB). To not have this extra copying done the other option is to allocate the pages using the DMA API so that there is not need to map the page and perform the expensive 'pci_sync_*' calls. This DMA API capable TTM pool requires for this the 'struct device' to properly call the DMA API. It also has to track the virtual and bus address of the page being handed out in case it ends up being swapped out or de-allocated - to make sure it is de-allocated using the proper's 'struct device'. Implementation wise the code keeps two lists: one that is attached to the 'struct device' (via the dev->dma_pools list) and a global one to be used when the 'struct device' is unavailable (think shrinker code). The global list can iterate over all of the 'struct device' and its associated dma_pool. The list in dev->dma_pools can only iterate the device's dma_pool. /[struct device_pool]\ /---------------------------------------------------| dev | / +-------| dma_pool | /-----+------\ / \--------------------/ |struct device| /-->[struct dma_pool for WC]</ /[struct device_pool]\ | dma_pools +----+ /-| dev | | ... | \--->[struct dma_pool for uncached]<-/--| dma_pool | \-----+------/ / \--------------------/ \----------------------------------------------/ [Two pools associated with the device (WC and UC), and the parallel list containing the 'struct dev' and 'struct dma_pool' entries] The maximum amount of dma pools a device can have is six: write-combined, uncached, and cached; then there are the DMA32 variants which are: write-combined dma32, uncached dma32, and cached dma32. Currently this code only gets activated when any variant of the SWIOTLB IOMMU code is running (Intel without VT-d, AMD without GART, IBM Calgary and Xen PV with PCI devices). Tested-by: Michel Dänzer <michel@daenzer.net> [v1: Using swiotlb_nr_tbl instead of swiotlb_enabled] [v2: Major overhaul - added 'inuse_list' to seperate used from inuse and reorder the order of lists to get better performance.] [v3: Added comments/and some logic based on review, Added Jerome tag] [v4: rebase on top of ttm_tt & ttm_backend merge] [v5: rebase on top of ttm memory accounting overhaul] [v6: New rebase on top of more memory accouting changes] [v7: well rebase on top of no memory accounting changes] [v8: make sure pages list is initialized empty] [v9: calll ttm_mem_global_free_page in unpopulate for accurate accountg] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Jerome Glisse <jglisse@redhat.com> Acked-by: Thomas Hellstrom <thellstrom@vmware.com>