Message ID | 1630286290-43714-2-git-send-email-linyunsheng@huawei.com |
---|---|
State | New |
Headers | show |
Series | some optimization for page pool | expand |
On 2021/8/30 23:05, Alexander Duyck wrote: > On Sun, Aug 29, 2021 at 6:19 PM Yunsheng Lin <linyunsheng@huawei.com> wrote: >> >> Currently when PP_FLAG_PAGE_FRAG is set, the caller is not >> expected to call page_pool_alloc_pages() directly because of >> the PP_FLAG_PAGE_FRAG checking in __page_pool_put_page(). >> >> The patch removes the above checking to enable non-split page >> support when PP_FLAG_PAGE_FRAG is set. >> >> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> >> --- >> include/net/page_pool.h | 6 ++++++ >> net/core/page_pool.c | 12 +++++++----- >> 2 files changed, 13 insertions(+), 5 deletions(-) >> >> diff --git a/include/net/page_pool.h b/include/net/page_pool.h >> index a408240..2ad0706 100644 >> --- a/include/net/page_pool.h >> +++ b/include/net/page_pool.h >> @@ -238,6 +238,9 @@ static inline void page_pool_set_dma_addr(struct page *page, dma_addr_t addr) >> >> static inline void page_pool_set_frag_count(struct page *page, long nr) >> { >> + if (PAGE_POOL_DMA_USE_PP_FRAG_COUNT) >> + return; >> + >> atomic_long_set(&page->pp_frag_count, nr); >> } >> >> @@ -246,6 +249,9 @@ static inline long page_pool_atomic_sub_frag_count_return(struct page *page, >> { >> long ret; >> >> + if (PAGE_POOL_DMA_USE_PP_FRAG_COUNT) >> + return 0; >> + >> /* As suggested by Alexander, atomic_long_read() may cover up the >> * reference count errors, so avoid calling atomic_long_read() in >> * the cases of freeing or draining the page_frags, where we would >> diff --git a/net/core/page_pool.c b/net/core/page_pool.c >> index 1a69784..ba9f14d 100644 >> --- a/net/core/page_pool.c >> +++ b/net/core/page_pool.c >> @@ -313,11 +313,14 @@ struct page *page_pool_alloc_pages(struct page_pool *pool, gfp_t gfp) >> >> /* Fast-path: Get a page from cache */ >> page = __page_pool_get_cached(pool); >> - if (page) >> - return page; >> >> /* Slow-path: cache empty, do real allocation */ >> - page = __page_pool_alloc_pages_slow(pool, gfp); >> + if (!page) >> + page = __page_pool_alloc_pages_slow(pool, gfp); >> + >> + if (likely(page)) >> + page_pool_set_frag_count(page, 1); >> + >> return page; >> } >> EXPORT_SYMBOL(page_pool_alloc_pages); >> @@ -426,8 +429,7 @@ __page_pool_put_page(struct page_pool *pool, struct page *page, >> unsigned int dma_sync_size, bool allow_direct) >> { >> /* It is not the last user for the page frag case */ >> - if (pool->p.flags & PP_FLAG_PAGE_FRAG && >> - page_pool_atomic_sub_frag_count_return(page, 1)) >> + if (page_pool_atomic_sub_frag_count_return(page, 1)) >> return NULL; > > Isn't this going to have a negative performance impact on page pool > pages in general? Essentially you are adding an extra atomic operation > for all the non-frag pages. > > It would work better if this was doing a check against 1 to determine > if it is okay for this page to be freed here and only if the check > fails then you perform the atomic sub_return. The page_pool_atomic_sub_frag_count_return() has added the optimization to not do the atomic sub_return when the caller is the last user of the page, see page_pool_atomic_sub_frag_count_return(): /* As suggested by Alexander, atomic_long_read() may cover up the * reference count errors, so avoid calling atomic_long_read() in * the cases of freeing or draining the page_frags, where we would * not expect it to match or that are slowpath anyway. */ if (__builtin_constant_p(nr) && atomic_long_read(&page->pp_frag_count) == nr) return 0; So the check against 1 is not needed here? > . >
On Mon, Aug 30, 2021 at 11:14 PM Yunsheng Lin <linyunsheng@huawei.com> wrote: > > On 2021/8/30 23:05, Alexander Duyck wrote: > > On Sun, Aug 29, 2021 at 6:19 PM Yunsheng Lin <linyunsheng@huawei.com> wrote: > >> > >> Currently when PP_FLAG_PAGE_FRAG is set, the caller is not > >> expected to call page_pool_alloc_pages() directly because of > >> the PP_FLAG_PAGE_FRAG checking in __page_pool_put_page(). > >> > >> The patch removes the above checking to enable non-split page > >> support when PP_FLAG_PAGE_FRAG is set. > >> > >> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> > >> --- > >> include/net/page_pool.h | 6 ++++++ > >> net/core/page_pool.c | 12 +++++++----- > >> 2 files changed, 13 insertions(+), 5 deletions(-) > >> > >> diff --git a/include/net/page_pool.h b/include/net/page_pool.h > >> index a408240..2ad0706 100644 > >> --- a/include/net/page_pool.h > >> +++ b/include/net/page_pool.h > >> @@ -238,6 +238,9 @@ static inline void page_pool_set_dma_addr(struct page *page, dma_addr_t addr) > >> > >> static inline void page_pool_set_frag_count(struct page *page, long nr) > >> { > >> + if (PAGE_POOL_DMA_USE_PP_FRAG_COUNT) > >> + return; > >> + > >> atomic_long_set(&page->pp_frag_count, nr); > >> } > >> > >> @@ -246,6 +249,9 @@ static inline long page_pool_atomic_sub_frag_count_return(struct page *page, > >> { > >> long ret; > >> > >> + if (PAGE_POOL_DMA_USE_PP_FRAG_COUNT) > >> + return 0; > >> + > >> /* As suggested by Alexander, atomic_long_read() may cover up the > >> * reference count errors, so avoid calling atomic_long_read() in > >> * the cases of freeing or draining the page_frags, where we would > >> diff --git a/net/core/page_pool.c b/net/core/page_pool.c > >> index 1a69784..ba9f14d 100644 > >> --- a/net/core/page_pool.c > >> +++ b/net/core/page_pool.c > >> @@ -313,11 +313,14 @@ struct page *page_pool_alloc_pages(struct page_pool *pool, gfp_t gfp) > >> > >> /* Fast-path: Get a page from cache */ > >> page = __page_pool_get_cached(pool); > >> - if (page) > >> - return page; > >> > >> /* Slow-path: cache empty, do real allocation */ > >> - page = __page_pool_alloc_pages_slow(pool, gfp); > >> + if (!page) > >> + page = __page_pool_alloc_pages_slow(pool, gfp); > >> + > >> + if (likely(page)) > >> + page_pool_set_frag_count(page, 1); > >> + > >> return page; > >> } > >> EXPORT_SYMBOL(page_pool_alloc_pages); > >> @@ -426,8 +429,7 @@ __page_pool_put_page(struct page_pool *pool, struct page *page, > >> unsigned int dma_sync_size, bool allow_direct) > >> { > >> /* It is not the last user for the page frag case */ > >> - if (pool->p.flags & PP_FLAG_PAGE_FRAG && > >> - page_pool_atomic_sub_frag_count_return(page, 1)) > >> + if (page_pool_atomic_sub_frag_count_return(page, 1)) > >> return NULL; > > > > Isn't this going to have a negative performance impact on page pool > > pages in general? Essentially you are adding an extra atomic operation > > for all the non-frag pages. > > > > It would work better if this was doing a check against 1 to determine > > if it is okay for this page to be freed here and only if the check > > fails then you perform the atomic sub_return. > > The page_pool_atomic_sub_frag_count_return() has added the optimization > to not do the atomic sub_return when the caller is the last user of the > page, see page_pool_atomic_sub_frag_count_return(): > > /* As suggested by Alexander, atomic_long_read() may cover up the > * reference count errors, so avoid calling atomic_long_read() in > * the cases of freeing or draining the page_frags, where we would > * not expect it to match or that are slowpath anyway. > */ > if (__builtin_constant_p(nr) && > atomic_long_read(&page->pp_frag_count) == nr) > return 0; > > So the check against 1 is not needed here? Ah, okay. I hadn't seen that part. So yeah, then this should be mostly harmless since 1 falls into the category of a builtin constant and would result in the standard case being the frag count being set to 1 and then being read which should be minimal overhead. Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
diff --git a/include/net/page_pool.h b/include/net/page_pool.h index a408240..2ad0706 100644 --- a/include/net/page_pool.h +++ b/include/net/page_pool.h @@ -238,6 +238,9 @@ static inline void page_pool_set_dma_addr(struct page *page, dma_addr_t addr) static inline void page_pool_set_frag_count(struct page *page, long nr) { + if (PAGE_POOL_DMA_USE_PP_FRAG_COUNT) + return; + atomic_long_set(&page->pp_frag_count, nr); } @@ -246,6 +249,9 @@ static inline long page_pool_atomic_sub_frag_count_return(struct page *page, { long ret; + if (PAGE_POOL_DMA_USE_PP_FRAG_COUNT) + return 0; + /* As suggested by Alexander, atomic_long_read() may cover up the * reference count errors, so avoid calling atomic_long_read() in * the cases of freeing or draining the page_frags, where we would diff --git a/net/core/page_pool.c b/net/core/page_pool.c index 1a69784..ba9f14d 100644 --- a/net/core/page_pool.c +++ b/net/core/page_pool.c @@ -313,11 +313,14 @@ struct page *page_pool_alloc_pages(struct page_pool *pool, gfp_t gfp) /* Fast-path: Get a page from cache */ page = __page_pool_get_cached(pool); - if (page) - return page; /* Slow-path: cache empty, do real allocation */ - page = __page_pool_alloc_pages_slow(pool, gfp); + if (!page) + page = __page_pool_alloc_pages_slow(pool, gfp); + + if (likely(page)) + page_pool_set_frag_count(page, 1); + return page; } EXPORT_SYMBOL(page_pool_alloc_pages); @@ -426,8 +429,7 @@ __page_pool_put_page(struct page_pool *pool, struct page *page, unsigned int dma_sync_size, bool allow_direct) { /* It is not the last user for the page frag case */ - if (pool->p.flags & PP_FLAG_PAGE_FRAG && - page_pool_atomic_sub_frag_count_return(page, 1)) + if (page_pool_atomic_sub_frag_count_return(page, 1)) return NULL; /* This allocator is optimized for the XDP mode that uses
Currently when PP_FLAG_PAGE_FRAG is set, the caller is not expected to call page_pool_alloc_pages() directly because of the PP_FLAG_PAGE_FRAG checking in __page_pool_put_page(). The patch removes the above checking to enable non-split page support when PP_FLAG_PAGE_FRAG is set. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> --- include/net/page_pool.h | 6 ++++++ net/core/page_pool.c | 12 +++++++----- 2 files changed, 13 insertions(+), 5 deletions(-)