From patchwork Tue Nov 24 15:38:40 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ricardo Ribalda X-Patchwork-Id: 332608 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 05032C2D0E4 for ; Tue, 24 Nov 2020 15:39:36 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 88A712067C for ; Tue, 24 Nov 2020 15:39:35 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=chromium.org header.i=@chromium.org header.b="R8lV2o5T" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389797AbgKXPiu (ORCPT ); Tue, 24 Nov 2020 10:38:50 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53978 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2389367AbgKXPit (ORCPT ); Tue, 24 Nov 2020 10:38:49 -0500 Received: from mail-wr1-x444.google.com (mail-wr1-x444.google.com [IPv6:2a00:1450:4864:20::444]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 208B8C0617A6 for ; Tue, 24 Nov 2020 07:38:49 -0800 (PST) Received: by mail-wr1-x444.google.com with SMTP id p8so22779716wrx.5 for ; Tue, 24 Nov 2020 07:38:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=from:to:subject:date:message-id:mime-version :content-transfer-encoding; bh=kEyLf8LFch+KdsSp/Rnz95CvSLqxxYSUroQ+WOd06N4=; b=R8lV2o5TDdkXVFg+npOQOfmTr894UuBL+hiZgpOvOeemCAGJLSxzNsjKPmbqTtaUBV df3z1UFF5oKZaN5ZQr91KwUZQ6ERHZqqQx0vdG6is+IDe89rDGCvNsXovvJQ5Kdp+51K 7t350VA05XF6rpHuFim6dpw29CirGR9tsMJuQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:mime-version :content-transfer-encoding; bh=kEyLf8LFch+KdsSp/Rnz95CvSLqxxYSUroQ+WOd06N4=; b=tWuCqDyQFR5a0hrXwW2eCk8dHLWA4L9vRCphmIcqmcW8aCQxwcZqU5nIQryRlXmVsE Q8buy/gh08dTA8NFI4ydlM4obmec4xZBYG5S7I3b5pXVcwqccT96PbLe/MMR7/G5rWz+ 7KVWEVE40Er6FrGwa1wjjZfSm1QyxllXB2ZQafjr58Nj3S5QobGeV26MzlRNulRalV8h B4N7GUUzjMieK8u9LW10R6YpBB4bFbk0C0D2DgJBROzvNtfHs+s1Um0uNYiY9gzBq/Ez eNZHLkeFOWIAXz3Mg/ExUk9Y7bXi4haUZxWYWI351N9M6jn80ioMe2Y41sraIzFV/E+O 9gEw== X-Gm-Message-State: AOAM5330ly0roVwrUwG7U/r/Lqr4yhnUHp6vEAc+jstGNgiAgUw+CDWJ Tm1JgeMVzMWB2E83xPJGKYGiBg== X-Google-Smtp-Source: ABdhPJyTWs2twpXLViHJQD11LS9U4IjfLcUt+cmKdO9Gzz+al2IGsYQUFvlDboFYPvWNuaA9EeR8BQ== X-Received: by 2002:a5d:544e:: with SMTP id w14mr5889967wrv.227.1606232327782; Tue, 24 Nov 2020 07:38:47 -0800 (PST) Received: from alco.lan ([80.71.134.83]) by smtp.gmail.com with ESMTPSA id 25sm5814752wmk.19.2020.11.24.07.38.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Nov 2020 07:38:47 -0800 (PST) From: Ricardo Ribalda To: Christoph Hellwig , Mauro Carvalho Chehab , Marek Szyprowski , IOMMU DRIVERS , Joerg Roedel , Robin Murphy , Linux Doc Mailing List , Linux Kernel Mailing List , Linux Media Mailing List , Tomasz Figa , Sergey Senozhatsky Subject: [PATCH 1/6] dma-mapping: remove the {alloc, free}_noncoherent methods Date: Tue, 24 Nov 2020 16:38:40 +0100 Message-Id: <20201124153845.132207-1-ribalda@chromium.org> X-Mailer: git-send-email 2.29.2.454.gaff20da3a2-goog MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-media@vger.kernel.org From: Christoph Hellwig It turns out allowing non-contigous allocations here was a rather bad idea, as we'll now need to define ways to get the pages for mmaping or dma_buf sharing. Revert this change and stick to the original concept. A different API for the use case of non-contigous allocations will be added back later. Signed-off-by: Christoph Hellwig --- drivers/iommu/dma-iommu.c | 30 ------------------------------ include/linux/dma-map-ops.h | 5 ----- kernel/dma/mapping.c | 33 ++++++--------------------------- 3 files changed, 6 insertions(+), 62 deletions(-) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 0cbcd3fc3e7e..73249732afd3 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -1054,34 +1054,6 @@ static void *iommu_dma_alloc(struct device *dev, size_t size, return cpu_addr; } -#ifdef CONFIG_DMA_REMAP -static void *iommu_dma_alloc_noncoherent(struct device *dev, size_t size, - dma_addr_t *handle, enum dma_data_direction dir, gfp_t gfp) -{ - if (!gfpflags_allow_blocking(gfp)) { - struct page *page; - - page = dma_common_alloc_pages(dev, size, handle, dir, gfp); - if (!page) - return NULL; - return page_address(page); - } - - return iommu_dma_alloc_remap(dev, size, handle, gfp | __GFP_ZERO, - PAGE_KERNEL, 0); -} - -static void iommu_dma_free_noncoherent(struct device *dev, size_t size, - void *cpu_addr, dma_addr_t handle, enum dma_data_direction dir) -{ - __iommu_dma_unmap(dev, handle, size); - __iommu_dma_free(dev, size, cpu_addr); -} -#else -#define iommu_dma_alloc_noncoherent NULL -#define iommu_dma_free_noncoherent NULL -#endif /* CONFIG_DMA_REMAP */ - static int iommu_dma_mmap(struct device *dev, struct vm_area_struct *vma, void *cpu_addr, dma_addr_t dma_addr, size_t size, unsigned long attrs) @@ -1152,8 +1124,6 @@ static const struct dma_map_ops iommu_dma_ops = { .free = iommu_dma_free, .alloc_pages = dma_common_alloc_pages, .free_pages = dma_common_free_pages, - .alloc_noncoherent = iommu_dma_alloc_noncoherent, - .free_noncoherent = iommu_dma_free_noncoherent, .mmap = iommu_dma_mmap, .get_sgtable = iommu_dma_get_sgtable, .map_page = iommu_dma_map_page, diff --git a/include/linux/dma-map-ops.h b/include/linux/dma-map-ops.h index a5f89fc4d6df..3d1f91464bcf 100644 --- a/include/linux/dma-map-ops.h +++ b/include/linux/dma-map-ops.h @@ -22,11 +22,6 @@ struct dma_map_ops { gfp_t gfp); void (*free_pages)(struct device *dev, size_t size, struct page *vaddr, dma_addr_t dma_handle, enum dma_data_direction dir); - void *(*alloc_noncoherent)(struct device *dev, size_t size, - dma_addr_t *dma_handle, enum dma_data_direction dir, - gfp_t gfp); - void (*free_noncoherent)(struct device *dev, size_t size, void *vaddr, - dma_addr_t dma_handle, enum dma_data_direction dir); int (*mmap)(struct device *, struct vm_area_struct *, void *, dma_addr_t, size_t, unsigned long attrs); diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c index 51bb8fa8eb89..d3032513c54b 100644 --- a/kernel/dma/mapping.c +++ b/kernel/dma/mapping.c @@ -514,40 +514,19 @@ EXPORT_SYMBOL_GPL(dma_free_pages); void *dma_alloc_noncoherent(struct device *dev, size_t size, dma_addr_t *dma_handle, enum dma_data_direction dir, gfp_t gfp) { - const struct dma_map_ops *ops = get_dma_ops(dev); - void *vaddr; - - if (!ops || !ops->alloc_noncoherent) { - struct page *page; - - page = dma_alloc_pages(dev, size, dma_handle, dir, gfp); - if (!page) - return NULL; - return page_address(page); - } + struct page *page; - size = PAGE_ALIGN(size); - vaddr = ops->alloc_noncoherent(dev, size, dma_handle, dir, gfp); - if (vaddr) - debug_dma_map_page(dev, virt_to_page(vaddr), 0, size, dir, - *dma_handle); - return vaddr; + page = dma_alloc_pages(dev, size, dma_handle, dir, gfp); + if (!page) + return NULL; + return page_address(page); } EXPORT_SYMBOL_GPL(dma_alloc_noncoherent); void dma_free_noncoherent(struct device *dev, size_t size, void *vaddr, dma_addr_t dma_handle, enum dma_data_direction dir) { - const struct dma_map_ops *ops = get_dma_ops(dev); - - if (!ops || !ops->free_noncoherent) { - dma_free_pages(dev, size, virt_to_page(vaddr), dma_handle, dir); - return; - } - - size = PAGE_ALIGN(size); - debug_dma_unmap_page(dev, dma_handle, size, dir); - ops->free_noncoherent(dev, size, vaddr, dma_handle, dir); + dma_free_pages(dev, size, virt_to_page(vaddr), dma_handle, dir); } EXPORT_SYMBOL_GPL(dma_free_noncoherent); From patchwork Tue Nov 24 15:38:41 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ricardo Ribalda X-Patchwork-Id: 331685 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D8ED6C8300E for ; Tue, 24 Nov 2020 15:39:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8AFDB2067C for ; Tue, 24 Nov 2020 15:39:39 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=chromium.org header.i=@chromium.org header.b="R4iABjEv" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387992AbgKXPjI (ORCPT ); Tue, 24 Nov 2020 10:39:08 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53986 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2389795AbgKXPiu (ORCPT ); Tue, 24 Nov 2020 10:38:50 -0500 Received: from mail-wr1-x444.google.com (mail-wr1-x444.google.com [IPv6:2a00:1450:4864:20::444]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B8926C061A4E for ; Tue, 24 Nov 2020 07:38:49 -0800 (PST) Received: by mail-wr1-x444.google.com with SMTP id 64so9353257wra.11 for ; Tue, 24 Nov 2020 07:38:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=nhjzpN/I0YbMNWZ/cuMZ/ssTNlecoXIrDpTSY4gcP5k=; b=R4iABjEvQuBHvnc+z9r2hhUutog05+6ThoWbUMykt7He8q7/Lm2MAV0kmbeONcOsxN U2G4j4uIxzcbx66W51sw80c9eTxt/OKrFsuZ2UvdrOfTO25Ud9jfdkZp/nNgFi5xd4uI Q/97qiFwUbs+YE8Pkx8HCm4VLRpdrdwGyTIe8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=nhjzpN/I0YbMNWZ/cuMZ/ssTNlecoXIrDpTSY4gcP5k=; b=KjqQJYLHKCGCp4HfuW+ovRcfnu7EdpvDO0n0TPcIOTFsbw2BHW9j5BufS4zfJwNFiI 4Au14xVGSaC9+IZ3juJlN0xMCePaOM/BdEJdATLEB5yPeHBRuHxZm3VuhM8ywc5tfXOd ZwDjztJpe2IKm9bsodhT91/8vzJju+IQF6EFyqKcCiRYZJUhOMdT5jKbA9fMWoYDpw5c MXtxY026hLDg26+8RMyWwoeYgkssdRZrE8PB50eCAfRacywOrpE7cpBGLxiqFl8siG46 OsQXA8NetrSQNGqa5Vg7csNcSrLYg5+irqP4SIXNhn16dnbboGQWzjYGbzUzw+EWsiTq LsCw== X-Gm-Message-State: AOAM533Vtiw2SKyaAsS/S4zjbRJSBLjkiiVibadEAXJDkSps88YubO55 8eKj3lbw1CL8ifYojRGCmWjy0Q== X-Google-Smtp-Source: ABdhPJyV94IkqLC3tQEI4e07+BimW/uaxT27Q28R/rp8nTUS06m6LEQw8+T51Xjpu9YZ6o8rFd6DgQ== X-Received: by 2002:adf:f944:: with SMTP id q4mr5883724wrr.120.1606232328515; Tue, 24 Nov 2020 07:38:48 -0800 (PST) Received: from alco.lan ([80.71.134.83]) by smtp.gmail.com with ESMTPSA id 25sm5814752wmk.19.2020.11.24.07.38.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Nov 2020 07:38:48 -0800 (PST) From: Ricardo Ribalda To: Christoph Hellwig , Mauro Carvalho Chehab , Marek Szyprowski , IOMMU DRIVERS , Joerg Roedel , Robin Murphy , Linux Doc Mailing List , Linux Kernel Mailing List , Linux Media Mailing List , Tomasz Figa , Sergey Senozhatsky Subject: [PATCH 2/6] dma-direct: use __GFP_ZERO in dma_direct_alloc_pages Date: Tue, 24 Nov 2020 16:38:41 +0100 Message-Id: <20201124153845.132207-2-ribalda@chromium.org> X-Mailer: git-send-email 2.29.2.454.gaff20da3a2-goog In-Reply-To: <20201124153845.132207-1-ribalda@chromium.org> References: <20201124153845.132207-1-ribalda@chromium.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-media@vger.kernel.org From: Christoph Hellwig Prepare for supporting the DMA_ATTR_NO_KERNEL_MAPPING flag in dma_alloc_pages. Signed-off-by: Christoph Hellwig --- kernel/dma/direct.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c index 06c111544f61..76c741e610fc 100644 --- a/kernel/dma/direct.c +++ b/kernel/dma/direct.c @@ -280,13 +280,12 @@ struct page *dma_direct_alloc_pages(struct device *dev, size_t size, dma_addr_t *dma_handle, enum dma_data_direction dir, gfp_t gfp) { struct page *page; - void *ret; if (IS_ENABLED(CONFIG_DMA_COHERENT_POOL) && force_dma_unencrypted(dev) && !gfpflags_allow_blocking(gfp)) return dma_direct_alloc_from_pool(dev, size, dma_handle, gfp); - page = __dma_direct_alloc_pages(dev, size, gfp); + page = __dma_direct_alloc_pages(dev, size, gfp | __GFP_ZERO); if (!page) return NULL; if (PageHighMem(page)) { @@ -300,13 +299,11 @@ struct page *dma_direct_alloc_pages(struct device *dev, size_t size, goto out_free_pages; } - ret = page_address(page); if (force_dma_unencrypted(dev)) { - if (set_memory_decrypted((unsigned long)ret, + if (set_memory_decrypted((unsigned long) page_address(page), 1 << get_order(size))) goto out_free_pages; } - memset(ret, 0, size); *dma_handle = phys_to_dma_direct(dev, page_to_phys(page)); return page; out_free_pages: From patchwork Tue Nov 24 15:38:42 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ricardo Ribalda X-Patchwork-Id: 331686 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7255CC8300C for ; Tue, 24 Nov 2020 15:39:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 105E32067C for ; Tue, 24 Nov 2020 15:39:39 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=chromium.org header.i=@chromium.org header.b="n2RkEURO" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389805AbgKXPjF (ORCPT ); Tue, 24 Nov 2020 10:39:05 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53996 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2389808AbgKXPiu (ORCPT ); Tue, 24 Nov 2020 10:38:50 -0500 Received: from mail-wr1-x443.google.com (mail-wr1-x443.google.com [IPv6:2a00:1450:4864:20::443]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 73A22C061A4D for ; Tue, 24 Nov 2020 07:38:50 -0800 (PST) Received: by mail-wr1-x443.google.com with SMTP id l1so22742384wrb.9 for ; Tue, 24 Nov 2020 07:38:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=mn1wB4UIIJYCPFpu55sy3LOHzNh5rEWUlXSp9HWhrTs=; b=n2RkEUROrYeufAtCSKDWr3fmq/B5Dj/1/iujvWXNBpKVXM46G14POxuNtrvFGvG4xS 65IlJ8mFMtlZ+DEw0CCukK4ngUegaZeo8SVlB1Gy+ncs+XC07IzM0JfpHRFMuhx4DL+q Rc6TA7y3GVyTSBWrC+0pOexy8rwtT88yGy6aM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=mn1wB4UIIJYCPFpu55sy3LOHzNh5rEWUlXSp9HWhrTs=; b=eKtOQOU1/uaqrTmuIokr3P/r9aPf9O8UvWafvpI5nKhfmKo6vVJF4WgHiPCQmGpoxs 4JOWMn5wHvJ97mIm226fh7Th9/a/Id5Qpbr8mqFUwJ/etbgo/+Cc4xcsVZe9f5mlWc2G N33QvabW3CIRtJ8KFHNgaLIyyBXnqFMRCsl9t9/r4zSuvKvnEBfIm5V2o7E91Tb87j5d EAJBEuGW0/nK71wFMiSO2WT4xhu9abdPCU/MV0JL5i8JiuiIbB58iDg1eZw6VoUVwKeY VkIgR7vLaJ6ts3iOQbPV6JJY1wJLZLER+PWmLLkYEYldDVD0/QQcf2XGODT+16NwZtSI o4xg== X-Gm-Message-State: AOAM532k42bOCxjDkNg8LjFqwtRTU5QVD2JqR33rJJMTBO10MBedrsx2 drr2IHWj7YOp5ne5Xuk82QL1Eg== X-Google-Smtp-Source: ABdhPJzJQS12dAtHSIto+AkUackkC3dZ1AW4aqhTl1qwf4siZheZKe0WXY06YOIImYJlDwLUTKFQtw== X-Received: by 2002:adf:e80b:: with SMTP id o11mr5939427wrm.409.1606232329253; Tue, 24 Nov 2020 07:38:49 -0800 (PST) Received: from alco.lan ([80.71.134.83]) by smtp.gmail.com with ESMTPSA id 25sm5814752wmk.19.2020.11.24.07.38.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Nov 2020 07:38:48 -0800 (PST) From: Ricardo Ribalda To: Christoph Hellwig , Mauro Carvalho Chehab , Marek Szyprowski , IOMMU DRIVERS , Joerg Roedel , Robin Murphy , Linux Doc Mailing List , Linux Kernel Mailing List , Linux Media Mailing List , Tomasz Figa , Sergey Senozhatsky Subject: [PATCH 3/6] dma-iommu: remove __iommu_dma_mmap Date: Tue, 24 Nov 2020 16:38:42 +0100 Message-Id: <20201124153845.132207-3-ribalda@chromium.org> X-Mailer: git-send-email 2.29.2.454.gaff20da3a2-goog In-Reply-To: <20201124153845.132207-1-ribalda@chromium.org> References: <20201124153845.132207-1-ribalda@chromium.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-media@vger.kernel.org From: Christoph Hellwig The function has a single caller, so open code it there and take advantage of the precalculated page count variable. Signed-off-by: Christoph Hellwig --- drivers/iommu/dma-iommu.c | 17 +---------------- 1 file changed, 1 insertion(+), 16 deletions(-) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 73249732afd3..a2fb92de7e3d 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -655,21 +655,6 @@ static void *iommu_dma_alloc_remap(struct device *dev, size_t size, return NULL; } -/** - * __iommu_dma_mmap - Map a buffer into provided user VMA - * @pages: Array representing buffer from __iommu_dma_alloc() - * @size: Size of buffer in bytes - * @vma: VMA describing requested userspace mapping - * - * Maps the pages of the buffer in @pages into @vma. The caller is responsible - * for verifying the correct size and protection of @vma beforehand. - */ -static int __iommu_dma_mmap(struct page **pages, size_t size, - struct vm_area_struct *vma) -{ - return vm_map_pages(vma, pages, PAGE_ALIGN(size) >> PAGE_SHIFT); -} - static void iommu_dma_sync_single_for_cpu(struct device *dev, dma_addr_t dma_handle, size_t size, enum dma_data_direction dir) { @@ -1074,7 +1059,7 @@ static int iommu_dma_mmap(struct device *dev, struct vm_area_struct *vma, struct page **pages = dma_common_find_pages(cpu_addr); if (pages) - return __iommu_dma_mmap(pages, size, vma); + return vm_map_pages(vma, pages, nr_pages); pfn = vmalloc_to_pfn(cpu_addr); } else { pfn = page_to_pfn(virt_to_page(cpu_addr)); From patchwork Tue Nov 24 15:38:43 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ricardo Ribalda X-Patchwork-Id: 331687 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 378F6C64E75 for ; Tue, 24 Nov 2020 15:39:37 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E0D822067C for ; Tue, 24 Nov 2020 15:39:36 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=chromium.org header.i=@chromium.org header.b="P/Yhlw5T" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389835AbgKXPiy (ORCPT ); Tue, 24 Nov 2020 10:38:54 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53998 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2389823AbgKXPiv (ORCPT ); Tue, 24 Nov 2020 10:38:51 -0500 Received: from mail-wr1-x442.google.com (mail-wr1-x442.google.com [IPv6:2a00:1450:4864:20::442]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 57298C0613D6 for ; Tue, 24 Nov 2020 07:38:51 -0800 (PST) Received: by mail-wr1-x442.google.com with SMTP id p8so22779849wrx.5 for ; Tue, 24 Nov 2020 07:38:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=HML93bM0Bc4n5YHvmGElcbvPwZfiGcvKE6k2Cpvki30=; b=P/Yhlw5TgpxeGby9pjaEmYM0kGqLCsDDYBbhyLuTlFded/Q5ZQRlZ84NfRlykzbYH1 DkVUEyabfDMDks59dR0b4BJYzFtC7c9ApMtGMz0bO5Hu7ZMt+S74Y94J6gZNfEWSV4ge MnpSsvnZ0GngUkTv8BCpKHurEO5uF9oK2zb5g= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=HML93bM0Bc4n5YHvmGElcbvPwZfiGcvKE6k2Cpvki30=; b=K6HJxZbdeDd5bPnctn+bUScRGojZV6tyPtgIq+KeQABJk3xbhSePzDzwc95p1HPL1V ro5U2ujLSlI56Xq2taDw6c+zbTYnSdb/VNRJkIY9z3R5MsV5hui6oaaJLlWu29+oiwXk n6ljPgpOvn9aLSMJwL6a+6drrKZUr2ldsNYP/niLCLMgzqBV09CVyFGqNAB0tHCFaYuG CIxhmlqdbpAN0KREOQEHuA4d4cfrwK6K5L/j7KZvVDkPVbCLz5pP/L55sLCoPEt8ZSN7 8oM5PrZ3coW1fp/Q2APm2cBziNzYqGC+Tpr24ZTpecfBfcuH53PIlc56dSMjlyOyTkB7 9uxQ== X-Gm-Message-State: AOAM5337KDr4xR8E++QjjA1kyaTq3x7atWGzBO/W8ggnHO5Q+9d9xqow iyQdT314J7uct3Z67Dp+Ef4cyA== X-Google-Smtp-Source: ABdhPJyr+m/OvVN8bOtfmxzP/EzSgR9DeGgRdBRvSfGr9+vxnRk4Tg52DdcdCEyGndX/SxXL7AVijg== X-Received: by 2002:a5d:44c1:: with SMTP id z1mr5836653wrr.375.1606232330021; Tue, 24 Nov 2020 07:38:50 -0800 (PST) Received: from alco.lan ([80.71.134.83]) by smtp.gmail.com with ESMTPSA id 25sm5814752wmk.19.2020.11.24.07.38.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Nov 2020 07:38:49 -0800 (PST) From: Ricardo Ribalda To: Christoph Hellwig , Mauro Carvalho Chehab , Marek Szyprowski , IOMMU DRIVERS , Joerg Roedel , Robin Murphy , Linux Doc Mailing List , Linux Kernel Mailing List , Linux Media Mailing List , Tomasz Figa , Sergey Senozhatsky Subject: [PATCH 4/6] WIP: add a dma_alloc_contiguous API Date: Tue, 24 Nov 2020 16:38:43 +0100 Message-Id: <20201124153845.132207-4-ribalda@chromium.org> X-Mailer: git-send-email 2.29.2.454.gaff20da3a2-goog In-Reply-To: <20201124153845.132207-1-ribalda@chromium.org> References: <20201124153845.132207-1-ribalda@chromium.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-media@vger.kernel.org From: Christoph Hellwig Add a new API that returns a virtually non-contigous array of pages and dma address. This API is only implemented for dma-iommu and will not be implemented for non-iommu DMA API instances that have to allocate contiguous memory. It is up to the caller to check if the API is available. The intent is that media drivers can use this API if either: - no kernel mapping or only temporary kernel mappings are required. That is as a better replacement for DMA_ATTR_NO_KERNEL_MAPPING - a kernel mapping is required for cached and DMA mapped pages, but the driver also needs the pages to e.g. map them to userspace. In that sense it is a replacement for some aspects of the recently removed and never fully implemented DMA_ATTR_NON_CONSISTENT Signed-off-by: Christoph Hellwig --- drivers/iommu/dma-iommu.c | 73 +++++++++++++++++++++++++------------ include/linux/dma-map-ops.h | 4 ++ include/linux/dma-mapping.h | 5 +++ kernel/dma/mapping.c | 35 ++++++++++++++++++ 4 files changed, 93 insertions(+), 24 deletions(-) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index a2fb92de7e3d..2e72fe1b9c3b 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -564,23 +564,12 @@ static struct page **__iommu_dma_alloc_pages(struct device *dev, return pages; } -/** - * iommu_dma_alloc_remap - Allocate and map a buffer contiguous in IOVA space - * @dev: Device to allocate memory for. Must be a real device - * attached to an iommu_dma_domain - * @size: Size of buffer in bytes - * @dma_handle: Out argument for allocated DMA handle - * @gfp: Allocation flags - * @prot: pgprot_t to use for the remapped mapping - * @attrs: DMA attributes for this allocation - * - * If @size is less than PAGE_SIZE, then a full CPU page will be allocated, +/* + * If size is less than PAGE_SIZE, then a full CPU page will be allocated, * but an IOMMU which supports smaller pages might not map the whole thing. - * - * Return: Mapped virtual address, or NULL on failure. */ -static void *iommu_dma_alloc_remap(struct device *dev, size_t size, - dma_addr_t *dma_handle, gfp_t gfp, pgprot_t prot, +static struct page **__iommu_dma_alloc_noncontiguous(struct device *dev, + size_t size, dma_addr_t *dma_handle, gfp_t gfp, pgprot_t prot, unsigned long attrs) { struct iommu_domain *domain = iommu_get_dma_domain(dev); @@ -592,7 +581,6 @@ static void *iommu_dma_alloc_remap(struct device *dev, size_t size, struct page **pages; struct sg_table sgt; dma_addr_t iova; - void *vaddr; *dma_handle = DMA_MAPPING_ERROR; @@ -635,17 +623,10 @@ static void *iommu_dma_alloc_remap(struct device *dev, size_t size, < size) goto out_free_sg; - vaddr = dma_common_pages_remap(pages, size, prot, - __builtin_return_address(0)); - if (!vaddr) - goto out_unmap; - *dma_handle = iova; sg_free_table(&sgt); - return vaddr; + return pages; -out_unmap: - __iommu_dma_unmap(dev, iova, size); out_free_sg: sg_free_table(&sgt); out_free_iova: @@ -655,6 +636,46 @@ static void *iommu_dma_alloc_remap(struct device *dev, size_t size, return NULL; } +static void *iommu_dma_alloc_remap(struct device *dev, size_t size, + dma_addr_t *dma_handle, gfp_t gfp, pgprot_t prot, + unsigned long attrs) +{ + struct page **pages; + void *vaddr; + + pages = __iommu_dma_alloc_noncontiguous(dev, size, dma_handle, gfp, + prot, attrs); + if (!pages) + return NULL; + vaddr = dma_common_pages_remap(pages, size, prot, + __builtin_return_address(0)); + if (!vaddr) + goto out_unmap; + return vaddr; + +out_unmap: + __iommu_dma_unmap(dev, *dma_handle, size); + __iommu_dma_free_pages(pages, PAGE_ALIGN(size) >> PAGE_SHIFT); + return NULL; +} + +#ifdef CONFIG_DMA_REMAP +static struct page **iommu_dma_alloc_noncontiguous(struct device *dev, + size_t size, dma_addr_t *dma_handle, gfp_t gfp, + unsigned long attrs) +{ + return __iommu_dma_alloc_noncontiguous(dev, size, dma_handle, gfp, + PAGE_KERNEL, attrs); +} + +static void iommu_dma_free_noncontiguous(struct device *dev, size_t size, + struct page **pages, dma_addr_t dma_handle) +{ + __iommu_dma_unmap(dev, dma_handle, size); + __iommu_dma_free_pages(pages, PAGE_ALIGN(size) >> PAGE_SHIFT); +} +#endif + static void iommu_dma_sync_single_for_cpu(struct device *dev, dma_addr_t dma_handle, size_t size, enum dma_data_direction dir) { @@ -1109,6 +1130,10 @@ static const struct dma_map_ops iommu_dma_ops = { .free = iommu_dma_free, .alloc_pages = dma_common_alloc_pages, .free_pages = dma_common_free_pages, +#ifdef CONFIG_DMA_REMAP + .alloc_noncontiguous = iommu_dma_alloc_noncontiguous, + .free_noncontiguous = iommu_dma_free_noncontiguous, +#endif .mmap = iommu_dma_mmap, .get_sgtable = iommu_dma_get_sgtable, .map_page = iommu_dma_map_page, diff --git a/include/linux/dma-map-ops.h b/include/linux/dma-map-ops.h index 3d1f91464bcf..3cc313678d42 100644 --- a/include/linux/dma-map-ops.h +++ b/include/linux/dma-map-ops.h @@ -22,6 +22,10 @@ struct dma_map_ops { gfp_t gfp); void (*free_pages)(struct device *dev, size_t size, struct page *vaddr, dma_addr_t dma_handle, enum dma_data_direction dir); + struct page **(*alloc_noncontiguous)(struct device *dev, size_t size, + dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs); + void (*free_noncontiguous)(struct device *dev, size_t size, + struct page **pages, dma_addr_t dma_handle); int (*mmap)(struct device *, struct vm_area_struct *, void *, dma_addr_t, size_t, unsigned long attrs); diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h index 956151052d45..e1b4cb1d2e55 100644 --- a/include/linux/dma-mapping.h +++ b/include/linux/dma-mapping.h @@ -267,6 +267,11 @@ void *dma_alloc_noncoherent(struct device *dev, size_t size, dma_addr_t *dma_handle, enum dma_data_direction dir, gfp_t gfp); void dma_free_noncoherent(struct device *dev, size_t size, void *vaddr, dma_addr_t dma_handle, enum dma_data_direction dir); +bool dma_can_alloc_noncontiguous(struct device *dev); +struct page **dma_alloc_noncontiguous(struct device *dev, size_t size, + dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs); +void dma_free_noncontiguous(struct device *dev, size_t size, + struct page **pages, dma_addr_t dma_handle); static inline dma_addr_t dma_map_single_attrs(struct device *dev, void *ptr, size_t size, enum dma_data_direction dir, unsigned long attrs) diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c index d3032513c54b..770c2f66512d 100644 --- a/kernel/dma/mapping.c +++ b/kernel/dma/mapping.c @@ -530,6 +530,41 @@ void dma_free_noncoherent(struct device *dev, size_t size, void *vaddr, } EXPORT_SYMBOL_GPL(dma_free_noncoherent); +bool dma_can_alloc_noncontiguous(struct device *dev) +{ + const struct dma_map_ops *ops = get_dma_ops(dev); + + return ops && ops->free_noncontiguous; +} +EXPORT_SYMBOL_GPL(dma_can_alloc_noncontiguous); + +struct page **dma_alloc_noncontiguous(struct device *dev, size_t size, + dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs) +{ + const struct dma_map_ops *ops = get_dma_ops(dev); + + if (WARN_ON_ONCE(!dma_can_alloc_noncontiguous(dev))) + return NULL; + if (attrs & ~DMA_ATTR_ALLOC_SINGLE_PAGES) { + dev_warn(dev, "invalid flags (0x%lx) for %s\n", + attrs, __func__); + return NULL; + } + return ops->alloc_noncontiguous(dev, size, dma_handle, gfp, attrs); +} +EXPORT_SYMBOL_GPL(dma_alloc_noncontiguous); + +void dma_free_noncontiguous(struct device *dev, size_t size, + struct page **pages, dma_addr_t dma_handle) +{ + const struct dma_map_ops *ops = get_dma_ops(dev); + + if (WARN_ON_ONCE(!dma_can_alloc_noncontiguous(dev))) + return; + ops->free_noncontiguous(dev, size, pages, dma_handle); +} +EXPORT_SYMBOL_GPL(dma_free_noncontiguous); + int dma_supported(struct device *dev, u64 mask) { const struct dma_map_ops *ops = get_dma_ops(dev); From patchwork Tue Nov 24 15:38:44 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ricardo Ribalda X-Patchwork-Id: 332607 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5D18CC2D0E4 for ; Tue, 24 Nov 2020 15:39:38 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1835C2067C for ; Tue, 24 Nov 2020 15:39:38 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=chromium.org header.i=@chromium.org header.b="MyQgzUY2" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389842AbgKXPi6 (ORCPT ); Tue, 24 Nov 2020 10:38:58 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54008 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2389806AbgKXPix (ORCPT ); Tue, 24 Nov 2020 10:38:53 -0500 Received: from mail-wr1-x441.google.com (mail-wr1-x441.google.com [IPv6:2a00:1450:4864:20::441]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 09A87C061A4E for ; Tue, 24 Nov 2020 07:38:52 -0800 (PST) Received: by mail-wr1-x441.google.com with SMTP id i2so3724144wrs.4 for ; Tue, 24 Nov 2020 07:38:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=EpyMQrLTrHVI5txNBrKgPsmGUgrXglJVTVw+m/0Ouds=; b=MyQgzUY2byHdFSx/r89I4t+T+2w/nKt4sOcaLSo7MrKy8Ir89ajb4aO0+vl+xzdFNB OpSlE7xnd/+/+6WDTbNQFh51Ew487JIaCXtKuzfGiqISmmcvLj/Dzp3cKS+RwDVh5z+1 +1VCIwXMZBuwSmzdchZPZ2xbRnbB1UlaV3QSg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=EpyMQrLTrHVI5txNBrKgPsmGUgrXglJVTVw+m/0Ouds=; b=pGY2FshncIGanJGAKb5vAjtwpgm4WQAwqrHIdkypIwGGen2RnQYGxg10CWQunwBWYZ u/TqmWN88I1DzLV1otvIS7ptqUVlBuv5azskJ7TDJgv3qt08EFVc/wQlKGri2a/0gx5m TcR7NcwO3J361gpDPdPDb0G2bMk6pind/rF5V8mwoDzsj8kI5JHkYA67Bye1qtXdV3Kf cdAitwHl8xw0JdX39FXb/XZPfbLDkTRm52nxg9WNG5jFIO0p9d/GmdAsjfvKQ56lSyfi TNHbIgBtvTTvWWO492TTSHbuQwZa8EM0PSIQJOE6q2o8vHCFCouNcV2JmGUu0qC7iwaQ +jKQ== X-Gm-Message-State: AOAM531QNGuAFKnCms/tO6uXW1Kg/zELjAwGEV2r1ivPcwgFHT0z5aEc LnnTS09KJ4AxKQuK8RzdtbES3w== X-Google-Smtp-Source: ABdhPJwNdpaLE8ojeiuxLUG9loFdJWcDfRi4IrJGedmjtWxZwsbIrZ+gGhlPEvJkjoCQXT/9u2JRwg== X-Received: by 2002:adf:f1c2:: with SMTP id z2mr5751777wro.281.1606232330808; Tue, 24 Nov 2020 07:38:50 -0800 (PST) Received: from alco.lan ([80.71.134.83]) by smtp.gmail.com with ESMTPSA id 25sm5814752wmk.19.2020.11.24.07.38.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Nov 2020 07:38:50 -0800 (PST) From: Ricardo Ribalda To: Christoph Hellwig , Mauro Carvalho Chehab , Marek Szyprowski , IOMMU DRIVERS , Joerg Roedel , Robin Murphy , Linux Doc Mailing List , Linux Kernel Mailing List , Linux Media Mailing List , Tomasz Figa , Sergey Senozhatsky Cc: Ricardo Ribalda Subject: [PATCH 5/6] media: uvcvideo: Use dma_alloc_noncontiguos API Date: Tue, 24 Nov 2020 16:38:44 +0100 Message-Id: <20201124153845.132207-5-ribalda@chromium.org> X-Mailer: git-send-email 2.29.2.454.gaff20da3a2-goog In-Reply-To: <20201124153845.132207-1-ribalda@chromium.org> References: <20201124153845.132207-1-ribalda@chromium.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-media@vger.kernel.org On architectures where the is no coherent caching such as ARM use the dma_alloc_noncontiguos API and handle manually the cache flushing using dma_sync_single(). With this patch on the affected architectures we can measure up to 20x performance improvement in uvc_video_copy_data_work(). Signed-off-by: Ricardo Ribalda --- drivers/media/usb/uvc/uvc_video.c | 74 ++++++++++++++++++++++++++----- drivers/media/usb/uvc/uvcvideo.h | 1 + 2 files changed, 63 insertions(+), 12 deletions(-) diff --git a/drivers/media/usb/uvc/uvc_video.c b/drivers/media/usb/uvc/uvc_video.c index a6a441d92b94..9e90b261428a 100644 --- a/drivers/media/usb/uvc/uvc_video.c +++ b/drivers/media/usb/uvc/uvc_video.c @@ -1490,6 +1490,11 @@ static void uvc_video_encode_bulk(struct uvc_urb *uvc_urb, urb->transfer_buffer_length = stream->urb_size - len; } +static inline struct device *stream_to_dmadev(struct uvc_streaming *stream) +{ + return stream->dev->udev->bus->controller->parent; +} + static void uvc_video_complete(struct urb *urb) { struct uvc_urb *uvc_urb = urb->context; @@ -1539,6 +1544,11 @@ static void uvc_video_complete(struct urb *urb) * Process the URB headers, and optionally queue expensive memcpy tasks * to be deferred to a work queue. */ + if (uvc_urb->pages) + dma_sync_single_for_cpu(stream_to_dmadev(stream), + urb->transfer_dma, + urb->transfer_buffer_length, + DMA_FROM_DEVICE); stream->decode(uvc_urb, buf, buf_meta); /* If no async work is needed, resubmit the URB immediately. */ @@ -1566,8 +1576,15 @@ static void uvc_free_urb_buffers(struct uvc_streaming *stream) continue; #ifndef CONFIG_DMA_NONCOHERENT - usb_free_coherent(stream->dev->udev, stream->urb_size, - uvc_urb->buffer, uvc_urb->dma); + if (uvc_urb->pages) { + vunmap(uvc_urb->buffer); + dma_free_noncontiguous(stream_to_dmadev(stream), + stream->urb_size, + uvc_urb->pages, uvc_urb->dma); + } else { + usb_free_coherent(stream->dev->udev, stream->urb_size, + uvc_urb->buffer, uvc_urb->dma); + } #else kfree(uvc_urb->buffer); #endif @@ -1577,6 +1594,47 @@ static void uvc_free_urb_buffers(struct uvc_streaming *stream) stream->urb_size = 0; } +#ifndef CONFIG_DMA_NONCOHERENT +static bool uvc_alloc_urb_buffer(struct uvc_streaming *stream, + struct uvc_urb *uvc_urb, gfp_t gfp_flags) +{ + struct device *dma_dev = dma_dev = stream_to_dmadev(stream); + + if (!dma_can_alloc_noncontiguous(dma_dev)) { + uvc_urb->buffer = usb_alloc_coherent(stream->dev->udev, + stream->urb_size, + gfp_flags | __GFP_NOWARN, + &uvc_urb->dma); + return uvc_urb->buffer != NULL; + } + + uvc_urb->pages = dma_alloc_noncontiguous(dma_dev, stream->urb_size, + &uvc_urb->dma, + gfp_flags | __GFP_NOWARN, 0); + if (!uvc_urb->pages) + return false; + + uvc_urb->buffer = vmap(uvc_urb->pages, + PAGE_ALIGN(stream->urb_size) >> PAGE_SHIFT, + VM_DMA_COHERENT, PAGE_KERNEL); + if (!uvc_urb->buffer) { + dma_free_noncontiguous(dma_dev, stream->urb_size, + uvc_urb->pages, uvc_urb->dma); + return false; + } + + return true; +} +#else +static bool uvc_alloc_urb_buffer(struct uvc_streaming *stream, + struct uvc_urb *uvc_urb, gfp_t gfp_flags) +{ + uvc_urb->buffer = kmalloc(stream->urb_size, gfp_flags | __GFP_NOWARN); + + return uvc_urb->buffer != NULL; +} +#endif + /* * Allocate transfer buffers. This function can be called with buffers * already allocated when resuming from suspend, in which case it will @@ -1607,19 +1665,11 @@ static int uvc_alloc_urb_buffers(struct uvc_streaming *stream, /* Retry allocations until one succeed. */ for (; npackets > 1; npackets /= 2) { + stream->urb_size = psize * npackets; for (i = 0; i < UVC_URBS; ++i) { struct uvc_urb *uvc_urb = &stream->uvc_urb[i]; - stream->urb_size = psize * npackets; -#ifndef CONFIG_DMA_NONCOHERENT - uvc_urb->buffer = usb_alloc_coherent( - stream->dev->udev, stream->urb_size, - gfp_flags | __GFP_NOWARN, &uvc_urb->dma); -#else - uvc_urb->buffer = - kmalloc(stream->urb_size, gfp_flags | __GFP_NOWARN); -#endif - if (!uvc_urb->buffer) { + if (!uvc_alloc_urb_buffer(stream, uvc_urb, gfp_flags)) { uvc_free_urb_buffers(stream); break; } diff --git a/drivers/media/usb/uvc/uvcvideo.h b/drivers/media/usb/uvc/uvcvideo.h index a3dfacf069c4..3e3ef1f1daa5 100644 --- a/drivers/media/usb/uvc/uvcvideo.h +++ b/drivers/media/usb/uvc/uvcvideo.h @@ -532,6 +532,7 @@ struct uvc_urb { char *buffer; dma_addr_t dma; + struct page **pages; unsigned int async_operations; struct uvc_copy_op copy_operations[UVC_MAX_PACKETS]; From patchwork Tue Nov 24 15:38:45 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ricardo Ribalda X-Patchwork-Id: 332606 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-23.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, MENTIONS_GIT_HOSTING, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 37ED1C64E90 for ; Tue, 24 Nov 2020 15:39:38 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id CFA2320715 for ; Tue, 24 Nov 2020 15:39:37 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=chromium.org header.i=@chromium.org header.b="dQ9L9ZXf" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389848AbgKXPi6 (ORCPT ); Tue, 24 Nov 2020 10:38:58 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54016 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2389824AbgKXPix (ORCPT ); Tue, 24 Nov 2020 10:38:53 -0500 Received: from mail-wr1-x442.google.com (mail-wr1-x442.google.com [IPv6:2a00:1450:4864:20::442]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E9CACC0613D6 for ; Tue, 24 Nov 2020 07:38:52 -0800 (PST) Received: by mail-wr1-x442.google.com with SMTP id e7so5234566wrv.6 for ; Tue, 24 Nov 2020 07:38:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=+3mkwmfTm2kr3b7c7d8P0yIXNEx9jD+OdvJJFJW+zFk=; b=dQ9L9ZXfFLezuzODqfyIvieZSIxqvT3n3Mvp/8WKA8otTlP/AyFHRv8qQZeG4XY/wi pqt9SFPvatEBcmWl1qaQF1RGMpM7ciEhWIGv246TwN8w5wkStnTf7QAwVYvVc+aVh+TH 5UzgHpfHEi2dxqjkS5cYLdB5elWZiLM54Bfj4= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=+3mkwmfTm2kr3b7c7d8P0yIXNEx9jD+OdvJJFJW+zFk=; b=eR0Ac1HbD15U25CGJueUyTpHJFPLwQ3T6JdzFGUClcTvY1kzHj20K+WxUE+xre/M/m 3t0yoCGXatxZIVsloe/IZPxNLlGnASJKh0l+StkgMe4UQ5Mk84zrGtLFNSjHSRkydLqq IwYJcYMXdFc3UcnuaWOVeDgkDnmKhAqY79CJoBrX7qHU0lUcTIe6bpEluT0cMTNRCO9s opJtk2WFH1/wsmsU2JPJ1ye1zzp0FLPZUWba2qviolAAsqGZ8YgNL7+mXkCH/U2LydwE bjg3CagJRxuW+bMqNIELV0fPaoXBQ1DA1ra11aU7hIWxvxFYRMieddm6QgcQQX9SE+AM OYag== X-Gm-Message-State: AOAM530ofrnOzIgnOPhC3XwRu2BrE0QmDCXh01d8UxmEaMH0UAQwilCd sBe0OxsghdlPcwexYJPvSBIbzQ== X-Google-Smtp-Source: ABdhPJwAKgSjs105Q9o2qftTAAnLbbDN93Uyn630ByGqWNQWfRRNDWCrKExPjmulwnz61GEm5Mn7FA== X-Received: by 2002:a05:6000:1006:: with SMTP id a6mr5732120wrx.367.1606232331638; Tue, 24 Nov 2020 07:38:51 -0800 (PST) Received: from alco.lan ([80.71.134.83]) by smtp.gmail.com with ESMTPSA id 25sm5814752wmk.19.2020.11.24.07.38.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Nov 2020 07:38:51 -0800 (PST) From: Ricardo Ribalda To: Christoph Hellwig , Mauro Carvalho Chehab , Marek Szyprowski , IOMMU DRIVERS , Joerg Roedel , Robin Murphy , Linux Doc Mailing List , Linux Kernel Mailing List , Linux Media Mailing List , Tomasz Figa , Sergey Senozhatsky Cc: Shik Chen Subject: [PATCH 6/6] TEST-ONLY: media: uvcvideo: Add statistics for measuring performance Date: Tue, 24 Nov 2020 16:38:45 +0100 Message-Id: <20201124153845.132207-6-ribalda@chromium.org> X-Mailer: git-send-email 2.29.2.454.gaff20da3a2-goog In-Reply-To: <20201124153845.132207-1-ribalda@chromium.org> References: <20201124153845.132207-1-ribalda@chromium.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-media@vger.kernel.org From: Shik Chen Majorly based on [1], with the following tweaks: * Use div_u64 for u64 divisions * Calculate standard deviation * Fix an uninitialized |min| field for header * Apply clang-format [1] https://git.kernel.org/pub/scm/linux/kernel/git/kbingham/rcar.git/commit/?h=uvc/async-ml&id=cebbd1b629bbe5f856ec5dc7591478c003f5a944 Signed-off-by: Shik Chen --- drivers/media/usb/uvc/uvc_video.c | 163 +++++++++++++++++++++++++++++- drivers/media/usb/uvc/uvcvideo.h | 21 ++++ 2 files changed, 181 insertions(+), 3 deletions(-) diff --git a/drivers/media/usb/uvc/uvc_video.c b/drivers/media/usb/uvc/uvc_video.c index 9e90b261428a..d3a515015003 100644 --- a/drivers/media/usb/uvc/uvc_video.c +++ b/drivers/media/usb/uvc/uvc_video.c @@ -906,12 +906,61 @@ static void uvc_video_stats_update(struct uvc_streaming *stream) memset(&stream->stats.frame, 0, sizeof(stream->stats.frame)); } +size_t uvc_video_dump_time_stats(char *buf, size_t size, + struct uvc_stats_time *stat, const char *pfx) +{ + unsigned int avg = 0; + unsigned int std = 0; + + if (stat->qty) { + avg = div_u64(stat->duration, stat->qty); + std = int_sqrt64(div_u64(stat->duration2, stat->qty) - + avg * avg); + } + + /* Stat durations are in nanoseconds, we present in micro-seconds */ + return scnprintf( + buf, size, + "%s: %llu/%u uS/qty: %u.%03u avg %u.%03u std %u.%03u min %u.%03u max (uS)\n", + pfx, div_u64(stat->duration, 1000), stat->qty, avg / 1000, + avg % 1000, std / 1000, std % 1000, stat->min / 1000, + stat->min % 1000, stat->max / 1000, stat->max % 1000); +} + +size_t uvc_video_dump_speed(char *buf, size_t size, const char *pfx, u64 bytes, + u64 milliseconds) +{ + unsigned int rate = 0; + bool gbit = false; + + if (milliseconds) + rate = div_u64(bytes * 8, milliseconds); + + if (rate >= 1000000) { + gbit = true; + rate /= 1000; + } + + /* + * bits/milliseconds == kilobits/seconds, + * presented here as Mbits/s (or Gbit/s) with 3 decimal places + */ + return scnprintf(buf, size, "%s: %d.%03d %sbits/s\n", pfx, rate / 1000, + rate % 1000, gbit ? "G" : "M"); +} + size_t uvc_video_stats_dump(struct uvc_streaming *stream, char *buf, size_t size) { + u64 bytes = stream->stats.stream.bytes; /* Single sample */ + unsigned int empty_ratio = 0; unsigned int scr_sof_freq; unsigned int duration; + unsigned int fps = 0; size_t count = 0; + u64 cpu = 0; + u64 cpu_q = 0; + u32 cpu_r = 0; /* Compute the SCR.SOF frequency estimate. At the nominal 1kHz SOF * frequency this will not overflow before more than 1h. @@ -924,12 +973,19 @@ size_t uvc_video_stats_dump(struct uvc_streaming *stream, char *buf, else scr_sof_freq = 0; + if (stream->stats.stream.nb_packets) + empty_ratio = stream->stats.stream.nb_empty * 100 / + stream->stats.stream.nb_packets; + count += scnprintf(buf + count, size - count, - "frames: %u\npackets: %u\nempty: %u\n" - "errors: %u\ninvalid: %u\n", + "frames: %u\n" + "packets: %u\n" + "empty: %u (%u %%)\n" + "errors: %u\n" + "invalid: %u\n", stream->stats.stream.nb_frames, stream->stats.stream.nb_packets, - stream->stats.stream.nb_empty, + stream->stats.stream.nb_empty, empty_ratio, stream->stats.stream.nb_errors, stream->stats.stream.nb_invalid); count += scnprintf(buf + count, size - count, @@ -946,6 +1002,55 @@ size_t uvc_video_stats_dump(struct uvc_streaming *stream, char *buf, stream->stats.stream.min_sof, stream->stats.stream.max_sof, scr_sof_freq / 1000, scr_sof_freq % 1000); + count += scnprintf(buf + count, size - count, + "bytes %lld : duration %d\n", bytes, duration); + + if (duration != 0) { + /* Duration is in milliseconds, * 100 to gain 2 dp precision */ + fps = stream->stats.stream.nb_frames * 1000 * 100 / duration; + /* CPU usage as a % with 6 decimal places */ + cpu = div_u64(stream->stats.urbstat.decode.duration, duration) * + 100; + } + + count += scnprintf(buf + count, size - count, "FPS: %u.%02u\n", + fps / 100, fps % 100); + + /* Processing Times */ + + count += uvc_video_dump_time_stats(buf + count, size - count, + &stream->stats.urbstat.urb, "URB"); + count += uvc_video_dump_time_stats(buf + count, size - count, + &stream->stats.urbstat.header, + "header"); + count += uvc_video_dump_time_stats(buf + count, size - count, + &stream->stats.urbstat.latency, + "latency"); + count += uvc_video_dump_time_stats(buf + count, size - count, + &stream->stats.urbstat.decode, + "decode"); + + /* Processing Speeds */ + + /* This should be representative of the memory bus / cpu speed */ + count += uvc_video_dump_speed( + buf + count, size - count, "raw decode speed", bytes, + div_u64(stream->stats.urbstat.decode.duration, 1000000)); + + /* Raw bus speed - scheduling latencies */ + count += uvc_video_dump_speed( + buf + count, size - count, "raw URB handling speed", bytes, + div_u64(stream->stats.urbstat.urb.duration, 1000000)); + + /* Throughput against wall clock time, stream duration is in millis*/ + count += uvc_video_dump_speed(buf + count, size - count, "throughput", + bytes, duration); + + cpu_q = div_u64_rem(cpu, 1000000, &cpu_r); + + /* Determine the 'CPU Usage' of our URB processing */ + count += scnprintf(buf + count, size - count, + "URB decode CPU usage %llu.%06u %%\n", cpu_q, cpu_r); return count; } @@ -954,6 +1059,11 @@ static void uvc_video_stats_start(struct uvc_streaming *stream) { memset(&stream->stats, 0, sizeof(stream->stats)); stream->stats.stream.min_sof = 2048; + + stream->stats.urbstat.header.min = -1; + stream->stats.urbstat.latency.min = -1; + stream->stats.urbstat.decode.min = -1; + stream->stats.urbstat.urb.min = -1; } static void uvc_video_stats_stop(struct uvc_streaming *stream) @@ -961,6 +1071,28 @@ static void uvc_video_stats_stop(struct uvc_streaming *stream) stream->stats.stream.stop_ts = ktime_get(); } +static s64 uvc_stats_add(struct uvc_stats_time *s, const ktime_t a, + const ktime_t b) +{ + ktime_t delta; + u64 duration; + + delta = ktime_sub(b, a); + duration = ktime_to_ns(delta); + + s->qty++; + s->duration += duration; + s->duration2 += duration * duration; + + if (duration < s->min) + s->min = duration; + + if (duration > s->max) + s->max = duration; + + return duration; +} + /* ------------------------------------------------------------------------ * Video codecs */ @@ -1024,6 +1156,9 @@ static int uvc_video_decode_start(struct uvc_streaming *stream, stream->sequence++; if (stream->sequence) uvc_video_stats_update(stream); + + /* Update the stream timer each frame */ + stream->stats.stream.stop_ts = ktime_get(); } uvc_video_clock_decode(stream, buf, data, len); @@ -1106,18 +1241,34 @@ static int uvc_video_decode_start(struct uvc_streaming *stream, static void uvc_video_copy_data_work(struct work_struct *work) { struct uvc_urb *uvc_urb = container_of(work, struct uvc_urb, work); + ktime_t now; unsigned int i; int ret; + /* Measure decode performance */ + uvc_urb->decode_start = ktime_get(); + /* Measure scheduling latency */ + uvc_stats_add(&uvc_urb->stream->stats.urbstat.latency, + uvc_urb->received, uvc_urb->decode_start); + for (i = 0; i < uvc_urb->async_operations; i++) { struct uvc_copy_op *op = &uvc_urb->copy_operations[i]; memcpy(op->dst, op->src, op->len); + uvc_urb->stream->stats.stream.bytes += op->len; /* Release reference taken on this buffer. */ uvc_queue_buffer_release(op->buf); } + now = ktime_get(); + /* measure 'memcpy time' */ + uvc_stats_add(&uvc_urb->stream->stats.urbstat.decode, + uvc_urb->decode_start, now); + /* measure 'full urb processing time' */ + uvc_stats_add(&uvc_urb->stream->stats.urbstat.urb, uvc_urb->received, + now); + ret = usb_submit_urb(uvc_urb->urb, GFP_KERNEL); if (ret < 0) uvc_printk(KERN_ERR, "Failed to resubmit video URB (%d).\n", @@ -1507,6 +1658,9 @@ static void uvc_video_complete(struct urb *urb) unsigned long flags; int ret; + /* Track URB processing performance */ + uvc_urb->received = ktime_get(); + switch (urb->status) { case 0: break; @@ -1562,6 +1716,9 @@ static void uvc_video_complete(struct urb *urb) } queue_work(stream->async_wq, &uvc_urb->work); + + uvc_stats_add(&uvc_urb->stream->stats.urbstat.header, uvc_urb->received, + ktime_get()); } /* diff --git a/drivers/media/usb/uvc/uvcvideo.h b/drivers/media/usb/uvc/uvcvideo.h index 3e3ef1f1daa5..80eeeaf3cd06 100644 --- a/drivers/media/usb/uvc/uvcvideo.h +++ b/drivers/media/usb/uvc/uvcvideo.h @@ -475,6 +475,14 @@ struct uvc_stats_frame { u32 scr_stc; /* SCR.STC of the last packet */ }; +struct uvc_stats_time { + u64 duration; /* Cumulative total duration between two events */ + u64 duration2; /* Cumulative total duration^2 between two events */ + unsigned int qty; /* Number of events represented in the total */ + unsigned int min; /* Shortest duration */ + unsigned int max; /* Longest duration */ +}; + struct uvc_stats_stream { ktime_t start_ts; /* Stream start timestamp */ ktime_t stop_ts; /* Stream stop timestamp */ @@ -496,6 +504,8 @@ struct uvc_stats_stream { unsigned int scr_sof; /* STC.SOF of the last packet */ unsigned int min_sof; /* Minimum STC.SOF value */ unsigned int max_sof; /* Maximum STC.SOF value */ + + unsigned long bytes; /* Successfully transferred bytes */ }; #define UVC_METADATA_BUF_SIZE 1024 @@ -525,6 +535,8 @@ struct uvc_copy_op { * @async_operations: counter to indicate the number of copy operations * @copy_operations: work descriptors for asynchronous copy operations * @work: work queue entry for asynchronous decode + * @received: URB interrupt time stamp + * @decode_start: URB processing start time stamp */ struct uvc_urb { struct urb *urb; @@ -537,6 +549,9 @@ struct uvc_urb { unsigned int async_operations; struct uvc_copy_op copy_operations[UVC_MAX_PACKETS]; struct work_struct work; + + ktime_t received; + ktime_t decode_start; }; struct uvc_streaming { @@ -599,6 +614,12 @@ struct uvc_streaming { struct { struct uvc_stats_frame frame; struct uvc_stats_stream stream; + struct uvc_stats_urb { + struct uvc_stats_time header; + struct uvc_stats_time latency; + struct uvc_stats_time decode; + struct uvc_stats_time urb; + } urbstat; } stats; /* Timestamps support. */