From patchwork Mon Nov 12 07:58:02 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kenneth Lee X-Patchwork-Id: 150787 Delivered-To: patch@linaro.org Received: by 2002:a2e:299d:0:0:0:0:0 with SMTP id p29-v6csp2856290ljp; Sun, 11 Nov 2018 23:59:14 -0800 (PST) X-Google-Smtp-Source: AJdET5fgTFgvhu5qpKne4jyP9rteKk1xaq1+koEveD1bk6lu0rCzh9Yz6BhOoKSODktzRZg/ZXOE X-Received: by 2002:aa7:8498:: with SMTP id u24-v6mr6849015pfn.220.1542009554474; Sun, 11 Nov 2018 23:59:14 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1542009554; cv=none; d=google.com; s=arc-20160816; b=tIXegIXpmjKwuadYW461fnmooFIOM6sjRE9ItFMHVn5WeqT6pVjq9lEwtj9F2S9Int kVBgialHKsh9HHJNiBfHS43SUmhGsB1ZBgZ/5vygNtE6Zw81rLdgGyyale+tSaz74jHO N8ANbf3O3UbpFBizIujFU0HGfe6n2Ijd/q8BfrCdsQ/J6JmIhB6xaqZKMFMDM39p4EmE 6wbWNrK/M5oj0PVvltagnpOAx+FiRz0W5GFjABP7jmO/78vf8FrJbPUIiHZBc0gd1MWl ZezOdl4ePF7jTfdyBsAIADUEGW6bmjRW4NPgZrwADSgMgqgkcl5F9MhLHEB7KccV74dz j+hA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature; bh=O5QGqogXo+zbFjJnIdn/zbBce5DhfCXEtw0mdKStgBg=; b=VQurheo5vVBN27JJDhQz61DeiTdsZjvajE0wnPNLVLnpz5QmNkz9F9lWtICj8+lbHs kVXOtjXv408/aRejVo0T/rwMBc4+0nXvjLNFMGJx7Mm9HNDLRfBtyFNeb0zrBCrWk0jU JvG6HUlbIBrmmHFWbVkJUMUxqbewHErBWui9Ul0HovaWvPq9n5xmf1Z3ZRBKm2PDR4DO R8z4lTpiuNtvKN/nHTOy3DyFU07ZORj1gXHPzka8XMrasl+chLbEFHC43w3IWGj/uCMy 5gPIv2d+HGplXaVDrWPT7rG0bpwg7GnV/1vTf0p7Aj81hIa60G/ePLD2QiGlYBnkwNa1 OFmQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=BiLlZc8I; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id bh6-v6si16792078plb.66.2018.11.11.23.59.13; Sun, 11 Nov 2018 23:59:14 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=BiLlZc8I; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727357AbeKLRvP (ORCPT + 2 others); Mon, 12 Nov 2018 12:51:15 -0500 Received: from mail-pg1-f195.google.com ([209.85.215.195]:44264 "EHLO mail-pg1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727000AbeKLRvP (ORCPT ); Mon, 12 Nov 2018 12:51:15 -0500 Received: by mail-pg1-f195.google.com with SMTP id w3-v6so3682384pgs.11; Sun, 11 Nov 2018 23:59:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=O5QGqogXo+zbFjJnIdn/zbBce5DhfCXEtw0mdKStgBg=; b=BiLlZc8I1JCLzEgxgOLPnWtf4GVp1woUTUYF8CRQgQ8lmz28zuQDNxrQ6N6Dq3JgYB bQRBRuwsHPMivemtvg9hGiI5YGW6jHUQf5yQXwtQgdY3Djo97W5fRWcyoOB8Q1Z3/tl5 75LCc8Iny/pICamT7NoDOZN+Ucne7ftpn/9RCcBSAGJmG6vUNcYVLiYtNe0obitUp08f xka4kOx7e9Je1dI2MgH0fnL8aFJmDYd5+YIEkOHUH0fTtdeCNnG10WmHSTbZ62jQgQfE ZLh9WvgDqeJbhTn4yUqsbsyUaybmSU9XXFTJKbfNyWw4fgNF+nUvhpPP9n/K590iZQlu uFiA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=O5QGqogXo+zbFjJnIdn/zbBce5DhfCXEtw0mdKStgBg=; b=C5CX7Jxjmmmico17M6nF+OZ1VpHPtw8zqZYuwJdkXKNNaEZ/VXSsRbUeW2tLhkCPcg sqOMMcAyQ1ZZUJcv67YgEaa6IhOtWXD9tpkKZTzkhO41i1w5tcqR1L7XHZEPLVP4ZdUk mCR5isQE4mgRYrXMvTndjj0+WnJhHNVBkaIOXsWvSQ50A9QhewHViEChyoySV5CYulBj H8l4yifKWe8/leHbz0Q2YTUosdIlxbs1kUZHpmrQaC/aHl7/Pc3A5U9HaxEbzQwSOCrc zQLIepdWRy1wCyi+JADDxqesIAP1k7t0mmQPyWcohj4ee+pvGWJD6JTVm+MEBzrKHxL7 Pzgw== X-Gm-Message-State: AGRZ1gJ7PnXO0qPPBwyv514euRqAeH572CtGUpiXdAaH+or0yTg4QhmJ W/JJX8N2moILIgXxB+r/Rjc= X-Received: by 2002:a62:1bd6:: with SMTP id b205-v6mr2405240pfb.178.1542009548673; Sun, 11 Nov 2018 23:59:08 -0800 (PST) Received: from localhost.localdomain ([45.41.180.77]) by smtp.gmail.com with ESMTPSA id u2-v6sm17050816pfn.50.2018.11.11.23.58.29 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 11 Nov 2018 23:59:07 -0800 (PST) From: Kenneth Lee To: Alexander Shishkin , Tim Sell , Sanyog Kale , Randy Dunlap , =?utf-8?q?Uwe_Kleine-K=C3=B6nig?= , Vinod Koul , David Kershner , Sagar Dharia , Gavin Schenk , Jens Axboe , Philippe Ombredanne , Cyrille Pitchen , Johan Hovold , Zhou Wang , Hao Fang , Jonathan Cameron , Zaibo Xu , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org, linux-accelerators@lists.ozlabs.org Cc: linuxarm@huawei.com, guodong.xu@linaro.org, zhangfei.gao@foxmail.com, haojian.zhuang@linaro.org, Kenneth Lee Subject: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce Date: Mon, 12 Nov 2018 15:58:02 +0800 Message-Id: <20181112075807.9291-2-nek.in.cn@gmail.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181112075807.9291-1-nek.in.cn@gmail.com> References: <20181112075807.9291-1-nek.in.cn@gmail.com> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org From: Kenneth Lee WarpDrive is a general accelerator framework for the user application to access the hardware without going through the kernel in data path. The kernel component to provide kernel facility to driver for expose the user interface is called uacce. It a short name for "Unified/User-space-access-intended Accelerator Framework". This patch add document to explain how it works. Signed-off-by: Kenneth Lee --- Documentation/warpdrive/warpdrive.rst | 260 +++++++ Documentation/warpdrive/wd-arch.svg | 764 ++++++++++++++++++++ Documentation/warpdrive/wd.svg | 526 ++++++++++++++ Documentation/warpdrive/wd_q_addr_space.svg | 359 +++++++++ 4 files changed, 1909 insertions(+) create mode 100644 Documentation/warpdrive/warpdrive.rst create mode 100644 Documentation/warpdrive/wd-arch.svg create mode 100644 Documentation/warpdrive/wd.svg create mode 100644 Documentation/warpdrive/wd_q_addr_space.svg -- 2.17.1 diff --git a/Documentation/warpdrive/warpdrive.rst b/Documentation/warpdrive/warpdrive.rst new file mode 100644 index 000000000000..ef84d3a2d462 --- /dev/null +++ b/Documentation/warpdrive/warpdrive.rst @@ -0,0 +1,260 @@ +Introduction of WarpDrive +========================= + +*WarpDrive* is a general accelerator framework for the user application to +access the hardware without going through the kernel in data path. + +It can be used as the quick channel for accelerators, network adaptors or +other hardware for application in user space. + +This may make some implementation simpler. E.g. you can reuse most of the +*netdev* driver in kernel and just share some ring buffer to the user space +driver for *DPDK* [4] or *ODP* [5]. Or you can combine the RSA accelerator with +the *netdev* in the user space as a https reversed proxy, etc. + +*WarpDrive* takes the hardware accelerator as a heterogeneous processor which +can share particular load from the CPU: + +.. image:: wd.svg + :alt: WarpDrive Concept + +The virtual concept, queue, is used to manage the requests sent to the +accelerator. The application send requests to the queue by writing to some +particular address, while the hardware takes the requests directly from the +address and send feedback accordingly. + +The format of the queue may differ from hardware to hardware. But the +application need not to make any system call for the communication. + +*WarpDrive* tries to create a shared virtual address space for all involved +accelerators. Within this space, the requests sent to queue can refer to any +virtual address, which will be valid to the application and all involved +accelerators. + +The name *WarpDrive* is simply a cool and general name meaning the framework +makes the application faster. It includes general user library, kernel +management module and drivers for the hardware. In kernel, the management +module is called *uacce*, meaning "Unified/User-space-access-intended +Accelerator Framework". + + +How does it work +================ + +*WarpDrive* uses *mmap* and *IOMMU* to play the trick. + +*Uacce* creates a chrdev for the device registered to it. A "queue" will be +created when the chrdev is opened. The application access the queue by mmap +different address region of the queue file. + +The following figure demonstrated the queue file address space: + +.. image:: wd_q_addr_space.svg + :alt: WarpDrive Queue Address Space + +The first region of the space, device region, is used for the application to +write request or read answer to or from the hardware. + +Normally, there can be three types of device regions mmio and memory regions. +It is recommended to use common memory for request/answer descriptors and use +the mmio space for device notification, such as doorbell. But of course, this +is all up to the interface designer. + +There can be two types of device memory regions, kernel-only and user-shared. +This will be explained in the "kernel APIs" section. + +The Static Share Virtual Memory region is necessary only when the device IOMMU +does not support "Share Virtual Memory". This will be explained after the +*IOMMU* idea. + + +Architecture +------------ + +The full *WarpDrive* architecture is represented in the following class +diagram: + +.. image:: wd-arch.svg + :alt: WarpDrive Architecture + + +The user API +------------ + +We adopt a polling style interface in the user space: :: + + int wd_request_queue(struct wd_queue *q); + void wd_release_queue(struct wd_queue *q); + + int wd_send(struct wd_queue *q, void *req); + int wd_recv(struct wd_queue *q, void **req); + int wd_recv_sync(struct wd_queue *q, void **req); + void wd_flush(struct wd_queue *q); + +wd_recv_sync() is a wrapper to its non-sync version. It will trapped into +kernel and waits until the queue become available. + +If the queue do not support SVA/SVM. The following helper function +can be used to create Static Virtual Share Memory: :: + + void *wd_preserve_share_memory(struct wd_queue *q, size_t size); + +The user API is not mandatory. It is simply a suggestion and hint what the +kernel interface is supposed to support. + + +The user driver +--------------- + +The queue file mmap space will need a user driver to wrap the communication +protocol. *UACCE* provides some attributes in sysfs for the user driver to +match the right accelerator accordingly. + +The *UACCE* device attribute is under the following directory: + +/sys/class/uacce//params + +The following attributes is supported: + +nr_queue_remained (ro) + number of queue remained + +api_version (ro) + a string to identify the queue mmap space format and its version + +device_attr (ro) + attributes of the device, see UACCE_DEV_xxx flag defined in uacce.h + +numa_node (ro) + id of numa node + +priority (rw) + Priority or the device, bigger is higher + +(This is not yet implemented in RFC version) + + +The kernel API +-------------- + +The *uacce* kernel API is defined in uacce.h. If the hardware support SVM/SVA, +The driver need only the following API functions: :: + + int uacce_register(uacce); + void uacce_unregister(uacce); + void uacce_wake_up(q); + +*uacce_wake_up* is used to notify the process who epoll() on the queue file. + +According to the IOMMU capability, *uacce* categories the devices as follow: + +UACCE_DEV_NOIOMMU + The device has no IOMMU. The user process cannot use VA on the hardware + This mode is not recommended. + +UACCE_DEV_SVA (UACCE_DEV_PASID | UACCE_DEV_FAULT_FROM_DEV) + The device has IOMMU which can share the same page table with user + process + +UACCE_DEV_SHARE_DOMAIN + The device has IOMMU which has no multiple page table and device page + fault support + +If the device works in mode other than UACCE_DEV_NOIOMMU, *uacce* will set its +IOMMU to IOMMU_DOMAIN_UNMANAGED. So the driver must not use any kernel +DMA API but the following ones from *uacce* instead: :: + + uacce_dma_map(q, va, size, prot); + uacce_dma_unmap(q, va, size, prot); + +*uacce_dma_map/unmap* is valid only for UACCE_DEV_SVA device. It creates a +particular PASID and page table for the kernel in the IOMMU (Not yet +implemented in the RFC) + +For the UACCE_DEV_SHARE_DOMAIN device, uacce_dma_map/unmap is not valid. +*Uacce* call back start_queue only when the DUS and DKO region is mmapped. The +accelerator driver must use those dma buffer, via uacce_queue->qfrs[], on +start_queue call back. The size of the queue file region is defined by +uacce->ops->qf_pg_start[]. + +We have to do it this way because most of current IOMMU cannot support the +kernel and user virtual address at the same time. So we have to let them both +share the same user virtual address space. + +If the device have to support kernel and user at the same time, both kernel +and the user should use these DMA API. This is not convenient. A better +solution is to change the future DMA/IOMMU design to let them separate the +address space between the user and kernel space. But it is not going to be in +a short time. + + +Multiple processes support +========================== + +In the latest mainline kernel (4.19) when this document is written, the IOMMU +subsystem do not support multiple process page tables yet. + +Most IOMMU hardware implementation support multi-process with the concept +of PASID. But they may use different name, e.g. it is call sub-stream-id in +SMMU of ARM. With PASID or similar design, multi page table can be added to +the IOMMU and referred by its PASID. + +*JPB* has a patchset to enable this[1]_. We have tested it with our hardware +(which is known as *D06*). It works well. *WarpDrive* rely on them to support +UACCE_DEV_SVA. If it is not enabled, *WarpDrive* can still work. But it +support only one process, the device will be set to UACCE_DEV_SHARE_DOMAIN +even it is set to UACCE_DEV_SVA initially. + +Static Share Virtual Memory is mainly used by UACCE_DEV_SHARE_DOMAIN device. + + +Legacy Mode Support +=================== +For the hardware without IOMMU, WarpDrive can still work, the only problem is +VA cannot be used in the device. The driver should adopt another strategy for +the shared memory. It is only for testing, and not recommended. + + +The Folk Scenario +================= +For a process with allocated queues and shared memory, what happen if it forks +a child? + +The fd of the queue will be duplicated on folk, so the child can send request +to the same queue as its parent. But the requests which is sent from processes +except for the one who open the queue will be blocked. + +It is recommended to add O_CLOEXEC to the queue file. + +The queue mmap space has a VM_DONTCOPY in its VMA. So the child will lost all +those VMAs. + +This is why *WarpDrive* does not adopt the mode used in *VFIO* and *InfiniBand*. +Both solutions can set any user pointer for hardware sharing. But they cannot +support fork when the dma is in process. Or the "Copy-On-Write" procedure will +make the parent process lost its physical pages. + + +The Sample Code +=============== +There is a sample user land implementation with a simple driver for Hisilicon +Hi1620 ZIP Accelerator. + +To test, do the following in samples/warpdrive (for the case of PC host): :: + ./autogen.sh + ./conf.sh # or simply ./configure if you build on target system + make + +Then you can get test_hisi_zip in the test subdirectory. Copy it to the target +system and make sure the hisi_zip driver is enabled (the major and minor of +the uacce chrdev can be gotten from the dmesg or sysfs), and run: :: + mknod /dev/ua1 c + test/test_hisi_zip -z < data > data.zip + test/test_hisi_zip -g < data > data.gzip + + +References +========== +.. [1] https://patchwork.kernel.org/patch/10394851/ + +.. vim: tw=78 diff --git a/Documentation/warpdrive/wd-arch.svg b/Documentation/warpdrive/wd-arch.svg new file mode 100644 index 000000000000..e59934188443 --- /dev/null +++ b/Documentation/warpdrive/wd-arch.svg @@ -0,0 +1,764 @@ + + + + + + + + + + + + + + + + + + + + + + + + generation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + image/svg+xml + + + + + + + + + WarpDrive + + + user_driver + + + + uacce + + + + + Device Driver + + <<anom_file>>Queue FD + 1 + * + + + other standard framework(crypto/nic/others) + + <<lkm>> + uacce register api + register to other subsystem + <<user_lib>> + mmapped memory r/w interface + wd user api + + + + Device(Hardware) + + + + IOMMU + + manage the driver iommu state + + diff --git a/Documentation/warpdrive/wd.svg b/Documentation/warpdrive/wd.svg new file mode 100644 index 000000000000..87ab92ebfbc6 --- /dev/null +++ b/Documentation/warpdrive/wd.svg @@ -0,0 +1,526 @@ + + + + + + + + + + + + + + + + generation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + image/svg+xml + + + + + + + + + user application (running by the CPU + + + + + MMU + + + Memory + + + IOMMU + + + Hardware Accelerator + + + + diff --git a/Documentation/warpdrive/wd_q_addr_space.svg b/Documentation/warpdrive/wd_q_addr_space.svg new file mode 100644 index 000000000000..5e6cf8e89908 --- /dev/null +++ b/Documentation/warpdrive/wd_q_addr_space.svg @@ -0,0 +1,359 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + image/svg+xml + + + + + + + + + queue file address space + + + + + + + + + offset 0 + device region (mapped to device mmio or shared kernel driver memory) + static share virtual memory region (for device without share virtual memory) + + + device mmio region + device kernel only region + + device user share region + + From patchwork Mon Nov 12 07:58:05 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kenneth Lee X-Patchwork-Id: 150790 Delivered-To: patch@linaro.org Received: by 2002:a2e:299d:0:0:0:0:0 with SMTP id p29-v6csp2857374ljp; Mon, 12 Nov 2018 00:00:38 -0800 (PST) X-Google-Smtp-Source: AJdET5fmoGY64Ah30/k2dmWHEwjaipRTU73wJuh9IPcjXYukQTVCyrkScRBLacosemhvllmh+tfd X-Received: by 2002:a63:960a:: with SMTP id c10mr16538205pge.106.1542009637958; Mon, 12 Nov 2018 00:00:37 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1542009637; cv=none; d=google.com; s=arc-20160816; b=VWFDS70w7goXsUZEjVsbROrQCk6+5hlhbMTJmI+SNdqhrnrr60JGS5jrF9ow8KOF5b fXmRVgpTD7bGFc63ioAR6wJbEtxOPoNCvBvkcc8VTenvleYf5KVvFRyKCsOgV4NP1x9X uphh5kFTNAJOixFnqo9J7Gal7UiBI+MTkCYU7Xgm+KnC07yL9pUrb5ZRaRu6aYuMKhcl mjqTxU7zJYhZeM9eJzHKQGBMePwijA1rVI2QnEAaddS9+/sP0ppD132ZVFx8hPKFJamK x7azIPLbGnzJCR4hXPY4X9ZU44A/8Z52gD2upAW+VKnLag5HK/xEG4Buf9kX+NSaNq84 qfHg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature; bh=kgNzw5lGQtouZ5L9IJuVZ4AdNTN71bbUQnUWmbQz5gE=; b=Y4t0qZaVCRKmk6ZcqJvZQQ2TeDyaFByL93EAhoiBywc5M+Zoc/q2NRjToYK8Z2tBZo OgYnHWOViDuWXQad0FGtoctMoDOv6S7J5AmhawGlBVybLmq245HpNfaFJBgyMQDMy0gJ qL5KgEAqJje/5V+KljQMucTErH9o+PuCkgZwp6BCWU+Pyyd+pEogAItCsJ++VgUClXnR 7LxdMfVwO7E9oAC2umnvl9evMPywzg+xeY84qfwov6VyLLRTCwQEWHhTZBOgiMGmS1pV qx4yEbaEFy67lPzqwTJWl2YYIAT3vcxjVvuMZu3TF8bk4soz9vsIESrQndv7yKBgrqYj JUOA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=C7WF6LhP; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z9si15318258pgf.54.2018.11.12.00.00.37; Mon, 12 Nov 2018 00:00:37 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=C7WF6LhP; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728930AbeKLRwk (ORCPT + 2 others); Mon, 12 Nov 2018 12:52:40 -0500 Received: from mail-pg1-f194.google.com ([209.85.215.194]:43056 "EHLO mail-pg1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728237AbeKLRwj (ORCPT ); Mon, 12 Nov 2018 12:52:39 -0500 Received: by mail-pg1-f194.google.com with SMTP id n10-v6so3689436pgv.10; Mon, 12 Nov 2018 00:00:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=kgNzw5lGQtouZ5L9IJuVZ4AdNTN71bbUQnUWmbQz5gE=; b=C7WF6LhPnW7xrxkwVnkYk9tsPbszuO4pGOlw6zw9PFwaabfns0qEXhYo1iufnj+XJf F7e4lyBZlwLilNa0X13nBHNB1IGjIDpi1CWS6BnZC79u3mUBws5G5U1JpZ21BmOq6KI1 VrMyc9Qeo+G2DGhiAwFPeuPLX0TtSDK2DtEYvEA8//DkkOg0cprLFIXkCVO7OsxaoAAj 24EAmV99EO1h27VcxpMiTfwZi+PwEoLtuN29QK+9ZPmkVSB+iGZpViEIzF4jeu51E/9h aqMcaHA5iuRICxIuH59fSCc9rgyMODq1BDTLtZocV4pxE+NPR8th99O2WSTGcG0yb41F SNxQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=kgNzw5lGQtouZ5L9IJuVZ4AdNTN71bbUQnUWmbQz5gE=; b=TLXZzd5J7cVTd/QICl3KUdbSr9dFYkjXCFNuYmwV8PjjVLpeZ0V8L5B/EDfioeMPtf VubWkB3XHckTDyq2r+Nr5QvuM6bJtzr05bFfHXtk2JPWA1ErRRuLO7wYFzhI10vIyes5 9x5X2suvTGc/JVD/dRrtpjMC/iGOMF99XzvPHAmGUkVPxe6+mHEaYYXUUqymgn+87xkH eRusCAKYpOygyB+8FXV69+cPGGFVxZAAKnAJV4rdhepDjh46bhLtUCu6Xvd+Qg/v5IiC xNovG2NG+OoQ400kDF456hQGt5YQCPpj4XXludUF8wM/W6sXiW5vC5lhynFurjpuFh/6 JH0Q== X-Gm-Message-State: AGRZ1gLpsN1Ndq5gJbcPn2p7Pfvgj51gnmbok+bckbgM/W8hoGV5xKqc t3EcarhoUmQPE/SUlcLsIB4= X-Received: by 2002:a62:3707:: with SMTP id e7-v6mr19091061pfa.70.1542009635275; Mon, 12 Nov 2018 00:00:35 -0800 (PST) Received: from localhost.localdomain ([45.41.180.77]) by smtp.gmail.com with ESMTPSA id u2-v6sm17050816pfn.50.2018.11.12.00.00.12 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 12 Nov 2018 00:00:34 -0800 (PST) From: Kenneth Lee To: Alexander Shishkin , Tim Sell , Sanyog Kale , Randy Dunlap , =?utf-8?q?Uwe_Kleine-K=C3=B6nig?= , Vinod Koul , David Kershner , Sagar Dharia , Gavin Schenk , Jens Axboe , Philippe Ombredanne , Cyrille Pitchen , Johan Hovold , Zhou Wang , Hao Fang , Jonathan Cameron , Zaibo Xu , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org, linux-accelerators@lists.ozlabs.org Cc: linuxarm@huawei.com, guodong.xu@linaro.org, zhangfei.gao@foxmail.com, haojian.zhuang@linaro.org, Kenneth Lee Subject: [RFCv3 PATCH 4/6] crypto/hisilicon: add Hisilicon zip driver Date: Mon, 12 Nov 2018 15:58:05 +0800 Message-Id: <20181112075807.9291-5-nek.in.cn@gmail.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181112075807.9291-1-nek.in.cn@gmail.com> References: <20181112075807.9291-1-nek.in.cn@gmail.com> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org From: Kenneth Lee The Hisilicon ZIP accelerator implements the zlib and gzip algorithm. It uses Hisilicon QM as the interface to the CPU. This patch provides PCIE driver to the accelerator and register it to the crypto subsystem. Signed-off-by: Kenneth Lee Signed-off-by: Zhou Wang Signed-off-by: Hao Fang --- drivers/crypto/hisilicon/Kconfig | 7 + drivers/crypto/hisilicon/Makefile | 1 + drivers/crypto/hisilicon/zip/Makefile | 2 + drivers/crypto/hisilicon/zip/zip.h | 57 ++++ drivers/crypto/hisilicon/zip/zip_crypto.c | 362 ++++++++++++++++++++++ drivers/crypto/hisilicon/zip/zip_crypto.h | 18 ++ drivers/crypto/hisilicon/zip/zip_main.c | 174 +++++++++++ 7 files changed, 621 insertions(+) create mode 100644 drivers/crypto/hisilicon/zip/Makefile create mode 100644 drivers/crypto/hisilicon/zip/zip.h create mode 100644 drivers/crypto/hisilicon/zip/zip_crypto.c create mode 100644 drivers/crypto/hisilicon/zip/zip_crypto.h create mode 100644 drivers/crypto/hisilicon/zip/zip_main.c -- 2.17.1 diff --git a/drivers/crypto/hisilicon/Kconfig b/drivers/crypto/hisilicon/Kconfig index 0e40f4a6666b..ce9deefbf037 100644 --- a/drivers/crypto/hisilicon/Kconfig +++ b/drivers/crypto/hisilicon/Kconfig @@ -15,3 +15,10 @@ config CRYPTO_DEV_HISI_SEC config CRYPTO_DEV_HISI_QM tristate depends on ARM64 && PCI + +config CRYPTO_DEV_HISI_ZIP + tristate "Support for HISI ZIP Driver" + depends on ARM64 + select CRYPTO_DEV_HISI_QM + help + Support for HiSilicon HIP08 ZIP Driver. diff --git a/drivers/crypto/hisilicon/Makefile b/drivers/crypto/hisilicon/Makefile index 05e9052e0f52..c97c5b27c3cb 100644 --- a/drivers/crypto/hisilicon/Makefile +++ b/drivers/crypto/hisilicon/Makefile @@ -1,3 +1,4 @@ # SPDX-License-Identifier: GPL-2.0 obj-$(CONFIG_CRYPTO_DEV_HISI_SEC) += sec/ obj-$(CONFIG_CRYPTO_DEV_HISI_QM) += qm.o +obj-$(CONFIG_CRYPTO_DEV_HISI_ZIP) += zip/ diff --git a/drivers/crypto/hisilicon/zip/Makefile b/drivers/crypto/hisilicon/zip/Makefile new file mode 100644 index 000000000000..a936f099ee22 --- /dev/null +++ b/drivers/crypto/hisilicon/zip/Makefile @@ -0,0 +1,2 @@ +obj-$(CONFIG_CRYPTO_DEV_HISI_ZIP) += hisi_zip.o +hisi_zip-objs = zip_main.o zip_crypto.o diff --git a/drivers/crypto/hisilicon/zip/zip.h b/drivers/crypto/hisilicon/zip/zip.h new file mode 100644 index 000000000000..26d72f7153b0 --- /dev/null +++ b/drivers/crypto/hisilicon/zip/zip.h @@ -0,0 +1,57 @@ +/* SPDX-License-Identifier: GPL-2.0+ */ +#ifndef HISI_ZIP_H +#define HISI_ZIP_H + +#include +#include "../qm.h" + +#define HZIP_SQE_SIZE 128 +#define HZIP_SQ_SIZE (HZIP_SQE_SIZE * QM_Q_DEPTH) +#define QM_CQ_SIZE (QM_CQE_SIZE * QM_Q_DEPTH) +#define HZIP_PF_DEF_Q_NUM 64 +#define HZIP_PF_DEF_Q_BASE 0 + +struct hisi_zip { + struct qm_info qm; + struct list_head list; + +#ifdef CONFIG_UACCE + struct uacce *uacce; +#endif +}; + +struct hisi_zip_sqe { + __u32 consumed; + __u32 produced; + __u32 comp_data_length; + __u32 dw3; + __u32 input_data_length; + __u32 lba_l; + __u32 lba_h; + __u32 dw7; + __u32 dw8; + __u32 dw9; + __u32 dw10; + __u32 priv_info; + __u32 dw12; + __u32 tag; + __u32 dest_avail_out; + __u32 rsvd0; + __u32 comp_head_addr_l; + __u32 comp_head_addr_h; + __u32 source_addr_l; + __u32 source_addr_h; + __u32 dest_addr_l; + __u32 dest_addr_h; + __u32 stream_ctx_addr_l; + __u32 stream_ctx_addr_h; + __u32 cipher_key1_addr_l; + __u32 cipher_key1_addr_h; + __u32 cipher_key2_addr_l; + __u32 cipher_key2_addr_h; + __u32 rsvd1[4]; +}; + +extern struct list_head hisi_zip_list; + +#endif diff --git a/drivers/crypto/hisilicon/zip/zip_crypto.c b/drivers/crypto/hisilicon/zip/zip_crypto.c new file mode 100644 index 000000000000..104e5140bf4b --- /dev/null +++ b/drivers/crypto/hisilicon/zip/zip_crypto.c @@ -0,0 +1,362 @@ +/* SPDX-License-Identifier: GPL-2.0+ */ +/* + * Copyright 2018 (c) HiSilicon Limited. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + */ + +#include +#include +#include +#include +#include "../qm.h" +#include "zip.h" + +#define INPUT_BUFFER_SIZE (64 * 1024) +#define OUTPUT_BUFFER_SIZE (64 * 1024) + +#define COMP_NAME_TO_TYPE(alg_name) \ + (!strcmp((alg_name), "zlib-deflate") ? 0x02 : \ + !strcmp((alg_name), "gzip") ? 0x03 : 0) \ + +struct hisi_zip_buffer { + u8 *input; + dma_addr_t input_dma; + u8 *output; + dma_addr_t output_dma; +}; + +struct hisi_zip_qp_ctx { + struct hisi_zip_buffer buffer; + struct hisi_qp *qp; + struct hisi_zip_sqe zip_sqe; +}; + +struct hisi_zip_ctx { +#define QPC_COMP 0 +#define QPC_DECOMP 1 + struct hisi_zip_qp_ctx qp_ctx[2]; +}; + +static struct hisi_zip *find_zip_device(int node) +{ + struct hisi_zip *hisi_zip, *ret = NULL; + struct device *dev; + int min_distance = 100; + + list_for_each_entry(hisi_zip, &hisi_zip_list, list) { + dev = &hisi_zip->qm.pdev->dev; + if (node_distance(dev->numa_node, node) < min_distance) { + ret = hisi_zip; + min_distance = node_distance(dev->numa_node, node); + } + } + + return ret; +} + +static void hisi_zip_qp_event_notifier(struct hisi_qp *qp) +{ + complete(&qp->completion); +} + +static int hisi_zip_fill_sqe_v1(void *sqe, void *q_parm, u32 len) +{ + struct hisi_zip_sqe *zip_sqe = (struct hisi_zip_sqe *)sqe; + struct hisi_zip_qp_ctx *qp_ctx = (struct hisi_zip_qp_ctx *)q_parm; + struct hisi_zip_buffer *buffer = &qp_ctx->buffer; + + memset(zip_sqe, 0, sizeof(struct hisi_zip_sqe)); + + zip_sqe->input_data_length = len; + zip_sqe->dw9 = qp_ctx->qp->req_type; + zip_sqe->dest_avail_out = OUTPUT_BUFFER_SIZE; + zip_sqe->source_addr_l = lower_32_bits(buffer->input_dma); + zip_sqe->source_addr_h = upper_32_bits(buffer->input_dma); + zip_sqe->dest_addr_l = lower_32_bits(buffer->output_dma); + zip_sqe->dest_addr_h = upper_32_bits(buffer->output_dma); + + return 0; +} + +/* let's allocate one buffer now, may have problem in async case */ +static int hisi_zip_alloc_qp_buffer(struct hisi_zip_qp_ctx *hisi_zip_qp_ctx) +{ + struct hisi_zip_buffer *buffer = &hisi_zip_qp_ctx->buffer; + struct hisi_qp *qp = hisi_zip_qp_ctx->qp; + struct device *dev = &qp->qm->pdev->dev; + int ret; + + /* todo: we are using dma api here. it should be updated to uacce api + * for user and kernel mode working at the same time + */ + buffer->input = dma_alloc_coherent(dev, INPUT_BUFFER_SIZE, + &buffer->input_dma, GFP_KERNEL); + if (!buffer->input) + return -ENOMEM; + + buffer->output = dma_alloc_coherent(dev, OUTPUT_BUFFER_SIZE, + &buffer->output_dma, GFP_KERNEL); + if (!buffer->output) { + ret = -ENOMEM; + goto err_alloc_output_buffer; + } + + return 0; + +err_alloc_output_buffer: + dma_free_coherent(dev, INPUT_BUFFER_SIZE, buffer->input, + buffer->input_dma); + return ret; +} + +static void hisi_zip_free_qp_buffer(struct hisi_zip_qp_ctx *hisi_zip_qp_ctx) +{ + struct hisi_zip_buffer *buffer = &hisi_zip_qp_ctx->buffer; + struct hisi_qp *qp = hisi_zip_qp_ctx->qp; + struct device *dev = &qp->qm->pdev->dev; + + dma_free_coherent(dev, INPUT_BUFFER_SIZE, buffer->input, + buffer->input_dma); + dma_free_coherent(dev, OUTPUT_BUFFER_SIZE, buffer->output, + buffer->output_dma); +} + +static int hisi_zip_create_qp(struct qm_info *qm, struct hisi_zip_qp_ctx *ctx, + int alg_type, int req_type) +{ + struct hisi_qp *qp; + int ret; + + qp = hisi_qm_create_qp(qm, alg_type); + + if (IS_ERR(qp)) + return PTR_ERR(qp); + + qp->event_cb = hisi_zip_qp_event_notifier; + qp->req_type = req_type; + + qp->qp_ctx = ctx; + ctx->qp = qp; + + ret = hisi_zip_alloc_qp_buffer(ctx); + if (ret) + goto err_with_qp; + + ret = hisi_qm_start_qp(qp, 0); + if (ret) + goto err_with_qp_buffer; + + return 0; +err_with_qp_buffer: + hisi_zip_free_qp_buffer(ctx); +err_with_qp: + hisi_qm_release_qp(qp); + return ret; +} + +static void hisi_zip_release_qp(struct hisi_zip_qp_ctx *ctx) +{ + hisi_qm_release_qp(ctx->qp); + hisi_zip_free_qp_buffer(ctx); +} + +static int hisi_zip_alloc_comp_ctx(struct crypto_tfm *tfm) +{ + struct hisi_zip_ctx *hisi_zip_ctx = crypto_tfm_ctx(tfm); + const char *alg_name = crypto_tfm_alg_name(tfm); + struct hisi_zip *hisi_zip; + struct qm_info *qm; + int ret, i, j; + u8 req_type = COMP_NAME_TO_TYPE(alg_name); + + pr_debug("hisi_zip init %s \n", alg_name); + + /* find the proper zip device */ + hisi_zip = find_zip_device(cpu_to_node(smp_processor_id())); + if (!hisi_zip) { + pr_err("Can not find proper ZIP device!\n"); + return -1; + } + qm = &hisi_zip->qm; + + for (i = 0; i < 2; i++) { + /* it is just happen that 0 is compress, 1 is decompress on alg_type */ + ret = hisi_zip_create_qp(qm, &hisi_zip_ctx->qp_ctx[i], i, + req_type); + if (ret) + goto err; + } + + return 0; +err: + for (j = i-1; j >= 0; j--) + hisi_zip_release_qp(&hisi_zip_ctx->qp_ctx[j]); + + return ret; +} + +static void hisi_zip_free_comp_ctx(struct crypto_tfm *tfm) +{ + struct hisi_zip_ctx *hisi_zip_ctx = crypto_tfm_ctx(tfm); + int i; + + /* release the qp */ + for (i = 1; i >= 0; i--) + hisi_zip_release_qp(&hisi_zip_ctx->qp_ctx[i]); +} + +static int hisi_zip_copy_data_to_buffer(struct hisi_zip_qp_ctx *qp_ctx, + const u8 *src, unsigned int slen) +{ + struct hisi_zip_buffer *buffer = &qp_ctx->buffer; + + if (slen > INPUT_BUFFER_SIZE) + return -EINVAL; + + memcpy(buffer->input, src, slen); + + return 0; +} + +static struct hisi_zip_sqe *hisi_zip_get_writeback_sqe(struct hisi_qp *qp) +{ + struct hisi_acc_qp_status *qp_status = &qp->qp_status; + struct hisi_zip_sqe *sq_base = qp->sqe; + u16 sq_head = qp_status->sq_head; + + return sq_base + sq_head; +} + +static int hisi_zip_copy_data_from_buffer(struct hisi_zip_qp_ctx *qp_ctx, + u8 *dst, unsigned int *dlen) +{ + struct hisi_zip_buffer *buffer = &qp_ctx->buffer; + struct hisi_qp *qp = qp_ctx->qp; + struct hisi_zip_sqe *zip_sqe = hisi_zip_get_writeback_sqe(qp); + u32 status = zip_sqe->dw3 & 0xff; + + if (status != 0) { + pr_err("hisi zip: %s fail!\n", (qp->alg_type == 0) ? + "compression" : "decompression"); + return status; + } + + if (zip_sqe->produced > OUTPUT_BUFFER_SIZE) + return -ENOMEM; + + memcpy(dst, buffer->output, zip_sqe->produced); + *dlen = zip_sqe->produced; + qp->qp_status.sq_head++; + + return 0; +} + +static int hisi_zip_compress(struct crypto_tfm *tfm, const u8 *src, + unsigned int slen, u8 *dst, unsigned int *dlen) +{ + struct hisi_zip_ctx *hisi_zip_ctx = crypto_tfm_ctx(tfm); + struct hisi_zip_qp_ctx *qp_ctx = &hisi_zip_ctx->qp_ctx[QPC_COMP]; + struct hisi_qp *qp = qp_ctx->qp; + struct hisi_zip_sqe *zip_sqe = &qp_ctx->zip_sqe; + int ret; + + ret = hisi_zip_copy_data_to_buffer(qp_ctx, src, slen); + if (ret < 0) + return ret; + + hisi_zip_fill_sqe_v1(zip_sqe, qp_ctx, slen); + + /* send command to start the compress job */ + hisi_qp_send(qp, zip_sqe); + + return hisi_zip_copy_data_from_buffer(qp_ctx, dst, dlen); +} + +static int hisi_zip_decompress(struct crypto_tfm *tfm, const u8 *src, + unsigned int slen, u8 *dst, unsigned int *dlen) +{ + struct hisi_zip_ctx *hisi_zip_ctx = crypto_tfm_ctx(tfm); + struct hisi_zip_qp_ctx *qp_ctx = &hisi_zip_ctx->qp_ctx[QPC_DECOMP]; + struct hisi_qp *qp = qp_ctx->qp; + struct hisi_zip_sqe *zip_sqe = &qp_ctx->zip_sqe; + int ret; + + ret = hisi_zip_copy_data_to_buffer(qp_ctx, src, slen); + if (ret < 0) + return ret; + + hisi_zip_fill_sqe_v1(zip_sqe, qp_ctx, slen); + + /* send command to start the compress job */ + hisi_qp_send(qp, zip_sqe); + + return hisi_zip_copy_data_from_buffer(qp_ctx, dst, dlen); +} + +static struct crypto_alg hisi_zip_zlib = { + .cra_name = "zlib-deflate", + .cra_flags = CRYPTO_ALG_TYPE_COMPRESS, + .cra_ctxsize = sizeof(struct hisi_zip_ctx), + .cra_priority = 300, + .cra_module = THIS_MODULE, + .cra_init = hisi_zip_alloc_comp_ctx, + .cra_exit = hisi_zip_free_comp_ctx, + .cra_u = { + .compress = { + .coa_compress = hisi_zip_compress, + .coa_decompress = hisi_zip_decompress + } + } +}; + +static struct crypto_alg hisi_zip_gzip = { + .cra_name = "gzip", + .cra_flags = CRYPTO_ALG_TYPE_COMPRESS, + .cra_ctxsize = sizeof(struct hisi_zip_ctx), + .cra_priority = 300, + .cra_module = THIS_MODULE, + .cra_init = hisi_zip_alloc_comp_ctx, + .cra_exit = hisi_zip_free_comp_ctx, + .cra_u = { + .compress = { + .coa_compress = hisi_zip_compress, + .coa_decompress = hisi_zip_decompress + } + } +}; + +int hisi_zip_register_to_crypto(void) +{ + int ret; + + ret = crypto_register_alg(&hisi_zip_zlib); + if (ret) { + pr_err("Zlib algorithm registration failed\n"); + return ret; + } else + pr_debug("hisi_zip: registered algorithm zlib\n"); + + ret = crypto_register_alg(&hisi_zip_gzip); + if (ret) { + pr_err("Gzip algorithm registration failed\n"); + goto err_unregister_zlib; + } else + pr_debug("hisi_zip: registered algorithm gzip\n"); + + return 0; + +err_unregister_zlib: + crypto_unregister_alg(&hisi_zip_zlib); + + return ret; +} + +void hisi_zip_unregister_from_crypto(void) +{ + crypto_unregister_alg(&hisi_zip_zlib); + crypto_unregister_alg(&hisi_zip_gzip); +} diff --git a/drivers/crypto/hisilicon/zip/zip_crypto.h b/drivers/crypto/hisilicon/zip/zip_crypto.h new file mode 100644 index 000000000000..6fae34b4df3a --- /dev/null +++ b/drivers/crypto/hisilicon/zip/zip_crypto.h @@ -0,0 +1,18 @@ +/* SPDX-License-Identifier: GPL-2.0+ */ +/* + * Copyright (c) 2018 HiSilicon Limited. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + */ + +#ifndef HISI_ZIP_CRYPTO_H +#define HISI_ZIP_CRYPTO_H + +int hisi_zip_register_to_crypto(void); +void hisi_zip_unregister_from_crypto(void); + +#endif diff --git a/drivers/crypto/hisilicon/zip/zip_main.c b/drivers/crypto/hisilicon/zip/zip_main.c new file mode 100644 index 000000000000..f4f3b6d89340 --- /dev/null +++ b/drivers/crypto/hisilicon/zip/zip_main.c @@ -0,0 +1,174 @@ +// SPDX-License-Identifier: GPL-2.0+ +#include +#include +#include +#include +#include +#include +#include "zip.h" +#include "zip_crypto.h" + +#define HZIP_VF_NUM 63 +#define HZIP_QUEUE_NUM_V1 4096 +#define HZIP_QUEUE_NUM_V2 1024 + +#define HZIP_FSM_MAX_CNT 0x301008 + +#define HZIP_PORT_ARCA_CHE_0 0x301040 +#define HZIP_PORT_ARCA_CHE_1 0x301044 +#define HZIP_PORT_AWCA_CHE_0 0x301060 +#define HZIP_PORT_AWCA_CHE_1 0x301064 + +#define HZIP_BD_RUSER_32_63 0x301110 +#define HZIP_SGL_RUSER_32_63 0x30111c +#define HZIP_DATA_RUSER_32_63 0x301128 +#define HZIP_DATA_WUSER_32_63 0x301134 +#define HZIP_BD_WUSER_32_63 0x301140 + +LIST_HEAD(hisi_zip_list); +DEFINE_MUTEX(hisi_zip_list_lock); + +static const char hisi_zip_name[] = "hisi_zip"; + +static const struct pci_device_id hisi_zip_dev_ids[] = { + { PCI_DEVICE(PCI_VENDOR_ID_HUAWEI, 0xa250) }, + { 0, } +}; + +static inline void hisi_zip_add_to_list(struct hisi_zip *hisi_zip) +{ + mutex_lock(&hisi_zip_list_lock); + list_add_tail(&hisi_zip->list, &hisi_zip_list); + mutex_unlock(&hisi_zip_list_lock); +} + +static void hisi_zip_set_user_domain_and_cache(struct hisi_zip *hisi_zip) +{ + u32 val; + + /* qm user domain */ + writel(0x40001070, hisi_zip->qm.io_base + QM_ARUSER_M_CFG_1); + writel(0xfffffffe, hisi_zip->qm.io_base + QM_ARUSER_M_CFG_ENABLE); + writel(0x40001070, hisi_zip->qm.io_base + QM_AWUSER_M_CFG_1); + writel(0xfffffffe, hisi_zip->qm.io_base + QM_AWUSER_M_CFG_ENABLE); + writel(0xffffffff, hisi_zip->qm.io_base + QM_WUSER_M_CFG_ENABLE); + writel(0x4893, hisi_zip->qm.io_base + QM_CACHE_CTL); + + val = readl(hisi_zip->qm.io_base + QM_PEH_AXUSER_CFG); + val |= (1 << 11); + writel(val, hisi_zip->qm.io_base + QM_PEH_AXUSER_CFG); + + /* qm cache */ + writel(0xffff, hisi_zip->qm.io_base + QM_AXI_M_CFG); + writel(0xffffffff, hisi_zip->qm.io_base + QM_AXI_M_CFG_ENABLE); + writel(0xffffffff, hisi_zip->qm.io_base + QM_PEH_AXUSER_CFG_ENABLE); + + /* cache */ + writel(0xffffffff, hisi_zip->qm.io_base + HZIP_PORT_ARCA_CHE_0); + writel(0xffffffff, hisi_zip->qm.io_base + HZIP_PORT_ARCA_CHE_1); + writel(0xffffffff, hisi_zip->qm.io_base + HZIP_PORT_AWCA_CHE_0); + writel(0xffffffff, hisi_zip->qm.io_base + HZIP_PORT_AWCA_CHE_1); + /* user domain configurations */ + writel(0x40001070, hisi_zip->qm.io_base + HZIP_BD_RUSER_32_63); + writel(0x40001070, hisi_zip->qm.io_base + HZIP_SGL_RUSER_32_63); + writel(0x40001071, hisi_zip->qm.io_base + HZIP_DATA_RUSER_32_63); + writel(0x40001071, hisi_zip->qm.io_base + HZIP_DATA_WUSER_32_63); + writel(0x40001070, hisi_zip->qm.io_base + HZIP_BD_WUSER_32_63); + + /* fsm count */ + writel(0xfffffff, hisi_zip->qm.io_base + HZIP_FSM_MAX_CNT); + + /* clock gating, core, decompress verify enable */ + writel(0x10005, hisi_zip->qm.io_base + 0x301004); +} + +static int hisi_zip_probe(struct pci_dev *pdev, const struct pci_device_id *id) +{ + struct hisi_zip *hisi_zip; + struct qm_info *qm; + int ret; + u8 rev_id; + + hisi_zip = devm_kzalloc(&pdev->dev, sizeof(*hisi_zip), GFP_KERNEL); + if (!hisi_zip) + return -ENOMEM; + hisi_zip_add_to_list(hisi_zip); //todo: this is needed only by crypto + + qm = &hisi_zip->qm; + qm->pdev = pdev; + qm->qp_base = HZIP_PF_DEF_Q_BASE; + qm->qp_num = HZIP_PF_DEF_Q_NUM; + qm->sqe_size = HZIP_SQE_SIZE; + + pci_set_drvdata(pdev, hisi_zip); + + pci_read_config_byte(pdev, PCI_REVISION_ID, &rev_id); + if (rev_id == 0x20) + qm->ver = 1; + + ret = hisi_qm_init(qm); + if (ret) + return ret; + + if (pdev->is_physfn) + hisi_zip_set_user_domain_and_cache(hisi_zip); + + ret = hisi_qm_start(qm); + if (ret) + goto err_with_qm_init; + + return 0; + +err_with_qm_init: + hisi_qm_uninit(qm); + return ret; +} + +static void hisi_zip_remove(struct pci_dev *pdev) +{ + struct hisi_zip *hisi_zip = pci_get_drvdata(pdev); + struct qm_info *qm = &hisi_zip->qm; + + hisi_qm_stop(qm); + hisi_qm_uninit(qm); +} + +static struct pci_driver hisi_zip_pci_driver = { + .name = "hisi_zip", + .id_table = hisi_zip_dev_ids, + .probe = hisi_zip_probe, + .remove = hisi_zip_remove, +}; + +static int __init hisi_zip_init(void) +{ + int ret; + + ret = pci_register_driver(&hisi_zip_pci_driver); + if (ret < 0) { + pr_err("zip: can't register hisi zip driver.\n"); + return ret; + } + + ret = hisi_zip_register_to_crypto(); + if (ret < 0) { + pci_unregister_driver(&hisi_zip_pci_driver); + return ret; + } + + return 0; +} + +static void __exit hisi_zip_exit(void) +{ + hisi_zip_unregister_from_crypto(); + pci_unregister_driver(&hisi_zip_pci_driver); +} + +module_init(hisi_zip_init); +module_exit(hisi_zip_exit); + +MODULE_LICENSE("GPL v2"); +MODULE_AUTHOR("Zhou Wang "); +MODULE_DESCRIPTION("Driver for HiSilicon ZIP accelerator"); +MODULE_DEVICE_TABLE(pci, hisi_zip_dev_ids); From patchwork Mon Nov 12 07:58:07 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kenneth Lee X-Patchwork-Id: 150792 Delivered-To: patch@linaro.org Received: by 2002:a2e:299d:0:0:0:0:0 with SMTP id p29-v6csp2858138ljp; Mon, 12 Nov 2018 00:01:27 -0800 (PST) X-Google-Smtp-Source: AJdET5eWtLwWlzErOzaOV/+IAi5IdM2DffSG+CSmPHRiEeae0icwIeYZX0P6CwVlZSbMj26LfgKe X-Received: by 2002:a62:f24f:: with SMTP id y15-v6mr19070813pfl.25.1542009687502; Mon, 12 Nov 2018 00:01:27 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1542009687; cv=none; d=google.com; s=arc-20160816; b=X5lScuh9MyK8mfrBRfTtUJybjJVMg9Ywf6PXxl0oH0Y9f3AXlkQiovsPagOs8jJ1td u83p4587VJBcHbcSehBZHF63ZqstKKc4/WPgNeiG4MPtoK/vMED4Ce043gpn2twlwQs3 KTlXs2hygj9bD+sI8gukf994XCWS2n7rU06lCHzqmHxwTcRqmtnDT8qKrgun6YUPulE3 3ndJYAC5lWjNsAXy8NrGWkopHPXQoDOf13mji/hHA7ZbevFST933GRFSS0xuYaidrWTW P3HuqMlJe4lY3FyMjJlkR4h8klOqV/4/yWVpFn8i9/5iS0bxyADF+FXJyIQejPPbxnVi Gm4A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature; bh=6hCgvuEcAnJ6GN/AGufyXbBU9VNIHBTSqc00IkOC25E=; b=QQP0CwZXFeAmbVEai70uo6dVzZyfWZSh/GD2Or+42xJ+LrbHpTvKdfo//sFayuttkU 3chVrT+NwJxp4lNJOPN1b0vCNW9ZFIZFWpU1Ob8w26bCPuRrEJFHwmTvSFIMP3zcGkSC q0clQPc9906h/4U2xzJcktOQeovV/P7Bh0ap6XjvU7kgaAXFTOhzGNP0YUHCgsej/CgH d/CUNBrzyTVyXsZ9x0wan0xOY3hLVAiylVIDPW7YXgVoLKap0MArPT5Z1b33+KOPMjDN dYjxd4Y0zZ4sMX8JGOvU7CwJVDKfMD6jIwGGKyC9qfQtzoLibxcdPYmvMN1VStxozKJr 23nw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=KOWu6vUi; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id c17si9835761pgl.385.2018.11.12.00.01.27; Mon, 12 Nov 2018 00:01:27 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=KOWu6vUi; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728199AbeKLRx3 (ORCPT + 2 others); Mon, 12 Nov 2018 12:53:29 -0500 Received: from mail-pl1-f195.google.com ([209.85.214.195]:38556 "EHLO mail-pl1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727996AbeKLRx3 (ORCPT ); Mon, 12 Nov 2018 12:53:29 -0500 Received: by mail-pl1-f195.google.com with SMTP id p4-v6so3944910plo.5; Mon, 12 Nov 2018 00:01:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=6hCgvuEcAnJ6GN/AGufyXbBU9VNIHBTSqc00IkOC25E=; b=KOWu6vUiyJzAQNjvD++REOgS2XLMuYX1JmzlMII3j9xeiosKssvvRf4QfzoeLZ29uT 0vy4wJr6NznG/Ls9cY/CZ6ahIWn+k21e8BqdFiL9NKpMZWCB5e0zHqzvcip/KTd60Eww kJ5juKAxoecH0IAITTgPx2lGLPuD6wxesACwlxPp6rZHjQORYBVp+7UMQGttSkaOA3Ud 8SScW1RcS/gohK2vuEhY7k9JciQp/sQ5UAyqvL/l7R1djs3pJCbXJFy+8Ox6cW1pYDiM 5ouACq0QUwv4CCugTPusRbt8Ece+G8UaW/xR+ILNpHDNy27T2QplxwU3x3dtFJzy3P1v CRqw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=6hCgvuEcAnJ6GN/AGufyXbBU9VNIHBTSqc00IkOC25E=; b=qwCHwQ6VaOgiKE2r+ztMwT0/+uyIIfjRPNEIPztENqT+cb4BdlhoCI8zuv28U9ysRB NG8DBH2HyZa61nyNwtg/1WVXY0h3LUojebTlET/MPYi3hfl+cQd0ZEvEN9f0jzs0mIC3 FEo4XGseMPjM95lPpy4gkbm4861h5nIps/ifQSm4fbnJzW+oqXTVO22f9HuH8WfUKUwF bL1hlyXXOvsue/2t0RucryGErgMnjY8v2xJJTx1GbNgqX+uIpXLDh3M74SgVZ11o9CrM 7VzKwpAtXie1Vg58jPvkSU322Hqir2QTln0P8ZWicUDsGlHargd8eKMMWnRFAVWCBgjt yGOw== X-Gm-Message-State: AGRZ1gLXerOP9KnBjW0lZ7UrgaRSVzWb31Cqrx8xNmJl6EbQnTeChTKL A7p78KwGtCQ/t7Fs3Nq9UNE= X-Received: by 2002:a17:902:7595:: with SMTP id j21-v6mr17390434pll.191.1542009684588; Mon, 12 Nov 2018 00:01:24 -0800 (PST) Received: from localhost.localdomain ([45.41.180.77]) by smtp.gmail.com with ESMTPSA id u2-v6sm17050816pfn.50.2018.11.12.00.00.54 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 12 Nov 2018 00:01:23 -0800 (PST) From: Kenneth Lee To: Alexander Shishkin , Tim Sell , Sanyog Kale , Randy Dunlap , =?utf-8?q?Uwe_Kleine-K=C3=B6nig?= , Vinod Koul , David Kershner , Sagar Dharia , Gavin Schenk , Jens Axboe , Philippe Ombredanne , Cyrille Pitchen , Johan Hovold , Zhou Wang , Hao Fang , Jonathan Cameron , Zaibo Xu , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org, linux-accelerators@lists.ozlabs.org Cc: linuxarm@huawei.com, guodong.xu@linaro.org, zhangfei.gao@foxmail.com, haojian.zhuang@linaro.org, Kenneth Lee Subject: [RFCv3 PATCH 6/6] uacce: add user sample for uacce/warpdrive Date: Mon, 12 Nov 2018 15:58:07 +0800 Message-Id: <20181112075807.9291-7-nek.in.cn@gmail.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181112075807.9291-1-nek.in.cn@gmail.com> References: <20181112075807.9291-1-nek.in.cn@gmail.com> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org From: Kenneth Lee This is the sample code to demostrate how WarpDrive user application should be. (Uacce is the kernel component for WarpDrive.) It contains: 1. wd.[ch]: the common library to provide WrapDrive interface. 2. wd_adaptor.[ch]: the adaptor for wd to call different user drivers 3. drv/*: the user driver to access the hardware space 4. test/*, the test application The Hisilicon HIP08 ZIP accelerator is used in this sample. Signed-off-by: Zaibo Xu Signed-off-by: Kenneth Lee Signed-off-by: Hao Fang Signed-off-by: Zhou Wang --- samples/warpdrive/AUTHORS | 3 + samples/warpdrive/ChangeLog | 1 + samples/warpdrive/Makefile.am | 9 + samples/warpdrive/NEWS | 1 + samples/warpdrive/README | 32 ++++ samples/warpdrive/autogen.sh | 3 + samples/warpdrive/cleanup.sh | 13 ++ samples/warpdrive/conf.sh | 4 + samples/warpdrive/configure.ac | 52 ++++++ samples/warpdrive/drv/hisi_qm_udrv.c | 228 +++++++++++++++++++++++++ samples/warpdrive/drv/hisi_qm_udrv.h | 57 +++++++ samples/warpdrive/drv/wd_drv.h | 19 +++ samples/warpdrive/test/Makefile.am | 7 + samples/warpdrive/test/test_hisi_zip.c | 150 ++++++++++++++++ samples/warpdrive/wd.c | 96 +++++++++++ samples/warpdrive/wd.h | 97 +++++++++++ samples/warpdrive/wd_adapter.c | 71 ++++++++ samples/warpdrive/wd_adapter.h | 36 ++++ 18 files changed, 879 insertions(+) create mode 100644 samples/warpdrive/AUTHORS create mode 100644 samples/warpdrive/ChangeLog create mode 100644 samples/warpdrive/Makefile.am create mode 100644 samples/warpdrive/NEWS create mode 100644 samples/warpdrive/README create mode 100755 samples/warpdrive/autogen.sh create mode 100755 samples/warpdrive/cleanup.sh create mode 100755 samples/warpdrive/conf.sh create mode 100644 samples/warpdrive/configure.ac create mode 100644 samples/warpdrive/drv/hisi_qm_udrv.c create mode 100644 samples/warpdrive/drv/hisi_qm_udrv.h create mode 100644 samples/warpdrive/drv/wd_drv.h create mode 100644 samples/warpdrive/test/Makefile.am create mode 100644 samples/warpdrive/test/test_hisi_zip.c create mode 100644 samples/warpdrive/wd.c create mode 100644 samples/warpdrive/wd.h create mode 100644 samples/warpdrive/wd_adapter.c create mode 100644 samples/warpdrive/wd_adapter.h -- 2.17.1 diff --git a/samples/warpdrive/AUTHORS b/samples/warpdrive/AUTHORS new file mode 100644 index 000000000000..bb55d2769147 --- /dev/null +++ b/samples/warpdrive/AUTHORS @@ -0,0 +1,3 @@ +Kenneth Lee +Zaibo Xu +Zhou Wang diff --git a/samples/warpdrive/ChangeLog b/samples/warpdrive/ChangeLog new file mode 100644 index 000000000000..b1b716105590 --- /dev/null +++ b/samples/warpdrive/ChangeLog @@ -0,0 +1 @@ +init diff --git a/samples/warpdrive/Makefile.am b/samples/warpdrive/Makefile.am new file mode 100644 index 000000000000..41154a880a97 --- /dev/null +++ b/samples/warpdrive/Makefile.am @@ -0,0 +1,9 @@ +ACLOCAL_AMFLAGS = -I m4 +AUTOMAKE_OPTIONS = foreign subdir-objects +AM_CFLAGS=-Wall -O0 -fno-strict-aliasing + +lib_LTLIBRARIES=libwd.la +libwd_la_SOURCES=wd.c wd_adapter.c wd.h wd_adapter.h \ + drv/hisi_qm_udrv.c drv/hisi_qm_udrv.h + +SUBDIRS=. test diff --git a/samples/warpdrive/NEWS b/samples/warpdrive/NEWS new file mode 100644 index 000000000000..b1b716105590 --- /dev/null +++ b/samples/warpdrive/NEWS @@ -0,0 +1 @@ +init diff --git a/samples/warpdrive/README b/samples/warpdrive/README new file mode 100644 index 000000000000..3adf66b112fc --- /dev/null +++ b/samples/warpdrive/README @@ -0,0 +1,32 @@ +WD User Land Demonstration +========================== + +This directory contains some applications and libraries to demonstrate how a + +WrapDrive application can be constructed. + + +As a demo, we try to make it simple and clear for understanding. It is not + +supposed to be used in business scenario. + + +The directory contains the following elements: + +wd.[ch] + A demonstration WrapDrive fundamental library which wraps the basic + operations to the WrapDrive-ed device. + +wd_adapter.[ch] + User driver adaptor for wd.[ch] + +wd_utils.[ch] + Some utitlities function used by WD and its drivers + +drv/* + User drivers. It helps to fulfill the semantic of wd.[ch] for + particular hardware + +test/* + Test applications to use the wrapdrive library + diff --git a/samples/warpdrive/autogen.sh b/samples/warpdrive/autogen.sh new file mode 100755 index 000000000000..58deaf49de2a --- /dev/null +++ b/samples/warpdrive/autogen.sh @@ -0,0 +1,3 @@ +#!/bin/sh -x + +autoreconf -i -f -v diff --git a/samples/warpdrive/cleanup.sh b/samples/warpdrive/cleanup.sh new file mode 100755 index 000000000000..c5f3d21e5dc1 --- /dev/null +++ b/samples/warpdrive/cleanup.sh @@ -0,0 +1,13 @@ +#!/bin/sh + +if [ -r Makefile ]; then + make distclean +fi + +FILES="aclocal.m4 autom4te.cache compile config.guess config.h.in config.log \ + config.status config.sub configure cscope.out depcomp install-sh \ + libsrc/Makefile libsrc/Makefile.in libtool ltmain.sh Makefile \ + ar-lib m4 \ + Makefile.in missing src/Makefile src/Makefile.in test/Makefile.in" + +rm -vRf $FILES diff --git a/samples/warpdrive/conf.sh b/samples/warpdrive/conf.sh new file mode 100755 index 000000000000..2af8a54c5126 --- /dev/null +++ b/samples/warpdrive/conf.sh @@ -0,0 +1,4 @@ +ac_cv_func_malloc_0_nonnull=yes ac_cv_func_realloc_0_nonnull=yes ./configure \ + --host aarch64-linux-gnu \ + --target aarch64-linux-gnu \ + --program-prefix aarch64-linux-gnu- diff --git a/samples/warpdrive/configure.ac b/samples/warpdrive/configure.ac new file mode 100644 index 000000000000..53262f3197c2 --- /dev/null +++ b/samples/warpdrive/configure.ac @@ -0,0 +1,52 @@ +AC_PREREQ([2.69]) +AC_INIT([wrapdrive], [0.1], [liguozhu@hisilicon.com]) +AC_CONFIG_SRCDIR([wd.c]) +AM_INIT_AUTOMAKE([1.10 no-define]) + +AC_CONFIG_MACRO_DIR([m4]) +AC_CONFIG_HEADERS([config.h]) + +# Checks for programs. +AC_PROG_CXX +AC_PROG_AWK +AC_PROG_CC +AC_PROG_CPP +AC_PROG_INSTALL +AC_PROG_LN_S +AC_PROG_MAKE_SET +AC_PROG_RANLIB + +AM_PROG_AR +AC_PROG_LIBTOOL +AM_PROG_LIBTOOL +LT_INIT +AM_PROG_CC_C_O + +AC_DEFINE([HAVE_SVA], [0], [enable SVA support]) +AC_ARG_ENABLE([sva], + [ --enable-sva enable to support sva feature], + AC_DEFINE([HAVE_SVA], [1])) + +# Checks for libraries. + +# Checks for header files. +AC_CHECK_HEADERS([fcntl.h stdint.h stdlib.h string.h sys/ioctl.h sys/time.h unistd.h]) + +# Checks for typedefs, structures, and compiler characteristics. +AC_CHECK_HEADER_STDBOOL +AC_C_INLINE +AC_TYPE_OFF_T +AC_TYPE_SIZE_T +AC_TYPE_UINT16_T +AC_TYPE_UINT32_T +AC_TYPE_UINT64_T +AC_TYPE_UINT8_T + +# Checks for library functions. +AC_FUNC_MALLOC +AC_FUNC_MMAP +AC_CHECK_FUNCS([memset munmap]) + +AC_CONFIG_FILES([Makefile + test/Makefile]) +AC_OUTPUT diff --git a/samples/warpdrive/drv/hisi_qm_udrv.c b/samples/warpdrive/drv/hisi_qm_udrv.c new file mode 100644 index 000000000000..5e623f31e2cb --- /dev/null +++ b/samples/warpdrive/drv/hisi_qm_udrv.c @@ -0,0 +1,228 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "wd_drv.h" +#include "hisi_qm_udrv.h" + +#define QM_SQE_SIZE 128 /* todo: get it from sysfs */ +#define QM_CQE_SIZE 16 + +#define DOORBELL_CMD_SQ 0 +#define DOORBELL_CMD_CQ 1 + +/* cqe shift */ +#define CQE_PHASE(cq) (((*((__u32 *)(cq) + 3)) >> 16) & 0x1) +#define CQE_SQ_NUM(cq) ((*((__u32 *)(cq) + 2)) >> 16) +#define CQE_SQ_HEAD_INDEX(cq) ((*((__u32 *)(cq) + 2)) & 0xffff) + +struct hisi_acc_qm_sqc { + __u16 sqn; +}; + +struct hisi_qm_queue_info { + void *sq_base; + void *cq_base; + void *doorbell_base; + void *dko_base; + __u16 sq_tail_index; + __u16 sq_head_index; + __u16 cq_head_index; + __u16 sqn; + bool cqc_phase; + void *req_cache[QM_Q_DEPTH]; + int is_sq_full; +}; + +int hacc_db(struct hisi_qm_queue_info *q, __u8 cmd, __u16 index, __u8 priority) +{ + void *base = q->doorbell_base; + __u16 sqn = q->sqn; + __u64 doorbell = 0; + + doorbell = (__u64)sqn | ((__u64)cmd << 16); + doorbell |= ((__u64)index | ((__u64)priority << 16)) << 32; + + *((__u64 *)base) = doorbell; + + return 0; +} + +static int hisi_qm_fill_sqe(void *msg, struct hisi_qm_queue_info *info, __u16 i) +{ + struct hisi_qm_msg *sqe = (struct hisi_qm_msg *)info->sq_base + i; + + memcpy((void *)sqe, msg, sizeof(struct hisi_qm_msg)); + assert(!info->req_cache[i]); + info->req_cache[i] = msg; + + return 0; +} + +static int hisi_qm_recv_sqe(struct hisi_qm_msg *sqe, + struct hisi_qm_queue_info *info, __u16 i) +{ + __u32 status = sqe->dw3 & 0xff; + __u32 type = sqe->dw9 & 0xff; + + if (status != 0 && status != 0x0d) { + fprintf(stderr, "bad status (s=%d, t=%d)\n", status, type); + return -EIO; + } + + assert(info->req_cache[i]); + memcpy((void *)info->req_cache[i], sqe, sizeof(struct hisi_qm_msg)); + return 0; +} + +int hisi_qm_set_queue_dio(struct wd_queue *q) +{ + struct hisi_qm_queue_info *info; + void *vaddr; + int ret; + + alloc_obj(info); + if (!info) + return -1; + + q->priv = info; + + vaddr = wd_drv_mmap(q, QM_DUS_SIZE, QM_DUS_START); + if (vaddr <= 0) { + ret = (intptr_t)vaddr; + goto err_with_info; + } + info->sq_base = vaddr; + info->cq_base = vaddr + QM_SQE_SIZE * QM_Q_DEPTH; + + vaddr = wd_drv_mmap(q, QM_DOORBELL_SIZE, QM_DOORBELL_START); + if (vaddr <= 0) { + ret = (intptr_t)vaddr; + goto err_with_dus; + } + info->doorbell_base = vaddr + QM_DOORBELL_OFFSET; + info->sq_tail_index = 0; + info->sq_head_index = 0; + info->cq_head_index = 0; + info->cqc_phase = 1; + info->is_sq_full = 0; + + vaddr = wd_drv_mmap(q, QM_DKO_SIZE, QM_DKO_START); + if (vaddr <= 0) { + ret = (intptr_t)vaddr; + goto err_with_db; + } + info->dko_base = vaddr; + + return 0; + +err_with_db: + munmap(info->doorbell_base - QM_DOORBELL_OFFSET, QM_DOORBELL_SIZE); +err_with_dus: + munmap(info->sq_base, QM_DUS_SIZE); +err_with_info: + free(info); + return ret; +} + +void hisi_qm_unset_queue_dio(struct wd_queue *q) +{ + struct hisi_qm_queue_info *info = (struct hisi_qm_queue_info *)q->priv; + + munmap(info->dko_base, QM_DKO_SIZE); + munmap(info->doorbell_base - QM_DOORBELL_OFFSET, QM_DOORBELL_SIZE); + munmap(info->sq_base, QM_DUS_SIZE); + free(info); + q->priv = NULL; +} + +int hisi_qm_add_to_dio_q(struct wd_queue *q, void *req) +{ + struct hisi_qm_queue_info *info = (struct hisi_qm_queue_info *)q->priv; + __u16 i; + + if (info->is_sq_full) + return -EBUSY; + + i = info->sq_tail_index; + + hisi_qm_fill_sqe(req, q->priv, i); + + mb(); /* make sure the request is all in memory before doorbell*/ + fprintf(stderr, "fill sqe\n"); + + if (i == (QM_Q_DEPTH - 1)) + i = 0; + else + i++; + + hacc_db(info, DOORBELL_CMD_SQ, i, 0); + fprintf(stderr, "db\n"); + + info->sq_tail_index = i; + + if (i == info->sq_head_index) + info->is_sq_full = 1; + + return 0; +} + +int hisi_qm_get_from_dio_q(struct wd_queue *q, void **resp) +{ + struct hisi_qm_queue_info *info = (struct hisi_qm_queue_info *)q->priv; + __u16 i = info->cq_head_index; + struct cqe *cq_base = info->cq_base; + struct hisi_qm_msg *sq_base = info->sq_base; + struct cqe *cqe = cq_base + i; + struct hisi_qm_msg *sqe; + int ret; + + if (info->cqc_phase == CQE_PHASE(cqe)) { + sqe = sq_base + CQE_SQ_HEAD_INDEX(cqe); + ret = hisi_qm_recv_sqe(sqe, info, i); + if (ret < 0) + return -EIO; + + if (info->is_sq_full) + info->is_sq_full = 0; + } else { + return -EAGAIN; + } + + *resp = info->req_cache[i]; + info->req_cache[i] = NULL; + + if (i == (QM_Q_DEPTH - 1)) { + info->cqc_phase = !(info->cqc_phase); + i = 0; + } else + i++; + + hacc_db(info, DOORBELL_CMD_CQ, i, 0); + + info->cq_head_index = i; + info->sq_head_index = i; + + + return ret; +} + +void *hisi_qm_preserve_mem(struct wd_queue *q, size_t size) +{ + void *mem = wd_drv_mmap(q, size, QM_SS_START); + + if (mem == MAP_FAILED) + return NULL; + else + return mem; +} diff --git a/samples/warpdrive/drv/hisi_qm_udrv.h b/samples/warpdrive/drv/hisi_qm_udrv.h new file mode 100644 index 000000000000..694eb4dd65de --- /dev/null +++ b/samples/warpdrive/drv/hisi_qm_udrv.h @@ -0,0 +1,57 @@ +// SPDX-License-Identifier: GPL-2.0 +#ifndef __HZIP_DRV_H__ +#define __HZIP_DRV_H__ + +#include +#include "../wd.h" +#include "../../drivers/crypto/hisilicon/qm_usr_if.h" + +/* this is unnecessary big, the hardware should optimize it */ +struct hisi_qm_msg { + __u32 consumed; + __u32 produced; + __u32 comp_date_length; + __u32 dw3; + __u32 input_date_length; + __u32 lba_l; + __u32 lba_h; + __u32 dw7; + __u32 dw8; + __u32 dw9; + __u32 dw10; + __u32 priv_info; + __u32 dw12; + __u32 tag; + __u32 dest_avail_out; + __u32 rsvd0; + __u32 comp_head_addr_l; + __u32 comp_head_addr_h; + __u32 source_addr_l; + __u32 source_addr_h; + __u32 dest_addr_l; + __u32 dest_addr_h; + __u32 stream_ctx_addr_l; + __u32 stream_ctx_addr_h; + __u32 cipher_key1_addr_l; + __u32 cipher_key1_addr_h; + __u32 cipher_key2_addr_l; + __u32 cipher_key2_addr_h; + __u32 rsvd1[4]; +}; + +int hisi_qm_set_queue_dio(struct wd_queue *q); +void hisi_qm_unset_queue_dio(struct wd_queue *q); +int hisi_qm_add_to_dio_q(struct wd_queue *q, void *req); +int hisi_qm_get_from_dio_q(struct wd_queue *q, void **resp); +void *hisi_qm_preserve_mem(struct wd_queue *q, size_t size); + +#define QM_DOORBELL_SIZE (QM_DOORBELL_PAGE_NR * PAGE_SIZE) +#define QM_DKO_SIZE (QM_DKO_PAGE_NR * PAGE_SIZE) +#define QM_DUS_SIZE (QM_DUS_PAGE_NR * PAGE_SIZE) + +#define QM_DOORBELL_START 0 +#define QM_DKO_START (QM_DOORBELL_START + QM_DOORBELL_SIZE) +#define QM_DUS_START (QM_DKO_START + QM_DKO_SIZE) +#define QM_SS_START (QM_DUS_START + QM_DUS_SIZE) + +#endif diff --git a/samples/warpdrive/drv/wd_drv.h b/samples/warpdrive/drv/wd_drv.h new file mode 100644 index 000000000000..66fa8d889e70 --- /dev/null +++ b/samples/warpdrive/drv/wd_drv.h @@ -0,0 +1,19 @@ +#ifndef __WD_DRV_H +#define __WD_DRV_H + +#include "wd.h" + +#ifndef PAGE_SHIFT +#define PAGE_SHIFT 12 +#endif + +#ifndef PAGE_SIZE +#define PAGE_SIZE (1 << PAGE_SHIFT) +#endif + +static inline void *wd_drv_mmap(struct wd_queue *q, size_t size, size_t off) +{ + return mmap(0, size, PROT_READ | PROT_WRITE, MAP_SHARED, q->fd, off); +} + +#endif diff --git a/samples/warpdrive/test/Makefile.am b/samples/warpdrive/test/Makefile.am new file mode 100644 index 000000000000..ad80e80a47d7 --- /dev/null +++ b/samples/warpdrive/test/Makefile.am @@ -0,0 +1,7 @@ +AM_CFLAGS=-Wall -O0 -fno-strict-aliasing + +bin_PROGRAMS=test_hisi_zip + +test_hisi_zip_SOURCES=test_hisi_zip.c + +test_hisi_zip_LDADD=../.libs/libwd.a diff --git a/samples/warpdrive/test/test_hisi_zip.c b/samples/warpdrive/test/test_hisi_zip.c new file mode 100644 index 000000000000..5e636482d318 --- /dev/null +++ b/samples/warpdrive/test/test_hisi_zip.c @@ -0,0 +1,150 @@ +// SPDX-License-Identifier: GPL-2.0+ +#include +#include +#include +#include +#include +#include "../wd.h" +#include "../drv/hisi_qm_udrv.h" + +#if defined(MSDOS) || defined(OS2) || defined(WIN32) || defined(__CYGWIN__) +# include +# include +# define SET_BINARY_MODE(file) setmode(fileno(file), O_BINARY) +#else +# define SET_BINARY_MODE(file) +#endif + +#define SYS_ERR_COND(cond, msg) \ +do { \ + if (cond) { \ + perror(msg); \ + exit(EXIT_FAILURE); \ + } \ +} while (0) + +#define ZLIB 0 +#define GZIP 1 + +int hizip_deflate(FILE *source, FILE *dest, int type) +{ + struct hisi_qm_msg *msg, *recv_msg; + struct wd_queue q; + __u64 in, out; + void *a; + char *src, *dst; + int ret, total_len, output_num, fd; + size_t sz; + + q.dev_path = "/dev/ua1"; + strncpy(q.hw_type, "hisi_qm_v1", PATH_STR_SIZE); + ret = wd_request_queue(&q); + SYS_ERR_COND(ret, "wd_request_queue"); + + fd = fileno(source); + struct stat s; + + if (fstat(fd, &s) < 0) + SYS_ERR_COND(-1, "fstat"); + total_len = s.st_size; + + SYS_ERR_COND(!total_len, "input file length zero"); + + SYS_ERR_COND(total_len > 16 * 1024 * 1024, + "totoal_len > 16MB)!"); + + a = wd_reserve_memory(&q, total_len * 2); + SYS_ERR_COND(!a, "memory reserved!"); + + fprintf(stderr, "a=%lx\n", (unsigned long)a); + memset(a, 0, total_len * 2); + + src = (char *)a; + dst = (char *)a + total_len; + + sz = fread(src, 1, total_len, source); + SYS_ERR_COND(sz != total_len, "read fail"); + + msg = malloc(sizeof(*msg)); + SYS_ERR_COND(!msg, "alloc msg"); + memset((void *)msg, 0, sizeof(*msg)); + msg->input_date_length = total_len; + if (type == ZLIB) + msg->dw9 = 2; + else + msg->dw9 = 3; + msg->dest_avail_out = 0x800000; + + in = (__u64)src; + out = (__u64)dst; + + msg->source_addr_l = in & 0xffffffff; + msg->source_addr_h = in >> 32; + msg->dest_addr_l = out & 0xffffffff; + msg->dest_addr_h = out >> 32; + + ret = wd_send(&q, msg); + if (ret == -EBUSY) { + usleep(1); + goto recv_again; + } + SYS_ERR_COND(ret, "send"); + +recv_again: + ret = wd_recv(&q, (void **)&recv_msg); + SYS_ERR_COND(ret == -EIO, "wd_recv"); + + if (ret == -EAGAIN) + goto recv_again; + + output_num = recv_msg->produced; + /* add zlib compress head and write head + compressed date to a file */ + char zip_head[2] = {0x78, 0x9c}; + + fwrite(zip_head, 1, 2, dest); + fwrite((char *)out, 1, output_num, dest); + fclose(dest); + free(msg); + wd_release_queue(&q); + return 0; +} + +int main(int argc, char *argv[]) +{ + int alg_type = 0; + + /* avoid end-of-line conversions */ + SET_BINARY_MODE(stdin); + SET_BINARY_MODE(stdout); + + if (!argv[1]) { + fputs("<>\n", stderr); + goto EXIT; + } + + if (!strcmp(argv[1], "-z")) + alg_type = ZLIB; + else if (!strcmp(argv[1], "-g")) { + alg_type = GZIP; + } else if (!strcmp(argv[1], "-h")) { + fputs("[version]:1.0.2\n", stderr); + fputs("[usage]: ./test_hisi_zip [type] dest_file\n", + stderr); + fputs(" [type]:\n", stderr); + fputs(" -z = zlib\n", stderr); + fputs(" -g = gzip\n", stderr); + fputs(" -h = usage\n", stderr); + fputs("Example:\n", stderr); + fputs("./test_hisi_zip -z < test.data > out.data\n", stderr); + goto EXIT; + } else { + fputs("Unknown option\n", stderr); + fputs("<>\n", + stderr); + goto EXIT; + } + + hizip_deflate(stdin, stdout, alg_type); +EXIT: + return EXIT_SUCCESS; +} diff --git a/samples/warpdrive/wd.c b/samples/warpdrive/wd.c new file mode 100644 index 000000000000..559314a13e38 --- /dev/null +++ b/samples/warpdrive/wd.c @@ -0,0 +1,96 @@ +// SPDX-License-Identifier: GPL-2.0 +#include "config.h" +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include "wd.h" +#include "wd_adapter.h" + +int wd_request_queue(struct wd_queue *q) +{ + int ret; + + q->fd = open(q->dev_path, O_RDWR | O_CLOEXEC); + if (q->fd == -1) + return -ENODEV; + + ret = drv_open(q); + if (ret) + goto err_with_fd; + + return 0; + +err_with_fd: + close(q->fd); + return ret; +} + +void wd_release_queue(struct wd_queue *q) +{ + drv_close(q); + close(q->fd); +} + +int wd_send(struct wd_queue *q, void *req) +{ + return drv_send(q, req); +} + +int wd_recv(struct wd_queue *q, void **resp) +{ + return drv_recv(q, resp); +} + +static int wd_wait(struct wd_queue *q, __u16 ms) +{ + struct pollfd fds[1]; + int ret; + + fds[0].fd = q->fd; + fds[0].events = POLLIN; + ret = poll(fds, 1, ms); + if (ret == -1) + return -errno; + + return 0; +} + +int wd_recv_sync(struct wd_queue *q, void **resp, __u16 ms) +{ + int ret; + + while (1) { + ret = wd_recv(q, resp); + if (ret == -EBUSY) { + ret = wd_wait(q, ms); + if (ret) + return ret; + } else + return ret; + } +} + +void wd_flush(struct wd_queue *q) +{ + drv_flush(q); +} + +void *wd_reserve_memory(struct wd_queue *q, size_t size) +{ + return drv_reserve_mem(q, size); +} + +int wd_share_preserved_memory(struct wd_queue *q, struct wd_queue *target_q) +{ + return ioctl(q->fd, UACCE_CMD_SHARE_SVAS, target_q->fd); +} diff --git a/samples/warpdrive/wd.h b/samples/warpdrive/wd.h new file mode 100644 index 000000000000..4c0ecfebdf14 --- /dev/null +++ b/samples/warpdrive/wd.h @@ -0,0 +1,97 @@ +// SPDX-License-Identifier: GPL-2.0 +#ifndef __WD_H +#define __WD_H +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include "../../include/uapi/linux/uacce.h" + +#define SYS_VAL_SIZE 16 +#define PATH_STR_SIZE 256 +#define WD_NAME_SIZE 64 + +typedef int bool; + +#ifndef true +#define true 1 +#endif + +#ifndef false +#define false 0 +#endif + +#ifndef WD_ERR +#define WD_ERR(format, args...) fprintf(stderr, format, ##args) +#endif + +#if defined(__AARCH64_CMODEL_SMALL__) && __AARCH64_CMODEL_SMALL__ + +#define dsb(opt) asm volatile("dsb " #opt : : : "memory") +#define rmb() dsb(ld) +#define wmb() dsb(st) +#define mb() dsb(sy) + +#else + +#define rmb() +#define wmb() +#define mb() +#error "no platform mb, define one before compiling" + +#endif + +static inline void wd_reg_write(void *reg_addr, uint32_t value) +{ + *((volatile uint32_t *)reg_addr) = value; + wmb(); +} + +static inline uint32_t wd_reg_read(void *reg_addr) +{ + uint32_t temp; + + temp = *((volatile uint32_t *)reg_addr); + rmb(); + + return temp; +} + +#define WD_CAPA_PRIV_DATA_SIZE 64 + +#define alloc_obj(objp) do { \ + objp = malloc(sizeof(*objp)); \ + memset(objp, 0, sizeof(*objp)); \ +} while (0) + +#define free_obj(objp) do { \ + if (objp) \ + free(objp); \ +} while (0) + +struct wd_queue { + char hw_type[PATH_STR_SIZE]; + int hw_type_id; + void *priv; /* private data used by the drv layer */ + int fd; + int iommu_type; + char *dev_path; +}; + +extern int wd_request_queue(struct wd_queue *q); +extern void wd_release_queue(struct wd_queue *q); +extern int wd_send(struct wd_queue *q, void *req); +extern int wd_recv(struct wd_queue *q, void **resp); +extern void wd_flush(struct wd_queue *q); +extern int wd_recv_sync(struct wd_queue *q, void **resp, __u16 ms); +extern void *wd_reserve_memory(struct wd_queue *q, size_t size); +extern int wd_share_reserved_memory(struct wd_queue *q, + struct wd_queue *target_q); + +#endif diff --git a/samples/warpdrive/wd_adapter.c b/samples/warpdrive/wd_adapter.c new file mode 100644 index 000000000000..5af7254c37a4 --- /dev/null +++ b/samples/warpdrive/wd_adapter.c @@ -0,0 +1,71 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#include +#include +#include + + +#include "wd_adapter.h" +#include "./drv/hisi_qm_udrv.h" +#include "./drv/wd_drv.h" + +static struct wd_drv_dio_if hw_dio_tbl[] = { { + .hw_type = "hisi_qm_v1", + .ss_offset = QM_SS_START, + .open = hisi_qm_set_queue_dio, + .close = hisi_qm_unset_queue_dio, + .send = hisi_qm_add_to_dio_q, + .recv = hisi_qm_get_from_dio_q, + }, + /* Add other drivers direct IO operations here */ +}; + +/* todo: there should be some stable way to match the device and the driver */ +#define MAX_HW_TYPE (sizeof(hw_dio_tbl) / sizeof(hw_dio_tbl[0])) + +int drv_open(struct wd_queue *q) +{ + int i; + + //todo: try to find another dev if the user driver is not available + for (i = 0; i < MAX_HW_TYPE; i++) { + if (!strcmp(q->hw_type, + hw_dio_tbl[i].hw_type)) { + q->hw_type_id = i; + return hw_dio_tbl[q->hw_type_id].open(q); + } + } + WD_ERR("No matching driver to use!\n"); + errno = ENODEV; + return -ENODEV; +} + +void drv_close(struct wd_queue *q) +{ + hw_dio_tbl[q->hw_type_id].close(q); +} + +int drv_send(struct wd_queue *q, void *req) +{ + return hw_dio_tbl[q->hw_type_id].send(q, req); +} + +int drv_recv(struct wd_queue *q, void **req) +{ + return hw_dio_tbl[q->hw_type_id].recv(q, req); +} + +void drv_flush(struct wd_queue *q) +{ + if (hw_dio_tbl[q->hw_type_id].flush) + hw_dio_tbl[q->hw_type_id].flush(q); +} + +void *drv_reserve_mem(struct wd_queue *q, size_t size) +{ + void *mem = wd_drv_mmap(q, size, hw_dio_tbl[q->hw_type_id].ss_offset); + + if (mem == MAP_FAILED) + return NULL; + + return mem; +} diff --git a/samples/warpdrive/wd_adapter.h b/samples/warpdrive/wd_adapter.h new file mode 100644 index 000000000000..914cba86198c --- /dev/null +++ b/samples/warpdrive/wd_adapter.h @@ -0,0 +1,36 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* the common drv header define the unified interface for wd */ +#ifndef __WD_ADAPTER_H__ +#define __WD_ADAPTER_H__ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + + +#include "wd.h" + +struct wd_drv_dio_if { + char *hw_type; + size_t ss_offset; + int (*open)(struct wd_queue *q); + void (*close)(struct wd_queue *q); + int (*send)(struct wd_queue *q, void *req); + int (*recv)(struct wd_queue *q, void **req); + void (*flush)(struct wd_queue *q); +}; + +extern int drv_open(struct wd_queue *q); +extern void drv_close(struct wd_queue *q); +extern int drv_send(struct wd_queue *q, void *req); +extern int drv_recv(struct wd_queue *q, void **req); +extern void drv_flush(struct wd_queue *q); +extern void *drv_reserve_mem(struct wd_queue *q, size_t size); + +#endif