From patchwork Fri Jul 1 11:08:10 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Abhishek Sahu X-Patchwork-Id: 586447 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B2ED0C433EF for ; Fri, 1 Jul 2022 11:09:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236856AbiGALJA (ORCPT ); Fri, 1 Jul 2022 07:09:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39290 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237042AbiGALI5 (ORCPT ); Fri, 1 Jul 2022 07:08:57 -0400 Received: from NAM12-DM6-obe.outbound.protection.outlook.com (mail-dm6nam12on2048.outbound.protection.outlook.com [40.107.243.48]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BA51113DEA; Fri, 1 Jul 2022 04:08:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=GAtGpwmC/G7QIl5j7aXfPQ6pzasRm4iNHMqTXkTJc2cg4biz9l5GsLJpoOx9Iv+j8L6oP9VMQMKV4NtpFh6Uazvn/jiktJs30cHnLKy+GRAfg0fpQeo8ztlEvbriMcW0Sc5jIyVfyiv1wUOIKDPhVQw/bv9FVpq7K3x+Nau1ELa7o17qC77L5rhWkaRXE9/PeftYsXUyz/SV/kqh7AnW8f6DW6eWBd3UTFBJFAzhgkTeVM4SRaIwdOh+EjSzmNRYr747D8nMyThS4VjnLxnagcLNycRhfvHbDFCgSB+wL6IFRUDEOqsTnJPI6mEK5ZCbdw4D8Sq0eM6OL4y+LoPNPw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=baRtbmqdjYf/awCA1mQfNe8yaCXjOVVeAntEji/Ka/0=; b=MTYkkki2tHr84IueB2cMdf0Y5mSaO8w4tYN+QYOXB0ML182HG5S4UjPbk1o8ui1aq01hJjH8O/atX8c/qUZjhtDqtrI9RI1hdj4JsocC8FiU5BTVUTyvi3lfqMloC26SVXtEvZhtxfAMnlzMzVze/eqQNLYs/UQqrlxtMGrqYFId9+Ad8E3tf0aPHuVMGEanWcyLnmyoL2/8peU+UieIPpwKVTW2ZnChCnBOgZR9/bB1ez1r9D4cyBshOqwzRjmExDfX5vtNbVbnV5soPM42FkZCNA8beNzCyhbng8DxqveQVlcq5RgeYwvQcDYu9oF/IipsAQVuKkLpOLhkfICE3w== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 12.22.5.236) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=baRtbmqdjYf/awCA1mQfNe8yaCXjOVVeAntEji/Ka/0=; b=AiBM3rgdCcSulQ7cVK4N9UggaWKqx3nEwboZ6SoPQZboQYAnvKqDeqcdlT5/ic+gJKQRhaSBJWC6BO+7fC1u/bq2yHh0wWfrTR+1s3FVM0jtHz4ijUHaKMQk62231XRjXtXZ9TQmfJ2rF+Y7I8GtKmgVO5nZJc8uLFLoUj1X4fDyjv3+IejoAJGb9TkudfZBoPRE32RKwMaP6wuisymwxdUIJTMB/l5TZ5LEmK20rbT/TY793N0Fz3tDS8eUBWNNL5UqJxk5L12donSQNvemFgUZHZ/tAOd45mUtmjZiqD7rdamoVeAHvvoSMIHXZk9CJKuqBrw037EP/jSVwOZYpw== Received: from CO2PR04CA0200.namprd04.prod.outlook.com (2603:10b6:104:5::30) by BYAPR12MB2632.namprd12.prod.outlook.com (2603:10b6:a03:6c::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5373.18; Fri, 1 Jul 2022 11:08:49 +0000 Received: from CO1NAM11FT014.eop-nam11.prod.protection.outlook.com (2603:10b6:104:5:cafe::9c) by CO2PR04CA0200.outlook.office365.com (2603:10b6:104:5::30) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5293.13 via Frontend Transport; Fri, 1 Jul 2022 11:08:48 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 12.22.5.236) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 12.22.5.236 as permitted sender) receiver=protection.outlook.com; client-ip=12.22.5.236; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (12.22.5.236) by CO1NAM11FT014.mail.protection.outlook.com (10.13.175.99) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.5395.14 via Frontend Transport; Fri, 1 Jul 2022 11:08:48 +0000 Received: from drhqmail202.nvidia.com (10.126.190.181) by DRHQMAIL109.nvidia.com (10.27.9.19) with Microsoft SMTP Server (TLS) id 15.0.1497.32; Fri, 1 Jul 2022 11:08:34 +0000 Received: from drhqmail202.nvidia.com (10.126.190.181) by drhqmail202.nvidia.com (10.126.190.181) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.26; Fri, 1 Jul 2022 04:08:33 -0700 Received: from nvidia-abhsahu-1.nvidia.com (10.127.8.12) by mail.nvidia.com (10.126.190.181) with Microsoft SMTP Server id 15.2.986.26 via Frontend Transport; Fri, 1 Jul 2022 04:08:28 -0700 From: Abhishek Sahu To: Alex Williamson , Cornelia Huck , Yishai Hadas , Jason Gunthorpe , Shameer Kolothum , Kevin Tian , "Rafael J . Wysocki" CC: Max Gurtovoy , Bjorn Helgaas , , , , , Abhishek Sahu Subject: [PATCH v4 2/6] vfio: Add a new device feature for the power management Date: Fri, 1 Jul 2022 16:38:10 +0530 Message-ID: <20220701110814.7310-3-abhsahu@nvidia.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20220701110814.7310-1-abhsahu@nvidia.com> References: <20220701110814.7310-1-abhsahu@nvidia.com> X-NVConfidentiality: public MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: ef5353e0-0330-4d04-87b1-08da5b5212d8 X-MS-TrafficTypeDiagnostic: BYAPR12MB2632:EE_ X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 05+ghnD6cpU5ybByA7kxBnDsiD9RYaPUzhFPmmEiiwtyyRSsPF3X6cwfU316AObSdvblUV6JMYUG3x1FyxyyleGCodlfGD16Uhu71yqbTUf2+dgpLycNnniuifTNgN1BThxo6S33i1/wMpRzUKZvxevWbrY6o3zPhSXVRnPK57SlQsQn4SGo3Smj0bRjO1kD/7mJP30NjFDSDdbkGJ8oMGgZpFVv21B9DvKSQGnNAiv6Z7XUptR1hOQUPNSEpjeIfdFwZM0k0OMBJJVjm0W9F1bX7ORO6Li8Ie/bjWW9VGiSC+dF7EgSwALn9MJC1leFEr7YR1MiR7F9Z94Idap32MDoUBa1R7xx53UGTrvcnjmy2NGGl/IpqnNbmBu4y4GKhEUrf8VYzo6KhHudFvqNmzln2zROTlpmjzXsHM+mxeAaxPXK0d4AqzfRDmu6PmvaVAPpPFaVjAnndrLY9G+CPzM3019Gf/QpgxtknfE/5p+yDzb+cK7ww+mHai1VrqtwxA7dXRKT7D+875LGjxlLvtFW/u640b3biQigqOm+ad/3ADbZjTh3eXVrVADONpW6NtzVrBC55t1eD4go/kdvjUII8IzmvZYBUouU3ZctHv4cD1Dc8gIPgzCeOCttiPM80eRvewt8pwVr+y9fcKKY6C/sOTYJWNrIjaZ5lqENEGAm1HgGBMgzcu4Eo/Z1faxri8ooj5XzCxZyiKLpfUo2wPhI8QQKNUdmYh5mbCo2LRd4TmOG9FbNCWqK45ky94OkmirFK2OgPMwBf9yoMQitK7OjJro1aroJEdMKEpfUzH+jAHlsuALtNkxCtTOeJ6/s/Q1h7u/ppcnwbqORA9ZdWwBVt6Hqkc+Tdfa0ql3EKSU= X-Forefront-Antispam-Report: CIP:12.22.5.236; CTRY:US; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:mail.nvidia.com; PTR:InfoNoRecords; CAT:NONE; SFS:(13230016)(4636009)(376002)(396003)(39860400002)(346002)(136003)(40470700004)(36840700001)(46966006)(70206006)(426003)(83380400001)(41300700001)(36860700001)(54906003)(81166007)(186003)(110136005)(107886003)(2616005)(82310400005)(336012)(356005)(1076003)(82740400003)(40460700003)(70586007)(86362001)(7696005)(8936002)(478600001)(4326008)(47076005)(7416002)(316002)(8676002)(2906002)(36756003)(26005)(5660300002)(40480700001)(6666004)(36900700001); DIR:OUT; SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 01 Jul 2022 11:08:48.6722 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: ef5353e0-0330-4d04-87b1-08da5b5212d8 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a; Ip=[12.22.5.236]; Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: CO1NAM11FT014.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: BYAPR12MB2632 Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org This patch adds the new feature VFIO_DEVICE_FEATURE_POWER_MANAGEMENT for the power management in the header file. The implementation for the same will be added in the subsequent patches. With the standard registers, all power states cannot be achieved. The platform-based power management needs to be involved to go into the lowest power state. For all the platform-based power management, this device feature can be used. This device feature uses flags to specify the different operations. In the future, if any more power management functionality is needed then a new flag can be added to it. It supports both GET and SET operations. Signed-off-by: Abhishek Sahu --- include/uapi/linux/vfio.h | 55 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 55 insertions(+) diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h index 733a1cddde30..7e00de5c21ea 100644 --- a/include/uapi/linux/vfio.h +++ b/include/uapi/linux/vfio.h @@ -986,6 +986,61 @@ enum vfio_device_mig_state { VFIO_DEVICE_STATE_RUNNING_P2P = 5, }; +/* + * Perform power management-related operations for the VFIO device. + * + * The low power feature uses platform-based power management to move the + * device into the low power state. This low power state is device-specific. + * + * This device feature uses flags to specify the different operations. + * It supports both the GET and SET operations. + * + * - VFIO_PM_LOW_POWER_ENTER flag moves the VFIO device into the low power + * state with platform-based power management. This low power state will be + * internal to the VFIO driver and the user will not come to know which power + * state is chosen. Once the user has moved the VFIO device into the low + * power state, then the user should not do any device access without moving + * the device out of the low power state. + * + * - VFIO_PM_LOW_POWER_EXIT flag moves the VFIO device out of the low power + * state. This flag should only be set if the user has previously put the + * device into low power state with the VFIO_PM_LOW_POWER_ENTER flag. + * + * - VFIO_PM_LOW_POWER_ENTER and VFIO_PM_LOW_POWER_EXIT are mutually exclusive. + * + * - VFIO_PM_LOW_POWER_REENTERY_DISABLE flag is only valid with + * VFIO_PM_LOW_POWER_ENTER. If there is any access for the VFIO device on + * the host side, then the device will be moved out of the low power state + * without the user's guest driver involvement. Some devices require the + * user's guest driver involvement for each low-power entry. If this flag is + * set, then the re-entry to the low power state will be disabled, and the + * host kernel will not move the device again into the low power state. + * The VFIO driver internally maintains a list of devices for which low + * power re-entry is disabled by default and for those devices, the + * re-entry will be disabled even if the user has not set this flag + * explicitly. + * + * For the IOCTL call with VFIO_DEVICE_FEATURE_GET: + * + * - VFIO_PM_LOW_POWER_ENTER will be set if the user has put the device into + * the low power state, otherwise, VFIO_PM_LOW_POWER_EXIT will be set. + * + * - If the device is in a normal power state currently, then + * VFIO_PM_LOW_POWER_REENTERY_DISABLE will be set for the devices where low + * power re-entry is disabled by default. If the device is in the low power + * state currently, then VFIO_PM_LOW_POWER_REENTERY_DISABLE will be set + * according to the current transition. + */ +struct vfio_device_feature_power_management { + __u32 flags; +#define VFIO_PM_LOW_POWER_ENTER (1 << 0) +#define VFIO_PM_LOW_POWER_EXIT (1 << 1) +#define VFIO_PM_LOW_POWER_REENTERY_DISABLE (1 << 2) + __u32 reserved; +}; + +#define VFIO_DEVICE_FEATURE_POWER_MANAGEMENT 3 + /* -------- API for Type1 VFIO IOMMU -------- */ /** From patchwork Fri Jul 1 11:08:11 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Abhishek Sahu X-Patchwork-Id: 586448 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CDC04C433EF for ; Fri, 1 Jul 2022 11:08:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236635AbiGALIr (ORCPT ); Fri, 1 Jul 2022 07:08:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38586 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236783AbiGALIr (ORCPT ); Fri, 1 Jul 2022 07:08:47 -0400 Received: from NAM12-DM6-obe.outbound.protection.outlook.com (mail-dm6nam12on2069.outbound.protection.outlook.com [40.107.243.69]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 051D513D0D; Fri, 1 Jul 2022 04:08:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=XbySsJhoVYa/GOYzoq8JMcYExG7SqA+G8rHk3KHxweEckDzfFY43qxA+/mXgE2ytoocF1ABcvWe+S4cnp3Snl4nu60hVUZ0syvk4B2faIukb/+lfklWNtImTSimuQcSIr8qRYTakUOihzS6/mHEVYis9KmHD9RubG3bJzlxN6xkkWYuGy5UDFJOf6b5cMKfN4FgW7HvPOUjiiV+abjJFH3L0K0ZuKCVCFFurVUjNmGmljYUFyxKu/yuwqpI6MirOBQEnibePDb4+YTFUzOO3trEF7qaTrpDVxkCZEMcsV6ABNb8rJNr5kP+KWbrJc9lIOBna4UaRuv9fVK65JsGVIw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=4tIOwQ7/Ik5bHfhQ1/UX0zekzI4be3QJ60AX0V2gCJ4=; b=jO5xBHCx59Vugag4nWn7E2qGysJzJrN/DOfAGVwMeC5HkMGyz2BAuuq4jk5IWmC11peyqYmB28DiAdbkf4St0LlH+HeHqkMBskTi/xfFLZQKDBU3tqOEkZ8QHwKecjvHArRg/WTUqVSsSVmADBiFyOfftVSaEyMgeu8T8p5PjrElz6d+MQk1PRiWFfPjdDa5I6BLNm1vG7w1ChR8T1x7FzImB88nrUJwpVH3s1ouaaXg7aIw3XVvfSrjKdIueFMQbZJUoCdCJV3r6B0zL12oXDPIKgN53kfE/pAmwZpiraaUVHpT91W2YlD861l4iIHXVX2Zk4OQYtRSw923g/9hsw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 12.22.5.235) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=4tIOwQ7/Ik5bHfhQ1/UX0zekzI4be3QJ60AX0V2gCJ4=; b=gSi4V48BJMxHwGSUO1HZkbV/3Vlbj32fl58bRQukI64IhO2ZT1EFmCuXxSesfgB+mDUi9/Wk7M6mHBcuV8hhULyskNyFDvjrr12Yvc6Iav0DdsmYQeoo/+mkG7kknJqoKhb+6NvAdU1kqaD8ZPXshVrs5czHfuxYcgmRR4arUvGv3mrtuPudxtLBGIItY1fxh7nNWs/dmoZbOstqRVDvOrb41tNtFyNQTLaF81ZLO3Kmv9XWcfOMwVhvu6u1E11Cl+URoH8WV3kCSu9Ib49X1PQMUlDoVYVhTsBJXbcerRqsxf5VKP6A8Aow9lF558OL2aCUllA6sM/gCSL8+ArRtQ== Received: from DM6PR02CA0119.namprd02.prod.outlook.com (2603:10b6:5:1b4::21) by MWHPR12MB1824.namprd12.prod.outlook.com (2603:10b6:300:113::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5373.18; Fri, 1 Jul 2022 11:08:41 +0000 Received: from DM6NAM11FT062.eop-nam11.prod.protection.outlook.com (2603:10b6:5:1b4:cafe::90) by DM6PR02CA0119.outlook.office365.com (2603:10b6:5:1b4::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5373.17 via Frontend Transport; Fri, 1 Jul 2022 11:08:40 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 12.22.5.235) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 12.22.5.235 as permitted sender) receiver=protection.outlook.com; client-ip=12.22.5.235; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (12.22.5.235) by DM6NAM11FT062.mail.protection.outlook.com (10.13.173.40) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.5395.17 via Frontend Transport; Fri, 1 Jul 2022 11:08:40 +0000 Received: from drhqmail202.nvidia.com (10.126.190.181) by DRHQMAIL107.nvidia.com (10.27.9.16) with Microsoft SMTP Server (TLS) id 15.0.1497.32; Fri, 1 Jul 2022 11:08:40 +0000 Received: from drhqmail202.nvidia.com (10.126.190.181) by drhqmail202.nvidia.com (10.126.190.181) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.26; Fri, 1 Jul 2022 04:08:39 -0700 Received: from nvidia-abhsahu-1.nvidia.com (10.127.8.12) by mail.nvidia.com (10.126.190.181) with Microsoft SMTP Server id 15.2.986.26 via Frontend Transport; Fri, 1 Jul 2022 04:08:34 -0700 From: Abhishek Sahu To: Alex Williamson , Cornelia Huck , Yishai Hadas , Jason Gunthorpe , Shameer Kolothum , Kevin Tian , "Rafael J . Wysocki" CC: Max Gurtovoy , Bjorn Helgaas , , , , , Abhishek Sahu Subject: [PATCH v4 3/6] vfio: Increment the runtime PM usage count during IOCTL call Date: Fri, 1 Jul 2022 16:38:11 +0530 Message-ID: <20220701110814.7310-4-abhsahu@nvidia.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20220701110814.7310-1-abhsahu@nvidia.com> References: <20220701110814.7310-1-abhsahu@nvidia.com> X-NVConfidentiality: public MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: e9d343ed-8593-489d-b16e-08da5b520e1e X-MS-TrafficTypeDiagnostic: MWHPR12MB1824:EE_ X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: W0TMRd/RSF/wyqGloh/OpMKKy/SAeHXFWTyUtfy0Ajk6Jkrq45gcka8/h08Tmg8wW8BqP/Htig8C92d83VK2pkC0X+aA9Q0612rKP5DPQXeBZaB7o3RxBFO3TQpcEx1ex7tRNVHV1qWDXeOEn89K0eu0lETeXCcaNwG9an/QhWZsxrUScXBz7CRfxbLWpVS2n3GPaM5OzMoW7kEAFgUI1uWDcwxJTjJfQFDEvEq239b3S0+2YMtGfT5PQ89zuTeWucQmNCOrVTKc5hLzvNWu0pfsPUEoZv50sSVO7xurWsKDIcwZlWczfW6+Jrz8xMVzgBfjjapTm4SnyC0WExuD9Bq+KeqKVkT/egUze60BPj5UTW+6i0PV0aX9F43CGu3rTpOsGVq1ms8R5BqEJmWaPwad6v1rJ8TsPmJ9QiS2QON3uUbLW93p/nSXj/GfWLttYLGKz6X3qs50H7uvXm1gZTvXlmvhLii/PoImsrgocfl6uogbNXDRZPjpid8/RG/QO+Rm/JKHB6M04naa9g4QPz7qG4lpw7Kv9i0vhTV1zXfXVq785Lv6sAOYp/q85SgG0x6wvKadctlG+ekBYjlovI30rra4MblG06ll29a19U+KTRpogsVDZj81knMeyh7+1srEIuxPeHZcZ881UNt9kWoSVv1fwOgEAkliV+sBd6vs6OIWklgQLONnQdMj30s1N+fIA/YdJng7FKKXK2hnNFTCEHgA8jvHsuJSl59hOXEJCfZWuHv8Cr2TU1hv88ZoXpk8OhWwwZ2DT9kq20i2PSE8sZust0xP0CXYudEX+lvdap7fsntaPBVOILQSlmPciAaA3u53iepCZYNZxey4WWd7anhYNnvvE3gnkCN+s9w= X-Forefront-Antispam-Report: CIP:12.22.5.235; CTRY:US; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:mail.nvidia.com; PTR:InfoNoRecords; CAT:NONE; SFS:(13230016)(4636009)(346002)(376002)(396003)(39860400002)(136003)(46966006)(40470700004)(36840700001)(336012)(426003)(36860700001)(316002)(5660300002)(8936002)(83380400001)(7416002)(4326008)(47076005)(70586007)(356005)(40480700001)(40460700003)(8676002)(86362001)(41300700001)(7696005)(70206006)(36756003)(107886003)(1076003)(81166007)(26005)(82740400003)(82310400005)(186003)(6666004)(54906003)(2616005)(2906002)(110136005)(478600001)(36900700001); DIR:OUT; SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 01 Jul 2022 11:08:40.7591 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: e9d343ed-8593-489d-b16e-08da5b520e1e X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a; Ip=[12.22.5.235]; Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: DM6NAM11FT062.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: MWHPR12MB1824 Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org The vfio-pci based driver will have runtime power management support where the user can put the device into the low power state and then PCI devices can go into the D3cold state. If the device is in the low power state and the user issues any IOCTL, then the device should be moved out of the low power state first. Once the IOCTL is serviced, then it can go into the low power state again. The runtime PM framework manages this with help of usage count. One option was to add the runtime PM related API's inside vfio-pci driver but some IOCTL (like VFIO_DEVICE_FEATURE) can follow a different path and more IOCTL can be added in the future. Also, the runtime PM will be added for vfio-pci based drivers variant currently, but the other VFIO based drivers can use the same in the future. So, this patch adds the runtime calls runtime-related API in the top-level IOCTL function itself. For the VFIO drivers which do not have runtime power management support currently, the runtime PM API's won't be invoked. Only for vfio-pci based drivers currently, the runtime PM API's will be invoked to increment and decrement the usage count. Taking this usage count incremented while servicing IOCTL will make sure that the user won't put the device into low power state when any other IOCTL is being serviced in parallel. Let's consider the following scenario: 1. Some other IOCTL is called. 2. The user has opened another device instance and called the power management IOCTL for the low power entry. 3. The power management IOCTL moves the device into the low power state. 4. The other IOCTL finishes. If we don't keep the usage count incremented then the device access will happen between step 3 and 4 while the device has already gone into the low power state. The runtime PM API's should not be invoked for VFIO_DEVICE_FEATURE_POWER_MANAGEMENT since this IOCTL itself performs the runtime power management entry and exit for the VFIO device. The pm_runtime_resume_and_get() will be the first call so its error should not be propagated to user space directly. For example, if pm_runtime_resume_and_get() can return -EINVAL for the cases where the user has passed the correct argument. So the pm_runtime_resume_and_get() errors have been masked behind -EIO. Signed-off-by: Abhishek Sahu --- drivers/vfio/vfio.c | 82 ++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 74 insertions(+), 8 deletions(-) diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c index 61e71c1154be..61a8d9f7629a 100644 --- a/drivers/vfio/vfio.c +++ b/drivers/vfio/vfio.c @@ -32,6 +32,7 @@ #include #include #include +#include #include "vfio.h" #define DRIVER_VERSION "0.3" @@ -1333,6 +1334,39 @@ static const struct file_operations vfio_group_fops = { .release = vfio_group_fops_release, }; +/* + * Wrapper around pm_runtime_resume_and_get(). + * Return error code on failure or 0 on success. + */ +static inline int vfio_device_pm_runtime_get(struct vfio_device *device) +{ + struct device *dev = device->dev; + + if (dev->driver && dev->driver->pm) { + int ret; + + ret = pm_runtime_resume_and_get(dev); + if (ret < 0) { + dev_info_ratelimited(dev, + "vfio: runtime resume failed %d\n", ret); + return -EIO; + } + } + + return 0; +} + +/* + * Wrapper around pm_runtime_put(). + */ +static inline void vfio_device_pm_runtime_put(struct vfio_device *device) +{ + struct device *dev = device->dev; + + if (dev->driver && dev->driver->pm) + pm_runtime_put(dev); +} + /* * VFIO Device fd */ @@ -1607,6 +1641,8 @@ static int vfio_ioctl_device_feature(struct vfio_device *device, { size_t minsz = offsetofend(struct vfio_device_feature, flags); struct vfio_device_feature feature; + int ret = 0; + u16 feature_cmd; if (copy_from_user(&feature, arg, minsz)) return -EFAULT; @@ -1626,28 +1662,51 @@ static int vfio_ioctl_device_feature(struct vfio_device *device, (feature.flags & VFIO_DEVICE_FEATURE_GET)) return -EINVAL; - switch (feature.flags & VFIO_DEVICE_FEATURE_MASK) { + feature_cmd = feature.flags & VFIO_DEVICE_FEATURE_MASK; + + /* + * The VFIO_DEVICE_FEATURE_POWER_MANAGEMENT itself performs the runtime + * power management entry and exit for the VFIO device, so the runtime + * PM API's should not be called for this feature. + */ + if (feature_cmd != VFIO_DEVICE_FEATURE_POWER_MANAGEMENT) { + ret = vfio_device_pm_runtime_get(device); + if (ret) + return ret; + } + + switch (feature_cmd) { case VFIO_DEVICE_FEATURE_MIGRATION: - return vfio_ioctl_device_feature_migration( + ret = vfio_ioctl_device_feature_migration( device, feature.flags, arg->data, feature.argsz - minsz); + break; case VFIO_DEVICE_FEATURE_MIG_DEVICE_STATE: - return vfio_ioctl_device_feature_mig_device_state( + ret = vfio_ioctl_device_feature_mig_device_state( device, feature.flags, arg->data, feature.argsz - minsz); + break; default: if (unlikely(!device->ops->device_feature)) - return -EINVAL; - return device->ops->device_feature(device, feature.flags, - arg->data, - feature.argsz - minsz); + ret = -EINVAL; + else + ret = device->ops->device_feature( + device, feature.flags, arg->data, + feature.argsz - minsz); + break; } + + if (feature_cmd != VFIO_DEVICE_FEATURE_POWER_MANAGEMENT) + vfio_device_pm_runtime_put(device); + + return ret; } static long vfio_device_fops_unl_ioctl(struct file *filep, unsigned int cmd, unsigned long arg) { struct vfio_device *device = filep->private_data; + int ret; switch (cmd) { case VFIO_DEVICE_FEATURE: @@ -1655,7 +1714,14 @@ static long vfio_device_fops_unl_ioctl(struct file *filep, default: if (unlikely(!device->ops->ioctl)) return -EINVAL; - return device->ops->ioctl(device, cmd, arg); + + ret = vfio_device_pm_runtime_get(device); + if (ret) + return ret; + + ret = device->ops->ioctl(device, cmd, arg); + vfio_device_pm_runtime_put(device); + return ret; } } From patchwork Fri Jul 1 11:08:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Abhishek Sahu X-Patchwork-Id: 586446 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 98E97C433EF for ; Fri, 1 Jul 2022 11:09:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237218AbiGALJN (ORCPT ); Fri, 1 Jul 2022 07:09:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39196 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237041AbiGALJD (ORCPT ); Fri, 1 Jul 2022 07:09:03 -0400 Received: from NAM10-DM6-obe.outbound.protection.outlook.com (mail-dm6nam10on2080.outbound.protection.outlook.com [40.107.93.80]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9667013F18; Fri, 1 Jul 2022 04:09:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=UVZsEcQrD0VbQVO2AAF8Sall6Uss1qzNgArQL/X9+OkAMo8WiQv4DCutklIwwHQeCEan1phLuGcVg8OMTGSrJkPWflZ26ssqGdsidP8SLq9Qf+lbFp3ddKEKNQVYnheoGKe2CAUVva3protgpAwcLveWuPGQMiKlLWoCTH1t5ZULpfJQTHWbjluS7OGuhlPOjXEujWbbBV+fOX6gQJe9I5xx9NRrUaAkezciEMg/ItfCCTszD3IUvltWSEVy8D7QJfr/I/mK0/LiNVGzIfsOEnqgv3WMQ5B7OqTGBO/AEq2LyO4X2ENzr8zkuBFATwja4MBDJ8EhINYbCP++HjBqzA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=iCE5cFy902njAAaC+/8i+2QQAsppvJHr5YO5yZN4ihg=; b=AbH0H8NSXuOwweZPBAzjqxNYUTLLmAiMIJ9TY9UvVU24k2ywbSirdT1NlGXj9ITKIJGOP6mNuKnIqK1GkBVh6OIfrQd2ItBaXcIqgEup1EL2hZlu3gePA4Cd9LE6hwHJqd/4q4Yd7gTsiZ5lMEubOh1NybxPU45rR4MSOZBySWcilcKrtsRrpmmxdFqdjGZ/HKCggiJhKEaK6/HIiTt7aoyGwOViRZqaQk3KVx8J7mNiDbyY02NoLyADZ0TorfUGn6j7CXINjXNztZn3k0OaRdv087XFZO7SZ6oHFaTV2ElG+PvHNnyo9QF8L2l5g3a5cYYW4+diFie1qJpCdUKdrQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 12.22.5.238) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=iCE5cFy902njAAaC+/8i+2QQAsppvJHr5YO5yZN4ihg=; b=HWGzhNNDQ0Uag4wQIuxb1yU7y1o1ZopUPIMc5mWQhxRjWBFPxaWt2vXqNaxvA0RZHBJq11lWuMbgVRVKsNSD5wfJTEMs5Fyxo/q8xiBj8Tkqd8RVmqdc5+UtUzUM94W0KEYXslNjdRHKMd+yt+D1U0zjbMPYLTdK3oFnelSYz6t4Cd/nAJr/7EJlFPcfLAgWOjszt4D5Pj13vncNGC1j0Sxfjedqlq7DbJA8pptgCmbqa1CxmNPJQZGZQtxvKZ8f1KmIAqRWz1NmmFTn/1+x9gdycGuRQOwxnQoxAFrp2YnVn8CqhH+YOnL3ztLdq+MRUc6URxADN/QHL6F8QcQK2g== Received: from MWHPR12CA0046.namprd12.prod.outlook.com (2603:10b6:301:2::32) by MN2PR12MB3440.namprd12.prod.outlook.com (2603:10b6:208:d0::26) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5395.14; Fri, 1 Jul 2022 11:08:58 +0000 Received: from CO1NAM11FT051.eop-nam11.prod.protection.outlook.com (2603:10b6:301:2:cafe::53) by MWHPR12CA0046.outlook.office365.com (2603:10b6:301:2::32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5395.14 via Frontend Transport; Fri, 1 Jul 2022 11:08:58 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 12.22.5.238) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 12.22.5.238 as permitted sender) receiver=protection.outlook.com; client-ip=12.22.5.238; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (12.22.5.238) by CO1NAM11FT051.mail.protection.outlook.com (10.13.174.114) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.5395.17 via Frontend Transport; Fri, 1 Jul 2022 11:08:58 +0000 Received: from drhqmail202.nvidia.com (10.126.190.181) by DRHQMAIL105.nvidia.com (10.27.9.14) with Microsoft SMTP Server (TLS) id 15.0.1497.32; Fri, 1 Jul 2022 11:08:57 +0000 Received: from drhqmail202.nvidia.com (10.126.190.181) by drhqmail202.nvidia.com (10.126.190.181) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.26; Fri, 1 Jul 2022 04:08:56 -0700 Received: from nvidia-abhsahu-1.nvidia.com (10.127.8.12) by mail.nvidia.com (10.126.190.181) with Microsoft SMTP Server id 15.2.986.26 via Frontend Transport; Fri, 1 Jul 2022 04:08:50 -0700 From: Abhishek Sahu To: Alex Williamson , Cornelia Huck , Yishai Hadas , Jason Gunthorpe , Shameer Kolothum , Kevin Tian , "Rafael J . Wysocki" CC: Max Gurtovoy , Bjorn Helgaas , , , , , Abhishek Sahu Subject: [PATCH v4 6/6] vfio/pci: Add support for virtual PME Date: Fri, 1 Jul 2022 16:38:14 +0530 Message-ID: <20220701110814.7310-7-abhsahu@nvidia.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20220701110814.7310-1-abhsahu@nvidia.com> References: <20220701110814.7310-1-abhsahu@nvidia.com> X-NVConfidentiality: public MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 88f1e34e-f7d5-4f8a-6422-08da5b521888 X-MS-TrafficTypeDiagnostic: MN2PR12MB3440:EE_ X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 75x93W5fMLZVOYwL8zqtTOsnk0ZFlsrTPdxIqoyPxDF+x/WEMohLtWf2phireV2fjWKTmuSe2lS+473EEZR/JvJD5kkHjBEaQPFATMAc3f/2NWPFdc6dn+EQ2xLZkX8NjtPxXxQNXJE6cIsmmT+5XnjqcGkR5u/Jthe2f11nF17CWPFDOFA6o0X/Pi/E74FfazCLiFSbPBgJKl/S/gBrAstUuYnWcMOnBcpqa88/O4NAjW3ucqqu5h/dHlct8v8sul1fCtP8JkKhtSLUduXOrW0nsRP3UEQpue4xtgwDMMJjp3ghwrJEMTgGLgnFhmabX2P/RPs6G2Qg5CsbPLTPlJB1K7VVCoqwRHPC51Faj8YrG2sZvUsdO86OZG1PL/hVDJPt34hEH12I48e0JwSpcLiAVpfqv+Olk7VuMQTW4bHEGhRVzaGxRw3mKZp87xx7okKU3/gFrXGCEYQ6fBpgqxuU/XaEPFxm1zyhOGvIQM9Kxwnn4NWDVwGLKNRYghIPifsatZ3RUxtP06J9+07IljDhFFMotCfYGX4b26OFuxhN4TT3UpW1Dyl51S/EISSlUeakYZemi7aWTGMIoYG1+zic4WXccUMqCJ+PkgGLczFehyoBMf3exyPcTp6QbxkJwt4dwuQHCk4lp1TYKdpdG549/PaCP+2OITq1j6avZmLatSwOi4zY2yye+lGp+CgGT5UPLUhHRfnCcIe9b+Gcs7rvFp88XEbLrWatXQeB4IFv3iDgM6kDFPT+g54XuWiYM42zC8l/xLN3ENEHWlTcDenlU96RSudld4GauHKpZoVIIGKkTkNZvj1/IaYr9q4OO8wkwmEevoiexLVTp2LDm0RqQ2P0aAj09SgGbr69Zy8= X-Forefront-Antispam-Report: CIP:12.22.5.238; CTRY:US; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:mail.nvidia.com; PTR:InfoNoRecords; CAT:NONE; SFS:(13230016)(4636009)(136003)(39860400002)(376002)(396003)(346002)(36840700001)(40470700004)(46966006)(40480700001)(7416002)(7696005)(30864003)(5660300002)(40460700003)(36860700001)(426003)(41300700001)(83380400001)(8936002)(478600001)(6666004)(2906002)(186003)(47076005)(336012)(82310400005)(316002)(82740400003)(2616005)(86362001)(54906003)(107886003)(356005)(4326008)(36756003)(110136005)(81166007)(26005)(1076003)(70206006)(8676002)(70586007)(36900700001); DIR:OUT; SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 01 Jul 2022 11:08:58.1977 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 88f1e34e-f7d5-4f8a-6422-08da5b521888 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a; Ip=[12.22.5.238]; Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: CO1NAM11FT051.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: MN2PR12MB3440 Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org If the PCI device is in low power state and the device requires wake-up, then it can generate PME (Power Management Events). Mostly these PME events will be propagated to the root port and then the root port will generate the system interrupt. Then the OS should identify the device which generated the PME and should resume the device. We can implement a similar virtual PME framework where if the device already went into the runtime suspended state and then there is any wake-up on the host side, then it will send the virtual PME notification to the guest. This virtual PME will be helpful for the cases where the device will not be suspended again if there is any wake-up triggered by the host. Following is the overall approach regarding the virtual PME. 1. Add one more event like VFIO_PCI_ERR_IRQ_INDEX named VFIO_PCI_PME_IRQ_INDEX and do the required code changes to get/set this new IRQ. 2. From the guest side, the guest needs to enable eventfd for the virtual PME notification. 3. In the vfio-pci driver, the PME support bits are currently virtualized and set to 0. We can set PME capability support for all the power states. This PME capability support is independent of the physical PME support. 4. The PME enable (PME_En bit in Power Management Control/Status Register) and PME status (PME_Status bit in Power Management Control/Status Register) are also virtualized currently. The write support for PME_En bit can be enabled. 5. The PME_Status bit is a write-1-clear bit where the write with zero value will have no effect and write with 1 value will clear the bit. The write for this bit will be trapped inside vfio_pm_config_write() similar to PCI_PM_CTRL write for PM_STATES. 6. When the host gets a request for resuming the device other than from low power exit feature IOCTL, then PME_Status bit will be set. According to [PCIe v5 7.5.2.2], "PME_Status - This bit is Set when the Function would normally generate a PME signal. The value of this bit is not affected by the value of the PME_En bit." So even if PME_En bit is not set, we can set PME_Status bit. 7. If the guest has enabled PME_En and registered for PME events through eventfd, then the usage count will be incremented to prevent the device to go into the suspended state and notify the guest through eventfd trigger. The virtual PME can help in handling physical PME also. When physical PME comes, then also the runtime resume will be called. If the guest has registered for virtual PME, then it will be sent in this case also. * Implementation for handling the virtual PME on the hypervisor: If we take the implementation in Linux OS, then during runtime suspend time, then it calls __pci_enable_wake(). It internally enables PME through pci_pme_active() and also enables the ACPI side wake-up through platform_pci_set_wakeup(). To handle the PME, the hypervisor has the following two options: 1. Create a virtual root port for the VFIO device and trigger interrupt when the PME comes. It will call pcie_pme_irq() which will resume the device. 2. Create a virtual ACPI _PRW resource and associate it with the device itself. In _PRW, any GPE (General Purpose Event) can be assigned for the wake-up. When PME comes, then GPE can be triggered by the hypervisor. GPE interrupt will call pci_acpi_wake_dev() function internally and it will resume the device. Signed-off-by: Abhishek Sahu --- drivers/vfio/pci/vfio_pci_config.c | 39 +++++++++++++++++++++------ drivers/vfio/pci/vfio_pci_core.c | 43 ++++++++++++++++++++++++------ drivers/vfio/pci/vfio_pci_intrs.c | 18 +++++++++++++ include/linux/vfio_pci_core.h | 2 ++ include/uapi/linux/vfio.h | 1 + 5 files changed, 87 insertions(+), 16 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_config.c b/drivers/vfio/pci/vfio_pci_config.c index 21a4743d011f..a06375a03758 100644 --- a/drivers/vfio/pci/vfio_pci_config.c +++ b/drivers/vfio/pci/vfio_pci_config.c @@ -719,6 +719,20 @@ static int vfio_pm_config_write(struct vfio_pci_core_device *vdev, int pos, if (count < 0) return count; + /* + * PME_STATUS is write-1-clear bit. If PME_STATUS is 1, then clear the + * bit in vconfig. The PME_STATUS is in the upper byte of the control + * register and user can do single byte write also. + */ + if (offset <= PCI_PM_CTRL + 1 && offset + count > PCI_PM_CTRL + 1) { + if (le32_to_cpu(val) & + (PCI_PM_CTRL_PME_STATUS >> (offset - PCI_PM_CTRL) * 8)) { + __le16 *ctrl = (__le16 *)&vdev->vconfig + [vdev->pm_cap_offset + PCI_PM_CTRL]; + *ctrl &= ~cpu_to_le16(PCI_PM_CTRL_PME_STATUS); + } + } + if (offset == PCI_PM_CTRL) { pci_power_t state; @@ -771,14 +785,16 @@ static int __init init_pci_cap_pm_perm(struct perm_bits *perm) * the user change power state, but we trap and initiate the * change ourselves, so the state bits are read-only. * - * The guest can't process PME from D3cold so virtualize PME_Status - * and PME_En bits. The vconfig bits will be cleared during device - * capability initialization. + * The guest can't process physical PME from D3cold so virtualize + * PME_Status and PME_En bits. These bits will be used for the + * virtual PME between host and guest. The vconfig bits will be + * updated during device capability initialization. PME_Status is + * write-1-clear bit, so it is read-only. We trap and update the + * vconfig bit manually during write. */ p_setd(perm, PCI_PM_CTRL, PCI_PM_CTRL_PME_ENABLE | PCI_PM_CTRL_PME_STATUS, - ~(PCI_PM_CTRL_PME_ENABLE | PCI_PM_CTRL_PME_STATUS | - PCI_PM_CTRL_STATE_MASK)); + ~(PCI_PM_CTRL_STATE_MASK | PCI_PM_CTRL_PME_STATUS)); return 0; } @@ -1454,8 +1470,13 @@ static void vfio_update_pm_vconfig_bytes(struct vfio_pci_core_device *vdev, __le16 *pmc = (__le16 *)&vdev->vconfig[offset + PCI_PM_PMC]; __le16 *ctrl = (__le16 *)&vdev->vconfig[offset + PCI_PM_CTRL]; - /* Clear vconfig PME_Support, PME_Status, and PME_En bits */ - *pmc &= ~cpu_to_le16(PCI_PM_CAP_PME_MASK); + /* + * Set the vconfig PME_Support bits. The PME_Status is being used for + * virtual PME support and is not dependent upon the physical + * PME support. + */ + *pmc |= cpu_to_le16(PCI_PM_CAP_PME_MASK); + /* Clear vconfig PME_Support and PME_En bits */ *ctrl &= ~cpu_to_le16(PCI_PM_CTRL_PME_ENABLE | PCI_PM_CTRL_PME_STATUS); } @@ -1582,8 +1603,10 @@ static int vfio_cap_init(struct vfio_pci_core_device *vdev) if (ret) return ret; - if (cap == PCI_CAP_ID_PM) + if (cap == PCI_CAP_ID_PM) { + vdev->pm_cap_offset = pos; vfio_update_pm_vconfig_bytes(vdev, pos); + } prev = &vdev->vconfig[pos + PCI_CAP_LIST_NEXT]; pos = next; diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c index 1ddaaa6ccef5..6c1225bc2aeb 100644 --- a/drivers/vfio/pci/vfio_pci_core.c +++ b/drivers/vfio/pci/vfio_pci_core.c @@ -319,14 +319,35 @@ static int vfio_pci_core_runtime_resume(struct device *dev) * the low power state or closed the device. * - If there is device access on the host side. * - * For the second case, check if re-entry to the low power state is - * allowed. If not, then increment the usage count so that runtime PM - * framework won't suspend the device and set the 'pm_runtime_resumed' - * flag. + * For the second case: + * - The virtual PME_STATUS bit will be set. If PME_ENABLE bit is set + * and user has registered for virtual PME events, then send the PME + * virtual PME event. + * - Check if re-entry to the low power state is not allowed. + * + * For the above conditions, increment the usage count so that + * runtime PM framework won't suspend the device and set the + * 'pm_runtime_resumed' flag. */ - if (vdev->pm_runtime_engaged && !vdev->pm_runtime_reentry_allowed) { - pm_runtime_get_noresume(dev); - vdev->pm_runtime_resumed = true; + if (vdev->pm_runtime_engaged) { + bool pme_triggered = false; + __le16 *ctrl = (__le16 *)&vdev->vconfig + [vdev->pm_cap_offset + PCI_PM_CTRL]; + + *ctrl |= cpu_to_le16(PCI_PM_CTRL_PME_STATUS); + if (le16_to_cpu(*ctrl) & PCI_PM_CTRL_PME_ENABLE) { + mutex_lock(&vdev->igate); + if (vdev->pme_trigger) { + pme_triggered = true; + eventfd_signal(vdev->pme_trigger, 1); + } + mutex_unlock(&vdev->igate); + } + + if (!vdev->pm_runtime_reentry_allowed || pme_triggered) { + pm_runtime_get_noresume(dev); + vdev->pm_runtime_resumed = true; + } } up_write(&vdev->memory_lock); @@ -586,6 +607,10 @@ void vfio_pci_core_close_device(struct vfio_device *core_vdev) eventfd_ctx_put(vdev->req_trigger); vdev->req_trigger = NULL; } + if (vdev->pme_trigger) { + eventfd_ctx_put(vdev->pme_trigger); + vdev->pme_trigger = NULL; + } mutex_unlock(&vdev->igate); } EXPORT_SYMBOL_GPL(vfio_pci_core_close_device); @@ -639,7 +664,8 @@ static int vfio_pci_get_irq_count(struct vfio_pci_core_device *vdev, int irq_typ } else if (irq_type == VFIO_PCI_ERR_IRQ_INDEX) { if (pci_is_pcie(vdev->pdev)) return 1; - } else if (irq_type == VFIO_PCI_REQ_IRQ_INDEX) { + } else if (irq_type == VFIO_PCI_REQ_IRQ_INDEX || + irq_type == VFIO_PCI_PME_IRQ_INDEX) { return 1; } @@ -985,6 +1011,7 @@ long vfio_pci_core_ioctl(struct vfio_device *core_vdev, unsigned int cmd, switch (info.index) { case VFIO_PCI_INTX_IRQ_INDEX ... VFIO_PCI_MSIX_IRQ_INDEX: case VFIO_PCI_REQ_IRQ_INDEX: + case VFIO_PCI_PME_IRQ_INDEX: break; case VFIO_PCI_ERR_IRQ_INDEX: if (pci_is_pcie(vdev->pdev)) diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c index 1a37db99df48..db4180687a74 100644 --- a/drivers/vfio/pci/vfio_pci_intrs.c +++ b/drivers/vfio/pci/vfio_pci_intrs.c @@ -639,6 +639,17 @@ static int vfio_pci_set_req_trigger(struct vfio_pci_core_device *vdev, count, flags, data); } +static int vfio_pci_set_pme_trigger(struct vfio_pci_core_device *vdev, + unsigned index, unsigned start, + unsigned count, uint32_t flags, void *data) +{ + if (index != VFIO_PCI_PME_IRQ_INDEX || start != 0 || count > 1) + return -EINVAL; + + return vfio_pci_set_ctx_trigger_single(&vdev->pme_trigger, + count, flags, data); +} + int vfio_pci_set_irqs_ioctl(struct vfio_pci_core_device *vdev, uint32_t flags, unsigned index, unsigned start, unsigned count, void *data) @@ -688,6 +699,13 @@ int vfio_pci_set_irqs_ioctl(struct vfio_pci_core_device *vdev, uint32_t flags, break; } break; + case VFIO_PCI_PME_IRQ_INDEX: + switch (flags & VFIO_IRQ_SET_ACTION_TYPE_MASK) { + case VFIO_IRQ_SET_ACTION_TRIGGER: + func = vfio_pci_set_pme_trigger; + break; + } + break; } if (!func) diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h index 18cc83b767b8..ee2646d820c2 100644 --- a/include/linux/vfio_pci_core.h +++ b/include/linux/vfio_pci_core.h @@ -102,6 +102,7 @@ struct vfio_pci_core_device { bool bar_mmap_supported[PCI_STD_NUM_BARS]; u8 *pci_config_map; u8 *vconfig; + u8 pm_cap_offset; struct perm_bits *msi_perm; spinlock_t irqlock; struct mutex igate; @@ -133,6 +134,7 @@ struct vfio_pci_core_device { int ioeventfds_nr; struct eventfd_ctx *err_trigger; struct eventfd_ctx *req_trigger; + struct eventfd_ctx *pme_trigger; struct list_head dummy_resources_list; struct mutex ioeventfds_lock; struct list_head ioeventfds_list; diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h index 7e00de5c21ea..08170950d655 100644 --- a/include/uapi/linux/vfio.h +++ b/include/uapi/linux/vfio.h @@ -621,6 +621,7 @@ enum { VFIO_PCI_MSIX_IRQ_INDEX, VFIO_PCI_ERR_IRQ_INDEX, VFIO_PCI_REQ_IRQ_INDEX, + VFIO_PCI_PME_IRQ_INDEX, VFIO_PCI_NUM_IRQS };