From patchwork Tue Oct 1 19:33:15 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Valente X-Patchwork-Id: 174943 Delivered-To: patch@linaro.org Received: by 2002:a92:7e96:0:0:0:0:0 with SMTP id q22csp8992329ill; Tue, 1 Oct 2019 12:33:41 -0700 (PDT) X-Google-Smtp-Source: APXvYqxepfRLfybDmkCjsNlASlJtuKnBoNCMIdV3Udyyy5y9R/RyMBVdxo2REpPudYvH5q04+E74 X-Received: by 2002:a05:6402:1583:: with SMTP id c3mr27740556edv.286.1569958421033; Tue, 01 Oct 2019 12:33:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1569958421; cv=none; d=google.com; s=arc-20160816; b=N4vBZd8bItbITGJGSglesonxjQTUkq9BL8/a0Rh6TzfGm02Vjgf1uchrqknZgcdw1E HFZ2Nf9KZNTZLH+klCUSvEpEBnoFLrXq/8y/oAt8Qt8OaUlRb5dxx0pEv1/TImPc43Xh jE+5g2NRzeF/e6UrOlDr+JKp+zrlkQu7W0ZkAyELNY0ikXC/BNat+pwAIEt3R0X5g3NL vsmbVByAmI+hDP0YoL+rmEz4/swp4a0ClD/ssnb8bJh+JvZzb7fgXRaufi5A1ZG2vAat 2VJ7lrFJRiPU+8KojcOmfc5cPqkkhy5kVOgCht1MSNn597sKu6kxCHHir5jOllxferBA bXAw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=j0OGfENzQ2AF4jhd2GrXe0NkuTbvQkug4rJhC91MArg=; b=04SbLPpko5NFSOiT5c0qT7OKedzCVATL2NCs28w9UBvL9EQNopO4qP3MMYd/azmTcy W7Y9XG897rEa9p+iWBXuoWI6QcjpxEcu1yV35e4H0LmmxMwzPxPy95D8TidhY1FDMBDW ohuvjojRI3r1P2lPf7tBrtQaPUfAhBSzvomWQzRKxmH2xoQBzw7qGnYyenP07t7j64HZ tTYp7sIByvVqyta6GKFidp+Jy4BBC3LngPTXzAZ4ohFNYsdrOHmiRjb3p4zXPd8PGOIr WpoV6aTPsk+jrq7hE38DvCMcM4zJXGuTPWqAjLwRcCNjuUnqpgJGnfv02OW+54KU2n/b BOcg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=J3gkUchf; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u23si9379802ejr.204.2019.10.01.12.33.40; Tue, 01 Oct 2019 12:33:41 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=J3gkUchf; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726950AbfJATdg (ORCPT + 27 others); Tue, 1 Oct 2019 15:33:36 -0400 Received: from mail-wr1-f65.google.com ([209.85.221.65]:44827 "EHLO mail-wr1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726793AbfJATdc (ORCPT ); Tue, 1 Oct 2019 15:33:32 -0400 Received: by mail-wr1-f65.google.com with SMTP id z9so3770514wrl.11 for ; Tue, 01 Oct 2019 12:33:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=j0OGfENzQ2AF4jhd2GrXe0NkuTbvQkug4rJhC91MArg=; b=J3gkUchf0v/cEnEstyA4QEajBjWadESiIXI5TQ9HavpAFlU5kxWmQRbANQaDOIte34 +3YtjAhfJYzbfN2EiooNv3nds5S95eTJRHNZgktcqjuQzA3uI/7ugYiHtMSoddXuHP3H aTnzGig5xvS9/+EeEF9AMRBvl3Cn55Jjx1S9wiJCkCzVNZ+RRYjoN2QVGz5N/976+jfx w3qqcIMJdIJx50YY7ZFIxw/nkn84hBvR8qmU849by/G/hlRlYXzzbwNJKbkNdPmZC/uY zWCfyNZPcirU1zAtHkBt7fwkTvTwXiPkntDoHqf9kN71yffYK3HYBNe0Rl8BIw1E9QAC E21w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=j0OGfENzQ2AF4jhd2GrXe0NkuTbvQkug4rJhC91MArg=; b=rMzAEr5iLonL0L5kitFNhCoDxI55PUfrHctBC837/l/nTq91sPrEqkYlsvVVNKFeXs D+ytAz1zqbsUZeFPfAdlr8IwTA3ODQAlOibu3qPXR7JQVRBPmrPbfiW9NX7d9rmW+0C8 REjXWOkYv/r4HJWZT8o1DE+LnOxwcUQbi152Yehr/dQm7s+ET7vq+PkJ17jUo4ZQ9IY2 57lq4oheQ31/9sBTgZh8mQgMT+5cImw2hrBwTTYZ0HpMNrdDfLqzRQ2aj5LZjKp5Ecl+ davZFcUmE4pmCd7mVN0hiASMMQk0y2Kx/R5opi+SH6vvcOXdHLgHwQ5J1UALffcpXNKI 8YbA== X-Gm-Message-State: APjAAAXl2MfTK7pj7Bt1p3sOayC3JLyfgoqqx6YKvo0mwh1ss+4lDwb2 v93sF3so0v8GW7Gh1Z34+voSqQ== X-Received: by 2002:a05:6000:1281:: with SMTP id f1mr19664251wrx.247.1569958410594; Tue, 01 Oct 2019 12:33:30 -0700 (PDT) Received: from localhost.localdomain ([212.140.138.217]) by smtp.gmail.com with ESMTPSA id q15sm36967632wrg.65.2019.10.01.12.33.29 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 01 Oct 2019 12:33:30 -0700 (PDT) From: Paolo Valente To: Jens Axboe Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, ulf.hansson@linaro.org, linus.walleij@linaro.org, bfq-iosched@googlegroups.com, oleksandr@natalenko.name, Tejun Heo , cgroups@vger.kernel.org, Paolo Valente Subject: [PATCH 1/2] blkcg: Make bfq disable iocost when enabled Date: Tue, 1 Oct 2019 20:33:15 +0100 Message-Id: <20191001193316.3330-2-paolo.valente@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20191001193316.3330-1-paolo.valente@linaro.org> References: <20191001193316.3330-1-paolo.valente@linaro.org> MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Tejun Heo Both iocost and bfq implement weight based IO control. Currently, bfq is using io.bfq prefix but wants to drop the bfq part. To avoid interface conflict, make bfq disable iocost when it's selected as the IO scheduler for any block device on the system. iocost is only re-enabled when bfq is built as a module and unloaded. Signed-off-by: Tejun Heo Cc: Paolo Valente --- Documentation/admin-guide/cgroup-v2.rst | 8 ++++--- block/bfq-cgroup.c | 2 ++ block/bfq-iosched.c | 32 +++++++++++++++++++++++++ block/blk-iocost.c | 5 ++-- include/linux/blk-cgroup.h | 5 ++++ kernel/cgroup/cgroup.c | 2 ++ 6 files changed, 48 insertions(+), 6 deletions(-) -- 2.20.1 diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst index 0fa8c0e615c2..8374957213f1 100644 --- a/Documentation/admin-guide/cgroup-v2.rst +++ b/Documentation/admin-guide/cgroup-v2.rst @@ -1440,9 +1440,11 @@ IO The "io" controller regulates the distribution of IO resources. This controller implements both weight based and absolute bandwidth or IOPS -limit distribution; however, weight based distribution is available -only if cfq-iosched is in use and neither scheme is available for -blk-mq devices. +limit distribution. Weight based distribution is implemented by +either iocost controller or bfq IO scheduler. When bfq is selected as +the IO scheduler for any block device, iocost is disabled and bfq's +implementation overrides for all devices. If bfq is built as a kernel +module, unloading it re-enables iocost. IO Interface Files diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c index 86a607cf19a1..decda96770f4 100644 --- a/block/bfq-cgroup.c +++ b/block/bfq-cgroup.c @@ -1194,7 +1194,9 @@ struct bfq_group *bfq_create_group_hierarchy(struct bfq_data *bfqd, int node) } struct blkcg_policy blkcg_policy_bfq = { +#ifndef CONFIG_BLK_CGROUP_IOCOST .dfl_cftypes = bfq_blkg_files, +#endif .legacy_cftypes = bfq_blkcg_legacy_files, .cpd_alloc_fn = bfq_cpd_alloc, diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index 0319d6339822..21d1b08610b1 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -6382,6 +6382,36 @@ static void bfq_init_root_group(struct bfq_group *root_group, root_group->sched_data.bfq_class_idle_last_service = jiffies; } +#if defined(CONFIG_BFQ_GROUP_IOSCHED) && defined(CONFIG_BLK_CGROUP_IOCOST) +static bool bfq_enabled = false; + +static void bfq_enable(void) +{ + static DEFINE_MUTEX(bfq_enable_mutex); + + mutex_lock(&bfq_enable_mutex); + if (!bfq_enabled) { + pr_info("bfq-iosched: Overriding iocost\n"); + blkcg_policy_unregister(&blkcg_policy_iocost); + cgroup_add_dfl_cftypes(&io_cgrp_subsys, bfq_blkg_files); + bfq_enabled = true; + } + mutex_unlock(&bfq_enable_mutex); +} + +static void __exit bfq_disable(void) +{ + if (bfq_enabled) { + pr_info("bfq-iosched: Restoring iocost\n"); + cgroup_rm_cftypes(bfq_blkg_files); + blkcg_policy_register(&blkcg_policy_iocost); + } +} +#else +static void bfq_enable(void) {} +static void __exit bfq_disable(void) {} +#endif + static int bfq_init_queue(struct request_queue *q, struct elevator_type *e) { struct bfq_data *bfqd; @@ -6506,6 +6536,7 @@ static int bfq_init_queue(struct request_queue *q, struct elevator_type *e) bfq_init_entity(&bfqd->oom_bfqq.entity, bfqd->root_group); wbt_disable_default(q); + bfq_enable(); return 0; out_free: @@ -6823,6 +6854,7 @@ static void __exit bfq_exit(void) blkcg_policy_unregister(&blkcg_policy_bfq); #endif bfq_slab_kill(); + bfq_disable(); } module_init(bfq_init); diff --git a/block/blk-iocost.c b/block/blk-iocost.c index 2a3db80c1dce..511bf80b6db3 100644 --- a/block/blk-iocost.c +++ b/block/blk-iocost.c @@ -605,8 +605,6 @@ static u32 vrate_adj_pct[] = 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 4, 4, 4, 4, 4, 4, 4, 4, 8, 8, 8, 8, 8, 8, 8, 8, 16 }; -static struct blkcg_policy blkcg_policy_iocost; - /* accessors and helpers */ static struct ioc *rqos_to_ioc(struct rq_qos *rqos) { @@ -2442,7 +2440,7 @@ static struct cftype ioc_files[] = { {} }; -static struct blkcg_policy blkcg_policy_iocost = { +struct blkcg_policy blkcg_policy_iocost = { .dfl_cftypes = ioc_files, .cpd_alloc_fn = ioc_cpd_alloc, .cpd_free_fn = ioc_cpd_free, @@ -2450,6 +2448,7 @@ static struct blkcg_policy blkcg_policy_iocost = { .pd_init_fn = ioc_pd_init, .pd_free_fn = ioc_pd_free, }; +EXPORT_SYMBOL_GPL(blkcg_policy_iocost); static int __init ioc_init(void) { diff --git a/include/linux/blk-cgroup.h b/include/linux/blk-cgroup.h index bed9e43f9426..5669e3cfa1bc 100644 --- a/include/linux/blk-cgroup.h +++ b/include/linux/blk-cgroup.h @@ -815,6 +815,11 @@ static inline void blkcg_clear_delay(struct blkcg_gq *blkg) void blkcg_add_delay(struct blkcg_gq *blkg, u64 now, u64 delta); void blkcg_schedule_throttle(struct request_queue *q, bool use_memdelay); void blkcg_maybe_throttle_current(void); + +#ifdef CONFIG_BLK_CGROUP_IOCOST +extern struct blkcg_policy blkcg_policy_iocost; +#endif + #else /* CONFIG_BLK_CGROUP */ struct blkcg { diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index 080561bb8a4b..9c9a674c12bd 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -4059,6 +4059,7 @@ int cgroup_rm_cftypes(struct cftype *cfts) mutex_unlock(&cgroup_mutex); return ret; } +EXPORT_SYMBOL_GPL(cgroup_rm_cftypes); /** * cgroup_add_cftypes - add an array of cftypes to a subsystem @@ -4115,6 +4116,7 @@ int cgroup_add_dfl_cftypes(struct cgroup_subsys *ss, struct cftype *cfts) cft->flags |= __CFTYPE_ONLY_ON_DFL; return cgroup_add_cftypes(ss, cfts); } +EXPORT_SYMBOL_GPL(cgroup_add_dfl_cftypes); /** * cgroup_add_legacy_cftypes - add an array of cftypes for legacy hierarchies From patchwork Tue Oct 1 19:33:16 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Valente X-Patchwork-Id: 174944 Delivered-To: patch@linaro.org Received: by 2002:a92:7e96:0:0:0:0:0 with SMTP id q22csp8992338ill; Tue, 1 Oct 2019 12:33:41 -0700 (PDT) X-Google-Smtp-Source: APXvYqzdmtkMJDh/Kb/GB1iqRostDXMOGqRAqmTya6/4i8J/owsUSvIetY2miW9Hu25lsvUCnrSr X-Received: by 2002:a50:f703:: with SMTP id g3mr27393092edn.43.1569958421721; Tue, 01 Oct 2019 12:33:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1569958421; cv=none; d=google.com; s=arc-20160816; b=Lm4i0JqNG4DnT+80X6iEEvKl4vWuprAMu6N4nJNGFtJaEq8AWSFEn2NbN8C+ZFD6y9 sfdIwdVIPFsxXKHKq1l+CFVvrQZihsbqTeIuLeBDWjJdu/CVczVlsQI5RkfJmvRY9viA EH+kUbO+NLZZOtt0xtXfactu/PK3XfA9UH0ch7uB39+Q5GV469RTRHSaXw9pFu1iCOfs Uz6jwI0MtQ96lyEtcTI/YGnO1xYl0eGOVL6QIlq08x/hJT9rwW3GuH/aXbhrjz7XsGIe bTfgKWFo2dOT2HR+clvGe2Q6+NqLEduy3fEXCw6t3M6NwIk67yYCpawKy26X1lrWbo3R H9zw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=WzszKoFLT5NZbsdUd0WPglNj1NW4fnrsvxmRV31aUBU=; b=W/QD4JOVj12FlvYxg/F38tSsFZ5h3wQfvL3Vu/HV9rEE8lm5pxkcYOIYOhkcD/c+tt Ks3cvqfO+7UuvEgI+9OE58d8Lshco4Hi7sb941Q3Q+6d6KsKE7c+JHP4cTegl1MMl2EK lXCM7tF9ylnLGyczUv+tuIO41JDnJn/+s3B00egYl3qaubadpdiWDNdsbNfxilkzM2V0 vh7WwKvt7x1cRfc4o45JOw08rYKMiNAMpJbVtyG2WmZS81avuz6cBtKr7oAeqoF5h32I i5lZi2xVaR6Nk0wYBQK4EkeXzBbSVVIPEko8ZL3zRmHfZGp0G9HQ4FTWxspdtpRawv3N C4yQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=fB+4Rxsi; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u23si9379802ejr.204.2019.10.01.12.33.41; Tue, 01 Oct 2019 12:33:41 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=fB+4Rxsi; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726981AbfJATdj (ORCPT + 27 others); Tue, 1 Oct 2019 15:33:39 -0400 Received: from mail-wm1-f66.google.com ([209.85.128.66]:53275 "EHLO mail-wm1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726876AbfJATdf (ORCPT ); Tue, 1 Oct 2019 15:33:35 -0400 Received: by mail-wm1-f66.google.com with SMTP id i16so4641506wmd.3 for ; Tue, 01 Oct 2019 12:33:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=WzszKoFLT5NZbsdUd0WPglNj1NW4fnrsvxmRV31aUBU=; b=fB+4Rxsip2rgYkAaVfwvGIDHK2K/xujvY+JTVeEQEmhUG59m7hp3vst6MoKVDlCvcI cDRRM6pcb1wkVeG8TzYFrZ6GlwUMb/w5R/renXsLTyP+MhkBiHZtz7XeqlGNqwX/orwS GwWYuBA2aYWAuKFw+BJfrGZj893f+kEFZFuLykvuTAcTswS9MJfHHsq2Qb7axBU+JJaD y4YSjrl14Qszvdx8WVz2fl5cZt2n3defYe33ZG7zwPoKn7cXvVGDr2H/iaCsXR3UteKP 8wEP8yTOOSoCNxBKr3o3731RFcecfo5egA8otbAQ7JCbMRGsMHH2bK5rlNDuk8zyangZ 5LHQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=WzszKoFLT5NZbsdUd0WPglNj1NW4fnrsvxmRV31aUBU=; b=IPNZPkXmQMdjLRlt40XgjQrBApcsNVe+ocBAYaZrF8IWGkOX0hI35/DW5m1ZrNALaA R0eJnEYIgeeaNy2kBcVMyqyYx3sailXeshLn8w6PNNMIahHzHU0aWoR1V1ZmEWA/MvO0 +2Z90yuhjXxhEVvvXrbU56pZS0kDOIhoFV6SRSrhPGR1pk/wT/wOHqwcNYBN0xhxvjN4 +wXjr+N4jykAHp2OhCL3Ke6+/KLuW4U32+XwnlmBz4aKqff8oYXErDx0uOyvhHciRA3k 4xFN/HiLbs/9e/8AXJpBQHhR/l9vOH8UJ+Iz3EZ4zZzPB/0UuganYsmb87EPHFVXco8X 4RWA== X-Gm-Message-State: APjAAAWj1rTdkYUs5r1KFw6C2tCqP3baNhfvKNSdw4vdYEHQhyIybs+x tIvVWmxdQXAt1zua5b6DGmqcrw== X-Received: by 2002:a1c:9dc1:: with SMTP id g184mr4930236wme.77.1569958411663; Tue, 01 Oct 2019 12:33:31 -0700 (PDT) Received: from localhost.localdomain ([212.140.138.217]) by smtp.gmail.com with ESMTPSA id q15sm36967632wrg.65.2019.10.01.12.33.30 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 01 Oct 2019 12:33:31 -0700 (PDT) From: Paolo Valente To: Jens Axboe Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, ulf.hansson@linaro.org, linus.walleij@linaro.org, bfq-iosched@googlegroups.com, oleksandr@natalenko.name, Tejun Heo , cgroups@vger.kernel.org, Paolo Valente Subject: [PATCH 2/2] block, bfq: present a double cgroups interface Date: Tue, 1 Oct 2019 20:33:16 +0100 Message-Id: <20191001193316.3330-3-paolo.valente@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20191001193316.3330-1-paolo.valente@linaro.org> References: <20191001193316.3330-1-paolo.valente@linaro.org> MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org When bfq was merged into mainline, there were two I/O schedulers that implemented the proportional-share policy: bfq for blk-mq and cfq for legacy blk. bfq's interface files in the blkio/io controller have the same names as cfq. But the cgroups interface doesn't allow two entities to use the same name for their files, so for bfq we had to prepend the "bfq" prefix to each of its files. However no legacy code uses these modified file names. This naming also causes confusion, as, e.g., in [1]. Now cfq has gone with legacy blk, so there is no need any longer for these prefixes in (the never used) bfq names. Yet some people may have started to use the current bfq interface. So, as suggested by Tejun Heo [2], make bfq present a double interface, one with the file names prepended with the "bfq" prefix, and the other one with no prefix. [1] https://github.com/systemd/systemd/issues/7057 [2] https://lkml.org/lkml/2019/9/18/736 Suggested-by: Tejun Heo Signed-off-by: Paolo Valente --- Documentation/block/bfq-iosched.rst | 40 +++-- block/bfq-cgroup.c | 258 ++++++++++++++-------------- 2 files changed, 153 insertions(+), 145 deletions(-) -- 2.20.1 diff --git a/Documentation/block/bfq-iosched.rst b/Documentation/block/bfq-iosched.rst index 0d237d402860..8ecd37903391 100644 --- a/Documentation/block/bfq-iosched.rst +++ b/Documentation/block/bfq-iosched.rst @@ -536,12 +536,14 @@ process. To get proportional sharing of bandwidth with BFQ for a given device, BFQ must of course be the active scheduler for that device. -Within each group directory, the names of the files associated with -BFQ-specific cgroup parameters and stats begin with the "bfq." -prefix. So, with cgroups-v1 or cgroups-v2, the full prefix for -BFQ-specific files is "blkio.bfq." or "io.bfq." For example, the group -parameter to set the weight of a group with BFQ is blkio.bfq.weight -or io.bfq.weight. +The interface of the proportional-share policy implemented by BFQ +consists of a series of cgroup parameters. For legacy issues, each +parameter can be read or written, equivalently, through one of two +files: the first file has the same name as the parameter to +read/write, while the second file has that same name prepended by the +prefix "bfq.". For example, the two files by which to set/show the +weight of a group are blkio.weight and blkio.bfq.weight with +cgroups-v1, or io.weight and io.bfq.weight with cgroups-v2. As for cgroups-v1 (blkio controller), the exact set of stat files created, and kept up-to-date by bfq, depends on whether @@ -550,14 +552,15 @@ the stat files documented in Documentation/admin-guide/cgroup-v1/blkio-controller.rst. If, instead, CONFIG_BFQ_CGROUP_DEBUG is not set, then bfq creates only the files:: - blkio.bfq.io_service_bytes - blkio.bfq.io_service_bytes_recursive - blkio.bfq.io_serviced - blkio.bfq.io_serviced_recursive + blkio.io_service_bytes + blkio.io_service_bytes_recursive + blkio.io_serviced + blkio.io_serviced_recursive -The value of CONFIG_BFQ_CGROUP_DEBUG greatly influences the maximum -throughput sustainable with bfq, because updating the blkio.bfq.* -stats is rather costly, especially for some of the stats enabled by +(plus their counterparts with also the bfq prefix). The value of +CONFIG_BFQ_CGROUP_DEBUG greatly influences the maximum throughput +sustainable with BFQ, because updating the blkio.* stats is rather +costly, especially for some of the stats enabled by CONFIG_BFQ_CGROUP_DEBUG. Parameters to set @@ -565,11 +568,12 @@ Parameters to set For each group, there is only the following parameter to set. -weight (namely blkio.bfq.weight or io.bfq-weight): the weight of the -group inside its parent. Available values: 1..10000 (default 100). The -linear mapping between ioprio and weights, described at the beginning -of the tunable section, is still valid, but all weights higher than -IOPRIO_BE_NR*10 are mapped to ioprio 0. +weight (namely blkio.weight/blkio.bfq.weight or +io.weight/io.bfq.weight): the weight of the group inside its +parent. Available values: 1..10000 (default 100). The linear mapping +between ioprio and weights, described at the beginning of the tunable +section, is still valid, but all weights higher than IOPRIO_BE_NR*10 +are mapped to ioprio 0. Recall that, if low-latency is set, then BFQ automatically raises the weight of the queues associated with interactive and soft real-time diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c index decda96770f4..d3b59b731992 100644 --- a/block/bfq-cgroup.c +++ b/block/bfq-cgroup.c @@ -1211,139 +1211,143 @@ struct blkcg_policy blkcg_policy_bfq = { .pd_reset_stats_fn = bfq_pd_reset_stats, }; -struct cftype bfq_blkcg_legacy_files[] = { - { - .name = "bfq.weight", - .flags = CFTYPE_NOT_ON_ROOT, - .seq_show = bfq_io_show_weight_legacy, - .write_u64 = bfq_io_set_weight_legacy, - }, - { - .name = "bfq.weight_device", - .flags = CFTYPE_NOT_ON_ROOT, - .seq_show = bfq_io_show_weight, - .write = bfq_io_set_weight, - }, - - /* statistics, covers only the tasks in the bfqg */ - { - .name = "bfq.io_service_bytes", - .private = (unsigned long)&blkcg_policy_bfq, - .seq_show = blkg_print_stat_bytes, - }, - { - .name = "bfq.io_serviced", - .private = (unsigned long)&blkcg_policy_bfq, - .seq_show = blkg_print_stat_ios, - }, -#ifdef CONFIG_BFQ_CGROUP_DEBUG - { - .name = "bfq.time", - .private = offsetof(struct bfq_group, stats.time), - .seq_show = bfqg_print_stat, - }, - { - .name = "bfq.sectors", - .seq_show = bfqg_print_stat_sectors, - }, - { - .name = "bfq.io_service_time", - .private = offsetof(struct bfq_group, stats.service_time), - .seq_show = bfqg_print_rwstat, - }, - { - .name = "bfq.io_wait_time", - .private = offsetof(struct bfq_group, stats.wait_time), - .seq_show = bfqg_print_rwstat, - }, - { - .name = "bfq.io_merged", - .private = offsetof(struct bfq_group, stats.merged), - .seq_show = bfqg_print_rwstat, - }, - { - .name = "bfq.io_queued", - .private = offsetof(struct bfq_group, stats.queued), - .seq_show = bfqg_print_rwstat, - }, -#endif /* CONFIG_BFQ_CGROUP_DEBUG */ +#define bfq_make_blkcg_legacy_files(prefix) \ + { \ + .name = #prefix "weight", \ + .flags = CFTYPE_NOT_ON_ROOT, \ + .seq_show = bfq_io_show_weight, \ + .write_u64 = bfq_io_set_weight_legacy, \ + }, \ + \ + /* statistics, covers only the tasks in the bfqg */ \ + { \ + .name = #prefix "io_service_bytes", \ + .private = (unsigned long)&blkcg_policy_bfq, \ + .seq_show = blkg_print_stat_bytes, \ + }, \ + { \ + .name = #prefix "io_serviced", \ + .private = (unsigned long)&blkcg_policy_bfq, \ + .seq_show = blkg_print_stat_ios, \ + }, \ + \ + /* the same statistics which cover the bfqg and its descendants */ \ + { \ + .name = #prefix "io_service_bytes_recursive", \ + .private = (unsigned long)&blkcg_policy_bfq, \ + .seq_show = blkg_print_stat_bytes_recursive, \ + }, \ + { \ + .name = #prefix "io_serviced_recursive", \ + .private = (unsigned long)&blkcg_policy_bfq, \ + .seq_show = blkg_print_stat_ios_recursive, \ + } + +#define bfq_make_blkcg_legacy_debug_files(prefix) \ + { \ + .name = #prefix "time", \ + .private = offsetof(struct bfq_group, stats.time), \ + .seq_show = bfqg_print_stat, \ + }, \ + { \ + .name = #prefix "sectors", \ + .seq_show = bfqg_print_stat_sectors, \ + }, \ + { \ + .name = #prefix "io_service_time", \ + .private = offsetof(struct bfq_group, stats.service_time), \ + .seq_show = bfqg_print_rwstat, \ + }, \ + { \ + .name = #prefix "io_wait_time", \ + .private = offsetof(struct bfq_group, stats.wait_time), \ + .seq_show = bfqg_print_rwstat, \ + }, \ + { \ + .name = #prefix "io_merged", \ + .private = offsetof(struct bfq_group, stats.merged), \ + .seq_show = bfqg_print_rwstat, \ + }, \ + { \ + .name = #prefix "io_queued", \ + .private = offsetof(struct bfq_group, stats.queued), \ + .seq_show = bfqg_print_rwstat, \ + }, \ + { \ + .name = #prefix "time_recursive", \ + .private = offsetof(struct bfq_group, stats.time), \ + .seq_show = bfqg_print_stat_recursive, \ + }, \ + { \ + .name = #prefix "sectors_recursive", \ + .seq_show = bfqg_print_stat_sectors_recursive, \ + }, \ + { \ + .name = #prefix "io_service_time_recursive", \ + .private = offsetof(struct bfq_group, stats.service_time), \ + .seq_show = bfqg_print_rwstat_recursive, \ + }, \ + { \ + .name = #prefix "io_wait_time_recursive", \ + .private = offsetof(struct bfq_group, stats.wait_time), \ + .seq_show = bfqg_print_rwstat_recursive, \ + }, \ + { \ + .name = #prefix "io_merged_recursive", \ + .private = offsetof(struct bfq_group, stats.merged), \ + .seq_show = bfqg_print_rwstat_recursive, \ + }, \ + { \ + .name = #prefix "io_queued_recursive", \ + .private = offsetof(struct bfq_group, stats.queued), \ + .seq_show = bfqg_print_rwstat_recursive, \ + }, \ + { \ + .name = #prefix "avg_queue_size", \ + .seq_show = bfqg_print_avg_queue_size, \ + }, \ + { \ + .name = #prefix "group_wait_time", \ + .private = offsetof(struct bfq_group, stats.group_wait_time), \ + .seq_show = bfqg_print_stat, \ + }, \ + { \ + .name = #prefix "idle_time", \ + .private = offsetof(struct bfq_group, stats.idle_time), \ + .seq_show = bfqg_print_stat, \ + }, \ + { \ + .name = #prefix "empty_time", \ + .private = offsetof(struct bfq_group, stats.empty_time), \ + .seq_show = bfqg_print_stat, \ + }, \ + { \ + .name = #prefix "dequeue", \ + .private = offsetof(struct bfq_group, stats.dequeue), \ + .seq_show = bfqg_print_stat, \ + } - /* the same statistics which cover the bfqg and its descendants */ - { - .name = "bfq.io_service_bytes_recursive", - .private = (unsigned long)&blkcg_policy_bfq, - .seq_show = blkg_print_stat_bytes_recursive, - }, - { - .name = "bfq.io_serviced_recursive", - .private = (unsigned long)&blkcg_policy_bfq, - .seq_show = blkg_print_stat_ios_recursive, - }, +struct cftype bfq_blkcg_legacy_files[] = { + bfq_make_blkcg_legacy_files(bfq.), + bfq_make_blkcg_legacy_files(), #ifdef CONFIG_BFQ_CGROUP_DEBUG - { - .name = "bfq.time_recursive", - .private = offsetof(struct bfq_group, stats.time), - .seq_show = bfqg_print_stat_recursive, - }, - { - .name = "bfq.sectors_recursive", - .seq_show = bfqg_print_stat_sectors_recursive, - }, - { - .name = "bfq.io_service_time_recursive", - .private = offsetof(struct bfq_group, stats.service_time), - .seq_show = bfqg_print_rwstat_recursive, - }, - { - .name = "bfq.io_wait_time_recursive", - .private = offsetof(struct bfq_group, stats.wait_time), - .seq_show = bfqg_print_rwstat_recursive, - }, - { - .name = "bfq.io_merged_recursive", - .private = offsetof(struct bfq_group, stats.merged), - .seq_show = bfqg_print_rwstat_recursive, - }, - { - .name = "bfq.io_queued_recursive", - .private = offsetof(struct bfq_group, stats.queued), - .seq_show = bfqg_print_rwstat_recursive, - }, - { - .name = "bfq.avg_queue_size", - .seq_show = bfqg_print_avg_queue_size, - }, - { - .name = "bfq.group_wait_time", - .private = offsetof(struct bfq_group, stats.group_wait_time), - .seq_show = bfqg_print_stat, - }, - { - .name = "bfq.idle_time", - .private = offsetof(struct bfq_group, stats.idle_time), - .seq_show = bfqg_print_stat, - }, - { - .name = "bfq.empty_time", - .private = offsetof(struct bfq_group, stats.empty_time), - .seq_show = bfqg_print_stat, - }, - { - .name = "bfq.dequeue", - .private = offsetof(struct bfq_group, stats.dequeue), - .seq_show = bfqg_print_stat, - }, -#endif /* CONFIG_BFQ_CGROUP_DEBUG */ + bfq_make_blkcg_legacy_debug_files(bfq.), + bfq_make_blkcg_legacy_debug_files(), +#endif { } /* terminate */ }; +#define bfq_make_blkg_files(prefix) \ + { \ + .name = #prefix "weight", \ + .flags = CFTYPE_NOT_ON_ROOT, \ + .seq_show = bfq_io_show_weight, \ + .write = bfq_io_set_weight, \ + } + struct cftype bfq_blkg_files[] = { - { - .name = "bfq.weight", - .flags = CFTYPE_NOT_ON_ROOT, - .seq_show = bfq_io_show_weight, - .write = bfq_io_set_weight, - }, + bfq_make_blkg_files(bfq.), + bfq_make_blkg_files(), {} /* terminate */ };