From patchwork Mon Nov 13 06:34:09 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Valente X-Patchwork-Id: 118714 Delivered-To: patch@linaro.org Received: by 10.140.22.164 with SMTP id 33csp1444914qgn; Sun, 12 Nov 2017 22:34:43 -0800 (PST) X-Google-Smtp-Source: AGs4zMbZpS0VvKDbdqco7HW/tiwrFzRfc9zv6TbRbvMFD77d2txlQ5RwuJaD84jcF+x0foFSyS7b X-Received: by 10.101.69.8 with SMTP id n8mr7744527pgq.79.1510554883398; Sun, 12 Nov 2017 22:34:43 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1510554883; cv=none; d=google.com; s=arc-20160816; b=BYNykmB7fugOjUpQOnmrz6mKk6bQNu7hREDV2G2xxHCLJ1RQxS0nYukVe3O7pgdm0y uE1wH7GOrYyzPzA6JsxHuNcD3Rxw4KkdC496CLOIOkIIWvAt6B7Il8EUeuBmnIZbU0Dn d+Nv8ER67OiyZSyCemQtgBlAk1lKcMWvWhKh+6kL/qX54G5fsSEx2HupL6lSe/M+CoIT wUGhwtEogS7fNZOpwGrsmBTO14RsYYXdd6q7wCqaMVi+Fd5nqWJ3HFo+XxTw0GycEXLU XfaeDNNeT5KeZ9c/enWlwROiyaGngdnxdrvJIPxDC8V2YRhuPG2tvLH7qpf0jPOUsMSY gwbw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=cjHAdRZqcgI6QTmLCfaOoDBzbp6bb2VrC9Wdjh7Zbiw=; b=WuaQxw3TCS4IXiGvGtjhUYOyurp6YkKedLTlak4v0gyjkBmHltXqHBnihO37v599ly qc0bdx0T/Ac4066VdX5beP7+wAU+DYs1I9ZH2xjzkj2u8o6W+UcNLWWWZ/H3Qgv5SgOa kg6Z9V6k8EQSwh2/focV/HoblyMKoDhCgfpCO+pa90KpCuymPB6B9hSghmsowJsoZ8vw Z2VWAw10WhYPVFKalwU3yshFeRLR5nlnJe33el1uX5NyLawlbe7+zhMj4LTTKGtke7/h AvdpjGOsTeojUsSKQAkQaifT3//yr694AIHPnVTIajK62f1t/tqgNjAN4+/yJ51CnYv6 fFLA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=IY1HXTD2; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h86si14813467pfj.24.2017.11.12.22.34.43; Sun, 12 Nov 2017 22:34:43 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=IY1HXTD2; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751979AbdKMGek (ORCPT + 27 others); Mon, 13 Nov 2017 01:34:40 -0500 Received: from mail-wr0-f194.google.com ([209.85.128.194]:56185 "EHLO mail-wr0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751924AbdKMGeg (ORCPT ); Mon, 13 Nov 2017 01:34:36 -0500 Received: by mail-wr0-f194.google.com with SMTP id l8so13414903wre.12 for ; Sun, 12 Nov 2017 22:34:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=cjHAdRZqcgI6QTmLCfaOoDBzbp6bb2VrC9Wdjh7Zbiw=; b=IY1HXTD27Yqqtm3i8B4epgD91AEW/gA04sb+qr+caQQ18pDndpHab9B5KitVHCxhUc 0Na4P13IdxKF/dJ4eo7sN9oTDTW/vwWtmSm78UBit0/YFjAqwkf4fSm/NUg1gAnBafF0 xC7jMyGyGTEBhjBAbGu9bG4nEGpgUu/j+TTdQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=cjHAdRZqcgI6QTmLCfaOoDBzbp6bb2VrC9Wdjh7Zbiw=; b=CG+d1egmTOW6cNulTZUApqyN7/olxRcmxMf9wctmPXNfr2+iX1HjeM3y1xVVIMZSck k75GiAV+UMtXofBnIEO+rbgUoy6PUhflsyjEZxoUympAiPukL3Cg94WukW2bwbM81eg/ FUJQ/Xk0I1VQCNA+kn63MTKhPMjGCMkedDX2ozt49eWHxeuRPgV5u/eJgut3kCss2c06 MnRSWQilUb+JSMUFIdEhO6lAWzAqD132UqGk+mJgZYqQbBWb6fXUV2481Q278+gNiFlI 1Y1u+wELQaZ6YS7IE/tudqLl/JpxS04n9JCRrTzupu+s5HFco+bgi68gMvsezIvcMpWF jD5A== X-Gm-Message-State: AJaThX4VgQpNkVTfOCAHRzSQzBzxjYGKWgulrjutyXoiH9e9yde7jpfb VKZL37+QMs632CAADCZNoI20xA== X-Received: by 10.223.145.166 with SMTP id 35mr5870498wri.51.1510554874743; Sun, 12 Nov 2017 22:34:34 -0800 (PST) Received: from localhost.localdomain ([5.169.167.83]) by smtp.gmail.com with ESMTPSA id 10sm11158429wml.27.2017.11.12.22.34.32 (version=TLS1 cipher=AES128-SHA bits=128/128); Sun, 12 Nov 2017 22:34:34 -0800 (PST) From: Paolo Valente To: Jens Axboe Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, ulf.hansson@linaro.org, broonie@kernel.org, linus.walleij@linaro.org, lee.tibbert@gmail.com, oleksandr@natalenko.name, lucmiccio@gmail.com, bfq-iosched@googlegroups.com, Paolo Valente Subject: [PATCH BUGFIX/IMPROVEMENT 3/4] block, bfq: update blkio stats outside the scheduler lock Date: Mon, 13 Nov 2017 07:34:09 +0100 Message-Id: <20171113063410.3029-4-paolo.valente@linaro.org> X-Mailer: git-send-email 2.10.0 In-Reply-To: <20171113063410.3029-1-paolo.valente@linaro.org> References: <20171113063410.3029-1-paolo.valente@linaro.org> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org bfq invokes various blkg_*stats_* functions to update the statistics contained in the special files blkio.bfq.* in the blkio controller groups, i.e., the I/O accounting related to the proportional-share policy provided by bfq. The execution of these functions takes a considerable percentage, about 40%, of the total per-request execution time of bfq (i.e., of the sum of the execution time of all the bfq functions that have to be executed to process an I/O request from its creation to its destruction). This reduces the request-processing rate sustainable by bfq noticeably, even on a multicore CPU. In fact, the bfq functions that invoke blkg_*stats_* functions cannot be executed in parallel with the rest of the code of bfq, because both are executed under the same same per-device scheduler lock. To reduce this slowdown, this commit moves, wherever possible, the invocation of these functions (more precisely, of the bfq functions that invoke blkg_*stats_* functions) outside the critical sections protected by the scheduler lock. With this change, and with all blkio.bfq.* statistics enabled, the throughput grows, e.g., from 250 to 310 KIOPS (+25%) on an Intel i7-4850HQ, in case of 8 threads doing random I/O in parallel on null_blk, with the latter configured with 0 latency. We obtained the same or higher throughput boosts, up to +30%, with other processors (some figures are reported in the documentation). For our tests, we used the script [1], with which our results can be easily reproduced. NOTE. This commit still protects the invocation of blkg_*stats_* functions with the request_queue lock, because the group these functions are invoked on may otherwise disappear before or while these functions are executed. Fortunately, tests without even this lock show, by difference, that the serialization caused by this lock has a little impact (at most ~5% of throughput reduction). [1] https://github.com/Algodev-github/IOSpeed Tested-by: Lee Tibbert Tested-by: Oleksandr Natalenko Signed-off-by: Paolo Valente Signed-off-by: Luca Miccio --- Documentation/block/bfq-iosched.txt | 6 +- block/bfq-iosched.c | 110 ++++++++++++++++++++++++++++++++---- block/bfq-wf2q.c | 1 - 3 files changed, 102 insertions(+), 15 deletions(-) -- 2.10.0 diff --git a/Documentation/block/bfq-iosched.txt b/Documentation/block/bfq-iosched.txt index 7a93615..7fad6c0 100644 --- a/Documentation/block/bfq-iosched.txt +++ b/Documentation/block/bfq-iosched.txt @@ -26,9 +26,9 @@ the limits on slow or average CPUs, here are BFQ limits for three different CPUs, on, respectively, an average laptop, an old desktop, and a cheap embedded system, in case full hierarchical support is enabled (i.e., CONFIG_BFQ_GROUP_IOSCHED is set): -- Intel i7-4850HQ: 250 KIOPS -- AMD A8-3850: 170 KIOPS -- ARM CortexTM-A53 Octa-core: 45 KIOPS +- Intel i7-4850HQ: 310 KIOPS +- AMD A8-3850: 200 KIOPS +- ARM CortexTM-A53 Octa-core: 56 KIOPS BFQ works for multi-queue devices too. diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index 91703eb..69e05f8 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -2228,7 +2228,6 @@ static void __bfq_set_in_service_queue(struct bfq_data *bfqd, struct bfq_queue *bfqq) { if (bfqq) { - bfqg_stats_update_avg_queue_size(bfqq_group(bfqq)); bfq_clear_bfqq_fifo_expire(bfqq); bfqd->budgets_assigned = (bfqd->budgets_assigned * 7 + 256) / 8; @@ -3469,7 +3468,6 @@ static struct bfq_queue *bfq_select_queue(struct bfq_data *bfqd) */ bfq_clear_bfqq_wait_request(bfqq); hrtimer_try_to_cancel(&bfqd->idle_slice_timer); - bfqg_stats_update_idle_time(bfqq_group(bfqq)); } goto keep_queue; } @@ -3695,15 +3693,67 @@ static struct request *bfq_dispatch_request(struct blk_mq_hw_ctx *hctx) { struct bfq_data *bfqd = hctx->queue->elevator->elevator_data; struct request *rq; +#ifdef CONFIG_BFQ_GROUP_IOSCHED + struct bfq_queue *in_serv_queue, *bfqq; + bool waiting_rq, idle_timer_disabled; +#endif spin_lock_irq(&bfqd->lock); +#ifdef CONFIG_BFQ_GROUP_IOSCHED + in_serv_queue = bfqd->in_service_queue; + waiting_rq = in_serv_queue && bfq_bfqq_wait_request(in_serv_queue); + + rq = __bfq_dispatch_request(hctx); + + idle_timer_disabled = + waiting_rq && !bfq_bfqq_wait_request(in_serv_queue); + +#else rq = __bfq_dispatch_request(hctx); - if (rq && RQ_BFQQ(rq)) - bfqg_stats_update_io_remove(bfqq_group(RQ_BFQQ(rq)), - rq->cmd_flags); +#endif spin_unlock_irq(&bfqd->lock); +#ifdef CONFIG_BFQ_GROUP_IOSCHED + bfqq = rq ? RQ_BFQQ(rq) : NULL; + if (!idle_timer_disabled && !bfqq) + return rq; + + /* + * rq and bfqq are guaranteed to exist until this function + * ends, for the following reasons. First, rq can be + * dispatched to the device, and then can be completed and + * freed, only after this function ends. Second, rq cannot be + * merged (and thus freed because of a merge) any longer, + * because it has already started. Thus rq cannot be freed + * before this function ends, and, since rq has a reference to + * bfqq, the same guarantee holds for bfqq too. + * + * In addition, the following queue lock guarantees that + * bfqq_group(bfqq) exists as well. + */ + spin_lock_irq(hctx->queue->queue_lock); + if (idle_timer_disabled) + /* + * Since the idle timer has been disabled, + * in_serv_queue contained some request when + * __bfq_dispatch_request was invoked above, which + * implies that rq was picked exactly from + * in_serv_queue. Thus in_serv_queue == bfqq, and is + * therefore guaranteed to exist because of the above + * arguments. + */ + bfqg_stats_update_idle_time(bfqq_group(in_serv_queue)); + if (bfqq) { + struct bfq_group *bfqg = bfqq_group(bfqq); + + bfqg_stats_update_avg_queue_size(bfqg); + bfqg_stats_set_start_empty_time(bfqg); + bfqg_stats_update_io_remove(bfqg, rq->cmd_flags); + } + spin_unlock_irq(hctx->queue->queue_lock); +#endif + return rq; } @@ -4161,7 +4211,6 @@ static void bfq_rq_enqueued(struct bfq_data *bfqd, struct bfq_queue *bfqq, */ bfq_clear_bfqq_wait_request(bfqq); hrtimer_try_to_cancel(&bfqd->idle_slice_timer); - bfqg_stats_update_idle_time(bfqq_group(bfqq)); /* * The queue is not empty, because a new request just @@ -4176,10 +4225,12 @@ static void bfq_rq_enqueued(struct bfq_data *bfqd, struct bfq_queue *bfqq, } } -static void __bfq_insert_request(struct bfq_data *bfqd, struct request *rq) +/* returns true if it causes the idle timer to be disabled */ +static bool __bfq_insert_request(struct bfq_data *bfqd, struct request *rq) { struct bfq_queue *bfqq = RQ_BFQQ(rq), *new_bfqq = bfq_setup_cooperator(bfqd, bfqq, rq, true); + bool waiting, idle_timer_disabled = false; if (new_bfqq) { if (bic_to_bfqq(RQ_BIC(rq), 1) != bfqq) @@ -4213,12 +4264,16 @@ static void __bfq_insert_request(struct bfq_data *bfqd, struct request *rq) bfqq = new_bfqq; } + waiting = bfqq && bfq_bfqq_wait_request(bfqq); bfq_add_request(rq); + idle_timer_disabled = waiting && !bfq_bfqq_wait_request(bfqq); rq->fifo_time = ktime_get_ns() + bfqd->bfq_fifo_expire[rq_is_sync(rq)]; list_add_tail(&rq->queuelist, &bfqq->fifo); bfq_rq_enqueued(bfqd, bfqq, rq); + + return idle_timer_disabled; } static void bfq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq, @@ -4226,7 +4281,11 @@ static void bfq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq, { struct request_queue *q = hctx->queue; struct bfq_data *bfqd = q->elevator->elevator_data; +#ifdef CONFIG_BFQ_GROUP_IOSCHED struct bfq_queue *bfqq = RQ_BFQQ(rq); + bool idle_timer_disabled = false; + unsigned int cmd_flags; +#endif spin_lock_irq(&bfqd->lock); if (blk_mq_sched_try_insert_merge(q, rq)) { @@ -4245,13 +4304,17 @@ static void bfq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq, else list_add_tail(&rq->queuelist, &bfqd->dispatch); } else { - __bfq_insert_request(bfqd, rq); +#ifdef CONFIG_BFQ_GROUP_IOSCHED + idle_timer_disabled = __bfq_insert_request(bfqd, rq); /* * Update bfqq, because, if a queue merge has occurred * in __bfq_insert_request, then rq has been * redirected into a new queue. */ bfqq = RQ_BFQQ(rq); +#else + __bfq_insert_request(bfqd, rq); +#endif if (rq_mergeable(rq)) { elv_rqhash_add(q, rq); @@ -4260,10 +4323,35 @@ static void bfq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq, } } - if (bfqq) - bfqg_stats_update_io_add(bfqq_group(bfqq), bfqq, rq->cmd_flags); - +#ifdef CONFIG_BFQ_GROUP_IOSCHED + /* + * Cache cmd_flags before releasing scheduler lock, because rq + * may disappear afterwards (for example, because of a request + * merge). + */ + cmd_flags = rq->cmd_flags; +#endif spin_unlock_irq(&bfqd->lock); + +#ifdef CONFIG_BFQ_GROUP_IOSCHED + if (!bfqq) + return; + /* + * bfqq still exists, because it can disappear only after + * either it is merged with another queue, or the process it + * is associated with exits. But both actions must be taken by + * the same process currently executing this flow of + * instruction. + * + * In addition, the following queue lock guarantees that + * bfqq_group(bfqq) exists as well. + */ + spin_lock_irq(q->queue_lock); + bfqg_stats_update_io_add(bfqq_group(bfqq), bfqq, cmd_flags); + if (idle_timer_disabled) + bfqg_stats_update_idle_time(bfqq_group(bfqq)); + spin_unlock_irq(q->queue_lock); +#endif } static void bfq_insert_requests(struct blk_mq_hw_ctx *hctx, diff --git a/block/bfq-wf2q.c b/block/bfq-wf2q.c index 414ba68..e495d3f 100644 --- a/block/bfq-wf2q.c +++ b/block/bfq-wf2q.c @@ -843,7 +843,6 @@ void bfq_bfqq_served(struct bfq_queue *bfqq, int served) st->vtime += bfq_delta(served, st->wsum); bfq_forget_idle(st); } - bfqg_stats_set_start_empty_time(bfqq_group(bfqq)); bfq_log_bfqq(bfqq->bfqd, bfqq, "bfqq_served %d secs", served); }