From patchwork Wed Aug 19 15:20:28 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Garry X-Patchwork-Id: 247984 Delivered-To: patch@linaro.org Received: by 2002:a05:6e02:522:0:0:0:0 with SMTP id h2csp561754ils; Wed, 19 Aug 2020 08:27:23 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxLUstGhaOGyEKaNMoIYiLULYLBNbEqmvfHmiR/dvvN2mPt0a7rJxTmqdzQKbOSuegUWA9e X-Received: by 2002:a17:906:3a41:: with SMTP id a1mr2321230ejf.260.1597850843282; Wed, 19 Aug 2020 08:27:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1597850843; cv=none; d=google.com; s=arc-20160816; b=eOHeb7Z8l68X9OfqU2bKcRy4lGOOZMqkTmjgC8K6PtWe5Z72bP/GS2wo5VRoLpECkB NzlRSaWFmVom3i9zjFVrJmjUjsPvgGQpsqf3TpmbV3SkkPDD7HiFiDUa9L7gyZTuXEi6 RndVp6vRvwcGG9hJMJ0ier9djJbpUbQc2DuLOvrXHdyMd8Nw9CZ+mTN5nILBkgXbLkqD B34HnCpy7OQVXdJLouIXTVVk3pdCtB8idGY3WxUeMr+Z6HwUgohav8X0jiKZGT88Uqp8 uDgjEC0RFxkUeFAUqApdKydzhrUjVkMSZvWC8uohggf1wpWuUzE+7Zewz7PNfbaHq/Gj b1KA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=7g4r8WSJpQalTdDu61cNp1z+z9Mc8VHlXh0cFD9Or7I=; b=J2q6dM68FBIKAhGe9yWSxue9Pg0L9DpXQKZ4gi0apoaMVjzFvbYnC8cM0Eb4q8coHN FCGdQu011OFm1FmqPTD915uZsxG0KwqYOOrLo/fB23dgOEnFZdYzL8eGi+sYLqF+NlkT Jj9dREGmBnCqs6PC63kqKuhBpgzOzubbZHRyOuz5gxXroXbgcMdZR7L2xzF8LSyNZmRe tY4CKVZoCMeU3BYcEUjnURcK+KTJGcoxdN5PyWfES/KkwMKpLxOHPH37WRKOpJEPI1Fg bUrYx1IZUOrmmBI6N8Y8eG7d3hCI32snEn3MjVGDcFhXDj963AQFhpKx0jOpmMkERrow n3kQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-scsi-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-scsi-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id du22si19868521ejc.24.2020.08.19.08.27.23 for ; Wed, 19 Aug 2020 08:27:23 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-scsi-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-scsi-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-scsi-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728341AbgHSP1P (ORCPT ); Wed, 19 Aug 2020 11:27:15 -0400 Received: from szxga06-in.huawei.com ([45.249.212.32]:38872 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728203AbgHSPZM (ORCPT ); Wed, 19 Aug 2020 11:25:12 -0400 Received: from DGGEMS414-HUB.china.huawei.com (unknown [172.30.72.60]) by Forcepoint Email with ESMTP id 60254424BC119C6AF290; Wed, 19 Aug 2020 23:24:58 +0800 (CST) Received: from localhost.localdomain (10.69.192.58) by DGGEMS414-HUB.china.huawei.com (10.3.19.214) with Microsoft SMTP Server id 14.3.487.0; Wed, 19 Aug 2020 23:24:51 +0800 From: John Garry To: , , , , , , , , , , CC: , , , , , , , , John Garry Subject: [PATCH v8 10/18] blk-mq, elevator: Count requests per hctx to improve performance Date: Wed, 19 Aug 2020 23:20:28 +0800 Message-ID: <1597850436-116171-11-git-send-email-john.garry@huawei.com> X-Mailer: git-send-email 2.8.1 In-Reply-To: <1597850436-116171-1-git-send-email-john.garry@huawei.com> References: <1597850436-116171-1-git-send-email-john.garry@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.69.192.58] X-CFilter-Loop: Reflected Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org From: Kashyap Desai High CPU utilization on "native_queued_spin_lock_slowpath" due to lock contention is possible for mq-deadline and bfq IO schedulers when nr_hw_queues is more than one. It is because kblockd work queue can submit IO from all online CPUs (through blk_mq_run_hw_queues()) even though only one hctx has pending commands. The elevator callback .has_work for mq-deadline and bfq scheduler considers pending work if there are any IOs on request queue but it does not account hctx context. Add a per-hctx 'elevator_queued' count to the hctx to avoid triggering the elevator even though there are no requests queued. Signed-off-by: Kashyap Desai Signed-off-by: Hannes Reinecke [jpg: Relocated atomic_dec() in dd_dispatch_request(), update commit message per Kashyap] Signed-off-by: John Garry --- block/bfq-iosched.c | 5 +++++ block/blk-mq.c | 1 + block/mq-deadline.c | 6 ++++++ include/linux/blk-mq.h | 4 ++++ 4 files changed, 16 insertions(+) -- 2.26.2 diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index 88f0dfa545d7..4650012f1e55 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -4640,6 +4640,9 @@ static bool bfq_has_work(struct blk_mq_hw_ctx *hctx) { struct bfq_data *bfqd = hctx->queue->elevator->elevator_data; + if (!atomic_read(&hctx->elevator_queued)) + return false; + /* * Avoiding lock: a race on bfqd->busy_queues should cause at * most a call to dispatch for nothing @@ -5554,6 +5557,7 @@ static void bfq_insert_requests(struct blk_mq_hw_ctx *hctx, rq = list_first_entry(list, struct request, queuelist); list_del_init(&rq->queuelist); bfq_insert_request(hctx, rq, at_head); + atomic_inc(&hctx->elevator_queued); } } @@ -5933,6 +5937,7 @@ static void bfq_finish_requeue_request(struct request *rq) bfq_completed_request(bfqq, bfqd); bfq_finish_requeue_request_body(bfqq); + atomic_dec(&rq->mq_hctx->elevator_queued); spin_unlock_irqrestore(&bfqd->lock, flags); } else { diff --git a/block/blk-mq.c b/block/blk-mq.c index 457b43829a4f..361fb9fe1dc5 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2649,6 +2649,7 @@ blk_mq_alloc_hctx(struct request_queue *q, struct blk_mq_tag_set *set, goto free_hctx; atomic_set(&hctx->nr_active, 0); + atomic_set(&hctx->elevator_queued, 0); if (node == NUMA_NO_NODE) node = set->numa_node; hctx->numa_node = node; diff --git a/block/mq-deadline.c b/block/mq-deadline.c index b57470e154c8..800ac902809b 100644 --- a/block/mq-deadline.c +++ b/block/mq-deadline.c @@ -386,6 +386,8 @@ static struct request *dd_dispatch_request(struct blk_mq_hw_ctx *hctx) spin_lock(&dd->lock); rq = __dd_dispatch_request(dd); spin_unlock(&dd->lock); + if (rq) + atomic_dec(&rq->mq_hctx->elevator_queued); return rq; } @@ -533,6 +535,7 @@ static void dd_insert_requests(struct blk_mq_hw_ctx *hctx, rq = list_first_entry(list, struct request, queuelist); list_del_init(&rq->queuelist); dd_insert_request(hctx, rq, at_head); + atomic_inc(&hctx->elevator_queued); } spin_unlock(&dd->lock); } @@ -579,6 +582,9 @@ static bool dd_has_work(struct blk_mq_hw_ctx *hctx) { struct deadline_data *dd = hctx->queue->elevator->elevator_data; + if (!atomic_read(&hctx->elevator_queued)) + return false; + return !list_empty_careful(&dd->dispatch) || !list_empty_careful(&dd->fifo_list[0]) || !list_empty_careful(&dd->fifo_list[1]); diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index a4b35ec60faf..2f3ba31a1658 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -139,6 +139,10 @@ struct blk_mq_hw_ctx { * shared across request queues. */ atomic_t nr_active; + /** + * @elevator_queued: Number of queued requests on hctx. + */ + atomic_t elevator_queued; /** @cpuhp_online: List to store request if CPU is going to die */ struct hlist_node cpuhp_online;