From patchwork Sun Feb 7 09:20:17 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 378893 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.3 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C3F4EC433E6 for ; Sun, 7 Feb 2021 09:22:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9806064E4E for ; Sun, 7 Feb 2021 09:22:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229613AbhBGJWQ (ORCPT ); Sun, 7 Feb 2021 04:22:16 -0500 Received: from us-smtp-delivery-124.mimecast.com ([63.128.21.124]:36120 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229548AbhBGJWP (ORCPT ); Sun, 7 Feb 2021 04:22:15 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1612689649; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=TOQ9im4Mu4G/f1OPv+9xXL1RYiP/DU11WYQmJwqKzpA=; b=IYOSs/5UJKvowJq1Myufed5AgwhLYDjibGM0innjZPED5vwqLsMqIPjMC8YnECbVCTnuBO jwB/7QrpsMLbWjRKrPBj5ILlHipx0qBGBgTkAuzOMXgchL64N1kN+EqEhvqqb8B2uqihxJ r3mdoOdoOnn0E0FNA/R3mSk74o6yE/A= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-162-35aSzzMuN4WN74GEIyzRDA-1; Sun, 07 Feb 2021 04:20:45 -0500 X-MC-Unique: 35aSzzMuN4WN74GEIyzRDA-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 42982107ACC7; Sun, 7 Feb 2021 09:20:44 +0000 (UTC) Received: from localhost (ovpn-13-9.pek2.redhat.com [10.72.13.9]) by smtp.corp.redhat.com (Postfix) with ESMTP id 9233E1412A; Sun, 7 Feb 2021 09:20:40 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org, "Martin K . Petersen" , linux-scsi@vger.kernel.org Cc: Ming Lei , Omar Sandoval , Kashyap Desai , Sumanesh Samanta , "Ewan D . Milne" , Hannes Reinecke , Johannes Thumshirn Subject: [PATCH V8 01/13] sbitmap: remove sbitmap_clear_bit_unlock Date: Sun, 7 Feb 2021 17:20:17 +0800 Message-Id: <20210207092029.1558550-2-ming.lei@redhat.com> In-Reply-To: <20210207092029.1558550-1-ming.lei@redhat.com> References: <20210207092029.1558550-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org No one uses this helper any more, so kill it. Cc: Omar Sandoval Cc: Kashyap Desai Cc: Sumanesh Samanta Cc: Ewan D. Milne Cc: Hannes Reinecke Reviewed-by: Hannes Reinecke Reviewed-by: Johannes Thumshirn Tested-by: Sumanesh Samanta Signed-off-by: Ming Lei --- include/linux/sbitmap.h | 6 ------ 1 file changed, 6 deletions(-) diff --git a/include/linux/sbitmap.h b/include/linux/sbitmap.h index 74cc6384715e..16353fbee765 100644 --- a/include/linux/sbitmap.h +++ b/include/linux/sbitmap.h @@ -315,12 +315,6 @@ static inline void sbitmap_deferred_clear_bit(struct sbitmap *sb, unsigned int b set_bit(SB_NR_TO_BIT(sb, bitnr), addr); } -static inline void sbitmap_clear_bit_unlock(struct sbitmap *sb, - unsigned int bitnr) -{ - clear_bit_unlock(SB_NR_TO_BIT(sb, bitnr), __sbitmap_word(sb, bitnr)); -} - static inline int sbitmap_test_bit(struct sbitmap *sb, unsigned int bitnr) { return test_bit(SB_NR_TO_BIT(sb, bitnr), __sbitmap_word(sb, bitnr)); From patchwork Sun Feb 7 09:20:18 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 378279 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.3 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 33C9BC433E0 for ; Sun, 7 Feb 2021 09:22:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 049A964E44 for ; Sun, 7 Feb 2021 09:22:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229706AbhBGJWa (ORCPT ); Sun, 7 Feb 2021 04:22:30 -0500 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:51622 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229638AbhBGJWX (ORCPT ); Sun, 7 Feb 2021 04:22:23 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1612689653; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5X4JyV3yRgX7edQzIwwiwSPfY5XTt1C4PAfVn+w5EcA=; b=iKZOknPLnNCIKh4ftievvM5IY7+uuo0jJ5nlnbP4KhYxfBHDJpMFKYgyt0IZjPFI6udR4J HI5iFqW2bIJunMpbag/d6YYlDMYekCZgEERTcZg7i7+gb3pJHD9Sl37sTZdnClUt64dMoj vLH/iInB8tASYaaVt/knhCffsu18zvQ= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-163-JbxaLiVwMEen7EG2YBCkQQ-1; Sun, 07 Feb 2021 04:20:49 -0500 X-MC-Unique: JbxaLiVwMEen7EG2YBCkQQ-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id A2C56801962; Sun, 7 Feb 2021 09:20:47 +0000 (UTC) Received: from localhost (ovpn-13-9.pek2.redhat.com [10.72.13.9]) by smtp.corp.redhat.com (Postfix) with ESMTP id E133F19D6C; Sun, 7 Feb 2021 09:20:46 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org, "Martin K . Petersen" , linux-scsi@vger.kernel.org Cc: Ming Lei , Omar Sandoval , Kashyap Desai , Sumanesh Samanta , "Ewan D . Milne" , Hannes Reinecke , virtualization@lists.linux-foundation.org Subject: [PATCH V8 02/13] sbitmap: maintain allocation round_robin in sbitmap Date: Sun, 7 Feb 2021 17:20:18 +0800 Message-Id: <20210207092029.1558550-3-ming.lei@redhat.com> In-Reply-To: <20210207092029.1558550-1-ming.lei@redhat.com> References: <20210207092029.1558550-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org Now allocation round_robin info is maintained by sbitmap_queue. Actually, bit allocation belongs to sbitmap. Also the following patch will move alloc_hint to sbitmap for users with high depth. So move round_robin to sbitmap. Cc: Omar Sandoval Cc: Kashyap Desai Cc: Sumanesh Samanta Cc: Ewan D. Milne Cc: Hannes Reinecke Cc: virtualization@lists.linux-foundation.org Reviewed-by: Hannes Reinecke Tested-by: Sumanesh Samanta Signed-off-by: Ming Lei --- block/blk-mq.c | 2 +- block/kyber-iosched.c | 3 ++- drivers/vhost/scsi.c | 4 ++-- include/linux/sbitmap.h | 20 ++++++++++---------- lib/sbitmap.c | 28 ++++++++++++++-------------- 5 files changed, 29 insertions(+), 28 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index f21d922ecfaf..a2bac31afaff 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2729,7 +2729,7 @@ blk_mq_alloc_hctx(struct request_queue *q, struct blk_mq_tag_set *set, goto free_cpumask; if (sbitmap_init_node(&hctx->ctx_map, nr_cpu_ids, ilog2(8), - gfp, node)) + gfp, node, false)) goto free_ctxs; hctx->nr_ctx = 0; diff --git a/block/kyber-iosched.c b/block/kyber-iosched.c index c25c41d0d061..d017c43c0041 100644 --- a/block/kyber-iosched.c +++ b/block/kyber-iosched.c @@ -479,7 +479,8 @@ static int kyber_init_hctx(struct blk_mq_hw_ctx *hctx, unsigned int hctx_idx) for (i = 0; i < KYBER_NUM_DOMAINS; i++) { if (sbitmap_init_node(&khd->kcq_map[i], hctx->nr_ctx, - ilog2(8), GFP_KERNEL, hctx->numa_node)) { + ilog2(8), GFP_KERNEL, hctx->numa_node, + false)) { while (--i >= 0) sbitmap_free(&khd->kcq_map[i]); goto err_kcqs; diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c index 4ce9f00ae10e..ab230f6f79e8 100644 --- a/drivers/vhost/scsi.c +++ b/drivers/vhost/scsi.c @@ -614,7 +614,7 @@ vhost_scsi_get_cmd(struct vhost_virtqueue *vq, struct vhost_scsi_tpg *tpg, return ERR_PTR(-EIO); } - tag = sbitmap_get(&svq->scsi_tags, 0, false); + tag = sbitmap_get(&svq->scsi_tags, 0); if (tag < 0) { pr_err("Unable to obtain tag for vhost_scsi_cmd\n"); return ERR_PTR(-ENOMEM); @@ -1512,7 +1512,7 @@ static int vhost_scsi_setup_vq_cmds(struct vhost_virtqueue *vq, int max_cmds) return 0; if (sbitmap_init_node(&svq->scsi_tags, max_cmds, -1, GFP_KERNEL, - NUMA_NO_NODE)) + NUMA_NO_NODE, false)) return -ENOMEM; svq->max_cmds = max_cmds; diff --git a/include/linux/sbitmap.h b/include/linux/sbitmap.h index 16353fbee765..734ee6214cd6 100644 --- a/include/linux/sbitmap.h +++ b/include/linux/sbitmap.h @@ -56,6 +56,11 @@ struct sbitmap { */ unsigned int map_nr; + /** + * @round_robin: Allocate bits in strict round-robin order. + */ + bool round_robin; + /** * @map: Allocated bitmap. */ @@ -124,11 +129,6 @@ struct sbitmap_queue { */ atomic_t ws_active; - /** - * @round_robin: Allocate bits in strict round-robin order. - */ - bool round_robin; - /** * @min_shallow_depth: The minimum shallow depth which may be passed to * sbitmap_queue_get_shallow() or __sbitmap_queue_get_shallow(). @@ -144,11 +144,14 @@ struct sbitmap_queue { * given, a good default is chosen. * @flags: Allocation flags. * @node: Memory node to allocate on. + * @round_robin: If true, be stricter about allocation order; always allocate + * starting from the last allocated bit. This is less efficient + * than the default behavior (false). * * Return: Zero on success or negative errno on failure. */ int sbitmap_init_node(struct sbitmap *sb, unsigned int depth, int shift, - gfp_t flags, int node); + gfp_t flags, int node, bool round_robin); /** * sbitmap_free() - Free memory used by a &struct sbitmap. @@ -174,15 +177,12 @@ void sbitmap_resize(struct sbitmap *sb, unsigned int depth); * sbitmap_get() - Try to allocate a free bit from a &struct sbitmap. * @sb: Bitmap to allocate from. * @alloc_hint: Hint for where to start searching for a free bit. - * @round_robin: If true, be stricter about allocation order; always allocate - * starting from the last allocated bit. This is less efficient - * than the default behavior (false). * * This operation provides acquire barrier semantics if it succeeds. * * Return: Non-negative allocated bit number if successful, -1 otherwise. */ -int sbitmap_get(struct sbitmap *sb, unsigned int alloc_hint, bool round_robin); +int sbitmap_get(struct sbitmap *sb, unsigned int alloc_hint); /** * sbitmap_get_shallow() - Try to allocate a free bit from a &struct sbitmap, diff --git a/lib/sbitmap.c b/lib/sbitmap.c index d693d9213ceb..7000636933b3 100644 --- a/lib/sbitmap.c +++ b/lib/sbitmap.c @@ -33,7 +33,7 @@ static inline bool sbitmap_deferred_clear(struct sbitmap_word *map) } int sbitmap_init_node(struct sbitmap *sb, unsigned int depth, int shift, - gfp_t flags, int node) + gfp_t flags, int node, bool round_robin) { unsigned int bits_per_word; unsigned int i; @@ -58,6 +58,7 @@ int sbitmap_init_node(struct sbitmap *sb, unsigned int depth, int shift, sb->shift = shift; sb->depth = depth; sb->map_nr = DIV_ROUND_UP(sb->depth, bits_per_word); + sb->round_robin = round_robin; if (depth == 0) { sb->map = NULL; @@ -129,14 +130,14 @@ static int __sbitmap_get_word(unsigned long *word, unsigned long depth, } static int sbitmap_find_bit_in_index(struct sbitmap *sb, int index, - unsigned int alloc_hint, bool round_robin) + unsigned int alloc_hint) { struct sbitmap_word *map = &sb->map[index]; int nr; do { nr = __sbitmap_get_word(&map->word, map->depth, alloc_hint, - !round_robin); + !sb->round_robin); if (nr != -1) break; if (!sbitmap_deferred_clear(map)) @@ -146,7 +147,7 @@ static int sbitmap_find_bit_in_index(struct sbitmap *sb, int index, return nr; } -int sbitmap_get(struct sbitmap *sb, unsigned int alloc_hint, bool round_robin) +int sbitmap_get(struct sbitmap *sb, unsigned int alloc_hint) { unsigned int i, index; int nr = -1; @@ -158,14 +159,13 @@ int sbitmap_get(struct sbitmap *sb, unsigned int alloc_hint, bool round_robin) * alloc_hint to find the right word index. No point in looping * twice in find_next_zero_bit() for that case. */ - if (round_robin) + if (sb->round_robin) alloc_hint = SB_NR_TO_BIT(sb, alloc_hint); else alloc_hint = 0; for (i = 0; i < sb->map_nr; i++) { - nr = sbitmap_find_bit_in_index(sb, index, alloc_hint, - round_robin); + nr = sbitmap_find_bit_in_index(sb, index, alloc_hint); if (nr != -1) { nr += index << sb->shift; break; @@ -350,7 +350,8 @@ int sbitmap_queue_init_node(struct sbitmap_queue *sbq, unsigned int depth, int ret; int i; - ret = sbitmap_init_node(&sbq->sb, depth, shift, flags, node); + ret = sbitmap_init_node(&sbq->sb, depth, shift, flags, node, + round_robin); if (ret) return ret; @@ -382,7 +383,6 @@ int sbitmap_queue_init_node(struct sbitmap_queue *sbq, unsigned int depth, atomic_set(&sbq->ws[i].wait_cnt, sbq->wake_batch); } - sbq->round_robin = round_robin; return 0; } EXPORT_SYMBOL_GPL(sbitmap_queue_init_node); @@ -424,12 +424,12 @@ int __sbitmap_queue_get(struct sbitmap_queue *sbq) hint = depth ? prandom_u32() % depth : 0; this_cpu_write(*sbq->alloc_hint, hint); } - nr = sbitmap_get(&sbq->sb, hint, sbq->round_robin); + nr = sbitmap_get(&sbq->sb, hint); if (nr == -1) { /* If the map is full, a hint won't do us much good. */ this_cpu_write(*sbq->alloc_hint, 0); - } else if (nr == hint || unlikely(sbq->round_robin)) { + } else if (nr == hint || unlikely(sbq->sb.round_robin)) { /* Only update the hint if we used it. */ hint = nr + 1; if (hint >= depth - 1) @@ -460,7 +460,7 @@ int __sbitmap_queue_get_shallow(struct sbitmap_queue *sbq, if (nr == -1) { /* If the map is full, a hint won't do us much good. */ this_cpu_write(*sbq->alloc_hint, 0); - } else if (nr == hint || unlikely(sbq->round_robin)) { + } else if (nr == hint || unlikely(sbq->sb.round_robin)) { /* Only update the hint if we used it. */ hint = nr + 1; if (hint >= depth - 1) @@ -576,7 +576,7 @@ void sbitmap_queue_clear(struct sbitmap_queue *sbq, unsigned int nr, smp_mb__after_atomic(); sbitmap_queue_wake_up(sbq); - if (likely(!sbq->round_robin && nr < sbq->sb.depth)) + if (likely(!sbq->sb.round_robin && nr < sbq->sb.depth)) *per_cpu_ptr(sbq->alloc_hint, cpu) = nr; } EXPORT_SYMBOL_GPL(sbitmap_queue_clear); @@ -633,7 +633,7 @@ void sbitmap_queue_show(struct sbitmap_queue *sbq, struct seq_file *m) } seq_puts(m, "}\n"); - seq_printf(m, "round_robin=%d\n", sbq->round_robin); + seq_printf(m, "round_robin=%d\n", sbq->sb.round_robin); seq_printf(m, "min_shallow_depth=%u\n", sbq->min_shallow_depth); } EXPORT_SYMBOL_GPL(sbitmap_queue_show); From patchwork Sun Feb 7 09:20:19 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 378892 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.3 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2877EC433DB for ; Sun, 7 Feb 2021 09:22:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id F2D1664E44 for ; Sun, 7 Feb 2021 09:22:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229646AbhBGJWb (ORCPT ); Sun, 7 Feb 2021 04:22:31 -0500 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:46670 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229548AbhBGJWZ (ORCPT ); Sun, 7 Feb 2021 04:22:25 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1612689659; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=KnG5gZBPUTLeVx2S/SnAZtG2WSuge4Xyl5Npf9sycBs=; b=EePu3SAZjlPdfKF6iVUBWywPStKAspdtkGpjvySYI/u1NFueVA97aQFzYyJ1f+Wu78SFOa Dd0KICswL+nhTwzIZCdGYOa+RxryT/Nb6mFIZQEuAma9QnlWyCUpnYpH0Wj+xEEhk+GLQm VNdHaeOK97nE4jUhSLr5wige2XuosyU= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-311-cvp7uKv4OHSS-51X10hGfw-1; Sun, 07 Feb 2021 04:20:55 -0500 X-MC-Unique: cvp7uKv4OHSS-51X10hGfw-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 5F0611846098; Sun, 7 Feb 2021 09:20:53 +0000 (UTC) Received: from localhost (ovpn-13-9.pek2.redhat.com [10.72.13.9]) by smtp.corp.redhat.com (Postfix) with ESMTP id BDAFA5C5AE; Sun, 7 Feb 2021 09:20:49 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org, "Martin K . Petersen" , linux-scsi@vger.kernel.org Cc: Ming Lei , Omar Sandoval , Kashyap Desai , Sumanesh Samanta , "Ewan D . Milne" , Hannes Reinecke Subject: [PATCH V8 03/13] sbitmap: add helpers for updating allocation hint Date: Sun, 7 Feb 2021 17:20:19 +0800 Message-Id: <20210207092029.1558550-4-ming.lei@redhat.com> In-Reply-To: <20210207092029.1558550-1-ming.lei@redhat.com> References: <20210207092029.1558550-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org Add helpers for updating allocation hint, so that we can avoid to duplicate code. Prepare for moving allocation hint into sbitmap. Cc: Omar Sandoval Cc: Kashyap Desai Cc: Sumanesh Samanta Cc: Ewan D. Milne Cc: Hannes Reinecke Reviewed-by: Hannes Reinecke Tested-by: Sumanesh Samanta Signed-off-by: Ming Lei --- lib/sbitmap.c | 93 ++++++++++++++++++++++++++++++--------------------- 1 file changed, 54 insertions(+), 39 deletions(-) diff --git a/lib/sbitmap.c b/lib/sbitmap.c index 7000636933b3..2b43a6aefec3 100644 --- a/lib/sbitmap.c +++ b/lib/sbitmap.c @@ -9,6 +9,55 @@ #include #include +static int init_alloc_hint(struct sbitmap_queue *sbq, gfp_t flags) +{ + unsigned depth = sbq->sb.depth; + + sbq->alloc_hint = alloc_percpu_gfp(unsigned int, flags); + if (!sbq->alloc_hint) + return -ENOMEM; + + if (depth && !sbq->sb.round_robin) { + int i; + + for_each_possible_cpu(i) + *per_cpu_ptr(sbq->alloc_hint, i) = prandom_u32() % depth; + } + + return 0; +} + +static inline unsigned update_alloc_hint_before_get(struct sbitmap_queue *sbq, + unsigned int depth) +{ + unsigned hint; + + hint = this_cpu_read(*sbq->alloc_hint); + if (unlikely(hint >= depth)) { + hint = depth ? prandom_u32() % depth : 0; + this_cpu_write(*sbq->alloc_hint, hint); + } + + return hint; +} + +static inline void update_alloc_hint_after_get(struct sbitmap_queue *sbq, + unsigned int depth, + unsigned int hint, + unsigned int nr) +{ + if (nr == -1) { + /* If the map is full, a hint won't do us much good. */ + this_cpu_write(*sbq->alloc_hint, 0); + } else if (nr == hint || unlikely(sbq->sb.round_robin)) { + /* Only update the hint if we used it. */ + hint = nr + 1; + if (hint >= depth - 1) + hint = 0; + this_cpu_write(*sbq->alloc_hint, hint); + } +} + /* * See if we have deferred clears that we can batch move */ @@ -355,17 +404,11 @@ int sbitmap_queue_init_node(struct sbitmap_queue *sbq, unsigned int depth, if (ret) return ret; - sbq->alloc_hint = alloc_percpu_gfp(unsigned int, flags); - if (!sbq->alloc_hint) { + if (init_alloc_hint(sbq, flags) != 0) { sbitmap_free(&sbq->sb); return -ENOMEM; } - if (depth && !round_robin) { - for_each_possible_cpu(i) - *per_cpu_ptr(sbq->alloc_hint, i) = prandom_u32() % depth; - } - sbq->min_shallow_depth = UINT_MAX; sbq->wake_batch = sbq_calc_wake_batch(sbq, depth); atomic_set(&sbq->wake_index, 0); @@ -418,24 +461,10 @@ int __sbitmap_queue_get(struct sbitmap_queue *sbq) unsigned int hint, depth; int nr; - hint = this_cpu_read(*sbq->alloc_hint); depth = READ_ONCE(sbq->sb.depth); - if (unlikely(hint >= depth)) { - hint = depth ? prandom_u32() % depth : 0; - this_cpu_write(*sbq->alloc_hint, hint); - } + hint = update_alloc_hint_before_get(sbq, depth); nr = sbitmap_get(&sbq->sb, hint); - - if (nr == -1) { - /* If the map is full, a hint won't do us much good. */ - this_cpu_write(*sbq->alloc_hint, 0); - } else if (nr == hint || unlikely(sbq->sb.round_robin)) { - /* Only update the hint if we used it. */ - hint = nr + 1; - if (hint >= depth - 1) - hint = 0; - this_cpu_write(*sbq->alloc_hint, hint); - } + update_alloc_hint_after_get(sbq, depth, hint, nr); return nr; } @@ -449,24 +478,10 @@ int __sbitmap_queue_get_shallow(struct sbitmap_queue *sbq, WARN_ON_ONCE(shallow_depth < sbq->min_shallow_depth); - hint = this_cpu_read(*sbq->alloc_hint); depth = READ_ONCE(sbq->sb.depth); - if (unlikely(hint >= depth)) { - hint = depth ? prandom_u32() % depth : 0; - this_cpu_write(*sbq->alloc_hint, hint); - } + hint = update_alloc_hint_before_get(sbq, depth); nr = sbitmap_get_shallow(&sbq->sb, hint, shallow_depth); - - if (nr == -1) { - /* If the map is full, a hint won't do us much good. */ - this_cpu_write(*sbq->alloc_hint, 0); - } else if (nr == hint || unlikely(sbq->sb.round_robin)) { - /* Only update the hint if we used it. */ - hint = nr + 1; - if (hint >= depth - 1) - hint = 0; - this_cpu_write(*sbq->alloc_hint, hint); - } + update_alloc_hint_after_get(sbq, depth, hint, nr); return nr; } From patchwork Sun Feb 7 09:20:20 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 378277 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.3 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BF31DC433E9 for ; Sun, 7 Feb 2021 09:22:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8C88364E43 for ; Sun, 7 Feb 2021 09:22:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229752AbhBGJWz (ORCPT ); Sun, 7 Feb 2021 04:22:55 -0500 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:21144 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229715AbhBGJWe (ORCPT ); Sun, 7 Feb 2021 04:22:34 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1612689666; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=r1WyMD3oXNUwkQQ5/3rgex241U1lMXhHQb0nNsfJ0j0=; b=XdUdNwfvBTJp0h5mh8s98WC4dCoT0Fl0PKdQKta9nBQimZlm+Hph9sE4D71ouGS5Zuya47 AakBFeej62d2PkRHNvcPDwgoPzI57+Lb+o2IGcNgVx/PQsobTMqB/AmzQQ+/idx8Cwf6P7 /yRKNdsbowwZHDrhWBwRmKNOxDYPsEU= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-255-cSkuzTHqMwq89wI443BVOA-1; Sun, 07 Feb 2021 04:21:02 -0500 X-MC-Unique: cSkuzTHqMwq89wI443BVOA-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 3FEF9107ACE6; Sun, 7 Feb 2021 09:21:00 +0000 (UTC) Received: from localhost (ovpn-13-9.pek2.redhat.com [10.72.13.9]) by smtp.corp.redhat.com (Postfix) with ESMTP id E18F619D6C; Sun, 7 Feb 2021 09:20:55 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org, "Martin K . Petersen" , linux-scsi@vger.kernel.org Cc: Ming Lei , Omar Sandoval , Kashyap Desai , Sumanesh Samanta , "Ewan D . Milne" , Mike Christie , virtualization@lists.linux-foundation.org, Hannes Reinecke Subject: [PATCH V8 04/13] sbitmap: move allocation hint into sbitmap Date: Sun, 7 Feb 2021 17:20:20 +0800 Message-Id: <20210207092029.1558550-5-ming.lei@redhat.com> In-Reply-To: <20210207092029.1558550-1-ming.lei@redhat.com> References: <20210207092029.1558550-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org Allocation hint should have belonged to sbitmap, also when sbitmap's depth is high and no need to use mulitple wakeup queues, user can benefit from percpu allocation hint too. So move allocation hint into sbitmap, then scsi device queue can benefit from allocation hint when converting to plain sbitmap in the following patches. Meantime convert vhost/scsi.c to use sbitmap allocation with percpu alloc hint, which way is more efficient than previous way. Cc: Omar Sandoval Cc: Kashyap Desai Cc: Sumanesh Samanta Cc: Ewan D. Milne Cc: Mike Christie Cc: virtualization@lists.linux-foundation.org Reviewed-by: Hannes Reinecke Tested-by: Sumanesh Samanta Signed-off-by: Ming Lei --- block/blk-mq.c | 2 +- block/kyber-iosched.c | 2 +- drivers/vhost/scsi.c | 4 +- include/linux/sbitmap.h | 41 +++++++++------ lib/sbitmap.c | 112 +++++++++++++++++++++++----------------- 5 files changed, 96 insertions(+), 65 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index a2bac31afaff..bcca3e9c04e8 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2729,7 +2729,7 @@ blk_mq_alloc_hctx(struct request_queue *q, struct blk_mq_tag_set *set, goto free_cpumask; if (sbitmap_init_node(&hctx->ctx_map, nr_cpu_ids, ilog2(8), - gfp, node, false)) + gfp, node, false, false)) goto free_ctxs; hctx->nr_ctx = 0; diff --git a/block/kyber-iosched.c b/block/kyber-iosched.c index d017c43c0041..d95a2b3393cc 100644 --- a/block/kyber-iosched.c +++ b/block/kyber-iosched.c @@ -480,7 +480,7 @@ static int kyber_init_hctx(struct blk_mq_hw_ctx *hctx, unsigned int hctx_idx) for (i = 0; i < KYBER_NUM_DOMAINS; i++) { if (sbitmap_init_node(&khd->kcq_map[i], hctx->nr_ctx, ilog2(8), GFP_KERNEL, hctx->numa_node, - false)) { + false, false)) { while (--i >= 0) sbitmap_free(&khd->kcq_map[i]); goto err_kcqs; diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c index ab230f6f79e8..8531751b28dd 100644 --- a/drivers/vhost/scsi.c +++ b/drivers/vhost/scsi.c @@ -614,7 +614,7 @@ vhost_scsi_get_cmd(struct vhost_virtqueue *vq, struct vhost_scsi_tpg *tpg, return ERR_PTR(-EIO); } - tag = sbitmap_get(&svq->scsi_tags, 0); + tag = sbitmap_get(&svq->scsi_tags); if (tag < 0) { pr_err("Unable to obtain tag for vhost_scsi_cmd\n"); return ERR_PTR(-ENOMEM); @@ -1512,7 +1512,7 @@ static int vhost_scsi_setup_vq_cmds(struct vhost_virtqueue *vq, int max_cmds) return 0; if (sbitmap_init_node(&svq->scsi_tags, max_cmds, -1, GFP_KERNEL, - NUMA_NO_NODE, false)) + NUMA_NO_NODE, false, true)) return -ENOMEM; svq->max_cmds = max_cmds; diff --git a/include/linux/sbitmap.h b/include/linux/sbitmap.h index 734ee6214cd6..247776fcc02c 100644 --- a/include/linux/sbitmap.h +++ b/include/linux/sbitmap.h @@ -65,6 +65,14 @@ struct sbitmap { * @map: Allocated bitmap. */ struct sbitmap_word *map; + + /* + * @alloc_hint: Cache of last successfully allocated or freed bit. + * + * This is per-cpu, which allows multiple users to stick to different + * cachelines until the map is exhausted. + */ + unsigned int __percpu *alloc_hint; }; #define SBQ_WAIT_QUEUES 8 @@ -100,14 +108,6 @@ struct sbitmap_queue { */ struct sbitmap sb; - /* - * @alloc_hint: Cache of last successfully allocated or freed bit. - * - * This is per-cpu, which allows multiple users to stick to different - * cachelines until the map is exhausted. - */ - unsigned int __percpu *alloc_hint; - /** * @wake_batch: Number of bits which must be freed before we wake up any * waiters. @@ -147,11 +147,13 @@ struct sbitmap_queue { * @round_robin: If true, be stricter about allocation order; always allocate * starting from the last allocated bit. This is less efficient * than the default behavior (false). + * @alloc_hint: If true, apply percpu hint for where to start searching for + * a free bit. * * Return: Zero on success or negative errno on failure. */ int sbitmap_init_node(struct sbitmap *sb, unsigned int depth, int shift, - gfp_t flags, int node, bool round_robin); + gfp_t flags, int node, bool round_robin, bool alloc_hint); /** * sbitmap_free() - Free memory used by a &struct sbitmap. @@ -159,6 +161,7 @@ int sbitmap_init_node(struct sbitmap *sb, unsigned int depth, int shift, */ static inline void sbitmap_free(struct sbitmap *sb) { + free_percpu(sb->alloc_hint); kfree(sb->map); sb->map = NULL; } @@ -176,19 +179,17 @@ void sbitmap_resize(struct sbitmap *sb, unsigned int depth); /** * sbitmap_get() - Try to allocate a free bit from a &struct sbitmap. * @sb: Bitmap to allocate from. - * @alloc_hint: Hint for where to start searching for a free bit. * * This operation provides acquire barrier semantics if it succeeds. * * Return: Non-negative allocated bit number if successful, -1 otherwise. */ -int sbitmap_get(struct sbitmap *sb, unsigned int alloc_hint); +int sbitmap_get(struct sbitmap *sb); /** * sbitmap_get_shallow() - Try to allocate a free bit from a &struct sbitmap, * limiting the depth used from each word. * @sb: Bitmap to allocate from. - * @alloc_hint: Hint for where to start searching for a free bit. * @shallow_depth: The maximum number of bits to allocate from a single word. * * This rather specific operation allows for having multiple users with @@ -200,8 +201,7 @@ int sbitmap_get(struct sbitmap *sb, unsigned int alloc_hint); * * Return: Non-negative allocated bit number if successful, -1 otherwise. */ -int sbitmap_get_shallow(struct sbitmap *sb, unsigned int alloc_hint, - unsigned long shallow_depth); +int sbitmap_get_shallow(struct sbitmap *sb, unsigned long shallow_depth); /** * sbitmap_any_bit_set() - Check for a set bit in a &struct sbitmap. @@ -315,6 +315,18 @@ static inline void sbitmap_deferred_clear_bit(struct sbitmap *sb, unsigned int b set_bit(SB_NR_TO_BIT(sb, bitnr), addr); } +/* + * Pair of sbitmap_get, and this one applies both cleared bit and + * allocation hint. + */ +static inline void sbitmap_put(struct sbitmap *sb, unsigned int bitnr) +{ + sbitmap_deferred_clear_bit(sb, bitnr); + + if (likely(sb->alloc_hint && !sb->round_robin && bitnr < sb->depth)) + *this_cpu_ptr(sb->alloc_hint) = bitnr; +} + static inline int sbitmap_test_bit(struct sbitmap *sb, unsigned int bitnr) { return test_bit(SB_NR_TO_BIT(sb, bitnr), __sbitmap_word(sb, bitnr)); @@ -363,7 +375,6 @@ int sbitmap_queue_init_node(struct sbitmap_queue *sbq, unsigned int depth, static inline void sbitmap_queue_free(struct sbitmap_queue *sbq) { kfree(sbq->ws); - free_percpu(sbq->alloc_hint); sbitmap_free(&sbq->sb); } diff --git a/lib/sbitmap.c b/lib/sbitmap.c index 2b43a6aefec3..e395435654aa 100644 --- a/lib/sbitmap.c +++ b/lib/sbitmap.c @@ -9,52 +9,51 @@ #include #include -static int init_alloc_hint(struct sbitmap_queue *sbq, gfp_t flags) +static int init_alloc_hint(struct sbitmap *sb, gfp_t flags) { - unsigned depth = sbq->sb.depth; + unsigned depth = sb->depth; - sbq->alloc_hint = alloc_percpu_gfp(unsigned int, flags); - if (!sbq->alloc_hint) + sb->alloc_hint = alloc_percpu_gfp(unsigned int, flags); + if (!sb->alloc_hint) return -ENOMEM; - if (depth && !sbq->sb.round_robin) { + if (depth && !sb->round_robin) { int i; for_each_possible_cpu(i) - *per_cpu_ptr(sbq->alloc_hint, i) = prandom_u32() % depth; + *per_cpu_ptr(sb->alloc_hint, i) = prandom_u32() % depth; } - return 0; } -static inline unsigned update_alloc_hint_before_get(struct sbitmap_queue *sbq, +static inline unsigned update_alloc_hint_before_get(struct sbitmap *sb, unsigned int depth) { unsigned hint; - hint = this_cpu_read(*sbq->alloc_hint); + hint = this_cpu_read(*sb->alloc_hint); if (unlikely(hint >= depth)) { hint = depth ? prandom_u32() % depth : 0; - this_cpu_write(*sbq->alloc_hint, hint); + this_cpu_write(*sb->alloc_hint, hint); } return hint; } -static inline void update_alloc_hint_after_get(struct sbitmap_queue *sbq, +static inline void update_alloc_hint_after_get(struct sbitmap *sb, unsigned int depth, unsigned int hint, unsigned int nr) { if (nr == -1) { /* If the map is full, a hint won't do us much good. */ - this_cpu_write(*sbq->alloc_hint, 0); - } else if (nr == hint || unlikely(sbq->sb.round_robin)) { + this_cpu_write(*sb->alloc_hint, 0); + } else if (nr == hint || unlikely(sb->round_robin)) { /* Only update the hint if we used it. */ hint = nr + 1; if (hint >= depth - 1) hint = 0; - this_cpu_write(*sbq->alloc_hint, hint); + this_cpu_write(*sb->alloc_hint, hint); } } @@ -82,7 +81,8 @@ static inline bool sbitmap_deferred_clear(struct sbitmap_word *map) } int sbitmap_init_node(struct sbitmap *sb, unsigned int depth, int shift, - gfp_t flags, int node, bool round_robin) + gfp_t flags, int node, bool round_robin, + bool alloc_hint) { unsigned int bits_per_word; unsigned int i; @@ -114,9 +114,18 @@ int sbitmap_init_node(struct sbitmap *sb, unsigned int depth, int shift, return 0; } + if (alloc_hint) { + if (init_alloc_hint(sb, flags)) + return -ENOMEM; + } else { + sb->alloc_hint = NULL; + } + sb->map = kcalloc_node(sb->map_nr, sizeof(*sb->map), flags, node); - if (!sb->map) + if (!sb->map) { + free_percpu(sb->alloc_hint); return -ENOMEM; + } for (i = 0; i < sb->map_nr; i++) { sb->map[i].depth = min(depth, bits_per_word); @@ -196,7 +205,7 @@ static int sbitmap_find_bit_in_index(struct sbitmap *sb, int index, return nr; } -int sbitmap_get(struct sbitmap *sb, unsigned int alloc_hint) +static int __sbitmap_get(struct sbitmap *sb, unsigned int alloc_hint) { unsigned int i, index; int nr = -1; @@ -228,10 +237,27 @@ int sbitmap_get(struct sbitmap *sb, unsigned int alloc_hint) return nr; } + +int sbitmap_get(struct sbitmap *sb) +{ + int nr; + unsigned int hint, depth; + + if (WARN_ON_ONCE(unlikely(!sb->alloc_hint))) + return -1; + + depth = READ_ONCE(sb->depth); + hint = update_alloc_hint_before_get(sb, depth); + nr = __sbitmap_get(sb, hint); + update_alloc_hint_after_get(sb, depth, hint, nr); + + return nr; +} EXPORT_SYMBOL_GPL(sbitmap_get); -int sbitmap_get_shallow(struct sbitmap *sb, unsigned int alloc_hint, - unsigned long shallow_depth) +static int __sbitmap_get_shallow(struct sbitmap *sb, + unsigned int alloc_hint, + unsigned long shallow_depth) { unsigned int i, index; int nr = -1; @@ -263,6 +289,22 @@ int sbitmap_get_shallow(struct sbitmap *sb, unsigned int alloc_hint, return nr; } + +int sbitmap_get_shallow(struct sbitmap *sb, unsigned long shallow_depth) +{ + int nr; + unsigned int hint, depth; + + if (WARN_ON_ONCE(unlikely(!sb->alloc_hint))) + return -1; + + depth = READ_ONCE(sb->depth); + hint = update_alloc_hint_before_get(sb, depth); + nr = __sbitmap_get_shallow(sb, hint, shallow_depth); + update_alloc_hint_after_get(sb, depth, hint, nr); + + return nr; +} EXPORT_SYMBOL_GPL(sbitmap_get_shallow); bool sbitmap_any_bit_set(const struct sbitmap *sb) @@ -400,15 +442,10 @@ int sbitmap_queue_init_node(struct sbitmap_queue *sbq, unsigned int depth, int i; ret = sbitmap_init_node(&sbq->sb, depth, shift, flags, node, - round_robin); + round_robin, true); if (ret) return ret; - if (init_alloc_hint(sbq, flags) != 0) { - sbitmap_free(&sbq->sb); - return -ENOMEM; - } - sbq->min_shallow_depth = UINT_MAX; sbq->wake_batch = sbq_calc_wake_batch(sbq, depth); atomic_set(&sbq->wake_index, 0); @@ -416,7 +453,6 @@ int sbitmap_queue_init_node(struct sbitmap_queue *sbq, unsigned int depth, sbq->ws = kzalloc_node(SBQ_WAIT_QUEUES * sizeof(*sbq->ws), flags, node); if (!sbq->ws) { - free_percpu(sbq->alloc_hint); sbitmap_free(&sbq->sb); return -ENOMEM; } @@ -458,32 +494,16 @@ EXPORT_SYMBOL_GPL(sbitmap_queue_resize); int __sbitmap_queue_get(struct sbitmap_queue *sbq) { - unsigned int hint, depth; - int nr; - - depth = READ_ONCE(sbq->sb.depth); - hint = update_alloc_hint_before_get(sbq, depth); - nr = sbitmap_get(&sbq->sb, hint); - update_alloc_hint_after_get(sbq, depth, hint, nr); - - return nr; + return sbitmap_get(&sbq->sb); } EXPORT_SYMBOL_GPL(__sbitmap_queue_get); int __sbitmap_queue_get_shallow(struct sbitmap_queue *sbq, unsigned int shallow_depth) { - unsigned int hint, depth; - int nr; - WARN_ON_ONCE(shallow_depth < sbq->min_shallow_depth); - depth = READ_ONCE(sbq->sb.depth); - hint = update_alloc_hint_before_get(sbq, depth); - nr = sbitmap_get_shallow(&sbq->sb, hint, shallow_depth); - update_alloc_hint_after_get(sbq, depth, hint, nr); - - return nr; + return sbitmap_get_shallow(&sbq->sb, shallow_depth); } EXPORT_SYMBOL_GPL(__sbitmap_queue_get_shallow); @@ -592,7 +612,7 @@ void sbitmap_queue_clear(struct sbitmap_queue *sbq, unsigned int nr, sbitmap_queue_wake_up(sbq); if (likely(!sbq->sb.round_robin && nr < sbq->sb.depth)) - *per_cpu_ptr(sbq->alloc_hint, cpu) = nr; + *per_cpu_ptr(sbq->sb.alloc_hint, cpu) = nr; } EXPORT_SYMBOL_GPL(sbitmap_queue_clear); @@ -630,7 +650,7 @@ void sbitmap_queue_show(struct sbitmap_queue *sbq, struct seq_file *m) if (!first) seq_puts(m, ", "); first = false; - seq_printf(m, "%u", *per_cpu_ptr(sbq->alloc_hint, i)); + seq_printf(m, "%u", *per_cpu_ptr(sbq->sb.alloc_hint, i)); } seq_puts(m, "}\n"); From patchwork Sun Feb 7 09:20:21 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 378278 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.3 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5FE18C433DB for ; Sun, 7 Feb 2021 09:22:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3181E64E43 for ; Sun, 7 Feb 2021 09:22:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229742AbhBGJWw (ORCPT ); Sun, 7 Feb 2021 04:22:52 -0500 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:33440 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229717AbhBGJWd (ORCPT ); Sun, 7 Feb 2021 04:22:33 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1612689666; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3NQbHSeYUBG1OhOepiNAE6URM0kU5Vn4WnKABOQl7lo=; b=ZcBBEG0C3+yTUV9HNTaNlwxIto2yQlz9jqullNsQ/etNnQayd4Btw3ZidJ4V8LRNAxsk3z 1S2wvrljIBYQq3nHv8yCqUu8Z2CVIjYtz19CXx84e/sznUB44a53M+hxPwmMTJCtymQgGg ly6PpciJxCCkoAIoSd3KJAAbHIurjYg= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-507-ap8Y5AizOLuhA0eR-DzMag-1; Sun, 07 Feb 2021 04:21:04 -0500 X-MC-Unique: ap8Y5AizOLuhA0eR-DzMag-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 475161846092; Sun, 7 Feb 2021 09:21:03 +0000 (UTC) Received: from localhost (ovpn-13-9.pek2.redhat.com [10.72.13.9]) by smtp.corp.redhat.com (Postfix) with ESMTP id 97AB35C8A7; Sun, 7 Feb 2021 09:21:02 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org, "Martin K . Petersen" , linux-scsi@vger.kernel.org Cc: Ming Lei , Omar Sandoval , Kashyap Desai , Sumanesh Samanta , "Ewan D . Milne" , Hannes Reinecke Subject: [PATCH V8 05/13] sbitmap: export sbitmap_weight Date: Sun, 7 Feb 2021 17:20:21 +0800 Message-Id: <20210207092029.1558550-6-ming.lei@redhat.com> In-Reply-To: <20210207092029.1558550-1-ming.lei@redhat.com> References: <20210207092029.1558550-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org SCSI's .device_busy will be converted to sbitmap, and sbitmap_weight is needed, so export the helper. Meantime the only user of sbitmap_weight() is just for getting how many set and not cleared bits, so align sbitmap_weight() meaning with this way because the external user only cares how many busy bits there are in the sbitmap. Cc: Omar Sandoval Cc: Kashyap Desai Cc: Sumanesh Samanta Cc: Ewan D. Milne Reviewed-by: Hannes Reinecke Tested-by: Sumanesh Samanta Signed-off-by: Ming Lei --- include/linux/sbitmap.h | 10 ++++++++++ lib/sbitmap.c | 11 ++++++----- 2 files changed, 16 insertions(+), 5 deletions(-) diff --git a/include/linux/sbitmap.h b/include/linux/sbitmap.h index 247776fcc02c..c65ba887dcc3 100644 --- a/include/linux/sbitmap.h +++ b/include/linux/sbitmap.h @@ -341,6 +341,16 @@ static inline int sbitmap_test_bit(struct sbitmap *sb, unsigned int bitnr) */ void sbitmap_show(struct sbitmap *sb, struct seq_file *m); + +/** + * sbitmap_weight() - Return how many set and not cleared bits in a &struct + * sbitmap. + * @sb: Bitmap to check. + * + * Return: How many set and not cleared bits set + */ +unsigned int sbitmap_weight(const struct sbitmap *sb); + /** * sbitmap_bitmap_show() - Write a hex dump of a &struct sbitmap to a &struct * seq_file. diff --git a/lib/sbitmap.c b/lib/sbitmap.c index e395435654aa..73da26ad021e 100644 --- a/lib/sbitmap.c +++ b/lib/sbitmap.c @@ -334,20 +334,21 @@ static unsigned int __sbitmap_weight(const struct sbitmap *sb, bool set) return weight; } -static unsigned int sbitmap_weight(const struct sbitmap *sb) +static unsigned int sbitmap_cleared(const struct sbitmap *sb) { - return __sbitmap_weight(sb, true); + return __sbitmap_weight(sb, false); } -static unsigned int sbitmap_cleared(const struct sbitmap *sb) +unsigned int sbitmap_weight(const struct sbitmap *sb) { - return __sbitmap_weight(sb, false); + return __sbitmap_weight(sb, true) - sbitmap_cleared(sb); } +EXPORT_SYMBOL_GPL(sbitmap_weight); void sbitmap_show(struct sbitmap *sb, struct seq_file *m) { seq_printf(m, "depth=%u\n", sb->depth); - seq_printf(m, "busy=%u\n", sbitmap_weight(sb) - sbitmap_cleared(sb)); + seq_printf(m, "busy=%u\n", sbitmap_weight(sb)); seq_printf(m, "cleared=%u\n", sbitmap_cleared(sb)); seq_printf(m, "bits_per_word=%u\n", 1U << sb->shift); seq_printf(m, "map_nr=%u\n", sb->map_nr); From patchwork Sun Feb 7 09:20:22 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 378891 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.3 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A9670C43381 for ; Sun, 7 Feb 2021 09:22:58 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8765264E46 for ; Sun, 7 Feb 2021 09:22:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229720AbhBGJW5 (ORCPT ); Sun, 7 Feb 2021 04:22:57 -0500 Received: from us-smtp-delivery-124.mimecast.com ([63.128.21.124]:32176 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229721AbhBGJWm (ORCPT ); Sun, 7 Feb 2021 04:22:42 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1612689675; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Pik+78pc3JFABkcHE05nCxb/TogRrH96E2yRbjio/BE=; b=F7VJC5bIphG0ozNCrxzjt4ZccCSLiD3pm+O4UpHB/IpD3e88YlKrTkUS5IXCVQgrO0gu67 AfFJ7y2qCdt0X5hGDXKQm8ywB4pigVe76YC9/JYsRXhCFvfWy0VkM07CrdZaFbDsk1k+yv kDcPZVPqnX2YDlwa1JhpiiD9ip6TP1k= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-310-h2HYOvrWPjW5fU0ouVf49w-1; Sun, 07 Feb 2021 04:21:12 -0500 X-MC-Unique: h2HYOvrWPjW5fU0ouVf49w-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 8EB4B8030CC; Sun, 7 Feb 2021 09:21:10 +0000 (UTC) Received: from localhost (ovpn-13-9.pek2.redhat.com [10.72.13.9]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6D684722C6; Sun, 7 Feb 2021 09:21:05 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org, "Martin K . Petersen" , linux-scsi@vger.kernel.org Cc: Ming Lei , Omar Sandoval , Kashyap Desai , Sumanesh Samanta , "Ewan D . Milne" , Hannes Reinecke Subject: [PATCH V8 06/13] sbitmap: add helper of sbitmap_calculate_shift Date: Sun, 7 Feb 2021 17:20:22 +0800 Message-Id: <20210207092029.1558550-7-ming.lei@redhat.com> In-Reply-To: <20210207092029.1558550-1-ming.lei@redhat.com> References: <20210207092029.1558550-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org Move code for calculating default shift into one public helper, which can be used for SCSI to calculate shift. Cc: Omar Sandoval Cc: Kashyap Desai Cc: Sumanesh Samanta Cc: Ewan D. Milne Reviewed-by: Hannes Reinecke Tested-by: Sumanesh Samanta Signed-off-by: Ming Lei --- include/linux/sbitmap.h | 18 ++++++++++++++++++ lib/sbitmap.c | 16 +++------------- 2 files changed, 21 insertions(+), 13 deletions(-) diff --git a/include/linux/sbitmap.h b/include/linux/sbitmap.h index c65ba887dcc3..3087e1f15fdd 100644 --- a/include/linux/sbitmap.h +++ b/include/linux/sbitmap.h @@ -332,6 +332,24 @@ static inline int sbitmap_test_bit(struct sbitmap *sb, unsigned int bitnr) return test_bit(SB_NR_TO_BIT(sb, bitnr), __sbitmap_word(sb, bitnr)); } +static inline int sbitmap_calculate_shift(unsigned int depth) +{ + int shift = ilog2(BITS_PER_LONG); + + /* + * If the bitmap is small, shrink the number of bits per word so + * we spread over a few cachelines, at least. If less than 4 + * bits, just forget about it, it's not going to work optimally + * anyway. + */ + if (depth >= 4) { + while ((4U << shift) > depth) + shift--; + } + + return shift; +} + /** * sbitmap_show() - Dump &struct sbitmap information to a &struct seq_file. * @sb: Bitmap to show. diff --git a/lib/sbitmap.c b/lib/sbitmap.c index 73da26ad021e..47b3691058eb 100644 --- a/lib/sbitmap.c +++ b/lib/sbitmap.c @@ -87,19 +87,9 @@ int sbitmap_init_node(struct sbitmap *sb, unsigned int depth, int shift, unsigned int bits_per_word; unsigned int i; - if (shift < 0) { - shift = ilog2(BITS_PER_LONG); - /* - * If the bitmap is small, shrink the number of bits per word so - * we spread over a few cachelines, at least. If less than 4 - * bits, just forget about it, it's not going to work optimally - * anyway. - */ - if (depth >= 4) { - while ((4U << shift) > depth) - shift--; - } - } + if (shift < 0) + shift = sbitmap_calculate_shift(depth); + bits_per_word = 1U << shift; if (bits_per_word > BITS_PER_LONG) return -EINVAL; From patchwork Sun Feb 7 09:20:23 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 378890 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.3 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C25DAC433E0 for ; Sun, 7 Feb 2021 09:23:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8F38564E2A for ; Sun, 7 Feb 2021 09:23:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229729AbhBGJXB (ORCPT ); Sun, 7 Feb 2021 04:23:01 -0500 Received: from us-smtp-delivery-124.mimecast.com ([63.128.21.124]:28335 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229750AbhBGJWz (ORCPT ); Sun, 7 Feb 2021 04:22:55 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1612689689; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ZJco1hPECt0Pm4e65HxzfTKSNzRXUw8o3m5gAaIMOhI=; b=KvgHzCEe8Es9YbYsbG0f5ZC0/XCVTgWZvU10GtqnY9pi5lvW2ECS3mZIQa5xIxtPjPOGpd s+eWo+PlHu/3Q/vPwyIE+GbTfYiAZKGO78bjDL/7Pz2zuRWcXlAnyAATNUbJIFuZk5ew0J oJirdwpdbyHw0Yxap2/Z5DgQr3+5pJU= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-344-awSeoF7LNtOTnHuzbOJ5Bg-1; Sun, 07 Feb 2021 04:21:27 -0500 X-MC-Unique: awSeoF7LNtOTnHuzbOJ5Bg-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id EBE46107ACC7; Sun, 7 Feb 2021 09:21:25 +0000 (UTC) Received: from localhost (ovpn-13-9.pek2.redhat.com [10.72.13.9]) by smtp.corp.redhat.com (Postfix) with ESMTP id C76231412A; Sun, 7 Feb 2021 09:21:19 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org, "Martin K . Petersen" , linux-scsi@vger.kernel.org Cc: Ming Lei , Omar Sandoval , Kashyap Desai , Sumanesh Samanta , "Ewan D . Milne" , Hannes Reinecke Subject: [PATCH V8 07/13] blk-mq: add callbacks for storing & retrieving budget token Date: Sun, 7 Feb 2021 17:20:23 +0800 Message-Id: <20210207092029.1558550-8-ming.lei@redhat.com> In-Reply-To: <20210207092029.1558550-1-ming.lei@redhat.com> References: <20210207092029.1558550-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org SCSI is the only driver which requires dispatch budget, and it isn't fair to add one field into 'struct request' for storing budget token which will be used in the following patches for improving scsi's device busy scalability. Cc: Omar Sandoval Cc: Kashyap Desai Cc: Sumanesh Samanta Cc: Ewan D. Milne Cc: Hannes Reinecke Reviewed-by: Hannes Reinecke Tested-by: Sumanesh Samanta Signed-off-by: Ming Lei --- drivers/scsi/scsi_lib.c | 18 ++++++++++++++++++ include/linux/blk-mq.h | 9 +++++++++ include/scsi/scsi_cmnd.h | 2 ++ 3 files changed, 29 insertions(+) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 4d2280658559..de3a9a04e958 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -1641,6 +1641,20 @@ static bool scsi_mq_get_budget(struct request_queue *q) return false; } +static void scsi_mq_set_rq_budget_token(struct request *req, int token) +{ + struct scsi_cmnd *cmd = blk_mq_rq_to_pdu(req); + + cmd->budget_token = token; +} + +static int scsi_mq_get_rq_budget_token(struct request *req) +{ + struct scsi_cmnd *cmd = blk_mq_rq_to_pdu(req); + + return cmd->budget_token; +} + static blk_status_t scsi_queue_rq(struct blk_mq_hw_ctx *hctx, const struct blk_mq_queue_data *bd) { @@ -1855,6 +1869,8 @@ static const struct blk_mq_ops scsi_mq_ops_no_commit = { .cleanup_rq = scsi_cleanup_rq, .busy = scsi_mq_lld_busy, .map_queues = scsi_map_queues, + .set_rq_budget_token = scsi_mq_set_rq_budget_token, + .get_rq_budget_token = scsi_mq_get_rq_budget_token, }; @@ -1883,6 +1899,8 @@ static const struct blk_mq_ops scsi_mq_ops = { .cleanup_rq = scsi_cleanup_rq, .busy = scsi_mq_lld_busy, .map_queues = scsi_map_queues, + .set_rq_budget_token = scsi_mq_set_rq_budget_token, + .get_rq_budget_token = scsi_mq_get_rq_budget_token, }; struct request_queue *scsi_mq_alloc_queue(struct scsi_device *sdev) diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index aabbf6830ffc..84c02f2257fe 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -313,6 +313,15 @@ struct blk_mq_ops { */ void (*put_budget)(struct request_queue *); + /* + * @set_rq_budget_toekn: store rq's budget token + */ + void (*set_rq_budget_token)(struct request *, int); + /* + * @get_rq_budget_toekn: retrieve rq's budget token + */ + int (*get_rq_budget_token)(struct request *); + /** * @timeout: Called on request timeout. */ diff --git a/include/scsi/scsi_cmnd.h b/include/scsi/scsi_cmnd.h index 69ade4fb71aa..4884f300c896 100644 --- a/include/scsi/scsi_cmnd.h +++ b/include/scsi/scsi_cmnd.h @@ -75,6 +75,8 @@ struct scsi_cmnd { int eh_eflags; /* Used by error handlr */ + int budget_token; + /* * This is set to jiffies as it was when the command was first * allocated. It is used to time how long the command has From patchwork Sun Feb 7 09:20:24 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 378276 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.3 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 121A1C433E0 for ; Sun, 7 Feb 2021 09:23:37 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B80FE64E46 for ; Sun, 7 Feb 2021 09:23:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229756AbhBGJXc (ORCPT ); Sun, 7 Feb 2021 04:23:32 -0500 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:59248 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229750AbhBGJXL (ORCPT ); Sun, 7 Feb 2021 04:23:11 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1612689705; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nfNal7yolsGv+ClUk9qWMKGLMZPBNEOufaSERj0nHdQ=; b=fEdxMcmOZt4w1Ml9cyC/OcDRv88X6QWDYDm4M0cBkz0idA94YzfoI08GfNJQ6AYHFCKCFB ulGhTM5b11PbHsaB60HuPMvnOFw0RKf7OoF5/d29UOi6yE5CfB6y+VxRf+L7V2VeIW+Bqn oU5D/H0gLNBx4ASX7TY7B6b8k3gN+DA= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-189-iscZifFhO2yy-I2YDyQDRw-1; Sun, 07 Feb 2021 04:21:41 -0500 X-MC-Unique: iscZifFhO2yy-I2YDyQDRw-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 750D810066EF; Sun, 7 Feb 2021 09:21:39 +0000 (UTC) Received: from localhost (ovpn-13-9.pek2.redhat.com [10.72.13.9]) by smtp.corp.redhat.com (Postfix) with ESMTP id E54E65C5AE; Sun, 7 Feb 2021 09:21:32 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org, "Martin K . Petersen" , linux-scsi@vger.kernel.org Cc: Ming Lei , Omar Sandoval , Kashyap Desai , Sumanesh Samanta , "Ewan D . Milne" , Hannes Reinecke Subject: [PATCH V8 08/13] blk-mq: return budget token from .get_budget callback Date: Sun, 7 Feb 2021 17:20:24 +0800 Message-Id: <20210207092029.1558550-9-ming.lei@redhat.com> In-Reply-To: <20210207092029.1558550-1-ming.lei@redhat.com> References: <20210207092029.1558550-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org SCSI uses one global atomic variable to track queue depth for each LUN/request queue. This way doesn't scale well when there is lots of CPU cores and the disk is very fast. It has been observed that IOPS is affected a lot by tracking queue depth via sdev->device_busy in IO path. Return budget token from .get_budget callback, and the budget token can be passed to driver, so that we can replace the atomic variable with sbitmap_queue, then the scale issue can be fixed. Cc: Omar Sandoval Cc: Kashyap Desai Cc: Sumanesh Samanta Cc: Ewan D. Milne Reviewed-by: Hannes Reinecke Tested-by: Sumanesh Samanta Signed-off-by: Ming Lei --- block/blk-mq-sched.c | 17 +++++++++++++---- block/blk-mq.c | 36 +++++++++++++++++++++++++----------- block/blk-mq.h | 25 +++++++++++++++++++++---- drivers/scsi/scsi_lib.c | 16 +++++++++++----- include/linux/blk-mq.h | 4 ++-- 5 files changed, 72 insertions(+), 26 deletions(-) diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index deff4e826e23..962b801b8fb2 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -131,6 +131,7 @@ static int __blk_mq_do_dispatch_sched(struct blk_mq_hw_ctx *hctx) do { struct request *rq; + int budget_token; if (e->type->ops.has_work && !e->type->ops.has_work(hctx)) break; @@ -140,12 +141,13 @@ static int __blk_mq_do_dispatch_sched(struct blk_mq_hw_ctx *hctx) break; } - if (!blk_mq_get_dispatch_budget(q)) + budget_token = blk_mq_get_dispatch_budget(q); + if (budget_token < 0) break; rq = e->type->ops.dispatch_request(hctx); if (!rq) { - blk_mq_put_dispatch_budget(q); + blk_mq_put_dispatch_budget(q, budget_token); /* * We're releasing without dispatching. Holding the * budget could have blocked any "hctx"s with the @@ -157,6 +159,8 @@ static int __blk_mq_do_dispatch_sched(struct blk_mq_hw_ctx *hctx) break; } + blk_mq_set_rq_budget_token(rq, budget_token); + /* * Now this rq owns the budget which has to be released * if this rq won't be queued to driver via .queue_rq() @@ -230,6 +234,8 @@ static int blk_mq_do_dispatch_ctx(struct blk_mq_hw_ctx *hctx) struct request *rq; do { + int budget_token; + if (!list_empty_careful(&hctx->dispatch)) { ret = -EAGAIN; break; @@ -238,12 +244,13 @@ static int blk_mq_do_dispatch_ctx(struct blk_mq_hw_ctx *hctx) if (!sbitmap_any_bit_set(&hctx->ctx_map)) break; - if (!blk_mq_get_dispatch_budget(q)) + budget_token = blk_mq_get_dispatch_budget(q); + if (budget_token < 0) break; rq = blk_mq_dequeue_from_ctx(hctx, ctx); if (!rq) { - blk_mq_put_dispatch_budget(q); + blk_mq_put_dispatch_budget(q, budget_token); /* * We're releasing without dispatching. Holding the * budget could have blocked any "hctx"s with the @@ -255,6 +262,8 @@ static int blk_mq_do_dispatch_ctx(struct blk_mq_hw_ctx *hctx) break; } + blk_mq_set_rq_budget_token(rq, budget_token); + /* * Now this rq owns the budget which has to be released * if this rq won't be queued to driver via .queue_rq() diff --git a/block/blk-mq.c b/block/blk-mq.c index bcca3e9c04e8..92e2a3d1848f 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1304,10 +1304,15 @@ static enum prep_dispatch blk_mq_prep_dispatch_rq(struct request *rq, bool need_budget) { struct blk_mq_hw_ctx *hctx = rq->mq_hctx; + int budget_token = -1; - if (need_budget && !blk_mq_get_dispatch_budget(rq->q)) { - blk_mq_put_driver_tag(rq); - return PREP_DISPATCH_NO_BUDGET; + if (need_budget) { + budget_token = blk_mq_get_dispatch_budget(rq->q); + if (budget_token < 0) { + blk_mq_put_driver_tag(rq); + return PREP_DISPATCH_NO_BUDGET; + } + blk_mq_set_rq_budget_token(rq, budget_token); } if (!blk_mq_get_driver_tag(rq)) { @@ -1324,7 +1329,7 @@ static enum prep_dispatch blk_mq_prep_dispatch_rq(struct request *rq, * together during handling partial dispatch */ if (need_budget) - blk_mq_put_dispatch_budget(rq->q); + blk_mq_put_dispatch_budget(rq->q, budget_token); return PREP_DISPATCH_NO_TAG; } } @@ -1334,12 +1339,16 @@ static enum prep_dispatch blk_mq_prep_dispatch_rq(struct request *rq, /* release all allocated budgets before calling to blk_mq_dispatch_rq_list */ static void blk_mq_release_budgets(struct request_queue *q, - unsigned int nr_budgets) + struct list_head *list) { - int i; + struct request *rq; - for (i = 0; i < nr_budgets; i++) - blk_mq_put_dispatch_budget(q); + list_for_each_entry(rq, list, queuelist) { + int budget_token = blk_mq_get_rq_budget_token(rq); + + if (budget_token >= 0) + blk_mq_put_dispatch_budget(q, budget_token); + } } /* @@ -1437,7 +1446,8 @@ bool blk_mq_dispatch_rq_list(struct blk_mq_hw_ctx *hctx, struct list_head *list, (hctx->flags & BLK_MQ_F_TAG_QUEUE_SHARED); bool no_budget_avail = prep == PREP_DISPATCH_NO_BUDGET; - blk_mq_release_budgets(q, nr_budgets); + if (nr_budgets) + blk_mq_release_budgets(q, list); spin_lock(&hctx->lock); list_splice_tail_init(list, &hctx->dispatch); @@ -2036,6 +2046,7 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, { struct request_queue *q = rq->q; bool run_queue = true; + int budget_token; /* * RCU or SRCU read lock is needed before checking quiesced flag. @@ -2053,11 +2064,14 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, if (q->elevator && !bypass_insert) goto insert; - if (!blk_mq_get_dispatch_budget(q)) + budget_token = blk_mq_get_dispatch_budget(q); + if (budget_token < 0) goto insert; + blk_mq_set_rq_budget_token(rq, budget_token); + if (!blk_mq_get_driver_tag(rq)) { - blk_mq_put_dispatch_budget(q); + blk_mq_put_dispatch_budget(q, budget_token); goto insert; } diff --git a/block/blk-mq.h b/block/blk-mq.h index c1458d9502f1..61a5e884a6f5 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -187,17 +187,34 @@ unsigned int blk_mq_in_flight(struct request_queue *q, void blk_mq_in_flight_rw(struct request_queue *q, struct block_device *part, unsigned int inflight[2]); -static inline void blk_mq_put_dispatch_budget(struct request_queue *q) +static inline void blk_mq_put_dispatch_budget(struct request_queue *q, + int budget_token) { if (q->mq_ops->put_budget) - q->mq_ops->put_budget(q); + q->mq_ops->put_budget(q, budget_token); } -static inline bool blk_mq_get_dispatch_budget(struct request_queue *q) +static inline int blk_mq_get_dispatch_budget(struct request_queue *q) { if (q->mq_ops->get_budget) return q->mq_ops->get_budget(q); - return true; + return 0; +} + +static inline void blk_mq_set_rq_budget_token(struct request *rq, int token) +{ + if (token < 0) + return; + + if (rq->q->mq_ops->set_rq_budget_token) + rq->q->mq_ops->set_rq_budget_token(rq, token); +} + +static inline int blk_mq_get_rq_budget_token(struct request *rq) +{ + if (rq->q->mq_ops->get_rq_budget_token) + return rq->q->mq_ops->get_rq_budget_token(rq); + return -1; } static inline void __blk_mq_inc_active_requests(struct blk_mq_hw_ctx *hctx) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index de3a9a04e958..00c36a4bebd9 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -329,6 +329,7 @@ void scsi_device_unbusy(struct scsi_device *sdev, struct scsi_cmnd *cmd) atomic_dec(&starget->target_busy); atomic_dec(&sdev->device_busy); + cmd->budget_token = -1; } static void scsi_kick_queue(struct request_queue *q) @@ -1142,6 +1143,7 @@ void scsi_init_command(struct scsi_device *dev, struct scsi_cmnd *cmd) unsigned long jiffies_at_alloc; int retries, to_clear; bool in_flight; + int budget_token = cmd->budget_token; if (!blk_rq_is_scsi(rq) && !(flags & SCMD_INITIALIZED)) { flags |= SCMD_INITIALIZED; @@ -1170,6 +1172,7 @@ void scsi_init_command(struct scsi_device *dev, struct scsi_cmnd *cmd) cmd->retries = retries; if (in_flight) __set_bit(SCMD_STATE_INFLIGHT, &cmd->state); + cmd->budget_token = budget_token; } @@ -1604,19 +1607,19 @@ static void scsi_mq_done(struct scsi_cmnd *cmd) blk_mq_complete_request(cmd->request); } -static void scsi_mq_put_budget(struct request_queue *q) +static void scsi_mq_put_budget(struct request_queue *q, int budget_token) { struct scsi_device *sdev = q->queuedata; atomic_dec(&sdev->device_busy); } -static bool scsi_mq_get_budget(struct request_queue *q) +static int scsi_mq_get_budget(struct request_queue *q) { struct scsi_device *sdev = q->queuedata; if (scsi_dev_queue_ready(q, sdev)) - return true; + return 0; atomic_inc(&sdev->restarts); @@ -1638,7 +1641,7 @@ static bool scsi_mq_get_budget(struct request_queue *q) if (unlikely(atomic_read(&sdev->device_busy) == 0 && !scsi_device_blocked(sdev))) blk_mq_delay_run_hw_queues(sdev->request_queue, SCSI_QUEUE_DELAY); - return false; + return -1; } static void scsi_mq_set_rq_budget_token(struct request *req, int token) @@ -1666,6 +1669,8 @@ static blk_status_t scsi_queue_rq(struct blk_mq_hw_ctx *hctx, blk_status_t ret; int reason; + WARN_ON_ONCE(cmd->budget_token < 0); + /* * If the device is not in running state we will reject some or all * commands. @@ -1717,7 +1722,8 @@ static blk_status_t scsi_queue_rq(struct blk_mq_hw_ctx *hctx, if (scsi_target(sdev)->can_queue > 0) atomic_dec(&scsi_target(sdev)->target_busy); out_put_budget: - scsi_mq_put_budget(q); + scsi_mq_put_budget(q, cmd->budget_token); + cmd->budget_token = -1; switch (ret) { case BLK_STS_OK: break; diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 84c02f2257fe..d53472a59353 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -306,12 +306,12 @@ struct blk_mq_ops { * reserved budget. Also we have to handle failure case * of .get_budget for avoiding I/O deadlock. */ - bool (*get_budget)(struct request_queue *); + int (*get_budget)(struct request_queue *); /** * @put_budget: Release the reserved budget. */ - void (*put_budget)(struct request_queue *); + void (*put_budget)(struct request_queue *, int); /* * @set_rq_budget_toekn: store rq's budget token From patchwork Sun Feb 7 09:20:25 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 378889 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.3 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C4742C433E0 for ; Sun, 7 Feb 2021 09:23:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 954C364E46 for ; Sun, 7 Feb 2021 09:23:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229770AbhBGJXg (ORCPT ); Sun, 7 Feb 2021 04:23:36 -0500 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:49200 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229760AbhBGJX2 (ORCPT ); Sun, 7 Feb 2021 04:23:28 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1612689720; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fkykNUZDBqt2TEzUDvm2cjbZJhUYeVi4lKGgArR4Hnk=; b=bIIuEwxWUfbOHHmEO5nlQAreGHWbZ4e+TU09GXLH3Tq80JvjB9oZshEKfxsCufQOUU1s7n LcmO9clLLPqyfiCa5Smcl3h1o3GVbdil2dxaTNUi9vScJnjd9WW706dwZX06/5DGxLmaSf UBOS6WnuMdHa5gzFngMUsZKs7Mylh68= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-481-d-Yd7XxmMQO0a399uR_SoQ-1; Sun, 07 Feb 2021 04:21:58 -0500 X-MC-Unique: d-Yd7XxmMQO0a399uR_SoQ-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 8A8E91846092; Sun, 7 Feb 2021 09:21:56 +0000 (UTC) Received: from localhost (ovpn-13-9.pek2.redhat.com [10.72.13.9]) by smtp.corp.redhat.com (Postfix) with ESMTP id A52445D9CA; Sun, 7 Feb 2021 09:21:48 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org, "Martin K . Petersen" , linux-scsi@vger.kernel.org Cc: Ming Lei , Omar Sandoval , Kashyap Desai , Sumanesh Samanta , "Ewan D . Milne" , Hannes Reinecke , Christoph Hellwig Subject: [PATCH V8 09/13] scsi: put hot fields of scsi_host_template into one cacheline Date: Sun, 7 Feb 2021 17:20:25 +0800 Message-Id: <20210207092029.1558550-10-ming.lei@redhat.com> In-Reply-To: <20210207092029.1558550-1-ming.lei@redhat.com> References: <20210207092029.1558550-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org The following three fields of scsi_host_template are referenced in scsi IO submission path, so put them together into one cacheline: - cmd_size - queuecommand - commit_rqs Cc: Omar Sandoval Cc: Kashyap Desai Cc: Sumanesh Samanta Cc: Ewan D. Milne Cc: Hannes Reinecke Reviewed-by: Hannes Reinecke Reviewed-by: Christoph Hellwig Tested-by: Sumanesh Samanta Signed-off-by: Ming Lei --- include/scsi/scsi_host.h | 72 ++++++++++++++++++++++------------------ 1 file changed, 39 insertions(+), 33 deletions(-) diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h index 701f178b20ae..2d6e3a1f5f0b 100644 --- a/include/scsi/scsi_host.h +++ b/include/scsi/scsi_host.h @@ -30,40 +30,15 @@ struct scsi_transport_template; #define MODE_TARGET 0x02 struct scsi_host_template { - struct module *module; - const char *name; - /* - * The info function will return whatever useful information the - * developer sees fit. If not provided, then the name field will - * be used instead. - * - * Status: OPTIONAL + * Put fields referenced in IO submission path together in + * same cacheline */ - const char *(* info)(struct Scsi_Host *); /* - * Ioctl interface - * - * Status: OPTIONAL - */ - int (*ioctl)(struct scsi_device *dev, unsigned int cmd, - void __user *arg); - - -#ifdef CONFIG_COMPAT - /* - * Compat handler. Handle 32bit ABI. - * When unknown ioctl is passed return -ENOIOCTLCMD. - * - * Status: OPTIONAL + * Additional per-command data allocated for the driver. */ - int (*compat_ioctl)(struct scsi_device *dev, unsigned int cmd, - void __user *arg); -#endif - - int (*init_cmd_priv)(struct Scsi_Host *shost, struct scsi_cmnd *cmd); - int (*exit_cmd_priv)(struct Scsi_Host *shost, struct scsi_cmnd *cmd); + unsigned int cmd_size; /* * The queuecommand function is used to queue up a scsi @@ -111,6 +86,41 @@ struct scsi_host_template { */ void (*commit_rqs)(struct Scsi_Host *, u16); + struct module *module; + const char *name; + + /* + * The info function will return whatever useful information the + * developer sees fit. If not provided, then the name field will + * be used instead. + * + * Status: OPTIONAL + */ + const char *(*info)(struct Scsi_Host *); + + /* + * Ioctl interface + * + * Status: OPTIONAL + */ + int (*ioctl)(struct scsi_device *dev, unsigned int cmd, + void __user *arg); + + +#ifdef CONFIG_COMPAT + /* + * Compat handler. Handle 32bit ABI. + * When unknown ioctl is passed return -ENOIOCTLCMD. + * + * Status: OPTIONAL + */ + int (*compat_ioctl)(struct scsi_device *dev, unsigned int cmd, + void __user *arg); +#endif + + int (*init_cmd_priv)(struct Scsi_Host *shost, struct scsi_cmnd *cmd); + int (*exit_cmd_priv)(struct Scsi_Host *shost, struct scsi_cmnd *cmd); + /* * This is an error handling strategy routine. You don't need to * define one of these if you don't want to - there is a default @@ -478,10 +488,6 @@ struct scsi_host_template { */ u64 vendor_id; - /* - * Additional per-command data allocated for the driver. - */ - unsigned int cmd_size; struct scsi_host_cmd_pool *cmd_pool; /* Delay for runtime autosuspend */ From patchwork Sun Feb 7 09:20:26 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 378275 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.3 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 30DC2C433DB for ; Sun, 7 Feb 2021 09:24:16 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E24CE64E44 for ; Sun, 7 Feb 2021 09:24:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229725AbhBGJYO (ORCPT ); Sun, 7 Feb 2021 04:24:14 -0500 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:33692 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229760AbhBGJX4 (ORCPT ); Sun, 7 Feb 2021 04:23:56 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1612689748; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5cbAKZsab5lSbZHsOVXdud8NTo655iceg85Z09+ozzg=; b=gu9+km5CKbj+DoOSS8pbYsllBEj1dyI8f3F2dICqXrxLEtY0lqkL8rPSg823Wy/Ycm2xDz 7f0fp5QWN9tO8DLBoY9g+VkDasngW7q8GQ1j46wxZqDv75wJxGehkL2PdHows6AQOqoIe1 bK3seXrCJ/L1PV4dlqbNWusous103NU= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-105-nNp_7GC_PDyAmXhdemacoQ-1; Sun, 07 Feb 2021 04:22:23 -0500 X-MC-Unique: nNp_7GC_PDyAmXhdemacoQ-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 4C370801962; Sun, 7 Feb 2021 09:22:22 +0000 (UTC) Received: from localhost (ovpn-13-9.pek2.redhat.com [10.72.13.9]) by smtp.corp.redhat.com (Postfix) with ESMTP id 098551412A; Sun, 7 Feb 2021 09:22:15 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org, "Martin K . Petersen" , linux-scsi@vger.kernel.org Cc: Kashyap Desai , Omar Sandoval , Sumanesh Samanta , "Ewan D . Milne" , Hannes Reinecke , Ming Lei Subject: [PATCH V8 10/13] megaraid_sas: v2 replace sdev_busy with local counter Date: Sun, 7 Feb 2021 17:20:26 +0800 Message-Id: <20210207092029.1558550-11-ming.lei@redhat.com> In-Reply-To: <20210207092029.1558550-1-ming.lei@redhat.com> References: <20210207092029.1558550-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org From: Kashyap Desai use local tracking of per sdev outstanding command since sdev_busy in SML is improved for performance reason using sbitmap (earlier it was atomic variable). Cc: Omar Sandoval Cc: Kashyap Desai Cc: Sumanesh Samanta Cc: Ewan D. Milne Reviewed-by: Hannes Reinecke Signed-off-by: Kashyap Desai Fix checkpatch ERROR and WARNING. Signed-off-by: Ming Lei --- drivers/scsi/megaraid/megaraid_sas.h | 2 + drivers/scsi/megaraid/megaraid_sas_fusion.c | 47 +++++++++++++++++---- 2 files changed, 41 insertions(+), 8 deletions(-) diff --git a/drivers/scsi/megaraid/megaraid_sas.h b/drivers/scsi/megaraid/megaraid_sas.h index 0f808d63580e..0c6a56b24c6e 100644 --- a/drivers/scsi/megaraid/megaraid_sas.h +++ b/drivers/scsi/megaraid/megaraid_sas.h @@ -2019,10 +2019,12 @@ union megasas_frame { * struct MR_PRIV_DEVICE - sdev private hostdata * @is_tm_capable: firmware managed tm_capable flag * @tm_busy: TM request is in progress + * @sdev_priv_busy: pending command per sdev */ struct MR_PRIV_DEVICE { bool is_tm_capable; bool tm_busy; + atomic_t sdev_priv_busy; atomic_t r1_ldio_hint; u8 interface_type; u8 task_abort_tmo; diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c b/drivers/scsi/megaraid/megaraid_sas_fusion.c index fd607287608e..e9ec9b7b1c36 100644 --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c @@ -220,6 +220,40 @@ megasas_clear_intr_fusion(struct megasas_instance *instance) return 1; } +static inline void +megasas_sdev_busy_inc(struct megasas_instance *instance, + struct scsi_cmnd *scmd) +{ + if (instance->perf_mode == MR_BALANCED_PERF_MODE) { + struct MR_PRIV_DEVICE *mr_device_priv_data = + scmd->device->hostdata; + atomic_inc(&mr_device_priv_data->sdev_priv_busy); + } +} + +static inline void +megasas_sdev_busy_dec(struct megasas_instance *instance, + struct scsi_cmnd *scmd) +{ + if (instance->perf_mode == MR_BALANCED_PERF_MODE) { + struct MR_PRIV_DEVICE *mr_device_priv_data = + scmd->device->hostdata; + atomic_dec(&mr_device_priv_data->sdev_priv_busy); + } +} + +static inline int +megasas_sdev_busy_read(struct megasas_instance *instance, + struct scsi_cmnd *scmd) +{ + if (instance->perf_mode == MR_BALANCED_PERF_MODE) { + struct MR_PRIV_DEVICE *mr_device_priv_data = + scmd->device->hostdata; + return atomic_read(&mr_device_priv_data->sdev_priv_busy); + } + return 0; +} + /** * megasas_get_cmd_fusion - Get a command from the free pool * @instance: Adapter soft state @@ -357,15 +391,9 @@ megasas_get_msix_index(struct megasas_instance *instance, struct megasas_cmd_fusion *cmd, u8 data_arms) { - int sdev_busy; - - /* TBD - if sml remove device_busy in future, driver - * should track counter in internal structure. - */ - sdev_busy = atomic_read(&scmd->device->device_busy); - if (instance->perf_mode == MR_BALANCED_PERF_MODE && - sdev_busy > (data_arms * MR_DEVICE_HIGH_IOPS_DEPTH)) { + (megasas_sdev_busy_read(instance, scmd) > + (data_arms * MR_DEVICE_HIGH_IOPS_DEPTH))) { cmd->request_desc->SCSIIO.MSIxIndex = mega_mod64((atomic64_add_return(1, &instance->high_iops_outstanding) / MR_HIGH_IOPS_BATCH_COUNT), instance->low_latency_index_start); @@ -3390,6 +3418,7 @@ megasas_build_and_issue_cmd_fusion(struct megasas_instance *instance, * Issue the command to the FW */ + megasas_sdev_busy_inc(instance, scmd); megasas_fire_cmd_fusion(instance, req_desc); if (r1_cmd) @@ -3450,6 +3479,7 @@ megasas_complete_r1_command(struct megasas_instance *instance, scmd_local->SCp.ptr = NULL; megasas_return_cmd_fusion(instance, cmd); scsi_dma_unmap(scmd_local); + megasas_sdev_busy_dec(instance, scmd_local); scmd_local->scsi_done(scmd_local); } } @@ -3550,6 +3580,7 @@ complete_cmd_fusion(struct megasas_instance *instance, u32 MSIxIndex, scmd_local->SCp.ptr = NULL; megasas_return_cmd_fusion(instance, cmd_fusion); scsi_dma_unmap(scmd_local); + megasas_sdev_busy_dec(instance, scmd_local); scmd_local->scsi_done(scmd_local); } else /* Optimal VD - R1 FP command completion. */ megasas_complete_r1_command(instance, cmd_fusion); From patchwork Sun Feb 7 09:20:27 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 378888 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.3 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 35866C433DB for ; Sun, 7 Feb 2021 09:24:24 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DD71B64E2A for ; Sun, 7 Feb 2021 09:24:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229785AbhBGJYW (ORCPT ); Sun, 7 Feb 2021 04:24:22 -0500 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:36239 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229760AbhBGJYR (ORCPT ); Sun, 7 Feb 2021 04:24:17 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1612689770; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=kFDQ+NjVkRJSQqXEU2prlgTj89AU1b/gSz5ilEHk3mc=; b=B0w7Mxmk2caBOOlqafpYAKOt7YiwKeHHxMWHW7FcC2TjcYx+gJB44zjGn/o9hMlYL0XiB7 3n5sPICOogdhj1atRA0xLBUVAdtHoPv09aBZXybpv+lqp6Sq/hmXjTK8frPNWgiPDtDmTY GlC5jzKe5vSKnxoy6ScZwWjHCcwjQ1o= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-145-LLwOtc2nPeWnODhiDznr5g-1; Sun, 07 Feb 2021 04:22:48 -0500 X-MC-Unique: LLwOtc2nPeWnODhiDznr5g-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 1A127189DF4F; Sun, 7 Feb 2021 09:22:47 +0000 (UTC) Received: from localhost (ovpn-13-9.pek2.redhat.com [10.72.13.9]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8508B60BF3; Sun, 7 Feb 2021 09:22:37 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org, "Martin K . Petersen" , linux-scsi@vger.kernel.org Cc: Ming Lei , Omar Sandoval , Kashyap Desai , Sumanesh Samanta , "Ewan D . Milne" , Hannes Reinecke Subject: [PATCH V8 11/13] scsi: add scsi_device_busy() to read sdev->device_busy Date: Sun, 7 Feb 2021 17:20:27 +0800 Message-Id: <20210207092029.1558550-12-ming.lei@redhat.com> In-Reply-To: <20210207092029.1558550-1-ming.lei@redhat.com> References: <20210207092029.1558550-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org Add scsi_device_busy() for drivers, so that we can prepare for tracking device queue depth via sbitmap_queue. Cc: Omar Sandoval Cc: Kashyap Desai Cc: Sumanesh Samanta Cc: Ewan D. Milne Reviewed-by: Hannes Reinecke Tested-by: Sumanesh Samanta Signed-off-by: Ming Lei --- drivers/message/fusion/mptsas.c | 2 +- drivers/scsi/mpt3sas/mpt3sas_scsih.c | 2 +- drivers/scsi/scsi_lib.c | 4 ++-- drivers/scsi/scsi_sysfs.c | 2 +- drivers/scsi/sg.c | 2 +- include/scsi/scsi_device.h | 5 +++++ 6 files changed, 11 insertions(+), 6 deletions(-) diff --git a/drivers/message/fusion/mptsas.c b/drivers/message/fusion/mptsas.c index 5eb0b3361e4e..85aa5788826b 100644 --- a/drivers/message/fusion/mptsas.c +++ b/drivers/message/fusion/mptsas.c @@ -3781,7 +3781,7 @@ mptsas_send_link_status_event(struct fw_event_work *fw_event) printk(MYIOC_s_DEBUG_FMT "SDEV OUTSTANDING CMDS" "%d\n", ioc->name, - atomic_read(&sdev->device_busy))); + scsi_device_busy(sdev))); } } diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c index c8b09a81834d..784ad7105d1b 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c @@ -3415,7 +3415,7 @@ scsih_dev_reset(struct scsi_cmnd *scmd) MPI2_SCSITASKMGMT_TASKTYPE_LOGICAL_UNIT_RESET, 0, 0, tr_timeout, tr_method); /* Check for busy commands after reset */ - if (r == SUCCESS && atomic_read(&scmd->device->device_busy)) + if (r == SUCCESS && scsi_device_busy(scmd->device)) r = FAILED; out: sdev_printk(KERN_INFO, scmd->device, "device reset: %s scmd(0x%p)\n", diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 00c36a4bebd9..cb56bf456e55 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -385,7 +385,7 @@ static void scsi_single_lun_run(struct scsi_device *current_sdev) static inline bool scsi_device_is_busy(struct scsi_device *sdev) { - if (atomic_read(&sdev->device_busy) >= sdev->queue_depth) + if (scsi_device_busy(sdev) >= sdev->queue_depth) return true; if (atomic_read(&sdev->device_blocked) > 0) return true; @@ -1638,7 +1638,7 @@ static int scsi_mq_get_budget(struct request_queue *q) * the .restarts flag, and the request queue will be run for handling * this request, see scsi_end_request(). */ - if (unlikely(atomic_read(&sdev->device_busy) == 0 && + if (unlikely(scsi_device_busy(sdev) == 0 && !scsi_device_blocked(sdev))) blk_mq_delay_run_hw_queues(sdev->request_queue, SCSI_QUEUE_DELAY); return -1; diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c index b6378c8ca783..0840e44140de 100644 --- a/drivers/scsi/scsi_sysfs.c +++ b/drivers/scsi/scsi_sysfs.c @@ -670,7 +670,7 @@ sdev_show_device_busy(struct device *dev, struct device_attribute *attr, char *buf) { struct scsi_device *sdev = to_scsi_device(dev); - return snprintf(buf, 20, "%d\n", atomic_read(&sdev->device_busy)); + return snprintf(buf, 20, "%d\n", scsi_device_busy(sdev)); } static DEVICE_ATTR(device_busy, S_IRUGO, sdev_show_device_busy, NULL); diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index 4383d93110f8..737cea9d908e 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -2503,7 +2503,7 @@ static int sg_proc_seq_show_dev(struct seq_file *s, void *v) scsidp->id, scsidp->lun, (int) scsidp->type, 1, (int) scsidp->queue_depth, - (int) atomic_read(&scsidp->device_busy), + (int) scsi_device_busy(scsidp), (int) scsi_device_online(scsidp)); } read_unlock_irqrestore(&sg_index_lock, iflags); diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h index 1a5c9a3df6d6..dd0b9f690a26 100644 --- a/include/scsi/scsi_device.h +++ b/include/scsi/scsi_device.h @@ -590,6 +590,11 @@ static inline int scsi_device_supports_vpd(struct scsi_device *sdev) return 0; } +static inline int scsi_device_busy(struct scsi_device *sdev) +{ + return atomic_read(&sdev->device_busy); +} + #define MODULE_ALIAS_SCSI_DEVICE(type) \ MODULE_ALIAS("scsi:t-" __stringify(type) "*") #define SCSI_DEVICE_MODALIAS_FMT "scsi:t-0x%02x" From patchwork Sun Feb 7 09:20:28 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 378274 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.3 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 865E1C433DB for ; Sun, 7 Feb 2021 09:25:10 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5A11864E43 for ; Sun, 7 Feb 2021 09:25:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229683AbhBGJY4 (ORCPT ); Sun, 7 Feb 2021 04:24:56 -0500 Received: from us-smtp-delivery-124.mimecast.com ([63.128.21.124]:25101 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229793AbhBGJYo (ORCPT ); Sun, 7 Feb 2021 04:24:44 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1612689797; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vYtWTnAXgtFS9ppvSzY14Z8fWYhtiqYqAeHMrlM/v8A=; b=HC08yQXW1G0Kk+HeVDO6WwZlRpedKD2+/05VGWbu0xb0997qWNwPM4HM4StZf8nN2h5CnE rZVGJMrv0AAPnuPRM/yCBpKBnK/FWRW9i/kNwJNAN3H3ejCQXCaqdKeRQ1v+BAXGbMGS01 k5y5fSg8inx9N+YXrN8sm74oqVX1UsM= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-42-ynmM4MOJMG-ep7gMi6xxtg-1; Sun, 07 Feb 2021 04:23:13 -0500 X-MC-Unique: ynmM4MOJMG-ep7gMi6xxtg-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 1CEDE801962; Sun, 7 Feb 2021 09:23:12 +0000 (UTC) Received: from localhost (ovpn-13-9.pek2.redhat.com [10.72.13.9]) by smtp.corp.redhat.com (Postfix) with ESMTP id B2A6F10016FF; Sun, 7 Feb 2021 09:23:01 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org, "Martin K . Petersen" , linux-scsi@vger.kernel.org Cc: Ming Lei , Omar Sandoval , Kashyap Desai , Sumanesh Samanta , "Ewan D . Milne" , Hannes Reinecke Subject: [PATCH V8 12/13] scsi: make sure sdev->queue_depth is <= max(shost->can_queue, 1024) Date: Sun, 7 Feb 2021 17:20:28 +0800 Message-Id: <20210207092029.1558550-13-ming.lei@redhat.com> In-Reply-To: <20210207092029.1558550-1-ming.lei@redhat.com> References: <20210207092029.1558550-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org Limit scsi device's queue depth is less than max(host->can_queue, 1024) in scsi_change_queue_depth(), and 1024 is big enough for saturating current fast SCSI LUN(SSD, or raid volume on multiple SSDs). Also single hw queue depth is usually enough for saturating single LUN because per-core performance is often considered in storage design. We need this patch for replacing sdev->device_busy with sbitmap which has to be pre-allocated with reasonable max depth. Cc: Omar Sandoval Cc: Kashyap Desai Cc: Sumanesh Samanta Cc: Ewan D. Milne Tested-by: Sumanesh Samanta Reviewed-by: Hannes Reinecke Signed-off-by: Ming Lei --- drivers/scsi/scsi.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c index 24619c3bebd5..a28d48c850cf 100644 --- a/drivers/scsi/scsi.c +++ b/drivers/scsi/scsi.c @@ -214,6 +214,15 @@ void scsi_finish_command(struct scsi_cmnd *cmd) scsi_io_completion(cmd, good_bytes); } + +/* + * 1024 is big enough for saturating the fast scsi LUN now + */ +static int scsi_device_max_queue_depth(struct scsi_device *sdev) +{ + return max_t(int, sdev->host->can_queue, 1024); +} + /** * scsi_change_queue_depth - change a device's queue depth * @sdev: SCSI Device in question @@ -223,6 +232,8 @@ void scsi_finish_command(struct scsi_cmnd *cmd) */ int scsi_change_queue_depth(struct scsi_device *sdev, int depth) { + depth = min_t(int, depth, scsi_device_max_queue_depth(sdev)); + if (depth > 0) { sdev->queue_depth = depth; wmb(); From patchwork Sun Feb 7 09:20:29 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 378887 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.3 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A68D8C433E0 for ; Sun, 7 Feb 2021 09:25:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7B56F64E44 for ; Sun, 7 Feb 2021 09:25:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229756AbhBGJZ2 (ORCPT ); Sun, 7 Feb 2021 04:25:28 -0500 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:26140 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229720AbhBGJZQ (ORCPT ); Sun, 7 Feb 2021 04:25:16 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1612689829; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=2pDH+CfvxMVlIbe49p7gST/wSckHi6i73Qo48dNi0Jk=; b=AjkAy32zOafe8lewDGMRFXRZbTVwmgtOycGKOh0JnSk3EtzOU4+abNx946MJRAGPzSTFdo PvZFAjUEq2JVW1P8LDzWZUq/hFkpO0o7PazQeWEAMl+Rckpd9CAVu+ar8Er3sW/tTtkbuG i06ce6UekTScI9c8deY81sPnEqwvPB8= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-417-dV7Swrt4NMKljbQY1Tn59g-1; Sun, 07 Feb 2021 04:23:47 -0500 X-MC-Unique: dV7Swrt4NMKljbQY1Tn59g-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 47C001005501; Sun, 7 Feb 2021 09:23:46 +0000 (UTC) Received: from localhost (ovpn-13-9.pek2.redhat.com [10.72.13.9]) by smtp.corp.redhat.com (Postfix) with ESMTP id 98B9C5D9CA; Sun, 7 Feb 2021 09:23:40 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org, "Martin K . Petersen" , linux-scsi@vger.kernel.org Cc: Ming Lei , Omar Sandoval , Kashyap Desai , Sumanesh Samanta , "Ewan D . Milne" , Hannes Reinecke Subject: [PATCH V8 13/13] scsi: replace sdev->device_busy with sbitmap Date: Sun, 7 Feb 2021 17:20:29 +0800 Message-Id: <20210207092029.1558550-14-ming.lei@redhat.com> In-Reply-To: <20210207092029.1558550-1-ming.lei@redhat.com> References: <20210207092029.1558550-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org scsi requires one global atomic variable to track queue depth for each LUN/ request queue, meantime blk-mq tracks queue depth for each hctx. This SCSI's requirement can't be implemented in blk-mq easily, cause it is a bigger & harder problem to spread the device or request queue's depth among all hw queues. The current approach by using atomic variable can't scale well when there is lots of CPU cores and the disk is very fast and IO are submitted to this device concurrently. It has been observed that IOPS is affected a lot by tracking queue depth via sdev->device_busy in IO path. So replace the atomic variable sdev->device_busy with sbitmap for tracking scsi device queue depth. It is observed that IOPS is improved ~30% by this patchset in the following test: 1) test machine(32 logical CPU cores) Thread(s) per core: 2 Core(s) per socket: 8 Socket(s): 2 NUMA node(s): 2 Model name: Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz 2) setup scsi_debug: modprobe scsi_debug virtual_gb=128 max_luns=1 submit_queues=32 delay=0 max_queue=256 3) fio script: fio --rw=randread --size=128G --direct=1 --ioengine=libaio --iodepth=2048 \ --numjobs=32 --bs=4k --group_reporting=1 --group_reporting=1 --runtime=60 \ --loops=10000 --name=job1 --filename=/dev/sdN [1] https://lore.kernel.org/linux-block/20200119071432.18558-6-ming.lei@redhat.com/ Cc: Omar Sandoval Cc: Kashyap Desai Cc: Sumanesh Samanta Cc: Ewan D. Milne Cc: Hannes Reinecke Tested-by: Sumanesh Samanta Reviewed-by: Hannes Reinecke Signed-off-by: Ming Lei --- drivers/scsi/scsi.c | 4 +++- drivers/scsi/scsi_lib.c | 35 ++++++++++++++++++----------------- drivers/scsi/scsi_priv.h | 3 +++ drivers/scsi/scsi_scan.c | 23 +++++++++++++++++++++-- drivers/scsi/scsi_sysfs.c | 2 ++ include/scsi/scsi_device.h | 5 +++-- 6 files changed, 50 insertions(+), 22 deletions(-) diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c index a28d48c850cf..e9e2f0e15ac8 100644 --- a/drivers/scsi/scsi.c +++ b/drivers/scsi/scsi.c @@ -218,7 +218,7 @@ void scsi_finish_command(struct scsi_cmnd *cmd) /* * 1024 is big enough for saturating the fast scsi LUN now */ -static int scsi_device_max_queue_depth(struct scsi_device *sdev) +int scsi_device_max_queue_depth(struct scsi_device *sdev) { return max_t(int, sdev->host->can_queue, 1024); } @@ -242,6 +242,8 @@ int scsi_change_queue_depth(struct scsi_device *sdev, int depth) if (sdev->request_queue) blk_set_queue_depth(sdev->request_queue, depth); + sbitmap_resize(&sdev->budget_map, sdev->queue_depth); + return sdev->queue_depth; } EXPORT_SYMBOL(scsi_change_queue_depth); diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index cb56bf456e55..65807cac6228 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -328,7 +328,7 @@ void scsi_device_unbusy(struct scsi_device *sdev, struct scsi_cmnd *cmd) if (starget->can_queue > 0) atomic_dec(&starget->target_busy); - atomic_dec(&sdev->device_busy); + sbitmap_put(&sdev->budget_map, cmd->budget_token); cmd->budget_token = -1; } @@ -1256,19 +1256,20 @@ scsi_device_state_check(struct scsi_device *sdev, struct request *req) } /* - * scsi_dev_queue_ready: if we can send requests to sdev, return 1 else - * return 0. - * - * Called with the queue_lock held. + * scsi_dev_queue_ready: if we can send requests to sdev, assign one token + * and return the token else return -1. */ static inline int scsi_dev_queue_ready(struct request_queue *q, struct scsi_device *sdev) { - unsigned int busy; + int token; - busy = atomic_inc_return(&sdev->device_busy) - 1; + token = sbitmap_get(&sdev->budget_map); if (atomic_read(&sdev->device_blocked)) { - if (busy) + if (token < 0) + goto out; + + if (scsi_device_busy(sdev) > 1) goto out_dec; /* @@ -1280,13 +1281,12 @@ static inline int scsi_dev_queue_ready(struct request_queue *q, "unblocking device at zero depth\n")); } - if (busy >= sdev->queue_depth) - goto out_dec; - - return 1; + return token; out_dec: - atomic_dec(&sdev->device_busy); - return 0; + if (token >= 0) + sbitmap_put(&sdev->budget_map, token); +out: + return -1; } /* @@ -1611,15 +1611,16 @@ static void scsi_mq_put_budget(struct request_queue *q, int budget_token) { struct scsi_device *sdev = q->queuedata; - atomic_dec(&sdev->device_busy); + sbitmap_put(&sdev->budget_map, budget_token); } static int scsi_mq_get_budget(struct request_queue *q) { struct scsi_device *sdev = q->queuedata; + int token = scsi_dev_queue_ready(q, sdev); - if (scsi_dev_queue_ready(q, sdev)) - return 0; + if (token >= 0) + return token; atomic_inc(&sdev->restarts); diff --git a/drivers/scsi/scsi_priv.h b/drivers/scsi/scsi_priv.h index 180636d54982..30b35002d2f8 100644 --- a/drivers/scsi/scsi_priv.h +++ b/drivers/scsi/scsi_priv.h @@ -5,6 +5,7 @@ #include #include #include +#include struct request_queue; struct request; @@ -182,6 +183,8 @@ static inline void scsi_dh_add_device(struct scsi_device *sdev) { } static inline void scsi_dh_release_device(struct scsi_device *sdev) { } #endif +extern int scsi_device_max_queue_depth(struct scsi_device *sdev); + /* * internal scsi timeout functions: for use by mid-layer and transport * classes. diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c index 9af50e6f94c4..9f1b7f3c650a 100644 --- a/drivers/scsi/scsi_scan.c +++ b/drivers/scsi/scsi_scan.c @@ -215,6 +215,7 @@ static void scsi_unlock_floptical(struct scsi_device *sdev, static struct scsi_device *scsi_alloc_sdev(struct scsi_target *starget, u64 lun, void *hostdata) { + unsigned int depth; struct scsi_device *sdev; int display_failure_msg = 1, ret; struct Scsi_Host *shost = dev_to_shost(starget->dev.parent); @@ -276,8 +277,25 @@ static struct scsi_device *scsi_alloc_sdev(struct scsi_target *starget, WARN_ON_ONCE(!blk_get_queue(sdev->request_queue)); sdev->request_queue->queuedata = sdev; - scsi_change_queue_depth(sdev, sdev->host->cmd_per_lun ? - sdev->host->cmd_per_lun : 1); + depth = sdev->host->cmd_per_lun ?: 1; + + /* + * Use .can_queue as budget map's depth because we have to + * support adjusting queue depth from sysfs. Meantime use + * default device queue depth to figure out sbitmap shift + * since we use this queue depth most of times. + */ + if (sbitmap_init_node(&sdev->budget_map, + scsi_device_max_queue_depth(sdev), + sbitmap_calculate_shift(depth), + GFP_KERNEL, sdev->request_queue->node, + false, true)) { + put_device(&starget->dev); + kfree(sdev); + goto out; + } + + scsi_change_queue_depth(sdev, depth); scsi_sysfs_device_initialize(sdev); @@ -979,6 +997,7 @@ static int scsi_add_lun(struct scsi_device *sdev, unsigned char *inq_result, scsi_attach_vpd(sdev); sdev->max_queue_depth = sdev->queue_depth; + WARN_ON_ONCE(sdev->max_queue_depth > sdev->budget_map.depth); sdev->sdev_bflags = *bflags; /* diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c index 0840e44140de..7fb2f70e97c8 100644 --- a/drivers/scsi/scsi_sysfs.c +++ b/drivers/scsi/scsi_sysfs.c @@ -477,6 +477,8 @@ static void scsi_device_dev_release_usercontext(struct work_struct *work) /* NULL queue means the device can't be used */ sdev->request_queue = NULL; + sbitmap_free(&sdev->budget_map); + mutex_lock(&sdev->inquiry_mutex); vpd_pg0 = rcu_replace_pointer(sdev->vpd_pg0, vpd_pg0, lockdep_is_held(&sdev->inquiry_mutex)); diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h index dd0b9f690a26..05c7c320ef32 100644 --- a/include/scsi/scsi_device.h +++ b/include/scsi/scsi_device.h @@ -8,6 +8,7 @@ #include #include #include +#include struct device; struct request_queue; @@ -106,7 +107,7 @@ struct scsi_device { struct list_head siblings; /* list of all devices on this host */ struct list_head same_target_siblings; /* just the devices sharing same target id */ - atomic_t device_busy; /* commands actually active on LLDD */ + struct sbitmap budget_map; atomic_t device_blocked; /* Device returned QUEUE_FULL. */ atomic_t restarts; @@ -592,7 +593,7 @@ static inline int scsi_device_supports_vpd(struct scsi_device *sdev) static inline int scsi_device_busy(struct scsi_device *sdev) { - return atomic_read(&sdev->device_busy); + return sbitmap_weight(&sdev->budget_map); } #define MODULE_ALIAS_SCSI_DEVICE(type) \