From patchwork Sun Apr 25 08:57:46 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 427396 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CDCD0C433ED for ; Sun, 25 Apr 2021 08:58:14 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id AA9A0613AB for ; Sun, 25 Apr 2021 08:58:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229620AbhDYI6x (ORCPT ); Sun, 25 Apr 2021 04:58:53 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:25147 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229485AbhDYI6x (ORCPT ); Sun, 25 Apr 2021 04:58:53 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1619341093; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=M74/hsQnSAa0X2C+IMiyjVk4wyeWitrFumK7BMvk4gA=; b=EOui41o3lv5BJn3wYSYDTR43CIO1GQFcs/1IRUBPdMgPVkTy5js1bVCgndtw2uK5iG1bWu QrH2l/spW+13EbjUePDN+o0wqky4Uh5jYHvg/mavBv+Y0ERxa34PULT6ODkZ2Rrl7SX8bi PJ8YVPjQgPTTZVlz9S/Gb854d8R0/NE= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-320-V0FQV5NMMvm2rR2RedqVZA-1; Sun, 25 Apr 2021 04:58:11 -0400 X-MC-Unique: V0FQV5NMMvm2rR2RedqVZA-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id AA5718030CF; Sun, 25 Apr 2021 08:58:09 +0000 (UTC) Received: from localhost (ovpn-13-143.pek2.redhat.com [10.72.13.143]) by smtp.corp.redhat.com (Postfix) with ESMTP id 1691A10016F8; Sun, 25 Apr 2021 08:58:02 +0000 (UTC) From: Ming Lei To: linux-nvme@lists.infradead.org, linux-scsi@vger.kernel.org, Jens Axboe , linux-block@vger.kernel.org, "Martin K . Petersen" , Christoph Hellwig Cc: Bart Van Assche , Khazhy Kumykov , Shin'ichiro Kawasaki , Hannes Reinecke , John Garry , David Jeffery , Ming Lei Subject: [PATCH 1/8] Revert "blk-mq: Fix races between blk_mq_update_nr_hw_queues() and iterating over tags" Date: Sun, 25 Apr 2021 16:57:46 +0800 Message-Id: <20210425085753.2617424-2-ming.lei@redhat.com> In-Reply-To: <20210425085753.2617424-1-ming.lei@redhat.com> References: <20210425085753.2617424-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org This reverts commit ac81d1ffd022b432d24fe79adf2d31f81a4acdc3. Signed-off-by: Ming Lei --- block/blk-mq-tag.c | 39 --------------------------------------- 1 file changed, 39 deletions(-) diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c index b0e0f074a864..39d5c9190a6b 100644 --- a/block/blk-mq-tag.c +++ b/block/blk-mq-tag.c @@ -376,31 +376,6 @@ void blk_mq_all_tag_iter_atomic(struct blk_mq_tags *tags, busy_tag_iter_fn *fn, __blk_mq_all_tag_iter(tags, fn, priv, BT_TAG_ITER_STATIC_RQS); } -/* - * Iterate over all request queues in a tag set, find the first queue with a - * non-zero usage count, increment its usage count and return the pointer to - * that request queue. This prevents that blk_mq_update_nr_hw_queues() will - * modify @set->nr_hw_queues while iterating over tags since - * blk_mq_update_nr_hw_queues() only modifies @set->nr_hw_queues while the - * usage count of all queues associated with a tag set is zero. - */ -static struct request_queue * -blk_mq_get_any_tagset_queue(struct blk_mq_tag_set *set) -{ - struct request_queue *q; - - rcu_read_lock(); - list_for_each_entry_rcu(q, &set->tag_list, tag_set_list) { - if (percpu_ref_tryget(&q->q_usage_counter)) { - rcu_read_unlock(); - return q; - } - } - rcu_read_unlock(); - - return NULL; -} - /** * blk_mq_tagset_busy_iter - iterate over all started requests in a tag set * @tagset: Tag set to iterate over. @@ -416,22 +391,15 @@ blk_mq_get_any_tagset_queue(struct blk_mq_tag_set *set) void blk_mq_tagset_busy_iter(struct blk_mq_tag_set *tagset, busy_tag_iter_fn *fn, void *priv) { - struct request_queue *q; int i; might_sleep(); - q = blk_mq_get_any_tagset_queue(tagset); - if (!q) - return; - for (i = 0; i < tagset->nr_hw_queues; i++) { if (tagset->tags && tagset->tags[i]) __blk_mq_all_tag_iter(tagset->tags[i], fn, priv, BT_TAG_ITER_STARTED | BT_TAG_ITER_MAY_SLEEP); } - - blk_queue_exit(q); } EXPORT_SYMBOL(blk_mq_tagset_busy_iter); @@ -450,20 +418,13 @@ EXPORT_SYMBOL(blk_mq_tagset_busy_iter); void blk_mq_tagset_busy_iter_atomic(struct blk_mq_tag_set *tagset, busy_tag_iter_fn *fn, void *priv) { - struct request_queue *q; int i; - q = blk_mq_get_any_tagset_queue(tagset); - if (!q) - return; - for (i = 0; i < tagset->nr_hw_queues; i++) { if (tagset->tags && tagset->tags[i]) __blk_mq_all_tag_iter(tagset->tags[i], fn, priv, BT_TAG_ITER_STARTED); } - - blk_queue_exit(q); } EXPORT_SYMBOL(blk_mq_tagset_busy_iter_atomic); From patchwork Sun Apr 25 08:57:48 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 427395 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AAB38C433B4 for ; Sun, 25 Apr 2021 08:58:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8D02A613B4 for ; Sun, 25 Apr 2021 08:58:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229689AbhDYI7G (ORCPT ); Sun, 25 Apr 2021 04:59:06 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:38039 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229699AbhDYI7F (ORCPT ); Sun, 25 Apr 2021 04:59:05 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1619341105; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1ySprp4c/nxfZOZakpTyMDVQUGhmiCiKIQUYVlnHBwI=; b=T9Rd2IzJhXfWzGPDlVLeYIyuWDAmA4JxH+W6HX/1rhtADSRLXC3YUKy4Tp0v2yviGSWK3V H4t2CT3ukxJw6dpHvGjpM2YxwHYh34dOTwnXQ+kWXMUZ7vGKDNB1OE/41JQ0l5/YX0DKAy k2s9+xkbQQLoDGv8HDfX5Vn+Be3TqoQ= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-349-vZvJvkNQNY-Uikz2PyA_Fg-1; Sun, 25 Apr 2021 04:58:21 -0400 X-MC-Unique: vZvJvkNQNY-Uikz2PyA_Fg-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 226ED1006C80; Sun, 25 Apr 2021 08:58:20 +0000 (UTC) Received: from localhost (ovpn-13-143.pek2.redhat.com [10.72.13.143]) by smtp.corp.redhat.com (Postfix) with ESMTP id DB99C690FD; Sun, 25 Apr 2021 08:58:15 +0000 (UTC) From: Ming Lei To: linux-nvme@lists.infradead.org, linux-scsi@vger.kernel.org, Jens Axboe , linux-block@vger.kernel.org, "Martin K . Petersen" , Christoph Hellwig Cc: Bart Van Assche , Khazhy Kumykov , Shin'ichiro Kawasaki , Hannes Reinecke , John Garry , David Jeffery , Ming Lei Subject: [PATCH 3/8] Revert "blk-mq: Fix races between iterating over requests and freeing requests" Date: Sun, 25 Apr 2021 16:57:48 +0800 Message-Id: <20210425085753.2617424-4-ming.lei@redhat.com> In-Reply-To: <20210425085753.2617424-1-ming.lei@redhat.com> References: <20210425085753.2617424-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org This reverts commit 5ba3f5a6ca7ee2dffcae7fab25a1a1053e3264cb. Signed-off-by: Ming Lei --- block/blk-core.c | 34 +------------------------------ block/blk-mq-tag.c | 51 ++++++---------------------------------------- block/blk-mq-tag.h | 4 +--- block/blk-mq.c | 21 ++++--------------- block/blk-mq.h | 1 - block/blk.h | 2 -- block/elevator.c | 1 - 7 files changed, 12 insertions(+), 102 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index ca7f833e25a8..9bcdae93f6d4 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -279,36 +279,6 @@ void blk_dump_rq_flags(struct request *rq, char *msg) } EXPORT_SYMBOL(blk_dump_rq_flags); -/** - * blk_mq_wait_for_tag_iter - wait for preexisting tag iteration functions to finish - * @set: Tag set to wait on. - * - * Waits for preexisting calls of blk_mq_all_tag_iter(), - * blk_mq_tagset_busy_iter() and also for their atomic variants to finish - * accessing hctx->tags->rqs[]. New readers may start while this function is - * in progress or after this function has returned. Use this function to make - * sure that hctx->tags->rqs[] changes have become globally visible. - * - * Waits for preexisting blk_mq_queue_tag_busy_iter(q, fn, priv) calls to - * finish accessing requests associated with other request queues than 'q'. - */ -void blk_mq_wait_for_tag_iter(struct blk_mq_tag_set *set) -{ - struct blk_mq_tags *tags; - int i; - - if (set->tags) { - for (i = 0; i < set->nr_hw_queues; i++) { - tags = set->tags[i]; - if (!tags) - continue; - down_write(&tags->iter_rwsem); - up_write(&tags->iter_rwsem); - } - } - synchronize_rcu(); -} - /** * blk_sync_queue - cancel any pending callbacks on a queue * @q: the queue @@ -442,10 +412,8 @@ void blk_cleanup_queue(struct request_queue *q) * it is safe to free requests now. */ mutex_lock(&q->sysfs_lock); - if (q->elevator) { - blk_mq_wait_for_tag_iter(q->tag_set); + if (q->elevator) blk_mq_sched_free_requests(q); - } mutex_unlock(&q->sysfs_lock); percpu_ref_exit(&q->q_usage_counter); diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c index 39d5c9190a6b..d8eaa38a1bd1 100644 --- a/block/blk-mq-tag.c +++ b/block/blk-mq-tag.c @@ -209,24 +209,14 @@ static bool bt_iter(struct sbitmap *bitmap, unsigned int bitnr, void *data) if (!reserved) bitnr += tags->nr_reserved_tags; - rcu_read_lock(); - /* - * The request 'rq' points at is protected by an RCU read lock until - * its queue pointer has been verified and by q_usage_count while the - * callback function is being invoked. See also the - * percpu_ref_tryget() and blk_queue_exit() calls in - * blk_mq_queue_tag_busy_iter(). - */ - rq = rcu_dereference(tags->rqs[bitnr]); + rq = tags->rqs[bitnr]; + /* * We can hit rq == NULL here, because the tagging functions * test and set the bit before assigning ->rqs[]. */ - if (rq && rq->q == hctx->queue && rq->mq_hctx == hctx) { - rcu_read_unlock(); + if (rq && rq->q == hctx->queue && rq->mq_hctx == hctx) return iter_data->fn(hctx, rq, iter_data->data, reserved); - } - rcu_read_unlock(); return true; } @@ -264,17 +254,11 @@ struct bt_tags_iter_data { unsigned int flags; }; -/* Include reserved tags. */ #define BT_TAG_ITER_RESERVED (1 << 0) -/* Only include started requests. */ #define BT_TAG_ITER_STARTED (1 << 1) -/* Iterate over tags->static_rqs[] instead of tags->rqs[]. */ #define BT_TAG_ITER_STATIC_RQS (1 << 2) -/* The callback function may sleep. */ -#define BT_TAG_ITER_MAY_SLEEP (1 << 3) -static bool __bt_tags_iter(struct sbitmap *bitmap, unsigned int bitnr, - void *data) +static bool bt_tags_iter(struct sbitmap *bitmap, unsigned int bitnr, void *data) { struct bt_tags_iter_data *iter_data = data; struct blk_mq_tags *tags = iter_data->tags; @@ -291,8 +275,7 @@ static bool __bt_tags_iter(struct sbitmap *bitmap, unsigned int bitnr, if (iter_data->flags & BT_TAG_ITER_STATIC_RQS) rq = tags->static_rqs[bitnr]; else - rq = rcu_dereference_check(tags->rqs[bitnr], - lockdep_is_held(&tags->iter_rwsem)); + rq = tags->rqs[bitnr]; if (!rq) return true; if ((iter_data->flags & BT_TAG_ITER_STARTED) && @@ -301,25 +284,6 @@ static bool __bt_tags_iter(struct sbitmap *bitmap, unsigned int bitnr, return iter_data->fn(rq, iter_data->data, reserved); } -static bool bt_tags_iter(struct sbitmap *bitmap, unsigned int bitnr, void *data) -{ - struct bt_tags_iter_data *iter_data = data; - struct blk_mq_tags *tags = iter_data->tags; - bool res; - - if (iter_data->flags & BT_TAG_ITER_MAY_SLEEP) { - down_read(&tags->iter_rwsem); - res = __bt_tags_iter(bitmap, bitnr, data); - up_read(&tags->iter_rwsem); - } else { - rcu_read_lock(); - res = __bt_tags_iter(bitmap, bitnr, data); - rcu_read_unlock(); - } - - return res; -} - /** * bt_tags_for_each - iterate over the requests in a tag map * @tags: Tag map to iterate over. @@ -393,12 +357,10 @@ void blk_mq_tagset_busy_iter(struct blk_mq_tag_set *tagset, { int i; - might_sleep(); - for (i = 0; i < tagset->nr_hw_queues; i++) { if (tagset->tags && tagset->tags[i]) __blk_mq_all_tag_iter(tagset->tags[i], fn, priv, - BT_TAG_ITER_STARTED | BT_TAG_ITER_MAY_SLEEP); + BT_TAG_ITER_STARTED); } } EXPORT_SYMBOL(blk_mq_tagset_busy_iter); @@ -582,7 +544,6 @@ struct blk_mq_tags *blk_mq_init_tags(unsigned int total_tags, tags->nr_tags = total_tags; tags->nr_reserved_tags = reserved_tags; - init_rwsem(&tags->iter_rwsem); if (blk_mq_is_sbitmap_shared(flags)) return tags; diff --git a/block/blk-mq-tag.h b/block/blk-mq-tag.h index d1d73d7cc7df..0290c308ece9 100644 --- a/block/blk-mq-tag.h +++ b/block/blk-mq-tag.h @@ -17,11 +17,9 @@ struct blk_mq_tags { struct sbitmap_queue __bitmap_tags; struct sbitmap_queue __breserved_tags; - struct request __rcu **rqs; + struct request **rqs; struct request **static_rqs; struct list_head page_list; - - struct rw_semaphore iter_rwsem; }; extern struct blk_mq_tags *blk_mq_init_tags(unsigned int nr_tags, diff --git a/block/blk-mq.c b/block/blk-mq.c index 8b59f6b4ec8e..79c01b1f885c 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -496,10 +496,8 @@ static void __blk_mq_free_request(struct request *rq) blk_crypto_free_request(rq); blk_pm_mark_last_busy(rq); rq->mq_hctx = NULL; - if (rq->tag != BLK_MQ_NO_TAG) { + if (rq->tag != BLK_MQ_NO_TAG) blk_mq_put_tag(hctx->tags, ctx, rq->tag); - rcu_assign_pointer(hctx->tags->rqs[rq->tag], NULL); - } if (sched_tag != BLK_MQ_NO_TAG) blk_mq_put_tag(hctx->sched_tags, ctx, sched_tag); blk_mq_sched_restart(hctx); @@ -841,20 +839,9 @@ EXPORT_SYMBOL(blk_mq_delay_kick_requeue_list); struct request *blk_mq_tag_to_rq(struct blk_mq_tags *tags, unsigned int tag) { - struct request *rq; - if (tag < tags->nr_tags) { - /* - * Freeing tags happens with the request queue frozen so the - * rcu dereference below is protected by the request queue - * usage count. We can only verify that usage count after - * having read the request pointer. - */ - rq = rcu_dereference_check(tags->rqs[tag], true); - WARN_ON_ONCE(IS_ENABLED(CONFIG_PROVE_RCU) && rq && - percpu_ref_is_zero(&rq->q->q_usage_counter)); - prefetch(rq); - return rq; + prefetch(tags->rqs[tag]); + return tags->rqs[tag]; } return NULL; @@ -1125,7 +1112,7 @@ static bool blk_mq_get_driver_tag(struct request *rq) rq->rq_flags |= RQF_MQ_INFLIGHT; __blk_mq_inc_active_requests(hctx); } - rcu_assign_pointer(hctx->tags->rqs[rq->tag], rq); + hctx->tags->rqs[rq->tag] = rq; return true; } diff --git a/block/blk-mq.h b/block/blk-mq.h index 9ccb1818303b..3616453ca28c 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -226,7 +226,6 @@ static inline void __blk_mq_put_driver_tag(struct blk_mq_hw_ctx *hctx, struct request *rq) { blk_mq_put_tag(hctx->tags, rq->mq_ctx, rq->tag); - rcu_assign_pointer(hctx->tags->rqs[rq->tag], NULL); rq->tag = BLK_MQ_NO_TAG; if (rq->rq_flags & RQF_MQ_INFLIGHT) { diff --git a/block/blk.h b/block/blk.h index d88b0823738c..529233957207 100644 --- a/block/blk.h +++ b/block/blk.h @@ -185,8 +185,6 @@ bool blk_bio_list_merge(struct request_queue *q, struct list_head *list, void blk_account_io_start(struct request *req); void blk_account_io_done(struct request *req, u64 now); -void blk_mq_wait_for_tag_iter(struct blk_mq_tag_set *set); - /* * Internal elevator interface */ diff --git a/block/elevator.c b/block/elevator.c index aae9cff6d5ae..7c486ce858e0 100644 --- a/block/elevator.c +++ b/block/elevator.c @@ -201,7 +201,6 @@ static void elevator_exit(struct request_queue *q, struct elevator_queue *e) { lockdep_assert_held(&q->sysfs_lock); - blk_mq_wait_for_tag_iter(q->tag_set); blk_mq_sched_free_requests(q); __elevator_exit(q, e); } From patchwork Sun Apr 25 08:57:50 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 427394 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 83106C433B4 for ; Sun, 25 Apr 2021 08:58:34 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5979F613BD for ; Sun, 25 Apr 2021 08:58:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229793AbhDYI7N (ORCPT ); Sun, 25 Apr 2021 04:59:13 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:43937 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229762AbhDYI7M (ORCPT ); Sun, 25 Apr 2021 04:59:12 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1619341112; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nlKUIawhhdfke5e91nvF9IFytrIJPg/rQRLeCbz+53s=; b=HGi34WtQk+Nty+oQyu2EMrY2JpOV6wYsnTNl/I+6GSI3pXSjNlcdbSuli+MZVU1bfGAMeT sNTrJhcGiH/zcEm9fp5LLyGD8UPekGYpH9sgz03sjlyynoz+jPng4I1sOjUqLlR2aUunPR v2zdjsFNU273JZYMZYbCQqq7VLrNSBQ= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-330-liBRZltwP2y6AS6-HBKHbA-1; Sun, 25 Apr 2021 04:58:31 -0400 X-MC-Unique: liBRZltwP2y6AS6-HBKHbA-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 7F23F1898297; Sun, 25 Apr 2021 08:58:29 +0000 (UTC) Received: from localhost (ovpn-13-143.pek2.redhat.com [10.72.13.143]) by smtp.corp.redhat.com (Postfix) with ESMTP id CCA7F690F5; Sun, 25 Apr 2021 08:58:28 +0000 (UTC) From: Ming Lei To: linux-nvme@lists.infradead.org, linux-scsi@vger.kernel.org, Jens Axboe , linux-block@vger.kernel.org, "Martin K . Petersen" , Christoph Hellwig Cc: Bart Van Assche , Khazhy Kumykov , Shin'ichiro Kawasaki , Hannes Reinecke , John Garry , David Jeffery , Ming Lei Subject: [PATCH 5/8] blk-mq: blk_mq_complete_request_locally Date: Sun, 25 Apr 2021 16:57:50 +0800 Message-Id: <20210425085753.2617424-6-ming.lei@redhat.com> In-Reply-To: <20210425085753.2617424-1-ming.lei@redhat.com> References: <20210425085753.2617424-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org Add blk_mq_complete_request_locally() for completing request via blk_mq_tagset_busy_iter(), so that we can avoid request UAF related with queue releasing, or request freeing. Signed-off-by: Ming Lei --- block/blk-mq.c | 16 ++++++++++++++++ include/linux/blk-mq.h | 1 + 2 files changed, 17 insertions(+) diff --git a/block/blk-mq.c b/block/blk-mq.c index 927189a55575..e3d1067b10c3 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -681,6 +681,22 @@ void blk_mq_complete_request(struct request *rq) } EXPORT_SYMBOL(blk_mq_complete_request); +/** + * blk_mq_complete_request_locally - end I/O on a request locally + * @rq: the request being processed + * + * Description: + * Complete a request by calling the ->complete_rq directly, + * and it is usually used in error handling via + * blk_mq_tagset_busy_iter(). + **/ +void blk_mq_complete_request_locally(struct request *rq) +{ + WRITE_ONCE(rq->state, MQ_RQ_COMPLETE); + rq->q->mq_ops->complete(rq); +} +EXPORT_SYMBOL(blk_mq_complete_request_locally); + static void hctx_unlock(struct blk_mq_hw_ctx *hctx, int srcu_idx) __releases(hctx->srcu) { diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 2c473c9b8990..f630bf9e497e 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -511,6 +511,7 @@ void blk_mq_kick_requeue_list(struct request_queue *q); void blk_mq_delay_kick_requeue_list(struct request_queue *q, unsigned long msecs); void blk_mq_complete_request(struct request *rq); bool blk_mq_complete_request_remote(struct request *rq); +void blk_mq_complete_request_locally(struct request *rq); bool blk_mq_queue_stopped(struct request_queue *q); void blk_mq_stop_hw_queue(struct blk_mq_hw_ctx *hctx); void blk_mq_start_hw_queue(struct blk_mq_hw_ctx *hctx); From patchwork Sun Apr 25 08:57:52 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 427393 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E1489C433ED for ; Sun, 25 Apr 2021 08:58:49 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BA71061285 for ; Sun, 25 Apr 2021 08:58:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229828AbhDYI72 (ORCPT ); Sun, 25 Apr 2021 04:59:28 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:26737 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229694AbhDYI71 (ORCPT ); Sun, 25 Apr 2021 04:59:27 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1619341128; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Kpa9pCDoik6FTiDJWFmgKud0dcOZjitGbcFr0C3GKKk=; b=RZYF59iRQEc5DyB7JZPF8+GRBRfStfg87xtVeua6jY4OTFwjeonyGKuDFEfp8SjESeXCTD 9ZTAW3tVeIJcAeml/M7Sg173YRGOp764hdGIirQPLCNYErCLdk7S9r2wAtC0cdmpr4wXip e08ps298rk+M6iqnLMBghtfBoJfQ3jQ= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-208-kkmOzQfbOsO_XSpnEw0QZA-1; Sun, 25 Apr 2021 04:58:46 -0400 X-MC-Unique: kkmOzQfbOsO_XSpnEw0QZA-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id BA30D343A2; Sun, 25 Apr 2021 08:58:44 +0000 (UTC) Received: from localhost (ovpn-13-143.pek2.redhat.com [10.72.13.143]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8314310016F8; Sun, 25 Apr 2021 08:58:37 +0000 (UTC) From: Ming Lei To: linux-nvme@lists.infradead.org, linux-scsi@vger.kernel.org, Jens Axboe , linux-block@vger.kernel.org, "Martin K . Petersen" , Christoph Hellwig Cc: Bart Van Assche , Khazhy Kumykov , Shin'ichiro Kawasaki , Hannes Reinecke , John Garry , David Jeffery , Ming Lei Subject: [PATCH 7/8] blk-mq: grab rq->refcount before calling ->fn in blk_mq_tagset_busy_iter Date: Sun, 25 Apr 2021 16:57:52 +0800 Message-Id: <20210425085753.2617424-8-ming.lei@redhat.com> In-Reply-To: <20210425085753.2617424-1-ming.lei@redhat.com> References: <20210425085753.2617424-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org Grab rq->refcount before calling ->fn in blk_mq_tagset_busy_iter(), and this way will prevent the request from being re-used when ->fn is running. The approach is same as what we do during handling timeout. Fix request UAF related with completion race or queue releasing: - If one rq is referred before rq->q is frozen, then queue won't be frozen before the request is released during iteration. - If one rq is referred after rq->q is frozen, refcount_inc_not_zero() will return false, and we won't iterate over this request. However, still one request UAF not covered: refcount_inc_not_zero() may read one freed request, and it will be handled in next patch. Signed-off-by: Ming Lei --- block/blk-mq-tag.c | 14 +++++++++++--- block/blk-mq.c | 14 +++++++++----- block/blk-mq.h | 1 + 3 files changed, 21 insertions(+), 8 deletions(-) diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c index 2a37731e8244..489d2db89856 100644 --- a/block/blk-mq-tag.c +++ b/block/blk-mq-tag.c @@ -264,6 +264,7 @@ static bool bt_tags_iter(struct sbitmap *bitmap, unsigned int bitnr, void *data) struct blk_mq_tags *tags = iter_data->tags; bool reserved = iter_data->flags & BT_TAG_ITER_RESERVED; struct request *rq; + bool ret; if (!reserved) bitnr += tags->nr_reserved_tags; @@ -276,12 +277,15 @@ static bool bt_tags_iter(struct sbitmap *bitmap, unsigned int bitnr, void *data) rq = tags->static_rqs[bitnr]; else rq = tags->rqs[bitnr]; - if (!rq) + if (!rq || !refcount_inc_not_zero(&rq->ref)) return true; if ((iter_data->flags & BT_TAG_ITER_STARTED) && !blk_mq_request_started(rq)) - return true; - return iter_data->fn(rq, iter_data->data, reserved); + ret = true; + else + ret = iter_data->fn(rq, iter_data->data, reserved); + blk_mq_put_rq_ref(rq); + return ret; } /** @@ -348,6 +352,10 @@ void blk_mq_all_tag_iter(struct blk_mq_tags *tags, busy_tag_iter_fn *fn, * indicates whether or not @rq is a reserved request. Return * true to continue iterating tags, false to stop. * @priv: Will be passed as second argument to @fn. + * + * We grab one request reference before calling @fn and release it after + * @fn returns. So far we don't support to pass the request reference to + * one new conetxt in @fn. */ void blk_mq_tagset_busy_iter(struct blk_mq_tag_set *tagset, busy_tag_iter_fn *fn, void *priv) diff --git a/block/blk-mq.c b/block/blk-mq.c index e3d1067b10c3..9a4d520740a1 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -925,6 +925,14 @@ static bool blk_mq_req_expired(struct request *rq, unsigned long *next) return false; } +void blk_mq_put_rq_ref(struct request *rq) +{ + if (is_flush_rq(rq, rq->mq_hctx)) + rq->end_io(rq, 0); + else if (refcount_dec_and_test(&rq->ref)) + __blk_mq_free_request(rq); +} + static bool blk_mq_check_expired(struct blk_mq_hw_ctx *hctx, struct request *rq, void *priv, bool reserved) { @@ -958,11 +966,7 @@ static bool blk_mq_check_expired(struct blk_mq_hw_ctx *hctx, if (blk_mq_req_expired(rq, next)) blk_mq_rq_timed_out(rq, reserved); - if (is_flush_rq(rq, hctx)) - rq->end_io(rq, 0); - else if (refcount_dec_and_test(&rq->ref)) - __blk_mq_free_request(rq); - + blk_mq_put_rq_ref(rq); return true; } diff --git a/block/blk-mq.h b/block/blk-mq.h index 3616453ca28c..143afe42c63a 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -47,6 +47,7 @@ void blk_mq_add_to_requeue_list(struct request *rq, bool at_head, void blk_mq_flush_busy_ctxs(struct blk_mq_hw_ctx *hctx, struct list_head *list); struct request *blk_mq_dequeue_from_ctx(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *start); +void blk_mq_put_rq_ref(struct request *rq); /* * Internal helpers for allocating/freeing the request map