From patchwork Wed Apr 27 19:12:11 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 566771 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9C2C8C433FE for ; Wed, 27 Apr 2022 19:25:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233686AbiD0T2S (ORCPT ); Wed, 27 Apr 2022 15:28:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53452 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232545AbiD0TTL (ORCPT ); Wed, 27 Apr 2022 15:19:11 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 94E8614096 for ; Wed, 27 Apr 2022 12:13:20 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id C4C93B8291D for ; Wed, 27 Apr 2022 19:13:18 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 180F5C385A7; Wed, 27 Apr 2022 19:13:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1651086797; bh=KPIMaUrv3ahVDlTZhvKlbu0p7ZyDL3VrkfH0rCRjpXk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=NVAJyy0Z88ptUfS0adUubqFg6aKiLKkNcTcAa/2DgX51BdUlghR9v7jpbbLG22MTB oHJQEfPVrldResTjo2SAU0pSFSISeWHb/z5e4Lx9XngZqfUy9IiynTwGU3yCQ7PS+n pq0HFYqUxjdOdATqLXNk+S5XqR7hUzCZki1oiOi6EKkeGE95TFpscw2auVqRgIffYf TnFn05TP1MX2nxECNR4UKdQEDVIIdn1D6pVDvVvauI9n9k4Gtcb5iqMPHm3dgKep7J E9q2obQzdbFPYc9eWCfRo3gSR1opPb1AnFqbvtPeKKpyIbnHXwZkctCtVw4bH5wBgB e/mMFn8QyayLw== From: Jeff Layton To: ceph-devel@vger.kernel.org Cc: xiubli@redhat.com, lhenriques@suse.de, idryomov@gmail.com Subject: [PATCH v14 01/64] libceph: add spinlock around osd->o_requests Date: Wed, 27 Apr 2022 15:12:11 -0400 Message-Id: <20220427191314.222867-2-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220427191314.222867-1-jlayton@kernel.org> References: <20220427191314.222867-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org In a later patch, we're going to need to search for a request in the rbtree, but taking the o_mutex is inconvenient as we already hold the con mutex at the point where we need it. Add a new spinlock that we take when inserting and erasing entries from the o_requests tree. Search of the rbtree can be done with either the mutex or the spinlock, but insertion and removal requires both. Reviewed-by: Xiubo Li Signed-off-by: Jeff Layton --- include/linux/ceph/osd_client.h | 8 +++++++- net/ceph/osd_client.c | 5 +++++ 2 files changed, 12 insertions(+), 1 deletion(-) diff --git a/include/linux/ceph/osd_client.h b/include/linux/ceph/osd_client.h index 3431011f364d..3122c1a3205f 100644 --- a/include/linux/ceph/osd_client.h +++ b/include/linux/ceph/osd_client.h @@ -29,7 +29,12 @@ typedef void (*ceph_osdc_callback_t)(struct ceph_osd_request *); #define CEPH_HOMELESS_OSD -1 -/* a given osd we're communicating with */ +/* + * A given osd we're communicating with. + * + * Note that the o_requests tree can be searched while holding the "lock" mutex + * or the "o_requests_lock" spinlock. Insertion or removal requires both! + */ struct ceph_osd { refcount_t o_ref; struct ceph_osd_client *o_osdc; @@ -37,6 +42,7 @@ struct ceph_osd { int o_incarnation; struct rb_node o_node; struct ceph_connection o_con; + spinlock_t o_requests_lock; struct rb_root o_requests; struct rb_root o_linger_requests; struct rb_root o_backoff_mappings; diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c index 83eb97c94e83..17c792b32343 100644 --- a/net/ceph/osd_client.c +++ b/net/ceph/osd_client.c @@ -1198,6 +1198,7 @@ static void osd_init(struct ceph_osd *osd) { refcount_set(&osd->o_ref, 1); RB_CLEAR_NODE(&osd->o_node); + spin_lock_init(&osd->o_requests_lock); osd->o_requests = RB_ROOT; osd->o_linger_requests = RB_ROOT; osd->o_backoff_mappings = RB_ROOT; @@ -1427,7 +1428,9 @@ static void link_request(struct ceph_osd *osd, struct ceph_osd_request *req) atomic_inc(&osd->o_osdc->num_homeless); get_osd(osd); + spin_lock(&osd->o_requests_lock); insert_request(&osd->o_requests, req); + spin_unlock(&osd->o_requests_lock); req->r_osd = osd; } @@ -1439,7 +1442,9 @@ static void unlink_request(struct ceph_osd *osd, struct ceph_osd_request *req) req, req->r_tid); req->r_osd = NULL; + spin_lock(&osd->o_requests_lock); erase_request(&osd->o_requests, req); + spin_unlock(&osd->o_requests_lock); put_osd(osd); if (!osd_homeless(osd)) From patchwork Wed Apr 27 19:12:13 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 566770 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4A692C433EF for ; Wed, 27 Apr 2022 19:25:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232563AbiD0T22 (ORCPT ); Wed, 27 Apr 2022 15:28:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43858 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232644AbiD0TTL (ORCPT ); Wed, 27 Apr 2022 15:19:11 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9386F12ADC for ; Wed, 27 Apr 2022 12:13:19 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 7D2036199D for ; Wed, 27 Apr 2022 19:13:19 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 79511C385AF; Wed, 27 Apr 2022 19:13:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1651086798; bh=nG8CJQgJdJ6uZY6N3B9ES84bNWIgTJ4luknQQSBEL7g=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Ea2SHtaGOjDR4w2Qc0nSitDDs2vD04Dmi0DPKWJwsUWEM/dMlP72mbKScE1x5WfnY 2h4dIp1sckZ2sYVx+SJuZC/Aiifj/GkT5uwhZYvOjc4bTAnTXVviaYyh6eBpje9yw8 nzyR6n7hJO7XtVoegVOc9X3l1fPfiAOCKqOA44g4BEW3FI3hOv99PGipOWoq0VvgBq 3x49BTvXnGUXKFUzLHv4E4uUSCgD2kNyS3nu19VuTzwxnSF/V2QZIK3jQKHaaPTDX5 wnv7EKdZn5XbedH7HVf27sLbLwAJ7b22mew0KnzPds6H6U55agR9w1ehI5EDekD1mb e3K++d82NFOzQ== From: Jeff Layton To: ceph-devel@vger.kernel.org Cc: xiubli@redhat.com, lhenriques@suse.de, idryomov@gmail.com Subject: [PATCH v14 03/64] libceph: add sparse read support to msgr2 crc state machine Date: Wed, 27 Apr 2022 15:12:13 -0400 Message-Id: <20220427191314.222867-4-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220427191314.222867-1-jlayton@kernel.org> References: <20220427191314.222867-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Add support for a new sparse_read ceph_connection operation. The idea is that the client driver can define this operation use it to do special handling for incoming reads. The alloc_msg routine will look at the request and determine whether the reply is expected to be sparse. If it is, then we'll dispatch to a different set of state machine states that will repeatedly call the driver's sparse_read op to get length and placement info for reading the extent map, and the extents themselves. This necessitates adding some new field to some other structs: - The msg gets a new bool to track whether it's a sparse_read request. - A new field is added to the cursor to track the amount remaining in the current extent. This is used to cap the read from the socket into the msg_data - Handing a revoke with all of this is particularly difficult, so I've added a new data_len_remain field to the v2 connection info, and then use that to skip that much on a revoke. We may want to expand the use of that to the normal read path as well, just for consistency's sake. Reviewed-by: Xiubo Li Signed-off-by: Jeff Layton --- include/linux/ceph/messenger.h | 28 ++++++ net/ceph/messenger.c | 1 + net/ceph/messenger_v2.c | 168 +++++++++++++++++++++++++++++++-- 3 files changed, 188 insertions(+), 9 deletions(-) diff --git a/include/linux/ceph/messenger.h b/include/linux/ceph/messenger.h index e7f2fb2fc207..7f09a4213834 100644 --- a/include/linux/ceph/messenger.h +++ b/include/linux/ceph/messenger.h @@ -17,6 +17,7 @@ struct ceph_msg; struct ceph_connection; +struct ceph_msg_data_cursor; /* * Ceph defines these callbacks for handling connection events. @@ -70,6 +71,30 @@ struct ceph_connection_operations { int used_proto, int result, const int *allowed_protos, int proto_cnt, const int *allowed_modes, int mode_cnt); + + /** + * sparse_read: read sparse data + * @con: connection we're reading from + * @cursor: data cursor for reading extents + * @buf: optional buffer to read into + * + * This should be called more than once, each time setting up to + * receive an extent into the current cursor position, and zeroing + * the holes between them. + * + * Returns amount of data to be read (in bytes), 0 if reading is + * complete, or -errno if there was an error. + * + * If @buf is set on a >0 return, then the data should be read into + * the provided buffer. Otherwise, it should be read into the cursor. + * + * The sparse read operation is expected to initialize the cursor + * with a length covering up to the end of the last extent. + */ + int (*sparse_read)(struct ceph_connection *con, + struct ceph_msg_data_cursor *cursor, + char **buf); + }; /* use format string %s%lld */ @@ -207,6 +232,7 @@ struct ceph_msg_data_cursor { struct ceph_msg_data *data; /* current data item */ size_t resid; /* bytes not yet consumed */ + int sr_resid; /* residual sparse_read len */ bool last_piece; /* current is last piece */ bool need_crc; /* crc update needed */ union { @@ -252,6 +278,7 @@ struct ceph_msg { struct kref kref; bool more_to_follow; bool needs_out_seq; + bool sparse_read; int front_alloc_len; struct ceph_msgpool *pool; @@ -396,6 +423,7 @@ struct ceph_connection_v2_info { void *conn_bufs[16]; int conn_buf_cnt; + int data_len_remain; struct kvec in_sign_kvecs[8]; struct kvec out_sign_kvecs[8]; diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c index d3bb656308b4..bf4e7f5751ee 100644 --- a/net/ceph/messenger.c +++ b/net/ceph/messenger.c @@ -1034,6 +1034,7 @@ void ceph_msg_data_cursor_init(struct ceph_msg_data_cursor *cursor, cursor->total_resid = length; cursor->data = msg->data; + cursor->sr_resid = 0; __ceph_msg_data_cursor_init(cursor); } diff --git a/net/ceph/messenger_v2.c b/net/ceph/messenger_v2.c index c6e5bfc717d5..d527777af584 100644 --- a/net/ceph/messenger_v2.c +++ b/net/ceph/messenger_v2.c @@ -52,14 +52,16 @@ #define FRAME_LATE_STATUS_COMPLETE 0xe #define FRAME_LATE_STATUS_ABORTED_MASK 0xf -#define IN_S_HANDLE_PREAMBLE 1 -#define IN_S_HANDLE_CONTROL 2 -#define IN_S_HANDLE_CONTROL_REMAINDER 3 -#define IN_S_PREPARE_READ_DATA 4 -#define IN_S_PREPARE_READ_DATA_CONT 5 -#define IN_S_PREPARE_READ_ENC_PAGE 6 -#define IN_S_HANDLE_EPILOGUE 7 -#define IN_S_FINISH_SKIP 8 +#define IN_S_HANDLE_PREAMBLE 1 +#define IN_S_HANDLE_CONTROL 2 +#define IN_S_HANDLE_CONTROL_REMAINDER 3 +#define IN_S_PREPARE_READ_DATA 4 +#define IN_S_PREPARE_READ_DATA_CONT 5 +#define IN_S_PREPARE_READ_ENC_PAGE 6 +#define IN_S_PREPARE_SPARSE_DATA 7 +#define IN_S_PREPARE_SPARSE_DATA_CONT 8 +#define IN_S_HANDLE_EPILOGUE 9 +#define IN_S_FINISH_SKIP 10 #define OUT_S_QUEUE_DATA 1 #define OUT_S_QUEUE_DATA_CONT 2 @@ -1819,6 +1821,124 @@ static void prepare_read_data_cont(struct ceph_connection *con) con->v2.in_state = IN_S_HANDLE_EPILOGUE; } +static int prepare_sparse_read_cont(struct ceph_connection *con) +{ + int ret; + struct bio_vec bv; + char *buf = NULL; + struct ceph_msg_data_cursor *cursor = &con->v2.in_cursor; + + WARN_ON(con->v2.in_state != IN_S_PREPARE_SPARSE_DATA_CONT); + + if (iov_iter_is_bvec(&con->v2.in_iter)) { + if (ceph_test_opt(from_msgr(con->msgr), RXBOUNCE)) { + con->in_data_crc = crc32c(con->in_data_crc, + page_address(con->bounce_page), + con->v2.in_bvec.bv_len); + get_bvec_at(cursor, &bv); + memcpy_to_page(bv.bv_page, bv.bv_offset, + page_address(con->bounce_page), + con->v2.in_bvec.bv_len); + } else { + con->in_data_crc = ceph_crc32c_page(con->in_data_crc, + con->v2.in_bvec.bv_page, + con->v2.in_bvec.bv_offset, + con->v2.in_bvec.bv_len); + } + + ceph_msg_data_advance(cursor, con->v2.in_bvec.bv_len); + cursor->sr_resid -= con->v2.in_bvec.bv_len; + dout("%s: advance by 0x%x sr_resid 0x%x\n", __func__, + con->v2.in_bvec.bv_len, cursor->sr_resid); + WARN_ON_ONCE(cursor->sr_resid > cursor->total_resid); + if (cursor->sr_resid) { + get_bvec_at(cursor, &bv); + if (bv.bv_len > cursor->sr_resid) + bv.bv_len = cursor->sr_resid; + if (ceph_test_opt(from_msgr(con->msgr), RXBOUNCE)) { + bv.bv_page = con->bounce_page; + bv.bv_offset = 0; + } + set_in_bvec(con, &bv); + con->v2.data_len_remain -= bv.bv_len; + return 0; + } + } else if (iov_iter_is_kvec(&con->v2.in_iter)) { + /* On first call, we have no kvec so don't compute crc */ + if (con->v2.in_kvec_cnt) { + WARN_ON_ONCE(con->v2.in_kvec_cnt > 1); + con->in_data_crc = crc32c(con->in_data_crc, + con->v2.in_kvecs[0].iov_base, + con->v2.in_kvecs[0].iov_len); + } + } else { + return -EIO; + } + + /* get next extent */ + ret = con->ops->sparse_read(con, cursor, &buf); + if (ret <= 0) { + if (ret < 0) + return ret; + + reset_in_kvecs(con); + add_in_kvec(con, con->v2.in_buf, CEPH_EPILOGUE_PLAIN_LEN); + con->v2.in_state = IN_S_HANDLE_EPILOGUE; + return 0; + } + + if (buf) { + /* receive into buffer */ + reset_in_kvecs(con); + add_in_kvec(con, buf, ret); + con->v2.data_len_remain -= ret; + return 0; + } + + if (ret > cursor->total_resid) { + pr_warn("%s: ret 0x%x total_resid 0x%zx resid 0x%zx last %d\n", + __func__, ret, cursor->total_resid, cursor->resid, + cursor->last_piece); + return -EIO; + } + get_bvec_at(cursor, &bv); + if (bv.bv_len > cursor->sr_resid) + bv.bv_len = cursor->sr_resid; + if (ceph_test_opt(from_msgr(con->msgr), RXBOUNCE)) { + if (unlikely(!con->bounce_page)) { + con->bounce_page = alloc_page(GFP_NOIO); + if (!con->bounce_page) { + pr_err("failed to allocate bounce page\n"); + return -ENOMEM; + } + } + + bv.bv_page = con->bounce_page; + bv.bv_offset = 0; + } + set_in_bvec(con, &bv); + con->v2.data_len_remain -= ret; + return ret; +} + +static int prepare_sparse_read_data(struct ceph_connection *con) +{ + struct ceph_msg *msg = con->in_msg; + + dout("%s: starting sparse read\n", __func__); + + if (WARN_ON_ONCE(!con->ops->sparse_read)) + return -EOPNOTSUPP; + + if (!con_secure(con)) + con->in_data_crc = -1; + + reset_in_kvecs(con); + con->v2.in_state = IN_S_PREPARE_SPARSE_DATA_CONT; + con->v2.data_len_remain = data_len(msg); + return prepare_sparse_read_cont(con); +} + static int prepare_read_tail_plain(struct ceph_connection *con) { struct ceph_msg *msg = con->in_msg; @@ -1839,7 +1959,10 @@ static int prepare_read_tail_plain(struct ceph_connection *con) } if (data_len(msg)) { - con->v2.in_state = IN_S_PREPARE_READ_DATA; + if (msg->sparse_read) + con->v2.in_state = IN_S_PREPARE_SPARSE_DATA; + else + con->v2.in_state = IN_S_PREPARE_READ_DATA; } else { add_in_kvec(con, con->v2.in_buf, CEPH_EPILOGUE_PLAIN_LEN); con->v2.in_state = IN_S_HANDLE_EPILOGUE; @@ -2893,6 +3016,12 @@ static int populate_in_iter(struct ceph_connection *con) prepare_read_enc_page(con); ret = 0; break; + case IN_S_PREPARE_SPARSE_DATA: + ret = prepare_sparse_read_data(con); + break; + case IN_S_PREPARE_SPARSE_DATA_CONT: + ret = prepare_sparse_read_cont(con); + break; case IN_S_HANDLE_EPILOGUE: ret = handle_epilogue(con); break; @@ -3485,6 +3614,23 @@ static void revoke_at_prepare_read_enc_page(struct ceph_connection *con) con->v2.in_state = IN_S_FINISH_SKIP; } +static void revoke_at_prepare_sparse_data(struct ceph_connection *con) +{ + int resid; /* current piece of data */ + int remaining; + + WARN_ON(con_secure(con)); + WARN_ON(!data_len(con->in_msg)); + WARN_ON(!iov_iter_is_bvec(&con->v2.in_iter)); + resid = iov_iter_count(&con->v2.in_iter); + dout("%s con %p resid %d\n", __func__, con, resid); + + remaining = CEPH_EPILOGUE_PLAIN_LEN + con->v2.data_len_remain; + con->v2.in_iter.count -= resid; + set_in_skip(con, resid + remaining); + con->v2.in_state = IN_S_FINISH_SKIP; +} + static void revoke_at_handle_epilogue(struct ceph_connection *con) { int resid; @@ -3501,6 +3647,7 @@ static void revoke_at_handle_epilogue(struct ceph_connection *con) void ceph_con_v2_revoke_incoming(struct ceph_connection *con) { switch (con->v2.in_state) { + case IN_S_PREPARE_SPARSE_DATA: case IN_S_PREPARE_READ_DATA: revoke_at_prepare_read_data(con); break; @@ -3510,6 +3657,9 @@ void ceph_con_v2_revoke_incoming(struct ceph_connection *con) case IN_S_PREPARE_READ_ENC_PAGE: revoke_at_prepare_read_enc_page(con); break; + case IN_S_PREPARE_SPARSE_DATA_CONT: + revoke_at_prepare_sparse_data(con); + break; case IN_S_HANDLE_EPILOGUE: revoke_at_handle_epilogue(con); break; From patchwork Wed Apr 27 19:12:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 566772 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BFF2FC433EF for ; Wed, 27 Apr 2022 19:25:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233670AbiD0T2N (ORCPT ); Wed, 27 Apr 2022 15:28:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38466 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232649AbiD0TTL (ORCPT ); Wed, 27 Apr 2022 15:19:11 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 94BC813F9A for ; Wed, 27 Apr 2022 12:13:20 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id C9177619D0 for ; Wed, 27 Apr 2022 19:13:19 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2FD74C385A9; Wed, 27 Apr 2022 19:13:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1651086799; bh=wH0z00USvCwH2GBLz1DogypY6kWWSEH0EiOI5h83q+E=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ioyZkXVC6ioFVCGnEmCdwPei7UWJPLz0rJeMMG1pvv0SENLNwc43ucRq1Jq8WDhJW ytUuyx7LrkEgkLYEeZRGciMvN4g1tfSQG5k0L1Lpia1sNOGXzUhek9W0EKU1+btcSJ VO4Sfp5PP2rlGQDTZVwRawRF03/Z7hk4Hg8WXx8M7s7l2seodAEAXHmxJQwq0RKY82 j9SDM3NkWL60IFB7kIBVbUM6Uq/BV8X78AYcqgcbp58gKRPLjtV9MvKpjzKYwkcy2H fP/G0hBtp1Gm7ebYNk+vwTBRo4CG1gl/sgnXKv12PyUHN+3PYZCLIbVP6LnfE/qHrS DO+c+b7CvVvBw== From: Jeff Layton To: ceph-devel@vger.kernel.org Cc: xiubli@redhat.com, lhenriques@suse.de, idryomov@gmail.com Subject: [PATCH v14 04/64] libceph: add sparse read support to OSD client Date: Wed, 27 Apr 2022 15:12:14 -0400 Message-Id: <20220427191314.222867-5-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220427191314.222867-1-jlayton@kernel.org> References: <20220427191314.222867-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Have get_reply check for the presence of sparse read ops in the request and set the sparse_read boolean in the msg. That will queue the messenger layer to use the sparse read codepath instead of the normal data receive. Add a new sparse_read operation for the OSD client, driven by its own state machine. The messenger will repeatedly call the sparse_read operation, and it will pass back the necessary info to set up to read the next extent of data, while zero-filling the sparse regions. The state machine will stop at the end of the last extent, and will attach the extent map buffer to the ceph_osd_req_op so that the caller can use it. Reviewed-by: Xiubo Li Signed-off-by: Jeff Layton --- include/linux/ceph/osd_client.h | 32 ++++ net/ceph/osd_client.c | 256 +++++++++++++++++++++++++++++++- 2 files changed, 284 insertions(+), 4 deletions(-) diff --git a/include/linux/ceph/osd_client.h b/include/linux/ceph/osd_client.h index 1dd02240d00d..4088601beacc 100644 --- a/include/linux/ceph/osd_client.h +++ b/include/linux/ceph/osd_client.h @@ -40,6 +40,36 @@ struct ceph_sparse_extent { u64 len; } __packed; +/* Sparse read state machine state values */ +enum ceph_sparse_read_state { + CEPH_SPARSE_READ_HDR = 0, + CEPH_SPARSE_READ_EXTENTS, + CEPH_SPARSE_READ_DATA_LEN, + CEPH_SPARSE_READ_DATA, +}; + +/* + * A SPARSE_READ reply is a 32-bit count of extents, followed by an array of + * 64-bit offset/length pairs, and then all of the actual file data + * concatenated after it (sans holes). + * + * Unfortunately, we don't know how long the extent array is until we've + * started reading the data section of the reply. The caller should send down + * a destination buffer for the array, but we'll alloc one if it's too small + * or if the caller doesn't. + */ +struct ceph_sparse_read { + enum ceph_sparse_read_state sr_state; /* state machine state */ + u64 sr_req_off; /* orig request offset */ + u64 sr_req_len; /* orig request length */ + u64 sr_pos; /* current pos in buffer */ + int sr_index; /* current extent index */ + __le32 sr_datalen; /* length of actual data */ + u32 sr_count; /* extent count in reply */ + int sr_ext_len; /* length of extent array */ + struct ceph_sparse_extent *sr_extent; /* extent array */ +}; + /* * A given osd we're communicating with. * @@ -48,6 +78,7 @@ struct ceph_sparse_extent { */ struct ceph_osd { refcount_t o_ref; + int o_sparse_op_idx; struct ceph_osd_client *o_osdc; int o_osd; int o_incarnation; @@ -63,6 +94,7 @@ struct ceph_osd { unsigned long lru_ttl; struct list_head o_keepalive_item; struct mutex lock; + struct ceph_sparse_read o_sparse_read; }; #define CEPH_OSD_SLAB_OPS 2 diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c index c150683f2a2f..acf6a19b6677 100644 --- a/net/ceph/osd_client.c +++ b/net/ceph/osd_client.c @@ -376,6 +376,7 @@ static void osd_req_op_data_release(struct ceph_osd_request *osd_req, switch (op->op) { case CEPH_OSD_OP_READ: + case CEPH_OSD_OP_SPARSE_READ: case CEPH_OSD_OP_WRITE: case CEPH_OSD_OP_WRITEFULL: kfree(op->extent.sparse_ext); @@ -707,6 +708,7 @@ static void get_num_data_items(struct ceph_osd_request *req, /* reply */ case CEPH_OSD_OP_STAT: case CEPH_OSD_OP_READ: + case CEPH_OSD_OP_SPARSE_READ: case CEPH_OSD_OP_LIST_WATCHERS: *num_reply_data_items += 1; break; @@ -776,7 +778,7 @@ void osd_req_op_extent_init(struct ceph_osd_request *osd_req, BUG_ON(opcode != CEPH_OSD_OP_READ && opcode != CEPH_OSD_OP_WRITE && opcode != CEPH_OSD_OP_WRITEFULL && opcode != CEPH_OSD_OP_ZERO && - opcode != CEPH_OSD_OP_TRUNCATE); + opcode != CEPH_OSD_OP_TRUNCATE && opcode != CEPH_OSD_OP_SPARSE_READ); op->extent.offset = offset; op->extent.length = length; @@ -985,6 +987,7 @@ static u32 osd_req_encode_op(struct ceph_osd_op *dst, case CEPH_OSD_OP_STAT: break; case CEPH_OSD_OP_READ: + case CEPH_OSD_OP_SPARSE_READ: case CEPH_OSD_OP_WRITE: case CEPH_OSD_OP_WRITEFULL: case CEPH_OSD_OP_ZERO: @@ -1081,7 +1084,8 @@ struct ceph_osd_request *ceph_osdc_new_request(struct ceph_osd_client *osdc, BUG_ON(opcode != CEPH_OSD_OP_READ && opcode != CEPH_OSD_OP_WRITE && opcode != CEPH_OSD_OP_ZERO && opcode != CEPH_OSD_OP_TRUNCATE && - opcode != CEPH_OSD_OP_CREATE && opcode != CEPH_OSD_OP_DELETE); + opcode != CEPH_OSD_OP_CREATE && opcode != CEPH_OSD_OP_DELETE && + opcode != CEPH_OSD_OP_SPARSE_READ); req = ceph_osdc_alloc_request(osdc, snapc, num_ops, use_mempool, GFP_NOFS); @@ -1222,6 +1226,13 @@ static void osd_init(struct ceph_osd *osd) mutex_init(&osd->lock); } +static void ceph_init_sparse_read(struct ceph_sparse_read *sr) +{ + kfree(sr->sr_extent); + memset(sr, '\0', sizeof(*sr)); + sr->sr_state = CEPH_SPARSE_READ_HDR; +} + static void osd_cleanup(struct ceph_osd *osd) { WARN_ON(!RB_EMPTY_NODE(&osd->o_node)); @@ -1232,6 +1243,8 @@ static void osd_cleanup(struct ceph_osd *osd) WARN_ON(!list_empty(&osd->o_osd_lru)); WARN_ON(!list_empty(&osd->o_keepalive_item)); + ceph_init_sparse_read(&osd->o_sparse_read); + if (osd->o_auth.authorizer) { WARN_ON(osd_homeless(osd)); ceph_auth_destroy_authorizer(osd->o_auth.authorizer); @@ -1251,6 +1264,9 @@ static struct ceph_osd *create_osd(struct ceph_osd_client *osdc, int onum) osd_init(osd); osd->o_osdc = osdc; osd->o_osd = onum; + osd->o_sparse_op_idx = -1; + + ceph_init_sparse_read(&osd->o_sparse_read); ceph_con_init(&osd->o_con, osd, &osd_con_ops, &osdc->client->msgr); @@ -2055,6 +2071,7 @@ static void setup_request_data(struct ceph_osd_request *req) &op->raw_data_in); break; case CEPH_OSD_OP_READ: + case CEPH_OSD_OP_SPARSE_READ: ceph_osdc_msg_data_add(reply_msg, &op->extent.osd_data); break; @@ -2474,8 +2491,10 @@ static void finish_request(struct ceph_osd_request *req) req->r_end_latency = ktime_get(); - if (req->r_osd) + if (req->r_osd) { + ceph_init_sparse_read(&req->r_osd->o_sparse_read); unlink_request(req->r_osd, req); + } atomic_dec(&osdc->num_requests); /* @@ -5420,6 +5439,24 @@ static void osd_dispatch(struct ceph_connection *con, struct ceph_msg *msg) ceph_msg_put(msg); } +/* How much sparse data was requested? */ +static u64 sparse_data_requested(struct ceph_osd_request *req) +{ + u64 len = 0; + + if (req->r_flags & CEPH_OSD_FLAG_READ) { + int i; + + for (i = 0; i < req->r_num_ops; ++i) { + struct ceph_osd_req_op *op = &req->r_ops[i]; + + if (op->op == CEPH_OSD_OP_SPARSE_READ) + len += op->extent.length; + } + } + return len; +} + /* * Lookup and return message for incoming reply. Don't try to do * anything about a larger than preallocated data portion of the @@ -5436,6 +5473,7 @@ static struct ceph_msg *get_reply(struct ceph_connection *con, int front_len = le32_to_cpu(hdr->front_len); int data_len = le32_to_cpu(hdr->data_len); u64 tid = le64_to_cpu(hdr->tid); + u64 srlen; down_read(&osdc->lock); if (!osd_registered(osd)) { @@ -5468,7 +5506,8 @@ static struct ceph_msg *get_reply(struct ceph_connection *con, req->r_reply = m; } - if (data_len > req->r_reply->data_length) { + srlen = sparse_data_requested(req); + if (!srlen && data_len > req->r_reply->data_length) { pr_warn("%s osd%d tid %llu data %d > preallocated %zu, skipping\n", __func__, osd->o_osd, req->r_tid, data_len, req->r_reply->data_length); @@ -5478,6 +5517,8 @@ static struct ceph_msg *get_reply(struct ceph_connection *con, } m = ceph_msg_get(req->r_reply); + m->sparse_read = (bool)srlen; + dout("get_reply tid %lld %p\n", tid, m); out_unlock_session: @@ -5710,9 +5751,216 @@ static int osd_check_message_signature(struct ceph_msg *msg) return ceph_auth_check_message_signature(auth, msg); } +static void advance_cursor(struct ceph_msg_data_cursor *cursor, size_t len, bool zero) +{ + while (len) { + struct page *page; + size_t poff, plen; + bool last = false; + + page = ceph_msg_data_next(cursor, &poff, &plen, &last); + if (plen > len) + plen = len; + if (zero) + zero_user_segment(page, poff, poff + plen); + len -= plen; + ceph_msg_data_advance(cursor, plen); + } +} + +static int prep_next_sparse_read(struct ceph_connection *con, + struct ceph_msg_data_cursor *cursor) +{ + struct ceph_osd *o = con->private; + struct ceph_sparse_read *sr = &o->o_sparse_read; + struct ceph_osd_request *req; + struct ceph_osd_req_op *op; + + spin_lock(&o->o_requests_lock); + req = lookup_request(&o->o_requests, le64_to_cpu(con->in_msg->hdr.tid)); + if (!req) { + spin_unlock(&o->o_requests_lock); + return -EBADR; + } + + if (o->o_sparse_op_idx < 0) { + u64 srlen = sparse_data_requested(req); + + dout("%s: [%d] starting new sparse read req. srlen=0x%llx\n", + __func__, o->o_osd, srlen); + ceph_msg_data_cursor_init(cursor, con->in_msg, srlen); + } else { + u64 end; + + op = &req->r_ops[o->o_sparse_op_idx]; + + WARN_ON_ONCE(op->extent.sparse_ext); + + /* hand back buffer we took earlier */ + op->extent.sparse_ext = sr->sr_extent; + sr->sr_extent = NULL; + op->extent.sparse_ext_cnt = sr->sr_count; + sr->sr_ext_len = 0; + dout("%s: [%d] completed extent array len %d cursor->resid %zd\n", + __func__, o->o_osd, op->extent.sparse_ext_cnt, cursor->resid); + /* Advance to end of data for this operation */ + end = ceph_sparse_ext_map_end(op); + if (end < sr->sr_req_len) + advance_cursor(cursor, sr->sr_req_len - end, false); + } + + ceph_init_sparse_read(sr); + + /* find next op in this request (if any) */ + while (++o->o_sparse_op_idx < req->r_num_ops) { + op = &req->r_ops[o->o_sparse_op_idx]; + if (op->op == CEPH_OSD_OP_SPARSE_READ) + goto found; + } + + /* reset for next sparse read request */ + spin_unlock(&o->o_requests_lock); + o->o_sparse_op_idx = -1; + return 0; +found: + sr->sr_req_off = op->extent.offset; + sr->sr_req_len = op->extent.length; + sr->sr_pos = sr->sr_req_off; + dout("%s: [%d] new sparse read op at idx %d 0x%llx~0x%llx\n", __func__, + o->o_osd, o->o_sparse_op_idx, sr->sr_req_off, sr->sr_req_len); + + /* hand off request's sparse extent map buffer */ + sr->sr_ext_len = op->extent.sparse_ext_cnt; + op->extent.sparse_ext_cnt = 0; + sr->sr_extent = op->extent.sparse_ext; + op->extent.sparse_ext = NULL; + + spin_unlock(&o->o_requests_lock); + return 1; +} + +#ifdef __BIG_ENDIAN +static inline void convert_extent_map(struct ceph_sparse_read *sr) +{ + int i; + + for (i = 0; i < sr->sr_count; i++) { + struct ceph_sparse_extent *ext = &sr->sr_extent[i]; + + ext->off = le64_to_cpu((__force __le64)ext->off); + ext->len = le64_to_cpu((__force __le64)ext->len); + } +} +#else +static inline void convert_extent_map(struct ceph_sparse_read *sr) +{ +} +#endif + +#define MAX_EXTENTS 4096 + +static int osd_sparse_read(struct ceph_connection *con, + struct ceph_msg_data_cursor *cursor, + char **pbuf) +{ + struct ceph_osd *o = con->private; + struct ceph_sparse_read *sr = &o->o_sparse_read; + u32 count = sr->sr_count; + u64 eoff, elen; + int ret; + + switch (sr->sr_state) { + case CEPH_SPARSE_READ_HDR: +next_op: + ret = prep_next_sparse_read(con, cursor); + if (ret <= 0) + return ret; + + /* number of extents */ + ret = sizeof(sr->sr_count); + *pbuf = (char *)&sr->sr_count; + sr->sr_state = CEPH_SPARSE_READ_EXTENTS; + break; + case CEPH_SPARSE_READ_EXTENTS: + /* Convert sr_count to host-endian */ + count = le32_to_cpu((__force __le32)sr->sr_count); + sr->sr_count = count; + dout("[%d] got %u extents\n", o->o_osd, count); + + if (count > 0) { + if (!sr->sr_extent || count > sr->sr_ext_len) { + /* + * Apply a hard cap to the number of extents. + * If we have more, assume something is wrong. + */ + if (count > MAX_EXTENTS) { + dout("%s: OSD returned 0x%x extents in a single reply!\n", + __func__, count); + return -EREMOTEIO; + } + + /* no extent array provided, or too short */ + kfree(sr->sr_extent); + sr->sr_extent = kmalloc_array(count, + sizeof(*sr->sr_extent), + GFP_NOIO); + if (!sr->sr_extent) + return -ENOMEM; + sr->sr_ext_len = count; + } + ret = count * sizeof(*sr->sr_extent); + *pbuf = (char *)sr->sr_extent; + sr->sr_state = CEPH_SPARSE_READ_DATA_LEN; + break; + } + /* No extents? Read data len */ + fallthrough; + case CEPH_SPARSE_READ_DATA_LEN: + convert_extent_map(sr); + ret = sizeof(sr->sr_datalen); + *pbuf = (char *)&sr->sr_datalen; + sr->sr_state = CEPH_SPARSE_READ_DATA; + break; + case CEPH_SPARSE_READ_DATA: + if (sr->sr_index >= count) { + sr->sr_state = CEPH_SPARSE_READ_HDR; + goto next_op; + } + + eoff = sr->sr_extent[sr->sr_index].off; + elen = sr->sr_extent[sr->sr_index].len; + + dout("[%d] ext %d off 0x%llx len 0x%llx\n", + o->o_osd, sr->sr_index, eoff, elen); + + if (elen > INT_MAX) { + dout("Sparse read extent length too long (0x%llx)\n", elen); + return -EREMOTEIO; + } + + /* zero out anything from sr_pos to start of extent */ + if (sr->sr_pos < eoff) + advance_cursor(cursor, eoff - sr->sr_pos, true); + + /* Set position to end of extent */ + sr->sr_pos = eoff + elen; + + /* send back the new length and nullify the ptr */ + cursor->sr_resid = elen; + ret = elen; + *pbuf = NULL; + + /* Bump the array index */ + ++sr->sr_index; + break; + } + return ret; +} + static const struct ceph_connection_operations osd_con_ops = { .get = osd_get_con, .put = osd_put_con, + .sparse_read = osd_sparse_read, .alloc_msg = osd_alloc_msg, .dispatch = osd_dispatch, .fault = osd_fault, From patchwork Wed Apr 27 19:12:15 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 566801 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1FB87C433EF for ; Wed, 27 Apr 2022 19:16:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231820AbiD0TT0 (ORCPT ); Wed, 27 Apr 2022 15:19:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49662 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232787AbiD0TTN (ORCPT ); Wed, 27 Apr 2022 15:19:13 -0400 Received: from sin.source.kernel.org (sin.source.kernel.org [IPv6:2604:1380:40e1:4800::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 90B676B0B0 for ; Wed, 27 Apr 2022 12:13:23 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id 0B5A7CE275A for ; Wed, 27 Apr 2022 19:13:22 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D9FEEC385AA; Wed, 27 Apr 2022 19:13:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1651086800; bh=y1CHM8STK8OTbioMqhyja91DwJu2qLFOOxUQqODp/bA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=KGNE8mBZe3K2GFObo09a7nKiFy9hmELDPDW7fUj5+aOa1B5COsFSdxY4l6lyf8rkz Qn6xxw7cCWXO1UdTpH2TKHftjoYuNaT06o3lJFapS9bx5cndzHVvI9d9bj6N5fEBBq X/EyjaNzkB2Fr0icsBMoGa5KwvidZTLYjDn63PMWnhWlZ+gdeoibAEfHdJcrLZu+Kz iMxgDH4iMMdVzdwRuphchbXq0z4+hQFT62rKlunU8bw2rjfDPvQj6Aw5zKysd+li27 xP7Nvf82LJZxP+hER4ODXfZ5rDYF1abhpTb7W4RHLeg1uR4tX8SayANP6MwxPb+9SV 39cJBEsTrXvbg== From: Jeff Layton To: ceph-devel@vger.kernel.org Cc: xiubli@redhat.com, lhenriques@suse.de, idryomov@gmail.com Subject: [PATCH v14 05/64] libceph: support sparse reads on msgr2 secure codepath Date: Wed, 27 Apr 2022 15:12:15 -0400 Message-Id: <20220427191314.222867-6-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220427191314.222867-1-jlayton@kernel.org> References: <20220427191314.222867-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Add a new init_sgs_pages helper that populates the scatterlist from an arbitrary point in an array of pages. Change setup_message_sgs to take an optional pointer to an array of pages. If that's set, then the scatterlist will be set using that array instead of the cursor. When given a sparse read on a secure connection, decrypt the data in-place rather than into the final destination, by passing it the in_enc_pages array. After decrypting, run the sparse_read state machine in a loop, copying data from the decrypted pages until it's complete. Signed-off-by: Jeff Layton --- net/ceph/messenger_v2.c | 119 ++++++++++++++++++++++++++++++++++++---- 1 file changed, 109 insertions(+), 10 deletions(-) diff --git a/net/ceph/messenger_v2.c b/net/ceph/messenger_v2.c index d527777af584..3dcaee6f8903 100644 --- a/net/ceph/messenger_v2.c +++ b/net/ceph/messenger_v2.c @@ -963,12 +963,48 @@ static void init_sgs_cursor(struct scatterlist **sg, } } +/** + * init_sgs_pages: set up scatterlist on an array of page pointers + * @sg: scatterlist to populate + * @pages: pointer to page array + * @dpos: position in the array to start (bytes) + * @dlen: len to add to sg (bytes) + * @pad: pointer to pad destination (if any) + * + * Populate the scatterlist from the page array, starting at an arbitrary + * byte in the array and running for a specified length. + */ +static void init_sgs_pages(struct scatterlist **sg, struct page **pages, + int dpos, int dlen, u8 *pad) +{ + int idx = dpos >> PAGE_SHIFT; + int off = offset_in_page(dpos); + int resid = dlen; + + do { + int len = min(resid, (int)PAGE_SIZE - off); + + sg_set_page(*sg, pages[idx], len, off); + *sg = sg_next(*sg); + off = 0; + ++idx; + resid -= len; + } while (resid); + + if (need_padding(dlen)) { + sg_set_buf(*sg, pad, padding_len(dlen)); + *sg = sg_next(*sg); + } +} + static int setup_message_sgs(struct sg_table *sgt, struct ceph_msg *msg, u8 *front_pad, u8 *middle_pad, u8 *data_pad, - void *epilogue, bool add_tag) + void *epilogue, struct page **pages, int dpos, + bool add_tag) { struct ceph_msg_data_cursor cursor; struct scatterlist *cur_sg; + int dlen = data_len(msg); int sg_cnt; int ret; @@ -982,9 +1018,15 @@ static int setup_message_sgs(struct sg_table *sgt, struct ceph_msg *msg, if (middle_len(msg)) sg_cnt += calc_sg_cnt(msg->middle->vec.iov_base, middle_len(msg)); - if (data_len(msg)) { - ceph_msg_data_cursor_init(&cursor, msg, data_len(msg)); - sg_cnt += calc_sg_cnt_cursor(&cursor); + if (dlen) { + if (pages) { + sg_cnt += calc_pages_for(dpos, dlen); + if (need_padding(dlen)) + sg_cnt++; + } else { + ceph_msg_data_cursor_init(&cursor, msg, dlen); + sg_cnt += calc_sg_cnt_cursor(&cursor); + } } ret = sg_alloc_table(sgt, sg_cnt, GFP_NOIO); @@ -998,9 +1040,13 @@ static int setup_message_sgs(struct sg_table *sgt, struct ceph_msg *msg, if (middle_len(msg)) init_sgs(&cur_sg, msg->middle->vec.iov_base, middle_len(msg), middle_pad); - if (data_len(msg)) { - ceph_msg_data_cursor_init(&cursor, msg, data_len(msg)); - init_sgs_cursor(&cur_sg, &cursor, data_pad); + if (dlen) { + if (pages) { + init_sgs_pages(&cur_sg, pages, dpos, dlen, data_pad); + } else { + ceph_msg_data_cursor_init(&cursor, msg, dlen); + init_sgs_cursor(&cur_sg, &cursor, data_pad); + } } WARN_ON(!sg_is_last(cur_sg)); @@ -1035,10 +1081,52 @@ static int decrypt_control_remainder(struct ceph_connection *con) padded_len(rem_len) + CEPH_GCM_TAG_LEN); } +/* Process sparse read data that lives in a buffer */ +static int process_v2_sparse_read(struct ceph_connection *con, struct page **pages, int spos) +{ + struct ceph_msg_data_cursor *cursor = &con->v2.in_cursor; + int ret; + + for (;;) { + char *buf = NULL; + + ret = con->ops->sparse_read(con, cursor, &buf); + if (ret <= 0) + return ret; + + dout("%s: sparse_read return %x buf %p\n", __func__, ret, buf); + + do { + int idx = spos >> PAGE_SHIFT; + int soff = offset_in_page(spos); + struct page *spage = con->v2.in_enc_pages[idx]; + int len = min_t(int, ret, PAGE_SIZE - soff); + + if (buf) { + memcpy_from_page(buf, spage, soff, len); + buf += len; + } else { + struct bio_vec bv; + + get_bvec_at(cursor, &bv); + len = min_t(int, len, bv.bv_len); + memcpy_page(bv.bv_page, bv.bv_offset, + spage, soff, len); + ceph_msg_data_advance(cursor, len); + } + spos += len; + ret -= len; + } while (ret); + } +} + static int decrypt_tail(struct ceph_connection *con) { struct sg_table enc_sgt = {}; struct sg_table sgt = {}; + struct page **pages = NULL; + bool sparse = con->in_msg->sparse_read; + int dpos = 0; int tail_len; int ret; @@ -1049,9 +1137,14 @@ static int decrypt_tail(struct ceph_connection *con) if (ret) goto out; + if (sparse) { + dpos = padded_len(front_len(con->in_msg) + padded_len(middle_len(con->in_msg))); + pages = con->v2.in_enc_pages; + } + ret = setup_message_sgs(&sgt, con->in_msg, FRONT_PAD(con->v2.in_buf), - MIDDLE_PAD(con->v2.in_buf), DATA_PAD(con->v2.in_buf), - con->v2.in_buf, true); + MIDDLE_PAD(con->v2.in_buf), DATA_PAD(con->v2.in_buf), + con->v2.in_buf, pages, dpos, true); if (ret) goto out; @@ -1061,6 +1154,12 @@ static int decrypt_tail(struct ceph_connection *con) if (ret) goto out; + if (sparse && data_len(con->in_msg)) { + ret = process_v2_sparse_read(con, con->v2.in_enc_pages, dpos); + if (ret) + goto out; + } + WARN_ON(!con->v2.in_enc_page_cnt); ceph_release_page_vector(con->v2.in_enc_pages, con->v2.in_enc_page_cnt); @@ -1584,7 +1683,7 @@ static int prepare_message_secure(struct ceph_connection *con) encode_epilogue_secure(con, false); ret = setup_message_sgs(&sgt, con->out_msg, zerop, zerop, zerop, - &con->v2.out_epil, false); + &con->v2.out_epil, NULL, 0, false); if (ret) goto out; From patchwork Wed Apr 27 19:12:17 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 566773 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3EB46C433FE for ; Wed, 27 Apr 2022 19:25:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232710AbiD0T2K (ORCPT ); Wed, 27 Apr 2022 15:28:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51460 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232958AbiD0TTN (ORCPT ); Wed, 27 Apr 2022 15:19:13 -0400 Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 13C0B6B0BB for ; Wed, 27 Apr 2022 12:13:25 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id 72BA2CE26AE for ; Wed, 27 Apr 2022 19:13:23 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 45ED5C385AE; Wed, 27 Apr 2022 19:13:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1651086801; bh=trOGwfW0dpuka6lVn4FyjgWAwHGV8whpehGIyBn/Kcw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=fdSfBkOOHecT7y7RQfW8wqcQ+rMgWJ4sbZtxChGShOqGW4xytI+R1JMw3osgj+Yok rC6omT7xGxLVq8hZTZAybsJkIEwg6AFwcjF6NTyvrKUKeC4Ys+fB4FXtlcwc3aNYaP oKFUbmTdS5zGjkoWqlEbvZdAwSvjTxEyeC8WwgfGOSh3Ubz4Hi7r+6bAHxkqfD13Eb h7iEyFAaysVGwpn9RJ2gxGMTOUIMutqfQcUVNDFflU0z57GFww4VqClNq+5EaCDUdq DlQfaIhs8pkZFG/hjrAD3Cv9K13+5v9w+S2UwJ4kriv6szUypnupkhGQqbs8P+NGbl zE9DgwB2esBVQ== From: Jeff Layton To: ceph-devel@vger.kernel.org Cc: xiubli@redhat.com, lhenriques@suse.de, idryomov@gmail.com Subject: [PATCH v14 07/64] ceph: add new mount option to enable sparse reads Date: Wed, 27 Apr 2022 15:12:17 -0400 Message-Id: <20220427191314.222867-8-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220427191314.222867-1-jlayton@kernel.org> References: <20220427191314.222867-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Add a new mount option that has the client issue sparse reads instead of normal ones. The callers now preallocate an sparse extent buffer that the libceph receive code can populate and hand back after the operation completes. After a successful sparse read, we can't use the req->r_result value to determine the amount of data "read", so instead we set the received length to be from the end of the last extent in the buffer. Any interstitial holes will have been filled by the receive code. Reviewed-by: Xiubo Li Signed-off-by: Jeff Layton --- fs/ceph/addr.c | 17 +++++++++++++++-- fs/ceph/file.c | 51 +++++++++++++++++++++++++++++++++++++++++-------- fs/ceph/super.c | 16 +++++++++++++++- fs/ceph/super.h | 1 + 4 files changed, 74 insertions(+), 11 deletions(-) diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c index 04a6c3f02f0c..84d0b52010d9 100644 --- a/fs/ceph/addr.c +++ b/fs/ceph/addr.c @@ -219,8 +219,10 @@ static void finish_netfs_read(struct ceph_osd_request *req) struct ceph_fs_client *fsc = ceph_inode_to_client(req->r_inode); struct ceph_osd_data *osd_data = osd_req_op_extent_osd_data(req, 0); struct netfs_io_subrequest *subreq = req->r_priv; + struct ceph_osd_req_op *op = &req->r_ops[0]; int num_pages; int err = req->r_result; + bool sparse = (op->op == CEPH_OSD_OP_SPARSE_READ); ceph_update_read_metrics(&fsc->mdsc->metric, req->r_start_latency, req->r_end_latency, osd_data->length, err); @@ -229,7 +231,9 @@ static void finish_netfs_read(struct ceph_osd_request *req) subreq->len, i_size_read(req->r_inode)); /* no object means success but no data */ - if (err == -ENOENT) + if (sparse && err >= 0) + err = ceph_sparse_ext_map_end(op); + else if (err == -ENOENT) err = 0; else if (err == -EBLOCKLISTED) fsc->blocklisted = true; @@ -312,13 +316,14 @@ static void ceph_netfs_issue_read(struct netfs_io_subrequest *subreq) size_t page_off; int err = 0; u64 len = subreq->len; + bool sparse = ceph_test_mount_opt(fsc, SPARSEREAD); if (ci->i_inline_version != CEPH_INLINE_NONE && ceph_netfs_issue_op_inline(subreq)) return; req = ceph_osdc_new_request(&fsc->client->osdc, &ci->i_layout, vino, subreq->start, &len, - 0, 1, CEPH_OSD_OP_READ, + 0, 1, sparse ? CEPH_OSD_OP_SPARSE_READ : CEPH_OSD_OP_READ, CEPH_OSD_FLAG_READ | fsc->client->osdc.client->options->read_from_replica, NULL, ci->i_truncate_seq, ci->i_truncate_size, false); if (IS_ERR(req)) { @@ -327,6 +332,14 @@ static void ceph_netfs_issue_read(struct netfs_io_subrequest *subreq) goto out; } + if (sparse) { + err = ceph_alloc_sparse_ext_map(&req->r_ops[0]); + if (err) { + ceph_osdc_put_request(req); + goto out; + } + } + dout("%s: pos=%llu orig_len=%zu len=%llu\n", __func__, subreq->start, subreq->len, len); iov_iter_xarray(&iter, READ, &rreq->mapping->i_pages, subreq->start, len); err = iov_iter_get_pages_alloc(&iter, &pages, len, &page_off); diff --git a/fs/ceph/file.c b/fs/ceph/file.c index 8c8226c0feac..c805a5f5a6d1 100644 --- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -915,6 +915,7 @@ static ssize_t ceph_sync_read(struct kiocb *iocb, struct iov_iter *to, u64 off = iocb->ki_pos; u64 len = iov_iter_count(to); u64 i_size = i_size_read(inode); + bool sparse = ceph_test_mount_opt(fsc, SPARSEREAD); dout("sync_read on file %p %llu~%u %s\n", file, off, (unsigned)len, (file->f_flags & O_DIRECT) ? "O_DIRECT" : ""); @@ -941,10 +942,12 @@ static ssize_t ceph_sync_read(struct kiocb *iocb, struct iov_iter *to, bool more; int idx; size_t left; + struct ceph_osd_req_op *op; req = ceph_osdc_new_request(osdc, &ci->i_layout, ci->i_vino, off, &len, 0, 1, - CEPH_OSD_OP_READ, CEPH_OSD_FLAG_READ, + sparse ? CEPH_OSD_OP_SPARSE_READ : CEPH_OSD_OP_READ, + CEPH_OSD_FLAG_READ, NULL, ci->i_truncate_seq, ci->i_truncate_size, false); if (IS_ERR(req)) { @@ -965,6 +968,16 @@ static ssize_t ceph_sync_read(struct kiocb *iocb, struct iov_iter *to, osd_req_op_extent_osd_data_pages(req, 0, pages, len, page_off, false, false); + + op = &req->r_ops[0]; + if (sparse) { + ret = ceph_alloc_sparse_ext_map(op); + if (ret) { + ceph_osdc_put_request(req); + break; + } + } + ret = ceph_osdc_start_request(osdc, req, false); if (!ret) ret = ceph_osdc_wait_request(osdc, req); @@ -974,19 +987,24 @@ static ssize_t ceph_sync_read(struct kiocb *iocb, struct iov_iter *to, req->r_end_latency, len, ret); - ceph_osdc_put_request(req); - i_size = i_size_read(inode); dout("sync_read %llu~%llu got %zd i_size %llu%s\n", off, len, ret, i_size, (more ? " MORE" : "")); - if (ret == -ENOENT) + /* Fix it to go to end of extent map */ + if (sparse && ret >= 0) + ret = ceph_sparse_ext_map_end(op); + else if (ret == -ENOENT) ret = 0; + + ceph_osdc_put_request(req); + if (ret >= 0 && ret < len && (off + ret < i_size)) { int zlen = min(len - ret, i_size - off - ret); int zoff = page_off + ret; + dout("sync_read zero gap %llu~%llu\n", - off + ret, off + ret + zlen); + off + ret, off + ret + zlen); ceph_zero_page_vector_range(zoff, zlen, pages); ret += zlen; } @@ -1105,8 +1123,10 @@ static void ceph_aio_complete_req(struct ceph_osd_request *req) struct inode *inode = req->r_inode; struct ceph_aio_request *aio_req = req->r_priv; struct ceph_osd_data *osd_data = osd_req_op_extent_osd_data(req, 0); + struct ceph_osd_req_op *op = &req->r_ops[0]; struct ceph_client_metric *metric = &ceph_sb_to_mdsc(inode->i_sb)->metric; unsigned int len = osd_data->bvec_pos.iter.bi_size; + bool sparse = (op->op == CEPH_OSD_OP_SPARSE_READ); BUG_ON(osd_data->type != CEPH_OSD_DATA_TYPE_BVECS); BUG_ON(!osd_data->num_bvecs); @@ -1127,6 +1147,8 @@ static void ceph_aio_complete_req(struct ceph_osd_request *req) } rc = -ENOMEM; } else if (!aio_req->write) { + if (sparse && rc >= 0) + rc = ceph_sparse_ext_map_end(op); if (rc == -ENOENT) rc = 0; if (rc >= 0 && len > rc) { @@ -1263,6 +1285,7 @@ ceph_direct_read_write(struct kiocb *iocb, struct iov_iter *iter, loff_t pos = iocb->ki_pos; bool write = iov_iter_rw(iter) == WRITE; bool should_dirty = !write && iter_is_iovec(iter); + bool sparse = ceph_test_mount_opt(fsc, SPARSEREAD); if (write && ceph_snap(file_inode(file)) != CEPH_NOSNAP) return -EROFS; @@ -1290,6 +1313,8 @@ ceph_direct_read_write(struct kiocb *iocb, struct iov_iter *iter, while (iov_iter_count(iter) > 0) { u64 size = iov_iter_count(iter); ssize_t len; + struct ceph_osd_req_op *op; + int readop = sparse ? CEPH_OSD_OP_SPARSE_READ : CEPH_OSD_OP_READ; if (write) size = min_t(u64, size, fsc->mount_options->wsize); @@ -1300,8 +1325,7 @@ ceph_direct_read_write(struct kiocb *iocb, struct iov_iter *iter, req = ceph_osdc_new_request(&fsc->client->osdc, &ci->i_layout, vino, pos, &size, 0, 1, - write ? CEPH_OSD_OP_WRITE : - CEPH_OSD_OP_READ, + write ? CEPH_OSD_OP_WRITE : readop, flags, snapc, ci->i_truncate_seq, ci->i_truncate_size, @@ -1352,6 +1376,14 @@ ceph_direct_read_write(struct kiocb *iocb, struct iov_iter *iter, } osd_req_op_extent_osd_data_bvecs(req, 0, bvecs, num_pages, len); + op = &req->r_ops[0]; + if (sparse) { + ret = ceph_alloc_sparse_ext_map(op); + if (ret) { + ceph_osdc_put_request(req); + break; + } + } if (aio_req) { aio_req->total_len += len; @@ -1380,8 +1412,11 @@ ceph_direct_read_write(struct kiocb *iocb, struct iov_iter *iter, size = i_size_read(inode); if (!write) { - if (ret == -ENOENT) + if (sparse && ret >= 0) + ret = ceph_sparse_ext_map_end(op); + else if (ret == -ENOENT) ret = 0; + if (ret >= 0 && ret < len && pos + ret < size) { struct iov_iter i; int zlen = min_t(size_t, len - ret, diff --git a/fs/ceph/super.c b/fs/ceph/super.c index b73b4f75462c..b03cd54006db 100644 --- a/fs/ceph/super.c +++ b/fs/ceph/super.c @@ -163,6 +163,7 @@ enum { Opt_copyfrom, Opt_wsync, Opt_pagecache, + Opt_sparseread, }; enum ceph_recover_session_mode { @@ -205,6 +206,7 @@ static const struct fs_parameter_spec ceph_mount_parameters[] = { fsparam_u32 ("wsize", Opt_wsize), fsparam_flag_no ("wsync", Opt_wsync), fsparam_flag_no ("pagecache", Opt_pagecache), + fsparam_flag_no ("sparseread", Opt_sparseread), {} }; @@ -574,6 +576,12 @@ static int ceph_parse_mount_param(struct fs_context *fc, else fsopt->flags &= ~CEPH_MOUNT_OPT_NOPAGECACHE; break; + case Opt_sparseread: + if (result.negated) + fsopt->flags &= ~CEPH_MOUNT_OPT_SPARSEREAD; + else + fsopt->flags |= CEPH_MOUNT_OPT_SPARSEREAD; + break; default: BUG(); } @@ -708,9 +716,10 @@ static int ceph_show_options(struct seq_file *m, struct dentry *root) if (!(fsopt->flags & CEPH_MOUNT_OPT_ASYNC_DIROPS)) seq_puts(m, ",wsync"); - if (fsopt->flags & CEPH_MOUNT_OPT_NOPAGECACHE) seq_puts(m, ",nopagecache"); + if (fsopt->flags & CEPH_MOUNT_OPT_SPARSEREAD) + seq_puts(m, ",sparseread"); if (fsopt->wsize != CEPH_MAX_WRITE_SIZE) seq_printf(m, ",wsize=%u", fsopt->wsize); @@ -1291,6 +1300,11 @@ static int ceph_reconfigure_fc(struct fs_context *fc) else ceph_clear_mount_opt(fsc, ASYNC_DIROPS); + if (fsopt->flags & CEPH_MOUNT_OPT_SPARSEREAD) + ceph_set_mount_opt(fsc, SPARSEREAD); + else + ceph_clear_mount_opt(fsc, SPARSEREAD); + if (strcmp_null(fsc->mount_options->mon_addr, fsopt->mon_addr)) { kfree(fsc->mount_options->mon_addr); fsc->mount_options->mon_addr = fsopt->mon_addr; diff --git a/fs/ceph/super.h b/fs/ceph/super.h index 669036ebef1e..0c7a6fedc58a 100644 --- a/fs/ceph/super.h +++ b/fs/ceph/super.h @@ -41,6 +41,7 @@ #define CEPH_MOUNT_OPT_NOCOPYFROM (1<<14) /* don't use RADOS 'copy-from' op */ #define CEPH_MOUNT_OPT_ASYNC_DIROPS (1<<15) /* allow async directory ops */ #define CEPH_MOUNT_OPT_NOPAGECACHE (1<<16) /* bypass pagecache altogether */ +#define CEPH_MOUNT_OPT_SPARSEREAD (1<<17) /* always do sparse reads */ #define CEPH_MOUNT_OPT_DEFAULT \ (CEPH_MOUNT_OPT_DCACHE | \ From patchwork Wed Apr 27 19:12:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 566802 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DA7B9C433EF for ; Wed, 27 Apr 2022 19:16:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233719AbiD0TTW (ORCPT ); Wed, 27 Apr 2022 15:19:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49660 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232853AbiD0TTN (ORCPT ); Wed, 27 Apr 2022 15:19:13 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 27BF028E10 for ; Wed, 27 Apr 2022 12:13:25 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id B4EB061A00 for ; Wed, 27 Apr 2022 19:13:24 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 972B1C385A7; Wed, 27 Apr 2022 19:13:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1651086804; bh=AL8GiMWUOKq04yGI7r7aMJqU4jt3Jz8zt+t2KqAdU5U=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=UV6Qs3W/IwOVQRUzzgky5s8pQYYLT/oHqmhuPyMmq3t4DijC1u+Sa4+dJA0/MZ67h Re/i9puf9f254Hn5OgzOPWs0LRvoixao5bl0yIuSPtXZqCSS2ZGpxqrngP96asm/X7 pJ4cvgup2z2S1pvubzXpxt8rdw4J5zhVyt7XAf7zMe5o+Bf/B2lYSTdswJnMedIZ3b eS4OPsJU0RvaEb81jSyQH0GtFlZPnT1KBqYTWsHTqOA0lsONaoAoSWIlMQx4HtW1Ga E+9HcyYi3fspLzRWEYYfrQCWslufKuJaXogNxMJIJ2OfXh+lTpK8ZXeWZXuQZKAsKp LHNzFx15mS6zg== From: Jeff Layton To: ceph-devel@vger.kernel.org Cc: xiubli@redhat.com, lhenriques@suse.de, idryomov@gmail.com, Eric Biggers Subject: [PATCH v14 10/64] fscrypt: add fscrypt_context_for_new_inode Date: Wed, 27 Apr 2022 15:12:20 -0400 Message-Id: <20220427191314.222867-11-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220427191314.222867-1-jlayton@kernel.org> References: <20220427191314.222867-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Most filesystems just call fscrypt_set_context on new inodes, which usually causes a setxattr. That's a bit late for ceph, which can send along a full set of attributes with the create request. Doing so allows it to avoid race windows that where the new inode could be seen by other clients without the crypto context attached. It also avoids the separate round trip to the server. Refactor the fscrypt code a bit to allow us to create a new crypto context, attach it to the inode, and write it to the buffer, but without calling set_context on it. ceph can later use this to marshal the context into the attributes we send along with the create request. Acked-by: Eric Biggers Signed-off-by: Jeff Layton --- fs/crypto/policy.c | 35 +++++++++++++++++++++++++++++------ include/linux/fscrypt.h | 1 + 2 files changed, 30 insertions(+), 6 deletions(-) diff --git a/fs/crypto/policy.c b/fs/crypto/policy.c index ed3d623724cd..ec861af96252 100644 --- a/fs/crypto/policy.c +++ b/fs/crypto/policy.c @@ -664,6 +664,32 @@ const union fscrypt_policy *fscrypt_policy_to_inherit(struct inode *dir) return fscrypt_get_dummy_policy(dir->i_sb); } +/** + * fscrypt_context_for_new_inode() - create an encryption context for a new inode + * @ctx: where context should be written + * @inode: inode from which to fetch policy and nonce + * + * Given an in-core "prepared" (via fscrypt_prepare_new_inode) inode, + * generate a new context and write it to ctx. ctx _must_ be at least + * FSCRYPT_SET_CONTEXT_MAX_SIZE bytes. + * + * Return: size of the resulting context or a negative error code. + */ +int fscrypt_context_for_new_inode(void *ctx, struct inode *inode) +{ + struct fscrypt_info *ci = inode->i_crypt_info; + + BUILD_BUG_ON(sizeof(union fscrypt_context) != + FSCRYPT_SET_CONTEXT_MAX_SIZE); + + /* fscrypt_prepare_new_inode() should have set up the key already. */ + if (WARN_ON_ONCE(!ci)) + return -ENOKEY; + + return fscrypt_new_context(ctx, &ci->ci_policy, ci->ci_nonce); +} +EXPORT_SYMBOL_GPL(fscrypt_context_for_new_inode); + /** * fscrypt_set_context() - Set the fscrypt context of a new inode * @inode: a new inode @@ -680,12 +706,9 @@ int fscrypt_set_context(struct inode *inode, void *fs_data) union fscrypt_context ctx; int ctxsize; - /* fscrypt_prepare_new_inode() should have set up the key already. */ - if (WARN_ON_ONCE(!ci)) - return -ENOKEY; - - BUILD_BUG_ON(sizeof(ctx) != FSCRYPT_SET_CONTEXT_MAX_SIZE); - ctxsize = fscrypt_new_context(&ctx, &ci->ci_policy, ci->ci_nonce); + ctxsize = fscrypt_context_for_new_inode(&ctx, inode); + if (ctxsize < 0) + return ctxsize; /* * This may be the first time the inode number is available, so do any diff --git a/include/linux/fscrypt.h b/include/linux/fscrypt.h index c79c4333f769..2b892376d325 100644 --- a/include/linux/fscrypt.h +++ b/include/linux/fscrypt.h @@ -273,6 +273,7 @@ int fscrypt_ioctl_get_policy(struct file *filp, void __user *arg); int fscrypt_ioctl_get_policy_ex(struct file *filp, void __user *arg); int fscrypt_ioctl_get_nonce(struct file *filp, void __user *arg); int fscrypt_has_permitted_context(struct inode *parent, struct inode *child); +int fscrypt_context_for_new_inode(void *ctx, struct inode *inode); int fscrypt_set_context(struct inode *inode, void *fs_data); struct fscrypt_dummy_policy { From patchwork Wed Apr 27 19:12:23 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 566799 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7A11BC433F5 for ; Wed, 27 Apr 2022 19:16:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233205AbiD0TTm (ORCPT ); Wed, 27 Apr 2022 15:19:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51848 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232969AbiD0TTN (ORCPT ); Wed, 27 Apr 2022 15:19:13 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0622F2C13F for ; Wed, 27 Apr 2022 12:13:27 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id DB9F1619FC for ; Wed, 27 Apr 2022 19:13:26 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D51CCC385AE; Wed, 27 Apr 2022 19:13:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1651086806; bh=5zFRUKmjXvvstcTiF+tAA1/QKJtXVH1q8ZRNbJAoGs4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=RZViOCwUZB7yY9jnLX0DtZB5gRL7ONYTpLCS8miODH18eAdWB5DjsEtNr8C8GJV9s asaNLA1Ora4XFxhRwNRP2ZL7lRL5ZAuDmgPgWoqHWoLADbwp9p34vUcISCwNwz6yXV z+c18mFqb7FLQChllfRvg4jdVt5n/SkZlVKVJXDA6YN86roPgImlniHWhovEswk8vc gEAm+BhBtGEZEe6gN37Lr/CcHARiJkj1MEroFlOWn3RWi6Xrnm/ylDT+T4leg7QcCd LKgLc27Q7fLMYuWhBqNkdwNyIEttr0LbtzPN3MgX2+M8fyK5VAvu4y5RIxMyiXlR9O 6Eb7Z8gilpJ2Q== From: Jeff Layton To: ceph-devel@vger.kernel.org Cc: xiubli@redhat.com, lhenriques@suse.de, idryomov@gmail.com Subject: [PATCH v14 13/64] ceph: ensure that we accept a new context from MDS for new inodes Date: Wed, 27 Apr 2022 15:12:23 -0400 Message-Id: <20220427191314.222867-14-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220427191314.222867-1-jlayton@kernel.org> References: <20220427191314.222867-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Signed-off-by: Jeff Layton --- fs/ceph/inode.c | 21 +++++++++++---------- 1 file changed, 11 insertions(+), 10 deletions(-) diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c index 74e428b22683..2fb7481060cd 100644 --- a/fs/ceph/inode.c +++ b/fs/ceph/inode.c @@ -944,6 +944,17 @@ int ceph_fill_inode(struct inode *inode, struct page *locked_page, __ceph_update_quota(ci, iinfo->max_bytes, iinfo->max_files); +#ifdef CONFIG_FS_ENCRYPTION + if (iinfo->fscrypt_auth_len && (inode->i_state & I_NEW)) { + kfree(ci->fscrypt_auth); + ci->fscrypt_auth_len = iinfo->fscrypt_auth_len; + ci->fscrypt_auth = iinfo->fscrypt_auth; + iinfo->fscrypt_auth = NULL; + iinfo->fscrypt_auth_len = 0; + inode_set_flags(inode, S_ENCRYPTED, S_ENCRYPTED); + } +#endif + if ((new_version || (new_issued & CEPH_CAP_AUTH_SHARED)) && (issued & CEPH_CAP_AUTH_EXCL) == 0) { inode->i_mode = mode; @@ -1033,16 +1044,6 @@ int ceph_fill_inode(struct inode *inode, struct page *locked_page, xattr_blob = NULL; } -#ifdef CONFIG_FS_ENCRYPTION - if (iinfo->fscrypt_auth_len && !ci->fscrypt_auth) { - ci->fscrypt_auth_len = iinfo->fscrypt_auth_len; - ci->fscrypt_auth = iinfo->fscrypt_auth; - iinfo->fscrypt_auth = NULL; - iinfo->fscrypt_auth_len = 0; - inode_set_flags(inode, S_ENCRYPTED, S_ENCRYPTED); - } -#endif - /* finally update i_version */ if (le64_to_cpu(info->version) > ci->i_version) ci->i_version = le64_to_cpu(info->version); From patchwork Wed Apr 27 19:12:24 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 566774 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D9C54C433EF for ; Wed, 27 Apr 2022 19:24:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233143AbiD0T2G (ORCPT ); Wed, 27 Apr 2022 15:28:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38478 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232983AbiD0TTN (ORCPT ); Wed, 27 Apr 2022 15:19:13 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 88EB888790 for ; Wed, 27 Apr 2022 12:13:29 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 40CA8B8294C for ; Wed, 27 Apr 2022 19:13:28 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8AADCC385A9; Wed, 27 Apr 2022 19:13:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1651086807; bh=W8QAxDrD8lDcr0emZ2VvimDJk4N/JbyJ2Y1e+Rk7ANg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=XtmMIM8U8Ahr0gyXBTXC3sje1XKhIUkFbJ/jZbtpz63ZX601OWQ5e0UAYrdi/UiUh C0t+IKVBr/Mw/a22uyY5cKgTVO+REH7D0u8QSAFU1Gd0nbh60gQFkiLvFsyCm5H7oR GAamHbKWy8gwA1vqm7VlggPTwbCkqVze3CRrvrspDPBHUjXoSSFrxZaeeg/f0br9q4 T3Ow+LfME7pxBDNiKpQJgIdzxM7hSOsQIdBwRyaIyE8OetIQdp5cS9LgRXkaLJ5fh7 y+ovS/LMyCiwm8/UbBlagA1xxSdAyBsqrJgWr5TP8JPfNUR+2FrlshZ5qXf3Jo4hb/ bvItJeX0BI1VQ== From: Jeff Layton To: ceph-devel@vger.kernel.org Cc: xiubli@redhat.com, lhenriques@suse.de, idryomov@gmail.com Subject: [PATCH v14 14/64] ceph: add support for fscrypt_auth/fscrypt_file to cap messages Date: Wed, 27 Apr 2022 15:12:24 -0400 Message-Id: <20220427191314.222867-15-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220427191314.222867-1-jlayton@kernel.org> References: <20220427191314.222867-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Add support for new version 12 cap messages that carry the new fscrypt_auth and fscrypt_file fields from the inode. Signed-off-by: Jeff Layton --- fs/ceph/caps.c | 76 +++++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 63 insertions(+), 13 deletions(-) diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index 7903276f7967..c49053e36112 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -13,6 +13,7 @@ #include "super.h" #include "mds_client.h" #include "cache.h" +#include "crypto.h" #include #include @@ -1214,15 +1215,12 @@ struct cap_msg_args { umode_t mode; bool inline_data; bool wake; + u32 fscrypt_auth_len; + u32 fscrypt_file_len; + u8 fscrypt_auth[sizeof(struct ceph_fscrypt_auth)]; // for context + u8 fscrypt_file[sizeof(u64)]; // for size }; -/* - * cap struct size + flock buffer size + inline version + inline data size + - * osd_epoch_barrier + oldest_flush_tid - */ -#define CAP_MSG_SIZE (sizeof(struct ceph_mds_caps) + \ - 4 + 8 + 4 + 4 + 8 + 4 + 4 + 4 + 8 + 8 + 4) - /* Marshal up the cap msg to the MDS */ static void encode_cap_msg(struct ceph_msg *msg, struct cap_msg_args *arg) { @@ -1238,7 +1236,7 @@ static void encode_cap_msg(struct ceph_msg *msg, struct cap_msg_args *arg) arg->size, arg->max_size, arg->xattr_version, arg->xattr_buf ? (int)arg->xattr_buf->vec.iov_len : 0); - msg->hdr.version = cpu_to_le16(10); + msg->hdr.version = cpu_to_le16(12); msg->hdr.tid = cpu_to_le64(arg->flush_tid); fc = msg->front.iov_base; @@ -1309,6 +1307,21 @@ static void encode_cap_msg(struct ceph_msg *msg, struct cap_msg_args *arg) /* Advisory flags (version 10) */ ceph_encode_32(&p, arg->flags); + + /* dirstats (version 11) - these are r/o on the client */ + ceph_encode_64(&p, 0); + ceph_encode_64(&p, 0); + +#if IS_ENABLED(CONFIG_FS_ENCRYPTION) + /* fscrypt_auth and fscrypt_file (version 12) */ + ceph_encode_32(&p, arg->fscrypt_auth_len); + ceph_encode_copy(&p, arg->fscrypt_auth, arg->fscrypt_auth_len); + ceph_encode_32(&p, arg->fscrypt_file_len); + ceph_encode_copy(&p, arg->fscrypt_file, arg->fscrypt_file_len); +#else /* CONFIG_FS_ENCRYPTION */ + ceph_encode_32(&p, 0); + ceph_encode_32(&p, 0); +#endif /* CONFIG_FS_ENCRYPTION */ } /* @@ -1430,8 +1443,37 @@ static void __prep_cap(struct cap_msg_args *arg, struct ceph_cap *cap, } } arg->flags = flags; +#if IS_ENABLED(CONFIG_FS_ENCRYPTION) + if (ci->fscrypt_auth_len && + WARN_ON_ONCE(ci->fscrypt_auth_len > sizeof(struct ceph_fscrypt_auth))) { + /* Don't set this if it's too big */ + arg->fscrypt_auth_len = 0; + } else { + arg->fscrypt_auth_len = ci->fscrypt_auth_len; + memcpy(arg->fscrypt_auth, ci->fscrypt_auth, + min_t(size_t, ci->fscrypt_auth_len, sizeof(arg->fscrypt_auth))); + } + /* FIXME: use this to track "real" size */ + arg->fscrypt_file_len = 0; +#endif /* CONFIG_FS_ENCRYPTION */ } +#define CAP_MSG_FIXED_FIELDS (sizeof(struct ceph_mds_caps) + \ + 4 + 8 + 4 + 4 + 8 + 4 + 4 + 4 + 8 + 8 + 4 + 8 + 8 + 4 + 4) + +#if IS_ENABLED(CONFIG_FS_ENCRYPTION) +static inline int cap_msg_size(struct cap_msg_args *arg) +{ + return CAP_MSG_FIXED_FIELDS + arg->fscrypt_auth_len + + arg->fscrypt_file_len; +} +#else +static inline int cap_msg_size(struct cap_msg_args *arg) +{ + return CAP_MSG_FIXED_FIELDS; +} +#endif /* CONFIG_FS_ENCRYPTION */ + /* * Send a cap msg on the given inode. * @@ -1442,7 +1484,7 @@ static void __send_cap(struct cap_msg_args *arg, struct ceph_inode_info *ci) struct ceph_msg *msg; struct inode *inode = &ci->vfs_inode; - msg = ceph_msg_new(CEPH_MSG_CLIENT_CAPS, CAP_MSG_SIZE, GFP_NOFS, false); + msg = ceph_msg_new(CEPH_MSG_CLIENT_CAPS, cap_msg_size(arg), GFP_NOFS, false); if (!msg) { pr_err("error allocating cap msg: ino (%llx.%llx) flushing %s tid %llu, requeuing cap.\n", ceph_vinop(inode), ceph_cap_string(arg->dirty), @@ -1468,10 +1510,6 @@ static inline int __send_flush_snap(struct inode *inode, struct cap_msg_args arg; struct ceph_msg *msg; - msg = ceph_msg_new(CEPH_MSG_CLIENT_CAPS, CAP_MSG_SIZE, GFP_NOFS, false); - if (!msg) - return -ENOMEM; - arg.session = session; arg.ino = ceph_vino(inode).ino; arg.cid = 0; @@ -1509,6 +1547,18 @@ static inline int __send_flush_snap(struct inode *inode, arg.flags = 0; arg.wake = false; + /* + * No fscrypt_auth changes from a capsnap. It will need + * to update fscrypt_file on size changes (TODO). + */ + arg.fscrypt_auth_len = 0; + arg.fscrypt_file_len = 0; + + msg = ceph_msg_new(CEPH_MSG_CLIENT_CAPS, cap_msg_size(&arg), + GFP_NOFS, false); + if (!msg) + return -ENOMEM; + encode_cap_msg(msg, &arg); ceph_con_send(&arg.session->s_con, msg); return 0; From patchwork Wed Apr 27 19:12:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 566800 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1C151C433F5 for ; Wed, 27 Apr 2022 19:16:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233305AbiD0TTh (ORCPT ); Wed, 27 Apr 2022 15:19:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38458 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232992AbiD0TTO (ORCPT ); Wed, 27 Apr 2022 15:19:14 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F0918887AF for ; Wed, 27 Apr 2022 12:13:29 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id AFB1CB8294F for ; Wed, 27 Apr 2022 19:13:28 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id EA923C385AE; Wed, 27 Apr 2022 19:13:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1651086808; bh=vz1Ry5ReZlY5z2VcDwbDw4IuIMhPeeWZ/YVqbVdimQU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ppsLt54385xjtEtMtEvB78rpuso55NtZsaI4476lWHvSeUqw31Jj+hqy+bfr2FfMq xCurN+C6yIzCbBzj0IOm1Vj+6u+e3dOpkdmyEfEIPmqm12jLgzkAz2v1zYl7FA0s85 lJ3MpO5JfPSivL12Em3QsENTycY65A3fvQkcaqHpk+UbR1KaTpuimS2Tkcm2aYZt5t nIqVARq413TMPTZrCtAcW36oCXXxq4JQZViijApMFv+M6z5PIBSEDjhOdyYhxXNQAp ds532g0sCkqJEC5yKZuR6RTyvu+nEcSjgWICCortfRBI+iabnxlfOiGMG+fnARzER7 zH1Gs+se9hVsw== From: Jeff Layton To: ceph-devel@vger.kernel.org Cc: xiubli@redhat.com, lhenriques@suse.de, idryomov@gmail.com Subject: [PATCH v14 16/64] ceph: decode alternate_name in lease info Date: Wed, 27 Apr 2022 15:12:26 -0400 Message-Id: <20220427191314.222867-17-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220427191314.222867-1-jlayton@kernel.org> References: <20220427191314.222867-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Ceph is a bit different from local filesystems, in that we don't want to store filenames as raw binary data, since we may also be dealing with clients that don't support fscrypt. We could just base64-encode the encrypted filenames, but that could leave us with filenames longer than NAME_MAX. It turns out that the MDS doesn't care much about filename length, but the clients do. To manage this, we've added a new "alternate name" field that can be optionally added to any dentry that we'll use to store the binary crypttext of the filename if its base64-encoded value will be longer than NAME_MAX. When a dentry has one of these names attached, the MDS will send it along in the lease info, which we can then store for later usage. Signed-off-by: Jeff Layton Signed-off-by: Xiubo Li --- fs/ceph/mds_client.c | 43 +++++++++++++++++++++++++++++++++---------- fs/ceph/mds_client.h | 11 +++++++---- 2 files changed, 40 insertions(+), 14 deletions(-) diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index 38de82c2cb5a..68f25292a26a 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -308,27 +308,47 @@ static int parse_reply_info_dir(void **p, void *end, static int parse_reply_info_lease(void **p, void *end, struct ceph_mds_reply_lease **lease, - u64 features) + u64 features, u32 *altname_len, u8 **altname) { + u8 struct_v; + u32 struct_len; + void *lend; + if (features == (u64)-1) { - u8 struct_v, struct_compat; - u32 struct_len; + u8 struct_compat; + ceph_decode_8_safe(p, end, struct_v, bad); ceph_decode_8_safe(p, end, struct_compat, bad); + /* struct_v is expected to be >= 1. we only understand * encoding whose struct_compat == 1. */ if (!struct_v || struct_compat != 1) goto bad; + ceph_decode_32_safe(p, end, struct_len, bad); - ceph_decode_need(p, end, struct_len, bad); - end = *p + struct_len; + } else { + struct_len = sizeof(**lease); + *altname_len = 0; + *altname = NULL; } - ceph_decode_need(p, end, sizeof(**lease), bad); + lend = *p + struct_len; + ceph_decode_need(p, end, struct_len, bad); *lease = *p; *p += sizeof(**lease); - if (features == (u64)-1) - *p = end; + + if (features == (u64)-1) { + if (struct_v >= 2) { + ceph_decode_32_safe(p, end, *altname_len, bad); + ceph_decode_need(p, end, *altname_len, bad); + *altname = *p; + *p += *altname_len; + } else { + *altname = NULL; + *altname_len = 0; + } + } + *p = lend; return 0; bad: return -EIO; @@ -358,7 +378,8 @@ static int parse_reply_info_trace(void **p, void *end, info->dname = *p; *p += info->dname_len; - err = parse_reply_info_lease(p, end, &info->dlease, features); + err = parse_reply_info_lease(p, end, &info->dlease, features, + &info->altname_len, &info->altname); if (err < 0) goto out_bad; } @@ -425,9 +446,11 @@ static int parse_reply_info_readdir(void **p, void *end, dout("parsed dir dname '%.*s'\n", rde->name_len, rde->name); /* dentry lease */ - err = parse_reply_info_lease(p, end, &rde->lease, features); + err = parse_reply_info_lease(p, end, &rde->lease, features, + &rde->altname_len, &rde->altname); if (err) goto out_bad; + /* inode */ err = parse_reply_info_in(p, end, &rde->inode, features); if (err < 0) diff --git a/fs/ceph/mds_client.h b/fs/ceph/mds_client.h index aab3ab284fce..2cc75f9ae7c7 100644 --- a/fs/ceph/mds_client.h +++ b/fs/ceph/mds_client.h @@ -29,8 +29,8 @@ enum ceph_feature_type { CEPHFS_FEATURE_MULTI_RECONNECT, CEPHFS_FEATURE_DELEG_INO, CEPHFS_FEATURE_METRIC_COLLECT, - - CEPHFS_FEATURE_MAX = CEPHFS_FEATURE_METRIC_COLLECT, + CEPHFS_FEATURE_ALTERNATE_NAME, + CEPHFS_FEATURE_MAX = CEPHFS_FEATURE_ALTERNATE_NAME, }; /* @@ -45,8 +45,7 @@ enum ceph_feature_type { CEPHFS_FEATURE_MULTI_RECONNECT, \ CEPHFS_FEATURE_DELEG_INO, \ CEPHFS_FEATURE_METRIC_COLLECT, \ - \ - CEPHFS_FEATURE_MAX, \ + CEPHFS_FEATURE_ALTERNATE_NAME, \ } #define CEPHFS_FEATURES_CLIENT_REQUIRED {} @@ -98,7 +97,9 @@ struct ceph_mds_reply_info_in { struct ceph_mds_reply_dir_entry { char *name; + u8 *altname; u32 name_len; + u32 altname_len; struct ceph_mds_reply_lease *lease; struct ceph_mds_reply_info_in inode; loff_t offset; @@ -122,7 +123,9 @@ struct ceph_mds_reply_info_parsed { struct ceph_mds_reply_info_in diri, targeti; struct ceph_mds_reply_dirfrag *dirfrag; char *dname; + u8 *altname; u32 dname_len; + u32 altname_len; struct ceph_mds_reply_lease *dlease; struct ceph_mds_reply_xattr xattr_info; From patchwork Wed Apr 27 19:12:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 566777 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D6E50C433F5 for ; Wed, 27 Apr 2022 19:24:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232878AbiD0T1w (ORCPT ); Wed, 27 Apr 2022 15:27:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49974 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233055AbiD0TTO (ORCPT ); Wed, 27 Apr 2022 15:19:14 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 54D2C88B21 for ; Wed, 27 Apr 2022 12:13:32 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id C1844619FC for ; Wed, 27 Apr 2022 19:13:31 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B6D06C385A9; Wed, 27 Apr 2022 19:13:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1651086811; bh=qOEGTJqEi5OzEQxCLc9HPDIXtciPfSXsM3lwZ4YNNbM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=IcZfvWHNSfSiNPx82aZRh5+4QTAbfHOHngyAEz/X6I53Qt8wvYtlsP21m8bxjUFoQ 6M7XKvjqDVAoCd3KUwRqGLJEqNVrjrErYWluMcaOKYEAJUiMAdEf+sNeIACzeZmFja Tt18gX72bUOmdy3URBW0YHg8k0eGCw+VawzDSP8pKTHf/s9QE5zIssrxByLYyY5JBB 9bQXnRdTwvBMyXXicMtr5lM4IJGg8Ml2QeGlDWXCu7jpGVLKRDkg39GlIP++1elYdE jiS6cW95ReIpf4niQRaiyZto58ejdPhCFaCNwWvd1jxMIr9H/VZuU8ajkhjTl1PcmJ 5hxC3BtnuHMNw== From: Jeff Layton To: ceph-devel@vger.kernel.org Cc: xiubli@redhat.com, lhenriques@suse.de, idryomov@gmail.com Subject: [PATCH v14 20/64] ceph: add base64 endcoding routines for encrypted names Date: Wed, 27 Apr 2022 15:12:30 -0400 Message-Id: <20220427191314.222867-21-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220427191314.222867-1-jlayton@kernel.org> References: <20220427191314.222867-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: Luís Henriques The base64url encoding used by fscrypt includes the '_' character, which may cause problems in snapshot names (if the name starts with '_'). Thus, use the base64 encoding defined for IMAP mailbox names (RFC 3501), which uses '+' and ',' instead of '-' and '_'. Reviewed-by: Xiubo Li Signed-off-by: Luís Henriques Signed-off-by: Jeff Layton --- fs/ceph/crypto.c | 60 ++++++++++++++++++++++++++++++++++++++++++++++++ fs/ceph/crypto.h | 32 ++++++++++++++++++++++++++ 2 files changed, 92 insertions(+) diff --git a/fs/ceph/crypto.c b/fs/ceph/crypto.c index 1c34b8ed1266..fffbd47d9e43 100644 --- a/fs/ceph/crypto.c +++ b/fs/ceph/crypto.c @@ -1,4 +1,11 @@ // SPDX-License-Identifier: GPL-2.0 +/* + * The base64 encode/decode code was copied from fscrypt: + * Copyright (C) 2015, Google, Inc. + * Copyright (C) 2015, Motorola Mobility + * Written by Uday Savagaonkar, 2014. + * Modified by Jaegeuk Kim, 2015. + */ #include #include #include @@ -7,6 +14,59 @@ #include "mds_client.h" #include "crypto.h" +/* + * The base64url encoding used by fscrypt includes the '_' character, which may + * cause problems in snapshot names (which can not starts with '_'). Thus, we + * used the base64 encoding defined for IMAP mailbox names (RFC 3501) instead, + * which replaces '-' and '_' by '+' and ','. + */ +static const char base64_table[65] = + "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+,"; + +int ceph_base64_encode(const u8 *src, int srclen, char *dst) +{ + u32 ac = 0; + int bits = 0; + int i; + char *cp = dst; + + for (i = 0; i < srclen; i++) { + ac = (ac << 8) | src[i]; + bits += 8; + do { + bits -= 6; + *cp++ = base64_table[(ac >> bits) & 0x3f]; + } while (bits >= 6); + } + if (bits) + *cp++ = base64_table[(ac << (6 - bits)) & 0x3f]; + return cp - dst; +} + +int ceph_base64_decode(const char *src, int srclen, u8 *dst) +{ + u32 ac = 0; + int bits = 0; + int i; + u8 *bp = dst; + + for (i = 0; i < srclen; i++) { + const char *p = strchr(base64_table, src[i]); + + if (p == NULL || src[i] == 0) + return -1; + ac = (ac << 6) | (p - base64_table); + bits += 6; + if (bits >= 8) { + bits -= 8; + *bp++ = (u8)(ac >> bits); + } + } + if (ac & ((1 << bits) - 1)) + return -1; + return bp - dst; +} + static int ceph_crypt_get_context(struct inode *inode, void *ctx, size_t len) { struct ceph_inode_info *ci = ceph_inode(inode); diff --git a/fs/ceph/crypto.h b/fs/ceph/crypto.h index cb00fe42d5b7..f5d38d8a1995 100644 --- a/fs/ceph/crypto.h +++ b/fs/ceph/crypto.h @@ -27,6 +27,38 @@ static inline u32 ceph_fscrypt_auth_len(struct ceph_fscrypt_auth *fa) } #ifdef CONFIG_FS_ENCRYPTION +/* + * We want to encrypt filenames when creating them, but the encrypted + * versions of those names may have illegal characters in them. To mitigate + * that, we base64 encode them, but that gives us a result that can exceed + * NAME_MAX. + * + * Follow a similar scheme to fscrypt itself, and cap the filename to a + * smaller size. If the ciphertext name is longer than the value below, then + * sha256 hash the remaining bytes. + * + * For the fscrypt_nokey_name struct the dirhash[2] member is useless in ceph + * so the corresponding struct will be: + * + * struct fscrypt_ceph_nokey_name { + * u8 bytes[157]; + * u8 sha256[SHA256_DIGEST_SIZE]; + * }; // 180 bytes => 240 bytes base64-encoded, which is <= NAME_MAX (255) + * + * (240 bytes is the maximum size allowed for snapshot names to take into + * account the format: '__'.) + * + * Note that for long names that end up having their tail portion hashed, we + * must also store the full encrypted name (in the dentry's alternate_name + * field). + */ +#define CEPH_NOHASH_NAME_MAX (180 - SHA256_DIGEST_SIZE) + +#define CEPH_BASE64_CHARS(nbytes) DIV_ROUND_UP((nbytes) * 4, 3) + +int ceph_base64_encode(const u8 *src, int srclen, char *dst); +int ceph_base64_decode(const char *src, int srclen, u8 *dst); + void ceph_fscrypt_set_ops(struct super_block *sb); void ceph_fscrypt_free_dummy_policy(struct ceph_fs_client *fsc); From patchwork Wed Apr 27 19:12:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 566776 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E7337C433EF for ; Wed, 27 Apr 2022 19:24:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232732AbiD0T1z (ORCPT ); Wed, 27 Apr 2022 15:27:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38482 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233113AbiD0TTO (ORCPT ); Wed, 27 Apr 2022 15:19:14 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7D45989322 for ; Wed, 27 Apr 2022 12:13:34 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 30280B8291B for ; Wed, 27 Apr 2022 19:13:33 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6C19AC385A7; Wed, 27 Apr 2022 19:13:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1651086811; bh=aZCBWdHBzzGojB76XlrwgBk/bZ2c4RnpxObaB8Tu1TY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=TjyW6o7zU+NQ+VeOBOvNQm+UmWVgwk4ZNEXhbUz+uZ1RCT/I18qKCgYwPeCmBh8/i tW4t6+RZ/9RbnjusremhRLzmUmahNKMpcheHQNf5F6Px+c6m3k1eZZn2wIQXNqu12G tuzxw1184Xhhe7Wjs648BWTdEytQI0NieIMl5BTGXt/3GtQBXf+iSvl9Zm/qhx+0Ge vfh226sM/OYm2nC3kSG1Po8G71zHowL157JDcma/G4fcVutuUOMYUA1Ei1x6FvTlEN cm3IjJiU9m7E7S6EYTt83lIMozjmCKEKGksXnoJOQVerg/yLbmYyDX+k7mcQrcCOs7 xSw4oxf9EGIHQ== From: Jeff Layton To: ceph-devel@vger.kernel.org Cc: xiubli@redhat.com, lhenriques@suse.de, idryomov@gmail.com Subject: [PATCH v14 21/64] ceph: add encrypted fname handling to ceph_mdsc_build_path Date: Wed, 27 Apr 2022 15:12:31 -0400 Message-Id: <20220427191314.222867-22-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220427191314.222867-1-jlayton@kernel.org> References: <20220427191314.222867-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Allow ceph_mdsc_build_path to encrypt and base64 encode the filename when the parent is encrypted and we're sending the path to the MDS. In most cases, we just encrypt the filenames and base64 encode them, but when the name is longer than CEPH_NOHASH_NAME_MAX, we use a similar scheme to fscrypt proper, and hash the remaning bits with sha256. When doing this, we then send along the full crypttext of the name in the new alternate_name field of the MClientRequest. The MDS can then send that along in readdir responses and traces. Signed-off-by: Jeff Layton --- fs/ceph/crypto.c | 47 ++++++++++++++++++++++++++ fs/ceph/crypto.h | 8 +++++ fs/ceph/mds_client.c | 80 ++++++++++++++++++++++++++++++++++---------- 3 files changed, 117 insertions(+), 18 deletions(-) diff --git a/fs/ceph/crypto.c b/fs/ceph/crypto.c index fffbd47d9e43..921e62f860c4 100644 --- a/fs/ceph/crypto.c +++ b/fs/ceph/crypto.c @@ -187,3 +187,50 @@ void ceph_fscrypt_as_ctx_to_req(struct ceph_mds_request *req, struct ceph_acl_se { swap(req->r_fscrypt_auth, as->fscrypt_auth); } + +int ceph_encode_encrypted_fname(const struct inode *parent, struct dentry *dentry, char *buf) +{ + u32 len; + int elen; + int ret; + u8 *cryptbuf; + + WARN_ON_ONCE(!fscrypt_has_encryption_key(parent)); + + /* + * Convert cleartext d_name to ciphertext. If result is longer than + * CEPH_NOHASH_NAME_MAX, sha256 the remaining bytes + * + * See: fscrypt_setup_filename + */ + if (!fscrypt_fname_encrypted_size(parent, dentry->d_name.len, NAME_MAX, &len)) + return -ENAMETOOLONG; + + /* Allocate a buffer appropriate to hold the result */ + cryptbuf = kmalloc(len > CEPH_NOHASH_NAME_MAX ? NAME_MAX : len, GFP_KERNEL); + if (!cryptbuf) + return -ENOMEM; + + ret = fscrypt_fname_encrypt(parent, &dentry->d_name, cryptbuf, len); + if (ret) { + kfree(cryptbuf); + return ret; + } + + /* hash the end if the name is long enough */ + if (len > CEPH_NOHASH_NAME_MAX) { + u8 hash[SHA256_DIGEST_SIZE]; + u8 *extra = cryptbuf + CEPH_NOHASH_NAME_MAX; + + /* hash the extra bytes and overwrite crypttext beyond that point with it */ + sha256(extra, len - CEPH_NOHASH_NAME_MAX, hash); + memcpy(extra, hash, SHA256_DIGEST_SIZE); + len = CEPH_NOHASH_NAME_MAX + SHA256_DIGEST_SIZE; + } + + /* base64 encode the encrypted name */ + elen = ceph_base64_encode(cryptbuf, len, buf); + kfree(cryptbuf); + dout("base64-encoded ciphertext name = %.*s\n", elen, buf); + return elen; +} diff --git a/fs/ceph/crypto.h b/fs/ceph/crypto.h index f5d38d8a1995..8abf10604722 100644 --- a/fs/ceph/crypto.h +++ b/fs/ceph/crypto.h @@ -6,6 +6,7 @@ #ifndef _CEPH_CRYPTO_H #define _CEPH_CRYPTO_H +#include #include struct ceph_fs_client; @@ -66,6 +67,7 @@ void ceph_fscrypt_free_dummy_policy(struct ceph_fs_client *fsc); int ceph_fscrypt_prepare_context(struct inode *dir, struct inode *inode, struct ceph_acl_sec_ctx *as); void ceph_fscrypt_as_ctx_to_req(struct ceph_mds_request *req, struct ceph_acl_sec_ctx *as); +int ceph_encode_encrypted_fname(const struct inode *parent, struct dentry *dentry, char *buf); #else /* CONFIG_FS_ENCRYPTION */ @@ -89,6 +91,12 @@ static inline void ceph_fscrypt_as_ctx_to_req(struct ceph_mds_request *req, struct ceph_acl_sec_ctx *as_ctx) { } + +static inline int ceph_encode_encrypted_fname(const struct inode *parent, + struct dentry *dentry, char *buf) +{ + return -EOPNOTSUPP; +} #endif /* CONFIG_FS_ENCRYPTION */ #endif diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index 5c7c89a1343b..e31a5bec9afc 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -14,6 +14,7 @@ #include #include "super.h" +#include "crypto.h" #include "mds_client.h" #include "crypto.h" @@ -2385,18 +2386,27 @@ static inline u64 __get_oldest_tid(struct ceph_mds_client *mdsc) return mdsc->oldest_tid; } -/* - * Build a dentry's path. Allocate on heap; caller must kfree. Based - * on build_path_from_dentry in fs/cifs/dir.c. +/** + * ceph_mdsc_build_path - build a path string to a given dentry + * @dentry: dentry to which path should be built + * @plen: returned length of string + * @pbase: returned base inode number + * @for_wire: is this path going to be sent to the MDS? + * + * Build a string that represents the path to the dentry. This is mostly called + * for two different purposes: * - * If @stop_on_nosnap, generate path relative to the first non-snapped - * inode. + * 1) we need to build a path string to send to the MDS (for_wire == true) + * 2) we need a path string for local presentation (e.g. debugfs) (for_wire == false) + * + * The path is built in reverse, starting with the dentry. Walk back up toward + * the root, building the path until the first non-snapped inode is reached (for_wire) + * or the root inode is reached (!for_wire). * * Encode hidden .snap dirs as a double /, i.e. * foo/.snap/bar -> foo//bar */ -char *ceph_mdsc_build_path(struct dentry *dentry, int *plen, u64 *pbase, - int stop_on_nosnap) +char *ceph_mdsc_build_path(struct dentry *dentry, int *plen, u64 *pbase, int for_wire) { struct dentry *cur; struct inode *inode; @@ -2418,30 +2428,65 @@ char *ceph_mdsc_build_path(struct dentry *dentry, int *plen, u64 *pbase, seq = read_seqbegin(&rename_lock); cur = dget(dentry); for (;;) { - struct dentry *temp; + struct dentry *parent; spin_lock(&cur->d_lock); inode = d_inode(cur); if (inode && ceph_snap(inode) == CEPH_SNAPDIR) { dout("build_path path+%d: %p SNAPDIR\n", pos, cur); - } else if (stop_on_nosnap && inode && dentry != cur && - ceph_snap(inode) == CEPH_NOSNAP) { + spin_unlock(&cur->d_lock); + parent = dget_parent(cur); + } else if (for_wire && inode && dentry != cur && ceph_snap(inode) == CEPH_NOSNAP) { spin_unlock(&cur->d_lock); pos++; /* get rid of any prepended '/' */ break; - } else { + } else if (!for_wire || !IS_ENCRYPTED(d_inode(cur->d_parent))) { pos -= cur->d_name.len; if (pos < 0) { spin_unlock(&cur->d_lock); break; } memcpy(path + pos, cur->d_name.name, cur->d_name.len); + spin_unlock(&cur->d_lock); + parent = dget_parent(cur); + } else { + int len, ret; + char buf[NAME_MAX]; + + /* + * Proactively copy name into buf, in case we need to present + * it as-is. + */ + memcpy(buf, cur->d_name.name, cur->d_name.len); + len = cur->d_name.len; + spin_unlock(&cur->d_lock); + parent = dget_parent(cur); + + ret = __fscrypt_prepare_readdir(d_inode(parent)); + if (ret < 0) { + dput(parent); + dput(cur); + return ERR_PTR(ret); + } + + if (fscrypt_has_encryption_key(d_inode(parent))) { + len = ceph_encode_encrypted_fname(d_inode(parent), cur, buf); + if (len < 0) { + dput(parent); + dput(cur); + return ERR_PTR(len); + } + } + pos -= len; + if (pos < 0) { + dput(parent); + break; + } + memcpy(path + pos, buf, len); } - temp = cur; - spin_unlock(&temp->d_lock); - cur = dget_parent(temp); - dput(temp); + dput(cur); + cur = parent; /* Are we at the root? */ if (IS_ROOT(cur)) @@ -2465,8 +2510,7 @@ char *ceph_mdsc_build_path(struct dentry *dentry, int *plen, u64 *pbase, * A rename didn't occur, but somehow we didn't end up where * we thought we would. Throw a warning and try again. */ - pr_warn("build_path did not end path lookup where " - "expected, pos is %d\n", pos); + pr_warn("build_path did not end path lookup where expected (pos = %d)\n", pos); goto retry; } @@ -2486,7 +2530,7 @@ static int build_dentry_path(struct dentry *dentry, struct inode *dir, rcu_read_lock(); if (!dir) dir = d_inode_rcu(dentry->d_parent); - if (dir && parent_locked && ceph_snap(dir) == CEPH_NOSNAP) { + if (dir && parent_locked && ceph_snap(dir) == CEPH_NOSNAP && !IS_ENCRYPTED(dir)) { *pino = ceph_ino(dir); rcu_read_unlock(); *ppath = dentry->d_name.name; From patchwork Wed Apr 27 19:12:33 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 566798 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 49257C433FE for ; Wed, 27 Apr 2022 19:16:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232990AbiD0TTx (ORCPT ); Wed, 27 Apr 2022 15:19:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49688 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230212AbiD0TTP (ORCPT ); Wed, 27 Apr 2022 15:19:15 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0AC5689333 for ; Wed, 27 Apr 2022 12:13:35 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 98AFFB8291D for ; Wed, 27 Apr 2022 19:13:34 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id CBF77C385AF; Wed, 27 Apr 2022 19:13:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1651086813; bh=4tZ05g/FJhXyVnjkVAE3L9bGcTJl59SqdTzNrr3EHzg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=mpcO2lPmgStkMnDADQZbW94zvD3A1hgXf1+HOQiWLxEyqJSCiRi1lTZ8irk/yujub AA0ndrGPF1wyjtEi3N9Ak7+vDOGS18G5OxcfXgjmPihye2imM0cCZAqd1MkbFiHr33 Jq8LSP7YotCTtvz04hwEa2+c6HVQ4ZVBqU7CqlMOd9XBWRs7O4eRKOZdlmXbOcXD4d KdF9KKJAB5PmuvjSpNMGM6qTjzo7h1LrBSLWqDHFBuOciJXil0VS0cBLAO73MTbBlk ikB/Axcp8zTxXsBLYHlkG+uXlm9r5hw8Hwb/Ioc4FNfg9KDjB7AADw0W7zSLpdGzUa CVIS2h39ldANw== From: Jeff Layton To: ceph-devel@vger.kernel.org Cc: xiubli@redhat.com, lhenriques@suse.de, idryomov@gmail.com Subject: [PATCH v14 23/64] ceph: encode encrypted name in dentry release Date: Wed, 27 Apr 2022 15:12:33 -0400 Message-Id: <20220427191314.222867-24-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220427191314.222867-1-jlayton@kernel.org> References: <20220427191314.222867-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Encode encrypted dentry names when sending a dentry release request. Also add a more helpful comment over ceph_encode_dentry_release. Signed-off-by: Jeff Layton --- fs/ceph/caps.c | 32 ++++++++++++++++++++++++++++---- fs/ceph/mds_client.c | 20 ++++++++++++++++---- 2 files changed, 44 insertions(+), 8 deletions(-) diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index c49053e36112..98230fd58b07 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -4622,6 +4622,18 @@ int ceph_encode_inode_release(void **p, struct inode *inode, return ret; } +/** + * ceph_encode_dentry_release - encode a dentry release into an outgoing request + * @p: outgoing request buffer + * @dentry: dentry to release + * @dir: dir to release it from + * @mds: mds that we're speaking to + * @drop: caps being dropped + * @unless: unless we have these caps + * + * Encode a dentry release into an outgoing request buffer. Returns 1 if the + * thing was released, or a negative error code otherwise. + */ int ceph_encode_dentry_release(void **p, struct dentry *dentry, struct inode *dir, int mds, int drop, int unless) @@ -4654,13 +4666,25 @@ int ceph_encode_dentry_release(void **p, struct dentry *dentry, if (ret && di->lease_session && di->lease_session->s_mds == mds) { dout("encode_dentry_release %p mds%d seq %d\n", dentry, mds, (int)di->lease_seq); - rel->dname_len = cpu_to_le32(dentry->d_name.len); - memcpy(*p, dentry->d_name.name, dentry->d_name.len); - *p += dentry->d_name.len; rel->dname_seq = cpu_to_le32(di->lease_seq); __ceph_mdsc_drop_dentry_lease(dentry); + spin_unlock(&dentry->d_lock); + if (IS_ENCRYPTED(dir) && fscrypt_has_encryption_key(dir)) { + int ret2 = ceph_encode_encrypted_fname(dir, dentry, *p); + + if (ret2 < 0) + return ret2; + + rel->dname_len = cpu_to_le32(ret2); + *p += ret2; + } else { + rel->dname_len = cpu_to_le32(dentry->d_name.len); + memcpy(*p, dentry->d_name.name, dentry->d_name.len); + *p += dentry->d_name.len; + } + } else { + spin_unlock(&dentry->d_lock); } - spin_unlock(&dentry->d_lock); return ret; } diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index cc3c507c03eb..af54a6164bba 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -2819,15 +2819,23 @@ static struct ceph_msg *create_request_message(struct ceph_mds_session *session, req->r_inode ? req->r_inode : d_inode(req->r_dentry), mds, req->r_inode_drop, req->r_inode_unless, req->r_op == CEPH_MDS_OP_READDIR); - if (req->r_dentry_drop) - releases += ceph_encode_dentry_release(&p, req->r_dentry, + if (req->r_dentry_drop) { + ret = ceph_encode_dentry_release(&p, req->r_dentry, req->r_parent, mds, req->r_dentry_drop, req->r_dentry_unless); - if (req->r_old_dentry_drop) - releases += ceph_encode_dentry_release(&p, req->r_old_dentry, + if (ret < 0) + goto out_err; + releases += ret; + } + if (req->r_old_dentry_drop) { + ret = ceph_encode_dentry_release(&p, req->r_old_dentry, req->r_old_dentry_dir, mds, req->r_old_dentry_drop, req->r_old_dentry_unless); + if (ret < 0) + goto out_err; + releases += ret; + } if (req->r_old_inode_drop) releases += ceph_encode_inode_release(&p, d_inode(req->r_old_dentry), @@ -2869,6 +2877,10 @@ static struct ceph_msg *create_request_message(struct ceph_mds_session *session, ceph_mdsc_free_path((char *)path1, pathlen1); out: return msg; +out_err: + ceph_msg_put(msg); + msg = ERR_PTR(ret); + goto out_free2; } /* From patchwork Wed Apr 27 19:12:34 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 566775 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CEBCFC433EF for ; Wed, 27 Apr 2022 19:24:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233056AbiD0T17 (ORCPT ); Wed, 27 Apr 2022 15:27:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47462 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233127AbiD0TTO (ORCPT ); Wed, 27 Apr 2022 15:19:14 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EA90E2F392 for ; Wed, 27 Apr 2022 12:13:34 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 8723D619FC for ; Wed, 27 Apr 2022 19:13:34 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 81530C385A9; Wed, 27 Apr 2022 19:13:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1651086814; bh=40bz7fgcSkfmSJmWJDKxd9JZxineBbUvX30XOIb7qbk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=dOWgDtv0+nQoiWl8y/5F2vwnlkkxU6B/29IonJe4l1yS6C+u6rokdRdI8avgo+BRO fJ7xrcOuheBR9SZ0ZuBgrRPajyHmng3TU3hLmojFe14m2m8ILeIfmN4W4QAzMd35gi h+jWhsMCRQfcFQkoOikX+CaU8487SYXiXj9Iu0mIIJBboatQkJ9wKc4IHHilHvNf7y hmrhcXYuLtCjDiMhdTl4I6GF6r+XZao1p8y6I8iSWBKjSmsGcn7NeUQugLqzrskslm +aMg5+hTR1bVFbReZxrKK8hb77qPnh15y7q7V+xgFLR2IsCZpCFJ70boMPRsjTIUJT 2tcgLHhQIp4aQ== From: Jeff Layton To: ceph-devel@vger.kernel.org Cc: xiubli@redhat.com, lhenriques@suse.de, idryomov@gmail.com Subject: [PATCH v14 24/64] ceph: properly set DCACHE_NOKEY_NAME flag in lookup Date: Wed, 27 Apr 2022 15:12:34 -0400 Message-Id: <20220427191314.222867-25-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220427191314.222867-1-jlayton@kernel.org> References: <20220427191314.222867-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org This is required so that we know to invalidate these dentries when the directory is unlocked. Signed-off-by: Jeff Layton --- fs/ceph/dir.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c index 8cc7a49ee508..897f8618151b 100644 --- a/fs/ceph/dir.c +++ b/fs/ceph/dir.c @@ -760,6 +760,17 @@ static struct dentry *ceph_lookup(struct inode *dir, struct dentry *dentry, if (dentry->d_name.len > NAME_MAX) return ERR_PTR(-ENAMETOOLONG); + if (IS_ENCRYPTED(dir)) { + err = __fscrypt_prepare_readdir(dir); + if (err) + return ERR_PTR(err); + if (!fscrypt_has_encryption_key(dir)) { + spin_lock(&dentry->d_lock); + dentry->d_flags |= DCACHE_NOKEY_NAME; + spin_unlock(&dentry->d_lock); + } + } + /* can we conclude ENOENT locally? */ if (d_really_is_negative(dentry)) { struct ceph_inode_info *ci = ceph_inode(dir); From patchwork Wed Apr 27 19:12:35 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 566791 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C24A2C433F5 for ; Wed, 27 Apr 2022 19:17:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230193AbiD0TU1 (ORCPT ); Wed, 27 Apr 2022 15:20:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51664 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233141AbiD0TTP (ORCPT ); Wed, 27 Apr 2022 15:19:15 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 444AA89CC9 for ; Wed, 27 Apr 2022 12:13:35 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id D077D619FE for ; Wed, 27 Apr 2022 19:13:34 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 372AAC385AE; Wed, 27 Apr 2022 19:13:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1651086814; bh=U24PxxvQ6v89QmCZ/efk298RUKYN48wbTknFX6iZrs0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=OK4AiQp58XCNd7IZouSyoTZXpOhHCJWEC3jIGZWk0e3zUSei9T8BBKK1w8MpkqeVi rCSkBIIkrmWwgoSbTK0a64u3RApE0QOF+JGS/8jVr2JM0JpE7tVkbcDgfOS/z/zJXm ajGmUdke2nTBqGs1tfHwCkH6JHhWAITB+I6vjdBSwsbSZPXCR6LLYW7YIfX+iy9Ysy YPspRk/tVk0okBaogXixj9CakNbGsCZovOxpAyVfZNAJtmYzt+3r9cGCDNdNILQ+pI s+Ajzc/5cMMcrV/vV+k003V4CIhSXaXulsCnChyaUPCqqv1JYadazoT+frfFLOrSpl 0vBALL+0t9I+A== From: Jeff Layton To: ceph-devel@vger.kernel.org Cc: xiubli@redhat.com, lhenriques@suse.de, idryomov@gmail.com Subject: [PATCH v14 25/64] ceph: set DCACHE_NOKEY_NAME in atomic open Date: Wed, 27 Apr 2022 15:12:35 -0400 Message-Id: <20220427191314.222867-26-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220427191314.222867-1-jlayton@kernel.org> References: <20220427191314.222867-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Atomic open can act as a lookup if handed a dentry that is negative on the MDS. Ensure that we set DCACHE_NOKEY_NAME on the dentry in atomic_open, if we don't have the key for the parent. Otherwise, we can end up validating the dentry inappropriately if someone later adds a key. Reviewed-by: Xiubo Li Reviewed-by: Luís Henriques Signed-off-by: Jeff Layton --- fs/ceph/file.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/fs/ceph/file.c b/fs/ceph/file.c index 391de38fa489..a519218ef942 100644 --- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -768,6 +768,13 @@ int ceph_atomic_open(struct inode *dir, struct dentry *dentry, req->r_args.open.mask = cpu_to_le32(mask); req->r_parent = dir; ihold(dir); + if (IS_ENCRYPTED(dir)) { + if (!fscrypt_has_encryption_key(dir)) { + spin_lock(&dentry->d_lock); + dentry->d_flags |= DCACHE_NOKEY_NAME; + spin_unlock(&dentry->d_lock); + } + } if (flags & O_CREAT) { struct ceph_file_layout lo; From patchwork Wed Apr 27 19:12:37 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 566797 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F3BC3C433F5 for ; Wed, 27 Apr 2022 19:16:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232871AbiD0TTy (ORCPT ); Wed, 27 Apr 2022 15:19:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41584 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233259AbiD0TTQ (ORCPT ); Wed, 27 Apr 2022 15:19:16 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B7ADE369CA for ; Wed, 27 Apr 2022 12:13:38 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 6CDE7B82929 for ; Wed, 27 Apr 2022 19:13:37 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 9707AC385A9; Wed, 27 Apr 2022 19:13:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1651086816; bh=t8pc+tEK8OUvwug35XJRJGMtLe5QP5Z9BQBlzDDEy+g=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Du6e5IV0RBd76OTea7n4RwAJkEw+BFtJyqwRfXMIuXonjzVXh5FZfpUDrtCH83LRw q8C10BITAhGXrn0mrGjj179dp+HHpc+AfqpU3/+dsY1UTZmNRlB04YtUijxEe/Imp3 vMD2BXEFyi1k0l8bItxnW3e+QKQIfwcihJj8FBe6NaT49lkHlNTsfgrNTo7tMgOgpQ EH4+b4D4FRT6iyZQdK0FIkJ2k6WdxWC0AjhD4IsWy/3riWqXfDGwCf/Avd1HV79Aqd sdOTK9RFjtJYuuITYSjeSJbnFzYtxixheprTyoPQossZYxGb9Lxdjz+Od1kGu+1h0T C6wAVknz602uw== From: Jeff Layton To: ceph-devel@vger.kernel.org Cc: xiubli@redhat.com, lhenriques@suse.de, idryomov@gmail.com Subject: [PATCH v14 27/64] ceph: add helpers for converting names for userland presentation Date: Wed, 27 Apr 2022 15:12:37 -0400 Message-Id: <20220427191314.222867-28-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220427191314.222867-1-jlayton@kernel.org> References: <20220427191314.222867-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Define a new ceph_fname struct that we can use to carry information about encrypted dentry names. Add helpers for working with these objects, including ceph_fname_to_usr which formats an encrypted filename for userland presentation. Signed-off-by: Jeff Layton --- fs/ceph/crypto.c | 76 ++++++++++++++++++++++++++++++++++++++++++++++++ fs/ceph/crypto.h | 41 ++++++++++++++++++++++++++ 2 files changed, 117 insertions(+) diff --git a/fs/ceph/crypto.c b/fs/ceph/crypto.c index 921e62f860c4..7e1f66b35095 100644 --- a/fs/ceph/crypto.c +++ b/fs/ceph/crypto.c @@ -234,3 +234,79 @@ int ceph_encode_encrypted_fname(const struct inode *parent, struct dentry *dentr dout("base64-encoded ciphertext name = %.*s\n", elen, buf); return elen; } + +/** + * ceph_fname_to_usr - convert a filename for userland presentation + * @fname: ceph_fname to be converted + * @tname: temporary name buffer to use for conversion (may be NULL) + * @oname: where converted name should be placed + * @is_nokey: set to true if key wasn't available during conversion (may be NULL) + * + * Given a filename (usually from the MDS), format it for presentation to + * userland. If @parent is not encrypted, just pass it back as-is. + * + * Otherwise, base64 decode the string, and then ask fscrypt to format it + * for userland presentation. + * + * Returns 0 on success or negative error code on error. + */ +int ceph_fname_to_usr(const struct ceph_fname *fname, struct fscrypt_str *tname, + struct fscrypt_str *oname, bool *is_nokey) +{ + int ret; + struct fscrypt_str _tname = FSTR_INIT(NULL, 0); + struct fscrypt_str iname; + + if (!IS_ENCRYPTED(fname->dir)) { + oname->name = fname->name; + oname->len = fname->name_len; + return 0; + } + + /* Sanity check that the resulting name will fit in the buffer */ + if (fname->name_len > FSCRYPT_BASE64URL_CHARS(NAME_MAX)) + return -EIO; + + ret = __fscrypt_prepare_readdir(fname->dir); + if (ret) + return ret; + + /* + * Use the raw dentry name as sent by the MDS instead of + * generating a nokey name via fscrypt. + */ + if (!fscrypt_has_encryption_key(fname->dir)) { + memcpy(oname->name, fname->name, fname->name_len); + oname->len = fname->name_len; + if (is_nokey) + *is_nokey = true; + return 0; + } + + if (fname->ctext_len == 0) { + int declen; + + if (!tname) { + ret = fscrypt_fname_alloc_buffer(NAME_MAX, &_tname); + if (ret) + return ret; + tname = &_tname; + } + + declen = fscrypt_base64url_decode(fname->name, fname->name_len, tname->name); + if (declen <= 0) { + ret = -EIO; + goto out; + } + iname.name = tname->name; + iname.len = declen; + } else { + iname.name = fname->ctext; + iname.len = fname->ctext_len; + } + + ret = fscrypt_fname_disk_to_usr(fname->dir, 0, 0, &iname, oname); +out: + fscrypt_fname_free_buffer(&_tname); + return ret; +} diff --git a/fs/ceph/crypto.h b/fs/ceph/crypto.h index 8abf10604722..f231d6e3ba0f 100644 --- a/fs/ceph/crypto.h +++ b/fs/ceph/crypto.h @@ -13,6 +13,14 @@ struct ceph_fs_client; struct ceph_acl_sec_ctx; struct ceph_mds_request; +struct ceph_fname { + struct inode *dir; + char *name; // b64 encoded, possibly hashed + unsigned char *ctext; // binary crypttext (if any) + u32 name_len; // length of name buffer + u32 ctext_len; // length of crypttext +}; + struct ceph_fscrypt_auth { __le32 cfa_version; __le32 cfa_blob_len; @@ -69,6 +77,22 @@ int ceph_fscrypt_prepare_context(struct inode *dir, struct inode *inode, void ceph_fscrypt_as_ctx_to_req(struct ceph_mds_request *req, struct ceph_acl_sec_ctx *as); int ceph_encode_encrypted_fname(const struct inode *parent, struct dentry *dentry, char *buf); +static inline int ceph_fname_alloc_buffer(struct inode *parent, struct fscrypt_str *fname) +{ + if (!IS_ENCRYPTED(parent)) + return 0; + return fscrypt_fname_alloc_buffer(NAME_MAX, fname); +} + +static inline void ceph_fname_free_buffer(struct inode *parent, struct fscrypt_str *fname) +{ + if (IS_ENCRYPTED(parent)) + fscrypt_fname_free_buffer(fname); +} + +int ceph_fname_to_usr(const struct ceph_fname *fname, struct fscrypt_str *tname, + struct fscrypt_str *oname, bool *is_nokey); + #else /* CONFIG_FS_ENCRYPTION */ static inline void ceph_fscrypt_set_ops(struct super_block *sb) @@ -97,6 +121,23 @@ static inline int ceph_encode_encrypted_fname(const struct inode *parent, { return -EOPNOTSUPP; } + +static inline int ceph_fname_alloc_buffer(struct inode *parent, struct fscrypt_str *fname) +{ + return 0; +} + +static inline void ceph_fname_free_buffer(struct inode *parent, struct fscrypt_str *fname) +{ +} + +static inline int ceph_fname_to_usr(const struct ceph_fname *fname, struct fscrypt_str *tname, + struct fscrypt_str *oname, bool *is_nokey) +{ + oname->name = fname->name; + oname->len = fname->name_len; + return 0; +} #endif /* CONFIG_FS_ENCRYPTION */ #endif From patchwork Wed Apr 27 19:12:38 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 566792 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4BBC5C433EF for ; Wed, 27 Apr 2022 19:17:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233903AbiD0TUX (ORCPT ); Wed, 27 Apr 2022 15:20:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38472 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233157AbiD0TTP (ORCPT ); Wed, 27 Apr 2022 15:19:15 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DD3BC35275 for ; Wed, 27 Apr 2022 12:13:37 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 4F986619E1 for ; Wed, 27 Apr 2022 19:13:37 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4CBF8C385AE; Wed, 27 Apr 2022 19:13:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1651086816; bh=FZyYLQc6AH9ugDOKqB0hiuOEGVhGs2E/B8b2nLDlupk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ruyj7nWVskA1eizEX70ZoXCZBQq+4Y5ltdfsm4ll1Ehr/gwAJvFLw/Sxw+cC5Kxrw IowCqDhqvuEd0AMzRU80NMT1uakDDQ74qQGoH4u0XSZYNmSOCgV+PWNr9QMte1KYH7 QtFwVsxy1xWyhja8supUrGmLL2x7wjbDqiF1OMs/MSboxsR+I/kHu3SAPnCKGU7/1z 1MPWfewpR+4IV1iePUQ15mCltvfdw50C8Hcx7J9sz1GH9xDnn4k+CK8hM7uLHkF/Gu z/ABxOxkhhn6VPAfiUPvquHKdQLED5JGvQNxecxsnb2GZDOK8DaKVudKwTSKdTFrpN qnH2C7l0+CSow== From: Jeff Layton To: ceph-devel@vger.kernel.org Cc: xiubli@redhat.com, lhenriques@suse.de, idryomov@gmail.com Subject: [PATCH v14 28/64] ceph: fix base64 encoded name's length check in ceph_fname_to_usr() Date: Wed, 27 Apr 2022 15:12:38 -0400 Message-Id: <20220427191314.222867-29-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220427191314.222867-1-jlayton@kernel.org> References: <20220427191314.222867-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: Xiubo Li The fname->name is based64_encoded names and the max long shouldn't exceed the NAME_MAX. The FSCRYPT_BASE64URL_CHARS(NAME_MAX) will be 255 * 4 / 3. Signed-off-by: Xiubo Li Signed-off-by: Jeff Layton --- fs/ceph/crypto.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/ceph/crypto.c b/fs/ceph/crypto.c index 7e1f66b35095..6fb2fd1a8af0 100644 --- a/fs/ceph/crypto.c +++ b/fs/ceph/crypto.c @@ -264,7 +264,7 @@ int ceph_fname_to_usr(const struct ceph_fname *fname, struct fscrypt_str *tname, } /* Sanity check that the resulting name will fit in the buffer */ - if (fname->name_len > FSCRYPT_BASE64URL_CHARS(NAME_MAX)) + if (fname->name_len > NAME_MAX || fname->ctext_len > NAME_MAX) return -EIO; ret = __fscrypt_prepare_readdir(fname->dir); From patchwork Wed Apr 27 19:12:42 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 566794 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B3409C4332F for ; Wed, 27 Apr 2022 19:17:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229516AbiD0TUN (ORCPT ); Wed, 27 Apr 2022 15:20:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53286 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233399AbiD0TTR (ORCPT ); Wed, 27 Apr 2022 15:19:17 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 962283878C for ; Wed, 27 Apr 2022 12:13:40 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 24850619E1 for ; Wed, 27 Apr 2022 19:13:40 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 196FCC385AE; Wed, 27 Apr 2022 19:13:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1651086819; bh=LuGiWEBS6dOrw5jEKYcRGI7O8GPpWfvTC83a1pvBsSo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Ej1+c+NY5C3/DJi7Px3Ak/N59957+9yUZ5hrRK5nsGrIkUKfbW0afVksIxCWqng70 oVQDVuR6DjF2GoEGhBTWyG4wj3iex27pb6o0PvP/pPQTYu6s0HOCGP5cpd/Yu7vF8z y1QiJvrxxL9GQf5AZW1Ei1KGMU97YdsGaQt3Q23vhb9KjtL58xCh+5Og87VLrPn41J 1ofT8i1hB0Beo+JAZtYIH6Y/FpZblBa+EPUJrj+Y+GtBheWwJNikqXhFTANUNwGuO8 UEzfiVjSy2SHGOxpHxLVoNQRQwyivDgM68VUV6VD6YlXF7tLxntb+KMxcMBhzeWsC2 NF76mN8qfuWWw== From: Jeff Layton To: ceph-devel@vger.kernel.org Cc: xiubli@redhat.com, lhenriques@suse.de, idryomov@gmail.com Subject: [PATCH v14 32/64] ceph: add support to readdir for encrypted filenames Date: Wed, 27 Apr 2022 15:12:42 -0400 Message-Id: <20220427191314.222867-33-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220427191314.222867-1-jlayton@kernel.org> References: <20220427191314.222867-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: Xiubo Li Once we've decrypted the names in a readdir reply, we no longer need the crypttext, so overwrite them in ceph_mds_reply_dir_entry with the unencrypted names. Then in both ceph_readdir_prepopulate() and ceph_readdir() we will use the dencrypted name directly. [ jlayton: convert some BUG_ONs into error returns ] Signed-off-by: Xiubo Li Signed-off-by: Jeff Layton --- fs/ceph/crypto.c | 12 +++++-- fs/ceph/crypto.h | 1 + fs/ceph/dir.c | 35 +++++++++++++++---- fs/ceph/inode.c | 12 ++++--- fs/ceph/mds_client.c | 81 ++++++++++++++++++++++++++++++++++++++++---- fs/ceph/mds_client.h | 4 +-- 6 files changed, 124 insertions(+), 21 deletions(-) diff --git a/fs/ceph/crypto.c b/fs/ceph/crypto.c index 05498859eef5..af453c57866d 100644 --- a/fs/ceph/crypto.c +++ b/fs/ceph/crypto.c @@ -195,7 +195,10 @@ int ceph_encode_encrypted_dname(const struct inode *parent, struct qstr *d_name, int ret; u8 *cryptbuf; - WARN_ON_ONCE(!fscrypt_has_encryption_key(parent)); + if (!fscrypt_has_encryption_key(parent)) { + memcpy(buf, d_name->name, d_name->len); + return d_name->len; + } /* * Convert cleartext d_name to ciphertext. If result is longer than @@ -237,6 +240,8 @@ int ceph_encode_encrypted_dname(const struct inode *parent, struct qstr *d_name, int ceph_encode_encrypted_fname(const struct inode *parent, struct dentry *dentry, char *buf) { + WARN_ON_ONCE(!fscrypt_has_encryption_key(parent)); + return ceph_encode_encrypted_dname(parent, &dentry->d_name, buf); } @@ -281,7 +286,10 @@ int ceph_fname_to_usr(const struct ceph_fname *fname, struct fscrypt_str *tname, * generating a nokey name via fscrypt. */ if (!fscrypt_has_encryption_key(fname->dir)) { - memcpy(oname->name, fname->name, fname->name_len); + if (fname->no_copy) + oname->name = fname->name; + else + memcpy(oname->name, fname->name, fname->name_len); oname->len = fname->name_len; if (is_nokey) *is_nokey = true; diff --git a/fs/ceph/crypto.h b/fs/ceph/crypto.h index 704e5a0e96f7..05db33f1a421 100644 --- a/fs/ceph/crypto.h +++ b/fs/ceph/crypto.h @@ -19,6 +19,7 @@ struct ceph_fname { unsigned char *ctext; // binary crypttext (if any) u32 name_len; // length of name buffer u32 ctext_len; // length of crypttext + bool no_copy; }; struct ceph_fscrypt_auth { diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c index caf2547c3fe1..5ce2a6384e55 100644 --- a/fs/ceph/dir.c +++ b/fs/ceph/dir.c @@ -9,6 +9,7 @@ #include "super.h" #include "mds_client.h" +#include "crypto.h" /* * Directory operations: readdir, lookup, create, link, unlink, @@ -241,7 +242,9 @@ static int __dcache_readdir(struct file *file, struct dir_context *ctx, di = ceph_dentry(dentry); if (d_unhashed(dentry) || d_really_is_negative(dentry) || - di->lease_shared_gen != shared_gen) { + di->lease_shared_gen != shared_gen || + ((dentry->d_flags & DCACHE_NOKEY_NAME) && + fscrypt_has_encryption_key(dir))) { spin_unlock(&dentry->d_lock); dput(dentry); err = -EAGAIN; @@ -340,6 +343,10 @@ static int ceph_readdir(struct file *file, struct dir_context *ctx) ctx->pos = 2; } + err = fscrypt_prepare_readdir(inode); + if (err) + return err; + spin_lock(&ci->i_ceph_lock); /* request Fx cap. if have Fx, we don't need to release Fs cap * for later create/unlink. */ @@ -389,6 +396,7 @@ static int ceph_readdir(struct file *file, struct dir_context *ctx) req = ceph_mdsc_create_request(mdsc, op, USE_AUTH_MDS); if (IS_ERR(req)) return PTR_ERR(req); + err = ceph_alloc_readdir_reply_buffer(req, inode); if (err) { ceph_mdsc_put_request(req); @@ -402,11 +410,20 @@ static int ceph_readdir(struct file *file, struct dir_context *ctx) req->r_inode_drop = CEPH_CAP_FILE_EXCL; } if (dfi->last_name) { - req->r_path2 = kstrdup(dfi->last_name, GFP_KERNEL); + struct qstr d_name = { .name = dfi->last_name, + .len = strlen(dfi->last_name) }; + + req->r_path2 = kzalloc(NAME_MAX + 1, GFP_KERNEL); if (!req->r_path2) { ceph_mdsc_put_request(req); return -ENOMEM; } + + err = ceph_encode_encrypted_dname(inode, &d_name, req->r_path2); + if (err < 0) { + ceph_mdsc_put_request(req); + return err; + } } else if (is_hash_order(ctx->pos)) { req->r_args.readdir.offset_hash = cpu_to_le32(fpos_hash(ctx->pos)); @@ -511,15 +528,20 @@ static int ceph_readdir(struct file *file, struct dir_context *ctx) for (; i < rinfo->dir_nr; i++) { struct ceph_mds_reply_dir_entry *rde = rinfo->dir_entries + i; - BUG_ON(rde->offset < ctx->pos); + if (rde->offset < ctx->pos) { + pr_warn("%s: rde->offset 0x%llx ctx->pos 0x%llx\n", + __func__, rde->offset, ctx->pos); + return -EIO; + } + + if (WARN_ON_ONCE(!rde->inode.in)) + return -EIO; ctx->pos = rde->offset; dout("readdir (%d/%d) -> %llx '%.*s' %p\n", i, rinfo->dir_nr, ctx->pos, rde->name_len, rde->name, &rde->inode.in); - BUG_ON(!rde->inode.in); - if (!dir_emit(ctx, rde->name, rde->name_len, ceph_present_ino(inode->i_sb, le64_to_cpu(rde->inode.in->ino)), le32_to_cpu(rde->inode.in->mode) >> 12)) { @@ -532,6 +554,8 @@ static int ceph_readdir(struct file *file, struct dir_context *ctx) dout("filldir stopping us...\n"); return 0; } + + /* Reset the lengths to their original allocated vals */ ctx->pos++; } @@ -586,7 +610,6 @@ static int ceph_readdir(struct file *file, struct dir_context *ctx) dfi->dir_ordered_count); spin_unlock(&ci->i_ceph_lock); } - dout("readdir %p file %p done.\n", inode, file); return 0; } diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c index cb82e74c80dd..557f1c90c400 100644 --- a/fs/ceph/inode.c +++ b/fs/ceph/inode.c @@ -1751,7 +1751,8 @@ int ceph_readdir_prepopulate(struct ceph_mds_request *req, struct ceph_mds_session *session) { struct dentry *parent = req->r_dentry; - struct ceph_inode_info *ci = ceph_inode(d_inode(parent)); + struct inode *inode = d_inode(parent); + struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_mds_reply_info_parsed *rinfo = &req->r_reply_info; struct qstr dname; struct dentry *dn; @@ -1825,9 +1826,7 @@ int ceph_readdir_prepopulate(struct ceph_mds_request *req, tvino.snap = le64_to_cpu(rde->inode.in->snapid); if (rinfo->hash_order) { - u32 hash = ceph_str_hash(ci->i_dir_layout.dl_dir_hash, - rde->name, rde->name_len); - hash = ceph_frag_value(hash); + u32 hash = ceph_frag_value(rde->raw_hash); if (hash != last_hash) fpos_offset = 2; last_hash = hash; @@ -1850,6 +1849,11 @@ int ceph_readdir_prepopulate(struct ceph_mds_request *req, err = -ENOMEM; goto out; } + if (rde->is_nokey) { + spin_lock(&dn->d_lock); + dn->d_flags |= DCACHE_NOKEY_NAME; + spin_unlock(&dn->d_lock); + } } else if (d_really_is_positive(dn) && (ceph_ino(d_inode(dn)) != tvino.ino || ceph_snap(d_inode(dn)) != tvino.snap)) { diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index 609bee0e576f..40b85d41f950 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -439,20 +439,87 @@ static int parse_reply_info_readdir(void **p, void *end, info->dir_nr = num; while (num) { + struct inode *inode = d_inode(req->r_dentry); + struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_mds_reply_dir_entry *rde = info->dir_entries + i; + struct fscrypt_str tname = FSTR_INIT(NULL, 0); + struct fscrypt_str oname = FSTR_INIT(NULL, 0); + struct ceph_fname fname; + u32 altname_len, _name_len; + u8 *altname, *_name; + /* dentry */ - ceph_decode_32_safe(p, end, rde->name_len, bad); - ceph_decode_need(p, end, rde->name_len, bad); - rde->name = *p; - *p += rde->name_len; - dout("parsed dir dname '%.*s'\n", rde->name_len, rde->name); + ceph_decode_32_safe(p, end, _name_len, bad); + ceph_decode_need(p, end, _name_len, bad); + _name = *p; + *p += _name_len; + dout("parsed dir dname '%.*s'\n", _name_len, _name); + + if (info->hash_order) + rde->raw_hash = ceph_str_hash(ci->i_dir_layout.dl_dir_hash, + _name, _name_len); /* dentry lease */ err = parse_reply_info_lease(p, end, &rde->lease, features, - &rde->altname_len, &rde->altname); + &altname_len, &altname); if (err) goto out_bad; + /* + * Try to dencrypt the dentry names and update them + * in the ceph_mds_reply_dir_entry struct. + */ + fname.dir = inode; + fname.name = _name; + fname.name_len = _name_len; + fname.ctext = altname; + fname.ctext_len = altname_len; + /* + * The _name_len maybe larger than altname_len, such as + * when the human readable name length is in range of + * (CEPH_NOHASH_NAME_MAX, CEPH_NOHASH_NAME_MAX + SHA256_DIGEST_SIZE), + * then the copy in ceph_fname_to_usr will corrupt the + * data if there has no encryption key. + * + * Just set the no_copy flag and then if there has no + * encryption key the oname.name will be assigned to + * _name always. + */ + fname.no_copy = true; + if (altname_len == 0) { + /* + * Set tname to _name, and this will be used + * to do the base64_decode in-place. It's + * safe because the decoded string should + * always be shorter, which is 3/4 of origin + * string. + */ + tname.name = _name; + + /* + * Set oname to _name too, and this will be + * used to do the dencryption in-place. + */ + oname.name = _name; + oname.len = _name_len; + } else { + /* + * This will do the decryption only in-place + * from altname cryptext directly. + */ + oname.name = altname; + oname.len = altname_len; + } + rde->is_nokey = false; + err = ceph_fname_to_usr(&fname, &tname, &oname, &rde->is_nokey); + if (err) { + pr_err("%s unable to decode %.*s, got %d\n", __func__, + _name_len, _name, err); + goto out_bad; + } + rde->name = oname.name; + rde->name_len = oname.len; + /* inode */ err = parse_reply_info_in(p, end, &rde->inode, features); if (err < 0) @@ -3501,7 +3568,7 @@ static void handle_reply(struct ceph_mds_session *session, struct ceph_msg *msg) if (err == 0) { if (result == 0 && (req->r_op == CEPH_MDS_OP_READDIR || req->r_op == CEPH_MDS_OP_LSSNAP)) - ceph_readdir_prepopulate(req, req->r_session); + err = ceph_readdir_prepopulate(req, req->r_session); } current->journal_info = NULL; mutex_unlock(&req->r_fill_mutex); diff --git a/fs/ceph/mds_client.h b/fs/ceph/mds_client.h index cd719691a86d..046a9368c4a9 100644 --- a/fs/ceph/mds_client.h +++ b/fs/ceph/mds_client.h @@ -96,10 +96,10 @@ struct ceph_mds_reply_info_in { }; struct ceph_mds_reply_dir_entry { + bool is_nokey; char *name; - u8 *altname; u32 name_len; - u32 altname_len; + u32 raw_hash; struct ceph_mds_reply_lease *lease; struct ceph_mds_reply_info_in inode; loff_t offset; From patchwork Wed Apr 27 19:12:44 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 566796 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4EA76C433EF for ; Wed, 27 Apr 2022 19:16:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233111AbiD0TUB (ORCPT ); Wed, 27 Apr 2022 15:20:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53468 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233543AbiD0TTT (ORCPT ); Wed, 27 Apr 2022 15:19:19 -0400 Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3C26C3F8B7 for ; Wed, 27 Apr 2022 12:13:44 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id 9EAA7CE2758 for ; Wed, 27 Apr 2022 19:13:42 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 79750C385AA; Wed, 27 Apr 2022 19:13:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1651086820; bh=cDPy0xI578WfkDIktMt8wJvIbPGQCoBzzMq2Fpkpaf4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=T4eN1w4g/6N7hWmhb24jm2FJmDGfWf2TnM9EgpMGdSvA4fTwJemKlJ9CuawaiiqfV ZsxVxRiJskU0O4O0dRL4xtyF38O9VjwE/q/xgmZsVMFf2XWlxckJCvERwcauKMjGyV CjeeVwrrGLCQDmsgObQ7mNL/c+E9UKoz1ihKx0/K+X0UFMWgPwhhZTGOFJJUtQ4knd D6UW2Ub0xiZqtPUzPxgJ5Vsw9xaMIF7xi+uwSHW4dZOJwsMYOBjVgEnr0ZnbsjQo/n 7hinBm92W9tXzhBq1TXJHWur4GJa7kBd/Vha7vm+BfClyfC3X6HZGHPqu8f6l2NKrM KlHYpNGgrfAJg== From: Jeff Layton To: ceph-devel@vger.kernel.org Cc: xiubli@redhat.com, lhenriques@suse.de, idryomov@gmail.com Subject: [PATCH v14 34/64] ceph: make ceph_get_name decrypt filenames Date: Wed, 27 Apr 2022 15:12:44 -0400 Message-Id: <20220427191314.222867-35-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220427191314.222867-1-jlayton@kernel.org> References: <20220427191314.222867-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org When we do a lookupino to the MDS, we get a filename in the trace. ceph_get_name uses that name directly, so we must properly decrypt it before copying it to the name buffer. Signed-off-by: Jeff Layton --- fs/ceph/export.c | 44 ++++++++++++++++++++++++++++++++------------ 1 file changed, 32 insertions(+), 12 deletions(-) diff --git a/fs/ceph/export.c b/fs/ceph/export.c index e0fa66ac8b9f..0ebf2bd93055 100644 --- a/fs/ceph/export.c +++ b/fs/ceph/export.c @@ -7,6 +7,7 @@ #include "super.h" #include "mds_client.h" +#include "crypto.h" /* * Basic fh @@ -534,7 +535,9 @@ static int ceph_get_name(struct dentry *parent, char *name, { struct ceph_mds_client *mdsc; struct ceph_mds_request *req; + struct inode *dir = d_inode(parent); struct inode *inode = d_inode(child); + struct ceph_mds_reply_info_parsed *rinfo; int err; if (ceph_snap(inode) != CEPH_NOSNAP) @@ -546,30 +549,47 @@ static int ceph_get_name(struct dentry *parent, char *name, if (IS_ERR(req)) return PTR_ERR(req); - inode_lock(d_inode(parent)); - + inode_lock(dir); req->r_inode = inode; ihold(inode); req->r_ino2 = ceph_vino(d_inode(parent)); - req->r_parent = d_inode(parent); - ihold(req->r_parent); + req->r_parent = dir; + ihold(dir); set_bit(CEPH_MDS_R_PARENT_LOCKED, &req->r_req_flags); req->r_num_caps = 2; err = ceph_mdsc_do_request(mdsc, NULL, req); + inode_unlock(dir); - inode_unlock(d_inode(parent)); + if (err) + goto out; - if (!err) { - struct ceph_mds_reply_info_parsed *rinfo = &req->r_reply_info; + rinfo = &req->r_reply_info; + if (!IS_ENCRYPTED(dir)) { memcpy(name, rinfo->dname, rinfo->dname_len); name[rinfo->dname_len] = 0; - dout("get_name %p ino %llx.%llx name %s\n", - child, ceph_vinop(inode), name); } else { - dout("get_name %p ino %llx.%llx err %d\n", - child, ceph_vinop(inode), err); - } + struct fscrypt_str oname = FSTR_INIT(NULL, 0); + struct ceph_fname fname = { .dir = dir, + .name = rinfo->dname, + .ctext = rinfo->altname, + .name_len = rinfo->dname_len, + .ctext_len = rinfo->altname_len }; + + err = ceph_fname_alloc_buffer(dir, &oname); + if (err < 0) + goto out; + err = ceph_fname_to_usr(&fname, NULL, &oname, NULL); + if (!err) { + memcpy(name, oname.name, oname.len); + name[oname.len] = 0; + } + ceph_fname_free_buffer(dir, &oname); + } +out: + dout("get_name %p ino %llx.%llx err %d %s%s\n", + child, ceph_vinop(inode), err, + err ? "" : "name ", err ? "" : name); ceph_mdsc_put_request(req); return err; } From patchwork Wed Apr 27 19:12:47 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 566795 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 39358C433F5 for ; Wed, 27 Apr 2022 19:16:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234139AbiD0TUF (ORCPT ); Wed, 27 Apr 2022 15:20:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41600 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233519AbiD0TTS (ORCPT ); Wed, 27 Apr 2022 15:19:18 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 344953E5F8 for ; Wed, 27 Apr 2022 12:13:44 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 9A3CE619D0 for ; Wed, 27 Apr 2022 19:13:43 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8F85DC385A9; Wed, 27 Apr 2022 19:13:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1651086823; bh=055bRGvY8Rd9//uY4MNk52HNCIW2YMD+cbIrSeHBb+4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=UIkuPKZqIQ3M7JV2O39PNLsFcE1uRZmyNf2t+q6yz40y8/Oo+ZVYkkmVdy9PiMxIO q/YlWMgpLNvm2PN3Pj0rYQ+k+yqsl69VmMBh1B7mMdn8J6sRF9oVj8NHb1CpK9Wq0Z m4BAJ7twv9eLPQUphs7S0KGsxGx6yDKy6utjlaGJOCr9dDy0ZWpoqDfloAAQiJ2gTs OTEruhSFe0/nDEY1a1ox5B0CnaH63xfVDXukpS3KGhB3TCpt7ieBKwNztdOxcN/CuU nEggBpKcanumMfArofuAKnUhAtBFoo0yrbMGJExWCmVxec2aN6Rq9CoDgVXHOGISBE Jk1K5g1KR3Y5g== From: Jeff Layton To: ceph-devel@vger.kernel.org Cc: xiubli@redhat.com, lhenriques@suse.de, idryomov@gmail.com Subject: [PATCH v14 37/64] ceph: don't allow changing layout on encrypted files/directories Date: Wed, 27 Apr 2022 15:12:47 -0400 Message-Id: <20220427191314.222867-38-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220427191314.222867-1-jlayton@kernel.org> References: <20220427191314.222867-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: Luís Henriques Encryption is currently only supported on files/directories with layouts where stripe_count=1. Forbid changing layouts when encryption is involved. Signed-off-by: Luis Henriques Signed-off-by: Jeff Layton --- fs/ceph/ioctl.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/fs/ceph/ioctl.c b/fs/ceph/ioctl.c index b9f0f4e460ab..9675ef3a6c47 100644 --- a/fs/ceph/ioctl.c +++ b/fs/ceph/ioctl.c @@ -294,6 +294,10 @@ static long ceph_set_encryption_policy(struct file *file, unsigned long arg) struct inode *inode = file_inode(file); struct ceph_inode_info *ci = ceph_inode(inode); + /* encrypted directories can't have striped layout */ + if (ci->i_layout.stripe_count > 1) + return -EINVAL; + ret = vet_mds_for_fscrypt(file); if (ret) return ret; From patchwork Wed Apr 27 19:12:49 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 566790 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D1CFBC433EF for ; Wed, 27 Apr 2022 19:17:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233158AbiD0TUd (ORCPT ); Wed, 27 Apr 2022 15:20:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53124 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233584AbiD0TTU (ORCPT ); Wed, 27 Apr 2022 15:19:20 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1375449F0B for ; Wed, 27 Apr 2022 12:13:47 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id B1002B8294E for ; Wed, 27 Apr 2022 19:13:45 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id EFA61C385AA; Wed, 27 Apr 2022 19:13:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1651086824; bh=PaEsfkI2axXbFqdz+NTbvv+Rq8vLu7oxQGOIhXT9a3U=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=FUdvVZgF8OlSR6rpv09RuD3rzkn/c3TC/u0SHI4CBwCHLUErKrhDd4h6gSV4IsW9g d1ghtW9nvnHl/d0RTQldfipG1gWG0LkkngMFoT4fWPmXCDaOueYR0XMuwkWU18AaCo PLJ9UoXQepp5KIthijVHO75tEBRR8JxcD25xMCm0/qKzYDn07RWtUcin+Iiel0AlGA Yf+bH0qe8MhODmUNGupjlVrGTd9xNgkr2chxmkJvYX7/9dwptTrBKYL4ltG5l/RGZY j61ySn8HDjyKElk1ETShCeEPtIG4WNDkwlZqI+Za4CTs5pa/X/h6HhllZvp2UHSXZQ u6BDXpVLEfpnA== From: Jeff Layton To: ceph-devel@vger.kernel.org Cc: xiubli@redhat.com, lhenriques@suse.de, idryomov@gmail.com Subject: [PATCH v14 39/64] ceph: size handling for encrypted inodes in cap updates Date: Wed, 27 Apr 2022 15:12:49 -0400 Message-Id: <20220427191314.222867-40-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220427191314.222867-1-jlayton@kernel.org> References: <20220427191314.222867-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Transmit the rounded-up size as the normal size, and fill out the fscrypt_file field with the real file size. Signed-off-by: Jeff Layton --- fs/ceph/caps.c | 43 +++++++++++++++++++++++++------------------ fs/ceph/crypto.h | 4 ++++ 2 files changed, 29 insertions(+), 18 deletions(-) diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index 98230fd58b07..1d717433427c 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -1215,10 +1215,9 @@ struct cap_msg_args { umode_t mode; bool inline_data; bool wake; + bool encrypted; u32 fscrypt_auth_len; - u32 fscrypt_file_len; u8 fscrypt_auth[sizeof(struct ceph_fscrypt_auth)]; // for context - u8 fscrypt_file[sizeof(u64)]; // for size }; /* Marshal up the cap msg to the MDS */ @@ -1253,7 +1252,12 @@ static void encode_cap_msg(struct ceph_msg *msg, struct cap_msg_args *arg) fc->ino = cpu_to_le64(arg->ino); fc->snap_follows = cpu_to_le64(arg->follows); - fc->size = cpu_to_le64(arg->size); +#if IS_ENABLED(CONFIG_FS_ENCRYPTION) + if (arg->encrypted) + fc->size = cpu_to_le64(round_up(arg->size, CEPH_FSCRYPT_BLOCK_SIZE)); + else +#endif + fc->size = cpu_to_le64(arg->size); fc->max_size = cpu_to_le64(arg->max_size); ceph_encode_timespec64(&fc->mtime, &arg->mtime); ceph_encode_timespec64(&fc->atime, &arg->atime); @@ -1313,11 +1317,17 @@ static void encode_cap_msg(struct ceph_msg *msg, struct cap_msg_args *arg) ceph_encode_64(&p, 0); #if IS_ENABLED(CONFIG_FS_ENCRYPTION) - /* fscrypt_auth and fscrypt_file (version 12) */ + /* + * fscrypt_auth and fscrypt_file (version 12) + * + * fscrypt_auth holds the crypto context (if any). fscrypt_file + * tracks the real i_size as an __le64 field (and we use a rounded-up + * i_size in * the traditional size field). + */ ceph_encode_32(&p, arg->fscrypt_auth_len); ceph_encode_copy(&p, arg->fscrypt_auth, arg->fscrypt_auth_len); - ceph_encode_32(&p, arg->fscrypt_file_len); - ceph_encode_copy(&p, arg->fscrypt_file, arg->fscrypt_file_len); + ceph_encode_32(&p, sizeof(__le64)); + ceph_encode_64(&p, arg->size); #else /* CONFIG_FS_ENCRYPTION */ ceph_encode_32(&p, 0); ceph_encode_32(&p, 0); @@ -1389,7 +1399,6 @@ static void __prep_cap(struct cap_msg_args *arg, struct ceph_cap *cap, arg->follows = flushing ? ci->i_head_snapc->seq : 0; arg->flush_tid = flush_tid; arg->oldest_flush_tid = oldest_flush_tid; - arg->size = i_size_read(inode); ci->i_reported_size = arg->size; arg->max_size = ci->i_wanted_max_size; @@ -1443,6 +1452,7 @@ static void __prep_cap(struct cap_msg_args *arg, struct ceph_cap *cap, } } arg->flags = flags; + arg->encrypted = IS_ENCRYPTED(inode); #if IS_ENABLED(CONFIG_FS_ENCRYPTION) if (ci->fscrypt_auth_len && WARN_ON_ONCE(ci->fscrypt_auth_len > sizeof(struct ceph_fscrypt_auth))) { @@ -1453,21 +1463,21 @@ static void __prep_cap(struct cap_msg_args *arg, struct ceph_cap *cap, memcpy(arg->fscrypt_auth, ci->fscrypt_auth, min_t(size_t, ci->fscrypt_auth_len, sizeof(arg->fscrypt_auth))); } - /* FIXME: use this to track "real" size */ - arg->fscrypt_file_len = 0; #endif /* CONFIG_FS_ENCRYPTION */ } +#if IS_ENABLED(CONFIG_FS_ENCRYPTION) #define CAP_MSG_FIXED_FIELDS (sizeof(struct ceph_mds_caps) + \ - 4 + 8 + 4 + 4 + 8 + 4 + 4 + 4 + 8 + 8 + 4 + 8 + 8 + 4 + 4) + 4 + 8 + 4 + 4 + 8 + 4 + 4 + 4 + 8 + 8 + 4 + 8 + 8 + 4 + 4 + 8) -#if IS_ENABLED(CONFIG_FS_ENCRYPTION) static inline int cap_msg_size(struct cap_msg_args *arg) { - return CAP_MSG_FIXED_FIELDS + arg->fscrypt_auth_len + - arg->fscrypt_file_len; + return CAP_MSG_FIXED_FIELDS + arg->fscrypt_auth_len; } #else +#define CAP_MSG_FIXED_FIELDS (sizeof(struct ceph_mds_caps) + \ + 4 + 8 + 4 + 4 + 8 + 4 + 4 + 4 + 8 + 8 + 4 + 8 + 8 + 4 + 4) + static inline int cap_msg_size(struct cap_msg_args *arg) { return CAP_MSG_FIXED_FIELDS; @@ -1546,13 +1556,10 @@ static inline int __send_flush_snap(struct inode *inode, arg.inline_data = capsnap->inline_data; arg.flags = 0; arg.wake = false; + arg.encrypted = IS_ENCRYPTED(inode); - /* - * No fscrypt_auth changes from a capsnap. It will need - * to update fscrypt_file on size changes (TODO). - */ + /* No fscrypt_auth changes from a capsnap.*/ arg.fscrypt_auth_len = 0; - arg.fscrypt_file_len = 0; msg = ceph_msg_new(CEPH_MSG_CLIENT_CAPS, cap_msg_size(&arg), GFP_NOFS, false); diff --git a/fs/ceph/crypto.h b/fs/ceph/crypto.h index 05db33f1a421..918da2be4165 100644 --- a/fs/ceph/crypto.h +++ b/fs/ceph/crypto.h @@ -9,6 +9,10 @@ #include #include +#define CEPH_FSCRYPT_BLOCK_SHIFT 12 +#define CEPH_FSCRYPT_BLOCK_SIZE (_AC(1, UL) << CEPH_FSCRYPT_BLOCK_SHIFT) +#define CEPH_FSCRYPT_BLOCK_MASK (~(CEPH_FSCRYPT_BLOCK_SIZE-1)) + struct ceph_fs_client; struct ceph_acl_sec_ctx; struct ceph_mds_request; From patchwork Wed Apr 27 19:12:50 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 566778 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BB630C4332F for ; Wed, 27 Apr 2022 19:24:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233502AbiD0T1t (ORCPT ); Wed, 27 Apr 2022 15:27:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38482 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233581AbiD0TTU (ORCPT ); Wed, 27 Apr 2022 15:19:20 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 136053A5C4 for ; Wed, 27 Apr 2022 12:13:47 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 93706B82929 for ; Wed, 27 Apr 2022 19:13:46 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A5F80C385A7; Wed, 27 Apr 2022 19:13:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1651086825; bh=9O4xylyOaVpQN4Q5DhwiD7Ip/SEnFJaingH96ZH9ECo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=YUKQkSkWmAMJllwtiTs0DKKrKul6FklqLmhUURNgFP6ebA/eu6iKEFivWDjkx8UlG 3CAz/lTacpbPEmfMdqE72FAnkzor7WqL/aXwUasHwOkpBdQYfb+SUhZQeg0pAPht+v hn90W0/rSSCmHD6/6CHRv4gP/5jj0LEdI3IF4qfpPVm/RB0EcCYtCENS/wf+fv0LDc RKqKCjFNycRsdXwABAjygQCMggWIObtZLW/ngUwxCBLPd3q5eb+q4ZM3L+wQ2P433G wfdmbNs+OAl0qT5VC8dqP8dkIuStvl2uF8g7Vs/DD8S4tNlHIWdMplCpzL9Oa6PqjV U4swaivim/njA== From: Jeff Layton To: ceph-devel@vger.kernel.org Cc: xiubli@redhat.com, lhenriques@suse.de, idryomov@gmail.com Subject: [PATCH v14 40/64] ceph: fscrypt_file field handling in MClientRequest messages Date: Wed, 27 Apr 2022 15:12:50 -0400 Message-Id: <20220427191314.222867-41-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220427191314.222867-1-jlayton@kernel.org> References: <20220427191314.222867-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org For encrypted inodes, transmit a rounded-up size to the MDS as the normal file size and send the real inode size in fscrypt_file field. Also, fix up creates and truncates to also transmit fscrypt_file. Signed-off-by: Jeff Layton --- fs/ceph/dir.c | 3 +++ fs/ceph/file.c | 1 + fs/ceph/inode.c | 18 ++++++++++++++++-- fs/ceph/mds_client.c | 9 ++++++++- fs/ceph/mds_client.h | 2 ++ 5 files changed, 30 insertions(+), 3 deletions(-) diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c index 80ad094790c5..544aa5e78a31 100644 --- a/fs/ceph/dir.c +++ b/fs/ceph/dir.c @@ -910,6 +910,9 @@ static int ceph_mknod(struct user_namespace *mnt_userns, struct inode *dir, goto out_req; } + if (S_ISREG(mode) && IS_ENCRYPTED(dir)) + set_bit(CEPH_MDS_R_FSCRYPT_FILE, &req->r_req_flags); + req->r_dentry = dget(dentry); req->r_num_caps = 2; req->r_parent = dir; diff --git a/fs/ceph/file.c b/fs/ceph/file.c index 8918aece8b5a..0f4a18457259 100644 --- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -774,6 +774,7 @@ int ceph_atomic_open(struct inode *dir, struct dentry *dentry, req->r_parent = dir; ihold(dir); if (IS_ENCRYPTED(dir)) { + set_bit(CEPH_MDS_R_FSCRYPT_FILE, &req->r_req_flags); if (!fscrypt_has_encryption_key(dir)) { spin_lock(&dentry->d_lock); dentry->d_flags |= DCACHE_NOKEY_NAME; diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c index f9d07bd28c7f..4fa4e206e8f4 100644 --- a/fs/ceph/inode.c +++ b/fs/ceph/inode.c @@ -2377,11 +2377,25 @@ int __ceph_setattr(struct inode *inode, struct iattr *attr, struct ceph_iattr *c } } else if ((issued & CEPH_CAP_FILE_SHARED) == 0 || attr->ia_size != isize) { - req->r_args.setattr.size = cpu_to_le64(attr->ia_size); - req->r_args.setattr.old_size = cpu_to_le64(isize); mask |= CEPH_SETATTR_SIZE; release |= CEPH_CAP_FILE_SHARED | CEPH_CAP_FILE_EXCL | CEPH_CAP_FILE_RD | CEPH_CAP_FILE_WR; + if (IS_ENCRYPTED(inode) && attr->ia_size) { + set_bit(CEPH_MDS_R_FSCRYPT_FILE, &req->r_req_flags); + mask |= CEPH_SETATTR_FSCRYPT_FILE; + req->r_args.setattr.size = + cpu_to_le64(round_up(attr->ia_size, + CEPH_FSCRYPT_BLOCK_SIZE)); + req->r_args.setattr.old_size = + cpu_to_le64(round_up(isize, + CEPH_FSCRYPT_BLOCK_SIZE)); + req->r_fscrypt_file = attr->ia_size; + /* FIXME: client must zero out any partial blocks! */ + } else { + req->r_args.setattr.size = cpu_to_le64(attr->ia_size); + req->r_args.setattr.old_size = cpu_to_le64(isize); + req->r_fscrypt_file = 0; + } } } if (ia_valid & ATTR_MTIME) { diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index 40b85d41f950..5f4da3aa8786 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -2752,7 +2752,12 @@ static void encode_mclientrequest_tail(void **p, const struct ceph_mds_request * } else { ceph_encode_32(p, 0); } - ceph_encode_32(p, 0); // fscrypt_file for now + if (test_bit(CEPH_MDS_R_FSCRYPT_FILE, &req->r_req_flags)) { + ceph_encode_32(p, sizeof(__le64)); + ceph_encode_64(p, req->r_fscrypt_file); + } else { + ceph_encode_32(p, 0); + } } /* @@ -2838,6 +2843,8 @@ static struct ceph_msg *create_request_message(struct ceph_mds_session *session, /* fscrypt_file */ len += sizeof(u32); + if (test_bit(CEPH_MDS_R_FSCRYPT_FILE, &req->r_req_flags)) + len += sizeof(__le64); msg = ceph_msg_new2(CEPH_MSG_CLIENT_REQUEST, len, 1, GFP_NOFS, false); if (!msg) { diff --git a/fs/ceph/mds_client.h b/fs/ceph/mds_client.h index 046a9368c4a9..e297bf98c39f 100644 --- a/fs/ceph/mds_client.h +++ b/fs/ceph/mds_client.h @@ -282,6 +282,7 @@ struct ceph_mds_request { #define CEPH_MDS_R_DID_PREPOPULATE (6) /* prepopulated readdir */ #define CEPH_MDS_R_PARENT_LOCKED (7) /* is r_parent->i_rwsem wlocked? */ #define CEPH_MDS_R_ASYNC (8) /* async request */ +#define CEPH_MDS_R_FSCRYPT_FILE (9) /* must marshal fscrypt_file field */ unsigned long r_req_flags; struct mutex r_fill_mutex; @@ -289,6 +290,7 @@ struct ceph_mds_request { union ceph_mds_request_args r_args; struct ceph_fscrypt_auth *r_fscrypt_auth; + u64 r_fscrypt_file; u8 *r_altname; /* fscrypt binary crypttext for long filenames */ u32 r_altname_len; /* length of r_altname */ From patchwork Wed Apr 27 19:12:51 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 566793 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 446E8C433F5 for ; Wed, 27 Apr 2022 19:17:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229461AbiD0TUR (ORCPT ); Wed, 27 Apr 2022 15:20:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53546 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233264AbiD0TTT (ORCPT ); Wed, 27 Apr 2022 15:19:19 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CCB7549277 for ; Wed, 27 Apr 2022 12:13:46 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 6347B619E1 for ; Wed, 27 Apr 2022 19:13:46 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5B921C385AF; Wed, 27 Apr 2022 19:13:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1651086825; bh=vJLDHbtT13mhryEotWQYEflQwUADw6c3JX3gik+rhUQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=P/UGaeXk3ZHT18BRDBmb9Apws2XP5YgW+ViaD9Wr8qwKdMvPXPJKiCv9vygwMLcC5 AJzZfcRVBnEAmVqWQihxlvxgtD8ror4w7Twzb9HK+qkbFBWCHAw/+N2jbH9B0ulwKo o7iO1UpJYXkI6wOQ0ZSM3H65wX/lCstJEDXJ/nd9ZDCfkg/PFrPBYn42U3T2fwhxCh JZ08t3uujW9fOJ9ENgSYvZR/ybS4vTdahUlt+uQtzzrZirwGUMsSijP5BQOuny/tRQ nv0/RNonQWjTgKpF8tCkd8H+mUORpFKZCf1hzv+SV5ike3IqLH8FrTO35NRhouALiK CSf1x5ma7QwXg== From: Jeff Layton To: ceph-devel@vger.kernel.org Cc: xiubli@redhat.com, lhenriques@suse.de, idryomov@gmail.com Subject: [PATCH v14 41/64] ceph: get file size from fscrypt_file when present in inode traces Date: Wed, 27 Apr 2022 15:12:51 -0400 Message-Id: <20220427191314.222867-42-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220427191314.222867-1-jlayton@kernel.org> References: <20220427191314.222867-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org When we get an inode trace from the MDS, grab the fscrypt_file field if the inode is encrypted, and use it to populate the i_size field instead of the regular inode size field. Signed-off-by: Jeff Layton --- fs/ceph/inode.c | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c index 4fa4e206e8f4..d606dd7c093b 100644 --- a/fs/ceph/inode.c +++ b/fs/ceph/inode.c @@ -1024,6 +1024,7 @@ int ceph_fill_inode(struct inode *inode, struct page *locked_page, if (new_version || (new_issued & (CEPH_CAP_ANY_FILE_RD | CEPH_CAP_ANY_FILE_WR))) { + u64 size = le64_to_cpu(info->size); s64 old_pool = ci->i_layout.pool_id; struct ceph_string *old_ns; @@ -1037,10 +1038,21 @@ int ceph_fill_inode(struct inode *inode, struct page *locked_page, pool_ns = old_ns; + if (IS_ENCRYPTED(inode) && size && (iinfo->fscrypt_file_len == sizeof(__le64))) { + u64 fsize = __le64_to_cpu(*(__le64 *)iinfo->fscrypt_file); + + if (size == round_up(fsize, CEPH_FSCRYPT_BLOCK_SIZE)) { + size = fsize; + } else { + pr_warn("fscrypt size mismatch: size=%llu fscrypt_file=%llu, discarding fscrypt_file size.\n", + info->size, size); + } + } + queue_trunc = ceph_fill_file_size(inode, issued, - le32_to_cpu(info->truncate_seq), - le64_to_cpu(info->truncate_size), - le64_to_cpu(info->size)); + le32_to_cpu(info->truncate_seq), + le64_to_cpu(info->truncate_size), + size); /* only update max_size on auth cap */ if ((info->cap.flags & CEPH_CAP_FLAG_AUTH) && ci->i_max_size != le64_to_cpu(info->max_size)) { From patchwork Wed Apr 27 19:12:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 566779 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 17EEBC433F5 for ; Wed, 27 Apr 2022 19:24:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232853AbiD0T1n (ORCPT ); Wed, 27 Apr 2022 15:27:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38476 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233598AbiD0TTU (ORCPT ); Wed, 27 Apr 2022 15:19:20 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 21E651113 for ; Wed, 27 Apr 2022 12:13:47 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id AAD826199D for ; Wed, 27 Apr 2022 19:13:46 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 11128C385A9; Wed, 27 Apr 2022 19:13:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1651086826; bh=2PyeAVlbbAqfhc0GWyi5UBr2yxqc8D2pBq0UYro16yM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=fFd1liKOu9lFqREYwGRP9XtwsvayfpVIQzq96jBRdcYVkcJB5l/UGC1tZRlW1w36l b3KwSMsZIZdDafKT9F80OYFVLLWx0CJ8XwB2p73/9BZFN8CrSXxSy6TmSLrg+FfYLf lAwSaIVBVRnHgS3LHouKQEIduXU0Fi2XZWwaarfJ5FMBtGgOnzYErg4MWVY4VYLphG e9OggKs9AcMZJaLIuBhoLpwRWjE3EILViHlxs9W4KORTPJp1Ul4Xd3yTAJhp7sdUEA 9LjHdySJY8VNZts5SEAGlEsZfP0y2IOAIRBBI7A6hjRXKr39VF0oYDpkjIyJbx4IS/ 8QY6h8kMLvxSA== From: Jeff Layton To: ceph-devel@vger.kernel.org Cc: xiubli@redhat.com, lhenriques@suse.de, idryomov@gmail.com Subject: [PATCH v14 42/64] ceph: handle fscrypt fields in cap messages from MDS Date: Wed, 27 Apr 2022 15:12:52 -0400 Message-Id: <20220427191314.222867-43-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220427191314.222867-1-jlayton@kernel.org> References: <20220427191314.222867-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Handle the new fscrypt_file and fscrypt_auth fields in cap messages. Use them to populate new fields in cap_extra_info and update the inode with those values. Signed-off-by: Jeff Layton --- fs/ceph/caps.c | 78 ++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 76 insertions(+), 2 deletions(-) diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index 1d717433427c..c931290e5f1a 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -3368,6 +3368,9 @@ struct cap_extra_info { /* currently issued */ int issued; struct timespec64 btime; + u8 *fscrypt_auth; + u32 fscrypt_auth_len; + u64 fscrypt_file_size; }; /* @@ -3400,6 +3403,14 @@ static void handle_cap_grant(struct inode *inode, bool deleted_inode = false; bool fill_inline = false; + /* + * If there is at least one crypto block then we'll trust fscrypt_file_size. + * If the real length of the file is 0, then ignore it (it has probably been + * truncated down to 0 by the MDS). + */ + if (IS_ENCRYPTED(inode) && size) + size = extra_info->fscrypt_file_size; + dout("handle_cap_grant inode %p cap %p mds%d seq %d %s\n", inode, cap, session->s_mds, seq, ceph_cap_string(newcaps)); dout(" size %llu max_size %llu, i_size %llu\n", size, max_size, @@ -3466,6 +3477,10 @@ static void handle_cap_grant(struct inode *inode, dout("%p mode 0%o uid.gid %d.%d\n", inode, inode->i_mode, from_kuid(&init_user_ns, inode->i_uid), from_kgid(&init_user_ns, inode->i_gid)); + + WARN_ON_ONCE(ci->fscrypt_auth_len != extra_info->fscrypt_auth_len || + memcmp(ci->fscrypt_auth, extra_info->fscrypt_auth, + ci->fscrypt_auth_len)); } if ((newcaps & CEPH_CAP_LINK_SHARED) && @@ -3876,7 +3891,8 @@ static void handle_cap_flushsnap_ack(struct inode *inode, u64 flush_tid, */ static bool handle_cap_trunc(struct inode *inode, struct ceph_mds_caps *trunc, - struct ceph_mds_session *session) + struct ceph_mds_session *session, + struct cap_extra_info *extra_info) { struct ceph_inode_info *ci = ceph_inode(inode); int mds = session->s_mds; @@ -3893,6 +3909,14 @@ static bool handle_cap_trunc(struct inode *inode, issued |= implemented | dirty; + /* + * If there is at least one crypto block then we'll trust fscrypt_file_size. + * If the real length of the file is 0, then ignore it (it has probably been + * truncated down to 0 by the MDS). + */ + if (IS_ENCRYPTED(inode) && size) + size = extra_info->fscrypt_file_size; + dout("handle_cap_trunc inode %p mds%d seq %d to %lld seq %d\n", inode, mds, seq, truncate_size, truncate_seq); queue_trunc = ceph_fill_file_size(inode, issued, @@ -4114,6 +4138,49 @@ static void handle_cap_import(struct ceph_mds_client *mdsc, *target_cap = cap; } +#ifdef CONFIG_FS_ENCRYPTION +static int parse_fscrypt_fields(void **p, void *end, struct cap_extra_info *extra) +{ + u32 len; + + ceph_decode_32_safe(p, end, extra->fscrypt_auth_len, bad); + if (extra->fscrypt_auth_len) { + ceph_decode_need(p, end, extra->fscrypt_auth_len, bad); + extra->fscrypt_auth = kmalloc(extra->fscrypt_auth_len, GFP_KERNEL); + if (!extra->fscrypt_auth) + return -ENOMEM; + ceph_decode_copy_safe(p, end, extra->fscrypt_auth, + extra->fscrypt_auth_len, bad); + } + + ceph_decode_32_safe(p, end, len, bad); + if (len >= sizeof(u64)) { + ceph_decode_64_safe(p, end, extra->fscrypt_file_size, bad); + len -= sizeof(u64); + } + ceph_decode_skip_n(p, end, len, bad); + return 0; +bad: + return -EIO; +} +#else +static int parse_fscrypt_fields(void **p, void *end, struct cap_extra_info *extra) +{ + u32 len; + + /* Don't care about these fields unless we're encryption-capable */ + ceph_decode_32_safe(p, end, len, bad); + if (len) + ceph_decode_skip_n(p, end, len, bad); + ceph_decode_32_safe(p, end, len, bad); + if (len) + ceph_decode_skip_n(p, end, len, bad); + return 0; +bad: + return -EIO; +} +#endif + /* * Handle a caps message from the MDS. * @@ -4232,6 +4299,11 @@ void ceph_handle_caps(struct ceph_mds_session *session, ceph_decode_64_safe(&p, end, extra_info.nsubdirs, bad); } + if (msg_version >= 12) { + if (parse_fscrypt_fields(&p, end, &extra_info)) + goto bad; + } + /* lookup ino */ inode = ceph_find_inode(mdsc->fsc->sb, vino); dout(" op %s ino %llx.%llx inode %p\n", ceph_cap_op_name(op), vino.ino, @@ -4328,7 +4400,8 @@ void ceph_handle_caps(struct ceph_mds_session *session, break; case CEPH_CAP_OP_TRUNC: - queue_trunc = handle_cap_trunc(inode, h, session); + queue_trunc = handle_cap_trunc(inode, h, session, + &extra_info); spin_unlock(&ci->i_ceph_lock); if (queue_trunc) ceph_queue_vmtruncate(inode); @@ -4346,6 +4419,7 @@ void ceph_handle_caps(struct ceph_mds_session *session, iput(inode); out: ceph_put_string(extra_info.pool_ns); + kfree(extra_info.fscrypt_auth); return; flush_cap_releases: From patchwork Wed Apr 27 19:12:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 566789 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7F247C433FE for ; Wed, 27 Apr 2022 19:17:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234023AbiD0TUh (ORCPT ); Wed, 27 Apr 2022 15:20:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43210 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233715AbiD0TTV (ORCPT ); Wed, 27 Apr 2022 15:19:21 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 38CBC53E17 for ; Wed, 27 Apr 2022 12:13:51 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id D6F27B8291B for ; Wed, 27 Apr 2022 19:13:49 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 276BEC385A9; Wed, 27 Apr 2022 19:13:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1651086828; bh=hufDPpYqCeqZaiv/AsWCxiKTSUC5k110DLYG84D+Pzw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=a5AXTUjZXx0YBAx7/WJbQe/aDTP1gFfveC6O4gZ7qk7BuiDAA8qy7Cx+SIUyPAOyU 1QHxhh3LQRG2VzLU/cWxfv8BtyctHIyZmOsCDx9hpS4drNFe7p3qFyJnIW1eedceNp z5Aqs7fBNXO+8D3szcWYkRrsDG8Y2XNyrK+olRf5RJwJ6LUDddoVAabaEFLN51ToAH A0ldCinquezk6EblquU10oETQ8a3t+zKJmiENBFAIasnHXJWVtDH4KrPLdv+EdeudX gj2VeSyipQIww+cqHrdvzTz92GCTnM1V/TBGxhe5YheJT2zC/gAQt7+L77JqwpvMYA YpEgN1FujIA/A== From: Jeff Layton To: ceph-devel@vger.kernel.org Cc: xiubli@redhat.com, lhenriques@suse.de, idryomov@gmail.com Subject: [PATCH v14 45/64] ceph: add __ceph_sync_read helper support Date: Wed, 27 Apr 2022 15:12:55 -0400 Message-Id: <20220427191314.222867-46-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220427191314.222867-1-jlayton@kernel.org> References: <20220427191314.222867-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: Xiubo Li Turn the guts of ceph_sync_read into a new helper that takes an inode and an offset instead of a kiocb struct, and make ceph_sync_read call the helper as a wrapper. Signed-off-by: Xiubo Li Signed-off-by: Jeff Layton --- fs/ceph/file.c | 33 +++++++++++++++++++++------------ fs/ceph/super.h | 2 ++ 2 files changed, 23 insertions(+), 12 deletions(-) diff --git a/fs/ceph/file.c b/fs/ceph/file.c index 0f4a18457259..297fa668bb06 100644 --- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -937,22 +937,19 @@ enum { * If we get a short result from the OSD, check against i_size; we need to * only return a short read to the caller if we hit EOF. */ -static ssize_t ceph_sync_read(struct kiocb *iocb, struct iov_iter *to, - int *retry_op) +ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos, + struct iov_iter *to, int *retry_op) { - struct file *file = iocb->ki_filp; - struct inode *inode = file_inode(file); struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_fs_client *fsc = ceph_inode_to_client(inode); struct ceph_osd_client *osdc = &fsc->client->osdc; ssize_t ret; - u64 off = iocb->ki_pos; + u64 off = *ki_pos; u64 len = iov_iter_count(to); u64 i_size = i_size_read(inode); bool sparse = ceph_test_mount_opt(fsc, SPARSEREAD); - dout("sync_read on file %p %llu~%u %s\n", file, off, (unsigned)len, - (file->f_flags & O_DIRECT) ? "O_DIRECT" : ""); + dout("sync_read on inode %p %llx~%llx\n", inode, *ki_pos, len); if (!len) return 0; @@ -1071,14 +1068,14 @@ static ssize_t ceph_sync_read(struct kiocb *iocb, struct iov_iter *to, break; } - if (off > iocb->ki_pos) { + if (off > *ki_pos) { if (off >= i_size) { *retry_op = CHECK_EOF; - ret = i_size - iocb->ki_pos; - iocb->ki_pos = i_size; + ret = i_size - *ki_pos; + *ki_pos = i_size; } else { - ret = off - iocb->ki_pos; - iocb->ki_pos = off; + ret = off - *ki_pos; + *ki_pos = off; } } @@ -1086,6 +1083,18 @@ static ssize_t ceph_sync_read(struct kiocb *iocb, struct iov_iter *to, return ret; } +static ssize_t ceph_sync_read(struct kiocb *iocb, struct iov_iter *to, + int *retry_op) +{ + struct file *file = iocb->ki_filp; + struct inode *inode = file_inode(file); + + dout("sync_read on file %p %llx~%zx %s\n", file, iocb->ki_pos, + iov_iter_count(to), (file->f_flags & O_DIRECT) ? "O_DIRECT" : ""); + + return __ceph_sync_read(inode, &iocb->ki_pos, to, retry_op); +} + struct ceph_aio_request { struct kiocb *iocb; size_t total_len; diff --git a/fs/ceph/super.h b/fs/ceph/super.h index d56f39c136ea..168ae951703d 100644 --- a/fs/ceph/super.h +++ b/fs/ceph/super.h @@ -1259,6 +1259,8 @@ extern int ceph_renew_caps(struct inode *inode, int fmode); extern int ceph_open(struct inode *inode, struct file *file); extern int ceph_atomic_open(struct inode *dir, struct dentry *dentry, struct file *file, unsigned flags, umode_t mode); +extern ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos, + struct iov_iter *to, int *retry_op); extern int ceph_release(struct inode *inode, struct file *filp); extern void ceph_fill_inline_data(struct inode *inode, struct page *locked_page, char *data, size_t len); From patchwork Wed Apr 27 19:13:00 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 566788 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id ABA0EC433EF for ; Wed, 27 Apr 2022 19:17:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233837AbiD0TUm (ORCPT ); Wed, 27 Apr 2022 15:20:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53286 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233762AbiD0TTW (ORCPT ); Wed, 27 Apr 2022 15:19:22 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2347F53A6B for ; Wed, 27 Apr 2022 12:13:53 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id B4498619FC for ; Wed, 27 Apr 2022 19:13:52 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 9D963C385A7; Wed, 27 Apr 2022 19:13:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1651086832; bh=bTGL2eTBz8WfSmM2NY3CC8x3l6rD2G+I0t/E0HV3Qts=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=tIVhNS+0NNBaraJ0Gx85JwJJ+pNYlfEMS53OpjfA7cffkAVpecMoUM0KEjX2OHbga PZHj+V+86uEetwN/S7s9uvS3nQe1X+NFe17PHb12aasvvYaQts2Mubzzp6TS7cFSOv layKyZXBfQ5Xkl7ncd32eiKGScq8NGl5Hk+IfNSPs4cyjZ7vT9Su9aJW5Pe6fBQrTQ i+9GN5I1AS8NEC0s/U0x6AnVsKTCA+F4jgWfj+S1gGHqjnFqgxosvN+f1k0XgdKtcc yXcZDFRnaxpAPJKe4gXjK/KzjGGnTqM1K9+ivC7dzQro5IqMT9DfLXP/bOu6Ec5g4m fEDQUCHEW5FoQ== From: Jeff Layton To: ceph-devel@vger.kernel.org Cc: xiubli@redhat.com, lhenriques@suse.de, idryomov@gmail.com Subject: [PATCH v14 50/64] ceph: disable fallocate for encrypted inodes Date: Wed, 27 Apr 2022 15:13:00 -0400 Message-Id: <20220427191314.222867-51-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220427191314.222867-1-jlayton@kernel.org> References: <20220427191314.222867-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org ...hopefully, just for now. Signed-off-by: Jeff Layton --- fs/ceph/file.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/fs/ceph/file.c b/fs/ceph/file.c index 7168cf97924b..1024dc57898d 100644 --- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -2217,6 +2217,9 @@ static long ceph_fallocate(struct file *file, int mode, if (!S_ISREG(inode->i_mode)) return -EOPNOTSUPP; + if (IS_ENCRYPTED(inode)) + return -EOPNOTSUPP; + prealloc_cf = ceph_alloc_cap_flush(); if (!prealloc_cf) return -ENOMEM; From patchwork Wed Apr 27 19:13:01 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 566781 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CBA48C433F5 for ; Wed, 27 Apr 2022 19:19:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233952AbiD0TWS (ORCPT ); Wed, 27 Apr 2022 15:22:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43858 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233826AbiD0TTX (ORCPT ); Wed, 27 Apr 2022 15:19:23 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6C080562C8 for ; Wed, 27 Apr 2022 12:13:55 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 27AF8B8291D for ; Wed, 27 Apr 2022 19:13:54 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5EE22C385AA; Wed, 27 Apr 2022 19:13:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1651086832; bh=qhce0OEg+JNVepHokYTlytSJKJrB4yhKGHHYs3Z054Q=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=hqvh/Up3ruXLX9tNu8fxLsmb+CR4iNIkeETD9Zpro1iK1Ni0r1TpsStgkN3tScSy6 iStrErmYYX649G7B9QzC+6KsvyUJY0vZt55aWy8Unuvh9niCSmidCjOiEPhXi0+2Nu Y6Q6uapt1/QZKlKCGk0cmGL3iSCV4P03CSYIkXhTaddd6ozVwM5uYBb2zsn6Ct/Fhq DeLsDEIkfq4CNqK0iHEkkumN0ySGWzkCXtW75ZU5Pv4g3Muw1N7xzSYGefXV8NcdTl JQibwCT7L6I7EjDM/5Ea7QyGUFG0CGZ2us2lfn9God6jVKvvFiwZgHTf62i7TKskBT S5GNfh6WMMrhg== From: Jeff Layton To: ceph-devel@vger.kernel.org Cc: xiubli@redhat.com, lhenriques@suse.de, idryomov@gmail.com Subject: [PATCH v14 51/64] ceph: disable copy offload on encrypted inodes Date: Wed, 27 Apr 2022 15:13:01 -0400 Message-Id: <20220427191314.222867-52-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220427191314.222867-1-jlayton@kernel.org> References: <20220427191314.222867-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org If we have an encrypted inode, then the client will need to re-encrypt the contents of the new object. Disable copy offload to or from encrypted inodes. Signed-off-by: Jeff Layton --- fs/ceph/file.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/fs/ceph/file.c b/fs/ceph/file.c index 1024dc57898d..483d7d016ad6 100644 --- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -2536,6 +2536,10 @@ static ssize_t __ceph_copy_file_range(struct file *src_file, loff_t src_off, return -EOPNOTSUPP; } + /* Every encrypted inode gets its own key, so we can't offload them */ + if (IS_ENCRYPTED(src_inode) || IS_ENCRYPTED(dst_inode)) + return -EOPNOTSUPP; + if (len < src_ci->i_layout.object_size) return -EOPNOTSUPP; /* no remote copy will be done */ From patchwork Wed Apr 27 19:13:02 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 566780 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 88FD0C433EF for ; Wed, 27 Apr 2022 19:24:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231532AbiD0T1k (ORCPT ); Wed, 27 Apr 2022 15:27:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43846 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233807AbiD0TTX (ORCPT ); Wed, 27 Apr 2022 15:19:23 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 86CE26246 for ; Wed, 27 Apr 2022 12:13:54 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 1AB14619E1 for ; Wed, 27 Apr 2022 19:13:54 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 149ECC385A9; Wed, 27 Apr 2022 19:13:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1651086833; bh=axX5BkqLZDK6roGl5PQHu8P5am5/8hnJSh4Bqgb3oGo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=SmWpwveIV6FX4AeX7CGCZzrOL5iwqtRWn9/5d8AaBryYfNQxxg0BKIVYCODRqmNrg Nuv70WUUFL3bJbVITLR25Ejz7Tgp5/Pl0gricvoURwy4A1kHFt5bcLaZCE7d68m/GD F9ZN6HwqzvyWvDWVdR7eBMFbXCyi9/Dg7Yzd9w1uXgjGvtfuIShgPNanCaNavclULA qjXPHvVh3Puq9aRwvDi4ZLSG0s+YgDKJf/BX0WPv3GkEAAhGazHnRHpHf4UgaGxAW6 aANeu1OTolTXM9RwthuAz4sDQoCvH3cdQ9LCQiQ3z7FAKL79seNqCxB2VW6kbjRDbw PP92bxDOvgt5w== From: Jeff Layton To: ceph-devel@vger.kernel.org Cc: xiubli@redhat.com, lhenriques@suse.de, idryomov@gmail.com Subject: [PATCH v14 52/64] ceph: don't use special DIO path for encrypted inodes Date: Wed, 27 Apr 2022 15:13:02 -0400 Message-Id: <20220427191314.222867-53-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220427191314.222867-1-jlayton@kernel.org> References: <20220427191314.222867-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Eventually I want to merge the synchronous and direct read codepaths, possibly via new netfs infrastructure. For now, the direct path is not crypto-enabled, so use the sync read/write paths instead. Signed-off-by: Jeff Layton --- fs/ceph/file.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/fs/ceph/file.c b/fs/ceph/file.c index 483d7d016ad6..edeb74f57b73 100644 --- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -1719,7 +1719,9 @@ static ssize_t ceph_read_iter(struct kiocb *iocb, struct iov_iter *to) ceph_cap_string(got)); if (ci->i_inline_version == CEPH_INLINE_NONE) { - if (!retry_op && (iocb->ki_flags & IOCB_DIRECT)) { + if (!retry_op && + (iocb->ki_flags & IOCB_DIRECT) && + !IS_ENCRYPTED(inode)) { ret = ceph_direct_read_write(iocb, to, NULL, NULL); if (ret >= 0 && ret < len) @@ -1945,7 +1947,7 @@ static ssize_t ceph_write_iter(struct kiocb *iocb, struct iov_iter *from) /* we might need to revert back to that point */ data = *from; - if (iocb->ki_flags & IOCB_DIRECT) + if ((iocb->ki_flags & IOCB_DIRECT) && !IS_ENCRYPTED(inode)) written = ceph_direct_read_write(iocb, &data, snapc, &prealloc_cf); else From patchwork Wed Apr 27 19:13:05 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 566786 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DF9DBC433FE for ; Wed, 27 Apr 2022 19:17:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233310AbiD0TUt (ORCPT ); Wed, 27 Apr 2022 15:20:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55468 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233385AbiD0TTi (ORCPT ); Wed, 27 Apr 2022 15:19:38 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 926EA562DF for ; Wed, 27 Apr 2022 12:13:57 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id EA0C3B8291D for ; Wed, 27 Apr 2022 19:13:56 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2A943C385AA; Wed, 27 Apr 2022 19:13:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1651086835; bh=1UkJDKwrv9caN2oVCkP6Btr2kn/RYmehHgLLxsRqD7A=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=lTl93nU9B0xkpv2a+BUDD9a1XwPcrUE0tgr+NEATBM/y7FWdzqOSX0VXoXVNPH+MB 9A79tGP2en/R756tObFGrATy8kpq+jk8SbuD4GX1CVpP4JvDi3Y8sdy/M7UxuWjvAs Ft6gJ4o17ba8DDw1yP1j4h4/JB282ZG2AlKbHyWmNPrDtuY/42mayQjc/LmbP2JBPi 8Lt8FQd/LssKvmOhwOgGqBB/ZIDAraVjVCQMlPEQmF0foDYegX/hA3SpPHHHQPn39w 2UfggaNMogVVr3Y1HA4R7h36BTOzw6V+JE9yy6h/RIlDq+EKQGAkTOH1rdsviOLqBa 0OKz0lWNA9jNQ== From: Jeff Layton To: ceph-devel@vger.kernel.org Cc: xiubli@redhat.com, lhenriques@suse.de, idryomov@gmail.com Subject: [PATCH v14 55/64] ceph: plumb in decryption during sync reads Date: Wed, 27 Apr 2022 15:13:05 -0400 Message-Id: <20220427191314.222867-56-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220427191314.222867-1-jlayton@kernel.org> References: <20220427191314.222867-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Switch to using sparse reads when the inode is encrypted. Note that the crypto block may be smaller than a page, but the reverse cannot be true. Signed-off-by: Jeff Layton --- fs/ceph/file.c | 89 ++++++++++++++++++++++++++++++++++++-------------- 1 file changed, 65 insertions(+), 24 deletions(-) diff --git a/fs/ceph/file.c b/fs/ceph/file.c index 472c525f732d..3d8f0839b718 100644 --- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -948,7 +948,7 @@ ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos, u64 off = *ki_pos; u64 len = iov_iter_count(to); u64 i_size = i_size_read(inode); - bool sparse = ceph_test_mount_opt(fsc, SPARSEREAD); + bool sparse = IS_ENCRYPTED(inode) || ceph_test_mount_opt(fsc, SPARSEREAD); u64 objver = 0; dout("sync_read on inode %p %llx~%llx\n", inode, *ki_pos, len); @@ -976,10 +976,19 @@ ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos, int idx; size_t left; struct ceph_osd_req_op *op; + u64 read_off = off; + u64 read_len = len; + + /* determine new offset/length if encrypted */ + ceph_fscrypt_adjust_off_and_len(inode, &read_off, &read_len); + + dout("sync_read orig %llu~%llu reading %llu~%llu", + off, len, read_off, read_len); req = ceph_osdc_new_request(osdc, &ci->i_layout, - ci->i_vino, off, &len, 0, 1, - sparse ? CEPH_OSD_OP_SPARSE_READ : CEPH_OSD_OP_READ, + ci->i_vino, read_off, &read_len, 0, 1, + sparse ? CEPH_OSD_OP_SPARSE_READ : + CEPH_OSD_OP_READ, CEPH_OSD_FLAG_READ, NULL, ci->i_truncate_seq, ci->i_truncate_size, false); @@ -988,10 +997,13 @@ ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos, break; } + /* adjust len downward if the request truncated the len */ + if (off + len > read_off + read_len) + len = read_off + read_len - off; more = len < iov_iter_count(to); - num_pages = calc_pages_for(off, len); - page_off = off & ~PAGE_MASK; + num_pages = calc_pages_for(read_off, read_len); + page_off = offset_in_page(off); pages = ceph_alloc_page_vector(num_pages, GFP_KERNEL); if (IS_ERR(pages)) { ceph_osdc_put_request(req); @@ -999,7 +1011,8 @@ ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos, break; } - osd_req_op_extent_osd_data_pages(req, 0, pages, len, page_off, + osd_req_op_extent_osd_data_pages(req, 0, pages, read_len, + offset_in_page(read_off), false, false); op = &req->r_ops[0]; @@ -1018,7 +1031,7 @@ ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos, ceph_update_read_metrics(&fsc->mdsc->metric, req->r_start_latency, req->r_end_latency, - len, ret); + read_len, ret); if (ret > 0) objver = req->r_version; @@ -1033,8 +1046,34 @@ ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos, else if (ret == -ENOENT) ret = 0; + if (ret > 0 && IS_ENCRYPTED(inode)) { + int fret; + + fret = ceph_fscrypt_decrypt_extents(inode, pages, read_off, + op->extent.sparse_ext, op->extent.sparse_ext_cnt); + if (fret < 0) { + ret = fret; + ceph_osdc_put_request(req); + break; + } + + /* account for any partial block at the beginning */ + fret -= (off - read_off); + + /* + * Short read after big offset adjustment? + * Nothing is usable, just call it a zero + * len read. + */ + fret = max(fret, 0); + + /* account for partial block at the end */ + ret = min_t(ssize_t, fret, len); + } + ceph_osdc_put_request(req); + /* Short read but not EOF? Zero out the remainder. */ if (ret >= 0 && ret < len && (off + ret < i_size)) { int zlen = min(len - ret, i_size - off - ret); int zoff = page_off + ret; @@ -1048,15 +1087,16 @@ ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos, idx = 0; left = ret > 0 ? ret : 0; while (left > 0) { - size_t len, copied; - page_off = off & ~PAGE_MASK; - len = min_t(size_t, left, PAGE_SIZE - page_off); + size_t plen, copied; + + plen = min_t(size_t, left, PAGE_SIZE - page_off); SetPageUptodate(pages[idx]); copied = copy_page_to_iter(pages[idx++], - page_off, len, to); + page_off, plen, to); off += copied; left -= copied; - if (copied < len) { + page_off = 0; + if (copied < plen) { ret = -EFAULT; break; } @@ -1073,20 +1113,21 @@ ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos, break; } - if (off > *ki_pos) { - if (off >= i_size) { - *retry_op = CHECK_EOF; - ret = i_size - *ki_pos; - *ki_pos = i_size; - } else { - ret = off - *ki_pos; - *ki_pos = off; + if (ret > 0) { + if (off > *ki_pos) { + if (off >= i_size) { + *retry_op = CHECK_EOF; + ret = i_size - *ki_pos; + *ki_pos = i_size; + } else { + ret = off - *ki_pos; + *ki_pos = off; + } } - } - - if (last_objver && ret > 0) - *last_objver = objver; + if (last_objver) + *last_objver = objver; + } dout("sync_read result %zd retry_op %d\n", ret, *retry_op); return ret; } From patchwork Wed Apr 27 19:13:06 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 566785 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 04A10C433F5 for ; Wed, 27 Apr 2022 19:17:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233762AbiD0TU6 (ORCPT ); Wed, 27 Apr 2022 15:20:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53654 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233534AbiD0TTi (ORCPT ); Wed, 27 Apr 2022 15:19:38 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 11CD754BCB for ; Wed, 27 Apr 2022 12:13:58 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 9B017B8294F for ; Wed, 27 Apr 2022 19:13:56 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D4839C385A9; Wed, 27 Apr 2022 19:13:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1651086836; bh=fxw7rBOvGLOYMtMJ+hDc9AJ5VYh5vxtrf7kGlvoN384=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=TUyfeABg8SsxiqYvk4cR0FfHUrFgUEN/VRuaTfa7VGjas4lmDn6fi9KeFt3+XmFsT lbf7/MNrUYD1QUQnR71tYzEYNIMGE1S1axOswEepoEdEfkII0EQgOxT4sCMr5d8OMm 1PSRP7MXBzuEefB/itYHLnlzK1pMkAJd8UOsTRhu/h34klA2/vpFTbsx7z9fSI4Lpk q5Q0DD4Yiz8xF38h+oeK3U/oDOOUWlONkE1KxJtEI8jOjyyk4GxbQBLjdBw//Brfhl YAOi//rnxwe6aSDOAtETx2RiK2XizC1EDJHhT6TEYWDJ0xOP3BV5qES+N1k9vlpKBX 07PR/I5yl2Lrw== From: Jeff Layton To: ceph-devel@vger.kernel.org Cc: xiubli@redhat.com, lhenriques@suse.de, idryomov@gmail.com Subject: [PATCH v14 56/64] ceph: add fscrypt decryption support to ceph_netfs_issue_op Date: Wed, 27 Apr 2022 15:13:06 -0400 Message-Id: <20220427191314.222867-57-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220427191314.222867-1-jlayton@kernel.org> References: <20220427191314.222867-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Force the use of sparse reads when the inode is encrypted, and add the appropriate code to decrypt the extent map after receiving. Signed-off-by: Jeff Layton --- fs/ceph/addr.c | 32 +++++++++++++++++++++++--------- 1 file changed, 23 insertions(+), 9 deletions(-) diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c index 84d0b52010d9..d65d431ec933 100644 --- a/fs/ceph/addr.c +++ b/fs/ceph/addr.c @@ -18,6 +18,7 @@ #include "mds_client.h" #include "cache.h" #include "metric.h" +#include "crypto.h" #include #include @@ -216,7 +217,8 @@ static bool ceph_netfs_clamp_length(struct netfs_io_subrequest *subreq) static void finish_netfs_read(struct ceph_osd_request *req) { - struct ceph_fs_client *fsc = ceph_inode_to_client(req->r_inode); + struct inode *inode = req->r_inode; + struct ceph_fs_client *fsc = ceph_inode_to_client(inode); struct ceph_osd_data *osd_data = osd_req_op_extent_osd_data(req, 0); struct netfs_io_subrequest *subreq = req->r_priv; struct ceph_osd_req_op *op = &req->r_ops[0]; @@ -231,15 +233,24 @@ static void finish_netfs_read(struct ceph_osd_request *req) subreq->len, i_size_read(req->r_inode)); /* no object means success but no data */ - if (sparse && err >= 0) - err = ceph_sparse_ext_map_end(op); - else if (err == -ENOENT) + if (err == -ENOENT) err = 0; else if (err == -EBLOCKLISTED) fsc->blocklisted = true; - if (err >= 0 && err < subreq->len) - __set_bit(NETFS_SREQ_CLEAR_TAIL, &subreq->flags); + if (err >= 0) { + if (sparse && err > 0) + err = ceph_sparse_ext_map_end(op); + if (err < subreq->len) + __set_bit(NETFS_SREQ_CLEAR_TAIL, &subreq->flags); + if (IS_ENCRYPTED(inode) && err > 0) { + err = ceph_fscrypt_decrypt_extents(inode, osd_data->pages, + subreq->start, op->extent.sparse_ext, + op->extent.sparse_ext_cnt); + if (err > subreq->len) + err = subreq->len; + } + } netfs_subreq_terminated(subreq, err, true); @@ -316,13 +327,16 @@ static void ceph_netfs_issue_read(struct netfs_io_subrequest *subreq) size_t page_off; int err = 0; u64 len = subreq->len; - bool sparse = ceph_test_mount_opt(fsc, SPARSEREAD); + bool sparse = IS_ENCRYPTED(inode) || ceph_test_mount_opt(fsc, SPARSEREAD); + u64 off = subreq->start; if (ci->i_inline_version != CEPH_INLINE_NONE && ceph_netfs_issue_op_inline(subreq)) return; - req = ceph_osdc_new_request(&fsc->client->osdc, &ci->i_layout, vino, subreq->start, &len, + ceph_fscrypt_adjust_off_and_len(inode, &off, &len); + + req = ceph_osdc_new_request(&fsc->client->osdc, &ci->i_layout, vino, off, &len, 0, 1, sparse ? CEPH_OSD_OP_SPARSE_READ : CEPH_OSD_OP_READ, CEPH_OSD_FLAG_READ | fsc->client->osdc.client->options->read_from_replica, NULL, ci->i_truncate_seq, ci->i_truncate_size, false); @@ -341,7 +355,7 @@ static void ceph_netfs_issue_read(struct netfs_io_subrequest *subreq) } dout("%s: pos=%llu orig_len=%zu len=%llu\n", __func__, subreq->start, subreq->len, len); - iov_iter_xarray(&iter, READ, &rreq->mapping->i_pages, subreq->start, len); + iov_iter_xarray(&iter, READ, &rreq->mapping->i_pages, off, len); err = iov_iter_get_pages_alloc(&iter, &pages, len, &page_off); if (err < 0) { dout("%s: iov_ter_get_pages_alloc returned %d\n", __func__, err); From patchwork Wed Apr 27 19:13:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 566787 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 55049C433FE for ; Wed, 27 Apr 2022 19:17:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234368AbiD0TUs (ORCPT ); Wed, 27 Apr 2022 15:20:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47460 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232913AbiD0TTi (ORCPT ); Wed, 27 Apr 2022 15:19:38 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CF4EA562E8 for ; Wed, 27 Apr 2022 12:13:57 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 4F9AFB8291B for ; Wed, 27 Apr 2022 19:13:57 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 89A9EC385AE; Wed, 27 Apr 2022 19:13:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1651086837; bh=3VdNwxJLoqYh6Iuifg1aP3SUY75NSApb6Lub/uJdb8M=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=N6ujfrLUcLWDmYUAg+C6Ge094H9RN9Co2GeIDj1VgHPMnmu6o8OXqC21WBRQ97bVD A06fq0amc//L+rdTY4Wh7DQeGqA1UpznwlAYJXdatwRnAJxMjBnXV7x/PTZxZ6aKoM 54vq5yPy+kqy6gX4WpbHiDKnRHA091+k4uxRFlodABr75a8maKS/64vrEHyC0/VtKE 7i5yCByT/knF5ab7dkXfwQHx6eKPN/dqrzOcUM8Ghj7HYUoYHwnIp8JC39Pv5/jQ84 B7DQO0QvkiSSKQVwcLryMCCjr+UVAOnCfi85KWFnu65joo8Iy1QtiA6Ve8Yx9rOzD2 lNWJ3qzPRif2Q== From: Jeff Layton To: ceph-devel@vger.kernel.org Cc: xiubli@redhat.com, lhenriques@suse.de, idryomov@gmail.com Subject: [PATCH v14 57/64] ceph: set i_blkbits to crypto block size for encrypted inodes Date: Wed, 27 Apr 2022 15:13:07 -0400 Message-Id: <20220427191314.222867-58-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220427191314.222867-1-jlayton@kernel.org> References: <20220427191314.222867-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Some of the underlying infrastructure for fscrypt relies on i_blkbits being aligned to the crypto blocksize. Signed-off-by: Jeff Layton --- fs/ceph/inode.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c index 9936d82193b5..021638fb85f9 100644 --- a/fs/ceph/inode.c +++ b/fs/ceph/inode.c @@ -976,13 +976,6 @@ int ceph_fill_inode(struct inode *inode, struct page *locked_page, issued |= __ceph_caps_dirty(ci); new_issued = ~issued & info_caps; - /* directories have fl_stripe_unit set to zero */ - if (le32_to_cpu(info->layout.fl_stripe_unit)) - inode->i_blkbits = - fls(le32_to_cpu(info->layout.fl_stripe_unit)) - 1; - else - inode->i_blkbits = CEPH_BLOCK_SHIFT; - __ceph_update_quota(ci, iinfo->max_bytes, iinfo->max_files); #ifdef CONFIG_FS_ENCRYPTION @@ -1008,6 +1001,15 @@ int ceph_fill_inode(struct inode *inode, struct page *locked_page, ceph_decode_timespec64(&ci->i_snap_btime, &iinfo->snap_btime); } + /* directories have fl_stripe_unit set to zero */ + if (IS_ENCRYPTED(inode)) + inode->i_blkbits = CEPH_FSCRYPT_BLOCK_SHIFT; + else if (le32_to_cpu(info->layout.fl_stripe_unit)) + inode->i_blkbits = + fls(le32_to_cpu(info->layout.fl_stripe_unit)) - 1; + else + inode->i_blkbits = CEPH_BLOCK_SHIFT; + if ((new_version || (new_issued & CEPH_CAP_LINK_SHARED)) && (issued & CEPH_CAP_LINK_EXCL) == 0) set_nlink(inode, le32_to_cpu(info->nlink)); From patchwork Wed Apr 27 19:13:08 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 566782 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5C4D0C433F5 for ; Wed, 27 Apr 2022 19:18:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232267AbiD0TV1 (ORCPT ); Wed, 27 Apr 2022 15:21:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55488 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233540AbiD0TTi (ORCPT ); Wed, 27 Apr 2022 15:19:38 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AEA2C562F0 for ; Wed, 27 Apr 2022 12:13:58 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 49336619FC for ; Wed, 27 Apr 2022 19:13:58 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3F01FC385AA; Wed, 27 Apr 2022 19:13:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1651086837; bh=kZh1Ni0AOS2h4sFtDN+KvlvqUtKHcdnUYyHdBQL9+V4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=bchcqs1CWsLPFqfN5D+sxBVoX5g8vL34srXPrxWXOLgIKTgrf/Gbevr8bU5e2wMQW B0NWtSWZRHpiqn/YKMQdrZvXKoaW8omfmtzrgrbTHltOwypBfSCIcClIo/N+3YkPOf V+DmCgLpY22ZUPrFlvXptZlSwPsTdOKCAf6kMm0Agcqa9Vivzd1Vw8Wl0cEZikdNrk 59IFGSLkkg24FnzIYgTeG7b7C4vc1oytF3U2rprgyoIS8ogVzmaJsn0rulOSaVO6uT rgpS2r7hLMl2TnG/ZsyKwdaxafmEVT9qIBR0aWTBGfouGcdNIAYuk9ARjZDmSWbrWc r+FagQbx9a0cg== From: Jeff Layton To: ceph-devel@vger.kernel.org Cc: xiubli@redhat.com, lhenriques@suse.de, idryomov@gmail.com Subject: [PATCH v14 58/64] ceph: add encryption support to writepage Date: Wed, 27 Apr 2022 15:13:08 -0400 Message-Id: <20220427191314.222867-59-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220427191314.222867-1-jlayton@kernel.org> References: <20220427191314.222867-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Allow writepage to issue encrypted writes. Extend out the requested size and offset to cover complete blocks, and then encrypt and write them to the OSDs. Signed-off-by: Jeff Layton --- fs/ceph/addr.c | 34 +++++++++++++++++++++++++++------- 1 file changed, 27 insertions(+), 7 deletions(-) diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c index d65d431ec933..f54940fc96ee 100644 --- a/fs/ceph/addr.c +++ b/fs/ceph/addr.c @@ -586,10 +586,12 @@ static int writepage_nounlock(struct page *page, struct writeback_control *wbc) loff_t page_off = page_offset(page); int err; loff_t len = thp_size(page); + loff_t wlen; struct ceph_writeback_ctl ceph_wbc; struct ceph_osd_client *osdc = &fsc->client->osdc; struct ceph_osd_request *req; bool caching = ceph_is_cache_enabled(inode); + struct page *bounce_page = NULL; dout("writepage %p idx %lu\n", page, page->index); @@ -621,6 +623,8 @@ static int writepage_nounlock(struct page *page, struct writeback_control *wbc) if (ceph_wbc.i_size < page_off + len) len = ceph_wbc.i_size - page_off; + if (IS_ENCRYPTED(inode)) + wlen = round_up(len, CEPH_FSCRYPT_BLOCK_SIZE); dout("writepage %p page %p index %lu on %llu~%llu snapc %p seq %lld\n", inode, page, page->index, page_off, len, snapc, snapc->seq); @@ -629,24 +633,39 @@ static int writepage_nounlock(struct page *page, struct writeback_control *wbc) CONGESTION_ON_THRESH(fsc->mount_options->congestion_kb)) fsc->write_congested = true; - req = ceph_osdc_new_request(osdc, &ci->i_layout, ceph_vino(inode), page_off, &len, 0, 1, - CEPH_OSD_OP_WRITE, CEPH_OSD_FLAG_WRITE, snapc, - ceph_wbc.truncate_seq, ceph_wbc.truncate_size, - true); + req = ceph_osdc_new_request(osdc, &ci->i_layout, ceph_vino(inode), + page_off, &wlen, 0, 1, CEPH_OSD_OP_WRITE, + CEPH_OSD_FLAG_WRITE, snapc, + ceph_wbc.truncate_seq, + ceph_wbc.truncate_size, true); if (IS_ERR(req)) { redirty_page_for_writepage(wbc, page); return PTR_ERR(req); } + if (wlen < len) + len = wlen; + set_page_writeback(page); if (caching) ceph_set_page_fscache(page); ceph_fscache_write_to_cache(inode, page_off, len, caching); + if (IS_ENCRYPTED(inode)) { + bounce_page = fscrypt_encrypt_pagecache_blocks(page, CEPH_FSCRYPT_BLOCK_SIZE, + 0, GFP_NOFS); + if (IS_ERR(bounce_page)) { + err = PTR_ERR(bounce_page); + goto out; + } + } /* it may be a short write due to an object boundary */ WARN_ON_ONCE(len > thp_size(page)); - osd_req_op_extent_osd_data_pages(req, 0, &page, len, 0, false, false); - dout("writepage %llu~%llu (%llu bytes)\n", page_off, len, len); + osd_req_op_extent_osd_data_pages(req, 0, + bounce_page ? &bounce_page : &page, wlen, 0, + false, false); + dout("writepage %llu~%llu (%llu bytes, %sencrypted)\n", + page_off, len, wlen, IS_ENCRYPTED(inode) ? "" : "not "); req->r_mtime = inode->i_mtime; err = ceph_osdc_start_request(osdc, req, true); @@ -655,7 +674,8 @@ static int writepage_nounlock(struct page *page, struct writeback_control *wbc) ceph_update_write_metrics(&fsc->mdsc->metric, req->r_start_latency, req->r_end_latency, len, err); - + fscrypt_free_bounce_page(bounce_page); +out: ceph_osdc_put_request(req); if (err == 0) err = len; From patchwork Wed Apr 27 19:13:12 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 566783 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C537AC433F5 for ; Wed, 27 Apr 2022 19:18:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231940AbiD0TVQ (ORCPT ); Wed, 27 Apr 2022 15:21:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53288 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233247AbiD0TTm (ORCPT ); Wed, 27 Apr 2022 15:19:42 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2EFB35BD2E for ; Wed, 27 Apr 2022 12:14:03 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id DEA66B8291B for ; Wed, 27 Apr 2022 19:14:01 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0ABC4C385A9; Wed, 27 Apr 2022 19:13:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1651086840; bh=3Hl9B2WkuP2u2FPbhyP19oOi1ueJw+ePMB/OrdQ0yqc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=M3DxSYdWASTDoqGtaxumJR9jky8BXGo30wfQ0iL+pg2+uXVyLtIRWTNKb9gY3DYjK V6b8uBCpOrqX5qg/D8dOIY7ltVdbg9vd+GHmBdcFA98ErVeZnNMyAvedlgCfsckbj1 ufRbqHOWvs7edz6qkW5QZ+S0jYr9dTcBMcy+7VbtiO0TaHRe7YRT+9BWtEprCouLKb iDs6Vfo4oFZFGIfP0lwhpL66q+zG6cVu7HRksbffQYMZ/kPupspP1UKO957kS3d+zC R2hAcQSmZbSLn4vdU7rN+Dw9Rbu2cU28K1uCa9SATJOaYfA7bva1oY3V0F/Fu2ea1j Xlb29xZyihO8w== From: Jeff Layton To: ceph-devel@vger.kernel.org Cc: xiubli@redhat.com, lhenriques@suse.de, idryomov@gmail.com Subject: [PATCH v14 62/64] ceph: add support for handling encrypted snapshot names Date: Wed, 27 Apr 2022 15:13:12 -0400 Message-Id: <20220427191314.222867-63-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220427191314.222867-1-jlayton@kernel.org> References: <20220427191314.222867-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: Luís Henriques When creating a snapshot, the .snap directories for every subdirectory will show the snapshot name in the "long format": # mkdir .snap/my-snap # ls my-dir/.snap/ _my-snap_1099511627782 Encrypted snapshots will need to be able to handle these snapshot names by encrypting/decrypting only the snapshot part of the string ('my-snap'). Also, since the MDS prevents snapshot names to be bigger than 240 characters it is necessary to adapt CEPH_NOHASH_NAME_MAX to accommodate this extra limitation. Reviewed-by: Xiubo Li Signed-off-by: Luís Henriques Signed-off-by: Jeff Layton --- fs/ceph/crypto.c | 190 ++++++++++++++++++++++++++++++++++++++++------- fs/ceph/crypto.h | 4 +- 2 files changed, 165 insertions(+), 29 deletions(-) diff --git a/fs/ceph/crypto.c b/fs/ceph/crypto.c index 590477d6e3ef..d23b3d1c59c5 100644 --- a/fs/ceph/crypto.c +++ b/fs/ceph/crypto.c @@ -189,16 +189,100 @@ void ceph_fscrypt_as_ctx_to_req(struct ceph_mds_request *req, struct ceph_acl_se swap(req->r_fscrypt_auth, as->fscrypt_auth); } -int ceph_encode_encrypted_dname(const struct inode *parent, struct qstr *d_name, char *buf) +/* + * User-created snapshots can't start with '_'. Snapshots that start with this + * character are special (hint: there aren't real snapshots) and use the + * following format: + * + * __ + * + * where: + * - - the real snapshot name that may need to be decrypted, + * - - the inode number (in decimal) for the actual snapshot + * + * This function parses these snapshot names and returns the inode + * . 'name_len' will also bet set with the + * length. + */ +static struct inode *parse_longname(const struct inode *parent, const char *name, + int *name_len) +{ + struct inode *dir = NULL; + struct ceph_vino vino = { .snap = CEPH_NOSNAP }; + char *inode_number; + char *name_end; + int orig_len = *name_len; + int ret = -EIO; + + /* Skip initial '_' */ + name++; + name_end = strrchr(name, '_'); + if (!name_end) { + dout("Failed to parse long snapshot name: %s\n", name); + return ERR_PTR(-EIO); + } + *name_len = (name_end - name); + if (*name_len <= 0) { + pr_err("Failed to parse long snapshot name\n"); + return ERR_PTR(-EIO); + } + + /* Get the inode number */ + inode_number = kmemdup_nul(name_end + 1, + orig_len - *name_len - 2, + GFP_KERNEL); + if (!inode_number) + return ERR_PTR(-ENOMEM); + ret = kstrtou64(inode_number, 10, &vino.ino); + if (ret) { + dout("Failed to parse inode number: %s\n", name); + dir = ERR_PTR(ret); + goto out; + } + + /* And finally the inode */ + dir = ceph_find_inode(parent->i_sb, vino); + if (!dir) { + /* This can happen if we're not mounting cephfs on the root */ + dir = ceph_get_inode(parent->i_sb, vino, NULL); + if (!dir) + dir = ERR_PTR(-ENOENT); + } + if (IS_ERR(dir)) + dout("Can't find inode %s (%s)\n", inode_number, name); + +out: + kfree(inode_number); + return dir; +} + +int ceph_encode_encrypted_dname(struct inode *parent, struct qstr *d_name, char *buf) { + struct inode *dir = parent; + struct qstr iname; u32 len; + int name_len; int elen; int ret; - u8 *cryptbuf; + u8 *cryptbuf = NULL; + + iname.name = d_name->name; + name_len = d_name->len; + + /* Handle the special case of snapshot names that start with '_' */ + if ((ceph_snap(dir) == CEPH_SNAPDIR) && (name_len > 0) && + (iname.name[0] == '_')) { + dir = parse_longname(parent, iname.name, &name_len); + if (IS_ERR(dir)) + return PTR_ERR(dir); + iname.name++; /* skip initial '_' */ + } + iname.len = name_len; - if (!fscrypt_has_encryption_key(parent)) { + if (!fscrypt_has_encryption_key(dir)) { memcpy(buf, d_name->name, d_name->len); - return d_name->len; + elen = d_name->len; + goto out; } /* @@ -207,18 +291,22 @@ int ceph_encode_encrypted_dname(const struct inode *parent, struct qstr *d_name, * * See: fscrypt_setup_filename */ - if (!fscrypt_fname_encrypted_size(parent, d_name->len, NAME_MAX, &len)) - return -ENAMETOOLONG; + if (!fscrypt_fname_encrypted_size(dir, iname.len, NAME_MAX, &len)) { + elen = -ENAMETOOLONG; + goto out; + } /* Allocate a buffer appropriate to hold the result */ cryptbuf = kmalloc(len > CEPH_NOHASH_NAME_MAX ? NAME_MAX : len, GFP_KERNEL); - if (!cryptbuf) - return -ENOMEM; + if (!cryptbuf) { + elen = -ENOMEM; + goto out; + } - ret = fscrypt_fname_encrypt(parent, d_name, cryptbuf, len); + ret = fscrypt_fname_encrypt(dir, &iname, cryptbuf, len); if (ret) { - kfree(cryptbuf); - return ret; + elen = ret; + goto out; } /* hash the end if the name is long enough */ @@ -234,12 +322,30 @@ int ceph_encode_encrypted_dname(const struct inode *parent, struct qstr *d_name, /* base64 encode the encrypted name */ elen = ceph_base64_encode(cryptbuf, len, buf); - kfree(cryptbuf); dout("base64-encoded ciphertext name = %.*s\n", elen, buf); + + /* To understand the 240 limit, see CEPH_NOHASH_NAME_MAX comments */ + WARN_ON(elen > 240); + if ((elen > 0) && (dir != parent)) { + char tmp_buf[NAME_MAX]; + + elen = snprintf(tmp_buf, sizeof(tmp_buf), "_%.*s_%ld", + elen, buf, dir->i_ino); + memcpy(buf, tmp_buf, elen); + } + +out: + kfree(cryptbuf); + if (dir != parent) { + if ((dir->i_state & I_NEW)) + discard_new_inode(dir); + else + iput(dir); + } return elen; } -int ceph_encode_encrypted_fname(const struct inode *parent, struct dentry *dentry, char *buf) +int ceph_encode_encrypted_fname(struct inode *parent, struct dentry *dentry, char *buf) { WARN_ON_ONCE(!fscrypt_has_encryption_key(parent)); @@ -264,29 +370,42 @@ int ceph_encode_encrypted_fname(const struct inode *parent, struct dentry *dentr int ceph_fname_to_usr(const struct ceph_fname *fname, struct fscrypt_str *tname, struct fscrypt_str *oname, bool *is_nokey) { - int ret; + struct inode *dir = fname->dir; struct fscrypt_str _tname = FSTR_INIT(NULL, 0); struct fscrypt_str iname; - - if (!IS_ENCRYPTED(fname->dir)) { - oname->name = fname->name; - oname->len = fname->name_len; - return 0; - } + char *name = fname->name; + int name_len = fname->name_len; + int ret; /* Sanity check that the resulting name will fit in the buffer */ if (fname->name_len > NAME_MAX || fname->ctext_len > NAME_MAX) return -EIO; - ret = __fscrypt_prepare_readdir(fname->dir); + /* Handle the special case of snapshot names that start with '_' */ + if ((ceph_snap(dir) == CEPH_SNAPDIR) && (name_len > 0) && + (name[0] == '_')) { + dir = parse_longname(dir, name, &name_len); + if (IS_ERR(dir)) + return PTR_ERR(dir); + name++; /* skip initial '_' */ + } + + if (!IS_ENCRYPTED(dir)) { + oname->name = fname->name; + oname->len = fname->name_len; + ret = 0; + goto out_inode; + } + + ret = __fscrypt_prepare_readdir(dir); if (ret) - return ret; + goto out_inode; /* * Use the raw dentry name as sent by the MDS instead of * generating a nokey name via fscrypt. */ - if (!fscrypt_has_encryption_key(fname->dir)) { + if (!fscrypt_has_encryption_key(dir)) { if (fname->no_copy) oname->name = fname->name; else @@ -294,7 +413,8 @@ int ceph_fname_to_usr(const struct ceph_fname *fname, struct fscrypt_str *tname, oname->len = fname->name_len; if (is_nokey) *is_nokey = true; - return 0; + ret = 0; + goto out_inode; } if (fname->ctext_len == 0) { @@ -303,11 +423,11 @@ int ceph_fname_to_usr(const struct ceph_fname *fname, struct fscrypt_str *tname, if (!tname) { ret = fscrypt_fname_alloc_buffer(NAME_MAX, &_tname); if (ret) - return ret; + goto out_inode; tname = &_tname; } - declen = ceph_base64_decode(fname->name, fname->name_len, tname->name); + declen = ceph_base64_decode(name, name_len, tname->name); if (declen <= 0) { ret = -EIO; goto out; @@ -319,9 +439,25 @@ int ceph_fname_to_usr(const struct ceph_fname *fname, struct fscrypt_str *tname, iname.len = fname->ctext_len; } - ret = fscrypt_fname_disk_to_usr(fname->dir, 0, 0, &iname, oname); + ret = fscrypt_fname_disk_to_usr(dir, 0, 0, &iname, oname); + if (!ret && (dir != fname->dir)) { + char tmp_buf[CEPH_BASE64_CHARS(NAME_MAX)]; + + name_len = snprintf(tmp_buf, sizeof(tmp_buf), "_%.*s_%ld", + oname->len, oname->name, dir->i_ino); + memcpy(oname->name, tmp_buf, name_len); + oname->len = name_len; + } + out: fscrypt_fname_free_buffer(&_tname); +out_inode: + if ((dir != fname->dir) && !IS_ERR(dir)) { + if ((dir->i_state & I_NEW)) + discard_new_inode(dir); + else + iput(dir); + } return ret; } diff --git a/fs/ceph/crypto.h b/fs/ceph/crypto.h index d1726307bdb8..c6ee993f4ec8 100644 --- a/fs/ceph/crypto.h +++ b/fs/ceph/crypto.h @@ -101,8 +101,8 @@ void ceph_fscrypt_free_dummy_policy(struct ceph_fs_client *fsc); int ceph_fscrypt_prepare_context(struct inode *dir, struct inode *inode, struct ceph_acl_sec_ctx *as); void ceph_fscrypt_as_ctx_to_req(struct ceph_mds_request *req, struct ceph_acl_sec_ctx *as); -int ceph_encode_encrypted_dname(const struct inode *parent, struct qstr *d_name, char *buf); -int ceph_encode_encrypted_fname(const struct inode *parent, struct dentry *dentry, char *buf); +int ceph_encode_encrypted_dname(struct inode *parent, struct qstr *d_name, char *buf); +int ceph_encode_encrypted_fname(struct inode *parent, struct dentry *dentry, char *buf); static inline int ceph_fname_alloc_buffer(struct inode *parent, struct fscrypt_str *fname) { From patchwork Wed Apr 27 19:13:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 566784 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1D003C433EF for ; Wed, 27 Apr 2022 19:18:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234027AbiD0TVH (ORCPT ); Wed, 27 Apr 2022 15:21:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53294 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233370AbiD0TTm (ORCPT ); Wed, 27 Apr 2022 15:19:42 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 57EA98A315 for ; Wed, 27 Apr 2022 12:14:04 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 19124B8294D for ; Wed, 27 Apr 2022 19:14:03 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 68ED2C385AA; Wed, 27 Apr 2022 19:14:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1651086841; bh=SaANu6zlmzyvh0EYfMkYTWrWskPuFira/Ah/I4AD80I=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=WIv8BJj32J7T9EBuynDwST9jcs3/8ZZQweGEGVhSzW/b1jLmdMoWKc2hyHgbnqiT7 MnLn0tRt1KyCbzLwCG8qiFT2jX7bQD1gwgNL6gbpVwu2e34RwPQl13O/zYB3InhnMC ThpvmIc65keVmcl/A7pk9zkGduGYs6D5hFLS2FvOcAftcGCi/aqPZI6nhAKwvrxOGu nRY4ErF+5DnyjD4gmaxstPaUHBMCA82gRwIjmDGJNaawCzjo54wRalt+jORhIx6/HW mmWGB7dLz/cWiB3JFr1f8RFbtthPKnxdA9rHqb4MoBdzuJoSWGm0BgU9ZB4VJIYV+a AevPj5KsujdgQ== From: Jeff Layton To: ceph-devel@vger.kernel.org Cc: xiubli@redhat.com, lhenriques@suse.de, idryomov@gmail.com Subject: [PATCH v14 64/64] ceph: prevent snapshots to be created in encrypted locked directories Date: Wed, 27 Apr 2022 15:13:14 -0400 Message-Id: <20220427191314.222867-65-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220427191314.222867-1-jlayton@kernel.org> References: <20220427191314.222867-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: Luís Henriques With snapshot names encryption we can not allow snapshots to be created in locked directories because the names wouldn't be encrypted. This patch forces the directory to be unlocked to allow a snapshot to be created. Signed-off-by: Luís Henriques Signed-off-by: Jeff Layton --- fs/ceph/dir.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c index 544aa5e78a31..7045f0aa53cc 100644 --- a/fs/ceph/dir.c +++ b/fs/ceph/dir.c @@ -1071,6 +1071,11 @@ static int ceph_mkdir(struct user_namespace *mnt_userns, struct inode *dir, err = -EDQUOT; goto out; } + if ((op == CEPH_MDS_OP_MKSNAP) && IS_ENCRYPTED(dir) && + !fscrypt_has_encryption_key(dir)) { + err = -ENOKEY; + goto out; + } req = ceph_mdsc_create_request(mdsc, op, USE_AUTH_MDS);