diff mbox series

ceph: skip reconnecting if MDS is not ready

Message ID 20230824095551.134118-1-xiubli@redhat.com
State New
Headers show
Series ceph: skip reconnecting if MDS is not ready | expand

Commit Message

Xiubo Li Aug. 24, 2023, 9:55 a.m. UTC
From: Xiubo Li <xiubli@redhat.com>

When MDS closed the session the kclient will send to reconnect to
it immediately, but if the MDS just restarted and still not ready
yet, such as still in the up:replay state and the sessionmap journal
logs hasn't be replayed, the MDS will close the session.

And then the kclient could remove the session and later when the
mdsmap is in RECONNECT phrase it will skip reconnecting. But the
will wait until timeout and then evicts the kclient.

Just skip sending the reconnection request until the MDS is ready.

URL: https://tracker.ceph.com/issues/62489
Signed-off-by: Xiubo Li <xiubli@redhat.com>
---
 fs/ceph/mds_client.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Venky Shankar Feb. 6, 2024, 9:22 a.m. UTC | #1
On Thu, Aug 24, 2023 at 3:28 PM <xiubli@redhat.com> wrote:
>
> From: Xiubo Li <xiubli@redhat.com>
>
> When MDS closed the session the kclient will send to reconnect to
> it immediately, but if the MDS just restarted and still not ready
> yet, such as still in the up:replay state and the sessionmap journal
> logs hasn't be replayed, the MDS will close the session.
>
> And then the kclient could remove the session and later when the
> mdsmap is in RECONNECT phrase it will skip reconnecting. But the
> will wait until timeout and then evicts the kclient.
>
> Just skip sending the reconnection request until the MDS is ready.
>
> URL: https://tracker.ceph.com/issues/62489
> Signed-off-by: Xiubo Li <xiubli@redhat.com>
> ---
>  fs/ceph/mds_client.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
> index 9aae39289b43..a9ef93411679 100644
> --- a/fs/ceph/mds_client.c
> +++ b/fs/ceph/mds_client.c
> @@ -5809,7 +5809,8 @@ static void mds_peer_reset(struct ceph_connection *con)
>
>         pr_warn_client(mdsc->fsc->client, "mds%d closed our session\n",
>                        s->s_mds);
> -       if (READ_ONCE(mdsc->fsc->mount_state) != CEPH_MOUNT_FENCE_IO)
> +       if (READ_ONCE(mdsc->fsc->mount_state) != CEPH_MOUNT_FENCE_IO &&
> +           ceph_mdsmap_get_state(mdsc->mdsmap, s->s_mds) >= CEPH_MDS_STATE_RECONNECT)
>                 send_mds_reconnect(mdsc, s);
>  }
>
> --
> 2.39.1
>

Tested-by: Venky Shankar <vshankar@redhat.com>
diff mbox series

Patch

diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
index 9aae39289b43..a9ef93411679 100644
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -5809,7 +5809,8 @@  static void mds_peer_reset(struct ceph_connection *con)
 
 	pr_warn_client(mdsc->fsc->client, "mds%d closed our session\n",
 		       s->s_mds);
-	if (READ_ONCE(mdsc->fsc->mount_state) != CEPH_MOUNT_FENCE_IO)
+	if (READ_ONCE(mdsc->fsc->mount_state) != CEPH_MOUNT_FENCE_IO &&
+	    ceph_mdsmap_get_state(mdsc->mdsmap, s->s_mds) >= CEPH_MDS_STATE_RECONNECT)
 		send_mds_reconnect(mdsc, s);
 }