mbox series

[v3,0/3] libceph: fix sparse-read failure bug

Message ID 20231215002034.205780-1-xiubli@redhat.com
Headers show
Series libceph: fix sparse-read failure bug | expand

Message

Xiubo Li Dec. 15, 2023, 12:20 a.m. UTC
From: Xiubo Li <xiubli@redhat.com>

The debug logs:

2725523 <7>[ 8771.114348] libceph:  [0] got 1 extents
2725524 <7>[ 8771.114353] libceph:  [0] ext 0 off 0x66000 len 0x4000
2725525 <7>[ 8771.114370] libceph:  prep_next_sparse_read: [0] completed extent array len 1 cursor->resid 0
2725526 <7>[ 8771.114374] libceph:  read_partial have 0, left 21, con->v1.in_base_pos 53
2725527 <7>[ 8771.114379] libceph:  read_partial have 14, left 7, con->v1.in_base_pos 67         ====> there were still 7 bytes not received
2725528 <7>[ 8771.114382] libceph:  read_partial return 0
2725529 <7>[ 8771.114384] libceph:  try_read done on 0000000094d53202 ret 0
2725530 <7>[ 8771.114387] libceph:  try_write start 0000000094d53202 state 12
2725531 <7>[ 8771.114389] libceph:  try_write out_kvec_bytes 0
2725532 <7>[ 8771.114391] libceph:  try_write nothing else to write.
2725533 <7>[ 8771.114393] libceph:  try_write done on 0000000094d53202 ret 0
2725534 <7>[ 8771.114396] libceph:  put_osd 000000009b11f20c 5 -> 4
2725535 <7>[ 8771.114450] libceph:  ceph_sock_data_ready 0000000094d53202 state = 12, queueing work
2725536 <7>[ 8771.114454] libceph:  get_osd 000000009b11f20c 4 -> 5
2725537 <7>[ 8771.114457] libceph:  queue_con_delay 0000000094d53202 0
2725538 <7>[ 8771.114651] libceph:  try_read start 0000000094d53202 state 12
2725539 <7>[ 8771.114655] libceph:  try_read tag 7 in_base_pos 67
2725540 <7>[ 8771.114657] libceph:  read_partial_message con 0000000094d53202 msg 0000000060b8a473
2725541 <7>[ 8771.114660] libceph:  read_partial return 1
2725542 <7>[ 8771.114663] libceph:  read_partial have 14, left 7, con->v1.in_base_pos 67         ====> the rest 7 bytes came
2725543 <7>[ 8771.114669] libceph:  read_partial return 1
2725544 <7>[ 8771.114671] libceph:  read_partial_message got msg 0000000060b8a473 164 (4271800174) + 0 (0) + 16408 (2739232014)
2725545 <7>[ 8771.114677] libceph:  ===== 0000000060b8a473 73 from osd0 43=osd_opreply len 164+0+16408 (4271800174 0 2739232014) =====
2725546 <7>[ 8771.114683] libceph:  handle_reply msg 0000000060b8a473 tid 99
2725547 <7>[ 8771.114687] libceph:  handle_reply req 000000006ba179f6 tid 99 flags 0x400015 pgid 3.2b epoch 53 attempt 0 v 0'0 uv 5984
2725548 <7>[ 8771.114693] libceph:   req 000000006ba179f6 tid 99 op 0 rval 0 len 16408
2725549 <7>[ 8771.114697] libceph:  handle_reply req 000000006ba179f6 tid 99 result 0 data_len 16408
2725550 <7>[ 8771.114701] libceph:  finish_request req 000000006ba179f6 tid 99                   ====> the request was successfully finished



V3:
- rename read_sparse_msg_XX to read_partial_sparse_msg_XX
- fix the sparse-read bug in the messager v1 code.

V2:
- fix the sparse-read bug in the sparse-read code instead


Xiubo Li (3):
  libceph: fail the sparse-read if there still has data in socket
  libceph: rename read_sparse_msg_XX to read_partial_sparse_msg_XX
  libceph: just wait for more data to be available on the socket

 include/linux/ceph/messenger.h |  2 ++
 net/ceph/messenger.c           |  1 +
 net/ceph/messenger_v1.c        | 29 ++++++++++++++++++++---------
 net/ceph/osd_client.c          |  5 ++++-
 4 files changed, 27 insertions(+), 10 deletions(-)