diff mbox series

ceph: fix blindly expanding the readahead windows

Message ID 20230323070105.201578-1-xiubli@redhat.com
State Superseded
Headers show
Series ceph: fix blindly expanding the readahead windows | expand

Commit Message

Xiubo Li March 23, 2023, 7:01 a.m. UTC
From: Xiubo Li <xiubli@redhat.com>

Blindly expanding the readahead windows will cause unneccessary
pagecache thrashing and also will introdue the network workload.
We should disable expanding the windows if the readahead is disabled
and also shouldn't expand the windows too much.

Expanding forward firstly instead of expanding backward for possible
sequential reads.

URL: https://www.spinics.net/lists/ceph-users/msg76183.html
Signed-off-by: Xiubo Li <xiubli@redhat.com>
---
 fs/ceph/addr.c | 23 +++++++++++++++++------
 1 file changed, 17 insertions(+), 6 deletions(-)

Comments

Ilya Dryomov April 30, 2023, 11 a.m. UTC | #1
On Thu, Mar 23, 2023 at 8:01 AM <xiubli@redhat.com> wrote:
>
> From: Xiubo Li <xiubli@redhat.com>
>
> Blindly expanding the readahead windows will cause unneccessary
> pagecache thrashing and also will introdue the network workload.
> We should disable expanding the windows if the readahead is disabled
> and also shouldn't expand the windows too much.
>
> Expanding forward firstly instead of expanding backward for possible
> sequential reads.
>
> URL: https://www.spinics.net/lists/ceph-users/msg76183.html
> Signed-off-by: Xiubo Li <xiubli@redhat.com>
> ---
>  fs/ceph/addr.c | 23 +++++++++++++++++------
>  1 file changed, 17 insertions(+), 6 deletions(-)
>
> diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
> index ca4dc6450887..01d997f6c66c 100644
> --- a/fs/ceph/addr.c
> +++ b/fs/ceph/addr.c
> @@ -188,16 +188,27 @@ static void ceph_netfs_expand_readahead(struct netfs_io_request *rreq)
>         struct inode *inode = rreq->inode;
>         struct ceph_inode_info *ci = ceph_inode(inode);
>         struct ceph_file_layout *lo = &ci->i_layout;
> +       unsigned long max_pages = inode->i_sb->s_bdi->ra_pages;
> +       unsigned long max_len = max_pages << PAGE_SHIFT;
> +       unsigned long len;
>         u32 blockoff;
>         u64 blockno;
>
> -       /* Expand the start downward */
> -       blockno = div_u64_rem(rreq->start, lo->stripe_unit, &blockoff);
> -       rreq->start = blockno * lo->stripe_unit;
> -       rreq->len += blockoff;
> +       /* Readahead is disabled */
> +       if (!max_pages)
> +               return;
> +
> +       /* Expand the length forward by rounding up it to the next block */
> +       len = roundup(rreq->len, lo->stripe_unit);
> +       if (len <= max_len)
> +               rreq->len = len;

Hi Xiubo,

This change makes it possible for the request to be expanded into the
next block (i.e. it's not rounded _up_ to the next block as the comment
says) because rreq->len is no longer guaranteed to be based off of the
start of the block here.  Previously that was ensured by the preceding
downward expansion.

Thanks,

                Ilya
Xiubo Li May 4, 2023, 4:26 a.m. UTC | #2
On 4/30/23 19:00, Ilya Dryomov wrote:
> On Thu, Mar 23, 2023 at 8:01 AM <xiubli@redhat.com> wrote:
>> From: Xiubo Li <xiubli@redhat.com>
>>
>> Blindly expanding the readahead windows will cause unneccessary
>> pagecache thrashing and also will introdue the network workload.
>> We should disable expanding the windows if the readahead is disabled
>> and also shouldn't expand the windows too much.
>>
>> Expanding forward firstly instead of expanding backward for possible
>> sequential reads.
>>
>> URL: https://www.spinics.net/lists/ceph-users/msg76183.html
>> Signed-off-by: Xiubo Li <xiubli@redhat.com>
>> ---
>>   fs/ceph/addr.c | 23 +++++++++++++++++------
>>   1 file changed, 17 insertions(+), 6 deletions(-)
>>
>> diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
>> index ca4dc6450887..01d997f6c66c 100644
>> --- a/fs/ceph/addr.c
>> +++ b/fs/ceph/addr.c
>> @@ -188,16 +188,27 @@ static void ceph_netfs_expand_readahead(struct netfs_io_request *rreq)
>>          struct inode *inode = rreq->inode;
>>          struct ceph_inode_info *ci = ceph_inode(inode);
>>          struct ceph_file_layout *lo = &ci->i_layout;
>> +       unsigned long max_pages = inode->i_sb->s_bdi->ra_pages;
>> +       unsigned long max_len = max_pages << PAGE_SHIFT;
>> +       unsigned long len;
>>          u32 blockoff;
>>          u64 blockno;
>>
>> -       /* Expand the start downward */
>> -       blockno = div_u64_rem(rreq->start, lo->stripe_unit, &blockoff);
>> -       rreq->start = blockno * lo->stripe_unit;
>> -       rreq->len += blockoff;
>> +       /* Readahead is disabled */
>> +       if (!max_pages)
>> +               return;
>> +
>> +       /* Expand the length forward by rounding up it to the next block */
>> +       len = roundup(rreq->len, lo->stripe_unit);
>> +       if (len <= max_len)
>> +               rreq->len = len;
> Hi Xiubo,
>
> This change makes it possible for the request to be expanded into the
> next block (i.e. it's not rounded _up_ to the next block as the comment
> says) because rreq->len is no longer guaranteed to be based off of the
> start of the block here.  Previously that was ensured by the preceding
> downward expansion.

Correct. I will fix it.

Thanks Ilya.

- Xiubo


> Thanks,
>
>                  Ilya
>
diff mbox series

Patch

diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
index ca4dc6450887..01d997f6c66c 100644
--- a/fs/ceph/addr.c
+++ b/fs/ceph/addr.c
@@ -188,16 +188,27 @@  static void ceph_netfs_expand_readahead(struct netfs_io_request *rreq)
 	struct inode *inode = rreq->inode;
 	struct ceph_inode_info *ci = ceph_inode(inode);
 	struct ceph_file_layout *lo = &ci->i_layout;
+	unsigned long max_pages = inode->i_sb->s_bdi->ra_pages;
+	unsigned long max_len = max_pages << PAGE_SHIFT;
+	unsigned long len;
 	u32 blockoff;
 	u64 blockno;
 
-	/* Expand the start downward */
-	blockno = div_u64_rem(rreq->start, lo->stripe_unit, &blockoff);
-	rreq->start = blockno * lo->stripe_unit;
-	rreq->len += blockoff;
+	/* Readahead is disabled */
+	if (!max_pages)
+		return;
+
+	/* Expand the length forward by rounding up it to the next block */
+	len = roundup(rreq->len, lo->stripe_unit);
+	if (len <= max_len)
+		rreq->len = len;
 
-	/* Now, round up the length to the next block */
-	rreq->len = roundup(rreq->len, lo->stripe_unit);
+	/* Now try to expand the start downward */
+	blockno = div_u64_rem(rreq->start, lo->stripe_unit, &blockoff);
+	if (rreq->len + blockoff <= max_len) {
+		rreq->start = blockno * lo->stripe_unit;
+		rreq->len += blockoff;
+	}
 }
 
 static bool ceph_netfs_clamp_length(struct netfs_io_subrequest *subreq)