[v3] ceph: fix blindly expanding the readahead windows

Message ID	20230504111100.417305-1-xiubli@redhat.com
State	Superseded
Headers	show Return-Path: <ceph-devel-owner@vger.kernel.org> From: xiubli@redhat.com To: idryomov@gmail.com, ceph-devel@vger.kernel.org Cc: jlayton@kernel.org, vshankar@redhat.com, lhenriques@suse.de, mchangir@redhat.com, Xiubo Li <xiubli@redhat.com>, stable@vger.kernel.org, Hu Weiwen <sehuww@mail.scut.edu.cn> Subject: [PATCH v3] ceph: fix blindly expanding the readahead windows Date: Thu, 4 May 2023 19:11:00 +0800 Message-Id: <20230504111100.417305-1-xiubli@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	[v3] ceph: fix blindly expanding the readahead windows \| expand [v3] ceph: fix blindly expanding the readahead windows

Message ID

20230504111100.417305-1-xiubli@redhat.com

State

Superseded

Headers

From: xiubli@redhat.com
To: idryomov@gmail.com, ceph-devel@vger.kernel.org
Cc: jlayton@kernel.org, vshankar@redhat.com, lhenriques@suse.de,
        mchangir@redhat.com, Xiubo Li <xiubli@redhat.com>,
        stable@vger.kernel.org, Hu Weiwen <sehuww@mail.scut.edu.cn>
Subject: [PATCH v3] ceph: fix blindly expanding the readahead windows
Date: Thu,  4 May 2023 19:11:00 +0800
Message-Id: <20230504111100.417305-1-xiubli@redhat.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Precedence: bulk

Series

[v3] ceph: fix blindly expanding the readahead windows | expand

Commit Message

Xiubo Li May 4, 2023, 11:11 a.m. UTC

From: Xiubo Li <xiubli@redhat.com>

Blindly expanding the readahead windows will cause unneccessary
pagecache thrashing and also will introdue the network workload.
We should disable expanding the windows if the readahead is disabled
and also shouldn't expand the windows too much.

Expanding forward firstly instead of expanding backward for possible
sequential reads.

Bound `rreq->len` to the actual file size to restore the previous page
cache usage.

Cc: stable@vger.kernel.org
Fixes: 49870056005c ("ceph: convert ceph_readpages to ceph_readahead")
URL: https://lore.kernel.org/ceph-devel/20230504082510.247-1-sehuww@mail.scut.edu.cn
URL: https://www.spinics.net/lists/ceph-users/msg76183.html
Cc: Hu Weiwen <sehuww@mail.scut.edu.cn>
Signed-off-by: Xiubo Li <xiubli@redhat.com>
---

V3:
- Folded Hu Weiwen's fix and bound `rreq->len` to the actual file size.



 fs/ceph/addr.c | 28 ++++++++++++++++++++++------
 1 file changed, 22 insertions(+), 6 deletions(-)

Comments

Ilya Dryomov May 8, 2023, 12:46 p.m. UTC | #1

On Thu, May 4, 2023 at 1:11 PM <xiubli@redhat.com> wrote:
>
> From: Xiubo Li <xiubli@redhat.com>
>
> Blindly expanding the readahead windows will cause unneccessary
> pagecache thrashing and also will introdue the network workload.
> We should disable expanding the windows if the readahead is disabled
> and also shouldn't expand the windows too much.
>
> Expanding forward firstly instead of expanding backward for possible
> sequential reads.
>
> Bound `rreq->len` to the actual file size to restore the previous page
> cache usage.
>
> Cc: stable@vger.kernel.org
> Fixes: 49870056005c ("ceph: convert ceph_readpages to ceph_readahead")
> URL: https://lore.kernel.org/ceph-devel/20230504082510.247-1-sehuww@mail.scut.edu.cn
> URL: https://www.spinics.net/lists/ceph-users/msg76183.html
> Cc: Hu Weiwen <sehuww@mail.scut.edu.cn>
> Signed-off-by: Xiubo Li <xiubli@redhat.com>
> ---
>
> V3:
> - Folded Hu Weiwen's fix and bound `rreq->len` to the actual file size.

Hi Xiubo,

This looks much better!  Just a couple of nits:

>
>
>
>  fs/ceph/addr.c | 28 ++++++++++++++++++++++------
>  1 file changed, 22 insertions(+), 6 deletions(-)
>
> diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
> index ca4dc6450887..357d9d28f202 100644
> --- a/fs/ceph/addr.c
> +++ b/fs/ceph/addr.c
> @@ -188,16 +188,32 @@ static void ceph_netfs_expand_readahead(struct netfs_io_request *rreq)
>         struct inode *inode = rreq->inode;
>         struct ceph_inode_info *ci = ceph_inode(inode);
>         struct ceph_file_layout *lo = &ci->i_layout;
> +       unsigned long max_pages = inode->i_sb->s_bdi->ra_pages;
> +       unsigned long max_len = max_pages << PAGE_SHIFT;
> +       loff_t end = rreq->start + rreq->len, new_end;
>         u32 blockoff;
>         u64 blockno;
>
> -       /* Expand the start downward */
> -       blockno = div_u64_rem(rreq->start, lo->stripe_unit, &blockoff);
> -       rreq->start = blockno * lo->stripe_unit;
> -       rreq->len += blockoff;
> +       /* Readahead is disabled */
> +       if (!max_pages)
> +               return;
>
> -       /* Now, round up the length to the next block */
> -       rreq->len = roundup(rreq->len, lo->stripe_unit);
> +       /*
> +        * Try to expand the length forward by rounding  up it to the next
> +        * block, but do not exceed the file size, unless the original
> +        * request already exceeds it.
> +        */
> +       new_end = round_up(end, lo->stripe_unit);
> +       new_end = min(new_end, rreq->i_size);

This can be done on single line:

        new_end = min(round_up(end, lo->stripe_unit), rreq->i_size);

> +       if (new_end > end && new_end <= rreq->start + max_len)
> +               rreq->len = new_end - rreq->start;
> +
> +       /* Try to expand the start downward */
> +       blockno = div_u64_rem(rreq->start, lo->stripe_unit, &blockoff);
> +       if (rreq->len + blockoff <= max_len) {
> +               rreq->start = blockno * lo->stripe_unit;

Can this be written as:

                rreq->start -= blockoff;

It seems like it would be easier to read and probably cheaper to
compute too.

Thanks,

                Ilya

Xiubo Li May 9, 2023, 12:46 a.m. UTC | #2

On 5/8/23 20:46, Ilya Dryomov wrote:
> On Thu, May 4, 2023 at 1:11 PM <xiubli@redhat.com> wrote:
>> From: Xiubo Li <xiubli@redhat.com>
>>
>> Blindly expanding the readahead windows will cause unneccessary
>> pagecache thrashing and also will introdue the network workload.
>> We should disable expanding the windows if the readahead is disabled
>> and also shouldn't expand the windows too much.
>>
>> Expanding forward firstly instead of expanding backward for possible
>> sequential reads.
>>
>> Bound `rreq->len` to the actual file size to restore the previous page
>> cache usage.
>>
>> Cc: stable@vger.kernel.org
>> Fixes: 49870056005c ("ceph: convert ceph_readpages to ceph_readahead")
>> URL: https://lore.kernel.org/ceph-devel/20230504082510.247-1-sehuww@mail.scut.edu.cn
>> URL: https://www.spinics.net/lists/ceph-users/msg76183.html
>> Cc: Hu Weiwen <sehuww@mail.scut.edu.cn>
>> Signed-off-by: Xiubo Li <xiubli@redhat.com>
>> ---
>>
>> V3:
>> - Folded Hu Weiwen's fix and bound `rreq->len` to the actual file size.
> Hi Xiubo,
>
> This looks much better!  Just a couple of nits:
>
>>
>>
>>   fs/ceph/addr.c | 28 ++++++++++++++++++++++------
>>   1 file changed, 22 insertions(+), 6 deletions(-)
>>
>> diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
>> index ca4dc6450887..357d9d28f202 100644
>> --- a/fs/ceph/addr.c
>> +++ b/fs/ceph/addr.c
>> @@ -188,16 +188,32 @@ static void ceph_netfs_expand_readahead(struct netfs_io_request *rreq)
>>          struct inode *inode = rreq->inode;
>>          struct ceph_inode_info *ci = ceph_inode(inode);
>>          struct ceph_file_layout *lo = &ci->i_layout;
>> +       unsigned long max_pages = inode->i_sb->s_bdi->ra_pages;
>> +       unsigned long max_len = max_pages << PAGE_SHIFT;
>> +       loff_t end = rreq->start + rreq->len, new_end;
>>          u32 blockoff;
>>          u64 blockno;
>>
>> -       /* Expand the start downward */
>> -       blockno = div_u64_rem(rreq->start, lo->stripe_unit, &blockoff);
>> -       rreq->start = blockno * lo->stripe_unit;
>> -       rreq->len += blockoff;
>> +       /* Readahead is disabled */
>> +       if (!max_pages)
>> +               return;
>>
>> -       /* Now, round up the length to the next block */
>> -       rreq->len = roundup(rreq->len, lo->stripe_unit);
>> +       /*
>> +        * Try to expand the length forward by rounding  up it to the next
>> +        * block, but do not exceed the file size, unless the original
>> +        * request already exceeds it.
>> +        */
>> +       new_end = round_up(end, lo->stripe_unit);
>> +       new_end = min(new_end, rreq->i_size);
> This can be done on single line:
>
>          new_end = min(round_up(end, lo->stripe_unit), rreq->i_size);
Sure, will fix it.
>> +       if (new_end > end && new_end <= rreq->start + max_len)
>> +               rreq->len = new_end - rreq->start;
>> +
>> +       /* Try to expand the start downward */
>> +       blockno = div_u64_rem(rreq->start, lo->stripe_unit, &blockoff);
>> +       if (rreq->len + blockoff <= max_len) {
>> +               rreq->start = blockno * lo->stripe_unit;
> Can this be written as:
>
>                  rreq->start -= blockoff;
>
> It seems like it would be easier to read and probably cheaper to
> compute too.

Yeah, this looks much easier to understand.

Thanks

- Xiubo

> Thanks,
>
>                  Ilya
>

diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
index ca4dc6450887..357d9d28f202 100644
--- a/fs/ceph/addr.c
+++ b/fs/ceph/addr.c
@@ -188,16 +188,32 @@  static void ceph_netfs_expand_readahead(struct netfs_io_request *rreq)
 	struct inode *inode = rreq->inode;
 	struct ceph_inode_info *ci = ceph_inode(inode);
 	struct ceph_file_layout *lo = &ci->i_layout;
+	unsigned long max_pages = inode->i_sb->s_bdi->ra_pages;
+	unsigned long max_len = max_pages << PAGE_SHIFT;
+	loff_t end = rreq->start + rreq->len, new_end;
 	u32 blockoff;
 	u64 blockno;
 
-	/* Expand the start downward */
-	blockno = div_u64_rem(rreq->start, lo->stripe_unit, &blockoff);
-	rreq->start = blockno * lo->stripe_unit;
-	rreq->len += blockoff;
+	/* Readahead is disabled */
+	if (!max_pages)
+		return;
 
-	/* Now, round up the length to the next block */
-	rreq->len = roundup(rreq->len, lo->stripe_unit);
+	/*
+	 * Try to expand the length forward by rounding  up it to the next
+	 * block, but do not exceed the file size, unless the original
+	 * request already exceeds it.
+	 */
+	new_end = round_up(end, lo->stripe_unit);
+	new_end = min(new_end, rreq->i_size);
+	if (new_end > end && new_end <= rreq->start + max_len)
+		rreq->len = new_end - rreq->start;
+
+	/* Try to expand the start downward */
+	blockno = div_u64_rem(rreq->start, lo->stripe_unit, &blockoff);
+	if (rreq->len + blockoff <= max_len) {
+		rreq->start = blockno * lo->stripe_unit;
+		rreq->len += blockoff;
+	}
 }
 
 static bool ceph_netfs_clamp_length(struct netfs_io_subrequest *subreq)

[v3] ceph: fix blindly expanding the readahead windows

Commit Message

Comments

Patch