diff mbox series

[4.19,19/71] btrfs: extent_io: add proper error handling to lock_extent_buffer_for_io()

Message ID 20201109125020.811120362@linuxfoundation.org
State New
Headers show
Series None | expand

Commit Message

Greg Kroah-Hartman Nov. 9, 2020, 12:55 p.m. UTC
From: Qu Wenruo <wqu@suse.com>

commit 2e3c25136adfb293d517e17f761d3b8a43a8fc22 upstream.

This function needs some extra checks on locked pages and eb.  For error
handling we need to unlock locked pages and the eb.

There is a rare >0 return value branch, where all pages get locked
while write bio is not flushed.

Thankfully it's handled by the only caller, btree_write_cache_pages(),
as later write_one_eb() call will trigger submit_one_bio().  So there
shouldn't be any problem.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/btrfs/extent_io.c |   27 ++++++++++++++++++++++-----
 1 file changed, 22 insertions(+), 5 deletions(-)

Comments

Pavel Machek Nov. 11, 2020, 12:44 p.m. UTC | #1
Hi!

> Thankfully it's handled by the only caller, btree_write_cache_pages(),

> as later write_one_eb() call will trigger submit_one_bio().  So there

> shouldn't be any problem.


This explains there should not be any problem in _the
mainline_. AFAICT this talks about this code. Mainline version is:

 prev_eb = eb;
 ret = lock_extent_buffer_for_io(eb, &epd);
 if (!ret) {
 	free_extent_buffer(eb);
 	continue;
 } else if (ret < 0) {
 	done = 1;
 	free_extent_buffer(eb);
 	break;
 }

But 4.19 has:

 ret = lock_extent_buffer_for_io(eb, fs_info, &epd);
 if (!ret) {
  	free_extent_buffer(eb);
 	continue;
 }

IOW missing the code mentioned in the changelog. Is 0607eb1d452d4
prerequisite for this patch?

Best regards,
								Pavel

> +/*

> + * Lock eb pages and flush the bio if we can't the locks

> + *

> + * Return  0 if nothing went wrong

> + * Return >0 is same as 0, except bio is not submitted

> + * Return <0 if something went wrong, no page is locked

> + */


-- 
http://www.livejournal.com/~pavelmachek
Ben Hutchings Nov. 11, 2020, 2:39 p.m. UTC | #2
On Wed, 2020-11-11 at 13:44 +0100, Pavel Machek wrote:
> Hi!

> 

> > Thankfully it's handled by the only caller, btree_write_cache_pages(),

> > as later write_one_eb() call will trigger submit_one_bio().  So there

> > shouldn't be any problem.

> 

> This explains there should not be any problem in _the

> mainline_. AFAICT this talks about this code. Mainline version is:

> 

>  prev_eb = eb;

>  ret = lock_extent_buffer_for_io(eb, &epd);

>  if (!ret) {

>  	free_extent_buffer(eb);

>  	continue;

>  } else if (ret < 0) {

>  	done = 1;

>  	free_extent_buffer(eb);

>  	break;

>  }

> 

> But 4.19 has:

> 

>  ret = lock_extent_buffer_for_io(eb, fs_info, &epd);

>  if (!ret) {

>   	free_extent_buffer(eb);

>  	continue;

>  }


That was changed in mainline two releases after this commit, though.

> IOW missing the code mentioned in the changelog. Is 0607eb1d452d4

> prerequisite for this patch?


I think it's a separate fix, but probably worth picking too.

Ben.

> Best regards,

> 								Pavel

> 

> > +/*

> > + * Lock eb pages and flush the bio if we can't the locks

> > + *

> > + * Return  0 if nothing went wrong

> > + * Return >0 is same as 0, except bio is not submitted

> > + * Return <0 if something went wrong, no page is locked

> > + */

-- 
Ben Hutchings, Software Developer                         Codethink Ltd
https://www.codethink.co.uk/                 Dale House, 35 Dale Street
                                     Manchester, M1 2HF, United Kingdom
Sasha Levin Nov. 12, 2020, 4:06 p.m. UTC | #3
On Wed, Nov 11, 2020 at 02:39:34PM +0000, Ben Hutchings wrote:
>On Wed, 2020-11-11 at 13:44 +0100, Pavel Machek wrote:

>> Hi!

>>

>> > Thankfully it's handled by the only caller, btree_write_cache_pages(),

>> > as later write_one_eb() call will trigger submit_one_bio().  So there

>> > shouldn't be any problem.

>>

>> This explains there should not be any problem in _the

>> mainline_. AFAICT this talks about this code. Mainline version is:

>>

>>  prev_eb = eb;

>>  ret = lock_extent_buffer_for_io(eb, &epd);

>>  if (!ret) {

>>  	free_extent_buffer(eb);

>>  	continue;

>>  } else if (ret < 0) {

>>  	done = 1;

>>  	free_extent_buffer(eb);

>>  	break;

>>  }

>>

>> But 4.19 has:

>>

>>  ret = lock_extent_buffer_for_io(eb, fs_info, &epd);

>>  if (!ret) {

>>   	free_extent_buffer(eb);

>>  	continue;

>>  }

>

>That was changed in mainline two releases after this commit, though.

>

>> IOW missing the code mentioned in the changelog. Is 0607eb1d452d4

>> prerequisite for this patch?

>

>I think it's a separate fix, but probably worth picking too.


I'll take it in too, thanks!

-- 
Thanks,
Sasha
diff mbox series

Patch

--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -3554,19 +3554,27 @@  void wait_on_extent_buffer_writeback(str
 		       TASK_UNINTERRUPTIBLE);
 }
 
+/*
+ * Lock eb pages and flush the bio if we can't the locks
+ *
+ * Return  0 if nothing went wrong
+ * Return >0 is same as 0, except bio is not submitted
+ * Return <0 if something went wrong, no page is locked
+ */
 static noinline_for_stack int
 lock_extent_buffer_for_io(struct extent_buffer *eb,
 			  struct btrfs_fs_info *fs_info,
 			  struct extent_page_data *epd)
 {
-	int i, num_pages;
+	int i, num_pages, failed_page_nr;
 	int flush = 0;
 	int ret = 0;
 
 	if (!btrfs_try_tree_write_lock(eb)) {
-		flush = 1;
 		ret = flush_write_bio(epd);
-		BUG_ON(ret < 0);
+		if (ret < 0)
+			return ret;
+		flush = 1;
 		btrfs_tree_lock(eb);
 	}
 
@@ -3576,7 +3584,8 @@  lock_extent_buffer_for_io(struct extent_
 			return 0;
 		if (!flush) {
 			ret = flush_write_bio(epd);
-			BUG_ON(ret < 0);
+			if (ret < 0)
+				return ret;
 			flush = 1;
 		}
 		while (1) {
@@ -3618,7 +3627,10 @@  lock_extent_buffer_for_io(struct extent_
 		if (!trylock_page(p)) {
 			if (!flush) {
 				ret = flush_write_bio(epd);
-				BUG_ON(ret < 0);
+				if (ret < 0) {
+					failed_page_nr = i;
+					goto err_unlock;
+				}
 				flush = 1;
 			}
 			lock_page(p);
@@ -3626,6 +3638,11 @@  lock_extent_buffer_for_io(struct extent_
 	}
 
 	return ret;
+err_unlock:
+	/* Unlock already locked pages */
+	for (i = 0; i < failed_page_nr; i++)
+		unlock_page(eb->pages[i]);
+	return ret;
 }
 
 static void end_extent_buffer_writeback(struct extent_buffer *eb)