diff mbox series

[v6,2/2] x86/MCE/AMD: Fix memory leak when `threshold_create_bank()` fails

Message ID 20220329104705.65256-3-ammarfaizi2@gnuweeb.org
State New
Headers show
Series Two x86 fixes | expand

Commit Message

Ammar Faizi March 29, 2022, 10:47 a.m. UTC
In mce_threshold_create_device(), if threshold_create_bank() fails, the
@bp will be leaked, because the call to mce_threshold_remove_device()
will not free the @bp. mce_threshold_remove_device() frees
@threshold_banks. At that point, the @bp has not been written to
@threshold_banks, @threshold_banks is NULL, so the call is just a nop.

Fix this by extracting the cleanup part into a new static function
__threshold_remove_device(), then call it from create/remove device
functions.

Also, eliminate the "goto out_err", just early return inside the loop
if the creation fails.

Cc: stable@vger.kernel.org # v5.8+
Fixes: 6458de97fc15 ("x86/mce/amd: Straighten CPU hotplug path")
Co-developed-by: Alviro Iskandar Setiawan <alviro.iskandar@gnuweeb.org>
Signed-off-by: Alviro Iskandar Setiawan <alviro.iskandar@gnuweeb.org>
Co-developed-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
---

   v6:
     - Change the helper function name to __threshold_remove_device().

---
 arch/x86/kernel/cpu/mce/amd.c | 32 +++++++++++++++++++-------------
 1 file changed, 19 insertions(+), 13 deletions(-)

Comments

Thomas Gleixner April 3, 2022, 5:03 p.m. UTC | #1
On Tue, Mar 29 2022 at 17:47, Ammar Faizi wrote:

> In mce_threshold_create_device(), if threshold_create_bank() fails, the
> @bp will be leaked, because the call to mce_threshold_remove_device()
> will not free the @bp. mce_threshold_remove_device() frees
> @threshold_banks. At that point, the @bp has not been written to
> @threshold_banks, @threshold_banks is NULL, so the call is just a nop.
>
> Fix this by extracting the cleanup part into a new static function
> __threshold_remove_device(), then call it from create/remove device
> functions.

The way simpler fix is to move 

>  	}
>  	this_cpu_write(threshold_banks, bp);

before the loop. That's safe because the banks cannot yet be reached via
an MCE as the vector is not yet enabled:
  
>  	if (thresholding_irq_en)
>  		mce_threshold_vector = amd_threshold_interrupt;

Thanks,

        tglx
Ammar Faizi April 3, 2022, 5:43 p.m. UTC | #2
On 4/4/22 12:03 AM, Thomas Gleixner wrote:
> On Tue, Mar 29 2022 at 17:47, Ammar Faizi wrote:
> 
>> In mce_threshold_create_device(), if threshold_create_bank() fails, the
>> @bp will be leaked, because the call to mce_threshold_remove_device()
>> will not free the @bp. mce_threshold_remove_device() frees
>> @threshold_banks. At that point, the @bp has not been written to
>> @threshold_banks, @threshold_banks is NULL, so the call is just a nop.
>>
>> Fix this by extracting the cleanup part into a new static function
>> __threshold_remove_device(), then call it from create/remove device
>> functions.
> 
> The way simpler fix is to move
> 
>>   	}
>>   	this_cpu_write(threshold_banks, bp);
> 
> before the loop. That's safe because the banks cannot yet be reached via
> an MCE as the vector is not yet enabled:
>    
>>   	if (thresholding_irq_en)
>>   		mce_threshold_vector = amd_threshold_interrupt;
Thomas,

I did like what you said (in the patch v4), but after Yazen and Borislav
reviewed it, we got a conclusion that it's not safe.

See [1] and [2] for the full message.

[1]: https://lore.kernel.org/lkml/YkFsQhpGGXIFTMyp@zn.tnic/
[2]: https://lore.kernel.org/lkml/Yh+oyD%2F5M3TW5ZMM@yaz-ubuntu/

Yazen, Borislav, please take a deeper look on this again. I will send
a v7 revision to really make it simpler by moving that "per-CPU var write"
before the loop.

Thanks!
Ammar Faizi April 3, 2022, 5:45 p.m. UTC | #3
On 4/4/22 12:43 AM, Ammar Faizi wrote:
> On 4/4/22 12:03 AM, Thomas Gleixner wrote:
>> On Tue, Mar 29 2022 at 17:47, Ammar Faizi wrote:
>>
>>> In mce_threshold_create_device(), if threshold_create_bank() fails, the
>>> @bp will be leaked, because the call to mce_threshold_remove_device()
>>> will not free the @bp. mce_threshold_remove_device() frees
>>> @threshold_banks. At that point, the @bp has not been written to
>>> @threshold_banks, @threshold_banks is NULL, so the call is just a nop.
>>>
>>> Fix this by extracting the cleanup part into a new static function
>>> __threshold_remove_device(), then call it from create/remove device
>>> functions.
>>
>> The way simpler fix is to move
>>
>>>       }
>>>       this_cpu_write(threshold_banks, bp);
>>
>> before the loop. That's safe because the banks cannot yet be reached via
>> an MCE as the vector is not yet enabled:
>>>       if (thresholding_irq_en)
>>>           mce_threshold_vector = amd_threshold_interrupt;
> Thomas,
> 
> I did like what you said (in the patch v4), but after Yazen and Borislav
> reviewed it, we got a conclusion that it's not safe.
> 
> See [1] and [2] for the full message.
> 
> [1]: https://lore.kernel.org/lkml/YkFsQhpGGXIFTMyp@zn.tnic/
> [2]: https://lore.kernel.org/lkml/Yh+oyD%2F5M3TW5ZMM@yaz-ubuntu/
> 
> Yazen, Borislav, please take a deeper look on this again. I will send
> a v7 revision to really make it simpler by moving that "per-CPU var write"
> before the loop.

(only if it's really safe)
Borislav Petkov April 3, 2022, 6:46 p.m. UTC | #4
On Mon, Apr 04, 2022 at 12:45:04AM +0700, Ammar Faizi wrote:
> > Yazen, Borislav, please take a deeper look on this again. I will send
> > a v7 revision to really make it simpler by moving that "per-CPU var write"
> > before the loop.
> 
> (only if it's really safe)

Don't bother - I've discussed it with tglx offlist. The current solution
remains.
diff mbox series

Patch

diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index 1940d305db1c..d293ae088d6b 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -1294,10 +1294,23 @@  static void threshold_remove_bank(struct threshold_bank *bank)
 	kfree(bank);
 }
 
+static void __threshold_remove_device(struct threshold_bank **bp,
+				       unsigned int numbanks)
+{
+	unsigned int bank;
+
+	for (bank = 0; bank < numbanks; bank++) {
+		if (bp[bank]) {
+			threshold_remove_bank(bp[bank]);
+			bp[bank] = NULL;
+		}
+	}
+	kfree(bp);
+}
+
 int mce_threshold_remove_device(unsigned int cpu)
 {
 	struct threshold_bank **bp = this_cpu_read(threshold_banks);
-	unsigned int bank, numbanks = this_cpu_read(mce_num_banks);
 
 	if (!bp)
 		return 0;
@@ -1308,13 +1321,7 @@  int mce_threshold_remove_device(unsigned int cpu)
 	 */
 	this_cpu_write(threshold_banks, NULL);
 
-	for (bank = 0; bank < numbanks; bank++) {
-		if (bp[bank]) {
-			threshold_remove_bank(bp[bank]);
-			bp[bank] = NULL;
-		}
-	}
-	kfree(bp);
+	__threshold_remove_device(bp, this_cpu_read(mce_num_banks));
 	return 0;
 }
 
@@ -1351,15 +1358,14 @@  int mce_threshold_create_device(unsigned int cpu)
 		if (!(this_cpu_read(bank_map) & (1 << bank)))
 			continue;
 		err = threshold_create_bank(bp, cpu, bank);
-		if (err)
-			goto out_err;
+		if (err) {
+			__threshold_remove_device(bp, numbanks);
+			return err;
+		}
 	}
 	this_cpu_write(threshold_banks, bp);
 
 	if (thresholding_irq_en)
 		mce_threshold_vector = amd_threshold_interrupt;
 	return 0;
-out_err:
-	mce_threshold_remove_device(cpu);
-	return err;
 }