Message ID | 26a2690e77671cfe687c5614613fbb6f079f7365.1739959820.git.repk@triplefau.lt |
---|---|
State | Superseded |
Headers | show |
Series | leds: Fix LED_OFF brightness race | expand |
Hi, On 20-Feb-25 9:46 AM, Remi Pommarel wrote: > On Wed, Feb 19, 2025 at 12:52:36PM +0100, Hans de Goede wrote: >> Hi, >> >> On 19-Feb-25 11:41 AM, Remi Pommarel wrote: >>> While commit fa15d8c69238 ("leds: Fix set_brightness_delayed() race") >>> successfully forces led_set_brightness() to be called with LED_OFF at >>> least once when switching from blinking to LED on state so that >>> hw-blinking can be disabled, another race remains. Indeed in >>> led_set_brightness(LED_OFF) followed by led_set_brightness(any) >>> scenario the following CPU scheduling can happen: >>> >>> CPU0 CPU1 >>> ---- ---- >>> set_brightness_delayed() { >>> test_and_clear_bit(BRIGHTNESS_OFF) >>> led_set_brightness(LED_OFF) { >>> set_bit(BRIGHTNESS_OFF) >>> queue_work() >>> } >>> led_set_brightness(any) { >>> set_bit(BRIGHTNESS) >>> queue_work() //already queued >>> } >>> test_and_clear_bit(BRIGHTNESS) >>> /* LED set with brightness any */ >>> } >>> >>> /* From previous CPU1 queue_work() */ >>> set_brightness_delayed() { >>> test_and_clear_bit(BRIGHTNESS_OFF) >>> /* LED turned off */ >>> test_and_clear_bit(BRIGHTNESS) >>> /* Clear from previous run, LED remains off */ >>> >>> In that case the led_set_brightness(LED_OFF)/led_set_brightness(any) >>> sequence will be effectively executed in reverse order and LED will >>> remain off. >>> >>> With the introduction of commit 32360bf6a5d4 ("leds: Introduce ordered >>> workqueue for LEDs events instead of system_wq") the race is easier to >>> trigger as sysfs brightness configuration does not wait for >>> set_brightness_delayed() work to finish (flush_work() removal). >>> >>> Use delayed_set_value to optionnally re-configure brightness after a >>> LED_OFF. That way a LED state could be configured more that once but >>> final state will always be as expected. >>> >>> Fixes: fa15d8c69238 ("leds: Fix set_brightness_delayed() race") >>> Signed-off-by: Remi Pommarel <repk@triplefau.lt> >> >> Thanks, patch looks good to me: >> > > Actually two additionnal remarks here. The first one is that now more > than before, delayed_set_value store should be seen before work_flags > modification on other CPUs. That means that a smp_mb__before_atomic() > is needed before the two set_bit(). > > The second one is that delayed_set_value can be bigger than a single > byte, so theoretically store tearing can happen and > set_brightness_delayed_set_brightness() could be called with an invalid > value. WRITE_ONCE/READ_ONCE could prevent that but because the > smp_mb__before_atomic() ensures that the "last" delayed_set_value is > valid I don't mind having very seldom intermediate invalid values. > > So I think a v2 with smp_mb__before_atomic() is needed here, what do you > think ? Doing a v2 with smp_mb__before_atomic() sounds good to me. Regards, Hans
diff --git a/drivers/leds/led-core.c b/drivers/leds/led-core.c index f6c46d2e5276..528f53bf13a9 100644 --- a/drivers/leds/led-core.c +++ b/drivers/leds/led-core.c @@ -159,8 +159,19 @@ static void set_brightness_delayed(struct work_struct *ws) * before this work item runs once. To make sure this works properly * handle LED_SET_BRIGHTNESS_OFF first. */ - if (test_and_clear_bit(LED_SET_BRIGHTNESS_OFF, &led_cdev->work_flags)) + if (test_and_clear_bit(LED_SET_BRIGHTNESS_OFF, &led_cdev->work_flags)) { set_brightness_delayed_set_brightness(led_cdev, LED_OFF); + /* + * The consecutives led_set_brightness(LED_OFF), + * led_set_brightness(LED_FULL) could have been executed out of + * order (LED_FULL first), if the work_flags has been set + * between LED_SET_BRIGHTNESS_OFF and LED_SET_BRIGHTNESS of this + * work. To avoid ending with the LED turned off, turn the LED + * on again. + */ + if (led_cdev->delayed_set_value != LED_OFF) + set_bit(LED_SET_BRIGHTNESS, &led_cdev->work_flags); + } if (test_and_clear_bit(LED_SET_BRIGHTNESS, &led_cdev->work_flags)) set_brightness_delayed_set_brightness(led_cdev, led_cdev->delayed_set_value); @@ -331,10 +342,11 @@ void led_set_brightness_nopm(struct led_classdev *led_cdev, unsigned int value) * change is done immediately afterwards (before the work runs), * it uses a separate work_flag. */ - if (value) { - led_cdev->delayed_set_value = value; + led_cdev->delayed_set_value = value; + + if (value) set_bit(LED_SET_BRIGHTNESS, &led_cdev->work_flags); - } else { + else { clear_bit(LED_SET_BRIGHTNESS, &led_cdev->work_flags); clear_bit(LED_SET_BLINK, &led_cdev->work_flags); set_bit(LED_SET_BRIGHTNESS_OFF, &led_cdev->work_flags);
While commit fa15d8c69238 ("leds: Fix set_brightness_delayed() race") successfully forces led_set_brightness() to be called with LED_OFF at least once when switching from blinking to LED on state so that hw-blinking can be disabled, another race remains. Indeed in led_set_brightness(LED_OFF) followed by led_set_brightness(any) scenario the following CPU scheduling can happen: CPU0 CPU1 ---- ---- set_brightness_delayed() { test_and_clear_bit(BRIGHTNESS_OFF) led_set_brightness(LED_OFF) { set_bit(BRIGHTNESS_OFF) queue_work() } led_set_brightness(any) { set_bit(BRIGHTNESS) queue_work() //already queued } test_and_clear_bit(BRIGHTNESS) /* LED set with brightness any */ } /* From previous CPU1 queue_work() */ set_brightness_delayed() { test_and_clear_bit(BRIGHTNESS_OFF) /* LED turned off */ test_and_clear_bit(BRIGHTNESS) /* Clear from previous run, LED remains off */ In that case the led_set_brightness(LED_OFF)/led_set_brightness(any) sequence will be effectively executed in reverse order and LED will remain off. With the introduction of commit 32360bf6a5d4 ("leds: Introduce ordered workqueue for LEDs events instead of system_wq") the race is easier to trigger as sysfs brightness configuration does not wait for set_brightness_delayed() work to finish (flush_work() removal). Use delayed_set_value to optionnally re-configure brightness after a LED_OFF. That way a LED state could be configured more that once but final state will always be as expected. Fixes: fa15d8c69238 ("leds: Fix set_brightness_delayed() race") Signed-off-by: Remi Pommarel <repk@triplefau.lt> --- While the race does seem to be very thin, it is very easy to trigger it on my setup with the following command: $ echo "pattern" > /sys/class/leds/<device>/trigger $ echo "3 200 40 200 3 200 3 200" > /sys/class/leds/<device>/pattern $ sleep 1 $ echo 0 > /sys/class/leds/<device>/brightness $ echo 30 > /sys/class/leds/<device>/brightness The issue happens 8 out of 10 times, not sure why this is the case, maybe two consecutive set_bit() on one CPU are likely to appear as one just after a previous test_and_clear_bit() on another due to the full memory barrier implied by this atomic operation ? drivers/leds/led-core.c | 20 ++++++++++++++++---- 1 file changed, 16 insertions(+), 4 deletions(-)