mbox series

[v2,0/4] ACPI: SBS: Fix various issues

Message ID 20230225115144.31212-1-W_Armin@gmx.de
Headers show
Series ACPI: SBS: Fix various issues | expand

Message

Armin Wolf Feb. 25, 2023, 11:51 a.m. UTC
On my Acer Travelmate 4002WLMi, the system locks up upon
suspend/shutdown. After a lot of research, it turned out
that the sbs module was the culprit. The driver would not
correctly mask out the value used to select a battery using
the "Smart Battery Selector" (subset of the "Smart Battery Manager").
This accidentally caused a invalid power source to be selected,
which was automatically corrected by the selector. Upon
notifing the host about the corrected power source, some batteries
would be selected for re-reading, causing a endless loop.
This would lead to some workqueues filling up, which caused the
lockup upon suspend/shutdown.

The first three patches fix a stacktrace on module removal caused
by some locking issues. The last patch finally fixes the
suspend/shutdown issues.

As a side note: This was the first machine on which i installed Linux,
to finally fixing this took ~5 years of tinkering.
---
Changes in v2:
- make acpi_ec_add_query_handler() static to fix warning

Armin Wolf (4):
  ACPI: EC: Add query notifier support
  ACPI: sbshc: Use ec query notifier call chain
  ACPI: EC: Make query handlers private
  ACPI: SBS: Fix handling of Smart Battery Selectors

 drivers/acpi/ec.c       | 44 ++++++++++++++++++++--------------------
 drivers/acpi/internal.h | 10 ++++-----
 drivers/acpi/sbs.c      | 27 ++++++++++++++++---------
 drivers/acpi/sbshc.c    | 45 ++++++++++++++++++++++++++---------------
 4 files changed, 74 insertions(+), 52 deletions(-)

--
2.30.2

Comments

Armin Wolf March 12, 2023, 5:15 p.m. UTC | #1
Am 25.02.23 um 12:51 schrieb Armin Wolf:

> On my Acer Travelmate 4002WLMi, the system locks up upon
> suspend/shutdown. After a lot of research, it turned out
> that the sbs module was the culprit. The driver would not
> correctly mask out the value used to select a battery using
> the "Smart Battery Selector" (subset of the "Smart Battery Manager").
> This accidentally caused a invalid power source to be selected,
> which was automatically corrected by the selector. Upon
> notifing the host about the corrected power source, some batteries
> would be selected for re-reading, causing a endless loop.
> This would lead to some workqueues filling up, which caused the
> lockup upon suspend/shutdown.
>
> The first three patches fix a stacktrace on module removal caused
> by some locking issues. The last patch finally fixes the
> suspend/shutdown issues.
>
> As a side note: This was the first machine on which i installed Linux,
> to finally fixing this took ~5 years of tinkering.

What is the status of this patchset? Should i use a SRCU notifier chain
for the query notifiers? I would really like to see this getting fixed,
as it prevents me from using linux on this machine.

Armin Wolf

> ---
> Changes in v2:
> - make acpi_ec_add_query_handler() static to fix warning
>
> Armin Wolf (4):
>    ACPI: EC: Add query notifier support
>    ACPI: sbshc: Use ec query notifier call chain
>    ACPI: EC: Make query handlers private
>    ACPI: SBS: Fix handling of Smart Battery Selectors
>
>   drivers/acpi/ec.c       | 44 ++++++++++++++++++++--------------------
>   drivers/acpi/internal.h | 10 ++++-----
>   drivers/acpi/sbs.c      | 27 ++++++++++++++++---------
>   drivers/acpi/sbshc.c    | 45 ++++++++++++++++++++++++++---------------
>   4 files changed, 74 insertions(+), 52 deletions(-)
>
> --
> 2.30.2
>
>
Rafael J. Wysocki March 14, 2023, 7:49 p.m. UTC | #2
On Sun, Mar 12, 2023 at 6:15 PM Armin Wolf <W_Armin@gmx.de> wrote:
>
> Am 25.02.23 um 12:51 schrieb Armin Wolf:
>
> > On my Acer Travelmate 4002WLMi, the system locks up upon
> > suspend/shutdown. After a lot of research, it turned out
> > that the sbs module was the culprit. The driver would not
> > correctly mask out the value used to select a battery using
> > the "Smart Battery Selector" (subset of the "Smart Battery Manager").
> > This accidentally caused a invalid power source to be selected,
> > which was automatically corrected by the selector. Upon
> > notifing the host about the corrected power source, some batteries
> > would be selected for re-reading, causing a endless loop.
> > This would lead to some workqueues filling up, which caused the
> > lockup upon suspend/shutdown.
> >
> > The first three patches fix a stacktrace on module removal caused
> > by some locking issues. The last patch finally fixes the
> > suspend/shutdown issues.
> >
> > As a side note: This was the first machine on which i installed Linux,
> > to finally fixing this took ~5 years of tinkering.
>
> What is the status of this patchset? Should i use a SRCU notifier chain
> for the query notifiers? I would really like to see this getting fixed,
> as it prevents me from using linux on this machine.

I'm not entirely convinced about the query notifiers idea TBH.
Armin Wolf March 14, 2023, 11:04 p.m. UTC | #3
Am 14.03.23 um 20:49 schrieb Rafael J. Wysocki:

> On Sun, Mar 12, 2023 at 6:15 PM Armin Wolf <W_Armin@gmx.de> wrote:
>> Am 25.02.23 um 12:51 schrieb Armin Wolf:
>>
>>> On my Acer Travelmate 4002WLMi, the system locks up upon
>>> suspend/shutdown. After a lot of research, it turned out
>>> that the sbs module was the culprit. The driver would not
>>> correctly mask out the value used to select a battery using
>>> the "Smart Battery Selector" (subset of the "Smart Battery Manager").
>>> This accidentally caused a invalid power source to be selected,
>>> which was automatically corrected by the selector. Upon
>>> notifing the host about the corrected power source, some batteries
>>> would be selected for re-reading, causing a endless loop.
>>> This would lead to some workqueues filling up, which caused the
>>> lockup upon suspend/shutdown.
>>>
>>> The first three patches fix a stacktrace on module removal caused
>>> by some locking issues. The last patch finally fixes the
>>> suspend/shutdown issues.
>>>
>>> As a side note: This was the first machine on which i installed Linux,
>>> to finally fixing this took ~5 years of tinkering.
>> What is the status of this patchset? Should i use a SRCU notifier chain
>> for the query notifiers? I would really like to see this getting fixed,
>> as it prevents me from using linux on this machine.
> I'm not entirely convinced about the query notifiers idea TBH.

I already thought about just flushing the query workqueue, but acpi_ec_remove_query_handler()
would still also remove query handlers installed to handle _Qxx methods. This might cause errors,
as for example on the Acer Travelmate 4002WLMI, the SMBUS query handler (_Q20) resets the SMBUS
alert bit in case no EC SMBus driver is overriding it.
Is there any specific reason for your dislike of the query notifiers? I could turn them into
SRCU call chains to avoid performance issues.

Armin Wolf