Message ID | 4501957.LvFx2qVVIh@kreacher |
---|---|
State | New |
Headers | show |
Series | [v3,1/8] thermal: core: Add mechanism for connecting trips with driver data | expand |
On 25/07/2023 14:04, Rafael J. Wysocki wrote: > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > Some drivers need to update trip point data (temperature and/or > hysteresis) upon notifications from the platform firmware or they > may need to reprogram hardware when trip point parameters are changed > via sysfs. For those purposes, they need to connect struct thermal_trip > to a private data set associated with the trip or the other way around > and using a trip point index for that may not always work, because the > core may need to reorder the trips during thermal zone registration (in > particular, they may need to be sorted). > > To allow that to be done without using a trip point index, introduce > a new field in struct thermal_trip that can be pointed by the driver > to its own data structure containing a trip pointer to be initialized > by the core during thermal zone registration. That pointer will then > have to be updated by the core every time the location of the given > trip point object in memory changes. > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > --- > > v2 -> v3: No changes. > > v1 -> v2: No changes. > > --- > drivers/thermal/thermal_core.c | 20 +++++++++++++++++--- > include/linux/thermal.h | 13 +++++++++++++ > 2 files changed, 30 insertions(+), 3 deletions(-) > > Index: linux-pm/include/linux/thermal.h > =================================================================== > --- linux-pm.orig/include/linux/thermal.h > +++ linux-pm/include/linux/thermal.h > @@ -76,16 +76,29 @@ struct thermal_zone_device_ops { > void (*critical)(struct thermal_zone_device *); > }; > > +struct thermal_trip_ref { > + struct thermal_trip *trip; > +}; That introduces a circular dependency. That should be avoided. > /** > * struct thermal_trip - representation of a point in temperature domain > * @temperature: temperature value in miliCelsius > * @hysteresis: relative hysteresis in miliCelsius > * @type: trip point type > + * @driver_ref: driver's reference to this trip point > + * > + * If @driver_ref is not NULL, the trip pointer in the object pointed to by it > + * will be initialized by the core during thermal zone registration and updated > + * whenever the location of the given trip object changes. This allows the > + * driver to access the trip point data without knowing the relative ordering > + * of trips within the trip table used by the core and, given a trip pointer, > + * to get back to its private data associated with the given trip. > */ > struct thermal_trip { > int temperature; > int hysteresis; > enum thermal_trip_type type; > + struct thermal_trip_ref *driver_ref; > }; Why not use void *priv ? AFAICT, the ACPI driver is the only one where when we reorder the trip points, the trip id is no longer matching the definition provided by the ACPI description. It is possible to have the driver *specific* code to define its own structure with the id and use it instead of the trip_id. So we end up with the ACPI driver registering the trip points with a data structure containing a private trip id. The thermal framework is not supposed to have to deal with this kind of driver issues and from a higher perspective, any driver specific thing must stay in the driver. eg. struct acpi_thermal_trip_data { int id; ... other info }; struct acpi_thermal_trip_data attd[NRTRIPS] = { .id = 0 }, { .id = 1 }, ... struct thermal_trip trips[NRTRIPS]; trips[i].priv = &attd[i]; The drivers with another kind of specific trip data can use this field. > struct thermal_cooling_device_ops { > Index: linux-pm/drivers/thermal/thermal_core.c > =================================================================== > --- linux-pm.orig/drivers/thermal/thermal_core.c > +++ linux-pm/drivers/thermal/thermal_core.c > @@ -1306,14 +1306,28 @@ thermal_zone_device_register_with_trips( > if (result) > goto release_device; > > + mutex_lock(&tz->lock); > + > for (count = 0; count < num_trips; count++) { > - struct thermal_trip trip; > + int temperature = 0; > + > + if (trips) { > + temperature = trips[count].temperature; > + if (trips[count].driver_ref) > + trips[count].driver_ref->trip = &trips[count]; > + } else { > + struct thermal_trip trip; As mentioned above, that should not appear in the thermal core code. > - result = thermal_zone_get_trip(tz, count, &trip); > - if (result || !trip.temperature) > + result = __thermal_zone_get_trip(tz, count, &trip); > + if (!result) > + temperature = trip.temperature; > + } > + if (!temperature) > set_bit(count, &tz->trips_disabled); > } > > + mutex_unlock(&tz->lock); > + > /* Update 'this' zone's governor information */ > mutex_lock(&thermal_governor_lock); > > > >
On Tue, Aug 1, 2023 at 8:29 PM Daniel Lezcano <daniel.lezcano@linaro.org> wrote: > > On 25/07/2023 14:04, Rafael J. Wysocki wrote: > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > > > Some drivers need to update trip point data (temperature and/or > > hysteresis) upon notifications from the platform firmware or they > > may need to reprogram hardware when trip point parameters are changed > > via sysfs. For those purposes, they need to connect struct thermal_trip > > to a private data set associated with the trip or the other way around > > and using a trip point index for that may not always work, because the > > core may need to reorder the trips during thermal zone registration (in > > particular, they may need to be sorted). > > > > To allow that to be done without using a trip point index, introduce > > a new field in struct thermal_trip that can be pointed by the driver > > to its own data structure containing a trip pointer to be initialized > > by the core during thermal zone registration. That pointer will then > > have to be updated by the core every time the location of the given > > trip point object in memory changes. > > > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > --- > > > > v2 -> v3: No changes. > > > > v1 -> v2: No changes. > > > > --- > > drivers/thermal/thermal_core.c | 20 +++++++++++++++++--- > > include/linux/thermal.h | 13 +++++++++++++ > > 2 files changed, 30 insertions(+), 3 deletions(-) > > > > Index: linux-pm/include/linux/thermal.h > > =================================================================== > > --- linux-pm.orig/include/linux/thermal.h > > +++ linux-pm/include/linux/thermal.h > > @@ -76,16 +76,29 @@ struct thermal_zone_device_ops { > > void (*critical)(struct thermal_zone_device *); > > }; > > > > +struct thermal_trip_ref { > > + struct thermal_trip *trip; > > +}; > > That introduces a circular dependency. That should be avoided. Sorry, but this is an empty statement without any substance. > > /** > > * struct thermal_trip - representation of a point in temperature domain > > * @temperature: temperature value in miliCelsius > > * @hysteresis: relative hysteresis in miliCelsius > > * @type: trip point type > > + * @driver_ref: driver's reference to this trip point > > + * > > + * If @driver_ref is not NULL, the trip pointer in the object pointed to by it > > + * will be initialized by the core during thermal zone registration and updated > > + * whenever the location of the given trip object changes. This allows the > > + * driver to access the trip point data without knowing the relative ordering > > + * of trips within the trip table used by the core and, given a trip pointer, > > + * to get back to its private data associated with the given trip. > > */ > > struct thermal_trip { > > int temperature; > > int hysteresis; > > enum thermal_trip_type type; > > + struct thermal_trip_ref *driver_ref; > > }; > > Why not use void *priv ? Because it wouldn't work. > AFAICT, the ACPI driver is the only one where when we reorder the trip > points, the trip id is no longer matching the definition provided by the > ACPI description. No, it is not the only one. Every driver that needs to handle trip point update notifications from the platform firmware will have this problem. > It is possible to have the driver *specific* code to define its own > structure with the id and use it instead of the trip_id. Then it would need to walk the trips[] table in the thermal zone, if I understand the suggestion correctly, which goes kind of against your previous changes. > So we end up with the ACPI driver registering the trip points with a > data structure containing a private trip id. > > The thermal framework is not supposed to have to deal with this kind of > driver issues and from a higher perspective, any driver specific thing > must stay in the driver. > > eg. > > struct acpi_thermal_trip_data { > int id; > ... other info > }; > > struct acpi_thermal_trip_data attd[NRTRIPS] = { .id = 0 }, { .id = 1 }, ... > > struct thermal_trip trips[NRTRIPS]; > > trips[i].priv = &attd[i]; But the driver needs to get from priv to trips[i], not the other way around. > The drivers with another kind of specific trip data can use this field. They could if the trips did not get reordered. Otherwise they would need to walk trips[] every time and have a way to match each trip against its private counterpart. I guess they could use the address of the private part as a tag in this, but is walking trips[] by drivers something that you really want? > > > > struct thermal_cooling_device_ops { > > Index: linux-pm/drivers/thermal/thermal_core.c > > =================================================================== > > --- linux-pm.orig/drivers/thermal/thermal_core.c > > +++ linux-pm/drivers/thermal/thermal_core.c > > @@ -1306,14 +1306,28 @@ thermal_zone_device_register_with_trips( > > if (result) > > goto release_device; > > > > + mutex_lock(&tz->lock); > > + > > for (count = 0; count < num_trips; count++) { > > - struct thermal_trip trip; > > + int temperature = 0; > > + > > + if (trips) { > > + temperature = trips[count].temperature; > > + if (trips[count].driver_ref) > > + trips[count].driver_ref->trip = &trips[count]; > > + } else { > > + struct thermal_trip trip; > > As mentioned above, that should not appear in the thermal core code. Well, this is a matter of opinion to me. Clearly, I disagree with it. Anyway, I want to be productive, so here's the thing: either something like this is done, or drivers need to be allowed to walk the trips table. Which one is better? > > > - result = thermal_zone_get_trip(tz, count, &trip); > > - if (result || !trip.temperature) > > + result = __thermal_zone_get_trip(tz, count, &trip); > > + if (!result) > > + temperature = trip.temperature; > > + } > > + if (!temperature) > > set_bit(count, &tz->trips_disabled); > > } > > > > + mutex_unlock(&tz->lock); > > + > > /* Update 'this' zone's governor information */ > > mutex_lock(&thermal_governor_lock);
Hi Rafael, On 01/08/2023 21:02, Rafael J. Wysocki wrote: > On Tue, Aug 1, 2023 at 8:29 PM Daniel Lezcano <daniel.lezcano@linaro.org> wrote: >> >> On 25/07/2023 14:04, Rafael J. Wysocki wrote: >>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> >>> >>> Some drivers need to update trip point data (temperature and/or >>> hysteresis) upon notifications from the platform firmware or they >>> may need to reprogram hardware when trip point parameters are changed >>> via sysfs. For those purposes, they need to connect struct thermal_trip >>> to a private data set associated with the trip or the other way around >>> and using a trip point index for that may not always work, because the >>> core may need to reorder the trips during thermal zone registration (in >>> particular, they may need to be sorted). >>> >>> To allow that to be done without using a trip point index, introduce >>> a new field in struct thermal_trip that can be pointed by the driver >>> to its own data structure containing a trip pointer to be initialized >>> by the core during thermal zone registration. That pointer will then >>> have to be updated by the core every time the location of the given >>> trip point object in memory changes. >>> >>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> >>> --- >>> >>> v2 -> v3: No changes. >>> >>> v1 -> v2: No changes. >>> >>> --- >>> drivers/thermal/thermal_core.c | 20 +++++++++++++++++--- >>> include/linux/thermal.h | 13 +++++++++++++ >>> 2 files changed, 30 insertions(+), 3 deletions(-) >>> >>> Index: linux-pm/include/linux/thermal.h >>> =================================================================== >>> --- linux-pm.orig/include/linux/thermal.h >>> +++ linux-pm/include/linux/thermal.h >>> @@ -76,16 +76,29 @@ struct thermal_zone_device_ops { >>> void (*critical)(struct thermal_zone_device *); >>> }; >>> >>> +struct thermal_trip_ref { >>> + struct thermal_trip *trip; >>> +}; >> >> That introduces a circular dependency. That should be avoided. > > Sorry, but this is an empty statement without any substance. I'm just pointing that we have a struct A pointing to struct B and struct B pointing to struct A. [ ... ] >>> struct thermal_cooling_device_ops { >>> Index: linux-pm/drivers/thermal/thermal_core.c >>> =================================================================== >>> --- linux-pm.orig/drivers/thermal/thermal_core.c >>> +++ linux-pm/drivers/thermal/thermal_core.c >>> @@ -1306,14 +1306,28 @@ thermal_zone_device_register_with_trips( >>> if (result) >>> goto release_device; >>> >>> + mutex_lock(&tz->lock); >>> + >>> for (count = 0; count < num_trips; count++) { >>> - struct thermal_trip trip; >>> + int temperature = 0; >>> + >>> + if (trips) { >>> + temperature = trips[count].temperature; >>> + if (trips[count].driver_ref) >>> + trips[count].driver_ref->trip = &trips[count]; >>> + } else { >>> + struct thermal_trip trip; >> >> As mentioned above, that should not appear in the thermal core code. > > Well, this is a matter of opinion to me. Clearly, I disagree with it. Why? It is not an opinion. The thermal core code has been very very tied with the ACPI implementation (which is logical given the history of the changes). All the efforts have been made to cut these frictions and make the thermal core code driver agnostic. The changes put in place a mechanism for the ACPI driver. The thermal zone lock wrapper is put in place for the ACPI driver. > Anyway, I want to be productive, so here's the thing: either something > like this is done, or drivers need to be allowed to walk the trips > table. > > Which one is better? None of them. I think we can find a third solution where the changes are self contained in the ACPI driver. What do you think?
Hi Daniel, On Wed, Aug 2, 2023 at 2:34 PM Daniel Lezcano <daniel.lezcano@linaro.org> wrote: > > > Hi Rafael, > > On 01/08/2023 21:02, Rafael J. Wysocki wrote: > > On Tue, Aug 1, 2023 at 8:29 PM Daniel Lezcano <daniel.lezcano@linaro.org> wrote: > >> > >> On 25/07/2023 14:04, Rafael J. Wysocki wrote: > >>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > >>> > >>> Some drivers need to update trip point data (temperature and/or > >>> hysteresis) upon notifications from the platform firmware or they > >>> may need to reprogram hardware when trip point parameters are changed > >>> via sysfs. For those purposes, they need to connect struct thermal_trip > >>> to a private data set associated with the trip or the other way around > >>> and using a trip point index for that may not always work, because the > >>> core may need to reorder the trips during thermal zone registration (in > >>> particular, they may need to be sorted). > >>> > >>> To allow that to be done without using a trip point index, introduce > >>> a new field in struct thermal_trip that can be pointed by the driver > >>> to its own data structure containing a trip pointer to be initialized > >>> by the core during thermal zone registration. That pointer will then > >>> have to be updated by the core every time the location of the given > >>> trip point object in memory changes. > >>> > >>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > >>> --- > >>> > >>> v2 -> v3: No changes. > >>> > >>> v1 -> v2: No changes. > >>> > >>> --- > >>> drivers/thermal/thermal_core.c | 20 +++++++++++++++++--- > >>> include/linux/thermal.h | 13 +++++++++++++ > >>> 2 files changed, 30 insertions(+), 3 deletions(-) > >>> > >>> Index: linux-pm/include/linux/thermal.h > >>> =================================================================== > >>> --- linux-pm.orig/include/linux/thermal.h > >>> +++ linux-pm/include/linux/thermal.h > >>> @@ -76,16 +76,29 @@ struct thermal_zone_device_ops { > >>> void (*critical)(struct thermal_zone_device *); > >>> }; > >>> > >>> +struct thermal_trip_ref { > >>> + struct thermal_trip *trip; > >>> +}; > >> > >> That introduces a circular dependency. That should be avoided. > > > > Sorry, but this is an empty statement without any substance. > > I'm just pointing that we have a struct A pointing to struct B and > struct B pointing to struct A. Why is this a problem in general? There are cases in which struct A needs to be found given struct B (like in the ACPI thermal case, when the driver needs to get to trips[i] from its local data) and there are cases in which struct B needs to be found given struct A (like when a driver's callback is invoked and passed a trip pointer, so the driver needs to get to its local data from it - arguably this is not the case right now, but I suppose it will be the case in the future). > [ ... ] > > >>> struct thermal_cooling_device_ops { > >>> Index: linux-pm/drivers/thermal/thermal_core.c > >>> =================================================================== > >>> --- linux-pm.orig/drivers/thermal/thermal_core.c > >>> +++ linux-pm/drivers/thermal/thermal_core.c > >>> @@ -1306,14 +1306,28 @@ thermal_zone_device_register_with_trips( > >>> if (result) > >>> goto release_device; > >>> > >>> + mutex_lock(&tz->lock); > >>> + > >>> for (count = 0; count < num_trips; count++) { > >>> - struct thermal_trip trip; > >>> + int temperature = 0; > >>> + > >>> + if (trips) { > >>> + temperature = trips[count].temperature; > >>> + if (trips[count].driver_ref) > >>> + trips[count].driver_ref->trip = &trips[count]; > >>> + } else { > >>> + struct thermal_trip trip; > >> > >> As mentioned above, that should not appear in the thermal core code. > > > > Well, this is a matter of opinion to me. Clearly, I disagree with it. > > Why? It is not an opinion. So what's wrong with it, technically? What's broken by it? Why does it make the code more difficult to maintain? > The thermal core code has been very very tied > with the ACPI implementation (which is logical given the history of the > changes). All the efforts have been made to cut these frictions and make > the thermal core code driver agnostic. > > The changes put in place a mechanism for the ACPI driver. Not really, for all drivers that have local trip data and need to get to trips[i] from there and/or the other way around. > The thermal zone lock wrapper is put in place for the ACPI driver. Yes, it is, because that's the most straightforward way to address the use case at hand IMV. > > Anyway, I want to be productive, so here's the thing: either something > > like this is done, or drivers need to be allowed to walk the trips > > table. > > > > Which one is better? > > None of them. I think we can find a third solution where the changes are > self contained in the ACPI driver. What do you think? The ACPI thermal driver needs to update trip point temperatures at times. For this purpose, it needs to get from its local trip data to trip[i] somehow. Creating a new trips[] array and handing it over to the core is not an option, because it potentially breaks the thermal device binding to the zone (in which trip indices are used, mind you). So how exactly do you want the driver to do the above? It could save a pointer to each trips[i] in its local data structures before registering the zone, but then if the core reordered the trips, those pointers would become stale. So how?
On 02/08/2023 15:03, Rafael J. Wysocki wrote: [ ... ] >>>>> +struct thermal_trip_ref { >>>>> + struct thermal_trip *trip; >>>>> +}; >>>> >>>> That introduces a circular dependency. That should be avoided. >>> >>> Sorry, but this is an empty statement without any substance. >> >> I'm just pointing that we have a struct A pointing to struct B and >> struct B pointing to struct A. > > Why is this a problem in general? Cyclic dependencies are often a sign of a design problem. > There are cases in which struct A needs to be found given struct B > (like in the ACPI thermal case, when the driver needs to get to > trips[i] from its local data) and there are cases in which struct B > needs to be found given struct A (like when a driver's callback is > invoked and passed a trip pointer, so the driver needs to get to its > local data from it - arguably this is not the case right now, but I > suppose it will be the case in the future). > >> [ ... ] >> >>>>> struct thermal_cooling_device_ops { >>>>> Index: linux-pm/drivers/thermal/thermal_core.c >>>>> =================================================================== >>>>> --- linux-pm.orig/drivers/thermal/thermal_core.c >>>>> +++ linux-pm/drivers/thermal/thermal_core.c >>>>> @@ -1306,14 +1306,28 @@ thermal_zone_device_register_with_trips( >>>>> if (result) >>>>> goto release_device; >>>>> >>>>> + mutex_lock(&tz->lock); >>>>> + >>>>> for (count = 0; count < num_trips; count++) { >>>>> - struct thermal_trip trip; >>>>> + int temperature = 0; >>>>> + >>>>> + if (trips) { >>>>> + temperature = trips[count].temperature; >>>>> + if (trips[count].driver_ref) >>>>> + trips[count].driver_ref->trip = &trips[count]; >>>>> + } else { >>>>> + struct thermal_trip trip; >>>> >>>> As mentioned above, that should not appear in the thermal core code. >>> >>> Well, this is a matter of opinion to me. Clearly, I disagree with it. >> >> Why? It is not an opinion. > > So what's wrong with it, technically? What's broken by it? Why does > it make the code more difficult to maintain? >> The thermal core code has been very very tied >> with the ACPI implementation (which is logical given the history of the >> changes). All the efforts have been made to cut these frictions and make >> the thermal core code driver agnostic. >> >> The changes put in place a mechanism for the ACPI driver. > > Not really, for all drivers that have local trip data and need to get > to trips[i] from there and/or the other way around. > >> The thermal zone lock wrapper is put in place for the ACPI driver. > > Yes, it is, because that's the most straightforward way to address the > use case at hand IMV. > >>> Anyway, I want to be productive, so here's the thing: either something >>> like this is done, or drivers need to be allowed to walk the trips >>> table. >>> >>> Which one is better? >> >> None of them. I think we can find a third solution where the changes are >> self contained in the ACPI driver. What do you think? > > The ACPI thermal driver needs to update trip point temperatures at > times. For this purpose, it needs to get from its local trip data to > trip[i] somehow. > > Creating a new trips[] array and handing it over to the core is not an > option, because it potentially breaks the thermal device binding to > the zone (in which trip indices are used, mind you). > > So how exactly do you want the driver to do the above? > > It could save a pointer to each trips[i] in its local data structures > before registering the zone, but then if the core reordered the trips, > those pointers would become stale. > > So how? Let me check if I can do something on top of your series to move it in the ACPI driver.
On Wed, Aug 2, 2023 at 5:50 PM Daniel Lezcano <daniel.lezcano@linaro.org> wrote: > > On 02/08/2023 15:03, Rafael J. Wysocki wrote: > > [ ... ] > > >>>>> +struct thermal_trip_ref { > >>>>> + struct thermal_trip *trip; > >>>>> +}; > >>>> > >>>> That introduces a circular dependency. That should be avoided. > >>> > >>> Sorry, but this is an empty statement without any substance. > >> > >> I'm just pointing that we have a struct A pointing to struct B and > >> struct B pointing to struct A. > > > > Why is this a problem in general? > > Cyclic dependencies are often a sign of a design problem. > > > There are cases in which struct A needs to be found given struct B > > (like in the ACPI thermal case, when the driver needs to get to > > trips[i] from its local data) and there are cases in which struct B > > needs to be found given struct A (like when a driver's callback is > > invoked and passed a trip pointer, so the driver needs to get to its > > local data from it - arguably this is not the case right now, but I > > suppose it will be the case in the future). > > > >> [ ... ] > >> > >>>>> struct thermal_cooling_device_ops { > >>>>> Index: linux-pm/drivers/thermal/thermal_core.c > >>>>> =================================================================== > >>>>> --- linux-pm.orig/drivers/thermal/thermal_core.c > >>>>> +++ linux-pm/drivers/thermal/thermal_core.c > >>>>> @@ -1306,14 +1306,28 @@ thermal_zone_device_register_with_trips( > >>>>> if (result) > >>>>> goto release_device; > >>>>> > >>>>> + mutex_lock(&tz->lock); > >>>>> + > >>>>> for (count = 0; count < num_trips; count++) { > >>>>> - struct thermal_trip trip; > >>>>> + int temperature = 0; > >>>>> + > >>>>> + if (trips) { > >>>>> + temperature = trips[count].temperature; > >>>>> + if (trips[count].driver_ref) > >>>>> + trips[count].driver_ref->trip = &trips[count]; > >>>>> + } else { > >>>>> + struct thermal_trip trip; > >>>> > >>>> As mentioned above, that should not appear in the thermal core code. > >>> > >>> Well, this is a matter of opinion to me. Clearly, I disagree with it. > >> > >> Why? It is not an opinion. > > > > So what's wrong with it, technically? What's broken by it? Why does > > it make the code more difficult to maintain? > > > > >> The thermal core code has been very very tied > >> with the ACPI implementation (which is logical given the history of the > >> changes). All the efforts have been made to cut these frictions and make > >> the thermal core code driver agnostic. > >> > >> The changes put in place a mechanism for the ACPI driver. > > > > Not really, for all drivers that have local trip data and need to get > > to trips[i] from there and/or the other way around. > > > >> The thermal zone lock wrapper is put in place for the ACPI driver. > > > > Yes, it is, because that's the most straightforward way to address the > > use case at hand IMV. > > > >>> Anyway, I want to be productive, so here's the thing: either something > >>> like this is done, or drivers need to be allowed to walk the trips > >>> table. > >>> > >>> Which one is better? > >> > >> None of them. I think we can find a third solution where the changes are > >> self contained in the ACPI driver. What do you think? > > > > The ACPI thermal driver needs to update trip point temperatures at > > times. For this purpose, it needs to get from its local trip data to > > trip[i] somehow. > > > > Creating a new trips[] array and handing it over to the core is not an > > option, because it potentially breaks the thermal device binding to > > the zone (in which trip indices are used, mind you). > > > > So how exactly do you want the driver to do the above? > > > > It could save a pointer to each trips[i] in its local data structures > > before registering the zone, but then if the core reordered the trips, > > those pointers would become stale. > > > > So how? > > Let me check if I can do something on top of your series to move it in > the ACPI driver. It doesn't need to be on top of my series, so if you have an idea, please just let me know what it is. It can't be entirely in the ACPI driver AFAICS, though, because trips[i] need to be modified on updates and they belong to the core. Hence, the driver needs some help from the core to get to them. It can be something like "this is my trip tag and please give me the address of the trip matching it" or similar, but it is needed, because the driver has to assume that the trip indices used by it initially may change.
On 02/08/2023 18:48, Rafael J. Wysocki wrote: [ ... ] >> Let me check if I can do something on top of your series to move it in >> the ACPI driver. > > It doesn't need to be on top of my series, so if you have an idea, > please just let me know what it is. > > It can't be entirely in the ACPI driver AFAICS, though, because > trips[i] need to be modified on updates and they belong to the core. > Hence, the driver needs some help from the core to get to them. It > can be something like "this is my trip tag and please give me the > address of the trip matching it" or similar, but it is needed, because > the driver has to assume that the trip indices used by it initially > may change. May be I'm missing something but driver_ref does not seems to be used except when assigning it, no?
On Thu, Aug 3, 2023 at 3:06 PM Daniel Lezcano <daniel.lezcano@linaro.org> wrote: > > On 02/08/2023 18:48, Rafael J. Wysocki wrote: > > [ ... ] > > >> Let me check if I can do something on top of your series to move it in > >> the ACPI driver. > > > > It doesn't need to be on top of my series, so if you have an idea, > > please just let me know what it is. > > > > It can't be entirely in the ACPI driver AFAICS, though, because > > trips[i] need to be modified on updates and they belong to the core. > > Hence, the driver needs some help from the core to get to them. It > > can be something like "this is my trip tag and please give me the > > address of the trip matching it" or similar, but it is needed, because > > the driver has to assume that the trip indices used by it initially > > may change. > > May be I'm missing something but driver_ref does not seems to be used > except when assigning it, no? It is used on the other side. That is, the value assigned to the trip field in it is accessed via trip_ref in the driver. The idea is that the driver puts a pointer to its local struct thermal_trip_ref into a struct thermal_trip and the core stores the address of that struct thermal_trip in there, which allows the driver to access the struct thermal_trip via its local struct thermal_trip_ref going forward. Admittedly, this is somewhat convoluted. I have an alternative approach in the works, just for illustration purposes if nothing else, but I have encountered a problem that I would like to ask you about. Namely, zone disabling is not particularly useful for preventing the zone from being used while the trips are updated, because it has side effects. First, it triggers __thermal_zone_device_update() and a netlink message every time the mode changes, which can be kind of overcome. But second, if the mode is "disabled", it does not actually prevent things like __thermal_zone_get_trip() from running and the zone lock is the only thing that can be used for that AFAICS. So by "disabling" a thermal zone, did you mean changing its mode to "disabled" or something else?
On 03/08/2023 16:15, Rafael J. Wysocki wrote: > On Thu, Aug 3, 2023 at 3:06 PM Daniel Lezcano <daniel.lezcano@linaro.org> wrote: >> >> On 02/08/2023 18:48, Rafael J. Wysocki wrote: >> >> [ ... ] >> >>>> Let me check if I can do something on top of your series to move it in >>>> the ACPI driver. >>> >>> It doesn't need to be on top of my series, so if you have an idea, >>> please just let me know what it is. >>> >>> It can't be entirely in the ACPI driver AFAICS, though, because >>> trips[i] need to be modified on updates and they belong to the core. >>> Hence, the driver needs some help from the core to get to them. It >>> can be something like "this is my trip tag and please give me the >>> address of the trip matching it" or similar, but it is needed, because >>> the driver has to assume that the trip indices used by it initially >>> may change. >> >> May be I'm missing something but driver_ref does not seems to be used >> except when assigning it, no? > > It is used on the other side. That is, the value assigned to the trip > field in it is accessed via trip_ref in the driver. > > The idea is that the driver puts a pointer to its local struct > thermal_trip_ref into a struct thermal_trip and the core stores the > address of that struct thermal_trip in there, which allows the driver > to access the struct thermal_trip via its local struct > thermal_trip_ref going forward. > > Admittedly, this is somewhat convoluted. > > I have an alternative approach in the works, just for illustration > purposes if nothing else, but I have encountered a problem that I > would like to ask you about. > > Namely, zone disabling is not particularly useful for preventing the > zone from being used while the trips are updated, because it has side > effects. First, it triggers __thermal_zone_device_update() and a > netlink message every time the mode changes, which can be kind of > overcome. Right > But second, if the mode is "disabled", it does not actually > prevent things like __thermal_zone_get_trip() from running and the > zone lock is the only thing that can be used for that AFAICS. > > So by "disabling" a thermal zone, did you mean changing its mode to > "disabled" or something else? Yes, that is what I meant. May be the initial proposal by updating the thermal trips pointer can solve that [1] IMO we can assume the trip point changes are very rare (if any), so rebuilding a new trip array and update the thermal zone with the pointer may solve the situation. The routine does a copy of the trips array, so it can reorder it without impacting the array passed as a parameter. And it can take the lock. We just have to constraint the update function to invalidate arrays with a number of trip points different from the one initially passed when creating the thermal zone. Alternatively, we can be smarter in the ACPI driver and update the corresponding temperature+hysteresis trip point by using the thermal_zone_set_trip() function. [1] https://lore.kernel.org/all/20230525140135.3589917-5-daniel.lezcano@linaro.org/
On Thu, Aug 3, 2023 at 6:20 PM Daniel Lezcano <daniel.lezcano@linaro.org> wrote: > > On 03/08/2023 16:15, Rafael J. Wysocki wrote: > > On Thu, Aug 3, 2023 at 3:06 PM Daniel Lezcano <daniel.lezcano@linaro.org> wrote: > >> > >> On 02/08/2023 18:48, Rafael J. Wysocki wrote: > >> > >> [ ... ] > >> > >>>> Let me check if I can do something on top of your series to move it in > >>>> the ACPI driver. > >>> > >>> It doesn't need to be on top of my series, so if you have an idea, > >>> please just let me know what it is. > >>> > >>> It can't be entirely in the ACPI driver AFAICS, though, because > >>> trips[i] need to be modified on updates and they belong to the core. > >>> Hence, the driver needs some help from the core to get to them. It > >>> can be something like "this is my trip tag and please give me the > >>> address of the trip matching it" or similar, but it is needed, because > >>> the driver has to assume that the trip indices used by it initially > >>> may change. > >> > >> May be I'm missing something but driver_ref does not seems to be used > >> except when assigning it, no? > > > > It is used on the other side. That is, the value assigned to the trip > > field in it is accessed via trip_ref in the driver. > > > > The idea is that the driver puts a pointer to its local struct > > thermal_trip_ref into a struct thermal_trip and the core stores the > > address of that struct thermal_trip in there, which allows the driver > > to access the struct thermal_trip via its local struct > > thermal_trip_ref going forward. > > > > Admittedly, this is somewhat convoluted. > > > > I have an alternative approach in the works, just for illustration > > purposes if nothing else, but I have encountered a problem that I > > would like to ask you about. > > > > Namely, zone disabling is not particularly useful for preventing the > > zone from being used while the trips are updated, because it has side > > effects. First, it triggers __thermal_zone_device_update() and a > > netlink message every time the mode changes, which can be kind of > > overcome. > > Right > > > But second, if the mode is "disabled", it does not actually > > prevent things like __thermal_zone_get_trip() from running and the > > zone lock is the only thing that can be used for that AFAICS. > > > > So by "disabling" a thermal zone, did you mean changing its mode to > > "disabled" or something else? > > Yes, that is what I meant. > > May be the initial proposal by updating the thermal trips pointer can > solve that [1] No, it can't. An existing trips[] table cannot be replaced with a new one with different trip indices, because those indices are already in use. And if the indices are the same, there's no reason to replace trips. > IMO we can assume the trip point changes are very rare (if any), so > rebuilding a new trip array and update the thermal zone with the pointer > may solve the situation. > > The routine does a copy of the trips array, so it can reorder it without > impacting the array passed as a parameter. And it can take the lock. The driver can take a lock as well. Forbidding drivers to use the zone lock is an artificial limitation without technical merit IMV. > We just have to constraint the update function to invalidate arrays with > a number of trip points different from the one initially passed when > creating the thermal zone. > > Alternatively, we can be smarter in the ACPI driver and update the > corresponding temperature+hysteresis trip point by using the > thermal_zone_set_trip() function. I don't see why this would make any difference. > [1] > https://lore.kernel.org/all/20230525140135.3589917-5-daniel.lezcano@linaro.org/
On 03/08/2023 21:58, Rafael J. Wysocki wrote: > On Thu, Aug 3, 2023 at 6:20 PM Daniel Lezcano <daniel.lezcano@linaro.org> wrote: >> >> On 03/08/2023 16:15, Rafael J. Wysocki wrote: >>> On Thu, Aug 3, 2023 at 3:06 PM Daniel Lezcano <daniel.lezcano@linaro.org> wrote: >>>> >>>> On 02/08/2023 18:48, Rafael J. Wysocki wrote: >>>> >>>> [ ... ] >>>> >>>>>> Let me check if I can do something on top of your series to move it in >>>>>> the ACPI driver. >>>>> >>>>> It doesn't need to be on top of my series, so if you have an idea, >>>>> please just let me know what it is. >>>>> >>>>> It can't be entirely in the ACPI driver AFAICS, though, because >>>>> trips[i] need to be modified on updates and they belong to the core. >>>>> Hence, the driver needs some help from the core to get to them. It >>>>> can be something like "this is my trip tag and please give me the >>>>> address of the trip matching it" or similar, but it is needed, because >>>>> the driver has to assume that the trip indices used by it initially >>>>> may change. >>>> >>>> May be I'm missing something but driver_ref does not seems to be used >>>> except when assigning it, no? >>> >>> It is used on the other side. That is, the value assigned to the trip >>> field in it is accessed via trip_ref in the driver. >>> >>> The idea is that the driver puts a pointer to its local struct >>> thermal_trip_ref into a struct thermal_trip and the core stores the >>> address of that struct thermal_trip in there, which allows the driver >>> to access the struct thermal_trip via its local struct >>> thermal_trip_ref going forward. >>> >>> Admittedly, this is somewhat convoluted. >>> >>> I have an alternative approach in the works, just for illustration >>> purposes if nothing else, but I have encountered a problem that I >>> would like to ask you about. >>> >>> Namely, zone disabling is not particularly useful for preventing the >>> zone from being used while the trips are updated, because it has side >>> effects. First, it triggers __thermal_zone_device_update() and a >>> netlink message every time the mode changes, which can be kind of >>> overcome. >> >> Right >> >>> But second, if the mode is "disabled", it does not actually >>> prevent things like __thermal_zone_get_trip() from running and the >>> zone lock is the only thing that can be used for that AFAICS. >> > >>> So by "disabling" a thermal zone, did you mean changing its mode to >>> "disabled" or something else? >> >> Yes, that is what I meant. >> >> May be the initial proposal by updating the thermal trips pointer can >> solve that [1] > > No, it can't. An existing trips[] table cannot be replaced with a new > one with different trip indices, because those indices are already in > use. And if the indices are the same, there's no reason to replace > trips. > >> IMO we can assume the trip point changes are very rare (if any), so >> rebuilding a new trip array and update the thermal zone with the pointer >> may solve the situation. >> >> The routine does a copy of the trips array, so it can reorder it without >> impacting the array passed as a parameter. And it can take the lock. > > The driver can take a lock as well. Forbidding drivers to use the > zone lock is an artificial limitation without technical merit IMV. Yes, it is technically possible to take a lock from a driver. However, from a higher perspective, we have a core framework which is self-contained and we have a back-end which forces us to export this lock. Even if it is possible, it is not desirable because we break the self-containment and thus that will make future changes in the core framework complicated because of the interactions with back-end drivers. I'm not putting in question your changes in general but just want to keep the direction of having the core framework and the drivers interacting with the ops and a few high level functions where the core framework handle the logic. The clocksource/clockevent drivers are an example on how the time framework and the drivers are clearly separated. >> We just have to constraint the update function to invalidate arrays with >> a number of trip points different from the one initially passed when >> creating the thermal zone. >> >> Alternatively, we can be smarter in the ACPI driver and update the >> corresponding temperature+hysteresis trip point by using the >> thermal_zone_set_trip() function. > > I don't see why this would make any difference. The function thermal_zone_set_trip() takes the lock.
On Fri, Aug 4, 2023 at 10:17 AM Daniel Lezcano <daniel.lezcano@linaro.org> wrote: > > On 03/08/2023 21:58, Rafael J. Wysocki wrote: > > On Thu, Aug 3, 2023 at 6:20 PM Daniel Lezcano <daniel.lezcano@linaro.org> wrote: > >> > >> On 03/08/2023 16:15, Rafael J. Wysocki wrote: > >>> On Thu, Aug 3, 2023 at 3:06 PM Daniel Lezcano <daniel.lezcano@linaro.org> wrote: > >>>> > >>>> On 02/08/2023 18:48, Rafael J. Wysocki wrote: > >>>> > >>>> [ ... ] > >>>> > >>>>>> Let me check if I can do something on top of your series to move it in > >>>>>> the ACPI driver. > >>>>> > >>>>> It doesn't need to be on top of my series, so if you have an idea, > >>>>> please just let me know what it is. > >>>>> > >>>>> It can't be entirely in the ACPI driver AFAICS, though, because > >>>>> trips[i] need to be modified on updates and they belong to the core. > >>>>> Hence, the driver needs some help from the core to get to them. It > >>>>> can be something like "this is my trip tag and please give me the > >>>>> address of the trip matching it" or similar, but it is needed, because > >>>>> the driver has to assume that the trip indices used by it initially > >>>>> may change. > >>>> > >>>> May be I'm missing something but driver_ref does not seems to be used > >>>> except when assigning it, no? > >>> > >>> It is used on the other side. That is, the value assigned to the trip > >>> field in it is accessed via trip_ref in the driver. > >>> > >>> The idea is that the driver puts a pointer to its local struct > >>> thermal_trip_ref into a struct thermal_trip and the core stores the > >>> address of that struct thermal_trip in there, which allows the driver > >>> to access the struct thermal_trip via its local struct > >>> thermal_trip_ref going forward. > >>> > >>> Admittedly, this is somewhat convoluted. > >>> > >>> I have an alternative approach in the works, just for illustration > >>> purposes if nothing else, but I have encountered a problem that I > >>> would like to ask you about. > >>> > >>> Namely, zone disabling is not particularly useful for preventing the > >>> zone from being used while the trips are updated, because it has side > >>> effects. First, it triggers __thermal_zone_device_update() and a > >>> netlink message every time the mode changes, which can be kind of > >>> overcome. > >> > >> Right > >> > >>> But second, if the mode is "disabled", it does not actually > >>> prevent things like __thermal_zone_get_trip() from running and the > >>> zone lock is the only thing that can be used for that AFAICS. > >> > > >>> So by "disabling" a thermal zone, did you mean changing its mode to > >>> "disabled" or something else? > >> > >> Yes, that is what I meant. > >> > >> May be the initial proposal by updating the thermal trips pointer can > >> solve that [1] > > > > No, it can't. An existing trips[] table cannot be replaced with a new > > one with different trip indices, because those indices are already in > > use. And if the indices are the same, there's no reason to replace > > trips. > > > >> IMO we can assume the trip point changes are very rare (if any), so > >> rebuilding a new trip array and update the thermal zone with the pointer > >> may solve the situation. > >> > >> The routine does a copy of the trips array, so it can reorder it without > >> impacting the array passed as a parameter. And it can take the lock. > > > > The driver can take a lock as well. Forbidding drivers to use the > > zone lock is an artificial limitation without technical merit IMV. > > Yes, it is technically possible to take a lock from a driver. However, > from a higher perspective, we have a core framework which is > self-contained and we have a back-end which forces us to export this lock. > > Even if it is possible, it is not desirable because we break the > self-containment and thus that will make future changes in the core > framework complicated because of the interactions with back-end drivers. So the counter argument here is that using the zone lock directly in the driver is the most straightforward way of addressing the use case at hand, which is the need to update trip points in a non-racy way. Everything else is more complex and the reasons for adding the extra complexity can be questioned. Self-containment is nice, but in some cases it is just not worth enforcing it all the way through at the cost of increased code complexity. Anyway, I'm going to post a new version of this patch series later today which uses a somewhat different approach. It is a bit more complex, but maybe this is fine.
Index: linux-pm/include/linux/thermal.h =================================================================== --- linux-pm.orig/include/linux/thermal.h +++ linux-pm/include/linux/thermal.h @@ -76,16 +76,29 @@ struct thermal_zone_device_ops { void (*critical)(struct thermal_zone_device *); }; +struct thermal_trip_ref { + struct thermal_trip *trip; +}; + /** * struct thermal_trip - representation of a point in temperature domain * @temperature: temperature value in miliCelsius * @hysteresis: relative hysteresis in miliCelsius * @type: trip point type + * @driver_ref: driver's reference to this trip point + * + * If @driver_ref is not NULL, the trip pointer in the object pointed to by it + * will be initialized by the core during thermal zone registration and updated + * whenever the location of the given trip object changes. This allows the + * driver to access the trip point data without knowing the relative ordering + * of trips within the trip table used by the core and, given a trip pointer, + * to get back to its private data associated with the given trip. */ struct thermal_trip { int temperature; int hysteresis; enum thermal_trip_type type; + struct thermal_trip_ref *driver_ref; }; struct thermal_cooling_device_ops { Index: linux-pm/drivers/thermal/thermal_core.c =================================================================== --- linux-pm.orig/drivers/thermal/thermal_core.c +++ linux-pm/drivers/thermal/thermal_core.c @@ -1306,14 +1306,28 @@ thermal_zone_device_register_with_trips( if (result) goto release_device; + mutex_lock(&tz->lock); + for (count = 0; count < num_trips; count++) { - struct thermal_trip trip; + int temperature = 0; + + if (trips) { + temperature = trips[count].temperature; + if (trips[count].driver_ref) + trips[count].driver_ref->trip = &trips[count]; + } else { + struct thermal_trip trip; - result = thermal_zone_get_trip(tz, count, &trip); - if (result || !trip.temperature) + result = __thermal_zone_get_trip(tz, count, &trip); + if (!result) + temperature = trip.temperature; + } + if (!temperature) set_bit(count, &tz->trips_disabled); } + mutex_unlock(&tz->lock); + /* Update 'this' zone's governor information */ mutex_lock(&thermal_governor_lock);