Message ID | 1456319674.2867.15.camel@linaro.org |
---|---|
State | New |
Headers | show |
On Tue, 2016-03-08 at 09:42 +0530, Vinod Koul wrote: > On Wed, Feb 24, 2016 at 01:14:34PM +0000, Jon Medhurst (Tixy) wrote: > > The residue calculation in pl330_tx_status doesn't handle transitional > > states that occur at the time one descriptor (A) is completed and the > > next (B) is started. Specifically, both A and B can simultaneously be in > > the BUSY state and at this time the thread's 'req_running' may (or may > > not) be -1. > > you are under lock so descriptor state wont be update while we are it. > > Also the query for residue is for "a descriptor" not whatever is the current > running descriptor... > > > > > To cope with this situation we change the code to ensure A is treated as > > complete and B as having not yet started. Prior to the change, the code > > would calculate a transferred byte count as if both A and B had > > completed. > > You query for either A or B not both! I've probably been using wrong/ambiguous terminology... In my description I'm using 'descriptor' to refer to a 'struct dma_pl330_desc', I guess other people assume 'struct dma_async_tx_descriptor'? The situation I was debugging was audio playback, where ASoC ends up calling pl330_prep_dma_cyclic() with a period one quarter the length of the buffer it is using, so that results in four dma_pl330_desc 'descriptors' being created to cover that buffer. These later get submitted to a DMA channel (struct dma_pl330_chan) which has a list of these that it is processing (the 'work_list'). The residual calculation that currently exists in pl08x_dma_tx_status() is iterating this work_list and summing the length of currently transferring 'descriptor' with those later pending ones. I believe that is correct behaviour because these 'descriptors' (dma_pl330_desc) are all internal implementation details of the driver, and the dmaengine API's are dealing in units of 'dma_async_tx_descriptor' ? If the current code is OK in this regard, it is definitely buggy because it doesn't cope with the situation when two dma_pl330_desc's are in the state 'BUSY' a, which I have seen occur when debugging this issue, had worked out can happen by analysing the code, and is acknowledged by the in-source comments for enum desc_status... /* * Sitting on the work_list and already submitted * to the PL330 core. Not more than two descriptors * of a channel can be BUSY at any time. */ BUSY, In my problematic usecase I have userside code calling ALSA ioctls to poll the current audio playback position which results in pl08x_dma_tx_status() being called multiple times a second. After only a second or two the buggy situation gets hit, resulting in a miscalculation that ASoC interprets as a buffer underflow and so it stops the stream. I spent several days debugging this, with enough ad hoc tests and printk's littered everywhere to be very confident as to how things are going wrong - what I'm not not totally confident of is how things should be properly fixed. This patch appears to fix the situation that I was hitting, but it really looks like there isn't any locking that prevent this polling use of pl08x_dma_tx_status() from happening concurrently with the irq handler reprogramming the hardware for the next dma_pl330_desc. I didn't attempt any fix for that for fear of introducing bugs in what looks like complex code, and because it's not a problem I saw happen in practice. -- Tixy > > > > > Fixes: aee4d1fac887 ("dmaengine: pl330: improve pl330_tx_status() function") > > > > Signed-off-by: Jon Medhurst <tixy@linaro.org> > > --- > > > > I discovered this issue when trying to work out why audio stopped > > working on ARM's Juno platform and bisected it to commit aee4d1fac887. > > Whilst this patch seems to fix the problems I was seeing, I can't help > > but think there are more race conditions with this code. E.g. if the > > running descriptor changes under us, pl330_get_current_xferred_count > > can end up reading values from hardware that relate to a different > > descriptor. And if we're really unlucky, the reading of the 'val' and > > 'addr' values in pl330_get_current_xferred_count can come from different > > descriptors. I don't know if there is any locks we can use to prevent > > such races or if we need to try and detect when things have changed and > > redo/abort the residue calculation... > > > > drivers/dma/pl330.c | 24 ++++++++++++++++++++---- > > 1 file changed, 20 insertions(+), 4 deletions(-) > > > > diff --git a/drivers/dma/pl330.c b/drivers/dma/pl330.c > > index 17ee758..55e3c5f 100644 > > --- a/drivers/dma/pl330.c > > +++ b/drivers/dma/pl330.c > > @@ -2240,6 +2240,7 @@ pl330_tx_status(struct dma_chan *chan, dma_cookie_t cookie, > > struct dma_pl330_desc *desc, *running = NULL; > > struct dma_pl330_chan *pch = to_pchan(chan); > > unsigned int transferred, residual = 0; > > + bool first_busy; > > > > ret = dma_cookie_status(chan, cookie, txstate); > > > > @@ -2253,16 +2254,31 @@ pl330_tx_status(struct dma_chan *chan, dma_cookie_t cookie, > > > > if (pch->thread->req_running != -1) > > running = pch->thread->req[pch->thread->req_running].desc; > > + first_busy = true; > > > > /* Check in pending list */ > > list_for_each_entry(desc, &pch->work_list, node) { > > if (desc->status == DONE) > > transferred = desc->bytes_requested; > > - else if (running && desc == running) > > - transferred = > > - pl330_get_current_xferred_count(pch, desc); > > - else > > + else if (desc->status == BUSY && first_busy) { > > + first_busy = false; > > + if (running && desc == running) { > > + transferred = > > + pl330_get_current_xferred_count(pch, desc); > > + } else { > > + /* BUSY but not running means it's just completed */ > > + transferred = desc->bytes_requested; > > + } > > + } else { > > + /* > > + * Descriptor is either in PREP state queued for future > > + * transfer or it is the second BUSY descriptor we have > > + * seen. The latter case means it has just, or is about > > + * to be, started, so treat it as having not yet > > + * transferred any bytes, the same as PREP. > > + */ > > transferred = 0; > > + } > > residual += desc->bytes_requested - transferred; > > if (desc->txd.cookie == cookie) { > > switch (desc->status) { > > -- > > 2.1.4 > > > > >
On Tue, 2016-03-08 at 19:45 +0530, Vinod Koul wrote: > On Tue, Mar 08, 2016 at 10:40:11AM +0000, Jon Medhurst (Tixy) wrote: > > On Tue, 2016-03-08 at 09:42 +0530, Vinod Koul wrote: > > > On Wed, Feb 24, 2016 at 01:14:34PM +0000, Jon Medhurst (Tixy) wrote: > > > > The residue calculation in pl330_tx_status doesn't handle transitional > > > > states that occur at the time one descriptor (A) is completed and the > > > > next (B) is started. Specifically, both A and B can simultaneously be in > > > > the BUSY state and at this time the thread's 'req_running' may (or may > > > > not) be -1. > > > > > > you are under lock so descriptor state wont be update while we are it. > > > > > > Also the query for residue is for "a descriptor" not whatever is the current > > > running descriptor... > > > > > > > > > > > To cope with this situation we change the code to ensure A is treated as > > > > complete and B as having not yet started. Prior to the change, the code > > > > would calculate a transferred byte count as if both A and B had > > > > completed. > > > > > > You query for either A or B not both! > > > > I've probably been using wrong/ambiguous terminology... > > > > In my description I'm using 'descriptor' to refer to a 'struct > > dma_pl330_desc', I guess other people assume 'struct > > dma_async_tx_descriptor'? > > Nope > > > The situation I was debugging was audio playback, where ASoC ends up > > calling pl330_prep_dma_cyclic() with a period one quarter the length of > > the buffer it is using, so that results in four dma_pl330_desc > > 'descriptors' being created to cover that buffer. These later get > > submitted to a DMA channel (struct dma_pl330_chan) which has a list of > > these that it is processing (the 'work_list'). > > > > The residual calculation that currently exists in pl08x_dma_tx_status() > > is iterating this work_list and summing the length of currently > > transferring 'descriptor' with those later pending ones. I believe that > > is correct behaviour because these 'descriptors' (dma_pl330_desc) are > > all internal implementation details of the driver, and the dmaengine > > API's are dealing in units of 'dma_async_tx_descriptor' ? > > Not really. If you look closely dma_pl330_desc contains > dma_async_tx_descriptor. A descriptor represents a tranasaction and is > certainly not internal detail of your driver I think that for this driver it's kinda both, which is why it's confusing... pl330_prep_dma_cyclic(), after creating a number of dma_pl330_desc objects in a circular list, returns the address of the dma_async_tx_descriptor contained in the last one. I'm guessing that as far as users of the dmaengine API is concerned, this is treated as just a single descriptor. pl330_prep_slave_sg() creates a similar circular list. When these descriptors are later submitted to the driver for transfer, it is done by a call to pl330_tx_submit, which I has this comment... /* * We returned the last one of the circular list of descriptor(s) * from prep_xxx, so the argument to submit corresponds to the last * descriptor of the list. */ This goes on to add all the linked descriptors to the pending queue for sending to the DMA hardware, and then returns the cookie of the last descriptor, maintaining the fiction to the dmaengine API's that there is just a single descriptor. > > The reside is requested for "a descriptor". For example if you have prepared > two descriptors A and B and submitted them, then you can request status and > reside for A and you need to calculate that for A only and not take into > account status of B But, in the case of the pl330 driver, A and B may each consist of multiple internal/hidden descriptors. So the residue calculation has to sum up the residue of all these internal/hidden descriptors as well. This is what the current pl330_tx_status() function does, but has bugs. I've only just managed to clearly understand all the above details whilst writing this email, and this confusion obviously means the code and any commit messages need to explain things better. -- Tixy > > > > > If the current code is OK in this regard, it is definitely buggy because > > it doesn't cope with the situation when two dma_pl330_desc's are in the > > state 'BUSY' a, which I have seen occur when debugging this issue, had > > worked out can happen by analysing the code, and is acknowledged by the > > in-source comments for enum desc_status... > > > > /* > > * Sitting on the work_list and already submitted > > * to the PL330 core. Not more than two descriptors > > * of a channel can be BUSY at any time. > > */ > > BUSY, > > > > In my problematic usecase I have userside code calling ALSA ioctls to > > poll the current audio playback position which results in > > pl08x_dma_tx_status() being called multiple times a second. After only a > > second or two the buggy situation gets hit, resulting in a > > miscalculation that ASoC interprets as a buffer underflow and so it > > stops the stream. > > > > I spent several days debugging this, with enough ad hoc tests and > > printk's littered everywhere to be very confident as to how things are > > going wrong - what I'm not not totally confident of is how things should > > be properly fixed. > > There maybe a problem but this patch and the fix dont explain that > > > > > This patch appears to fix the situation that I was hitting, but it > > really looks like there isn't any locking that prevent this polling use > > of pl08x_dma_tx_status() from happening concurrently with the irq > > handler reprogramming the hardware for the next dma_pl330_desc. I didn't > > attempt any fix for that for fear of introducing bugs in what looks like > > complex code, and because it's not a problem I saw happen in practice. > > > > -- Tixy > > > > > > > > > > > > > Fixes: aee4d1fac887 ("dmaengine: pl330: improve pl330_tx_status() function") > > > > > > > > Signed-off-by: Jon Medhurst <tixy@linaro.org> > > > > --- > > > > > > > > I discovered this issue when trying to work out why audio stopped > > > > working on ARM's Juno platform and bisected it to commit aee4d1fac887. > > > > Whilst this patch seems to fix the problems I was seeing, I can't help > > > > but think there are more race conditions with this code. E.g. if the > > > > running descriptor changes under us, pl330_get_current_xferred_count > > > > can end up reading values from hardware that relate to a different > > > > descriptor. And if we're really unlucky, the reading of the 'val' and > > > > 'addr' values in pl330_get_current_xferred_count can come from different > > > > descriptors. I don't know if there is any locks we can use to prevent > > > > such races or if we need to try and detect when things have changed and > > > > redo/abort the residue calculation... > > > > > > > > drivers/dma/pl330.c | 24 ++++++++++++++++++++---- > > > > 1 file changed, 20 insertions(+), 4 deletions(-) > > > > > > > > diff --git a/drivers/dma/pl330.c b/drivers/dma/pl330.c > > > > index 17ee758..55e3c5f 100644 > > > > --- a/drivers/dma/pl330.c > > > > +++ b/drivers/dma/pl330.c > > > > @@ -2240,6 +2240,7 @@ pl330_tx_status(struct dma_chan *chan, dma_cookie_t cookie, > > > > struct dma_pl330_desc *desc, *running = NULL; > > > > struct dma_pl330_chan *pch = to_pchan(chan); > > > > unsigned int transferred, residual = 0; > > > > + bool first_busy; > > > > > > > > ret = dma_cookie_status(chan, cookie, txstate); > > > > > > > > @@ -2253,16 +2254,31 @@ pl330_tx_status(struct dma_chan *chan, dma_cookie_t cookie, > > > > > > > > if (pch->thread->req_running != -1) > > > > running = pch->thread->req[pch->thread->req_running].desc; > > > > + first_busy = true; > > > > > > > > /* Check in pending list */ > > > > list_for_each_entry(desc, &pch->work_list, node) { > > > > if (desc->status == DONE) > > > > transferred = desc->bytes_requested; > > > > - else if (running && desc == running) > > > > - transferred = > > > > - pl330_get_current_xferred_count(pch, desc); > > > > - else > > > > + else if (desc->status == BUSY && first_busy) { > > > > + first_busy = false; > > > > + if (running && desc == running) { > > > > + transferred = > > > > + pl330_get_current_xferred_count(pch, desc); > > > > + } else { > > > > + /* BUSY but not running means it's just completed */ > > > > + transferred = desc->bytes_requested; > > > > + } > > > > + } else { > > > > + /* > > > > + * Descriptor is either in PREP state queued for future > > > > + * transfer or it is the second BUSY descriptor we have > > > > + * seen. The latter case means it has just, or is about > > > > + * to be, started, so treat it as having not yet > > > > + * transferred any bytes, the same as PREP. > > > > + */ > > > > transferred = 0; > > > > + } > > > > residual += desc->bytes_requested - transferred; > > > > if (desc->txd.cookie == cookie) { > > > > switch (desc->status) { > > > > -- > > > > 2.1.4 > > > > > > > > > > > > > > > >
On Fri, 2016-03-11 at 13:13 +0530, Vinod Koul wrote: > On Tue, Mar 08, 2016 at 03:50:41PM +0000, Jon Medhurst (Tixy) wrote: > > > > The reside is requested for "a descriptor". For example if you have prepared > > > two descriptors A and B and submitted them, then you can request status and > > > reside for A and you need to calculate that for A only and not take into > > > account status of B > > > > But, in the case of the pl330 driver, A and B may each consist of > > multiple internal/hidden descriptors. So the residue calculation has to > > sum up the residue of all these internal/hidden descriptors as well. > > This is what the current pl330_tx_status() function does, but has bugs. > > > > I've only just managed to clearly understand all the above details > > whilst writing this email, and this confusion obviously means the code > > and any commit messages need to explain things better. > > Okay I relooked at the code, the driver is creating descriptor per period, > so yes while calculating that should be taken into account but only for > cyclic case and not for rest, as it will break reside values for for them. So not for scatter gather lists passed to pl330_prep_slave_sg? It seems to me, that as for the cyclic case, the descriptors used for each element in the list are equally internal to the pl330 driver and not exposed to users. pl330_tx_submit() only returns the cookie for the last descriptor, how do users get the cookies for the other elements so they can pass them to pl330_tx_status()? To me, the implementation of the pl330 driver looks consistent with the APIs I can see in header files and the Documentation directory, namely: - dmaengine_prep_* functions prepare a DMA transactions and return a descriptor (struct dma_async_tx_descriptor) - The transaction is queued for processing by passing that descriptor to dmaengine_submit() which returns 'cookie' (dma_cookie_t) - The current status of a transaction can be queried by passing it's cookie to dmaengine_tx_status Now, I've never looked at DMA API's or their working before I started investigating the bug I'm trying to fix, so I may very well have missed something. In fact, the more I look, the more uncertain I am. From Documentation/dmaengine/provider.txt it describes: * device_tx_status - Should report the bytes left to go over on the given channel - Should only care about the transaction descriptor passed as argument, not the currently active one on a given channel - The tx_state argument might be NULL - Should use dma_set_residue to report it - In the case of a cyclic transfer, it should only take into account the current period. So that's saying residue for cyclic transfers residue is only for the current period? I tried looking at other DMA drivers for inspiration, and it seems that some at least (e.g. drivers/dma/mmp_pdma.c) do things the same way as pl330, but that could just be a case of people copying broken code. -- Tixy
diff --git a/drivers/dma/pl330.c b/drivers/dma/pl330.c index 17ee758..55e3c5f 100644 --- a/drivers/dma/pl330.c +++ b/drivers/dma/pl330.c @@ -2240,6 +2240,7 @@ pl330_tx_status(struct dma_chan *chan, dma_cookie_t cookie, struct dma_pl330_desc *desc, *running = NULL; struct dma_pl330_chan *pch = to_pchan(chan); unsigned int transferred, residual = 0; + bool first_busy; ret = dma_cookie_status(chan, cookie, txstate); @@ -2253,16 +2254,31 @@ pl330_tx_status(struct dma_chan *chan, dma_cookie_t cookie, if (pch->thread->req_running != -1) running = pch->thread->req[pch->thread->req_running].desc; + first_busy = true; /* Check in pending list */ list_for_each_entry(desc, &pch->work_list, node) { if (desc->status == DONE) transferred = desc->bytes_requested; - else if (running && desc == running) - transferred = - pl330_get_current_xferred_count(pch, desc); - else + else if (desc->status == BUSY && first_busy) { + first_busy = false; + if (running && desc == running) { + transferred = + pl330_get_current_xferred_count(pch, desc); + } else { + /* BUSY but not running means it's just completed */ + transferred = desc->bytes_requested; + } + } else { + /* + * Descriptor is either in PREP state queued for future + * transfer or it is the second BUSY descriptor we have + * seen. The latter case means it has just, or is about + * to be, started, so treat it as having not yet + * transferred any bytes, the same as PREP. + */ transferred = 0; + } residual += desc->bytes_requested - transferred; if (desc->txd.cookie == cookie) { switch (desc->status) {
The residue calculation in pl330_tx_status doesn't handle transitional states that occur at the time one descriptor (A) is completed and the next (B) is started. Specifically, both A and B can simultaneously be in the BUSY state and at this time the thread's 'req_running' may (or may not) be -1. To cope with this situation we change the code to ensure A is treated as complete and B as having not yet started. Prior to the change, the code would calculate a transferred byte count as if both A and B had completed. Fixes: aee4d1fac887 ("dmaengine: pl330: improve pl330_tx_status() function") Signed-off-by: Jon Medhurst <tixy@linaro.org> --- I discovered this issue when trying to work out why audio stopped working on ARM's Juno platform and bisected it to commit aee4d1fac887. Whilst this patch seems to fix the problems I was seeing, I can't help but think there are more race conditions with this code. E.g. if the running descriptor changes under us, pl330_get_current_xferred_count can end up reading values from hardware that relate to a different descriptor. And if we're really unlucky, the reading of the 'val' and 'addr' values in pl330_get_current_xferred_count can come from different descriptors. I don't know if there is any locks we can use to prevent such races or if we need to try and detect when things have changed and redo/abort the residue calculation... drivers/dma/pl330.c | 24 ++++++++++++++++++++---- 1 file changed, 20 insertions(+), 4 deletions(-) -- 2.1.4