[Linaro-uefi,v6,1/5] Drivers/Net/MarvellYukon: Don't re-use DMA buffers

Message ID CAKv+Gu8H+NMtS2JYNQ=+msR2dDEzAXFU6gM5xkr+BsGUCU=6ng@mail.gmail.com
State New
Headers show

Commit Message

Ard Biesheuvel Aug. 30, 2016, 6:49 a.m.
On 30 August 2016 at 03:59, Alan Ott <alan@softiron.co.uk> wrote:
> On 08/29/2016 03:34 PM, Ard Biesheuvel wrote:
>>
>> On 29 August 2016 at 20:02, Alan Ott <alan@softiron.co.uk> wrote:
>>>
>>> Change the receive buffers to be single-use only. This involves a few
>>> changes that work together:
>>> 1. Change msk_newbuf() to not attempt to re-use or free a buffer,
>>> 2. Change msk_rxeof() free the buffer when it is done with it,
>>> 3. Store a temporary copy of the received data for passing to
>>>     the Receive Queue,
>>> 4. Call the required Flush() and Unmap() on the DMA buffer.
>>>
>>> In addition this means failure to allocate a new buffer is a failure
>>> of msk_rxeof().
>>>
>>> Note that this change makes the driver work the way the FreeBSD driver
>>> (from which this driver was derived) works, and this simply removes an
>>> optimization (the code in msk_newbuf() which re-uses the buffers. This
>>> removal of the optimization is done for two reasons:
>>> 1. The optimization failed to work for 64-bit DMA transfers;
>>> 2. The UEFI specification, version 2.6, section 13.4 requires calls to
>>>     Flush() and Unmap() before reading a DMA write buffer from the CPU,
>>>     which doesn't fit with the optimization as it existed.
>>>
>>> Reverting back to the behavior as it was in the FreeBSD driver solves
>>> number 1 and 2 above, and makes this driver more consistent with
>>> something we know to be working. There is slightly more overhead, but
>>> it is more consistent with the UEFI standard.
>>>
>>> Contributed-under: TianoCore Contribution Agreement 1.0
>>> Signed-off-by: Alan Ott <alan@softiron.co.uk>
>>> ---
>>>   Drivers/Net/MarvellYukonDxe/if_msk.c | 32
>>> +++++++++++++++++++++-----------
>>>   1 file changed, 21 insertions(+), 11 deletions(-)
>>>
>>> diff --git a/Drivers/Net/MarvellYukonDxe/if_msk.c
>>> b/Drivers/Net/MarvellYukonDxe/if_msk.c
>>> index d2102dc..ad7a1c2 100644
>>> --- a/Drivers/Net/MarvellYukonDxe/if_msk.c
>>> +++ b/Drivers/Net/MarvellYukonDxe/if_msk.c
>>> @@ -581,13 +581,6 @@ msk_newbuf (
>>>
>>>     rxd = &sc_if->msk_cdata.msk_rxdesc[idx];
>>>
>>> -  if ((rxd->rx_m.Buf != NULL) && (rxd->rx_m.Length >= Length)) {
>>> -    return EFI_ALREADY_STARTED;
>>> -  } else if (rxd->rx_m.Buf != NULL) {
>>> -    mPciIo->Unmap (mPciIo, rxd->rx_m.DmaMapping);
>>> -    mPciIo->FreeBuffer (mPciIo, EFI_SIZE_TO_PAGES (rxd->rx_m.Length),
>>> rxd->rx_m.Buf);
>>> -  }
>>> -
>>>     Status = mPciIo->AllocateBuffer (mPciIo, AllocateAnyPages,
>>> EfiBootServicesData, EFI_SIZE_TO_PAGES (Length), &Buffer, 0);
>>>     if (EFI_ERROR (Status)) {
>>>       return Status;
>>> @@ -1848,6 +1841,7 @@ msk_rxeof (
>>>     struct msk_rxdesc   *rxd;
>>>     INTN                cons;
>>>     INTN                rxlen;
>>> +  MSK_DMA_BUF         m;
>>>
>>>     DEBUG ((EFI_D_NET, "Marvell Yukon: rxeof\n"));
>>>
>>> @@ -1871,26 +1865,40 @@ msk_rxeof (
>>>         break;
>>>       }
>>>
>>> +    rxd = &sc_if->msk_cdata.msk_rxdesc[cons];
>>> +
>>> +    m = rxd->rx_m;
>>
>> Struct assignment is not allowed in EDK2 (but I haven't checked the
>> existing code).
>
>
> I had a look in the EDKII C Coding Standards 2.1 Draft PDF here:
> https://github.com/tianocore/tianocore.github.io/wiki/EDK-II-Specifications
> and didn't see it, but maybe I'm looking at the wrong thing.
>
>> I will change this to
>>
>> gBS->CopyMem (&m, &rxd->rx_m, sizeof(MSK_DMA_BUF));
>>
>> when merging the patch, unless there is a pressing reason not to.
>
>
> Feel free, but:
>     gBS->CopyMem (&m, &rxd->rx_m, sizeof(m));
> is a little safer (sizeof(m) instead of sizeof(MSK_DMA_BUF)), but of course
> the struct assignment method is the safest (with compile-time checks for
> equivalent types and automatic determination of size (which already helped
> me once today)).
>
> Whatever you feel needs to be done is fine.
>
>

OK

>>>       Status = msk_newbuf (sc_if, cons);
>>>       if (EFI_ERROR (Status)) {
>>> +      // This is a dropped packet, but we aren't counting drops
>>>         // Reuse old buffer
>>>         msk_discard_rxbuf (sc_if, cons);
>>> +      break;
>>> +    }
>>> +
>>> +    Status = mPciIo->Flush (mPciIo);
>>> +    if (EFI_ERROR (Status)) {
>>> +      DEBUG ((EFI_D_NET, "Marvell Yukon: failed to Flush DMA\n"));
>>>       }
>>> +
>>> +    Status = mPciIo->Unmap (mPciIo, m.DmaMapping);
>>> +    if (EFI_ERROR (Status)) {
>>> +      DEBUG ((EFI_D_NET, "Marvell Yukon: failed to Unmap DMA\n"));
>>> +    }
>>> +
>>>       Status = gBS->AllocatePool (EfiBootServicesData,
>>>                                   sizeof (MSK_LINKED_DMA_BUF),
>>>                                   (VOID**) &m_link);
>>>       if (!EFI_ERROR (Status)) {
>>>         gBS->SetMem (m_link, sizeof (MSK_LINKED_DMA_BUF), 0);
>>> -      rxd = &sc_if->msk_cdata.msk_rxdesc[cons];
>>> -
>>>         m_link->Signature = RX_MBUF_SIGNATURE;
>>>         Status = gBS->AllocatePool (EfiBootServicesData,
>>>                                     len,
>>>                                     (VOID**) &m_link->DmaBuf.Buf);
>>>         if(!EFI_ERROR (Status)) {
>>> -        gBS->CopyMem (m_link->DmaBuf.Buf, rxd->rx_m.Buf, len);
>>> +        gBS->CopyMem (m_link->DmaBuf.Buf, m.Buf, len);
>>>           m_link->DmaBuf.Length = len;
>>> -        m_link->DmaBuf.DmaMapping = rxd->rx_m.DmaMapping;
>>> +        m_link->DmaBuf.DmaMapping = m.DmaMapping;
>>>
>> Do we still need DmaMapping after we unmapped the buffer?
>
>
> Certainly not, but the consumer of that queue doesn't do anything with the
> DMA mapping, which is why it didn't cause an issue, even before my patches.
>
> The Receive Queue should use a MSK_LINKED_SYSTEM_BUF instead of a
> MSK_LINKED_DMA_BUF. I made a patch for this, which I will send tomorrow
> (Tuesday) as part of v7.
>

OK, but before you do that, could we get rid of the
PciIo->AllocateBuffer() and the redundant copy as well? Something like
below, but in a way that it doesn't leak memory?

Linaro-uefi mailing list
Linaro-uefi@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/linaro-uefi

Comments

Ard Biesheuvel Aug. 30, 2016, 2:15 p.m. | #1
On 30 August 2016 at 14:14, Alan Ott <alan@softiron.co.uk> wrote:
> On 08/30/2016 02:49 AM, Ard Biesheuvel wrote:
>>
>> On 30 August 2016 at 03:59, Alan Ott <alan@softiron.co.uk> wrote:
>>>
>>> On 08/29/2016 03:34 PM, Ard Biesheuvel wrote:
>>>>
>>>> On 29 August 2016 at 20:02, Alan Ott <alan@softiron.co.uk> wrote:
>>>>>
>>>>> Change the receive buffers to be single-use only. This involves a few
>>>>> changes that work together:
>>>>> 1. Change msk_newbuf() to not attempt to re-use or free a buffer,
>>>>> 2. Change msk_rxeof() free the buffer when it is done with it,
>>>>> 3. Store a temporary copy of the received data for passing to
>>>>>      the Receive Queue,
>>>>> 4. Call the required Flush() and Unmap() on the DMA buffer.
>>>>>
>>>>> In addition this means failure to allocate a new buffer is a failure
>>>>> of msk_rxeof().
>>>>>
>>>>> Note that this change makes the driver work the way the FreeBSD driver
>>>>> (from which this driver was derived) works, and this simply removes an
>>>>> optimization (the code in msk_newbuf() which re-uses the buffers. This
>>>>> removal of the optimization is done for two reasons:
>>>>> 1. The optimization failed to work for 64-bit DMA transfers;
>>>>> 2. The UEFI specification, version 2.6, section 13.4 requires calls to
>>>>>      Flush() and Unmap() before reading a DMA write buffer from the
>>>>> CPU,
>>>>>      which doesn't fit with the optimization as it existed.
>>>>>
>>>>> Reverting back to the behavior as it was in the FreeBSD driver solves
>>>>> number 1 and 2 above, and makes this driver more consistent with
>>>>> something we know to be working. There is slightly more overhead, but
>>>>> it is more consistent with the UEFI standard.
>>>>>
>>>>> Contributed-under: TianoCore Contribution Agreement 1.0
>>>>> Signed-off-by: Alan Ott <alan@softiron.co.uk>
>>>>> ---
>>>>>    Drivers/Net/MarvellYukonDxe/if_msk.c | 32
>>>>> +++++++++++++++++++++-----------
>>>>>    1 file changed, 21 insertions(+), 11 deletions(-)
>>>>>
>>>>> diff --git a/Drivers/Net/MarvellYukonDxe/if_msk.c
>>>>> b/Drivers/Net/MarvellYukonDxe/if_msk.c
>>>>> index d2102dc..ad7a1c2 100644
>>>>> --- a/Drivers/Net/MarvellYukonDxe/if_msk.c
>>>>> +++ b/Drivers/Net/MarvellYukonDxe/if_msk.c
>>>>> @@ -581,13 +581,6 @@ msk_newbuf (
>>>>>
>>>>>      rxd = &sc_if->msk_cdata.msk_rxdesc[idx];
>>>>>
>>>>> -  if ((rxd->rx_m.Buf != NULL) && (rxd->rx_m.Length >= Length)) {
>>>>> -    return EFI_ALREADY_STARTED;
>>>>> -  } else if (rxd->rx_m.Buf != NULL) {
>>>>> -    mPciIo->Unmap (mPciIo, rxd->rx_m.DmaMapping);
>>>>> -    mPciIo->FreeBuffer (mPciIo, EFI_SIZE_TO_PAGES (rxd->rx_m.Length),
>>>>> rxd->rx_m.Buf);
>>>>> -  }
>>>>> -
>>>>>      Status = mPciIo->AllocateBuffer (mPciIo, AllocateAnyPages,
>>>>> EfiBootServicesData, EFI_SIZE_TO_PAGES (Length), &Buffer, 0);
>>>>>      if (EFI_ERROR (Status)) {
>>>>>        return Status;
>>>>> @@ -1848,6 +1841,7 @@ msk_rxeof (
>>>>>      struct msk_rxdesc   *rxd;
>>>>>      INTN                cons;
>>>>>      INTN                rxlen;
>>>>> +  MSK_DMA_BUF         m;
>>>>>
>>>>>      DEBUG ((EFI_D_NET, "Marvell Yukon: rxeof\n"));
>>>>>
>>>>> @@ -1871,26 +1865,40 @@ msk_rxeof (
>>>>>          break;
>>>>>        }
>>>>>
>>>>> +    rxd = &sc_if->msk_cdata.msk_rxdesc[cons];
>>>>> +
>>>>> +    m = rxd->rx_m;
>>>>
>>>> Struct assignment is not allowed in EDK2 (but I haven't checked the
>>>> existing code).
>>>
>>>
>>> I had a look in the EDKII C Coding Standards 2.1 Draft PDF here:
>>>
>>> https://github.com/tianocore/tianocore.github.io/wiki/EDK-II-Specifications
>>> and didn't see it, but maybe I'm looking at the wrong thing.
>>>
>>>> I will change this to
>>>>
>>>> gBS->CopyMem (&m, &rxd->rx_m, sizeof(MSK_DMA_BUF));
>>>>
>>>> when merging the patch, unless there is a pressing reason not to.
>>>
>>>
>>> Feel free, but:
>>>      gBS->CopyMem (&m, &rxd->rx_m, sizeof(m));
>>> is a little safer (sizeof(m) instead of sizeof(MSK_DMA_BUF)), but of
>>> course
>>> the struct assignment method is the safest (with compile-time checks for
>>> equivalent types and automatic determination of size (which already
>>> helped
>>> me once today)).
>>>
>>> Whatever you feel needs to be done is fine.
>>>
>>>
>> OK
>>
>>>>>        Status = msk_newbuf (sc_if, cons);
>>>>>        if (EFI_ERROR (Status)) {
>>>>> +      // This is a dropped packet, but we aren't counting drops
>>>>>          // Reuse old buffer
>>>>>          msk_discard_rxbuf (sc_if, cons);
>>>>> +      break;
>>>>> +    }
>>>>> +
>>>>> +    Status = mPciIo->Flush (mPciIo);
>>>>> +    if (EFI_ERROR (Status)) {
>>>>> +      DEBUG ((EFI_D_NET, "Marvell Yukon: failed to Flush DMA\n"));
>>>>>        }
>>>>> +
>>>>> +    Status = mPciIo->Unmap (mPciIo, m.DmaMapping);
>>>>> +    if (EFI_ERROR (Status)) {
>>>>> +      DEBUG ((EFI_D_NET, "Marvell Yukon: failed to Unmap DMA\n"));
>>>>> +    }
>>>>> +
>>>>>        Status = gBS->AllocatePool (EfiBootServicesData,
>>>>>                                    sizeof (MSK_LINKED_DMA_BUF),
>>>>>                                    (VOID**) &m_link);
>>>>>        if (!EFI_ERROR (Status)) {
>>>>>          gBS->SetMem (m_link, sizeof (MSK_LINKED_DMA_BUF), 0);
>>>>> -      rxd = &sc_if->msk_cdata.msk_rxdesc[cons];
>>>>> -
>>>>>          m_link->Signature = RX_MBUF_SIGNATURE;
>>>>>          Status = gBS->AllocatePool (EfiBootServicesData,
>>>>>                                      len,
>>>>>                                      (VOID**) &m_link->DmaBuf.Buf);
>>>>>          if(!EFI_ERROR (Status)) {
>>>>> -        gBS->CopyMem (m_link->DmaBuf.Buf, rxd->rx_m.Buf, len);
>>>>> +        gBS->CopyMem (m_link->DmaBuf.Buf, m.Buf, len);
>>>>>            m_link->DmaBuf.Length = len;
>>>>> -        m_link->DmaBuf.DmaMapping = rxd->rx_m.DmaMapping;
>>>>> +        m_link->DmaBuf.DmaMapping = m.DmaMapping;
>>>>>
>>>> Do we still need DmaMapping after we unmapped the buffer?
>>>
>>>
>>> Certainly not, but the consumer of that queue doesn't do anything with
>>> the
>>> DMA mapping, which is why it didn't cause an issue, even before my
>>> patches.
>>>
>>> The Receive Queue should use a MSK_LINKED_SYSTEM_BUF instead of a
>>> MSK_LINKED_DMA_BUF. I made a patch for this, which I will send tomorrow
>>> (Tuesday) as part of v7.
>>>
>> OK, but before you do that, could we get rid of the
>> PciIo->AllocateBuffer()
>
>
>
> I've acutally been testing with something similar already (although using
> gBS->AllocatePages() as you originally suggested), but I'd like to do one
> thing at a time.
>

OK, fair enough.

> Further, from my reading of the documents, I'm not even sure it's right to
> try to DMA to memory that's not allocated by the PciIo. I had partially
> typed up an email to this effect, but I was hoping for one thing at a time.
> But since you brought it up...
>
> What do you think about section 5.1.1.2 of the Driver Writer's Guide where
> it says:
>>
>> A UEFI driver must never directly allocate a memory buffer for DMA access.
>> The UEFI
>> driver cannot know enough about the system architecture to predict what
>> system
>> memory areas are available for DMA or if CPU caches are coherent with DMA
>> operations. Instead, a UEFI driver must use the services provided by the
>> I/O protocol
>> for the bus to allocate and free buffers required for DMA operations.
>> There should also
>> be services to initiate and complete DMA transactions. For example, the
>> PCI Root
>> Bridge I/O Protocol and PCI I/O Protocol both provide services for PCI DMA
>> operations.
>> As additional I/O bus types with DMA capabilities are introduced, new
>> protocols that
>> abstract the DMA services must be provided.
>
>
> Similar language is in section 5.3.11. It seems like it's stating you can't
> make the assumption that the bus is able to perform DMA operations to all of
> RAM, but that the bus driver knows these (platform- and bus-specific)
> limitations, so you should use the bus (in this case PciIo), to allocate the
> memory.
>
> However, the examples of streaming PCI DMA in the DWG do _not_ do an
> allocation this way, and rely on a pointer passed in from outside the
> example code (DWG 18.5.7).
>
> I've looked hard in the UEFI specification, but I can't find anything else
> which corroborates this. The UEFI spec only seems to mandate
> PciIo->AllocateBuffer() for common buffers
> (EfiPciOperationBusMasterCommonBuffer). This leads me to suspect the above
> quoted paragraph (DWG 5.1.1.2) may only be talking about common buffers and
> not for streaming buffers, but it's not clear.
>

No, it is not clear, but AllocateBuffer() is simply not required for
streaming DMA. Note that the transmit path in this driver already
performs streaming DMA on arbitrary buffers, so there is no reason for
the RX path to be different.

In general, the directionality of streaming DMA combined with the
restrictions on when the data is valid and accessible (before Map,
after Unmap, etc) means that the implementation can reallocate the
buffer if it is not located in suitable memory, which is exactly what
the non-coherent DmaLib does.

There are two reasons I'd prefer not to use AllocateBuffer() here:
- it gives you uncached memory on non-coherent platforms, which means
that every single read performed by the CopyMem() after flush/unmap
goes all the way to main memory
- AllocateBuffer() changes the memory map (and so does
AllocatePages(), which is why I suggested AllocatePool() instead in
the second instance), which is costly

> In my testing, your suggestion (AllocatePages()) works, but I'm not
> convinced it's correct.
>
> Either way, I'm going to push for this being a separate patch than the
> current patch series. I'm happy to do it (if we decide ultimately that it is
> correct), but it's really a separate problem (as you've indicated before).
>
> Let me know if this is ok with you.
>

Sure.

>>   and the redundant copy as well? Something like
>> below, but in a way that it doesn't leak memory?
>
>
> Can this also be a separate from the series? Again, it's a separate problem.
> Like the above, I'm happy to do it, but rebasing this same set continuously
> is getting tiresome. v7 is nearly ready, and I'd like it to be the last one.
> I'm happy to submit a second series after this series is in, to optimize
> these issues you point out.
>
> Let me know, and again, thanks for your review and interest in this series.
>

No, that is perfectly fine. I am very happy you are willing to spend
more time on this, since I don't have access to the hardware (and am
not that familiar with this version or the BSD version of the driver)

It just seemed that changing from MSK_DMA_BUF to MSK_SYSTEM_BUF etc
would combine naturally with the changes above, but perhaps not. It's
up to you.

Thanks,
Ard.
Ard Biesheuvel Aug. 30, 2016, 2:47 p.m. | #2
On 30 August 2016 at 15:46, Alan Ott <alan@softiron.co.uk> wrote:
> On 08/30/2016 10:15 AM, Ard Biesheuvel wrote:
>>
>> On 30 August 2016 at 14:14, Alan Ott <alan@softiron.co.uk> wrote:
>>>
>>> On 08/30/2016 02:49 AM, Ard Biesheuvel wrote:
>>>>
>>>> On 30 August 2016 at 03:59, Alan Ott <alan@softiron.co.uk> wrote:
>>>>>
>>>>> On 08/29/2016 03:34 PM, Ard Biesheuvel wrote:
>>>>>>
>>>>>> On 29 August 2016 at 20:02, Alan Ott <alan@softiron.co.uk> wrote:
>>>>>>>
>>>>>>> Change the receive buffers to be single-use only. This involves a few
>>>>>>> changes that work together:
>>>>>>> 1. Change msk_newbuf() to not attempt to re-use or free a buffer,
>>>>>>> 2. Change msk_rxeof() free the buffer when it is done with it,
>>>>>>> 3. Store a temporary copy of the received data for passing to
>>>>>>>       the Receive Queue,
>>>>>>> 4. Call the required Flush() and Unmap() on the DMA buffer.
>>>>>>>
>>>>>>> In addition this means failure to allocate a new buffer is a failure
>>>>>>> of msk_rxeof().
>>>>>>>
>>>>>>> Note that this change makes the driver work the way the FreeBSD
>>>>>>> driver
>>>>>>> (from which this driver was derived) works, and this simply removes
>>>>>>> an
>>>>>>> optimization (the code in msk_newbuf() which re-uses the buffers.
>>>>>>> This
>>>>>>> removal of the optimization is done for two reasons:
>>>>>>> 1. The optimization failed to work for 64-bit DMA transfers;
>>>>>>> 2. The UEFI specification, version 2.6, section 13.4 requires calls
>>>>>>> to
>>>>>>>       Flush() and Unmap() before reading a DMA write buffer from the
>>>>>>> CPU,
>>>>>>>       which doesn't fit with the optimization as it existed.
>>>>>>>
>>>>>>> Reverting back to the behavior as it was in the FreeBSD driver solves
>>>>>>> number 1 and 2 above, and makes this driver more consistent with
>>>>>>> something we know to be working. There is slightly more overhead, but
>>>>>>> it is more consistent with the UEFI standard.
>>>>>>>
>>>>>>> Contributed-under: TianoCore Contribution Agreement 1.0
>>>>>>> Signed-off-by: Alan Ott <alan@softiron.co.uk>
>>>>>>> ---
>>>>>>>     Drivers/Net/MarvellYukonDxe/if_msk.c | 32
>>>>>>> +++++++++++++++++++++-----------
>>>>>>>     1 file changed, 21 insertions(+), 11 deletions(-)
>>>>>>>
>>>>>>> diff --git a/Drivers/Net/MarvellYukonDxe/if_msk.c
>>>>>>> b/Drivers/Net/MarvellYukonDxe/if_msk.c
>>>>>>> index d2102dc..ad7a1c2 100644
>>>>>>> --- a/Drivers/Net/MarvellYukonDxe/if_msk.c
>>>>>>> +++ b/Drivers/Net/MarvellYukonDxe/if_msk.c
>>>>>>> @@ -581,13 +581,6 @@ msk_newbuf (
>>>>>>>
>>>>>>>       rxd = &sc_if->msk_cdata.msk_rxdesc[idx];
>>>>>>>
>>>>>>> -  if ((rxd->rx_m.Buf != NULL) && (rxd->rx_m.Length >= Length)) {
>>>>>>> -    return EFI_ALREADY_STARTED;
>>>>>>> -  } else if (rxd->rx_m.Buf != NULL) {
>>>>>>> -    mPciIo->Unmap (mPciIo, rxd->rx_m.DmaMapping);
>>>>>>> -    mPciIo->FreeBuffer (mPciIo, EFI_SIZE_TO_PAGES
>>>>>>> (rxd->rx_m.Length),
>>>>>>> rxd->rx_m.Buf);
>>>>>>> -  }
>>>>>>> -
>>>>>>>       Status = mPciIo->AllocateBuffer (mPciIo, AllocateAnyPages,
>>>>>>> EfiBootServicesData, EFI_SIZE_TO_PAGES (Length), &Buffer, 0);
>>>>>>>       if (EFI_ERROR (Status)) {
>>>>>>>         return Status;
>>>>>>> @@ -1848,6 +1841,7 @@ msk_rxeof (
>>>>>>>       struct msk_rxdesc   *rxd;
>>>>>>>       INTN                cons;
>>>>>>>       INTN                rxlen;
>>>>>>> +  MSK_DMA_BUF         m;
>>>>>>>
>>>>>>>       DEBUG ((EFI_D_NET, "Marvell Yukon: rxeof\n"));
>>>>>>>
>>>>>>> @@ -1871,26 +1865,40 @@ msk_rxeof (
>>>>>>>           break;
>>>>>>>         }
>>>>>>>
>>>>>>> +    rxd = &sc_if->msk_cdata.msk_rxdesc[cons];
>>>>>>> +
>>>>>>> +    m = rxd->rx_m;
>>>>>>
>>>>>> Struct assignment is not allowed in EDK2 (but I haven't checked the
>>>>>> existing code).
>>>>>
>>>>>
>>>>> I had a look in the EDKII C Coding Standards 2.1 Draft PDF here:
>>>>>
>>>>>
>>>>> https://github.com/tianocore/tianocore.github.io/wiki/EDK-II-Specifications
>>>>> and didn't see it, but maybe I'm looking at the wrong thing.
>>>>>
>>>>>> I will change this to
>>>>>>
>>>>>> gBS->CopyMem (&m, &rxd->rx_m, sizeof(MSK_DMA_BUF));
>>>>>>
>>>>>> when merging the patch, unless there is a pressing reason not to.
>>>>>
>>>>>
>>>>> Feel free, but:
>>>>>       gBS->CopyMem (&m, &rxd->rx_m, sizeof(m));
>>>>> is a little safer (sizeof(m) instead of sizeof(MSK_DMA_BUF)), but of
>>>>> course
>>>>> the struct assignment method is the safest (with compile-time checks
>>>>> for
>>>>> equivalent types and automatic determination of size (which already
>>>>> helped
>>>>> me once today)).
>>>>>
>>>>> Whatever you feel needs to be done is fine.
>>>>>
>>>>>
>>>> OK
>>>>
>>>>>>>         Status = msk_newbuf (sc_if, cons);
>>>>>>>         if (EFI_ERROR (Status)) {
>>>>>>> +      // This is a dropped packet, but we aren't counting drops
>>>>>>>           // Reuse old buffer
>>>>>>>           msk_discard_rxbuf (sc_if, cons);
>>>>>>> +      break;
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    Status = mPciIo->Flush (mPciIo);
>>>>>>> +    if (EFI_ERROR (Status)) {
>>>>>>> +      DEBUG ((EFI_D_NET, "Marvell Yukon: failed to Flush DMA\n"));
>>>>>>>         }
>>>>>>> +
>>>>>>> +    Status = mPciIo->Unmap (mPciIo, m.DmaMapping);
>>>>>>> +    if (EFI_ERROR (Status)) {
>>>>>>> +      DEBUG ((EFI_D_NET, "Marvell Yukon: failed to Unmap DMA\n"));
>>>>>>> +    }
>>>>>>> +
>>>>>>>         Status = gBS->AllocatePool (EfiBootServicesData,
>>>>>>>                                     sizeof (MSK_LINKED_DMA_BUF),
>>>>>>>                                     (VOID**) &m_link);
>>>>>>>         if (!EFI_ERROR (Status)) {
>>>>>>>           gBS->SetMem (m_link, sizeof (MSK_LINKED_DMA_BUF), 0);
>>>>>>> -      rxd = &sc_if->msk_cdata.msk_rxdesc[cons];
>>>>>>> -
>>>>>>>           m_link->Signature = RX_MBUF_SIGNATURE;
>>>>>>>           Status = gBS->AllocatePool (EfiBootServicesData,
>>>>>>>                                       len,
>>>>>>>                                       (VOID**) &m_link->DmaBuf.Buf);
>>>>>>>           if(!EFI_ERROR (Status)) {
>>>>>>> -        gBS->CopyMem (m_link->DmaBuf.Buf, rxd->rx_m.Buf, len);
>>>>>>> +        gBS->CopyMem (m_link->DmaBuf.Buf, m.Buf, len);
>>>>>>>             m_link->DmaBuf.Length = len;
>>>>>>> -        m_link->DmaBuf.DmaMapping = rxd->rx_m.DmaMapping;
>>>>>>> +        m_link->DmaBuf.DmaMapping = m.DmaMapping;
>>>>>>>
>>>>>> Do we still need DmaMapping after we unmapped the buffer?
>>>>>
>>>>>
>>>>> Certainly not, but the consumer of that queue doesn't do anything with
>>>>> the
>>>>> DMA mapping, which is why it didn't cause an issue, even before my
>>>>> patches.
>>>>>
>>>>> The Receive Queue should use a MSK_LINKED_SYSTEM_BUF instead of a
>>>>> MSK_LINKED_DMA_BUF. I made a patch for this, which I will send tomorrow
>>>>> (Tuesday) as part of v7.
>>>>>
>>>> OK, but before you do that, could we get rid of the
>>>> PciIo->AllocateBuffer()
>>>
>>>
>>>
>>> I've acutally been testing with something similar already (although using
>>> gBS->AllocatePages() as you originally suggested), but I'd like to do one
>>> thing at a time.
>>>
>> OK, fair enough.
>>
>>> Further, from my reading of the documents, I'm not even sure it's right
>>> to
>>> try to DMA to memory that's not allocated by the PciIo. I had partially
>>> typed up an email to this effect, but I was hoping for one thing at a
>>> time.
>>> But since you brought it up...
>>>
>>> What do you think about section 5.1.1.2 of the Driver Writer's Guide
>>> where
>>> it says:
>>>>
>>>> A UEFI driver must never directly allocate a memory buffer for DMA
>>>> access.
>>>> The UEFI
>>>> driver cannot know enough about the system architecture to predict what
>>>> system
>>>> memory areas are available for DMA or if CPU caches are coherent with
>>>> DMA
>>>> operations. Instead, a UEFI driver must use the services provided by the
>>>> I/O protocol
>>>> for the bus to allocate and free buffers required for DMA operations.
>>>> There should also
>>>> be services to initiate and complete DMA transactions. For example, the
>>>> PCI Root
>>>> Bridge I/O Protocol and PCI I/O Protocol both provide services for PCI
>>>> DMA
>>>> operations.
>>>> As additional I/O bus types with DMA capabilities are introduced, new
>>>> protocols that
>>>> abstract the DMA services must be provided.
>>>
>>>
>>> Similar language is in section 5.3.11. It seems like it's stating you
>>> can't
>>> make the assumption that the bus is able to perform DMA operations to all
>>> of
>>> RAM, but that the bus driver knows these (platform- and bus-specific)
>>> limitations, so you should use the bus (in this case PciIo), to allocate
>>> the
>>> memory.
>>>
>>> However, the examples of streaming PCI DMA in the DWG do _not_ do an
>>> allocation this way, and rely on a pointer passed in from outside the
>>> example code (DWG 18.5.7).
>>>
>>> I've looked hard in the UEFI specification, but I can't find anything
>>> else
>>> which corroborates this. The UEFI spec only seems to mandate
>>> PciIo->AllocateBuffer() for common buffers
>>> (EfiPciOperationBusMasterCommonBuffer). This leads me to suspect the
>>> above
>>> quoted paragraph (DWG 5.1.1.2) may only be talking about common buffers
>>> and
>>> not for streaming buffers, but it's not clear.
>>>
>> No, it is not clear, but AllocateBuffer() is simply not required for
>> streaming DMA. Note that the transmit path in this driver already
>> performs streaming DMA on arbitrary buffers, so there is no reason for
>> the RX path to be different.
>>
>> In general, the directionality of streaming DMA combined with the
>> restrictions on when the data is valid and accessible (before Map,
>> after Unmap, etc) means that the implementation can reallocate the
>> buffer if it is not located in suitable memory, which is exactly what
>> the non-coherent DmaLib does.
>>
>> There are two reasons I'd prefer not to use AllocateBuffer() here:
>> - it gives you uncached memory on non-coherent platforms, which means
>> that every single read performed by the CopyMem() after flush/unmap
>> goes all the way to main memory
>> - AllocateBuffer() changes the memory map (and so does
>> AllocatePages(), which is why I suggested AllocatePool() instead in
>> the second instance), which is costly
>
>
> Ok, that's fine with me. I'll patch this later today after sending v7.
>
>>> In my testing, your suggestion (AllocatePages()) works, but I'm not
>>> convinced it's correct.
>>>
>>> Either way, I'm going to push for this being a separate patch than the
>>> current patch series. I'm happy to do it (if we decide ultimately that it
>>> is
>>> correct), but it's really a separate problem (as you've indicated
>>> before).
>>>
>>> Let me know if this is ok with you.
>>>
>> Sure.
>>
>>>>    and the redundant copy as well? Something like
>>>> below, but in a way that it doesn't leak memory?
>>>
>>>
>>> Can this also be a separate from the series? Again, it's a separate
>>> problem.
>>> Like the above, I'm happy to do it, but rebasing this same set
>>> continuously
>>> is getting tiresome. v7 is nearly ready, and I'd like it to be the last
>>> one.
>>> I'm happy to submit a second series after this series is in, to optimize
>>> these issues you point out.
>>>
>>> Let me know, and again, thanks for your review and interest in this
>>> series.
>>>
>> No, that is perfectly fine. I am very happy you are willing to spend
>> more time on this, since I don't have access to the hardware (and am
>> not that familiar with this version or the BSD version of the driver)
>>
>> It just seemed that changing from MSK_DMA_BUF to MSK_SYSTEM_BUF etc
>> would combine naturally with the changes above, but perhaps not. It's
>> up to you.
>
>
> The MSK_DMA_BUF/MSK_SYSTEM_BUF change wasn't too bad as the structures are
> similar and the structure of the code flow is the same.
>
> I'll take out the extra CopyMem() too. It really just moves the Free*() to a
> different place. Expect some patches later today.
>

Awesome, thanks!

Patch

diff --git a/Drivers/Net/MarvellYukonDxe/if_msk.c
b/Drivers/Net/MarvellYukonDxe/if_msk.c
index 9095ccd1744d..bf0f6d091ccb 100644
--- a/Drivers/Net/MarvellYukonDxe/if_msk.c
+++ b/Drivers/Net/MarvellYukonDxe/if_msk.c
@@ -581,7 +581,8 @@  msk_newbuf (

   rxd = &sc_if->msk_cdata.msk_rxdesc[idx];

-  Status = mPciIo->AllocateBuffer (mPciIo, AllocateAnyPages,
EfiBootServicesData, EFI_SIZE_TO_PAGES (Length), &Buffer, 0);
+  Status = gBS->AllocatePool (EfiBootServicesData, EFI_SIZE_TO_PAGES (Length),
+                  &Buffer);
   if (EFI_ERROR (Status)) {
     return Status;
   }
@@ -590,7 +591,7 @@  msk_newbuf (
   Status = mPciIo->Map (mPciIo, EfiPciIoOperationBusMasterWrite,
Buffer, &Length, &PhysAddr, &Mapping);
   if (EFI_ERROR (Status)) {
     Length = MAX_SUPPORTED_PACKET_SIZE;
-    mPciIo->FreeBuffer (mPciIo, EFI_SIZE_TO_PAGES (Length), Buffer);
+    gBS->FreePool (Buffer);
     return Status;
   }

@@ -1608,7 +1609,7 @@  msk_txrx_dma_free (
       mPciIo->Unmap (mPciIo, rxd->rx_m.DmaMapping);
       // Free Rx buffers as we own these
       if(rxd->rx_m.Buf != NULL) {
-        mPciIo->FreeBuffer (mPciIo, EFI_SIZE_TO_PAGES
(rxd->rx_m.Length), rxd->rx_m.Buf);
+        gBS->FreePool (rxd->rx_m.Buf);
         rxd->rx_m.Buf = NULL;
       }
       gBS->SetMem (&(rxd->rx_m), sizeof (MSK_DMA_BUF), 0);
@@ -1891,19 +1892,11 @@  msk_rxeof (
     if (!EFI_ERROR (Status)) {
       gBS->SetMem (m_link, sizeof (MSK_LINKED_DMA_BUF), 0);
       m_link->Signature = RX_MBUF_SIGNATURE;
-      Status = gBS->AllocatePool (EfiBootServicesData,
-                                  len,
-                                  (VOID**) &m_link->DmaBuf.Buf);
-      if(!EFI_ERROR (Status)) {
-        gBS->CopyMem (m_link->DmaBuf.Buf, m.Buf, len);
-        m_link->DmaBuf.Length = len;
-        m_link->DmaBuf.DmaMapping = m.DmaMapping;
-
-        mPciIo->FreeBuffer (mPciIo, EFI_SIZE_TO_PAGES (m.Length), m.Buf);
+      m_link->DmaBuf.Buf = m.Buf;
+      m_link->DmaBuf.Length = len;
+      m_link->DmaBuf.DmaMapping = m.DmaMapping;

-        InsertTailList (&mSoftc->ReceiveQueueHead, &m_link->Link);
-      } else {
-        DEBUG ((EFI_D_NET, "Marvell Yukon: failed to allocate DMA
buffer. Dropping Frame\n"));
+      InsertTailList (&mSoftc->ReceiveQueueHead, &m_link->Link);
       }
     } else {
       DEBUG ((EFI_D_NET, "Marvell Yukon: failed to allocate receive
buffer link. Dropping Frame\n"));
diff --git a/Drivers/Net/MarvellYukonDxe/if_msk.h
b/Drivers/Net/MarvellYukonDxe/if_msk.h
index a5025a39ecf3..66513d3c54c1 100644
--- a/Drivers/Net/MarvellYukonDxe/if_msk.h
+++ b/Drivers/Net/MarvellYukonDxe/if_msk.h
@@ -15,7 +15,7 @@ 

 #include <Uefi.h>

-#define MAX_SUPPORTED_PACKET_SIZE   EFI_PAGE_SIZE
+#define MAX_SUPPORTED_PACKET_SIZE   (1500)

 EFI_STATUS mskc_probe (EFI_PCI_IO_PROTOCOL *PciIo);
_______________________________________________