[edk2,v2,0/6] ArmVirtQemu: move to generic PciHostBridgeDxe

Message ID CAKv+Gu-Ec=E+3bB95E1xW3m4rQWNWB6wwL2-a0vxxVFuO5E-oA@mail.gmail.com
State New
Headers show

Commit Message

Ard Biesheuvel Sept. 2, 2016, 4:26 p.m.
On 2 September 2016 at 17:13, Laszlo Ersek <lersek@redhat.com> wrote:
> On 09/02/16 17:27, Laszlo Ersek wrote:

>> On 09/02/16 16:58, Ard Biesheuvel wrote:

>>> (on the road atm, will reply in full later)

>>>

>>>> On 2 sep. 2016, at 14:09, Laszlo Ersek <lersek@redhat.com> wrote:

>>

>>>> (2) aarch64 KVM, using virtio-gpu-pci and USB 2 keyboard and

>>>> tablet. I actually booted a Fedora 24 guest with this, and in the

>>>> guest, everything works just fine (display, keyboard,

>>>> mouse/tablet). Most of the firmware log looks good too.

>>>>

>>>> (2a) However, the USB 2 keyboard is broken while in the firmware

>>>> (in spite of it working well in the guest OS).

>>>>

>>>>  -device ich9-usb-ehci1,multifunction=on,id=ehci,addr=05.0 \

>>>>  -device ich9-usb-uhci1,multifunction=on,masterbus=ehci.0,firstport=0,addr=05.1 \

>>>>  -device ich9-usb-uhci2,multifunction=on,masterbus=ehci.0,firstport=2,addr=05.2 \

>>>>  -device ich9-usb-uhci3,multifunction=on,masterbus=ehci.0,firstport=4,addr=05.3 \

>>>>  -device usb-kbd,bus=ehci.0 \

>>>>  -device usb-tablet,bus=ehci.0 \

>>>>

>>>> My QEMU has your commit 5d636e21c44e ("hw/arm/virt: mark the PCIe

>>>> host controller as DMA coherent in the DT"), but I guess the EHCI

>>>> driver in edk2 doesn't comply with the "guest drivers should use

>>>> cacheable accesses as well when running under KVM" part. :(

>>>>

>>>> The following snippet repeats in the log:

>>>>

>>>>  EhcClearLegacySupport: called to clear legacy support

>>>>  processing error - resetting ehci HC

>>>>  EhcInitHC: failed to enable period schedule

>>>>  EhcDriverBindingStart: failed to init host controller

>>>>  EhcCreateUsb2Hc: capability length 32

>>>>

>>>> Interestingly, if I back out your series, then USB2 works in the

>>>> firmware. I don't understand this, given that my build includes

>>>> commit 3ef3209d3028 ("ArmVirtPkg: remove

>>>> PcdKludgeMapPciMmioAsCached") from the master branch!

>>>>

>>>

>>> Does it work when you limit DMA to < 4 GB?

>>

>> You are one wicked genius, man; the following change

>>

>>> diff --git a/ArmVirtPkg/Library/FdtPciHostBridgeLib/FdtPciHostBridgeLib.c b/ArmVirtPkg/Library/FdtPciHostBridgeLib/FdtPciHostBridgeLib.c

>>> index efccedcca14f..1f0f87cac8a9 100644

>>> --- a/ArmVirtPkg/Library/FdtPciHostBridgeLib/FdtPciHostBridgeLib.c

>>> +++ b/ArmVirtPkg/Library/FdtPciHostBridgeLib/FdtPciHostBridgeLib.c

>>> @@ -317,7 +317,7 @@ PciHostBridgeGetRootBridges (

>>>                                        EFI_PCI_ATTRIBUTE_VGA_PALETTE_IO_16;

>>>    mRootBridge.Attributes            = mRootBridge.Supports;

>>>

>>> -  mRootBridge.DmaAbove4G            = TRUE;

>>> +  mRootBridge.DmaAbove4G            = FALSE;

>>>    mRootBridge.NoExtendedConfigSpace = FALSE;

>>>    mRootBridge.ResourceAssigned      = FALSE;

>>>

>>

>> does make it work! Excellent!

>>

>> Explain please. :) (Although, I'll look into PciHostBridgeDxe in a moment too. :))

>


Thanks. You seem to have a good handle on things already, though :-)

> Well okay, I reviewed the RootBridgeIoMap() and RootBridgeIoUnmap()

> functions in "MdeModulePkg/Bus/Pci/PciHostBridgeDxe/PciRootBridgeIo.c".

> They implement bounce buffering when DmaAbove4G is set to FALSE, and

> when the original RAM buffer, to be DMA'd from or to by the PCI device,

> would end outside of 32-bit space.

>

> For common buffer operations (when device and CPU collaborate on memory

> repeatedly, without intervening Map() and Unmap() calls), Map() and

> Unmap() cannot implement bounce buffering, so the initial buffer must be

> allocated low enough. This is what RootBridgeIoAllocateBuffer() does,

> and yes it considers DmaAbove4G as well.

>

> EhciDxe uses these functions quite a bit. And, my test VM has 4G of

> memory, with a base at 0x4000_0000 (1GB); the base is fixed of course,

> from "-M virt". So, I guess, some buffers that EhciDxe allocated itself,

> for DMA'ing from/to the device, and some buffers that it allocated with

> AllocateBuffer(), for common operations with the device, ended up in the

> 4GB..5GB range. Due to DmaAbove4G = TRUE, those host addresses got

> passed to the PCI device (the USB 2 host controller) verbatim, but that

> device can only access host RAM in the 32-bit address range?....

>

> Hm, let me check the QEMU code (hw/usb/hcd-ehci.c)...

>

> Alright, I've found it. According to the EHCI specification

> ("ehci-specification-for-usb.pdf", link found under

> <https://en.wikipedia.org/wiki/Extensible_Host_Controller_Interface#References>),

> revision 1.0, section "2.2.4 HCCPARAMS -- Capability Parameters", bit #0

> (value 1) in the HCCPARAMS capability register stands for:

>

>

>     64-bit Addressing Capability. This field documents the addressing

>     range capability of this implementation. The value of this field

>     determines whether software should use the data structures defined

>     in Section 3 (32-bit) or those defined in Appendix B (64-bit).

>     Values for this field have the following interpretation:

>

>     0b  data structures using 32-bit address memory pointers

>     1b  data structures using 64-bit address memory pointers

>

> Furthermore, the HCCPARAMS register lives at address "Base + (08h)".

>

> Now, looking at the QEMU code, we have usb_ehci_init()

> [hw/usb/hcd-ehci.c] performing the following assignment:

>

>   s->caps[0x08] = 0x80;        /* We can cache whole frame, no 64-bit */

>

> (And, the "cache whole frame" reference, for bit #7, is consistent with

> the documentation of that bit in the spec: "When bit [7] is a

> one, then host software assumes the host controller may cache an

> isochronous data structure for an entire frame.")

>

> So, bingo. Please flip DmaAbove4G to FALSE in patch #3, and please drop

> the "DMA above 4 GB" paragraph from the commit message of patch #4.

>


Actually, I suspect this is a bug in PciHostBridgeDxe. It ignores the
absence of the EFI_PCI_ATTRIBUTE_DUAL_ADDRESS_CYCLE attribute, which
should be set by the driver if it knows the device is capable of
64-bit DMA.

Could you please try the below?



Thanks,
Ard.
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel

Comments

Laszlo Ersek Sept. 2, 2016, 5:21 p.m. | #1
On 09/02/16 18:26, Ard Biesheuvel wrote:
> On 2 September 2016 at 17:13, Laszlo Ersek <lersek@redhat.com> wrote:

>> On 09/02/16 17:27, Laszlo Ersek wrote:

>>> On 09/02/16 16:58, Ard Biesheuvel wrote:

>>>> (on the road atm, will reply in full later)

>>>>

>>>>> On 2 sep. 2016, at 14:09, Laszlo Ersek <lersek@redhat.com> wrote:

>>>

>>>>> (2) aarch64 KVM, using virtio-gpu-pci and USB 2 keyboard and

>>>>> tablet. I actually booted a Fedora 24 guest with this, and in the

>>>>> guest, everything works just fine (display, keyboard,

>>>>> mouse/tablet). Most of the firmware log looks good too.

>>>>>

>>>>> (2a) However, the USB 2 keyboard is broken while in the firmware

>>>>> (in spite of it working well in the guest OS).

>>>>>

>>>>>  -device ich9-usb-ehci1,multifunction=on,id=ehci,addr=05.0 \

>>>>>  -device ich9-usb-uhci1,multifunction=on,masterbus=ehci.0,firstport=0,addr=05.1 \

>>>>>  -device ich9-usb-uhci2,multifunction=on,masterbus=ehci.0,firstport=2,addr=05.2 \

>>>>>  -device ich9-usb-uhci3,multifunction=on,masterbus=ehci.0,firstport=4,addr=05.3 \

>>>>>  -device usb-kbd,bus=ehci.0 \

>>>>>  -device usb-tablet,bus=ehci.0 \

>>>>>

>>>>> My QEMU has your commit 5d636e21c44e ("hw/arm/virt: mark the PCIe

>>>>> host controller as DMA coherent in the DT"), but I guess the EHCI

>>>>> driver in edk2 doesn't comply with the "guest drivers should use

>>>>> cacheable accesses as well when running under KVM" part. :(

>>>>>

>>>>> The following snippet repeats in the log:

>>>>>

>>>>>  EhcClearLegacySupport: called to clear legacy support

>>>>>  processing error - resetting ehci HC

>>>>>  EhcInitHC: failed to enable period schedule

>>>>>  EhcDriverBindingStart: failed to init host controller

>>>>>  EhcCreateUsb2Hc: capability length 32

>>>>>

>>>>> Interestingly, if I back out your series, then USB2 works in the

>>>>> firmware. I don't understand this, given that my build includes

>>>>> commit 3ef3209d3028 ("ArmVirtPkg: remove

>>>>> PcdKludgeMapPciMmioAsCached") from the master branch!

>>>>>

>>>>

>>>> Does it work when you limit DMA to < 4 GB?

>>>

>>> You are one wicked genius, man; the following change

>>>

>>>> diff --git a/ArmVirtPkg/Library/FdtPciHostBridgeLib/FdtPciHostBridgeLib.c b/ArmVirtPkg/Library/FdtPciHostBridgeLib/FdtPciHostBridgeLib.c

>>>> index efccedcca14f..1f0f87cac8a9 100644

>>>> --- a/ArmVirtPkg/Library/FdtPciHostBridgeLib/FdtPciHostBridgeLib.c

>>>> +++ b/ArmVirtPkg/Library/FdtPciHostBridgeLib/FdtPciHostBridgeLib.c

>>>> @@ -317,7 +317,7 @@ PciHostBridgeGetRootBridges (

>>>>                                        EFI_PCI_ATTRIBUTE_VGA_PALETTE_IO_16;

>>>>    mRootBridge.Attributes            = mRootBridge.Supports;

>>>>

>>>> -  mRootBridge.DmaAbove4G            = TRUE;

>>>> +  mRootBridge.DmaAbove4G            = FALSE;

>>>>    mRootBridge.NoExtendedConfigSpace = FALSE;

>>>>    mRootBridge.ResourceAssigned      = FALSE;

>>>>

>>>

>>> does make it work! Excellent!

>>>

>>> Explain please. :) (Although, I'll look into PciHostBridgeDxe in a moment too. :))

>>

> 

> Thanks. You seem to have a good handle on things already, though :-)

> 

>> Well okay, I reviewed the RootBridgeIoMap() and RootBridgeIoUnmap()

>> functions in "MdeModulePkg/Bus/Pci/PciHostBridgeDxe/PciRootBridgeIo.c".

>> They implement bounce buffering when DmaAbove4G is set to FALSE, and

>> when the original RAM buffer, to be DMA'd from or to by the PCI device,

>> would end outside of 32-bit space.

>>

>> For common buffer operations (when device and CPU collaborate on memory

>> repeatedly, without intervening Map() and Unmap() calls), Map() and

>> Unmap() cannot implement bounce buffering, so the initial buffer must be

>> allocated low enough. This is what RootBridgeIoAllocateBuffer() does,

>> and yes it considers DmaAbove4G as well.

>>

>> EhciDxe uses these functions quite a bit. And, my test VM has 4G of

>> memory, with a base at 0x4000_0000 (1GB); the base is fixed of course,

>> from "-M virt". So, I guess, some buffers that EhciDxe allocated itself,

>> for DMA'ing from/to the device, and some buffers that it allocated with

>> AllocateBuffer(), for common operations with the device, ended up in the

>> 4GB..5GB range. Due to DmaAbove4G = TRUE, those host addresses got

>> passed to the PCI device (the USB 2 host controller) verbatim, but that

>> device can only access host RAM in the 32-bit address range?....

>>

>> Hm, let me check the QEMU code (hw/usb/hcd-ehci.c)...

>>

>> Alright, I've found it. According to the EHCI specification

>> ("ehci-specification-for-usb.pdf", link found under

>> <https://en.wikipedia.org/wiki/Extensible_Host_Controller_Interface#References>),

>> revision 1.0, section "2.2.4 HCCPARAMS -- Capability Parameters", bit #0

>> (value 1) in the HCCPARAMS capability register stands for:

>>

>>

>>     64-bit Addressing Capability. This field documents the addressing

>>     range capability of this implementation. The value of this field

>>     determines whether software should use the data structures defined

>>     in Section 3 (32-bit) or those defined in Appendix B (64-bit).

>>     Values for this field have the following interpretation:

>>

>>     0b  data structures using 32-bit address memory pointers

>>     1b  data structures using 64-bit address memory pointers

>>

>> Furthermore, the HCCPARAMS register lives at address "Base + (08h)".

>>

>> Now, looking at the QEMU code, we have usb_ehci_init()

>> [hw/usb/hcd-ehci.c] performing the following assignment:

>>

>>   s->caps[0x08] = 0x80;        /* We can cache whole frame, no 64-bit */

>>

>> (And, the "cache whole frame" reference, for bit #7, is consistent with

>> the documentation of that bit in the spec: "When bit [7] is a

>> one, then host software assumes the host controller may cache an

>> isochronous data structure for an entire frame.")

>>

>> So, bingo. Please flip DmaAbove4G to FALSE in patch #3, and please drop

>> the "DMA above 4 GB" paragraph from the commit message of patch #4.

>>

> 

> Actually, I suspect this is a bug in PciHostBridgeDxe. It ignores the

> absence of the EFI_PCI_ATTRIBUTE_DUAL_ADDRESS_CYCLE attribute, which

> should be set by the driver if it knows the device is capable of

> 64-bit DMA.

> 

> Could you please try the below?

> 

> 

> diff --git a/MdeModulePkg/Bus/Pci/PciHostBridgeDxe/PciRootBridgeIo.c

> b/MdeModulePkg/Bus/Pci/PciHostBridgeDxe/PciRootBridgeIo.c

> index b2d76d67afa2..b53b9a834816 100644

> --- a/MdeModulePkg/Bus/Pci/PciHostBridgeDxe/PciRootBridgeIo.c

> +++ b/MdeModulePkg/Bus/Pci/PciHostBridgeDxe/PciRootBridgeIo.c

> @@ -1308,7 +1308,8 @@ RootBridgeIoAllocateBuffer (

>    RootBridge = ROOT_BRIDGE_FROM_THIS (This);

> 

>    AllocateType = AllocateAnyPages;

> -  if (!RootBridge->DmaAbove4G) {

> +  if (!RootBridge->DmaAbove4G ||

> +      (Attributes & EFI_PCI_ATTRIBUTE_DUAL_ADDRESS_CYCLE) == 0) {

>      //

>      // Limit allocations to memory below 4GB

>      //

> 

> Thanks,

> Ard.

> 


Before trying it, I'll say that I don't like it, for two reasons :)

(1) This will affect AllocateBuffer(), yes, but it doesn't affect Map()
and Unmap(). In fact I don't understand how the spec allows those
functions to communicate this kind of information between PciIo and
PciRootBridgeIo: while for AllocateBuffer(), the PciIo implementation
can check the device itself, and pass
EFI_PCI_ATTRIBUTE_DUAL_ADDRESS_CYCLE to PciRootBridgeIo, I don't see the
same possibility, in the spec, for Map(). There is no Attributes
parameter there. So how will PciRootBridgeIo know?

In more direct terms, you can't extend the DmaAbove4G check in
RootBridgeIoMap() in a similar fashion. (Is this a spec bug actually?)

(2) I tried to track down where EFI_PCI_ATTRIBUTE_DUAL_ADDRESS_CYCLE
would come from, in edk2. The only location that passes it is
PciIoAllocateBuffer() in "MdeModulePkg/Bus/Pci/PciBusDxe/PciIo.c" (i.e.,
the implementation of the similarly named PciIo protocol member).

The condition for passing this attribute to
PciRootBridgeIo.AllocateBuffer() is that
EFI_PCI_IO_ATTRIBUTE_DUAL_ADDRESS_CYCLE (note: a different constant!) be
set in PciIo.Attributes -- i.e., on the PciIo device itself. Makes
sense, right?

So, what sets EFI_PCI_IO_ATTRIBUTE_DUAL_ADDRESS_CYCLE on the device?

- PciSetDeviceAttribute() in
"MdeModulePkg/Bus/Pci/PciBusDxe/PciEnumeratorSupport.c" sets the bit in
PCI_IO_DEVICE.Supports (not .Attributes!) unconditionally,

- in DetermineDeviceAttribute()
[MdeModulePkg/Bus/Pci/PciBusDxe/PciEnumeratorSupport.c], the attribute
is again set unconditionally (only in PCI_IO_DEVICE.Supports),
accompanied by the comment "Assume the PCI Root Bridge supports DAC",

- ModifyRootBridgeAttributes() in
"MdeModulePkg/Bus/Pci/PciBusDxe/PciIo.c" seems to exclude this bit from
the set of bits that can be toggled.

So, I think unless a UEFI_DRIVER that consumes PciIo actively calls
PciIo.Attributes() with OperationSet / OperationEnable for
EFI_PCI_IO_ATTRIBUTE_DUAL_ADDRESS_CYCLE, this code will never make a
difference.

UEFI_DRIVERs are actually expected to massage the PciIo attributes as
they see fit, for example EFI_PCI_IO_ATTRIBUTE_IO is frequently set for
IO BAR decoding. However, I couldn't find any driver in the tree that
would set EFI_PCI_IO_ATTRIBUTE_DUAL_ADDRESS_CYCLE.

*Maybe*, I guess, EhciDxe could look at the HCCPARAMS register discussed
above, and then set EFI_PCI_IO_ATTRIBUTE_DUAL_ADDRESS_CYCLE? I've got no
clue.


Anyway, after this wall of text, I should reenable >4GB DMA, and
actually test your patch... Yep, while it might be justified per se, it
definitely does not suffice for making things work. The USB 2 keyboard
remains broken with it.

Thanks
Laszlo
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel

Patch

diff --git a/MdeModulePkg/Bus/Pci/PciHostBridgeDxe/PciRootBridgeIo.c
b/MdeModulePkg/Bus/Pci/PciHostBridgeDxe/PciRootBridgeIo.c
index b2d76d67afa2..b53b9a834816 100644
--- a/MdeModulePkg/Bus/Pci/PciHostBridgeDxe/PciRootBridgeIo.c
+++ b/MdeModulePkg/Bus/Pci/PciHostBridgeDxe/PciRootBridgeIo.c
@@ -1308,7 +1308,8 @@  RootBridgeIoAllocateBuffer (
   RootBridge = ROOT_BRIDGE_FROM_THIS (This);

   AllocateType = AllocateAnyPages;
-  if (!RootBridge->DmaAbove4G) {
+  if (!RootBridge->DmaAbove4G ||
+      (Attributes & EFI_PCI_ATTRIBUTE_DUAL_ADDRESS_CYCLE) == 0) {
     //
     // Limit allocations to memory below 4GB
     //