diff mbox series

[v3,4/9] PCI/VGA: Improve the default VGA device selection

Message ID 20230711164310.791756-5-sui.jingfeng@linux.dev
State New
Headers show
Series None | expand

Commit Message

Sui Jingfeng July 11, 2023, 4:43 p.m. UTC
From: Sui Jingfeng <suijingfeng@loongson.cn>

Currently, the strategy of selecting the default boot on a multiple video
card coexistence system is not perfect. Potential problems are:

1) This function is a no-op on non-x86 architectures.
2) It does not take the PCI Bar may get relocated into consideration.
3) It is not effective for the PCI device without a dedicated VRAM Bar.
4) It is device-agnostic, thus it has to waste the effort to iterate all
   of the PCI Bar to find the VRAM aperture.
5) It has invented lots of methods to determine which one is the default
   boot device, but this is still a policy because it doesn't give the
   user a choice to override.

With the observation that device drivers may have better knowledge about
which PCI bar contains the firmware FB. This patch tries to solve the above
problems by introducing a function callback to the vga_client_register()
function interface. DRM device drivers for the PCI device could provide
a xx_vga_is_primary_gpu() function callback during the driver loading time.
Once the driver binds the device successfully, VRAARB will call back to
the driver. This gives the device drivers a chance to provide accurate
boot device identification. Which in turn unlock the abitration service
to non-x86 architectures. A device driver can also just pass a NULL pointer
to keep the original behavior.

This patch is intended to introducing the mechanism only, the specific
implementation is left to the authors of various device driver. Also honor
the comment: "Clients have *TWO* callback mechanisms they can use"

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Konig <christian.koenig@amd.com>
Cc: Pan Xinhui <Xinhui.Pan@amd.com>
Cc: David Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Karol Herbst <kherbst@redhat.com>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Lijo Lazar <lijo.lazar@amd.com>
Cc: YiPeng Chai <YiPeng.Chai@amd.com>
Cc: Bokun Zhang <Bokun.Zhang@amd.com>
Cc: Likun Gao <Likun.Gao@amd.com>
Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
CC: Kevin Tian <kevin.tian@intel.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Cc: Yishai Hadas <yishaih@nvidia.com>
Cc: Abhishek Sahu <abhsahu@nvidia.com>
Cc: Yi Liu <yi.l.liu@intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com> # i915
Reviewed-by: Lyude Paul <lyude@redhat.com> # nouveau
Signed-off-by: Sui Jingfeng <suijingfeng@loongson.cn>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  2 +-
 drivers/gpu/drm/i915/display/intel_vga.c   |  3 +-
 drivers/gpu/drm/loongson/lsdc_drv.c        |  2 +-
 drivers/gpu/drm/nouveau/nouveau_vga.c      |  2 +-
 drivers/gpu/drm/radeon/radeon_device.c     |  2 +-
 drivers/pci/vgaarb.c                       | 55 ++++++++++++++++++++--
 drivers/vfio/pci/vfio_pci_core.c           |  2 +-
 include/linux/vgaarb.h                     |  8 ++--
 8 files changed, 61 insertions(+), 15 deletions(-)

Comments

suijingfeng July 17, 2023, 2:07 p.m. UTC | #1
Hi,


Fixes: f6b1772b2555 ('vgaarb: remove the unused irq_set_state argument 
to vga_client_register')


Because after applied that patch, there have only one callback mechanism 
we can use, not two anymore.


On 2023/7/12 00:43, Sui Jingfeng wrote:
> From: Sui Jingfeng <suijingfeng@loongson.cn>
>
> Currently, the strategy of selecting the default boot on a multiple video
> card coexistence system is not perfect. Potential problems are:
>
> 1) This function is a no-op on non-x86 architectures.
> 2) It does not take the PCI Bar may get relocated into consideration.
> 3) It is not effective for the PCI device without a dedicated VRAM Bar.
> 4) It is device-agnostic, thus it has to waste the effort to iterate all
>     of the PCI Bar to find the VRAM aperture.
> 5) It has invented lots of methods to determine which one is the default
>     boot device, but this is still a policy because it doesn't give the
>     user a choice to override.
>
> With the observation that device drivers may have better knowledge about
> which PCI bar contains the firmware FB. This patch tries to solve the above
> problems by introducing a function callback to the vga_client_register()
> function interface. DRM device drivers for the PCI device could provide
> a xx_vga_is_primary_gpu() function callback during the driver loading time.
> Once the driver binds the device successfully, VRAARB will call back to
> the driver. This gives the device drivers a chance to provide accurate
> boot device identification. Which in turn unlock the abitration service
> to non-x86 architectures. A device driver can also just pass a NULL pointer
> to keep the original behavior.
>
> This patch is intended to introducing the mechanism only, the specific
> implementation is left to the authors of various device driver. Also honor
> the comment: "Clients have *TWO* callback mechanisms they can use"
>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: Christian Konig <christian.koenig@amd.com>
> Cc: Pan Xinhui <Xinhui.Pan@amd.com>
> Cc: David Airlie <airlied@gmail.com>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: Jani Nikula <jani.nikula@linux.intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> Cc: Ben Skeggs <bskeggs@redhat.com>
> Cc: Karol Herbst <kherbst@redhat.com>
> Cc: Lyude Paul <lyude@redhat.com>
> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Cc: Alex Williamson <alex.williamson@redhat.com>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Maxime Ripard <mripard@kernel.org>
> Cc: Thomas Zimmermann <tzimmermann@suse.de>
> Cc: Hawking Zhang <Hawking.Zhang@amd.com>
> Cc: Mario Limonciello <mario.limonciello@amd.com>
> Cc: Lijo Lazar <lijo.lazar@amd.com>
> Cc: YiPeng Chai <YiPeng.Chai@amd.com>
> Cc: Bokun Zhang <Bokun.Zhang@amd.com>
> Cc: Likun Gao <Likun.Gao@amd.com>
> Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
> Cc: Jason Gunthorpe <jgg@ziepe.ca>
> CC: Kevin Tian <kevin.tian@intel.com>
> Cc: Cornelia Huck <cohuck@redhat.com>
> Cc: Yishai Hadas <yishaih@nvidia.com>
> Cc: Abhishek Sahu <abhsahu@nvidia.com>
> Cc: Yi Liu <yi.l.liu@intel.com>
> Acked-by: Jani Nikula <jani.nikula@intel.com> # i915
> Reviewed-by: Lyude Paul <lyude@redhat.com> # nouveau
> Signed-off-by: Sui Jingfeng <suijingfeng@loongson.cn>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  2 +-
>   drivers/gpu/drm/i915/display/intel_vga.c   |  3 +-
>   drivers/gpu/drm/loongson/lsdc_drv.c        |  2 +-
>   drivers/gpu/drm/nouveau/nouveau_vga.c      |  2 +-
>   drivers/gpu/drm/radeon/radeon_device.c     |  2 +-
>   drivers/pci/vgaarb.c                       | 55 ++++++++++++++++++++--
>   drivers/vfio/pci/vfio_pci_core.c           |  2 +-
>   include/linux/vgaarb.h                     |  8 ++--
>   8 files changed, 61 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index a92c6189b4b6..d98f0801ac77 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -4103,7 +4103,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>   	/* this will fail for cards that aren't VGA class devices, just
>   	 * ignore it */
>   	if ((adev->pdev->class >> 8) == PCI_CLASS_DISPLAY_VGA)
> -		vga_client_register(adev->pdev, amdgpu_device_vga_set_decode);
> +		vga_client_register(adev->pdev, amdgpu_device_vga_set_decode, NULL);
>   
>   	px = amdgpu_device_supports_px(ddev);
>   
> diff --git a/drivers/gpu/drm/i915/display/intel_vga.c b/drivers/gpu/drm/i915/display/intel_vga.c
> index 286a0bdd28c6..98d7d4dffe9f 100644
> --- a/drivers/gpu/drm/i915/display/intel_vga.c
> +++ b/drivers/gpu/drm/i915/display/intel_vga.c
> @@ -115,7 +115,6 @@ intel_vga_set_decode(struct pci_dev *pdev, bool enable_decode)
>   
>   int intel_vga_register(struct drm_i915_private *i915)
>   {
> -
>   	struct pci_dev *pdev = to_pci_dev(i915->drm.dev);
>   	int ret;
>   
> @@ -127,7 +126,7 @@ int intel_vga_register(struct drm_i915_private *i915)
>   	 * then we do not take part in VGA arbitration and the
>   	 * vga_client_register() fails with -ENODEV.
>   	 */
> -	ret = vga_client_register(pdev, intel_vga_set_decode);
> +	ret = vga_client_register(pdev, intel_vga_set_decode, NULL);
>   	if (ret && ret != -ENODEV)
>   		return ret;
>   
> diff --git a/drivers/gpu/drm/loongson/lsdc_drv.c b/drivers/gpu/drm/loongson/lsdc_drv.c
> index 188ec82afcfb..d10a28c2c494 100644
> --- a/drivers/gpu/drm/loongson/lsdc_drv.c
> +++ b/drivers/gpu/drm/loongson/lsdc_drv.c
> @@ -289,7 +289,7 @@ static int lsdc_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
>   
>   	pci_set_drvdata(pdev, ddev);
>   
> -	vga_client_register(pdev, lsdc_vga_set_decode);
> +	vga_client_register(pdev, lsdc_vga_set_decode, NULL);
>   
>   	drm_kms_helper_poll_init(ddev);
>   
> diff --git a/drivers/gpu/drm/nouveau/nouveau_vga.c b/drivers/gpu/drm/nouveau/nouveau_vga.c
> index f8bf0ec26844..162b4f4676c7 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_vga.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_vga.c
> @@ -92,7 +92,7 @@ nouveau_vga_init(struct nouveau_drm *drm)
>   		return;
>   	pdev = to_pci_dev(dev->dev);
>   
> -	vga_client_register(pdev, nouveau_vga_set_decode);
> +	vga_client_register(pdev, nouveau_vga_set_decode, NULL);
>   
>   	/* don't register Thunderbolt eGPU with vga_switcheroo */
>   	if (pci_is_thunderbolt_attached(pdev))
> diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c
> index afbb3a80c0c6..71f2ff39d6a1 100644
> --- a/drivers/gpu/drm/radeon/radeon_device.c
> +++ b/drivers/gpu/drm/radeon/radeon_device.c
> @@ -1425,7 +1425,7 @@ int radeon_device_init(struct radeon_device *rdev,
>   	/* if we have > 1 VGA cards, then disable the radeon VGA resources */
>   	/* this will fail for cards that aren't VGA class devices, just
>   	 * ignore it */
> -	vga_client_register(rdev->pdev, radeon_vga_set_decode);
> +	vga_client_register(rdev->pdev, radeon_vga_set_decode, NULL);
>   
>   	if (rdev->flags & RADEON_IS_PX)
>   		runtime = true;
> diff --git a/drivers/pci/vgaarb.c b/drivers/pci/vgaarb.c
> index 953daf731b2c..610ddcccef24 100644
> --- a/drivers/pci/vgaarb.c
> +++ b/drivers/pci/vgaarb.c
> @@ -53,6 +53,7 @@ struct vga_device {
>   	bool bridge_has_one_vga;
>   	bool is_firmware_default;	/* device selected by firmware */
>   	unsigned int (*set_decode)(struct pci_dev *pdev, bool decode);
> +	bool (*is_primary_gpu)(struct pci_dev *pdev);
>   };
>   
>   static LIST_HEAD(vga_list);
> @@ -958,6 +959,13 @@ EXPORT_SYMBOL(vga_set_legacy_decoding);
>    * @set_decode callback: If a client can disable its GPU VGA resource, it
>    * will get a callback from this to set the encode/decode state.
>    *
> + * @is_primary_gpu callback: call back to the device driver, query if a PCI
> + * GPU client is the primary display device, as device drivers (drm-based
> + * or fbdev-based) may have better knowledge if a specific device is the
> + * default boot device or should be the default boot device. But this
> + * callback is optional. A device driver can simply pass a NULL pointer to
> + * adhere to the original rules of arbitration.
> + *
>    * Rationale: we cannot disable VGA decode resources unconditionally, some
>    * single GPU laptops seem to require ACPI or BIOS access to the VGA registers
>    * to control things like backlights etc. Hopefully newer multi-GPU laptops do
> @@ -973,7 +981,8 @@ EXPORT_SYMBOL(vga_set_legacy_decoding);
>    * Returns: 0 on success, -1 on failure
>    */
>   int vga_client_register(struct pci_dev *pdev,
> -		unsigned int (*set_decode)(struct pci_dev *pdev, bool decode))
> +		unsigned int (*set_decode)(struct pci_dev *pdev, bool decode),
> +		bool (*is_primary_gpu)(struct pci_dev *pdev))
>   {
>   	int ret = -ENODEV;
>   	struct vga_device *vgadev;
> @@ -985,6 +994,7 @@ int vga_client_register(struct pci_dev *pdev,
>   		goto bail;
>   
>   	vgadev->set_decode = set_decode;
> +	vgadev->is_primary_gpu = is_primary_gpu;
>   	ret = 0;
>   
>   bail:
> @@ -1490,6 +1500,30 @@ static void vga_arbiter_notify_clients(void)
>   	spin_unlock_irqrestore(&vga_lock, flags);
>   }
>   
> +static void vga_arbiter_do_arbitration(struct pci_dev *pdev)
> +{
> +	struct vga_device *vgadev;
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&vga_lock, flags);
> +	list_for_each_entry(vgadev, &vga_list, list) {
> +		if (vgadev->pdev != pdev)
> +			continue;
> +
> +		/* This device already the boot device, do nothing */
> +		if (pdev == vga_default_device())
> +			break;
> +
> +		if (vgadev->is_primary_gpu) {
> +			if (vgadev->is_primary_gpu(pdev)) {
> +				vgaarb_info(&pdev->dev, "Overriding as primary GPU\n");
> +				vga_set_default_device(pdev);
> +			}
> +		}
> +	}
> +	spin_unlock_irqrestore(&vga_lock, flags);
> +}
> +
>   static int pci_notify(struct notifier_block *nb, unsigned long action,
>   		      void *data)
>   {
> @@ -1509,13 +1543,24 @@ static int pci_notify(struct notifier_block *nb, unsigned long action,
>   	 * cases of hotplugable vga cards.
>   	 */
>   
> -	if (action == BUS_NOTIFY_ADD_DEVICE)
> +	switch (action) {
> +	case BUS_NOTIFY_ADD_DEVICE:
>   		notify = vga_arbiter_add_pci_device(pdev);
> -	else if (action == BUS_NOTIFY_DEL_DEVICE)
> +		if (notify)
> +			vga_arbiter_notify_clients();
> +		break;
> +	case BUS_NOTIFY_DEL_DEVICE:
>   		notify = vga_arbiter_del_pci_device(pdev);
> +		if (notify)
> +			vga_arbiter_notify_clients();
> +		break;
> +	case BUS_NOTIFY_BOUND_DRIVER:
> +		vga_arbiter_do_arbitration(pdev);
> +		break;
> +	default:
> +		break;
> +	}
>   
> -	if (notify)
> -		vga_arbiter_notify_clients();
>   	return 0;
>   }
>   
> diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
> index 20d7b69ea6ff..531c4d8ef26e 100644
> --- a/drivers/vfio/pci/vfio_pci_core.c
> +++ b/drivers/vfio/pci/vfio_pci_core.c
> @@ -2108,7 +2108,7 @@ static int vfio_pci_vga_init(struct vfio_pci_core_device *vdev)
>   	if (ret)
>   		return ret;
>   
> -	ret = vga_client_register(pdev, vfio_pci_set_decode);
> +	ret = vga_client_register(pdev, vfio_pci_set_decode, NULL);
>   	if (ret)
>   		return ret;
>   	vga_set_legacy_decoding(pdev, vfio_pci_set_decode(pdev, false));
> diff --git a/include/linux/vgaarb.h b/include/linux/vgaarb.h
> index 97129a1bbb7d..e4102be21f47 100644
> --- a/include/linux/vgaarb.h
> +++ b/include/linux/vgaarb.h
> @@ -33,7 +33,8 @@ struct pci_dev *vga_default_device(void);
>   void vga_set_default_device(struct pci_dev *pdev);
>   int vga_remove_vgacon(struct pci_dev *pdev);
>   int vga_client_register(struct pci_dev *pdev,
> -		unsigned int (*set_decode)(struct pci_dev *pdev, bool state));
> +		unsigned int (*set_decode)(struct pci_dev *pdev, bool state),
> +		bool (*is_primary_gpu)(struct pci_dev *pdev));
>   #else /* CONFIG_VGA_ARB */
>   static inline void vga_set_legacy_decoding(struct pci_dev *pdev,
>   		unsigned int decodes)
> @@ -59,7 +60,8 @@ static inline int vga_remove_vgacon(struct pci_dev *pdev)
>   	return 0;
>   }
>   static inline int vga_client_register(struct pci_dev *pdev,
> -		unsigned int (*set_decode)(struct pci_dev *pdev, bool state))
> +		unsigned int (*set_decode)(struct pci_dev *pdev, bool state),
> +		bool (*is_primary_gpu)(struct pci_dev *pdev))
>   {
>   	return 0;
>   }
> @@ -97,7 +99,7 @@ static inline int vga_get_uninterruptible(struct pci_dev *pdev,
>   
>   static inline void vga_client_unregister(struct pci_dev *pdev)
>   {
> -	vga_client_register(pdev, NULL);
> +	vga_client_register(pdev, NULL, NULL);
>   }
>   
>   #endif /* LINUX_VGA_H */
Bjorn Helgaas July 19, 2023, 7:32 p.m. UTC | #2
[+cc linux-pci (please cc in the future since the bulk of this patch
is in drivers/pci/)]

On Wed, Jul 12, 2023 at 12:43:05AM +0800, Sui Jingfeng wrote:
> From: Sui Jingfeng <suijingfeng@loongson.cn>
> 
> Currently, the strategy of selecting the default boot on a multiple video
> card coexistence system is not perfect. Potential problems are:
> 
> 1) This function is a no-op on non-x86 architectures.

Which function in particular is a no-op for non-x86?

> 2) It does not take the PCI Bar may get relocated into consideration.
> 3) It is not effective for the PCI device without a dedicated VRAM Bar.
> 4) It is device-agnostic, thus it has to waste the effort to iterate all
>    of the PCI Bar to find the VRAM aperture.
> 5) It has invented lots of methods to determine which one is the default
>    boot device, but this is still a policy because it doesn't give the
>    user a choice to override.

I don't think we need a list of *potential* problems.  We need an
example of the specific problem this will solve, i.e., what currently
does not work?

The drm/ast and maybe drm/loongson patches are the only ones that use
the new callback, so I assume there are real problems with those
drivers.

CONFIG_DRM_AST is a tristate.  We're talking about identifying the
boot-time console device.  So if CONFIG_DRM_AST=m, I guess we don't
get the benefit of the new callback unless the module gets loaded?

> Also honor the comment: "Clients have *TWO* callback mechanisms they
> can use"

This refers to the existing vga_client_register() function comment:

   * vga_client_register - register or unregister a VGA arbitration client
   * @pdev: pci device of the VGA client
   * @set_decode: vga decode change callback
   *
   * Clients have two callback mechanisms they can use.
   *
   * @set_decode callback: If a client can disable its GPU VGA resource, it
   * will get a callback from this to set the encode/decode state.

and the fact that struct vga_device currently only contains *one*
callback function pointer:

  unsigned int (*set_decode)(struct pci_dev *pdev, bool decode);

Adding the .is_primary_gpu() callback does mean there will now be two
callbacks, as the comment says, but I think it's just confusing to
mention this in the commit log, so I would just remove it.

> @@ -1509,13 +1543,24 @@ static int pci_notify(struct notifier_block *nb, unsigned long action,
>  	 * cases of hotplugable vga cards.
>  	 */
>  
> -	if (action == BUS_NOTIFY_ADD_DEVICE)
> +	switch (action) {
> +	case BUS_NOTIFY_ADD_DEVICE:
>  		notify = vga_arbiter_add_pci_device(pdev);
> -	else if (action == BUS_NOTIFY_DEL_DEVICE)
> +		if (notify)
> +			vga_arbiter_notify_clients();
> +		break;
> +	case BUS_NOTIFY_DEL_DEVICE:
>  		notify = vga_arbiter_del_pci_device(pdev);
> +		if (notify)
> +			vga_arbiter_notify_clients();
> +		break;
> +	case BUS_NOTIFY_BOUND_DRIVER:
> +		vga_arbiter_do_arbitration(pdev);
> +		break;
> +	default:
> +		break;
> +	}

Changing from if/else to switch makes the patch bigger than necessary
for no real benefit and obscures what is really changing.

Bjorn
suijingfeng July 19, 2023, 10:32 p.m. UTC | #3
Hi,

On 2023/7/20 03:32, Bjorn Helgaas wrote:
> [+cc linux-pci (please cc in the future since the bulk of this patch
> is in drivers/pci/)]
>
> On Wed, Jul 12, 2023 at 12:43:05AM +0800, Sui Jingfeng wrote:
>> From: Sui Jingfeng <suijingfeng@loongson.cn>
>>
>> Currently, the strategy of selecting the default boot on a multiple video
>> card coexistence system is not perfect. Potential problems are:
>>
>> 1) This function is a no-op on non-x86 architectures.
> Which function in particular is a no-op for non-x86?


I refer to the vga_is_firmware_default() function,

I will improve the commit message at the next version. (To make it more 
human readable).

Thanks you point it out.


>> 2) It does not take the PCI Bar may get relocated into consideration.
>> 3) It is not effective for the PCI device without a dedicated VRAM Bar.
>> 4) It is device-agnostic, thus it has to waste the effort to iterate all
>>     of the PCI Bar to find the VRAM aperture.
>> 5) It has invented lots of methods to determine which one is the default
>>     boot device, but this is still a policy because it doesn't give the
>>     user a choice to override.
> I don't think we need a list of *potential* problems.  We need an
> example of the specific problem this will solve, i.e., what currently
> does not work?

1) The selection of primary GPU on Non-x86 platform. (Arm64, risc-v, 
powerpc etc)

Mostly server platforms have equipped with aspeed bmc, and such hardware 
platforms have a lot PCIe slot.

So I think, aspeed bmc V.S (P.K) radeon(or amdgpu) is very common.


2) The ability to pass the control back to the end user.

Convert the *device driven* to the "driver driven" or "human driven".

Currently, it is the machine making the decision.

Emm, I probably will be able to give some examples at the next version.


> The drm/ast and maybe drm/loongson patches are the only ones that use
> the new callback, so I assume there are real problems with those
> drivers.
>
> CONFIG_DRM_AST is a tristate.  We're talking about identifying the
> boot-time console device.  So if CONFIG_DRM_AST=m, I guess we don't
> get the benefit of the new callback unless the module gets loaded?
>
Since, this patch set is mostly for the user of X server.

It is actually okey if CONFIG_DRM_AST=m. (it will be works no matter CONFIG_DRM_AST=m or CONFIG_DRM_AST=y)


As the device and the driver bound at a latter time.

So we are lucky, we need this behavior to implement the override.
Sui Jingfeng July 19, 2023, 10:44 p.m. UTC | #4
On 2023/7/20 06:32, suijingfeng wrote:
> it will be works no matter CONFIG_DRM_AST=m or CONFIG_DRM_AST=y


It will be works regardless of CONFIG_DRM_AST=m or CONFIG_DRM_AST=y.

When vgaarb call to the device driver, device driver already loaded 
successfully.

and the PCI(e) device emulation already finished.


So the last change the vgaarb gave us to override is actually happen 
very late.

But it will be happen as long as the device driver get loaded successfully.
suijingfeng July 19, 2023, 10:51 p.m. UTC | #5
On 2023/7/20 03:32, Bjorn Helgaas wrote:
> but I think it's just confusing to
> mention this in the commit log, so I would just remove it.


Ok, will be done at the next version.
suijingfeng July 24, 2023, 11:56 a.m. UTC | #6
Hi,


I was too hurry reply to you. I'm may miss the point for part of your 
reviews, Sorry.


On 2023/7/20 03:32, Bjorn Helgaas wrote:
> CONFIG_DRM_AST is a tristate.  We're talking about identifying the
> boot-time console device.

Yes, my patch will only works *after* the module gets loaded successfully.

But generally, vgaarb will select a default boot device before my patch taking into effect.

I means that vgaarb will select a default boot device by calling vga_arbiter_add_pci_device() function.


In practice, I still not notice any obvious problems.

I'm lack the knowledge about the boot-time console,

what is the potential problems with such a condition?


>   So if CONFIG_DRM_AST=m, I guess we don't
> get the benefit of the new callback unless the module gets loaded?

Yes, my approach will not works until the device driver kernel module 
gets loaded successfully.

So what's the problem with such a situation, do you see something weird ?
suijingfeng July 24, 2023, 12:16 p.m. UTC | #7
Hi,

On 2023/7/20 03:32, Bjorn Helgaas wrote:
>> 2) It does not take the PCI Bar may get relocated into consideration.
>> 3) It is not effective for the PCI device without a dedicated VRAM Bar.
>> 4) It is device-agnostic, thus it has to waste the effort to iterate all
>>     of the PCI Bar to find the VRAM aperture.
>> 5) It has invented lots of methods to determine which one is the default
>>     boot device, but this is still a policy because it doesn't give the
>>     user a choice to override.
> I don't think we need a list of*potential*  problems.  We need an
> example of the specific problem this will solve, i.e., what currently
> does not work?


This version do allow the arbitration service works on non-x86 arch,

which also allow me remove a arch-specific workaround.

I will give more detail at the next version.


But I want to provide one more drawback of vgaarb here:


(6) It does not works for non VGA-compatible PCI(e) display controllers.


Because, currently, vgaarb deal with PCI VGA compatible devices only.

See another my patch set [1] for more elaborate discussion.

It also ignore PCI_CLASS_NOT_DEFINED_VGA as Maciej puts it[2].

While my approach do not required the display controller to be 
VGA-compatible to enjoy the arbitration service.

What do you think then?


[1] https://patchwork.freedesktop.org/patch/546690/?series=120548&rev=1

[2] https://lkml.org/lkml/2023/6/18/315
suijingfeng July 24, 2023, 12:28 p.m. UTC | #8
Hi,


Thanks for you noticed my change.


On 2023/7/20 03:32, Bjorn Helgaas wrote:
>> @@ -1509,13 +1543,24 @@ static int pci_notify(struct notifier_block *nb, unsigned long action,
>>   	 * cases of hotplugable vga cards.
>>   	 */
>>   
>> -	if (action == BUS_NOTIFY_ADD_DEVICE)
>> +	switch (action) {
>> +	case BUS_NOTIFY_ADD_DEVICE:
>>   		notify = vga_arbiter_add_pci_device(pdev);
>> -	else if (action == BUS_NOTIFY_DEL_DEVICE)
>> +		if (notify)
>> +			vga_arbiter_notify_clients();
>> +		break;
>> +	case BUS_NOTIFY_DEL_DEVICE:
>>   		notify = vga_arbiter_del_pci_device(pdev);
>> +		if (notify)
>> +			vga_arbiter_notify_clients();
>> +		break;
>> +	case BUS_NOTIFY_BOUND_DRIVER:
>> +		vga_arbiter_do_arbitration(pdev);
>> +		break;
>> +	default:
>> +		break;
>> +	}
> Changing from if/else to switch makes the patch bigger than necessary
> for no real benefit and obscures what is really changing.

Actually, the logic become more clear after this patch applied.

```

     switch (action) {
     case BUS_NOTIFY_ADD_DEVICE:
         notify = vga_arbiter_add_pci_device(pdev);
         if (notify)
             vga_arbiter_notify_clients();
         break;
     case BUS_NOTIFY_DEL_DEVICE:
         notify = vga_arbiter_del_pci_device(pdev);
         if (notify)
             vga_arbiter_notify_clients();
         break;
     case BUS_NOTIFY_BOUND_DRIVER:
         vga_arbiter_do_arbitration(pdev);
         break;
     default:
         break;
     }

```


Because we only need call vga_arbiter_notify_clients() when action == 
BUS_NOTIFY_ADD_DEVICE or action == BUS_NOTIFY_DEL_DEVICE,

But *NOT* when the action equals to  BUS_NOTIFY_BOUND_DRIVER.
Bjorn Helgaas July 25, 2023, 9:30 p.m. UTC | #9
On Mon, Jul 24, 2023 at 08:16:18PM +0800, suijingfeng wrote:
> On 2023/7/20 03:32, Bjorn Helgaas wrote:
> > > 2) It does not take the PCI Bar may get relocated into consideration.
> > > 3) It is not effective for the PCI device without a dedicated VRAM Bar.
> > > 4) It is device-agnostic, thus it has to waste the effort to iterate all
> > >     of the PCI Bar to find the VRAM aperture.
> > > 5) It has invented lots of methods to determine which one is the default
> > >     boot device, but this is still a policy because it doesn't give the
> > >     user a choice to override.
> > I don't think we need a list of*potential*  problems.  We need an
> > example of the specific problem this will solve, i.e., what currently
> > does not work?
> 
> 
> This version do allow the arbitration service works on non-x86 arch,
> which also allow me remove a arch-specific workaround.
> I will give more detail at the next version.

Yes.  This part I think we want.

> But I want to provide one more drawback of vgaarb here:
> 
> (6) It does not works for non VGA-compatible PCI(e) display controllers.
> 
> Because, currently, vgaarb deal with PCI VGA compatible devices only.
> 
> See another my patch set [1] for more elaborate discussion.
> 
> It also ignore PCI_CLASS_NOT_DEFINED_VGA as Maciej puts it[2].
> 
> While my approach do not required the display controller to be
> VGA-compatible to enjoy the arbitration service.

I think vgaarb is really only for dealing with the problem of the
legacy VGA address space routing.  For example, there may be VGA
devices that require the [pci 0xa0000-0xbffff] range but they don't
describe that via a BAR.  There may also be VGA option ROMs that
depend on that range so they can initialize the device.

The [pci 0xa0000-0xbffff] range can only be routed to one device at a
time, and vgaarb is what takes care of that by manipulating the VGA
Enable bits in bridges.

I don't think we should extend vgaarb to deal with non-VGA GPUs in
general, i.e., I don't think it should be concerned with devices and
option ROMs that do not require the [pci 0xa0000-0xbffff] range.

I think a strict reading of the PCI Class Code spec would be that only
devices with Programming Interface 0000 0000b can depend on that
legacy range.

If that's what vgaarb currently enforces, great.  If it currently
deals with more than just 0000 0000b devices, and there's some value
in restricting it to only 0000 0000b, we could try that, but I would
suggest doing that in a tiny patch all by itself.  Then if we trip
over a problem, it's easy to bisect and revert it.

> [1] https://patchwork.freedesktop.org/patch/546690/?series=120548&rev=1
> 
> [2] https://lkml.org/lkml/2023/6/18/315
>
diff mbox series

Patch

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index a92c6189b4b6..d98f0801ac77 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4103,7 +4103,7 @@  int amdgpu_device_init(struct amdgpu_device *adev,
 	/* this will fail for cards that aren't VGA class devices, just
 	 * ignore it */
 	if ((adev->pdev->class >> 8) == PCI_CLASS_DISPLAY_VGA)
-		vga_client_register(adev->pdev, amdgpu_device_vga_set_decode);
+		vga_client_register(adev->pdev, amdgpu_device_vga_set_decode, NULL);
 
 	px = amdgpu_device_supports_px(ddev);
 
diff --git a/drivers/gpu/drm/i915/display/intel_vga.c b/drivers/gpu/drm/i915/display/intel_vga.c
index 286a0bdd28c6..98d7d4dffe9f 100644
--- a/drivers/gpu/drm/i915/display/intel_vga.c
+++ b/drivers/gpu/drm/i915/display/intel_vga.c
@@ -115,7 +115,6 @@  intel_vga_set_decode(struct pci_dev *pdev, bool enable_decode)
 
 int intel_vga_register(struct drm_i915_private *i915)
 {
-
 	struct pci_dev *pdev = to_pci_dev(i915->drm.dev);
 	int ret;
 
@@ -127,7 +126,7 @@  int intel_vga_register(struct drm_i915_private *i915)
 	 * then we do not take part in VGA arbitration and the
 	 * vga_client_register() fails with -ENODEV.
 	 */
-	ret = vga_client_register(pdev, intel_vga_set_decode);
+	ret = vga_client_register(pdev, intel_vga_set_decode, NULL);
 	if (ret && ret != -ENODEV)
 		return ret;
 
diff --git a/drivers/gpu/drm/loongson/lsdc_drv.c b/drivers/gpu/drm/loongson/lsdc_drv.c
index 188ec82afcfb..d10a28c2c494 100644
--- a/drivers/gpu/drm/loongson/lsdc_drv.c
+++ b/drivers/gpu/drm/loongson/lsdc_drv.c
@@ -289,7 +289,7 @@  static int lsdc_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 
 	pci_set_drvdata(pdev, ddev);
 
-	vga_client_register(pdev, lsdc_vga_set_decode);
+	vga_client_register(pdev, lsdc_vga_set_decode, NULL);
 
 	drm_kms_helper_poll_init(ddev);
 
diff --git a/drivers/gpu/drm/nouveau/nouveau_vga.c b/drivers/gpu/drm/nouveau/nouveau_vga.c
index f8bf0ec26844..162b4f4676c7 100644
--- a/drivers/gpu/drm/nouveau/nouveau_vga.c
+++ b/drivers/gpu/drm/nouveau/nouveau_vga.c
@@ -92,7 +92,7 @@  nouveau_vga_init(struct nouveau_drm *drm)
 		return;
 	pdev = to_pci_dev(dev->dev);
 
-	vga_client_register(pdev, nouveau_vga_set_decode);
+	vga_client_register(pdev, nouveau_vga_set_decode, NULL);
 
 	/* don't register Thunderbolt eGPU with vga_switcheroo */
 	if (pci_is_thunderbolt_attached(pdev))
diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c
index afbb3a80c0c6..71f2ff39d6a1 100644
--- a/drivers/gpu/drm/radeon/radeon_device.c
+++ b/drivers/gpu/drm/radeon/radeon_device.c
@@ -1425,7 +1425,7 @@  int radeon_device_init(struct radeon_device *rdev,
 	/* if we have > 1 VGA cards, then disable the radeon VGA resources */
 	/* this will fail for cards that aren't VGA class devices, just
 	 * ignore it */
-	vga_client_register(rdev->pdev, radeon_vga_set_decode);
+	vga_client_register(rdev->pdev, radeon_vga_set_decode, NULL);
 
 	if (rdev->flags & RADEON_IS_PX)
 		runtime = true;
diff --git a/drivers/pci/vgaarb.c b/drivers/pci/vgaarb.c
index 953daf731b2c..610ddcccef24 100644
--- a/drivers/pci/vgaarb.c
+++ b/drivers/pci/vgaarb.c
@@ -53,6 +53,7 @@  struct vga_device {
 	bool bridge_has_one_vga;
 	bool is_firmware_default;	/* device selected by firmware */
 	unsigned int (*set_decode)(struct pci_dev *pdev, bool decode);
+	bool (*is_primary_gpu)(struct pci_dev *pdev);
 };
 
 static LIST_HEAD(vga_list);
@@ -958,6 +959,13 @@  EXPORT_SYMBOL(vga_set_legacy_decoding);
  * @set_decode callback: If a client can disable its GPU VGA resource, it
  * will get a callback from this to set the encode/decode state.
  *
+ * @is_primary_gpu callback: call back to the device driver, query if a PCI
+ * GPU client is the primary display device, as device drivers (drm-based
+ * or fbdev-based) may have better knowledge if a specific device is the
+ * default boot device or should be the default boot device. But this
+ * callback is optional. A device driver can simply pass a NULL pointer to
+ * adhere to the original rules of arbitration.
+ *
  * Rationale: we cannot disable VGA decode resources unconditionally, some
  * single GPU laptops seem to require ACPI or BIOS access to the VGA registers
  * to control things like backlights etc. Hopefully newer multi-GPU laptops do
@@ -973,7 +981,8 @@  EXPORT_SYMBOL(vga_set_legacy_decoding);
  * Returns: 0 on success, -1 on failure
  */
 int vga_client_register(struct pci_dev *pdev,
-		unsigned int (*set_decode)(struct pci_dev *pdev, bool decode))
+		unsigned int (*set_decode)(struct pci_dev *pdev, bool decode),
+		bool (*is_primary_gpu)(struct pci_dev *pdev))
 {
 	int ret = -ENODEV;
 	struct vga_device *vgadev;
@@ -985,6 +994,7 @@  int vga_client_register(struct pci_dev *pdev,
 		goto bail;
 
 	vgadev->set_decode = set_decode;
+	vgadev->is_primary_gpu = is_primary_gpu;
 	ret = 0;
 
 bail:
@@ -1490,6 +1500,30 @@  static void vga_arbiter_notify_clients(void)
 	spin_unlock_irqrestore(&vga_lock, flags);
 }
 
+static void vga_arbiter_do_arbitration(struct pci_dev *pdev)
+{
+	struct vga_device *vgadev;
+	unsigned long flags;
+
+	spin_lock_irqsave(&vga_lock, flags);
+	list_for_each_entry(vgadev, &vga_list, list) {
+		if (vgadev->pdev != pdev)
+			continue;
+
+		/* This device already the boot device, do nothing */
+		if (pdev == vga_default_device())
+			break;
+
+		if (vgadev->is_primary_gpu) {
+			if (vgadev->is_primary_gpu(pdev)) {
+				vgaarb_info(&pdev->dev, "Overriding as primary GPU\n");
+				vga_set_default_device(pdev);
+			}
+		}
+	}
+	spin_unlock_irqrestore(&vga_lock, flags);
+}
+
 static int pci_notify(struct notifier_block *nb, unsigned long action,
 		      void *data)
 {
@@ -1509,13 +1543,24 @@  static int pci_notify(struct notifier_block *nb, unsigned long action,
 	 * cases of hotplugable vga cards.
 	 */
 
-	if (action == BUS_NOTIFY_ADD_DEVICE)
+	switch (action) {
+	case BUS_NOTIFY_ADD_DEVICE:
 		notify = vga_arbiter_add_pci_device(pdev);
-	else if (action == BUS_NOTIFY_DEL_DEVICE)
+		if (notify)
+			vga_arbiter_notify_clients();
+		break;
+	case BUS_NOTIFY_DEL_DEVICE:
 		notify = vga_arbiter_del_pci_device(pdev);
+		if (notify)
+			vga_arbiter_notify_clients();
+		break;
+	case BUS_NOTIFY_BOUND_DRIVER:
+		vga_arbiter_do_arbitration(pdev);
+		break;
+	default:
+		break;
+	}
 
-	if (notify)
-		vga_arbiter_notify_clients();
 	return 0;
 }
 
diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index 20d7b69ea6ff..531c4d8ef26e 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -2108,7 +2108,7 @@  static int vfio_pci_vga_init(struct vfio_pci_core_device *vdev)
 	if (ret)
 		return ret;
 
-	ret = vga_client_register(pdev, vfio_pci_set_decode);
+	ret = vga_client_register(pdev, vfio_pci_set_decode, NULL);
 	if (ret)
 		return ret;
 	vga_set_legacy_decoding(pdev, vfio_pci_set_decode(pdev, false));
diff --git a/include/linux/vgaarb.h b/include/linux/vgaarb.h
index 97129a1bbb7d..e4102be21f47 100644
--- a/include/linux/vgaarb.h
+++ b/include/linux/vgaarb.h
@@ -33,7 +33,8 @@  struct pci_dev *vga_default_device(void);
 void vga_set_default_device(struct pci_dev *pdev);
 int vga_remove_vgacon(struct pci_dev *pdev);
 int vga_client_register(struct pci_dev *pdev,
-		unsigned int (*set_decode)(struct pci_dev *pdev, bool state));
+		unsigned int (*set_decode)(struct pci_dev *pdev, bool state),
+		bool (*is_primary_gpu)(struct pci_dev *pdev));
 #else /* CONFIG_VGA_ARB */
 static inline void vga_set_legacy_decoding(struct pci_dev *pdev,
 		unsigned int decodes)
@@ -59,7 +60,8 @@  static inline int vga_remove_vgacon(struct pci_dev *pdev)
 	return 0;
 }
 static inline int vga_client_register(struct pci_dev *pdev,
-		unsigned int (*set_decode)(struct pci_dev *pdev, bool state))
+		unsigned int (*set_decode)(struct pci_dev *pdev, bool state),
+		bool (*is_primary_gpu)(struct pci_dev *pdev))
 {
 	return 0;
 }
@@ -97,7 +99,7 @@  static inline int vga_get_uninterruptible(struct pci_dev *pdev,
 
 static inline void vga_client_unregister(struct pci_dev *pdev)
 {
-	vga_client_register(pdev, NULL);
+	vga_client_register(pdev, NULL, NULL);
 }
 
 #endif /* LINUX_VGA_H */