mbox series

[v5,00/10] Support GEM object mappings from I/O memory

Message ID 20201020122046.31167-1-tzimmermann@suse.de
Headers show
Series Support GEM object mappings from I/O memory | expand

Message

Thomas Zimmermann Oct. 20, 2020, 12:20 p.m. UTC
DRM's fbdev console uses regular load and store operations to update
framebuffer memory. The bochs driver on sparc64 requires the use of
I/O-specific load and store operations. We have a workaround, but need
a long-term solution to the problem.

This patchset changes GEM's vmap/vunmap interfaces to forward pointers
of type struct dma_buf_map and updates the generic fbdev emulation to
use them correctly. This enables I/O-memory operations on all framebuffers
that require and support them.

Patches #1 to #4 prepare VRAM helpers and drivers.

Next is the update of the GEM vmap functions. Patch #5 adds vmap and vunmap
that is usable with TTM-based GEM drivers, and patch #6 updates GEM's
vmap/vunmap callback to forward instances of type struct dma_buf_map. While
the patch touches many files throughout the DRM modules, the applied changes
are mostly trivial interface fixes. Several TTM-based GEM drivers now use
the new vmap code. Patch #7 updates GEM's internal vmap/vunmap functions to
forward struct dma_buf_map.

With struct dma_buf_map propagated through the layers, patches #8 to #10
convert DRM clients and generic fbdev emulation to use it. Updating the
fbdev framebuffer will select the correct functions, either for system or
I/O memory.

v5:
	* rebase onto latest TTM changes (Chrsitian)
	* support TTM premapped memory correctly (Christian)
	* implement fb_read/fb_write internally (Sam, Daniel)
	* cleanups
v4:
	* provide TTM vmap/vunmap plus GEM helpers and convert drivers
	  over (Christian, Daniel)
	* remove several empty functions
	* more TODOs and documentation (Daniel)
v3:
	* recreate the whole patchset on top of struct dma_buf_map
v2:
	* RFC patchset


Thomas Zimmermann (10):
  drm/vram-helper: Remove invariant parameters from internal kmap
    function
  drm/cma-helper: Remove empty drm_gem_cma_prime_vunmap()
  drm/etnaviv: Remove empty etnaviv_gem_prime_vunmap()
  drm/exynos: Remove empty exynos_drm_gem_prime_{vmap,vunmap}()
  drm/ttm: Add vmap/vunmap to TTM and TTM GEM helpers
  drm/gem: Use struct dma_buf_map in GEM vmap ops and convert GEM
    backends
  drm/gem: Update internal GEM vmap/vunmap interfaces to use struct
    dma_buf_map
  drm/gem: Store client buffer mappings as struct dma_buf_map
  dma-buf-map: Add memcpy and pointer-increment interfaces
  drm/fb_helper: Support framebuffers in I/O memory

 Documentation/gpu/todo.rst                  |  37 ++-
 drivers/gpu/drm/Kconfig                     |   2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c |  36 ---
 drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.h |   2 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c     |   5 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.h  |   1 -
 drivers/gpu/drm/ast/ast_cursor.c            |  27 +--
 drivers/gpu/drm/ast/ast_drv.h               |   7 +-
 drivers/gpu/drm/bochs/bochs_kms.c           |   1 -
 drivers/gpu/drm/drm_client.c                |  38 +--
 drivers/gpu/drm/drm_fb_helper.c             | 248 ++++++++++++++++++--
 drivers/gpu/drm/drm_gem.c                   |  29 ++-
 drivers/gpu/drm/drm_gem_cma_helper.c        |  27 +--
 drivers/gpu/drm/drm_gem_shmem_helper.c      |  48 ++--
 drivers/gpu/drm/drm_gem_ttm_helper.c        |  38 +++
 drivers/gpu/drm/drm_gem_vram_helper.c       | 117 +++++----
 drivers/gpu/drm/drm_internal.h              |   5 +-
 drivers/gpu/drm/drm_prime.c                 |  14 +-
 drivers/gpu/drm/etnaviv/etnaviv_drv.h       |   3 +-
 drivers/gpu/drm/etnaviv/etnaviv_gem.c       |   1 -
 drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c |  12 +-
 drivers/gpu/drm/exynos/exynos_drm_gem.c     |  12 -
 drivers/gpu/drm/exynos/exynos_drm_gem.h     |   2 -
 drivers/gpu/drm/lima/lima_gem.c             |   6 +-
 drivers/gpu/drm/lima/lima_sched.c           |  11 +-
 drivers/gpu/drm/mgag200/mgag200_mode.c      |  10 +-
 drivers/gpu/drm/nouveau/Kconfig             |   1 +
 drivers/gpu/drm/nouveau/nouveau_bo.h        |   2 -
 drivers/gpu/drm/nouveau/nouveau_gem.c       |   6 +-
 drivers/gpu/drm/nouveau/nouveau_gem.h       |   2 -
 drivers/gpu/drm/nouveau/nouveau_prime.c     |  20 --
 drivers/gpu/drm/panfrost/panfrost_perfcnt.c |  14 +-
 drivers/gpu/drm/qxl/qxl_display.c           |  11 +-
 drivers/gpu/drm/qxl/qxl_draw.c              |  14 +-
 drivers/gpu/drm/qxl/qxl_drv.h               |  11 +-
 drivers/gpu/drm/qxl/qxl_object.c            |  31 ++-
 drivers/gpu/drm/qxl/qxl_object.h            |   2 +-
 drivers/gpu/drm/qxl/qxl_prime.c             |  12 +-
 drivers/gpu/drm/radeon/radeon.h             |   1 -
 drivers/gpu/drm/radeon/radeon_gem.c         |   7 +-
 drivers/gpu/drm/radeon/radeon_prime.c       |  20 --
 drivers/gpu/drm/rockchip/rockchip_drm_gem.c |  22 +-
 drivers/gpu/drm/rockchip/rockchip_drm_gem.h |   4 +-
 drivers/gpu/drm/tiny/cirrus.c               |  10 +-
 drivers/gpu/drm/tiny/gm12u320.c             |  10 +-
 drivers/gpu/drm/ttm/ttm_bo_util.c           |  72 ++++++
 drivers/gpu/drm/udl/udl_modeset.c           |   8 +-
 drivers/gpu/drm/vboxvideo/vbox_mode.c       |  11 +-
 drivers/gpu/drm/vc4/vc4_bo.c                |   7 +-
 drivers/gpu/drm/vc4/vc4_drv.h               |   2 +-
 drivers/gpu/drm/vgem/vgem_drv.c             |  16 +-
 drivers/gpu/drm/vkms/vkms_plane.c           |  15 +-
 drivers/gpu/drm/vkms/vkms_writeback.c       |  22 +-
 drivers/gpu/drm/xen/xen_drm_front_gem.c     |  18 +-
 drivers/gpu/drm/xen/xen_drm_front_gem.h     |   6 +-
 include/drm/drm_client.h                    |   7 +-
 include/drm/drm_gem.h                       |   5 +-
 include/drm/drm_gem_cma_helper.h            |   3 +-
 include/drm/drm_gem_shmem_helper.h          |   4 +-
 include/drm/drm_gem_ttm_helper.h            |   6 +
 include/drm/drm_gem_vram_helper.h           |  14 +-
 include/drm/drm_mode_config.h               |  12 -
 include/drm/ttm/ttm_bo_api.h                |  28 +++
 include/linux/dma-buf-map.h                 |  93 +++++++-
 64 files changed, 852 insertions(+), 436 deletions(-)

Comments

Thomas Zimmermann Oct. 22, 2020, 8:37 a.m. UTC | #1
Hi

On 22.10.20 10:05, Daniel Vetter wrote:
> On Tue, Oct 20, 2020 at 02:20:46PM +0200, Thomas Zimmermann wrote:
>> At least sparc64 requires I/O-specific access to framebuffers. This
>> patch updates the fbdev console accordingly.
>>
>> For drivers with direct access to the framebuffer memory, the callback
>> functions in struct fb_ops test for the type of memory and call the rsp
>> fb_sys_ of fb_cfb_ functions. Read and write operations are implemented
>> internally by DRM's fbdev helper.
>>
>> For drivers that employ a shadow buffer, fbdev's blit function retrieves
>> the framebuffer address as struct dma_buf_map, and uses dma_buf_map
>> interfaces to access the buffer.
>>
>> The bochs driver on sparc64 uses a workaround to flag the framebuffer as
>> I/O memory and avoid a HW exception. With the introduction of struct
>> dma_buf_map, this is not required any longer. The patch removes the rsp
>> code from both, bochs and fbdev.
>>
>> v5:
>> 	* implement fb_read/fb_write internally (Daniel, Sam)
>> v4:
>> 	* move dma_buf_map changes into separate patch (Daniel)
>> 	* TODO list: comment on fbdev updates (Daniel)
>>
>> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
>> Tested-by: Sam Ravnborg <sam@ravnborg.org>
>> ---
>>  Documentation/gpu/todo.rst        |  19 ++-
>>  drivers/gpu/drm/bochs/bochs_kms.c |   1 -
>>  drivers/gpu/drm/drm_fb_helper.c   | 227 ++++++++++++++++++++++++++++--
>>  include/drm/drm_mode_config.h     |  12 --
>>  4 files changed, 230 insertions(+), 29 deletions(-)
>>
>> diff --git a/Documentation/gpu/todo.rst b/Documentation/gpu/todo.rst
>> index 7e6fc3c04add..638b7f704339 100644
>> --- a/Documentation/gpu/todo.rst
>> +++ b/Documentation/gpu/todo.rst
>> @@ -197,13 +197,28 @@ Convert drivers to use drm_fbdev_generic_setup()
>>  ------------------------------------------------
>>  
>>  Most drivers can use drm_fbdev_generic_setup(). Driver have to implement
>> -atomic modesetting and GEM vmap support. Current generic fbdev emulation
>> -expects the framebuffer in system memory (or system-like memory).
>> +atomic modesetting and GEM vmap support. Historically, generic fbdev emulation
>> +expected the framebuffer in system memory or system-like memory. By employing
>> +struct dma_buf_map, drivers with frambuffers in I/O memory can be supported
>> +as well.
>>  
>>  Contact: Maintainer of the driver you plan to convert
>>  
>>  Level: Intermediate
>>  
>> +Reimplement functions in drm_fbdev_fb_ops without fbdev
>> +-------------------------------------------------------
>> +
>> +A number of callback functions in drm_fbdev_fb_ops could benefit from
>> +being rewritten without dependencies on the fbdev module. Some of the
>> +helpers could further benefit from using struct dma_buf_map instead of
>> +raw pointers.
>> +
>> +Contact: Thomas Zimmermann <tzimmermann@suse.de>, Daniel Vetter
>> +
>> +Level: Advanced
>> +
>> +
>>  drm_framebuffer_funcs and drm_mode_config_funcs.fb_create cleanup
>>  -----------------------------------------------------------------
>>  
>> diff --git a/drivers/gpu/drm/bochs/bochs_kms.c b/drivers/gpu/drm/bochs/bochs_kms.c
>> index 13d0d04c4457..853081d186d5 100644
>> --- a/drivers/gpu/drm/bochs/bochs_kms.c
>> +++ b/drivers/gpu/drm/bochs/bochs_kms.c
>> @@ -151,7 +151,6 @@ int bochs_kms_init(struct bochs_device *bochs)
>>  	bochs->dev->mode_config.preferred_depth = 24;
>>  	bochs->dev->mode_config.prefer_shadow = 0;
>>  	bochs->dev->mode_config.prefer_shadow_fbdev = 1;
>> -	bochs->dev->mode_config.fbdev_use_iomem = true;
>>  	bochs->dev->mode_config.quirk_addfb_prefer_host_byte_order = true;
>>  
>>  	bochs->dev->mode_config.funcs = &bochs_mode_funcs;
>> diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
>> index 6212cd7cde1d..1d3180841778 100644
>> --- a/drivers/gpu/drm/drm_fb_helper.c
>> +++ b/drivers/gpu/drm/drm_fb_helper.c
>> @@ -372,24 +372,22 @@ static void drm_fb_helper_resume_worker(struct work_struct *work)
>>  }
>>  
>>  static void drm_fb_helper_dirty_blit_real(struct drm_fb_helper *fb_helper,
>> -					  struct drm_clip_rect *clip)
>> +					  struct drm_clip_rect *clip,
>> +					  struct dma_buf_map *dst)
>>  {
>>  	struct drm_framebuffer *fb = fb_helper->fb;
>>  	unsigned int cpp = fb->format->cpp[0];
>>  	size_t offset = clip->y1 * fb->pitches[0] + clip->x1 * cpp;
>>  	void *src = fb_helper->fbdev->screen_buffer + offset;
>> -	void *dst = fb_helper->buffer->map.vaddr + offset;
>>  	size_t len = (clip->x2 - clip->x1) * cpp;
>>  	unsigned int y;
>>  
>> -	for (y = clip->y1; y < clip->y2; y++) {
>> -		if (!fb_helper->dev->mode_config.fbdev_use_iomem)
>> -			memcpy(dst, src, len);
>> -		else
>> -			memcpy_toio((void __iomem *)dst, src, len);
>> +	dma_buf_map_incr(dst, offset); /* go to first pixel within clip rect */
>>  
>> +	for (y = clip->y1; y < clip->y2; y++) {
>> +		dma_buf_map_memcpy_to(dst, src, len);
>> +		dma_buf_map_incr(dst, fb->pitches[0]);
>>  		src += fb->pitches[0];
>> -		dst += fb->pitches[0];
>>  	}
>>  }
>>  
>> @@ -417,8 +415,9 @@ static void drm_fb_helper_dirty_work(struct work_struct *work)
>>  			ret = drm_client_buffer_vmap(helper->buffer, &map);
>>  			if (ret)
>>  				return;
>> -			drm_fb_helper_dirty_blit_real(helper, &clip_copy);
>> +			drm_fb_helper_dirty_blit_real(helper, &clip_copy, &map);
>>  		}
>> +
>>  		if (helper->fb->funcs->dirty)
>>  			helper->fb->funcs->dirty(helper->fb, NULL, 0, 0,
>>  						 &clip_copy, 1);
>> @@ -2027,6 +2026,206 @@ static int drm_fbdev_fb_mmap(struct fb_info *info, struct vm_area_struct *vma)
>>  		return -ENODEV;
>>  }
>>  
>> +static bool drm_fbdev_use_iomem(struct fb_info *info)
>> +{
>> +	struct drm_fb_helper *fb_helper = info->par;
>> +	struct drm_client_buffer *buffer = fb_helper->buffer;
>> +
>> +	return !drm_fbdev_use_shadow_fb(fb_helper) && buffer->map.is_iomem;
>> +}
>> +
>> +static ssize_t fb_read_screen_base(struct fb_info *info, char __user *buf, size_t count, 
>> +				   loff_t pos)
>> +{
>> +	const char __iomem *src = info->screen_base + pos;
> 
> Maybe a bit much a bikeshed, but I'd write this in terms of drm objects,
> like the dirty_blit function, using the dma_buf_map (instead of the
> fb_info parameter). And then instead of
> screen_base and screen_buffer suffixes give them _mem and _iomem suffixes.

Screen_buffer can be a shadow buffer. Until the blit worker (see
drm_fb_helper_dirty_work() ) completes, it might be more up to date than
the real buffer that's stored in the client.

The orignal fbdev code supported an fb_sync callback to synchronize with
outstanding screen updates (e.g., HW blit ops), but fb_sync is just
overhead here. Copying from screen_buffer or screen_base always returns
the most up-to-date image.

> 
> Same for write below. Or I'm not quite understanding why we do it like
> this here - I don't think this code will be used outside of the generic
> fbdev code, so we can always assume that drm_fb_helper->buffer is set up.

It's similar as in the read case. If we write to the client's buffer, an
outstanding blit worker could write the now-outdated shadow buffer over
the user's newly written framebuffer data.

Thinking about it, we might want to schedule the blit worker at the end
of each fb_write, so that the data makes it into the HW buffer in time.

> 
> The other thing I think we need is some minimal testcases to make sure.
> The fbtest tool used way back seems to have disappeared, I couldn't find
> a copy of the source anywhere anymore.

As discussed on IRC, I'll add some testcase to the igt test. I'll share
the link here when done.

Best regards
Thomas

> 
> With all that: Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> 
> Cheers, Daniel
> 
>> +	size_t alloc_size = min(count, PAGE_SIZE);
>> +	ssize_t ret = 0;
>> +	char *tmp;
>> +
>> +	tmp = kmalloc(alloc_size, GFP_KERNEL);
>> +	if (!tmp)
>> +		return -ENOMEM;
>> +
>> +	while (count) {
>> +		size_t c = min(count, alloc_size);
>> +
>> +		memcpy_fromio(tmp, src, c);
>> +		if (copy_to_user(buf, tmp, c)) {
>> +			ret = -EFAULT;
>> +			break;
>> +		}
>> +
>> +		src += c;
>> +		buf += c;
>> +		ret += c;
>> +		count -= c;
>> +	}
>> +
>> +	kfree(tmp);
>> +
>> +	return ret;
>> +}
>> +
>> +static ssize_t fb_read_screen_buffer(struct fb_info *info, char __user *buf, size_t count,
>> +				     loff_t pos)
>> +{
>> +	const char *src = info->screen_buffer + pos;
>> +
>> +	if (copy_to_user(buf, src, count))
>> +		return -EFAULT;
>> +
>> +	return count;
>> +}
>> +
>> +static ssize_t drm_fbdev_fb_read(struct fb_info *info, char __user *buf,
>> +				 size_t count, loff_t *ppos)
>> +{
>> +	loff_t pos = *ppos;
>> +	size_t total_size;
>> +	ssize_t ret;
>> +
>> +	if (info->state != FBINFO_STATE_RUNNING)
>> +		return -EPERM;
>> +
>> +	if (info->screen_size)
>> +		total_size = info->screen_size;
>> +	else
>> +		total_size = info->fix.smem_len;
>> +
>> +	if (pos >= total_size)
>> +		return 0;
>> +	if (count >= total_size)
>> +		count = total_size;
>> +	if (total_size - count < pos)
>> +		count = total_size - pos;
>> +
>> +	if (drm_fbdev_use_iomem(info))
>> +		ret = fb_read_screen_base(info, buf, count, pos);
>> +	else
>> +		ret = fb_read_screen_buffer(info, buf, count, pos);
>> +
>> +	if (ret > 0)
>> +		*ppos = ret;
>> +
>> +	return ret;
>> +}
>> +
>> +static ssize_t fb_write_screen_base(struct fb_info *info, const char __user *buf, size_t count,
>> +				    loff_t pos)
>> +{
>> +	char __iomem *dst = info->screen_base + pos;
>> +	size_t alloc_size = min(count, PAGE_SIZE);
>> +	ssize_t ret = 0;
>> +	u8 *tmp;
>> +
>> +	tmp = kmalloc(alloc_size, GFP_KERNEL);
>> +	if (!tmp)
>> +		return -ENOMEM;
>> +
>> +	while (count) {
>> +		size_t c = min(count, alloc_size);
>> +
>> +		if (copy_from_user(tmp, buf, c)) {
>> +			ret = -EFAULT;
>> +			break;
>> +		}
>> +		memcpy_toio(dst, tmp, c);
>> +
>> +		dst += c;
>> +		buf += c;
>> +		ret += c;
>> +		count -= c;
>> +	}
>> +
>> +	kfree(tmp);
>> +
>> +	return ret;
>> +}
>> +
>> +static ssize_t fb_write_screen_buffer(struct fb_info *info, const char __user *buf, size_t count,
>> +				      loff_t pos)
>> +{
>> +	char *dst = info->screen_buffer + pos;
>> +
>> +	if (copy_from_user(dst, buf, count))
>> +		return -EFAULT;
>> +
>> +	return count;
>> +}
>> +
>> +static ssize_t drm_fbdev_fb_write(struct fb_info *info, const char __user *buf,
>> +				  size_t count, loff_t *ppos)
>> +{
>> +	loff_t pos = *ppos;
>> +	size_t total_size;
>> +	ssize_t ret;
>> +	int err;
>> +
>> +	if (info->state != FBINFO_STATE_RUNNING)
>> +		return -EPERM;
>> +
>> +	if (info->screen_size)
>> +		total_size = info->screen_size;
>> +	else
>> +		total_size = info->fix.smem_len;
>> +
>> +	if (pos > total_size)
>> +		return -EFBIG;
>> +	if (count > total_size) {
>> +		err = -EFBIG;
>> +		count = total_size;
>> +	}
>> +	if (total_size - count < pos) {
>> +		if (!err)
>> +			err = -ENOSPC;
>> +		count = total_size - pos;
>> +	}
>> +
>> +	/*
>> +	 * Copy to framebuffer even if we already logged an error. Emulates
>> +	 * the behavior of the original fbdev implementation.
>> +	 */
>> +	if (drm_fbdev_use_iomem(info))
>> +		ret = fb_write_screen_base(info, buf, count, pos);
>> +	else
>> +		ret = fb_write_screen_buffer(info, buf, count, pos);
>> +
>> +	if (ret > 0)
>> +		*ppos = ret;
>> +
>> +	if (err)
>> +		return err;
>> +
>> +	return ret;
>> +}
>> +
>> +static void drm_fbdev_fb_fillrect(struct fb_info *info,
>> +				  const struct fb_fillrect *rect)
>> +{
>> +	if (drm_fbdev_use_iomem(info))
>> +		drm_fb_helper_cfb_fillrect(info, rect);
>> +	else
>> +		drm_fb_helper_sys_fillrect(info, rect);
>> +}
>> +
>> +static void drm_fbdev_fb_copyarea(struct fb_info *info,
>> +				  const struct fb_copyarea *area)
>> +{
>> +	if (drm_fbdev_use_iomem(info))
>> +		drm_fb_helper_cfb_copyarea(info, area);
>> +	else
>> +		drm_fb_helper_sys_copyarea(info, area);
>> +}
>> +
>> +static void drm_fbdev_fb_imageblit(struct fb_info *info,
>> +				   const struct fb_image *image)
>> +{
>> +	if (drm_fbdev_use_iomem(info))
>> +		drm_fb_helper_cfb_imageblit(info, image);
>> +	else
>> +		drm_fb_helper_sys_imageblit(info, image);
>> +}
>> +
>>  static const struct fb_ops drm_fbdev_fb_ops = {
>>  	.owner		= THIS_MODULE,
>>  	DRM_FB_HELPER_DEFAULT_OPS,
>> @@ -2034,11 +2233,11 @@ static const struct fb_ops drm_fbdev_fb_ops = {
>>  	.fb_release	= drm_fbdev_fb_release,
>>  	.fb_destroy	= drm_fbdev_fb_destroy,
>>  	.fb_mmap	= drm_fbdev_fb_mmap,
>> -	.fb_read	= drm_fb_helper_sys_read,
>> -	.fb_write	= drm_fb_helper_sys_write,
>> -	.fb_fillrect	= drm_fb_helper_sys_fillrect,
>> -	.fb_copyarea	= drm_fb_helper_sys_copyarea,
>> -	.fb_imageblit	= drm_fb_helper_sys_imageblit,
>> +	.fb_read	= drm_fbdev_fb_read,
>> +	.fb_write	= drm_fbdev_fb_write,
>> +	.fb_fillrect	= drm_fbdev_fb_fillrect,
>> +	.fb_copyarea	= drm_fbdev_fb_copyarea,
>> +	.fb_imageblit	= drm_fbdev_fb_imageblit,
>>  };
>>  
>>  static struct fb_deferred_io drm_fbdev_defio = {
>> diff --git a/include/drm/drm_mode_config.h b/include/drm/drm_mode_config.h
>> index 5ffbb4ed5b35..ab424ddd7665 100644
>> --- a/include/drm/drm_mode_config.h
>> +++ b/include/drm/drm_mode_config.h
>> @@ -877,18 +877,6 @@ struct drm_mode_config {
>>  	 */
>>  	bool prefer_shadow_fbdev;
>>  
>> -	/**
>> -	 * @fbdev_use_iomem:
>> -	 *
>> -	 * Set to true if framebuffer reside in iomem.
>> -	 * When set to true memcpy_toio() is used when copying the framebuffer in
>> -	 * drm_fb_helper.drm_fb_helper_dirty_blit_real().
>> -	 *
>> -	 * FIXME: This should be replaced with a per-mapping is_iomem
>> -	 * flag (like ttm does), and then used everywhere in fbdev code.
>> -	 */
>> -	bool fbdev_use_iomem;
>> -
>>  	/**
>>  	 * @quirk_addfb_prefer_xbgr_30bpp:
>>  	 *
>> -- 
>> 2.28.0
>>
>
Daniel Vetter Oct. 22, 2020, 8:51 a.m. UTC | #2
On Thu, Oct 22, 2020 at 10:37:56AM +0200, Thomas Zimmermann wrote:
> Hi
> 
> On 22.10.20 10:05, Daniel Vetter wrote:
> > On Tue, Oct 20, 2020 at 02:20:46PM +0200, Thomas Zimmermann wrote:
> >> At least sparc64 requires I/O-specific access to framebuffers. This
> >> patch updates the fbdev console accordingly.
> >>
> >> For drivers with direct access to the framebuffer memory, the callback
> >> functions in struct fb_ops test for the type of memory and call the rsp
> >> fb_sys_ of fb_cfb_ functions. Read and write operations are implemented
> >> internally by DRM's fbdev helper.
> >>
> >> For drivers that employ a shadow buffer, fbdev's blit function retrieves
> >> the framebuffer address as struct dma_buf_map, and uses dma_buf_map
> >> interfaces to access the buffer.
> >>
> >> The bochs driver on sparc64 uses a workaround to flag the framebuffer as
> >> I/O memory and avoid a HW exception. With the introduction of struct
> >> dma_buf_map, this is not required any longer. The patch removes the rsp
> >> code from both, bochs and fbdev.
> >>
> >> v5:
> >> 	* implement fb_read/fb_write internally (Daniel, Sam)
> >> v4:
> >> 	* move dma_buf_map changes into separate patch (Daniel)
> >> 	* TODO list: comment on fbdev updates (Daniel)
> >>
> >> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
> >> Tested-by: Sam Ravnborg <sam@ravnborg.org>
> >> ---
> >>  Documentation/gpu/todo.rst        |  19 ++-
> >>  drivers/gpu/drm/bochs/bochs_kms.c |   1 -
> >>  drivers/gpu/drm/drm_fb_helper.c   | 227 ++++++++++++++++++++++++++++--
> >>  include/drm/drm_mode_config.h     |  12 --
> >>  4 files changed, 230 insertions(+), 29 deletions(-)
> >>
> >> diff --git a/Documentation/gpu/todo.rst b/Documentation/gpu/todo.rst
> >> index 7e6fc3c04add..638b7f704339 100644
> >> --- a/Documentation/gpu/todo.rst
> >> +++ b/Documentation/gpu/todo.rst
> >> @@ -197,13 +197,28 @@ Convert drivers to use drm_fbdev_generic_setup()
> >>  ------------------------------------------------
> >>  
> >>  Most drivers can use drm_fbdev_generic_setup(). Driver have to implement
> >> -atomic modesetting and GEM vmap support. Current generic fbdev emulation
> >> -expects the framebuffer in system memory (or system-like memory).
> >> +atomic modesetting and GEM vmap support. Historically, generic fbdev emulation
> >> +expected the framebuffer in system memory or system-like memory. By employing
> >> +struct dma_buf_map, drivers with frambuffers in I/O memory can be supported
> >> +as well.
> >>  
> >>  Contact: Maintainer of the driver you plan to convert
> >>  
> >>  Level: Intermediate
> >>  
> >> +Reimplement functions in drm_fbdev_fb_ops without fbdev
> >> +-------------------------------------------------------
> >> +
> >> +A number of callback functions in drm_fbdev_fb_ops could benefit from
> >> +being rewritten without dependencies on the fbdev module. Some of the
> >> +helpers could further benefit from using struct dma_buf_map instead of
> >> +raw pointers.
> >> +
> >> +Contact: Thomas Zimmermann <tzimmermann@suse.de>, Daniel Vetter
> >> +
> >> +Level: Advanced
> >> +
> >> +
> >>  drm_framebuffer_funcs and drm_mode_config_funcs.fb_create cleanup
> >>  -----------------------------------------------------------------
> >>  
> >> diff --git a/drivers/gpu/drm/bochs/bochs_kms.c b/drivers/gpu/drm/bochs/bochs_kms.c
> >> index 13d0d04c4457..853081d186d5 100644
> >> --- a/drivers/gpu/drm/bochs/bochs_kms.c
> >> +++ b/drivers/gpu/drm/bochs/bochs_kms.c
> >> @@ -151,7 +151,6 @@ int bochs_kms_init(struct bochs_device *bochs)
> >>  	bochs->dev->mode_config.preferred_depth = 24;
> >>  	bochs->dev->mode_config.prefer_shadow = 0;
> >>  	bochs->dev->mode_config.prefer_shadow_fbdev = 1;
> >> -	bochs->dev->mode_config.fbdev_use_iomem = true;
> >>  	bochs->dev->mode_config.quirk_addfb_prefer_host_byte_order = true;
> >>  
> >>  	bochs->dev->mode_config.funcs = &bochs_mode_funcs;
> >> diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
> >> index 6212cd7cde1d..1d3180841778 100644
> >> --- a/drivers/gpu/drm/drm_fb_helper.c
> >> +++ b/drivers/gpu/drm/drm_fb_helper.c
> >> @@ -372,24 +372,22 @@ static void drm_fb_helper_resume_worker(struct work_struct *work)
> >>  }
> >>  
> >>  static void drm_fb_helper_dirty_blit_real(struct drm_fb_helper *fb_helper,
> >> -					  struct drm_clip_rect *clip)
> >> +					  struct drm_clip_rect *clip,
> >> +					  struct dma_buf_map *dst)
> >>  {
> >>  	struct drm_framebuffer *fb = fb_helper->fb;
> >>  	unsigned int cpp = fb->format->cpp[0];
> >>  	size_t offset = clip->y1 * fb->pitches[0] + clip->x1 * cpp;
> >>  	void *src = fb_helper->fbdev->screen_buffer + offset;
> >> -	void *dst = fb_helper->buffer->map.vaddr + offset;
> >>  	size_t len = (clip->x2 - clip->x1) * cpp;
> >>  	unsigned int y;
> >>  
> >> -	for (y = clip->y1; y < clip->y2; y++) {
> >> -		if (!fb_helper->dev->mode_config.fbdev_use_iomem)
> >> -			memcpy(dst, src, len);
> >> -		else
> >> -			memcpy_toio((void __iomem *)dst, src, len);
> >> +	dma_buf_map_incr(dst, offset); /* go to first pixel within clip rect */
> >>  
> >> +	for (y = clip->y1; y < clip->y2; y++) {
> >> +		dma_buf_map_memcpy_to(dst, src, len);
> >> +		dma_buf_map_incr(dst, fb->pitches[0]);
> >>  		src += fb->pitches[0];
> >> -		dst += fb->pitches[0];
> >>  	}
> >>  }
> >>  
> >> @@ -417,8 +415,9 @@ static void drm_fb_helper_dirty_work(struct work_struct *work)
> >>  			ret = drm_client_buffer_vmap(helper->buffer, &map);
> >>  			if (ret)
> >>  				return;
> >> -			drm_fb_helper_dirty_blit_real(helper, &clip_copy);
> >> +			drm_fb_helper_dirty_blit_real(helper, &clip_copy, &map);
> >>  		}
> >> +
> >>  		if (helper->fb->funcs->dirty)
> >>  			helper->fb->funcs->dirty(helper->fb, NULL, 0, 0,
> >>  						 &clip_copy, 1);
> >> @@ -2027,6 +2026,206 @@ static int drm_fbdev_fb_mmap(struct fb_info *info, struct vm_area_struct *vma)
> >>  		return -ENODEV;
> >>  }
> >>  
> >> +static bool drm_fbdev_use_iomem(struct fb_info *info)
> >> +{
> >> +	struct drm_fb_helper *fb_helper = info->par;
> >> +	struct drm_client_buffer *buffer = fb_helper->buffer;
> >> +
> >> +	return !drm_fbdev_use_shadow_fb(fb_helper) && buffer->map.is_iomem;
> >> +}
> >> +
> >> +static ssize_t fb_read_screen_base(struct fb_info *info, char __user *buf, size_t count, 
> >> +				   loff_t pos)
> >> +{
> >> +	const char __iomem *src = info->screen_base + pos;
> > 
> > Maybe a bit much a bikeshed, but I'd write this in terms of drm objects,
> > like the dirty_blit function, using the dma_buf_map (instead of the
> > fb_info parameter). And then instead of
> > screen_base and screen_buffer suffixes give them _mem and _iomem suffixes.
> 
> Screen_buffer can be a shadow buffer. Until the blit worker (see
> drm_fb_helper_dirty_work() ) completes, it might be more up to date than
> the real buffer that's stored in the client.
> 
> The orignal fbdev code supported an fb_sync callback to synchronize with
> outstanding screen updates (e.g., HW blit ops), but fb_sync is just
> overhead here. Copying from screen_buffer or screen_base always returns
> the most up-to-date image.
> 
> > 
> > Same for write below. Or I'm not quite understanding why we do it like
> > this here - I don't think this code will be used outside of the generic
> > fbdev code, so we can always assume that drm_fb_helper->buffer is set up.
> 
> It's similar as in the read case. If we write to the client's buffer, an
> outstanding blit worker could write the now-outdated shadow buffer over
> the user's newly written framebuffer data.
> 
> Thinking about it, we might want to schedule the blit worker at the end
> of each fb_write, so that the data makes it into the HW buffer in time.

Hm ok, makes some sense. I think there's some potential for cleanup if we
add a dma_buf_map drm_fb_helper->uapi_map which points at the right thing
always. That could then also the drm_fbdev_use_iomem() helper and make
this all look really neat.

But maybe a follow up clean up patch, if you're bored. As-is:

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

While looking at this I also noticed a potential small issue in an earlier
patch.

> > The other thing I think we need is some minimal testcases to make sure.
> > The fbtest tool used way back seems to have disappeared, I couldn't find
> > a copy of the source anywhere anymore.
> 
> As discussed on IRC, I'll add some testcase to the igt test. I'll share
> the link here when done.
> 
> Best regards
> Thomas
> 
> > 
> > With all that: Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> > 
> > Cheers, Daniel
> > 
> >> +	size_t alloc_size = min(count, PAGE_SIZE);
> >> +	ssize_t ret = 0;
> >> +	char *tmp;
> >> +
> >> +	tmp = kmalloc(alloc_size, GFP_KERNEL);
> >> +	if (!tmp)
> >> +		return -ENOMEM;
> >> +
> >> +	while (count) {
> >> +		size_t c = min(count, alloc_size);
> >> +
> >> +		memcpy_fromio(tmp, src, c);
> >> +		if (copy_to_user(buf, tmp, c)) {
> >> +			ret = -EFAULT;
> >> +			break;
> >> +		}
> >> +
> >> +		src += c;
> >> +		buf += c;
> >> +		ret += c;
> >> +		count -= c;
> >> +	}
> >> +
> >> +	kfree(tmp);
> >> +
> >> +	return ret;
> >> +}
> >> +
> >> +static ssize_t fb_read_screen_buffer(struct fb_info *info, char __user *buf, size_t count,
> >> +				     loff_t pos)
> >> +{
> >> +	const char *src = info->screen_buffer + pos;
> >> +
> >> +	if (copy_to_user(buf, src, count))
> >> +		return -EFAULT;
> >> +
> >> +	return count;
> >> +}
> >> +
> >> +static ssize_t drm_fbdev_fb_read(struct fb_info *info, char __user *buf,
> >> +				 size_t count, loff_t *ppos)
> >> +{
> >> +	loff_t pos = *ppos;
> >> +	size_t total_size;
> >> +	ssize_t ret;
> >> +
> >> +	if (info->state != FBINFO_STATE_RUNNING)
> >> +		return -EPERM;
> >> +
> >> +	if (info->screen_size)
> >> +		total_size = info->screen_size;
> >> +	else
> >> +		total_size = info->fix.smem_len;
> >> +
> >> +	if (pos >= total_size)
> >> +		return 0;
> >> +	if (count >= total_size)
> >> +		count = total_size;
> >> +	if (total_size - count < pos)
> >> +		count = total_size - pos;
> >> +
> >> +	if (drm_fbdev_use_iomem(info))
> >> +		ret = fb_read_screen_base(info, buf, count, pos);
> >> +	else
> >> +		ret = fb_read_screen_buffer(info, buf, count, pos);
> >> +
> >> +	if (ret > 0)
> >> +		*ppos = ret;
> >> +
> >> +	return ret;
> >> +}
> >> +
> >> +static ssize_t fb_write_screen_base(struct fb_info *info, const char __user *buf, size_t count,
> >> +				    loff_t pos)
> >> +{
> >> +	char __iomem *dst = info->screen_base + pos;
> >> +	size_t alloc_size = min(count, PAGE_SIZE);
> >> +	ssize_t ret = 0;
> >> +	u8 *tmp;
> >> +
> >> +	tmp = kmalloc(alloc_size, GFP_KERNEL);
> >> +	if (!tmp)
> >> +		return -ENOMEM;
> >> +
> >> +	while (count) {
> >> +		size_t c = min(count, alloc_size);
> >> +
> >> +		if (copy_from_user(tmp, buf, c)) {
> >> +			ret = -EFAULT;
> >> +			break;
> >> +		}
> >> +		memcpy_toio(dst, tmp, c);
> >> +
> >> +		dst += c;
> >> +		buf += c;
> >> +		ret += c;
> >> +		count -= c;
> >> +	}
> >> +
> >> +	kfree(tmp);
> >> +
> >> +	return ret;
> >> +}
> >> +
> >> +static ssize_t fb_write_screen_buffer(struct fb_info *info, const char __user *buf, size_t count,
> >> +				      loff_t pos)
> >> +{
> >> +	char *dst = info->screen_buffer + pos;
> >> +
> >> +	if (copy_from_user(dst, buf, count))
> >> +		return -EFAULT;
> >> +
> >> +	return count;
> >> +}
> >> +
> >> +static ssize_t drm_fbdev_fb_write(struct fb_info *info, const char __user *buf,
> >> +				  size_t count, loff_t *ppos)
> >> +{
> >> +	loff_t pos = *ppos;
> >> +	size_t total_size;
> >> +	ssize_t ret;
> >> +	int err;
> >> +
> >> +	if (info->state != FBINFO_STATE_RUNNING)
> >> +		return -EPERM;
> >> +
> >> +	if (info->screen_size)
> >> +		total_size = info->screen_size;
> >> +	else
> >> +		total_size = info->fix.smem_len;
> >> +
> >> +	if (pos > total_size)
> >> +		return -EFBIG;
> >> +	if (count > total_size) {
> >> +		err = -EFBIG;
> >> +		count = total_size;
> >> +	}
> >> +	if (total_size - count < pos) {
> >> +		if (!err)
> >> +			err = -ENOSPC;
> >> +		count = total_size - pos;
> >> +	}
> >> +
> >> +	/*
> >> +	 * Copy to framebuffer even if we already logged an error. Emulates
> >> +	 * the behavior of the original fbdev implementation.
> >> +	 */
> >> +	if (drm_fbdev_use_iomem(info))
> >> +		ret = fb_write_screen_base(info, buf, count, pos);
> >> +	else
> >> +		ret = fb_write_screen_buffer(info, buf, count, pos);
> >> +
> >> +	if (ret > 0)
> >> +		*ppos = ret;
> >> +
> >> +	if (err)
> >> +		return err;
> >> +
> >> +	return ret;
> >> +}
> >> +
> >> +static void drm_fbdev_fb_fillrect(struct fb_info *info,
> >> +				  const struct fb_fillrect *rect)
> >> +{
> >> +	if (drm_fbdev_use_iomem(info))
> >> +		drm_fb_helper_cfb_fillrect(info, rect);
> >> +	else
> >> +		drm_fb_helper_sys_fillrect(info, rect);
> >> +}
> >> +
> >> +static void drm_fbdev_fb_copyarea(struct fb_info *info,
> >> +				  const struct fb_copyarea *area)
> >> +{
> >> +	if (drm_fbdev_use_iomem(info))
> >> +		drm_fb_helper_cfb_copyarea(info, area);
> >> +	else
> >> +		drm_fb_helper_sys_copyarea(info, area);
> >> +}
> >> +
> >> +static void drm_fbdev_fb_imageblit(struct fb_info *info,
> >> +				   const struct fb_image *image)
> >> +{
> >> +	if (drm_fbdev_use_iomem(info))
> >> +		drm_fb_helper_cfb_imageblit(info, image);
> >> +	else
> >> +		drm_fb_helper_sys_imageblit(info, image);
> >> +}
> >> +
> >>  static const struct fb_ops drm_fbdev_fb_ops = {
> >>  	.owner		= THIS_MODULE,
> >>  	DRM_FB_HELPER_DEFAULT_OPS,
> >> @@ -2034,11 +2233,11 @@ static const struct fb_ops drm_fbdev_fb_ops = {
> >>  	.fb_release	= drm_fbdev_fb_release,
> >>  	.fb_destroy	= drm_fbdev_fb_destroy,
> >>  	.fb_mmap	= drm_fbdev_fb_mmap,
> >> -	.fb_read	= drm_fb_helper_sys_read,
> >> -	.fb_write	= drm_fb_helper_sys_write,
> >> -	.fb_fillrect	= drm_fb_helper_sys_fillrect,
> >> -	.fb_copyarea	= drm_fb_helper_sys_copyarea,
> >> -	.fb_imageblit	= drm_fb_helper_sys_imageblit,
> >> +	.fb_read	= drm_fbdev_fb_read,
> >> +	.fb_write	= drm_fbdev_fb_write,
> >> +	.fb_fillrect	= drm_fbdev_fb_fillrect,
> >> +	.fb_copyarea	= drm_fbdev_fb_copyarea,
> >> +	.fb_imageblit	= drm_fbdev_fb_imageblit,
> >>  };
> >>  
> >>  static struct fb_deferred_io drm_fbdev_defio = {
> >> diff --git a/include/drm/drm_mode_config.h b/include/drm/drm_mode_config.h
> >> index 5ffbb4ed5b35..ab424ddd7665 100644
> >> --- a/include/drm/drm_mode_config.h
> >> +++ b/include/drm/drm_mode_config.h
> >> @@ -877,18 +877,6 @@ struct drm_mode_config {
> >>  	 */
> >>  	bool prefer_shadow_fbdev;
> >>  
> >> -	/**
> >> -	 * @fbdev_use_iomem:
> >> -	 *
> >> -	 * Set to true if framebuffer reside in iomem.
> >> -	 * When set to true memcpy_toio() is used when copying the framebuffer in
> >> -	 * drm_fb_helper.drm_fb_helper_dirty_blit_real().
> >> -	 *
> >> -	 * FIXME: This should be replaced with a per-mapping is_iomem
> >> -	 * flag (like ttm does), and then used everywhere in fbdev code.
> >> -	 */
> >> -	bool fbdev_use_iomem;
> >> -
> >>  	/**
> >>  	 * @quirk_addfb_prefer_xbgr_30bpp:
> >>  	 *
> >> -- 
> >> 2.28.0
> >>
> > 
> 
> -- 
> Thomas Zimmermann
> Graphics Driver Developer
> SUSE Software Solutions Germany GmbH
> Maxfeldstr. 5, 90409 Nürnberg, Germany
> (HRB 36809, AG Nürnberg)
> Geschäftsführer: Felix Imendörffer
Sam Ravnborg Oct. 24, 2020, 8:38 p.m. UTC | #3
Hi Thomas.

On Tue, Oct 20, 2020 at 02:20:46PM +0200, Thomas Zimmermann wrote:
> At least sparc64 requires I/O-specific access to framebuffers. This

> patch updates the fbdev console accordingly.

> 

> For drivers with direct access to the framebuffer memory, the callback

> functions in struct fb_ops test for the type of memory and call the rsp

> fb_sys_ of fb_cfb_ functions. Read and write operations are implemented

> internally by DRM's fbdev helper.

> 

> For drivers that employ a shadow buffer, fbdev's blit function retrieves

> the framebuffer address as struct dma_buf_map, and uses dma_buf_map

> interfaces to access the buffer.

> 

> The bochs driver on sparc64 uses a workaround to flag the framebuffer as

> I/O memory and avoid a HW exception. With the introduction of struct

> dma_buf_map, this is not required any longer. The patch removes the rsp

> code from both, bochs and fbdev.

> 

> v5:

> 	* implement fb_read/fb_write internally (Daniel, Sam)

> v4:

> 	* move dma_buf_map changes into separate patch (Daniel)

> 	* TODO list: comment on fbdev updates (Daniel)

> 

> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>

> Tested-by: Sam Ravnborg <sam@ravnborg.org>

Reviewed-by: Sam Ravnborg <sam@ravnborg.org>


But see a few comments below on naming for you to consider.

	Sam

> ---

>  Documentation/gpu/todo.rst        |  19 ++-

>  drivers/gpu/drm/bochs/bochs_kms.c |   1 -

>  drivers/gpu/drm/drm_fb_helper.c   | 227 ++++++++++++++++++++++++++++--

>  include/drm/drm_mode_config.h     |  12 --

>  4 files changed, 230 insertions(+), 29 deletions(-)

> 

> diff --git a/Documentation/gpu/todo.rst b/Documentation/gpu/todo.rst

> index 7e6fc3c04add..638b7f704339 100644

> --- a/Documentation/gpu/todo.rst

> +++ b/Documentation/gpu/todo.rst

> @@ -197,13 +197,28 @@ Convert drivers to use drm_fbdev_generic_setup()

>  ------------------------------------------------

>  

>  Most drivers can use drm_fbdev_generic_setup(). Driver have to implement

> -atomic modesetting and GEM vmap support. Current generic fbdev emulation

> -expects the framebuffer in system memory (or system-like memory).

> +atomic modesetting and GEM vmap support. Historically, generic fbdev emulation

> +expected the framebuffer in system memory or system-like memory. By employing

> +struct dma_buf_map, drivers with frambuffers in I/O memory can be supported

> +as well.

>  

>  Contact: Maintainer of the driver you plan to convert

>  

>  Level: Intermediate

>  

> +Reimplement functions in drm_fbdev_fb_ops without fbdev

> +-------------------------------------------------------

> +

> +A number of callback functions in drm_fbdev_fb_ops could benefit from

> +being rewritten without dependencies on the fbdev module. Some of the

> +helpers could further benefit from using struct dma_buf_map instead of

> +raw pointers.

> +

> +Contact: Thomas Zimmermann <tzimmermann@suse.de>, Daniel Vetter

> +

> +Level: Advanced

> +

> +

>  drm_framebuffer_funcs and drm_mode_config_funcs.fb_create cleanup

>  -----------------------------------------------------------------

>  

> diff --git a/drivers/gpu/drm/bochs/bochs_kms.c b/drivers/gpu/drm/bochs/bochs_kms.c

> index 13d0d04c4457..853081d186d5 100644

> --- a/drivers/gpu/drm/bochs/bochs_kms.c

> +++ b/drivers/gpu/drm/bochs/bochs_kms.c

> @@ -151,7 +151,6 @@ int bochs_kms_init(struct bochs_device *bochs)

>  	bochs->dev->mode_config.preferred_depth = 24;

>  	bochs->dev->mode_config.prefer_shadow = 0;

>  	bochs->dev->mode_config.prefer_shadow_fbdev = 1;

> -	bochs->dev->mode_config.fbdev_use_iomem = true;

>  	bochs->dev->mode_config.quirk_addfb_prefer_host_byte_order = true;

>  

>  	bochs->dev->mode_config.funcs = &bochs_mode_funcs;

> diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c

> index 6212cd7cde1d..1d3180841778 100644

> --- a/drivers/gpu/drm/drm_fb_helper.c

> +++ b/drivers/gpu/drm/drm_fb_helper.c

> @@ -372,24 +372,22 @@ static void drm_fb_helper_resume_worker(struct work_struct *work)

>  }

>  

>  static void drm_fb_helper_dirty_blit_real(struct drm_fb_helper *fb_helper,

> -					  struct drm_clip_rect *clip)

> +					  struct drm_clip_rect *clip,

> +					  struct dma_buf_map *dst)

>  {

>  	struct drm_framebuffer *fb = fb_helper->fb;

>  	unsigned int cpp = fb->format->cpp[0];

>  	size_t offset = clip->y1 * fb->pitches[0] + clip->x1 * cpp;

>  	void *src = fb_helper->fbdev->screen_buffer + offset;

> -	void *dst = fb_helper->buffer->map.vaddr + offset;

>  	size_t len = (clip->x2 - clip->x1) * cpp;

>  	unsigned int y;

>  

> -	for (y = clip->y1; y < clip->y2; y++) {

> -		if (!fb_helper->dev->mode_config.fbdev_use_iomem)

> -			memcpy(dst, src, len);

> -		else

> -			memcpy_toio((void __iomem *)dst, src, len);

> +	dma_buf_map_incr(dst, offset); /* go to first pixel within clip rect */

>  

> +	for (y = clip->y1; y < clip->y2; y++) {

> +		dma_buf_map_memcpy_to(dst, src, len);

> +		dma_buf_map_incr(dst, fb->pitches[0]);

>  		src += fb->pitches[0];

> -		dst += fb->pitches[0];

>  	}

>  }

>  

> @@ -417,8 +415,9 @@ static void drm_fb_helper_dirty_work(struct work_struct *work)

>  			ret = drm_client_buffer_vmap(helper->buffer, &map);

>  			if (ret)

>  				return;

> -			drm_fb_helper_dirty_blit_real(helper, &clip_copy);

> +			drm_fb_helper_dirty_blit_real(helper, &clip_copy, &map);

>  		}

> +

>  		if (helper->fb->funcs->dirty)

>  			helper->fb->funcs->dirty(helper->fb, NULL, 0, 0,

>  						 &clip_copy, 1);

> @@ -2027,6 +2026,206 @@ static int drm_fbdev_fb_mmap(struct fb_info *info, struct vm_area_struct *vma)

>  		return -ENODEV;

>  }

>  

> +static bool drm_fbdev_use_iomem(struct fb_info *info)

> +{

> +	struct drm_fb_helper *fb_helper = info->par;

> +	struct drm_client_buffer *buffer = fb_helper->buffer;

> +

> +	return !drm_fbdev_use_shadow_fb(fb_helper) && buffer->map.is_iomem;

> +}

> +

> +static ssize_t fb_read_screen_base(struct fb_info *info, char __user *buf, size_t count, 

> +				   loff_t pos)

The naming here confused me - a name like:
fb_read_iomem() would have helped me more.
With the current naming I shall remember that the screen_base member is
the iomem pointer.

> +{

> +	const char __iomem *src = info->screen_base + pos;

> +	size_t alloc_size = min(count, PAGE_SIZE);

> +	ssize_t ret = 0;

> +	char *tmp;

> +

> +	tmp = kmalloc(alloc_size, GFP_KERNEL);

> +	if (!tmp)

> +		return -ENOMEM;

> +


I looked around and could not find other places where
we copy from iomem to mem to usermem in chunks of PAGE_SIZE.

> +	while (count) {

> +		size_t c = min(count, alloc_size);

> +

> +		memcpy_fromio(tmp, src, c);

> +		if (copy_to_user(buf, tmp, c)) {

> +			ret = -EFAULT;

> +			break;

> +		}

> +

> +		src += c;

> +		buf += c;

> +		ret += c;

> +		count -= c;

> +	}

> +

> +	kfree(tmp);

> +

> +	return ret;

> +}

> +

> +static ssize_t fb_read_screen_buffer(struct fb_info *info, char __user *buf, size_t count,

> +				     loff_t pos)

And fb_read_sysmem() here.

> +{

> +	const char *src = info->screen_buffer + pos;

> +

> +	if (copy_to_user(buf, src, count))

> +		return -EFAULT;

> +

> +	return count;

> +}

> +

> +static ssize_t drm_fbdev_fb_read(struct fb_info *info, char __user *buf,

> +				 size_t count, loff_t *ppos)

> +{

> +	loff_t pos = *ppos;

> +	size_t total_size;

> +	ssize_t ret;

> +

> +	if (info->state != FBINFO_STATE_RUNNING)

> +		return -EPERM;

> +

> +	if (info->screen_size)

> +		total_size = info->screen_size;

> +	else

> +		total_size = info->fix.smem_len;

> +

> +	if (pos >= total_size)

> +		return 0;

> +	if (count >= total_size)

> +		count = total_size;

> +	if (total_size - count < pos)

> +		count = total_size - pos;

> +

> +	if (drm_fbdev_use_iomem(info))

> +		ret = fb_read_screen_base(info, buf, count, pos);

> +	else

> +		ret = fb_read_screen_buffer(info, buf, count, pos);

> +

> +	if (ret > 0)

> +		*ppos = ret;

> +

> +	return ret;

> +}

> +

> +static ssize_t fb_write_screen_base(struct fb_info *info, const char __user *buf, size_t count,

> +				    loff_t pos)


fb_write_iomem()

> +{

> +	char __iomem *dst = info->screen_base + pos;

> +	size_t alloc_size = min(count, PAGE_SIZE);

> +	ssize_t ret = 0;

> +	u8 *tmp;

> +

> +	tmp = kmalloc(alloc_size, GFP_KERNEL);

> +	if (!tmp)

> +		return -ENOMEM;

> +

> +	while (count) {

> +		size_t c = min(count, alloc_size);

> +

> +		if (copy_from_user(tmp, buf, c)) {

> +			ret = -EFAULT;

> +			break;

> +		}

> +		memcpy_toio(dst, tmp, c);

> +

> +		dst += c;

> +		buf += c;

> +		ret += c;

> +		count -= c;

> +	}

> +

> +	kfree(tmp);

> +

> +	return ret;

> +}

> +

> +static ssize_t fb_write_screen_buffer(struct fb_info *info, const char __user *buf, size_t count,

> +				      loff_t pos)

fb_write_sysmem()

> +{

> +	char *dst = info->screen_buffer + pos;

> +

> +	if (copy_from_user(dst, buf, count))

> +		return -EFAULT;

> +

> +	return count;

> +}

> +

> +static ssize_t drm_fbdev_fb_write(struct fb_info *info, const char __user *buf,

> +				  size_t count, loff_t *ppos)

> +{

> +	loff_t pos = *ppos;

> +	size_t total_size;

> +	ssize_t ret;

> +	int err;

> +

> +	if (info->state != FBINFO_STATE_RUNNING)

> +		return -EPERM;

> +

> +	if (info->screen_size)

> +		total_size = info->screen_size;

> +	else

> +		total_size = info->fix.smem_len;

> +

> +	if (pos > total_size)

> +		return -EFBIG;

> +	if (count > total_size) {

> +		err = -EFBIG;

> +		count = total_size;

> +	}

> +	if (total_size - count < pos) {

> +		if (!err)

> +			err = -ENOSPC;

> +		count = total_size - pos;

> +	}

> +

> +	/*

> +	 * Copy to framebuffer even if we already logged an error. Emulates

> +	 * the behavior of the original fbdev implementation.

> +	 */

> +	if (drm_fbdev_use_iomem(info))

> +		ret = fb_write_screen_base(info, buf, count, pos);

> +	else

> +		ret = fb_write_screen_buffer(info, buf, count, pos);

> +

> +	if (ret > 0)

> +		*ppos = ret;

> +

> +	if (err)

> +		return err;

> +

> +	return ret;

> +}

> +

> +static void drm_fbdev_fb_fillrect(struct fb_info *info,

> +				  const struct fb_fillrect *rect)

> +{

> +	if (drm_fbdev_use_iomem(info))

> +		drm_fb_helper_cfb_fillrect(info, rect);

> +	else

> +		drm_fb_helper_sys_fillrect(info, rect);

> +}

> +

> +static void drm_fbdev_fb_copyarea(struct fb_info *info,

> +				  const struct fb_copyarea *area)

> +{

> +	if (drm_fbdev_use_iomem(info))

> +		drm_fb_helper_cfb_copyarea(info, area);

> +	else

> +		drm_fb_helper_sys_copyarea(info, area);

> +}

> +

> +static void drm_fbdev_fb_imageblit(struct fb_info *info,

> +				   const struct fb_image *image)

> +{

> +	if (drm_fbdev_use_iomem(info))

> +		drm_fb_helper_cfb_imageblit(info, image);

> +	else

> +		drm_fb_helper_sys_imageblit(info, image);

> +}

> +

>  static const struct fb_ops drm_fbdev_fb_ops = {

>  	.owner		= THIS_MODULE,

>  	DRM_FB_HELPER_DEFAULT_OPS,

> @@ -2034,11 +2233,11 @@ static const struct fb_ops drm_fbdev_fb_ops = {

>  	.fb_release	= drm_fbdev_fb_release,

>  	.fb_destroy	= drm_fbdev_fb_destroy,

>  	.fb_mmap	= drm_fbdev_fb_mmap,

> -	.fb_read	= drm_fb_helper_sys_read,

> -	.fb_write	= drm_fb_helper_sys_write,

> -	.fb_fillrect	= drm_fb_helper_sys_fillrect,

> -	.fb_copyarea	= drm_fb_helper_sys_copyarea,

> -	.fb_imageblit	= drm_fb_helper_sys_imageblit,

> +	.fb_read	= drm_fbdev_fb_read,

> +	.fb_write	= drm_fbdev_fb_write,

> +	.fb_fillrect	= drm_fbdev_fb_fillrect,

> +	.fb_copyarea	= drm_fbdev_fb_copyarea,

> +	.fb_imageblit	= drm_fbdev_fb_imageblit,

>  };

>  

>  static struct fb_deferred_io drm_fbdev_defio = {

> diff --git a/include/drm/drm_mode_config.h b/include/drm/drm_mode_config.h

> index 5ffbb4ed5b35..ab424ddd7665 100644

> --- a/include/drm/drm_mode_config.h

> +++ b/include/drm/drm_mode_config.h

> @@ -877,18 +877,6 @@ struct drm_mode_config {

>  	 */

>  	bool prefer_shadow_fbdev;

>  

> -	/**

> -	 * @fbdev_use_iomem:

> -	 *

> -	 * Set to true if framebuffer reside in iomem.

> -	 * When set to true memcpy_toio() is used when copying the framebuffer in

> -	 * drm_fb_helper.drm_fb_helper_dirty_blit_real().

> -	 *

> -	 * FIXME: This should be replaced with a per-mapping is_iomem

> -	 * flag (like ttm does), and then used everywhere in fbdev code.

> -	 */

> -	bool fbdev_use_iomem;

> -

>  	/**

>  	 * @quirk_addfb_prefer_xbgr_30bpp:

>  	 *

> -- 

> 2.28.0
Thomas Zimmermann Oct. 26, 2020, 7:50 a.m. UTC | #4
Hi

Am 24.10.20 um 22:38 schrieb Sam Ravnborg:
> Hi Thomas.
> 
> On Tue, Oct 20, 2020 at 02:20:46PM +0200, Thomas Zimmermann wrote:
>> At least sparc64 requires I/O-specific access to framebuffers. This
>> patch updates the fbdev console accordingly.
>>
>> For drivers with direct access to the framebuffer memory, the callback
>> functions in struct fb_ops test for the type of memory and call the rsp
>> fb_sys_ of fb_cfb_ functions. Read and write operations are implemented
>> internally by DRM's fbdev helper.
>>
>> For drivers that employ a shadow buffer, fbdev's blit function retrieves
>> the framebuffer address as struct dma_buf_map, and uses dma_buf_map
>> interfaces to access the buffer.
>>
>> The bochs driver on sparc64 uses a workaround to flag the framebuffer as
>> I/O memory and avoid a HW exception. With the introduction of struct
>> dma_buf_map, this is not required any longer. The patch removes the rsp
>> code from both, bochs and fbdev.
>>
>> v5:
>> 	* implement fb_read/fb_write internally (Daniel, Sam)
>> v4:
>> 	* move dma_buf_map changes into separate patch (Daniel)
>> 	* TODO list: comment on fbdev updates (Daniel)
>>
>> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
>> Tested-by: Sam Ravnborg <sam@ravnborg.org>
> Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
> 
> But see a few comments below on naming for you to consider.
> 
> 	Sam
> 
>> ---
>>  Documentation/gpu/todo.rst        |  19 ++-
>>  drivers/gpu/drm/bochs/bochs_kms.c |   1 -
>>  drivers/gpu/drm/drm_fb_helper.c   | 227 ++++++++++++++++++++++++++++--
>>  include/drm/drm_mode_config.h     |  12 --
>>  4 files changed, 230 insertions(+), 29 deletions(-)
>>
>> diff --git a/Documentation/gpu/todo.rst b/Documentation/gpu/todo.rst
>> index 7e6fc3c04add..638b7f704339 100644
>> --- a/Documentation/gpu/todo.rst
>> +++ b/Documentation/gpu/todo.rst
>> @@ -197,13 +197,28 @@ Convert drivers to use drm_fbdev_generic_setup()
>>  ------------------------------------------------
>>  
>>  Most drivers can use drm_fbdev_generic_setup(). Driver have to implement
>> -atomic modesetting and GEM vmap support. Current generic fbdev emulation
>> -expects the framebuffer in system memory (or system-like memory).
>> +atomic modesetting and GEM vmap support. Historically, generic fbdev emulation
>> +expected the framebuffer in system memory or system-like memory. By employing
>> +struct dma_buf_map, drivers with frambuffers in I/O memory can be supported
>> +as well.
>>  
>>  Contact: Maintainer of the driver you plan to convert
>>  
>>  Level: Intermediate
>>  
>> +Reimplement functions in drm_fbdev_fb_ops without fbdev
>> +-------------------------------------------------------
>> +
>> +A number of callback functions in drm_fbdev_fb_ops could benefit from
>> +being rewritten without dependencies on the fbdev module. Some of the
>> +helpers could further benefit from using struct dma_buf_map instead of
>> +raw pointers.
>> +
>> +Contact: Thomas Zimmermann <tzimmermann@suse.de>, Daniel Vetter
>> +
>> +Level: Advanced
>> +
>> +
>>  drm_framebuffer_funcs and drm_mode_config_funcs.fb_create cleanup
>>  -----------------------------------------------------------------
>>  
>> diff --git a/drivers/gpu/drm/bochs/bochs_kms.c b/drivers/gpu/drm/bochs/bochs_kms.c
>> index 13d0d04c4457..853081d186d5 100644
>> --- a/drivers/gpu/drm/bochs/bochs_kms.c
>> +++ b/drivers/gpu/drm/bochs/bochs_kms.c
>> @@ -151,7 +151,6 @@ int bochs_kms_init(struct bochs_device *bochs)
>>  	bochs->dev->mode_config.preferred_depth = 24;
>>  	bochs->dev->mode_config.prefer_shadow = 0;
>>  	bochs->dev->mode_config.prefer_shadow_fbdev = 1;
>> -	bochs->dev->mode_config.fbdev_use_iomem = true;
>>  	bochs->dev->mode_config.quirk_addfb_prefer_host_byte_order = true;
>>  
>>  	bochs->dev->mode_config.funcs = &bochs_mode_funcs;
>> diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
>> index 6212cd7cde1d..1d3180841778 100644
>> --- a/drivers/gpu/drm/drm_fb_helper.c
>> +++ b/drivers/gpu/drm/drm_fb_helper.c
>> @@ -372,24 +372,22 @@ static void drm_fb_helper_resume_worker(struct work_struct *work)
>>  }
>>  
>>  static void drm_fb_helper_dirty_blit_real(struct drm_fb_helper *fb_helper,
>> -					  struct drm_clip_rect *clip)
>> +					  struct drm_clip_rect *clip,
>> +					  struct dma_buf_map *dst)
>>  {
>>  	struct drm_framebuffer *fb = fb_helper->fb;
>>  	unsigned int cpp = fb->format->cpp[0];
>>  	size_t offset = clip->y1 * fb->pitches[0] + clip->x1 * cpp;
>>  	void *src = fb_helper->fbdev->screen_buffer + offset;
>> -	void *dst = fb_helper->buffer->map.vaddr + offset;
>>  	size_t len = (clip->x2 - clip->x1) * cpp;
>>  	unsigned int y;
>>  
>> -	for (y = clip->y1; y < clip->y2; y++) {
>> -		if (!fb_helper->dev->mode_config.fbdev_use_iomem)
>> -			memcpy(dst, src, len);
>> -		else
>> -			memcpy_toio((void __iomem *)dst, src, len);
>> +	dma_buf_map_incr(dst, offset); /* go to first pixel within clip rect */
>>  
>> +	for (y = clip->y1; y < clip->y2; y++) {
>> +		dma_buf_map_memcpy_to(dst, src, len);
>> +		dma_buf_map_incr(dst, fb->pitches[0]);
>>  		src += fb->pitches[0];
>> -		dst += fb->pitches[0];
>>  	}
>>  }
>>  
>> @@ -417,8 +415,9 @@ static void drm_fb_helper_dirty_work(struct work_struct *work)
>>  			ret = drm_client_buffer_vmap(helper->buffer, &map);
>>  			if (ret)
>>  				return;
>> -			drm_fb_helper_dirty_blit_real(helper, &clip_copy);
>> +			drm_fb_helper_dirty_blit_real(helper, &clip_copy, &map);
>>  		}
>> +
>>  		if (helper->fb->funcs->dirty)
>>  			helper->fb->funcs->dirty(helper->fb, NULL, 0, 0,
>>  						 &clip_copy, 1);
>> @@ -2027,6 +2026,206 @@ static int drm_fbdev_fb_mmap(struct fb_info *info, struct vm_area_struct *vma)
>>  		return -ENODEV;
>>  }
>>  
>> +static bool drm_fbdev_use_iomem(struct fb_info *info)
>> +{
>> +	struct drm_fb_helper *fb_helper = info->par;
>> +	struct drm_client_buffer *buffer = fb_helper->buffer;
>> +
>> +	return !drm_fbdev_use_shadow_fb(fb_helper) && buffer->map.is_iomem;
>> +}
>> +
>> +static ssize_t fb_read_screen_base(struct fb_info *info, char __user *buf, size_t count, 
>> +				   loff_t pos)
> The naming here confused me - a name like:
> fb_read_iomem() would have helped me more.
> With the current naming I shall remember that the screen_base member is
> the iomem pointer.

Yeah, true. In terms of naming, I was undecided. I was thinking about
adopting a naming similar to what you describe, but OTOH we don't use
sysmem anywhere in the code. I thought about adopting fbdev's conention
of using _sys_ and _cfb_. But that would make sensein the local context.

> 
>> +{
>> +	const char __iomem *src = info->screen_base + pos;
>> +	size_t alloc_size = min(count, PAGE_SIZE);
>> +	ssize_t ret = 0;
>> +	char *tmp;
>> +
>> +	tmp = kmalloc(alloc_size, GFP_KERNEL);
>> +	if (!tmp)
>> +		return -ENOMEM;
>> +
> 
> I looked around and could not find other places where
> we copy from iomem to mem to usermem in chunks of PAGE_SIZE.

I took this pattern from fbdev's original implementation. I think it's
done to work nicely with kmalloc.

Best regards
Thomas

> 
>> +	while (count) {
>> +		size_t c = min(count, alloc_size);
>> +
>> +		memcpy_fromio(tmp, src, c);
>> +		if (copy_to_user(buf, tmp, c)) {
>> +			ret = -EFAULT;
>> +			break;
>> +		}
>> +
>> +		src += c;
>> +		buf += c;
>> +		ret += c;
>> +		count -= c;
>> +	}
>> +
>> +	kfree(tmp);
>> +
>> +	return ret;
>> +}
>> +
>> +static ssize_t fb_read_screen_buffer(struct fb_info *info, char __user *buf, size_t count,
>> +				     loff_t pos)
> And fb_read_sysmem() here.
> 
>> +{
>> +	const char *src = info->screen_buffer + pos;
>> +
>> +	if (copy_to_user(buf, src, count))
>> +		return -EFAULT;
>> +
>> +	return count;
>> +}
>> +
>> +static ssize_t drm_fbdev_fb_read(struct fb_info *info, char __user *buf,
>> +				 size_t count, loff_t *ppos)
>> +{
>> +	loff_t pos = *ppos;
>> +	size_t total_size;
>> +	ssize_t ret;
>> +
>> +	if (info->state != FBINFO_STATE_RUNNING)
>> +		return -EPERM;
>> +
>> +	if (info->screen_size)
>> +		total_size = info->screen_size;
>> +	else
>> +		total_size = info->fix.smem_len;
>> +
>> +	if (pos >= total_size)
>> +		return 0;
>> +	if (count >= total_size)
>> +		count = total_size;
>> +	if (total_size - count < pos)
>> +		count = total_size - pos;
>> +
>> +	if (drm_fbdev_use_iomem(info))
>> +		ret = fb_read_screen_base(info, buf, count, pos);
>> +	else
>> +		ret = fb_read_screen_buffer(info, buf, count, pos);
>> +
>> +	if (ret > 0)
>> +		*ppos = ret;
>> +
>> +	return ret;
>> +}
>> +
>> +static ssize_t fb_write_screen_base(struct fb_info *info, const char __user *buf, size_t count,
>> +				    loff_t pos)
> 
> fb_write_iomem()
> 
>> +{
>> +	char __iomem *dst = info->screen_base + pos;
>> +	size_t alloc_size = min(count, PAGE_SIZE);
>> +	ssize_t ret = 0;
>> +	u8 *tmp;
>> +
>> +	tmp = kmalloc(alloc_size, GFP_KERNEL);
>> +	if (!tmp)
>> +		return -ENOMEM;
>> +
>> +	while (count) {
>> +		size_t c = min(count, alloc_size);
>> +
>> +		if (copy_from_user(tmp, buf, c)) {
>> +			ret = -EFAULT;
>> +			break;
>> +		}
>> +		memcpy_toio(dst, tmp, c);
>> +
>> +		dst += c;
>> +		buf += c;
>> +		ret += c;
>> +		count -= c;
>> +	}
>> +
>> +	kfree(tmp);
>> +
>> +	return ret;
>> +}
>> +
>> +static ssize_t fb_write_screen_buffer(struct fb_info *info, const char __user *buf, size_t count,
>> +				      loff_t pos)
> fb_write_sysmem()
> 
>> +{
>> +	char *dst = info->screen_buffer + pos;
>> +
>> +	if (copy_from_user(dst, buf, count))
>> +		return -EFAULT;
>> +
>> +	return count;
>> +}
>> +
>> +static ssize_t drm_fbdev_fb_write(struct fb_info *info, const char __user *buf,
>> +				  size_t count, loff_t *ppos)
>> +{
>> +	loff_t pos = *ppos;
>> +	size_t total_size;
>> +	ssize_t ret;
>> +	int err;
>> +
>> +	if (info->state != FBINFO_STATE_RUNNING)
>> +		return -EPERM;
>> +
>> +	if (info->screen_size)
>> +		total_size = info->screen_size;
>> +	else
>> +		total_size = info->fix.smem_len;
>> +
>> +	if (pos > total_size)
>> +		return -EFBIG;
>> +	if (count > total_size) {
>> +		err = -EFBIG;
>> +		count = total_size;
>> +	}
>> +	if (total_size - count < pos) {
>> +		if (!err)
>> +			err = -ENOSPC;
>> +		count = total_size - pos;
>> +	}
>> +
>> +	/*
>> +	 * Copy to framebuffer even if we already logged an error. Emulates
>> +	 * the behavior of the original fbdev implementation.
>> +	 */
>> +	if (drm_fbdev_use_iomem(info))
>> +		ret = fb_write_screen_base(info, buf, count, pos);
>> +	else
>> +		ret = fb_write_screen_buffer(info, buf, count, pos);
>> +
>> +	if (ret > 0)
>> +		*ppos = ret;
>> +
>> +	if (err)
>> +		return err;
>> +
>> +	return ret;
>> +}
>> +
>> +static void drm_fbdev_fb_fillrect(struct fb_info *info,
>> +				  const struct fb_fillrect *rect)
>> +{
>> +	if (drm_fbdev_use_iomem(info))
>> +		drm_fb_helper_cfb_fillrect(info, rect);
>> +	else
>> +		drm_fb_helper_sys_fillrect(info, rect);
>> +}
>> +
>> +static void drm_fbdev_fb_copyarea(struct fb_info *info,
>> +				  const struct fb_copyarea *area)
>> +{
>> +	if (drm_fbdev_use_iomem(info))
>> +		drm_fb_helper_cfb_copyarea(info, area);
>> +	else
>> +		drm_fb_helper_sys_copyarea(info, area);
>> +}
>> +
>> +static void drm_fbdev_fb_imageblit(struct fb_info *info,
>> +				   const struct fb_image *image)
>> +{
>> +	if (drm_fbdev_use_iomem(info))
>> +		drm_fb_helper_cfb_imageblit(info, image);
>> +	else
>> +		drm_fb_helper_sys_imageblit(info, image);
>> +}
>> +
>>  static const struct fb_ops drm_fbdev_fb_ops = {
>>  	.owner		= THIS_MODULE,
>>  	DRM_FB_HELPER_DEFAULT_OPS,
>> @@ -2034,11 +2233,11 @@ static const struct fb_ops drm_fbdev_fb_ops = {
>>  	.fb_release	= drm_fbdev_fb_release,
>>  	.fb_destroy	= drm_fbdev_fb_destroy,
>>  	.fb_mmap	= drm_fbdev_fb_mmap,
>> -	.fb_read	= drm_fb_helper_sys_read,
>> -	.fb_write	= drm_fb_helper_sys_write,
>> -	.fb_fillrect	= drm_fb_helper_sys_fillrect,
>> -	.fb_copyarea	= drm_fb_helper_sys_copyarea,
>> -	.fb_imageblit	= drm_fb_helper_sys_imageblit,
>> +	.fb_read	= drm_fbdev_fb_read,
>> +	.fb_write	= drm_fbdev_fb_write,
>> +	.fb_fillrect	= drm_fbdev_fb_fillrect,
>> +	.fb_copyarea	= drm_fbdev_fb_copyarea,
>> +	.fb_imageblit	= drm_fbdev_fb_imageblit,
>>  };
>>  
>>  static struct fb_deferred_io drm_fbdev_defio = {
>> diff --git a/include/drm/drm_mode_config.h b/include/drm/drm_mode_config.h
>> index 5ffbb4ed5b35..ab424ddd7665 100644
>> --- a/include/drm/drm_mode_config.h
>> +++ b/include/drm/drm_mode_config.h
>> @@ -877,18 +877,6 @@ struct drm_mode_config {
>>  	 */
>>  	bool prefer_shadow_fbdev;
>>  
>> -	/**
>> -	 * @fbdev_use_iomem:
>> -	 *
>> -	 * Set to true if framebuffer reside in iomem.
>> -	 * When set to true memcpy_toio() is used when copying the framebuffer in
>> -	 * drm_fb_helper.drm_fb_helper_dirty_blit_real().
>> -	 *
>> -	 * FIXME: This should be replaced with a per-mapping is_iomem
>> -	 * flag (like ttm does), and then used everywhere in fbdev code.
>> -	 */
>> -	bool fbdev_use_iomem;
>> -
>>  	/**
>>  	 * @quirk_addfb_prefer_xbgr_30bpp:
>>  	 *
>> -- 
>> 2.28.0
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>
Linus Walleij Nov. 5, 2020, 10:07 a.m. UTC | #5
Overall I like this, just an inline question:

On Tue, Oct 20, 2020 at 2:20 PM Thomas Zimmermann <tzimmermann@suse.de> wrote:

> To do framebuffer updates, one needs memcpy from system memory and a
> pointer-increment function. Add both interfaces with documentation.

(...)
> +/**
> + * dma_buf_map_memcpy_to - Memcpy into dma-buf mapping
> + * @dst:       The dma-buf mapping structure
> + * @src:       The source buffer
> + * @len:       The number of byte in src
> + *
> + * Copies data into a dma-buf mapping. The source buffer is in system
> + * memory. Depending on the buffer's location, the helper picks the correct
> + * method of accessing the memory.
> + */
> +static inline void dma_buf_map_memcpy_to(struct dma_buf_map *dst, const void *src, size_t len)
> +{
> +       if (dst->is_iomem)
> +               memcpy_toio(dst->vaddr_iomem, src, len);
> +       else
> +               memcpy(dst->vaddr, src, len);
> +}

Are these going to be really big memcpy() operations?

Some platforms have DMA offload engines that can perform memcpy(),
drivers/dma, include/linux/dmaengine.h
especially if the CPU doesn't really need to touch the contents
and flush caches etc.
An example exist in some MTD drivers that move large quantities of
data off flash memory like this:
drivers/mtd/nand/raw/cadence-nand-controller.c

Notice that DMAengine and DMAbuf does not have much in common,
the names can be deceiving.

The value of this varies with the system architecture. It is not just
a question about performance but also about power and the CPU
being able to do other stuff in parallel for large transfers. So *when*
to use this facility to accelerate memcpy() is a delicate question.

What I'm after here is if these can be really big, do we want
(in the long run, not now) open up to the idea to slot in
hardware-accelerated memcpy() here?

Yours,
Linus Walleij
Thomas Zimmermann Nov. 5, 2020, 10:37 a.m. UTC | #6
Hi

Am 05.11.20 um 11:07 schrieb Linus Walleij:
> Overall I like this, just an inline question:
> 
> On Tue, Oct 20, 2020 at 2:20 PM Thomas Zimmermann <tzimmermann@suse.de> wrote:
> 
>> To do framebuffer updates, one needs memcpy from system memory and a
>> pointer-increment function. Add both interfaces with documentation.
> 
> (...)
>> +/**
>> + * dma_buf_map_memcpy_to - Memcpy into dma-buf mapping
>> + * @dst:       The dma-buf mapping structure
>> + * @src:       The source buffer
>> + * @len:       The number of byte in src
>> + *
>> + * Copies data into a dma-buf mapping. The source buffer is in system
>> + * memory. Depending on the buffer's location, the helper picks the correct
>> + * method of accessing the memory.
>> + */
>> +static inline void dma_buf_map_memcpy_to(struct dma_buf_map *dst, const void *src, size_t len)
>> +{
>> +       if (dst->is_iomem)
>> +               memcpy_toio(dst->vaddr_iomem, src, len);
>> +       else
>> +               memcpy(dst->vaddr, src, len);
>> +}
> 
> Are these going to be really big memcpy() operations?

Individually, each could be a scanline, so a few KiB. (4 bytes *
horizontal resolution). Updating a full framebuffer can sum up to
several MiB.

> 
> Some platforms have DMA offload engines that can perform memcpy(),They could be
> drivers/dma, include/linux/dmaengine.h
> especially if the CPU doesn't really need to touch the contents
> and flush caches etc.
> An example exist in some MTD drivers that move large quantities of
> data off flash memory like this:
> drivers/mtd/nand/raw/cadence-nand-controller.c
> 
> Notice that DMAengine and DMAbuf does not have much in common,
> the names can be deceiving.
> 
> The value of this varies with the system architecture. It is not just
> a question about performance but also about power and the CPU
> being able to do other stuff in parallel for large transfers. So *when*
> to use this facility to accelerate memcpy() is a delicate question.
> 
> What I'm after here is if these can be really big, do we want
> (in the long run, not now) open up to the idea to slot in
> hardware-accelerated memcpy() here?

We currently use this functionality for the graphical framebuffer
console that most DRM drivers provide. It's non-accelerated and slow,
but this has not been much of a problem so far.

Within DRM, we're more interested in removing console code from drivers
and going for the generic implementation.

Most of the graphics HW allocates framebuffers from video RAM, system
memory or CMA pools and does not really need these memcpys. Only a few
systems with small video RAM require a shadow buffer, which we flush
into VRAM as needed. Those might benefit.

OTOH, off-loading memcpys to hardware sounds reasonable if we can hide
it from the DRM code. I think it all depends on how invasive that change
would be.

Best regards
Thomas

> 
> Yours,
> Linus Walleij
>
Daniel Vetter Nov. 5, 2020, 12:54 p.m. UTC | #7
On Thu, Nov 05, 2020 at 11:37:08AM +0100, Thomas Zimmermann wrote:
> Hi
> 
> Am 05.11.20 um 11:07 schrieb Linus Walleij:
> > Overall I like this, just an inline question:
> > 
> > On Tue, Oct 20, 2020 at 2:20 PM Thomas Zimmermann <tzimmermann@suse.de> wrote:
> > 
> >> To do framebuffer updates, one needs memcpy from system memory and a
> >> pointer-increment function. Add both interfaces with documentation.
> > 
> > (...)
> >> +/**
> >> + * dma_buf_map_memcpy_to - Memcpy into dma-buf mapping
> >> + * @dst:       The dma-buf mapping structure
> >> + * @src:       The source buffer
> >> + * @len:       The number of byte in src
> >> + *
> >> + * Copies data into a dma-buf mapping. The source buffer is in system
> >> + * memory. Depending on the buffer's location, the helper picks the correct
> >> + * method of accessing the memory.
> >> + */
> >> +static inline void dma_buf_map_memcpy_to(struct dma_buf_map *dst, const void *src, size_t len)
> >> +{
> >> +       if (dst->is_iomem)
> >> +               memcpy_toio(dst->vaddr_iomem, src, len);
> >> +       else
> >> +               memcpy(dst->vaddr, src, len);
> >> +}
> > 
> > Are these going to be really big memcpy() operations?
> 
> Individually, each could be a scanline, so a few KiB. (4 bytes *
> horizontal resolution). Updating a full framebuffer can sum up to
> several MiB.
> 
> > 
> > Some platforms have DMA offload engines that can perform memcpy(),They could be
> > drivers/dma, include/linux/dmaengine.h
> > especially if the CPU doesn't really need to touch the contents
> > and flush caches etc.
> > An example exist in some MTD drivers that move large quantities of
> > data off flash memory like this:
> > drivers/mtd/nand/raw/cadence-nand-controller.c
> > 
> > Notice that DMAengine and DMAbuf does not have much in common,
> > the names can be deceiving.
> > 
> > The value of this varies with the system architecture. It is not just
> > a question about performance but also about power and the CPU
> > being able to do other stuff in parallel for large transfers. So *when*
> > to use this facility to accelerate memcpy() is a delicate question.
> > 
> > What I'm after here is if these can be really big, do we want
> > (in the long run, not now) open up to the idea to slot in
> > hardware-accelerated memcpy() here?
> 
> We currently use this functionality for the graphical framebuffer
> console that most DRM drivers provide. It's non-accelerated and slow,
> but this has not been much of a problem so far.
> 
> Within DRM, we're more interested in removing console code from drivers
> and going for the generic implementation.
> 
> Most of the graphics HW allocates framebuffers from video RAM, system
> memory or CMA pools and does not really need these memcpys. Only a few
> systems with small video RAM require a shadow buffer, which we flush
> into VRAM as needed. Those might benefit.
> 
> OTOH, off-loading memcpys to hardware sounds reasonable if we can hide
> it from the DRM code. I think it all depends on how invasive that change
> would be.

I wouldn't, all the additional locks this would pull in sound like
nightmare. And when an oops happens, this might be the only thing that
manages to get the oops to the user.

Unless someone really starts caring about fbcon acceleration I really
wouldn't bother. Ok maybe it also matters for fbdev, but the problem is
that the page fault intercepting alone is already expensive, so the only
real solution if you care about performance in that case is to use kms
natively, and use a dirty rectangle flip (or the DIRTY syscall).

And in there drivers should (and do) use any dma engines they have to
upload the frames already.
-Daniel