mbox series

[RFC,PKS/PMEM,00/58] PMEM: Introduce stray write protection for PMEM

Message ID 20201009195033.3208459-1-ira.weiny@intel.com
Headers show
Series PMEM: Introduce stray write protection for PMEM | expand

Message

Ira Weiny Oct. 9, 2020, 7:49 p.m. UTC
From: Ira Weiny <ira.weiny@intel.com>

Should a stray write in the kernel occur persistent memory is affected more
than regular memory.  A write to the wrong area of memory could result in
latent data corruption which will will persist after a reboot.  PKS provides a
nice way to restrict access to persistent memory kernel mappings, while
providing fast access when needed.

Since the last RFC[1] this patch set has grown quite a bit.  It now depends on
the core patches submitted separately.

	https://lore.kernel.org/lkml/20201009194258.3207172-1-ira.weiny@intel.com/

And contained in the git tree here:

	https://github.com/weiny2/linux-kernel/tree/pks-rfc-v3

However, functionally there is only 1 major change from the last RFC.
Specifically, kmap() is most often used within a single thread in a 'map/do
something/unmap' pattern.  In fact this is the pattern used in ~90% of the
callers of kmap().  This pattern works very well for the pmem use case and the
testing which was done.  However, there were another ~20-30 kmap users which do
not follow this pattern.  Some of them seem to expect the mapping to be
'global' while others require a detailed audit to be sure.[2][3]

While we don't anticipate global mappings to pmem there is a danger in
changing the semantics of kmap().  Effectively, this would cause an unresolved
page fault with little to no information about why.

There were a number of options considered.

1) Attempt to change all the thread local kmap() calls to kmap_atomic()
2) Introduce a flags parameter to kmap() to indicate if the mapping should be
   global or not
3) Change ~20-30 call sites to 'kmap_global()' to indicate that they require a
   global mapping of the pages
4) Change ~209 call sites to 'kmap_thread()' to indicate that the mapping is to
   be used within that thread of execution only

Option 1 is simply not feasible kmap_atomic() is not the same semantic as
kmap() within a single tread.  Option 2 would require all of the call sites of
kmap() to change.  Option 3 seems like a good minimal change but there is a
danger that new code may miss the semantic change of kmap() and not get the
behavior intended for future users.  Therefore, option #4 was chosen.

To handle the global PKRS state in the most efficient manner possible.  We
lazily override the thread specific PKRS key value only when needed because we
anticipate PKS to not be needed will not be needed most of the time.  And even
when it is used 90% of the time it is a thread local call.


[1] https://lore.kernel.org/lkml/20200717072056.73134-1-ira.weiny@intel.com/

[2] The following list of callers continue calling kmap() (utilizing the global
PKRS).  It would be nice if more of them could be converted to kmap_thread()

	drivers/firewire/net.c:         ptr = kmap(dev->broadcast_rcv_buffer.pages[u]);
	drivers/gpu/drm/i915/gem/i915_gem_pages.c:              return kmap(sg_page(sgt->sgl));
	drivers/gpu/drm/ttm/ttm_bo_util.c:              map->virtual = kmap(map->page);
	drivers/infiniband/hw/qib/qib_user_sdma.c:      mpage = kmap(page);
	drivers/misc/vmw_vmci/vmci_host.c:      context->notify = kmap(context->notify_page) + (uva & (PAGE_SIZE - 1));
	drivers/misc/xilinx_sdfec.c:            addr = kmap(pages[i]);
	drivers/mmc/host/usdhi6rol0.c:  host->pg.mapped         = kmap(host->pg.page);
	drivers/mmc/host/usdhi6rol0.c:  host->pg.mapped = kmap(host->pg.page);
	drivers/mmc/host/usdhi6rol0.c:  host->pg.mapped = kmap(host->pg.page);
	drivers/nvme/target/tcp.c:              iov->iov_base = kmap(sg_page(sg)) + sg->offset + sg_offset;
	drivers/scsi/libiscsi_tcp.c:            segment->sg_mapped = kmap(sg_page(sg));
	drivers/target/iscsi/iscsi_target.c:            iov[i].iov_base = kmap(sg_page(sg)) + sg->offset + page_off;
	drivers/target/target_core_transport.c:         return kmap(sg_page(sg)) + sg->offset;
	fs/btrfs/check-integrity.c:             block_ctx->datav[i] = kmap(block_ctx->pagev[i]);
	fs/ceph/dir.c:          cache_ctl->dentries = kmap(cache_ctl->page);
	fs/ceph/inode.c:                ctl->dentries = kmap(ctl->page);
	fs/erofs/zpvec.h:               kmap_atomic(ctor->curr) : kmap(ctor->curr);
	lib/scatterlist.c:              miter->addr = kmap(miter->page) + miter->__offset;
	net/ceph/pagelist.c:    pl->mapped_tail = kmap(page);
	net/ceph/pagelist.c:            pl->mapped_tail = kmap(page);
	virt/kvm/kvm_main.c:                    hva = kmap(page);

[3] The following appear to follow the same pattern as ext2 which was converted
after some code audit.  So I _think_ they too could be converted to
k[un]map_thread().

	fs/freevxfs/vxfs_subr.c|75| kmap(pp);
	fs/jfs/jfs_metapage.c|102| kmap(page);
	fs/jfs/jfs_metapage.c|156| kmap(page);
	fs/minix/dir.c|72| kmap(page);
	fs/nilfs2/dir.c|195| kmap(page);
	fs/nilfs2/ifile.h|24| void *kaddr = kmap(ibh->b_page);
	fs/ntfs/aops.h|78| kmap(page);
	fs/ntfs/compress.c|574| kmap(page);
	fs/qnx6/dir.c|32| kmap(page);
	fs/qnx6/dir.c|58| kmap(*p = page);
	fs/qnx6/inode.c|190| kmap(page);
	fs/qnx6/inode.c|557| kmap(page);
	fs/reiserfs/inode.c|2397| kmap(bh_result->b_page);
	fs/reiserfs/xattr.c|444| kmap(page);
	fs/sysv/dir.c|60| kmap(page);
	fs/sysv/dir.c|262| kmap(page);
	fs/ufs/dir.c|194| kmap(page);
	fs/ufs/dir.c|562| kmap(page);


Ira Weiny (58):
  x86/pks: Add a global pkrs option
  x86/pks/test: Add testing for global option
  memremap: Add zone device access protection
  kmap: Add stray access protection for device pages
  kmap: Introduce k[un]map_thread
  kmap: Introduce k[un]map_thread debugging
  drivers/drbd: Utilize new kmap_thread()
  drivers/firmware_loader: Utilize new kmap_thread()
  drivers/gpu: Utilize new kmap_thread()
  drivers/rdma: Utilize new kmap_thread()
  drivers/net: Utilize new kmap_thread()
  fs/afs: Utilize new kmap_thread()
  fs/btrfs: Utilize new kmap_thread()
  fs/cifs: Utilize new kmap_thread()
  fs/ecryptfs: Utilize new kmap_thread()
  fs/gfs2: Utilize new kmap_thread()
  fs/nilfs2: Utilize new kmap_thread()
  fs/hfs: Utilize new kmap_thread()
  fs/hfsplus: Utilize new kmap_thread()
  fs/jffs2: Utilize new kmap_thread()
  fs/nfs: Utilize new kmap_thread()
  fs/f2fs: Utilize new kmap_thread()
  fs/fuse: Utilize new kmap_thread()
  fs/freevxfs: Utilize new kmap_thread()
  fs/reiserfs: Utilize new kmap_thread()
  fs/zonefs: Utilize new kmap_thread()
  fs/ubifs: Utilize new kmap_thread()
  fs/cachefiles: Utilize new kmap_thread()
  fs/ntfs: Utilize new kmap_thread()
  fs/romfs: Utilize new kmap_thread()
  fs/vboxsf: Utilize new kmap_thread()
  fs/hostfs: Utilize new kmap_thread()
  fs/cramfs: Utilize new kmap_thread()
  fs/erofs: Utilize new kmap_thread()
  fs: Utilize new kmap_thread()
  fs/ext2: Use ext2_put_page
  fs/ext2: Utilize new kmap_thread()
  fs/isofs: Utilize new kmap_thread()
  fs/jffs2: Utilize new kmap_thread()
  net: Utilize new kmap_thread()
  drivers/target: Utilize new kmap_thread()
  drivers/scsi: Utilize new kmap_thread()
  drivers/mmc: Utilize new kmap_thread()
  drivers/xen: Utilize new kmap_thread()
  drivers/firmware: Utilize new kmap_thread()
  drives/staging: Utilize new kmap_thread()
  drivers/mtd: Utilize new kmap_thread()
  drivers/md: Utilize new kmap_thread()
  drivers/misc: Utilize new kmap_thread()
  drivers/android: Utilize new kmap_thread()
  kernel: Utilize new kmap_thread()
  mm: Utilize new kmap_thread()
  lib: Utilize new kmap_thread()
  powerpc: Utilize new kmap_thread()
  samples: Utilize new kmap_thread()
  dax: Stray access protection for dax_direct_access()
  nvdimm/pmem: Stray access protection for pmem->virt_addr
  [dax|pmem]: Enable stray access protection

 Documentation/core-api/protection-keys.rst    |  11 +-
 arch/powerpc/mm/mem.c                         |   4 +-
 arch/x86/entry/common.c                       |  28 +++
 arch/x86/include/asm/pkeys.h                  |   6 +-
 arch/x86/include/asm/pkeys_common.h           |   8 +-
 arch/x86/kernel/process.c                     |  74 ++++++-
 arch/x86/mm/fault.c                           | 193 ++++++++++++++----
 arch/x86/mm/pkeys.c                           |  88 ++++++--
 drivers/android/binder_alloc.c                |   4 +-
 drivers/base/firmware_loader/fallback.c       |   4 +-
 drivers/base/firmware_loader/main.c           |   4 +-
 drivers/block/drbd/drbd_main.c                |   4 +-
 drivers/block/drbd/drbd_receiver.c            |  12 +-
 drivers/dax/device.c                          |   2 +
 drivers/dax/super.c                           |   2 +
 drivers/firmware/efi/capsule-loader.c         |   6 +-
 drivers/firmware/efi/capsule.c                |   4 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c       |  12 +-
 drivers/gpu/drm/gma500/gma_display.c          |   4 +-
 drivers/gpu/drm/gma500/mmu.c                  |  10 +-
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c     |   4 +-
 .../drm/i915/gem/selftests/i915_gem_context.c |   4 +-
 .../drm/i915/gem/selftests/i915_gem_mman.c    |   8 +-
 drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c  |   4 +-
 drivers/gpu/drm/i915/gt/intel_gtt.c           |   4 +-
 drivers/gpu/drm/i915/gt/shmem_utils.c         |   4 +-
 drivers/gpu/drm/i915/i915_gem.c               |   8 +-
 drivers/gpu/drm/i915/i915_gpu_error.c         |   4 +-
 drivers/gpu/drm/i915/selftests/i915_perf.c    |   4 +-
 drivers/gpu/drm/radeon/radeon_ttm.c           |   4 +-
 drivers/infiniband/hw/hfi1/sdma.c             |   4 +-
 drivers/infiniband/hw/i40iw/i40iw_cm.c        |  10 +-
 drivers/infiniband/sw/siw/siw_qp_tx.c         |  14 +-
 drivers/md/bcache/request.c                   |   4 +-
 drivers/misc/vmw_vmci/vmci_queue_pair.c       |  12 +-
 drivers/mmc/host/mmc_spi.c                    |   4 +-
 drivers/mmc/host/sdricoh_cs.c                 |   4 +-
 drivers/mtd/mtd_blkdevs.c                     |  12 +-
 drivers/net/ethernet/intel/igb/igb_ethtool.c  |   4 +-
 .../net/ethernet/intel/ixgbe/ixgbe_ethtool.c  |   4 +-
 drivers/nvdimm/pmem.c                         |   6 +
 drivers/scsi/ipr.c                            |   8 +-
 drivers/scsi/pmcraid.c                        |   8 +-
 drivers/staging/rts5208/rtsx_transport.c      |   4 +-
 drivers/target/target_core_iblock.c           |   4 +-
 drivers/target/target_core_rd.c               |   4 +-
 drivers/target/target_core_transport.c        |   4 +-
 drivers/xen/gntalloc.c                        |   4 +-
 fs/afs/dir.c                                  |  16 +-
 fs/afs/dir_edit.c                             |  16 +-
 fs/afs/mntpt.c                                |   4 +-
 fs/afs/write.c                                |   4 +-
 fs/aio.c                                      |   4 +-
 fs/binfmt_elf.c                               |   4 +-
 fs/binfmt_elf_fdpic.c                         |   4 +-
 fs/btrfs/check-integrity.c                    |   4 +-
 fs/btrfs/compression.c                        |   4 +-
 fs/btrfs/inode.c                              |  16 +-
 fs/btrfs/lzo.c                                |  24 +--
 fs/btrfs/raid56.c                             |  34 +--
 fs/btrfs/reflink.c                            |   8 +-
 fs/btrfs/send.c                               |   4 +-
 fs/btrfs/zlib.c                               |  32 +--
 fs/btrfs/zstd.c                               |  20 +-
 fs/cachefiles/rdwr.c                          |   4 +-
 fs/cifs/cifsencrypt.c                         |   6 +-
 fs/cifs/file.c                                |  16 +-
 fs/cifs/smb2ops.c                             |   8 +-
 fs/cramfs/inode.c                             |  10 +-
 fs/ecryptfs/crypto.c                          |   8 +-
 fs/ecryptfs/read_write.c                      |   8 +-
 fs/erofs/super.c                              |   4 +-
 fs/erofs/xattr.c                              |   4 +-
 fs/exec.c                                     |  10 +-
 fs/ext2/dir.c                                 |   8 +-
 fs/ext2/ext2.h                                |   8 +
 fs/ext2/namei.c                               |  15 +-
 fs/f2fs/f2fs.h                                |   8 +-
 fs/freevxfs/vxfs_immed.c                      |   4 +-
 fs/fuse/readdir.c                             |   4 +-
 fs/gfs2/bmap.c                                |   4 +-
 fs/gfs2/ops_fstype.c                          |   4 +-
 fs/hfs/bnode.c                                |  14 +-
 fs/hfs/btree.c                                |  20 +-
 fs/hfsplus/bitmap.c                           |  20 +-
 fs/hfsplus/bnode.c                            | 102 ++++-----
 fs/hfsplus/btree.c                            |  18 +-
 fs/hostfs/hostfs_kern.c                       |  12 +-
 fs/io_uring.c                                 |   4 +-
 fs/isofs/compress.c                           |   4 +-
 fs/jffs2/file.c                               |   8 +-
 fs/jffs2/gc.c                                 |   4 +-
 fs/nfs/dir.c                                  |  20 +-
 fs/nilfs2/alloc.c                             |  34 +--
 fs/nilfs2/cpfile.c                            |   4 +-
 fs/ntfs/aops.c                                |   4 +-
 fs/reiserfs/journal.c                         |   4 +-
 fs/romfs/super.c                              |   4 +-
 fs/splice.c                                   |   4 +-
 fs/ubifs/file.c                               |  16 +-
 fs/vboxsf/file.c                              |  12 +-
 fs/zonefs/super.c                             |   4 +-
 include/linux/entry-common.h                  |   3 +
 include/linux/highmem.h                       |  63 +++++-
 include/linux/memremap.h                      |   1 +
 include/linux/mm.h                            |  43 ++++
 include/linux/pkeys.h                         |   6 +-
 include/linux/sched.h                         |   8 +
 include/trace/events/kmap_thread.h            |  56 +++++
 init/init_task.c                              |   6 +
 kernel/fork.c                                 |  18 ++
 kernel/kexec_core.c                           |   8 +-
 lib/Kconfig.debug                             |   8 +
 lib/iov_iter.c                                |  12 +-
 lib/pks/pks_test.c                            | 138 +++++++++++--
 lib/test_bpf.c                                |   4 +-
 lib/test_hmm.c                                |   8 +-
 mm/Kconfig                                    |  13 ++
 mm/debug.c                                    |  23 +++
 mm/memory.c                                   |   8 +-
 mm/memremap.c                                 |  90 ++++++++
 mm/swapfile.c                                 |   4 +-
 mm/userfaultfd.c                              |   4 +-
 net/ceph/messenger.c                          |   4 +-
 net/core/datagram.c                           |   4 +-
 net/core/sock.c                               |   8 +-
 net/ipv4/ip_output.c                          |   4 +-
 net/sunrpc/cache.c                            |   4 +-
 net/sunrpc/xdr.c                              |   8 +-
 net/tls/tls_device.c                          |   4 +-
 samples/vfio-mdev/mbochs.c                    |   4 +-
 131 files changed, 1284 insertions(+), 565 deletions(-)
 create mode 100644 include/trace/events/kmap_thread.h

Comments

Daniel Vetter Oct. 9, 2020, 10:03 p.m. UTC | #1
On Fri, Oct 09, 2020 at 12:49:44PM -0700, ira.weiny@intel.com wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> These kmap() calls in the gpu stack are localized to a single thread.
> To avoid the over head of global PKRS updates use the new kmap_thread()
> call.
> 
> Cc: David Airlie <airlied@linux.ie>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>

I'm guessing the entire pile goes in through some other tree. If so:

Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>

If you want this to land through maintainer trees, then we need a
per-driver split (since aside from amdgpu and radeon they're all different
subtrees).

btw the two kmap calls in drm you highlight in the cover letter should
also be convertible to kmap_thread. We only hold vmalloc mappings for a
longer time (or it'd be quite a driver bug). So if you want maybe throw
those two as two additional patches on top, and we can do some careful
review & testing for them.
-Daniel

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c              | 12 ++++++------
>  drivers/gpu/drm/gma500/gma_display.c                 |  4 ++--
>  drivers/gpu/drm/gma500/mmu.c                         | 10 +++++-----
>  drivers/gpu/drm/i915/gem/i915_gem_shmem.c            |  4 ++--
>  .../gpu/drm/i915/gem/selftests/i915_gem_context.c    |  4 ++--
>  drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c   |  8 ++++----
>  drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c         |  4 ++--
>  drivers/gpu/drm/i915/gt/intel_gtt.c                  |  4 ++--
>  drivers/gpu/drm/i915/gt/shmem_utils.c                |  4 ++--
>  drivers/gpu/drm/i915/i915_gem.c                      |  8 ++++----
>  drivers/gpu/drm/i915/i915_gpu_error.c                |  4 ++--
>  drivers/gpu/drm/i915/selftests/i915_perf.c           |  4 ++--
>  drivers/gpu/drm/radeon/radeon_ttm.c                  |  4 ++--
>  13 files changed, 37 insertions(+), 37 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> index 978bae731398..bd564bccb7a3 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> @@ -2437,11 +2437,11 @@ static ssize_t amdgpu_ttm_gtt_read(struct file *f, char __user *buf,
>  
>  		page = adev->gart.pages[p];
>  		if (page) {
> -			ptr = kmap(page);
> +			ptr = kmap_thread(page);
>  			ptr += off;
>  
>  			r = copy_to_user(buf, ptr, cur_size);
> -			kunmap(adev->gart.pages[p]);
> +			kunmap_thread(adev->gart.pages[p]);
>  		} else
>  			r = clear_user(buf, cur_size);
>  
> @@ -2507,9 +2507,9 @@ static ssize_t amdgpu_iomem_read(struct file *f, char __user *buf,
>  		if (p->mapping != adev->mman.bdev.dev_mapping)
>  			return -EPERM;
>  
> -		ptr = kmap(p);
> +		ptr = kmap_thread(p);
>  		r = copy_to_user(buf, ptr + off, bytes);
> -		kunmap(p);
> +		kunmap_thread(p);
>  		if (r)
>  			return -EFAULT;
>  
> @@ -2558,9 +2558,9 @@ static ssize_t amdgpu_iomem_write(struct file *f, const char __user *buf,
>  		if (p->mapping != adev->mman.bdev.dev_mapping)
>  			return -EPERM;
>  
> -		ptr = kmap(p);
> +		ptr = kmap_thread(p);
>  		r = copy_from_user(ptr + off, buf, bytes);
> -		kunmap(p);
> +		kunmap_thread(p);
>  		if (r)
>  			return -EFAULT;
>  
> diff --git a/drivers/gpu/drm/gma500/gma_display.c b/drivers/gpu/drm/gma500/gma_display.c
> index 3df6d6e850f5..35f4e55c941f 100644
> --- a/drivers/gpu/drm/gma500/gma_display.c
> +++ b/drivers/gpu/drm/gma500/gma_display.c
> @@ -400,9 +400,9 @@ int gma_crtc_cursor_set(struct drm_crtc *crtc,
>  		/* Copy the cursor to cursor mem */
>  		tmp_dst = dev_priv->vram_addr + cursor_gt->offset;
>  		for (i = 0; i < cursor_pages; i++) {
> -			tmp_src = kmap(gt->pages[i]);
> +			tmp_src = kmap_thread(gt->pages[i]);
>  			memcpy(tmp_dst, tmp_src, PAGE_SIZE);
> -			kunmap(gt->pages[i]);
> +			kunmap_thread(gt->pages[i]);
>  			tmp_dst += PAGE_SIZE;
>  		}
>  
> diff --git a/drivers/gpu/drm/gma500/mmu.c b/drivers/gpu/drm/gma500/mmu.c
> index 505044c9a673..fba7a3a461fd 100644
> --- a/drivers/gpu/drm/gma500/mmu.c
> +++ b/drivers/gpu/drm/gma500/mmu.c
> @@ -192,20 +192,20 @@ struct psb_mmu_pd *psb_mmu_alloc_pd(struct psb_mmu_driver *driver,
>  		pd->invalid_pte = 0;
>  	}
>  
> -	v = kmap(pd->dummy_pt);
> +	v = kmap_thread(pd->dummy_pt);
>  	for (i = 0; i < (PAGE_SIZE / sizeof(uint32_t)); ++i)
>  		v[i] = pd->invalid_pte;
>  
> -	kunmap(pd->dummy_pt);
> +	kunmap_thread(pd->dummy_pt);
>  
> -	v = kmap(pd->p);
> +	v = kmap_thread(pd->p);
>  	for (i = 0; i < (PAGE_SIZE / sizeof(uint32_t)); ++i)
>  		v[i] = pd->invalid_pde;
>  
> -	kunmap(pd->p);
> +	kunmap_thread(pd->p);
>  
>  	clear_page(kmap(pd->dummy_page));
> -	kunmap(pd->dummy_page);
> +	kunmap_thread(pd->dummy_page);
>  
>  	pd->tables = vmalloc_user(sizeof(struct psb_mmu_pt *) * 1024);
>  	if (!pd->tables)
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> index 38113d3c0138..274424795fb7 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> @@ -566,9 +566,9 @@ i915_gem_object_create_shmem_from_data(struct drm_i915_private *dev_priv,
>  		if (err < 0)
>  			goto fail;
>  
> -		vaddr = kmap(page);
> +		vaddr = kmap_thread(page);
>  		memcpy(vaddr, data, len);
> -		kunmap(page);
> +		kunmap_thread(page);
>  
>  		err = pagecache_write_end(file, file->f_mapping,
>  					  offset, len, len,
> diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
> index 7ffc3c751432..b466c677d007 100644
> --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
> @@ -1754,7 +1754,7 @@ static int check_scratch_page(struct i915_gem_context *ctx, u32 *out)
>  		return -EINVAL;
>  	}
>  
> -	vaddr = kmap(page);
> +	vaddr = kmap_thread(page);
>  	if (!vaddr) {
>  		pr_err("No (mappable) scratch page!\n");
>  		return -EINVAL;
> @@ -1765,7 +1765,7 @@ static int check_scratch_page(struct i915_gem_context *ctx, u32 *out)
>  		pr_err("Inconsistent initial state of scratch page!\n");
>  		err = -EINVAL;
>  	}
> -	kunmap(page);
> +	kunmap_thread(page);
>  
>  	return err;
>  }
> diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
> index 9c7402ce5bf9..447df22e2e06 100644
> --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
> +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
> @@ -143,7 +143,7 @@ static int check_partial_mapping(struct drm_i915_gem_object *obj,
>  	intel_gt_flush_ggtt_writes(&to_i915(obj->base.dev)->gt);
>  
>  	p = i915_gem_object_get_page(obj, offset >> PAGE_SHIFT);
> -	cpu = kmap(p) + offset_in_page(offset);
> +	cpu = kmap_thread(p) + offset_in_page(offset);
>  	drm_clflush_virt_range(cpu, sizeof(*cpu));
>  	if (*cpu != (u32)page) {
>  		pr_err("Partial view for %lu [%u] (offset=%llu, size=%u [%llu, row size %u], fence=%d, tiling=%d, stride=%d) misalignment, expected write to page (%llu + %u [0x%llx]) of 0x%x, found 0x%x\n",
> @@ -161,7 +161,7 @@ static int check_partial_mapping(struct drm_i915_gem_object *obj,
>  	}
>  	*cpu = 0;
>  	drm_clflush_virt_range(cpu, sizeof(*cpu));
> -	kunmap(p);
> +	kunmap_thread(p);
>  
>  out:
>  	__i915_vma_put(vma);
> @@ -236,7 +236,7 @@ static int check_partial_mappings(struct drm_i915_gem_object *obj,
>  		intel_gt_flush_ggtt_writes(&to_i915(obj->base.dev)->gt);
>  
>  		p = i915_gem_object_get_page(obj, offset >> PAGE_SHIFT);
> -		cpu = kmap(p) + offset_in_page(offset);
> +		cpu = kmap_thread(p) + offset_in_page(offset);
>  		drm_clflush_virt_range(cpu, sizeof(*cpu));
>  		if (*cpu != (u32)page) {
>  			pr_err("Partial view for %lu [%u] (offset=%llu, size=%u [%llu, row size %u], fence=%d, tiling=%d, stride=%d) misalignment, expected write to page (%llu + %u [0x%llx]) of 0x%x, found 0x%x\n",
> @@ -254,7 +254,7 @@ static int check_partial_mappings(struct drm_i915_gem_object *obj,
>  		}
>  		*cpu = 0;
>  		drm_clflush_virt_range(cpu, sizeof(*cpu));
> -		kunmap(p);
> +		kunmap_thread(p);
>  		if (err)
>  			return err;
>  
> diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c
> index 7fb36b12fe7a..38da348282f1 100644
> --- a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c
> +++ b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c
> @@ -731,7 +731,7 @@ static void swizzle_page(struct page *page)
>  	char *vaddr;
>  	int i;
>  
> -	vaddr = kmap(page);
> +	vaddr = kmap_thread(page);
>  
>  	for (i = 0; i < PAGE_SIZE; i += 128) {
>  		memcpy(temp, &vaddr[i], 64);
> @@ -739,7 +739,7 @@ static void swizzle_page(struct page *page)
>  		memcpy(&vaddr[i + 64], temp, 64);
>  	}
>  
> -	kunmap(page);
> +	kunmap_thread(page);
>  }
>  
>  /**
> diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
> index 2a72cce63fd9..4cfb24e9ed62 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gtt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
> @@ -312,9 +312,9 @@ static void poison_scratch_page(struct page *page, unsigned long size)
>  	do {
>  		void *vaddr;
>  
> -		vaddr = kmap(page);
> +		vaddr = kmap_thread(page);
>  		memset(vaddr, POISON_FREE, PAGE_SIZE);
> -		kunmap(page);
> +		kunmap_thread(page);
>  
>  		page = pfn_to_page(page_to_pfn(page) + 1);
>  		size -= PAGE_SIZE;
> diff --git a/drivers/gpu/drm/i915/gt/shmem_utils.c b/drivers/gpu/drm/i915/gt/shmem_utils.c
> index 43c7acbdc79d..a40d3130cebf 100644
> --- a/drivers/gpu/drm/i915/gt/shmem_utils.c
> +++ b/drivers/gpu/drm/i915/gt/shmem_utils.c
> @@ -142,12 +142,12 @@ static int __shmem_rw(struct file *file, loff_t off,
>  		if (IS_ERR(page))
>  			return PTR_ERR(page);
>  
> -		vaddr = kmap(page);
> +		vaddr = kmap_thread(page);
>  		if (write)
>  			memcpy(vaddr + offset_in_page(off), ptr, this);
>  		else
>  			memcpy(ptr, vaddr + offset_in_page(off), this);
> -		kunmap(page);
> +		kunmap_thread(page);
>  		put_page(page);
>  
>  		len -= this;
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 9aa3066cb75d..cae8300fd224 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -312,14 +312,14 @@ shmem_pread(struct page *page, int offset, int len, char __user *user_data,
>  	char *vaddr;
>  	int ret;
>  
> -	vaddr = kmap(page);
> +	vaddr = kmap_thread(page);
>  
>  	if (needs_clflush)
>  		drm_clflush_virt_range(vaddr + offset, len);
>  
>  	ret = __copy_to_user(user_data, vaddr + offset, len);
>  
> -	kunmap(page);
> +	kunmap_thread(page);
>  
>  	return ret ? -EFAULT : 0;
>  }
> @@ -708,7 +708,7 @@ shmem_pwrite(struct page *page, int offset, int len, char __user *user_data,
>  	char *vaddr;
>  	int ret;
>  
> -	vaddr = kmap(page);
> +	vaddr = kmap_thread(page);
>  
>  	if (needs_clflush_before)
>  		drm_clflush_virt_range(vaddr + offset, len);
> @@ -717,7 +717,7 @@ shmem_pwrite(struct page *page, int offset, int len, char __user *user_data,
>  	if (!ret && needs_clflush_after)
>  		drm_clflush_virt_range(vaddr + offset, len);
>  
> -	kunmap(page);
> +	kunmap_thread(page);
>  
>  	return ret ? -EFAULT : 0;
>  }
> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> index 3e6cbb0d1150..aecd469b6b6e 100644
> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> @@ -1058,9 +1058,9 @@ i915_vma_coredump_create(const struct intel_gt *gt,
>  
>  			drm_clflush_pages(&page, 1);
>  
> -			s = kmap(page);
> +			s = kmap_thread(page);
>  			ret = compress_page(compress, s, dst, false);
> -			kunmap(page);
> +			kunmap_thread(page);
>  
>  			drm_clflush_pages(&page, 1);
>  
> diff --git a/drivers/gpu/drm/i915/selftests/i915_perf.c b/drivers/gpu/drm/i915/selftests/i915_perf.c
> index c2d001d9c0ec..7f7ef2d056f4 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_perf.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_perf.c
> @@ -307,7 +307,7 @@ static int live_noa_gpr(void *arg)
>  	}
>  
>  	/* Poison the ce->vm so we detect writes not to the GGTT gt->scratch */
> -	scratch = kmap(ce->vm->scratch[0].base.page);
> +	scratch = kmap_thread(ce->vm->scratch[0].base.page);
>  	memset(scratch, POISON_FREE, PAGE_SIZE);
>  
>  	rq = intel_context_create_request(ce);
> @@ -405,7 +405,7 @@ static int live_noa_gpr(void *arg)
>  out_rq:
>  	i915_request_put(rq);
>  out_ce:
> -	kunmap(ce->vm->scratch[0].base.page);
> +	kunmap_thread(ce->vm->scratch[0].base.page);
>  	intel_context_put(ce);
>  out:
>  	stream_destroy(stream);
> diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c b/drivers/gpu/drm/radeon/radeon_ttm.c
> index 004344dce140..0aba0cac51e1 100644
> --- a/drivers/gpu/drm/radeon/radeon_ttm.c
> +++ b/drivers/gpu/drm/radeon/radeon_ttm.c
> @@ -1013,11 +1013,11 @@ static ssize_t radeon_ttm_gtt_read(struct file *f, char __user *buf,
>  
>  		page = rdev->gart.pages[p];
>  		if (page) {
> -			ptr = kmap(page);
> +			ptr = kmap_thread(page);
>  			ptr += off;
>  
>  			r = copy_to_user(buf, ptr, cur_size);
> -			kunmap(rdev->gart.pages[p]);
> +			kunmap_thread(rdev->gart.pages[p]);
>  		} else
>  			r = clear_user(buf, cur_size);
>  
> -- 
> 2.28.0.rc0.12.gb6a658bd00c9
>
Eric W. Biederman Oct. 10, 2020, 3:43 a.m. UTC | #2
ira.weiny@intel.com writes:

> From: Ira Weiny <ira.weiny@intel.com>
>
> This kmap() call is localized to a single thread.  To avoid the over
> head of global PKRS updates use the new kmap_thread() call.

Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>

>
> Cc: Eric Biederman <ebiederm@xmission.com>
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> ---
>  kernel/kexec_core.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
> index c19c0dad1ebe..272a9920c0d6 100644
> --- a/kernel/kexec_core.c
> +++ b/kernel/kexec_core.c
> @@ -815,7 +815,7 @@ static int kimage_load_normal_segment(struct kimage *image,
>  		if (result < 0)
>  			goto out;
>  
> -		ptr = kmap(page);
> +		ptr = kmap_thread(page);
>  		/* Start with a clear page */
>  		clear_page(ptr);
>  		ptr += maddr & ~PAGE_MASK;
> @@ -828,7 +828,7 @@ static int kimage_load_normal_segment(struct kimage *image,
>  			memcpy(ptr, kbuf, uchunk);
>  		else
>  			result = copy_from_user(ptr, buf, uchunk);
> -		kunmap(page);
> +		kunmap_thread(page);
>  		if (result) {
>  			result = -EFAULT;
>  			goto out;
> @@ -879,7 +879,7 @@ static int kimage_load_crash_segment(struct kimage *image,
>  			goto out;
>  		}
>  		arch_kexec_post_alloc_pages(page_address(page), 1, 0);
> -		ptr = kmap(page);
> +		ptr = kmap_thread(page);
>  		ptr += maddr & ~PAGE_MASK;
>  		mchunk = min_t(size_t, mbytes,
>  				PAGE_SIZE - (maddr & ~PAGE_MASK));
> @@ -895,7 +895,7 @@ static int kimage_load_crash_segment(struct kimage *image,
>  		else
>  			result = copy_from_user(ptr, buf, uchunk);
>  		kexec_flush_icache_page(page);
> -		kunmap(page);
> +		kunmap_thread(page);
>  		arch_kexec_pre_free_pages(page_address(page), 1);
>  		if (result) {
>  			result = -EFAULT;
Ira Weiny Oct. 10, 2020, 11:01 p.m. UTC | #3
On Sat, Oct 10, 2020 at 12:03:49AM +0200, Daniel Vetter wrote:
> On Fri, Oct 09, 2020 at 12:49:44PM -0700, ira.weiny@intel.com wrote:
> > From: Ira Weiny <ira.weiny@intel.com>
> > 
> > These kmap() calls in the gpu stack are localized to a single thread.
> > To avoid the over head of global PKRS updates use the new kmap_thread()
> > call.
> > 
> > Cc: David Airlie <airlied@linux.ie>
> > Cc: Daniel Vetter <daniel@ffwll.ch>
> > Cc: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
> > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> 
> I'm guessing the entire pile goes in through some other tree.
>

Apologies for not realizing there were multiple maintainers here.

But, I was thinking it would land together through the mm tree once the core
support lands.  I've tried to split these out in a way they can be easily
reviewed/acked by the correct developers.

> If so:
> 
> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> 
> If you want this to land through maintainer trees, then we need a
> per-driver split (since aside from amdgpu and radeon they're all different
> subtrees).

It is just RFC for the moment.  I need to get the core support accepted first
then this can land.

> 
> btw the two kmap calls in drm you highlight in the cover letter should
> also be convertible to kmap_thread. We only hold vmalloc mappings for a
> longer time (or it'd be quite a driver bug). So if you want maybe throw
> those two as two additional patches on top, and we can do some careful
> review & testing for them.

Cool.  I'll add them in.

Ira

> -Daniel
> 
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c              | 12 ++++++------
> >  drivers/gpu/drm/gma500/gma_display.c                 |  4 ++--
> >  drivers/gpu/drm/gma500/mmu.c                         | 10 +++++-----
> >  drivers/gpu/drm/i915/gem/i915_gem_shmem.c            |  4 ++--
> >  .../gpu/drm/i915/gem/selftests/i915_gem_context.c    |  4 ++--
> >  drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c   |  8 ++++----
> >  drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c         |  4 ++--
> >  drivers/gpu/drm/i915/gt/intel_gtt.c                  |  4 ++--
> >  drivers/gpu/drm/i915/gt/shmem_utils.c                |  4 ++--
> >  drivers/gpu/drm/i915/i915_gem.c                      |  8 ++++----
> >  drivers/gpu/drm/i915/i915_gpu_error.c                |  4 ++--
> >  drivers/gpu/drm/i915/selftests/i915_perf.c           |  4 ++--
> >  drivers/gpu/drm/radeon/radeon_ttm.c                  |  4 ++--
> >  13 files changed, 37 insertions(+), 37 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> > index 978bae731398..bd564bccb7a3 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> > @@ -2437,11 +2437,11 @@ static ssize_t amdgpu_ttm_gtt_read(struct file *f, char __user *buf,
> >  
> >  		page = adev->gart.pages[p];
> >  		if (page) {
> > -			ptr = kmap(page);
> > +			ptr = kmap_thread(page);
> >  			ptr += off;
> >  
> >  			r = copy_to_user(buf, ptr, cur_size);
> > -			kunmap(adev->gart.pages[p]);
> > +			kunmap_thread(adev->gart.pages[p]);
> >  		} else
> >  			r = clear_user(buf, cur_size);
> >  
> > @@ -2507,9 +2507,9 @@ static ssize_t amdgpu_iomem_read(struct file *f, char __user *buf,
> >  		if (p->mapping != adev->mman.bdev.dev_mapping)
> >  			return -EPERM;
> >  
> > -		ptr = kmap(p);
> > +		ptr = kmap_thread(p);
> >  		r = copy_to_user(buf, ptr + off, bytes);
> > -		kunmap(p);
> > +		kunmap_thread(p);
> >  		if (r)
> >  			return -EFAULT;
> >  
> > @@ -2558,9 +2558,9 @@ static ssize_t amdgpu_iomem_write(struct file *f, const char __user *buf,
> >  		if (p->mapping != adev->mman.bdev.dev_mapping)
> >  			return -EPERM;
> >  
> > -		ptr = kmap(p);
> > +		ptr = kmap_thread(p);
> >  		r = copy_from_user(ptr + off, buf, bytes);
> > -		kunmap(p);
> > +		kunmap_thread(p);
> >  		if (r)
> >  			return -EFAULT;
> >  
> > diff --git a/drivers/gpu/drm/gma500/gma_display.c b/drivers/gpu/drm/gma500/gma_display.c
> > index 3df6d6e850f5..35f4e55c941f 100644
> > --- a/drivers/gpu/drm/gma500/gma_display.c
> > +++ b/drivers/gpu/drm/gma500/gma_display.c
> > @@ -400,9 +400,9 @@ int gma_crtc_cursor_set(struct drm_crtc *crtc,
> >  		/* Copy the cursor to cursor mem */
> >  		tmp_dst = dev_priv->vram_addr + cursor_gt->offset;
> >  		for (i = 0; i < cursor_pages; i++) {
> > -			tmp_src = kmap(gt->pages[i]);
> > +			tmp_src = kmap_thread(gt->pages[i]);
> >  			memcpy(tmp_dst, tmp_src, PAGE_SIZE);
> > -			kunmap(gt->pages[i]);
> > +			kunmap_thread(gt->pages[i]);
> >  			tmp_dst += PAGE_SIZE;
> >  		}
> >  
> > diff --git a/drivers/gpu/drm/gma500/mmu.c b/drivers/gpu/drm/gma500/mmu.c
> > index 505044c9a673..fba7a3a461fd 100644
> > --- a/drivers/gpu/drm/gma500/mmu.c
> > +++ b/drivers/gpu/drm/gma500/mmu.c
> > @@ -192,20 +192,20 @@ struct psb_mmu_pd *psb_mmu_alloc_pd(struct psb_mmu_driver *driver,
> >  		pd->invalid_pte = 0;
> >  	}
> >  
> > -	v = kmap(pd->dummy_pt);
> > +	v = kmap_thread(pd->dummy_pt);
> >  	for (i = 0; i < (PAGE_SIZE / sizeof(uint32_t)); ++i)
> >  		v[i] = pd->invalid_pte;
> >  
> > -	kunmap(pd->dummy_pt);
> > +	kunmap_thread(pd->dummy_pt);
> >  
> > -	v = kmap(pd->p);
> > +	v = kmap_thread(pd->p);
> >  	for (i = 0; i < (PAGE_SIZE / sizeof(uint32_t)); ++i)
> >  		v[i] = pd->invalid_pde;
> >  
> > -	kunmap(pd->p);
> > +	kunmap_thread(pd->p);
> >  
> >  	clear_page(kmap(pd->dummy_page));
> > -	kunmap(pd->dummy_page);
> > +	kunmap_thread(pd->dummy_page);
> >  
> >  	pd->tables = vmalloc_user(sizeof(struct psb_mmu_pt *) * 1024);
> >  	if (!pd->tables)
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> > index 38113d3c0138..274424795fb7 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> > @@ -566,9 +566,9 @@ i915_gem_object_create_shmem_from_data(struct drm_i915_private *dev_priv,
> >  		if (err < 0)
> >  			goto fail;
> >  
> > -		vaddr = kmap(page);
> > +		vaddr = kmap_thread(page);
> >  		memcpy(vaddr, data, len);
> > -		kunmap(page);
> > +		kunmap_thread(page);
> >  
> >  		err = pagecache_write_end(file, file->f_mapping,
> >  					  offset, len, len,
> > diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
> > index 7ffc3c751432..b466c677d007 100644
> > --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
> > @@ -1754,7 +1754,7 @@ static int check_scratch_page(struct i915_gem_context *ctx, u32 *out)
> >  		return -EINVAL;
> >  	}
> >  
> > -	vaddr = kmap(page);
> > +	vaddr = kmap_thread(page);
> >  	if (!vaddr) {
> >  		pr_err("No (mappable) scratch page!\n");
> >  		return -EINVAL;
> > @@ -1765,7 +1765,7 @@ static int check_scratch_page(struct i915_gem_context *ctx, u32 *out)
> >  		pr_err("Inconsistent initial state of scratch page!\n");
> >  		err = -EINVAL;
> >  	}
> > -	kunmap(page);
> > +	kunmap_thread(page);
> >  
> >  	return err;
> >  }
> > diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
> > index 9c7402ce5bf9..447df22e2e06 100644
> > --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
> > +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
> > @@ -143,7 +143,7 @@ static int check_partial_mapping(struct drm_i915_gem_object *obj,
> >  	intel_gt_flush_ggtt_writes(&to_i915(obj->base.dev)->gt);
> >  
> >  	p = i915_gem_object_get_page(obj, offset >> PAGE_SHIFT);
> > -	cpu = kmap(p) + offset_in_page(offset);
> > +	cpu = kmap_thread(p) + offset_in_page(offset);
> >  	drm_clflush_virt_range(cpu, sizeof(*cpu));
> >  	if (*cpu != (u32)page) {
> >  		pr_err("Partial view for %lu [%u] (offset=%llu, size=%u [%llu, row size %u], fence=%d, tiling=%d, stride=%d) misalignment, expected write to page (%llu + %u [0x%llx]) of 0x%x, found 0x%x\n",
> > @@ -161,7 +161,7 @@ static int check_partial_mapping(struct drm_i915_gem_object *obj,
> >  	}
> >  	*cpu = 0;
> >  	drm_clflush_virt_range(cpu, sizeof(*cpu));
> > -	kunmap(p);
> > +	kunmap_thread(p);
> >  
> >  out:
> >  	__i915_vma_put(vma);
> > @@ -236,7 +236,7 @@ static int check_partial_mappings(struct drm_i915_gem_object *obj,
> >  		intel_gt_flush_ggtt_writes(&to_i915(obj->base.dev)->gt);
> >  
> >  		p = i915_gem_object_get_page(obj, offset >> PAGE_SHIFT);
> > -		cpu = kmap(p) + offset_in_page(offset);
> > +		cpu = kmap_thread(p) + offset_in_page(offset);
> >  		drm_clflush_virt_range(cpu, sizeof(*cpu));
> >  		if (*cpu != (u32)page) {
> >  			pr_err("Partial view for %lu [%u] (offset=%llu, size=%u [%llu, row size %u], fence=%d, tiling=%d, stride=%d) misalignment, expected write to page (%llu + %u [0x%llx]) of 0x%x, found 0x%x\n",
> > @@ -254,7 +254,7 @@ static int check_partial_mappings(struct drm_i915_gem_object *obj,
> >  		}
> >  		*cpu = 0;
> >  		drm_clflush_virt_range(cpu, sizeof(*cpu));
> > -		kunmap(p);
> > +		kunmap_thread(p);
> >  		if (err)
> >  			return err;
> >  
> > diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c
> > index 7fb36b12fe7a..38da348282f1 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c
> > @@ -731,7 +731,7 @@ static void swizzle_page(struct page *page)
> >  	char *vaddr;
> >  	int i;
> >  
> > -	vaddr = kmap(page);
> > +	vaddr = kmap_thread(page);
> >  
> >  	for (i = 0; i < PAGE_SIZE; i += 128) {
> >  		memcpy(temp, &vaddr[i], 64);
> > @@ -739,7 +739,7 @@ static void swizzle_page(struct page *page)
> >  		memcpy(&vaddr[i + 64], temp, 64);
> >  	}
> >  
> > -	kunmap(page);
> > +	kunmap_thread(page);
> >  }
> >  
> >  /**
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
> > index 2a72cce63fd9..4cfb24e9ed62 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gtt.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
> > @@ -312,9 +312,9 @@ static void poison_scratch_page(struct page *page, unsigned long size)
> >  	do {
> >  		void *vaddr;
> >  
> > -		vaddr = kmap(page);
> > +		vaddr = kmap_thread(page);
> >  		memset(vaddr, POISON_FREE, PAGE_SIZE);
> > -		kunmap(page);
> > +		kunmap_thread(page);
> >  
> >  		page = pfn_to_page(page_to_pfn(page) + 1);
> >  		size -= PAGE_SIZE;
> > diff --git a/drivers/gpu/drm/i915/gt/shmem_utils.c b/drivers/gpu/drm/i915/gt/shmem_utils.c
> > index 43c7acbdc79d..a40d3130cebf 100644
> > --- a/drivers/gpu/drm/i915/gt/shmem_utils.c
> > +++ b/drivers/gpu/drm/i915/gt/shmem_utils.c
> > @@ -142,12 +142,12 @@ static int __shmem_rw(struct file *file, loff_t off,
> >  		if (IS_ERR(page))
> >  			return PTR_ERR(page);
> >  
> > -		vaddr = kmap(page);
> > +		vaddr = kmap_thread(page);
> >  		if (write)
> >  			memcpy(vaddr + offset_in_page(off), ptr, this);
> >  		else
> >  			memcpy(ptr, vaddr + offset_in_page(off), this);
> > -		kunmap(page);
> > +		kunmap_thread(page);
> >  		put_page(page);
> >  
> >  		len -= this;
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index 9aa3066cb75d..cae8300fd224 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -312,14 +312,14 @@ shmem_pread(struct page *page, int offset, int len, char __user *user_data,
> >  	char *vaddr;
> >  	int ret;
> >  
> > -	vaddr = kmap(page);
> > +	vaddr = kmap_thread(page);
> >  
> >  	if (needs_clflush)
> >  		drm_clflush_virt_range(vaddr + offset, len);
> >  
> >  	ret = __copy_to_user(user_data, vaddr + offset, len);
> >  
> > -	kunmap(page);
> > +	kunmap_thread(page);
> >  
> >  	return ret ? -EFAULT : 0;
> >  }
> > @@ -708,7 +708,7 @@ shmem_pwrite(struct page *page, int offset, int len, char __user *user_data,
> >  	char *vaddr;
> >  	int ret;
> >  
> > -	vaddr = kmap(page);
> > +	vaddr = kmap_thread(page);
> >  
> >  	if (needs_clflush_before)
> >  		drm_clflush_virt_range(vaddr + offset, len);
> > @@ -717,7 +717,7 @@ shmem_pwrite(struct page *page, int offset, int len, char __user *user_data,
> >  	if (!ret && needs_clflush_after)
> >  		drm_clflush_virt_range(vaddr + offset, len);
> >  
> > -	kunmap(page);
> > +	kunmap_thread(page);
> >  
> >  	return ret ? -EFAULT : 0;
> >  }
> > diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> > index 3e6cbb0d1150..aecd469b6b6e 100644
> > --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> > +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> > @@ -1058,9 +1058,9 @@ i915_vma_coredump_create(const struct intel_gt *gt,
> >  
> >  			drm_clflush_pages(&page, 1);
> >  
> > -			s = kmap(page);
> > +			s = kmap_thread(page);
> >  			ret = compress_page(compress, s, dst, false);
> > -			kunmap(page);
> > +			kunmap_thread(page);
> >  
> >  			drm_clflush_pages(&page, 1);
> >  
> > diff --git a/drivers/gpu/drm/i915/selftests/i915_perf.c b/drivers/gpu/drm/i915/selftests/i915_perf.c
> > index c2d001d9c0ec..7f7ef2d056f4 100644
> > --- a/drivers/gpu/drm/i915/selftests/i915_perf.c
> > +++ b/drivers/gpu/drm/i915/selftests/i915_perf.c
> > @@ -307,7 +307,7 @@ static int live_noa_gpr(void *arg)
> >  	}
> >  
> >  	/* Poison the ce->vm so we detect writes not to the GGTT gt->scratch */
> > -	scratch = kmap(ce->vm->scratch[0].base.page);
> > +	scratch = kmap_thread(ce->vm->scratch[0].base.page);
> >  	memset(scratch, POISON_FREE, PAGE_SIZE);
> >  
> >  	rq = intel_context_create_request(ce);
> > @@ -405,7 +405,7 @@ static int live_noa_gpr(void *arg)
> >  out_rq:
> >  	i915_request_put(rq);
> >  out_ce:
> > -	kunmap(ce->vm->scratch[0].base.page);
> > +	kunmap_thread(ce->vm->scratch[0].base.page);
> >  	intel_context_put(ce);
> >  out:
> >  	stream_destroy(stream);
> > diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c b/drivers/gpu/drm/radeon/radeon_ttm.c
> > index 004344dce140..0aba0cac51e1 100644
> > --- a/drivers/gpu/drm/radeon/radeon_ttm.c
> > +++ b/drivers/gpu/drm/radeon/radeon_ttm.c
> > @@ -1013,11 +1013,11 @@ static ssize_t radeon_ttm_gtt_read(struct file *f, char __user *buf,
> >  
> >  		page = rdev->gart.pages[p];
> >  		if (page) {
> > -			ptr = kmap(page);
> > +			ptr = kmap_thread(page);
> >  			ptr += off;
> >  
> >  			r = copy_to_user(buf, ptr, cur_size);
> > -			kunmap(rdev->gart.pages[p]);
> > +			kunmap_thread(rdev->gart.pages[p]);
> >  		} else
> >  			r = clear_user(buf, cur_size);
> >  
> > -- 
> > 2.28.0.rc0.12.gb6a658bd00c9
> > 
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
Damien Le Moal Oct. 12, 2020, 2:30 a.m. UTC | #4
On 2020/10/10 4:52, ira.weiny@intel.com wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> The kmap() calls in this FS are localized to a single thread.  To avoid
> the over head of global PKRS updates use the new kmap_thread() call.
> 
> Cc: Damien Le Moal <damien.lemoal@wdc.com>
> Cc: Naohiro Aota <naohiro.aota@wdc.com>
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> ---
>  fs/zonefs/super.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/zonefs/super.c b/fs/zonefs/super.c
> index 8ec7c8f109d7..2fd6c86beee1 100644
> --- a/fs/zonefs/super.c
> +++ b/fs/zonefs/super.c
> @@ -1297,7 +1297,7 @@ static int zonefs_read_super(struct super_block *sb)
>  	if (ret)
>  		goto free_page;
>  
> -	super = kmap(page);
> +	super = kmap_thread(page);
>  
>  	ret = -EINVAL;
>  	if (le32_to_cpu(super->s_magic) != ZONEFS_MAGIC)
> @@ -1349,7 +1349,7 @@ static int zonefs_read_super(struct super_block *sb)
>  	ret = 0;
>  
>  unmap:
> -	kunmap(page);
> +	kunmap_thread(page);
>  free_page:
>  	__free_page(page);
>  
> 

acked-by: Damien Le Moal <damien.lemoal@wdc.com>
Dan Williams Oct. 13, 2020, 6:44 p.m. UTC | #5
On Fri, Oct 9, 2020 at 12:52 PM <ira.weiny@intel.com> wrote:
>
> From: Ira Weiny <ira.weiny@intel.com>
>
> The kmap() calls in this FS are localized to a single thread.  To avoid
> the over head of global PKRS updates use the new kmap_thread() call.
>
> Cc: Nicolas Pitre <nico@fluxnic.net>
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> ---
>  fs/cramfs/inode.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/fs/cramfs/inode.c b/fs/cramfs/inode.c
> index 912308600d39..003c014a42ed 100644
> --- a/fs/cramfs/inode.c
> +++ b/fs/cramfs/inode.c
> @@ -247,8 +247,8 @@ static void *cramfs_blkdev_read(struct super_block *sb, unsigned int offset,
>                 struct page *page = pages[i];
>
>                 if (page) {
> -                       memcpy(data, kmap(page), PAGE_SIZE);
> -                       kunmap(page);
> +                       memcpy(data, kmap_thread(page), PAGE_SIZE);
> +                       kunmap_thread(page);

Why does this need a sleepable kmap? This looks like a textbook
kmap_atomic() use case.
Dan Williams Oct. 13, 2020, 7:41 p.m. UTC | #6
On Tue, Oct 13, 2020 at 12:37 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Tue, Oct 13, 2020 at 11:44:29AM -0700, Dan Williams wrote:
> > On Fri, Oct 9, 2020 at 12:52 PM <ira.weiny@intel.com> wrote:
> > >
> > > From: Ira Weiny <ira.weiny@intel.com>
> > >
> > > The kmap() calls in this FS are localized to a single thread.  To avoid
> > > the over head of global PKRS updates use the new kmap_thread() call.
> > >
> > > Cc: Nicolas Pitre <nico@fluxnic.net>
> > > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > > ---
> > >  fs/cramfs/inode.c | 10 +++++-----
> > >  1 file changed, 5 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/fs/cramfs/inode.c b/fs/cramfs/inode.c
> > > index 912308600d39..003c014a42ed 100644
> > > --- a/fs/cramfs/inode.c
> > > +++ b/fs/cramfs/inode.c
> > > @@ -247,8 +247,8 @@ static void *cramfs_blkdev_read(struct super_block *sb, unsigned int offset,
> > >                 struct page *page = pages[i];
> > >
> > >                 if (page) {
> > > -                       memcpy(data, kmap(page), PAGE_SIZE);
> > > -                       kunmap(page);
> > > +                       memcpy(data, kmap_thread(page), PAGE_SIZE);
> > > +                       kunmap_thread(page);
> >
> > Why does this need a sleepable kmap? This looks like a textbook
> > kmap_atomic() use case.
>
> There's a lot of code of this form.  Could we perhaps have:
>
> static inline void copy_to_highpage(struct page *to, void *vfrom, unsigned int size)
> {
>         char *vto = kmap_atomic(to);
>
>         memcpy(vto, vfrom, size);
>         kunmap_atomic(vto);
> }
>
> in linux/highmem.h ?

Nice, yes, that could also replace the local ones in lib/iov_iter.c
(memcpy_{to,from}_page())
Ira Weiny Oct. 13, 2020, 8:45 p.m. UTC | #7
On Tue, Oct 13, 2020 at 08:36:43PM +0100, Matthew Wilcox wrote:
> On Tue, Oct 13, 2020 at 11:44:29AM -0700, Dan Williams wrote:
> > On Fri, Oct 9, 2020 at 12:52 PM <ira.weiny@intel.com> wrote:
> > >
> > > From: Ira Weiny <ira.weiny@intel.com>
> > >
> > > The kmap() calls in this FS are localized to a single thread.  To avoid
> > > the over head of global PKRS updates use the new kmap_thread() call.
> > >
> > > Cc: Nicolas Pitre <nico@fluxnic.net>
> > > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > > ---
> > >  fs/cramfs/inode.c | 10 +++++-----
> > >  1 file changed, 5 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/fs/cramfs/inode.c b/fs/cramfs/inode.c
> > > index 912308600d39..003c014a42ed 100644
> > > --- a/fs/cramfs/inode.c
> > > +++ b/fs/cramfs/inode.c
> > > @@ -247,8 +247,8 @@ static void *cramfs_blkdev_read(struct super_block *sb, unsigned int offset,
> > >                 struct page *page = pages[i];
> > >
> > >                 if (page) {
> > > -                       memcpy(data, kmap(page), PAGE_SIZE);
> > > -                       kunmap(page);
> > > +                       memcpy(data, kmap_thread(page), PAGE_SIZE);
> > > +                       kunmap_thread(page);
> > 
> > Why does this need a sleepable kmap? This looks like a textbook
> > kmap_atomic() use case.
> 
> There's a lot of code of this form.  Could we perhaps have:
> 
> static inline void copy_to_highpage(struct page *to, void *vfrom, unsigned int size)
> {
> 	char *vto = kmap_atomic(to);
> 
> 	memcpy(vto, vfrom, size);
> 	kunmap_atomic(vto);
> }
> 
> in linux/highmem.h ?

Christoph had the same idea.  I'll work on it.

Ira
Ira Weiny Oct. 13, 2020, 8:52 p.m. UTC | #8
On Tue, Oct 13, 2020 at 12:25:44PM +0100, Christoph Hellwig wrote:
> > -	kaddr = kmap(pp);
> > +	kaddr = kmap_thread(pp);
> >  	memcpy(kaddr, vip->vii_immed.vi_immed + offset, PAGE_SIZE);
> > -	kunmap(pp);
> > +	kunmap_thread(pp);
> 
> You only Cced me on this particular patch, which means I have absolutely
> no idea what kmap_thread and kunmap_thread actually do, and thus can't
> provide an informed review.

Sorry the list was so big I struggled with who to CC and on which patches.

> 
> That being said I think your life would be a lot easier if you add
> helpers for the above code sequence and its counterpart that copies
> to a potential hughmem page first, as that hides the implementation
> details from most users.

Matthew Wilcox and Al Viro have suggested similar ideas.

https://lore.kernel.org/lkml/20201013205012.GI2046448@iweiny-DESK2.sc.intel.com/

Ira