From patchwork Fri Oct 6 02:45:30 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicolas Pitre X-Patchwork-Id: 115022 Delivered-To: patch@linaro.org Received: by 10.140.22.163 with SMTP id 32csp1271020qgn; Thu, 5 Oct 2017 19:47:08 -0700 (PDT) X-Google-Smtp-Source: AOwi7QDdp1OXqJmf70YAP8ORu2hEPsmMtZTXvb23A77ZSzidmlrt+aQu4n3GVMhwGh3PEJ86hUl4 X-Received: by 10.84.218.131 with SMTP id r3mr587397pli.271.1507258028165; Thu, 05 Oct 2017 19:47:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1507258028; cv=none; d=google.com; s=arc-20160816; b=hRfMaiY918hx31Buaa+K830ogeCehsuJmeOEuDsC0NHb+Xnmgrw1TZkV7nki1UqoWX YJ6ePGJm7z8EAdgzEs6UanmN0Nr73+mbgb+Ep6YXzBmHnRtzovXUlcg+gqAKYuui2L5f 3KbxiiLcLAVoxcj8dhF1Qw8bxJ2Qf0CJYnjDA780OKPsUAsAFOjLLfeQpQ1GMxrMRNkR +LkbrCvsi+MD0WiQ47zbndXBvHLAWHYhLV6L6LVdd2xvRMKroky8z1rAraDaefZYtp6d Qe9gc2L0eojTcVE7Nt6SP1D6XB6Gf+2AeNe3k4UAeOiJkxvS7U3UpK0XkX+j1CI/dpBc T9UQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=DB8q9GINsf/PdupsvkADcQp34NMnlqHQhEf1L/Yb7Vw=; b=rnFa7IB03VwU/xsoqQ3RQJUt6ok7JwM3zKLqe8dc/ISAj1WoDQBSPLzTPKGZJLCVMG AsdIbdUB8r30WIJ1fALIHL85ns+SzvNY60Uin1XPtnDKlnyyy6RFnOl9794xjs1e6bLG fWYY8MLbYlBcdjR5wYvhiScutmmIFO3+yfDVSGNhvlPSREFhxiCGE7I9aP6qIrsRpwUb gykHmkGCXQKm0SfumHAWTxAwVDknphhHaZVvEr6Yfm6p9RAkNNZlDfwxGJFr5nffxoxQ jYZfiLMuU6e1XXgtC2KsD967yhIhKOUlPUekiFaWpPsamCkTK4qlNknGeTqRnXgjAWDN YX0Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@pobox.com header.s=sasl header.b=NjorPLqT; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b27si315245pge.711.2017.10.05.19.47.07; Thu, 05 Oct 2017 19:47:08 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@pobox.com header.s=sasl header.b=NjorPLqT; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751969AbdJFCqt (ORCPT + 26 others); Thu, 5 Oct 2017 22:46:49 -0400 Received: from pb-smtp1.pobox.com ([64.147.108.70]:51832 "EHLO sasl.smtp.pobox.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751528AbdJFCpj (ORCPT ); Thu, 5 Oct 2017 22:45:39 -0400 Received: from sasl.smtp.pobox.com (unknown [127.0.0.1]) by pb-smtp1.pobox.com (Postfix) with ESMTP id 22CE799B15; Thu, 5 Oct 2017 22:45:39 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=from:to:cc :subject:date:message-id:in-reply-to:references; s=sasl; bh=zIWg gcpEm/2vurKJq5AFCrGm7l0=; b=NjorPLqTli6xTzVpwca/aizqy3JJ/uLpzyla I06ncaFOw9PDgry8dRSzxgam8O6Jdmm2xvsUClTuWhHdFbfz15hJEO4ffc2cLsLI MLt56ZDwltOWAaP55nRFKtxw6EnTbppaJnZAgrVGhz0Gx6aWxOQNPvkOXqVdsIqq G2ur+gc= Received: from pb-smtp1.nyi.icgroup.com (unknown [127.0.0.1]) by pb-smtp1.pobox.com (Postfix) with ESMTP id 18C2599B14; Thu, 5 Oct 2017 22:45:39 -0400 (EDT) Received: from yoda.home (unknown [137.175.234.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by pb-smtp1.pobox.com (Postfix) with ESMTPSA id 8C22499B0D; Thu, 5 Oct 2017 22:45:38 -0400 (EDT) Received: from xanadu.home (xanadu.home [192.168.2.2]) by yoda.home (Postfix) with ESMTP id D94F32DA06DA; Thu, 5 Oct 2017 22:45:37 -0400 (EDT) From: Nicolas Pitre To: Alexander Viro , Christoph Hellwig Cc: linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-embedded@vger.kernel.org, linux-kernel@vger.kernel.org, Chris Brandt Subject: [PATCH v5 4/5] cramfs: add mmap support Date: Thu, 5 Oct 2017 22:45:30 -0400 Message-Id: <20171006024531.8885-5-nicolas.pitre@linaro.org> X-Mailer: git-send-email 2.9.5 In-Reply-To: <20171006024531.8885-1-nicolas.pitre@linaro.org> References: <20171006024531.8885-1-nicolas.pitre@linaro.org> X-Pobox-Relay-ID: 6FA70BBE-AA40-11E7-AA5F-8EF31968708C-78420484!pb-smtp1.pobox.com Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org When cramfs_physmem is used then we have the opportunity to map files directly from ROM, directly into user space, saving on RAM usage. This gives us Execute-In-Place (XIP) support. For a file to be mmap()-able, the map area has to correspond to a range of uncompressed and contiguous blocks, and in the MMU case it also has to be page aligned. A version of mkcramfs with appropriate support is necessary to create such a filesystem image. In the MMU case it may happen for a vma structure to extend beyond the actual file size. This is notably the case in binfmt_elf.c:elf_map(). Or the file's last block is shared with other files and cannot be mapped as is. Rather than refusing to mmap it, we do a "mixed" map and let the regular fault handler populate the unmapped area with RAM-backed pages. In practice the unmapped area is seldom accessed so page faults might never occur before this area is discarded. In the non-MMU case it is the get_unmapped_area method that is responsible for providing the address where the actual data can be found. No mapping is necessary of course. Signed-off-by: Nicolas Pitre Tested-by: Chris Brandt --- fs/cramfs/inode.c | 194 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 194 insertions(+) -- 2.9.5 diff --git a/fs/cramfs/inode.c b/fs/cramfs/inode.c index 6aa1d94ed8..071ce1eb58 100644 --- a/fs/cramfs/inode.c +++ b/fs/cramfs/inode.c @@ -15,7 +15,10 @@ #include #include +#include #include +#include +#include #include #include #include @@ -49,6 +52,7 @@ static inline struct cramfs_sb_info *CRAMFS_SB(struct super_block *sb) static const struct super_operations cramfs_ops; static const struct inode_operations cramfs_dir_inode_operations; static const struct file_operations cramfs_directory_operations; +static const struct file_operations cramfs_physmem_fops; static const struct address_space_operations cramfs_aops; static DEFINE_MUTEX(read_mutex); @@ -96,6 +100,10 @@ static struct inode *get_cramfs_inode(struct super_block *sb, case S_IFREG: inode->i_fop = &generic_ro_fops; inode->i_data.a_ops = &cramfs_aops; + if (IS_ENABLED(CONFIG_CRAMFS_PHYSMEM) && + CRAMFS_SB(sb)->flags & CRAMFS_FLAG_EXT_BLOCK_POINTERS && + CRAMFS_SB(sb)->linear_phys_addr) + inode->i_fop = &cramfs_physmem_fops; break; case S_IFDIR: inode->i_op = &cramfs_dir_inode_operations; @@ -277,6 +285,192 @@ static void *cramfs_read(struct super_block *sb, unsigned int offset, return NULL; } +/* + * For a mapping to be possible, we need a range of uncompressed and + * contiguous blocks. Return the offset for the first block and number of + * valid blocks for which that is true, or zero otherwise. + */ +static u32 cramfs_get_block_range(struct inode *inode, u32 pgoff, u32 *pages) +{ + struct super_block *sb = inode->i_sb; + struct cramfs_sb_info *sbi = CRAMFS_SB(sb); + int i; + u32 *blockptrs, first_block_addr; + + /* + * We can dereference memory directly here as this code may be + * reached only when there is a direct filesystem image mapping + * available in memory. + */ + blockptrs = (u32 *)(sbi->linear_virt_addr + OFFSET(inode) + pgoff * 4); + first_block_addr = blockptrs[0] & ~CRAMFS_BLK_FLAGS; + i = 0; + do { + u32 block_off = i * (PAGE_SIZE >> CRAMFS_BLK_DIRECT_PTR_SHIFT); + u32 expect = (first_block_addr + block_off) | + CRAMFS_BLK_FLAG_DIRECT_PTR | + CRAMFS_BLK_FLAG_UNCOMPRESSED; + if (blockptrs[i] != expect) { + pr_debug("range: block %d/%d got %#x expects %#x\n", + pgoff+i, pgoff + *pages - 1, + blockptrs[i], expect); + if (i == 0) + return 0; + break; + } + } while (++i < *pages); + + *pages = i; + return first_block_addr << CRAMFS_BLK_DIRECT_PTR_SHIFT; +} + +#ifdef CONFIG_MMU + +static int cramfs_physmem_mmap(struct file *file, struct vm_area_struct *vma) +{ + struct inode *inode = file_inode(file); + struct super_block *sb = inode->i_sb; + struct cramfs_sb_info *sbi = CRAMFS_SB(sb); + unsigned int pages, max_pages, offset; + unsigned long address, pgoff = vma->vm_pgoff; + char *bailout_reason; + int ret; + + if ((vma->vm_flags & VM_SHARED) && (vma->vm_flags & VM_MAYWRITE)) + return -EINVAL; + + /* Could COW work here? */ + bailout_reason = "vma is writable"; + if (vma->vm_flags & VM_WRITE) + goto bailout; + + max_pages = (inode->i_size + PAGE_SIZE - 1) >> PAGE_SHIFT; + bailout_reason = "beyond file limit"; + if (pgoff >= max_pages) + goto bailout; + pages = min(vma_pages(vma), max_pages - pgoff); + + offset = cramfs_get_block_range(inode, pgoff, &pages); + bailout_reason = "unsuitable block layout"; + if (!offset) + goto bailout; + address = sbi->linear_phys_addr + offset; + bailout_reason = "data is not page aligned"; + if (!PAGE_ALIGNED(address)) + goto bailout; + + /* Don't map the last page if it contains some other data */ + if (unlikely(pgoff + pages == max_pages)) { + unsigned int partial = offset_in_page(inode->i_size); + if (partial) { + char *data = sbi->linear_virt_addr + offset; + data += (max_pages - 1) * PAGE_SIZE + partial; + if (memchr_inv(data, 0, PAGE_SIZE - partial) != NULL) { + pr_debug("mmap: %s: last page is shared\n", + file_dentry(file)->d_name.name); + pages--; + } + } + } + + if (!pages) { + bailout_reason = "no suitable block remaining"; + goto bailout; + } + + if (pages != vma_pages(vma)) { + /* Let's create a mixed map if we can't map it all. */ + int i; + vma->vm_flags |= VM_MIXEDMAP; + for (i = 0; i < pages; i++) { + unsigned long off = i * PAGE_SIZE; + pfn_t pfn = phys_to_pfn_t(address + off, PFN_DEV); + ret = vm_insert_mixed(vma, vma->vm_start + off, pfn); + if (ret) + return ret; + } + /* + * The normal paging machinery will take care of the + * unpopulated ptes via cramfs_readpage(). + */ + vma->vm_ops = &generic_file_vm_ops; + } else { + /* + * The entire vma is mappable. remap_pfn_range() will + * make it distinguishable from a non-direct mapping + * in /proc//maps by substituting the file offset + * with the actual physical address. + */ + ret = remap_pfn_range(vma, vma->vm_start, address >> PAGE_SHIFT, + pages * PAGE_SIZE, vma->vm_page_prot); + if (ret) + return ret; + } + + pr_debug("mapped %s[%lu] at 0x%08lx (%u/%lu pages) to vma 0x%08lx, " + "page_prot 0x%llx\n", file_dentry(file)->d_name.name, pgoff, + address, pages, vma_pages(vma), vma->vm_start, + (unsigned long long)pgprot_val(vma->vm_page_prot)); + return 0; + +bailout: + pr_debug("%s[%lu]: direct mmap impossible: %s\n", + file_dentry(file)->d_name.name, pgoff, bailout_reason); + + /* We didn't manage a direct map, but normal paging is still possible */ + vma->vm_ops = &generic_file_vm_ops; + return 0; +} + +#else /* CONFIG_MMU */ + +static int cramfs_physmem_mmap(struct file *file, struct vm_area_struct *vma) +{ + return vma->vm_flags & (VM_SHARED | VM_MAYSHARE) ? 0 : -ENOSYS; +} + +static unsigned long cramfs_physmem_get_unmapped_area(struct file *file, + unsigned long addr, unsigned long len, + unsigned long pgoff, unsigned long flags) +{ + struct inode *inode = file_inode(file); + struct super_block *sb = inode->i_sb; + struct cramfs_sb_info *sbi = CRAMFS_SB(sb); + unsigned int pages, block_pages, max_pages, offset; + + pages = (len + PAGE_SIZE - 1) >> PAGE_SHIFT; + max_pages = (inode->i_size + PAGE_SIZE - 1) >> PAGE_SHIFT; + if (pgoff >= max_pages || pages > max_pages - pgoff) + return -EINVAL; + block_pages = pages; + offset = cramfs_get_block_range(inode, pgoff, &block_pages); + if (!offset || block_pages != pages) + return -ENOSYS; + addr = sbi->linear_phys_addr + offset; + pr_debug("get_unmapped for %s ofs %#lx siz %lu at 0x%08lx\n", + file_dentry(file)->d_name.name, pgoff*PAGE_SIZE, len, addr); + return addr; +} + +static unsigned int cramfs_physmem_mmap_capabilities(struct file *file) +{ + return NOMMU_MAP_COPY | NOMMU_MAP_DIRECT | + NOMMU_MAP_READ | NOMMU_MAP_EXEC; +} + +#endif /* CONFIG_MMU */ + +static const struct file_operations cramfs_physmem_fops = { + .llseek = generic_file_llseek, + .read_iter = generic_file_read_iter, + .splice_read = generic_file_splice_read, + .mmap = cramfs_physmem_mmap, +#ifndef CONFIG_MMU + .get_unmapped_area = cramfs_physmem_get_unmapped_area, + .mmap_capabilities = cramfs_physmem_mmap_capabilities, +#endif +}; + static void cramfs_blkdev_kill_sb(struct super_block *sb) { struct cramfs_sb_info *sbi = CRAMFS_SB(sb);