Message ID | 20181025012745.20884-1-rafael.tinoco@linaro.org |
---|---|
State | New |
Headers | show |
Series | [1/2] mm/zsmalloc.c: check encoded object value overflow for PAE | expand |
On (10/24/18 22:27), Rafael David Tinoco wrote: > static unsigned long location_to_obj(struct page *page, unsigned int obj_idx) > { > - unsigned long obj; > + unsigned long obj, pfn; > + > + pfn = page_to_pfn(page); > + > + if (unlikely(OBJ_OVERFLOW(pfn))) > + BUG(); The trend these days is to have less BUG/BUG_ON-s in the kernel. -ss
On Thu, Oct 25, 2018 at 2:29 AM, Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> wrote: > On (10/24/18 22:27), Rafael David Tinoco wrote: >> static unsigned long location_to_obj(struct page *page, unsigned int obj_idx) >> { >> - unsigned long obj; >> + unsigned long obj, pfn; >> + >> + pfn = page_to_pfn(page); >> + >> + if (unlikely(OBJ_OVERFLOW(pfn))) >> + BUG(); > > The trend these days is to have less BUG/BUG_ON-s in the kernel. > > -ss For this case, IMHO, it is worth. It will avoid a investigation like: https://bugs.linaro.org/show_bug.cgi?id=3765#c7 and and #c8, where I had to poison slab allocation - to force both zs_handle and zspage slabs not to be merged - and to make sure the zspage slab had a good magic number AND to identify why the bad paging request happened. If this happens again, for any other arch supporting PAE that does not declare MAX_POSSIBLE_PHYSMEM_BITS or MAX_PHYSMEM_BITS appropriately, the kernel will panic, no matter what, by the time it reaches obj_to_location(). Things can be more complicated about declarations for PAE if we consider ARM can declare MAX_PHYSMEM_BITS differently in arch/arm/mach-XXX and/or, for this case, when having, or not SPARSEMEM set (if I had SPARSEMEM set I would not face this, for example). If this occurs, the kernel will panic, no matter what, by the time it reaches obj_to_location()... so why not to BUG() here and let user to know exactly where it panic-ed and why ? Other option would be to WARN() here and let it panic naturally because of bad paging request in a very near future... please advise. Thanks, Best Rgds -Rafael
On Wed, Oct 24, 2018 at 10:27:44PM -0300, Rafael David Tinoco wrote: > On 32-bit systems, zsmalloc uses HIGHMEM and, when PAE is enabled, the > physical frame number might be so big that zsmalloc obj encoding (to > location) will break IF architecture does not re-define > MAX_PHYSMEM_BITS, causing: I think there's a deeper problem here - a misunderstanding of what MAX_PHYSMEM_BITS is. MAX_PHYSMEM_BITS is a definition for sparsemem, and is only visible when sparsemem is enabled. When sparsemem is disabled, asm/sparsemem.h is not included (and should not be included) which means there is no MAX_PHYSMEM_BITS definition. I don't think zsmalloc.c should be (ab)using MAX_PHYSMEM_BITS, and your description above makes it sound like you expect it to always be defined. If we want to have a definition for this, we shouldn't be playing fragile games like: #ifndef MAX_POSSIBLE_PHYSMEM_BITS #ifdef MAX_PHYSMEM_BITS #define MAX_POSSIBLE_PHYSMEM_BITS MAX_PHYSMEM_BITS #else /* * If this definition of MAX_PHYSMEM_BITS is used, OBJ_INDEX_BITS will just * be PAGE_SHIFT */ #define MAX_POSSIBLE_PHYSMEM_BITS BITS_PER_LONG #endif #endif but instead insist that MAX_PHYSMEM_BITS is defined _everywhere_. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up According to speedtest.net: 11.9Mbps down 500kbps up
> MAX_PHYSMEM_BITS is a definition for sparsemem, and is only visible > when sparsemem is enabled. When sparsemem is disabled, asm/sparsemem.h > is not included (and should not be included) which means there is no > MAX_PHYSMEM_BITS definition. Missed that part :\, tks. > I don't think zsmalloc.c should be (ab)using MAX_PHYSMEM_BITS, and > your description above makes it sound like you expect it to always be > defined. > > If we want to have a definition for this, we shouldn't be playing > fragile games like: > > #ifndef MAX_POSSIBLE_PHYSMEM_BITS > #ifdef MAX_PHYSMEM_BITS > #define MAX_POSSIBLE_PHYSMEM_BITS MAX_PHYSMEM_BITS > #else > /* > * If this definition of MAX_PHYSMEM_BITS is used, OBJ_INDEX_BITS will just > * be PAGE_SHIFT > */ > #define MAX_POSSIBLE_PHYSMEM_BITS BITS_PER_LONG > #endif > #endif > > but instead insist that MAX_PHYSMEM_BITS is defined _everywhere_. Is it okay to propose using only MAX_PHYSMEM_BITS for zsmalloc (like it was before commit 02390b87) instead, and make sure *at least* ARM 32/64 and x86/x64, for now, have it defined outside sparsemem headers as well ? This way I can WARN_ONCE(), instead of BUG(), when specific arch does not define it - enforcing behavior - showing BITS_PER_LONG is being used instead of MAX_PHYSMEM_BITS (warning, at least once, for the possibility of an overflow, like the issue showed in here).
Hi Rafael, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on linux-sof-driver/master] [also build test WARNING on v4.19 next-20181019] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Rafael-David-Tinoco/mm-zsmalloc-c-check-encoded-object-value-overflow-for-PAE/20181025-110258 base: https://github.com/thesofproject/linux master config: um-allyesconfig (attached as .config) compiler: gcc-7 (Debian 7.3.0-1) 7.3.0 reproduce: # save the attached .config to linux build tree make ARCH=um All warnings (new ones prefixed by >>): In file included from include/linux/kernel.h:10:0, from include/linux/list.h:9, from include/linux/module.h:9, from mm/zsmalloc.c:33: mm/zsmalloc.c: In function 'location_to_obj': >> mm/zsmalloc.c:129:17: warning: left shift count >= width of type [-Wshift-count-overflow] ((_AC(1, ULL)) << MAX_POSSIBLE_PHYSMEM_BITS) ? 1 : 0) ^ include/linux/compiler.h:77:42: note: in definition of macro 'unlikely' # define unlikely(x) __builtin_expect(!!(x), 0) ^ >> mm/zsmalloc.c:886:15: note: in expansion of macro 'OBJ_OVERFLOW' if (unlikely(OBJ_OVERFLOW(pfn))) ^~~~~~~~~~~~ Cyclomatic Complexity 5 include/linux/compiler.h:__read_once_size Cyclomatic Complexity 5 include/linux/compiler.h:__write_once_size Cyclomatic Complexity 1 include/linux/kasan-checks.h:kasan_check_read Cyclomatic Complexity 1 include/linux/kasan-checks.h:kasan_check_write Cyclomatic Complexity 2 arch/x86/include/asm/bitops.h:set_bit Cyclomatic Complexity 2 arch/x86/include/asm/bitops.h:clear_bit Cyclomatic Complexity 1 arch/x86/include/asm/bitops.h:clear_bit_unlock Cyclomatic Complexity 1 arch/x86/include/asm/bitops.h:test_and_set_bit Cyclomatic Complexity 1 arch/x86/include/asm/bitops.h:test_and_set_bit_lock Cyclomatic Complexity 1 arch/x86/include/asm/bitops.h:constant_test_bit Cyclomatic Complexity 1 arch/x86/include/asm/bitops.h:variable_test_bit Cyclomatic Complexity 1 arch/x86/include/asm/bitops.h:fls64 Cyclomatic Complexity 1 include/linux/log2.h:__ilog2_u64 Cyclomatic Complexity 1 include/linux/kernel.h:___might_sleep Cyclomatic Complexity 1 include/linux/list.h:INIT_LIST_HEAD Cyclomatic Complexity 2 include/linux/list.h:__list_add Cyclomatic Complexity 1 include/linux/list.h:list_add Cyclomatic Complexity 1 include/linux/list.h:__list_del Cyclomatic Complexity 2 include/linux/list.h:__list_del_entry Cyclomatic Complexity 1 include/linux/list.h:list_del Cyclomatic Complexity 1 include/linux/list.h:list_del_init Cyclomatic Complexity 1 include/linux/list.h:list_empty Cyclomatic Complexity 1 include/linux/list.h:__list_splice Cyclomatic Complexity 2 include/linux/list.h:list_splice_init Cyclomatic Complexity 1 arch/um/include/shared/mem.h:to_virt Cyclomatic Complexity 1 include/asm-generic/getorder.h:__get_order Cyclomatic Complexity 1 arch/um/include/asm/thread_info.h:current_thread_info Cyclomatic Complexity 1 include/asm-generic/preempt.h:preempt_count Cyclomatic Complexity 1 arch/x86/include/asm/atomic.h:arch_atomic_read Cyclomatic Complexity 1 arch/x86/include/asm/atomic.h:arch_atomic_set Cyclomatic Complexity 1 arch/x86/include/asm/atomic.h:arch_atomic_inc Cyclomatic Complexity 1 arch/x86/include/asm/atomic.h:arch_atomic_dec_and_test Cyclomatic Complexity 1 arch/x86/include/asm/atomic64_64.h:arch_atomic64_read Cyclomatic Complexity 1 arch/x86/include/asm/atomic64_64.h:arch_atomic64_add Cyclomatic Complexity 1 arch/x86/include/asm/atomic64_64.h:arch_atomic64_sub Cyclomatic Complexity 1 arch/x86/include/asm/atomic64_64.h:arch_atomic64_inc Cyclomatic Complexity 1 arch/x86/include/asm/atomic64_64.h:arch_atomic64_dec Cyclomatic Complexity 1 include/asm-generic/atomic-instrumented.h:atomic_read Cyclomatic Complexity 1 include/asm-generic/atomic-instrumented.h:atomic64_read Cyclomatic Complexity 1 include/asm-generic/atomic-instrumented.h:atomic_set Cyclomatic Complexity 1 include/asm-generic/atomic-instrumented.h:atomic_inc Cyclomatic Complexity 1 include/asm-generic/atomic-instrumented.h:atomic64_inc Cyclomatic Complexity 1 include/asm-generic/atomic-instrumented.h:atomic64_dec Cyclomatic Complexity 1 include/asm-generic/atomic-instrumented.h:atomic64_add Cyclomatic Complexity 1 include/asm-generic/atomic-instrumented.h:atomic64_sub Cyclomatic Complexity 1 include/asm-generic/atomic-instrumented.h:atomic_dec_and_test Cyclomatic Complexity 1 include/asm-generic/atomic-long.h:atomic_long_read Cyclomatic Complexity 1 include/asm-generic/atomic-long.h:atomic_long_inc Cyclomatic Complexity 1 include/asm-generic/atomic-long.h:atomic_long_dec Cyclomatic Complexity 1 include/asm-generic/atomic-long.h:atomic_long_add Cyclomatic Complexity 1 include/asm-generic/atomic-long.h:atomic_long_sub Cyclomatic Complexity 1 arch/x86/um/asm/processor.h:rep_nop Cyclomatic Complexity 1 include/linux/spinlock.h:spinlock_check Cyclomatic Complexity 1 include/linux/spinlock.h:spin_lock Cyclomatic Complexity 1 include/linux/spinlock.h:spin_unlock Cyclomatic Complexity 1 include/linux/jump_label.h:static_key_count Cyclomatic Complexity 2 include/linux/jump_label.h:static_key_false Cyclomatic Complexity 1 include/linux/nodemask.h:node_state Cyclomatic Complexity 1 include/linux/err.h:PTR_ERR Cyclomatic Complexity 1 include/linux/err.h:IS_ERR Cyclomatic Complexity 1 include/linux/workqueue.h:queue_work Cyclomatic Complexity 1 include/linux/workqueue.h:schedule_work Cyclomatic Complexity 1 include/linux/topology.h:numa_node_id Cyclomatic Complexity 1 include/linux/topology.h:numa_mem_id Cyclomatic Complexity 1 include/linux/gfp.h:__alloc_pages Cyclomatic Complexity 4 include/linux/gfp.h:__alloc_pages_node Cyclomatic Complexity 2 include/linux/gfp.h:alloc_pages_node Cyclomatic Complexity 4 include/linux/bit_spinlock.h:bit_spin_lock Cyclomatic Complexity 2 include/linux/bit_spinlock.h:bit_spin_trylock Cyclomatic Complexity 2 include/linux/bit_spinlock.h:bit_spin_unlock Cyclomatic Complexity 2 include/linux/bit_spinlock.h:bit_spin_is_locked Cyclomatic Complexity 1 include/linux/fs.h:mount_pseudo Cyclomatic Complexity 2 include/linux/page-flags.h:compound_head Cyclomatic Complexity 1 include/linux/page-flags.h:PagePoisoned Cyclomatic Complexity 1 include/linux/page-flags.h:PageLocked Cyclomatic Complexity 1 include/linux/page-flags.h:PagePrivate Cyclomatic Complexity 1 include/linux/page-flags.h:SetPagePrivate Cyclomatic Complexity 1 include/linux/page-flags.h:ClearPagePrivate Cyclomatic Complexity 1 include/linux/page-flags.h:PageOwnerPriv1 Cyclomatic Complexity 1 include/linux/page-flags.h:SetPageOwnerPriv1 Cyclomatic Complexity 1 include/linux/page-flags.h:ClearPageOwnerPriv1 Cyclomatic Complexity 1 include/linux/page-flags.h:PageIsolated Cyclomatic Complexity 1 include/linux/page_ref.h:page_ref_count Cyclomatic Complexity 2 include/linux/page_ref.h:page_ref_inc Cyclomatic Complexity 2 include/linux/page_ref.h:page_ref_dec_and_test Cyclomatic Complexity 1 include/linux/mm.h:put_page_testzero Cyclomatic Complexity 1 include/linux/mm.h:page_mapcount_reset Cyclomatic Complexity 1 include/linux/mm.h:page_zonenum Cyclomatic Complexity 1 include/linux/mm.h:get_page Cyclomatic Complexity 2 include/linux/mm.h:put_page Cyclomatic Complexity 1 include/linux/mm.h:page_zone Cyclomatic Complexity 1 include/linux/vmstat.h:__inc_zone_state Cyclomatic Complexity 1 include/linux/vmstat.h:__dec_zone_state Cyclomatic Complexity 1 include/linux/vmstat.h:__inc_zone_page_state Cyclomatic Complexity 1 include/linux/vmstat.h:__dec_zone_page_state Cyclomatic Complexity 1 include/linux/mm.h:lowmem_page_address Cyclomatic Complexity 1 include/linux/uaccess.h:pagefault_disabled_inc Cyclomatic Complexity 1 include/linux/uaccess.h:pagefault_disabled_dec vim +129 mm/zsmalloc.c 32 > 33 #include <linux/module.h> 34 #include <linux/kernel.h> 35 #include <linux/sched.h> 36 #include <linux/magic.h> 37 #include <linux/bitops.h> 38 #include <linux/errno.h> 39 #include <linux/highmem.h> 40 #include <linux/string.h> 41 #include <linux/slab.h> 42 #include <asm/tlbflush.h> 43 #include <asm/pgtable.h> 44 #include <linux/cpumask.h> 45 #include <linux/cpu.h> 46 #include <linux/vmalloc.h> 47 #include <linux/preempt.h> 48 #include <linux/spinlock.h> 49 #include <linux/shrinker.h> 50 #include <linux/types.h> 51 #include <linux/debugfs.h> 52 #include <linux/zsmalloc.h> 53 #include <linux/zpool.h> 54 #include <linux/mount.h> 55 #include <linux/migrate.h> 56 #include <linux/pagemap.h> 57 #include <linux/fs.h> 58 59 #define ZSPAGE_MAGIC 0x58 60 61 /* 62 * This must be power of 2 and greater than of equal to sizeof(link_free). 63 * These two conditions ensure that any 'struct link_free' itself doesn't 64 * span more than 1 page which avoids complex case of mapping 2 pages simply 65 * to restore link_free pointer values. 66 */ 67 #define ZS_ALIGN 8 68 69 /* 70 * A single 'zspage' is composed of up to 2^N discontiguous 0-order (single) 71 * pages. ZS_MAX_ZSPAGE_ORDER defines upper limit on N. 72 */ 73 #define ZS_MAX_ZSPAGE_ORDER 2 74 #define ZS_MAX_PAGES_PER_ZSPAGE (_AC(1, UL) << ZS_MAX_ZSPAGE_ORDER) 75 76 #define ZS_HANDLE_SIZE (sizeof(unsigned long)) 77 78 /* 79 * Object location (<PFN>, <obj_idx>) is encoded as 80 * as single (unsigned long) handle value. 81 * 82 * Note that object index <obj_idx> starts from 0. 83 * 84 * This is made more complicated by various memory models and PAE. 85 */ 86 87 #ifndef MAX_POSSIBLE_PHYSMEM_BITS 88 #ifdef MAX_PHYSMEM_BITS 89 #define MAX_POSSIBLE_PHYSMEM_BITS MAX_PHYSMEM_BITS 90 #else 91 /* 92 * If this definition of MAX_PHYSMEM_BITS is used, OBJ_INDEX_BITS will just 93 * be PAGE_SHIFT 94 */ 95 #define MAX_POSSIBLE_PHYSMEM_BITS BITS_PER_LONG 96 #endif 97 #endif 98 99 #define _PFN_BITS (MAX_POSSIBLE_PHYSMEM_BITS - PAGE_SHIFT) 100 101 /* 102 * Memory for allocating for handle keeps object position by 103 * encoding <page, obj_idx> and the encoded value has a room 104 * in least bit(ie, look at obj_to_location). 105 * We use the bit to synchronize between object access by 106 * user and migration. 107 */ 108 #define HANDLE_PIN_BIT 0 109 110 /* 111 * Head in allocated object should have OBJ_ALLOCATED_TAG 112 * to identify the object was allocated or not. 113 * It's okay to add the status bit in the least bit because 114 * header keeps handle which is 4byte-aligned address so we 115 * have room for two bit at least. 116 */ 117 #define OBJ_ALLOCATED_TAG 1 118 #define OBJ_TAG_BITS 1 119 #define OBJ_INDEX_BITS (BITS_PER_LONG - _PFN_BITS - OBJ_TAG_BITS) 120 #define OBJ_INDEX_MASK ((_AC(1, UL) << OBJ_INDEX_BITS) - 1) 121 122 /* 123 * When using PAE, the obj encoding might overflow if arch does 124 * not re-define MAX_PHYSMEM_BITS, since zsmalloc uses HIGHMEM. 125 * This checks for a future bad page access, when de-coding obj. 126 */ 127 #define OBJ_OVERFLOW(_pfn) \ 128 (((unsigned long long) _pfn << (OBJ_INDEX_BITS + OBJ_TAG_BITS)) >= \ > 129 ((_AC(1, ULL)) << MAX_POSSIBLE_PHYSMEM_BITS) ? 1 : 0) 130 --- 0-DAY kernel test infrastructure Open Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation
On Thu, Oct 25, 2018 at 09:37:59AM -0300, Rafael David Tinoco wrote: > Is it okay to propose using only MAX_PHYSMEM_BITS for zsmalloc (like > it was before commit 02390b87) instead, and make sure *at least* ARM > 32/64 and x86/x64, for now, have it defined outside sparsemem headers > as well ? It looks to me like this has been broken on ARM for quite some time, predating that commit. The original was: #ifndef MAX_PHYSMEM_BITS #ifdef CONFIG_HIGHMEM64G #define MAX_PHYSMEM_BITS 36 #else /* !CONFIG_HIGHMEM64G */ #define MAX_PHYSMEM_BITS BITS_PER_LONG #endif #endif #define _PFN_BITS (MAX_PHYSMEM_BITS - PAGE_SHIFT) On ARM, CONFIG_HIGHMEM64G is never defined (it's an x86 private symbol) which means that the above sets MAX_PHYSMEM_BITS to 32 on non-sparsemem ARM LPAE platforms. So commit 02390b87 hasn't really changed anything as far as ARM LPAE is concerned - and this looks to be a bug that goes all the way back to when zsmalloc.c was moved out of staging in 2014. Digging further back, it seems this brokenness was introduced with: commit 6e00ec00b1a76a199b8c0acae401757b795daf57 Author: Seth Jennings <sjenning@linux.vnet.ibm.com> Date: Mon Mar 5 11:33:22 2012 -0600 staging: zsmalloc: calculate MAX_PHYSMEM_BITS if not defined This patch provides a way to determine or "set a reasonable value for" MAX_PHYSMEM_BITS in the case that it is not defined (i.e. !SPARSEMEM) Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Acked-by: Nitin Gupta <ngupta@vflare.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> which, at the time, realised the problem with SPARSEMEM, but decided that in the absense of SPARSEMEM, that MAX_PHYSMEM_BITS shall be BITS_PER_LONG which seems absurd (see below.) > This way I can WARN_ONCE(), instead of BUG(), when specific > arch does not define it - enforcing behavior - showing BITS_PER_LONG > is being used instead of MAX_PHYSMEM_BITS (warning, at least once, for > the possibility of an overflow, like the issue showed in here). Assuming that the maximum number of physical memory bits are BITS_PER_LONG in the absense of MAX_POSSIBLE_PHYSMEM_BITS is a nonsense - we have had the potential for PAE systems for a long time, and to introduce new code that makes this assumption was plainly wrong. We know when there's the potential for PAE, and thus more than BITS_PER_LONG bits of physical memory address, through CONFIG_PHYS_ADDR_T_64BIT. So if we have the situation where MAX_POSSIBLE_PHYSMEM_BITS (or the older case of MAX_PHYSMEM_BITS) not being defined, but CONFIG_PHYS_ADDR_T_64BIT set, we should've been erroring or something based on not knowing how many physical memory bits are possible - it would be more than BITS_PER_LONG but less than some unknown number of bits. This is why I think any fallback here to BITS_PER_LONG is wrong. What I suggested is to not fall back to BITS_PER_LONG in any case, but always define MAX_PHYSMEM_BITS. However, I now see that won't work for x86 because MAX_PHYSMEM_BITS is not a constant anymore. So I suggest everything that uses zsmalloc.c should instead define MAX_POSSIBLE_PHYSMEM_BITS. Note that there should _also_ be some protection in zsmalloc.c against MAX_POSSIBLE_PHYSMEM_BITS being too large: #define OBJ_INDEX_BITS (BITS_PER_LONG - _PFN_BITS - OBJ_TAG_BITS) #define OBJ_TAG_BITS 1 #define _PFN_BITS (MAX_POSSIBLE_PHYSMEM_BITS - PAGE_SHIFT) which means there's an implicit limitation on _PFN_BITS being less than BITS_PER_LONG - OBJ_TAG_BITS (where, if it's equal to this, and hence OBJ_INDEX_BITS will be zero.) This imples that MAX_POSSIBLE_PHYSMEM_BITS must be smaller than BITS_PER_LONG + PAGE_SHIFT - OBJ_TAG_BITS, or 43 bits on a 32 bit system. If you want to guarantee a minimum number of objects, then that limitation needs to be reduced further. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up According to speedtest.net: 11.9Mbps down 500kbps up
diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index 9da65552e7ca..9c3ff8c2ccbc 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -119,6 +119,15 @@ #define OBJ_INDEX_BITS (BITS_PER_LONG - _PFN_BITS - OBJ_TAG_BITS) #define OBJ_INDEX_MASK ((_AC(1, UL) << OBJ_INDEX_BITS) - 1) +/* + * When using PAE, the obj encoding might overflow if arch does + * not re-define MAX_PHYSMEM_BITS, since zsmalloc uses HIGHMEM. + * This checks for a future bad page access, when de-coding obj. + */ +#define OBJ_OVERFLOW(_pfn) \ + (((unsigned long long) _pfn << (OBJ_INDEX_BITS + OBJ_TAG_BITS)) >= \ + ((_AC(1, ULL)) << MAX_POSSIBLE_PHYSMEM_BITS) ? 1 : 0) + #define FULLNESS_BITS 2 #define CLASS_BITS 8 #define ISOLATED_BITS 3 @@ -871,9 +880,14 @@ static void obj_to_location(unsigned long obj, struct page **page, */ static unsigned long location_to_obj(struct page *page, unsigned int obj_idx) { - unsigned long obj; + unsigned long obj, pfn; + + pfn = page_to_pfn(page); + + if (unlikely(OBJ_OVERFLOW(pfn))) + BUG(); - obj = page_to_pfn(page) << OBJ_INDEX_BITS; + obj = pfn << OBJ_INDEX_BITS; obj |= obj_idx & OBJ_INDEX_MASK; obj <<= OBJ_TAG_BITS;
On 32-bit systems, zsmalloc uses HIGHMEM and, when PAE is enabled, the physical frame number might be so big that zsmalloc obj encoding (to location) will break IF architecture does not re-define MAX_PHYSMEM_BITS, causing: [ 497.097843] ================================================================== [ 497.102365] BUG: KASAN: null-ptr-deref in zs_map_object+0xa4/0x2bc [ 497.105933] Read of size 4 at addr 00000000 by task mkfs.ext4/623 [ 497.109684] [ 497.110722] CPU: 2 PID: 623 Comm: mkfs.ext4 Not tainted 4.19.0-rc8-00017-g8239bc6d3307-dirty #15 [ 497.116098] Hardware name: Generic DT based system [ 497.119094] [<c0418f7c>] (unwind_backtrace) from [<c0410ca4>] (show_stack+0x20/0x24) [ 497.123819] [<c0410ca4>] (show_stack) from [<c16bd540>] (dump_stack+0xbc/0xe8) [ 497.128299] [<c16bd540>] (dump_stack) from [<c06cab74>] (kasan_report+0x248/0x390) [ 497.132928] [<c06cab74>] (kasan_report) from [<c06c94e8>] (__asan_load4+0x78/0xb4) [ 497.137530] [<c06c94e8>] (__asan_load4) from [<c06ddc24>] (zs_map_object+0xa4/0x2bc) [ 497.142335] [<c06ddc24>] (zs_map_object) from [<bf0bbbd8>] (zram_bvec_rw.constprop.2+0x324/0x8e4 [zram]) [ 497.148294] [<bf0bbbd8>] (zram_bvec_rw.constprop.2 [zram]) from [<bf0bc3cc>] (zram_make_request+0x234/0x46c [zram]) [ 497.154653] [<bf0bc3cc>] (zram_make_request [zram]) from [<c09aff9c>] (generic_make_request+0x304/0x63c) [ 497.160413] [<c09aff9c>] (generic_make_request) from [<c09b0320>] (submit_bio+0x4c/0x1c8) [ 497.165379] [<c09b0320>] (submit_bio) from [<c0743570>] (submit_bh_wbc.constprop.15+0x238/0x26c) [ 497.170775] [<c0743570>] (submit_bh_wbc.constprop.15) from [<c0746cf8>] (__block_write_full_page+0x524/0x76c) [ 497.176776] [<c0746cf8>] (__block_write_full_page) from [<c07472c4>] (block_write_full_page+0x1bc/0x1d4) [ 497.182549] [<c07472c4>] (block_write_full_page) from [<c0748eb4>] (blkdev_writepage+0x24/0x28) [ 497.187849] [<c0748eb4>] (blkdev_writepage) from [<c064a780>] (__writepage+0x44/0x78) [ 497.192633] [<c064a780>] (__writepage) from [<c064b50c>] (write_cache_pages+0x3b8/0x800) [ 497.197616] [<c064b50c>] (write_cache_pages) from [<c064dd78>] (generic_writepages+0x74/0xa0) [ 497.202807] [<c064dd78>] (generic_writepages) from [<c0748e64>] (blkdev_writepages+0x18/0x1c) [ 497.208022] [<c0748e64>] (blkdev_writepages) from [<c064e890>] (do_writepages+0x68/0x134) [ 497.213002] [<c064e890>] (do_writepages) from [<c06368a4>] (__filemap_fdatawrite_range+0xb0/0xf4) [ 497.218447] [<c06368a4>] (__filemap_fdatawrite_range) from [<c0636b68>] (file_write_and_wait_range+0x64/0xd0) [ 497.224416] [<c0636b68>] (file_write_and_wait_range) from [<c0747af8>] (blkdev_fsync+0x54/0x84) [ 497.229749] [<c0747af8>] (blkdev_fsync) from [<c0739dac>] (vfs_fsync_range+0x70/0xd4) [ 497.234549] [<c0739dac>] (vfs_fsync_range) from [<c0739e98>] (do_fsync+0x4c/0x80) [ 497.239159] [<c0739e98>] (do_fsync) from [<c073a26c>] (sys_fsync+0x1c/0x20) [ 497.243407] [<c073a26c>] (sys_fsync) from [<c0401000>] (ret_fast_syscall+0x0/0x2c) like in this ARM 32-bit (LPAE enabled), example. That occurs because, if not set, MAX_POSSIBLE_PHYSMEM_BITS will default to BITS_PER_LONG (32) in most cases, and, for PAE, _PFN_BITS will be wrong: which may cause obj variable to overflow if PFN is HIGHMEM and referencing any page above the 4GB watermark. This commit exposes the BUG IF the architecture supports PAE AND does not explicitly set MAX_POSSIBLE_PHYSMEM_BITS to supported value. Link: https://bugs.linaro.org/show_bug.cgi?id=3765#c17 Signed-off-by: Rafael David Tinoco <rafael.tinoco@linaro.org> --- mm/zsmalloc.c | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-) -- 2.19.1