From patchwork Wed Mar 9 16:03:53 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Catalin Marinas X-Patchwork-Id: 63719 Delivered-To: patch@linaro.org Received: by 10.112.199.169 with SMTP id jl9csp2728927lbc; Wed, 9 Mar 2016 08:04:11 -0800 (PST) X-Received: by 10.66.121.97 with SMTP id lj1mr51064650pab.51.1457539451361; Wed, 09 Mar 2016 08:04:11 -0800 (PST) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id ck1si9950448pac.68.2016.03.09.08.04.11; Wed, 09 Mar 2016 08:04:11 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933194AbcCIQEK (ORCPT + 30 others); Wed, 9 Mar 2016 11:04:10 -0500 Received: from foss.arm.com ([217.140.101.70]:40706 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933050AbcCIQD5 (ORCPT ); Wed, 9 Mar 2016 11:03:57 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 3F7F949; Wed, 9 Mar 2016 08:02:58 -0800 (PST) Received: from e104818-lin.cambridge.arm.com (e104818-lin.cambridge.arm.com [10.1.203.148]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id ADA353F211; Wed, 9 Mar 2016 08:03:55 -0800 (PST) Date: Wed, 9 Mar 2016 16:03:53 +0000 From: Catalin Marinas To: Ganapatrao Kulkarni Cc: dann.frazier@canonical.com, Ganapatrao Kulkarni , Will Deacon , "linux-kernel@vger.kernel.org" , "linux-arm-kernel@lists.infradead.org" Subject: Re: [PATCH] arm64: Fix the ptep_set_wrprotect() to set PTE_DIRTY if (PTE_DBM && !PTE_RDONLY) Message-ID: <20160309160352.GM6192@e104818-lin.cambridge.arm.com> References: <1457499768-31176-1-git-send-email-gkulkarni@caviumnetworks.com> <20160309100605.GJ6192@e104818-lin.cambridge.arm.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Mar 09, 2016 at 05:17:39PM +0530, Ganapatrao Kulkarni wrote: > On Wed, Mar 9, 2016 at 3:36 PM, Catalin Marinas wrote: > > On Wed, Mar 09, 2016 at 10:32:48AM +0530, Ganapatrao Kulkarni wrote: > >> Commit 2f4b829c625e ("arm64: Add support for hardware updates of the > >> access and dirty pte bits") introduced support for handling hardware > >> updates of the access flag and dirty status. > >> > >> ptep_set_wrprotect is setting PTR_DIRTY if !PTE_RDONLY, > >> however by design it suppose to set PTE_DIRTY > >> only if (PTE_DBM && !PTE_RDONLY). This patch addes code to > >> test and set accordingly. > > > > The reasoning behind the original code is that if !PTE_RDONLY, you have > > no way to tell whether the page was written or not since it is already > > writable, independent of the DBM. So by clearing the DBM bit (making the > > page read-only), we need to ensure that a potential dirty state is > > transferred to the software PTE_DIRTY bit. > > > > By checking PTE_DBM && !PTE_RDONLY, you kind of imply that you can have > > a page with !PTE_DBM && !PTE_RDONLY. Given that PTE_DBM is actually > > PTE_WRITE, PTE_RDONLY must always be set when !PTE_DBM. The bug may be > > elsewhere not setting these bits correctly. > > but i do see this macro, > #define pte_hw_dirty(pte) (pte_write(pte) && !(pte_val(pte) & PTE_RDONLY)) This was added in commit b847415ce96e ("arm64: Fix the pte_hw_dirty() check when AF/DBM is enabled") for the pte_modify() case which is not called on the actual PTE but a local variable. A pte passed to this function as !PTE_DBM && !PTE_RDONLY should not be assumed dirty since PTE_RDONLY will be set later by set_pte_at() when the actual page table write occurs. ptep_set_wrprotect() is run directly on the actual PTE, so here a !PTE_RDONLY only means potentially dirty, independent of the PTE_DBM bit. I consider the additional PTE_DBM check superfluous in this case but we need to understand when we would actually get a pte with both PTE_DBM and PTE_RDONLY cleared. The only way I see this happening is if the pte doesn't have PTE_VALID set, IOW it probably has PTE_PROT_NONE set which is used by the NUMA balancing. So calling set_pte_at() on a !PTE_VALID && !PTE_DBM pte does not currently set PTE_RDONLY and ptep_set_wrprotect() wrongly assumes it is dirty. > i dont see this issue, if i comment out arm64 implementation of > ptep_set_wrprotect() Because the default implementation discards any existing hw dirty information by clearing the PTE_DBM bit and setting PTE_RDONLY via the set_pte_at (of course, apart from the atomicity issues). > >> This patch fixes BUG, > >> kernel BUG at /build/linux-StrpB2/linux-4.4.0/fs/ext4/inode.c:2394! > >> Internal error: Oops - BUG: 0 [#1] SMP > > > > Which bug is this? It's a PageWriteback() check in the for-next/core > > branch. What kernel version are you using? > > i am using 4.4.0 I guess with additional NUMA patches since it only fails when you enable the NUMA_BALANCING configuration. > > BTW, in 4.5-rc2 we pushed commit ac15bd63bbb2 ("arm64: Honour !PTE_WRITE > > in set_pte_at() for kernel mappings"), though not sure that's what you > > are hitting. > > i have tried this patch, but issue still exist. crash log below > > root@ubuntu:/home/ganapat/test# [ 733.853009] kernel BUG at > fs/ext4/inode.c:2394! Is this the BUG_ON in page_buffers(!PagePrivate(page))? I can see in the code above this that wrongly marking a page as dirty could have some side effects. Can you give this patch a try, on top of commit ac15bd63bbb2? -------------8<---------------------- diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 7c73b365fcfa..b409a983f870 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -201,7 +201,7 @@ extern void __sync_icache_dcache(pte_t pteval, unsigned long addr); static inline void set_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte) { - if (pte_valid(pte)) { + if (pte_present(pte)) { if (pte_sw_dirty(pte) && pte_write(pte)) pte_val(pte) &= ~PTE_RDONLY; else