From patchwork Mon Nov 16 01:56:51 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: AKASHI Takahiro X-Patchwork-Id: 56563 Delivered-To: patch@linaro.org Received: by 10.112.155.196 with SMTP id vy4csp1062777lbb; Sun, 15 Nov 2015 17:59:16 -0800 (PST) X-Received: by 10.68.69.71 with SMTP id c7mr51266278pbu.84.1447639156175; Sun, 15 Nov 2015 17:59:16 -0800 (PST) Return-Path: Received: from bombadil.infradead.org (bombadil.infradead.org. [2001:1868:205::9]) by mx.google.com with ESMTPS id ez3si46640387pab.130.2015.11.15.17.59.15 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 15 Nov 2015 17:59:16 -0800 (PST) Received-SPF: pass (google.com: domain of linux-arm-kernel-bounces+patch=linaro.org@lists.infradead.org designates 2001:1868:205::9 as permitted sender) client-ip=2001:1868:205::9; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-arm-kernel-bounces+patch=linaro.org@lists.infradead.org designates 2001:1868:205::9 as permitted sender) smtp.mailfrom=linux-arm-kernel-bounces+patch=linaro.org@lists.infradead.org; dkim=neutral (body hash did not verify) header.i=@linaro_org.20150623.gappssmtp.com Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1Zy92p-0007Tq-KY; Mon, 16 Nov 2015 01:57:23 +0000 Received: from mail-pa0-x22e.google.com ([2607:f8b0:400e:c03::22e]) by bombadil.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1Zy92l-0007ST-BV for linux-arm-kernel@lists.infradead.org; Mon, 16 Nov 2015 01:57:21 +0000 Received: by pacdm15 with SMTP id dm15so157747628pac.3 for ; Sun, 15 Nov 2015 17:56:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro_org.20150623.gappssmtp.com; s=20150623; h=subject:to:references:cc:from:message-id:date:user-agent :mime-version:in-reply-to:content-type:content-transfer-encoding; bh=jFesQ9imJlMKqG6PVj8NEGZsfGOmh843mnP0OsFMoJE=; b=FZN+gQn7g4m/FC+TrXZEza4H2nY41VgP4TEP2w0jvKMFKC4pOH5LkG3yqDW2b7L38r wt7MSKTkHznpRxOU6y0cGvMx97AAmnWXrLFWSTJn8ioFoPVM7fe8b2tWF38Yw0HqoOtZ ImhhQPuC9DxxxWOtOB80tv7sfg6aj9ZpurxxjLJsesGgy3H2xyjHQDUeRwKFtarIS2ls wPssDczA/GQfIflK5IyEp5y/wMdg03E0eU7emy03E3osrjHoQhFmImYff72Tk7ypLNbT v6u69N5bJAkoZZcjGC2GYHY9wgy3lN9gmwSXxcEhYjJSHOhh16Puyo1HYx5k3w8pHFKy 23lQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:subject:to:references:cc:from:message-id:date :user-agent:mime-version:in-reply-to:content-type :content-transfer-encoding; bh=jFesQ9imJlMKqG6PVj8NEGZsfGOmh843mnP0OsFMoJE=; b=dVs4QNmx21ujTwabvTice94qjdDg4QL/OX19U/nu8MDPzHlowsN6AxIE7SMrrdkMEt k/OPZD+xlrkIgBwDI2dZIFMQSzgO97T7YGhJngK/Arh1t/vjZTzzZ9Z5dEmwCiXV2PWj i6Nnrw5BODybnL9ntA6bqfXPJR8telMQePwvQ+hnwWb2vOv/7F2oRXshoKPonahN7MHD CXg/f+n6j/JSbLkvesRRu/6ZoMQeyppllFIjfp5tFxQTh8lzo0NkR9Rg9ilNaBcmrg7+ 32qSj37DQAazTh9lE2RC0v8ey14YEbGLMRYR1Cfd/cz6Pk7qVgIbFW7C0+rVD0MtuDw3 Tz1g== X-Gm-Message-State: ALoCoQlgqfdH119YhfpCcRsQG1+t+WHns6wq6L8k84so/dhCjbg7eLsIFhNhua6Z2MgIrrieDiKb X-Received: by 10.66.253.135 with SMTP id aa7mr50332102pad.14.1447639017661; Sun, 15 Nov 2015 17:56:57 -0800 (PST) Received: from [192.168.1.225] (61-205-4-81m5.grp1.mineo.jp. [61.205.4.81]) by smtp.googlemail.com with ESMTPSA id xr8sm26390285pab.26.2015.11.15.17.56.54 (version=TLSv1/SSLv3 cipher=OTHER); Sun, 15 Nov 2015 17:56:56 -0800 (PST) Subject: Re: arm64 function_graph tracer panic with CONFIG_DYNAMIC_FTRACE To: Catalin Marinas References: <20151112104243.GF5627@e104818-lin.cambridge.arm.com> <56457A4B.601@linaro.org> From: AKASHI Takahiro Message-ID: <564937E3.3090501@linaro.org> Date: Mon, 16 Nov 2015 10:56:51 +0900 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: <56457A4B.601@linaro.org> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20151115_175719_468682_9D3F81A7 X-CRM114-Status: GOOD ( 31.99 ) X-Spam-Score: -2.6 (--) X-Spam-Report: SpamAssassin version 3.4.0 on bombadil.infradead.org summary: Content analysis details: (-2.6 points) pts rule name description ---- ---------------------- -------------------------------------------------- -0.7 RCVD_IN_DNSWL_LOW RBL: Sender listed at http://www.dnswl.org/, low trust [2607:f8b0:400e:c03:0:0:0:22e listed in] [list.dnswl.org] -0.0 SPF_PASS SPF: sender matches SPF record -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jungseok Lee , linux-arm-kernel@lists.infradead.org, Steven Rostedt Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patch=linaro.org@lists.infradead.org Catalin, On 11/13/2015 02:51 PM, AKASHI Takahiro wrote: > Catalin, > (add lakml back to Cc) > > On 11/13/2015 01:01 AM, Catalin Marinas wrote: > > On Fri, Nov 13, 2015 at 12:00:42AM +0900, AKASHI Takahiro wrote: > >> If possible, please try again with "select HAVE_FUNCTION_GRAPH_FP_TEST" added > >> to CONFIG_ARM64. > >> It will insert an additional check. I hope this will help. > > > > So it looks the the frame is off by some amount: > > I was able to reproduce the problem on hikey, too. > > > # echo function_graph > /sys/kernel/debug/tracing/current_tracer > > Bad frame pointer: expected ffffffc976b3bd50, received ffffffc976b3be60 > > The difference of fp's was 0x110(sizeof(pt_reg_t)=0x120). > It seems that an interrupt happened here (and we might have unwound one extra frame?). > > > from func psci_cpu_suspend return to ffffffc00008d058 > > Always in the same place. > > > ------------[ cut here ]------------ > > WARNING: at /work/Linux/linux-2.6-aarch64/kernel/trace/trace_functions_graph.c:223 > > Modules linked in: > > > > CPU: 4 PID: 0 Comm: swapper/4 Not tainted 4.3.0+ #584 > > Hardware name: Juno (DT) > > task: ffffffc976b22f00 ti: ffffffc976b38000 task.ti: ffffffc976b38000 > > PC is at ftrace_return_to_handler+0x11c/0x150 > > LR is at ftrace_return_to_handler+0x11c/0x150 > > pc : [] lr : [] pstate: 800001c5 > > sp : ffffffc976b3bdd0 > > x29: ffffffc976b3bdd0 x28: ffffffc0006c9000 > > x27: ffffffc000a29f20 x26: ffffffc976b3bf70 > > x25: 000000cc2c13e054 x24: 0000000000000004 > > x23: ffffffc000aeb790 x22: 0000000000000002 > > x21: ffffffc975503200 x20: ffffffc000aeb850 > > x19: 0000000000000002 x18: ffffffc000b3a920 > > x17: 0000000000000000 x16: 0000000000000000 > > x15: ffffffc000b3a5bf x14: 3739636666666666 > > x13: 6620646576696563 x12: 6572202c30356462 > > x11: 3362363739636666 x10: 00000000000000fc > > x9 : ffffffc9768070c0 x8 : 0000000000001dd0 > > x7 : ffffffc9760f4c80 x6 : ffffffc9760f4c00 > > x5 : 00000000000084c0 x4 : 0000000000000001 > > x3 : ffffffc976b3bc50 x2 : 0000000000000001 > > x1 : 000000000000007f x0 : 000000000000007f > > > > ---[ end trace ac97c44ba7dabcce ]--- > > Call trace: > > [] ftrace_return_to_handler+0x11c/0x150 > > [] return_to_handler+0x1c/0x40 > > [] return_to_handler+0x0/0x40 > > [] return_to_handler+0x0/0x40 > > [] return_to_handler+0x0/0x40 > > [] return_to_handler+0x0/0x40 > > [] return_to_handler+0x0/0x40 > > [] return_to_handler+0x0/0x40 > > These call stacks were: > (psci_cpu_suspend) > psci_suspend_finisher > (cpu_suspend_enter?) > cpu_suspend > cpu_psci_cpu_suspend > arm_cpuidle_suspend > arm_enter_idle_state > cpuidle_enter_state > cpuidle_enter > call_cpuidle > (cpu_startup_entry?) > > and if cpuidle is turned off(cpuidle.off=1), the panic never happens. > That is all what I know at this moment. I think I fixed the problem. As you can see stack dump traces above, psci_cpu_suspend() and psci_suspend_finisher() are called in cpu suspend path, but they never return in cpu resume path and cpu_suspend() will resume directly via cpu_resume(). So those two functions should not be ftrace'd. Please try this patch: ----8<---- From c3aaaab52cf51879ae8e112f80790075425ba9be Mon Sep 17 00:00:00 2001 From: AKASHI Takahiro Date: Mon, 16 Nov 2015 10:29:21 +0900 Subject: [PATCH] arm64: disable ftrace in suspend path Signed-off-by: AKASHI Takahiro --- arch/arm64/kernel/psci.c | 2 +- arch/arm64/kernel/suspend.c | 2 +- drivers/firmware/psci.c | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) -- 1.7.9.5 ---->8---- There are some other functions which are called by cpu_suspend(), e.g. psci_system_suspend(). Should we apply a similar fix to them? Thanks, -Takahiro AKASHI > -Takahiro AKASHI > > > > [] secondary_start_kernel+0x130/0x158 > > Unable to handle kernel NULL pointer dereference at virtual address 00000000 > > pgd = ffffffc000b68000 > > [00000000] *pgd=00000009f69ea003, *pud=00000009f69ea003, *pmd=00000009f69eb003, *pte=006000002c010707 > > Internal error: Oops: 96000006 [#1] PREEMPT SMP > [snip] > > > On 11/12/2015 07:42 PM, Catalin Marinas wrote: >> Hi Takahiro, >> >> I was trying to assess what's working/not working in ftrace on arm64 and >> whether your recent patches are meant for 4.4 or can wait until 4.5. >> However, without any patches applied on a 4.3 kernel, enabling the >> function_graph tracer when CONFIG_DYNAMIC_FTRACE is enabled panics the >> kernel. Strangely, it does not happen if only a single CPU is on. >> However, with function_graph tracer on, onlining a second CPU leads to a >> similar panic. >> >> Below is the kernel panic with 2 CPUs online on a Juno board. There >> isn't much stack trace information for CPU 5 where the fault happened, >> the frame pointer (x29) seems corrupted as well. Any idea? Thanks. >> >> # echo function_graph > /sys/kernel/debug/tracing/current_tracer >> swapper/5[0]: undefined instruction: pc=ffffffc000a46698 >> Code: 0009ccf8 ffffffc0 00000000 00000000 (00000030) >> Internal error: Oops - undefined instruction: 0 [#1] PREEMPT SMP >> Modules linked in: >> CPU: 5 PID: 0 Comm: swapper/5 Not tainted 4.3.0 #574 >> Hardware name: Juno (DT) >> task: ffffffc976b23ac0 ti: ffffffc976b3c000 task.ti: ffffffc976b3c000 >> PC is at cpu_online_bits+0x0/0x8 >> LR is at cpu_online_bits+0x0/0x8 >> pc : [] lr : [] pstate: 600001c5 >> sp : ffffffc976b3ff40 >> x29: 0000000000000002 x28: ffffffc000100c0c >> x27: ffffffc000aeb790 x26: ffffffc9758d5800 >> x25: 0000000000000002 x24: ffffffc0000924a0 >> x23: ffffffc976b3ff40 x22: ffffffc00056841c >> x21: ffffffc000a25b80 x20: ffffffc000aeb790 >> x19: ffffffc9758d5800 x18: 0000000000000000 >> x17: 0000000000000000 x16: 0000000000000000 >> x15: 0000000000000000 x14: 0000000000000000 >> x13: 0000000000000000 x12: 0000000034d5d91d >> x11: 0000000000000000 x10: 0000000000001000 >> x9 : ffffffc9768070c0 x8 : 00000000000007d8 >> x7 : 0000001b280eb7c4 x6 : 0000000000000015 >> x5 : 000000000000000a x4 : ffffffc0792a4990 >> x3 : ffffffffffffffa1 x2 : ffffffc0792a4800 >> x1 : ffffffc0000949e0 x0 : ffffffffffffffa1 >> >> Process swapper/5 (pid: 0, stack limit = 0xffffffc976b3c020) >> Stack: (0xffffffc976b3ff40 to 0xffffffc976b40000) >> ff40: ffffffc976b3ff70 ffffffc0000924a0 ffffffc976b3c000 ffffffc000a46000 >> ff60: ffffffc000b20000 0000001b27d716c0 ffffffc976b3ffd0 ffffffc000090208 >> ff80: 0000000000000005 0000000000000e12 ffffffc000b306a0 0000000000000000 >> ffa0: 0000000000000000 0000000000000000 0000000080b65000 0000000080b68000 >> ffc0: ffffffc0000827f0 0000000000000000 0000000000000000 00000000800827dc >> ffe0: 0000000000000000 0000000000000000 068824085611a721 280b8a0b05808446 >> Call trace: >> Code: 0009ccf8 ffffffc0 00000000 00000000 (00000030) >> ---[ end trace 702bb2ed35464601 ]--- >> Kernel panic - not syncing: Attempted to kill the idle task! >> CPU4: stopping >> CPU: 4 PID: 0 Comm: swapper/4 Tainted: G D 4.3.0 #574 >> Hardware name: Juno (DT) >> Call trace: >> [] dump_backtrace+0x0/0x150 >> [] show_stack+0x24/0x30 >> [] dump_stack+0x90/0xd0 >> [] handle_IPI+0x280/0x290 >> [] gic_handle_irq+0x94/0xb0 >> Exception stack(0xffffffc976b3bd70 to 0xffffffc976b3be90) >> bd60: 0000001b2ed1ee8c ffffffc000aeb850 >> bd80: ffffffc976b3bec0 ffffffc0005681dc 0000000080000145 ffffffc00009244c >> bda0: 0000000000000000 0000000000000001 ffffffc976b38000 0000000000000003 >> bdc0: ffffffc976b38000 ffffffc000a282a8 ffffffc976b38000 ffffffc0000ecb28 >> bde0: 00000000000006ed ffffffc9768070c0 00000000000007e0 ffffffc0006d04a8 >> be00: 0000000000000001 0000000000005a0e 0000000000000001 ffffffc0006d04a8 >> be20: 0000000000000000 0000000000000000 0000000000000001 0000001b2ed1ee8c >> be40: ffffffc000aeb850 ffffffc9758d5600 0000000000000000 ffffffc000aeb790 >> be60: 0000000000000004 0000001b2e40a7ec ffffffc976b3bf70 ffffffc000a29f20 >> be80: ffffffc0006c9000 ffffffc976b3bec0 >> [] el1_irq+0x6c/0xe0 >> [] cpuidle_enter+0x34/0x48 >> [] call_cpuidle+0x48/0x80 >> [] cpu_startup_entry+0x228/0x308 >> [] secondary_start_kernel+0x130/0x158 >> ---[ end Kernel panic - not syncing: Attempted to kill the idle task! >> _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel diff --git a/arch/arm64/kernel/psci.c b/arch/arm64/kernel/psci.c index f67f35b..78414d7 100644 --- a/arch/arm64/kernel/psci.c +++ b/arch/arm64/kernel/psci.c @@ -178,7 +178,7 @@ static int cpu_psci_cpu_kill(unsigned int cpu) } #endif -static int psci_suspend_finisher(unsigned long index) +static int notrace psci_suspend_finisher(unsigned long index) { u32 *state = __this_cpu_read(psci_power_state); diff --git a/arch/arm64/kernel/suspend.c b/arch/arm64/kernel/suspend.c index 44ca414..12280ef 100644 --- a/arch/arm64/kernel/suspend.c +++ b/arch/arm64/kernel/suspend.c @@ -19,7 +19,7 @@ extern int __cpu_suspend_enter(unsigned long arg, int (*fn)(unsigned long)); * save_ptr: address of the location where the context physical address * must be saved */ -void notrace __cpu_suspend_save(struct cpu_suspend_ctx *ptr, +void __cpu_suspend_save(struct cpu_suspend_ctx *ptr, phys_addr_t *save_ptr) { *save_ptr = virt_to_phys(ptr); diff --git a/drivers/firmware/psci.c b/drivers/firmware/psci.c index 838298f..8fce2c5 100644 --- a/drivers/firmware/psci.c +++ b/drivers/firmware/psci.c @@ -129,7 +129,7 @@ static u32 psci_get_version(void) return invoke_psci_fn(PSCI_0_2_FN_PSCI_VERSION, 0, 0, 0); } -static int psci_cpu_suspend(u32 state, unsigned long entry_point) +static int notrace psci_cpu_suspend(u32 state, unsigned long entry_point) { int err; u32 fn;