From patchwork Wed Jul 2 14:12:59 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Thompson X-Patchwork-Id: 32990 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-vc0-f198.google.com (mail-vc0-f198.google.com [209.85.220.198]) by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id AF51320672 for ; Wed, 2 Jul 2014 14:13:36 +0000 (UTC) Received: by mail-vc0-f198.google.com with SMTP id hy10sf25615196vcb.5 for ; Wed, 02 Jul 2014 07:13:36 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:delivered-to:from:to:cc:subject :date:message-id:in-reply-to:references:x-original-sender :x-original-authentication-results:precedence:mailing-list:list-id :list-post:list-help:list-archive:list-unsubscribe; bh=52lbH6ElYdEC5T3vwxPXS38Y0db/X1XF/j7Wkd1hgus=; b=ZbxiF33qwjln7rtJGZ5cxhoP4/NHBu61PJ7h8nrNwPxfMuxm18y6cBRdB4KrzQ3uua u6kgCrf0QFINwkDP2IwqtSacpbxDD0IwvRYhjA+Y+D+8zps4ywcDdh4VHk8hcnX3FiiG 4kdI2s/sAs8wlpSosDa06RSaYbgUwrTH0/APeMFkiAKSikTF07bQvnSKsHxiQiLVOgu0 xX2lDxIkTKyer/1hlH+lYZKzcdKuNl5FJYpQaFxATcMUaFLE3tDKkbwq4Jeipj0G9XtT GZ6vJrvHvCkzjXc+d2TIDspHZCFHRD37kf+F8trG5pnIhEogb+fozrKLNeDfl7++ELBW cFIw== X-Gm-Message-State: ALoCoQl5bSPXQiyfHNfODjM4FyexXrDSO/wb/BR8EWiA+q5oUJxuXk5KWXCG7XATZTc01Uk9kfbC X-Received: by 10.236.99.8 with SMTP id w8mr6137558yhf.31.1404310416380; Wed, 02 Jul 2014 07:13:36 -0700 (PDT) MIME-Version: 1.0 X-BeenThere: patchwork-forward@linaro.org Received: by 10.140.33.202 with SMTP id j68ls62571qgj.28.gmail; Wed, 02 Jul 2014 07:13:36 -0700 (PDT) X-Received: by 10.220.53.72 with SMTP id l8mr50079114vcg.16.1404310416290; Wed, 02 Jul 2014 07:13:36 -0700 (PDT) Received: from mail-vc0-f171.google.com (mail-vc0-f171.google.com [209.85.220.171]) by mx.google.com with ESMTPS id lg9si12936419vdb.3.2014.07.02.07.13.36 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 02 Jul 2014 07:13:36 -0700 (PDT) Received-SPF: pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.220.171 as permitted sender) client-ip=209.85.220.171; Received: by mail-vc0-f171.google.com with SMTP id id10so10434982vcb.16 for ; Wed, 02 Jul 2014 07:13:36 -0700 (PDT) X-Received: by 10.58.220.230 with SMTP id pz6mr49565869vec.9.1404310416196; Wed, 02 Jul 2014 07:13:36 -0700 (PDT) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patches@linaro.org Received: by 10.221.37.5 with SMTP id tc5csp296301vcb; Wed, 2 Jul 2014 07:13:35 -0700 (PDT) X-Received: by 10.180.98.130 with SMTP id ei2mr17220662wib.24.1404310415085; Wed, 02 Jul 2014 07:13:35 -0700 (PDT) Received: from mail-wg0-f52.google.com (mail-wg0-f52.google.com [74.125.82.52]) by mx.google.com with ESMTPS id z20si20093821wij.18.2014.07.02.07.13.25 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 02 Jul 2014 07:13:25 -0700 (PDT) Received-SPF: pass (google.com: domain of daniel.thompson@linaro.org designates 74.125.82.52 as permitted sender) client-ip=74.125.82.52; Received: by mail-wg0-f52.google.com with SMTP id x13so2384789wgg.23 for ; Wed, 02 Jul 2014 07:13:23 -0700 (PDT) X-Received: by 10.194.120.103 with SMTP id lb7mr57649369wjb.40.1404310401577; Wed, 02 Jul 2014 07:13:21 -0700 (PDT) Received: from sundance.lan (cpc4-aztw19-0-0-cust157.18-1.cable.virginm.net. [82.33.25.158]) by mx.google.com with ESMTPSA id rw4sm55325995wjb.44.2014.07.02.07.13.19 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 02 Jul 2014 07:13:20 -0700 (PDT) From: Daniel Thompson To: Jason Wessel Cc: Daniel Thompson , linux-kernel@vger.kernel.org, patches@linaro.org, linaro-kernel@lists.linaro.org, Mike Travis , Randy Dunlap , Dimitri Sivanich , Andrew Morton , Borislav Petkov , kgdb-bugreport@lists.sourceforge.net Subject: [PATCH v2] kgdb: Timeout if secondary CPUs ignore the roundup Date: Wed, 2 Jul 2014 15:12:59 +0100 Message-Id: <1404310379-30228-1-git-send-email-daniel.thompson@linaro.org> X-Mailer: git-send-email 1.9.3 In-Reply-To: <1404224174-25024-1-git-send-email-daniel.thompson@linaro.org> References: <1404224174-25024-1-git-send-email-daniel.thompson@linaro.org> X-Removed-Original-Auth: Dkim didn't pass. X-Original-Sender: daniel.thompson@linaro.org X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.220.171 as permitted sender) smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org Precedence: list Mailing-list: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org List-ID: X-Google-Group-Id: 836684582541 List-Post: , List-Help: , List-Archive: List-Unsubscribe: , Currently if an active CPU fails to respond to a roundup request the CPU that requested the roundup will become stuck. This needlessly reduces the robustness of the debugger. This patch introduces a timeout allowing the system state to be examined even when the system contains unresponsive processors. It also modifies kdb's cpu command to make it censor attempts to switch to unresponsive processors and to report their state as (D)ead. Signed-off-by: Daniel Thompson Cc: Jason Wessel Cc: Mike Travis Cc: Randy Dunlap Cc: Dimitri Sivanich Cc: Andrew Morton Cc: Borislav Petkov Cc: kgdb-bugreport@lists.sourceforge.net --- Notes: Changes since v1: - Set CATASTROPHIC if the system contains unresponsive processors (Jason Wessel) kernel/debug/debug_core.c | 9 +++++++-- kernel/debug/kdb/kdb_debugger.c | 4 ++++ kernel/debug/kdb/kdb_main.c | 4 +++- 3 files changed, 14 insertions(+), 3 deletions(-) diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c index 1adf62b..acd7497 100644 --- a/kernel/debug/debug_core.c +++ b/kernel/debug/debug_core.c @@ -471,6 +471,7 @@ static int kgdb_cpu_enter(struct kgdb_state *ks, struct pt_regs *regs, int cpu; int trace_on = 0; int online_cpus = num_online_cpus(); + u64 time_left; kgdb_info[ks->cpu].enter_kgdb++; kgdb_info[ks->cpu].exception_state |= exception_state; @@ -595,9 +596,13 @@ return_normal: /* * Wait for the other CPUs to be notified and be waiting for us: */ - while (kgdb_do_roundup && (atomic_read(&masters_in_kgdb) + - atomic_read(&slaves_in_kgdb)) != online_cpus) + time_left = loops_per_jiffy * HZ; + while (kgdb_do_roundup && --time_left && + (atomic_read(&masters_in_kgdb) + atomic_read(&slaves_in_kgdb)) != + online_cpus) cpu_relax(); + if (!time_left) + pr_crit("KGDB: Timed out waiting for secondary CPUs.\n"); /* * At this point the primary processor is completely diff --git a/kernel/debug/kdb/kdb_debugger.c b/kernel/debug/kdb/kdb_debugger.c index 8859ca3..15e1a7a 100644 --- a/kernel/debug/kdb/kdb_debugger.c +++ b/kernel/debug/kdb/kdb_debugger.c @@ -129,6 +129,10 @@ int kdb_stub(struct kgdb_state *ks) ks->pass_exception = 1; KDB_FLAG_SET(CATASTROPHIC); } + /* set CATASTROPHIC if the system contains unresponsive processors */ + for_each_online_cpu(i) + if (!kgdb_info[i].enter_kgdb) + KDB_FLAG_SET(CATASTROPHIC); if (KDB_STATE(SSBPT) && reason == KDB_REASON_SSTEP) { KDB_STATE_CLEAR(SSBPT); KDB_STATE_CLEAR(DOING_SS); diff --git a/kernel/debug/kdb/kdb_main.c b/kernel/debug/kdb/kdb_main.c index 2f7c760..49f2425 100644 --- a/kernel/debug/kdb/kdb_main.c +++ b/kernel/debug/kdb/kdb_main.c @@ -2157,6 +2157,8 @@ static void kdb_cpu_status(void) for (start_cpu = -1, i = 0; i < NR_CPUS; i++) { if (!cpu_online(i)) { state = 'F'; /* cpu is offline */ + } else if (!kgdb_info[i].enter_kgdb) { + state = 'D'; /* cpu is online but unresponsive */ } else { state = ' '; /* cpu is responding to kdb */ if (kdb_task_state_char(KDB_TSK(i)) == 'I') @@ -2210,7 +2212,7 @@ static int kdb_cpu(int argc, const char **argv) /* * Validate cpunum */ - if ((cpunum > NR_CPUS) || !cpu_online(cpunum)) + if ((cpunum > NR_CPUS) || !kgdb_info[cpunum].enter_kgdb) return KDB_BADCPUNUM; dbg_switch_cpu = cpunum;