From patchwork Tue Feb 17 11:34:39 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiri Slaby X-Patchwork-Id: 44741 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-lb0-f198.google.com (mail-lb0-f198.google.com [209.85.217.198]) by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id A080D21544 for ; Tue, 17 Feb 2015 11:48:46 +0000 (UTC) Received: by lbdu10 with SMTP id u10sf2584128lbd.3 for ; Tue, 17 Feb 2015 03:48:45 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:delivered-to:from:to:cc:subject :date:message-id:in-reply-to:references:in-reply-to:references :sender:precedence:list-id:x-original-sender :x-original-authentication-results:mailing-list:list-post:list-help :list-archive:list-unsubscribe; bh=OBcT4qBTQMGyBr0n8rYblGKNTha9CyLZdKxfEk4xKww=; b=WBApJgc7mjwe70kc3OWzACbtg1bvazEASTOD5npW//mYjusl4eHB7HuucS7RgXtNOp N8JTcwzVWFYvmG0BodWDmJ9jVJ86gRBmIOyTaLVQE5E7y+ZAU/pTzq4d6+Otr2E1n9B7 8XMCelR+sQxWJkSEO/O0sDHE+diqd0USmzzSCu6ctgio3IlhIg6J28WKAsgXfsqt2+0y mRd6PcJAxVZLSgjU78KvdM44DmHJDqCMAo2/resIk3TIt0vVM/VCnv5j4MNNSVi4Dl6M NdNQelhD8ayiUfHqfMRzGPZNOAXvWXBu2lr5W0TS78yrZcrpYyik8bKIB5fJckiC43RN wAmA== X-Gm-Message-State: ALoCoQkSjaAPh3DVv4VcWKDoLYBupzQapABuNfrZ98FZzT0r+xqFVmy9/Pu46J9WzkW4iBc8f2Xd X-Received: by 10.152.4.229 with SMTP id n5mr449085lan.1.1424173725522; Tue, 17 Feb 2015 03:48:45 -0800 (PST) MIME-Version: 1.0 X-BeenThere: patchwork-forward@linaro.org Received: by 10.152.36.232 with SMTP id t8ls666333laj.100.gmail; Tue, 17 Feb 2015 03:48:45 -0800 (PST) X-Received: by 10.152.7.7 with SMTP id f7mr27534210laa.27.1424173725374; Tue, 17 Feb 2015 03:48:45 -0800 (PST) Received: from mail-lb0-f177.google.com (mail-lb0-f177.google.com. [209.85.217.177]) by mx.google.com with ESMTPS id wu3si12334427lac.0.2015.02.17.03.48.45 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 17 Feb 2015 03:48:45 -0800 (PST) Received-SPF: pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.217.177 as permitted sender) client-ip=209.85.217.177; Received: by lbiz12 with SMTP id z12so3908533lbi.11 for ; Tue, 17 Feb 2015 03:48:45 -0800 (PST) X-Received: by 10.112.161.34 with SMTP id xp2mr27725316lbb.73.1424173725098; Tue, 17 Feb 2015 03:48:45 -0800 (PST) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patch@linaro.org Received: by 10.112.35.133 with SMTP id h5csp2153266lbj; Tue, 17 Feb 2015 03:48:43 -0800 (PST) X-Received: by 10.66.66.238 with SMTP id i14mr48568795pat.27.1424173722514; Tue, 17 Feb 2015 03:48:42 -0800 (PST) Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id n9si11510886pap.82.2015.02.17.03.48.41; Tue, 17 Feb 2015 03:48:42 -0800 (PST) Received-SPF: none (google.com: linux-kernel-owner@vger.kernel.org does not designate permitted sender hosts) client-ip=209.132.180.67; Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757250AbbBQLsU (ORCPT + 28 others); Tue, 17 Feb 2015 06:48:20 -0500 Received: from ip4-83-240-67-251.cust.nbox.cz ([83.240.67.251]:39882 "EHLO ip4-83-240-18-248.cust.nbox.cz" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1756955AbbBQLfO (ORCPT ); Tue, 17 Feb 2015 06:35:14 -0500 Received: from ku by ip4-83-240-18-248.cust.nbox.cz with local (Exim 4.85) (envelope-from ) id 1YNgQp-0005Lg-IN; Tue, 17 Feb 2015 12:35:11 +0100 From: Jiri Slaby To: stable@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Alex Elder , Jiri Slaby Subject: [PATCH 3.12 092/122] rbd: drop an unsafe assertion Date: Tue, 17 Feb 2015 12:34:39 +0100 Message-Id: <15c029a998d463691fc36b55826d136b2b6b67f5.1424099974.git.jslaby@suse.cz> X-Mailer: git-send-email 2.2.2 In-Reply-To: <09e6fe32192a77f6e2e60cc0f4103e630c7ecf20.1424099973.git.jslaby@suse.cz> References: <09e6fe32192a77f6e2e60cc0f4103e630c7ecf20.1424099973.git.jslaby@suse.cz> In-Reply-To: References: Sender: linux-kernel-owner@vger.kernel.org Precedence: list List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Removed-Original-Auth: Dkim didn't pass. X-Original-Sender: linux-kernel-owner@vger.kernel.org X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.217.177 as permitted sender) smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org Mailing-list: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org X-Google-Group-Id: 836684582541 List-Post: , List-Help: , List-Archive: List-Unsubscribe: , From: Alex Elder 3.12-stable review patch. If anyone has any objections, please let me know. =============== commit 638c323c4d1f8eaf25224946e21ce8818f1bcee1 upstream. Olivier Bonvalet reported having repeated crashes due to a failed assertion he was hitting in rbd_img_obj_callback(): Assertion failure in rbd_img_obj_callback() at line 2165: rbd_assert(which >= img_request->next_completion); With a lot of help from Olivier with reproducing the problem we were able to determine the object and image requests had already been completed (and often freed) at the point the assertion failed. There was a great deal of discussion on the ceph-devel mailing list about this. The problem only arose when there were two (or more) object requests in an image request, and the problem was always seen when the second request was being completed. The problem is due to a race in the window between setting the "done" flag on an object request and checking the image request's next completion value. When the first object request completes, it checks to see if its successor request is marked "done", and if so, that request is also completed. In the process, the image request's next_completion value is updated to reflect that both the first and second requests are completed. By the time the second request is able to check the next_completion value, it has been set to a value *greater* than its own "which" value, which caused an assertion to fail. Fix this problem by skipping over any completion processing unless the completing object request is the next one expected. Test only for inequality (not >=), and eliminate the bad assertion. Tested-by: Olivier Bonvalet Signed-off-by: Alex Elder Reviewed-by: Sage Weil Reviewed-by: Ilya Dryomov Signed-off-by: Jiri Slaby --- drivers/block/rbd.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c index 5544f254175d..2eb4458f4ba8 100644 --- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -2137,7 +2137,6 @@ static void rbd_img_obj_callback(struct rbd_obj_request *obj_request) rbd_assert(img_request->obj_request_count > 0); rbd_assert(which != BAD_WHICH); rbd_assert(which < img_request->obj_request_count); - rbd_assert(which >= img_request->next_completion); spin_lock_irq(&img_request->completion_lock); if (which != img_request->next_completion)