From patchwork Fri Jul 31 20:00:41 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 51795 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-la0-f72.google.com (mail-la0-f72.google.com [209.85.215.72]) by patches.linaro.org (Postfix) with ESMTPS id CE72122D9C for ; Fri, 31 Jul 2015 20:01:03 +0000 (UTC) Received: by lafd3 with SMTP id d3sf27677386laf.1 for ; Fri, 31 Jul 2015 13:01:02 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:delivered-to:mailing-list:precedence:list-id :list-unsubscribe:list-subscribe:list-archive:list-post:list-help :sender:delivered-to:message-id:date:from:user-agent:mime-version:to :cc:subject:references:in-reply-to:content-type :content-transfer-encoding:x-original-sender :x-original-authentication-results; bh=4d2TEiP0BVVyJ1acnmL0bwzkqrfcs+d/J/9Xa6ABGUg=; b=e+psrj/zOawBFxIuxFT/mOkq22SCEkc/L4D7qJyyeSYNLSANzrRkTU1iV+ye4m4i/o 9vhgUu4g3kHSz7KYmo2IEKT1wAUOyAX17Pv6TOchK2RZyZbr3JLNTLA5JkxiBJ174VSs ZkYSj9HfLyLiIxX8mAva9IWE6cyWM6SPJZQjascEvw9KPf2lNFkP0GktFzWvb8xNSDVj X/AVDDdNcL1QlBp0c0CVznf/jeQPFsvyWiz6BxTQ2AiZX1KVdhulXDQPmveugumzCUph 4E3q+FktfG1mZyKECWsdx91ZwOzxy+zdmlkIPQ7hxiJyW28bl0D+CBo5fP52DcU5ZxcI I7qQ== X-Gm-Message-State: ALoCoQmqWMwYxztKOS8Ut2s/KOv0wwAp0gCMIANZsQq2YxZiAzLnUs5lh7einpPudQF5WB2WPs3o X-Received: by 10.112.13.200 with SMTP id j8mr1428296lbc.14.1438372862815; Fri, 31 Jul 2015 13:01:02 -0700 (PDT) X-BeenThere: patchwork-forward@linaro.org Received: by 10.152.19.39 with SMTP id b7ls333960lae.85.gmail; Fri, 31 Jul 2015 13:01:02 -0700 (PDT) X-Received: by 10.112.42.102 with SMTP id n6mr1838957lbl.96.1438372862662; Fri, 31 Jul 2015 13:01:02 -0700 (PDT) Received: from mail-la0-x22e.google.com (mail-la0-x22e.google.com. [2a00:1450:4010:c03::22e]) by mx.google.com with ESMTPS id zd10si4757788lbb.169.2015.07.31.13.01.02 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 31 Jul 2015 13:01:02 -0700 (PDT) Received-SPF: pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 2a00:1450:4010:c03::22e as permitted sender) client-ip=2a00:1450:4010:c03::22e; Received: by lacct8 with SMTP id ct8so17510224lac.2 for ; Fri, 31 Jul 2015 13:01:02 -0700 (PDT) X-Received: by 10.152.207.76 with SMTP id lu12mr4966931lac.29.1438372862549; Fri, 31 Jul 2015 13:01:02 -0700 (PDT) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patch@linaro.org Received: by 10.112.7.198 with SMTP id l6csp578527lba; Fri, 31 Jul 2015 13:01:00 -0700 (PDT) X-Received: by 10.202.231.210 with SMTP id e201mr5156125oih.88.1438372859915; Fri, 31 Jul 2015 13:00:59 -0700 (PDT) Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id lj11si12547500pab.25.2015.07.31.13.00.58 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 31 Jul 2015 13:00:59 -0700 (PDT) Received-SPF: pass (google.com: domain of libc-alpha-return-61546-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Received: (qmail 53405 invoked by alias); 31 Jul 2015 20:00:50 -0000 Mailing-List: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org Precedence: list List-Id: List-Unsubscribe: , List-Subscribe: List-Archive: List-Post: , List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 53387 invoked by uid 89); 31 Jul 2015 20:00:50 -0000 X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.2 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-yk0-f170.google.com X-Received: by 10.170.49.197 with SMTP id 188mr5250242ykr.87.1438372845722; Fri, 31 Jul 2015 13:00:45 -0700 (PDT) Message-ID: <55BBD3E9.8040005@linaro.org> Date: Fri, 31 Jul 2015 17:00:41 -0300 From: Adhemerval Zanella User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.8.0 MIME-Version: 1.0 To: Tulio Magno Quites Machado Filho CC: libc-alpha@sourceware.org Subject: Re: [PATCH][BZ 18743] PowerPC: Fix a race condition when eliding a lock References: <1438274936-26493-1-git-send-email-tuliom@linux.vnet.ibm.com> <55BA703D.7010303@linaro.org> <874mkl3wtq.fsf@totoro.lan> <55BB76FA.5040703@linaro.org> <87zj2c1ij1.fsf@totoro.lan> In-Reply-To: <87zj2c1ij1.fsf@totoro.lan> X-Original-Sender: adhemerval.zanella@linaro.org X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 2a00:1450:4010:c03::22e as permitted sender) smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org; dkim=pass header.i=@sourceware.org X-Google-Group-Id: 836684582541 On 31-07-2015 11:57, Tulio Magno Quites Machado Filho wrote: > Adhemerval Zanella writes: > >> On 30-07-2015 23:06, Tulio Magno Quites Machado Filho wrote: >>> Using pthread_rwlock_rdlock as an example, is_lock_free would be a preprocessor >>> token with the following contents: >>> >>> 128 if (ELIDE_LOCK (rwlock->__data.__rwelision, >>> 129 rwlock->__data.__lock == 0 >>> 130 && rwlock->__data.__writer == 0 >>> 131 && rwlock->__data.__nr_readers == 0)) >>> >>> And this is where the important memory access happens. >>> If one of these fields are changed by another thread after this point, but >>> before __builtin_tbegin, we're able to reproduce the first problem I mentioned >>> previously. >> >> My understanding is this fields will be changed only in the lock taken path. Let's >> say a thread t1 grab the lock (either by a transaction or by using a default lock) >> and modifies any 'rwlock->__data' fields after the ELIDE_LOCK check and before >> the transaction begin on a second thread t2. My understanding of the issue you >> are seeing is the transaction will initiate and since is_lock_free is 1 it won't >> abort and continue as the a lock taken in both threads. > > I agree with you. But this won't reproduce the bug. You have to change the > order of events. > > Let's say t1 is the first one to read rwlock->__data fields and they're all 0. > t1 will set is_lock_free to 0. I believe you mean is_lock_free will be set to '1' if all fields are '0' (since the lock will not be held). > Then t2 modifies rwlock->__data before t1 starts the transaction (as in: get a > lock, instead of a transaction). > After that, t1 starts the transaction. Here comes the problem: the memory > barrier is created with rwlock->__data saying that someone took the lock, > but is_lock_free is set to 0. In this scenario, t1 will proceed with the > transaction. > >> However my understanding is the transaction won't be able to commit since it >> will have a conflict in the 'rwlock->__data.lock' or in any memory being access >> in the critical region. > > Yes, but with the order of events I mentioned, when t1 calls > pthread_rwlock_unlock, it will run the following code: > > 38 if (ELIDE_UNLOCK (rwlock->__data.__writer == 0 > 39 && rwlock->__data.__nr_readers == 0)) > 40 return 0; > > Where ELIDE_UNLOCK is: > > 89 static inline bool > 90 __elide_unlock (int is_lock_free) > 91 { > 92 if (is_lock_free) > 93 { > 94 __builtin_tend (0); > 95 return true; > 96 } > 97 return false; > 98 } > 99 > 100 # define ELIDE_UNLOCK(is_lock_free) \ > 101 __elide_unlock (is_lock_free) > > Remember that t1 started the transaction even if rwlock->__data said someone > was holding the lock. > So now, in __elide_unlock, is_lock_free will be 1. __elide_unlock will > return false and t1 will unlock a lock acquired by t2. I see now a window where the pthread_rwlock_t internal structures might have concurrent access and the transaction is never finished. However I do not think this is a powerpc specific and thus I think it should be fixed in generic algorithm instead. Something like this: --- -- I did not fix the x86_64 code, which would require some adjustments. I checked on ppc64le and make nptl/tests passes. It would be good if you could create a testcase to stress this issue (and I do see this could be quite hard since it is very timing dependent). The problem with this approach is it couple the elide macros with rdlock fields, and I think the idea would make this header more generic for multiple elide algorithms. Suggestions? > >> Are you saying that the transaction is not being aborted >> in the critical region? > > No. I'm saying the memory access to rwlock->__data is critical and should > happen inside the transaction. Without this, we can't say the whole process > is atomic. > >> If compiler is doing memory reordering around HTM builtin it is *clearly* a bug >> since transaction begin/end acts a full memory barrier (sync). You should also >> add a compiler barrier in pthread elision code as well. > > Agreed. But I guess that's out of scope for this particular bug. > I could prepare a separate patch with these changes if you agree with that. > diff --git a/nptl/pthread_rwlock_rdlock.c b/nptl/pthread_rwlock_rdlock.c index eb7ac8d..3756bb6 100644 --- a/nptl/pthread_rwlock_rdlock.c +++ b/nptl/pthread_rwlock_rdlock.c @@ -126,9 +126,9 @@ __pthread_rwlock_rdlock (pthread_rwlock_t *rwlock) LIBC_PROBE (rdlock_entry, 1, rwlock); if (ELIDE_LOCK (rwlock->__data.__rwelision, - rwlock->__data.__lock == 0 - && rwlock->__data.__writer == 0 - && rwlock->__data.__nr_readers == 0)) + rwlock->__data.__lock, + rwlock->__data.__writer, + rwlock->__data.__nr_readers)) return 0; /* Make sure we are alone. */ diff --git a/nptl/pthread_rwlock_tryrdlock.c b/nptl/pthread_rwlock_tryrdlock.c index 256188a..bb199e4 100644 --- a/nptl/pthread_rwlock_tryrdlock.c +++ b/nptl/pthread_rwlock_tryrdlock.c @@ -33,9 +33,9 @@ __pthread_rwlock_tryrdlock (pthread_rwlock_t *rwlock) rwlock->__data.__shared == LLL_PRIVATE ? FUTEX_PRIVATE : FUTEX_SHARED; if (ELIDE_TRYLOCK (rwlock->__data.__rwelision, - rwlock->__data.__lock == 0 - && rwlock->__data.__nr_readers == 0 - && rwlock->__data.__writer, 0)) + rwlock->__data.__lock, + rwlock->__data.__writer, + rwlock->__data.__nr_readers, 0)) return 0; lll_lock (rwlock->__data.__lock, rwlock->__data.__shared); diff --git a/nptl/pthread_rwlock_trywrlock.c b/nptl/pthread_rwlock_trywrlock.c index 918a8f2..49c6757 100644 --- a/nptl/pthread_rwlock_trywrlock.c +++ b/nptl/pthread_rwlock_trywrlock.c @@ -28,9 +28,9 @@ __pthread_rwlock_trywrlock (pthread_rwlock_t *rwlock) int result = EBUSY; if (ELIDE_TRYLOCK (rwlock->__data.__rwelision, - rwlock->__data.__lock == 0 - && rwlock->__data.__nr_readers == 0 - && rwlock->__data.__writer, 1)) + rwlock->__data.__lock, + rwlock->__data.__writer, + rwlock->__data.__nr_readers, 1)) return 0; lll_lock (rwlock->__data.__lock, rwlock->__data.__shared); diff --git a/nptl/pthread_rwlock_unlock.c b/nptl/pthread_rwlock_unlock.c index bdd115d..0946d0c 100644 --- a/nptl/pthread_rwlock_unlock.c +++ b/nptl/pthread_rwlock_unlock.c @@ -35,8 +35,7 @@ __pthread_rwlock_unlock (pthread_rwlock_t *rwlock) LIBC_PROBE (rwlock_unlock, 1, rwlock); - if (ELIDE_UNLOCK (rwlock->__data.__writer == 0 - && rwlock->__data.__nr_readers == 0)) + if (ELIDE_UNLOCK (rwlock->__data.__writer, rwlock->__data.__nr_readers)) return 0; lll_lock (rwlock->__data.__lock, rwlock->__data.__shared); diff --git a/nptl/pthread_rwlock_wrlock.c b/nptl/pthread_rwlock_wrlock.c index 60fa909..0161876 100644 --- a/nptl/pthread_rwlock_wrlock.c +++ b/nptl/pthread_rwlock_wrlock.c @@ -98,9 +98,9 @@ __pthread_rwlock_wrlock (pthread_rwlock_t *rwlock) LIBC_PROBE (wrlock_entry, 1, rwlock); if (ELIDE_LOCK (rwlock->__data.__rwelision, - rwlock->__data.__lock == 0 - && rwlock->__data.__writer == 0 - && rwlock->__data.__nr_readers == 0)) + rwlock->__data.__lock, + rwlock->__data.__writer, + rwlock->__data.__nr_readers)) return 0; /* Make sure we are alone. */ diff --git a/sysdeps/generic/elide.h b/sysdeps/generic/elide.h index 80a3a03..64d9cce 100644 --- a/sysdeps/generic/elide.h +++ b/sysdeps/generic/elide.h @@ -15,11 +15,12 @@ You should have received a copy of the GNU Lesser General Public License along with the GNU C Library; if not, see . */ + #ifndef ELIDE_H #define ELIDE_H 1 -#define ELIDE_LOCK(adapt_count, is_lock_free) 0 -#define ELIDE_TRYLOCK(adapt_count, is_lock_free, write) 0 -#define ELIDE_UNLOCK(is_lock_free) 0 +#define ELIDE_LOCK(adapt_count, lock, writer, readers) 0 +#define ELIDE_TRYLOCK(adapt_count, lock, writer, readers, write) 0 +#define ELIDE_UNLOCK(writer, readers) 0 #endif diff --git a/sysdeps/powerpc/nptl/elide.h b/sysdeps/powerpc/nptl/elide.h index 389f5a5..b3cc50a 100644 --- a/sysdeps/powerpc/nptl/elide.h +++ b/sysdeps/powerpc/nptl/elide.h @@ -27,7 +27,7 @@ ADAPT_COUNT is a pointer to per-lock state variable. */ static inline bool -__elide_lock (uint8_t *adapt_count, int is_lock_free) +__elide_lock (uint8_t *adapt_count, int *lock, int *writer, unsigned int *readers) { if (*adapt_count > 0) { @@ -39,7 +39,10 @@ __elide_lock (uint8_t *adapt_count, int is_lock_free) { if (__builtin_tbegin (0)) { - if (is_lock_free) + /* The compiler barrier is required because some GCC version might + reorder the lock read before the transaction init builtin. */ + asm volatile("" ::: "memory"); + if ((*lock == 0) && (*writer == 0) && (*readers == 0)) return true; /* Lock was busy. */ __builtin_tabort (_ABORT_LOCK_BUSY); @@ -66,30 +69,31 @@ __elide_lock (uint8_t *adapt_count, int is_lock_free) return false; } -# define ELIDE_LOCK(adapt_count, is_lock_free) \ - __elide_lock (&(adapt_count), is_lock_free) +# define ELIDE_LOCK(adapt_count, lock, writer, readers) \ + __elide_lock (&(adapt_count), &(lock), &(writer), &(readers)) static inline bool -__elide_trylock (uint8_t *adapt_count, int is_lock_free, int write) +__elide_trylock (uint8_t *adapt_count, int *lock, int *writer, + unsigned int *readers, int write) { if (__elision_aconf.try_tbegin > 0) { if (write) __builtin_tabort (_ABORT_NESTED_TRYLOCK); - return __elide_lock (adapt_count, is_lock_free); + return __elide_lock (adapt_count, lock, writer, readers); } return false; } -# define ELIDE_TRYLOCK(adapt_count, is_lock_free, write) \ - __elide_trylock (&(adapt_count), is_lock_free, write) +# define ELIDE_TRYLOCK(adapt_count, lock, writer, readers, write) \ + __elide_trylock (&(adapt_count), &(lock), &(writer), &(readers), write) static inline bool -__elide_unlock (int is_lock_free) +__elide_unlock (int *writer, unsigned int *readers) { - if (is_lock_free) + if ((*writer == 0) && (*readers == 0)) { __builtin_tend (0); return true; @@ -97,14 +101,14 @@ __elide_unlock (int is_lock_free) return false; } -# define ELIDE_UNLOCK(is_lock_free) \ - __elide_unlock (is_lock_free) +# define ELIDE_UNLOCK(writer, readers) \ + __elide_unlock (&(writer), &(readers)) # else -# define ELIDE_LOCK(adapt_count, is_lock_free) 0 -# define ELIDE_TRYLOCK(adapt_count, is_lock_free, write) 0 -# define ELIDE_UNLOCK(is_lock_free) 0 +# define ELIDE_LOCK(adapt_count, lock, writer, readers) 0 +# define ELIDE_TRYLOCK(adapt_count, lock, writer, readers, write) 0 +# define ELIDE_UNLOCK(writer, readers) 0 #endif /* ENABLE_LOCK_ELISION */