From patchwork Thu Jan 16 16:54:59 2025
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Adhemerval Zanella <adhemerval.zanella@linaro.org>
X-Patchwork-Id: 857927
Delivered-To: patch@linaro.org
Received: by 2002:a05:6000:cc8:b0:385:e875:8a9e with SMTP id dq8csp301310wrb; 
 Thu, 16 Jan 2025 08:56:30 -0800 (PST)
X-Forwarded-Encrypted: i=3;
 AJvYcCU9eygqYp7GI0woYkLZAfphlGs4a9WZmyXyr4zCpOq2UEzvmhGOcOF21KU7DDWDgq5/G+xv5Q==@linaro.org
X-Google-Smtp-Source: AGHT+IEbXu8FxNjUTGm2HDxk7fLaBmA8pXXo2f3TxqrUdD7967WfyGcAUmItcjnPVMukG8AHaa2c
X-Received: by 2002:a05:6214:4015:b0:6df:97c6:ccc0 with SMTP id
 6a1803df08f44-6df9b22de6bmr589901296d6.28.1737046590216;
 Thu, 16 Jan 2025 08:56:30 -0800 (PST)
ARC-Seal: i=2; a=rsa-sha256; t=1737046590; cv=pass;
 d=google.com; s=arc-20240605;
 b=DDd3iDksQJbO8grcj9E4JMtA6mKaHN1BnHsGCHmw5pcRcoln7aKDATUQxQcCW7cL75
 Fpcca7HArwLh24VoiFq8tIIu9R2AEKEvagfJtHXWfuL3lr7XEi1Kb627FkfE5kvhq1SQ
 HWkUf7qgE0ZlNjD3EH9azTF32RL9BRsOKPbmdyZG/feQ3uW385Agy6EYlNzgHQPa6AsP
 F6Og4m+r/i8k9VZ6e9ubIlum1ZwLFxu3RdYmIdeojCcMEpbU/TgFvAk1NlgN+va7sCvC
 mHaGrXcdCADuyp1Hr113i3ScZnC+ke7CDvaaW3kwfhL1rdhH4R16WjXCpjSckmmOIVEk
 881Q==
ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com;
 s=arc-20240605;
 h=errors-to:list-subscribe:list-help:list-post:list-archive
 :list-unsubscribe:list-id:precedence:content-transfer-encoding
 :mime-version:message-id:date:subject:cc:to:from:dkim-signature
 :dkim-filter:arc-filter:dmarc-filter:delivered-to:dkim-filter;
 bh=+R6HtrHa5R2+PjtIWOZaUsxz0RdwR/HmbtdemvpxRpg=;
 fh=kxe/wfJybIMwOF5l32zk6CyzYVmV/lEsdubStKtynhc=;
 b=c+wZc8VG3bsK+6Adyl5zIXgD7dJeKJeB3Q1Sdrm/x6YiAp+AmsFzu4aqZaNP8f4crS
 obVFn2vfSfMk14BlaDlpLfj8BdwklS3fbJJwOYkPCaSkCH8ov5EYKgPKgE1depBkgwvt
 1CgJ8V93It+1Mm5Xmk737FFy/6hrOpxeKzALsqUZR80wxejP4ozxrIubgpYROOTTTCKT
 Hj1S5f4nHJNg8yX8FcsWHAp6IZLB0OVBtQFHAWUQrS+fbn6cBZ9d7wzqawsnYECMwCp4
 WCzfKhfBdEEjkToCUAS0wCv2SvstNLXIrVLIJJxPSAVYPdRoo8goD9gEALRXeTADrj9d
 Oodg==; dara=google.com
ARC-Authentication-Results: i=2; mx.google.com;
 dkim=pass header.i=@linaro.org header.s=google header.b=cy9PvbeD;
 arc=pass (i=1); spf=pass (google.com: domain of
 libc-alpha-bounces~patch=linaro.org@sourceware.org designates
 2620:52:3:1:0:246e:9693:128c as permitted sender)
 smtp.mailfrom="libc-alpha-bounces~patch=linaro.org@sourceware.org";
 dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org
Return-Path: <libc-alpha-bounces~patch=linaro.org@sourceware.org>
Received: from server2.sourceware.org (server2.sourceware.org.
 [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id
 6a1803df08f44-6e1afcd960csi3225906d6.195.2025.01.16.08.56.30
 for <patch@linaro.org>
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Thu, 16 Jan 2025 08:56:30 -0800 (PST)
Received-SPF: pass (google.com: domain of
 libc-alpha-bounces~patch=linaro.org@sourceware.org designates
 2620:52:3:1:0:246e:9693:128c as permitted sender)
 client-ip=2620:52:3:1:0:246e:9693:128c;
Authentication-Results: mx.google.com;
 dkim=pass header.i=@linaro.org header.s=google header.b=cy9PvbeD;
 arc=pass (i=1); spf=pass (google.com: domain of
 libc-alpha-bounces~patch=linaro.org@sourceware.org designates
 2620:52:3:1:0:246e:9693:128c as permitted sender)
 smtp.mailfrom="libc-alpha-bounces~patch=linaro.org@sourceware.org";
 dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org
Received: from server2.sourceware.org (localhost [IPv6:::1])
 by sourceware.org (Postfix) with ESMTP id BB634384DEF7
 for <patch@linaro.org>; Thu, 16 Jan 2025 16:56:29 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org BB634384DEF7
Authentication-Results: sourceware.org; dkim=pass (2048-bit key,
 unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256
 header.s=google header.b=cy9PvbeD
X-Original-To: libc-alpha@sourceware.org
Delivered-To: libc-alpha@sourceware.org
Received: from mail-oi1-x22a.google.com (mail-oi1-x22a.google.com
 [IPv6:2607:f8b0:4864:20::22a])
 by sourceware.org (Postfix) with ESMTPS id BED49384DEF3
 for <libc-alpha@sourceware.org>; Thu, 16 Jan 2025 16:56:04 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org BED49384DEF3
Authentication-Results: sourceware.org;
 dmarc=pass (p=none dis=none) header.from=linaro.org
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org BED49384DEF3
Authentication-Results: server2.sourceware.org;
 arc=none smtp.remote-ip=2607:f8b0:4864:20::22a
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1737046564; cv=none;
 b=pAUwkN7IQilr6AdlTDTIHbqaiHeGBKpnCEX1F788tSISnj5HsFZ7ogBTVgu3P2V4WATU3hmXlC6nn81yQthfJyob35bhxl7QeHg/RwBoFiIV/hV0x1R6u8eu5UkKWvaADJWUj0yeLYzuzsXrwBRhR/IbluCY8JeDxp/P6QKzMMU=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
 t=1737046564; c=relaxed/simple;
 bh=bY123kvS/SXe8C8lvwckAZHJ05AhHNebk5gg09CtvGk=;
 h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version;
 b=mXQ6P2PpC4wn0NUEe9NwP3I3i78eJ6crhaDSJwLhfmm+pfnMu0DJ/QhlsPddAAW3bAQLuX/cxbAYW+B/QmGlPOq9zZehSqvpYf01lkndQdlzVXzuL73cOv1CrLq0fDxnfp04TweYx8nux+xKXrWHkRbJ1kvkCPTErj47Vh7LqH0=
ARC-Authentication-Results: i=1; server2.sourceware.org
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org BED49384DEF3
Received: by mail-oi1-x22a.google.com with SMTP id
 5614622812f47-3eb6b16f1a0so411690b6e.3
 for <libc-alpha@sourceware.org>; Thu, 16 Jan 2025 08:56:04 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=linaro.org; s=google; t=1737046564; x=1737651364; darn=sourceware.org;
 h=content-transfer-encoding:mime-version:message-id:date:subject:cc
 :to:from:from:to:cc:subject:date:message-id:reply-to;
 bh=+R6HtrHa5R2+PjtIWOZaUsxz0RdwR/HmbtdemvpxRpg=;
 b=cy9PvbeDbyklBiUAQZTW+kjaDLrRoJFWgajZMYxEP97b7+cDHrBiKsgD+KO3yqcELZ
 PplyP12D4BUWnGAKwnjotB/DlY0fqo/vDc1u0wls6w2QnSD0z1hZbxWn4dOVK3NkCSAu
 f4QAe+kYdsobuQ15AyKopyUx2D5Xc+WyvN7HfbMjDvx/Neuz16rB0kAtJoDJFD4+s8GO
 /cmam+KNC+Fra1cjmY5wXiT7DgKGwZ5MMJZPhshWLoVhNPbvErnwv7eSsUljKDa1Dchz
 +qI5zO5Ebc4O8QsbiLcoZtdG8Bp2Y8xO0t02b0W1vXY5nmOUcoOajv0dfPG/S47JkG/W
 Imxg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20230601; t=1737046564; x=1737651364;
 h=content-transfer-encoding:mime-version:message-id:date:subject:cc
 :to:from:x-gm-message-state:from:to:cc:subject:date:message-id
 :reply-to;
 bh=+R6HtrHa5R2+PjtIWOZaUsxz0RdwR/HmbtdemvpxRpg=;
 b=BbKfVoKvFte3pz4toPfIZuJ4Jvnjybtx4GJhw9YSkJ+mpn66/pSjneNh3FW9KkLRZb
 wbkeXKsrR3JkBtVprxwAd1OSWSjOOEyj5ZYE5mVNpTpdACn4tauc/ExQVd8ZkpJSeEX3
 x6SLANFZnrUkEFjEcqy1IhszdoH1tWOsbe0TUsHgVCWzE2kJfVTxEJQpu+zmkFLt8f+u
 CofWSLM1wQq9lIh+VpLGXy7AJYbI4cjgmSzc8oCUIsdELJVwM0tfgnJJrUOics5qG4eC
 CwEgtN+WJWlH93iBuW67kQMiXQ7rDWczPum7cjDh+iQDjGcI9bMPN/iDe68YNleqnd9i
 XwyA==
X-Gm-Message-State: AOJu0YxiXVqBMKDMklo28WeF/XEq77o8eRPwPbAxlEGCMIcfMZWO90Na
 l3mbKyrBI1ZIyWXPf+zsMNIrg9LkYRMch5wjfxHESWD5ODa57NoGCatJYN2ha8JnKvu4QMkTWJ8
 u
X-Gm-Gg: ASbGncuwVZNLBNu+NlztQ4sx81WElYa8DuqS/EWbDWCVB5+QKkVE2R6nf4LZFHZtbAr
 bujwlZmuB8jTPmuf9TAuTJcPm2ylSQ10maT7wJKGg47gICnG8CxiFd5IB3LSmajXNAh/jrt8RVk
 Yde9fL7ipVNJwZrhIIWp4gx7neQREHnPReVpoFgKsDnLIUp4E5ujrwxRvX6O291tp/aw0X3jwpH
 iXtMNfWNzx201eUjTSrgu1vjUT2fdHmZY0Ly4Q0V9QYrOFPR/rjEA1sQ2fYJJUv5jb07g==
X-Received: by 2002:a05:6808:228e:b0:3ea:5a0e:941c with SMTP id
 5614622812f47-3ef2ec1704fmr22700198b6e.10.1737046563285;
 Thu, 16 Jan 2025 08:56:03 -0800 (PST)
Received: from mandiga.. ([2804:1b3:a7c0:f41c:3bfd:b552:5a7a:db14])
 by smtp.gmail.com with ESMTPSA id
 5614622812f47-3f19da450e7sm109075b6e.4.2025.01.16.08.56.01
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Thu, 16 Jan 2025 08:56:02 -0800 (PST)
From: Adhemerval Zanella <adhemerval.zanella@linaro.org>
To: libc-alpha@sourceware.org
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>, =?utf-8?q?Cristian_Rodr?=
 =?utf-8?q?=C3=ADguez?= <crrodriguez@opensuse.org>
Subject: [PATCH v2] nptl: Add support for setup guard pages with
 MADV_GUARD_INSTALL
Date: Thu, 16 Jan 2025 13:54:59 -0300
Message-ID: <20250116165557.2289386-1-adhemerval.zanella@linaro.org>
X-Mailer: git-send-email 2.43.0
MIME-Version: 1.0
X-BeenThere: libc-alpha@sourceware.org
X-Mailman-Version: 2.1.30
Precedence: list
List-Id: Libc-alpha mailing list <libc-alpha.sourceware.org>
List-Unsubscribe: <https://sourceware.org/mailman/options/libc-alpha>,
 <mailto:libc-alpha-request@sourceware.org?subject=unsubscribe>
List-Archive: <https://sourceware.org/pipermail/libc-alpha/>
List-Post: <mailto:libc-alpha@sourceware.org>
List-Help: <mailto:libc-alpha-request@sourceware.org?subject=help>
List-Subscribe: <https://sourceware.org/mailman/listinfo/libc-alpha>,
 <mailto:libc-alpha-request@sourceware.org?subject=subscribe>
Errors-To: libc-alpha-bounces~patch=linaro.org@sourceware.org

Linux 6.13 (662df3e5c3766) added a lightweight way to define guard areas
through madvise syscall.  Instead of PROT_NONE the guard region through
mprotect, userland can madvise the same area with a special flag, and
the kernel ensures that accessing the area will trigger a SIGSEGV (as for
PROT_NONE mapping).

The madvise way has the advantage of less kernel memory consumption for
the process page-table (one less VMA per guard area), and slightly less
contention on kernel (also due to the fewer VMA areas being tracked).

The pthread_create allocates a new thread stack in two ways: if a guard
area is set (the default) it allocates the memory range required using
PROT_NONE and then mprotect the usable stack area. Otherwise, if a
guard page is not set it allocates the region with the required flags.

For the MADV_GUARD_INSTALL support, the stack area region is allocated
with required flags and then the guard region is installed.  If the
kernel does not support it, the usual way is used instead (and
MADV_GUARD_INSTALL is disabled for future stack creations).

The stack allocation strategy is recorded on the pthread struct, and it
is used in case the guard region needs to be resized.  To avoid needing
an extra field, the 'user_stack' is repurposed and renamed to 'stack_mode'.

This patch also adds a proper test for the pthread guard.

I checked on x86_64, aarch64, powerpc64le, and hppa with kernel 6.13.0-rc7.

Changes from v1:
* Fixed MADV_GUARD_INSTALL on _STACK_GROWS_UP ABIs.
---
 nptl/Makefile                             |   1 +
 nptl/TODO-testing                         |   4 -
 nptl/allocatestack.c                      | 263 ++++++++++-----
 nptl/descr.h                              |   8 +-
 nptl/nptl-stack.c                         |   2 +-
 nptl/pthread_create.c                     |   2 +-
 nptl/tst-guard1.c                         | 369 ++++++++++++++++++++++
 sysdeps/nptl/dl-tls_init_tp.c             |   2 +-
 sysdeps/nptl/fork.h                       |   2 +-
 sysdeps/unix/sysv/linux/bits/mman-linux.h |   2 +
 10 files changed, 560 insertions(+), 95 deletions(-)
 create mode 100644 nptl/tst-guard1.c

diff --git a/nptl/Makefile b/nptl/Makefile
index b7c63999a3..b04e25cd0d 100644
--- a/nptl/Makefile
+++ b/nptl/Makefile
@@ -289,6 +289,7 @@ tests = \
   tst-dlsym1 \
   tst-exec4 \
   tst-exec5 \
+  tst-guard1 \
   tst-initializers1 \
   tst-initializers1-c11 \
   tst-initializers1-c89 \
diff --git a/nptl/TODO-testing b/nptl/TODO-testing
index f50d2ceb51..46ebf3bc5c 100644
--- a/nptl/TODO-testing
+++ b/nptl/TODO-testing
@@ -1,7 +1,3 @@
-pthread_attr_setguardsize
-
-  test effectiveness
-
 pthread_attr_[sg]etschedparam
 
   what to test?
diff --git a/nptl/allocatestack.c b/nptl/allocatestack.c
index 9c1a72bcf0..e2c9ac8143 100644
--- a/nptl/allocatestack.c
+++ b/nptl/allocatestack.c
@@ -146,10 +146,37 @@ get_cached_stack (size_t *sizep, void **memp)
   return result;
 }
 
+/* Assume support for MADV_ADVISE_GUARD, setup_stack_prot will disable it
+   and fallback to ALLOCATE_GUARD_PROT_NONE if the madvise call fails.  */
+static int allocate_stack_mode = ALLOCATE_GUARD_MADV_GUARD;
+
+static inline int stack_prot (void)
+{
+  return (PROT_READ | PROT_WRITE
+	  | ((GL(dl_stack_flags) & PF_X) ? PROT_EXEC : 0));
+}
+
+static void *
+allocate_thread_stack (size_t size, size_t guardsize)
+{
+  /* MADV_ADVISE_GUARD does not require an additional PROT_NONE mapping.  */
+  int prot = stack_prot ();
+
+  if (atomic_load_relaxed (&allocate_stack_mode) == ALLOCATE_GUARD_PROT_NONE)
+    /* If a guard page is required, avoid committing memory by first allocate
+       with PROT_NONE and then reserve with required permission excluding the
+       guard page.  */
+    prot = guardsize == 0 ? prot : PROT_NONE;
+
+  return __mmap (NULL, size, prot, MAP_PRIVATE | MAP_ANONYMOUS | MAP_STACK, -1,
+		 0);
+}
+
+
 /* Return the guard page position on allocated stack.  */
 static inline char *
 __attribute ((always_inline))
-guard_position (void *mem, size_t size, size_t guardsize, struct pthread *pd,
+guard_position (void *mem, size_t size, size_t guardsize, const struct pthread *pd,
 		size_t pagesize_m1)
 {
 #if _STACK_GROWS_DOWN
@@ -159,27 +186,131 @@ guard_position (void *mem, size_t size, size_t guardsize, struct pthread *pd,
 #endif
 }
 
-/* Based on stack allocated with PROT_NONE, setup the required portions with
-   'prot' flags based on the guard page position.  */
-static inline int
-setup_stack_prot (char *mem, size_t size, char *guard, size_t guardsize,
-		  const int prot)
+/* Setup the MEM thread stack of SIZE bytes with the required protection flags
+   along with a guard area of GUARDSIZE size.  It first tries with
+   MADV_GUARD_INSTALL, and then fallback to setup the guard area using the
+   extra PROT_NONE mapping.  Update PD with the type of guard area setup.  */
+static inline bool
+setup_stack_prot (char *mem, size_t size, struct pthread *pd,
+		  size_t guardsize, size_t pagesize_m1)
 {
-  char *guardend = guard + guardsize;
+  if (__glibc_unlikely (guardsize == 0))
+    return true;
+
+  char *guard = guard_position (mem, size, guardsize, pd, pagesize_m1);
+  if (atomic_load_relaxed (&allocate_stack_mode) == ALLOCATE_GUARD_MADV_GUARD)
+    {
+      if (__madvise (guard, guardsize, MADV_GUARD_INSTALL) == 0)
+	{
+	  pd->stack_mode = ALLOCATE_GUARD_MADV_GUARD;
+	  return true;
+	}
+
+      /* If madvise fails it means the kernel does not support the guard
+	 advise (we assume that the syscall is available, guard is page-aligned
+	 and length is non negative).  The stack has already the expected
+	 protection flags, so it just need to PROT_NONE the guard area.  */
+      atomic_store_relaxed (&allocate_stack_mode, ALLOCATE_GUARD_PROT_NONE);
+      if (__mprotect (guard, guardsize, PROT_NONE) != 0)
+	return false;
+    }
+  else
+    {
+      const int prot = stack_prot ();
+      char *guardend = guard + guardsize;
 #if _STACK_GROWS_DOWN
-  /* As defined at guard_position, for architectures with downward stack
-     the guard page is always at start of the allocated area.  */
-  if (__mprotect (guardend, size - guardsize, prot) != 0)
-    return errno;
+      /* As defined at guard_position, for architectures with downward stack
+	 the guard page is always at start of the allocated area.  */
+      if (__mprotect (guardend, size - guardsize, prot) != 0)
+	return false;
 #else
-  size_t mprots1 = (uintptr_t) guard - (uintptr_t) mem;
-  if (__mprotect (mem, mprots1, prot) != 0)
-    return errno;
-  size_t mprots2 = ((uintptr_t) mem + size) - (uintptr_t) guardend;
-  if (__mprotect (guardend, mprots2, prot) != 0)
-    return errno;
+      size_t mprots1 = (uintptr_t) guard - (uintptr_t) mem;
+      if (__mprotect (mem, mprots1, prot) != 0)
+	return false;
+      size_t mprots2 = ((uintptr_t) mem + size) - (uintptr_t) guardend;
+      if (__mprotect (guardend, mprots2, prot) != 0)
+	return false;
 #endif
-  return 0;
+    }
+
+  pd->stack_mode = ALLOCATE_GUARD_PROT_NONE;
+  return true;
+}
+
+/* Update the guard area of the thread stack MEM of size SIZE with the new
+   GUARDISZE.  It uses the method defined by PD stack_mode.  */
+static inline bool
+adjust_stack_prot (char *mem, size_t size, const struct pthread *pd,
+		   size_t guardsize, size_t pagesize_m1)
+{
+  /* The required guard area is larger than the current one.  For
+     _STACK_GROWS_DOWN it means the guard should increase as:
+
+     |guard|stack---------------------------------|
+     |new guard--|stack---------------------------|
+
+     while for _STACK_GROWS_UP:
+
+     |stack---------------------------|guard|-----|
+     |stack--------------------|new guard---|-----|
+
+     Both madvise and mprotect allows overlap the required region,
+     so use the new guard placement with the new size.  */
+  if (guardsize > pd->guardsize)
+    {
+      char *guard = guard_position (mem, size, guardsize, pd, pagesize_m1);
+      if (pd->stack_mode == ALLOCATE_GUARD_MADV_GUARD)
+	return __madvise (guard, guardsize, MADV_GUARD_INSTALL) == 0;
+      else if (pd->stack_mode == ALLOCATE_GUARD_PROT_NONE)
+	return __mprotect (guard, guardsize, PROT_NONE) == 0;
+    }
+  /* The current guard area is larger than the required one.  For
+     _STACK_GROWS_DOWN is means change the guard as:
+
+     |guard-------|stack-------------------------|
+     |new guard|stack----------------------------|
+
+     And for _STACK_GROWS_UP:
+
+     |stack---------------------|guard-------|---|
+     |stack------------------------|new guard|---|
+
+     For ALLOCATE_GUARD_MADV_GUARD it means remove the slack area
+     (disjointed region of guard and new guard), while for
+     ALLOCATE_GUARD_PROT_NONE it requires to mprotect it with the stack
+     protection flags.  */
+  else if (pd->guardsize > guardsize)
+    {
+      size_t slacksize = pd->guardsize - guardsize;
+      if (pd->stack_mode == ALLOCATE_GUARD_MADV_GUARD)
+	{
+	  void *slack =
+#if _STACK_GROWS_DOWN
+	    mem + guardsize;
+#else
+	    guard_position (mem, size, pd->guardsize, pd, pagesize_m1);
+#endif
+	  return __madvise (slack, slacksize, MADV_GUARD_REMOVE) == 0;
+	}
+      else if (pd->stack_mode == ALLOCATE_GUARD_PROT_NONE)
+	{
+	  const int prot = stack_prot ();
+#if _STACK_GROWS_DOWN
+	  return __mprotect (mem + guardsize, slacksize, prot) == 0;
+#else
+	  char *new_guard = (char *)(((uintptr_t) pd - guardsize)
+				     & ~pagesize_m1);
+	  char *old_guard = (char *)(((uintptr_t) pd - pd->guardsize)
+				     & ~pagesize_m1);
+	  /* The guard size difference might be > 0, but once rounded
+	     to the nearest page the size difference might be zero.  */
+	  if (new_guard > old_guard
+	      && __mprotect (old_guard, new_guard - old_guard, prot) != 0)
+	    return false;
+#endif
+	}
+    }
+  return true;
 }
 
 /* Mark the memory of the stack as usable to the kernel.  It frees everything
@@ -291,7 +422,7 @@ allocate_stack (const struct pthread_attr *attr, struct pthread **pdp,
 
       /* This is a user-provided stack.  It will not be queued in the
 	 stack cache nor will the memory (except the TLS memory) be freed.  */
-      pd->user_stack = true;
+      pd->stack_mode = ALLOCATE_GUARD_USER;
 
       /* This is at least the second thread.  */
       pd->header.multiple_threads = 1;
@@ -325,10 +456,7 @@ allocate_stack (const struct pthread_attr *attr, struct pthread **pdp,
       /* Allocate some anonymous memory.  If possible use the cache.  */
       size_t guardsize;
       size_t reported_guardsize;
-      size_t reqsize;
       void *mem;
-      const int prot = (PROT_READ | PROT_WRITE
-			| ((GL(dl_stack_flags) & PF_X) ? PROT_EXEC : 0));
 
       /* Adjust the stack size for alignment.  */
       size &= ~tls_static_align_m1;
@@ -358,16 +486,10 @@ allocate_stack (const struct pthread_attr *attr, struct pthread **pdp,
 	return EINVAL;
 
       /* Try to get a stack from the cache.  */
-      reqsize = size;
       pd = get_cached_stack (&size, &mem);
       if (pd == NULL)
 	{
-	  /* If a guard page is required, avoid committing memory by first
-	     allocate with PROT_NONE and then reserve with required permission
-	     excluding the guard page.  */
-	  mem = __mmap (NULL, size, (guardsize == 0) ? prot : PROT_NONE,
-			MAP_PRIVATE | MAP_ANONYMOUS | MAP_STACK, -1, 0);
-
+	  mem = allocate_thread_stack (size, guardsize);
 	  if (__glibc_unlikely (mem == MAP_FAILED))
 	    return errno;
 
@@ -394,15 +516,10 @@ allocate_stack (const struct pthread_attr *attr, struct pthread **pdp,
 #endif
 
 	  /* Now mprotect the required region excluding the guard area.  */
-	  if (__glibc_likely (guardsize > 0))
+	  if (!setup_stack_prot (mem, size, pd, guardsize, pagesize_m1))
 	    {
-	      char *guard = guard_position (mem, size, guardsize, pd,
-					    pagesize_m1);
-	      if (setup_stack_prot (mem, size, guard, guardsize, prot) != 0)
-		{
-		  __munmap (mem, size);
-		  return errno;
-		}
+	      __munmap (mem, size);
+	      return errno;
 	    }
 
 	  /* Remember the stack-related values.  */
@@ -456,59 +573,31 @@ allocate_stack (const struct pthread_attr *attr, struct pthread **pdp,
 	     which will be read next.  */
 	}
 
-      /* Create or resize the guard area if necessary.  */
-      if (__glibc_unlikely (guardsize > pd->guardsize))
+      /* Create or resize the guard area if necessary on an already
+	 allocated stack.  */
+      if (!adjust_stack_prot (mem, size, pd, guardsize, pagesize_m1))
 	{
-	  char *guard = guard_position (mem, size, guardsize, pd,
-					pagesize_m1);
-	  if (__mprotect (guard, guardsize, PROT_NONE) != 0)
-	    {
-	    mprot_error:
-	      lll_lock (GL (dl_stack_cache_lock), LLL_PRIVATE);
-
-	      /* Remove the thread from the list.  */
-	      __nptl_stack_list_del (&pd->list);
+	  lll_lock (GL (dl_stack_cache_lock), LLL_PRIVATE);
 
-	      lll_unlock (GL (dl_stack_cache_lock), LLL_PRIVATE);
+	  /* Remove the thread from the list.  */
+	  __nptl_stack_list_del (&pd->list);
 
-	      /* Get rid of the TLS block we allocated.  */
-	      _dl_deallocate_tls (TLS_TPADJ (pd), false);
+	  lll_unlock (GL (dl_stack_cache_lock), LLL_PRIVATE);
 
-	      /* Free the stack memory regardless of whether the size
-		 of the cache is over the limit or not.  If this piece
-		 of memory caused problems we better do not use it
-		 anymore.  Uh, and we ignore possible errors.  There
-		 is nothing we could do.  */
-	      (void) __munmap (mem, size);
+	  /* Get rid of the TLS block we allocated.  */
+	  _dl_deallocate_tls (TLS_TPADJ (pd), false);
 
-	      return errno;
-	    }
+	  /* Free the stack memory regardless of whether the size
+	     of the cache is over the limit or not.  If this piece
+	     of memory caused problems we better do not use it
+	     anymore.  Uh, and we ignore possible errors.  There
+	     is nothing we could do.  */
+	  (void) __munmap (mem, size);
 
-	  pd->guardsize = guardsize;
+	  return errno;
 	}
-      else if (__builtin_expect (pd->guardsize - guardsize > size - reqsize,
-				 0))
-	{
-	  /* The old guard area is too large.  */
-
-#if _STACK_GROWS_DOWN
-	  if (__mprotect ((char *) mem + guardsize, pd->guardsize - guardsize,
-			prot) != 0)
-	    goto mprot_error;
-#elif _STACK_GROWS_UP
-         char *new_guard = (char *)(((uintptr_t) pd - guardsize)
-                                    & ~pagesize_m1);
-         char *old_guard = (char *)(((uintptr_t) pd - pd->guardsize)
-                                    & ~pagesize_m1);
-         /* The guard size difference might be > 0, but once rounded
-            to the nearest page the size difference might be zero.  */
-         if (new_guard > old_guard
-             && __mprotect (old_guard, new_guard - old_guard, prot) != 0)
-	    goto mprot_error;
-#endif
 
-	  pd->guardsize = guardsize;
-	}
+      pd->guardsize = guardsize;
       /* The pthread_getattr_np() calls need to get passed the size
 	 requested in the attribute, regardless of how large the
 	 actually used guardsize is.  */
@@ -568,19 +657,21 @@ allocate_stack (const struct pthread_attr *attr, struct pthread **pdp,
 static void
 name_stack_maps (struct pthread *pd, bool set)
 {
+  size_t adjust = pd->stack_mode == ALLOCATE_GUARD_PROT_NONE ?
+    pd->guardsize : 0;
 #if _STACK_GROWS_DOWN
-  void *stack = pd->stackblock + pd->guardsize;
+  void *stack = pd->stackblock + adjust;
 #else
   void *stack = pd->stackblock;
 #endif
-  size_t stacksize = pd->stackblock_size - pd->guardsize;
+  size_t stacksize = pd->stackblock_size - adjust;
 
   if (!set)
-    __set_vma_name (stack, stacksize, NULL);
+    __set_vma_name (stack, stacksize, " glibc: unused stack");
   else
     {
       unsigned int tid = pd->tid;
-      if (pd->user_stack)
+      if (pd->stack_mode == ALLOCATE_GUARD_USER)
 	SET_STACK_NAME (" glibc: pthread user stack: ", stack, stacksize, tid);
       else
 	SET_STACK_NAME (" glibc: pthread stack: ", stack, stacksize, tid);
diff --git a/nptl/descr.h b/nptl/descr.h
index d0d30929e2..9c1ed54c56 100644
--- a/nptl/descr.h
+++ b/nptl/descr.h
@@ -125,6 +125,12 @@ struct priority_protection_data
   unsigned int priomap[];
 };
 
+enum allocate_stack_mode_t
+{
+  ALLOCATE_GUARD_MADV_GUARD = 0,
+  ALLOCATE_GUARD_PROT_NONE = 1,
+  ALLOCATE_GUARD_USER = 2,
+};
 
 /* Thread descriptor data structure.  */
 struct pthread
@@ -324,7 +330,7 @@ struct pthread
   bool report_events;
 
   /* True if the user provided the stack.  */
-  bool user_stack;
+  enum allocate_stack_mode_t stack_mode;
 
   /* True if thread must stop at startup time.  */
   bool stopped_start;
diff --git a/nptl/nptl-stack.c b/nptl/nptl-stack.c
index 503357f25d..c049c5133c 100644
--- a/nptl/nptl-stack.c
+++ b/nptl/nptl-stack.c
@@ -120,7 +120,7 @@ __nptl_deallocate_stack (struct pthread *pd)
      not reset the 'used' flag in the 'tid' field.  This is done by
      the kernel.  If no thread has been created yet this field is
      still zero.  */
-  if (__glibc_likely (! pd->user_stack))
+  if (__glibc_likely (pd->stack_mode != ALLOCATE_GUARD_USER))
     (void) queue_stack (pd);
   else
     /* Free the memory associated with the ELF TLS.  */
diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c
index 01e8a86980..0808f2e628 100644
--- a/nptl/pthread_create.c
+++ b/nptl/pthread_create.c
@@ -554,7 +554,7 @@ start_thread (void *arg)
      to avoid creating a new free-state block during thread release.  */
   __getrandom_vdso_release (pd);
 
-  if (!pd->user_stack)
+  if (pd->stack_mode != ALLOCATE_GUARD_USER)
     advise_stack_range (pd->stackblock, pd->stackblock_size, (uintptr_t) pd,
 			pd->guardsize);
 
diff --git a/nptl/tst-guard1.c b/nptl/tst-guard1.c
new file mode 100644
index 0000000000..18df7ff301
--- /dev/null
+++ b/nptl/tst-guard1.c
@@ -0,0 +1,369 @@
+/* Basic tests for pthread guard area.
+   Copyright (C) 2025 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <array_length.h>
+#include <pthreaddef.h>
+#include <setjmp.h>
+#include <stackinfo.h>
+#include <stdio.h>
+#include <support/check.h>
+#include <support/test-driver.h>
+#include <support/xsignal.h>
+#include <support/xthread.h>
+#include <support/xunistd.h>
+#include <sys/mman.h>
+#include <stdlib.h>
+
+static long int pagesz;
+
+/* To check if the guard region is inaccessible, the thread tries read/writes
+   on it and checks if a SIGSEGV is generated.  */
+
+static volatile sig_atomic_t signal_jump_set;
+static sigjmp_buf signal_jmp_buf;
+
+static void
+sigsegv_handler (int sig)
+{
+  if (signal_jump_set == 0)
+    return;
+
+  siglongjmp (signal_jmp_buf, sig);
+}
+
+static bool
+try_access_buf (char *ptr, bool write)
+{
+  signal_jump_set = true;
+
+  bool failed = sigsetjmp (signal_jmp_buf, 0) != 0;
+  if (!failed)
+    {
+      if (write)
+	*(volatile char *)(ptr) = 'x';
+      else
+	*(volatile char *)(ptr);
+    }
+
+  signal_jump_set = false;
+  return !failed;
+}
+
+static bool
+try_read_buf (char *ptr)
+{
+  return try_access_buf (ptr, false);
+}
+
+static bool
+try_write_buf (char *ptr)
+{
+  return try_access_buf (ptr, true);
+}
+
+static bool
+try_read_write_buf (char *ptr)
+{
+  return try_read_buf (ptr) && try_write_buf(ptr);
+}
+
+
+/* Return the guard region of the current thread (it only makes sense on
+   a thread created by pthread_created).  */
+
+struct stack_t
+{
+  char *stack;
+  size_t stacksize;
+  char *guard;
+  size_t guardsize;
+};
+
+static inline size_t
+adjust_stacksize (size_t stacksize)
+{
+  /* For some ABIs, The guard page depends of the thread descriptor, which in
+     turn rely  on the require static TLS.  The only supported _STACK_GROWS_UP
+     ABI, hppa, defines TLS_DTV_AT_TP and it is not straightforward to
+     calculate the guard region with current pthread APIs.  So to get a
+     correct stack size assumes an extra page after the guard area.  */
+#if _STACK_GROWS_DOWN
+  return stacksize;
+#elif _STACK_GROWS_UP
+  return stacksize - pagesz;
+#endif
+}
+
+struct stack_t
+get_current_stack_info (void)
+{
+  pthread_attr_t attr;
+  TEST_VERIFY_EXIT (pthread_getattr_np (pthread_self (), &attr) == 0);
+  void *stack;
+  size_t stacksize;
+  TEST_VERIFY_EXIT (pthread_attr_getstack (&attr, &stack, &stacksize) == 0);
+  size_t guardsize;
+  TEST_VERIFY_EXIT (pthread_attr_getguardsize (&attr, &guardsize) == 0);
+  /* The guardsize is reported as the current page size, although it might
+     be adjusted to a larger value (aarch64 for instance).  */
+  if (guardsize != 0 && guardsize < ARCH_MIN_GUARD_SIZE)
+    guardsize = ARCH_MIN_GUARD_SIZE;
+
+#if _STACK_GROWS_DOWN
+  void *guard = guardsize ? stack - guardsize : 0;
+#elif _STACK_GROWS_UP
+  stacksize = adjust_stacksize (stacksize);
+  void *guard = guardsize ? stack + stacksize  : 0;
+#endif
+
+  pthread_attr_destroy (&attr);
+
+  return (struct stack_t) { stack, stacksize, guard, guardsize };
+}
+
+struct thread_args_t
+{
+  size_t stacksize;
+  size_t guardsize;
+};
+
+struct thread_args_t
+get_thread_args (const pthread_attr_t *attr)
+{
+  size_t stacksize;
+  size_t guardsize;
+
+  TEST_COMPARE (pthread_attr_getstacksize (attr, &stacksize), 0);
+  TEST_COMPARE (pthread_attr_getguardsize (attr, &guardsize), 0);
+  if (guardsize < ARCH_MIN_GUARD_SIZE)
+    guardsize = ARCH_MIN_GUARD_SIZE;
+
+  return (struct thread_args_t) { stacksize, guardsize };
+}
+
+static void
+set_thread_args (pthread_attr_t *attr, const struct thread_args_t *args)
+{
+  xpthread_attr_setstacksize (attr, args->stacksize);
+  xpthread_attr_setguardsize (attr, args->guardsize);
+}
+
+static void *
+tf (void *closure)
+{
+  struct thread_args_t *args = closure;
+
+  struct stack_t s = get_current_stack_info ();
+  if (test_verbose)
+    printf ("debug: [tid=%jd] stack = { .stack=%p, stacksize=%#zx, guard=%p, "
+	    "guardsize=%#zx }\n",
+	    (intmax_t) gettid (),
+	    s.stack,
+	    s.stacksize,
+	    s.guard,
+	    s.guardsize);
+
+  if (args != NULL)
+    {
+      TEST_COMPARE (adjust_stacksize (args->stacksize), s.stacksize);
+      TEST_COMPARE (args->guardsize, s.guardsize);
+    }
+
+  /* Ensure we can access the stack area.  */
+  TEST_COMPARE (try_read_write_buf (s.stack), true);
+  TEST_COMPARE (try_read_write_buf (&s.stack[s.stacksize / 2]), true);
+  TEST_COMPARE (try_read_write_buf (&s.stack[s.stacksize - 1]), true);
+
+  /* Check if accessing the guard area results in SIGSEGV.  */
+  if (s.guardsize > 0)
+    {
+      TEST_COMPARE (try_read_write_buf (s.guard), false);
+      TEST_COMPARE (try_read_write_buf (&s.guard[s.guardsize / 2]), false);
+      TEST_COMPARE (try_read_write_buf (&s.guard[s.guardsize] - 1), false);
+    }
+
+  return NULL;
+}
+
+/* Test 1: caller provided stack without guard.  */
+static void
+do_test1 (void)
+{
+  pthread_attr_t attr;
+  xpthread_attr_init (&attr);
+
+  size_t stacksize = support_small_thread_stack_size ();
+  void *stack = xmmap (0,
+		       stacksize,
+		       PROT_READ | PROT_WRITE,
+		       MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK,
+		       -1);
+  xpthread_attr_setstack (&attr, stack, stacksize);
+  xpthread_attr_setguardsize (&attr, 0);
+
+  struct thread_args_t args = { stacksize, 0 };
+  pthread_t t = xpthread_create (&attr, tf, &args);
+  void *status = xpthread_join (t);
+  TEST_VERIFY (status == 0);
+
+  xpthread_attr_destroy (&attr);
+  xmunmap (stack, stacksize);
+}
+
+/* Test 2: same as 1., but with a guard area.  */
+static void
+do_test2 (void)
+{
+  pthread_attr_t attr;
+  xpthread_attr_init (&attr);
+
+  size_t stacksize = support_small_thread_stack_size ();
+  void *stack = xmmap (0,
+		       stacksize,
+		       PROT_READ | PROT_WRITE,
+		       MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK,
+		       -1);
+  xpthread_attr_setstack (&attr, stack, stacksize);
+  xpthread_attr_setguardsize (&attr, pagesz);
+
+  struct thread_args_t args = { stacksize, 0 };
+  pthread_t t = xpthread_create (&attr, tf, &args);
+  void *status = xpthread_join (t);
+  TEST_VERIFY (status == 0);
+
+  xpthread_attr_destroy (&attr);
+  xmunmap (stack, stacksize);
+}
+
+/* Test 3: pthread_create with default values.  */
+static void
+do_test3 (void)
+{
+  pthread_t t = xpthread_create (NULL, tf, NULL);
+  void *status = xpthread_join (t);
+  TEST_VERIFY (status == 0);
+}
+
+/* Test 4: pthread_create without a guard area.  */
+static void
+do_test4 (void)
+{
+  pthread_attr_t attr;
+  xpthread_attr_init (&attr);
+  struct thread_args_t args = get_thread_args (&attr);
+  args.stacksize += args.guardsize;
+  args.guardsize = 0;
+  set_thread_args (&attr, &args);
+
+  pthread_t t = xpthread_create (&attr, tf, &args);
+  void *status = xpthread_join (t);
+  TEST_VERIFY (status == 0);
+
+  xpthread_attr_destroy (&attr);
+}
+
+/* Test 5: pthread_create with non default stack and guard size value.  */
+static void
+do_test5 (void)
+{
+  pthread_attr_t attr;
+  xpthread_attr_init (&attr);
+  struct thread_args_t args = get_thread_args (&attr);
+  args.guardsize += pagesz;
+  args.stacksize += pagesz;
+  set_thread_args (&attr, &args);
+
+  pthread_t t = xpthread_create (&attr, tf, &args);
+  void *status = xpthread_join (t);
+  TEST_VERIFY (status == 0);
+
+  xpthread_attr_destroy (&attr);
+}
+
+/* Test 6: thread with the required size (stack + guard) that matches the
+   test 3, but with a larger guard area.  The pthread_create will need to
+   increase the guard area.  */
+static void
+do_test6 (void)
+{
+  pthread_attr_t attr;
+  xpthread_attr_init (&attr);
+  struct thread_args_t args = get_thread_args (&attr);
+  args.guardsize += pagesz;
+  args.stacksize -= pagesz;
+  set_thread_args (&attr, &args);
+
+  pthread_t t = xpthread_create (&attr, tf, &args);
+  void *status = xpthread_join (t);
+  TEST_VERIFY (status == 0);
+
+  xpthread_attr_destroy (&attr);
+}
+
+/* Test 7: pthread_create with default values, the requires size matches the
+   one from test 3 and 6 (but with a reduced guard ares).  The
+   pthread_create should use the cached stack from previous tests, but it
+   would require to reduce the guard area.  */
+static void
+do_test7 (void)
+{
+  pthread_t t = xpthread_create (NULL, tf, NULL);
+  void *status = xpthread_join (t);
+  TEST_VERIFY (status == 0);
+}
+
+static int
+do_test (void)
+{
+  pagesz = sysconf (_SC_PAGESIZE);
+
+  {
+    struct sigaction sa = {
+      .sa_handler = sigsegv_handler,
+      .sa_flags = SA_NODEFER,
+    };
+    sigemptyset (&sa.sa_mask);
+    xsigaction (SIGSEGV, &sa, NULL);
+    /* Some system generates SIGBUS accessing the guard area when it is
+       setup with madvise.  */
+    xsigaction (SIGBUS, &sa, NULL);
+  }
+
+  static const struct {
+    const char *descr;
+    void (*test)(void);
+  } tests[] = {
+    { "user provided stack without guard", do_test1 },
+    { "user provided stack with guard",    do_test2 },
+    { "default attribute",                 do_test3 },
+    { "default attribute without guard",   do_test4 },
+    { "non default stack and guard sizes", do_test5 },
+    { "reused stack with larger guard",    do_test6 },
+    { "reused stack with smaller guard",   do_test7 },
+  };
+
+  for (int i = 0; i < array_length (tests); i++)
+    {
+      printf ("debug: test%01d: %s\n", i, tests[i].descr);
+      tests[i].test();
+    }
+
+  return 0;
+}
+
+#include <support/test-driver.c>
diff --git a/sysdeps/nptl/dl-tls_init_tp.c b/sysdeps/nptl/dl-tls_init_tp.c
index c57738e9f3..20cc9202ec 100644
--- a/sysdeps/nptl/dl-tls_init_tp.c
+++ b/sysdeps/nptl/dl-tls_init_tp.c
@@ -72,7 +72,7 @@ __tls_init_tp (void)
    /* Early initialization of the TCB.   */
    pd->tid = INTERNAL_SYSCALL_CALL (set_tid_address, &pd->tid);
    THREAD_SETMEM (pd, specific[0], &pd->specific_1stblock[0]);
-   THREAD_SETMEM (pd, user_stack, true);
+   THREAD_SETMEM (pd, stack_mode, ALLOCATE_GUARD_USER);
 
   /* Before initializing GL (dl_stack_user), the debugger could not
      find us and had to set __nptl_initial_report_events.  Propagate
diff --git a/sysdeps/nptl/fork.h b/sysdeps/nptl/fork.h
index 6156af79e1..3c79179437 100644
--- a/sysdeps/nptl/fork.h
+++ b/sysdeps/nptl/fork.h
@@ -155,7 +155,7 @@ reclaim_stacks (void)
   INIT_LIST_HEAD (&GL (dl_stack_used));
   INIT_LIST_HEAD (&GL (dl_stack_user));
 
-  if (__glibc_unlikely (THREAD_GETMEM (self, user_stack)))
+  if (__glibc_unlikely (self->stack_mode == ALLOCATE_GUARD_USER))
     list_add (&self->list, &GL (dl_stack_user));
   else
     list_add (&self->list, &GL (dl_stack_used));
diff --git a/sysdeps/unix/sysv/linux/bits/mman-linux.h b/sysdeps/unix/sysv/linux/bits/mman-linux.h
index 8e072eb4cd..fe0496d802 100644
--- a/sysdeps/unix/sysv/linux/bits/mman-linux.h
+++ b/sysdeps/unix/sysv/linux/bits/mman-linux.h
@@ -113,6 +113,8 @@
 				    locked pages too.  */
 # define MADV_COLLAPSE    25	/* Synchronous hugepage collapse.  */
 # define MADV_HWPOISON	  100	/* Poison a page for testing.  */
+# define MADV_GUARD_INSTALL 102 /* Fatal signal on access to range */
+# define MADV_GUARD_REMOVE 103  /* Unguard range */
 #endif
 
 /* The POSIX people had to invent similar names for the same things.  */