diff mbox

linux-gen: _ishmphy: fix possible race with malloc

Message ID 1485185917-3677-1-git-send-email-christophe.milard@linaro.org
State Accepted
Commit 0d6c0804640151f8aa9970cc74f1fef28a11fac1
Headers show

Commit Message

Christophe Milard Jan. 23, 2017, 3:38 p.m. UTC
_ishm prereserves address space for _ODP_ISHM_SINGLE_VA allocated memory
by reserving unreachable "memory", using PROT_NONE, hence really allocating
virtual space (as the memory cannot be hit).
The problem came when trying to use some of this preallocated space:
strangely, if a new mapping on the preallocated virtual space failed (e.g.
due to lack of huge pages), linux would returned MAP_FAILED (OK) but
also cancel the previous PROT_NONE mapping, hence making the virtual
memory space available for further mmaps. (code Point A)
Before this patch, the code simply re-reserved (PROT_NONE) the area
(point B):
This was NOT OK: yes, _ishm_reserve calls are mutexed, so no other
odpthread 2 could do a reserve between point A and B of opdthread1, but if
thread2 did its own mmap(), malloc(),...,  there was a chance for thread 2
to get virtual space from the preserved area (which should be blocked).
This patch does not allow any A-B window by first doing an mmap (non fixed)
and then performing a second mapping at the correct address (in the
pre-reserved area), which is guaranteed to succeed, and finally removing
the non-fixed mapping.
The non-fix mapping just acts as a failure guard when the proper maping
is done.
Fixes https://bugs.linaro.org/show_bug.cgi?id=2834
(but very hard to trigger i.e. to prove)

Signed-off-by: Christophe Milard <christophe.milard@linaro.org>

---
 platform/linux-generic/_ishmphy.c | 42 +++++++++++++++++++++++++++++----------
 1 file changed, 32 insertions(+), 10 deletions(-)

-- 
2.7.4

Comments

Bill Fischofer Jan. 24, 2017, 12:55 a.m. UTC | #1
On Mon, Jan 23, 2017 at 9:38 AM, Christophe Milard
<christophe.milard@linaro.org> wrote:
> _ishm prereserves address space for _ODP_ISHM_SINGLE_VA allocated memory

> by reserving unreachable "memory", using PROT_NONE, hence really allocating

> virtual space (as the memory cannot be hit).

> The problem came when trying to use some of this preallocated space:

> strangely, if a new mapping on the preallocated virtual space failed (e.g.

> due to lack of huge pages), linux would returned MAP_FAILED (OK) but

> also cancel the previous PROT_NONE mapping, hence making the virtual

> memory space available for further mmaps. (code Point A)

> Before this patch, the code simply re-reserved (PROT_NONE) the area

> (point B):

> This was NOT OK: yes, _ishm_reserve calls are mutexed, so no other

> odpthread 2 could do a reserve between point A and B of opdthread1, but if

> thread2 did its own mmap(), malloc(),...,  there was a chance for thread 2

> to get virtual space from the preserved area (which should be blocked).

> This patch does not allow any A-B window by first doing an mmap (non fixed)

> and then performing a second mapping at the correct address (in the

> pre-reserved area), which is guaranteed to succeed, and finally removing

> the non-fixed mapping.

> The non-fix mapping just acts as a failure guard when the proper maping

> is done.

> Fixes https://bugs.linaro.org/show_bug.cgi?id=2834

> (but very hard to trigger i.e. to prove)

>

> Signed-off-by: Christophe Milard <christophe.milard@linaro.org>


Reviewed-and-tested-by: Bill Fischofer <bill.fischofer@linaro.org>

> ---

>  platform/linux-generic/_ishmphy.c | 42 +++++++++++++++++++++++++++++----------

>  1 file changed, 32 insertions(+), 10 deletions(-)

>

> diff --git a/platform/linux-generic/_ishmphy.c b/platform/linux-generic/_ishmphy.c

> index 2b2d100..d519af6 100644

> --- a/platform/linux-generic/_ishmphy.c

> +++ b/platform/linux-generic/_ishmphy.c

> @@ -94,7 +94,7 @@ int _odp_ishmphy_unbook_va(void)

>  void *_odp_ishmphy_map(int fd, void *start, uint64_t size,

>                        int flags)

>  {

> -       void *mapped_addr;

> +       void *mapped_addr_tmp, *mapped_addr;

>         int mmap_flags = 0;

>

>         if (flags & _ODP_ISHM_SINGLE_VA) {

> @@ -103,15 +103,37 @@ void *_odp_ishmphy_map(int fd, void *start, uint64_t size,

>                         return NULL;

>                 }

>                 /* maps over fragment of reserved VA: */

> -               mapped_addr = mmap(start, size, PROT_READ | PROT_WRITE,

> -                                  MAP_SHARED | MAP_FIXED | mmap_flags, fd, 0);

> -               /* if mapping fails, re-block the space we tried to take

> -                * as it seems a mapping failure still affect what was there??*/

> -               if (mapped_addr == MAP_FAILED) {

> -                       mmap_flags = MAP_SHARED | MAP_FIXED |

> -                                    MAP_ANONYMOUS | MAP_NORESERVE;

> -                       mmap(start, size, PROT_NONE, mmap_flags, -1, 0);

> -                       mprotect(start, size, PROT_NONE);

> +               /* first, try a normal map. If that works, remap it where it

> +                * should (on the prereverved space), and remove the initial

> +                * normal mapping:

> +                * This is because it turned out that if a mapping fails

> +                * on a the prereserved virtual address space, then

> +                * the prereserved address space which was tried to be mapped

> +                * on becomes available to the kernel again! This was not

> +                * according to expectations: the assumption was that if a

> +                * mapping fails, the system should remain unchanged, but this

> +                * is obvioulsy not true (at least for huge pages when

> +                * exhausted).

> +                * So the strategy is to first map at a non reserved place

> +                * (which can then be freed and returned to the kernel on

> +                * failure) and peform a new map to the prereserved space on

> +                * success (which is then guaranteed to work).

> +                * The initial free maping can then be removed.

> +                */

> +               mapped_addr = MAP_FAILED;

> +               mapped_addr_tmp = mmap(NULL, size, PROT_READ | PROT_WRITE,

> +                                      MAP_SHARED | mmap_flags, fd, 0);

> +               if (mapped_addr_tmp != MAP_FAILED) {

> +                       /* If OK, do new map at right fixed location... */

> +                       mapped_addr = mmap(start,

> +                                          size, PROT_READ | PROT_WRITE,

> +                                          MAP_SHARED | MAP_FIXED | mmap_flags,

> +                                          fd, 0);

> +                       if (mapped_addr != start)

> +                               ODP_ERR("new map failed:%s\n", strerror(errno));

> +                       /* ... and remove initial mapping: */

> +                       if (munmap(mapped_addr_tmp, size))

> +                               ODP_ERR("munmap failed:%s\n", strerror(errno));

>                 }

>         } else {

>                 /* just do a new mapping in the VA space: */

> --

> 2.7.4

>
Maxim Uvarov Jan. 25, 2017, 4:38 p.m. UTC | #2
merged,
Maxim.

On 01/24/17 03:55, Bill Fischofer wrote:
> On Mon, Jan 23, 2017 at 9:38 AM, Christophe Milard

> <christophe.milard@linaro.org> wrote:

>> _ishm prereserves address space for _ODP_ISHM_SINGLE_VA allocated memory

>> by reserving unreachable "memory", using PROT_NONE, hence really allocating

>> virtual space (as the memory cannot be hit).

>> The problem came when trying to use some of this preallocated space:

>> strangely, if a new mapping on the preallocated virtual space failed (e.g.

>> due to lack of huge pages), linux would returned MAP_FAILED (OK) but

>> also cancel the previous PROT_NONE mapping, hence making the virtual

>> memory space available for further mmaps. (code Point A)

>> Before this patch, the code simply re-reserved (PROT_NONE) the area

>> (point B):

>> This was NOT OK: yes, _ishm_reserve calls are mutexed, so no other

>> odpthread 2 could do a reserve between point A and B of opdthread1, but if

>> thread2 did its own mmap(), malloc(),...,  there was a chance for thread 2

>> to get virtual space from the preserved area (which should be blocked).

>> This patch does not allow any A-B window by first doing an mmap (non fixed)

>> and then performing a second mapping at the correct address (in the

>> pre-reserved area), which is guaranteed to succeed, and finally removing

>> the non-fixed mapping.

>> The non-fix mapping just acts as a failure guard when the proper maping

>> is done.

>> Fixes https://bugs.linaro.org/show_bug.cgi?id=2834

>> (but very hard to trigger i.e. to prove)

>>

>> Signed-off-by: Christophe Milard <christophe.milard@linaro.org>

> 

> Reviewed-and-tested-by: Bill Fischofer <bill.fischofer@linaro.org>

> 

>> ---

>>  platform/linux-generic/_ishmphy.c | 42 +++++++++++++++++++++++++++++----------

>>  1 file changed, 32 insertions(+), 10 deletions(-)

>>

>> diff --git a/platform/linux-generic/_ishmphy.c b/platform/linux-generic/_ishmphy.c

>> index 2b2d100..d519af6 100644

>> --- a/platform/linux-generic/_ishmphy.c

>> +++ b/platform/linux-generic/_ishmphy.c

>> @@ -94,7 +94,7 @@ int _odp_ishmphy_unbook_va(void)

>>  void *_odp_ishmphy_map(int fd, void *start, uint64_t size,

>>                        int flags)

>>  {

>> -       void *mapped_addr;

>> +       void *mapped_addr_tmp, *mapped_addr;

>>         int mmap_flags = 0;

>>

>>         if (flags & _ODP_ISHM_SINGLE_VA) {

>> @@ -103,15 +103,37 @@ void *_odp_ishmphy_map(int fd, void *start, uint64_t size,

>>                         return NULL;

>>                 }

>>                 /* maps over fragment of reserved VA: */

>> -               mapped_addr = mmap(start, size, PROT_READ | PROT_WRITE,

>> -                                  MAP_SHARED | MAP_FIXED | mmap_flags, fd, 0);

>> -               /* if mapping fails, re-block the space we tried to take

>> -                * as it seems a mapping failure still affect what was there??*/

>> -               if (mapped_addr == MAP_FAILED) {

>> -                       mmap_flags = MAP_SHARED | MAP_FIXED |

>> -                                    MAP_ANONYMOUS | MAP_NORESERVE;

>> -                       mmap(start, size, PROT_NONE, mmap_flags, -1, 0);

>> -                       mprotect(start, size, PROT_NONE);

>> +               /* first, try a normal map. If that works, remap it where it

>> +                * should (on the prereverved space), and remove the initial

>> +                * normal mapping:

>> +                * This is because it turned out that if a mapping fails

>> +                * on a the prereserved virtual address space, then

>> +                * the prereserved address space which was tried to be mapped

>> +                * on becomes available to the kernel again! This was not

>> +                * according to expectations: the assumption was that if a

>> +                * mapping fails, the system should remain unchanged, but this

>> +                * is obvioulsy not true (at least for huge pages when

>> +                * exhausted).

>> +                * So the strategy is to first map at a non reserved place

>> +                * (which can then be freed and returned to the kernel on

>> +                * failure) and peform a new map to the prereserved space on

>> +                * success (which is then guaranteed to work).

>> +                * The initial free maping can then be removed.

>> +                */

>> +               mapped_addr = MAP_FAILED;

>> +               mapped_addr_tmp = mmap(NULL, size, PROT_READ | PROT_WRITE,

>> +                                      MAP_SHARED | mmap_flags, fd, 0);

>> +               if (mapped_addr_tmp != MAP_FAILED) {

>> +                       /* If OK, do new map at right fixed location... */

>> +                       mapped_addr = mmap(start,

>> +                                          size, PROT_READ | PROT_WRITE,

>> +                                          MAP_SHARED | MAP_FIXED | mmap_flags,

>> +                                          fd, 0);

>> +                       if (mapped_addr != start)

>> +                               ODP_ERR("new map failed:%s\n", strerror(errno));

>> +                       /* ... and remove initial mapping: */

>> +                       if (munmap(mapped_addr_tmp, size))

>> +                               ODP_ERR("munmap failed:%s\n", strerror(errno));

>>                 }

>>         } else {

>>                 /* just do a new mapping in the VA space: */

>> --

>> 2.7.4

>>
diff mbox

Patch

diff --git a/platform/linux-generic/_ishmphy.c b/platform/linux-generic/_ishmphy.c
index 2b2d100..d519af6 100644
--- a/platform/linux-generic/_ishmphy.c
+++ b/platform/linux-generic/_ishmphy.c
@@ -94,7 +94,7 @@  int _odp_ishmphy_unbook_va(void)
 void *_odp_ishmphy_map(int fd, void *start, uint64_t size,
 		       int flags)
 {
-	void *mapped_addr;
+	void *mapped_addr_tmp, *mapped_addr;
 	int mmap_flags = 0;
 
 	if (flags & _ODP_ISHM_SINGLE_VA) {
@@ -103,15 +103,37 @@  void *_odp_ishmphy_map(int fd, void *start, uint64_t size,
 			return NULL;
 		}
 		/* maps over fragment of reserved VA: */
-		mapped_addr = mmap(start, size, PROT_READ | PROT_WRITE,
-				   MAP_SHARED | MAP_FIXED | mmap_flags, fd, 0);
-		/* if mapping fails, re-block the space we tried to take
-		 * as it seems a mapping failure still affect what was there??*/
-		if (mapped_addr == MAP_FAILED) {
-			mmap_flags = MAP_SHARED | MAP_FIXED |
-				     MAP_ANONYMOUS | MAP_NORESERVE;
-			mmap(start, size, PROT_NONE, mmap_flags, -1, 0);
-			mprotect(start, size, PROT_NONE);
+		/* first, try a normal map. If that works, remap it where it
+		 * should (on the prereverved space), and remove the initial
+		 * normal mapping:
+		 * This is because it turned out that if a mapping fails
+		 * on a the prereserved virtual address space, then
+		 * the prereserved address space which was tried to be mapped
+		 * on becomes available to the kernel again! This was not
+		 * according to expectations: the assumption was that if a
+		 * mapping fails, the system should remain unchanged, but this
+		 * is obvioulsy not true (at least for huge pages when
+		 * exhausted).
+		 * So the strategy is to first map at a non reserved place
+		 * (which can then be freed and returned to the kernel on
+		 * failure) and peform a new map to the prereserved space on
+		 * success (which is then guaranteed to work).
+		 * The initial free maping can then be removed.
+		 */
+		mapped_addr = MAP_FAILED;
+		mapped_addr_tmp = mmap(NULL, size, PROT_READ | PROT_WRITE,
+				       MAP_SHARED | mmap_flags, fd, 0);
+		if (mapped_addr_tmp != MAP_FAILED) {
+			/* If OK, do new map at right fixed location... */
+			mapped_addr = mmap(start,
+					   size, PROT_READ | PROT_WRITE,
+					   MAP_SHARED | MAP_FIXED | mmap_flags,
+					   fd, 0);
+			if (mapped_addr != start)
+				ODP_ERR("new map failed:%s\n", strerror(errno));
+			/* ... and remove initial mapping: */
+			if (munmap(mapped_addr_tmp, size))
+				ODP_ERR("munmap failed:%s\n", strerror(errno));
 		}
 	} else {
 		/* just do a new mapping in the VA space: */