diff mbox series

[v2,10/11] efi_loader: Pass address to fs_read()

Message ID 20180614182232.78201-11-agraf@suse.de
State Superseded
Headers show
Series sandbox: efi_loader support | expand

Commit Message

Alexander Graf June 14, 2018, 6:22 p.m. UTC
The fs_read() function wants to get an address rather than the
pointer to a buffer.

So let's convert the passed buffer from pointer back a the address
to make efi_loader on sandbox happier.

Signed-off-by: Alexander Graf <agraf@suse.de>

---

v1 -> v2:

  - Clarify address vs pointer
  - include mapmem.h
---
 lib/efi_loader/efi_file.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

Comments

Simon Glass June 14, 2018, 7:01 p.m. UTC | #1
On 14 June 2018 at 12:22, Alexander Graf <agraf@suse.de> wrote:
> The fs_read() function wants to get an address rather than the
> pointer to a buffer.
>
> So let's convert the passed buffer from pointer back a the address
> to make efi_loader on sandbox happier.
>
> Signed-off-by: Alexander Graf <agraf@suse.de>
>
> ---
>
> v1 -> v2:
>
>   - Clarify address vs pointer
>   - include mapmem.h
> ---
>  lib/efi_loader/efi_file.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)

Reviewed-by: Simon Glass <sjg@chromium.org>
Alexander Graf June 14, 2018, 7:51 p.m. UTC | #2
On 14.06.18 21:01, Simon Glass wrote:
> On 14 June 2018 at 12:22, Alexander Graf <agraf@suse.de> wrote:
>> The fs_read() function wants to get an address rather than the
>> pointer to a buffer.
>>
>> So let's convert the passed buffer from pointer back a the address
>> to make efi_loader on sandbox happier.
>>
>> Signed-off-by: Alexander Graf <agraf@suse.de>
>>
>> ---
>>
>> v1 -> v2:
>>
>>   - Clarify address vs pointer
>>   - include mapmem.h
>> ---
>>  lib/efi_loader/efi_file.c | 5 ++++-
>>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> Reviewed-by: Simon Glass <sjg@chromium.org>
> 

I actually think that this patch tackles the problem the wrong way
around. I've cooked up another one that converts fs_read() and
fs_write() to instead take a pointer - which really is what most users
of the API want in the first place:


https://github.com/agraf/u-boot/commit/eb89f036a42cea8d7aaa6d83b8ecd9d202814b0f


Alex
Simon Glass June 14, 2018, 9:35 p.m. UTC | #3
Hi Alex,

On 14 June 2018 at 13:51, Alexander Graf <agraf@suse.de> wrote:
>
>
> On 14.06.18 21:01, Simon Glass wrote:
>> On 14 June 2018 at 12:22, Alexander Graf <agraf@suse.de> wrote:
>>> The fs_read() function wants to get an address rather than the
>>> pointer to a buffer.
>>>
>>> So let's convert the passed buffer from pointer back a the address
>>> to make efi_loader on sandbox happier.
>>>
>>> Signed-off-by: Alexander Graf <agraf@suse.de>
>>>
>>> ---
>>>
>>> v1 -> v2:
>>>
>>>   - Clarify address vs pointer
>>>   - include mapmem.h
>>> ---
>>>  lib/efi_loader/efi_file.c | 5 ++++-
>>>  1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> Reviewed-by: Simon Glass <sjg@chromium.org>
>>
>
> I actually think that this patch tackles the problem the wrong way
> around. I've cooked up another one that converts fs_read() and
> fs_write() to instead take a pointer - which really is what most users
> of the API want in the first place:
>
>
> https://github.com/agraf/u-boot/commit/eb89f036a42cea8d7aaa6d83b8ecd9d202814b0f

Actually this is a pretty good exposition of why we don't want to use
pointers everywhere. U-Boot uses addresses all over the place. Even a
search for something as specific as "ulong addr" gets over 200
matches. If we go down this path we will end up changing a pretty
fundamental part of U-Boot, and I don't think it is a good idea to do
that. Addresses are easy to understand, can be added/subtracted
without pointer arithmetic, can be easily converted to pointers as
needed, can be 32-bits long on sandbox, etc.

So I think we should step back from the abyss here and stick with the
way sandbox currently deals with addresses. It works well, is pretty
easy to understand, allows debugging which is just as easy on sandbox
as other archs (since they both uses addresses until basically the
final access), the addresses are typically small values that start at
0 which much is easier to read than (e.g.) 00007f1b58c8b008.

Here for example is the output from starting U-Boot with debugging in
board_f.c (something I have turned on a lot when refactoring and
debugging the init sequence):

U-Boot 2018.07-rc1-00142-g134ea86c7f-dirty (Jun 14 2018 - 15:04:04 -0600)

DRAM:  Monitor len: 00395AB0
Ram size: 08000000
Ram top: 08000000
Reserving 3670k for U-Boot at: 07c6a000
Reserving 32776k for malloc() at: 05c68000
Reserving 120 Bytes for Board Info at: 05c67f88
Reserving 472 Bytes for Global Data at: 05c67db0
Reserving 4352 Bytes for FDT at: 05c66cb0
Reserving 0x3c8 Bytes for bootstage at: 05c668e8

RAM Configuration:
Bank #0: 0 128 MiB

DRAM:  128 MiB
New Stack Pointer is: 05c668d0
Copying bootstage from 00007fdb0056e038 to 00007fdb061c48f0, size 3c8
Relocation Offset is: 07c6a000
Relocating to 07c6a000, new gd at 05c67db0, sp at 05c668d0


If we use pointers we get things like this:

Reserving 3670k for U-Boot at: 00007f1b58c8b008

I just don't want to deal with 64-bit addresses when debugging things,
and there really is no point. The map_sysmem() function provides a
nice and easy way to cast an address to a pointer, and it works on all
architectures.

Regards,
Simon
Alexander Graf June 14, 2018, 9:55 p.m. UTC | #4
On 14.06.18 23:35, Simon Glass wrote:
> Hi Alex,
> 
> On 14 June 2018 at 13:51, Alexander Graf <agraf@suse.de> wrote:
>>
>>
>> On 14.06.18 21:01, Simon Glass wrote:
>>> On 14 June 2018 at 12:22, Alexander Graf <agraf@suse.de> wrote:
>>>> The fs_read() function wants to get an address rather than the
>>>> pointer to a buffer.
>>>>
>>>> So let's convert the passed buffer from pointer back a the address
>>>> to make efi_loader on sandbox happier.
>>>>
>>>> Signed-off-by: Alexander Graf <agraf@suse.de>
>>>>
>>>> ---
>>>>
>>>> v1 -> v2:
>>>>
>>>>   - Clarify address vs pointer
>>>>   - include mapmem.h
>>>> ---
>>>>  lib/efi_loader/efi_file.c | 5 ++++-
>>>>  1 file changed, 4 insertions(+), 1 deletion(-)
>>>
>>> Reviewed-by: Simon Glass <sjg@chromium.org>
>>>
>>
>> I actually think that this patch tackles the problem the wrong way
>> around. I've cooked up another one that converts fs_read() and
>> fs_write() to instead take a pointer - which really is what most users
>> of the API want in the first place:
>>
>>
>> https://github.com/agraf/u-boot/commit/eb89f036a42cea8d7aaa6d83b8ecd9d202814b0f
> 
> Actually this is a pretty good exposition of why we don't want to use
> pointers everywhere. U-Boot uses addresses all over the place. Even a
> search for something as specific as "ulong addr" gets over 200
> matches. If we go down this path we will end up changing a pretty
> fundamental part of U-Boot, and I don't think it is a good idea to do
> that. Addresses are easy to understand, can be added/subtracted
> without pointer arithmetic, can be easily converted to pointers as
> needed, can be 32-bits long on sandbox, etc.

I tend to disagree with this. Most code in U-Boot actually consumes
pointers rather than addresses.

> So I think we should step back from the abyss here and stick with the
> way sandbox currently deals with addresses. It works well, is pretty
> easy to understand, allows debugging which is just as easy on sandbox
> as other archs (since they both uses addresses until basically the
> final access), the addresses are typically small values that start at
> 0 which much is easier to read than (e.g.) 00007f1b58c8b008.
> 
> Here for example is the output from starting U-Boot with debugging in
> board_f.c (something I have turned on a lot when refactoring and
> debugging the init sequence):
> 
> U-Boot 2018.07-rc1-00142-g134ea86c7f-dirty (Jun 14 2018 - 15:04:04 -0600)
> 
> DRAM:  Monitor len: 00395AB0
> Ram size: 08000000
> Ram top: 08000000
> Reserving 3670k for U-Boot at: 07c6a000
> Reserving 32776k for malloc() at: 05c68000
> Reserving 120 Bytes for Board Info at: 05c67f88
> Reserving 472 Bytes for Global Data at: 05c67db0
> Reserving 4352 Bytes for FDT at: 05c66cb0
> Reserving 0x3c8 Bytes for bootstage at: 05c668e8
> 
> RAM Configuration:
> Bank #0: 0 128 MiB
> 
> DRAM:  128 MiB
> New Stack Pointer is: 05c668d0
> Copying bootstage from 00007fdb0056e038 to 00007fdb061c48f0, size 3c8
> Relocation Offset is: 07c6a000
> Relocating to 07c6a000, new gd at 05c67db0, sp at 05c668d0
> 
> 
> If we use pointers we get things like this:
> 
> Reserving 3670k for U-Boot at: 00007f1b58c8b008
> 
> I just don't want to deal with 64-bit addresses when debugging things,
> and there really is no point. The map_sysmem() function provides a
> nice and easy way to cast an address to a pointer, and it works on all
> architectures.

Ok, circling back to square 1. The main issue we're facing here is that
most code in U-Boot does not really understand the difference between
virtual and physical addresses - it just simply assumes a 1:1 map.

With sandbox, we do not have a 1:1 map anymore - any address is
somewhere else than its pointer.

We have a few options:

  1) Deal with pointers at random addresses throughout the code

I can see why you don't want this. I find ASLR generated addresses quite
cumbersome to read as well.

  2) Explicitly map RAM at VA 0x10000000, then do 1:1 map

This would be the best of all worlds still IMHO. That way we will have
easily readable addresses (that are identical in 32bit and 64bit), but
can still do a 1:1 map. The only thing we need to do is to introduce a
linker section at 0x10000000.

  3) Keep converting addresses to pointers

I'm afraid if we follow this path, we will always have arguments on
where the correct boundaries are. If you start to enable debugging in
core dm code and print out pointers to your dm objects, you will get
pointer values today as well. Sooner or later we will always end up with
pointers.


Alex
Simon Glass June 19, 2018, 10:03 p.m. UTC | #5
Hi Alex,

On 14 June 2018 at 15:55, Alexander Graf <agraf@suse.de> wrote:
>
>
> On 14.06.18 23:35, Simon Glass wrote:
>> Hi Alex,
>>
>> On 14 June 2018 at 13:51, Alexander Graf <agraf@suse.de> wrote:
>>>
>>>
>>> On 14.06.18 21:01, Simon Glass wrote:
>>>> On 14 June 2018 at 12:22, Alexander Graf <agraf@suse.de> wrote:
>>>>> The fs_read() function wants to get an address rather than the
>>>>> pointer to a buffer.
>>>>>
>>>>> So let's convert the passed buffer from pointer back a the address
>>>>> to make efi_loader on sandbox happier.
>>>>>
>>>>> Signed-off-by: Alexander Graf <agraf@suse.de>
>>>>>
>>>>> ---
>>>>>
>>>>> v1 -> v2:
>>>>>
>>>>>   - Clarify address vs pointer
>>>>>   - include mapmem.h
>>>>> ---
>>>>>  lib/efi_loader/efi_file.c | 5 ++++-
>>>>>  1 file changed, 4 insertions(+), 1 deletion(-)
>>>>
>>>> Reviewed-by: Simon Glass <sjg@chromium.org>
>>>>
>>>
>>> I actually think that this patch tackles the problem the wrong way
>>> around. I've cooked up another one that converts fs_read() and
>>> fs_write() to instead take a pointer - which really is what most users
>>> of the API want in the first place:
>>>
>>>
>>> https://github.com/agraf/u-boot/commit/eb89f036a42cea8d7aaa6d83b8ecd9d202814b0f
>>
>> Actually this is a pretty good exposition of why we don't want to use
>> pointers everywhere. U-Boot uses addresses all over the place. Even a
>> search for something as specific as "ulong addr" gets over 200
>> matches. If we go down this path we will end up changing a pretty
>> fundamental part of U-Boot, and I don't think it is a good idea to do
>> that. Addresses are easy to understand, can be added/subtracted
>> without pointer arithmetic, can be easily converted to pointers as
>> needed, can be 32-bits long on sandbox, etc.
>
> I tend to disagree with this. Most code in U-Boot actually consumes
> pointers rather than addresses.

That might be true within individual components, but addresses are the
common currency in U-Boot. See for example bootm, all the
image-handling stuff including FIT, commands, device tree address
properties. Pointers are more often used at the lowest level of code.

>
>> So I think we should step back from the abyss here and stick with the
>> way sandbox currently deals with addresses. It works well, is pretty
>> easy to understand, allows debugging which is just as easy on sandbox
>> as other archs (since they both uses addresses until basically the
>> final access), the addresses are typically small values that start at
>> 0 which much is easier to read than (e.g.) 00007f1b58c8b008.
>>
>> Here for example is the output from starting U-Boot with debugging in
>> board_f.c (something I have turned on a lot when refactoring and
>> debugging the init sequence):
>>
>> U-Boot 2018.07-rc1-00142-g134ea86c7f-dirty (Jun 14 2018 - 15:04:04 -0600)
>>
>> DRAM:  Monitor len: 00395AB0
>> Ram size: 08000000
>> Ram top: 08000000
>> Reserving 3670k for U-Boot at: 07c6a000
>> Reserving 32776k for malloc() at: 05c68000
>> Reserving 120 Bytes for Board Info at: 05c67f88
>> Reserving 472 Bytes for Global Data at: 05c67db0
>> Reserving 4352 Bytes for FDT at: 05c66cb0
>> Reserving 0x3c8 Bytes for bootstage at: 05c668e8
>>
>> RAM Configuration:
>> Bank #0: 0 128 MiB
>>
>> DRAM:  128 MiB
>> New Stack Pointer is: 05c668d0
>> Copying bootstage from 00007fdb0056e038 to 00007fdb061c48f0, size 3c8
>> Relocation Offset is: 07c6a000
>> Relocating to 07c6a000, new gd at 05c67db0, sp at 05c668d0
>>
>>
>> If we use pointers we get things like this:
>>
>> Reserving 3670k for U-Boot at: 00007f1b58c8b008
>>
>> I just don't want to deal with 64-bit addresses when debugging things,
>> and there really is no point. The map_sysmem() function provides a
>> nice and easy way to cast an address to a pointer, and it works on all
>> architectures.
>
> Ok, circling back to square 1. The main issue we're facing here is that
> most code in U-Boot does not really understand the difference between
> virtual and physical addresses - it just simply assumes a 1:1 map.

Traditionally that was true, but over the past 5 years quite a bit of
code has been converted to work correctly with sandbox. Also, almost
all *tested* code supports sandbox.

>
> With sandbox, we do not have a 1:1 map anymore - any address is
> somewhere else than its pointer.
>
> We have a few options:
>
>   1) Deal with pointers at random addresses throughout the code
>
> I can see why you don't want this. I find ASLR generated addresses quite
> cumbersome to read as well.

I'm not sure what this means.

>
>   2) Explicitly map RAM at VA 0x10000000, then do 1:1 map
>
> This would be the best of all worlds still IMHO. That way we will have
> easily readable addresses (that are identical in 32bit and 64bit), but
> can still do a 1:1 map. The only thing we need to do is to introduce a
> linker section at 0x10000000.

Linker section or mmap() is essentially the same thing. The former
will presumably just fail to run (bus error?) if the address is not
available. The latter will select a similar, available address.

>
>   3) Keep converting addresses to pointers
>
> I'm afraid if we follow this path, we will always have arguments on
> where the correct boundaries are. If you start to enable debugging in
> core dm code and print out pointers to your dm objects, you will get
> pointer values today as well. Sooner or later we will always end up with
> pointers.

The addresses should generally be converted just before use. Most APIs
should use addresses rather than pointers, particularly those related
to images, loaded data in memory, hardware peripheral addresses and
booting.

Regards,
Simon
Alexander Graf June 20, 2018, 9:31 a.m. UTC | #6
On 06/20/2018 12:03 AM, Simon Glass wrote:
> Hi Alex,
>
> On 14 June 2018 at 15:55, Alexander Graf <agraf@suse.de> wrote:
>>
>> On 14.06.18 23:35, Simon Glass wrote:
>>> Hi Alex,
>>>
>>> On 14 June 2018 at 13:51, Alexander Graf <agraf@suse.de> wrote:
>>>>
>>>> On 14.06.18 21:01, Simon Glass wrote:
>>>>> On 14 June 2018 at 12:22, Alexander Graf <agraf@suse.de> wrote:
>>>>>> The fs_read() function wants to get an address rather than the
>>>>>> pointer to a buffer.
>>>>>>
>>>>>> So let's convert the passed buffer from pointer back a the address
>>>>>> to make efi_loader on sandbox happier.
>>>>>>
>>>>>> Signed-off-by: Alexander Graf <agraf@suse.de>
>>>>>>
>>>>>> ---
>>>>>>
>>>>>> v1 -> v2:
>>>>>>
>>>>>>    - Clarify address vs pointer
>>>>>>    - include mapmem.h
>>>>>> ---
>>>>>>   lib/efi_loader/efi_file.c | 5 ++++-
>>>>>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>>>> Reviewed-by: Simon Glass <sjg@chromium.org>
>>>>>
>>>> I actually think that this patch tackles the problem the wrong way
>>>> around. I've cooked up another one that converts fs_read() and
>>>> fs_write() to instead take a pointer - which really is what most users
>>>> of the API want in the first place:
>>>>
>>>>
>>>> https://github.com/agraf/u-boot/commit/eb89f036a42cea8d7aaa6d83b8ecd9d202814b0f
>>> Actually this is a pretty good exposition of why we don't want to use
>>> pointers everywhere. U-Boot uses addresses all over the place. Even a
>>> search for something as specific as "ulong addr" gets over 200
>>> matches. If we go down this path we will end up changing a pretty
>>> fundamental part of U-Boot, and I don't think it is a good idea to do
>>> that. Addresses are easy to understand, can be added/subtracted
>>> without pointer arithmetic, can be easily converted to pointers as
>>> needed, can be 32-bits long on sandbox, etc.
>> I tend to disagree with this. Most code in U-Boot actually consumes
>> pointers rather than addresses.
> That might be true within individual components, but addresses are the
> common currency in U-Boot. See for example bootm, all the
> image-handling stuff including FIT, commands, device tree address
> properties. Pointers are more often used at the lowest level of code.

Yes, but all counterparts of fs_read and APIs on the same level use 
pointers. And please take a few minutes to look at its users. 90% of 
them are using pointers too.

As I mentioned elsewhere already, going from address to pointer is 
something reasonable. Going from address to pointer usually a sign of 
bad design ;). If most users of fs_read() are already using pointers, it 
means that either all of its users are wrong or fs_read() is wrong. I 
think the latter is the case.

>
>>> So I think we should step back from the abyss here and stick with the
>>> way sandbox currently deals with addresses. It works well, is pretty
>>> easy to understand, allows debugging which is just as easy on sandbox
>>> as other archs (since they both uses addresses until basically the
>>> final access), the addresses are typically small values that start at
>>> 0 which much is easier to read than (e.g.) 00007f1b58c8b008.
>>>
>>> Here for example is the output from starting U-Boot with debugging in
>>> board_f.c (something I have turned on a lot when refactoring and
>>> debugging the init sequence):
>>>
>>> U-Boot 2018.07-rc1-00142-g134ea86c7f-dirty (Jun 14 2018 - 15:04:04 -0600)
>>>
>>> DRAM:  Monitor len: 00395AB0
>>> Ram size: 08000000
>>> Ram top: 08000000
>>> Reserving 3670k for U-Boot at: 07c6a000
>>> Reserving 32776k for malloc() at: 05c68000
>>> Reserving 120 Bytes for Board Info at: 05c67f88
>>> Reserving 472 Bytes for Global Data at: 05c67db0
>>> Reserving 4352 Bytes for FDT at: 05c66cb0
>>> Reserving 0x3c8 Bytes for bootstage at: 05c668e8
>>>
>>> RAM Configuration:
>>> Bank #0: 0 128 MiB
>>>
>>> DRAM:  128 MiB
>>> New Stack Pointer is: 05c668d0
>>> Copying bootstage from 00007fdb0056e038 to 00007fdb061c48f0, size 3c8
>>> Relocation Offset is: 07c6a000
>>> Relocating to 07c6a000, new gd at 05c67db0, sp at 05c668d0
>>>
>>>
>>> If we use pointers we get things like this:
>>>
>>> Reserving 3670k for U-Boot at: 00007f1b58c8b008
>>>
>>> I just don't want to deal with 64-bit addresses when debugging things,
>>> and there really is no point. The map_sysmem() function provides a
>>> nice and easy way to cast an address to a pointer, and it works on all
>>> architectures.
>> Ok, circling back to square 1. The main issue we're facing here is that
>> most code in U-Boot does not really understand the difference between
>> virtual and physical addresses - it just simply assumes a 1:1 map.
> Traditionally that was true, but over the past 5 years quite a bit of
> code has been converted to work correctly with sandbox. Also, almost
> all *tested* code supports sandbox.

I think the difference is between infrastructure and device code. Most 
device code is still assuming 1:1 maps, but just doesn't get executed in 
sandbox.

>
>> With sandbox, we do not have a 1:1 map anymore - any address is
>> somewhere else than its pointer.
>>
>> We have a few options:
>>
>>    1) Deal with pointers at random addresses throughout the code
>>
>> I can see why you don't want this. I find ASLR generated addresses quite
>> cumbersome to read as well.
> I'm not sure what this means.

Address Space Layout Randomization. It's what causes these weird looking 
addresses ;).

>
>>    2) Explicitly map RAM at VA 0x10000000, then do 1:1 map
>>
>> This would be the best of all worlds still IMHO. That way we will have
>> easily readable addresses (that are identical in 32bit and 64bit), but
>> can still do a 1:1 map. The only thing we need to do is to introduce a
>> linker section at 0x10000000.
> Linker section or mmap() is essentially the same thing. The former
> will presumably just fail to run (bus error?) if the address is not
> available. The latter will select a similar, available address.

No, they're inherently different things. When a process starts, its 
virtual address space is basically empty (apart from the vdso and a tiny 
chunk of ld). The linker is then responsible to ensure that all 
application defined memory regions are mapped. Only after that happened, 
it will map linked libraries (such as glibc, SDL, whatever) at random 
places in address space.

So with a linker section, we're basically guaranteed to always have 
memory live at the same spot. With mmap() there is a good chance we're 
overlapping with something that happened to get mapped there in between.

>
>>    3) Keep converting addresses to pointers
>>
>> I'm afraid if we follow this path, we will always have arguments on
>> where the correct boundaries are. If you start to enable debugging in
>> core dm code and print out pointers to your dm objects, you will get
>> pointer values today as well. Sooner or later we will always end up with
>> pointers.
> The addresses should generally be converted just before use. Most APIs
> should use addresses rather than pointers, particularly those related
> to images, loaded data in memory, hardware peripheral addresses and
> booting.

I don't think that's a reasonable definition for the API requirements to 
be honest. I think addresses make a lot of sense as long as you don't 
want to touch any data inside. You basically use them like a token. Once 
you've gone past that point - so you've run map_sysmem() on that address 
- we should stay in pointer land IMHO.


Alex
Simon Glass June 21, 2018, 2:02 a.m. UTC | #7
Hi Alex,

On 20 June 2018 at 03:31, Alexander Graf <agraf@suse.de> wrote:
> On 06/20/2018 12:03 AM, Simon Glass wrote:
>>
>> Hi Alex,
>>
>> On 14 June 2018 at 15:55, Alexander Graf <agraf@suse.de> wrote:
>>>
>>>
>>> On 14.06.18 23:35, Simon Glass wrote:
>>>>
>>>> Hi Alex,
>>>>
>>>> On 14 June 2018 at 13:51, Alexander Graf <agraf@suse.de> wrote:
>>>>>
>>>>>
>>>>> On 14.06.18 21:01, Simon Glass wrote:
>>>>>>
>>>>>> On 14 June 2018 at 12:22, Alexander Graf <agraf@suse.de> wrote:
>>>>>>>
>>>>>>> The fs_read() function wants to get an address rather than the
>>>>>>> pointer to a buffer.
>>>>>>>
>>>>>>> So let's convert the passed buffer from pointer back a the address
>>>>>>> to make efi_loader on sandbox happier.
>>>>>>>
>>>>>>> Signed-off-by: Alexander Graf <agraf@suse.de>
>>>>>>>
>>>>>>> ---
>>>>>>>
>>>>>>> v1 -> v2:
>>>>>>>
>>>>>>>    - Clarify address vs pointer
>>>>>>>    - include mapmem.h
>>>>>>> ---
>>>>>>>   lib/efi_loader/efi_file.c | 5 ++++-
>>>>>>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> Reviewed-by: Simon Glass <sjg@chromium.org>
>>>>>>
>>>>> I actually think that this patch tackles the problem the wrong way
>>>>> around. I've cooked up another one that converts fs_read() and
>>>>> fs_write() to instead take a pointer - which really is what most users
>>>>> of the API want in the first place:
>>>>>
>>>>>
>>>>>
>>>>> https://github.com/agraf/u-boot/commit/eb89f036a42cea8d7aaa6d83b8ecd9d202814b0f
>>>>
>>>> Actually this is a pretty good exposition of why we don't want to use
>>>> pointers everywhere. U-Boot uses addresses all over the place. Even a
>>>> search for something as specific as "ulong addr" gets over 200
>>>> matches. If we go down this path we will end up changing a pretty
>>>> fundamental part of U-Boot, and I don't think it is a good idea to do
>>>> that. Addresses are easy to understand, can be added/subtracted
>>>> without pointer arithmetic, can be easily converted to pointers as
>>>> needed, can be 32-bits long on sandbox, etc.
>>>
>>> I tend to disagree with this. Most code in U-Boot actually consumes
>>> pointers rather than addresses.
>>
>> That might be true within individual components, but addresses are the
>> common currency in U-Boot. See for example bootm, all the
>> image-handling stuff including FIT, commands, device tree address
>> properties. Pointers are more often used at the lowest level of code.
>
>
> Yes, but all counterparts of fs_read and APIs on the same level use
> pointers. And please take a few minutes to look at its users. 90% of them
> are using pointers too.

Do you mean the callers of fs_read()? That's not the point. Of course
there will be conversion from addresses to pointers, but I am trying
to keep those at the 'edge', away from the core U-Boot code. Your
patch is basically designed to avoid a pointer conversion (which you
can't support) in the filesystem layer. Next you will be doing this in
networking, display, etc. etc. There is just no need for this change.

>
> As I mentioned elsewhere already, going from address to pointer is something
> reasonable. Going from address to pointer usually a sign of bad design ;).
> If most users of fs_read() are already using pointers, it means that either
> all of its users are wrong or fs_read() is wrong. I think the latter is the
> case.

By that logic you would have U-Boot not use addresses at all, and have
pointers everywhere. It just isn't the way the code base is today.

>
>
>>
>>>> So I think we should step back from the abyss here and stick with the
>>>> way sandbox currently deals with addresses. It works well, is pretty
>>>> easy to understand, allows debugging which is just as easy on sandbox
>>>> as other archs (since they both uses addresses until basically the
>>>> final access), the addresses are typically small values that start at
>>>> 0 which much is easier to read than (e.g.) 00007f1b58c8b008.
>>>>
>>>> Here for example is the output from starting U-Boot with debugging in
>>>> board_f.c (something I have turned on a lot when refactoring and
>>>> debugging the init sequence):
>>>>
>>>> U-Boot 2018.07-rc1-00142-g134ea86c7f-dirty (Jun 14 2018 - 15:04:04
>>>> -0600)
>>>>
>>>> DRAM:  Monitor len: 00395AB0
>>>> Ram size: 08000000
>>>> Ram top: 08000000
>>>> Reserving 3670k for U-Boot at: 07c6a000
>>>> Reserving 32776k for malloc() at: 05c68000
>>>> Reserving 120 Bytes for Board Info at: 05c67f88
>>>> Reserving 472 Bytes for Global Data at: 05c67db0
>>>> Reserving 4352 Bytes for FDT at: 05c66cb0
>>>> Reserving 0x3c8 Bytes for bootstage at: 05c668e8
>>>>
>>>> RAM Configuration:
>>>> Bank #0: 0 128 MiB
>>>>
>>>> DRAM:  128 MiB
>>>> New Stack Pointer is: 05c668d0
>>>> Copying bootstage from 00007fdb0056e038 to 00007fdb061c48f0, size 3c8
>>>> Relocation Offset is: 07c6a000
>>>> Relocating to 07c6a000, new gd at 05c67db0, sp at 05c668d0
>>>>
>>>>
>>>> If we use pointers we get things like this:
>>>>
>>>> Reserving 3670k for U-Boot at: 00007f1b58c8b008
>>>>
>>>> I just don't want to deal with 64-bit addresses when debugging things,
>>>> and there really is no point. The map_sysmem() function provides a
>>>> nice and easy way to cast an address to a pointer, and it works on all
>>>> architectures.
>>>
>>> Ok, circling back to square 1. The main issue we're facing here is that
>>> most code in U-Boot does not really understand the difference between
>>> virtual and physical addresses - it just simply assumes a 1:1 map.
>>
>> Traditionally that was true, but over the past 5 years quite a bit of
>> code has been converted to work correctly with sandbox. Also, almost
>> all *tested* code supports sandbox.
>
>
> I think the difference is between infrastructure and device code. Most
> device code is still assuming 1:1 maps, but just doesn't get executed in
> sandbox.

If you mean the drivers, then yes I agree. The drivers typically get
an address and map it to a pointer which they use from then on.

I do have some ideas about executing driver code in sandbox. We
already build a lot of drivers in sandbox, but execute only a very
few. But sandbox could absolutely provide a way to test driver code,
with suitable plumbing. Of course, we need address mapping...:-)

>
>>
>>> With sandbox, we do not have a 1:1 map anymore - any address is
>>> somewhere else than its pointer.
>>>
>>> We have a few options:
>>>
>>>    1) Deal with pointers at random addresses throughout the code
>>>
>>> I can see why you don't want this. I find ASLR generated addresses quite
>>> cumbersome to read as well.
>>
>> I'm not sure what this means.
>
>
> Address Space Layout Randomization. It's what causes these weird looking
> addresses ;).

No, I mean, I am not sure what your comment means. I know what ASLR
is, but I just don't understand your comment as a whole. What problem
are you trying to solve?

>
>>
>>>    2) Explicitly map RAM at VA 0x10000000, then do 1:1 map
>>>
>>> This would be the best of all worlds still IMHO. That way we will have
>>> easily readable addresses (that are identical in 32bit and 64bit), but
>>> can still do a 1:1 map. The only thing we need to do is to introduce a
>>> linker section at 0x10000000.
>>
>> Linker section or mmap() is essentially the same thing. The former
>> will presumably just fail to run (bus error?) if the address is not
>> available. The latter will select a similar, available address.
>
>
> No, they're inherently different things. When a process starts, its virtual
> address space is basically empty (apart from the vdso and a tiny chunk of
> ld). The linker is then responsible to ensure that all application defined
> memory regions are mapped. Only after that happened, it will map linked
> libraries (such as glibc, SDL, whatever) at random places in address space.
>
> So with a linker section, we're basically guaranteed to always have memory
> live at the same spot. With mmap() there is a good chance we're overlapping
> with something that happened to get mapped there in between.

I don't think so. The only calls to mmap() in U-Boot are in os.c so we
can control use of this <4GB address. From my limited testing with
64-bit machines I always get a huge address for things where I don't
specify a hint address.

>
>>
>>>    3) Keep converting addresses to pointers
>>>
>>> I'm afraid if we follow this path, we will always have arguments on
>>> where the correct boundaries are. If you start to enable debugging in
>>> core dm code and print out pointers to your dm objects, you will get
>>> pointer values today as well. Sooner or later we will always end up with
>>> pointers.
>>
>> The addresses should generally be converted just before use. Most APIs
>> should use addresses rather than pointers, particularly those related
>> to images, loaded data in memory, hardware peripheral addresses and
>> booting.
>
>
> I don't think that's a reasonable definition for the API requirements to be
> honest. I think addresses make a lot of sense as long as you don't want to
> touch any data inside. You basically use them like a token. Once you've gone
> past that point - so you've run map_sysmem() on that address - we should
> stay in pointer land IMHO.

Again, by that logic we would never uses addresses in U-Boot.
Everything would be a pointer. Addresses have a number of advantages
over pointers:

- easy to do arithmetic
- don't need casts
- easy to printf() without getting massive 16-digit hex strings all the time
- can be 32-bits wide in sandbox even on a 64-bit machine

Probably some others...you yourself said you cannot do pointer
arithmetic without casting to a uintptr_t first. I don't think that is
true, but it is indicative of the messiness of dealing with pointers.

Regards,
Simon
diff mbox series

Patch

diff --git a/lib/efi_loader/efi_file.c b/lib/efi_loader/efi_file.c
index e6a15bcb52..2107730ba5 100644
--- a/lib/efi_loader/efi_file.c
+++ b/lib/efi_loader/efi_file.c
@@ -9,6 +9,7 @@ 
 #include <charset.h>
 #include <efi_loader.h>
 #include <malloc.h>
+#include <mapmem.h>
 #include <fs.h>
 
 /* GUID for file system information */
@@ -232,8 +233,10 @@  static efi_status_t file_read(struct file_handle *fh, u64 *buffer_size,
 		void *buffer)
 {
 	loff_t actread;
+	/* fs_read expects buffer address, not pointer */
+	uintptr_t buffer_addr = (uintptr_t)map_to_sysmem(buffer);
 
-	if (fs_read(fh->path, (ulong)buffer, fh->offset,
+	if (fs_read(fh->path, buffer_addr, fh->offset,
 		    *buffer_size, &actread))
 		return EFI_DEVICE_ERROR;