Message ID | cover.1678785672.git.baskov@ispras.ru |
---|---|
Headers | show |
Series | x86_64: Improvements at compressed kernel stage | expand |
On Tue, Mar 14, 2023, at 2:23 PM, Andy Lutomirski wrote: > On Tue, Mar 14, 2023, at 3:13 AM, Evgeniy Baskov wrote: >> >> Kernel is made to be more compatible with PE image specification [3], >> allowing it to be successfully loaded by stricter PE loader >> implementations like the one from [2]. There is at least one >> known implementation that uses that loader in production [4]. >> There are also ongoing efforts to upstream these changes. > > Can you clarify Sorry, lost part of a sentence. Can you clarify in what respect the loader is stricter? Anyway, I did some research. I found: https://github.com/rhboot/shim/pull/459/commits/99a8d19326f69665e0b86bcfa6a59d554f662fba which gives a somewhat incoherent-sounding description in which setting EFI_IMAGE_DLLCHARACTERISTICS_NX_COMPAT apparently enables allocating memory that isn't RWX. But this seems odd EFI_IMAGE_DLLCHARACTERISTICS_NX_COMPAT is a property of the EFI *program*, not the boot services implementation. And I'd be surprised if a flag on the application changes the behavior of boot services, but, OTOH, this is Microsoft. And the PE 89 spec does say that EFI_IMAGE_DLLCHARACTERISTICS_NX_COMPAT means "Image is NX compatible" and that is the sole mention of NX in the document. And *this* seems to be the actual issue: https://github.com/rhboot/shim/pull/459/commits/825d99361b4aaa16144392dc6cea43e24c8472ae I assume that MS required this change as a condition for signing, but what do I know? Anyway, the rules appear to be that the PE sections must not be both W and X at the same size. (For those who are familiar with the abomination known as ELF but not with the abomination known as PE, a "section" is a range in the file that gets mapped into memory. Like a PT_LOAD segment in ELF.) Now I don't know whether anything prevents us from doing something awful like mapping the EFI stuf RX and then immediately *re*mapping everything RWX. (Not that I'm seriously suggesting that.) And it's not immediately clear to me how the rest of this series fits in, what this has to do with the identity map, etc. Anyway, I think the series needs to document what's going on, in the changelog and relevant comments. And if the demand-population of the identity map is a problem, then there should be a comment like (made up -- don't say this unless it's correct): A sufficiently paranoid EFI implementation may enforce W^X when mapping memory through the boot services protocols. And creating identity mappings in the page fault handler needs to use the boot services protocols to do so because [fill this in] [or it would be a bit of an abomination to do an end run around them by modifying the page tables ourselves] [or whatever is actually happening]. While we *could* look at the actual fault type and create an R or RW or RX mapping as appropriate, it's better to figure out what needs to be mapped for real and to map it with the correct permissions before faulting. But I still think we should keep the demand-faulting code as a fallback, even if it's hardcoded as RW, and just log the fault mode and address. We certainly shouldn't be *executing* code that wasn't identity mapped. Unless that code is boot services and we're creating the boot services mappings! For that matter, how confident are we that there aren't crappy boot services implementations out there that require that we fix up page faults? After all, it's not like EFI implementations, especially early ones, are any good. --Andy
Hi, > And *this* seems to be the actual issue: > > https://github.com/rhboot/shim/pull/459/commits/825d99361b4aaa16144392dc6cea43e24c8472ae > > I assume that MS required this change as a condition for signing, but what do I know? Your guess is correct. UEFI world is moving to being stricter, for example set page permissions according to the allocation type (RW for data, RX for code). Microsoft raised the bar for PE binaries when it comes to secure boot signing as part of that effort. Being a valid PE binary according to the PE spec is not good enough, some additional constrains like sections not overlapping and sections with different load flags not sharing pages (so setting strict page permissions is actually possible) are required now. Stuff which is standard since years elsewhere. > Anyway, the rules appear to be that the PE sections must not be both W and X at the same size. That too. > But I still think we should keep the demand-faulting code as a > fallback, even if it's hardcoded as RW, and just log the fault mode > and address. We certainly shouldn't be *executing* code that wasn't > identity mapped. Unless that code is boot services and we're creating > the boot services mappings! Agree. > For that matter, how confident are we that there aren't crappy boot > services implementations out there that require that we fix up page > faults? After all, it's not like EFI implementations, especially early > ones, are any good. I don't expect much problems here. Early EFI implementations don't bother setting page permissions and just identity-map everything using rwx huge pages, or run with paging turned off (hello ia32). But playing safe (and keep demand-faulting just in case) is a good idea nevertheless. take care, Gerd
On 2023-03-15 00:23, Andy Lutomirski wrote: > On Tue, Mar 14, 2023, at 3:13 AM, Evgeniy Baskov wrote: >> This patchset is aimed >> * to improve UEFI compatibility of compressed kernel code for x86_64 >> * to setup proper memory access attributes for code and rodata >> sections >> * to implement W^X protection policy throughout the whole execution >> of compressed kernel for EFISTUB code path. > > The overall code quality seems okay, but I have some questions as to > what this is for. The early boot environment is not exposed to most > sorts of attacks -- there's no userspace, there's no network, and > there is not a whole lot of input that isn't implicitly completely > trusted. > > What parts of this series are actually needed to get these fancy new > bootloaders to boot Linux? And why? Well, most of the series is needed, except for may be adding W^X for the non-UEFI boot path (patches 3-9), but those add changes, required for booting via UEFI, like memory protection call-backs. And since the important callbacks are already in-place W^X for non-UEFI won't be too undesired property. The most part of this series (3-16,26,27) implements W^X, and the remaining patches improves the compatibility of PE, which includes: * Removing W+X sections (which is now required as Gerd have already mentioned or at least very desired) * Aligning sections to the page size in memory and to minimal file alignment in file. * Aligning data structures on their natural alignment (e.g. [2] requires it) * Filling more PE header fields to their actual values. * Removing alignment flags on sections, which according to the spec, is only for object files. * Filling in relocation data directory and its rounding the size to 4 bytes. Most of this work is done in the patch 24 "x86/build: Make generated PE more spec compliant", but it also requires working W^X due to the removal of W+X sections and some clean-up work from patches 17-23. > >> >> Kernel is made to be more compatible with PE image specification [3], >> allowing it to be successfully loaded by stricter PE loader >> implementations like the one from [2]. There is at least one >> known implementation that uses that loader in production [4]. >> There are also ongoing efforts to upstream these changes. > > Can you clarify > >> >> Also the patchset adds EFI_MEMORY_ATTTRIBUTE_PROTOCOL, included into >> EFI specification since version 2.10, as a better alternative to >> using DXE services for memory protection attributes manipulation, >> since it is defined by the UEFI specification itself and not UEFI PI >> specification. This protocol is not widely available so the code >> using DXE services is kept in place as a fallback in case specific >> implementation does not support the new protocol. >> One of EFI implementations that already support >> EFI_MEMORY_ATTTRIBUTE_PROTOCOL is Microsoft Project Mu [5]. > > Maybe make this a separate series? This now is just one fairly straight forward patch, since the protocol definitions are already got accepted and the protocol is used elsewhere in the EFISTUB. This patch would also have to be replaced, rather than removed if it's made a separate series, since it adds a warning about W+X mappings. > >> >> Kernel image generation tool (tools/build.c) is refactored as a part >> of changes that makes PE image more compatible. >> >> The patchset implements memory protection for compressed kernel >> code while executing both inside EFI boot services and outside of >> them. For EFISTUB code path W^X protection policy is maintained >> throughout the whole execution of compressed kernel. The latter >> is achieved by extracting the kernel directly from EFI environment >> and jumping to it's head immediately after exiting EFI boot services. >> As a side effect of this change one page table rebuild and a copy of >> the kernel image is removed. > > I have no problem with this, but what's it needed for? The one hard part that made the series more complicated is that non-UEFI (or rather the only) boot path relocates the kernel, which messes up the memory protection for sections set by the UEFI. I did not want to remove the support of in-place extraction and relocation, when loaded in inappropriate place, for the non-UEFI boot path, which is why extraction from boot services was implemented. A proper W^X in EFISTUB is a side effect, but the desired one. The alternative would be to make the whole image RWX after the EFISTUB execution. But the current approach is a lot nicer solution. > >> >> Memory protection inside EFI environment is controlled by the >> CONFIG_DXE_MEM_ATTRIBUTES option, although with these patches this >> option also control the use EFI_MEMORY_ATTTRIBUTE_PROTOCOL and memory >> protection attributes of PE sections and not only DXE services as the >> name might suggest. >> > >> [1] >> https://lore.kernel.org/lkml/893da11995f93a7ea8f7485d17bf356a@ispras.ru/ >> [2] https://github.com/acidanthera/audk/tree/secure_pe > > Link is broken Ah, sorry, the branch was merged into master since I've first posted the series, so the working link is: https://github.com/acidanthera/audk The loader itself is here: https://github.com/acidanthera/audk/tree/master/MdePkg/Library/BasePeCoffLib2 > >> [3] >> https://download.microsoft.com/download/9/c/5/9c5b2167-8017-4bae-9fde-d599bac8184a/pecoff_v83.docx > > I skimmed this very briefly, and I have no idea what I'm supposed to > look at. This is the entire PE spec! I gave some explanations above, which are mostly the duplicates of the patch 24 "x86/build: Make generated PE more spec compliant" commit message. Thanks, Evgeniy Baskov
On Tue, Mar 14, 2023 at 04:20:43PM -0700, Andy Lutomirski wrote: > > > On Tue, Mar 14, 2023, at 2:23 PM, Andy Lutomirski wrote: > > On Tue, Mar 14, 2023, at 3:13 AM, Evgeniy Baskov wrote: > >> > >> Kernel is made to be more compatible with PE image specification [3], > >> allowing it to be successfully loaded by stricter PE loader > >> implementations like the one from [2]. There is at least one > >> known implementation that uses that loader in production [4]. > >> There are also ongoing efforts to upstream these changes. > > > > Can you clarify > > Sorry, lost part of a sentence. Can you clarify in what respect the loader is stricter? > > > Anyway, I did some research. I found: > > https://github.com/rhboot/shim/pull/459/commits/99a8d19326f69665e0b86bcfa6a59d554f662fba > > which gives a somewhat incoherent-sounding description in which > setting EFI_IMAGE_DLLCHARACTERISTICS_NX_COMPAT apparently enables > allocating memory that isn't RWX. But this seems odd > EFI_IMAGE_DLLCHARACTERISTICS_NX_COMPAT is a property of the EFI > *program*, not the boot services implementation. Well, "is this binary compatible" is a property of the program, yes. It's up to the loader to decide if it *cares*, and the compatibility flag allows it to do that. > And I'd be surprised if a flag on the application changes the behavior > of boot services, but, OTOH, this is Microsoft. There has been discussion of implementing a compatibility mode that allows you to enable NX support by default, but only breaks the old assumptions that the stack and memory allocations will be executable if the flag is set, so that newer OSes get the mitigations we need, but older OSes still work. I don't think anyone has actually implemented this *yet*, but some hardware vendors have made noises that sound like they may intend to. (I realize that sounds cagey as hell. Sorry.) Currently I think the only shipping systems that implement NX-requirements are from Microsoft - the Surface product line and Windows Dev Kit - and they don't allow you to disable it at all. Other vendors have produced firmware that isn't shipping yet (I *think*) that has it as a setting in the firmware menu, and they're looking to move to enabling it by default on some product lines. We'd like to not be left behind. > And the PE 89 spec does say that > EFI_IMAGE_DLLCHARACTERISTICS_NX_COMPAT means "Image is NX compatible" > and that is the sole mention of NX in the document. Yeah, the PE spec is not very good in a lot of ways unrelated to how abominable the thing it's describing is. > And *this* seems to be the actual issue: > > https://github.com/rhboot/shim/pull/459/commits/825d99361b4aaa16144392dc6cea43e24c8472ae > > I assume that MS required this change as a condition for signing, but > what do I know? Yes, they have, but it's not as if they did it in a vacuum. I think the idea was originally Kees Cook's actually, and there's been a significant effort on the firmware and bootloader side to enable it. And there's good reason to do this, too - more and more of this surface is being attacked, and recently we've seen the first "bootkit" that actually includes a Secure Boot breakout in the wild: https://www.welivesecurity.com/2023/03/01/blacklotus-uefi-bootkit-myth-confirmed/ While that particular malware (somewhat ironically) only uses code developed for linux systems *after* the exploit, it could easily have gone the other way, and we're definitely a target here. We need NX in our boot path, and soon. > Anyway, the rules appear to be that the PE sections > must not be both W and X at the same size. (For those who are > familiar with the abomination known as ELF but not with the > abomination known as PE, a "section" is a range in the file that gets > mapped into memory. Like a PT_LOAD segment in ELF.) > > Now I don't know whether anything prevents us from doing something > awful like mapping the EFI stuf RX and then immediately *re*mapping > everything RWX. (Not that I'm seriously suggesting that.) Once we've taken over paging, nothing stops us at all. Until then, SetMemoryAttributes() (which is more or less mprotect()) might prevent it. > And it's not immediately clear to me how the rest of this series fits > in, what this has to do with the identity map, etc. I'll let Evgeniy address that and the rest of this.
On Wed, Mar 15, 2023 at 01:57:33PM -0400, Peter Jones wrote: > Currently I think the only shipping systems that implement > NX-requirements are from Microsoft - the Surface product line and > Windows Dev Kit - and they don't allow you to disable it at all. Other > vendors have produced firmware that isn't shipping yet (I *think*) that > has it as a setting in the firmware menu, and they're looking to move to > enabling it by default on some product lines. I hope they realize that they must leave the off switch in the BIOS for older kernels...