apianti Posted January 5, 2018 Share Posted January 5, 2018 Maybe you now know why I was... mildly unhappy with the comment. v_v Regarding the "global variable problem": It was a conclusion from the test results drawn from false premises... now I'll shut up till there is a working solution. Which comment? My first comment? That is clearly just me saying that I said there was a problem with RT_Code relocation two months ago. I really think that you just took it as an insult when I was just being like what's up. It's just a misunderstanding, so no I don't understand why you were unhappy with the comment. Have I ever once come to anyone and been like "GIVE ME CREDIT! I NEED CREDIT!!"? If anything, it's the opposite, I could be bragging all the time asking for props. But I could care less. I'm much happier with people having working systems and that making them happy. Credit is worthless and I've said that many times, a thanks is like a million times more wonderful to me. Like crack for the soul. have more respect another one died.... is that number 7 now?.... hmm i wonder if it's copyrighted. Yeah and now it's whole family doesn't work at Microsoft, Google, Apple, Intel, Facebook, Amazon, IBM, Dell, HP, etc.... So it can't get the inside "sitch". 4 Link to comment Share on other sites More sharing options...
Popular Post vit9696 Posted January 6, 2018 Popular Post Share Posted January 6, 2018 Ok gentlemen, after some reasonably extensive research we were able to mostly understand several problems preventing the users of AMI APTIO have working NVRAM in macOS. As a result we have a reasonably reliable yet dirty solution. As a side issue we suppose we may have discovered a terrible memory corruption issue due to a of AMI APTIO and boot.efi, which we suppose we also fixed. Some of the information was mentioned in this topic and/or on applelife, we consider it important to give a summary that covers all the details. Especially because during the research the information obtained was very confusing, which led to certain misunderstandings . 1. ASUS APTIO IV Z97 Motherboards Described here: http://www.insanelymac.com/forum/topic/317802-efi-variable-store-on-aptio-v-haswell-e-and-up/page-6?do=findComment&comment=2535040 After the disassembling it was discovered that several APTIO IV drivers including the presented one implement a variable whitelist, and disallow writing anything but the variables from the list. It is unclear whether it was intentional or just an logical mistake, but a most reasonable solution will be to just replace the NvramSmi driver with the working one from a previous firmware and reflash. EFI_GLOBAL_VARIABLE_GUID: Lang, Timeout, PlatformLang, ConIn, ConOut, ErrOut, BootOrder, BootNext, DriverOrder, HwErrRecSupport, OsIndications, PK, KEK, FTMActiveFlag EFI_IMAGE_SECURITY_DATABASE_GUID: db, dbx 2. APTIO V firmwares with working NVRAM Some APTIO V firmwares (pre-KabyLake at least) do not use a new NVRAM driver implementation, but rely on a driver that can be found on older Z87 or Z77 APTIO IV motherboards. For this reason NVRAM works fine on them. Notably the extensive changelog of the NVRAM driver covers a lot of issues that arose during the development. Yet it should be noted that once the firmware was upgraded it is no longer possible to use the older driver, like on Z97 motherboards, due to a completely new stack. 3. APTIO V firmwares with not working NVRAM a) Before explaining the details of the new bugs we have to go back to describing NVRAM issues on APTIO IV. As everyone knows during the loading process: - boot.efi discards all the memory that is not EFI_MEMORY_RUNTIME - boot.efi physically moves EfiRuntimeServicesCode and EfiRuntimeServicesData regions regions to go one by one and zeroes the original area - boot.efi assigns virtual addresses to EFI_MEMORY_RUNTIME regions - XNU maps EfiRuntimeServicesCode as RX memory and everything else supplied as RW memory However, AMI SMM drivers preserve the original physical address of EfiRuntimeServicesData, and use this memory for communication. Since SMM drivers cannot be easily changed, the AptioFix driver prohibits EfiRuntimeServicesData from being moved by marking it as a EfiMemoryMappedIO region. To our surprise the AptioFix driver does not revert "temporarily" changed types back to EfiRuntimeServicesData before starting XNU, which undesirably leads to VM_MEM_NOT_CACHEABLE | VM_MEM_GUARDED flags being used in the XNU mapping, yet it not known to cause practical issues. What did the original solution not care about? There exists a certain FlashDriver that exposes AMI_FLASH_PROTOCOL (755B6596-6896-4ba3-B3DD-1C629FD1EA88). On APTIO IV this protocol is not used after ExitBootServices (if not earlier) by any EFI_RUNTIME_SERVICES, which was proven by RW mapping it without the X bit, yet is part of the EfiRuntimeServicesCode. It was discovered that on APTIO IV AMI SMM drivers have a physical address of one of the static variables of this driver, and they use this variable for write access during all the three calls to NVRAM-related EFI_RUNTIME_SERVICES (GetVariable/SetVariable/GetNextVariableName). Therefore it effectively leads to arbitrary memory corruptions (which happen to be used for RW access by XNU and thus do not trigger a kernel panic with a page fault) when invoking any NVRAM procedures. c) What did APTIO V do? On APTIO V they appear to have additionally changed the implementation to work via a shared SMM/DXE buffer. However, unlike APTIO IV, which used EfiRuntimeServicesData, this shared buffer become a static variable in some driver too. Once again since SMM uses physical addresses to the buffer, after boot.efi moves the problematic driver, it will no longer is able to communicate with SMM. For quite some time (probably due to the lack of the hardware) we thought that the problematic driver was the same FlashDriver just like on APTIO IV. However, this is not the case, and on APTIO V it is some other unknown driver. For this reason we prevented the whole EfiRuntimeServicesCode area from being moved just like with EfiRuntimeServicesData. After we mapped all the EfiRuntimeServicesCode memory as RWX in XNU NVRAM started to work fine. As a result we discovered the memory corruption mentioned in ( due to WP page faults. Summary follows. APTIO IV: FlashDriver RW EfiRuntimeServicesData & force same address → nvram works FlashDriver R X EfiRuntimeServicesCode & force same address → write page fault FlashDriver RWX EfiRuntimeServicesCode & force same address → nvram works FlashDriver RWX EfiRuntimeServicesData & force same address → nvram works RTCode RWX EfiRuntimeServicesCode & force same address → nvram works APTIO V: FlashDriver RW EfiRuntimeServicesData & force same address → nvram does not work FlashDriver R X EfiRuntimeServicesCode & force same address → write page fault FlashDriver RWX EfiRuntimeServicesCode & force same address → nvram does not work FlashDriver RWX EfiRuntimeServicesData & force same address → nvram does not work RTCode RWX EfiRuntimeServicesCode & force same address → nvram worksd) What did we do? Firstly, we prohibited all the EfiRuntimeServicesCode from being moved by boot.efi. It was a tough decision whether to try researching the exact driver not move just it or the entire code area. The choice was made for the latter, because -1- the NVRAM bugs clearly show that AMI APTIO does not support this and -2- if there exist any more write accesses from SMM to DXE we will discover them by triggering a kernel panic (with a WP page fault) instead of just silently corrupting arbitrary memory area. Secondly, we created dedicated shims for GetVariable/SetVariable/GetNextVariableName which unset the WP bit during the execution of the UEFI code. This is safe to do in an SMP environment, because AppleEFIRuntime kext performs a lock call during any UEFI code calls, and thus effectively disables CPU preemption. On the contrary patching the kernel to map all the UEFI memory as RWX is an unnecessary risk, which is furthermore prone to errors due to instruction changes. It should be noted, that it is not possible to perform memory region protection upgrade from a kext either (vm_protect fails due to maximum protection being set to a too low value). The relevant changes for OsxAptioFix2 driver and a prebuilt OsxAptioFix2 driver version, which contains the fix: OsxAptioFix2Drv-WTF.zip Signed off: Download-Fritz, vit9696, and everyone who shared their wisdom and helped to test 06.01.17 4:30 MSK 26 Link to comment Share on other sites More sharing options...
apianti Posted January 6, 2018 Share Posted January 6, 2018 So only the system table copy is relocated now then? This seems like a good fix, but I haven't looked at the code yet. Can I get some confirmations that it works for users of all firmwares? Firstly, we prohibited all the EfiRuntimeServicesCode from being moved by boot.efi. It was a tough decision whether to try researching the exact driver not move just it or the entire code area. The choice was made for the latter, because -1- the NVRAM bugs clearly show that AMI APTIO does not support this and -2- if there exist any more write accesses from SMM to DXE we will discover them by triggering a kernel panic (with a WP page fault) instead of just silently corrupting arbitrary memory area. This was a point I was trying to make to DF about scoping the solution too narrowly. Secondly, we created dedicated shims for GetVariable/SetVariable/GetNextVariableName which unset the WP bit during the execution of the UEFI code. This is safe to do in an SMP environment, because AppleEFIRuntime kext performs a lock call during any UEFI code calls, and thus effectively disables CPU preemption. On the contrary patching the kernel to map all the UEFI memory as RWX is an unnecessary risk, which is furthermore prone to errors due to instruction changes. It should be noted, that it is not possible to perform memory region protection upgrade from a kext either (vm_protect fails due to maximum protection being set to a too low value). This is a really good idea instead of patching the kernel. As you know UEFI can only be accessed from the BSP and may thunk to another mode, like SMM, so it must disable preemption and it will always run on that same AP. I agree that making those pages entirely writeable is a huge security risk. Good job guys. Going to look at the code and commit if it's cool, which it probably is. EDIT: Code looks good but I somehow broke my BaseTools... lol. So I'm having to rebuild everything.... And it is slooooowww.... Give me a second and I will commit. If this fix works for everyone, which it should, it looks good, then there is no point in maintaining the two separate versions of AptioFix. We should move to just AptioFix2 only and rename it to something else. EDIT2: There's still one problem to solve now (it's actually two though). The contiguous memory problem with X99+ firmwares. I mean I know how to solve the one part but it's not going to work for v2 because it would require actually garbage collecting and forcing memory regions to be allocated as high as possible. I've already done this in v3, I tried to attempt to do it for v2 but god it has no hope on that front. The other part is that the firmware itself allocates memory regions all over the place before it even passes off to a boot loader so those may require a firmware modification to force the allocation of these regions as high as possible too. A small driver to override the memory methods for boot services could easily be inserted to be loaded directly after DxeCore so the memory is allocated high throughout the rest of boot. 8 Link to comment Share on other sites More sharing options...
apianti Posted January 6, 2018 Share Posted January 6, 2018 DF, after I apply your patch I get build errors.. OsxAptioFixDrv.lib(BootFixes.obj) : error LNK2001: unresolved external symbol RTShims OsxAptioFixDrv.lib(BootFixes.obj) : error LNK2001: unresolved external symbol gGetVariable OsxAptioFixDrv.lib(BootFixes.obj) : error LNK2001: unresolved external symbol gRTShimsDataStart OsxAptioFixDrv.lib(BootFixes.obj) : error LNK2001: unresolved external symbol gSetVariable OsxAptioFixDrv.lib(BootFixes.obj) : error LNK2001: unresolved external symbol gGetNextVariableName So this is used in both AptioFix but only AptioFix2 has these defined? EDIT: I fixed it by moving around some of the code to shared sources and adding X64/RTShim.nasm to AptioFix The fix is now committed as r4368. Hopefully, everyone will have fully working runtime now with AptioFix2, AptioFix should be obsoleted except that some users must use it because of terrible firmware. Let's say deprecated for now. EDIT: I don't understand why sometimes adding another post creates another post and others it appends to your previous post. I wanted this to be a new unrelated post but, eh, whatever...... Go get the goods and report back or I swear to god I will delete the entire repository. Haha. I'm not doing that. EDIT2: AptioFix will not fix your runtime problem, only AptioFix2. It could possibly but it doesn't for now. And truthfully not sure it matters, we should try to escape AptioFix altogether. 3 Link to comment Share on other sites More sharing options...
vit9696 Posted January 6, 2018 Share Posted January 6, 2018 There were some issues reported with the original solution, so we decided to publish another not so pretty update. Changelog: 1. Fixed AptioFix V1 issues & implemented working nvram on APTIO V. We kind of expected clover team to backport stuff to B1 instead of just making it compile, but since it was us who fundamentally broke it, here is a fix. 2. Fixed Hibernation issues with AptioFix v2. Pointers in shims for runtime services were not updated and caused an immediate reboot. There exist other problems with memory mapping, but they existed prior to our changes and are irrelevant. Precompiled binaries for both V1 and V2 are included in the attachment. OsxAptioFix-WTF-R2.zip Thanks for help & test to Vandroiy and lvs1974. 14 Link to comment Share on other sites More sharing options...
314TeR Posted January 6, 2018 Share Posted January 6, 2018 None of the above versions of AptioFIX work with me on Z97 ASUS Maximus Impact VII (BIOS 3003). I tested thoroughly the clover r4368, OsxAptioFix2Drv-WTF.efi, OsxAptioFix2Drv-WTH.efi and OsxAptioFixDrv-WTB.efi. Link to comment Share on other sites More sharing options...
mhaeuser Posted January 6, 2018 Author Share Posted January 6, 2018 None of the above versions of AptioFIX work with me on Z97 ASUS Maximus Impact VII (BIOS 3003). I tested thoroughly the clover r4368, OsxAptioFix2Drv-WTF.efi, OsxAptioFix2Drv-WTH.efi and OsxAptioFixDrv-WTB.efi. As said in vit's comprehensive post, ASUS implemented a whitlelist. This is not some macOS issue, this is intended. 3 Link to comment Share on other sites More sharing options...
314TeR Posted January 6, 2018 Share Posted January 6, 2018 As said in vit's comprehensive post, ASUS implemented a whitlelist. This is not some macOS issue, this is intended. Will it be possible to fix this error in the future? Link to comment Share on other sites More sharing options...
mhaeuser Posted January 6, 2018 Author Share Posted January 6, 2018 Will it be possible to fix this error in the future? Not without a firmware flash. EDIT: Remembered incorrectly... the driver doing the check is a DXE driver, so an on-the-fly patch is possible, but... idk, just flash imo. Link to comment Share on other sites More sharing options...
apianti Posted January 6, 2018 Share Posted January 6, 2018 There were some issues reported with the original solution, so we decided to publish another not so pretty update. Changelog: 1. Fixed AptioFix V1 issues & implemented working nvram on APTIO V. We kind of expected clover team to backport stuff to B1 instead of just making it compile, but since it was us who fundamentally broke it, here is a fix. 2. Fixed Hibernation issues with AptioFix v2. Pointers in shims for runtime services were not updated and caused an immediate reboot. There exist other problems with memory mapping, but they existed prior to our changes and are irrelevant. Precompiled binaries for both V1 and V2 are included in the attachment. OsxAptioFix-WTF-R2.zip Thanks for help & test to Vandroiy and lvs1974. I want to eliminate AptioFix all together that's why, there was no need to fix it. What reason is there to use AptioFix? It breaks pretty much everything.... If AptioFix can find an open region large enough to create the relocation block, then you can just use slide=N to move the kernel into that place and use AptioFix2. As for the memory attributes assigned to MMIO regions, non-cacheable means that it can never be cached it must always hit memory, makes it a little slower but that's fine, it's not being used often enough for it to matter. And mem guard means it can't be moved at a later time, because it's supposed to be mapped to some device. Not sure how either of those are going to affect anything too much. You know that since it's not checking for anything but EFI_MEMORY_RUNTIME and that it's MMIO to assign those attributes, we could probably just change those regions to BT_Data, which should give corrrect attributes as RT_Data would. There are some comments that don't make sense. The most concerting one is where the old memory map is not handed off. The one currently in EFI is then used, that's why the handoff is removed and you should still be restoring the types. Does this still not result in non-working NVRAM or sleep, or just luck? There's also a comment that says "Somehow the virtual address change event did not fire...", but this is taking place before SetVirtualMemoryMap is it not? Otherwise it would be in virtual mode already. Looks good though, committed as r4639. None of the above versions of AptioFIX work with me on Z97 ASUS Maximus Impact VII (BIOS 3003). I tested thoroughly the clover r4368, OsxAptioFix2Drv-WTF.efi, OsxAptioFix2Drv-WTH.efi and OsxAptioFixDrv-WTB.efi. Because the problem is just with them being idiots. They only allowed a whitelist as vit said so it can only write those variables at runtime. Will it be possible to fix this error in the future? Just replace the old driver in the new firmware rom and flash it. Not without a firmware flash. EDIT: Remembered incorrectly... the driver doing the check is a DXE driver, so an on-the-fly patch is possible, but... idk, just flash imo. Yeah just flash the old driver. There is no point in making a patcher for this especially since it couldn't be guaranteed to work for every firmware with this problem since the code is most likely not the exact firmware and modified for each individual board. 2 Link to comment Share on other sites More sharing options...
mhaeuser Posted January 6, 2018 Author Share Posted January 6, 2018 I want to eliminate AptioFix all together that's why, there was no need to fix it. What reason is there to use AptioFix? It breaks pretty much everything.... If AptioFix can find an open region large enough to create the relocation block, then you can just use slide=N to move the kernel into that place and use AptioFix2. Slide was introduced with Lion or ML, and it works in 2MB steps iirc... if you check CupertinoSupportPkg, I played with relocating each block that failed individually rather than having one huge block. With that, you can avoid relocation entirely with the right slide values and even if you won't, KASLR will work. I tested it once like over a year ago and it didn't work... kinda forgot about it since then, don't think the concept is flawed though. Quoting bugged again... sigh... "we could probably just change those regions to BT_Data" Better don't risk... "The one currently in EFI is then used, that's why the handoff is removed" The handoff contains the *new* memory map, i.e. the one from the current boot cycle, so removing it is what doesn't make sense. I didn't test hibernation in ages though and hence just added a comment about it. ""Somehow the virtual address change event did not fire...", but this is taking place before SetVirtualMemoryMap is it not?" No, the location where that comment is, is run on kernel entry. I had registered a VirtualChange event in the location where RTShims is alloc'd and wasted easily 1h reviewing the code over and over again till I added some stall code to the event handler - which was ignored. Idk why the event didn't fire, I expect AMI has some kind of caller check (AF is BS after all), so kernel entry was the only alternative. "Just replace the old driver in the new firmware rom and flash it." +1 Link to comment Share on other sites More sharing options...
apianti Posted January 6, 2018 Share Posted January 6, 2018 Slide was introduced with Lion or ML, and it works in 2MB steps iirc... if you check CupertinoSupportPkg, I played with relocating each block that failed individually rather than having one huge block. With that, you can avoid relocation entirely with the right slide values and even if you won't, KASLR will work. I tested it once like over a year ago and it didn't work... kinda forgot about it since then, don't think the concept is flawed though. I'm pretty sure it works fine. At least it used to because on my previous board I used it all the time, it was only a Z68 though. Haven't tried with this cause it works. Quoting bugged again... sigh... Hey you got one, at least. lol. Whenever this happens to me I just click that little light switch icon and copy. Refresh the page and usually it works again. Sometimes you gotta do it a lot though..... "we could probably just change those regions to BT_Data" Better don't risk... There's no need. Was just pointing out that it probably doesn't matter about the attributes but there was also a time when it was set to use AcpiNVS and it was changed to MMIO so maybe there is a point to it. "The one currently in EFI is then used, that's why the handoff is removed" The handoff contains the *new* memory map, i.e. the one from the current boot cycle, so removing it is what doesn't make sense. I didn't test hibernation in ages though and hence just added a comment about it. Nah, if you read dmazars comment it is the previous stored memory. If you don't remove it then the new RT regions won't be added to memory map, instant reboot because it tries to use the old regions that aren't there. If there was no memory map at all then that would definitely reboot. Where else would it be getting the memory map from? EDIT: It instantly rebooted because the RT_Code pages were all relocated. But they are not now. Soooooooooo..... Maybe we can just hand the map back off again. EDIT2: But is it guaranteed that they are all in the exact same correct places? ""Somehow the virtual address change event did not fire...", but this is taking place before SetVirtualMemoryMap is it not?" No, the location where that comment is, is run on kernel entry. I had registered a VirtualChange event in the location where RTShims is alloc'd and wasted easily 1h reviewing the code over and over again till I added some stall code to the event handler - which was ignored. Idk why the event didn't fire, I expect AMI has some kind of caller check (AF is BS after all), so kernel entry was the only alternative. But SetVirtualMemoryMap takes place in the kernel.... The kernel is jumped to in physical mode, that's how a boot driver is able to intercept it.. The virtual change doesn't take place until after and almost guarantee that it doesn't get called because it is not a runtime driver like you said. Link to comment Share on other sites More sharing options...
mhaeuser Posted January 6, 2018 Author Share Posted January 6, 2018 "Nah, if you read dmazars comment it is the previous stored memory." "if mem map handoff is not present, then kernel will not map those new rt pages" macosxbootloader code indicates the very same. MemoryMap handoff has the new memory map, if we drop it, what we are doing, the old is used. That might break. "But SetVirtualMemoryMap takes place in the kernel...." No, it is called by boot.efi. SetVirtualAddressMap() does not map anything but does nothing else than providing UEFI with the new addresses. Link to comment Share on other sites More sharing options...
vit9696 Posted January 6, 2018 Share Posted January 6, 2018 We discussed the Z97 issue, and decided that it is indeed not worth the time and effort for us to write any software workaround for it. Yet, since we already have the driver reversed, and some people may not want to reflash their BIOS (for whatever political reasons they have), we decided to published the complete decompiled code of the working driver, and the two changed functions of the not working one. This should be enough for anyone who works to implement a dynamic workaround.NvramSmiSrc.zipThe idea should be very simple — you need to overwrite gRT->SetVariable function with a function from the working driver (that does not have a whitelist check), and take care of the events.—————apianti,a. Regarding the memory mapping code, to be honest, we have no clear understanding what happens there either. But the evidence Fritz and I insist upon (that XNU must be using at least something from the original mapping) is obvious out of the following:- we mark EfiRuntimeServicesCode as MMIO prior to boot.efi relocation code (that much is clear from the successful ProtectRtDataFromRelocation invocation, confirmed by VirtualizeRTShimPointers call, which is nearby and the only fix needed for hibernation);- we do not recover EfiRuntimeServicesCode back in case of NVRAM (the only place that does this for AptioFixV2 is FixBootingWithoutRelocBlock and hibernate wake goes via FixHibernateWakeWithoutRelocBlock)- XNU maps EfiRuntimeServicesCode and RX and everything else as RW (clear from https://opensource.apple.com/source/xnu/xnu-4570.1.46/osfmk/i386/AT386/model_dep.c.auto.html efi_init code)- XNU invokes RuntimeServices and our new shims (since not fixing pointers caused an instant reboot).- XNU hibernate_newruntime_map, which is the one that remaps UEFI RuntimeServices is called from hibernate_machine_init with a parameter named kIOHibernateHandoffTypeMemoryMap. The code in FixHibernateWakeWithoutRelocBlock effectively disables this, so hibernate_newruntime_map will never be called (see: https://opensource.apple.com/source/xnu/xnu-4570.1.46/iokit/Kernel/IOHibernateIO.cpp.auto.html).So yes, it is mere luck that the address match and it does not burn.b. Regarding the gRT->SetVirtualAddress call, it comes from boot.efi not XNU. The code is similar to macosxbootloader. Due to various issues that arose during the development we had to reverse-engineer the relevant part of boot.efi, so you could compare yourself.https://ghostbin.com/paste/55hgb boot.efi (10.13.1)https://github.com/Piker-Alpha/macosxbootloader/blob/El-Capitan/src/boot/MemoryMap.cpp#L183 Corresponding function in macosxbootloaderThere definitely may exist some serious misunderstanding, but I would rather not touch NVRAM problem for the next few months (and I suppose so does Fritz). The outline presented is all our knowledge, and there is hardly anything to add to it. If you would like to continue research, I may gladly read but rather not participate ^^ MACRO_EFI __fastcall SetVirtualAddressUp(EFI_MEMORY_DESCRIPTOR *MemoryDescriptorBegin, EFI_MEMORY_DESCRIPTOR *MemoryDescriptorEnd, unsigned __int64 *a3, __int64 DecriptorSize, __int64 DescriptorVersion, unsigned __int64 MinimalVirtualPointer, unsigned __int64 MaximumVirtualPointer, __int64 NextFreeVirtualPtr) { __int64 NumberOfRuntimePages; // rbx unsigned __int64 Attr; // rax __int64 TmpAddr; // rax __int64 v16; // rdx EFI_SYSTEM_TABLE *SystemTable; // rdx MAPDST __int64 v18; // rbx EFI_MEMORY_DESCRIPTOR *CurrentMemoryDescriptor; // rsi MAPDST unsigned int Size; // eax __int64 Size64; // rbx MAPDST char *Dst; // rdx char *Src; // rcx EFI_PHYSICAL_ADDRESS PhysicalStart; // rdx MAPDST EFI_CONFIGURATION_TABLE *ConfigurationTable; // rdx bool ConfigurationTableBelowMemoryDescriptors; // cf EFI_PHYSICAL_ADDRESS v33; // rdx EFI_PHYSICAL_ADDRESS v39; // rax _QWORD *v40; // rdx UINTN Index; // rdi UINT32 v43; // esi EFI_VIRTUAL_ADDRESS v44; // r13 signed __int64 a7; // rax unsigned __int64 NumberOfPages; // rcx __int64 v47; // rdx EFI_MEMORY_DESCRIPTOR *v48; // rax __int64 v50; // [rsp+28h] [rbp-78h] EFI_MEMORY_DESCRIPTOR *SavedMemoryDescriptorBegin; // [rsp+50h] [rbp-50h] void **a3a; // [rsp+58h] [rbp-48h] EFI_MEMORY_DESCRIPTOR *a4; // [rsp+60h] [rbp-40h] a3a = (void **)a3; NumberOfRuntimePages = 0i64; if ( MemoryDescriptorBegin >= MemoryDescriptorEnd ) { LABEL_13: gST->RuntimeServices = 0i64; SystemTable = gST; gST->Hdr.CRC32 = 0; gST->Hdr.CRC32 = CalculateCRC32(0i64, SystemTable, SystemTable->Hdr.HeaderSize); } else { SavedMemoryDescriptorBegin = MemoryDescriptorBegin; do { Attr = MemoryDescriptorBegin->Attribute; if ( MemoryDescriptorBegin->Type ) // Type != EfiReservedMemoryType { if ( (Attr & EFI_MEMORY_RUNTIME) != 0i64 ) { Size64 = BitShiftLeft(MemoryDescriptorBegin->NumberOfPages, 12); if ( ((unsigned int)NextFreeVirtualPtr & 0x3FFFFFFF) < MinimalVirtualPointer ) return EFI_NO_MAPPING; TmpAddr = NextFreeVirtualPtr + Size64; if ( ((unsigned int)TmpAddr & 0x3FFFFFFF) > MaximumVirtualPointer ) return EFI_NO_MAPPING; NumberOfRuntimePages += LODWORD(MemoryDescriptorBegin->NumberOfPages); MemoryDescriptorBegin->VirtualStart = NextFreeVirtualPtr; NextFreeVirtualPtr = TmpAddr; } } else { MemoryDescriptorBegin->Attribute = Attr & ~EFI_MEMORY_RUNTIME; } MemoryDescriptorBegin = (EFI_MEMORY_DESCRIPTOR *)((char *)MemoryDescriptorBegin + DecriptorSize); } while ( MemoryDescriptorBegin < MemoryDescriptorEnd ); MemoryDescriptorBegin = SavedMemoryDescriptorBegin; if ( !NumberOfRuntimePages ) goto LABEL_13; if ( (gRT->SetVirtualAddressMap( (char *)MemoryDescriptorEnd - (char *)SavedMemoryDescriptorBegin, DecriptorSize, DescriptorVersion, SavedMemoryDescriptorBegin) & 0x8000000000000000ui64) != 0i64 ) InternalDbgError("Error in SetVirtualAddressMap\n", v16); } v18 = 0i64; if ( MemoryDescriptorBegin < MemoryDescriptorEnd ) { CurrentMemoryDescriptor = MemoryDescriptorBegin; do { if ( CurrentMemoryDescriptor->Type - 5 <= 1 )// Type == EfiRuntimeServicesCode || Type == EfiRuntimeServicesData { Size = BitShiftLeft(CurrentMemoryDescriptor->NumberOfPages, 12); Size64 = Size; Dst = (char *)(CurrentMemoryDescriptor->VirtualStart & 0x3FFFFFFF); Src = (char *)CurrentMemoryDescriptor->PhysicalStart; if ( Dst != Src ) { MemCopy(Src, Dst, Size); ZeroMemory((void *)CurrentMemoryDescriptor->PhysicalStart, Size64); } } CurrentMemoryDescriptor = (EFI_MEMORY_DESCRIPTOR *)((char *)CurrentMemoryDescriptor + DecriptorSize); } while ( CurrentMemoryDescriptor < MemoryDescriptorEnd ); v18 = 0i64; if ( MemoryDescriptorBegin < MemoryDescriptorEnd ) { CurrentMemoryDescriptor = MemoryDescriptorBegin; do { if ( CurrentMemoryDescriptor->Type - 5 <= 1 ) { Size64 = BitShiftLeft(CurrentMemoryDescriptor->NumberOfPages, 12); PhysicalStart = CurrentMemoryDescriptor->PhysicalStart; if ( (unsigned __int64)gST >= PhysicalStart && (unsigned __int64)gST < PhysicalStart + Size64 ) gST = (EFI_SYSTEM_TABLE *)((LODWORD(CurrentMemoryDescriptor->VirtualStart) + (_DWORD)gST - (_DWORD)PhysicalStart) & 0x3FFFFFFF); } CurrentMemoryDescriptor = (EFI_MEMORY_DESCRIPTOR *)((char *)CurrentMemoryDescriptor + DecriptorSize); } while ( CurrentMemoryDescriptor < MemoryDescriptorEnd ); v18 = 0i64; if ( MemoryDescriptorBegin < MemoryDescriptorEnd ) { CurrentMemoryDescriptor = MemoryDescriptorBegin; do { if ( CurrentMemoryDescriptor->Type - 5 <= 1 )// Type == EfiRuntimeServicesCode || Type == EfiRuntimeServicesData { Size64 = BitShiftLeft(CurrentMemoryDescriptor->NumberOfPages, 12); PhysicalStart = CurrentMemoryDescriptor->PhysicalStart; ConfigurationTable = gST->ConfigurationTable; ConfigurationTableBelowMemoryDescriptors = (unsigned __int64)ConfigurationTable < PhysicalStart; v33 = (EFI_PHYSICAL_ADDRESS)ConfigurationTable - PhysicalStart; if ( !ConfigurationTableBelowMemoryDescriptors && gST->ConfigurationTable < (EFI_CONFIGURATION_TABLE *)(Size64 + PhysicalStart) ) { gST->ConfigurationTable = (EFI_CONFIGURATION_TABLE *)(CurrentMemoryDescriptor->VirtualStart + v33); } } CurrentMemoryDescriptor = (EFI_MEMORY_DESCRIPTOR *)((char *)CurrentMemoryDescriptor + DecriptorSize); } while ( CurrentMemoryDescriptor < MemoryDescriptorEnd ); v18 = 0i64; if ( MemoryDescriptorBegin < MemoryDescriptorEnd ) { CurrentMemoryDescriptor = MemoryDescriptorBegin; do { if ( CurrentMemoryDescriptor->Type - 5 <= 1 ) { Size64 = BitShiftLeft(CurrentMemoryDescriptor->NumberOfPages, 12); SystemTable = gST; if ( gST->NumberOfTableEntries ) { PhysicalStart = CurrentMemoryDescriptor->PhysicalStart; v39 = PhysicalStart + Size64; v40 = (_QWORD *)(((_QWORD)gST->ConfigurationTable & 0x3FFFFFFFi64) + 16); Index = 0i64; do { if ( *v40 >= PhysicalStart && *v40 < v39 ) { *v40 = CurrentMemoryDescriptor->VirtualStart + *v40 - PhysicalStart; SystemTable = gST; } ++Index; v40 += 3; } while ( Index < SystemTable->NumberOfTableEntries ); } } CurrentMemoryDescriptor = (EFI_MEMORY_DESCRIPTOR *)((char *)CurrentMemoryDescriptor + DecriptorSize); } while ( CurrentMemoryDescriptor < MemoryDescriptorEnd ); v18 = 0i64; CurrentMemoryDescriptor = MemoryDescriptorBegin; if ( MemoryDescriptorBegin < MemoryDescriptorEnd ) { v18 = 0i64; do { v43 = CurrentMemoryDescriptor->Type; if ( CurrentMemoryDescriptor->Type - 5 <= 1 )// Type == EfiRuntimeServicesCode || Type == EfiRuntimeServicesData { v44 = CurrentMemoryDescriptor->VirtualStart; a7 = CurrentMemoryDescriptor->VirtualStart & 0x3FFFFFFF; if ( a7 != CurrentMemoryDescriptor->PhysicalStart ) { if ( a3a ) { NumberOfPages = CurrentMemoryDescriptor->NumberOfPages; a4 = 0i64; CurrentMemoryDescriptor->Attribute &= 0x7FFFFFFFFFFFFFFFui64; CurrentMemoryDescriptor->Type = EfiConventionalMemory; CurrentMemoryDescriptor->VirtualStart = 0i64; if ( sub_13DE6( MemoryDescriptorBegin, MemoryDescriptorEnd, a3a, &a4, DecriptorSize, v50, a7, NumberOfPages) < 0 ) InternalDbgError("Could not create subregion", v47); v48 = a4; a4->VirtualStart = v44; v48->Attribute |= EFI_MEMORY_RUNTIME; v48->Type = v43; MemoryDescriptorEnd = (EFI_MEMORY_DESCRIPTOR *)*a3a; } else { CurrentMemoryDescriptor->PhysicalStart = a7; } } } CurrentMemoryDescriptor = (EFI_MEMORY_DESCRIPTOR *)((char *)CurrentMemoryDescriptor + DecriptorSize); } while ( CurrentMemoryDescriptor < MemoryDescriptorEnd ); } } } } } return v18; } 6 Link to comment Share on other sites More sharing options...
apianti Posted January 6, 2018 Share Posted January 6, 2018 "Nah, if you read dmazars comment it is the previous stored memory." "if mem map handoff is not present, then kernel will not map those new rt pages" macosxbootloader code indicates the very same. MemoryMap handoff has the new memory map, if we drop it, what we are doing, the old is used. That might break. Well you selectively cut out the fact that it says it's equivalent to RemoveRTFlagMappings. Or the fact that it is retrieving that information from the hibernation image......... Though it is misleading because he named the variable passed in bootArgs.... "But SetVirtualMemoryMap takes place in the kernel...." No, it is called by boot.efi. SetVirtualAddressMap() does not map anything but does nothing else than providing UEFI with the new addresses. SetVirtualAddressMap() has to be called in physical mode, but switches to virtual mode, at least that's what it's supposed to do according to the spec. So boot.efi is switching to virtual mode then back to physical to jump to the kernel to switch back to virtual? That's dumb. If that's the case then maybe the override for SetVirtualAddressMap disregards it as a non EFI_MEMORY_RUNTIME area, so it can't get triggered.... Doesn't really matter if it's the correct addresses in the end though. Or depending on where you registered the event, maybe you did it when it's in the jump intercept and the addresses are already virtualized if boot.efi does indeed virtualize the memory. Did you do it in the driver entry point? Did you use the Ex method or the regular? Link to comment Share on other sites More sharing options...
Funky frank Posted January 6, 2018 Share Posted January 6, 2018 Hi, I have some noob questions regarding this new aptiofix2 by vit9696. I have a Asus B85M-G, and so far used aptiofix2 old version: - Will the new aptiofix2 by vit9696 by vlad improve system performance? - Is it possible that vst audio plugins now better validate with the new aptiofix2? - Is it possible that the old aptiofix2 caused graphics stuttering in combination with recent hs nvidia web drivers? Thanks! Link to comment Share on other sites More sharing options...
apianti Posted January 6, 2018 Share Posted January 6, 2018 a. Regarding the memory mapping code, to be honest, we have no clear understanding what happens there either. But the evidence Fritz and I insist upon (that XNU must be using at least something from the original mapping) is obvious out of the following: - we mark EfiRuntimeServicesCode as MMIO prior to boot.efi relocation code (that much is clear from the successful ProtectRtDataFromRelocation invocation, confirmed by VirtualizeRTShimPointers call, which is nearby and the only fix needed for hibernation); - we do not recover EfiRuntimeServicesCode back in case of NVRAM (the only place that does this for AptioFixV2 is FixBootingWithoutRelocBlock and hibernate wake goes via FixHibernateWakeWithoutRelocBlock) - XNU maps EfiRuntimeServicesCode and RX and everything else as RW (clear from https://opensource.apple.com/source/xnu/xnu-4570.1.46/osfmk/i386/AT386/model_dep.c.auto.htmlefi_init code) - XNU invokes RuntimeServices and our new shims (since not fixing pointers caused an instant reboot). - XNU hibernate_newruntime_map, which is the one that remaps UEFI RuntimeServices is called from hibernate_machine_init with a parameter named kIOHibernateHandoffTypeMemoryMap. The code in FixHibernateWakeWithoutRelocBlock effectively disables this, so hibernate_newruntime_map will never be called (see: https://opensource.apple.com/source/xnu/xnu-4570.1.46/iokit/Kernel/IOHibernateIO.cpp.auto.html). So yes, it is mere luck that the address match and it does not burn. I'm not disagreeing with a lot of what you or DF is saying or I wouldn't have committed it. I'm just trying to understand. If hibernate_newruntime_map never runs where does it get the mapping from? How does it have the memory mappings of the virtual addresses if they were all removed from the handoff? EDIT: So shouldn't the regions be fixed in this case as well, or actually pass the memory handoff? b. Regarding the gRT->SetVirtualAddress call, it comes from boot.efi not XNU. The code is similar to macosxbootloader. Due to various issues that arose during the development we had to reverse-engineer the relevant part of boot.efi, so you could compare yourself. https://ghostbin.com/paste/55hgbboot.efi (10.13.1) https://github.com/Piker-Alpha/macosxbootloader/blob/El-Capitan/src/boot/MemoryMap.cpp#L183Corresponding function in macosxbootloader There definitely may exist some serious misunderstanding, but I would rather not touch NVRAM problem for the next few months (and I suppose so does Fritz). The outline presented is all our knowledge, and there is hardly anything to add to it. If you would like to continue research, I may gladly read but rather not participate ^^ I just had a brain fart I think, IDK, I have the flu so I'm kinda feeling terrible. Hi, I have some noob questions regarding this new aptiofix2 by vit9696. I have a Asus B85M-G, and so far used aptiofix2 old version: - Will the new aptiofix2 by vit9696 by vlad improve system performance? - Is it possible that vst audio plugins now better validate with the new aptiofix2? - Is it possible that the old aptiofix2 caused graphics stuttering in combination with recent hs nvidia web drivers? Thanks! Try it. May help stuff work better. Might have been causing issues. No idea. There were so many memory issues that it's hard to say. It's even hard to say if we even fixed them all.... 1 Link to comment Share on other sites More sharing options...
vit9696 Posted January 6, 2018 Share Posted January 6, 2018 (edited) I'm not disagreeing with a lot of what you're or DF is saying or I wouldn't have committed it. I'm just trying to understand. If hibernate_newruntime_map never runs where does it get the mapping from? How does it have the memory mappings of the virtual addresses if they were all removed from the handoff? Here is the relevant quote from XNU: - After the platform CPU init code is called, hibernate_machine_init() is called to restore the restof memory, using the polled mode driver, before other threads can run or any devices are turned on. This reduces the memory usage for BootX and allows decompression in parallel with disk reads, for the remaining non wired pages. Basically XNU restores all the memory mapping that was mapped during the initial boot (i.e. before going to hibernation), effectively including the original UEFI mapping. hibernate_machine_init or rather hibernate_newruntime_map (which it calls if and only it receives kIOHibernateHandoffTypeMemoryMap) is responsible for unmapping the old UEFI memory, and then mapping the new one. See the pmap_remove calls over the original pointers and pmap_map for the new pointers. I just had a brain fart I think, IDK, I have the flu so I'm kinda feeling terrible. Get better soon Fritz, looks like I answered a bit earlier Edited January 6, 2018 by vit9696 Link to comment Share on other sites More sharing options...
mhaeuser Posted January 6, 2018 Author Share Posted January 6, 2018 "Well you selectively cut out the fact that it says it's equivalent to RemoveRTFlagMappings." Yes, I did, because I consider that to be incorrect. RemoveRTFlagMappings() is not even called anymore, it's just there being unused. I suppose dmazar simply did not come back to fix that comment once replacing the method of RT protection "SetVirtualAddressMap() has to be called in physical mode, but switches to virtual mode, at least that's what it's supposed to do according to the spec." It switches the addressing to virtual mode, i.e. any future address usages will be virtual, but it does not map or switch modes. That's that the kernel does later. "Did you use the Ex method or the regular?"I didn't try Ex as both vit and me were running out of patience. If you can get some sort of VirtualAddressChange event to trigger, you're safe to replace the code. "I'm just trying to understand. If hibernate_newruntime_map never runs where does it get the mapping from? How does it have the memory mappings of the virtual addresses if they were all removed from the handoff?" I did some research a year ago, so it might be lacking. I have no real doubt that it's first mapping the old region and once the MemoryMap handoff is hit, remaps everything (I remember code for that in XNU), but feel free to share if you find something that hints otherwise. 1 Link to comment Share on other sites More sharing options...
Funky frank Posted January 6, 2018 Share Posted January 6, 2018 Hm I just had a sudden restart on the desktop and also after reboot the bios does not boot a drive anymore, I have to power off/on first. Is that related to new aptiofix2? EDIT: And what is the diff between WTF and WTH version? Link to comment Share on other sites More sharing options...
apianti Posted January 6, 2018 Share Posted January 6, 2018 Basically XNU restores all the memory mapping that was mapped during the initial boot (i.e. before going to hibernation), effectively including the original UEFI mapping. hibernate_machine_init or rather hibernate_newruntime_map (which it calls if and only it receives kIOHibernateHandoffTypeMemoryMap) is responsible for unmapping the old UEFI memory, and then mapping the new one. See the pmap_remove calls over the original pointers and pmap_map for the new pointers. Yes, that was my understanding of what is happening as well. But my point is, if it's just using the original memory map because the handoff to remap them was disabled, don't those regions still need fixed because they are the original regions? EDIT: That's not clear. Is it remapping the regions from before hibernation but also the original from the firmware are there as well?? I don't get how it's reconciling the regions since they could be mapped differently every boot. Wouldn't that create multiple regions of the same codes? Get better soon Stupid holiday parties, every was sick, now I am. Thanks though. "Well you selectively cut out the fact that it says it's equivalent to RemoveRTFlagMappings." Yes, I did, because I consider that to be incorrect. RemoveRTFlagMappings() is not even called anymore, it's just there being unused. I suppose dmazar simply did not come back to fix that comment once replacing the method of RT protection I'm seem to not be able to articulate what I'm talking about. Yeah that code is not called anymore, so why are we doing something that he said was equivalent? "SetVirtualAddressMap() has to be called in physical mode, but switches to virtual mode, at least that's what it's supposed to do according to the spec." It switches the addressing to virtual mode, i.e. any future address usages will be virtual, but it does not map or switch modes. That's that the kernel does later. You forget that it's actually already in the virtual mode of the CPU since PEI, we already had this conversation. Is it actually virtualizing all addresses and then replacing it back with the flat descriptor again? "Did you use the Ex method or the regular?" I didn't try Ex as both vit and me were running out of patience. If you can get some sort of VirtualAddressChange event to trigger, you're safe to replace the code. Nah I think it's perfectly acceptable, once again just trying to understand. "I'm just trying to understand. If hibernate_newruntime_map never runs where does it get the mapping from? How does it have the memory mappings of the virtual addresses if they were all removed from the handoff?" I did some research a year ago, so it might be lacking. I have no real doubt that it's first mapping the old region and once the MemoryMap handoff is hit, remaps everything (I remember code for that in XNU), but feel free to share if you find something that hints otherwise. Yes.... But how are the original regions ok, if they are not replaced by the handoff or fixed by the new protection...? Hm I just had a sudden restart on the desktop and also after reboot the bios does not boot a drive anymore, I have to power off/on first. Is that related to new aptiofix2? EDIT: And what is the diff between WTF and WTH version? I think he was making jokes, urban dictionary. Just refining the solution. Go into the space bar menu and select the don't reboot on kp and keep symbol info. That should give you idea of what happened with the restart. But probably yes... Do you have a Z97?? Link to comment Share on other sites More sharing options...
mhaeuser Posted January 6, 2018 Author Share Posted January 6, 2018 "I'm seem to not be able to articulate what I'm talking about. Yeah that code is not called anymore, so why are we doing something that he said was equivalent?" That's basically what I meant... test what is happening when that code is ditched. "You forget that it's actually already in the virtual mode of the CPU since PEI, we already had this conversation. Is it actually virtualizing all addresses and then replacing it back with the flat descriptor again?" No, I did not saying that it is not in virtual mode, I said that the function does not perform a switch. If we are in flat protected pre-call, we are in flat protected post-call. It does not alter mapping, it merely a "Hey, RT drivers, adapt your addresses!" function. "Nah I think it's perfectly acceptable, once again just trying to understand." Imo it's not acceptable when there is a prettier solution, but... AMI. "Yes.... But how are the original regions ok, if they are not replaced by the handoff or fixed by the new protection...?" Amazing question, I don't know and hence I vote for trying to get rid of that code. Remember hibernation is quite easily failing, so that could be a sign of just that. Link to comment Share on other sites More sharing options...
Funky frank Posted January 6, 2018 Share Posted January 6, 2018 I think he was making jokes, urban dictionary. Just refining the solution. Go into the space bar menu and select the don't reboot on kp and keep symbol info. That should give you idea of what happened with the restart. But probably yes... Do you have a Z97?? No, I have this: https://www.asus.com/de/Motherboards/B85MG/ Basically I have quite a bunch of problems since upgrading to HS 10.13.2, starting with nvidia web driver lags... I think I will go back now to 10.12.6 again, which worked quite perfectly. I only was very curious for Metal 2... OpenGL game performance heavily improved here in HS 10.3.2 though. Hard to say what the reason is for all these issues. Link to comment Share on other sites More sharing options...
vit9696 Posted January 6, 2018 Share Posted January 6, 2018 Yes, that was my understanding of what is happening as well. But my point is, if it's just using the original memory map because the handoff to remap them was disabled, don't those regions still need fixed because they are the original regions? They do need to be fixed, because otherwise it means XNU may execute invalid code if the mapping changes across the boots. This sometimes happens for the users, and that's why they sometimes cannot wake from the hibernation but generally it works (screw it). I think the magic HibernationFixup does may be relevant in avoiding such code paths, but it is just a guess. The correct solution is to let XNU properly handle the handoff or ensure that the mapping is the same across hibernation restoration by preserving it somewhere and then using back. The second is mad, but if we cannot let XNU handle the handoff it might be the only solution. Link to comment Share on other sites More sharing options...
apianti Posted January 6, 2018 Share Posted January 6, 2018 "I'm seem to not be able to articulate what I'm talking about. Yeah that code is not called anymore, so why are we doing something that he said was equivalent?" That's basically what I meant... test what is happening when that code is ditched. Yeah, that might be a wise test. "You forget that it's actually already in the virtual mode of the CPU since PEI, we already had this conversation. Is it actually virtualizing all addresses and then replacing it back with the flat descriptor again?" No, I did not saying that it is not in virtual mode, I said that the function does not perform a switch. If we are in flat protected pre-call, we are in flat protected post-call. It does not alter mapping, it merely a "Hey, RT drivers, adapt your addresses!" function. Once all events have been notified, the EFI firmware reapplies image “fix-up” information to virtually relocate all runtime images to their new addresses. A virtual address map may only be applied one time. Once the runtime system is in virtual mode, calls to this function return EFI_UNSUPPORTED. Uh.... I'm very confused by what that means then... "Nah I think it's perfectly acceptable, once again just trying to understand." Imo it's not acceptable when there is a prettier solution, but... AMI. I don't think much in AptioFix is pretty, lol. "Yes.... But how are the original regions ok, if they are not replaced by the handoff or fixed by the new protection...?" Amazing question, I don't know and hence I vote for trying to get rid of that code. Remember hibernation is quite easily failing, so that could be a sign of just that. I want to do this but I think I'm about to just lay down and watch a movie or something.... Maybe later, or someone else. No, I have this: https://www.asus.com/de/Motherboards/B85MG/ Basically I have quite a bunch of problems since upgrading to HS 10.13.2, starting with nvidia web driver lags... I think I will go back now to 10.12.6 again, which worked quite perfectly. I only was very curious for Metal 2... OpenGL game performance heavily improved here in HS 10.3.2 though. Hard to say what the reason is for all these issues. It's entirely possible that you are maybe being affected by the intel speculation vulnerability? They do need to be fixed, because otherwise it means XNU may execute invalid code if the mapping changes across the boots. This sometimes happens for the users, and that's why they sometimes cannot wake from the hibernation but generally it works (screw it). I think the magic HibernationFixup does may be relevant in avoiding such code paths, but it is just a guess. Ok so that is an issue then and we'll need to see how to best fix them. The correct solution is to let XNU properly handle the handoff or ensure that the mapping is the same across hibernation restoration by preserving it somewhere and then using back. The second is mad, but if we cannot let XNU handle the handoff it might be the only solution. Yes, I say let's try to see if we can get the regular handoff to work because that is waaaaaay easier than the second. That is mad, but I like it. HAHA. It's like I already know that's probably what we'll have to do............. Link to comment Share on other sites More sharing options...
Recommended Posts