Jief_Machak Posted March 16, 2021 Share Posted March 16, 2021 Can launch a kernel. That's why I ask for your img. In there I'll have the same config as you, maximizing the chances to reproduce. Link to comment Share on other sites More sharing options...
Slice Posted March 16, 2021 Share Posted March 16, 2021 Place here CLOVERX64.efi from the post above QEMU-test2.img-0.lzma command to start test4.sh Adujst for your configuration. Remember to use bios.bin but not OVMF.fd Link to comment Share on other sites More sharing options...
Jief_Machak Posted March 17, 2021 Share Posted March 17, 2021 Got this in console : Loading 'mach_kernel'... Could not open file 'mach_kernel' Error loading kernel 'mach_kernel' (0xe) and this in screen : The kernel is not there so all that seems normal. No crash... If you can reproduce, try launching from Qemu folder in Clover with the qemu in qemu-portable folder. maybe we have qemu bug problem... Also, the clover you gave me doesn't boot with timeout. I tried compiling one, but no crash either. Link to comment Share on other sites More sharing options...
Slice Posted March 17, 2021 Share Posted March 17, 2021 It is not a qemu problem as other users reported the problem on real hardware. 1. This is legacy hardware having no UEFI. There is only bios boot. No problem with UEFI boot. 2. Boot by timeout is required for reproduce. If we type "enter" then no crash. I reproduced the behaviour with my qemu and then made git bisection to find a commit when the problem started. Link to comment Share on other sites More sharing options...
Slice Posted March 17, 2021 Share Posted March 17, 2021 One more condition: 3. There is no OpenRuntime.efi. If it presents then no crash. First observation https://applelife.ru/threads/clover.42089/page-1339#post-926032 Second confirmation https://www.applelife.ru/threads/clover.42089/page-1351#post-928012 My confirmation https://www.applelife.ru/threads/clover.42089/page-1351#post-928188 1 Link to comment Share on other sites More sharing options...
Jief_Machak Posted March 17, 2021 Share Posted March 17, 2021 You sent me a CloverX64.efi that suppose to reproduce but this efi doesn't boot with timeout. 5 hours ago, Slice said: 1. This is legacy hardware having no UEFI. There is only bios boot. No problem with UEFI boot. 2. Boot by timeout is required for reproduce. If we type "enter" then no crash. I've well understood that, but can't reproduce yey on my Qemu. 5 hours ago, Slice said: I reproduced the behaviour with my qemu Good, I'd like the same. 4 hours ago, Slice said: First observation https://applelife.ru/threads/clover.42089/page-1339#post-926032 Second confirmation https://www.applelife.ru/threads/clover.42089/page-1351#post-928012 My confirmation https://www.applelife.ru/threads/clover.42089/page-1351#post-928188 I never said it's not true. No need to copy/paste link as "proof". What I'm saying is using an intermediary variable doesn't "solve" the problem. It probably just hides it by coincidence. The problem cannot come for using .wc_str() in args. There is plenty of that and if we need to extract an intermediary variable for each, we won't see the end. It's basic compiler task, I doubt there is a bug in that. Also, the fact that pressing enter avoids the problem is a sign of a deeper problem. Creating an intermediary change the sized used from the stack and probably hides the fact that a pointer is wrongly rewritten by duplicating it. Or something else. For me, it's crucial to find that. And I need help to reproduce and investigate. If you think it doesn't matter, just let me know and I won't bother you anymore... Link to comment Share on other sites More sharing options...
Slice Posted March 17, 2021 Share Posted March 17, 2021 I agree there is the task. I just can't understand why you can't reproduce this while I did this many times. It is very strange bug, not logical. And very strange that there is no crash if OpenRuntime presents. Why it influences on this codes? Link to comment Share on other sites More sharing options...
Jief_Machak Posted March 17, 2021 Share Posted March 17, 2021 Memory allocation and or usage. Loading OpenRuntime.efi may change, reallocate, move some memory. I'm pretty sure some memory is overwritten. If it's written on unused memory, it works... What even more strange is no bug if you press enter! To be sure it's not depending on qemu version, compiler version etc., the only solution is to prepare a self contained img that has the bug when you launch it with qemu-portable (the version I put inside QEMU folder) and send it to me. That way, I can execute the exact same code in the exact same emulator. Once I'm sure that my current computer config can create this bug, I can start to do different tests. It's important to try with the same qemu as me because we may not have the same QEMU configuration (bios, vga bios, etc.). That can have an influence. So I propose : - make an img that creates the problem in your usual qemu, launched with your usual command. - adapt the launch command from the launch script and try it with qemu portable from QEMU folder. If the crash is still there : send it to me, including the command you use. NOTE to try Qemu portable : be in your Clover folder and type ./Qemu/launch [path to CloverX64.efi to copy into the img] path to CloverX64.efi is usually something like ./Build/Clover/DEBUG_GCC53/X64/CloverX64.efi Link to comment Share on other sites More sharing options...
Slice Posted March 17, 2021 Share Posted March 17, 2021 OK, I reproduce this one more time. git checkout 0dc0a93045a9ff1490277cfc2738f4a1d0c637db No matter, some commit between bad and repair. Insert one line into sources: nvram.c INTN FindStartupDiskVolume ( REFIT_MENU_SCREEN *MainMenu ) { INTN Index; // LEGACY_ENTRY *LegacyEntry; // LOADER_ENTRY *LoaderEntry; // REFIT_VOLUME *Volume; REFIT_VOLUME *DiskVolume; BOOLEAN IsPartitionVolume; XStringW LoaderPath; XStringW EfiBootVolumeStr; //test crash by timeout return 0; // This return 0 makes me boot by timeout for first entry. ./hebuild.sh -fr cp /src/CloverBootloader/CloverPackage/CloverV2/EFI/CLOVER/CLOVERX64.efi Qemu/ ./Qemu/launch Qemu/CLOVERX64.efi And voila! CLOVERX64.efi Link to comment Share on other sites More sharing options...
Jief_Machak Posted March 18, 2021 Share Posted March 18, 2021 I 10 hours ago, Slice said: cp /src/CloverBootloader/CloverPackage/CloverV2/EFI/CLOVER/CLOVERX64.efi Qemu/ ./Qemu/launch Qemu/CLOVERX64.efi And voila! Ok, making progress. I did that and I got the crash ! If I used the img you've sent me few post ago, I got stuck at : I also have ehci: PERIODIC list base register set while periodic schedule is enabled and HC is enabled ehci: ASYNC list address register set while async schedule is enabled and HC is enabled in the mac console. Knowing which img you use is easy. The img committed in Qemu folder has this theme : Could you have a quick check by copying the img you've sent me in Qemu folder? Rename it to "disk_image_gpt.img" or modify the launch command. Link to comment Share on other sites More sharing options...
Slice Posted March 18, 2021 Share Posted March 18, 2021 In the evening only when I will be at home. But if you have a result with your image then it is enough. I send you shorten image I don't remember exactly what inside. Link to comment Share on other sites More sharing options...
Jief_Machak Posted March 18, 2021 Share Posted March 18, 2021 Sure, sure. I just wonder why this difference. No hurry especially now that I can work on the crash by using the committed img. I think I found why I couldn't recreate this crash at first : I was compiling debug version. Seems to only happen in release version ! Link to comment Share on other sites More sharing options...
Slice Posted March 18, 2021 Share Posted March 18, 2021 Once we already found other DEBUG|RELEASE difference. Link to comment Share on other sites More sharing options...
Jief_Machak Posted March 18, 2021 Share Posted March 18, 2021 I've added this in main.cpp, line 847 UINT32 a = GetDebugPrintErrorLevel (); DBG("%d", a); and the crash is also avoided. So even more clue that it's a stack or memory problem. I've just added "--define=DEBUG_ON_SERIAL_PORT" to the release compilation and the crash is also avoided. So that's the main difference between DEBUG and RELEASE. I guess I have to debug without debug messages ! 2 Link to comment Share on other sites More sharing options...
Slice Posted March 18, 2021 Share Posted March 18, 2021 When I added DBG("test 1\n"); the crash was at place. This way I narrowed down the problem to one place. It is not looking as floating memory problem. Link to comment Share on other sites More sharing options...
Jief_Machak Posted March 18, 2021 Share Posted March 18, 2021 I have to leave for now but I saw that some lines below, after setting up open core structure, I try to start openruntime.efi image even if it’s not loaded. i only had time to test once but the crash seems gone. Will test more and finish that tomorrow. Link to comment Share on other sites More sharing options...
Slice Posted March 18, 2021 Share Posted March 18, 2021 One more issue. debug.log is not created at this crash. It is very-very bad. Old method for debug.log assumed that it must be created at any crash. Not now. In the old method file opened or created and closed at every line. In the new method it will be open once and so not saved at incidence. 1 Link to comment Share on other sites More sharing options...
Jief_Machak Posted March 19, 2021 Share Posted March 19, 2021 For decades, I told programmer : "check return status of everything you call, even in experimental code. If you are sure of the return status, still check it and panic (or exception) if it's not what you think". The obvious reason is to quickly know where is the problem when things change. If you don't panic, crash happens later on. You waste a lot of time understand where and why. Why also in experimental code ? Because pieces of experimental code always crawl into production code ! I killed a lot of programmers over few decades to not follow this ! Look what I did when I was trying to integrate OC EFI_HANDLE DriverHandle; Status = gBS->LoadImage(false, gImageHandle, FileDevicePath(self.getSelfLoadedImage().DeviceHandle, FileName), NULL, 0, &DriverHandle); DBG("Load OpenRuntime.efi : Status %s\n", efiStrError(Status)); Status = gBS->StartImage(DriverHandle, 0, 0); DBG("Start OpenRuntime.efi : Status %s\n", efiStrError(Status)); Do you see it ? I call StartImage even if LoadImage fails !!! So bad. At that time, I thought OpenRuntime.efi was mandatory. But that's no excuse. By not testing the result code (Status) of LoadImage, I just wasted 2 programmers a lot of hours. Arg, shame on me. If I had put something like this : EFI_HANDLE DriverHandle; Status = gBS->LoadImage(false, gImageHandle, FileDevicePath(self.getSelfLoadedImage().DeviceHandle, FileName), NULL, 0, &DriverHandle); DBG("Load OpenRuntime.efi : Status %s\n", efiStrError(Status)); if ( EFI_ERROR(Status) ) panic("Cannot load OpenRuntime.efi"); Status = gBS->StartImage(DriverHandle, 0, 0); DBG("Start OpenRuntime.efi : Status %s\n", efiStrError(Status)); See the "if ( EFI_ERROR(Status) ) panic("Cannot load OpenRuntime.efi");" ? The message would have been very clear when people try to not put OpenRuntime.efi. Bug would have been fixed in 5 minutes top. Good example that my advice was good and I must have followed it . Sorry everyone. @Slice : and please say sorry for me in Russian on Applelife.ru 3 Link to comment Share on other sites More sharing options...
Jief_Machak Posted March 19, 2021 Share Posted March 19, 2021 Ah and for the log : in the config.plist in the Qemu disk image, the key was Log instead of Debug. I fixed it and it's in the zip. Delete "disk_image_gpt.img.zip" and the launch script will re-extract the zip. I don't understand why you said that I'm not closing the log, and why you re-import old method. At the end of the new "SaveMessageToDebugLogFile", there is "closeDebugLog();" If there is a log creation problem, tell me how to reproduce and I'll fix it. Link to comment Share on other sites More sharing options...
Slice Posted March 19, 2021 Share Posted March 19, 2021 4 hours ago, Jief_Machak said: See the "if ( EFI_ERROR(Status) ) panic("Cannot load OpenRuntime.efi");" ? Hmmm... Not agree. Legacy bios boot may live without OpenRuntime.efi so in this case the driver is not mandatory. May be better if ( !EFI_ERROR(Status) ) Status = gBS->StartImage(DriverHandle, 0, 0); Link to comment Share on other sites More sharing options...
Jief_Machak Posted March 19, 2021 Share Posted March 19, 2021 Just now, Slice said: Hmmm... Not agree. Legacy bios boot may live without OpenRuntime.efi so in this case the driver is not mandatory. May be better if ( !EFI_ERROR(Status) ) Status = gBS->StartImage(DriverHandle, 0, 0); Yes, it's what I did now I know it's not mandatory. I was explaining that, when I thought it was mandatory and always there, I still should have checked the return code. And because it was experimental at that time, if I had put a panic, we would have catch that bug very easy the first time someone tried to boot without OpenRuntime.efi, therefore avoiding painful process to reproduce and fix. In short, when creating new code and focusing on one case (boot with OpenRuntime.efi), it's better to put some panic for non-handled cases instead of nothing. Link to comment Share on other sites More sharing options...
Slice Posted March 19, 2021 Share Posted March 19, 2021 As I remember Pene can boot Z390 without OpenRuntime in UEFI mode. 1 minute ago, Jief_Machak said: Yes, it's what I did now I know it's not mandatory. I was explaining that, when I thought it was mandatory and always there, I still should have checked the return code. And because it was experimental at that time, if I had put a panic, we would have catch that bug very easy the first time someone tried to boot without OpenRuntime.efi, therefore avoiding painful process to reproduce and fix. In short, when creating new code and focusing on one case (boot with OpenRuntime.efi), it's better to put some panic for non-handled cases instead of nothing. As a method yes Link to comment Share on other sites More sharing options...
Slice Posted March 19, 2021 Share Posted March 19, 2021 22 minutes ago, MifJpn said: Hello, Slice, Jief. Thank you for always doing your best for everyone. I'm worried about Slice's rig malfunction. For now, the latest commits can be compiled by me as well. Also, there is no problem with its movement. From the place where * Dirty.efi was written in the middle of the compilation log, it seemed to me that the environment variables were adjusted again and the compilation was restarted. I expect it to be an adjustment to avoid any problems. Or is it due to enhancements? I hope you can answer when you have time. I hope the problem of qemu issue will be cured. Thank you. I didn't understand you. 1. Where is my rig malfunction? 2. "I expect it to be an adjustment to avoid any problems." - it is about what? What is Dirty.efi? Link to comment Share on other sites More sharing options...
Jief_Machak Posted March 19, 2021 Share Posted March 19, 2021 Did some fix in GenFw. If you got a seg fault 11 in compilation, regenerate your BaseTools. 2 Link to comment Share on other sites More sharing options...
Jief_Machak Posted March 19, 2021 Share Posted March 19, 2021 I continue to refactor SETTINGS_DATA to separate functional layers. For most of the settings it is just a rename, which cannot really introduce new bugs. But there is of course some more complicated ones. So If you use the SSDT or CPU settings EnableC2, EnableC4, EnableC6, C3Latency, HWPEnable or HWPValue, I'll very much appreciate a test with the latest commit, or the efi supplied in this post. Thanks.CLOVERX64.efi Link to comment Share on other sites More sharing options...
Recommended Posts