Clover General discussion

Jief_Machak · March 16, 2021

Can launch a kernel.

That's why I ask for your img. In there I'll have the same config as you, maximizing the chances to reproduce.

Slice · March 16, 2021

Place here CLOVERX64.efi from the post above

QEMU-test2.img-0.lzma

command to start

test4.sh

Adujst for your configuration.

Remember to use bios.bin but not OVMF.fd

Jief_Machak · March 17, 2021

Got this in console :

Loading 'mach_kernel'...
Could not open file 'mach_kernel'
Error loading kernel 'mach_kernel' (0xe)

and this in screen :

The kernel is not there so all that seems normal. No crash...

If you can reproduce, try launching from Qemu folder in Clover with the qemu in qemu-portable folder. maybe we have qemu bug problem...

Also, the clover you gave me doesn't boot with timeout. I tried compiling one, but no crash either.

Slice · March 17, 2021

It is not a qemu problem as other users reported the problem on real hardware.

1. This is legacy hardware having no UEFI. There is only bios boot. No problem with UEFI boot.

2. Boot by timeout is required for reproduce. If we type "enter" then no crash.

I reproduced the behaviour with my qemu and then made git bisection to find a commit when the problem started.

Slice · March 17, 2021

One more condition:

3. There is no OpenRuntime.efi. If it presents then no crash.

First observation

https://applelife.ru/threads/clover.42089/page-1339#post-926032

Second confirmation

https://www.applelife.ru/threads/clover.42089/page-1351#post-928012

My confirmation

https://www.applelife.ru/threads/clover.42089/page-1351#post-928188

Jief_Machak · March 17, 2021

You sent me a CloverX64.efi that suppose to reproduce but this efi doesn't boot with timeout.

5 hours ago, Slice said:

1. This is legacy hardware having no UEFI. There is only bios boot. No problem with UEFI boot.

2. Boot by timeout is required for reproduce. If we type "enter" then no crash.

I've well understood that, but can't reproduce yey on my Qemu.

5 hours ago, Slice said:

I reproduced the behaviour with my qemu

Good, I'd like the same.

4 hours ago, Slice said:

First observation

https://applelife.ru/threads/clover.42089/page-1339#post-926032

Second confirmation

https://www.applelife.ru/threads/clover.42089/page-1351#post-928012

My confirmation

https://www.applelife.ru/threads/clover.42089/page-1351#post-928188

I never said it's not true. No need to copy/paste link as "proof".

What I'm saying is using an intermediary variable doesn't "solve" the problem. It probably just hides it by coincidence. The problem cannot come for using .wc_str() in args. There is plenty of that and if we need to extract an intermediary variable for each, we won't see the end. It's basic compiler task, I doubt there is a bug in that.

Also, the fact that pressing enter avoids the problem is a sign of a deeper problem.

Creating an intermediary change the sized used from the stack and probably hides the fact that a pointer is wrongly rewritten by duplicating it. Or something else.

For me, it's crucial to find that. And I need help to reproduce and investigate.

If you think it doesn't matter, just let me know and I won't bother you anymore...

Slice · March 17, 2021

I agree there is the task. I just can't understand why you can't reproduce this while I did this many times.

It is very strange bug, not logical. And very strange that there is no crash if OpenRuntime presents. Why it influences on this codes?

Jief_Machak · March 17, 2021

Memory allocation and or usage. Loading OpenRuntime.efi may change, reallocate, move some memory. I'm pretty sure some memory is overwritten. If it's written on unused memory, it works...

What even more strange is no bug if you press enter!

To be sure it's not depending on qemu version, compiler version etc., the only solution is to prepare a self contained img that has the bug when you launch it with qemu-portable (the version I put inside QEMU folder) and send it to me. That way, I can execute the exact same code in the exact same emulator. Once I'm sure that my current computer config can create this bug, I can start to do different tests.

It's important to try with the same qemu as me because we may not have the same QEMU configuration (bios, vga bios, etc.). That can have an influence.

So I propose :

- make an img that creates the problem in your usual qemu, launched with your usual command.

- adapt the launch command from the launch script and try it with qemu portable from QEMU folder. If the crash is still there : send it to me, including the command you use.

NOTE to try Qemu portable : be in your Clover folder and type ./Qemu/launch [path to CloverX64.efi to copy into the img]

path to CloverX64.efi is usually something like ./Build/Clover/DEBUG_GCC53/X64/CloverX64.efi

Slice · March 17, 2021

OK, I reproduce this one more time.

git checkout 0dc0a93045a9ff1490277cfc2738f4a1d0c637db

No matter, some commit between bad and repair.

Insert one line into sources: nvram.c

INTN
FindStartupDiskVolume (
  REFIT_MENU_SCREEN *MainMenu
  )
{
  INTN         Index;
//  LEGACY_ENTRY *LegacyEntry;
//  LOADER_ENTRY *LoaderEntry;
//  REFIT_VOLUME *Volume;
  REFIT_VOLUME *DiskVolume;
  BOOLEAN      IsPartitionVolume;
  XStringW     LoaderPath;
  XStringW     EfiBootVolumeStr;
//test crash by timeout
  return 0;
  //

This return 0 makes me boot by timeout for first entry.

./hebuild.sh -fr

cp /src/CloverBootloader/CloverPackage/CloverV2/EFI/CLOVER/CLOVERX64.efi Qemu/

./Qemu/launch Qemu/CLOVERX64.efi

And voila!

Снимок экрана 2021-03-17 в 23.30.54.png

CLOVERX64.efi

Jief_Machak · March 18, 2021

I

10 hours ago, Slice said:

cp /src/CloverBootloader/CloverPackage/CloverV2/EFI/CLOVER/CLOVERX64.efi Qemu/

./Qemu/launch Qemu/CLOVERX64.efi

And voila!

Ok, making progress. I did that and I got the crash !

If I used the img you've sent me few post ago, I got stuck at :

I also have

ehci: PERIODIC list base register set while periodic schedule
      is enabled and HC is enabled
ehci: ASYNC list address register set while async schedule
      is enabled and HC is enabled

in the mac console.

Knowing which img you use is easy. The img committed in Qemu folder has this theme :

Could you have a quick check by copying the img you've sent me in Qemu folder? Rename it to "disk_image_gpt.img" or modify the launch command.

Slice · March 18, 2021

In the evening only when I will be at home.

But if you have a result with your image then it is enough. I send you shorten image I don't remember exactly what inside.

Jief_Machak · March 18, 2021

Sure, sure. I just wonder why this difference. No hurry especially now that I can work on the crash by using the committed img.

I think I found why I couldn't recreate this crash at first : I was compiling debug version. Seems to only happen in release version !

Slice · March 18, 2021

Once we already found other DEBUG|RELEASE difference.

Jief_Machak · March 18, 2021

I've added this in main.cpp, line 847

UINT32 a = GetDebugPrintErrorLevel ();
DBG("%d", a);

and the crash is also avoided. So even more clue that it's a stack or memory problem.

I've just added "--define=DEBUG_ON_SERIAL_PORT" to the release compilation and the crash is also avoided. So that's the main difference between DEBUG and RELEASE. I guess I have to debug without debug messages !

Slice · March 18, 2021

When I added DBG("test 1\n"); the crash was at place. This way I narrowed down the problem to one place. It is not looking as floating memory problem.

Jief_Machak · March 18, 2021

I have to leave for now but I saw that some lines below, after setting up open core structure, I try to start openruntime.efi image even if it’s not loaded.

i only had time to test once but the crash seems gone.

Will test more and finish that tomorrow.

Slice · March 18, 2021

One more issue. debug.log is not created at this crash.

It is very-very bad. Old method for debug.log assumed that it must be created at any crash. Not now.

In the old method file opened or created and closed at every line. In the new method it will be open once and so not saved at incidence.

Jief_Machak · March 19, 2021

For decades, I told programmer : "check return status of everything you call, even in experimental code. If you are sure of the return status, still check it and panic (or exception) if it's not what you think".

The obvious reason is to quickly know where is the problem when things change. If you don't panic, crash happens later on. You waste a lot of time understand where and why.

Why also in experimental code ? Because pieces of experimental code always crawl into production code !

I killed a lot of programmers over few decades to not follow this !

Look what I did when I was trying to integrate OC

    EFI_HANDLE DriverHandle;
    Status = gBS->LoadImage(false, gImageHandle, FileDevicePath(self.getSelfLoadedImage().DeviceHandle, FileName), NULL, 0, &DriverHandle);
    DBG("Load OpenRuntime.efi : Status %s\n", efiStrError(Status));
    Status = gBS->StartImage(DriverHandle, 0, 0);
    DBG("Start OpenRuntime.efi : Status %s\n", efiStrError(Status));

Do you see it ?

I call StartImage even if LoadImage fails !!! So bad. At that time, I thought OpenRuntime.efi was mandatory. But that's no excuse. By not testing the result code (Status) of LoadImage, I just wasted 2 programmers a lot of hours. Arg, shame on me.

If I had put something like this :

    EFI_HANDLE DriverHandle;
    Status = gBS->LoadImage(false, gImageHandle, FileDevicePath(self.getSelfLoadedImage().DeviceHandle, FileName), NULL, 0, &DriverHandle);
    DBG("Load OpenRuntime.efi : Status %s\n", efiStrError(Status));
    if ( EFI_ERROR(Status) ) panic("Cannot load OpenRuntime.efi");
    Status = gBS->StartImage(DriverHandle, 0, 0);
    DBG("Start OpenRuntime.efi : Status %s\n", efiStrError(Status));

See the "if ( EFI_ERROR(Status) ) panic("Cannot load OpenRuntime.efi");" ?

The message would have been very clear when people try to not put OpenRuntime.efi. Bug would have been fixed in 5 minutes top.

Good example that my advice was good and I must have followed it .

Sorry everyone.

@Slice : and please say sorry for me in Russian on Applelife.ru

Jief_Machak · March 19, 2021

Ah and for the log : in the config.plist in the Qemu disk image, the key was Log instead of Debug. I fixed it and it's in the zip. Delete "disk_image_gpt.img.zip" and the launch script will re-extract the zip.

I don't understand why you said that I'm not closing the log, and why you re-import old method. At the end of the new "SaveMessageToDebugLogFile", there is "closeDebugLog();"

If there is a log creation problem, tell me how to reproduce and I'll fix it.

Slice · March 19, 2021

4 hours ago, Jief_Machak said:

See the "if ( EFI_ERROR(Status) ) panic("Cannot load OpenRuntime.efi");" ?

Hmmm... Not agree. Legacy bios boot may live without OpenRuntime.efi so in this case the driver is not mandatory.

May be better

if ( !EFI_ERROR(Status) ) Status = gBS->StartImage(DriverHandle, 0, 0);

Jief_Machak · March 19, 2021

Just now, Slice said:
Hmmm... Not agree. Legacy bios boot may live without OpenRuntime.efi so in this case the driver is not mandatory.

May be better
if ( !EFI_ERROR(Status) ) Status = gBS->StartImage(DriverHandle, 0, 0);

Yes, it's what I did now I know it's not mandatory.

I was explaining that, when I thought it was mandatory and always there, I still should have checked the return code. And because it was experimental at that time, if I had put a panic, we would have catch that bug very easy the first time someone tried to boot without OpenRuntime.efi, therefore avoiding painful process to reproduce and fix.

In short, when creating new code and focusing on one case (boot with OpenRuntime.efi), it's better to put some panic for non-handled cases instead of nothing.

Slice · March 19, 2021

As I remember Pene can boot Z390 without OpenRuntime in UEFI mode.

1 minute ago, Jief_Machak said:

Yes, it's what I did now I know it's not mandatory.

I was explaining that, when I thought it was mandatory and always there, I still should have checked the return code. And because it was experimental at that time, if I had put a panic, we would have catch that bug very easy the first time someone tried to boot without OpenRuntime.efi, therefore avoiding painful process to reproduce and fix.

In short, when creating new code and focusing on one case (boot with OpenRuntime.efi), it's better to put some panic for non-handled cases instead of nothing.

As a method yes

Slice · March 19, 2021

22 minutes ago, MifJpn said:

Hello, Slice, Jief.
Thank you for always doing your best for everyone.

I'm worried about Slice's rig malfunction.

For now, the latest commits can be compiled by me as well. Also, there is no problem with its movement.

From the place where * Dirty.efi was written in the middle of the compilation log, it seemed to me that the environment variables were adjusted again and the compilation was restarted.
I expect it to be an adjustment to avoid any problems.
Or is it due to enhancements?

I hope you can answer when you have time.

I hope the problem of qemu issue will be cured.

Thank you.

I didn't understand you.

1. Where is my rig malfunction?

2. "I expect it to be an adjustment to avoid any problems." - it is about what? What is Dirty.efi?

Jief_Machak · March 19, 2021

Did some fix in GenFw.

If you got a seg fault 11 in compilation, regenerate your BaseTools.

Jief_Machak · March 19, 2021

I continue to refactor SETTINGS_DATA to separate functional layers.

For most of the settings it is just a rename, which cannot really introduce new bugs. But there is of course some more complicated ones.

So If you use the SSDT or CPU settings EnableC2, EnableC4, EnableC6, C3Latency, HWPEnable or HWPValue, I'll very much appreciate a test with the latest commit, or the efi supplied in this post.

Thanks.CLOVERX64.efi

Clover General discussion

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites