Slice Posted November 27, 2018 Share Posted November 27, 2018 Unclosed comment is a bug but not the reason for the issue. Thanks for the explanation about different GOP. Now it is more clear the mistery. My screen is 1366x768 which is not exactly 16:9. I think real buffer may differ from that while UGAWidth = 1366 and UGAHeight=768. Link to comment Share on other sites More sharing options...
Slice Posted November 27, 2018 Share Posted November 27, 2018 Clover hangs at rendering the icon so before GOP is working so I still don't understand what is the difference between two cases. (nvidia and intel) 1 Link to comment Share on other sites More sharing options...
apianti Posted November 27, 2018 Share Posted November 27, 2018 I'm not really sure how either is working because that the one number (the y of the destination point) is negative (and maybe that both x and y are insanely huge, basically the upper and lower bounds of a signed 32bit integer) so when the drawing happens there is going to be negative offsets in the vertical direction and would be drawing out of bounds of the memory allocated for the icon. Is the icon being rendered directly into the video memory or rasterised into an allocated memory buffer first and then rendered? C (x1 y1 x2 y2 x y)+ Draws a cubic Bézier curve from the current point to (x,y) using (x1,y1) as the control point at the beginning of the curve and (x2,y2) as the control point at the end of the curve. Because the first, directly rendering into video memory, would explain the different behaviors and also that the artifacts are corruptions from drawing outside, intel integrated uses regular memory so drawing outside of the area probably causes a segmentation fault for using unallocated memory. The debug build may give you the exception screen to help. The second, rasterising to a secondary buffer should cause all of them to fail since it would most likely be drawing outside the bounds of any allocated buffer. I am only making some assumptions since I haven't looked and I am just typing this early in the morning before I am about to leave. I don't know if the SVG rasterising checks the bounds properly or not. I would guess that the numbers are not supposed to be those and there is some problem reading them in or somewhere else in the process of creating the SVG info. Link to comment Share on other sites More sharing options...
Slice Posted November 27, 2018 Share Posted November 27, 2018 No, icon rendered into buffer which later will be outputted by GOP->Blt() but later. I see crash before this. And I agree some points may occur out of bounds and this is hard to debug. May be moving the whole nanosvg project from INT32 to INT64 will improve the stability. Link to comment Share on other sites More sharing options...
apianti Posted November 27, 2018 Share Posted November 27, 2018 There is also possibility that different machines are giving a slightly different integer value when converted from float to INT32 because of different algorithms used in the instructions in hardware, where in those big numbers might overflow and cause a zero answer which causes a divide by zero. The problem I would say is obviously somewhere in the reading or storing of the values since the SVG clearly contains no such curve. Also there really shouldn't be any case where you're hitting the ~2B maximum of an INT32 normally. Even drawing a 5k monitor at full resolution 32bit color is only around ~59MB which is well under an INT32 max. So I would check where the values are being loaded for a curve and see if there's a mistake somewhere there or anywhere that may change those values before being rasterised to the icon buffer. Link to comment Share on other sites More sharing options...
Slice Posted November 27, 2018 Share Posted November 27, 2018 Theme Clovy works since revision 4777. But the question is still here, why? Because vol_recovery became simpler? Link to comment Share on other sites More sharing options...
apianti Posted November 27, 2018 Share Posted November 27, 2018 (edited) Oh wait, I think I see the issue, if you comment out this path with class="st34" does it no longer crash? If it doesn't crash, I think the issue is nanosvg can't properly differentiate the points for some reason for the absolute cubic bezier curve (C). Try adding a space in between the points instead of commas. If it still crashes then does commenting out the other path prevent the crash? Edited November 27, 2018 by apianti Link to comment Share on other sites More sharing options...
Slice Posted November 27, 2018 Share Posted November 27, 2018 Link to comment Share on other sites More sharing options...
apianti Posted November 28, 2018 Share Posted November 28, 2018 (edited) Sorry, I was in a hurry, I was asking which path element was causing the problem of the two in the sample SVG you posted for the recovery volume, since they both have a different style of defining the points. The fact is that the problem lies within the loading, storing, or transformation of an absolute cubic bezier curve, the 'C' form in a path since that is what is giving an insane number. It may be that it's actually a bug for everything but is just only being produced by chance because of the way you (or probably illustrator) happened to code the curve path. The problem could be in the translation of one of the group containers. Another thing is why aren't you using like a bounding box of say "0, 0, 100, 100" and then scaling everything to be like percentages? That would make the drawing so much easier as you would know that everything is based on a percentage and you should never get any numbers that aren't within the resolution size of your display when scaled. Like say you want want something to be 10% by 8% of the screen, it's super easy to make that happen with a bounding box of that size on any display. It's much more difficult if you use a seemingly random bounding box, since there will be much more complicated math than basically percentage scaling and you would be able to easier debug the points because they would be more uniform. The points that are being used are hard to tell where they are at because they are scaled according to a weird sized bounding box, using like 0.25,-0.5 as relative points bounding the cubic bezier curve would be much easier to read and know whats going on instead of like 3.6786,-7 or whatever. Not to mention that displays have different resolutions/aspect ratios and you may end up with floating point rounding errors the more numbers are involved in a float, i.e. the more significant digits. Which is definitely happening when the display has a different resolution than the bounding box, which will inevitably be the case since very few displays have resolutions of what you have set as the bounding box. SVG is super complicated and I am almost positive that nanosvg is probably not equipped for the amount of weird ways you can properly describe a path form, especially since both the Clovy theme.svg and the sample svg you provided render properly in everything I've tried to render them in that supports SVG images. I was asking if you modified the way that the 'C' form is being described to a different format does that give you different results? It's also confusing that you would even be using an absolute anything since I'm assuming you want to move these around and that should change the angle of the curve when you translate the element to a different location, i.e. another volume is detected? Does the Clovy theme show top left corners with different angles than the rest of the shapes when not in the original placement? It may be hard to see unless you have a high resolution display. Just moving the recovery volume shape around inside the bounding box and zooming in, I can see that the top left corner gets steeper or shallower depending on where it is moved while the others stay the same shape. EDIT: Sorry, it's not the moving it's the zooming that is causing the corner to change shape which is what would be happening when this is scaled to a different size, especially a much larger display such as 4k/5k. EDIT2: Another note, if you just used only relative points (lowercase letters) then you would only need to have one group for each actual group that behaved as a unit. Edited November 28, 2018 by apianti Link to comment Share on other sites More sharing options...
Slice Posted November 28, 2018 Share Posted November 28, 2018 I have pathes just that it defined by Illustrator not coorecting them manually. One of the potencial problem is that when all points of cubic bezier lay in the view box some part may occurs outside of it because of curve Link to comment Share on other sites More sharing options...
apianti Posted November 28, 2018 Share Posted November 28, 2018 (edited) I just basically removed all translation and everything except that one absolute cubic bezier curve, the rest is all relative. Because I didn't fix the curve points, I should have ended up with a rectangle with inverted curves (swooping inward instead of outward) , instead I got that the top line was slanted, and as you said the curve is going outside the view box. EDIT: Also this does show that translation does affect the curves depth when using absolute points, it just worsens when zoomed/scaled because it's being multiplied and you can really tell. EDIT2: I should have did what you did with the bounding box, so you could see where it was but I guess its pretty obvious.... EDIT3: Only the third point specified has to lay within the bounding/view box, the first two are the points that control the curve, they do not actually lie on the curve but outside it and bind the curve to its shape, you should be able to choose some option for illustrator to show you the bounding points. There is no specified first point in the curve it's taken from the previous point location, which when you start a shape you specify with the 'M' (or 'm') form, which is move to point. Edited November 28, 2018 by apianti Link to comment Share on other sites More sharing options...
Slice Posted November 28, 2018 Share Posted November 28, 2018 Recently I closed the issue https://github.com/memononen/nanosvg/issues/127 while I still have doubts why this may happen. May be the difference in realloc used by Mikko and ReallocatePool used by me? Link to comment Share on other sites More sharing options...
apianti Posted November 28, 2018 Share Posted November 28, 2018 (edited) This line is the issue https://sourceforge.net/p/cloverefiboot/code/HEAD/tree/rEFIt_UEFI/libeg/nanosvg.c#l632, his code is https://github.com/memononen/nanosvg/blob/master/src/nanosvg.h#L703, which the condition p->npts < p->cpts must always hold, then it should be p->npts+1 > p->cpts or p->npts >= p->cpts EDIT: Sorry, I need to sleep and should look more before I say stuff. He wrote that so weird, where he has to subtract one every single time he uses n->npts. He should have used a zero based count and just assumed that there was always points if there is a buffer, also saves on instructions. Not to mention it would have been easier to use a pointer and increment it, but whatever. EDIT2: I'm not sure that his move to is correct, I believe that it should always create a new curve as per: Quote The first command is the "Move To" or M, which was described above. It takes two parameters, a coordinate ' x ' and coordinate ' y ' to move to. If your cursor already was somewhere on the page, no line is drawn to connect the two places. The "Move To" command appears at the beginning of paths to specify where the drawing should start. Edited November 28, 2018 by apianti Link to comment Share on other sites More sharing options...
Slice Posted November 28, 2018 Share Posted November 28, 2018 As he said p->npts is always 3n+1 = 1,4,7,10... not 8. Link to comment Share on other sites More sharing options...
apianti Posted November 28, 2018 Share Posted November 28, 2018 (edited) That's not possible as he is not adding all three points of a bezier curve segment at once, which he could do though much easier and simpler than what he did. 8 is absolutely a valid value for p->npts, building from his example, when you add another line to form then the very first point that will be added will be the eighth point [A B0 B1 B C0 C1 C D0], sure it does not stay that way and you end up with [A B0 B1 B C0 C1 C D0 D1 D]. If he was going to convert everything to cubic bezier curves then he should have just created a structure for it and added that instead. Also he is incorrect because he does not correctly implement the behavior of move, it's actually 3n + m, where m is the number of moves. So really he should have had a path structure that looks like this: typedef SVG_POINT SVG_POINT; struct SVG_POINT { float X; float Y; }; typedef SVG_POINT SVG_SEGMENT[3]; typedef struct SVG_CURVE SVG_CURVE; struct SVG_CURVE { UINTN Count; UINTN Size; SVG_POINT Start; SVG_SEGMENT *Segments; }; typedef struct SVG_PATH SVG_PATH; struct SVG_PATH { SVG_PATH *Next; UINTN Count; SVG_CURVE *Curves; }; EDIT: Oh I see where you're talking about thats r->npoints++ not p->npts++. It is correct, there are 3n + 1 points in that array because thats how he made it. He also should not add the first point back to the path to close it, there is nowhere it says that shapes have to be completely closed, you can straight up just draw a line or curve and that's it but his algorithm will always close the curve or redraw back over the line. EDIT2: Made the path structure more clear. EDIT3: I guess that you could even make this different with a linked list for the segments as well: typedef SVG_POINT SVG_POINT; struct SVG_POINT { float X; float Y; }; typedef struct SVG_SEGMENT SVG_SEGMENT; struct SVG_SEGMENT { SVG_SEGMENT *Next; SVG_POINT Points[3]; }; typedef struct SVG_CURVE SVG_CURVE; struct SVG_CURVE { SVG_CURVE *Next; SVG_POINT Start; SVG_SEGMENT *Segments; }; typedef struct SVG_PATH SVG_PATH; struct SVG_PATH { SVG_PATH *Next; SVG_CURVE *Curves; }; Edited November 29, 2018 by apianti Link to comment Share on other sites More sharing options...
Slice Posted November 29, 2018 Share Posted November 29, 2018 I think unclosed pathes are also possible. Fill will have no sense but stroke will produce image if width>0. Link to comment Share on other sites More sharing options...
apianti Posted November 29, 2018 Share Posted November 29, 2018 (edited) As I recall, although I don't have time to test this currently, even an unclosed path is filled as if it was closed but for sure as you say a stroke with width>0 should draw the unclosed path. Like, say you wanted to draw a closed lip smile on a persons face, his algorithm will just create an open mouth smile because it always closes the path to the original starting point. Which is also incorrect in any sense of how the behavior is expected. I mean this is what 'Z' or 'z' are for (returning back to the most recent moved to point specified by 'M' or 'm'), why would they exist if every shape was supposed to be closed? EDIT: Actually I don't even need to test, this contains an example about three-quarters down that shows an unclosed filled shape: https://developer.mozilla.org/en-US/docs/Web/SVG/Tutorial/Paths EDIT2: This makes it pretty obvious that there should not be automatic closure and indeed does go back to the previous point specified by a move to: https://www.w3.org/TR/SVG/paths.html#PathDataClosePathCommand Edited November 29, 2018 by apianti Link to comment Share on other sites More sharing options...
Slice Posted December 8, 2018 Share Posted December 8, 2018 We have new problem noticed in discussions and tickets. CloverEFI (boot6 and boot7) can't be correctly compiled by Xcode10,1. Tested both XCODE8 and XCODE5 rules, --std-ebda or --low-ebda. CloverEFI works normal compiled by Xcode7.3.1 even with XCODE8 rule. Official release 4798 contains old boot6 and boot7 compiled by Xcode8.3.3 1 Link to comment Share on other sites More sharing options...
apianti Posted December 8, 2018 Share Posted December 8, 2018 Is there a noticeable difference in the size of the boot files from Xcode10.1 compared to the others? Does it fail in QEMU or other VMs? Can you not use the serial debugging with a VM to output to a file to see if the boot file is even gaining control or the problem is with loading/pass off? Link to comment Share on other sites More sharing options...
Slice Posted December 9, 2018 Share Posted December 9, 2018 Yes, it fails even in QEMU so serial debugging is possible if I remember how to do it. Size by Xcode10,1 less then older xcode EfiLdr20Pure = 429 338 by Xcode10 = 434 864 by Xcode7 Same sources. I found for example different compilation of file X86Thunk.c. But I see no why new compilation is worse. Spoiler xcode10 _AsmPrepareThunk16: 0000000000000015 pushq %rbp 0000000000000016 movq %rsp, %rbp 0000000000000019 pushq %rsi 000000000000001a subq $0x28, %rsp 000000000000001e movq %rcx, %rsi 0000000000000021 movq 0x8(%rsi), %rcx 0000000000000025 movzwl (%rip), %r8d 000000000000002d leaq (%rip), %rdx 0000000000000034 callq 0x39 0000000000000039 movq 0x8(%rsi), %rcx 000000000000003d movzwl (%rip), %r9d 0000000000000045 movl %ecx, %eax 0000000000000047 andl $0xfff0, %eax 000000000000004c movw %ax, 0xa(%rcx,%r9) 0000000000000052 movb 0xa(%rsi), %al 0000000000000055 movb %al, 0xc(%rcx,%r9) 000000000000005a movq 0x8(%rsi), %r8 000000000000005e movl %r8d, %eax 0000000000000061 andl $0xf, %eax 0000000000000064 movzwl (%rip), %edx 000000000000006b addl %eax, (%r8,%rdx) 000000000000006f leaq (%rcx,%r9), %rcx 0000000000000073 movl 0x14(%rsi), %edx 0000000000000076 testb $0x1, %dl 0000000000000079 jne 0x8a 000000000000007b movb $0x70, %al 000000000000007d andb %al, 0xe(%rcx) 0000000000000080 andb %al, 0x16(%rcx) 0000000000000083 movq 0x8(%rsi), %r8 0000000000000087 movl 0x14(%rsi), %edx 000000000000008a movzwl (%rip), %eax 0000000000000091 movq %rcx, (%r8,%rax) 0000000000000095 movq 0x8(%rsi), %rax 0000000000000099 movzwl (%rip), %ecx 00000000000000a0 movl %edx, (%rax,%rcx) 00000000000000a3 addq $0x28, %rsp 00000000000000a7 popq %rsi 00000000000000a8 popq %rbp 00000000000000a9 retq _AsmThunk16: xcode7 Spoiler _AsmPrepareThunk16: 0000000000000015 pushq %rbp 0000000000000016 movq %rsp, %rbp 0000000000000019 pushq %rsi 000000000000001a pushq %rdi 000000000000001b pushq %rbx 000000000000001c subq $0x28, %rsp 0000000000000020 movq %rcx, %rbx 0000000000000023 movq 0x8(%rbx), %rcx 0000000000000027 movzwl (%rip), %r8d 000000000000002f leaq (%rip), %rdx 0000000000000036 callq 0x3b 000000000000003b movq 0x8(%rbx), %rcx 000000000000003f movzwl (%rip), %edx 0000000000000046 leaq (%rdx,%rcx), %r8 000000000000004a movq 0x8(%rdx,%rcx), %rax 000000000000004f movl %ecx, %esi 0000000000000051 andl $-0x10, %esi 0000000000000054 shll $0x10, %esi 0000000000000057 movzwl %ax, %edi 000000000000005a orl %esi, %edi 000000000000005c movl %edi, 0x8(%rdx,%rcx) 0000000000000060 movzbl 0xa(%rbx), %edi 0000000000000064 shrq $0x20, %rax 0000000000000068 andl $0xffffff00, %eax 000000000000006d orl %edi, %eax 000000000000006f movl %eax, 0xc(%rdx,%rcx) 0000000000000073 movq 0x8(%rbx), %rcx 0000000000000077 movl %ecx, %eax 0000000000000079 andl $0xf, %eax 000000000000007c movzwl (%rip), %edx 0000000000000083 addl %eax, (%rdx,%rcx) 0000000000000086 movl 0x14(%rbx), %edx 0000000000000089 testb $0x1, %dl 000000000000008c jne 0x9f 000000000000008e movb $0x70, %al 0000000000000090 andb %al, 0xe(%r8) 0000000000000094 andb %al, 0x16(%r8) 0000000000000098 movq 0x8(%rbx), %rcx 000000000000009c movl 0x14(%rbx), %edx 000000000000009f movzwl (%rip), %eax 00000000000000a6 movq %r8, (%rax,%rcx) 00000000000000aa movq 0x8(%rbx), %rax 00000000000000ae movzwl (%rip), %ecx 00000000000000b5 movl %edx, (%rcx,%rax) 00000000000000b8 addq $0x28, %rsp 00000000000000bc popq %rbx 00000000000000bd popq %rdi 00000000000000be popq %rsi 00000000000000bf popq %rbp 00000000000000c0 retq _AsmThunk16: Link to comment Share on other sites More sharing options...
Slice Posted December 9, 2018 Share Posted December 9, 2018 I found the reason. Xcode10 made optimization with intrinsic functions (again!). The code // // Patch IVT 0x68 ~ 0x6f // IdtArray = (UINT32 *) 0; for (Index = 0; Index < 8; Index++) { IdtArray[ProtectedModeBaseVector + Index] = ((EFI_SEGMENT (LegacyRegionBase + Index * 4)) << 16) | (EFI_OFFSET (LegacyRegionBase + Index * 4)); } Compiled as Xcode7 Spoiler 00000000000000ce movq -0x8(%rbp), %r8 00000000000000d2 movq %r8, %rcx 00000000000000d5 shlq $0xc, %rcx 00000000000000d9 movzbl -0x9(%rbp), %edx 00000000000000dd xorl %esi, %esi 00000000000000df movl %ecx, %edi 00000000000000e1 andl $0xf0000000, %edi 00000000000000e7 leal (%r8,%rsi), %eax 00000000000000eb movzwl %ax, %eax 00000000000000ee orl %edi, %eax 00000000000000f0 movl %eax, (%rsi,%rdx,4) 00000000000000f3 addq $0x4, %rsi 00000000000000f7 addl $0x4000, %ecx 00000000000000fd cmpq $0x20, %rsi 0000000000000101 jne 0xdf and Xcode10 Spoiler 00000000000000ab movq -0x10(%rbp), %rcx 00000000000000af leaq (%rip), %rdx 00000000000000b6 movl $0x20, %r8d 00000000000000bc callq 0xc1 00000000000000c1 leaq -0x1(%rbp), %r8 00000000000000c5 xorl %edx, %edx 00000000000000c7 movq %rdi, %rcx 00000000000000ca callq *0x20(%rdi) 00000000000000cd testq %rax, %rax 00000000000000d0 js 0xde 00000000000000d2 movq $-0x8, %rax 00000000000000d9 incq %rax 00000000000000dc jne 0xd9 LegacyBiosThunk.c should be rewritten to avoid such optimization. Link to comment Share on other sites More sharing options...
vector sigma Posted December 9, 2018 Share Posted December 9, 2018 To me says: warning: boot file bigger than low-ebda permits, switching to --std-ebda Link to comment Share on other sites More sharing options...
Slice Posted December 9, 2018 Share Posted December 9, 2018 Looks like resolved by commit 4802. Now boot6 compiled by xcode10 works in my QEMU. 1 hour ago, vector sigma said: To me says: warning: boot file bigger than low-ebda permits, switching to --std-ebda It is normal. It switched low-ebda <-> std-ebda automatically. 4 Link to comment Share on other sites More sharing options...
apianti Posted December 9, 2018 Share Posted December 9, 2018 I figured it was an optimization issue which is why I had suggested to disable optimization to test. Link to comment Share on other sites More sharing options...
apianti Posted December 11, 2018 Share Posted December 11, 2018 (edited) On 12/9/2018 at 2:15 AM, Slice said: LegacyBiosThunk.c should be rewritten to avoid such optimization. I think you should be able to rewrite like: UINT32 *IdtArrayEnd; UINT32 RegionBase; IdtArray = (UINT32 *)(UINTN)(((UINT32)ProtectedModeBaseVector) * sizeof(UINT32)); IdtArrayEnd = IdtArray + 8; for (RegionBase = (UINT32)(UINTN)LegacyRegionBase; IdtArray < IdtArrayEnd; RegionBase += sizeof(UINT32)) { *IdtArray++ = (((RegionBase & 0xF0000) << 12) | (RegionBase & 0xFFFF)); } That should, I think, prevent that specific optimization. Then you should be able to reenable optimization so the file is smaller again. EDIT: Oops, made a mistake. Forgot the * to derefence the pointer to set the memory at the location not change the memory location, lol. EDIT2: This should also produces less instructions anyway. EDIT3: Just in case you want to know how it produces less instructions, I removed a right shift, two multiplications and two additions from the loop and only added one multiplication and one addition/multiplication outside the loop. Edited December 12, 2018 by apianti 1 Link to comment Share on other sites More sharing options...
Recommended Posts