Time2Retire Posted May 29, 2011 Author Share Posted May 29, 2011 I found something! I did another round of slowly removing components of SSDT_PR. The stripped down version of ACST with only one "Package (0x40)" never goes beyond 35x where the old version happily steps to 39x. Need to do one more test, then I will post the differences and a new minimal SSDT_PR. In the latest boot I included HDEF and _SUN. I haven't checked PDC0 yet, but everything working suggests that it's not changing it. Right, so basically the difference between the mini SSDT_PR you posted and this one is the full ACST. This version gives me the full stepping with turbo modes, where the stripped down ACST only stepped between 16 and 35... What do you mean by "full stepping"? Are all defined P-States now showing up / being used? Good news. I also solved something in the mean time. Look here: Got boot device = IOService:/AppleACPIPlatformExpert/PCI0@0/AppleACPIPCI/SATA@1F,2/AppleIntelPchSeriesAHCI/ Link to comment Share on other sites More sharing options...
flAked Posted May 29, 2011 Share Posted May 29, 2011 What do you mean by "full stepping"? Are all defined P-States now showing up / being used? These are the P-States I get after using the OS for 1 hour: 16, 22, 26, 30, 33, 35, 36, 38, 39 Those are only 9 of my overall 19 states. Monitored in 500 tick intervals. The last four are the turbo states (2356). Got boot device = IOService:/AppleACPIPlatformExpert/PCI0@0/AppleACPIPCI/SATA@1F,2/AppleIntelPchSeriesAHCI/ Nice! Please also see my last post at page 30... Link to comment Share on other sites More sharing options...
Time2Retire Posted May 29, 2011 Author Share Posted May 29, 2011 These are the P-States I get after using the OS for 1 hour: 16, 22, 26, 30, 33, 35, 36, 38, 39 Those are only 9 of my overall 19 states. Monitored in 500 tick intervals. The last four are the turbo states (2356). Please also see my last post at page 30... Yup. Got it. Let me first change my turbo ratio to 2356. Right after I finish my reply. I also switched to booting in 64-bit mode. Unfortunately still no bit 5 in PDC0. One off the list so to speak. More testing to do... Going to call it a day. My Geekbench score dropped to an all time low. Will look at it when I am fresh again. Get some sleep! I just remembered something. This is something I forgot. Look at your FACP.dsl and check: P-State Control What do you have there? This is 0x00 on a MacBookPro8,[3, 2 and 1] Link to comment Share on other sites More sharing options...
flAked Posted May 29, 2011 Share Posted May 29, 2011 What do you have there? This is 0x00 on a MacBookPro8,[3, 2 and 1] P-State Control : C1 Link to comment Share on other sites More sharing options...
Time2Retire Posted May 29, 2011 Author Share Posted May 29, 2011 P-State Control : C1 Yup. Same here. BTW: I am now using v713 but no difference. Still a heck of a lot slower. Oh and with debug=127 as boot option, you get a few lines (won't show up in kernel.log) like: AppleIntelCPUPowerManagement: Turbo Mode enabled AppleIntelCPUPowerManagement: SMT enabled And there's more interesting stuff if you know how to enable it: DBG: NormalToRealTime2: waiting to get to Px (%u) at %u DBG: NormalToRealTime2: we're at %u (target was %u) DBG: NormalToRealTime2: switched to real-time context DBG: NormalToRealTime2: ran for a while DBG: NormalToRealTime2: switched back to normal context DBG: NormalToRealTIme2: target = %u, CPU P-State = %u, package P-State = %u Probably not even possible, without having access to the source code, but this kind of output would be a great help for us. Link to comment Share on other sites More sharing options...
flAked Posted May 29, 2011 Share Posted May 29, 2011 And there's more interesting stuff if you know how to enable it: Is that in AICPUPM? I don't have these in my version (10k521). What did you do to get IntelPCH showing up? Using the extra keys from the linked kext? Unfortunately still no bit 5 in PDC0. Let's make sure it is even a problem. We find the important part in Cpu0Ist: Method (_PSD, 0, NotSerialized) { .... If (And (PDC0, 0x0800)) { Return (HPSD) // Hardware PSD } Return (SPSD) // Software PSD } Depending on PDC0, a PSD object with either 0xFE (HPSD) or 0xFC (SPSD) is returned. Wait a minute, PDC0 is checked for bit 11, "OSPM is capable of hardware coordination of P States". That's a bit odd? This indicates that HPSD is preferred over SPSD, if set. The question is, why put it in there if it's not getting used by AICPUPM? 1) be spec compliant? 2) support windows via bootcamp? (windows might have real troubles without those acpi definitions) Loaded up windows, isn't that interesting? Full prime95 torture test going on 3 threads, but core 0 is still running at 35x. The only odd thing is that core 2 and 3 kept shifting between 35-36x. Anyways this should answer your question if all cores are stepping up at the same time. Link to comment Share on other sites More sharing options...
elitee Posted May 29, 2011 Share Posted May 29, 2011 The latest versions of Chimera/Chameleon/updated branches seem to cause issues and it can't read the CST data. ACPI_SMC_PlatformPlugin::pushCPU_CSTData - _CST evaluation failed ACPI_SMC_PlatformPlugin::pushCPU_CSTData - _CST evaluation failed ACPI_SMC_PlatformPlugin::registerLPCDriver - WARNING - LPC device initialization failed: C-state power management not initialized Haven't changed anything else, so I must be missing something recently changed in your PR aml? Link to comment Share on other sites More sharing options...
Time2Retire Posted May 29, 2011 Author Share Posted May 29, 2011 Is that in AICPUPM? I don't have these in my version (10k521). Yes. But don't look in kernel.log because the debug output will only be dumped to your screen (and the serial port). I also changed AppleSMBIOS a little. Like so: bool AppleSMBIOS::start( IOService * provider ) { OSSerializer * serializer; if (super::start(provider) != true || IOService::getResourceService() == 0 || IOService::getResourceService()->getProperty("SMBIOS")) { return false; } [color="#FF0000"]waitQuiet(500);[/color] SMBIOSTable = NULL; SMBIOSTableLength = 0; What did you do to get IntelPCH showing up? Using the extra keys from the linked kext? Nothing like that. Like I said this is an IRQ conflict in my DSDT, and probably other DSDT's as well. The solution (changed ARnn) will be part of my next DSDT update. In short; The kext need two IRQ's of which one 10-15 and not just 2 or whatever you might have there now. Let's make sure it is even a problem. We find the important part in Cpu0Ist:... Depending on PDC0, a PSD object with either 0xFE (HPSD) or 0xFC (SPSD) is returned. Correct. This was one of the reasons for me to get the PDC0 value. This is how I work. Had to rule out things. Wait a minute, PDC0 is checked for bit 11, "OSPM is capable of hardware coordination of P States". That's a bit odd?This indicates that HPSD is preferred over SPSD, if set. The question is, why put it in there if it's not getting used by AICPUPM? 1) be spec compliant? 2) support windows via bootcamp? (windows might have real troubles without those acpi definitions) Loaded up windows, isn't that interesting? Full prime95 torture test going on 3 threads, but core 0 is still running at 35x. The only odd thing is that core 2 and 3 kept shifting between 35-36x.p We have 0xFE in our DSDT. Nothing else. Most likely because, like I said earlier, that the ACPI has been developed with Windows compatibility (only) in mind. This was also why I asked you to change the value in _PSD because previous versions of AICPUPM do in fact use that. Without this it doesn't even work (see the monstrous long Vanilla Intel Speedstep thread). Anyways this should answer your question if all cores are stepping up at the same time. It sure does show me something! That MSR 0x198 is in fact a per-core MSR. Otherwise the values would all have been the same. AICPUPM is just not doing the right thing(s) here. Let's have a closer look at the binary once again... and maybe it is time for us (someone) to give the Lion kext a run or two. I also see (from your screen capture) that the cores are running on 1.27 ~ 1.28 Volts. Which is probably why (for Windows) the 0x40 (64 bit) was used in Method _PCT. My brothers morning must have been boring as BEEP because he wrote me a function to get the P-States. Sweet. The latest versions of Chimera/Chameleon/updated branches seem to cause issues and it can't read the CST data.... Haven't changed anything else, so I must be missing something recently changed in your PR aml? And you are not letting it generate P/C-States for you? Link to comment Share on other sites More sharing options...
flAked Posted May 29, 2011 Share Posted May 29, 2011 Yes. But don't look in kernel.log because the debug output will only be dumped to your screen (and the serial port). Note: the _actual_ strings are not in the binary for my version. If the strings are in your binary it should be no problem to patch it. This was also why I asked you to change the value in _PSD because previous versions of AICPUPM do in fact use that. Without this it doesn't even work (see the monstrous long Vanilla Intel Speedstep thread). Yes but my latest iterations of tests have shown that some sort of stepping is going on without _PSD. It sure does show me something! That MSR 0x198 is in fact a per-core MSR. Otherwise the values would all have been the same. AICPUPM is just not doing the right thing(s) here. Let's have a closer look at the binary once again... and maybe it is time for us (someone) to give the Lion kext a run or two. It seems the Intel documentation is either faulty on this point or is not telling us everything. Have you tried the patcher on the Lion kext? It will probably fail, but it would be interesting to see if some offsets are the same. I was trying to find a tool to log 0x198 continuously in windows to see if more P-states are being used, no luck so far. Anyways, it would use the factory table for it, but still, it would show how much granularity is applied by windows. Link to comment Share on other sites More sharing options...
elitee Posted May 29, 2011 Share Posted May 29, 2011 And you are not letting it generate P/C-States for you? I thought by adding "GeneratePStates"="Yes" and "GenerateCStates"="Yes" it would overwrite what was being added by the _PR and generate them itself? Am I mistaken? Link to comment Share on other sites More sharing options...
Time2Retire Posted May 29, 2011 Author Share Posted May 29, 2011 Note: the _actual_ strings are not in the binary for my version. If the strings are in your binary it should be no problem to patch it. Now that is strange because I do see them in the binary of 10K521 and 10K524 (I checked both of them). Which is what you are using, right? I also checked the md5 and yours is not the same as what I get from the one in the COMBO / DELTA update packages. Should be: d6158b61e62dc4d44c7f7a5f84b44354 (See post #575). So are you using 10K524 or maybe 10K531 now? Yes but my latest iterations of tests have shown that some sort of stepping is going on without _PSD. Hardware stepping doesn't need anything. BIOS takes care of it. We checked this on an old Asus board. It seems the Intel documentation is either faulty on this point or is not telling us everything. I won't be surprised. Intel recently updated it, twice, in a short period of time. And documentation is usually late everywhere. Have you tried the patcher on the Lion kext? It will probably fail, but it would be interesting to see if some offsets are the same. No. Not yet. I first like to know what it is that make things work at your end, but I fail to reproduce it. Very strange. I was trying to find a tool to log 0x198 continuously in windows to see if more P-states are being used, no luck so far. Anyways, it would use the factory table for it, but still, it would show how much granularity is applied by windows. You might in fact have more luck using Linux. I thought by adding "GeneratePStates"="Yes" and "GenerateCStates"="Yes" it would overwrite what was being added by the _PR and generate them itself? Am I mistaken? The idea itself isn't wrong no, but it has first to be developed for Sandy Bridge processors. That is clearly still not the case. Not for Chameleon. Not for Chimera. Not for any other boot loader (I know how, but RevoBoot won't do this). Link to comment Share on other sites More sharing options...
elitee Posted May 29, 2011 Share Posted May 29, 2011 The idea itself isn't wrong no, but it has first to be developed for Sandy Bridge processors. That is clearly still not the case. Not for Chameleon. Not for Chimera. Not for any other boot loader (I know how, but RevoBoot won't do this). So that would mean it's not loading the C-states from the pr.aml (for whatever reason), correct? Link to comment Share on other sites More sharing options...
Time2Retire Posted May 29, 2011 Author Share Posted May 29, 2011 So that would mean it's not loading the C-states from the pr.aml (for whatever reason), correct? Correct. Let's assume that you are using RevoBoot for which you need to have: /Extra/ACPI/dsdt.aml /Extra/ACPI/ssdt_pr.aml Extra/ACPI/ssdt_usb.aml (if you ain't using my static version already) Plus the following compiler switches in settings.h: #define PATCH_ACPI_TABLE_DATA 1 #define STATIC_SSDT_USB_TABLE_INJECTION 1 #define LOAD_DSDT_TABLE_FROM_EXTRA_ACPI 1 #define LOAD_SSDT_TABLE_FROM_EXTRA_ACPI 1 #define DROP_SSDT_TABLES 1 #define APPLE_STYLE_ACPI 1 Chameleon users however should use: /Extra/dsdt.aml /Extra/ssdt-1.aml /Extra/ssdt-2.aml Link to comment Share on other sites More sharing options...
flAked Posted May 29, 2011 Share Posted May 29, 2011 Now that is strange because I do see them in the binary of 10K521 and 10K524 (I checked both of them). Which is what you are using, right? I also checked the md5 and yours is not the same as what I get from the one in the COMBO / DELTA update packages. Should be: d6158b61e62dc4d44c7f7a5f84b44354 (See post #575). So are you using 10K524 or maybe 10K531 now? I'm still on 10K521, these are the md5's from the combo update: MD5 (AppleIntelCPUPowerManagement_org) = d6158b61e62dc4d44c7f7a5f84b44354 MD5 (AppleIntelCPUPowerManagement_patch) = 2f4f8fbd071c3e59020484e07732ced1 I just rechecked with a fresh extract, it's the same md5 of the kext I'm currently using. I do see the DBG strings in the Lion DP3 binary, but they are not present in 10K521. What happens if you set OSBundleEnableKextLogging = true in the info.plist? Link to comment Share on other sites More sharing options...
Time2Retire Posted May 29, 2011 Author Share Posted May 29, 2011 I'm still on 10K521, these are the md5's from the combo update: MD5 (AppleIntelCPUPowerManagement_org) = d6158b61e62dc4d44c7f7a5f84b44354 MD5 (AppleIntelCPUPowerManagement_patch) = 2f4f8fbd071c3e59020484e07732ced1 I just rechecked with a fresh extract, it's the same md5 of the kext I'm currently using. I do see the DBG strings in the Lion DP3 binary, but they are not present in 10K521. What happens if you set OSBundleEnableKextLogging = true in the info.plist? I will try that before going back to an older kext. Let's see if bit 5 is set, for me here, when I use that instead. I disabled Hyper Threading in UEFI – since your CPU doesn't support it – but no change. Well except for losing four threads of course. That and this debug output: AppleIntelCPUPowerManagement: SMT enabled Which is pretty obvious as to why that is Link to comment Share on other sites More sharing options...
flAked Posted May 29, 2011 Share Posted May 29, 2011 You know what it is It will only patch x86_64. speed_stepper_lion_dp3.zip There is new PM code in the Lion DP3 kernel, sadly it won't load on 10.6.8. I need to get NVEnabler for Lion or I get eye-cancer... Link to comment Share on other sites More sharing options...
Time2Retire Posted May 29, 2011 Author Share Posted May 29, 2011 You know what it is It will only patch x86_64. speed_stepper_lion_dp3.zip Thanks. Don't know how to add a boolean in Info.plist. Kext won't even load anymore... Link to comment Share on other sites More sharing options...
flAked Posted May 29, 2011 Share Posted May 29, 2011 Thanks. Don't know how to add a boolean in Info.plist. Kext won't even load anymore... If you edit the plist in Xcode you can change the type there. What will change in using waitQuiet(500)? It will load later and then? Maybe we need to use "mp_rendezvous" (with interrupts) to get the per-cpu msr? Link to comment Share on other sites More sharing options...
Time2Retire Posted May 29, 2011 Author Share Posted May 29, 2011 If you edit the plist in Xcode you can change the type there. Perfect tip. Now I know what to use when I nano edit the plist. What will change in using waitQuiet(500)? It will load later and then? It will wait and prevent that lines scroll off your screen. Which is the case for me with 8 threads. Maybe we need to use "mp_rendezvous" (with interrupts) to get the per-cpu msr? Is that what CPU-i is using (haven't checked)? I am now using AICPUPM from 10K521 with Hyper Threading switched off, and I still don't get bit 5 set. Are you absolutely sure that bit 5 is set? You're not using some freaking FakeSMC plug-in that is changing something? Got to go for a while now. back later... Link to comment Share on other sites More sharing options...
flAked Posted May 29, 2011 Share Posted May 29, 2011 Using this inside ACST gives me the "Stepper CPU": If (LEqual (And (PDC0, 0x20), 0x20)) // Check bit 5 of PDC0 - set! { .... } Is that what CPU-i is using (haven't checked)? Nope, mp_rendezvous_no_intrs(). Shooting in the dark here... I mean, we are reading the same MSR as in windows. Have you seen any case in your log where one core multiplier was different to the others? Now, let's apply some logic. For the sake of argument, let's say AICPUPM is doing nothing and this is in fact hardware-controlled: OUTPUT_1: May 29 22:01:52 slave kernel[0]: MSRDumper CoreMulti(26) May 29 22:01:57 slave kernel[0]: MSRDumper CoreMulti(36) May 29 22:02:02 slave kernel[0]: MSRDumper CoreMulti(38) May 29 22:02:07 slave kernel[0]: MSRDumper CoreMulti(22) May 29 22:02:12 slave kernel[0]: MSRDumper CoreMulti(26) May 29 22:02:17 slave kernel[0]: MSRDumper CoreMulti(33) May 29 22:02:22 slave kernel[0]: MSRDumper CoreMulti(36) May 29 22:02:27 slave kernel[0]: MSRDumper CoreMulti(26) May 29 22:02:32 slave kernel[0]: MSRDumper CoreMulti(16) May 29 22:02:37 slave kernel[0]: MSRDumper CoreMulti(22) May 29 22:02:42 slave kernel[0]: MSRDumper CoreMulti(16) May 29 22:02:47 slave kernel[0]: MSRDumper CoreMulti(26) May 29 22:02:52 slave kernel[0]: MSRDumper CoreMulti(16) May 29 22:02:57 slave kernel[0]: MSRDumper CoreMulti(22) OK, so let's load up NullCPUPM and see what we get: OUTPUT_2: May 29 22:13:53 slave kernel[0]: MSRDumper CoreMulti(16) May 29 22:14:18 slave kernel[0]: MSRDumper CoreMulti(33) May 29 22:14:33 slave kernel[0]: MSRDumper CoreMulti(16) May 29 22:15:03 slave kernel[0]: MSRDumper CoreMulti(16) May 29 22:15:08 slave kernel[0]: MSRDumper CoreMulti(33) May 29 22:15:13 slave kernel[0]: MSRDumper CoreMulti(16) May 29 22:15:23 slave kernel[0]: MSRDumper CoreMulti(33) May 29 22:15:33 slave kernel[0]: MSRDumper CoreMulti(16) May 29 22:15:38 slave kernel[0]: MSRDumper CoreMulti(33) May 29 22:15:43 slave kernel[0]: MSRDumper CoreMulti(16) May 29 22:16:13 slave kernel[0]: MSRDumper CoreMulti(16) With our starting assumption we acknowledged that we have two scenarios, we firstly presumed that we have in fact a hardware-coordination and we strive for software-coordination. We must also acknowledge, that there is no in-between, either it is hardware or software. So either AICPUPM is doing the coordination or the BIOS. The SSDT_PR used in this experiment is stripped down to APSS, ACST and APSN. EIST, Turbo and C-States are enabled in the BIOS. We established in earlier tests that the first output (with no NullCPUPM loaded) can be influenced by changing APSS. We see different Turbo states being used, depending on the APSS object. We assumed that AICPUPM is doing nothing, and OUPUT_1 is hardware-controlled. If that is in fact true we can load NullCPUPM and have the same stepping. We can see in OUTPUT_2 that this is not the case. Therefor we must conclude that our assumption was incorrect and AICPUPM is in fact controlling the state transitions. Link to comment Share on other sites More sharing options...
Time2Retire Posted May 29, 2011 Author Share Posted May 29, 2011 Using this inside ACST gives me the "Stepper CPU... I will verify my findings after I solved a problem with my current DSDT (a constant kernel_tasks load of ~60%). Update: Solved by putting back the original ARnn declarations. I mean, we are reading the same MSR as in windows. Have you seen any case in your log where one core multiplier was different to the others? No. Never. Now, let's apply some logic. For the sake of argument, let's say AICPUPM is doing nothing and this is in fact hardware-controlled:... OK, so let's load up NullCPUPM and see what we get:... With our starting assumption we acknowledged that we have two scenarios, we firstly presumed that we have in fact a hardware-coordination and we strive for software-coordination. We must also acknowledge, that there is no in-between, either it is hardware or software. So either AICPUPM is doing the coordination or the BIOS. The SSDT_PR used in this experiment is stripped down to APSS, ACST and APSN. EIST, Turbo and C-States are enabled in the BIOS. We established in earlier tests that the first output (with no NullCPUPM loaded) can be influenced by changing APSS. We see different Turbo states being used, depending on the APSS object. We assumed that AICPUPM is doing nothing, and OUPUT_1 is hardware-controlled. If that is in fact true we can load NullCPUPM and have the same stepping. We can see in OUTPUT_2 that this is not the case. Therefor we must conclude that our assumption was incorrect and AICPUPM is in fact controlling the state transitions. I have to disagree. The problem I have with this is that you assume that neither AICPUPM nor the NullPM kext is interfering and/or overruling something in the BIOS or visa versa. We just don't know. Not to mention that your system is doing something differently and I still fail to reproduce what you get. I'd say that it is time for someone else here to step in and try to reproduce things because I will soon be gone! Link to comment Share on other sites More sharing options...
mrmojorisin17 Posted May 29, 2011 Share Posted May 29, 2011 I'd say that it is time for someone else here to step in and try to reproduce things because I will soon be gone! I'm here. You and flAked just have to say me (us) what I (we) have to do. I'll be happy to help you with your tests Link to comment Share on other sites More sharing options...
Time2Retire Posted May 29, 2011 Author Share Posted May 29, 2011 OMG I think that your testing went bad. Please check for bit 5 again, and now check your kernel.log for these: ACPI_SMC_PlatformPlugin::pushCPU_CSTData - ACST did not return a package ACPI_SMC_PlatformPlugin::registerLPCDriver - WARNING - LPC device initialization failed: C-state power management not initialized Checking for "CPU Stepper" is not right. I have that with the above errors!!! LOL Now I see it step. Bummer... Link to comment Share on other sites More sharing options...
flAked Posted May 29, 2011 Share Posted May 29, 2011 ... (deleted, I felt it wasn't very constructive) Well, all this writeup and I still feel I haven't said what I really wanted. I think my biggest concern is that we haven't condensed the tests and findings into a list of what we know and what is uncertain. And I feel we are not on the same page right now, I don't know what you are testing and where my tests fail on your system. Link to comment Share on other sites More sharing options...
Time2Retire Posted May 30, 2011 Author Share Posted May 30, 2011 We all want something that works, and we did found out a lot of new things. However. Let's try to focus people, and look forward so here's an idea: What if MSRDumper.kext isn't really reading the MSR on a per-core basis, but read the one from the master core only? What can we do to show us that what we do is either right or wrong? I for example tried reading other MSR's and they all came back with the same values. That can't be right. I mean the load factor is different, but not the 0x198 / 0x199 combo so what are we missing (on the assumption that we did miss something). So if there's some dev hero watching over our shoulders, then please, please chime in and let us get this thing going. We clearly are at our end so we need help. Big time! I now see the following multipliers: 16, 21, 26 and 35 And that with the forementioned errors. B I N G O I have now seen all my P-States coming by. That's the first time for me. Hard to believe but... exiting news. APSN apparently is not the number of Turbo States, but the number of P-States. What about that. Can that be it? Please verify this! Link to comment Share on other sites More sharing options...
Recommended Posts