rvxtm Posted March 21, 2013 Share Posted March 21, 2013 netstat.txt Here is a quick look at my netstat with the new driver, i'm back on lnx2mac one, Tried the debug version and still very slow upload speed, the download is normal. I took a look in my system.log and no info regarding realtek, only that a link has been made with 100mbps and that i have EEE support on the controler, no errors, no nothing. Link to comment Share on other sites More sharing options...
Mieze Posted March 21, 2013 Author Share Posted March 21, 2013 netstat.txt Here is a quick look at my netstat with the new driver, i'm back on lnx2mac one, Tried the debug version and still very slow upload speed, the download is normal. I took a look in my system.log and no info regarding realtek, only that a link has been made with 100mbps and that i have EEE support on the controler, no errors, no nothing. Hello RVXTM, the netstat output shows evidence for a significant packet loss as the number of retransmissions compared to the total number of packets is extremely high: tcp: 80345 packets sent 3764 data packets (6492888 bytes) 2589 data packets (1618611 bytes) retransmitted This would also explain the low upload speed. The log messages are ok,. Unfortunately you are the first user with an RTL8111C (RTL8168C/8111C: (Chipset 5)) to test the driver so that I have no experience with that chip but the issue reminds me of a report from a user with a MSI Z77MA-G45. Maybe you should take a look at this: http://www.tonymacx8...html#post556468 I would suggest to disable checksum offload as he did and see if it helps. In case it doesn't, check the network statistics of the machine on the other side. Especially look out for bad packets. Using Wireshark to create a packet dump might also be helpful. By the way, are you using the driver on the GA EP-45-EXTREME like your signature suggests? Are both ports connected to the switch? What is their configuration (DHCP, static IP or ...)? Mieze Link to comment Share on other sites More sharing options...
rvxtm Posted March 21, 2013 Share Posted March 21, 2013 Hello RVXTM, the netstat output shows evidence for a significant packet loss as the number of retransmissions compared to the total number of packets is extremely high: tcp: 80345 packets sent 3764 data packets (6492888 bytes) 2589 data packets (1618611 bytes) retransmitted This would also explain the low upload speed. The log messages are ok,. Unfortunately you are the first user with an RTL8111C (RTL8168C/8111C: (Chipset 5)) to test the driver so that I have no experience with that chip but the issue reminds me of a report from a user with a MSI Z77MA-G45. Maybe you should take a look at this: http://www.tonymacx8...html#post556468 I would suggest to disable checksum offload as he did and see if it helps. In case it doesn't, check the network statistics of the machine on the other side. Especially look out for bad packets. Using Wireshark to create a packet dump might also be helpful. By the way, are you using the driver on the GA EP-45-EXTREME like your signature suggests? Are both ports connected to the switch? What is their configuration (DHCP, static IP or ...)? Mieze I am using that MB, only port 0 connected to a router with DHCP ip allocations. Link to comment Share on other sites More sharing options...
Mieze Posted March 21, 2013 Author Share Posted March 21, 2013 I am using that MB, only port 0 connected to a router with DHCP ip allocations. Ok, the interesting question is where did all the lost packets go to and why do they get lost? Did you follow the installation instructions? Mieze Link to comment Share on other sites More sharing options...
dmazar Posted March 22, 2013 Share Posted March 22, 2013 Have an issue: when I boot to Windows and then restart into OSX (just restart, warm reboot, no shutdown), then net is not working any more. Requires shutdown and fresh start to get it working again. Windows -> restart into OSX - not working Windows -> restart into Ubuntu -> restart into OSX - working OSX (when net is working) -> restart into OSX - still working Logs: DebugLogs.zip When not working: ip: 543 total packets received 327 bad header checksums 1 Link to comment Share on other sites More sharing options...
Mieze Posted March 22, 2013 Author Share Posted March 22, 2013 Have an issue: when I boot to Windows and then restart into OSX (just restart, warm reboot, no shutdown), then net is not working any more. Requires shutdown and fresh start to get it working again. Windows -> restart into OSX - not working Windows -> restart into Ubuntu -> restart into OSX - working OSX (when net is working) -> restart into OSX - still working When not working: ip: 543 total packets received 327 bad header checksums Hello dmazar, thanks for your feedback. The extremely high number of packets with bad header checksums might probably indicate that checksum offload isn't working properly after you have used Windows. As all drivers (Win, Linux and OS X) load firmware into the NIC, I assume that the Windows driver leaves the NIC with settings that are incompatible with the OS X driver and don't get replaced completely. Unfortunately the firmware has been provided by Realtek without any documentation so that there is little I can do to resolve the issue. I guess the problem doesn't show up when you shut down the system after using windows and do a cold boot into OS X? Mieze Link to comment Share on other sites More sharing options...
dmazar Posted March 22, 2013 Share Posted March 22, 2013 Yes, cold boot solves the problem. Or ... booting into Linux and then warm reboot into OSX also does the trick. Looks like driver in my Ubuntu is able to "fix" after Windows. Link to comment Share on other sites More sharing options...
Mieze Posted March 22, 2013 Author Share Posted March 22, 2013 Yes, cold boot solves the problem. Or ... booting into Linux and then warm reboot into OSX also does the trick. Looks like driver in my Ubuntu is able to "fix" after Windows. At least we have a workaround that is really easy to apply. I will add this information to the troubleshooting section. Mieze Link to comment Share on other sites More sharing options...
dmazar Posted March 24, 2013 Share Posted March 24, 2013 Did some more tests ... If I am in Windows and then shutdown, wait few seconds and turn the comp on and boot to OSX -> it does not work. I have to shut it down from OSX again and then start into OSX and then it works. Tested sequences from Windows: 1. Windows -> soft restart into Ubuntu -> soft restart into OSX, net works 2. Windows -> soft restart into OSX (net does not work) -> soft restart into OSX, net does not work 3. Windows -> soft restart into OSX (net does not work) -> shutdown and start into OSX, net works 4. Windows -> shutdown and start into OSX (net does not work) -> shutdown and start into OSX, net works From sequences 3 and 4: Simple shutdown from Windows is not enough. Your driver needs to be started on my controller twice. At first boot (warm or cold) it does something, but not enough. Second cold restart (shutdown/start) is needed, and then it works fine. Is there anything that can be learned from this and done? Plus, it looks to me that the same thing is happening when removing some other driver from OSX and installing this one - required several restarts/shutdowns to get it working after install. Link to comment Share on other sites More sharing options...
dmazar Posted March 24, 2013 Share Posted March 24, 2013 Tested Slice's version: http://www.insanelymac.com/forum/topic/286937-realtekr1000-v3/page__st__20#entry1900418 Same issue, but slightly worse regarding Windows. Link to comment Share on other sites More sharing options...
polkaholga Posted March 24, 2013 Share Posted March 24, 2013 Finally I can use WOL !! My system have RTL8111E (GA X58A-UD3R motherboard) chip, it works perfectly now. That sounds really promising since i have the same chip onboard ! @ Mieze Thank you for this driver, can't wait for testing it when back at my hack next weekend... I got a bunch of warnings about "unused variable flags" when i compiled for 10.7 with Xcode4.5.2. Should i use 4.4.1 or is there nothing to worry about ? Link to comment Share on other sites More sharing options...
Mieze Posted March 24, 2013 Author Share Posted March 24, 2013 I got a bunch of warnings about "unused variable flags" when i compiled for 10.7 with Xcode4.5.2. Should i use 4.4.1 or is there nothing to worry about ? No need to worry. I will comment out those lines in the next release to get rid of the warnings. Mieze Link to comment Share on other sites More sharing options...
Mieze Posted March 25, 2013 Author Share Posted March 25, 2013 Did some more tests ... If I am in Windows and then shutdown, wait few seconds and turn the comp on and boot to OSX -> it does not work. I have to shut it down from OSX again and then start into OSX and then it works. Tested sequences from Windows: 1. Windows -> soft restart into Ubuntu -> soft restart into OSX, net works 2. Windows -> soft restart into OSX (net does not work) -> soft restart into OSX, net does not work 3. Windows -> soft restart into OSX (net does not work) -> shutdown and start into OSX, net works 4. Windows -> shutdown and start into OSX (net does not work) -> shutdown and start into OSX, net works From sequences 3 and 4: Simple shutdown from Windows is not enough. Your driver needs to be started on my controller twice. At first boot (warm or cold) it does something, but not enough. Second cold restart (shutdown/start) is needed, and then it works fine. Is there anything that can be learned from this and done? As the chip supports WoL it uses standby power so that it won't be off completely until you pull the plug off the wall or flick the PSU's switch. Maybe it's a firmware related problem? Plus, it looks to me that the same thing is happening when removing some other driver from OSX and installing this one - required several restarts/shutdowns to get it working after install. I had the suspect that the lnx2mac driver also causes problems when you switch over to my driver. This might be a firmware issue but it could be as well that the driver left something in the system preferences that is the reason for the strange behavior. As my driver is the only Realtek driver for OS X that makes use of the chip's advanced features (checksum offload and TCP segmentation offload) in order to improve performance, it's interaction with the network stack is far more complex. Here is another funny thing I discovered during my tests. I removed the lnx2mac driver from my test system and installed my driver. Although it was working I noticed that https connections to Apple websites. e. g. iCloud, App Store, iTunes Store and developer.apple.com stopped working. The strange thing was that replies to connection requests from those servers where considered to have a bad IP header checksums by the NIC but everything else, including https connections to other servers, was working flawlessly. Ironically the problem disappeared after I wiped out the disk, reinstalled OS X and my driver. This time everything was working fine. After all I came to the conclusion that rx checksum offload is responsible for many of the known problems. In the next release I will address this issue and change the way received packets are handled when checksum verification in hardware failed. Instead of marking these packets as bad, they will be considered as unchecked letting the network stack repeat verification in software. So far this strategy seems to work without speed impacts and might even be a practical solution for boards with broken NICs like the MSI Z77MA-G45. Mieze Tested Slice's version: http://www.insanelym...20#entry1900418 Same issue, but slightly worse regarding Windows. That's funny! I've been in a personal conversation with Slice during the last days. He was trying to convince me that my driver has a power management issue causing this kind of trouble and that he already found a solution for his driver. Obviously the issue isn't related to power management at all but as we both started with Realtek's linux driver 8.035.0 its no wonder that both drivers are affected. Anyway, thanks for the information. At least we know now that PM is not the place to look for in order to find a solution. Mieze Link to comment Share on other sites More sharing options...
undeadlegion1 Posted March 25, 2013 Share Posted March 25, 2013 Here is another funny thing I discovered during my tests. I removed the lnx2mac driver from my test system and installed my driver. Although it was working I noticed that https connections to Apple websites. e. g. iCloud, App Store, iTunes Store and developer.apple.com stopped working. The strange thing was that replies to connection requests from those servers where considered to have a bad IP header checksums by the NIC but everything else, including https connections to other servers, was working flawlessly. Ironically the problem disappeared after I wiped out the disk, reinstalled OS X and my driver. This time everything was working fine. After all I came to the conclusion that rx checksum offload is responsible for many of the known problems. In the next release I will address this issue and change the way received packets are handled when checksum verification in hardware failed. Instead of marking these packets as bad, they will be considered as unchecked letting the network stack repeat verification in software. So far this strategy seems to work without speed impacts and might even be a practical solution for boards with broken NICs like the MSI Z77MA-G45. Mieze Glad you were able to figure out this bug! Let me know if you need any help testing Link to comment Share on other sites More sharing options...
dmazar Posted March 25, 2013 Share Posted March 25, 2013 As the chip supports WoL it uses standby power so that it won't be off completely until you pull the plug off the wall or flick the PSU's switch. Maybe it's a firmware related problem? Tested this: booted to Windows, then shutdown, then unplugged comp from power for 10-15 secs, then started OSX - and all the same as before, net is connected, but not usable. Required another shutdown and start into OSX to get it working. I'm willing to try to to compare initalization with r8169 Linux driver (https://github.com/t...realtek/r8169.c). What should I start with or what to try to compare? Carefully go through all init process or go straight to this rtl8168_hw_phy_config()? r8169 identifies my card as RTL_GIGA_MAC_VER_33, uses rtl8168e_1_hw_phy_config() and uses FIRMWARE_8168E_2 (tl_nic/rtl8168e-2.fw, have it from Ubuntu). What is a chance to brick my controller by experimenting with this? EDIT: Just an update: shutdown after Windows and unplugging the power and ethernet cable and waiting for 30 secs did the trick. Next boot to OSX resulted in working net. Link to comment Share on other sites More sharing options...
Mieze Posted March 25, 2013 Author Share Posted March 25, 2013 Tested this: booted to Windows, then shutdown, then unplugged comp from power for 10-15 secs, then started OSX - and all the same as before, net is connected, but not usable. Required another shutdown and start into OSX to get it working. I'm willing to try to to compare initalization with r8169 Linux driver (https://github.com/t...realtek/r8169.c). What should I start with or what to try to compare? Carefully go through all init process or go straight to this rtl8168_hw_phy_config()? r8169 identifies my card as RTL_GIGA_MAC_VER_33, uses rtl8168e_1_hw_phy_config() and uses FIRMWARE_8168E_2 (tl_nic/rtl8168e-2.fw, have it from Ubuntu). What is a chance to brick my controller by experimenting with this? EDIT: Just an update: shutdown after Windows and unplugging the power and ethernet cable and waiting for 30 secs did the trick. Next boot to OSX resulted in working net. Hello dmazar, you'll have a hard time trying to track down the error, in particular because Realtek's 8.035.00 driver doesn't separate the firmware from the code at all. Everything is packed into that giant function rtl8168_hw_phy_config. The chance of bricking the NIC can't be ruled out and we don't know what the firmware does. But I have a better idea. You could get a copy of Realtek's current Linux driver (version 8.035.00) and test it under Linux. http://218.210.127.1...3&GetDown=false If it shows the same issue with regard to Windows as under OS X then you could contact their technical support and hopefully they will fix it for us. Can you provide me two debug logs. One when network is working fine, and one after you rebooted from Windows and the network is dead. In the last case please also open Network Utility and watch the number of packets transferred as they are updated by a hardware statistics dump of the NIC. This allows us to take a look at the NIC's internal state. Mieze Link to comment Share on other sites More sharing options...
dmazar Posted March 25, 2013 Share Posted March 25, 2013 you'll have a hard time trying to track down the error, in particular because Realtek's 8.035.00 driver doesn't separate the firmware from the code at all. Everything is packed into that giant function rtl8168_hw_phy_config. The chance of bricking the NIC can't be ruled out and we don't know what the firmware does. Well, that was kind of naive from me. Like I could jump in, change few registers and try to get it working. But I have a better idea. You could get a copy of Realtek's current Linux driver (version 8.035.00) and test it under Linux. http://218.210.127.1...3&GetDown=false If it shows the same issue with regard to Windows as under OS X then you could contact their technical support and hopefully they will fix it for us. Got it: [ 1.154796] r8168 Gigabit Ethernet driver 8.035.00-NAPI loaded [ 1.154894] r8168 0000:08:00.0: irq 53 for MSI/MSI-X [ 1.298849] r8168: This product is covered by one or more of the following patents: US5,307,459, US5,434,872, US5,732,094, US6,570,884, US6,115,776, and US6,327,625. [ 1.298852] r8168 Copyright © 2012 Realtek NIC software team <nicfae@realtek.com> [ 17.289937] r8168: eth0: link down [ 18.858229] r8168: eth0: link up [ 19.284308] r8168: eth0: link up Network still works fine in Ubuntu after restart from Windows. So the firmware theory is not valid any more, right? This thing is the same in yours and Linux drivers, right? What changed now is that Windows -> restart into Linux -> restart into OSX results in non working net in OSX, while restart from Linux previously fixed it for OSX also. Shutdown and new start fixes it. About additional logs: do you need something different from previous logs? Link to comment Share on other sites More sharing options...
Mieze Posted March 25, 2013 Author Share Posted March 25, 2013 Network still works fine in Ubuntu after restart from Windows. So the firmware theory is not valid any more, right? This thing is the same in yours and Linux drivers, right? Correct! Maybe I should check the PCI config space setup as this had to be rewritten from scratch because the Linux code was not portable and as far as I know this could be preserved across a reboot or while standby power is still present. About additional logs: do you need something different from previous logs? The log messages of the driver (debug build) when network is not working would be really helpful. Mieze Link to comment Share on other sites More sharing options...
dmazar Posted March 25, 2013 Share Posted March 25, 2013 The one from here is not ok: http://www.insanelymac.com/forum/topic/287161-new-driver-for-realtek-rtl8111/page__st__20#entry1899870 ? Network Utility/Info: there is no really a difference here with working and 'non-working' net. Send and Receive errors and Collisions are 0. It's just that Sent and Recv packets number are much smaller with 'non-working' net. But they still rise with time. By the way, ping, lookup and traceroute are working fine. Safari: does not report any connection errors, just waits to receive some data, which is not coming. Link to comment Share on other sites More sharing options...
Mieze Posted March 25, 2013 Author Share Posted March 25, 2013 The one from here is not ok: http://www.insanelym...20#entry1899870 ? Network Utility/Info: there is no really a difference here with working and 'non-working' net. Send and Receive errors and Collisions are 0. It's just that Sent and Recv packets number are much smaller with 'non-working' net. But they still rise with time. By the way, ping, lookup and traceroute are working fine. Safari: does not report any connection errors, just waits to receive some data, which is not coming. According to the logs the NIC is working but rx checksum offload seems to be unreliable after a reboot from Windows which brings the firmware theory back into the game because my driver and the linux driver have different strategies. When checksum verification in hardware failed the linux driver treats the packet as unchecked and lets the network stack perform the check while my driver considers it to be a bad packet. Please try the attached version in which I adopted the strategy of the linux driver. Good luck! Mieze PS: Do you need a binary or can you compile from source? Link to comment Share on other sites More sharing options...
dmazar Posted March 26, 2013 Share Posted March 26, 2013 Src is fine. Thanks! Tested: still does not work after Windows. Logs: NetDebug.zip The error moved from bad checksum to data size error: ip: 356 total packets received 0 bad header checksums 0 with size smaller than minimum 159 with data size < data length Hope this will trigger some more ideas . Link to comment Share on other sites More sharing options...
Mieze Posted March 26, 2013 Author Share Posted March 26, 2013 Src is fine. Thanks! Tested: still does not work after Windows. Logs: NetDebug.zip The error moved from bad checksum to data size error: ip: 356 total packets received 0 bad header checksums 0 with size smaller than minimum 159 with data size < data length Hope this will trigger some more ideas . Hello dmazar, according to the documentation the NIC transfers the packet including the ethernet CRC into memory but as the CRC isn't needed by the protocol stack, the driver removes the last 4 bytes of a received packet. Maybe your NIC (Chipset 14) is different? Locate the following line in the source code (its in RTL8111::rxInterrupt()) pktSize = (descStatus1 & 0x1fff) - 4; Change it into pktSize = (descStatus1 & 0x1fff); Good luck! Mieze Edit: In case this doesn't help you might also try to increase the packet size a little bit. The buffers are all 2000 bytes in size so that there is enough headroom. Link to comment Share on other sites More sharing options...
dmazar Posted March 26, 2013 Share Posted March 26, 2013 Tried, but does not help. I've dumped descStatus1 and descStatus2 from working and non working net and from linux (after Windows). Maybe it will help. NetDebug2.zip Mainly, non working system contains packets with descStatus1 bits 20-23 as 6, while working system and linux do not have that. Not working: rxInterrupt(): descStatus1=0x3462c500, descStatus2=0x40000000, pktSize=1536 rxInterrupt(): descStatus1=0x3462c500, descStatus2=0x40000000, pktSize=1536 (packet sizes are invalid, increased by 256 by me) Link to comment Share on other sites More sharing options...
Mieze Posted March 26, 2013 Author Share Posted March 26, 2013 Mainly, non working system contains packets with descStatus1 bits 20-23 as 6, while working system and linux do not have that. Not working: rxInterrupt(): descStatus1=0x3462c500, descStatus2=0x40000000, pktSize=1536 rxInterrupt(): descStatus1=0x3462c500, descStatus2=0x40000000, pktSize=1536 (packet sizes are invalid, increased by 256 by me) The datasheet says: Bit 22: Receive Watchdog Timer Expired: This bit is set whenever the received packet length exceeds 8192 bytes. Bit 21: Receive Error summary: When set, indicates that at least one of the following errors has occurred: CRC, RUNT, RWT, FAE. This bit is valid only when LS (Last segment bit) is set. Ok, we know now that these packets are really bad because of a reception error but I have no idea how to avoid this. Mieze Link to comment Share on other sites More sharing options...
Mieze Posted March 27, 2013 Author Share Posted March 27, 2013 Hello dmazar, two more questions to narrow down the issue: 1) Which Windows driver do you use? Driver from board's manufacturer, Realtek, included in Win? 2) When you use Ubuntu's native driver without the firmware, is it still able to cure the problem caused by the Win driver? Mieze Link to comment Share on other sites More sharing options...
Recommended Posts