Jump to content

Intel (Errors) Inside


Swad
 Share

25 posts in this topic

Recommended Posts

Welcome to the world of processors politics, or "How I learned to stop worrying and love the errors."

 

It appears that Intel's new Core Duo - which was released less than a month ago - has errors out the wazoo serial port. Its errata documenation reveals 34 known issues with the processor. That averages out to almost an error and a half a day being discovered. Now, it's nothing new for processors to have errors; the Pentium 4 has 65 known errors. But those 65 have been accumulated over the life of the chip. Should be we concerned? Some think so.

 

It makes one wonder if the Core Duo was rushed out of the gates at Intel. Should they be shipping such an error-prone chip? Worse yet, should Apple be using it?

Link to comment
Share on other sites

I'm definetly no guru in the way processor's work, but the fact that most of these seem to be unfixable through software approaches, plus Intel's decision not to do anything about them at this point should mean some cause for alarm...

 

I particularly like the idea of loading the wrong section of memory when returning from hibernation... :poster_oops:

Link to comment
Share on other sites

Doesn't matter. OSx86 is tested and optimized (on occasion, over time) tons on consumer hardware, and since the Core Duo is part of that everything should be fine.

 

I'm saying, all the chips in the new Macs are the same. A developer of OSx86 doesn't have his own uber-perfect chip to use and not be aware of the problems. Essentially, if it's working now it has no reason to suddenly stop working when it's in your house.

 

Besides, most of the time you have to TRY to get a processor show-stopper.

Link to comment
Share on other sites

hahahha this is so funny, too bad for apple.. they should have gone with AMD teaches them the lesson AMD all the way thats why i havent purchased it yet.. lol they will eventually figure out AMD was the better choice but meh.. lol

 

AMD all the way :angry:

Link to comment
Share on other sites

Bah, you're all just jealous 'cause I got a macbook on order. :angry: Just kidding. Seriously though, amd processors have errors too. The question is about the number of errors... Steve ran the keynote on one without issues. Unix and the mach kernel in particular are pretty darn fault tollerant. I'm not too worried. You know why they called it Pentium instead of 586 don't you? They added 100 to 486 and got 585.9998134.

Link to comment
Share on other sites

The point isn't that Apple is the pinnacle of life and existence.

 

It's that normal Joe user isn't going to notice anything. Even the professionals. The only people I see that this can effect are the heavy math and science buffs, along with heavy duty programmers.

 

Since I'm neither of those and have yet to hear from a reliable source about an exploding iMac due to the intel chip unable to comprehend a math string, I find it obsurd that people are all pitchforks and fire over low level processor commands.

 

I mean really, how many of you know how to create an IFU/BSU deadlock on a processor?

 

I mean damn. It's working. I didn't even know about errata sheets before this, and my pentium 4 with it's however much errors has yet to catastrophiclly destroy my system.

 

It works like this. Internet Explorer is horribly broken. People hate designing for it, because it hates the people designing for it. But the websites still work for it. The hard part is designing it, but once it works it works.

 

I'm putting this same form of thinking with this Core Duo. I don't know about you, but I'm probably not going to notice squat.

Link to comment
Share on other sites

I found this on slashdot.

 

"The MPC7410 family of chips (aka G4) from Freescale (formally part of Motorola) has 21 errata currently listed: MPC7410CE.pdf [freescale.com]

 

The MPC7447 family of chips (aka G4) from Freescale has 36 errata currently listed: MPC7457CE.pdf [freescale.com]

 

The PPC 970FX (aka G5) from IBM has 24 errata currently listed: 970fx_errata_dd3.x_v1.6.pdf [ibm.com]"

 

The Opteron has 64 eratta.

"The errata for the AMD Opteron is 85 pages long [amd.com]. I once spoke with a chipset designer and he told me that the Opteron errata was especially long with some convoluted workarounds, compared to other CPUs he's worked with."

 

"At least 136 in the Athlon.

 

Google html of the pdf:

http://64.233.179.104/search?q=cache:HFDm3zBojEcJ: www.amd.com/us-en/assets/content_type/white_papers _and_tech_docs/25759.pdf+amd+athlon+errata&hl=en&c lient=firefox-a [64.233.179.104]

 

Amd's original (pdf!)

http://www.amd.com/us-en/assets/content_type/white _papers_and_tech_docs/25759.pdf [amd.com]"

 

 

I couldn't find anything about Pentiums (and I didnt spend the time googling), but I would assume that the numbers are similar. In other words, this really isnt anything abnormal and this is why chip makers create microcode updates to be put into bios updates.

Link to comment
Share on other sites

Pfft - its not like apple chose intel because they were technically superior - everyone knows they aren't. It was a purely business decision and i wouldn't worry about it. I'm sure it'll work fine.

Link to comment
Share on other sites

This scares me, this truly does. Steve, I'm disappointed in you for the first run of IPods with bad batteries.

 

With such zealtoism out of the way, I think that this is a very bad thing for the first run of Intel-Macs. I don't blame this directly on Apple, however I do blame this on the cultural aspects at work at Intel. Intel is not a company which understands the long term viability of the Mac platform, nor the needs of it's users. PC users can encounter error after error, while Apple with it's more closed box mythology has software which has a long standing tradition of 'just working.' I think such issues will further manifest themselves as Apple moves into EFI (modified or elimination of the 20 year old BIOS technology).

 

We must ask ourselves truly, can Intel actually drag itself out of it's own corporate culture to product products to the Mac standard of quality? I don't believe they can as someone who has experience with both companies.

 

To put it bluntly, Intel needs to find it's way out of it's own ass, and make a product which lives up to Macintosh standards. So far, I'm not seeing it.

 

Issues like this are what kill great platforms. Apple, of course, has no choice but to work with Intel, or create thier own x86 based architecture, something which I think they should have done in the first place given the open nature of the x86 standard. It's too late to turn back now.

 

In the computer industry, each company sets thier own level of 'negligible error.' Could we now be seeing that the partnership between the two companies isn't a match made in heaven but a match made in incompatible quality standards?

 

I mediate my response however with my longstanding belief that we won't know anything about the consumer reaction or understanding of these flaws until Q3 of 06, at least.

Link to comment
Share on other sites

Pfft - its not like apple chose intel because they were technically superior - everyone knows they aren't. It was a purely business decision and i wouldn't worry about it. I'm sure it'll work fine.

 

 

I suspect they choose Intel instead of AMD because of IBM partnership to AMD....so they would like to separate from IBM....or at least to win Intel's credit in business....

Link to comment
Share on other sites

Blah blah blah. We're forgetting every reason apple ACTUALLY decided to partner with intel. If you look at the numbers that intel struts around with you'd understand. Apple needed a high quality notebook chip that wouldn't cook an egg or require a built in nuclear power plant. AMDs chips, A:as has been pointed out, are possibly the worst chips when it comes to huge numbers of low level errors. and B:AMDs chips do indeed serve well for cooking surfaces.

 

The last Centrino release, however, ran cool enough and had very good power consumption, which says nothing about the Core Duo. The Core Duo only consumes half the power of the last centrino chipset and has twice the performance (with that ridiculous 4x data rate FSB and 2 cores).

 

If you want to talk processor errata, maybe you should shut up about how many intel is finding that will probably never affect users (unless, of course, you happen to use a debugger a lot that will freeze operation frames. Anyone who's used macsbug knows that untill you slow things down to a forced crawl, processor errata will almost never cause any real problems). Instead you should read some technical documents explaining how the fact that x86 has so few registers and moves instructions to the stack causes certain errors that are no problem on PPC (divide by zero, for example, will not halt the flow of instruction on a PPC) to become fatal and cause crashes. As a result, some code which would have been fine without extra precautions (and therefore extra instructions) on PPC OSX will now have to be scoured for any possible conditions which will cause an x86 chip to defacate itself. In the end though, and with some time for apple to adjust and iron out the bugs, the OS will be no different than on PPC.

Link to comment
Share on other sites

I am not so deep into PC architecture, despite having owned various models of the evolving architecture from when you only could use an 8088 with 640k RAM to current-day hyper-threaded 3.0+ Ghz systems sporting 4gb.

 

I remember the time when the Pentium had a demonstrable calculation fault. Initially, Intel was saying this arithmetic error would affect such a tiny amount of operations, virtually no customers would be affected by the error. This of course led to demonstration spreadsheets that produced results which were definitely way off, and like wildfire came slogans like, "What's another name for the Intel Inside sticker? A warning label!"

 

Intel may or may not have learned from this experience. The fact they are publishing known problems with the Core Duo shows to me that some of the denial aspect of their previous modus operandi has changed. Whether this carries forward to them actually admitting the product is so faulty they will replace it is a totally different ball of wax.

 

This CPU is already inside at least one Acer notebook - do we have any early warning signals that this system is a turkey?

Link to comment
Share on other sites

  • 3 weeks later...
I remember the time when the Pentium had a demonstrable calculation fault. Initially, Intel was saying this arithmetic error would affect such a tiny amount of operations, virtually no customers would be affected by the error. This of course led to demonstration spreadsheets that produced results which were definitely way off..

 

Intel may or may not have learned from this experience.

 

This CPU is already inside at least one Acer notebook - do we have any early warning signals that this system is a turkey?

 

What Intel learned from that experience is to call them "errata" instead of "errors" and to be open about it. The fact is that even that Pentium bug was fixable in software, but the press was so damaging to their rep, they offered to replace the processors anyway. Tons of people ordered a replacement, but tons did not. The CPU driver was updated and the floating point bug was gone forever. All processors have a number of errata, and it's all fixed in OS software that is way lower-level than anybody but compiler writers have to worry about. These processors have hundreds of millions of transistors, they _ALL_ have errata, and we as end users never have to worry about it. GEEK.com should be ashamed of themselves for trying to blow this WAY out of proportion just to drive hits to their website.

Link to comment
Share on other sites

Thanks for the info regarding how errata are usually corrected by using a CPU driver. Naturally, until computers can totally design their own CPU chips from start to finish, we'll have human error factors creeping in.

 

I am a bit more curious why nobody is ranting about the TPM chip, when we know for sure Apple already is using it to validate and lock out OS functions (beyond just installation time). We know their OS has a "Don't Steal Mac OS" kernel extension, and we know that code is executed when the system environment THINKS it is "pirate."

 

The OS (imo) is coded with a PRESUMPTION of user guilt. Microsoft Windows will let you use it for 30 days without activation. It seems Mac OS locks you out the instant it gets a no-go from the TPM encryption check.

 

I also do not like to trust the operation of my computer to a function that is under control of the computer's manufacturer. What would prevent Apple from sending out an update that makes the MacOS expire after a year of use, then demand paid upgrades? What means does a consumer have for retriving their data after the computer has decided they are not authorized?

 

I think this is WAY too much Big Brother to build into ANY computer. If Microsoft was up to this chicanery - or should I say draconian measures? Machievellian measures? - I think the press would raise a huge stink about it.

 

I think Apple is trying to slip in the hardware equivalent of the DMCA. And just like the DMCA, it is open to some serious abuse.

 

I simply cannot countenance an OS design of this type. I definitely will never buy an Apple computer with a TPM inside.

Link to comment
Share on other sites

I've been using both a 17" and 20" iMac CoreDuo now for almost 2 weeks with no problems at all (except the video glitch that was fixed with 10.4.5) in a working production setup.

 

These machines are quite snappy and Rosetta works just fine with Creative Suite 2 and Office. Each machine has 1GB RAM which seems to be a major factor in Rosetta performance. Ordered them with a single DIMM so I can add more later.

 

If the processors have errors I haven't seen or felt the effects yet. I guess time will tell.

Link to comment
Share on other sites

What Intel learned from that experience is to call them "errata" instead of "errors" and to be open about it. The fact is that even that Pentium bug was fixable in software, but the press was so damaging to their rep, they offered to replace the processors anyway. Tons of people ordered a replacement, but tons did not. The CPU driver was updated and the floating point bug was gone forever. All processors have a number of errata, and it's all fixed in OS software that is way lower-level than anybody but compiler writers have to worry about. These processors have hundreds of millions of transistors, they _ALL_ have errata, and we as end users never have to worry about it. GEEK.com should be ashamed of themselves for trying to blow this WAY out of proportion just to drive hits to their website.

Thats not entirely correct. There WAS a software fix for the fdiv bug,but it was not a good solution. It disabled the floating point unit and turned on software emulation. Shutting off the coprocessor was NOT a fix for it. The P60 chips that had the fdiv flaw did NOT have the software update features the newer chips have and could not be fixed. When I worked at intel everyone from the help desk people to the office staff to the managers understood that the whole issue was a huge mistake. Grove even said so publicly and Intel internalized that lesson into their corporate culture. The problem with the fdiv bug was that intel tried to deny that it existed,then said it wasnt a prolem,and even when it was clear it was a problem,initially refused to replace the chips. They finally gave in but not before the damage was done. Ive been told that "all of the faulty chips were replaced" While this certainly is not the case,I have never found one with the flaw. Ive searched for one,becuae I think it would be cool to have one,but the only one Ive ever seen is the one embedded in a block of plastic that is hanging off my key chain. (At least I was told that the p60 die that is in the keychain was one of the returns.)

 

I do agree with you,all processors have bugs. Its just something we have to accept regarding something as complex as a cpu. The fdiv bug was serious becuase the floating point unit on the cpu just didnt work right and there was no fix. Most of the bugs either have software workarounds or microcode updates. Modern motherboards have microcode updates built into their firmware. The updates are applied when the system boots. Windows and linux (and I assume other operating systems) also have a driver that allows them to update microcode after boot. This can be usefull if your bios does not support the particular stepping of your chip. If you can boot and get far enough to load the update,your ok. Errors like the aforementioned loading of the wrong memory during a restore from hybernations can be worked around in software. So long as the software fix does not appreciably affect performance,its no big deal. PPC had errors,intel has errors and AMD has errors.

 

To sum it up,the fdiv bug was a HUGE flaw that made the chip almost unusable for any serious work. The bugs most chips have,including the core duo,are nothing like that. They are of interest only to programmers who write operating system and compiler software and will never be seen by most people.

Link to comment
Share on other sites

  • 1 month later...
Pfft - its not like apple chose intel because they were technically superior - everyone knows they aren't. It was a purely business decision and i wouldn't worry about it. I'm sure it'll work fine.

 

What does that mean? As far as I know, the decision on choosing intel is solely based on business. Apple choose intel because they offered a platform. From motherboard, chipsets, graphic card, up to processor. If they go with AMD, they must get another deal from another party to get things done and this could add alot of money than a wholesale buy.

Link to comment
Share on other sites

 Share

×
×
  • Create New...