The Truth About Processor Performance (a.k.a AMD GHz vs. Intel GHz)

Status
Not open for further replies.
THE LONG STORY (short story at bottom)

For the longest time people have had difficulty comprehending the power of processors. In the past it was mostly seen that the higher the frequency (number of MHz or GHz) of a processor was the more powerful it was. Now that is not the case. While the frequency does matter, it is not the only factor that determines performance.

In the past 5 years we have seen AMD and Intel go two separate routes in the way they have developed their processors. For the most of that 5 years Intel has chosen to manufacture high frequency processors while AMD has manufactured low frequency processors (the highest clocked being 3.8ghz and 2.8ghz respectively for each manufacturer). Many people wondered why AMD made “slower” processors. The truth is that they did not. Now there are a few things that influence how quickly a processor works and I’ll try and stick to the more important and fundamental ones:

Execution Units: The number of execution units basically determines how many things a processor can do at once:

Execution unit
From Wikipedia, the free encyclopedia

In computer engineering, an execution unit is a part of a CPU that performs the operations and calculations called for by the program. It may have its own internal control sequence unit (not to be confused with the CPUs main control unit), some registers, and other internal units such as a sub-ALU or FPU, or some smaller, more specific components.
It is commonplace for modern CPUs to have multiple parallel execution units, referred to as scalar or superscalar design. The simplest arrangement is to use one, the bus manager, to manage the memory interface, and the others to perform calculations. Additionally, modern CPUs execution units are usually pipelined.
Wikipedia

Pipeline: The length of the pipeline determines how long it is before an instruction is processed completely and the processors "decision" or output is finalised.

Frequency: This is the number of “clock cycles” per second. The number of clock cycles per second determines the total number of times the processor can process an instruction (given the way it’s pipeline/execution units work).

Memory Access/Cache: How quickly a processor can access itÂ’s memory or the cache built into it will impact on how fast it can perform certain tasks. Think of the memory as being a book. Now if the processor has enough memory and fast enough access to it, it would be like having every page of the book torn out and laid side by side. That would make it possible to see every page of the book in one instant. If the processor doesnÂ’t have enough memory and/or canÂ’t access it quick enough itÂ’s like having to flick through every page.

Now that is all good and well but it still doesnÂ’t explain how a 1.8GHz Athlon 64 can perform just as well as a 3.0GHz Pentium 4, so letÂ’s take a look at the Pentium 4 Prescott vs. the Athlon 64 Venice.

Prescott:
90nm die process size
3 execution units
31 stage pipeline
3.0GHz
Memory controller on the motherboard’s “Northbridge”


Venice:
90nm die process size
3 execution units
12 stage pipeline
1.8GHz
Memory controller integrated into processor.


Once again though, IÂ’m sure that still doesnÂ’t explain things soÂ…Â…Â…Â…..

The stages of the pipeline is really probably the largest difference here. As you can see the Prescott and the Venice have a 31 stage and a 12 stage pipeline respectively. What that means is that it takes the Pentium 4 from our example 31 clock cycles to complete a single instruction before it can start another!!!! The Athlon on the other hand has the short 12 stage pipeline and will process more than 2 instructions on each execution unit in the time it takes the Pentium 4 to do one single instruction on each unit. As you can see while the Athlon has a much lower frequency it makes more efficient use of it. This is where the naming system used by AMD comes in. The names such as 3000+ and 4000+ are rough indicators of performance. They are called P-Ratings and were originally taken by comparing the performance of the Athlon XPÂ’s to the original Athlon Thunderbird and the rating designated is roughly the frequency in MHz of an Athlon Thunderbird that would equal it. It just so happens that the Athlon Thunderbird was about as efficient as the Pentium 4 Prescott which makes things easy for a comparison. Essentially it means that an Athlon 64 3700+ is equivalent or better than a 3700MHz or 3.7GHz Pentium 4 even though it is only 2.2GHz.

The other factor here is the Athlon 64’s in-built memory controller. In the past and with Intel’s current processors, the processor’s “FSB” (Front Side BUS) would be used to “talk” to the motherboard’s chipset Northbridge which contains the memory controller. The memory controller would then “talk” to the memory and then the process would go back in reverse to “talk” to the processor again. After this game of Chinese whispers quite a lot of time has been wasted just to fetch some information and THEN the processor can get to work. Having the memory controller on the processor pretty much allows almost instant access to the memory so the processor can get straight to work.

There are other good reasons why it is better to have a processor that does more work at a lower frequency. The higher the frequency, the higher the power consumption and the amount of heat that the processor puts out. The other problem is that the higher the frequency, the greater the likelihood that the processor will make an error which is how the Pentium 4 ended up with a 31 stage pipeline. It allowed it to run at a higher frequency but at the cost of efficiency.

So the difference isnÂ’t performance, itÂ’s just a different way of getting things done. That brings us to the latest generation of processors; AMDÂ’s socket AM2 and IntelÂ’s Core 2.

So whatÂ’s different?
AM2: Not a whole heck of a lot. Socket AM2 processors for all intents and purposes are exactly the same as the Athlon 64 we have already talked about but with just one difference. The memory controller has been modified so it can use DDR2 memory as well as the DDR the original Athlon 64Â’s used. The DDR2 memory only provides a marginal performance increase.

Core 2: Intel has repented. This is a complete departure from the design of the Pentium 4. The Core 2 processors are based on the Core processors, which in turn are based loosely on the Pentium M processorsÂ…Â…which are based on the Pentium 3. Of course there are a lot of differences between the Pentium 3 and the Core 2 but they are relatives. The Core 2 is more efficient than the Athlon 64 as itÂ’s pipeline has been shortened to 14 stages (almost same as the Athlon 64) and an extra execution unit has been added. Aside from this the Core 2 processors have shown amazing potential for being run at high frequencies despite their efficiency, many have been seen to reach 3.6GHz which has been almost impossible for the Athlon 64.

So now let us compare the Core 2 Conroe and the Athlon 64 Toledo/Windsor cores.

Core 2 Conroe e6600:
65nm die process size
4 execution units
14 stage pipeline
2.4GHz
Memory controller on motherboard's Northbridge

Windsor 5000+:
90nm die process size
3 execution units
12 stage pipeline
2.6GHz
Memory controller integrated into processor


The Conroe core is Intel's new flagship. The addition of a 4th execution unit, improved L2 cache function and shortened pipeline have combined to form a monstrously powerful processor. As you can see I am now comparing an Athlon 64 X2 5000+ with a clock speed of 2.6GHz to the e6600 at 2.4GHz. The new found efficiency of the Conroe means that despite being 200MHz lower in frequency it is far more powerful. In fact at stock speed the e6600 can out perform the FX-60 which is a 2.6GHz Toledo core (think of this as being perhaps around the power a X2 5200+ would be if it existed) and challenges the 2.8GHz Toledo core FX-62 (~5600+).

MYTHS

A 3.8ghz P4 is better than a 2.8GHZ Athlon 64: False, a 2.8GHz Athlon 64 is roughly equivalent to approx. a 4.8-5GHz P4.

65nm will increase frequencies: False, while reducing the size of the parts in a processor does reduce the voltage required and consequently the heat output by the processor this does not mean it has a greater potential for overclocking. The smaller parts are more sensitive to heat so really once that is factored in there is no gain other than it costing less to manufacture and operating at lower power.

Reverse Hyper Threading: This is complete fantasy as near as I can tell. Certain websites fabricated and perpetuated this myth. It suggests that the AM2 processors have something called Reverse Hyper Threading that enables the separate processor cores of a dual core to operate as one. It also suggested at a date that has been and gone the RHT technology would be enabled with a BIOS updateÂ…Â…..still waiting :p

DDR2 memory is better than DDR: False. ItÂ’s just a different type of memory really. Similar to the difference between the Athlon and P4, DDR obtains itÂ’s performance from efficiency and DDR2 from high frequency. Here are some comparisons between DDR and DDR2 though keep in mind that the results are mainly due to the Athlon's memory controller. Once we have some AM2 processors with DDR2 memory it will be a better comparison. Overall the while the bandwidth provided may be different it does not make a particularly noticeable difference.GaaraÂ’s Bandwidth Comparison

AMD will have a quick response to Core 2: I hate to say it but this is false too. ItÂ’s really impossible for AMD to make any competition for the Core 2 Conroe core until at least mid way through 2007.

Intel’s FSB is higher: This is more a misconception than a myth. Intel’s advertised FSB has always been the “effective FSB” rather than the true FSB. The later Athlon 64’s actually have a “2000MT/s” FSB. That is an “effective FSB” of 1000MHz but in “full duplex” which means it can perform 2 operations per clock cycle or 2,000 Million Transfers/Second.

Adding more cache automatically makes the processor faster: Not exactly, additional L2 cache does help if designed correctly but just adding extra cache does not necessarily help. A perfect proof of this is the original P4 Extreme Edition (code name "Gallatin"). This was a P4 pretty much merged with a Xeon. it was given extra cache in the form of an L3 cache. Despite being higher clocked than the Northwood Pentium 4's of the time it was actually beaten in many applications though took a significant lead in encoding and in some games.



THE SHORT STORY
The frequency of a processor is almost irrelevant. The only thing that determines itÂ’s performance is how efficient it is with itÂ’s clock cycles. The Athlon 64 is more efficient than the Pentium 4 so can out perform it at a lower clock speed and the Core 2 Duo is more efficient than the Athlon 64 and can out perform it at a lower clock speed.

Core 2>Athlon 64=Core>Pentium 4

Any corrections, additions or requests are welcome as i'm certain i've probably messed something up :p. This shall remain a work in progress

Happy Reading,
Nitestick

© Nitestick 2006
______________________________________________
Download the *.PDF Guide Mirror #1
 
Very good info :D

One point I will comment on:

DDR2 memory is better than DDR: False. ItÂ’s just a different type of memory really. Similar to the difference between the Athlon and P4, DDR obtains itÂ’s performance from efficiency and DDR2 from high frequency. In fact itÂ’s arguable that DDR is superior, see this thread GaaraÂ’s Bandwidth Comparison

I am not going to argue that DDR2 is "better," but I think the augment here is somewhat misleading.

Within the thread you linked too, all comparisons are DDR on A64 vs DDR2 on Intel. In that comparison, the bigger factor is IMC vs chipset MC. When comparing DDR on 939 vs DDR2 on AM2, DDR2 does add a significant amount of bandwidth. Now, this does not translate into significant performance increases for AM2 systems, primarily because 939 systems weren't bandwidth starved to begin width.

Great thread though, should answer a lot of people's questions.
;)
 
idiotec said:
Very good info :D

One point I will comment on:



I am not going to argue that DDR2 is "better," but I think the augment here is somewhat misleading.

Within the thread you linked too, all comparisons are DDR on A64 vs DDR2 on Intel. In that comparison, the bigger factor is IMC vs chipset MC. When comparing DDR on 939 vs DDR2 on AM2, DDR2 does add a significant amount of bandwidth. Now, this does not translate into significant performance increases for AM2 systems, primarily because 939 systems weren't bandwidth starved to begin width.

Great thread though, should answer a lot of people's questions.
;)

:eek: very true, i think i have journalist blood in me as i kind of thought that in the back of my head yet wrote it still to prove a point. i'll rectify that

edit: have changed that information about DDR2, added some more myths and cleaned up the format a little. still plenty of work though i guess
 
very nice, i am going to print out this thread and show all of my dumb classmates in my programming class how stupid they are. :D

I mean seriously, im like the only one who knows anything in that class
(and its an elective)
 
This helped out a lot. I always wondered why AMDs ghz was so much smaller than INTEL. I knew that 4200+ meant that is how it compared to INTEL.
Great Thread.
 
This helped me out alot. I always knew that Core 2 Duos were better than AMD X2 and those were better than Pentiums, but never knew why until i read this.
 
someone needed to clear this up.. wow, never knew so many computer geeks could be so stupid..

Im just pi***d about my classmates thinking that im wrong all the time, when i absolutly know im right
 
Status
Not open for further replies.
Back
Top Bottom