Why So Fast?

Status
Not open for further replies.

Jumping_Bean514

Daemon Poster
Messages
752
Location
Seattle
IÂ’m not completely sure why the Core 2 Duo is so fast. I look at the Intel Pentium Extreme Edition 965 Presler with its specs being:

Duel Core
3.73GHz
2 x 2MB L2 Cache
65 nm
64 Bit

Which at first glance, seams better than Conroe? So why is Conroe faster? Could a really technical person give me the run down on this?
 
I read that the conroes are built with a differnt architecture. Something about shorter pipelines and something called NetBurst technology. Mabye its faster because conroe is kind of like an AMD processor ya know it can get more done even at a lower speed.
 
maybe they have two mini-cores inside of one core? i'm not sure what he means, but i beleive the conroes are faster because they use clock timings like AMD does now. so a 2.7 conroe would be equivalent to a 3.7 EE.
 
Conroes are faster because they are a completely redone architecture. They can do A LOT per clock cycle. Meaning better performance at lower speeds, and amazingly low power consumption.

massacreinfallx was saying that instead of having a dual core with 2 seperate dies, Core 2 Duo is the first to have the 2 cores on the same die.

NetBurst was the old Pentium 4/D style, where more Mhz = faster speeds. That is not the case anymore.
 
"Netburst" is a nickname that refers to the core design of Pentium 4 procesors such as the Presler 965 you've listed above, it has nothing to do with C2D processors.

I'm pretty sure that massacreinfallx was talking about the unified cache found on Conroe chips. This just means that the extremely fast ondie memory that the core uses is accessible to both cores. In other words, say you're running a single threaded app and one core is idle, the other core is free to use the entire 4MB cache. Previously, multicore processors have cores with dedicated cache, or they each had 2MB, and they could not "share" each others cache

Now, a processor executes an operation via sending a thread down a "pipeline" that has various stages that analyze and decode the thread so the processor can execute it. The number of stages defines the length of the pipeline. Presler was based on the Prescott revision and probably had 31 pipeline stages, whereas Conroe/C2D has 14 pipeline stages. The longer a pipeline is, the easier it is for the core to scale in terms of clock frequency which is why the Presler has a faster operating frequency.

However, an clock cycle is defined as the time it takes the core to execute an operation. You should be able to understand that a core with a shorter pipeline is able to execute an operation much faster than a core with a longer pipeline because the thread has to go through less stages. Therefore, the time it takes a 14 stage core to execute an operation is less than a 31 stage core, therefore the clock cycle of the 14 stage core is more efficent, therefore it requires less cycles to perform a similar operation when compared to a 31 pipeline core, therefore it can operate at a slower frequency and still perform at the same, or even greater speeds.

The megahertz wars as they have been labelled are irrelevant now, you have to factor in the IPC (instruction per cycle) rate of a core in order to determine its power.
 
Status
Not open for further replies.
Back
Top Bottom