Largest Performance Gain

kobe24 · May 21, 2007

yeah true sora, just image how fast theyll be in a couple a yrs

Apokalipse · May 21, 2007

Trotter said:
While C2D beats the current AMD architecture, it doesn't beat it by miles, but by inches. Bleeding edge is where companies want to be, but that razor is a hard one to walk.

I look forward to the new generation of AMD's chips, just to see the further evolution of the species. I seriously doubt I'll be able to buy into it, but I definitely want to see it.

Agreed.

Both companies have done things well, and other things not as well.

Intel made a mistake with their Netburst architecture. They finally abandoned Netburst in favour of the more efficient architecture based on the Pentium 3 (which is what the Core 2 Duo is really based off), and they have a very good platform now.

If you read into the internal architecture much, you'll find that AMD has a really good architecture for floating point instructions.
HyperTransport is one very solid technology, too.

kobe24 · May 21, 2007

they both have very good ideas and know how to use them

Sora · May 21, 2007

Yep the P4's were just a joke really. Intel actually used to be pretty decent that most people forget. They did create the standard chips we all use today pretty much today so no one can diss them completley. AMD has made very few mistakes though and they were not expecting Intel to release the C2D and for it to be that powerful.

kobe24 · May 21, 2007

c2d definatly change they way we look at things

HAVOC · May 21, 2007

Apokalipse said:
I don't think it will. The architecture changes for K10 look very solid. Penryn hasn't got a lot of change, apart from a die shrink.

Penryn hasn't got a lot of change, apart from a die shrink? uh yes it does.

First notable change 45nm - that's not the size of the die, it's the relative size of the transistors.

2.) shared 6 MB L2 cache. (Dual) shared 12 MB L2 cache (Quad)

3.) New High-K Metal Gate Silicon Transistors.

4.) New SSE4 optimized video encoders.

5.) New Fast Radix-16 Divider

6.) Increased to 1666 MHz FSB

7.) 820 Million Transistors

8.) Starting at 3.0Ghz

9.) Enhanced Dynamic Acceleration Technology, which uses the power headroom of an idle core to boost performance of the non-idle core.

10.) Enhancements to Intel Virtualization Technology speed up virtual machine transition (entry/exit) times by an average of 25 to 75 percent.

There are plenty more...

I'd say it's a bit more then a "die shrink"... lol

maroon1 · May 21, 2007

The fact that most other benchmarks were in the range of 10-25% improved,

If you know games really depends more on GPU than CPU. HL2 Lost Coast was running 37.3% faster on Yorkfield. Thats not a small difference.

Again, it depends on what's being encoded, and in what format.

Just show me one benchmark that shows that E6320 is 50% faster than E4300

I'm here saying 50%, not 111%

Just that the actual architectural improvements are much easier to compare with them both at the same clock speeds.

Thats not fair.

first: Intel Penryn has 45nm die shrink. It is designed to run at high clock speed without consuming a lot of power. Agena processors will find difficulties to run at high clock speed as Penryn. The best processors has the best Performance/power consumption/price ratio. High clock speed or low clock speed, it doesn't matter.

Second: Who told you that Barcelona or Agena or whatever perform as good as Core 2 when they are running at same clock speed ?!!! Have you saw any benchmark

While C2D beats the current AMD architecture, it doesn't beat it by miles, but by inches.

By inches !!!

From the benchmarks that I saw ÙE6600 performs as good if not better than 6000+.

6000+ has 600MHz higher clock speed, it consumes much more power (125W compared to 65W for E6600).

I'd say it's a bit more then a "die shrink"... lol

Thank you

Edit

6.) Increased to 1666 MHz FSB

And the one that was tested had only 1333MHz FSB, and it performed much better than QX6800.

I think the ones that have 1600MHz FSB are going to be a bomb !!!

2.) shared 6 MB L2 cache. (Dual) shared 12 MB L2 cache (Quad)

Note: The cache memory in the Quad version is not shared.

Apokalipse · May 21, 2007

HAVOC said:
Penryn hasn't got a lot of change, apart from a die shrink? uh yes it does.

First notable change 45nm - that's not the size of the die, it's the relative size of the transistors.

2.) shared 6 MB L2 cache. (Dual) shared 12 MB L2 cache (Quad)

3.) New High-K Metal Gate Silicon Transistors.

4.) New SSE4 optimized video encoders.

5.) New Fast Radix-16 Divider

6.) Increased to 1666 MHz FSB

7.) 820 Million Transistors

8.) Starting at 3.0Ghz

9.) Enhanced Dynamic Acceleration Technology, which uses the power headroom of an idle core to boost performance of the non-idle core.

10.) Enhancements to Intel Virtualization Technology speed up virtual machine transition (entry/exit) times by an average of 25 to 75 percent.

There are plenty more...

I'd say it's a bit more then a "die shrink"... lol

The only things that require a somewhat significant architecture design change are 2, 4, and 5.
7. is mainly because of 2.
3. is not really an architecture change. But it is part of how they're moving to 45nm.
6. and 8. are possible largely because of the improved efficiency of the transistors, and die shrink.
9. is really not very complex, and hardly requires much change.
10. is a bit misleading. It's not a 25-75% improvement on actual performance.

If you have studied a lot about CPU architecture, you'll know why this isn't really anything revolutionary. CPU's get a lot more complex on the inside; but in this revision, not much of the actual core architecture has changed.

If you've read much about K10, you can find mountains of information about how their architecture is changing (a lot, LOT of very small ways you generally don't hear about, and also in a lot of very big ways)

A lot of what makes the C2D so successful is their ability to get instructions quickly to the core to be processed once they reach the CPU, along with the fact that each core can process 4 IPC, and have 14 stage pipelines. Bandwith to the CPU may be somewhat limited compared to AMD's HTT, but their prefetching does minimise the effect of the slower FSB.

The Pentium 4 could process up to 3 IPC (same as K8), however their pipelines were 20 stage, and they weren't nearly as efficient at getting instructions to the core. Plus, their FSB was quite restricting, especially without the prefeching like the C2D has.

AMD's K8 was built largely on K7, with a core IPC of 3, and 12 stage pipelines. The pipelines being much fewer stages made the K8 much more efficient than the Pentium 4. With an onboard memory controller added, which reduced latencies and increased bandwith, Pentium 4 simply did not compete.

K8 is not without its problems. For one, their cache system is not as efficient as Intel's (including the Pentium 4). But it is still a good architecture.
K8 is still ahead of Core 2 on bandwith, and raw memory latencies (if you take out prefetching), and it has fewer pipeline stages. Plus, AMD's instruction decoders can decode both simple and complex instructions; Core 2 has separate simple and complex decoders.

Some of the enhancements of K10 over K8 include increasing their IPC to 4, increasing cache efficiency, and highly improving the way instructions are decoded (once they reach the CPU) and sent to the processing streams. And yes, HTT is recieving an upgrade aswell.

AMD has made a huge effort into power conservation, aswell. They have developed technologies to slow down, and completely disable particular cores, particular sections of a core, or even the memory controller. They will also allow all four cores to run at separate clock speeds.
That, coupled with improved transistor design to reduce power leakage (which also consequently increases stability and clockability)

Yes, I do think K10 will be very competitive, even as Intel drops to 45nm and adds some more instruction sets, and improves clockability and power usage.

maroon1 said:
If you know games really depends more on GPU than CPU.

Depends on the game. Some games have a lot more physics than others, for example, Half Life 2.
In some circumstances, cache makes a significant difference also.

HL2 Lost Coast was running 37.3% faster on Yorkfield. Thats not a small difference.

see above.

Just show me one benchmark that shows that E6320 is 50% faster than E4300

I'm here saying 50%, not 111%

You know what?
snarf it. I'm not going to go googling around for benchmarks just to provide more evidence that a single instance of a high benchmark performance was caused by a certain factor.
The fact is, encoding does benefit from extra cache. Don't believe me? fine.

Thats not fair.

You're right, it's not fair. To see which architecture is more efficient, when the newer one is clocked higher.

Never mind the extra cache giving it an advantage in applications that are cache sensitive.

You can't simply say one architecture is better when there are other factors like that.

first: Intel Penryn has 45nm die shrink. It is designed to run at high clock speed without consuming a lot of power. Agena processors will find difficulties to run at high clock speed as Penryn. The best processors has the best Performance/power consumption/price ratio. High clock speed or low clock speed, it doesn't matter.

But it will have numerous architectural improvements to improve efficiency. For one, the increase to 4 IPC, while retaining their 12 stage pipelines, and much improved caching, decoding phases, and latencies.

Second: Who told you that Barcelona or Agena or whatever perform as good as Core 2 when they are running at same clock speed ?!!! Have you saw any benchmark

I've read a lot about its architecture.

By inches !!!

In most cases, yes. Maybe 20%

From the benchmarks that I saw ÙE6600 performs as good if not better than 6000+.

And?

6000+ has 600MHz higher clock speed, it consumes much more power (125W compared to 65W for E6600).

It's also a 90nm die process, with outdated transistor design, has a core IPC of 3, and has a less efficient caching and decoding phase.

And the one that was tested had only 1333MHz FSB, and it performed much better than QX6800.

"much" is subjective.

I think the ones that have 1666 MHz FSB are going to be a bomb !!!

depends. On memory bandwith sensitive applications, maybe it will see more improvement.

Oreo · May 21, 2007

Sod my PC, this debate is interesting

Basically AMD will win.
Because Intel will get hit by a bomb made of abonded Dells with pentium 4's, that noone wants anymore.

And after that AMD will release a Octo Core processor each core at 6Ghz with 64Mb of L4 Cache and will run Crysis at 600FPS without a Gfx Card.

End Of.

And no i am not a AMD Fanboy, i have a C2D, Im a mere fortune cookie that tells the future

[/SARCASM]

Intel and AMD will always be even, for aslong as averages exist, half should always like amd, half intel.

veg1992 · May 21, 2007

K M A N said:
Sod my PC, this debate is interesting

Basically AMD will win.
Because Intel will get hit by a bomb made of abonded Dells with pentium 4's, that noone wants anymore.

And after that AMD will release a Octo Core processor each core at 6Ghz with 64Mb of L4 Cache and will run Crysis at 600FPS without a Gfx Card.

End Of.

And no i am not a AMD Fanboy, i have a C2D, Im a mere fortune cookie that tells the future

[/SARCASM]

Intel and AMD will always be even, for aslong as averages exist, half should always like amd, half intel.

finally an unbiased opinion.... i applaud you K M A N, i can't wait for those systems , and i want my fortune told fortune cookie

Largest Performance Gain

kobe24

Fully Optimized

Apokalipse

Golden Master

kobe24

Fully Optimized

Sora

Renowned Budgeting Master

kobe24

Fully Optimized

HAVOC

Fully Optimized

maroon1

Banned

Apokalipse

Golden Master

Oreo

-Deactivated-

veg1992

Golden Master

Similar threads