There's an incredible amount of work gone into BD; architecturally, it looks like it will be at least as fast as Nehalem in single threaded IPC (probably better)
So yes, I think it will be in the ballpark of SB, though it may well be lower.
I think AMD's BD modules are better than HyperThreading - it's a lot closer to two full cores, but without the die area
In particular, I like the FlexFP design; a 256-bit FPU that can process two 128-bit FP instructions simultaneously from two threads - Almost all FP instructions are 128-bit or less still, with 256-bit AVX instructions only just being made. So making 8 full 256-bit FPU's wouldn't really make much sense. It would be a massive increase in die size with a tiny increase in performance. You want to do the opposite.
Sharing a 256-bit FPU like BD's FlexFP means those execution resources are actually used a lot more, and also means it can still process the (small number of) 256-bit AVX instructions in one cycle instead of decoding it into 2x128-bit micro-ops and processing one after the other.
Now, if IPC isn't higher than SB, how will it compete single threaded?
Actually, it looks like AMD is designing it to clock very high, with a streamlined pipeline.
But wait a minute, didn't Intel try that with Netburst? How is this different?
With Netburst, Intel pretty much ignored IPC altogether, and tried to rely on imaginary improvements in manufacturing processes.
AMD is streamlining BD's pipeline for higher frequency, but they're not making it their sole focus. It's just part of the picture.
IPC isn't the whole picture. Frequency isn't the whole picture. Core/thread count isn't the whole picture. You need a good balance of these things.
That's basically what AMD is trying to do with BD. Not focus on one specific thing, but improve as many areas as they can without over-engineering them.
I'd say they're definitely ahead of Intel in some ways. Maybe not every way, but we'll have to see some results to know exactly how successful they've been.