By,
Dave Altavilla
February 1, 2004
We began our
testing with a few synthetic tests courtesy of SiSoftware's
SANDRA 2004. SANDRA, the System ANalyzer, Diagnostic and
Reporting Assistant, consists of a set of informational and
diagnostic utilities designed to test various PC subsystems
and relative performance. We ran four of the built-in
sub-system tests that comprise the SANDRA 2004 suite (CPU,
Multimedia, Memory and Cache). We ran all tests at
stock speeds for the Pentium 4 3.2GHz Prescott CPU and the
P4 3.4GHz Extreme Edition. Reference numbers provided
on the graphs below are from SANDRA's internal database.
|
SiSoftware SANDRA |
Synthetic CPU and Memory Benchmarks |
|
P4 Prescott
3.2GHz
CPU TEST
|
P4 Prescott
3.2GHz
MULTIMEDIA TEST
|
P4 Prescott
3.2GHz
CACHE TEST
|
P4 Prescott
3.2GHz
MEMORY TEST
|
P4 Extreme Ed 3.4GHz
CPU TEST
|
P4 Extreme Ed 3.4GHz
MULTIMEDIA TEST
|
P4 Extreme Ed 3.4GHz
CACHE TEST
|
P4 Extreme Ed 3.4GHz
MEMORY TEST
|
Surprisingly
Prescott's performance versus a standard P4 Northwood falls
a bit short. We were fairly certain Prescott's deeper
pipeline was the root cause of this situation so we asked
some of our contacts at SiSoftware for their thoughts and
here's what they told us.
" Several micro-architectural changes
were made in Prescott's core in order to enable headroom for
higher performance, scaling and frequency. For many
applications, the increased L2 cache size maximizes the
benefit of the uarch changes. However, in Sandra's tests,
which fit very well in the L1 cache, the increased L2 cache
does not make a significant contribution and thus amplifies
the impact of the uarch changes resulting in clock to clock
difference."
So in short,
clock for clock, Prescott's deeper pipelines are hurting its
performance and its enhanced BPU and extra cache aren't
making up for it. We would suggest however, that in
simple synthetic tests such as these, it's also difficult to
draw direct conclusions on real world performance.
Synthetic test such as SANDRA, are useful in helping to
detail a complete picture of performance and are only a
component of a complete performance metric.
|
FutureMark PCMark 2004 |
Synthetic CPU and Memory Benchmarks |
|
We also ran the
CPU and Memory performance modules available with
Futuremark's new PCMark04 suite. We'll quote Futuremark for
an explanation of how these tests work.
"The CPU test suite is a collection of tests that are run
to isolate the performance of the CPU. There are nine tests
in all. Two pairs of tests are run multithreaded ? each test
in the pair is run in its own thread. The remaining five
tests are run single threaded. These tests include such
functions as file encryption, decryption, compression and
decompression, grammar check, audio conversion, WMV and DivX
video compression."
Once again,
Prescott shows it's slightly slower clock for clock than a
Northwood core. However, the delta that exists
between the Athlon 64 and P4 scores, is indicative of this
test's emphasis on compression and decompression functions,
like those found in video conversion, which is the P4's
strong suit.
Editor's
Note 2/9/04: Since the release of this article we
found an error in the above graph for our Pentium 4EE 3.4GHz
score. We accidentally transposed the score at the
time, from that which is now noted here to "5346". The
correct score for the P4EE 3.4GHz system is now noted in
this graph.
Here are
FutureMark's comments on what PCMark 2004's Memory
Performance Test is doing.
"The Memory test suite is a collection of tests that isolate
the performance of the memory subsystem. The memory
subsystem consists of various devices on the PC. This
includes the main memory, the CPU internal cache (known as
the L1 cache) and the external cache (known as the L2
cache). As it is difficult to find applications that only
stress the memory, we explicitly developed a set of tests
geared for this purpose. The tests are written in C++ and
assembly. They include: Reading data blocks from memory,
Writing data blocks to memory performing copy operations on
data blocks, random access to data items and latency
testing."
Prescott's
additional L2 cache propels it ahead of the standard Pentium
4 Northwood processor and within striking distance of the
Athlon FX-51 with its integrated memory controller and 1MB
of L2 cache. The P4 Extreme Edition CPUs take the test with
ease, most likely due to their full 1MB advantage in L3
cache.
Business and Content Creation Winstone 2004 and XMPEG
|