Originally
posted 8/15/02 by rob-ART morgan, mad scientist
Updated 8/17/02 with 'streamlined'
graphs, tweaked verbiage,
and addition of a very interesting
Memory
Speed Graph
8/20/02 -- Page
TWO
added with additional tests using subsystems.
Updated
8/27/02 -- Page
TWO Quake3
results using three different graphics cards on
both DDR and SDR
systems
Mahalo to my remote mad scientist, Greg, for
results from his new DDR Dual 1GHz.
(Note
to skeptics that doubted Greg's DDR results: On
8/26/02 we re-ran every test in the Bare Feats
local lab using a different DDR 1GHz/MP Power Mac
with even more memory (1.25GB) and got the same
results as he got. This report is not a mistake,
not a fake, and not a fluke as some suggested.)
Does the new
2.7GB/s DDR memory bus and 167MHz system bus make a
difference? Has we taken a leap forward in Power
Macs overall system throughput? I figured the best
way to tell is to run the NEW DDR Dual 1GHz Power
Mac against the "obsolete" SDR (short for SDRAM)
Dual 1GHz Power Mac (and the DDR based Xserve Dual
1GHz).
CONCLUSION
To my surprise
and chagrin, the new Power Mac with DDR RAM has
no performance advantage over the old SDR
Power Mac running at the same clock speed. The 25%
faster system bus seems of no help, either.
Depressing. Scandalous!
FOR MORE TEST
RESULTS THAT STRESS THE DISK AND GRAPHICS
SUBSYSTEMS, TURN
TO PAGE TWO.
10/20/02
-- MacAddict
published their own shootout
between a DDR 1GHz G4 and an SDR 1GHz
G4.
They showed a significant difference in speed
but I believe their tests are faulty. All three
tests involve the hard drive. Their two test
machines had a different model of hard drive.
Therefore, the results should be different. In
the BARE
FEATS tests,
I used the same 120GXP hard drive in both
machines. The results were virtually
identical.
Where's the
bottleneck? Here are some theories from readers
on why the new DDR machine isn't any
faster:
1. The
two processors share a 1.3GB/s pipe to Apple's
custom AGP/Memory controller. So the 2.7GB/s
memory gets "starved."
2. The main
problem hobbling the PowerMac is the CPU to
System Controller Bus, i.e. the System Bus. What
is "starved" is not the memory. What is starved
are the CPUs of the PowerMac. They are capable
of processing much faster than the system can
feed it data. This has been a known, nagging,
problem with the G4 processor in its current
implementation. By design, the PowerPC 7450 used
in current PowerMacs CANNOT receive data at
faster than 1.3 GBPS speed.
And if you use
DUAL or QUAD CPUS, they are FORCED to SHARE the
bus, slowing down each processor when doing data
intensive work. This is why Apple cannot come
out with a QUAD processor machine. The four CPUs
have to share the same thin 1.3 GBPS bus!!!!.
Talk about starving CPUs even more!!!
Ideally, the
two CPUs should get separate busses to get data.
But they cannot. It's forced by Motorola's
current design. Note that Pentium 4 also has the
same problem - having to share the same system
bus among dual CPUs (however, the bus is
fatter). Motorola's PowerPC G4 cannot use a
faster bus. Only the Athlon can take advantage
of using a separate bus per CPU.
3. The DDR
Power Mac has a smaller (faster) L3 cache (1MB
per cpu versus the SDR's 2MB per cpu). (I am
not convinced this is a big factor. Why? Because
the DDR based Xserve has 2MB per CPU just like
the SDR Power Mac and it's no faster. See memory
graph below for more on this. --
rob-ART)
4. The DDR
Power Mac is running Jaguar while the SDR Power
Mac and Xserve are running Puma (10.1.5). (As
you can see above, we re-tested using 10.2 on
the SDR Power Mac to shoot holes in that theory
-- rob-ART)
5. The
advantages of the DDR Power Mac will show only
if you load up all the buses (CPU, Memory, PCI,
FireWire, AGP, etc.). (Though all the tests
apps we used exercise the most of the PCI and
AGP buses to some degree, they are primarily CPU
and Memory intensive. See PAGE
TWO
tests for more on graphics and disk stressing.
-- rob-ART)
Here's
a very interesting graph created with data
generated by an app named MemPerf written by Basil
Achermann. It reveals what's going on with L1
cache, L2 cache, L3 cache, and regular
RAM:
The graph
above shows that the SDR Mac is as much as 81%
faster for several sequential 0.5-2MB data accesses
over the same data or whenever its larger L3 cache
is used a lot. The DDR Mac can load 21% faster from
the main memory but only because of the 25% faster
bus speed, not because of the doubled data rate
(proof for the bottleneck). Real world applications
use both L3 cache and RAM, and that's why they all
score about the same on both
machines.
What
will it take to fix the bottleneck?
1.
A CPU than can handle the full speed of DDR
memory (like the fabled PPC 7470).
2. A redesigned motherboard with separate bus
for each CPU.
Will the Dual
1.25GHz Power Mac be 25% faster as the specs imply?
Will it smoke the Pentium 4 2.53GHz system as Apple
claims? We plan to answer those and other questions
in about 6 weeks... when the top model ships.
Meanwhile, we have already started testing on the
Pentium 4 2.53GHz system.
Here's a chart
comparing the features and specs of the three Dual
1GHz systems:
|
DDR
Power Mac
|
SDR
Power Mac
|
Xserve
MP
|
CPU
Clock Speed
|
1GHz
*2
|
1GHz
*2
|
1GHz
*2
|
System
Bus Speed
|
1.3GB/s
|
1GB/s
|
1.1GB/s
|
Memory
Bus Speed
|
2.7GB/s
(DDR)
|
1GB/s
|
2.1GB/s
(DDR)
|
Maximum
Memory
|
2.0GB
|
1.5GB
|
2.0GB
|
L3
Cache
|
1MB
per CPU **
|
2MB
per CPU
|
2MB
per CPU
|
L3 Cache
Throughput
|
4.6GB/s
|
4GB/s
|
4GB/s
|
Standard
Graphics Card
|
Radeon
9000 AGP
|
GeForce4
MX AGP (64MB DDR)
|
ATI
PCI (32MB DDR) *
|
Optional
Graphics
|
GeForce4
Titanium AGP
|
GeForce4
Titanium AGP
|
Radeon
8500 AGP (Apple) or any SHORT AGP card you
can obtain
|
ATA
drive bus speed
|
66MB/s
and 100MB/s (two
controllers)
|
66MB/s
|
100MB/s
|
ATA
drive connections
|
4
|
2
|
4
w/ SMART support
|
full
length PCI slots
|
4
|
4
|
2
|
PCI bus
speed
|
33Mhz
|
33MHz
|
66MHz
|
FireWire
Ports
|
2
|
2
|
3
|
Gigabit
Ethernet Ports
|
1
|
1
|
2
*
|
Price
for Dual 1GHz Base Model
|
$2495
|
(out
of production)
|
$3999
|
* Combo PCI/AGP
short slot can be used for short AGP graphics card
or Gigabit Ethernet card but for both. Full length
GeForce4 Titanium won't fit. And you'll need a
special AGP riser for the short card if you decide
to upgrade. The default factory config is a generic
ATI PCI graphics card which seems to be on the
level of a Radeon 7000.
** The Power Mac
G4/1.25GHz MP will ship with 2MB L3 per
CPU
RELATED
LINKS
Apple has
published some test results of their own using
Photoshop, Final Cut Pro, and DVD encloding
(iDVD?). You can download the Technical
Overview PDF
with these results. They haven't responded yet to
my request for details as to how these tests were
run so we can't duplicate or verify the
results.
PowerLogix
published an in-depth
white paper discussion (PDF
)
comparing single data rate static RAM ("SDR")
architecture versus double data rate ("DDR") when
designing the level 3 cache circuitry for use with
the latest Motorola G4/745x processors.
Don't forget
to TURN
TO PAGE TWO
and THREE
for more DDR vs SDR test results.
If you buy a new
DDR Power Mac, make sure you specify DDR PC2700
333MHz (Non ECC) 64x64 CL2.5 memory. I found
the 512MB modules at Data
Memory Systems
for $135 (part number DM50 609). Check also with
TransIntl.com
but be sure to specify "PC2700" since they don't
list the speed on their website.
TEST
NOTES
The "SDR" Power
Mac 1GHz MP had 1GB of PC133 CL2 SDRAM.
The "DDR" Power Mac 1GHz MP had 1.25GB of PC2700
CL2.5 DDR RAM
The "DDR" Xserve 1GHz MP had 1.5GB of PC2100 CL2.5
DDR RAM
All three were running from an IBM 120GXP
drive.
For details on
each real world test, read "HOW
I TEST."
|