Closer look: Apple Mac Pro AMD FirePro D-Series dual GPUs
AMD FirePro D-Series power the Mac Pro to new performance highs
AMD D-series GPUs for Mac Pro compared with AMD's PC variants
We've now had our hands on both the Mac Pro 4-core model with the dual AMD FirePro D300 GPUs, the 6-core model with dual AMD D500 GPUs and lately, the 8-core model with dual AMD D700 GPUs. The D300 GPUs come standard on the entry-level 4-core model, which retails for $2,999, while Apple charges $400 to upgrade these to the D500, and $1,000 to upgrade them to the D700s. AMD customized the design of these cards according to Apple's requirements and although they are not available on the aftermarket, AMD sells some rough equivalents to each card, which makes for an interesting point of comparison from a pricing perspective. At a glance, the upgrade charges seem expensive, but in reality, they are extremely competitive.
Architosh has done some detective work to find the AMD after-market cards that best matched the custom cards in Apple's Mac Pros. They found that the AMD FirePro D300 is very similar to the AMD FirePro W7000, although with half the VRAM of 2GB per card. These are typically available at a street price of around $650 - each. The $2,999 Mac Pro includes two of these cards. The AMD FirePro D500 is something of a hybrid of the AMD FirePro W7000 and the W8000 with a mix of components from the two cards, with enhancements including a faster memory bus, more stream processors and additional VRAM at 3GB per card. Architosh estimates that if an equivalent card was on sale it would retail for around $2,000 each. Yet, Apple is only charging $400 for the upgrade. The D700 aligns directly with the specs of the AMD FirePro W9000, although it is likely running at a slightly lower clock speed. Each of these are worth around $3,150 each, making Apple's upgrade charge of $1,000 an absolute bargain.
AMD FirePro D300 Specs
Although the AMD FirePro D300 (codenamed 'Pitcairn') is the 'entry-level' card in the Mac Pro line, its rough equivalent is marketed by AMD as a 'High End' workstation card. The entry-level Mac Pro is fitted with with two GPUs, meaning that even this model is most definitely befitting of the 'Pro' banner. Exemplifying this is the combined performance of its dual-GPU setup, which is rated at 2 teraflops per GPU. A fully specced previous generation Mac Pro peaked at 2.7 teraflops of combined computing performance. Each D300 achieves its performance courtesy of 1280 stream processors utilizing AMD's GCN architecture, a 256-bit wide memory bus with a 160GB/s memory bandwidth coupled with 2GB of GDDR5 RAM. AMD tells us that it, like its stablemates, is fabricated by TSMC on a 28nm process. (TSMC, you may recall, is expected to fabricate Apple's next-generation A8 mobile processor).
AMD FirePro D500 specs
AMD markets the rough equivalent to the D500 (codenamed 'Tahiti') in the same 'High End' workstation GPU category as the D300, although it is further up the range. The $400 upgrade will get you an additional 246 stream processors per card, taking the count to 1526 each. Although performance increases incrementally from 2 teraflops to 2.2 teraflops per card, importantly the memory bus increases to 384-bits, while memory bandwidth is upgraded to 240GB/s. VRAM also gets boosted by 1GB per card to 3GB, all helping to give a solid boost to overall performance. Its processor architecture is also based on AMD's GNC design.
AMD FirePro D700 specs
The closest equivalent to FirePro D500 in AMDs range is marketed as an 'Ultra High End' workstation GPU, which should appeal to those with either the extra cash, or need, for the optimal GPU graphics and GPU compute performance. The $1000 upgrade ups the ante by adding 522 stream processors per card over the D500 and a whopping 768 stream processors per card over the D300 (for a total of 2048 stream processors per card). Like the D500 and D300, the D700 GPU architecture is based on AMD's latest GCN microarchitecture. The memory bus, however, increases to 384-bits, while peak memory bandwidth reaches 264GB/s bandwidth. Performance reaches 3.5 teraflops per card for a total of 7 teraflops, nearly doubling the performance achieved by the D300 highlighting how much horsepower it contains.
Combined AMD D-Series GPU Performance Benchmarks
Analysis and commentary
It may be somewhat surprising to see that the Cinebench R15 OpenGL benchmarks don't appear to show dramatic performance differences between the AMD FirePro D300, D500, and D700 cards. AMD advises us that the reason for this is that the Cinebench OpenGL test is relatively CPU-bound (this is supported by benchmarks of the D500 showing dropped frames when running on a 6-core Mac Pro versus the 8-core Mac Pro). This is also evident when running Cinebench R15 in Windows 8, which we did with the 6-core/D500 Mac Pro. With CrossFireX enabled, the Mac Pro produced a Cinebench OpenGL score of 106.13fps. This is compares with a score of 80.13fps under Mac OS X, showing a substantial boost in performance. None of the Mac Pro benchmarks above actually tap into the second GPU at this stage.
However, while there was a notable uplift in performance on the Cinebench OpenGL score under Windows 8 with CrossFireX enabled, AMD says that the improvement in performance should have actually been greater. AMD's assessment is that it demonstrates that the Cinebench OpenGL test is substantially CPU-bound (by as much as 70 percent), and does not take full advantage of the extra GPU resources. Maxon has also explained to us that its test does not take advantage of both GPUs in Mac OS X either. As to why CrossFireX is not enabled in Mac OS X, AMD explains that this gives developers the opportunity in Mac OS X to more finely balance load balancing algorithms, allowing for maximum flexibility from the Mac Pro architecture. Enabling CrossFireX in Mac OS X could actually make it harder for developers to fine tune application workloads.
As Anandtech's review of the Mac Pro reveals, Apple's implementation of OpenCL in Final Cut Pro 10.1 makes substantial use of AMD's Graphics Core Next architecture. It has been written by Apple to tap into both GPUs for processing power, delivering substantial performance gains over previous single GPU Mac Pro configurations, particularly when applying renders. Each of the D-Series GPUs in the Mac Pros contains 4.3 billion transistors. The amount of raw compute performance in the AMD FirePro GPUs in the Mac Pro actually far outstrips the raw compute performance of the Intel Xeon CPUs in the Mac Pro (as powerful as they are in their own right).
The CPU performance in the Mac Pros is still measured in GigaFLOPS, whereas the performance of each of its GPU configurations is measured in the TeraFLOPS. The Luxmark 2.1 benchmark above is a measure of GPU OpenCL performance and clearly shows that the FirePro D700 more than justifies its price premium in this measure, but shows that the D300 and D500 are no slouches either. This is where the performance gap between Apple's semi-pro high-end iMac range starts to be gapped by the Mac Pro system architecture, more than justifying the "Pro" moniker. With the biggest gains in compute performance coming from GPUs, particularly when it comes to processing requirements of pro applications, you can clearly see why Apple has chosen to design the Mac Pro the way it has, leveraging a dual-GPU architecture, underpinned by a wide range of CPU choices with 4, 6, 8 and 12 core options available.
At the same time, however, Mac OS X developers can opt to utilize system resources as they see fit, and it is possible for them to harness the power of both GPUs if it will be of benefit to the application. It has been shown that applications suitable for leveraging GPUs can see massive performance gains, sometimes higher than a 100x boost than when using CPUs alone. However, for Mac Pro users who want to tap into the full gaming potential of the AMD FirePro GPUs in the Mac Pro, the Windows environment is going to offer more performance at this time. That said, AMD has confirmed with us that it is not planning to optimize the FirePro D-Series GPUs fitted to the Mac Pro in the Windows environment any further.
The good news for Mac Pro users is that AMD advised Electronista/MacNN that it is working with Apple on a daily basis to optimize the drivers for the AMD D-Series FirePro GPUs in the Mac Pro. The company also confirmed that it is continually working with professional application software developers, including game makers, to help them remove bottlenecks and maximize performance. Over time, it is likely that we will start to see the full performance potential of the AMD FirePro GPUs revealed in all contexts as developers get to grips with how best to leverage them. This is not surprising as the hardware breakthroughs invariably come first, and the software development follows.
The development work by Apple, in conjunction with its partners including Intel and AMD has resulted in the most powerful Mac Pro ever. As is often the case, Apple has focused on the future of computing with the new Mac Pro. GPU performance is on a much higher trajectory than CPU performance and the Mac Pro design is well placed to take full advantage of this. This also means that it will take a short while for the software, including system drivers, to be fully optimized. Apple has served up a machine that will excite developers (gaming and Pro app makers alike) as it offers them a platform where their apps will ultimately be able to do more, faster than ever. If that is not reason enough for developers to start tapping into both AMD FirePro GPUs in the Mac Pros as soon as they can, nothing will.
By Sanjiv Sathiah
Electronista/MacNN would like to extend its thanks to AMD's John Swinimer and Chris Bentley for their assistance in putting this story together. A full transcript of a Q and A that Electronista/MacNN has enjoyed with AMD in its original context is included below for your reference. It yields some additional information that will be of interest to readers:
Q: Is ECC memory enabled in the D300, D500 and/or D700 AMD FirePro GPUs in the Mac Pro?
A: The AMD FirePro D-series products seen in the Mac Pro do not use ECC memory.
Q: Is AMD able to explain why the D500 scores 80fps in the Cinebench R15 OpenGL benchmark and the D300 scores quite closely at 77fps, even though there is a substantial price differential between the two cards?
A: Cinebench R15 is still somewhat CPU-bound, rather than being a pure test of GPU performance, so it tends to be a good indicator of the CPU performance of the drivers. Since the AMD FirePro D300, D500 and D700 all use the same driver code, you could expect Cinebench R15 to perform very similarly on all three graphics processors.
Q: Is it simply that the D500 is better at OpenCL compute tasks and the D300 close in performance in general GPU tasks?
A: Not necessarily. The AMD FirePro D500 has more resources than the AMD FirePro D300 for both OpenGL and OpenCL tasks. Certain workloads show this, but Cinebench R15 happens to measure the driver performance more. It's important to note that AMD FirePro D500 has 50% more memory bandwidth (384b vs 256b), and 20% more compute units (24 vs 20) than AMD FirePro D300. For memory-bound or compute heavy tasks, including both OpenCL and OpenGL shader-heavy tasks, the largest gains can be seen with AMD FirePro D500 compared to AMD FirePro D300.
Q: Of the three GPU cards, which is the best for gaming and why? (Although, we do realize that these are workstation cards).
A: Yes, these are workstation cards, but this involves no compromise on the gaming front. The AMD FirePro D700 will be the fastest at higher resolutions and detail settings, because of its extra compute and shader horsepower. If you check out the gaming results by searching the web, the AMD FirePro D700 competes neck and neck with the fastest gaming cards available on the platform, and we've seen that if run under Windows, games that support Crossfire will show an even greater boost.
Q: Does AMD have any plans to offer specialized Catalyst FirePro drivers for the new D300, D500 and D700 cards in the Windows environment for Mac Pro users who also run Windows in Bootcamp?
A: AMD provides driver support to Apple specifically for AMD FirePro D-series graphics found in the Mac Pro. There are no other driver plans for the current Mac Pro.
Q: Are there any AMD benchmarks that you can share that highlight the performance differences between the D300, D500 and D700?
A: AMD runs many benchmarks on our graphics cards. Many of them are proprietary, but included in the list are many public performance test cases, including: Batman: Arkham City, Borderlands, Civilization V, Dirt 2, F1 2012, Portal 2, Starcraft 2, Tomb Raider, X-Plane 10, CineBench R15, Geek3D GpuTest, Heaven demo, LuxMark, OpenGL Extension Viewer, Adobe After Effect, Adobe PhotoShop, Mari, Maya, Modo, Mudbox, Valley demo. Results for these benchmarks running on the 2013 Mac Pros are already available on the web. Some of these results show differences between AMD FirePro D300, D500 and D700 GPUs.
Q: Can AMD please specify the manufacturing process that they D300, D500 and D700 are fabricated on?
A: The AMD FirePro D300, D500, and D700 graphics are manufactured on 28nm at TSMC.
Q: Can AMD clarify the specific micro architecture used in each of the three GPUs?
A: All three of the AMD FirePro D-Series GPUs use the Graphics Core Next (GCN) micro-architecture (http://www.amd.com/us/products/technologies/gcn/Pages/gcn-architecture.aspx)
Q: Can AMD explain clarify why the FirePro GPUs in the Mac Pro (as powerful as they are) are better for running workstation 3D applications, than running 3D games? i.e., what is different about the way the workstation GPUs are optimized compared to its high-end consumer Radeon GPU line?
A: The AMD FirePro GPUs are powerful in both cases. Games and workstation apps sometimes depend on different parts of the OpenGL and OpenCL APIs, and sometimes they depend on the same paths through the driver and in hardware. Specifically speaking to the drivers for the new Mac Pro, we are optimizing both paths. This kind of work is always ongoing: we work with Apple and with Mac game developers and Pro app developers to remove bottlenecks and resolve issues every day.
Q: I have run the new AMD Catalyst 14.1 Beta Driver for Windows on the D500, but noticed no improvement using the Mantle API when running Battlefield 4 versus DirectX 11, even with CrossFire activated - what might be the reasons for this? (Again, we do realize these are workstation GPUs.)
A: With the understanding that these are workstation GPUs, it's possible the graphics settings or maybe the levels need to be reviewed for performance. AMD has seen Battlefield 4 performance differences between DirectX and Mantle being most pronounced in multiplayer modes. You may want to look carefully at the Battlefield 4 performance monitoring tools that should be used to analyze performance. In theory, Mantle should work fine in this particular instance but AMD has not tested this configuration. While you can expect in general the performance uplift relative to DirectX to be smaller on systems with fast CPUs like the Mac Pro (i.e. under 10%), scaling expectations could be better with multi-GPU enabled.
Q: What applications, if any, are the D300, D500 and D700 validated for, specifically?
A: All applications. AMD tests almost every game and pro app that is appropriate.
Q: Would you please give us your thoughts on the results. It seems surprising that the NIVIDA mobile GPU is as competitive as it is on the whole?
A: Those numbers look very much inline with our results. We would expect that Batman Arkham City would show more separation between the D700 and D300 at 2560x1440. Also, we tend to enable 4xMSAA to stress the GPUs more. What we see is that the 2013 Mac Pros are competitive when running a single GPU, but they really shine when apps leverage both GPUs. Currently, this is enabled by some of Apple's apps (Final Cut Pro, Motion, etc.), some Pro apps (Premiere Pro, Mari, Pixelmator, etc.) and OpenCL apps. We are expecting that over time more apps will join the ranks of multi-GPU enabled, since there's clearly lots of GPU power in the 2013 Mac Pros ready to be tapped.