![]() |
|
||||||||||||||
|
![]() |
|||||||||||||
|
Inside Woodcrest If you don't mind using vendor-supplied benchmark data, take a quick spin through www.intelstartyourengines.com. (Otherwise, do some searching for independent Web reviews. You'll find the results are about the same.) The upshot is that Dempsey draws even, give or take, with AMD's Opteron line. With the move to Woodcrest and its underlying Core architecture, though, Intel stretches far into the lead regardless of what combination of performance, power consumption, or pricing you assess.
Like Dempsey, Woodcrest is based on a 65 nm process. However, whereas Dempsey is essentially two separate cores, each with its own 2MB of cache, tied together by a data bus, Woodcrest integrates a single 4MB L2 cache shared between both processor units. This is a more elegant, efficient design that alleviates much of the Paxville/Dempsey latencies caused by the two discrete cores constantly needing to communicate to the other what each has in its cache, almost like two neurotic lovers endlessly repeating "Are you OK with this? Are you OK with this?" Woodcrest ups the maximum FSB speed to 1,333 MHz, with top-end parts boasting only an 80W TDP. The flagship Xeon 5160 sprints at 3.0 GHz while the lowest speed bin part, the 5110 is only 1.6 GHz on a 1,066 MHz bus. But keep an eye out for some Woodcrest curveballs, such as the Xeon 5148, which is a 2.33 GHz processor with only 40W TDP. For those wanting a killer mix of performance and low power, this may be the best play around. The majority of Woodcrest CPUs are 65W parts. Broad rollout on Woodcrest should be happening right about when you read this. Again, in terms of platform features and compatibility, Woodcrest and Dempsey are interchangeable. Intel expects Woodcrest to achieve the lion's share of Xeon sales by the September/October time frame, but even then Dempsey and Nocona/Irwindale will still hold over 30% of the Intel's DP share. Things remain fairly predictable until sometime in the fourth quarter when Intel estimates it will bow Clovertown, the quad-core implementation of Woodcrest. Clovertown is expected to be compatible with Bensley chipsets, but whether it will be compatible with current Bensley motherboards (stepping could be an issue) remains to be seen. The 5000-Series Chipsets "Blackford" is the code-name for the Bensley server chipsets (5000P and 5000V); "Greencreek" refers to the workstation chipset variant (5000X). By now, some of the 5000-series features will come as no surprise. The front-side bus fattens up to 1,333 MHz, ECC memory and Hyper-Threading are supported, you get the platform enhancements detailed earlier, and there are two independent 64-bit/133 MHz PCI-X segments. But not even a 1,333 MHz FSB would likely accommodate the coming quad-core tide. The writing on the wall has been visible for quite a while. A single FSB link between the processor and memory controller hub (northbridge) could not possibly continue Intel's vision of a "balanced platform."
Once again, bandwidth to the CPU has become a primary bottleneck. With Lindenhurst, the prior generation Xeon chipset, both Paxville/Irwindale CPUs feed into a shared FSB linked to the chipset. With Blackford, a separate 1,333 MHz FSB links to each Bensley CPU. This spikes the overall bus bandwidth from Lindenhurst's 6.4 GB/s to 21.3 GB/s with Woodcrest. Like Lindenhurst, the 5000V (perhaps for "Value") chipset supports up to 16GB of memory. The 5000X maxes at 21GB, but the 5000P reaches up to 64GB with four memory channels each containing up to four 533 MHz modules. (Lindenhurst only supported up to 400 MHz.) Blackford remedies some nagging problems in the server space, such as the chipset's support for up to three x8 PCI Express links, but it leaves some questions unanswered. Chief among these is the long-term viability of the FSB architecture. Intel remains confident about the design, noting that with over 10 GB/s of net bandwidth available per bus, the Blackford FSB trounces HyperTransport, even with a supposed update from 2,000 MHz to 2,800 MHz. Moreover, the front-side bus offers this bandwidth in each direction; HyperTransport only provides half of its stated bandwidth in each direction. Additionally, posits Intel, keeping the memory controller separate from the CPU allows for greater platform mobility. One might look at the relative ease with which Intel hopped from DDR to DDR2 without losing CPU compatibility as an example. The Athlon line couldn't make the DDR2 transition until this year and faces a similar predicament if the market moves en masse to a faster format because its CPU architecture is tied like a Siamese twin to the memory controller. On the downside, there's no arguing that latency increases if one needs to communicate across a longer bus for memory accesses. With the move to multi-core going into 2007 and far beyond, we may yet see the FSB grow congested. However, Intel's consistent ability to scale into smaller fab processes significantly ahead of its competition allows it to increase integrated L2 cache sizes, which in turn lightens the load on CPU-to-memory traffic. In all likelihood, Xeon's FSBs will continue to scale beyond 1,333 MHz as needed, and while two FSB links obviously allow Bensley to be the fastest DP server architecture in the world, nothing is preventing Intel from adding more links as core counts dictate. Certainly, the rising complexity of such an approach has its limits, but these are years away, and we're fairly sure that new Intel server platforms will arise to address multi-core bus needs with far greater alacrity in the future than was seen during the 90 nm era. The FB-DIMM Migration Serial technology wins again. Parallel ports gave way to USB, parallel IDE stepped aside for SATA, and now DDR memory is shifting up to Fully Buffered DIMM (FB-DIMM) technology, which is still based on DDR2 architecture. (Because of this, as DDR2 technology improves in the near-term, FB-DIMM technology will rise along with it.) By cutting way back on the number of parallel memory traces on the motherboard, Intel is able to use that PCB real estate for better purposes—like adding more memory channels and slots.
FB-DIMM technology remedies these shortcomings by adopting a sort of modified PCI Express serial bus. As computer architecture has proved many times over the years, it's more effective to move fast signals over a few traces than slow signals over many traces. This is why you only see 69 pins on FB-DIMMs as opposed to 240 pins with DDR2. Additionally, because FB-DIMM uses a point-to-point architecture between the chipset's memory controller and the intelligent controller module on each DIMM (called the Advanced Memory Buffer, or AMB), each memory stick has a mind of its own. One module can perform reads while another does writes, a feat that is impossible with older, parallel memory technologies. Going serial isn't an entirely perfect solution. FB-DIMM architecture allows for up to six memory channels with up to eight modules per channel. That's a maximum of 48 modules, or 192GB of total memory with 4GB sticks. However, since instructions pass from module buffer to buffer down each channel until the destination module is reached, noticeable latencies can set in when accessing modules deep in a channel chain. The faster the modules, the less this latency is felt, plus the ability for each module to work independent of its peers increases overall efficiency. So while latency is more of an issue than with DDR2, FB-DIMM technology as a whole proves undeniably superior in performance and scalability. With the imminent move to quad-core, Intel knew that more bandwidth was needed than dual-channel DDR2 could provide. Moreover, as DP servers grew able to tackle larger processing jobs, they needed to be able to handle correspondingly larger datasets, hence the roof raising to 64GB. Early on, there was some grumbling, even within Intel's ranks, that the move to FB-DIMMs might be setting up for another transitional misstep akin to the Rambus RDRAM brouhaha of the late ‘90s. (Continuing an earlier point, one could argue that RDRAM was an example of what can happen when an industry leader tries to outsource a major element of a platform advance. Keeping such matters in-house obviously pays dividends.) However, further research showed that the industry would have no choice but to move beyond DDR2, and FB-DIMM consistently emerged as the simplest, most robust way to do it. This is why you now see FB-DIMM showing up on roadmaps throughout the industry, including from Sun and AMD, and JEDEC is adopting FB-DIMM as the industry's next-gen memory standard. We're back to Intel providing technology leadership rather than competitive speculation. One note for system builders about FB-DIMMs: Because of the nature of the memory architecture, the platform does best when running synchronously with the CPU. Thus if you're using a 1,066 MHz FSB Dempsey processor, you'd want FBD/533 memory. Stepping up into FBD/667 modules won't likely net you enough performance gain to justify the cost increase. However, if you're sprinting ahead with a 1,333 MHz Woodcrest, then you want FBD/667 modules as 533 MHz memory will deliver an unjust performance penalty. Also note that FB-DIMM architecture is more flexible in how capacity points can be reached. In last year's Xeon designs, a 16GB capacity could only be obtained with four 4GB modules or eight 2GB modules. Because of the higher slot count with most Bensley motherboards, you now have the option of implementing 16 1GB modules. Since lower capacity modules tend to cost less per megabyte, you may realize an overall memory cost decrease of 20% to 30% by opting for more modules of lower size in order to hit a given capacity point. Now, it seems that with every shift in memory technologies comes an inevitable outcry against higher pricing per megabyte, which in turn can hamper adoption of the new platform. Even before Bensley's arrival, there was considerable industry buzz on this point. Intel may have learned its lesson during the DDR-to-DDR2 migration, which seemed to take longer than necessary because of price premiums with the newer format. We couldn't get specifics from Intel, but word on the street is that part of the company's budget for Bensley's platform push is going toward bringing the price delta for FB-DIMM over DDR2 down from roughly 40% to about 20 percent. This will likely take the form of a rebate program offered through distribution on Bensley platform purchases. As volumes increase, the delta will shrink, and Intel estimates that pricing parity with DDR2 should be reached by mid-2007. Succeeding With Bensley If you gather your information strictly from sensationalist Web stories, you might think that Intel was engaged in a game of server market catch-up. In reality, even as of the end of 2005, more than four out of every five server processors purchased were made by Intel. The fact is that for all the hullaboo made over market share trends, Intel has seen relatively little share erosion in most server and workstation segments. "The situation for Intel isn't as bad as you might think," says Alex Hererra, senior analyst with Jon Peddie Research. "Intel's only dropped a few percentage points on the Xeon side. And if you look at the overall workstation market, it's interesting because Intel hasn't really lost anything, mainly because things like mobile workstation have had a lot of growth, and mobile workstation is virtually owned by Intel. They had like 93%, and now it's still at roughly 93 percent."
Bensley has been over two years in the making and was running in trials for several months previous to the May launch. But Intel didn't want a paper launch, which is why it insisted on motherboards, memory, cases ranging from 1U rack boxes to 5U pedestals, and everything else Bensley needed being in place in the channel by May 23rd. It's still too early to know if the Bensley rollout has lived up to Intel's or the market's expectations, but so far the indicators show the platform being more or less right on track. By the end of June, Dempsey comprised roughly 30% of the Xeon line. In August, Woodcrest passed Dempsey in volume, and by September Bensley will rule nearly 80% of the Intel 1P/2P world, with 20% of that volume still being supplied by Dempsey in the 5030 and 5050 SKUs. Bensley as a whole offers a tremendous amount of value and growth potential for the server and workstation communities, but in the short term there are a few key points of exceptional opportunity: Xeon 5030. This is Intel's single-core and 1P killer. Running at 2.66 GHz and 95W, this sub-$170 CPU is your shining star for potential buyers whose primary concern is pricing. The CPU is sold in boxed format only. Xeon 5050. For $20 more, the 5050 gives value buyers a respectable performance bump up to 3.0 GHz, still on a 667 MHz front-side bus. There's no power draw penalty versus the 5030, leaving us thinking that the 5050 is a no-brainer upsell save perhaps for large customers potentially magnifying the small price delta across hundreds of new servers. The 5050 is available in both boxed and tray packaging. Xeon 5130. Crossing into the Woodcrest collection, the 5110 is tempting as an entry point, but the more interesting performance point is the 5130, which hops from the 1,066 MHz FSB up to 1,333 MHz and runs at 2.0 GHz. The 5130 is probably your play for those wanting the "best bang for the buck." Xeon 5148. In a product family already renowned for its power savings, the 5148 is Intel's first LV variant. At only 40W, this 2.33 GHz Xeon is the industry's new high water mark for creating high-performance, ultra-dense server arrays. Xeon 5160. At 3 GHz on a 1,333 MHz bus, the 5160 is the flagship of the Core architecture for the third quarter of 2006. Server buyers wanting the absolute fastest 2P configuration available that won't be thermally challenged need to look no further. As testimonials on power and performance continue to roll in for the Bensley line, it's easy to lose track of the fact that Intel is also making sure that the platform is priced below its competition at each performance-per-watt point. In essence, Intel is doing whatever it takes to remove all objections against buying a Xeon solution. While the company hasn't surrendered tremendous ground in the server space, it naturally covets each share percentage point and is willing to fight for them tooth and nail. If it takes lower pricing to win back some of these buyers, then lower pricing is what they'll get. "There are plenty of folks who just care about the bottom line," says analyst Alex Hererra. "They may be doing volume deals through the channel and just want machines that work and don't care what the label says on the front of the box. They care about the P&L sheet and keeping the engineers busy. Those folks bought Intel in the past and would be happy to switch back. Most of them have been waiting for things to fall into place so they could go back to the regular way they do business." By now, it should be plain that Bensley is exactly what the server market has been craving. Yes, competing server CPUs can deliver speedy processing at moderate pricing levels, but Bensley uses that limited value proposition as a starting point and keeps on building. With faster performance, lower pricing, superior power conservation, better vendor support, and swaths of value-add technology advances inherent to the Bensley platform—not the least of which is the immensely robust and flexible virtualization technology—it's hard to imagine how resellers won't be able to leverage Bensley into more system sales, higher margins, and recurring revenue streams. Once again, Intel has demonstrated that the platform approach to technology can facilitate far more than chip sales; it can create and build entire channel business segments. If 2P servers haven't been a significant part of your reseller offerings in the past, Bensley may just be the door through which you can step into a whole new world of sales. |
||||||||||||||
|
||||||||||||||
Copyright © 2006 RAM Magazine. All rights reserved.
Do not duplicate or redistribute in any form. |
||||||||||||||