![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
|
|
![]() |
|
||||
By Chris Angelini |
|||||
I could tell you which processors are the fastest in which tests and why. Then I could do the same with video cards, memory modules, and motherboards. If you really wanted, I could even show you the subtle performance differences between onboard audio and hardware-accelerated processing. But there's a good chance that none of my numbers would matter to you—and for more than one reason. Firstly, benchmark results are very configuration-dependant. I might be comparing an Athlon 64 FX-57 to an Intel Pentium Extreme Edition 955. However, the other components used to test will very likely differ from the hardware or systems you're selling to customers. To get a really solid idea of how your offerings stack up to each other, you'll need to run your own numbers. Then there's the issue of application. If a vendor drops by to present a list of 20 gaming benchmarks and explain why its products are the best, you might be inclined to blindly believe those results. But what if you specialize in rendering workstations, which are not optimized the same way as gaming boxes? There is no direct translation. You need tests that reflect the work your customers do. And that leads to a final consideration: the subtle differences between synthetic metrics and real-world benchmarks. Though the two classes look similar on paper, lab rats like me spend a lot of time fine-tuning benchmark suites, balancing the two against each other. A reseller with the technical know-how to run meaningful tests and present results is at a great advantage. For starters, it puts you in a much better position to answer questions should a customer ask them. Of course, you don't want to overwhelm anyone with technical mumbo-jumbo. At the same time, walking into a proposal meeting prepared with factual data on the equipment you're selling is always a good thing. There's also the benefit of leveraging benchmark data to make your own configuration decisions. If you're wondering whether to go ATI or NVIDIA for a top gaming machine, allow test results to illuminate your path. Sane Benchmarking in an Insane World. Running tests on computer hardware might not be brain surgery, but it's still a science that, done improperly, yields absolutely meaningless results. Benchmarking best practices help establish an environment conducive to accuracy and precision. Don't nod off here—this will only take a second. Precision reflects the repeatability of a measurement, even if it doesn't represent an actual customer experience. Accuracy establishes the quality of a result. In this case, that's its nearness to actual video encoding performance, gaming frame rates, and so on. The first real imperative is standardization across hardware and software. Systematically reducing variables helps improve comparability. On the hardware side, that means the test platforms you set up for a processor comparison sport similar graphics cards, memory capacities, and storage subsystems. Wherever possible, use the same motherboards and memory timings, too. The impetus here should be obvious. If you're only dealing with one variable, all performance variation is attributable to the component in question.
Software is a little trickier because it involves several layers of optimization. There's the BIOS, with all of its performance-impacting knobs and dials. Then you have drivers to keep consistent. Windows XP has its own bevy of settings to change, such as disabling System Restore and the Automatic Updates service. (You don't want either service to start in the middle of a test and throw off your results.) Lastly, it's important to prep each individual application for testing. Some games control quality settings, while others look to your graphics driver. Video programs have their own idiosyncrasies to deal with, and so do office apps. Of course, it's not always possible to kill extraneous variables, and in some cases that's perfectly normal. Pitting Athlon 64 and Pentium 4 systems against each other naturally necessitates a DDR versus DDR2 discussion. And demonstrating the benefits of PCI Express by comparing to AGP involves some obvious interface complications. Fortunately, there are ways around such inconvenient hurdles. Take that last example. It's actually fairly easy to showcase the virtues of PCIe. Build one 875P platform equipped with a 3.4 GHz Socket 478 Pentium 4 and Radeon X850 XT PE (AGP) graphics card. Put it up against an Intel 925X motherboard, 3.4 GHz LGA775 Pentium 4, and Radeon X850 XT PE (PCIe). By carefully selecting the programs you use to test, the influences of DDR and DDR2 memory can be filtered out. It's not magic or any sort of deliberate trickery. Just use a bit of homebrewed logic. A graphics interface is designed to improve data flow between your video card and chipset. Games just so happen to be the most taxing graphics applications—even more so when you crank up display resolution and the other visual details. So there's a fair chance that testing an intensive first-person shooter at 1600x1200 with plenty of antialiasing will emphasize any performance difference between disparate technologies. Fairly straightforward, right? Now grab a four-pack of Red Bull and get ready for an all-night testing marathon. The Impartial Scientist One thing I've learned from my years testing hardware is that it's extremely easy to interject bias and very difficult to generate meaningful benchmark results. Certain developers spend a lot of time optimizing their code for Intel's NetBurst architecture. Others are motivated by ATI or NVIDIA to spend extra time improving performance on their respective products. Be aware of the underhanded politicking so that you build benchmark suites with a good distribution of different titles. Believe me, it'd be a 10-minute exercise to compile a list of metrics that would make your Athlon 64 boxes dwarf competing Pentium 4 systems and vice versa. Intel does it for press briefings. AMD does, too. In fact, you can bet that every vendor with a marketing department runs the tests stacked in their favor. This is one of the best reasons to run your own benchmarks. The Intel Message Vendors are always interested in what the mainstream press has to say about their products, and performance-oriented briefings are actually quite common, especially if your published test results are inconsistent with what others are seeing. Recently, Intel has adopted a keener interest in measuring performance. Its move coincides with the shift toward dual-core processing, which admittedly does turn the whole benchmarking world a little topsy-turvy. You see, multi-core (dual, for now) appears to be the way of the future. And yet, there's a glut of solid tests capable of exploring the potential of existing single-core hardware. "But wait," you say. "Doesn't the benchmark landscape reflect applications that customers are currently running?" Indeed it does. However, moving forward, your customers are going to see a focused drive toward optimizations for multi-core processors. In other words, existing benchmarks deliver a precise insight on performance but fail to accurately represent the full scope of a system's potential. In VAR-speak, you might be unnecessarily conveying lower benchmark scores due to your suite of tests. If you want to sell dual-core, you'll have to start by figuring out the best way to quantify the technology's benefits.
That's not all, either. As if it weren't enough to have precision problems, Intel is calling the accuracy of game tests into question as well. You see, there's a lot going on behind the scenes when someone plays a game. You have graphics calculations, physics processing, artificial intelligence, sound, networking, and so on. Benchmark modes do away with most of that in attempt to peg precision. The physics and AI are seen as variables capable of changing a frame rate between one run and the next. Killing them off successfully makes scores more repeatable, but it reduces applicability. Consequently, the company is developing its own vendor-agnostic tool to measure game performance and assign experiential scores to the most popular games. This way, you'll be able to rate entire computer systems using one playability score for several different titles. Synthetic and Real-World Testing Most tests are broken up between real-world and synthetic. You can think of the synthetic tests as those written specifically with an exacting workload in mind. They're best for getting under the hood and hashing out hard drive performance, for example, or memory bandwidth. Futuremark's PCMark05 (www.futuremark.com) is a synthetic benchmark. So is SiSoftware's Sandra (www.sisoftware.co.uk). Neither is a usable app with any purpose outside of measurement. Application testing, on the other hand, is characterized by the use of real-world programs to measure performance. The tests generally aren't as focused since applications aren't written with benchmarking in mind. Nevertheless, numbers from a set of real-world programs give your customers some idea of what they could expect if they were to buy similar hardware. Game tests are technically considered real-world measurements. So are timed runs of a Windows Media Encoder project or WinRAR file compression sequence.
Somewhere between synthetic and real-world you'll find hybrid metrics that employ usable programs but can't be run outside of a benchmark environment. BAPCo's SysMark 2004 SE (www.bapco.com) is an example, albeit a pricey one at $499. SysMark tests Internet content creation and office productivity performance using Adobe, Macromedia, Microsoft apps, and more, distilling scores into more specific sub-categories. The great thing about SysMark is that it does exploit multi-core processors, and it also accommodates the latest 64-bit version of Windows, facilitating additional comparison against older 32-bit software. Another popular hybrid is SPECviewperf (www.spec.org), a free conglomeration of graphics tests based on the most prolific OpenGL rendering programs out there, including 3ds max 3.1, Solidworks 2004, Lightscape, Pro/ENGINEER, and a few others. Both suites have nothing to do with gaming, which works out well for your customers who really mean business. So when would you use one type of test and not the other? Why even bother with synthetics when real-world measures tell the whole story? Knowing what to use when is just one of those traits that comes from experience. Overall, a synthetic works well for ultra-simplification and super in-depth stories. Because synthetic tests are benchmarks for the sake of benchmarking, they really require comparative numbers in order to derive meaning. Futuremark's 3DMark05 (www.futuremark.com) is the perfect example. It can either set all of your options by default and spit back one easy-to-decipher score or break down individual capabilities of your graphics card. Either way, the test's results, reported as 3DMarks, are only significant if you have another platform to test against. However, if a customer has a question about Shader Model 3.0 support, running 3DMark05 through a SM 3.0 path and then again with SM 2.0 will yield the answers. That's the kind of flexibility a synthetic metric offers. Your real-world tests can stand on their own, though. Should someone ask, "Is a GeForce 7800 GT fast enough to play Call of Duty 2 at 1600x1200?" you could answer that. Indeed, at that resolution, they'd average around 75 frames per second—more than enough for fluid game play. "How about media encoding? Will a dual-core 2.8 GHz Pentium D run faster than a single-core Pentium 4 at 3.2 GHz?" Why, yes, it will. And here are the Dr. DivX numbers to prove it. I think you get the idea. Hard Facts Matter Busy VARs already have a lot on their plates without worrying about running performance numbers. Is there not value in knowing how your wares compare to other offerings, though? Increasingly, yes. Enthusiasts will do their own research, but you can't expect a small business customer to know how one processor compares against another and so on. Admittedly, many might not even care. It's the ones who do that make familiarizing yourself with performance worthwhile. And again, this isn't all for your customer's benefit. Running tests for your own edification is a great way to establish recommendations. High-impact benchmarks might also help expose any stability weaknesses in your designs by applying uncommonly exacting loads you'd otherwise miss. Considering that many of the tools out there are free, there's really nothing for you to lose by diving deeper into the world of benchmarking. |
|||||
Copyright © 2007 RAM Magazine. All rights reserved.
Do not duplicate or redistribute in any form. |
|||||