Nvidia’s new GeForce3 graphics processing unit (GPU) has been garnering attention from gamers and 3D pros since the company announced the product at Macworld Expo Tokyo. With 57 million transistors, the GPU is more complex in many ways than the CPUs that drive Macs and PCs. But instead of relying purely on brute strength, Nvidia has improved performance by adding functions that remove common bottlenecks and/or process data more efficiently. It also includes a feature called the nFiniteFX Engine that allows game developers to add custom real-time effects to their titles.
To take advantage of the nFiniteFX Engine, developers must rewrite their game titles — or more precisely, their game development engines — to access its features. However, most of the GeForce3’s enhancements promise substantial performance gains in any game or application that’s based on OpenGL, the 3D graphics standard that Apple has adopted for Mac OS 9 and Mac OS X.
The basics of OpenGL
OpenGL is an application programming interface (API) that provides system-level functions permitting real-time, hardware-accelerated rendering of 2D and 3D graphics. Originally developed by Silicon Graphics for use in its Unix workstations, it is now a cross-platform standard governed by the OpenGL Architecture Review Board. Versions of OpenGL are bundled with the Mac OS, Windows, Unix and Linux systems. Although it includes 2D graphics functions, it is primarily used in 3D environments.
For software developers, OpenGL offers two key benefits. First, as an API, it provides a set of basic graphics routines that developers can access through languages such as C++. This makes it easier for programmers to add lighting, rendering and other 3D features to their software because they can call on routines in OpenGL instead of writing their own. But OpenGL’s biggest advantage is that its functions can be accelerated by OpenGL-savvy GPUs, such as ATI’s Radeon and Nvidia’s GeForce series. These cards are optimized for the kinds of calculations needed to render and display high-resolution graphics, and by offloading the calculations from the CPU, they provide a vast improvement in graphics performance.
Mac applications
Although OpenGL has many applications, Mac users are most likely to see the API in games and 3D authoring software. In the latter, OpenGL is most often used for interactive, real-time rendering during modeling and scene composition. Instead of working with wireframe previews of models, you can view them with lights, shading, and in some cases, textures. However, 3D authoring programs typically do not use OpenGL for final rendering. Instead, they employ their own rendering engines, which often include such features as raytracing, a highly realistic, but computationally intensive form of rendering, and radiosity, which accounts for the effects of light bouncing off objects in the scene. OpenGL does not support raytracing, radiosity or other sophisticated rendering effects, and thus its interactive rendering features can only approximate what a final scene will look like.
OpenGL plays a bigger role in computer games that require fast real-time rendering of 3D graphics. When an alien monster with green scales pops up in a first-person shooter, OpenGL is likely driving the action. (You can find a list of OpenGL-based Mac applications here.)
Apple’s adoption of OpenGL, shortly after Steve Jobs’ return, prompted many game developers to port their titles to the Mac. Apple had previously used QuickDraw 3D, a proprietary 3D graphics API that many developers avoided due to the small size of the Mac market. (QuickDraw 3D lives on in the form of Quesa, an open-source implementation that runs on top of OpenGL.) Now-defunct 3dfx, developer of the Voodoo graphics cards, used a 3D API called Glide.
Graphics cards do not necessarily accelerate all OpenGL functions. For example, early GPUs targeted at gamers focused almost exclusively on setup and rendering — the process of converting 3D geometry into pixels and then creating the final image. However, later cards, including the Radeon and GeForce series, also accelerate transform and lighting functions, features commonly associated with graphics cards aimed at 3D pros (“transform” refers to movement in a scene).
No matter which OpenGL functions a card might accelerate, the technology renders geometry, lights and surfaces according to its own rules. As a result, the graphics tend to have a similar look, and game developers must use workarounds to get certain kinds of real-time effects. For example, OpenGL cannot generate the realistic shadows made possible by raytracing, but developers can compensate by explicitly adding the appropriate shadow effects.
Enter the GeForce3
The GeForce3 includes several features designed to improve performance and image quality in any OpenGL-based application. One, high-resolution antialiasing, employs a technique known as “multisampling” to speed antialiasing (antialiasing is a common filtering technique used to soften jagged edges in images). Most graphics cards implement antialiasing through “supersampling,” in which the GPU performs filtering operations on a version of the image that’s two or four times bigger than the final picture shown on screen. Supersampling at 4x offers the best quality, but at a bigger cost in rendering speed. Nvidia says that its multisampling technique, which performs antialiasing within the GPU as pixels are being processed, offers the quality of 4x supersampling at 2x speed.
The GPU also employs a memory architecture that uses four independent memory controllers to manage the flow of data between the core GPU and graphics memory. This ensures that the system is using memory — a major bottleneck in any graphics card — as efficiently as possible. Other enhancements include use of “higher order surfaces” — a highly efficient method of describing 3D geometry — to speed data transfer from the CPU, and a hardware-based compression scheme that speeds data transfer to and from the z-buffer, an area of memory that stores depth and transparency information. A visibility subsystem ensures that only the pixels seen in the final image will be rendered. This effectively doubles bandwidth, because in a typical scene, only half the pixels that are processed will actually appear in the image.
At Macworld Expo Tokyo, Apple offered a comparison of the four graphics cards now available for the Mac, reporting that ATI’s Rage 128 Pro delivered 10 frames per second, the Radeon 20 frames per second, the GeForce2 MX 33 frames per second, and the GeForce3 64 frames per second, when running Quake 3 Arena at 1280 x 1024 pixels at 32-bit pixel depth with sound turned on.
nFiniteFX
Beyond these performance enhancements, the GeForce3’s new nFiniteFX Engine allows developers to extend the capabilities of OpenGL, permitting real-time gaming effects that would otherwise be impossible or require workarounds. It has two components: Vertex Shaders and Pixel Shaders.
Vertex Shaders allow developers to extend OpenGL’s transform and lighting functions, offering enhanced control over 3D geometry. For example, with Vertex Shaders, developers can create models with skin and clothing that stretch and crease more realistically. In a first-person shooter, Vertex Shaders would allow developers to create scenes where bullets leave dents in objects. The technology also enables custom lighting, lens, morphing and fog effects.
Pixel Shaders allow developers to create games with more-realistic textures than those permitted by OpenGL. For example, an object can have multiple animated textures, such as bumps combined with reflective material, that OpenGL otherwise cannot generate in real time. (Most high-end 3D authoring programs can produce these effects in their final-rendering engines.)
To employ the nFiniteFX engine, developers must explicitly write code for it. Some developers, such as John Carmack of Id Software, have expressed enthusiasm for the technology, while others seem to be taking a “wait and see” approach (Carmack participated in a demonstration of the GeForce3 during Steve Jobs’ Macworld Expo Tokyo keynote). However, it appears that the GPU will have sufficient critical mass to draw significant developer support: the GeForce3 will be available for the Mac and Windows, and Microsoft will use the underlying technology in its forthcoming X-Box game console.
Apple now offers the GeForce3 as a $350 build-to-order option for the Power Mac G4. You’ll also be able to buy a $600 stand-alone version beginning in April.