NVIDIA® OptiX™ Ray Tracing Engine Unveiled at SIGGRAPH 2009 – and a Different Future For Games Ray Tracing

In Computer Hardware, Electronics, Gaming on August 5, 2009 at 4:31 pm

Bookmark and Share

add to del.icio.us :: Add to Blinkslist :: add to furl :: Digg it :: add to ma.gnolia :: Stumble It! :: add to simpy :: seed the vine :: :: :: TailRank :: post to facebook

NVidia® OptiX™ at SIGGRAPH 2009

NVIDIA announced it’s OptiX real-time ray tracing engine at SIGGRAPH 2009, yesterday. You may recall my previous article on how GPU-based ray tracing may become the future of 3D graphics in games with the release of the GT300 family of GPUs. The fact remains, rasterised graphics ARE currently much faster than ray traced rendering, so much more powerful GPUs will be required for real-time ray tracing in games.

As David Kirk of NVIDIA put it, “Ray tracing is the technology of the future and it always will be!”

NVidia OptiX real-time ray tracing engine

NVIDIA OptiX real-time ray tracing engine

The new APIs/Engines that NVidia revealed at SIGGRAPH include additional support for applications programmers. The OptiX interactive engine is a ray tracing pipeline which will allow developers to use the C programming language in order to leverage ray tracing in their applications.

NVIDIA® OptiX™ engine for real-time ray tracing

NVIDIA® SceniX™ engine for managing 3D data and scenes

NVIDIA® CompleX™ engine for scaling performance across multiple GPUs

NVIDIA® PhysX® 64-bit engine for real-time, hyper-realistic physical and environmental effects

It seems in addition to the three new engines, PhysX is getting a 64-bit facelift. More details on the engines can be found here.

For 3D applications where speed isn’t quite as crucial, ray tracing on the GPU is a far more realistic goal. How efficiently the GT300 architecture will handle real-time ray tracing remains to be seen, but I suspect it just may be possible in a number of cases. At the time I wrote the article on the GT300 and possible future ray tracing applications in games, the Quake III/IV ray tracing algorithm I talked about was designed to run on multi-core/hyperthreaded CPUs rather than GPUs. Ray tracing is one application where performance scales extremely well with threading so you can expect a quad core/8-threaded Core i7 processor to offer much better performance than, say, a Core 2 Duo (2 cores/2 threads) E8400. Intel intends to release LGA 1336 hexacore processors when it’s scaled-down ‘Lynnfield processors hit the market. These hexacore nehalems should allow 12 simultaneous threads of execution.

Ray Tracing, Hyperthreading, and GPUs

Here’s a quick explanation of these concepts for the not-so-tech-savvy. ‘Threads’ refers to ‘threads of execution’ — i.e. it roughly translates to how many simultaneous programs or processes can run on the CPU at a time. This explanation is simplistic and ignores complexities such as multithreaded Operating Systems where one process has more than a single thread of execution. A process is a ‘unit of execution’ or ‘unit of executable code’ that shares processing time and exists in memory in its own clearly defined address space. In ‘advanced microprocessors’ like the type used in modern computers (i.e. x86 compatible CPUs since the 286AT onwards) processes are ‘protected by the hardware’ from interfering with the address space (or ‘memory allocated to that program or process’)  of another process. However, a process may have more than one thread that can execute simultaneously. So while a process is a ‘single unit of execution that HAS ITS OWN address space in memory,’ a thread is a ‘single unit of execution that may SHARE its address space WITH OTHER THREADS.’

The corollory of these definitions: a process may or may not have more than one ‘thread of execution’. You can think of this simply as follows: when using excel you may ask excel to sum up a huge column of numbers, and while it is doing that you may continue editing the spreadsheet without having Excel ‘lock up’. In a single threaded program, unless it is very cleverly designed, you can expect the program to lock up while it performs a time consuming task such as a large number of calculations. On a multithreaded program, the software will be designed to fork off a thread does the calculations while the main thread continues to respond to user requests.

Finally, it is threads rather than processes themselves, which compete for a time-slice with the CPU. The Operating System using a scheduling algorithm to give each competing process a share of processing time. If you have more cores and a hyperthreaded CPU (i.e. where a single core can execute more than 1 thread simultaneously), then the CPU can service more threads simultaneously. This is multitasking.

NVIDIA Demonstration

Now, NVIDIA has revealed it’s OptiX engine to be the first true GPU accelerated ray tracing engine. NVidia used an automotive ray tracing demo to show off some of the features of their new engine. The OptiX realtime demonstration ran on NVIDIA’s Quadro Plex 2100 D4 multi-GPU Visual Computing Systems and managed to produce 30 fps at 1080p resolution.  These are professional workstation-class systems used for graphics rendering and image processing, so 30fps at 1080p is still a far cry from the type of performance needed for gaming, the coming GT300 architecture notwithstanding.

NVIDIA OptiX real-time ray tracing demo

NVIDIA OptiX real-time ray tracing demo

Previously it was possible for programmers to use CUDA and C/C++ to harness the power of the parallel processing architecture of the GPU to allow ray tracing algorithms to run on this. As far as I’m aware, there are no current third-party engines for ray tracing using CUDA. However, NVidia demonstrated at NVISION 2008 how hybrid CUDA and OpenGL APIs could be used to implement ray tracing algorithms to run on the GPU. There is an excellent slide show here which talks about CUDA-powered ray tracing and dispels some myths in the ray tracing vs rasterisation debate.

The Future for Gaming

As multi-threaded processors capable of executing 8 or more simultaneous threads become mainstream, it may be possible to offload more processing to the CPU from the GPU — the Wheel of Reincarnation that I spoke about in my previous article on the topic. Current games don’t make use of powerful processors much at all. For this reason, the Core 2 Duo E8400 and E8500 remain some of the most sort after processors for gaming rigs. The costs are kept low, and the dual cores and higher clocks of these chips allow for gaming that meets the performance of Intel’s most high performance Nehalem CPUs. In the future games will slowly start to use increasing amounts of CPU power, however it seems CPU development will always be well-ahead of what mainstream game developers will end up using.

Perhaps the future lies in running some of the ray tracing algorithms in the CPU then. The CPU and GPU combination may work well together as GPU power increases, and while it handles most of the work involving textures, models, rendering, and outputting of display data, the CPU may use its extra thread capability to carry out ray tracing calculations for the GPU. I can see such a model working really well. Even with the coming of ultra-powerful GT300 and RV870 GPUs, with their claimed substantial power increase of baseline 3TFLOPS, it still seems unlikely that ray tracing for high-end games will become a mainstream possibility. I’m making this jugdement based on the 30fps @ 1080p achieved by the Visual Computing System (VCS) used by NVIDIA at the demo. 30fps is baseline acceptable for the majority of gamers, and the graphics output doesnt look all that great to justify this substantial reduction in performance for gaming.

A hybrid solution using extra CPU cores/hyperthreading seems to be the future.

If you have different opinions on this, please write them in the comments below. I’ll conclude this article with the NVIDIA press release on their OptiX and related engines.

NVIDIA Press Release

As the world’s first interactive ray tracing engine to leverage the GPU, the NVIDIA OptiX engine is a programmable ray tracing pipeline enabling software developers to easily bring new levels of realism to their applications using traditional C programming. By tapping into the massively parallel computing power of NVIDIA®® Quadro®® processors, the OptiX engine greatly accelerates the ray tracing used across a spectrum of disciplines, including: photorealistic rendering, automotive styling, acoustical design, optics simulation, volume calculations and radiation research. Application developers are utilizing the OptiX engine to redefine what’s possible for designers, engineers and researchers.

“In one year, NVIDIA has gone from proving interactive GPU ray tracing is possible, to making it available to all,” said Jon Peddie, founder and president of Jon Peddie Research. “Intricate design tasks, such as examining the play of reflection and refraction across surfaces and within glass, can now be examined in real-time by utilizing the OptiX acceleration engine running on Quadro processors. This is a phenomenal milestone for developers and designers alike.”

“Thousands of applications are being created today that harness the phenomenal power of GPUs, a clear sign that GPU computing has reached a tipping point. The world of computing is shifting from host-bound processing on CPUs to balanced co-processing on GPUs and CPUs,” said Jeff Brown, general manager, Professional Solutions, NVIDIA. “NVIDIA application acceleration engines arm developers with the tools they need to further revolutionize both real-time graphics and advanced data analysis.”

The NVIDIA SceniX scene management engine provides the interactive core for demanding real-time, professional 3D graphics applications. Whether used in leading products such as RTT DeltaGen, Autodesk Showcase and Anark Media Studio, or in scores of in-house tools used for advanced visualization, simulation, broadcast graphics, medical imagery, and energy exploration, developers look to the SceniX engine for the interactive framework to manage 3D data and convey results in real-time at high fidelity.

The NVIDIA CompleX scene scaling engine enables applications to maintain interactivity when working with extremely large and complex models. By automatically utilizing the combined memory and processing power of multiple GPUs within Quadro Plex visual computing systems, applications that utilize the CompleX engine enable users to explore and visualize all their data in full context, instead of piecemeal.

The NVIDIA PhysX 64-bit physics engine brings hyper-realistic, real-time physics to professional applications. Already a proven and popular solution within the computer games industry, the 64-bit version of PhysX will permit more accurate calculations on far larger data sets for engineers, designers and animators wanting to interrogate their data, model physical properties and breathe life into their work.

“The SceniX acceleration engine has been a critical part of our success in the automotive styling industry,” said Christian Matzen, COO, ICIDO, a global leader in virtual engineering solutions. “Based on the ease of integrating OptiX within SceniX, and its stunning visual results, we plan on delivering interactive ray tracing to our design customers later this year.”

“The CompleX engine is essential for our application to accommodate the massive data sets of customers like StatoilHydro,” said Thorolf Horn Tonjum, Director of R&D Stormfjord, a Norwegian development company serving the visualization needs of the energy industry. “By using the SceniX engine to power our scene graph, we easily incorporated the CompleX engine to keep navigation smooth for 10 GB scenes, and the PhysX 64-bit engine to study the challenges off shore oil rigs must face. These engines from NVIDIA accelerate not only our product, but also our time to market.”

NVIDIA will be showcasing the new suite of application acceleration engines this week at the SIGGRAPH 2009 conference and exhibition in New Orleans; booth #2101. For more information on NVIDIA at SIGGRAPH, visit: http://www.nvidia.com/engines.

Bookmark and Share

add to del.icio.us :: Add to Blinkslist :: add to furl :: Digg it :: add to ma.gnolia :: Stumble It! :: add to simpy :: seed the vine :: :: :: TailRank :: post to facebook


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: