Chapter 5: Cg and NVIDIA

Please download to get full document.

View again

of 108
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Information Report
Category:

Recruiting & HR

Published:

Views: 3 | Pages: 108

Extension: PDF | Download: 0

Share
Description
Chapter 5: Cg and NVIDIA Mark J. Kilgard NVIDIA Corporation Austin, Texas This chapter covers both Cg and NVIDIA s mainstream GPU shading and rendering hardware. First the chapter explains NVIDIA s Cg
Transcript
Chapter 5: Cg and NVIDIA Mark J. Kilgard NVIDIA Corporation Austin, Texas This chapter covers both Cg and NVIDIA s mainstream GPU shading and rendering hardware. First the chapter explains NVIDIA s Cg Programming Language for programmable graphics hardware. Cg provides broad shader portability across a range of graphics hardware functionality (supporting programmable GPUs spanning the DirectX 8 and DirectX 9 feature sets). Shaders written in Cg can be used with OpenGL or Direct3D; Cg is API-neutral and does not tie your shader to a particular 3D API or platform. For example, Direct3D programmers can re-compile Cg programs with Microsoft s HLSL language implementation. Cg supports all versions of Windows (including legacy NT 4.0 and Windows 95 versions), Linux, Apple s OS X for the Macintosh, Sun s Solaris, and Sony s PlayStation 3. Collected in this chapter are the following Cg-related articles: Cg in Two Pages: As the title indicates, this article summaries Cg in just two pages, including one vertex and one fragment program example. Cg: A system for programming graphics hardware in a C-like language: This longer SIGGRAPH 2002 paper explains the design rationale for Cg. A Follow-up Cg Runtime Tutorial for Readers of The Cg Tutorial: This article presents a complete but simple ANSI C program that uses OpenGL, GLUT, and the Cg runtime to render a bump-mapped torus using Cg vertex and fragment shaders from Chapter 8 of The Cg Tutorial. It s easier than you think to integrate Cg into your application; this article explains how! Re-implementing the Follow-up Cg Runtime Tutorial with CgFX: This follow-up to the previous article re-implements the bump-mapped torus using the CgFX shading system. Learn how to decouple your shading content from application code. NEW Comparison Tables for HLSL, OpenGL Shading Language, and Cg: Are you looking for a side-by-side comparison of the various features of the several different hardware-accelerated shading languages available to you today? UPDATED Second this chapter provides details about NVIDIA s GPU hardware architecture and API support. NVIDIA s latest GPUs are designed to fully support the rendering and SIGGRAPH Course 3, GPU Shading and Rendering shading features of both DirectX 9.0c and OpenGL 2.0. NVIDIA provides 3D game and application developers your choice of high-level shading languages (Cg, OpenGL Shading Language, or DirectX 9 HLSL) as well as full support for low-level assembly interfaces to shading. Collected in this chapter are the following NVIDIA GPU-related articles: GeForce 6 Architecture: This paper, re-printed from GPU Gems 2, is the most detailed publicly available description of NVIDIA GeForce 6 Series of GPUs. NVIDIA GPU Historical Data: This two page table collects performance data over a 7-year period on NVIDIA GPUs. This table presents the historical basis for expecting continuing graphics hardware performance improvements. What do the financial types always say? Past performance is not a guarantee of future return. UPDATED NVIDIA OpenGL 2.0 Support: The GeForce 6 Series has the broadest hardware support for OpenGL 2.0 available at the time these notes were prepared. Key OpenGL 2.0 hardware-accelerated features include fully-general non-power-oftwo textures, multiple draw buffers (also known as multiple render targets or MRT), two-sided stencil testing, OpenGL Shading Language (GLSL), GLSL support for vertex textures, GLSL support for both per-vertex and per-fragment dynamic branching, separate blend equations, and points sprites. SIGGRAPH Course 3, GPU Shading and Rendering Cg in Two Pages Mark J. Kilgard NVIDIA Corporation Austin, Texas January 16, Cg by Example Cg is a language for programming GPUs. Cg programs look a lot like C programs. Here is a Cg vertex program: void simpletransform(float4 objectposition : POSITION, float4 color : COLOR, float4 decalcoord : TEXCOORD0, float4 lightmapcoord : TEXCOORD1, out float4 clipposition : POSITION, out float4 ocolor : COLOR, out float4 odecalcoord : TEXCOORD0, out float4 olightmapcoord : TEXCOORD1, uniform float brightness, uniform float4x4 modelviewprojection) { clipposition = mul(modelviewprojection, objectposition); ocolor = brightness * color; odecalcoord = decalcoord; olightmapcoord = lightmapcoord; 1.1 Vertex Program Explanation The program transforms an object-space position for a vertex by a 4x4 matrix containing the concatenation of the modeling, viewing, and projection transforms. The resulting vector is output as the clip-space position of the vertex. The per-vertex color is scaled by a floating-point parameter prior to output. Also, two texture coordinate sets are passed through unperturbed. Cg supports scalar data types such as float but also has first-class support for vector data types. float4 represents a vector of four floats. float4x4 represents a matrix. mul is a standard library routine that performs matrix by vector multiplication. Cg provides function overloading like C++; mul is an overloaded function so it can be used to multiply all combinations of vectors and matrices. Cg provides the same operators as C. Unlike C however, Cg operators accept and return vectors as well as scalars. For example, the scalar, brightness, scales the vector, color, as you would expect. In Cg, declaring a parameter with the uniform modifier indicates that its value is initialized by an external source that will not vary over a given batch of vertices. In this respect, the uniform modifier in Cg is different from the uniform modifier in RenderMan but used in similar contexts. In practice, the external source is some OpenGL or Direct3D state that your application takes care to load appropriately. For example, your application must supply the modelviewprojection matrix and the brightness scalar. The Cg runtime library provides an API for loading your application state into the appropriate API state required by the compiled program. The POSITION, COLOR, TEXCOORD0, and TEXCOORD1 identifiers following the objectposition, color, decalcoord, and lightmapcoord parameters are input semantics. They indicate how their parameters are initialized by per-vertex varying data. In OpenGL, glvertex commands feed the POSITION input semantic; glcolor commands feed the COLOR semantic; glmultitexcoord commands feed the TEXCOORDn semantics. The out modifier indicates that clipposition, ocolor, odecalcoord, and olightmapcoord parameters are output by the program. The semantics that follow these parameters are therefore output semantics. The respective semantics indicate the program outputs a transformed clip-space position and a scaled color. Also, two sets of texture coordinates are passed through. The resulting vertex is feed to primitive assembly to eventually generate a primitive for rasterization. Compiling the program requires the program source code, the name of the entry function to compile (simpletransform), and a profile name (vs_1_1). The Cg compiler can then compile the above Cg program into the following DirectX 8 vertex shader: vs.1.1 mov ot0, v7 mov ot1, v8 dp4 opos.x, c1, v0 dp4 opos.y, c2, v0 dp4 opos.z, c3, v0 dp4 opos.w, c4, v0 mul od0, c0.x, v5 The profile indicates for what API execution environment the program should be compiled. This same program can be compiled for the DirectX 9 vertex shader profile (vs_2_0), the multi-vendor OpenGL vertex program extension (arbvp1), or NVIDIAproprietary OpenGL extensions (vp20 & vp30). The process of compiling Cg programs can take place during the initialization of your application using Cg. The Cg runtime contains the Cg compiler as well as API-dependent routines that greatly simplify the process of configuring your compiled program for use with either OpenGL or Direct3D. 1.2 Fragment Program Explanation In addition to writing programs to process vertices, you can write programs to process fragments. Here is a Cg fragment program: float4 brightlightmapdecal(float4 color : COLOR, float4 decalcoord : TEXCOORD0, float4 lightmapcoord : TEXCOORD1, uniform sampler2d decal, uniform sampler2d lightmap) : COLOR { float4 d = tex2dproj(decal, decalcoord); float4 lm = tex2dproj(lightmap, lightmapcoord); return 2.0 * color*d*lm; The input parameters correspond to the interpolated color and two texture coordinate sets as designated by their input semantics. The sampler2d type corresponds to a 2D texture unit. The Cg standard library routine tex2dproj performs a projective 2D texture lookup. The two tex2dproj calls sample a decal and light map texture and assign the result to the local variables, d and lm, respectively. The program multiplies the two textures results, the interpolated color, and the constant 2.0 together and returns this RGBA color. SIGGRAPH Course 3, GPU Shading and Rendering The program returns a float4 and the semantic for the return value is COLOR, the final color of the fragment. The Cg compiler generates the following code for brightlightmapdecal when compiled with the arbfp1 multivendor OpenGL fragment profile:!!arbfp1.0 PARAM c0 = {2, 2, 2, 2; TEMP R0; TEMP R1; TEMP R2; TXP R0, fragment.texcoord[0], texture[0], 2D; TXP R1, fragment.texcoord[1], texture[1], 2D; MUL R2, c0.x, fragment.color.primary; MUL R0, R2, R0; MUL result.color, R0, R1; END This same program also compiles for the DirectX 8 and 9 profiles (ps_1_3 & ps_2_x) and NVIDIA-proprietary OpenGL extensions (fp20 & fp30). 2. Other Cg Functionality 2.1 Features from C Cg provides structures and arrays, including multi-dimensional arrays. Cg provides all of C s arithmetic operators (+, *, /, etc.). Cg provides a boolean type and boolean and relational operators (, &&,!, etc.). Cg provides increment/decrement (++/--) operators, the conditional expression operator (?:), assignment expressions (+=, etc.), and even the C comma operator. Cg provides user-defined functions (in addition to pre-defined standard library functions), but recursive functions are not allowed. Cg provides a subset of C s control flow constructs (do, while, for, if, break, continue); other constructs such as goto and switch are not supported in current the current Cg implementation but the necessary keywords are reserved. Like C, Cg does not mandate the precision and range of its data types. In practice, the profile chosen for compilation determines the concrete representation for each data type. float, half, and double are meant to represent continuous values, ideally in floating-point, but this can depend on the profile. half is intended for a 16-bit half-precision floating-point data type. (NVIDIA s CineFX architecture provides such a data type.) int is an integer data type, usually used for looping and indexing. fixed is an additional data type intended to represent a fixed-point continuous data type that may not be floating-point. Cg provides #include, #define, #ifdef, etc. matching the C preprocessor. Cg supports C and C++ comments. 2.2 Additional Features Not in C Cg provides built-in constructors (similar to C++ but not userdefined) for vector data types: float4 vec1 = float4(4.0, -2.0, 5.0, 3.0); Swizzling is a way of rearranging components of vector values and constructing shorter or longer vectors. Example: float2 vec2 = vec1.yx; // vec2 = (-2.0, 4.0) float scalar = vec1.w; // scalar = 3.0 float3 vec3 = scalar.xxx; // vec3 = (3.0, 3.0, 3.0) More complicated swizzling syntax is available for matrices. Vector and matrix elements can also be accessed with standard array indexing syntax as well. Write masking restricts vector assignments to indicated components. Example: vec1.xw = vec3; // vec1 = (3.0, -2.0, 5.0, 3.0) Use either.xyzw or.rgba suffixes swizzling and write masking. The Cg standard library includes a large set of built-in functions for mathematics (abs, dot, log2, reflect, rsqrt, etc.) and texture access (texcube, tex3dproj, etc.). The standard library makes extensive use of function overloading (similar to C++) to support different vector lengths and data types. There is no need to use #include to obtain prototypes for standard library routines as in C; Cg standard library routines are automatically prototyped. In addition to the out modifier for call-by-result parameter passing, the inout modifier treats a parameter as both a call-byvalue input parameter and a call-by-result output parameter. The discard keyword is similar to return but aborts the processing without returning a transformed fragment. 2.3 Features Not Supported Cg has no support currently for pointers or bitwise operations (however, the necessary C operators and keywords are reserved for this purpose). Cg does not (currently) support unions and function variables. Cg lacks C++ features for programming in the large such as classes, templates, operator overloading, exception handling, and namespaces. The Cg standard library lacks routines for functionality such as string processing, file input/output, and memory allocation, which is beyond the specialized scope of Cg. However, Cg reserves all C and C++ keywords so that features from these languages could be incorporated into future implementations of Cg as warranted. 3. Profile Dependencies When you compile a C or C++ program, you expect it to compile without regard to how big (within reason) the program is or what the program does. With Cg, a syntactically and semantically correct program may still not compile due to limitations of the profile for which you are compiling the program. For example, it is currently an error to access a texture when compiling with a vertex profile. Future vertex profiles may well allow texture accesses, but existing vertex profiles do not. Other errors are more inherent. For example, a fragment profile should not output a parameter with a TEXCOORD0 semantic. Other errors may be due to exceeding a capacity limit of current GPUs such as the maximum number of instructions or the number of texture units available. Understand that these profile dependent errors do not reflect limitations of the Cg language, but rather limitations of the current implementation of Cg or the underlying hardware limitations of your target GPU. 4. Compatibility and Portability NVIDIA's Cg implementation and Microsoft's High Level Shader Language (HLSL) are very similar as they were co-developed. HLSL is integrated with DirectX 9 and the Windows operating system. Cg provides support for multiple APIs (OpenGL, Direct X 8, and Direct X 9) and multiple operating systems (Windows, Linux, and Mac OS X). Because Cg interfaces to multi-vendor APIs, Cg runs on GPUs from multiple vendors. 5. More Information Read the The Cg Tutorial: The Definitive Guide to Programmable Real-Time Graphics (ISBN ) published by Addison- Wesley. SIGGRAPH Course 3, GPU Shading and Rendering Cg: A system for programming graphics hardware in a C-like language William R. Mark R. Steven Glanville Kurt Akeley Mark J. Kilgard The University of Texas at Austin NVIDIA Corporation Abstract The latest real-time graphics architectures include programmable floating-point vertex and fragment processors, with support for data-dependent control flow in the vertex processor. We present a programming language and a supporting system that are designed for programming these stream processors. The language follows the philosophy of C, in that it is a hardware-oriented, generalpurpose language, rather than an application-specific shading language. The language includes a variety of facilities designed to support the key architectural features of programmable graphics processors, and is designed to support multiple generations of graphics architectures with different levels of functionality. The system supports both of the major 3D graphics APIs: OpenGL and Direct3D. This paper identifies many of the choices that we faced as we designed the system, and explains why we made the decisions that we did. CR Categories: I.3.7 [Computer Graphics]: Three-Dimensional Graphics and Realism Color, shading, shadowing, and texture; D.3.4 [Programming Languages]: Processors Compilers and code generation I.3.1 [Computer Graphics]: Hardware Architecture Graphics processors; I.3.6 [Computer Graphics]: Methodology and Techniques Languages 1 Introduction Graphics architectures are now highly programmable, and support application-specified assembly language programs for both vertex processing and fragment processing. But it is already clear that the most effective tool for programming these architectures is a high level language. Such languages provide the usual benefits of program portability and improved programmer productivity, and they also make it easier develop programs incrementally and interactively, a benefit that is particularly valuable for shader programs. In this paper we describe a system for programming graphics hardware that supports programs written in a new C-like language named Cg. The Cg language is based on both the syntax and the philosophy of C [Kernighan and Ritchie 1988]. In particular, Cg is intended to be general-purpose (as much as is possible on graphics hardware), rather than application specific, and is a hardware-oriented language. As in C, most data types and operators have an obvious mapping to hardware operations, so that it is easy to write high-performance code. Cg includes a Formerly at NVIDIA, where this work was performed. variety of new features designed to efficiently support the unique architectural characteristics of programmable graphics processors. Cg also adopts a few features from C++ [Stroustrup 2000] and Java [Joy et al. 2000], but unlike these languages Cg is intended to be a language for programming in the small, rather than programming in the large. Cg is most commonly used for implementing shading algorithms (Figure 1), but Cg is not an application-specific shading language in the sense that the RenderMan shading language [Hanrahan and Lawson 1990] or the Stanford real-time shading language (RTSL) [Proudfoot et al. 2001] are. For example, Cg omits high-level shading-specific facilities such as built-in support for separate surface and light shaders. It also omits specialized data types for colors and points, but supports general-purpose user-defined compound data types such as structs and arrays. As is the case for almost all system designs, most features of the Cg language and system are not novel when considered individually. However, when considered as a whole, we believe that the system and its design goals are substantially different from any previously-implemented system for programming graphics hardware. The design, implementation, and public release of the Cg system has occurred concurrently with the design and development of similar systems by 3Dlabs [2002], the OpenGL ARB [Kessenich et al. 2003], and Microsoft [2002b]. There has been significant cross-pollination of ideas between the different efforts, via both public and private channels, and all four systems have improved as a result of this exchange. We will discuss some of the remaining similarities and differences between these systems throughout this paper. This paper discusses the Cg programmer interfaces (i.e. Cg language and APIs) and the high-level Cg system architecture. We focus on describing the key design choices that we faced and on explaining why we made the decisions we did, rather than providing a language tutorial or describing the system s detailed implementation and internal architecture. More information about the Cg language is available in the language specification [NVIDIA Corp. 2003a] and tutorial [Fernando and Kilgard 2003]. Figure 1: Screen captures from a real-time Cg demo running on an NVIDIA GeForce TM FX. The procedural paint shader makes the car s surface rustier as time progresses. 2 Background Off-line rendering systems have supported user-programmable components for many years. Early efforts included Perlin s pixel-stream editor [1985] and Cook s shade-tree system [1984]. SIGGRAPH Course 3, GPU Shading and Rendering Tod
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks
SAVE OUR EARTH

We need your sign to support Project to invent "SMART AND CONTROLLABLE REFLECTIVE BALLOONS" to cover the Sun and Save Our Earth.

More details...

Sign Now!

We are very appreciated for your Prompt Action!

x