After on and off work for a year, and many thousand words later, my final year BSc dissertation project and report was completed. Can a ray tracer ever be truly ‘complete’? Considering endless potential optimisations and features that can be implemented, if one so desires, probably not! My report can be found here and below are a few renderings from my prototypes.
The project from a personal point of view was an important one. It was a period where I gained heightened interest in graphics programming, gaining an understanding of the principles of computer graphics, the mathematics involved and also the creative satisfaction that comes from it. When creating realistic virtual graphics from essentially nothing but code, maths and a display, on the face of it, it’s very easy to gloss over the ‘magic’ of it all, especially when you understand the complexity of how we actually perceive the Universe and the shortcuts that must be taken for computers to accurately mimic the natural phenomena of our brain’s visual perception.
A Bit of Biology and Philosophy:
The modern computer when you think of it, is really just a primitive extension of our own bodies, simple enough that we can manipulate, manage and understand it, with much greater control and predictability then our biology. They allow us to achieve things we could not otherwise do and many of the components inside a computer carry out very similar roles to organs found within us. Of course we can think of the CPU as a brain, but what else? Going into more detail, the GPU could be seen as a specialised part of the brain engineered to handle visual computation, just as our brain has it’s own visual cortex. A virtual camera in a rendering program replicates the capabilities of part of our eye, defining an aperture or lens through which to calculate rays of light, and like-wise, an ‘image plane’ positioned in front of the camera, carries out essentially the same functionality as our retina, but using pixels to make up the visual image of what we see.
When you understand the detailed steps required to render something in 3D, you realise that we are essential trying to recreate our own little simplified universe, it’s a pretty profound concept that when taken much further, manifests itself in popular science fiction such as the Matrix. After all, is mathematics not simply the ‘code’ of our Universe? It’s perhaps not as silly as it may sound, when you get down to the fundamentals of game developers creating virtual worlds, graphics programming being an essential component, and looking just how real and immersive these worlds are starting to become.
So What Is Ray Tracing?
Of all popular rendering techniques, it’s ray tracing that perhaps stands out the most in respect to my previous comments above. We all know roughly how and why we see, where light rays shine from a light source such as our Sun, they travel millions of miles to get to us and out of all the infinite number of rays, the tiniest percentage may find it’s way directly into our eye. This could be from directly looking at the Sun (not recommended!), and also from scattered or reflected light that has hit a surface, finding it’s way on a collision course with our eye.
This is fundamentally close to how ray tracing works, but with important differences. If a computer had to calculate the trajectory of all possible rays been fired out from a light source, this would be impossible with modern hardware, there are just too many potential rays, of which, only an infinitesimally small amount would ever find there way into the camera (eye) of the scene, and it’s only these rays we are interested in anyway. Instead, and referred to as ‘Backwards Ray Tracing’, light is fired from the camera (eye) into the scene and then traced backwards as it is reflected, refracted or simply absorbed by whatever material it hits. We then only have to fire a ray from the camera for each pixel in the image, which is still potentially a considerable number of rays (1920×1080 = 2073600 primary rays) and that’s without counting all the secondary rays as light scatters throughout the scene, but at least this reduced number is quite feasible.
Still, it is ray tracing’s close semblance to how light interacts with us in the real world that makes it a very elegant and simple algorithm for rendering images, allowing for what is known as ‘physically based rendering’, where light is simulated to create realistic looking scenes with mathematically accurate shadows, caustics and more advanced features such as ‘global illumination’, something that other faster and more common rendering techniques like rasterization (pipeline-based) cannot do.
Illumination and Shading:
The ultimate main job of firing the rays into a scene in the first place is to determine what colour the pixel in our image should be. This is found by looking at what a ray hits when fired into a 3D scene. Put simply, if it hits a red sphere, the pixel is set to red. We can define the material information for every object in the scene in similar fashion to how we know in the real world that a matt yellow box reflects light. Technically, the box is yellow because it reflects yellow light, and is matt (not shiny) because it has a microscopically uneven surface (diffuse) that scatters the light more evenly away from the surface. Compare this to light hitting a smooth (specular) surface, most of the light would bounce off the surface in the same direction and appear shiny to our eyes. Clearly, for computer graphics, we are not likely to program a surface material literally in such microscopic detail as to define if it is rough or smooth, but we can cheat using a popular and effective local illumination model such as Phong, essentially using the ‘normal’ of a surface, the directions of our light source and camera and some vector math to put it all together and calculate the colour of the surface based on it’s material and angle, creating a smooth shaded object rather than a ‘flat’ colour.
Intersections, Distance Functions and Ray Marching:
So we know why we need to fire the rays, but how do know a ray has hit a surface? There’s a few different ways this can be done, all down to the complexity of the geometry you’re trying to render. Ray intersections with simple shape such as planes or spheres can be calculated precisely using linear and quadratic equations respectively. Additionally, for complex explicit 3D models made from triangle mesh, linear algebra and vector math can also be used to compute the intersections.
Another technique, has been gaining popularity in recent years, despite been around quite some time in academic circles. Rendering complex implicit geometry using ‘distance functions’ with nothing but a pixel shader on your GPU as shown on websites like Shadertoy have popularised a subset of ray tracing called ‘ray marching’, requiring no 3D mesh models, vertices or even textures to produce startlingly realistic real-time 3D renderings. It is in fact, the very freedom from mesh constraints that is apparent when you observe the complex, organic and smooth ray marched geometry possible using the technique. Ray marching allows you to do things you simply cannot do using explicit mesh, such as blending surfaces seamlessly together, akin to sticking two lumps of clay together to form a more complicated object. Endless repetition of objects throughout a scene at little extra cost using simple modulus maths is another nifty trick allowing for infinite scenes. By manipulating the surface positions along cast rays, you can effectively transform your objects, twist, contort and even animate; it’s all good stuff.
The Dissertation Project:
My development project was comprised of two parts, a prototype phase to create a ray tracer using GPGPU techniques and a hefty report detailing the theory, implementation and outcomes. For those unfamiliar, General-purpose computing on graphics processing units (GPGPU) is a area of programming aimed at using the specialised hardware found in GPU’s to perform arithmetic tasks normally carried out by the CPU, and is widely used in supercomputing. Though the CPU hardware is singularly much more powerful than the processors in a GPU; GPU’s make up for it in sheer numbers, meaning they excel and outperform CPU’s when computing simple highly parallel tasks. Ray tracing, is one such highly parallel candidate that is well suited to GPGPU techniques and for my dissertation I was tasked to use NVIDIA’s GPGPU framework called CUDA to create an offline ray tracer, done from scratch using no existing graphics API. Offline rendering means not real-time, and is clearly unsuitable for games, yet is commonly used in 3D graphics industry for big budget animations like those by Pixar and DreamWorks, with each frame individually rendered to ultra high quality, sometimes over a period in excess of 24 hours for a single frame.
In the end I produced four different ray tracing prototypes for comparison, incorporating previously mentioned techniques. Prototype 1, running purely on a CPU single thread using simple implicit intersections of spheres and planes. Prototype 2, the same but implemented using a single CUDA kernel and running purely on the GPU across millions of threads. Prototype 3, a CPU ray marcher using distance functions to render more complex implicit geometry. Prototype 4, the same as 3, but implemented using CUDA. My aim for the project was to assess GPGPU performance and the rendering qualities of the ray marching technique, the findings of which can be found in the report.
I knew when I picked this project that I was not taking an easy topic by any stretch, and a great thing I can take away from this is the extensive research experience and planning needed to simultaneously implement many different difficult concepts I had no prior knowledge about, yet still managed to produce a cohesive project, and fully working prototypes, achieving an 88% mark for my efforts, which I am very pleased with. As expected, with heinsight there are things that I would do differently if repeated, but nothing too major, and really, it’s all part of the learning process.
Ray tracing, ray marching, GPGPU, CUDA, distance functions and implicit geometry were all concepts I had to pickup and learn. I bought some books, but in the end, research on the internet in the form of tutorials, blogs, academic papers and lectures proved more beneficial. Sometimes, it takes a certain kind of way to present the information for your brain to ‘click’ with certain principles, and all of us are different. The Internet is a treasure trove in this regard, if you spend the time, you can usually eventually find some explanation that will suit your grey matter, failing that, re-reading it a million times can sometimes help!
On the back of this, I will be continuing this subject into my masters degree and will likely be pursuing this further during my masters dissertation. I am already busy at work on a real-time implicit render with UI functionality running in DirectX 11 (A couple of early screenshots above). Additionally, I’d love to get a chance to contribute to a research paper on the subject, but we’ll see.
I plan to make some easy to follow tutorials on implementing ray tracing and ray marching at some point for this website, when I get the chance. Hopefully, they could help out other students or anyone else wanting to learn the aforementioned topics. I know first hand and from friends, that at times it can be frustrating since although there is theory out there, there is comparatively very little information on actual implementation details for the subject, when compared to say pipeline-based rendering.