CUDA Ray Tracer – Dissertation Project

After on and off work for a year, and many thousand words later, my final year BSc dissertation project and report was completed. Can a ray tracer ever be truly ‘complete’? Considering endless potential optimisations and features that can be implemented, if one so desires, probably not! My report can be found here and below are a few renderings from my prototypes.

The project from a personal point of view was an important one. It was a period where I gained heightened interest in graphics programming, gaining an understanding of the principles of computer graphics, the mathematics involved and also the creative satisfaction that comes from it. When creating realistic virtual graphics from essentially nothing but code, maths and a display, on the face of it, it’s very easy to gloss over the ‘magic’ of it all, especially when you understand the complexity of how we actually perceive the Universe and the shortcuts that must be taken for computers to accurately mimic the natural phenomena of our brain’s visual perception.

A Bit of Biology and Philosophy:

The modern computer when you think of it, is really just a primitive extension of our own bodies, simple enough that we can manipulate, manage and understand it, with much greater control and predictability then our biology. They allow us to achieve things we could not otherwise do and many of the components inside a computer carry out very similar roles to organs found within us. Of course we can think of the CPU as a brain, but what else? Going into more detail, the GPU could be seen as a specialised part of the brain engineered to handle visual computation, just as our brain has it’s own visual cortex. A virtual camera in a rendering program replicates the capabilities of part of our eye, defining an aperture or lens through which to calculate rays of light, and like-wise, an ‘image plane’ positioned in front of the camera, carries out essentially the same functionality as our retina, but using pixels to make up the visual image of what we see.

When you understand the detailed steps required to render something in 3D, you realise that we are essential trying to recreate our own little simplified universe, it’s a pretty profound concept that when taken much further, manifests itself in popular science fiction such as the Matrix. After all, is mathematics not simply the ‘code’ of our Universe? It’s perhaps not as silly as it may sound, when you get down to the fundamentals of game developers creating virtual worlds, graphics programming being an essential component, and looking just how real and immersive these worlds are starting to become.

So What Is Ray Tracing?

Ray Tracing

Of all popular rendering techniques, it’s ray tracing that perhaps stands out the most in respect to my previous comments above. We all know roughly how and why we see, where light rays shine from a light source such as our Sun, they travel millions of miles to get to us and out of all the infinite number of rays, the tiniest percentage may find it’s way directly into our eye. This could be from directly looking at the Sun (not recommended!), and also from scattered or reflected light that has hit a surface, finding it’s way on a collision course with our eye.
This is fundamentally close to how ray tracing works, but with important differences. If a computer had to calculate the trajectory of all possible rays been fired out from a light source, this would be impossible with modern hardware, there are just too many potential rays, of which, only an infinitesimally small amount would ever find there way into the camera (eye) of the scene, and it’s only these rays we are interested in anyway. Instead, and referred to as ‘Backwards Ray Tracing’, light is fired from the camera (eye) into the scene and then traced backwards as it is reflected, refracted or simply absorbed by whatever material it hits. We then only have to fire a ray from the camera for each pixel in the image, which is still potentially a considerable number of rays (1920×1080 = 2073600 primary rays) and that’s without counting all the secondary rays as light scatters throughout the scene, but at least this reduced number is quite feasible.

Still, it is ray tracing’s close semblance to how light interacts with us in the real world that makes it a very elegant and simple algorithm for rendering images, allowing for what is known as ‘physically based rendering’, where light is simulated to create realistic looking scenes with mathematically accurate shadows, caustics and more advanced features such as ‘global illumination’, something that other faster and more common rendering techniques like rasterization (pipeline-based) cannot do.

Illumination and Shading:

Phong Shading

Phong shading

The ultimate main job of firing the rays into a scene in the first place is to determine what colour the pixel in our image should be. This is found by looking at what a ray hits when fired into a 3D scene. Put simply, if it hits a red sphere, the pixel is set to red. We can define the material information for every object in the scene in similar fashion to how we know in the real world that a matt yellow box reflects light. Technically, the box is yellow because it reflects yellow light, and is matt (not shiny) because it has a microscopically uneven surface (diffuse) that scatters the light more evenly away from the surface. Compare this to light hitting a smooth (specular) surface, most of the light would bounce off the surface in the same direction and appear shiny to our eyes. Clearly, for computer graphics, we are not likely to program a surface material literally in such microscopic detail as to define if it is rough or smooth, but we can cheat using a popular and effective local illumination model such as Phong, essentially using the ‘normal’ of a surface, the directions of our light source and camera and some vector math to put it all together and calculate the colour of the surface based on it’s material and angle, creating a smooth shaded object rather than a ‘flat’ colour.

Intersections, Distance Functions and Ray Marching:

Implicit Functions

So we know why we need to fire the rays, but how do know a ray has hit a surface? There’s a few different ways this can be done, all down to the complexity of the geometry you’re trying to render. Ray intersections with simple shape such as planes or spheres can be calculated precisely using linear and quadratic equations respectively. Additionally, for complex explicit 3D models made from triangle mesh, linear algebra and vector math can also be used to compute the intersections.

Another technique, has been gaining popularity in recent years, despite been around quite some time in academic circles. Rendering complex implicit geometry using ‘distance functions’ with nothing but a pixel shader on your GPU as shown on websites like Shadertoy have popularised a subset of ray tracing called ‘ray marching’, requiring no 3D mesh models, vertices or even textures to produce startlingly realistic real-time 3D renderings. It is in fact, the very freedom from mesh constraints that is apparent when you observe the complex, organic and smooth ray marched geometry possible using the technique. Ray marching allows you to do things you simply cannot do using explicit mesh, such as blending surfaces seamlessly together, akin to sticking two lumps of clay together to form a more complicated object. Endless repetition of objects throughout a scene at little extra cost using simple modulus maths is another nifty trick allowing for infinite scenes. By manipulating the surface positions along cast rays, you can effectively transform your objects, twist, contort and even animate; it’s all good stuff.

The Dissertation Project:

My development project was comprised of two parts, a prototype phase to create a ray tracer using GPGPU techniques and a hefty report detailing the theory, implementation and outcomes. For those unfamiliar, General-purpose computing on graphics processing units (GPGPU) is a area of programming aimed at using the specialised hardware found in GPU’s to perform arithmetic tasks normally carried out by the CPU, and is widely used in supercomputing. Though the CPU hardware is singularly much more powerful than the processors in a GPU; GPU’s make up for it in sheer numbers, meaning they excel and outperform CPU’s when computing simple highly parallel tasks. Ray tracing, is one such highly parallel candidate that is well suited to GPGPU techniques and for my dissertation I was tasked to use NVIDIA’s GPGPU framework called CUDA to create an offline ray tracer, done from scratch using no existing graphics API. Offline rendering means not real-time, and is clearly unsuitable for games, yet is commonly used in 3D graphics industry for big budget animations like those by Pixar and DreamWorks, with each frame individually rendered to ultra high quality, sometimes over a period in excess of 24 hours for a single frame.

In the end I produced four different ray tracing prototypes for comparison, incorporating previously mentioned techniques. Prototype 1, running purely on a CPU single thread using simple implicit intersections of spheres and planes. Prototype 2, the same but implemented using a single CUDA kernel and running purely on the GPU across millions of threads. Prototype 3, a CPU ray marcher using distance functions to render more complex implicit geometry. Prototype 4, the same as 3, but implemented using CUDA. My aim for the project was to assess GPGPU performance and the rendering qualities of the ray marching technique, the findings of which can be found in the report.

I knew when I picked this project that I was not taking an easy topic by any stretch, and a great thing I can take away from this is the extensive research experience and planning needed to simultaneously implement many different difficult concepts I had no prior knowledge about, yet still managed to produce a cohesive project, and fully working prototypes, achieving an 88% mark for my efforts, which I am very pleased with. As expected, with heinsight there are things that I would do differently if repeated, but nothing too major, and really, it’s all part of the learning process.

Ray tracing, ray marching, GPGPU, CUDA, distance functions and implicit geometry were all concepts I had to pickup and learn. I bought some books, but in the end, research on the internet in the form of tutorials, blogs, academic papers and lectures proved more beneficial. Sometimes, it takes a certain kind of way to present the information for your brain to ‘click’ with certain principles, and all of us are different. The Internet is a treasure trove in this regard, if you spend the time, you can usually eventually find some explanation that will suit your grey matter, failing that, re-reading it a million times can sometimes help!

Future Plans:

On the back of this, I will be continuing this subject into my masters degree and will likely be pursuing this further during my masters dissertation. I am already busy at work on a real-time implicit render with UI functionality running in DirectX 11 (A couple of early screenshots above). Additionally, I’d love to get a chance to contribute to a research paper on the subject, but we’ll see.

I plan to make some easy to follow tutorials on implementing ray tracing and ray marching at some point for this website, when I get the chance. Hopefully, they could  help out other students or anyone else wanting to learn the aforementioned topics. I know first hand and from friends, that at times it can be frustrating since although there is theory out there, there is comparatively very little information on actual implementation details for the subject, when compared to say pipeline-based rendering.

OpenGL Cross-platform PC/PSP Game Coursework

Last semester as part of the Advanced Graphics module of my CS degree at Hull University, we were tasked with a group project to produce a cross-platform OpenGL mini-game for the PC and Sony PSP based on a specification. The game premise was to move around a 3D ‘maze’ consisting of four rooms and connecting corridors, avoiding a patrolling AI that would shoot you if within its line of sight. The objective was to collect 3 keys to activate a portal to escape and beat the game.

The groups were selected at complete random with 4 members. As per usual, group coursework assignments are particularly difficult due to the extra concerns of motivating members and assigning work and by year 3 of University, you get a good idea on the best way of operating within them to secure good grades. I went in with the mindset of doing as much work as possible after we assigned tasks. Hopefully each would carry out their allocated work, if not, I’d just go ahead and do it, no fuss. Luckily one chap in my group was a friend and he did an excellent job coding the AI, mini-map and sound while I worked on coding the geometry, camera, lighting and player functionality etc.


Mini-maze model

Static environment lighting

Static environment lighting

Cross-Platform Limitations:

Having worked with OpenGL and shaders last year for my 3D ‘The Column‘ project, it was some-what limiting when I realised that the PSP didn’t support them and that fragment-based lighting was a no go. With one requirement of the game being a torchlight effect that illuminated the geometry, this would therefore mean that for PSP compatibility, vertex-based lighting would need to be implemented and that meant tessellation of primitives to prevent the lighting looking very blocky and…well very 90’s. Luckily the PSP did atleast have support for VBO (Vertex Buffer Objects) which meant effectively each tessellated model could loaded onto the graphics card only once to improve performance.

Unified Code

An interesting aspect of this project was the required consideration for a consolidated code-base that where possible allowed shared functionality for both the PC and PSP platforms i.e limiting how much platform specific code was used. This was essential since the game would be a single C++ Solution for both platforms.

I designed the code structure based around principles Darren McKie (the course lecturer) described, and produced the following class diagram that reflects the final structure:

Unified Cross-platform Class Diagram

Unified Cross-platform Class Diagram

The majority of game code resides in ‘Common Code’ classes that are instantiated by each particular platform ‘Game’ object. Certain code such as API rendering calls were kept platform specific but made use of the common classes where necessary. A particular nice way of ensuring the correct platform specific object was instantiated was carried out using ‘#Ifdef’, ‘#ifndef’ preprocessor statements and handled by a ‘ResourceManager’ class.

As mentioned earlier, per-vertex lighting had to be implemented due to PSP compatibility. A primitive with a low number of vertices would thus result in very blocky lighting. To prevent this I created a tessellation function that subdivided each primitives vertices into many more triangles. I played around with the tessellation depth to find how many iterations of subdivision could be achieved before inducing lag and was very happy with the lighting result considering there is no fragment shader; a given for today’s modern pipelined-based rendering.

Active Portal

Active Portal

The PSP implementation proved more tricky due to getting to grips with the PSP SDK and having access to very little documentation, however the game was successfully implemented onto a PSP device and ran with decent performance after compressing the textures down and removing geometry tessellation to allow for the PSP’s limited memory capacity.

The game was written in C++ and  the following libraries and software were used:

  • GXBase OpenGL API
  • Sony PSP SDK
  • OpenAL
  • Visual Studio 2012
  • Paint .Net

Exchange Reports Project Overview

During this summer, in-between semesters I was fortunate enough to get a software development job for a local company just 10 minutes walk from my door. The project was to produce an ‘Exchange Reports’ system that would provide email messaging statistics exactly to the customers specification. The system would be automated so that after reports were designed, they would be generated programmatically by a service and emailed to any recipients that had been setup to receive each report. The solution was to be comprised of 3 distinct programs that would need to be developed along with configuration tools to setup the non-GUI processes in the solution (namely the services).

I have produced the following diagram to demonstrate the solutions processes, ( Click to enlarge):

The design was in place when I started and an existing code-base was also present, but still required the vast majority of the functionality to be added. It was the first time having worked professionally as a software engineer and therefore also the first time getting to grips with existing code made by developers no longer around. More so, understanding the solutions technical proposal well enough to execute exactly what the customer and my employer wanted. I think working in IT professionally for a lot of years certainly helped me get into a comfortable stride after an initial information overload when taking on solely what was a surprisingly large but beneficial technical project compared to what I had envisioned. Being thrown into the deep end is probably the fastest way you can improve and I feel that above all, I have taken a lot from this experience which will prove valuable in the future. I’m very pleased with the outcome and successfully got all the core functionality in and finished in the time frame that was assigned. I whole heartily would encourage students thinking of getting professional experience to go for it, ideally with an established company from which you can learn a great deal. Having experienced developers around to run things by is great way to improve.

Now onto the technical details. The project was coded in C# and used WinForms for initial testing of processes and later for the configuration programs. I used a set of third-party .NET development tools from ‘DevExpress’ that proved to be fantastic and a massive boon to those wanting to create quick and great looking UI’s with reporting functionality. SQL Server provided the relational database functionality, an experience I found very positive and very much enjoyed the power of Query Language when it came to manipulating data via .NET data tables, data adapters, table joins or just simple direct commands.

Using the diagram as a reference, I’ll briefly go through each process in the solution for A) those interested in such things and B) future reference for myself while it’s still fresh in my mind because i’ll likely forget much of how the system works after a few months of 3D graphics programming and Uni coursework :P.

Exchange Message Logs: 

In Exchange 2010 Message Tracking logs can be enabled quite simply and provide a wealth of information that can be used for analysis and reporting if so desired. They come in the form of comma delimited log files that can be opened simply with a text editor. They have been around a lot of years and in the past during IT support work I have found myself looking at them from time to time to diagnose various issues. This time I’d be using them as the source of data for a whole reporting system. The customer was a large international company and to give an example from just one Exchange system they were producing 40 MB-worth of these messaging logs each day. With these being effectively just text files that’s an awful lot of email data to deal with.

Processing Service: 

The first of 3 core components of the solution, the Processing Service as the name suggests is an install-able Windows Service that resides on a server with access to the Exchange Messaging log files. The service is coded to run daily at a specified time and it’s purpose is comprised of 5 stages:

1. Connect to the Exchange server and retrieve a list of users from the Global Address List (GAL). This is done using a third-party Outlook library called ‘Redemption’ that enables this information to be extracted and then check it for any changes to existing users and/or any new users. The users are placed in a table on the SQL database server and will be used later to provide full name and department information for each email message we store.

2. Next, each Exchange Message log is individually parsed and useful messaging information is extracted and stored into various tables on the database server. Parsed log file names are kept track of in the database  to prevent reading logs more than once.

3. Any message forwards or replies are identified and tallied up.

4. A separate Summary table on the database is populated with data processed from the prior mentioned message tables. This table is what the reports will look at to generate data. Various calculations are made such as time difference between an email being received and then forwarded or replied to gauge estimates of response times being just one example; a whole plethora of fields are populated in this table, much more than could comfortably fit on a single report. Due to this large amount of potentially desirable data we later allow the user to select which fields they want from the Summary table in the ‘Report Manager’ if they wish to create a custom report or alternatively and more typically, they use predefined database ‘Views’ that have been created for them based on the customers specification which allows them to access only the data they need. Database Views are a really neat feature.

5. The databases Messaging tables are scoured for old records beyond a threshold period and deleted. This maintenance is essential to prevent table sizes growing too large. Their associated Summary data that has been generated is still kept however but I added functionality to archive this by serializing this data off and deleting it from the database if required.

Report Manager:

Initially we had thought to utilise DevExpress’s ‘Data Grid’ object controls in a custom Form application but we decided that the appearance of the reports that were generated from this were not satisfactory. This turned out to be a good design decision since we later discovered DevExpress has remarkable reporting controls that allow very powerful design and presentation features that completely overshadowed that of the Data Grids. After some migrating of code from the old ‘Report Manager’ program and having to spend a day or two researching and familiarising myself with the DevExpress API I had a great looking new application that the customer will be using to design and manage the reports.

Report Manager program

Report Manager program

The Report Manager allows you to design every aspect of a report through an intuitive drag and drop interface. Images and various graphics can also be added to beautify the design, though that wasn’t something I did nor had the time to attempt! The data objects can be arranged as desired and the ‘data source’ information for the report is saved along with it’s design layout via a neat serialization function inherent to the ‘XtraReport’ object in the DevExpress library which is then stored in a reports table on the database server for later loading or building. You can also generate the report on-the-fly and export it into various formats such as PDF or simply print it. Another neat built-in feature is the ability to issue SQL query commands using a user-friendly filter for non-developers in the report designer which is then stored along with the layout, thus the user designing the report has absolute control over the data i.e a quick filter based on Department being “Customer Services” would return only that related message data without me needing to code in some method to do this manually like was the case when using the Data Grids.

In the top left you’ll see specific icons that provide the necessary plumbing for the database server. ‘Save’, ‘Save As’ and ‘Load’ respectively writes the serialized report layout to the database, creates a new record with said layout or loads an existing saved report from the database into the designer. Loading is achieved by retrieving the list of report records stored in the reports table and placing it into a Data Grid control on a form where you can select a report to load or delete. The ‘Recipients’ button brings up the interface to manage adding users who want to receive the report by email, this retrieves the user data imported by the Processing Service and populates a control that allows you to search through and select a user or manually type a name and email address to add a custom recipient. Additionally, upon adding a recipient to the report you must select whether they wish to receive the report on a daily, weekly or monthly basis. This information is then stored in the aptly named recipient table and then relates to the reports via a reportID field.

Report Service:

Nearly there (if you’ve made it this far well done), the last piece in the solution is another Windows Service called the ‘Report Service’. This program sits and waits to run as per a schedule that can be determined by a configuration app that i’ll mention shortly. Like the Processing Service, as part of it’s logic, it needs to check if it’s the right time of the day to execute the program, of course the service continuously polls itself every few minutes to see if this is the case. Upon running it looks to see if it’s the right day for daily reports, day of week for weekly reports, or day of month for the (you guessed it) monthly reports. If it is, it then it runs and grabs the ‘joined’ data from the reports and recipient tables and proceeds to build each report and fire them out as PDF email attachments to the associated recipients. It makes a final note of the last time it ran to prevent it repeatedly running on each valid day.

Configuration Tools:

Two configuration apps were made, one for the Processing Service and one for the Report Service. These two services have no interfaces since they run silently in the background, so I provided a method via an XML settings file and the two apps to store a variety of important data such as SQL connection strings, server authentication details (encrypted) and additionally also through the need to provide certain manual debugging options that may need to be executed as well as providing an interface to set both services run times and the report delivery schedule.

Screens below (click to enlarge):

So that’s the solution start to finish, depending on time I’m told it’s possible it could be turned into a product at some point which would be great since other customers could potentially benefit from it too.

The great thing about a creative industry like programming, whether business or games, is that you’re ultimately creating a product for someone to use. It’s nice to know people somewhere will be getting use and function out of something you have made and just one reason why I’ve thoroughly enjoyed working on the project. I’ve learned a lot from my colleagues while working on it and hope to work with them again. You also get a taste for real life professional development and how it differs in various ways to academic teachings, which although are very logical and sensible are also idealistic (and rightly so) but in the real-world when time is money and you need to turn around projects to sustain the ebb and flow of business, you have to do things in a realistic fashion that might mean cutting some corners when it comes to programming or software design disciplines. I always try my best to write as clean code as possible and this was no exception but ultimately you need to the get the project done first and foremost and it’s interesting how that can alter the way software development pans out with regards perhaps to niceties like extensive documentation, ‘Use Case’ diagrams and robust unit testing potentially falling to to the wayside in favor of a more speedy short-term turn around. Certainly I imagine, larger businesses can afford to manage these extra processes to great effect, but for small teams of developers it’s not always realistic, which I can now understand.