by the Codermind team. |
|
As we saw previously in our first part, a perfectly diffuse material will send incident light back uniformly in all directions. At the other side of the spectrum, a perfect mirror only sends light in the opposite direction of the incident ray.
Reality is not as black and white as that, real object surfaces are often complex and difficult to model simply. I will describe here two empirical models of "shiny" surfaces that are not pure mirror.
This is the second part of the our series of articles about ray tracing in C++. It follows the part called "First rays".
Phong
Bui Tuong Phong was a student at the University of Utah when he invented the process that took his name (Phong lighting). It is a quick and dirty way to render objects that reflect light in a priviledged direction without a full blown reflection. In most cases that priviledged direction is that of the light vector, reflected by the normal vector of the surface.

The common usage is to describe that as the specular term, opposed to the diffuse term from the Lambert formula. The specular term varies based on the position of the observer, as you can see below in the equation that takes into account viewRay.dir :
float reflet = 2.0f * (lightRay.dir * n); vecteur phongDir = lightRay.dir - reflet * n; float phongTerm = _MAX(phongDir * viewRay.dir, 0.0f) ; phongTerm = currentMat.specvalue * powf(phongTerm, currentMat.specpower) * coef; red += phongTerm * current.red; green += phongTerm * current.green; blue += phongTerm * current.blue;
Here is the result of the previous scene with some added Phong specular term:
Digression: an amusing story about the Phong lighting is that in papers that Bui Tuong Phong wrote his name is written in this order, which is the traditional order in Vietnam (and sometimes also in other countries where the given name is quoted after the family name). But for most english speaking readers that order is reversed and so people assumed his last name was Phong. Otherwise his work would have been known as the Bui Tuong lighting.
Blinn-Phong
A possible variation on the specular term is the derived version based on the work of Jim Blinn, who was at the time professor for Bui Tuong Phong in Salt Lake City and is now working for Microsoft Research.
Jim Blinn added some physical considerations to the initial empiric result. This goes through the computation of an intermediate vector at the midpoint between the light direction and the viewer's one (Blinn's vector). Then it goes through the computation of the dot product of this Blinn vector with the normal to the surface. A comparison with previous formula allows us to see that Blinn term is equally maximum when the view ray is the reflected of the light ray : it is a specular term.
vecteur blinnDir = lightRay.dir - viewRay.dir;
float temp = sqrtf( blinnDir * blinnDir);
if (temp != 0.0f )
{
blinnDir = (1.0f / temp) * blinnDir;
float blinnTerm = _MAX(blinnDir * n, 0.0f);
blinnTerm = currentMat.specvalue * powf(blinnTerm , currentMat.specpower) * coef;
red += blinnTerm * current.red ;
green += blinnTerm * current.green ;
blue += blinnTerm * current.blue;
}
Here is the result of the Blinn-Phong version. As you can see the output is pretty similar, this is the version we will use for the remaining of the series.
Antialiasing
There are numerous methods aimed at reducing aliasing. This article is not meant to be a complete description of what anti-aliasing does, or a complete directory of all the methods. For now we'll just describe our current solution which is based on basic supersampling.
The supersampling is the idea that you can take more individual color samples per pixel in order to reduce aliasing problems (mostly stair effect and to some extent moire effect). In our case for a final image of X,Y resolution, we'll render to the higher resolution of 2*X, 2*Y then take the average of four samples to get the color of one pixel. This is effectively a 4x supersampling because we're computing four times more rays per pixels. This is not necessary that those four extra samples are within the boundary of the original pixel, or that the averaging be a regular arithmetic average. But more on that later.
You can see on the following code how it goes. For each pixel that we have to compute we will effectively launch four rays, but those rays will see their contribution reduced to the quarter. The relative position of each pixel is free. But for simplicity reason and for now we'll just order them on a regular grid (as if we had just computed a four times larger image).
for (int y = 0; y < myScene.sizey; ++y)
for (int x = 0; x < myScene.sizex; ++x)
{
float red = 0, green = 0, blue = 0;
for(float fragmentx = x; fragmentx < x + 1.0f; fragmentx += 0.5f)
for(float fragmenty = y; fragmenty < y + 1.0f; fragmenty += 0.5f)
{
// Each ray contribute to the quarter of a full pixel contribution.
float coef = 0.25f;
// Then just launch rays as we did before
}
// Then the contribution of each ray is added and the result is put into the image file
}
Here is the result of the applied 4x supersampling, greatly zoomed in:
Gamma function
Originally CRTs didn't have good restitution curve. That is the relation between a color stored in a frame buffer and its perceived intensity on the screen would not be "linear". Usually you would instead have something like Iscreen = Pow(I, gamma), with gamma greater than 1.
In order to correctly reproduce linear gradients or anything that rely on the good linear relation (for example dithering and anti-aliasing), you would then have to take into account this function when doing all your computations. In the case above for example, you would have to apply the inverse function Pow(I, 1/gamma). Nowadays that correction can be done at any point in the visualization pipeline, modern PC have even dedicated hardware in their DAC (digital to analog converter) that can apply this inverse function to the color stored into the frame buffer just before sending it to the monitor. But that's not the whole story.
To provide a file format that would be calibrated in advance for all kind of configurations (monitors, applications, graphics cards) that could exist out there is the squaring of the circle : it's impossible. Instead we rely on conventions and explicit tables (the jpeg format has profiles that describe what kind of response the monitor should provide to the encoded colors, then the application layer can match that to the profile provided for the monitor). For the world wide web and Windows platforms a standard has been defined, it is called sRGB. Any file or buffer that is sRGB encoded has had its color multiplied by the power of 1/2.2 before being written out to the file. In our raytracer we deal with a 32 bit float per color component, we have a lot of extra precision, that way we can apply the sRGB transform just before downconverting to 8 bit integer per component.
Here is an image that is sRGB encoded. Your monitor should have been calibrated to minimize the perceived difference between the solid gray parts and the dithered black and white pattern. (It will be harder to calibrate your LCD than your CRT, because, on typical LCDs, angle of view will impact the perceived relation between intensities but you should be able to find a spot that is good enough hopefully).

What would be the interest to encode images in sRGB format, rather than have them encoded in a linear color space and let the application/graphics card apply the necessary corrections at the end ? Well other than the fact that it is the standard and having your png or tga file in an sRGB space will ensure that the viewer if he's correctly calibrated will see them as you would expect, sRGB can be seen as a form of compression also. The human eye is more sensitive to lower intensity and so when reducing the 32 bit color to a 8 bit value you have less risk of having visible banding when reproducing them. Because the pow(1/2.2) transform gives more precision to the lower intensity. (of course that's a poor compression, anything other than TGA is better, like jpeg or png, but since we output a tga file, the loss are already "factored in").
Here is the corresponding code :
float srgbEncode(float c)
{
if (c <= 0.0031308f)
{
return 12.92f * c;
}
else
{
return 1.055f * powf(c, 0.4166667f) - 0.055f; // Inverse gamma 2.4
}
}
//..
// gamma correction
output.blue = srgbEncode(output.blue);
output.red = srgbEncode(output.red);
output.green = srgbEncode(output.green);
Photo exposure
On the previous page, I talked about the fact that the use of the min function in our float conversion to int was a "naive" approach. Let me detail that. We call that operator a saturation operator. What is saturation ? This term has a lot of different usage, you can encounter it in electronics, music, photography, etc. What it means in this case is that a signal that varies on a large interval is going to be converted to a smaller interval but with everything that was bigger than the max value of that smaller interval will be set equal to the max value itself. With this approach on the following diagram, any value that was between 1 and 2 will be seen as 1.

This works well only if all the interesting values are below 1. But in reality and in image synthesis this is not the case. Intensity in an image can vary wildly. The closer an object goes to a light source the brighter it becomes and this brightness is not limited artificially. Also ideally, we would like to have details in the high intensity as well as the low intensity. This approach is called tone mapping (mapping higher range values to a lower range value without losing too much details).
Photo exposure is one possible tone mapping operator. In real life photography with film, we used to have chemical surface where a component would migrate from one form to another as long as light would come on it. And the speed of migration would be dependant on the intensity of the light (flow of photons). After their migration they would become inert, slowing down the rate of migration over time. Areas with a lot of migration would appear brighter (after the intermediate use of a negative film or not) and those without migration would appear darker. With an infinite exposure time, all component would have migrated. Of course this is a gross simplification of the whole process, but this can give us an idea of how to deal with possibly unbound intensity. Our exposure function takes the final intensity as an exponential function of exposure time and light linear intensity. 1 - exp(lambda * I * duration).

Here is the corresponding code :
float exposure = -1.00f; blue = 1.0f - expf(blue * exposure); red = 1.0f - expf(red * exposure); green = 1.0f - expf(green * exposure);
"Exposure" is a simplified term, no need to carry all the real life implications (at least in the scope of this limited tutorial).
The first image below, uses the original "saturation" operator and as we submit it to a bright light suffers from serious saturation artefacts. The second image uses our exposure operator and we submit it to the same light.
Digression : The exposure computation takes place before the gamme correction. We would also have to define automatically the exposure value instead of the hard coded value that we use above. Ideally an artist would select the right exposure amount until he gets what he wants from the image. Other effects such as blooming, halos, can simulate the effect of brighter lights on camera and our eye. Some bloom will happen naturally (due to foggy conditions, or because of "color leaking" in the whole photography process, or because of small defects in our eyes), but most of the time they are added for dramatic effect. We won't use those in our program, but you could add those as an exercise.
Here is the output of the program with the notions brought up in this page :

In order to compile the source code for this second page, you don't need any particular additional library. You just need a C++ compiler coming with the standard C++ library. Tested with the GCC 3.4.4 and Visual Studio .net/2005/2008. Decompress the .rar file with Winrar.
Get the source code of the raytracer in C++. Source code for the second page only.
To page 3 : "Procedural textures, bump mapping, cubic environment map".









