by the Codermind team. |
|
After dealing with surface properties in our previous articles, we'll go a bit more in depth of what a raytracer allows us to do easily.
One of the interesting property of ray tracing is to be independent of the model of camera, and that is what makes it possible to simulate advanced camera model with only slight code modification. The model we'll explore in this page is the one of the pin hole camera with a conic projection and a depth of field. Another interesting property is that contrary to a lot of other rendering methods such as rasterization, adding implicit objects is the matter of defining that object equation and intersection, in some cases (spheres) it is easier to compute intersection with the implicit object than it is with the equivalent triangle based one. Spheres are one possible implicitly defined primitive. We'll add a new one which we'll call blobs. Finally ray tracing is good for reflections as we've seen in our previous articles, but it can as easily simulate another physical effect called refraction. Refraction is a property of transparent objects, allowing us to draw glass-like or water-like surfaces.
This is the fourth part of our ray tracing in C++ article series. This article follows the one titled "Procedural textures, bump mapping, cube environment map".
Camera model and depth of field
What follows is the extreme simplification of what exists in our eye or in a camera.
A simplified camera is made of a optical center, that means the point where all the light rays converge, and a projection plane. This projection plane can be placed anywhere, even inside the scene since it happens to be a simulation on a computer and not a true camera (in the case of the true camera, the projection plane would be on the film or CCD). It also contained an optical system that is there to converge the rays that pass through the diaphragm. One thing important to understand for the physical model of cameras is that they have to let pass a full beam of light, big enough so that the the film or the retina are inscribed. The optical system must be configured so that the point of interest of the scene is sharp on the projection plane. It is often mutually exclusive, that means there can't be two points, one far and one near that can be both sharp at the same time. In photography we call that the depth of field. Our simplified model of camera allows us to simulate the same effect for an improved photo-realism. But the main advantage we have compared to real life photographer is that we can do easily without the constraint of the physical system.
The blur caused by the depth of field is a bit expensive to simulate because of the added computation. The reason is that in our case we'll try to simulate that effect with a stochastic method. That means we'll send a lot of pseudo-random rays to try to accommodate the complexity of the problem. The idea behind this technique is to imagine we shoot a photon ray through our optical system and that ray will be slightly deviated from the optical center because of the aperture of the diaphragm. The larger the hole at the diaphragm, the bigger the deviation can be. But for simplicity's sake we'll consider the optical system like a black box. Some people successfully managed to render images with accurate representations of lenses, etc.. But that is not our goal here in this limited tutorial. The only thing we care about from now on is at what distance from our optical center are the points in the scene that will be seen as sharp and from which direction our photon was hitting our projection plane so that we can follow its reverse path.
We need to use the geometrical properties of the optical systems. Those are called the invariants. First any ray passing through the optical center itself is not deviated from its course. Second, two rays crossing the "sharp" plane at the same point will converge to the same point on the projection plane, this is by definition of the "sharp" plane. Those two properties will allow us to determine the reverse path of 100% of the rays hitting our projection plane.

By starting from the projection point, if we deviate the ray to pass not from the optical center, but from a slight deviated point, we can then compute the trajectories on the other side of the optical system thanks to the definition of this ray's "point net" (sharp point). So that gives us a system where we can easily set the distance of the sharp plane and the strength of the blur (by allowing more or less divergence). This is a start even as we said, we may wish to simulate more complex optical systems later.
vecteur dir = {(fragmentx - 0.5f * myScene.sizex) / projectionDistance,
(fragmenty - 0.5f * myScene.sizey) / projectionDistance,
1.0f};
float norm = dir * dir;
if (norm == 0.0f)
break;
dir = invsqrtf(norm) * dir;
point start = {0.5f * myScene.sizex, 0.5f * myScene.sizey, 0.0f};
point ptAimed = start + myScene.persp.clearPoint * dir;
for (int i = 0; i < myScene.complexity; ++i)
{
ray viewRay = { {start.x, start.y, start.z}, {dir.x, dir.y, dir.z} };
if (myScene.persp.dispersion != 0.0f)
{
vecteur vDisturbance;
vDisturbance.x = (myScene.persp.dispersion / RAND_MAX)
* (1.0f * rand());
vDisturbance.y = (myScene.persp.dispersion / RAND_MAX)
* (1.0f * rand());
vDisturbance.z = 0.0f;
viewRay.start = viewRay.start + vDisturbance;
viewRay.dir = ptAimed - viewRay.start;
norm = viewRay.dir * viewRay.dir;
if (norm == 0.0f)
break;
viewRay.dir = invsqrtf(norm) * viewRay.dir;
}
color rayResult = addRay (viewRay, myScene, context::getDefaultAir());
fTotalWeight += 1.0f;
temp += rayResult;
}
temp = (1.0f / fTotalWeight) * temp;
The scene file now contains additional perspective information. The one that interests us here is the dispersion shown above. If the dispersion is not zero, then we have to look for the ray exiting from the optical center and the position of the "ptAimed" that is the point that lays at the distance clearPoint on that existing undisturbed ray. Then we add some perturbation and we recompute the direction from the perturbed origin to the ptAimed who is one of our invariant. The more we do that per pixel (with the number defined as "complexity"), the more we approach reality with an infinitely large number of rays. Of course we need to find the right compromise, because each additional ray makes the render longer. It's possible that we could find a heuristic to eliminate some of the additional rays on pixels that do not benefit from the extra complexity but let's not worry about that right now.
The "virtual" distance to the projection plane is determined by the FOV (field of view) and by the size of the viewport.
// We do some extra maths here, because our parameters are implicit.
// The scene file contained the horizontal FOV (field of view)
// and we have to deduct from it at what virtual distance
// we can find our projection plane from the optical center.
float projectionDistance = 0.0f;
if ((myScene.persp.type == perspective::conic) && myScene.persp.FOV > 0.0f && myScene.persp.FOV < 189.0f)
{
// This is a simple triangulation, from the FOV/2
// and the viewport horizontal size/2
projectionDistance = 0.5f * myScene.sizex / tanf (float(PIOVER180) * 0.5f * myScene.persp.FOV);
}
Ideally the coordinate of the scene should be independent of the resolution at which we render the scene but it is not here. In the future we should consider having a "normalized" coordinate system.
Here is the result of the depth of field effect with the sharp plane put at the same distance as the middle sphere :

And here is the result when the sharp plane is put at a farther distance. If the distance of the sharp plane is very large compared to the size of the aperture then only the landscape (at infinity) is seen as sharp here. For our eye and for a camera, "infinity" can be only a few feet away. It's only a matter of perception.

To get this level of quality without too much noise in the blur we need to send a large number of ray, more than one hundred per pixel in the image above. The time needed to render will almost scale linearly with the number of rays : if it takes one minute with one ray per pixel, then it will take one hundred minutes with one hundred rays per pixel. Of course this image took less (around several minutes at the time). Your mileage may vary, it all depends on the complexity of the scene, the amount of optimization that you could put into the renderer and of course of the performance of the CPU you're running it on.
The contribution of all the rays for a pixel are added together before the exposure and gamma correction took place. This is unlike the antialiasing computation. The main reason would be that the blur is the result of the addition of all photons before they hit the eye or the photosensitive area. In practice a very bright but blurry point one would leave a bigger trace on the image than a darker one. This is not a recent discovery as already painters like Vermeer exploited that small details to improve the rendering of their paintings. Vermeer like other people at his time had access to the dark chamber (camera oscura) which profoundly changed the perception of lighting itself.
Isosurfaces (blobs)
A iso-surface is a non planar surface that is defined mathematically like the set of points where a field of varying potential have the exact same value. The field has to be varying continuously else the definition does not constitute a surface. We suppose this field of scalar has the following properties :
- the fields that we consider are additive. That means that we have the field created by a source A, and the field created by the source B, if we put those two sources together, then the resulting field is simply the addition in each point of the space of the field A and the field B. In practice a lot of fields in real life acts this way. This is true for the magnetic field, the electric field and the gravitational field, so the domain is well known. So if we wanted to represent an iso-gravitational surface we could use this method or a similar one.
- the potential field value when the source is a single point is well known. The potential is simply the inverse to the square distance to the source point. Thanks to this and the property above we can easily define the potential field created by N source points.
- this isn't a property but a mere discovered truth. We don't need a perfectly accurate representation of the iso-surface. We allow approximations in the rendering.
Below is a representation of 2D iso-potential curves (in red). Actually this image represents the field of three identical attractors in a gravitational field. Each single red line represents the iso-potential for a different value, in blue, we have the direction of the gradient of the scalar field. The gradient in that case is also the direction of the gravitational force.

What are the justifications for the above properties ? There is none. This is just one possible definitions for a blob you will probably be able to find other ones. Also you should note that for a single given source point, the iso-potential surfaces form a spherical shape. Only if there is a second (third, etc.) source in the area is the sphere deformed. That's why we'll sometimes see our blobs being called meta-balls. They are like spherical balls until they are close from each other, in which case they tend to fusion, as their potential fields add up.
Now let's examine our code implementation. Adding each potential and trying to resolve directly the potential = constant equation will be a bit difficult. Actually we'll end up with the resolution of polynomial equations of the nth degree but those will exceed what we want to do for our series of article. What we'll do instead is go the approximate way, but hopefully close enough for our limited scenes.
If we take the potential function f(r) = 1/r^2, we can approximate this rational function as a set of linear functions instead each of them operating on a different subdivision of space. If we divide the space around each source point with concentric spheres, between two concentric sphere we have the value 1/r^2 being close to a linear function f(r^2) = a*r^2 + b. Plus as we go further along the radius, at some point the potential for that source point becomes so close to zero that it has no chance to contribute to the iso-surface. Outside of that sphere, it's as if only the N-1 other source points contributed. This is an easy optimization that we can apply to quickly cull source points. The image below illustrate how our 1/r^2 function gets approximated by a series of linear function in r^2.

The image below illustrates how our view ray has to go through our field of potential. The potential function is null at the start, so the coefficients a, b, c are zero, zero, zero. Then it hits the first outer sphere. At this point with our approximation from above, the 1/r^2 potential becomes a polynomial of the first degree in r^2, r being defined as the distance that is between a point on our ray and the point that is the first source. This squared distance can be expressed in term of t, the arithmetic distance along our ray. This becomes then a polynomial function of the second degree in terms of t a1*t^2+b1*t+c1. We can then solves the equation a1*t^2+b1*t+c1 = constant to see if the ray has hit the iso-surface in that zone. Luckily for us it's a second degree equation with t as an unknown so very easy to solve (same complexity as our ray/sphere intersection described in the first page). We have to do that same intersection test for each new zone that the ray will encounter. Each zone is delimited by a sphere, that sphere will modify the equation by adding or subtracting a few terms but never by changing the shape of the equation. In fact we have very good features designed in that guarantee that the solution to our equation will be a continuous iso-potential surface.

So here follows the source code. Before doing the intersection test with the iso-surface we have to do the intersection tests with the concentric spheres. We'll have to determine how each intersection is sorted along the view ray, and how much each one contributes to the final equation. We'll have precomputed the contribution of each zone in a static array to accelerate the computation (those values stay constant for the whole scene).
for (unsigned int i= 0; i < b.centerList.size(); i++)
{
point currentPoint = b.centerList[i];
vecteur vDist = currentPoint - r.start;
const float A = 1.0f;
const float B = - 2.0f * r.dir * vDist;
const float C = vDist * vDist;
for (int j=0; j < zoneNumber - 1; j++)
{
const float fDelta = BSquareOverFourMinusC
+ zoneTab[j].fCoef * rSquare;
if (fDelta < 0.0f)
{
break;
}
const float t0 = MinusBOverTwo - sqrtDelta;
const float t1 = MinusBOverTwo + sqrtDelta;
poly poly0 = {zoneArray[j].fGamma * ATimeInvRSquare ,
zoneArray[j].fGamma * BTimeInvRSquare ,
zoneArray[j].fGamma * CTimeInvRSquare + zoneArray[j].fBeta,
t0};
poly poly1 = {- poly0.a, - poly0.b, - poly0.c,
t1};
polynomMap.push_back(poly0);
polynomMap.push_back(poly1);
};
}
std::sort(polynomMap.begin(),polynomMap.end(), IsLessPredicate());
As we can see, we go through the list of sources then the list of concentric spheres to those sources. We could go either way through the list but it makes more sense to go from the outer spheres to the inner one, as if the outer sphere is not intersected by the ray then there is no chance that the inner ones are. The contribution of the spherical zone when entering is always the negation of the contribution of the zone when exiting as seen above because of the symmetrical problem.
We then go through the list of tagged intersections and their associated contributions. It is sorted by their arithmetic distance on the view ray so we can easily determine that the first suitable equation solution is the one we're looking for without having to go through all of them.
float A = 0.0f, B = 0.0f, C = 0.0f;
for (; itNext != polynomMap.end(); it = itNext, ++itNext)
{
A += it->a;
B += it->b;
C += it->c;
if (t > fZoneStart && 0.01f < fZoneEnd )
{
float fDelta = B * B - 4.0f * A * (C - 1.0f) ;
if (fDelta < 0.0f)
continue;
const float t0 = fInvA * (- B - fSqrtDelta);
const float t1 = fInvA * (- B + fSqrtDelta);
if ((t0 > 0.01f ) && (t0 >= fZoneStart )
&& (t0 < fZoneEnd) && (t0 <= t ))
{
t = t0;
bResult = true;
}
if ((t1 > 0.01f ) && (t1 >= fZoneStart )
&& (t1 < fZoneEnd) && (t1 <= t ))
{
t = t1;
bResult = true;
}
if (bResult)
return true;
}
}
return false;
Once the intersection point found, we have to compare it to the intersection point of other objects in the scene of course. If that's the one that is closer to the starting point we'll then have to do our lighting computation on that point. We have the point coordinates but we still need the normal to the surface in order to do the lighting, reflection and refraction. I've seen some people suggest to do the weighted average of all normals of the spheres around the source. But we can do way better than that, because of a nice property of the iso-potential surface. In the case of a surface defined at the points where potential=constant, then the normal to that surface is in the same direction as the gradient of the potential field. Below express as a mathematical expression :

The gradient of a potential field in 1/r^2 being a well known entity, we have the following code to compute the gradient of the sum of a couple of 1/r^2 fields :
vecteur gradient = {0.0f,0.0f,0.0f};
float fRSquare = b.size * b.size;
for (unsigned int i= 0; i < b.centerList.size(); i++)
{
vecteur normal = pos - b.centerList[i];
float fDistSquare = normal * normal;
if (fDistSquare <= 0.001f)
continue;
float fDistFour = fDistSquare * fDistSquare;
normal = (fRSquare/fDistFour) * normal;
gradient = gradient + normal;
}
After all those computations, here what we obtain when we combine three source points :

Fresnel and transparent volumes
Some materials have the property that they transmit the light really well (glass, water etc.) but affect its progression, which leads to two phenomenon. One phenomenon is the refraction, the ray of light is propagated but is deviated from its original trajectory. The deviation is controlled by a simple law that was first formulated by Descartes and Snell.
In short, if ThetaI is the incident angle that the view ray (or the light ray) does with the normal to the surface, then we have ThetaT, the angle that the transmitted ray does with the normal. ThetaT and ThetaI are part of the following equation : n2 * sin(ThetaT) = n1 * sin(ThetaI). This is the famous Snell-Descartes law.
n is called the refraction index of the medium. It is a variable with no unit that is computed for each medium by its relation to the void itself. The void is given a refraction index of 1. n1 is on the incident side and n2 is on the transmission side. Of course since the problem is totally symmetrical, you can easily swap the value of ThetaI and ThetaT, n1 and n2. In classical optics there is no difference between incoming ray and a ray that goes away. That's why we can consider light rays and view rays as similar objects.
The above relation cannot work for all values of ThetaI and ThetaT though. If n1 > n2, then there are some possible ThetaI values for which there is no corresponding ThetaT (because sin(ThetaT) cannot be bigger than one). What it means in practice is that in the real world, when the ThetaI exceeds a certain value then there is simply no transmission possible.
The second effect that those transparent surface cause is to reflect a part of their incident light. This reflection is similar to the mirror reflection we had described in the previous pages, that is the reflected ray is the symmetrical ray to the incident ray with the normal as the axis of symmetry. But what's different is that the intensity of the reflected ray will vary according to the amount of light that was transmitted by refraction.

What is the amount of light that is transmitted and the amount that is reflected ? That's where Fresnel comes into play. The Fresnel effect is a complex mechanism that needs to take into account the ondulatory nature of light (light is a wave). Until now we've treated all surfaces as perfect mirror. A surface of a perfect mirror has loaded particles that will be excited by the light and generate a perfect counter wave that will seemingly cause the ray to go towards the reflected direction.
But most surfaces are not perfect mirrors. Most of the time they will reflect light a little bit at some angles, and be more reflective at some other angles. That's the case of materials such as water and glass. But it's also the case for a lot of materials that are not even transparent, a paper sheet for example will reflect light that comes at a grazing angle and will absorb and re-diffuse the rest (according to the Lambert's formula). Wood, or ceramic would behave similarly too. The amount of light that is reflected in one direction depending on the angle of incidence is called the reflectance.
For a surface that reacts to an incident electromagnetic field there are two types of reflectance depending on the polarity of light. Polarity is the orientation along whom the electric and the magnetic fields are oscillating. For a given incident ray we can basically separate it into two separate rays, one where the photons are oriented along the axis parallel to the surface and one where the photons are oriented along a perpendicular axis. For a light that was not previously polarized, the intensity of each kind is divided in half and half. Of course for simplification reaons we forget about the polarity of light and we simply take the average of the reflectance computed for parallel and perpendicular photons.
float fViewProjection = viewRay.dir * vNormal;
fCosThetaI = fabsf(fViewProjection);
fSinThetaI = sqrtf(1 - fCosThetaI * fCosThetaI);
fSinThetaT = (fDensity1 / fDensity2) * fSinThetaI;
if (fSinThetaT * fSinThetaT > 0.9999f)
{
fReflectance = 1.0f ; // pure reflectance at grazing angle
fCosThetaT = 0.0f;
}
else
{
fCosThetaT = sqrtf(1 - fSinThetaT * fSinThetaT);
float fReflectanceOrtho =
(fDensity2 * fCosThetaT - fDensity1 * fCosThetaI )
/ (fDensity2 * fCosThetaT + fDensity1 * fCosThetaI);
fReflectanceOrtho = fReflectanceOrtho * fReflectanceOrtho;
float fReflectanceParal =
(fDensity1 * fCosThetaT - fDensity2 * fCosThetaI )
/ (fDensity1 * fCosThetaT + fDensity2 * fCosThetaI);
fReflectanceParal = fReflectanceParal * fReflectanceParal;
fReflectance = 0.5f * (fReflectanceOrtho + fReflectanceParal);
}
Then we have to modify slightly the code that computes the lighting and the rest. First if the reflection and refraction coefficients are both equal to one then we forget about the any other lighting because in that case that means that all the light intensity is either reflected or transmitted. Then we have to follow the view ray after it hit the surface. Before that point we had very limited choice we were either reflected or stopped. Here we have a third choice appearing. The C++ language and the computer execution being sequential in nature we cannot follow all choices at the same time. We can use recursion and store temporary choices on the C++ stack or do something else. In our code we decided to do something else but it's pretty arbitrary so you shouldn't think that's the only solution. So instead of following the full tree of solutions, I only follow one ray at a time even if the choices made me diverge from another possible path. How can we account for BOTH reflection and refraction in this case ? We'll use the fact that we've already shot a lot of rays for the depth of field effect and that we can evaluate a lot of different paths by shooting rays and varying randomly (or not) the path taken in each ray. I call that the "Russian roulette", because at each divergence I pick randomly based on some "importance" heuristic which road my ray will take.
We need a lot of samples per pixels but we have to acknowledge that some samples will contribute more to the final image because they contributed more light. This leads to our implementation of importance sampling. The reflectance and transmittance we've computed above have then to be taken as "probabilities" that a given ray will take one path vs the other.
Here's what we have when we've rewritten those ideas in C++ :
float fRoulette = (1.0f / RAND_MAX) * rand();
if (fRoulette <= fReflectance)
{
// rays are either reflected, ..
coef *= currentMat.reflection;
viewRay.start = ptHitPoint;
viewRay.dir += fReflection * vNormal;
}
else if(fRoulette <= fReflectance + fTransmittance)
{
// ..transmitted..
coef *= currentMat.refraction;
float fOldRefractionCoef = myContext.fRefractionCoef;
if (bInside)
myContext.fRefractionCoef = context::getDefaultAir().fRefractionCoef;
else
myContext.fRefractionCoef = currentMat.density;
viewRay.start = ptHitPoint;
viewRay.dir = viewRay.dir + fCosThetaI * vNormal;
viewRay.dir = (fOldRefractionCoef / myContext.fRefractionCoef)
* viewRay.dir;
viewRay.dir += (-fCosThetaT) * vNormal;
}
else
{
// .. or diffused
bDiffuse = true;
}
So here's what you would get with a refractive index of 2.0 combined with the blob rendering from above :

One of the disadvantage of shooting rays with russian roulette is that you risk ending up with a "noisy" look, this is often the case for renderers that rely on randomness in their algorithms. This is not too visible on the above image because we did shoot enough rays to hide that effect. But it can prove expensive to hide it.
Here is the output of the program with the notions that we explained in this page and the previous ones :

In order to compile the source code for this fourth page, you don't need any particular additional library. You just need a C++ compiler coming with the standard C++ library. Tested with the GCC 3.4.4 and Visual Studio .net/2005/2008. Decompress the .rar file with Winrar.
Get the source code of the raytracer in C++. Source code for the fourth page explanations only.
For this page and the following ones you will also need the extra files in the archive called textures.rar (3 MB). It contains the cube map files used to render those images.
To page 5 : "High dynamic range, beer's law and chromatic aberration".




