Areas

Tutorials

j3d.org

State Sorting Tutorial

In terms of gaining maximum performance from your rendering code, there are two fundamental techniques - culling and state sorting. Culling removes non-visible geometry, while state sorting reorganises the ordering that calls are made to the rendering API in order to reduce how much of the state changes for each new primitive.

State sorting is quite a bit of a black art. Doing a web search leads to many questions about it, but very few concrete suggestions on going about implementing in it your application. This tutorial aims to give as good an introduction as possible, without getting into the specifics of a particular higher-level scene graph structure. Just in case you are wondering, the code examples are taken from the Aviatrix3D project.

 

What is state sorting?

State sorting is about minimising the number of state changes that go down the OpenGL pipeline by grouping collections of similar objects together. Some state changes are more expensive than others. For example, turning on and off depth buffer writing is far less expensive than turning on or off a light. What you need to do then, is to work out what State is more expensive than another and then sort your objects accordingly.

Of course, that leaves the question of "well how do I sort?". Sorting requires that you have something to sort. Since this is an object-oriented language, sorting collections of objects is a good idea. (well, duh...). How you decide to sort them is a long and involved process. Every one that I've seen done in Java ends up using a wrapper object that implements the Comparable interface. Why? Well that allows you to make use of the java.util.Arrays method called sort(). That simple method makes use of the most efficient sorting algorithm for this task - an insertion sort[1].

The Comparable interface has a method compareTo() which has to return negative, zero or positive. This is the key to the state sorting. As far as the insertion sort algorithm is concerned, when two objects are compared, it just wants to know the relative difference between them. It doesn't care for absolutes, just whether something is identical, less than or greater than another. That makes our state sorting life much easier, because now all we need to do is evaluate our state in the current object versus the state in the object that is handed to us. So long as the rules about that state comparison are consistent, you'll end up with a nicely sorted array at the end, that minimises state changes according to whatever rules you want.

The next thing to look at is - what does that object contain that it needs to compare so that it can give this answer of negative, zero or positive? The simple answer is: whatever you need to compare to get the order correct. Heh, good answer that one (and one of the most frustrating to find a decent answer on the internet or in books). That object must hold all of the state that is of importance to your sorting order, so that means things like textures, lights, material, buffer state handling etc etc.

At about this point, you've moved into the world of a scene graph. It may not look like it at first, but step back and you'll see that you need collections of objects that hold info about the current texture (and all it's parameters), material state, light state etc. These are all classic objects of a scene graph architecture. There's not much of a jump going from this up to the next level of creating grouping nodes, switches, viewpoints etc. So then, the question arises - do you really want to implement your own scene graph, or would one of the pre-rolled ones like Xith3D or Aviatrix3D save you a lot of time and effort?

Assuming you've decided to plow ahead with your own solution, in each of these objects you will now need to implement a comparison between the two states. The good thing about the requirements is that it is very liberal so long as we keep to those basic ternary output talked about above. The goal here is to sort things to minimise state change, and that's all. It's not about priority sorting (though that could be done too), so your decisions about what is "less than" or "greater than" another object's state can be entirely arbitrary - so long as you are always consistent.

To illustrate how this work, let's take an example of a Light that we want to state sort. There are 3 different different light types provided by OpenGL - point, spot and directional light. Let's say our two objects that we want to sort represent a light. Our first decision point is based on whether we even have the same type of light. Changing from a spotlight to a spotlight is a small state change compared to a directional light, for example. So, at the start of our compareTo() method, we have something like this:
public int compareTo(Object o)  {
  if(o == this)
    return 0;

  Light other = (Light)o;
  if(type != l.type)
    return type < l.type ? -1 : 1;

If the objects are the same reference then obviously they're the same. Next, cast the incoming object to be the same as us. It's ok if you toss a ClassCastException here as the sorter is expecting that. Of course, there's a lot more infrastructure to go over the top if you want to make sure you are sorting more than just light state (remember what I said about scene graphs above?). Finally you get to the important bit - are the lights of the same type? If they are, then we need to find some other way of differentiating the two objects, so move on to the next comparison. If they are not the same, then great, we can bug out here and return a value safely. Remember, we really don't care what value is returned, so long as it is either negative or positive, nor does it matter which order. So in this case, we've chosen to go with the natural ordering of the light type. Of course, you could have the other way around if you like, it makes absolutely no difference in the end.

From this point on, it's just more of the same - for every state that your object keeps, you need to do a comparison. Keep doing comparisons until you find something that doesn't match, or until there is nothing left to compare, and thus return zero. For example, here's the rest of that method:
public int compareTo(Light l) {
  if(l == null)
    return 1;

  if(l == this)
    return 0;

  // are we the same type of light?
  if(l.lightType != lightType)
    return lightType < l.lightType ? -1 : 1;

  if(enabled != l.enabled)
    return enabled ? 1 : -1;

  int res = compareColor3(diffuseColor, l.diffuseColor);
  if(res != 0)
    return res;

  res = compareColor3(specularColor, l.specularColor);
  if(res != 0)
    return res;

  return compareColor3(ambientColor, l.ambientColor);
}

protected int compareColor3(float[] a, float[] b) {
  if(a[0] < b[0])
    return -1;
  else if (a[0] > b[0])
    return 1;

  if(a[1] < b[1])
    return -1;
  else if (a[1] > b[1])
    return 1;

  if(a[2] < b[2])
    return -1;
  else if (a[2] > b[2])
    return 1;

  return 0;
}

See how at each comparison point all we do is look for inequality and then assign some random value if they aren't.

One thing that you do want to do is order your equality checks in a way that is the most efficient. Do the big checks first, such as the type of light, as this will get the biggest results fastest. Then, order the checks as you go down for things that are more and more likely to be the same. Determining that is going to take a lot of just basic knowledge about 3D graphics and your particular application. Remember that this is an insertion sort algorithm that uses a tree-style structure where multiple comparisons are made time and again with O(n log n) time cost and thus the faster you can make a decision about the difference between two objects, the quicker you can sort.

Of course, that's the simple part, now you have to deal with the more complex issue of multiple-states within an object. Think of your terrain - you not only have a texture object, but material information, lights and the geometry itself (though geometry is not typically state sorted[2]). If all you did was toss the individual lights, materials and textures into a single array and then tell Arrays to sort it, you'd end up with a huge mess, and nothing useful out the end. What you really need to sort on is the collection of states that work with a single large object. That way a collection of state is going have the minimal change - it's amazing how often you end up with objects that have everything identical except one component of one object way down the bottom of the state set. That's why you see scene graphs have objects like Appearance that bundle everything together. Appearance is the complete set of states that an object can have, and thus makes a really nice object to sort with as all the comparisons that matter to your sorting algorithm will always be sorting exactly the same object type - even though the internals may be different, at least you'll have something meaningful out of the sorter. Then, the comparisons can start doing checks like whether two appearances even have the same material set.

For example, here's the comparison function out of the TextureUnit object in Aviatrix3D:

public int compareTo(TextureUnit tu) {
  if(tu == null)
    return 1;

  if(tu == this)
    return 0;

  if(texture != tu.texture) {
    if(texture == null)
      return -1;
    else if(tu.texture == null)
      return 1;

    int res = texture.compareTo(tu.texture);
    if(res != 0)
      return res;
  }

  // Same type of texture. Hmmmm.... let's now work on
  // the texture attributes.
  if(tatts != tu.tatts) {
    if(tatts == null)
      return -1;
    else if(tu.tatts == null)
      return 1;

    int res = tatts.compareTo(tu.tatts);
    if(res != 0)
      return res;
  }

  // Ah, well ok, Now let's try coordinate generation
  if(coordGen != tu.coordGen) {
    if(coordGen == null)
      return -1;
    else if(tu.coordGen == null)
      return 1;

    int res = coordGen.compareTo(tu.coordGen);
    if(res != 0)
      return res;
  }

  // Gah! Matrix time now as that's the only thing left
  if(validMatrix != tu.validMatrix)
      return validMatrix ? 1 : -1;

  for(int i = 0; i < 16; i++) {
    if(texTransform[i] != tu.texTransform[i])
      return texTransform[i] < tu.texTransform[i] ? -1 : 1;
  }

  return 0;
}

See how at this high level all you care about is whether the two objects being compared even have the same objects set, and if they don't, to do a comparison on them. This can be repeated right down the chain of object types that you're working with.

Now you've got the basics of a state sorting system going. There's plenty more to add, but this should be enough to get you up and running. Anything after this starts getting fairly implementation specific.

Footnotes:
[1] It appears that the only way of getting a good insertion algorithm is through the use of the comparison-based algorithms. I've been searching for a long time to see if I could come up with a good way of using the one of the linear sorting algorithms such as a radix sort, but that involves generating some sort of meaningful descriptor code with many, many units of comparison - basically a really big hash code. That's a very far from trivial setup for a generalised scene graph. However, if you have very specific objects and limited choices in what you're going to offer to the GL pipeline then it may be doable.

[2] Sometimes geometry is also sorted. An alternative to state sorting is depth sorting. By doing this, objects are sorted from front to back according to their distance from the camera. The idea of this technique is to eliminate overdraw as much as possible, rather than to minimise state changes. Thus, any objects that were definitely completely obscured by those in front would be pulled out of the rendering pipeline and not drawn (ie, a primitive form of occulusion culling). Overdraw was a problem back in the days of the old rasterisation-only video cards that had no 3D capabilities, but these days state changes are the real performance killer.