Thursday, October 27, 2011

Playing Dice with Mesh Simplification

Mesh simplification is as old as polygon meshes. Even then, it is still a hot topic. There are plenty of methods out there, picking the right one is the tricky part. It is not only about the quality of the resulting mesh, but also how fast can you do it.

I had decided not to use progressive meshes at all and chose clipmaps instead. That meant I had to deliver each cell of the clipmap in a single shot. My plan was to use a very simplified mesh and keep most of the detail in a series of maps: normal map, material map, etc. In this post I will describe the method I used for mesh simplification.

Here you can see some results:

The voxel countouring phase was churning out huge meshes. A single cubic cell of 6 meters side would easily take a few hundred thousand polygons. I needed to fit that into a budget of one or two thousand polygons.

Hitting this budget was the first problem. Removing so many faces from the mesh could severely alter its appearance. It was necessary that the simplification introduced the least amount of error possible.

There are two main classic approaches to mesh simplification: Vertex Clustering and Greedy Serial simplification. I had used some vertex clustering methods before, but disliked their boxy artifacts.

One of the best methods for minimizing error is one described by Hoppes in 1993, which is a greedy serial approach. It uses Quadratic Error metrics to minimize errors. The algorithm can be described like this: For all edges in the mesh, find the one edge that when collapsed introduces the least error. Then repeat until the desired polygon count is reached.

This method produces beautiful results, but it is very slow. It sequentially visits each edge in the mesh to discover which one to collapse. If your mesh has to be compressed from 100,000 triangles to 1,000 that's a lot of iterations for every collapse. It can be optimized so errors are pre-computed in one initial phase and kept in a sorted list, having the smallest error edge on top. As long as the list is properly sorted, you can get the edge at the top and collapse it. Still, every time an edge collapses, several other edges need their collapse error to be recomputed and their place updated in the sorted list. This comes at some cost.

Then I found this paper which opened my eyes to a technique called Multiple Choice Algorithms. These guys apply it to mesh simplification, but it is a very general and powerful way to solve a big family of problems. Load balancing and network routing are some typical applications.

This is how the method is usually explained: Imagine you have 100 large odd-shaped containers and a tube that continuously ejects a ping-pong ball. Your task is to fill the containers evenly with balls. The catch is that someone else will remove balls behind your back, so the only way to know how many balls you have in any given container is to count them. You cannot just keep track of how many balls you have put since some may have been removed already.

The naive approach would be to count the number of balls in each container and then place the ball in the one having the least amount. That would work, but it is an awful lot of work and it would take a lot of time.

Now, instead of that, imagine you just pick 5 random containers, count the number of balls in them and place the new ball in the one having the least amount of balls. Would it work in the long run? If it did, it would be great, you have reduced the number of containers to count from 100 to just 5.

The question of whether this really works depends on the total number of balls. For small numbers it won't be good, but when the balls start to pile up the randomness evens out and so do the containers.

Applied to mesh simplification, the method means you pick only a few edges randomly, let's say 10, and you collapse the one introducing the smallest error.

Does this create noticeable differences with the sequential greedy method? Again it is a matter of how many times you perform this operation. These are huge meshes and most edges in the original mesh will be gone. This is the equivalent of having a large number of balls.

You can actually estimate the chances of it making an error. A compression from 100,000 triangles to 1,000 means only 1% of the edges will remain. An error happens when the best edge collapse in the set of 10 random edges belongs to the 1% that should never collapse. This means that the other nine candidates are also in the 1%, otherwise one of them would have been picked for the collapse. So the probability of picking the wrong edge is 0.01 at the tenth power: 0.00000000000000000001. This is what in engineering we like to call never.

I implemented this method and the speedup was considerable. It also made the whole algorithm a lot simpler. In the serial greedy method you need to do a lot of housekeeping to avoid recomputing edge collapse errors. Also when a collapse actually happens, you need to recompute the error metric for the adjacent edges. In the multiple choice method it is OK to recompute the error metrics for the small set of edges every time, the optimization structures of the serial greedy approach are not needed anymore.

What is truly remarkable about the multiple choice optimization is that it lends very well to parallelization. You can have thousands of simultaneous threads, each one looking at one different bucket of random candidates. Each thread would output the result of the collapse operation for its bucket and the resulting mesh would be the input for the next iteration. A single iteration could collapse thousands of edges in one shot.

Remember, this can be used in many other things than just meshes. Like the scan algorithm, this is one tool you may want to keep in your set for the future. As for myself I'll keep walking with this hammer in my hand, I'll let you know if I find another nail.

Friday, October 7, 2011

Popping Detail

I have added two new videos where you can see the popping of new detail.

It is very noticeable when the camera is close to a boundary and the new mesh comes right in front of it.

In these videos the clipmap loader is always trying to catch up with the moving view. In most cases it drags behind, as it really needs to load and process nearly 50 Megs worth of new meshes for each new scene. These meshes won't be used by the clients. Clients will receive very simplified meshes with all the detail projected. They will load a lot faster, but some of them will have to come from the network, so it is essentially the same problem.

Adding some degree of prediction should help. A prediction logic would load the clipmap scene for where the camera will be, at least a second ahead.

These videos are from some kind of weird terrain I was testing. I still need to go back and improve the terrain definition and materials. Right now it is a bit crazy.

If you look carefully you will notice both videos end in the same spot of terrain, but they start in very different places. Enjoy!


After playing with some ideas on how to send the world to viewers, I ended up choosing Clipmaps.

Clipmaps have been around for a while, so I won't spend much time explaining what they are, how they work. It is about breaking the world around the viewer into concentric rings. Each ring has roughly the same amount of information, but the further they are from the viewer, the larger they become. The perspective projection makes distant objects appear smaller on screen, so you end up having a fairly constant distribution of information for the entire view.

In most implementations clipmap rings are square. This is mostly because Euclidean spaces have affinity with square things, but rings could be spherical, pyramidal, etc. Spherical rings provide the best ratio between information and the size at which it appers on screen. With square clipmaps three will be some areas where you have more information than what you actually need. This error is the difference between the Manhattan and the Euclidian distance, but it is not very noticeable. It is very hard for the viewer to perceive the axes of the clipmap system.

Clipmaps were quite easy to implement in my case. I already had the world information stored in an octree. The leaf cells were 6m by 6m by 6m. It is not hard to see that octree cells can be arranged around any point in space so cells are increasingly subdivided as they get closer to this point. The following image shows this in a quadtree:

Something nice about this approach is that a given cell is always the same regardless of the point from where its is being observed. Once a cell is downloaded from the server it can be cached and reused any time later, no matter if the viewer has now moved to a different position.

Clipmaps are very simple to manage. The cells can be stored as files and served over HTTP. No server-side programming is necessary. Cloud services like Amazon's S3 make it dirt cheap to store and serve arrays of files like these clipmap cells. They also provide caching at multiple points if your files don't change often. As a solution it is fast, reliable and scales very well.

In principle clipmaps are a way to manage information for a scene from a point of view. Some of the shortcomings we usually attribute to clipmaps do not really come from the method, rather from the way it is used.

A landmark limitation of clipmaps is the noticeable popping of higher detail cells into the view. This is not the clipmap's fault. It is caused by the way information is stored. For instance, if the higher detail cells contained progressive mesh operations to be performed over the lower resolution, it would be possible to smoothly transition into the higher detail geometry and topology.

One real hairy issue with clipmaps is managing the seams between rings at different resolution. Some engines do a lot of stitching. Some other engines create "skirts" on the seams so any holes would be covered by the skirts.

How am I dealing with popping and seams? For now I'm just waiting to see how bad of a problem this is in my case. Right now seams are hidden by some overlap between cells. Since the texturing matches, they are hard to see. I like this solution since it is very parallel, one cell does not depend on its neighbors. The popping will depend on the mesh simplification and projection for the final meshes. In the high resolution mesh explorer program I did the popping is very noticeable, but this may be misleading at this point.

Here are a couple of videos. Again, they are taken from the mesh explorer, so please disregard texturing, illumination. The clipmap shows nicely in the wireframe at some point in the first video. The second video shows high frequency meshes, where the popping is most noticeable.