Procedural World: September 2011

Wednesday, September 28, 2011

Streaming Meshes

At this point I was considering different methods to stream this huge world to users. I was looking for an approach that featured great compression, but also could be selective about what information to send. There is no point in sending very compressed information if it's not needed anyway.

Polygons, actually triangles, are a very efficient way to compress 3D models. You can think of it as some form of lossy compression. A very irregular surface can be approximated by a series of triangles. Depending on how many triangles you use, the resulting surface becomes closer to the original.

The beauty of this is you can do it progressively: you start with a coarse approximation and then, if needed, you can increase the amount of triangles until the approximation is satisfactory.

Progressive meshes are quite old. The classic approach is to simplify a mesh and record all the compression operations. If you reverse this process, that is, start from the simplified mesh and play back the recording, it is like you were adding detail.

This principle has been extended so playback only happens for the polygons that are visible, making it view-depending. Detail outside the view is not refined. This property suits very well for streaming from a server. You can be very selective about what details are sent to clients. For instance, if the user cannot see the back of a mountain or a house, there is no need to stream any detail there.

Now, if you use a simplification method that is able to change mesh topology, it means the playback can increase the "genus" of your mesh. For instance a house in the distance would not have a hole in the chimney, but if you get close enough, this hole would appear. This is a game changer.

An octree-based mesh simplification method can do this for you:

This method works very well over a network. TCP/IP is a streaming protocol. Not only it makes sure all bits of info arrive, it makes them arrive in the same order as they were sent. This is actually an overkill. For traditional progressive meshes it was required that playback occurred in the exact same order as simplification ocurred. You needed something like TCP/IP for them. With an octree-based progressive mesh this is not necessary all the time. For instance, cells at the same level can be refined in any order. In this case it makes sense to use UDP, since in many cases you won't matter if packets arrive out of order.

A big problem with traditional polygonal approaches is texturing. Each vertex requires some UV parametrization that links the 3D model with the 2D texture space. It is a problem because a progressive mesh evolves in a way that may break its texture mapping. Often the solution was to create charts that were shared both by the model and the 2D space, and then constrain simplification to within the charts. This was too limiting, and it did not do well with massive topology changes.

What if I dumped 2D parametrization altogether? I could use the actual vertices to store all the information I needed, like material, illumination, etc. The mesh itself acted as a compression structure. If I had enough vertices, the results could be very detailed, same as if I was using conventional 2D mapping to store materials and lightmaps. Then the concept of material kicks in. A material is more than a texture, it can include displacement maps for tessellation, wang-tiles of very detailed geometry for instancing, instructions for procedural generators, etc.

I did some quick tests and saw this method was within what the current hardware generation can do. I did not like a couple of things about it. First, it required to render a lot of polygons. Newer systems could take it, but older ones did not do so well. And second, I would need to run some streaming logic on the server-side, making sure only those packets needed by the client were sent. I don't have the budget for running UDP-streaming servers right now.

So I did not choose this direction, but I think it has a lot of potential. I may come back to it later. As I see it, this method is an alternative to sparse voxel octrees. Voxels make most sense when polygons are too small. They don't require a 2D parametrization of your world anymore. And they can deal with topology simplification in a beautiful way. Progressive octree meshes does pretty much the same, as long as you don't allow your polygons to become very small on screen. The advantage is you may still use all the polygon hardware.

Friday, September 23, 2011

How much data?

While we are in the subject of big data, I wanted to show how much detail these algorithms are able to generate. It may give you an idea of the amount of information that needs to be delivered to clients, one way or another.

Before anything else a disclaimer: the following images (and video) are taken from a mesh preview explorer program I have built. The quality of the texturing is very poor, this is actually textured in realtime using shaders. There are a lot of shorcuts, especially in the noise functions I use. Also the lighting is flat, no shadows, no radiosity. But it is good enough to illustrate my point about these datasets.

Take a look at the following screenshot:

Nothing fancy here, it is just some rock outcrops. In the next two screenshots you can see how many polygons are generated for this scene:

This is how meshes are output from the dual-contouring stage. If you look carefully you will see there is some mesh optimization. The dual contouring is adaptive, which means it can produce larger polygons if the detail is not there. Problem is, with this type of procedural functions there is always detail. This is actually good, you want to keep all the little cracks and waves in the surface. It adds character to the models.

Here is another example of the same:

These scenes are about 60x60x60 meters. Now extrapolate this amount of detail to a 12km by 12km island. Are you sweating yet?

I leave you with a screen capture of the mesh preview program flying over a small portion of this island. It is the same 60m x 60m x 60m view moving around, detail outside this view is clipped. The meshes loading here are a bit more optimized, but they still retain a lot of detail. I apologize for the motion sickness this may cause. I did not spend any time smoothing camera movement over the terrain.

Monday, September 12, 2011

Big Data

The terms "Big Data" are frequently thrown around these days. I don't know about you, but this is what comes to mind:

Big Data is when you have so much information that traditional storage and management techniques begin to lose their edge.

One of the promises of Procedural Generation is to cut the amount of data required to render virtual worlds. In theory it sounds perfect. Just describe the world by a few classes, pick a few seeds and let everything grow from there. It is storing metadata instead of data. The actual generation is done by the player's system, data is created on-demand.

The problem is how to generate compelling content. The player's machine may not be powerful enough for this. Odds are you will end up targeting the lowest common denominator and your options will become seriously limited. For instance, Minecraft generates the entire world locally. This limits its complexity. Blocks are huge and there is little diversity. For Minecraft it works, but not many games could get away with it. If you want some kind of realistic graphics it is probably out of the question.

As it seems, data must be generated away from the user`s system. One possible approach is to use procedural techniques to create a static world. Whatever data constraints we may have can be built in as parameters to the procedural generation. For instance we could make it sure the entire world definition fits on a DVD.

On the other hand, procedural techniques can produce huge datasets of very unique features. If you don`t really need to fit on a DVD, Bluray or any type of media you can keep around your house, you could create very large worlds. This is not new, Google Earth already does that. You don't need to download the entire dataset to start browsing, it all comes on-demand. A game world would not be as large as our real planet, but it would be a lot more complex and detailed. The world's topology for sure will be more than just a sphere.

Then it becomes about Big Data. While generation and storage may be simple and affordable, moving that much data becomes a real challenge. What is the point of becoming so big when you cannot fit through the door anymore?

One option is not to move any data and render the user's point of view on the server. You would stream these views down to the player's machine pretty much like you would stream a movie. This is what the OnLive service does for conventional games. It is appealing because you can deal with bandwidth using generic techniques that are not really related to the type of content you produce, like the magic codecs from OnLive and their competition. But there are a few issues with this. Every user will create some significant load on the servers. Bandwidth costs may also be high, every frame experienced by the users must go down your wires.

The other option is to do it like Google Earth: send just enough data to the client so the virtual world can be rendered there. This approach requires you to walk a thin line. You must decide which portions of data to send over the wire and which ones can be still be created on the client at runtime. Then, whatever you choose to send, you must devise ways to compress it the most.

It is not only about network bandwidth. In some cases data will be so big, traditional polygon rendering may not make sense anymore. When you start having large amounts of triangles becoming smaller than actual pixels after they are projected into screen, it is best to consider other forms of rendering like ray-tracing. For instance this is what these guys found:

This is taken from their realtime terrain engine. This dataset covers a 56 km × 85 km area of the Alps at a resolution of 1 m and 12.5 cm for the DEM and the photo texture, respectively. It amounts to over 860 GB of data

I am a hobbyist, having dedicated servers to render the world for users is not even an option. I rather pay my mortgage. If I want anyone to experience the virtual world I'm creating, I will need to stream data to end-users and render there. I have not figured out a definitive solution, but I have something I think will work. My next post will be about how I'm dealing with my big data.