Voxels and polygons are alternative forms of storing and visualizing 3D information. They are pretty much equivalent in terms you could represent the same information, the key difference is there are penalties attached to each method.
For instance, if you want to change the world in real-time, like making holes, cutting pieces or merging different shapes, voxels are likely to outperform polygons. The same applies if you want to merge layers of procedural content in real time. This is fast because voxels are a much simpler representation of the content. If you were doing this with polygons, you must use more complex and slower methods.
On the other hand, polygons can represent and reproduce some surfaces more economically. This is the reason why the graphics industry adopted polygons so early.
One aspect where we can do an apple-to-apple comparison is data size. The experiment would be this: Get a fairly large scene, store it both as voxels and polygons, and see which dataset is larger. We would be measuring the final size of the package, that is, how much data you need to download to have a complete scene.
This is what we did. We used Ben's work-in-progress scene, which features a massive citadel. The following video shows a character running around this place. You do not have to watch all this to realize it is a pretty big place:
Everything you see there is voxel content. There are no props or instances. This is all unique geometry, forming a watertight mesh:
Here are the core stats about the scene:
54,080,225 triangles
2,203,456,000 voxels
This is the first takeaway. It takes 2 billion voxels to represent the same content as 54 million polygons. You need 40 times more voxels than polygons.
Is the voxel dataset 40 times the size of the polygon dataset?
That, you guessed, depends on how much smaller a voxel is than a polygon, also what is the overhead in storing them. Let's talk about that.
We store meshes as:
- a list of vertex coordinates (3 x 32bit float)
- a list of faces, where each face is three indices into the vertex list (3 x 32bit int)
- a list of UV pairs, one per each vertex in a face (2 x 32bit float)
- a list of material identifiers, one per each face (16 bit)
For the entire scene, the final compressed version of this data is 527 MB.
Voxels, on the other hand, store:
- attributes (empty, has material, has UV, etc. 8bit int)
- one 3D point (3 x 8bit float)
- up to 12 UV entries with surface properties (each 64bit)
- inner material (16bit int)
It seems the voxel data takes twice the space. This somehow feels right, considering everything we have heard about voxels versus polygons, it is no surprise voxels take twice the space as polygons for the same content.
But there is a little problem with this test. It is not really apples-to-apples. Here is why:
The polygon version of the content captures only the visible surfaces. That is when the solid materials meet air. These are the portions of the model you can actually see.
The voxel version of the content also captures hidden surfaces. While you cannot see these initially, they may become exposed later due to changes made by the viewer to the scene, for instance, while destroying or building things.
This image shows why these two sets of surfaces are different:
The red arrows point to surfaces that appear in the voxel set but are not included in the polygon set.
Luckily for us, we can change the contour rules and also produce these surfaces in the polygon dataset. After a collecting a new set of stats for this new configuration, the new polygon count is 122,470,300 triangles. Once this is compressed, the final storage is 1,105 MB.
Now, this has come very close to the voxel database size. Does this make any sense?
What is maybe most surprising is that we expected the sizes to be different. In both cases, we are capturing surfaces. Even if they are fully volumetric, voxels only really get "busy" around surfaces. This is not much different than polygons.
Of course, there are nuances in how the information is compressed. In each case, we could be using tailored compression schemes. But at this point, this will be producing diminishing returns, and the ratio between voxel data and polygon data is not likely to change much.
If you have questions or opinions about these measurements, I'd love to discuss them. Just post a comment below.
Are you comparing the voxels to the polygons created from the voxels, as seen in the screenshot above?
ReplyDeleteHow many polygons would be used if one created the scene in polygons from the beginning, like in a regular game today? In that screenshot you can see flat surfaces that only needs a fraction of the triangles that are shown. Perhaps try some decimation/simplifying on the polygon dataset?
Anyway, been following you a long time and its great seeing all the progress made! Do you have any news on more games using the tech?
Yes this is comparing voxels to polygons generated by the same voxels.
DeleteThe triangle mesh is already highly optimized. The optimization is a greedy edge collapse, and it runs before the mesh is saved.
There are additional constraints, which do not come from the use of voxels, but rather from the scene management, which is a clipmap octree and does not depend on the content representation (same scene voxel or poly).
One constraint is that since the content is split into chunks, the mesh simplification needs to match along the chunk edges. There are additional constraints for vertices to keep blending between materials at the artist-configured distances. This is producing a mesh that is similar in triangle density to what you would have in a traditional polygonal game.
Key difference with a polygon game is that this is all unique geometry. While polygon games could be made of unique geometry as well, in practice a lot of the surfaces are instanced props.
DeleteThis opens an interesting topic for me, which is how far can you get with instanced geometry as a scene creator.
In this scene, most of the content is the result of applying a small number of textured meshes. The artist used these as a brush, where each "stroke" is imprinted in the voxel canvas is not remembered after that.
The total number of instances used in this project is too high for traditional game engines. Baking everything into a single mesh, which you can LOD as a whole, allows you to render this in a much faster way.
Alright!
DeleteI was just thinking that the ground appears quite flat, and that in a poly-game the triangles might be a lot larger, for example. So a slightly lossy optimization outside the game engine could perhaps reduce the triangle count further?
How does the mesh of a voxelized poly-prop, compare to the original? If the detail is fully preserved by the voxels, does it have a similar triangle count?
Anyway, any such increases in triangle count are well worth it for the advantages of voxels.
All unique geometry is indeed a huge selling point, as it is very easy to spot duplicate instances.
DeleteHow far can you go on instances? Well that question brought this old thing up from my memories: https://youtu.be/Q-ATtrImCx4?t=3m25s
So, you can make some really terrible scenes using instancing. Of course that is just a tech demo, but it shows there is a limit to instances.
When the ground is flat and all the same material, it reduces to a few flat large polygons.
DeleteThe greedy mesh simplification is lossy, the error metrics are chosen by the artist in this case. This runs outside the engine.
A voxelized prop will have the same or fewer triangles as the original. This does not account for the additional constraints like chunk borders. If you place the prop entirely inside a chunk, then it will be same or smaller.
In general you get extra triangles where you want them, based on how you configure simplification errors, blending rules between materials, and other factors that affect geometry. In terrain for instance, there are much fewer triangles coming from voxel terrain than the equivalent using Unreal terrain.
"
DeleteWhen the ground is flat and all the same material, it reduces to a few flat large polygons.
The greedy mesh simplification is lossy, the error metrics are chosen by the artist in this case. This runs outside the engine.
"
Ah I see. I was thinking that in order to compare to a poly-engine that it might be a bit unfair to include triangles that are the result of the voxel-meshing, like chunk borders and material blending. As they would not be present had the scene been created traditionally.
By outside the engine i ment something like using 3ds max to optimize the data before comparison. But as you said the scene itself might not be possible in a poly based engine to begin with.
Anyway thanks for explaining and answering my questions!
Yes, but again, chunk borders and material blending are not result of voxel meshing. We have chunks because the scene space is partitioned in an octree, each leaf could be a mesh. In that case you would also need mesh borders to align. Blending across vertices is the same. This is using the mesh attributes to encode a surface properties. You could have a mesh system that would do the same, in fact many of these systems in games do, where they store things like alpha blending and colors as vertex attributes.
DeleteThis instanced vs. unique data conversation reminds me of Virtual Textures. The selling point of virtual textures was that every texel could be unique, but what happens in practice, is the generated content was still made of tiled texture maps simply because it was impractical to author the world that way, the only advantage them, was that they got to use a lot of decals (which individually would also repeat in many places across the world). The same thing is happening to you guys, you can potentially have a lot of geometry variety, but in practice, the authored content is still made out of multiple copies of repeated meshes anyway. Well, back to virtual textures, the team behind Trials Evolution implemente their own hybrid solution of virtual textures that was a hybrid of both worlds. It used a big atlas of virtual textures, but the texture tiling and decal merging was done at run-time. This means they could still have a lot more decals and texture layers than on most other engines of the time at a small cost, since all layers were merged in texture space once, and only recalculated as the camera moved around the scene. Yet, since the results weren't baked offline, they still had the low storage costs of traditional texturing systems. If the game level just had a big area with the same tiled texture across the ground with nothing else going on, the engine didn't have to load a huge texture with the same pattern repeated across the ground. It just loaded the pattern and generated the repeated virtual texture when such surface apeared on screen, and that would get used for multiple frames. It would still be re-generated if the camera moved closer, at higher res, or if it was modified, or if it disapeared and was flushed out of the virtual atlas, and then apeard again later.
DeleteMaybe the money is on implementing something like this for your system. Take advantage of the fact content is authored using thousands of instances of a few hundred or less different props and store the content as instanced geometry, but convert that to a virtual voxel cache at runtime.
> as 44 million polygons
ReplyDeleteDid you mean "as 54 million", or is the precise number quoted before incorrect?
The precise number is correct. It is 54 million (fixed now in the blog post.) Thanks for catching this!
DeletePeople are used to triangle meshes and don't expect voxels. This video is plain and boring if it's not about voxels. How about creating some kind of a visual effect that can only be done with voxels and using it in the beginning and in the end of every video, just to remind the viewer what he is looking at? For example, would it be possible to send some kind of a wave through the world, but make it on the level of voxels, not vertices (so it's obvious it's not just a shader). Or maybe the world could be quickly put together and took apart in an interesting way?
ReplyDeleteThose tiled floors look repetitive and unnatural. Would it be very difficult to make a procedural floor tiler? Would it be practical?
I feel like the walkthrough would look better if the character had a jetpack and used it a few times to get a better perspective on the world.
Yes we are still figuring out ways to showcase this scene in a video. The general idea is to have some sort of wizard destroying the place.
DeleteThe video in this post is not meant to be entertaining, it is mostly about conveying the size of the place. Like you said, it takes some experience in the field to know you are looking at something new.
Even if you forget about voxels, making a scene this size work just with polygons is a huge challenge. All houses and temples have interiors here.
For inspiration, try doing what atomontage does/did. A flying perspective that starts out far, zoom in to show off all the detail. Then zoom out as a reminder of how big the scene is.
DeleteMaybe use cross section? It will not need creating new data (for vexels, but for triangle it will need) but will show inner structure of scene.
DeleteNice breakdown!
DeleteI'm a little confused by your voxel representation.
You list one material per voxel, but don't all "interesting" (ie surface) voxels contain multiple materials by definition?
Also I don't see connectivity information, which seems necessary to do the proper polygon reconstruction.
Hmm, now that I think about it, these concerns would go away if the "material" associated with each voxel was actually the material of, say, the top-left-back corner of that voxel. Is that what you are doing?
Oops, meant to reply to the original blog post. Sorry about that, wish I could edit it.
DeleteThe voxel material is associated with the bottom, left and front corner of the voxel. This is the voxel's origin. Neighboring voxels could have a different material, creating a transition in the resulting surface.
DeleteThanks! I can see how you can polygonalize now.
DeleteIt seems like if one is not careful one could end up wasting a lot of storage on "empty" (ie uniform) voxels. For your in-memory representation do you have something like a full 40x40x40 array of 8 bit attributes and then hash tables for the sparse contents?
This comment has been removed by a blog administrator.
ReplyDeleteThis comment has been removed by a blog administrator.
ReplyDeleteDon't know how useful or relevant to you guys this is, but this is a paper that applies neural networks and machine learning to make an easily editable terrain map that generates realistically looking terrain on the fly, figured you guys might at least be interested in having a look =).
ReplyDeletehttps://hal.archives-ouvertes.fr/hal-01583706/file/tog.pdf