Procedural World: Procedural Information

Saturday, March 8, 2014

Procedural Information

Information is the measure of what you do not know.

You just looked at your watch and realized it is 3:00 PM, then someone comes into your office and tells you it is 3:00 PM. The amount of information the person gave you amounts to zero. You already knew that. That person did give you data, but data is not necessarily information.

Information is measured in bits, bytes, etc. If you ask someone, "Is it raining out there?", the answer will be one bit worth of information, no matter what the weather looks like.

You are now looking at a photo of a real lake on your computer screen:

Let's imagine it is the first time you see this photo. This is information to you, but how many bits of it? You could check the file size, it is already in bytes. It turns out it is a BMP file and it is 300 KBytes. Did you just receive 300 KBytes through your eyes? Somehow this seems suspicious to you. You know that if the file was compressed as a PNG the file size would be a lot less, probably around 90 KBytes, no visual degradation. So what is going on, is it 300 or 90 KBytes what you just saw? Nobody can tell you the right amount. Your eyes, brain and psyche are still mysterious objects to modern science. But whatever it is, it will be closer to 90 than 300. The PNG compression took out a lot of bits that were not really information. Compression algorithms reshuffle data in ways redundancy becomes evident. Then they take it out. It is like having someone else stop that person before entering your office to announce it was 3:00 PM. How is this related to procedural generation? Now imagine I have sent you this little EXE file. It is only 300 KBytes. When you run it, it turns out to be a game. You see terrain, trees, buildings. There are some creatures that want you dead. You learn to hate them back, you fight them everywhere you go. You find it amusing that even if you keep walking, this world appears to never end. You play for days, weeks. Eventually you realize the game's world is infinite, it has no limit. All this was contained in 300K, still the information coming out of it appears to be infinite. How is this possible? You are being tricked. You are not getting infinite information, it is all redundant. The information was the initial 300 KBytes. You have been listening to echoes believing someone was talking to you. This is a hallmark of procedural generation: A trick of mirrors that produces interesting effects, like a kaleidoscope. A successful procedural generator deceives you into thinking you are getting information when you are not. That is hard to achieve. In the same way we love information, we dislike redundancy. It wastes space and time, it does nothing for us. Our brains are very good at discovering it, and we adapt quickly to see through any new trick. Now, does this mean software cannot create information? There is energy going into this system, can it be used for more than powering infinite echoes? This is one of the big questions out there. It is beyond software. Can anyone create information at all? If you look at the lake picture again, you may ask yourself how it came to be. Not the picture, the actual lake. Is it there partly by chance, or because there was no other choice. Its exact shape, location and size, could they be the inevitable result of a chain of cause-effect events that started when the Universe began? If that is the case, the real lake is not information, it is an echo of a much smaller but powerful universal seed. The real answer probably does not matter. Even if the lake was an echo of the Big Bang, 42 or some sort of universal seed, the emergent complexity is so high we cannot realize it. Our brains and senses cannot go that far. If you are ready to accept that, then, yes, software can create information. The key is simulation. Simulations are special because they acknowledge the existence of time, cause and effect. You pick a set of rules, a starting state and you let things unfold from there. If humans are allowed to participate by changing the current state at any point in time, the end results could be very surprising. The problem with simulation is that it is very expensive. If you keep rules too simple or simulate for very little time, results may not be realistic enough. If you choose the right amount of rules and simulate for the right amount of time, you may realize it would take too long to be practical. When it comes to procedural generation you will find these two big families of techniques. One family is based on deception, produces no information, but it is fast and cheap. The other family has great potential, but it is expensive and difficult. As a world builder you should play to their strengths, avoid their pitfalls. And what is more interesting: learn how to mix them.

26 comments:

PiotrMarch 8, 2014 at 2:34 PM
Happy to see a new post; it's been a long time :)
ReplyDelete
Replies
OtspIIIMarch 8, 2014 at 3:10 PM
I'm not sure I'm completely comfortable framing the digital creation of information as 'simulation', especially if you're taking the biggest path for a simulation to fail to be 'not be realistic enough'. In fact, I have some issues with the framing of the way you divide 'data' and 'information' up in general. . .at least from a human perception perspective--from a computational perspective it seems pretty solid.

When I look at that picture of the lake I feel like the amount of information I register from it is actually relatively small. There's a lake. There's a little island. There are trees and mountains. They have loose spatial relationships. I don't register the image as a series of pixels, but as a series of concepts, which lets me compress the amount of information I need to record into my brain by a huge amount.

The thing is, the amount of information I actually take out of the image is going to vary a huge amount based on both A) who I am and B) why I'm looking at the picture. What's meaningful about the picture to me is based on my relationship with it...how much do I know about the types of trees surrounding the lake? Do I register this as 'that lake I grew up near' or 'just some generic lake'? Am I trying to plan a specific location to build a house on its shore or am I just looking at it for the emotional tone it produces? To take this to procedural generation, I'm not interacting with a huge grid of blocks in Minecraft so much as I'm interacting with a general zone of a biome and various landmarks that point me towards locations where the resources I'm looking to dig up might be more easily accessed.

I think that one of the biggest things that procedural generation is good at is emphasizing the player's relationship with the content it produces. In a procedural platformer like Spelunky the player isn't so much experiencing 'a tile of dirt' as they are 'a safe place to land between the two spike-traps' or 'the place I need to stand to keep the spider from landing on me'. There's a huge amount of emergent information that comes out of relatively simple modular procedure-systems interlocking with each other in semi-predictable patterns that the player needs to explore if they want to hope to be able to interact with the game-world in a meaningful manner.

I don't think that this emergence needs to actually be all that expensive from a computational perspective. It's like a simple board game like chess or go--a few simple rules create a huge range of very different situations that the player can build relationships with.

I don't think that any of this is too different from what you're saying, and in fact part of me does think that maybe your biggest strength is that you're taking procedural generation on from a pretty hardline 'realism' perspective, but there's a whole huge side of procedural information-production that completely bypasses the 'information creation is expensive' issue.
ReplyDelete
Replies
John RamboMarch 8, 2014 at 3:23 PM
Dude, that's like so deep.
ReplyDelete
Replies
Rune Skovbo JohansenMarch 9, 2014 at 8:02 AM
I'm with you for most of the argumentation. Most things in a procedural world (and in the real world too if we assume for a moment that it's deterministic) can be considered just echoes of a formula plus initial conditions. But the emergent complexity is so high that we can't comprehend its consequences, and thus the derived "echoes" often function like information after all.

However, I don't get your distinction that simulation is somehow different in this regard than other forms of procedural generation. if you try to create mountains in a game by simulating tectonic plate collisions, volcanoes, and various types of erosion, will a player really see it as more information dense than if the mountains were created from sophisticated fractal noise functions with heuristics that make them look eroded too but without doing the actual simulation?

In fact, whatever simulation you use, it's still a heuristic anyway. Does the mere fact that simulations do calculations iteratively somehow make the end result qualitatively different and its resulting information level higher in a perceivable way? I'm not convinced of that.
ReplyDelete
Replies
UnknownMarch 9, 2014 at 8:39 AM
I always considered data to be raw bits and bytes and information to be some useful subset of data (like the result of a question) that can be used to learn or interpret something. Therefore my thinking is that the information (from a human point of view) for that image is very different to what the computational needs might be for it as the image itself could be data but combined with this article it becomes information. It's like saying, words are data but a sentence is information.

From a computational point of view however everything is data and only the result pushed to a computer screen is information for a human as the computer does not behave in the same way as a brain. it takes data in to a function and pushes data out of that function, that's it, there's no "what does this tell me?" or "what have i learnt from this?" type response going on between functions.

I did some interesting reading up on this when learning about AI, and smarter people than me suggest that unless we have a sentient computational system this is always likely to be the case so in short ....

Information is data that means something in a way that can be interpreted by the consumer in some meaningful way.

My thinking is the closest thing we have at the moment is large sets of polymorphic code that give the appearance of having learnt something when the reality is more basic "the data was stored so i can now search this later".

That brings us to the whole "what does it mean to be alive?" discussion, as my thinking here is that a computer program does not ever use "information" only data whereas a living entity can consume both in some meaningful way.

as a side note ...
Why does this page show me an advert to "date arab girls" ... i rest my case, information about this site tells me it's not related, a human would know this by "experience" and "past relationships of things" a computer knows only bits and bytes.
ReplyDelete
Replies
AjmMarch 9, 2014 at 11:20 AM
Why not do both? For your base iteration use the fractal noise algorithms. Then iterate over it. Initial loading of the world will iterate x mount of times in a given radius of the player. Then, as the player explores the world new terrain is generated with algorithms, and then iterated until the user gets close enough the wotld is locked into voxel data. The slower a character moves, the more realistic their world will generate. Which is great because they will be exploring that world to a far greater degree.

The main issue for this is your world no longer generates around a single initial seed and you need some way to save. For this, you could save the world as a heightmap, and then procedurally generate elements off of that heightmap. If the heightmap doesn't change and the heightmap is the seed, then any explored world can be re explored.
ReplyDelete
Replies
NevermindMarch 9, 2014 at 12:50 PM
I'm not mush of a computer scientist, but as far as I understand, producing information requires a source of entropy. This is basically how evolution works - random mutations are the source of entropy that creates new information in actual genomes.
So, as long as your procedural algorithm uses an actual RNG (as opposed to PRNG) - and does not destroy this randomness in the process - it is legitimately creating new information "from scratch".
ReplyDelete
Replies
UnknownMarch 10, 2014 at 5:38 AM
My head hurts now!

Oh and Miguel ... Answer your emails !!!!
ReplyDelete
Replies
UnknownMarch 10, 2014 at 10:47 AM
Our eyes and minds are used to seeing the natural world, and (as Miguel has indicated) we can spot something that looks 'fake' without too much effort.

But even the natural world can throw up scenes that appear to contradict our own understanding of how the world should look. The more unusual they are, the more appealing they are (to my eye anyway). Off the top of my head, there's the Zhangjiajie National Forest Park in China. (For more, google 'earth porn'!) Oh and the baobab and dragon blood trees are great examples of trees that look like they've been created with some very weird settings on a procedural generator.

The challenge for environment design (whether totally procedural or artist-influenced) is creating something that has that indefinable appeal - something that draws you in and entices you to explore it. The software trickery that's been used to create it ultimately is immaterial.
ReplyDelete
Replies
UnknownMarch 12, 2014 at 7:13 AM
This might be a bit Off-Topic:
I did not consider this question "how many bits of information is gathered by looking at a picture" before and i think it is very interesting.

I assume (if you would be able to quantify it) the amount of data is different from person to person. It depends on how we see the world. A artist gathers other information by looking on other parts of the picture than a mathematician or programmer, once we have understood what we actually see. This also depends on what our brain is doing with this data. Is something weird with some element on the picture? So we take a closer look, trying to understand it, etc. etc.

What i want to say: Our brain leads the eye through the picture, for gathering the data we need to have to understand what we see. This amount of information can vary between different persons and it also can vary between different ways, we want or need to see an image.

ReplyDelete
Replies
UnknownMarch 12, 2014 at 8:17 AM
so you are working on a planet generator,
starting from a big hot glowing gas ball,
and ending up in a green blue paradise?
ReplyDelete
Replies
UnknownMarch 12, 2014 at 11:11 AM
Does your engine have anything to do with no man's sky? http://www.rockpapershotgun.com/2013/12/09/first-look-no-mans-sky/
It will go crazy with procedural generation, to the chemical composition of liquids, gasses, terrain. Dynamic weather, lifeforms and so on. If you can't or don't want to talk about it, just leave no response I guess.
ReplyDelete
Replies
UnknownMarch 12, 2014 at 4:36 PM
One of the truly great things about procedural generation is that a small amount of information in initial rules can be used to create a much larger set of information. You could equate this to the laws of physics versus the current state of the universe.

There's a video that I think you'd find very interesting, specifically the part about cellular automaton: https://www.youtube.com/watch?v=_eC14GonZnU
One of the points that relates well to procedural worlds is the occurrence of some extremely rare 'landmark' patterns.

Procedural worlds are generally considered 'samey' once you've explored long enough, but I don't think that needs to be the case, or at least no more than the real world.
ReplyDelete
Replies
UnknownMarch 13, 2014 at 11:06 PM
when i saw these videos it made me think of you, and procedural generation.
if we could translate that vibrational essence which moves all of nature even partially
I bet you could produce some amazing sights.
http://www.youtube.com/watch?v=P-xK3G71jDo
check out part 1 as well.
tho my friend disagrees with this film, and says it doesn't hold up, i still think there is something there
ReplyDelete
Replies

Add comment