Last time I posted a simple model for 3d diffusion limited aggregation, which demonstrated using Haskell for a simulation based on randomly walking particles. The followup post is coming related to performance tuning, but in the meantime, I have another simple model that I figured I would post that came out of an article I worked on a few months ago.

Earlier this year I wrote an article for the Encyclopedia of Parallel Computing (coming out in 2011) on Cellular Automata. It was quite a bit of fun to write, and during the process I wrote a few little examples so I could experiment with them along the way. One of the examples I wrote about was the forest fire model. Writing this up was particularly interesting because it touched on a couple of topics that I had been wanting to learn more about: how to write array based programs in Haskell using the Data.Vector package, and how to integrate OpenGL visualization into the program. In my previous example of 3D DLA, I deferred visualization to external tools, but this time I wanted to watch the system evolve as it ran.

The reason I was curious about implementing it in Haskell is that I’ve been looking at Haskell as an alternative to Matlab/Octave for some prototyping tasks. At the end of the article, I’ll show the equivalent code in Matlab and give some of my thoughts on the two. I’m still a big fan of Matlab, so my intent isn’t to argue for Haskell as a complete replacement for it – not at all. I just happened to write this algorithm in both and had thoughts on how it felt looking at the problem from the two different perspectives.

Visualization in Haskell is easy now that we have the Haskell Platform that includes OpenGL and GLUT bindings available on all platforms where the Haskell Platform works. This is nice in comparison to the alternatives, like those based on wx or GTK, both of which do work on multiple platforms, but require a separate installation beyond that which the Haskell Platform comes with. In some cases, there is also a somewhat involved process of installing a third party library that can be awkward (I’ve never found getting Cairo and GTK going on OSX to be smooth for example.) In my opinion, working with GL is the simplest route to doing graphics if all you want to install to start is the platform.

Like any cellular automaton, the forest fire model is based on the application of a simple rule to each cell in the world that takes the state of the cell and a small number of neighbors to determine the next state of the system. Unlike CAs such as Conway’s game of life, the forest fire model is stochastic – the automaton rule depends on sampling a random number generator to determine its output. The rule is easy to state:

- Cells can be in one of three states: empty, burning, or tree.
- If a cell is burning in one iteration, it becomes empty (“burned out”) in the next.
- A cell that is empty has some probability of regrowing and turning into a non-burning tree.
- Finally, a non-burning tree will start burning if one of its neighbors is burning, or it may spontaneously ignite (“lightning strike”) with some probability.

Note that the two probabilities (regrowth or lightning) are independently tunable parameters, and their choice has an impact on the dynamics of the system as it runs. Below is a YouTube video of the program I describe in this post running. (Sorry about the mouse cursor in view for the first few seconds…)

I’ve split the code into two files: ForestFire.hs that contains the logic for the automaton, and ff.hs that contains the driver and the OpenGL visualization code. We’ll start with ForestFire.hs. First, there are the obligatory imports.

import Control.Monad (replicateM) import Data.IORef import qualified Data.Vector as U import System.Random.Mersenne.Pure64

The world can be easily represented in Haskell as follows:

data Size = Size Int Int data CellState = Empty | Tree | Burning deriving Eq data World = World { wData :: U.Vector CellState, wSize :: Size }

We have a World data type that contains a vector of CellStates, and a CellState data type representing the three possible states that a cell can be in during the execution of the model. The use of a 1D vector means that if we want to treat the world as a 2D torus, we must have routines to move back and forth from 1D indices to 2D coordinates.

-- index size (w,h) from 0..w-1, 0..h-1 idx :: Int -> Int -> Size -> Int idx x y (Size w h) = ((y `mod` h)*w)+(x `mod` w) -- turn 1d offset into 2d coordinate unidx :: Int -> Size -> (Int,Int) unidx i (Size w h) = (i `mod` w, i `div` w)

The world starts empty, so we have a simple function for initializing a world given a size.

initialize :: Size -> World initialize s@(Size w h) = World { wData = U.generate (w*h) (\i -> Empty), wSize = s }

To encode the local rule that occurs at each element of the world, a simple function is created that takes the state of the cell, the number of burning neighbors, and the result of sampling the two probability parameters (p and f – the names were borrowed from the Drossel and Schwabl paper). The function then yields the state that the cell will take on in the next time step. Note that the probability sampling takes place in a function that calls this one, and we only pass in the boolean of whether or not the sampled value was above the threshold for either p or f.

localRule :: CellState -> Int -> Bool -> Bool -> CellState localRule s ncount ptest ftest = case s of Burning -> Empty Empty -> if ptest then Tree else Empty Tree -> if ftest then Burning else if (ncount > 0) then Burning else Tree

To apply this function, we need to count the neighbors. To achieve this, we use a simple function (“ruleContext”) that is called from the larger time stepping function. This ruleContext function counts the neighbors and calls localRule. The separation of the actual forest fire transition rule from this additional function was to decouple the step of looking at the neighborhood (which is the “context” for the cell, hence my choice of name) from the cell-centric transition function.

-- given a coordinate, find linearized indices of neighbors neighbors :: Int -> Size -> [Int] neighbors i s = [idx (x+1) y s, idx (x+1) (y-1) s, idx (x+1) (y+1) s, idx (x-1) y s, idx (x-1) (y-1) s, idx (x-1) (y+1) s, idx x (y-1) s, idx x (y+1) s] where (x,y) = unidx i s ruleContext w p f i (v,pv,fv) = localRule v ncount (pv > p) (fv > f) where ns = neighbors i (wSize w) ncount = foldl' (+) 0 (map (\j -> if ((wData w) U.! j == Burning) then 1 else 0) ns)

This requires a bit of explanation. First off, notice that we represent the state of the world in terms of 1D coordinates, but define the automaton in terms of a 2d view on this 1d vector. So, when we compute the indices of the neighbors of a point (using an 8-point Moore Neighborhood), we need to turn the 1D index into the 2D coordinate of the cell the neighborhood is centered on, and then find the 1D indices of the nine neighbors that are defined in 2D relative to the center. The neighbors function takes care of this, and requires us to pass in the size of the world as well.

The ruleContext function invokes neighbors to generate the list of indices for the neighboring cells, and then it counts the neighbors that are burning. This is achieved by using foldl’ to compute the reduction of (+) over the list generated by mapping a test to see if a cell is burning over the list of neighbors computed in the line above. If the test succeeds, we place a 1 in the list, otherwise 0 is generated. The foldl just adds them up. We also test if the sampled random number is over the parameterized threshold for each stochastic parameter (p and f). Once we have this, we then pass the values on to the localRule to do the real work applying the forest fire rule. (*Edit*: Changed foldl to foldl’ based on comments. See here for related discussion.)

Armed with all of this, we can finally look at the function that evolves the state of the entire world from one time step to the next. This is written in the “oneStep” function.

oneStep :: World -> (IORef PureMT) -> Double -> Double -> IO World oneStep w st p f = do let x = wData w let s@(Size sw sh) = wSize w pval <- replicateM (sw*sh) (nextF st) fval <- replicateM (sw*sh) (nextF st) let xpf = U.zip3 x (U.fromList pval) (U.fromList fval) let w' = World { wData = U.imap (ruleContext w p f) xpf, wSize = s } return w'

This is a fairly straightforward function. First, we give the data and size names by accessing the wData and wSize fields of the world that was passed in. We then sample the random number generator twice for each cell, once for p and once for f. This is a bit wasteful, since it means we may sample the random number generator for cells where we don’t need it (such as those that are burning that have one choice to go to empty). I decided not to optimize that away since I was only concerned with getting the model working, and the implementation here performs well enough as written for my original purpose (testing the algorithm for an article). Plus, this is how the Matlab code works as well.

The replicateM function is used to call the nextF function for each cell to generate the list of p-values and f-values. These vectors are then zipped together with the state of each cell so that we get a vector of triplets – the state of a cell, the value of sampling the random number generator for the p probability, and the value for the f probability. We then use the imap function from the vector package to map the ruleContext function over all cells, where imap passes the triplet for each cell in along with the index. The result of this is used to define the wData part of the world to return for the next iteration.

For those who are curious, the nextF function is defined as:

nextF :: (IORef PureMT) -> IO Double nextF st = do rst <- readIORef st let (v,rst') = randomDouble rst writeIORef st rst' return v

Obviously there are other ways to string random number state through the code (such as the monadic approach used in the DLA example), but this is how I wrote this example about 6 months ago and I didn’t want to take much time to change this part. You can probably tweak the code with little effort to use other methods to thread the PRNG through the code if you prefer.

That’s everything for the model itself. Now we just need to look at the driver that invokes it and the little bit of code that is used to draw the visualization. This is in the file ff.hs in my public github repository, along with the file above. Let’s start with the code for drawing the 2D array, and then work back to the driver. I will admit up front that I am not a proficient OpenGL programmer, so it is highly likely to not be the best way to write this code. If anyone *is* interested in Haskell OpenGL and wants to help me with a cleaner 2D visualizer, definitely contact me. I’m very interested.

First, let’s define some colors for the different states that the cells can be in.

tree = Color4 0.7 0.7 0.7 1.0 empty = Color4 0.2 0.3 0.6 1.0 burn = Color4 0.9 0.2 0.2 1.0

Now, we’ll write a function that plots the contents of our world. Note that I renamed the ForestFire module to FF when I imported it.

drawSquares :: [(Int,Int)] -> FF.World -> IO () drawSquares pts world = do renderPrimitive Quads $ mapM_ drawQuad pts where wdat = FF.wData world wsiz = FF.wSize world drawQuad (x, y) = do currentColor $= c vertex $ Vertex2 x0 y0 vertex $ Vertex2 x1 y0 vertex $ Vertex2 x1 y1 vertex $ Vertex2 x0 y1 where x0 :: GLint x0 = fromIntegral x x1 = fromIntegral x + 1 y0 = fromIntegral y y1 = fromIntegral y + 1 c = case (wdat U.! (FF.idx x y wsiz)) of FF.Burning -> burn FF.Empty -> empty FF.Tree -> tree

Now, we live in IO since this code is obviously effectful – it draws on the screen. The renderPrimitive Quads function takes a list of OpenGL quadrilaterals, and renders them. The function drawQuad handles creating individual quads, and the mapM function runs it over the set of 2D coordinates representing the world. The data and size of the world are extracted from the world variable, and then we create a colored quad. The vertices are computed from the 2D coordinates of the cell, and the color is generated by indexing into the 1D world vector and pattern matching on the contents to select one of the three colors we defined earlier. The OpenGL quad is then defined as a color followed by four Vertex2 elements.

The display function handles actually drawing this. Now, OpenGL/GLUT (as I understand the Haskell binding) defines a main thread (“mainLoop”) that periodically refreshes the screen and checks for events like mouse clicks, window resizes, etc… We want to have our code execute on its own to evolve the CA and draw as soon as the data is ready after each time step. So, we create a second thread that handles calling oneStep to evolve the model. Given that we now are faced with two threads (one the OpenGL main loop, and the other being our time stepper), we need to communicate the data between them containing the state of the world. This is accomplished with an IORef that they both share. Let’s now look at the code executed by each of these threads.

display world = do clear [ColorBuffer] wval <- readIORef world let (FF.Size w h) = FF.wSize wval let points = foldl (++) [] $ map (\i -> map (\j -> (i,j)) [0..(h-1)]) [0..(w-1)] drawSquares points wval swapBuffers

The display function is invoked with one parameter – the IORef containing the world. Each time the display function is called, it reads the world from the IORef and extracts the size. It then generates a set of points for the 2D coordinates to plot, and then invokes the draw squares function. The calls to clear and swapBuffers are OpenGL calls related to double buffering to prevent flicker and seeing incomplete updates. This allow drawSquares to update the buffer to display, and the swap the completed buffer to the display when it is done. This prevents seeing incomplete updates.

The thread body for the code that does our computation is:

threadBody world rngstate = do wval <- readIORef world wval' <- FF.oneStep wval rngstate 0.96 0.999 writeIORef world wval' postRedisplay Nothing

I’ve committed one of the obvious sins of writing a good reusable model, and hardcoded the parameters for p and f into the body of the function. If this was more than a toy, I would probably have used a cleaner way to parameterize the code for these values.

The final code to write is main. Much of this is boilerplate that I took from various example OpenGL programs on the web. First, I have a few more sinfully hardcoded values related to the geometry of the window and the size of the world.

-- window dimensions initialWindowSizeX :: Int initialWindowSizeX = 300 initialWindowSizeY :: Int initialWindowSizeY = 300 -- we want to split our window into initialWindowSizeX/pixelsPerUnit grid -- squares wide, and initialWindowSizeY/pixelsPerUnit grid squares high pixelsPerUnit = 4 -- PRNG seed seed = 2

And now, the big ugly main:

main = do getArgsAndInitialize let w = (fromIntegral initialWindowSizeX :: GLsizei) let h = (fromIntegral initialWindowSizeY :: GLsizei) initialDisplayMode $= [RGBMode, DoubleBuffered] initialWindowSize $= Size w h (Size screenSizeX screenSizeY) <- get screenSize let initialPos = Position x y where x = (screenSizeX - w) `div` 2 y = (screenSizeY - h) `div` 2 initialWindowPosition $= initialPos createWindow "Forest Fire!" let worldSize = FF.Size (initialWindowSizeX `div` pixelsPerUnit) (initialWindowSizeY `div` pixelsPerUnit) world <- newIORef (FF.initialize worldSize) rngstate <- newIORef (pureMT seed) displayCallback $= display world reshapeCallback $= Nothing idleCallback $= Just (threadBody world rngstate) matrixMode $= Projection loadIdentity ortho2D 0 ((fromIntegral w) / (fromIntegral pixelsPerUnit)) 0 ((fromIntegral h) / (fromIntegral pixelsPerUnit)) mainLoop

Ugly as it may be, the code makes sense if you walk from top to bottom. We start off initializing the window and doing some computations to position it and size it and give it a title before making it appear. The variables w and h exist to have type GLsizei in order to be used with the GL calls, while the original size variables have type Int. We create a world that corresponds to the number of squares in the image. If we have a 300 by 300 image with each cell taking a 4×4 space, then we can only fit 75 cells in each dimension.

Next, we create our two IO refs. The first one holds the world, and as we saw earlier, this is to allow the two threads (one for computing, one for displaying) to share the state. The second one is to hold the PRNG state. These are then handed off to the appropriate functions — the world is handed to the display function, the thread body gets the world and the PRNG state. Finally, we do a bit more OpenGL setup to make sure our view is configured and scaled appropriately, and then hand control off to the mainLoop (which comes with GLUT).

*Note on OpenGL*: Windows users may need glut32.dll from here. I needed to download that to run this on my 32-bit Windows 7 machine since glut32.dll didn’t seem to be available already.

Personally, I found this to be useful as a method for me to learn how to use OpenGL from my Haskell programs, where GL was visualizing the output of an algorithm running and updating the contents of an array. As promised, here is the Matlab code implementing the same algorithm, visualization and all.

p = 0.96; % grow f = 0.999; % lightning sz = [150 150]; EMPTY = 0; TREE = 1; BURNING = 2; world = zeros(sz(1),sz(2)); for i=1:10000 % empties may come to life newgrowth = double(world==EMPTY).*double(rand(sz(1),sz(2))>p); tileworld = world([end 1:end 1], [end 1:end 1]); % count neighbors. tileworld gives us periodic boundaries for the % 2d convolution nconv = conv2(double(tileworld==BURNING),[1 1 1; 1 0 1; 1 1 1],'same'); neighbors = nconv(2:end-1,2:end-1); % trees with burning neighbors will burn toburn = double(world==TREE).*double(neighbors > 0); % some trees get hit by lightning lightning = double(world==TREE).*double(rand(sz(1),sz(2))>f); % trees stay trees, and those that are burning get added on to % get value 2. trees = double(world==TREE)+double(toburn + lightning > 0); % the world is composed of new growth + trees (burning and not) world = (newgrowth + trees); % view the world imagesc(world); % keep colors ranging from 0 to 2 caxis([0 2]); % plot drawnow; end

First off, I find this code to be pleasantly short. This is why I use Matlab – I can quickly crank out an algorithm to mess around with. On the other hand, it lacks clarity. Consider the line to grow new trees.

newgrowth = double(world==EMPTY) .* double(rand(sz(1),sz(2))>p);

First, I create a bit mask (world==EMPTY) such that only empty cells are set to 1, and then sample the random number generator at every point, creating a new mask for all points where the probability p is exceeded. Both need to be turned into double-precision arrays where I can do element wise multiplication, a result of the fact that == and > yield logical arrays that don’t support multiplication. Element-wise multiplication is used to bring the two together and grow trees on all the points where a cell was EMPTY and the probability was exceeded.

That same pattern is used in other places to cells to selectively apply a computation over a whole array. In most languages, we’d implement this as a loop over coordinates where the inner loop body is a conditional on the array contents at each element. This isn’t wise in Matlab since Matlab performs horribly for loop-based code. You need to vectorize Matlab code, which means you need to turn everything into whole-matrix operations. (Interestingly, the same process of writing in whole-array style would apply to people trying to write code in vectorized form using Data Parallel Haskell!).

The neighbor counting is even more obfuscated. First, I want a torus, so I replicate around the world a single layer of cells that are those on the other side of the world after wrapping around. Then, I apply a 2D convolution with a kernel that counts the neighbors of each cell. The conv2 function doesn’t (to my knowledge) use periodic boundaries, so I had to manually implement them. So, conv2 is applied to my sort-of-periodic world, and then I extract out the middle (2 to end-1 in each dimension) to get the result – each element containing the number of burning neighbors. (*Edit*: Thanks to Michael Turmon for pointing out a much nicer way to do the set up for the neighborhood convolution. )

The code can be a bit dense at times due to writing in a vectorized style, but it is short. It takes a while to get used to thinking in vectorized form, seeing loops as nothing more than convolutions and masked whole-array operations.

My favorite feature though, and the one that keeps me in Matlab instead of Haskell for many tasks, is at the very end. I just drop in three lines and I have a view of my matrix, adjusted colors, and control over when it redraws. Three lines! I can do images, plots, 3d plots, etc… All with essentially zero effort. My OpenGL code in this post is my first explorations into writing a similarly simple library for Haskell based purely on the Haskell platform – no GTK, no Cairo, no GNUPlot – just what comes with the platform. I’ve tried the chart package for Haskell and similar plotting libraries, but I’m still waiting for something as easy as the built in simple visualization capabilities of Matlab. I think it will be wonderful if some day a Haskell equivalent of plot, image, etc… all are provided as standard parts of the platform. Interestingly, a year or so ago I spent a few months enamored with Processing for this very reason – graphics were easy to integrate into my algorithms. I had a similar affair with Ocaml due to the built in graphics capabilities. I think a Haskell equivalent for the Ocaml graphics module would be a pleasant addition to the Haskell platform using the OpenGL library as the back end.

I wish more languages would make simple graphics simple to do as part of the standard library.

I’ve put the code in my public github repository, so you can find fully working versions of the code there for both the Haskell and Matlab.

You should avoid using foldl wherever possible, as it needs MUCH more memory than foldr, especially for large lists. In this case, you can easily switch to foldr, without changing anything.

True, although 8 is the fixed upper bound on the length of lists being folded here.

The choice of foldl vs. foldr depends on the operation you are folding with. It’s not just a static “foldl is always larger than foldr.” It is pretty safe to say, however, that if you are using a left fold you probably want foldl’ rather than foldl.

Regarding the MATLAB/Haskell comparison: it is important to note that this kind of code is what MATLAB was designed for (matrix manipulation and computation).

Another interesting thing to think about is: how easily could you build an OpenGL library to make plotting similarly easy to MATLAB? Something akin to:

figure sizeX sizeY

plot myData [Options]

show

Eventually making the backend pluggable to allow for Cairo etc…

I am also looking to use Haskell as a semi-replacement for MATLAB, although I am coming from the Python+Numpy side of things.

> Now, OpenGL (as I understand the Haskell binding) defines a main thread (“mainLoop”) that periodically refreshes the screen and checks for events like mouse clicks, window resizes, etc…

That would be GLUT, not the OpenGL bindings. That’s just how GLUT works.

Easy graphics is the thing that keeps me moving back to Matlab after experimenting with python/numpy for tasks.

Because Matlab has & for logical arrays, you can use:

newgrowth = (world==EMPTY) & (rand(sz)>p);

Your padded array could be done as:

tileworld = world([end 1:end 1],[end 1:end 1]);

Thanks for the [end 1:end 1] line — somehow I managed to forget that I could do that when I hacked together the matlab code! I tweaked the code and credited you with pointing out the far-cleaner way to do it…

As for the logical and versus double multiply – I’ve gotten out of the habit of writing code with logical operations when I’m going to eventually treat the same data as a double later. At some point I need to make sure to say double( .. ) to prevent a type mismatch error, and I have found it to be less error prone to do the conversion at the time of creation of the array instead of remembering that the array is a logical and putting the double( … ) at a later point. I’m sure it’s less memory efficient and has other implications on performance, but I’ve had one too many fatal errors two hours into a long run that result from a forgotten double (…) around a logical array that was ultimately used in an arithmetic computation.

Interestingly, I used to teach Python and Numpy (along with the rest of the SciPy tool chain). I’ve been moving away from them for a number of years now – I think I stopped seriously looking at SciPy/NumPy in 2007 or so. According to the analysis of this blogger, the state of the SciPy/NumPy world hasn’t changed much since then.

I am curious, what made Scipy/Numpy lose favor with you?

Let me start by saying I do think SciPy/Numpy is a useful project – I think the group working on it is doing a good thing. There was some turbulence during the numeric/numpy/numarray transition – some packages used one array library, others used a different one, and yet others used the third. It was annoying enough that I moved to alternative languages since I had to get work done, and that work was not wrangling libraries or development environments. I probably should give it a look again to see if that ecosystem has stabilized (which I assume it has – a lot can happen in 4 years!).

I used Scipy/numpy for about ~2 years and they have stabilized around numpy. I never had to fight any numpy/numarray/numeric variations.

Good to know. I think I was just unlucky to have been trying to use it during the transition.

Scipy has stabilized around Numpy and will soon be ported to Python 3 now that Numpy has been ported. For Matlab style plotting in Python check out matplotlib. For Mathematica notebook style interfaces check out SageMath.

Nice, I guess, but by the gold standard of functional programming this is still pretty verbose. The gold standard being the woefully overlooked Mathematica, whose fantastic built-in higher-order functions are only just being replicated by modern LISPs like Clojure. Hopefully I’ll have time to rewrite this forest fire model in M- for my blog, I’ll drop the link here when I do…

I’m a big fan of Mathematica actually. I was really happy when the Home Edition came out – I’ve been using it for little experiments in my spare time more and more lately. I wrote up this example in Mathematica already. See ForestFire.nb in the github repository that I refer to in the article.

On OSX, I get a compile error:

ff.hs:40:28:

Couldn’t match expected type `GLint’

against inferred type `GLsizei’

In the first argument of `Position’, namely `x’

In the expression: Position x y

In the definition of `initialPos’: initialPos = Position x y

Hmm. On my mac (OSX 10.6.4) I see:

There are examples of Forest fire written in around fourteen languages here on Rosetta Code. A major difference is that most entries use a simple textual display.

I’ve seen the Rosetta Code site. I’m not sure how useful it is though – the Fortran version of Forest fire is particularly bad (it almost looks like it was intentionally written to be ugly), and Matlab is missing from the list of implementation languages for Forest Fire.

How about rewriting that matlab code in R? The syntax is almost identical, and it’s free, unlike matlab. The graphics are great. The tools for measuring the statistics of spatiotemporal data are state of the art.

The main limiting factor is time – I would be happy to write it up in R, but I’m lacking free cycles at the moment.

Hi, could it be possible to adapt the fire modelling code in Matlab to simulate on real maps. Obviously, representing data in arrays and interacting them with code using Rothermals theory? Thank you

I haven’t thought much about this, but I would assume that it is possible. The big difference is that the forest fire model that I described is a highly idealized system – Rothermals theory that you mention is less idealized, so I would have to think about how to bring both real maps and the less idealized model together. Sounds like an interesting problem though – is this something you were interested in modeling yourself?