So for a second exercise with Go, I wanted to explore some of its concurrency features. Support for concurrent programming is a key factor in the design of Go, and its main tools in simplifying concurrent programming are goroutines and channels. It should be noted that these are not the only means of writing concurrent programs, Go also has lower level constructs like Mutexes, but for many applications channels and goroutines provide a simpler and safer alternative.
To explore this, I modified the program from day 1 to load the names database from disk, and to extract names from the pinterest front page, concurrently. I will just include the main() function below as it was the only one that changed.
It turns out doing this was very simple, and while this modification isn't conceptually hard—it is really just launching two async tasks and waiting for them to complete before computing the stats. It is good to see that simple things are indeed simple and it was a quick introduction to the syntax.
The go keyword takes a function call and executes it in a goroutine. A goroutine is a lightweight thread of concurrency similar in spirit to erlang's lightweight tasks, these can be scheduled by the run time ro run on multiple cores of available. The main program (also running in a goroutine by the way) is then free to continue executing concurrently. In the example above anonymous functions are used to load the names database from disk and to fetch and extract names from the pinterest front page respectively.
We use channels to communicate between the main goroutine and the ones it creates, in this case the channel is used to communicate the result of these operations back to the main goroutine. A channel is exactly what it sounds like, it is a (typed) pipe for values, and can be birectional (support both send and receive) or unidirectional (either send or receive) though I can't see how one could make use of the latter (a receive only channel would never have anything put in it?), buffered or unbuffered (blocking on the sender side), channels always block on receive. Given this, channels can be used to communicate between data goroutines as well as synchronise them. The <- operator is used to put things on the channel and to receive from them as well depending on what side of the channel its on.
The changes between this and the non concurrent one are simple, I simply take the code for each task, put it in an anonymous function and call go with it. I then use a channel to wait to for the data to be sent back from the tasks that were kicked off, once they have all sent back their data the program continues as before.
Overall this was an extremely pleasant transformation to make. I did notice one thing though, while go's philosophy around shared mutable state is
"Do not communicate by sharing memory; instead, share memory by communicating."
Go doesn't actually prevent you from doing the former. The anonymous functions we created for the goroutines are closures and share a containing scope and thus access to shared mutable state. It is up to the programmer to be careful with this, and where possible not do it (and go does provide lower level constructs like mutexes if this is really what you need). Overall I find go to be less strict compared to Clojure or Erlang in its suggested approach to communication in concurrent situations, which probably fits in well with it's design as a system language, and does make it quite flexible. Though at first blush I would still say the chamber of its shoot-footing-gun holds fewer bullets than those of C/C++ :-).