Go concurrency overview

Photo by DAVIDCOHEN on Unsplash

Go concurrency overview

·

7 min read

One of the things that drive me to learn Go is concurrency. Go is famous for this feature. Go concurrency is based on Communicating Sequential Processes (CSP), which was first described by Tony Hoare back in 1978. Yes, he is the one who invented the Quicksort algorithm.

This article will go through four fundamental components of concurrency in Go: goroutines, channels, the select keyword, and WaitGroup.

Words of caution

Parallelism is about doing lots of things at once.Not the same, but related.

One is about structure, one is about execution.

Concurrency provides a way to structure a solution to solve a problem that may (but not necessarily) be parallelizable.

-- Rob Pike, Co-inventor of the Go language

More concurrency does not automatically make things fasters, and it can make code harder to understand. Concurrency is not parallelism. When you write a piece of concurrent code, you hope that it will be run in parallel. But there is no guarantee. Parallelism is a property of the runtime of our program, not the code.

You should use concurrency in two cases. First, when you want to combine data from multiple operations that can operate independently. And second, the process that is running concurrently takes a lot of time, for instance, functions like I/O or reading and writing to a disk or network. Otherwise, the overhead of passing values via concurrency overwhelms any potential time savings you would gain.

Goroutines

Goroutines are one of the basic units of organization in a Go program. Every Go program has at least one goroutine: the main goroutine, which is automatically created and started when the process begins.

A goroutine is a function that runs concurrently alongside other code. To use it, place the go keyword before a function.

func sayHello() {
    fmt.Println("hello")
}

func main() {
    go sayHello()
    // continue doing other things

    // You can use anonymous function as well
    go func(){
        fmt.Println("Hello")
    }()
}

Goroutines are not OS threads, and they are not exactly green threads. They are, in fact, coroutines. Because goroutines are not OS threads, they have several advantages. You can create goroutines quickly, and the initial stack sizes are smaller than the operating system thread. Switching between goroutines is also fast. You can spawn a huge number of goroutines; other languages with native threading will slow to a crawl.

Channels

Channels are how goroutines communicate. Channels are a built-in type through which you can send and receive values. Like maps and slices, channels must be created before use. This is how you create a channel:

ch := make(chan int)

Channels are reference types like maps. When you pass a channel to a function you are passing a pointer to the channel.

To interact with a channel, you can use the <- operator. If you want to read from a channel, place the <- operator to the left of the channel. If you want to write to a channel, place the <- operator to the right of the channel.

x := <-ch // reads a value from channel ch and assign it to x
ch <- y    // write the value in y to channel ch

By default, read and write block until the other side is ready. If you write to an open channel, the writing goroutine will pause until another goroutine reads from the same channel. Likewise, a read from an open channel causes the reading goroutine to pause until another goroutine writes to the same channel.

Each value written to a channel can only be read once. If multiple goroutines are reading from the same channel, a value written to the channel will only be read by one of them. This allows goroutines to synchronize without explicit locks or condition variables.

Buffered channels

By default, channels are unbuffered. But channels can be buffered. A buffered channel is created by specifying the capacity of the buffer when creating the channel:

ch := make(chan int, 10)

If the buffer is full before any reads from the channel, a subsequent write to the channel pauses the writing goroutine until the channel is read. Likewise, if reading from an empty channel will also block.

The built-in functionslen and cap return information about a buffered channel. Use len to find out how many values are currently in the buffer and use cap to find out the maximum buffer size. Once it is declared, the capacity of the buffer cannot be changed.

Range and Close

You can close a channel when there are no more values to be sent. And you can test whether a channel has been closed by assigning a second parameter.

close(ch)

// Check if channel is close or not
// exists is false if there are no more values to read and the channel is closed.

v, exists := <-ch

You can read from a channel using a for-loop construct. It will repeatedly read from the channel until the channel is closed.

for v := range ch {
    ftm.Println(v)
}

Once a channel is closed, any attempts to write to the channel or close the channel again will panic. But, attempting to read from a closed channel always succeeds.

// Writing to a closed channel will panic
var ch = make(chan string, 1)
ch <- "Hello, world!"
close(ch)
ch <- "This will panic"

// Reading from a close channel returns the zero value
var ch = make(chan int, 2)
ch <- 1
ch <- 2
close(c)
for i := 0; i < 3; i++ {
    fmt.Println("%d", <-ch)
}
// This will output 1 2 0
// No panic

Closing a channel is only required if there is a goroutine waiting for the channel to complete, such as using a for-range loop to read from the channel.

Nil channel

You can declare nil channel, but you should not. If you send a message to a nil channel, it will block forever. If you read from a nil channel, it will block forever.

// A send to a nil channel blocks forever
var ch chan string    // declare nil channel
ch <- "Hello, World"

// A receive from a nil channel blocks forever
var ch chan string    // nil channel
y := <- ch

Select

The select keyword allows a goroutine to read from or write to one of a set of multiple channels. Each case in a select statement either read or write to a channel, and if it is possible for a case, it executes that case. If no cases can run, a select will block. A default case can read or write without blocking.

select {
    case v := <-ch:
        fmt.Println(v)
    case v := <-ch2:
        fmt.Println(v)
    case ch3 <- x:
        fmt.Println("wrote", x)
    case <-ch4:
        fmt.Println("ignore value from channel")
    default:
        fmt.Println("no channel can be read or written")
}

The interesting about select is how it handles when multiple cases are ready. The select statement randomly picks one case that can go forward; order is not important. This is very different from a switch statement that chooses the first case that resolves to true.

The select can resolve two problems that make concurrency hard: starvation and deadlock. It solves the starvation problem by picking a random case (no case is favored over the other), and all cases are checked simultaneously. It can avoid deadlock because it checks if any of its cases can proceed; it will execute that case if it can.

WaitGroup

Sometimes one goroutine needs to wait for multiple goroutines to complete their work. But if you are waiting on several goroutines, you need to use a WaitGroup, which is found in the sync package in the standard library.

func w(id int, wg *sync.WaitGroup) {
    defer wg.Done()
    time.Sleep(time.Second)
}

func main() {
    var wg sync.WaitGroup
    routineCount := 5
    wg.Add(routineCount)
    for i := 1; i <= routineCount; i++ {
        go w(i, &wg)
    }
    wg.Wait()
}

A sync.WaitGroup doesn’t need to be initialized, just declared, as its zero value is useful. You don’t pass the sync.WaitGroup as a function parameter. This is because you must ensure that every place that uses a sync.WaitGroup is using the same instance. If you pass the sync.WaitGroup to the goroutine function and don’t use a pointer, then the function has a copy, and the call to Done won’t decrement the original sync.WaitGroup.

There are three methods of sync.WaitGroup:

  1. Add, which increments the counter of goroutines to wait for. Add is usually called once, with the number of goroutines that will be launched.
  2. Done, which decrements the counter and is called by a goroutine when it is finished. Done is called within the goroutine. To ensure that it is called, even if the goroutine panics, we use a defer.
  3. Wait, which pauses its goroutine until the counter hits zero.

While WaitGroups are handy, they shouldn’t be your first choice when coordinating goroutines. Use them only when you have something to clean up (like closing a channel they all write to) after all of your worker goroutines exits.

Wrap up

So This is a short overview of the concurrency in Go. In the following article, we will try to take a deeper look by solving some concurrency problems.