Working with The Graph Data Structure in Go
Representing Graphs in Code
Graphs are just a data structure where a bunch of nodes are connected to each other by imaginary lines called edges. They're really similar to linked lists and trees except pretty much any node can be connected to any other node. And also like linked lists, those connections can have directions, (for example A points to B), or can be undirected, where A would point to B and B would also point to A. It's basically like one way and two way streets. As a matter of fact, graphs can be used to model roads for objectives like finding the shortest path between two points. Or even analyzing connections between users on a social network. There's a lot more that they can be used for but these are some common examples. So how do we represent a graph in code so we can use one?
To represent graphs in code, we have basically two options, Adjacency Matricies and Adjacency Lists. Actually, your data can start off in a variety of different ways but putting it into these formats makes it easier to work with. For example, just say you had a database table that stored the information of who follows who. Your table data might look something like this:
A follows B A follows C B follows A B follows C C follows D
Which, if we write as a 2D array, would look like this:
['A', 'B'] ['A', 'C'] ['B', 'A'] ['B', 'C'] ['C', 'D']
To make our graph easier to work with, it would be helpful if we could represent it like this, so that we can easly see which nodes are connected to which:
A follows B, C B follows A, C C follows D
Or as an array:
[
['A', 'B', 'C'],
['B', 'A', 'C'],
['C', 'D']
]
If we make this into a hash map, with the 'from' nodes being the keys and the 'to' nodes being the elements of arrays paired to those keys, we get what we want. This is called an adjacency list.
{
'A' : ['B', 'C'],
'B' : ['A', 'C'],
'C' : ['D']
}
Graphs can be also be represented as adjacency matricies (which can work a lot better when you have a lot of edges), But in this example we're going to use adjacency lists because they're better for sparse maps (fewer edges). To convert an array of connections to an adjacency list, we're going to use Go's map datatype, which is essentiall a hash table. This gives us O(1) lookup, insert, and delete. Let's create a function that does the conversion.
In our example, the adjacency list will be a map where the key is a rune and the value is an slice of runes, so we'll have the function return that type. To create maps in Go, you have to use the make() function. It's used for creating slices, channels, and maps. Next we loop through the first level of the slice and grab the first element in each inner slice to be the 'from' node of this iteration. If no other elements come after that 'from' value in the slice, we set the adjacency list's map to be an empty slice at that index. This signifies that the node isn't pointing to any other nodes. Then, starting at the second element in the sub-slice, we loop through each remaining one and append them to the slice for this node. Then finally, we do another loop through those same elements to see if they also have keys set in the map and if so, set their values to empty slices, which again just signifies that they don't point to anything. This just let's us make sure we have a key for every node even if it's just being pointed to but doesn't point to anything.
func ArraytoAdjacencyList(array [][]rune) map[rune][]rune {
graph := make(map[rune][]rune)
for i := range array {
if len(array[i]) == 0 {
continue // skip empty rows
}
from := array[i][0]
if len(array[i]) == 1 {
graph[from] = []rune{} // node with no neighbors
continue
}
for j := 1; j < len(array[i]); j++ {
graph[from] = append(graph[from], array[i][j])
}
for j := 1; j < len(array[i]); j++ {
if _, exists := graph[array[i][j]]; !exists {
graph[array[i][j]] = []rune{}
}
}
}
return graph
}
Depth First Print using Iterative Method
To do a DFS print, we have to use stacks so we defer taking paths down neighbor nodes until the first direction we chose hits a wall (returns). So first, we'll create a stack. Next, to make sure we don't go down paths we've already tried, we'll make map of visited nodes. Maps are perfect for this because they enforce uniqueness of keys. In a programming language other than Go, you could use a Set to accomplish the same thing. We push our starting node onto the stack and start a loop that will continue as long as there are values in the stack. Inside that loop, we pop off the top element from the stack and check if it's already been visited. If so, we skip this iteration of the loop, if not, we insert the node into the visited map (or set) and print it's value. Then, we get all the neighbors (the nodes it points to) and loop through them, pushing them onto our stack as long as they haven't been visited yet.
func DepthFirstPrint(graph map[rune][]rune, start rune) {
var graph_stack = CreateStack(100)
visited := make(map[rune]bool)
graph_stack.Push(start)
for graph_stack.Top >= 0 {
current := graph_stack.Pop()
c, _ := current.(rune)
if visited[c] {
continue
}
visited[c] = true
fmt.Printf("current: %c\n", c)
neighbors := graph[c]
for i := len(neighbors) - 1; i >= 0; i-- {
if !visited[neighbors[i]] {
graph_stack.Push(neighbors[i])
}
}
}
}
This is depth first traversal because adding neighbor nodes we plan to visit to the stack means 'visiting' those directions get 'stuck under' the stack and can't be visited until the most recently selected path has been visted. Remember stacks give us LIFO (last in first out) behavior, which means you're traveling through whichever path's node has been last added on the stack.
Depth First Print using Recursive Method
Of course, this method still uses a stack but will leverage the call stack instead of having to create and manage our own. Less code but does the same thing.
func DepthFirstPrintRecursive(graph map[rune][]rune, start rune, visited map[rune]bool) {
if visited[start] == true {
return
}
fmt.Printf("current: %c\n", start)
visited[start] = true
neighbors := graph[start]
for i := len(neighbors) - 1; i >= 0; i-- {
DepthFirstPrintRecursive(graph, neighbors[i], visited)
}
}
Breadth First Print
To use BFS to walk through the graph, we basically do the same thing as the iterative stack approach, except we replace the stack with a queue. Queuing up neighbor nodes means all of a node's neighbor nodes will be visited before moving on to the next node in the adjacency list. Again, this is because queues are FIFO (first in first out).
func BreadthFirstPrint(graph map[rune][]rune, start rune) {
graph_queue := CreateQueue(100)
visited := make(map[rune]bool)
graph_queue.Enque(start)
for graph_queue.Size > 0 {
c, _ := graph_queue.Dequeue().(rune)
if visited[c] {
continue
}
visited[c] = true
fmt.Printf("current: %c\n", c)
neighbors := graph[c]
for i := len(neighbors) - 1; i >= 0; i-- {
if !visited[neighbors[i]] {
graph_queue.Enque(neighbors[i])
}
}
}
}
Has Path Using DFS
Let's look at a DFS solution for checking to see if a path exists between a source and a destination node. We'll use recursion so we don't have to create and manage our own stack. Every time we call this function, we'll check to see if the source equals the destination. This will be true if we ever call the method on a neighbor node that happens to be the one we're looking for. Again, we keep a map of all the visited nodes so we don't get caught in any cycles in our graph.
func HasPathDFS(graph map[rune][]rune, source rune, destination rune, visited map[rune]bool) bool {
if source == destination {
return true
}
if visited[source] {
return false
}
visited[source] = true
neighbors := graph[source]
for i := 0; i < len(neighbors); i++ {
if HasPathDFS(graph, neighbors[i], destination, visited) {
return true
}
}
return false
}
Has Path Using BFS
And we can do the same thing with a Breadth First Search approach by using a queue to determine which order we visit neighbor nodes.
func HasPathBFS(graph map[rune][]rune, source rune, destination rune) bool {
queue := CreateQueue(100)
queue.Enque(source)
visited := make(map[rune]bool)
for queue.Size > 0 {
current := queue.Dequeue().(rune)
if current == source {
return true
}
if visited[current] {
continue
}
visited[current] = true
neighbors := graph[source]
for i := 0; i < len(neighbors); i++ {
if !visited[neighbors[i]] {
queue.Enque(neighbors[i])
}
}
}
return false
}
Finding the Shortest Path
Ok, we've seen how to create a graph, traverse it, and check if a path exists between two nodes in it. Now let's look out how to find the shortest path between two nodes.
The main idea is that we want to use BFS because we're more likely to hit the destination by searching in all directions evenly. If we were to use DFS, we'd have to get lucky and hope we're going in the right direction when we start. Since every time we visit all the neighbor nodes of a particular node we're exploring one direction, we can increment the distance we've traveled by 1 and store that distance in our queue along with the value of the node. Aside from that, this algorithm is just like the one we used to check to see if a path exists. The only difference is, we also keep track of how many edges we had to go through to get from the source to the destination.
type Jump struct {
node rune
distance int
}
func ShortestDistanceBFS(graph map[rune][]rune, source rune, dest rune) int {
queue := CreateQueue(100)
visited := make(map[rune]bool)
queue.Enque(Jump{
node: source,
distance: 0,
})
for queue.Size > 0 {
current, _ := queue.Dequeue().(Jump)
if current.node == dest {
return current.distance
}
if visited[current.node] {
continue
}
visited[current.node] = true
neighbors := graph[current.node]
for _, neighbor := range neighbors {
if !visited[neighbor] {
queue.Enque(Jump{
node: neighbor,
distance: current.distance + 1,
})
}
}
}
return -1
}
Now that we see how to find how long the shortest path is, let's create a function that not only gives us that distance, but tells exactly which path was taken to get from the path to the destination. Something like this would be really useful in a social network where you wanted to analyze how certain people might be connected. The main difference from just finding the shortest path is that, instead of adding up the count of "jumps" from one node to the next, we'll keep a mapping of each node's parent node as we visit them (the one we came from to get there). Then, after we've found the destination, we can back-track through that mapping of parent nodes to find out how we got there.
func ShortestPathBFS(graph map[rune][]rune, source rune, dest rune) []rune {
queue := CreateQueue(100)
visited := make(map[rune]bool)
parents := make(map[rune]rune)
path_exists := false
queue.Enque(source)
for queue.Size > 0 {
current, _ := queue.Dequeue().(rune)
if current == dest {
path_exists = true
break
}
if visited[current] {
continue
}
visited[current] = true
neighbors := graph[current]
for _, neighbor := range neighbors {
if !visited[neighbor] {
queue.Enque(neighbor)
parents[neighbor] = current
}
}
}
if !path_exists {
return []rune{}
}
paths := make([]rune, 0)
d, ok := parents[dest]
for ok {
paths = append(paths, d)
d, ok = parents[d]
}
for i, j := 0, len(paths)-1; i < j; i, j = i+1, j-1 {
paths[i], paths[j] = paths[j], paths[i]
}
paths = append(paths, dest)
return paths
}
If we don't find a path, we just return an empty slice, but if we do, we loop through the 'parents' map to see which node led to which. Because we're back-tracking, we get the list in reverse order, so the last step is to reverse the 'paths' array so we get it in chronological order. I'm doing it manually with the 2 pointers technique but you could juse use Go's built in .Reverse() method to save some lines of code. Oh, then one more thing, we need to append the destination node since we exited the search loop before adding it to the parent's map.
There's a lot more you can model with graphs, but this is pretty much it for the basics.