Back

Simulating MapReduce with Golang

Dec 19, 2025 • By gravity.blog

Simulating MapReduce with Golang

MapReduce is a programming model for processing large data sets with a parallel, distributed algorithm on a cluster. In this post, we'll simulate a basic MapReduce flow using Go's goroutines and channels.

The Core Concept

The process consists of three main stages:

  1. Map: Processing input into key-value pairs.
  2. Shuffle: Grouping values by key (simplified here).
  3. Reduce: Aggregating values for each key.

Implementation in Go

Here is a simplified simulation where we count words in a slice of strings.

package main import ( "fmt" "strings" "sync" ) // Map function: splits string into words and emits (word, 1) func mapper(input string, output chan<- map[string]int) { counts := make(map[string]int) words := strings.Fields(input) for _, word := range words { counts[strings.ToLower(word)]++ } output <- counts } // Reduce function: sums up word counts func reducer(maps []map[string]int) map[string]int { result := make(map[string]int) for _, m := range maps { for word, count := range m { result[word] += count } } return result } func main() { inputs := []string{ "Hello world", "MapReduce in Go is fun", "Hello Go world", } outputChan := make(chan map[string]int, len(inputs)) var wg sync.WaitGroup // Map Stage for _, input := range inputs { wg.Add(1) go func(s string) { defer wg.Done() mapper(s, outputChan) }(input) } wg.Wait() close(outputChan) // Collect Results var maps []map[string]int for m := range outputChan { maps = append(maps, m) } // Reduce Stage finalResults := reducer(maps) fmt.Println("Word Counts:") for word, count := range finalResults { fmt.Printf("%s: %d\n", word, count) } }

Why Go?

Go is uniquely suited for MapReduce-like tasks because of its built-in concurrency model. Goroutines are lightweight, and channels provide a safe way to communicate between them, making parallel processing intuitive and efficient.

Happy scaling!