Golang Set Types Matter (A Little)
This is just to settle a dispute that mostly exists in my own head. It likely matters very, very little. However, it matters enough. I am currently interviewing for new jobs and during one interview, I was asked to design an algorithm for which I needed a set data type. Golang provides no built-in set data type, but it does provide a map, so I defined my set as follows:
type Set map[string]struct{}
var exists = struct{}{}
func (s *Set) Add(item string) {
s[item] = exists
}
func (s *Set) Contains(item string) bool {
_, hasItem := s[item]
return hasItem
}
I actually did not define anything that formally. I mostly just used a map that
way and said I was using it as a set as part of the interview. I suggested I
could make it into a formal type if the interview wanted to see me do it. The
dispute was that he suggested that instead of using struct{}
as the value type
for the map, it would be faster to use interface{}
instead (a.k.a. any
in
recent versions of Golang). However, I was pretty sure struct{}
should be
faster.
I believed this for two reasons.
-
I remember reading it somewhere, probably on StackOverflow or similar.
-
It makes logical sense that a language as mature as Golang with as much emphasis on compiler optimization as it has, would completely abstract away the
struct{}
part ofmap[string]struct{}
where the only possibly value isstruct{}{}
. Whereasmap[string]interface{}
still must still have a pointer value becauseinterface{}
is the opposite ofstruct{}
as the language’s most generic type: values of that type may hold a pointer to literally any type in Golang.
Anyway, this thought has been bugging me, so I did a quick benchmark to verify
this empirically. This is not at all exhaustive, but the results are clear and
tell the story I expected to be told. Note that I use a map[int]...
rather
than map[string]...
, but that’s purely for the purpose of laziness. It’s
easier to create unique integer keys than string keys. (Sadly, Golang lacks the
“AA"++ operation that made generating string keys relatively easy in Perl. I
couldn’t be bothered to find a library to do that for this. Writing this post
took longer than writing the benchmark.)
Here’s the benchmark program:
package main
import "testing"
type InterfaceMap map[int]any
var IExists any = nil
func BenchmarkInterfaceSet(b *testing.B) {
m := make(InterfaceMap, 10)
for i := 0; i < b.N; i++ {
m[i] = IExists
}
}
func nop() {}
var presetImap = make(InterfaceMap, 10_000)
func init() {
m := presetImap
for i := 0; i < 10_000; i++ {
m[i] = IExists
}
}
func BenchmarkInterfaceGet(b *testing.B) {
m := presetImap
for i := 0; i < b.N; i++ {
if _, isSet := m[i]; isSet {
nop()
}
}
}
type StructMap map[int]struct{}
var SExists = struct{}{} // H-Huh, H-Huh... He said... sexists. ~~ Butthead
func BenchmarkStructSet(b *testing.B) {
m := make(StructMap, 10)
for i := 0; i < b.N; i++ {
m[i] = SExists
}
}
var presetSmap = make(StructMap, 10_000)
func init() {
m := presetSmap
for i := 0; i < 10_000; i++ {
m[i] = SExists
}
}
func BenchmarkStructGet(b *testing.B) {
m := presetSmap
for i := 0; i < b.N; i++ {
if _, isSet := m[i]; isSet {
nop()
}
}
}
And the results of go test -bench=. -count 5
on my laptop:
goos: linux
goarch: amd64
pkg: github.com/zostay/scratch
cpu: Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
BenchmarkInterfaceSet-16 7397239 185.9 ns/op
BenchmarkInterfaceSet-16 9243465 200.5 ns/op
BenchmarkInterfaceSet-16 9258003 173.6 ns/op
BenchmarkInterfaceSet-16 9453066 172.6 ns/op
BenchmarkInterfaceSet-16 9301214 175.2 ns/op
BenchmarkInterfaceGet-16 99072818 12.15 ns/op
BenchmarkInterfaceGet-16 100000000 12.14 ns/op
BenchmarkInterfaceGet-16 97749759 11.83 ns/op
BenchmarkInterfaceGet-16 100000000 11.76 ns/op
BenchmarkInterfaceGet-16 100000000 11.64 ns/op
BenchmarkStructSet-16 13275610 121.8 ns/op
BenchmarkStructSet-16 14263959 141.3 ns/op
BenchmarkStructSet-16 14711151 146.4 ns/op
BenchmarkStructSet-16 15161526 151.7 ns/op
BenchmarkStructSet-16 15210423 148.7 ns/op
BenchmarkStructGet-16 100000000 10.02 ns/op
BenchmarkStructGet-16 100000000 10.11 ns/op
BenchmarkStructGet-16 100000000 10.13 ns/op
BenchmarkStructGet-16 100000000 10.03 ns/op
BenchmarkStructGet-16 127146160 9.405 ns/op
PASS
ok github.com/zostay/scratch 31.801s
I figured the Set operations would perform differently, but I did not really
expect the Get operations to be as different as they are. The median Set
operation is 20% faster using struct{}
and the Get is 18% faster in my little
test with no benchmark from struct{}
being slower than any equivalent in
interface{}
. That’s conclusive as far as I’m concerned.
Now, in actuality, this performance difference will matter very little in most
applications. However, if performance matters to your application at all, it is
clear that using map[int]struct{}
will have superior CPU performance to
map[int]interface{}
. I would expect that improvement to be seen generally
anywhere a struct{}
will suffice in place of an interface{}
. I would expect
memory performance to be improved as well, though I leave that hypothesis as an
exercise for the reader to test.
Cheers.