Send me an email!
What's happening in Go tip (2013-08-23)
Last week I published the first article in an ongoing series on changes in Go tip. It was met with a lot of positive feedback, so here’s the second article. Thank you for your support and I hope you enjoy this article just as much as the previous one.
What’s happening
This time we’ll be looking at:
- A new slicing syntax
- Performance improvements
- Fast, constant-time P-256 elliptic curves
- Where has godoc gone?
A new slicing syntax
Relevant CLs: CL 10743046, CL 12931044
The first change we will be looking at is probably the most controversial addition to Go 1.2: A new slicing syntax that allows setting the capacity of a slice. But before going into why it’s controversial let’s see what it’s about first.
Everyone should be familiar with the slicing operation mySlice[i:j]
,
creating a new slice, ranging from i
to j
. mySlice
and the new
slice will share the same backing array and manipulating one slice
will affect the other. That’s all pretty well known and expected
behaviour.
What some people might not realize is that the two slices will share
the capacity, too. A slice has a length and a capacity; the number of
elements it can see right now and the number of elements in can
potentially access. The capacity is mostly relevant when using
append()
¹: When appending to a slice, it first checks if there’s
enough room left in the underlying array (the capacity). If there is,
it creates a new slice with bigger length, pointing to the same
backing array, the same memory.
¹: One can also manually reslice past the length, if the capacity
allows it. This is basically what append()
is doing.
Consider the following piece of code:
orig := make([]int, 10, 20)
subSlice := orig[5:10]
fmt.Printf("len(orig) = %d, cap(orig) = %d, len(subSlice) = %d, cap(subSlice) = %d\n",
len(orig), cap(orig), len(subSlice), cap(subSlice))
// Output: len(orig) = 10, cap(orig) = 20, len(subSlice) = 5, cap(subSlice) = 15
orig = append(orig, 1)
fmt.Println(orig)
// Output: [0 0 0 0 0 0 0 0 0 0 1]
subSlice = append(subSlice, 2)
fmt.Println(orig)
// Output: [0 0 0 0 0 0 0 0 0 0 2]
It demonstrates that subSlice
, albeit being sliced to a specific
window of orig
, still has access to the same kind of memory – minus
the first 5 elements, since slices cannot grow backwards. It also
demonstrates that using append()
leads to “weird” results, where
we’d overwrite elements in the original slice.
Now, why is this an issue in real code? Imagine you’re writing your
own memory allocator based on []byte
– something that’s common when
dealing with a lot of tiny allocations that shouldn’t slow down the
GC. When the user asks for N
bytes of memory, you return a slice
that is a slice somewhere from that []byte
, with length N
. Now the
user does something stupid: He appends to it. As we’ve seen before,
this append will “leak” into memory beyond the length of the slice,
and beyond what the memory allocator intended. You just overwrote
someone else’s memory!
A safe alternative that still allows implementing a custom memory
allocator but preventing arbitrary memory corruption would be to limit
the capacity of the slice we return, so that an append wouldn’t modify
memory it shouldn’t. And that safe alternative is the new slicing
syntax: mySlice[i:j:k]
– The first two elements behave as usual,
slicing from i
to j
. The third element affects the capacity, where
the capacity will be calculated as k - i
.
k - i
as the capacity simply means that k - 1
is the absolute
index into the underlying array that will be reachable by the new
slice, same as j - 1
being the index for the length. By making k
equal to j
you create a slice that cannot look past its length. And
now we can write an allocator that returns N
bytes of memory and
doesn’t allow you to look any further.
In the introduction I wrote that this addition is controversial. Maybe
you can already guess why: If you’ve kept thinking to yourself “why
would I need this”, you guessed why. This feature is used very rarely.
Only some very specific use-cases require it, and looking at the
standard library won’t return many uses of it, either. That doesn’t
mean the feature shouldn’t exist at all, since it does solve a valid
problem, but it has been argued that the slicing syntax shouldn’t be
extended. Functions like setcap()
have been suggested to avoid
adding new syntax to Go, but in the end the Go team went with the new
slicing syntax.
Here’s the full discussion
if you’re interested in the back and forth. And yes, this will be in
Go 1.2.
There’s also a design document, describing the original motivations and concerns.
Performance improvements & Garbage be gone
Relevant CLs: CL 11573043, CL 9129044, CL 12894043, CL 8179043, CL 9492044, CL 9432046, CL 8819049, CL 9459044, CL 12603049, CL 12708046, CL 10079043, CL 8670044, CL 12680046, CL 12694048, CL 11874043, CL 12072045, CL 9915043, CL 12662043, CL 9462044
Go 1.2 will see a lot of performance improvements. These improvements fall into two categories: Raw higher speed thanks to better code and better compilers, and less work for the memory allocator and garbage collector thanks to less garbage.
Brad Fitzpatrick, in his quest for high performance HTTP, has been
doing a lot of work on the net/http
package, and related packages
you often use with it. Most of his changes fall into the “less
garbage” category. Since most changes are rather simple, I’ll only
list their CL numbers so you can check them out. They all include
benchmarks in their description: CL 9129044 (faster JSON encoding), CL 12894043 (faster ZIP compression), CL 8179043, CL 9492044 and CL 9432046 (several HTTP improvements)
There are, however, some CLs that are particularly interesting. One
set of CLs are CL 8819049, CL 9459044 and CL 12603049 – The first two
add a buffer pool and buffer reuse to bufio
and the last one removes
said buffer pool and buffer reuse, replacing it with a simple Reset
method. The story behind this is that these buffers would be handed to
users, and these could, by ignoring the API, potentially corrupt data
of other users, leading to hard to debug issues that are alike to “use
after free” errors, which Go explicitly avoids by not providing a
manual way of freeing memory. Additionally, it would set a precedent
of unnecessarily complex code for the sake of some additional
performance.
The newly added Reset
method manages to reach similar performance
improvements, at the cost of forcing the user to actively reuse
buffers instead of relying on the package for it. net/http
would be
one of these users, making use of Reset
in CL 12708046. Do note that
the included benchmark is relative to the bufio
pool, not Go 1.1.
For the original discussion around this, see issue 6086 by Russ Cox
One last change in Brad’s name is CL 10079043, which adds coalescing of duplicate in-flight DNS lookups. Or put differently: If doing the same DNS lookup a hundred times before one of them finishes, only one actual request will be made to servers.
But not only Brad did some work, other people contributed great improvements as well.
The most notable one would be CL 8670044, which implements an integrated network poller for Windows, leading to 30% performance improvements on that platform for networking operations – A luxury that Go 1.1 already brought for Linux.
Another notable change would be CL 11573043, which considerably speeds
up sync.Cond
– And completely gets rid of memory allocations in the
process.
CL 12680046 and CL 12694048 add fast routes to encoding/binary
,
handling slices of integer types, making these common types a lot
faster and cheaper.
DES encryption has become five times faster (which, sadly, is still very slow) thanks to CL 11874043 and CL 12072045.
bzip2 decompresses 30% faster (CL 9915043) and the DNS client in the
net
package produces ~350 bytes less of garbage (CL 12662043).
And last but not least, io.Copy
now prioritizes WriterTo over ReaderFrom, which can
lead to tremendous noticeable performance improvements (and
virtually no-ops in the case of ioutil.Discard
) when types implement
both functions.
Fast, constant-time P-256 elliptic curves
Relevant CLs: CL 10551044
One of my readers pointed out this very interesting change which adds a constant-time P-256 implementation, which at the same time happens to be a lot faster than the previous implementation.
And while performance is nice, the really important bit here is “constant-time”: Since the function takes the same amount of time for all possible inputs, it avoids being vulnerable to timing attacks, a form of side channel attacks in which timing information is used to break a cipher.
Where has godoc gone?
Using Go tip always requires you to be more aware of the development process and to expect breakage, weird behaviour and sudden changes.
Nevertheless, not everyone who uses Go tip has the time to check its
entire development history, especially if you’ve been told to “try Go
tip and see if your problem has been fixed”. As such, a fair number of
people have recently been surprised and confused when, after compiling
Go tip, godoc
wouldn’t work anymore, either because there was no
godoc
binary, or because it would print an error like this one:
readTemplate: open /Users/rsc/g/go/lib/godoc/codewalk.html: no such file or directory
In either case, the fix is to run go get
code.google.com/p/go.tools/cmd/godoc
– godoc
has been moved to the
go.tools
subrepository.
The future
Phew. We’re still catching up with all the development that happened
since Go 1.1, but we’re slowly getting there. Luckily current Go tip
development has slowed down, not yielding a lot of interesting
changes. And where important things change, like the godoc
change,
we mix them into the articles.
Attentive readers might have noticed that the previous article promised a look at shared linking support. Unfortunately it didn’t make its way into this article, to make room for more important changes and because the change to slicing syntax already took a bit to digest.
What’s planned for next week? More changes!