Use xxhash for composite checker keys instead of strings #2288
+240
−237
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Often when I see people's profiles, the top allocator is
WriteByte, all from the checker building runtime map keys.We stringify data to use as map keys for union types, intersection types, etc. In JS, this is cheap thanks to the "rope" optimization where the strings are never actually produced unless required (which does not include
Maplookups!).But in Go, this doesn't work so well because we do actually have to back strings with real fully-written memory, and it's not easy to do a lookup otherwise. There's no magic string type that can expand itself when read. A Go stdlib map that allows custom hashing/equality is seemingly not coming soon.
An alternative is to just hash the data and use that hash as a key. A while ago I had tried use
sha256to do this (like gopls does), but did not see much success. And, introducingcryptointo the binary is a can of worms I don't want to open. So, I gave up on that idea.We have since then added xxhash to our dependencies for incremental mode, the LSP, etc.
A 128-bit xxhash could be argued to have enough collision resistance; UUIDs are of course also 128-bit. What are the chances of a UUID-level collision happening within a single compile? 😅
This PR tries that out, swapping our key builder out for one that hashes the data as it comes in instead.
The effect seems pretty good. VS Code before:
And after:
The time in general seems unchanged, but with 35% fewer allocations.
Downside is of course that the map keys become unreadable and irreversible. Right now, they're only "unreadable" thanks to us making numbers into shorter strings. We could have a mode which preserves the data behind a build tag. Not sure about that.
Thankfully, these keys are never actually exposed through any API, so should be opaque enough.