6 Commits

Author SHA1 Message Date
a97aafca75 Merge remote-tracking branch 'origin' into refactor/name-bucket-slot-table
All checks were successful
CI / Check PR Title (pull_request) Successful in 30s
CI / Go Lint (pull_request) Successful in 48s
CI / Markdown Lint (pull_request) Successful in 34s
CI / Makefile Lint (pull_request) Successful in 49s
CI / Unit Tests (pull_request) Successful in 48s
CI / Fuzz Tests (pull_request) Successful in 1m17s
CI / Mutation Tests (pull_request) Successful in 1m23s
2026-04-15 23:13:41 -04:00
5c39182958 refactor: HashTable -> Table, table -> subtable
All checks were successful
CI / Check PR Title (pull_request) Successful in 29s
CI / Go Lint (pull_request) Successful in 39s
CI / Makefile Lint (pull_request) Successful in 48s
CI / Markdown Lint (pull_request) Successful in 31s
CI / Unit Tests (pull_request) Successful in 37s
CI / Fuzz Tests (pull_request) Successful in 1m38s
CI / Mutation Tests (pull_request) Successful in 1m19s
2026-04-13 21:11:37 -04:00
2eeff25efd Merge remote-tracking branch 'origin' into refactor/name-bucket-slot-table 2026-04-13 20:54:09 -04:00
6a5b40c097 docs: replaced instances of "bucket" with "table"
- Removed instances of `growthFactor`, as it is unexported.
- Typo in `HashTable.String()`.
2026-04-13 20:49:33 -04:00
395a3560c7 refactor: constructors, update docs
- NewCustomTable -> NewCustom
- NewTableBy -> NewBy
- NewTable -> New
2026-04-04 12:27:53 +02:00
2fd9da973b refactor: bucket -> table, Table -> HashTable 2026-04-04 12:22:42 +02:00
8 changed files with 79 additions and 666 deletions

View File

@@ -114,9 +114,6 @@ linters:
# Reports uses of functions with replacement inside the testing package. # Reports uses of functions with replacement inside the testing package.
- usetesting - usetesting
# Reports mixed receiver types in structs/interfaces.
- recvcheck
settings: settings:
revive: revive:
rules: rules:
@@ -201,7 +198,7 @@ linters:
# warns when initialism, variable or package naming conventions are not followed. # warns when initialism, variable or package naming conventions are not followed.
- name: var-naming - name: var-naming
misspell: misspell:
# Correct spellings using locale preferences for US or UK. # Correct spellings using locale preferences for US or UK.
# Setting locale to US will correct the British spelling of 'colour' to 'color'. # Setting locale to US will correct the British spelling of 'colour' to 'color'.

View File

@@ -1,542 +0,0 @@
# Designing an Idiomatic API Interface
We (the maintainers) built `go-cuckoo`'s API interface without design intent.
Up until now, we paid more attention implementing the underlying functionality of the cuckoo hashing.
With the fundamentals of the algorithm built, we should revisit the interface.
It should align closer to the following principles:
- **Congruency**
A `go-cuckoo` table should have the same core functionality as Go's built-in map.
- **Familiarity**
A `go-cuckoo` table should behave similarly to Go's standard map, so users will intuitively know how to use it.
In effect, its users will carry less cognitive load.
## Current State
### Interface of the built-in Map
Listed below is every interface provided by Go to the built-in map object.
Also included, are the functions from the package `maps` in the standard library.
<details>
<summary>Interfaces</summary>
| # | built-in Interface | Description |
| --- | ---------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------- |
| 1 | `m := make(map[K]V)` | Returns an empty map using the built-in `make()` function. |
| 2 | `m := make(map[K]V, hint)` | Returns an empty map using `make()`, with a capacity 'hint'. This hint is how many items the map expects to hold, _not_ a measure of how large it is. |
| 3 | `m := map[K]V{...}` | Returns a map, which may be filled with entries in the ellipsis (optional). |
| 4 | `var m map[K]V` | Defines an empty _variable_ that holds a map. This differs from #1 because `m` is uninitialized (nil) here. |
| 5 | `m[k] := v` | Assigns the value of `k` to `v`. |
| 6 | `v := m[k]` | Returns the value of `k` if it exists. Otherwise, `v` is uninitialized. |
| 7 | `v, ok := m[k]` | Similar to #6, except `ok` is equal to whether `v` is initialized. This is comma-ok notation. |
| 8 | `for k, v := range m` | Iterates over every key-value pair in `m`. The order is random. |
| 9 | `delete(m, k)` | Unassigns the value `k`. Returns no value. |
| 10 | `clear(m)` | Unassigns all keys in `m`. Returns no value. |
| 11 | `n := len(m)` | Returns the number of entries in `m`. If nil, `m` returns 0. |
| 12 | `m2 := maps.Clone(m)` | Returns a copy of `m`. |
| 13 | `maps.Copy(dst, src)` | Assigns every entry of `src` in `dst`. |
| 14 | `ok := maps.Equal(m1, m2)` | Returns true iff `m1` and `m2` the same entries. |
| 15 | `ok := maps.EqualFunc(m1, m2, fn)` | Like #14, but with a custom comparator for non-comparable values. |
| 16 | `maps.DeleteFunc(m, fn)` | Removes every entry in `m` which satisfies `fn`. Returns no value. |
| 17 | `it2 := maps.All(m)` | Returns an 2D iterator over every key-value pair. |
| 18 | `it := maps.Keys(m)` | Returns an iterator over every key. |
| 19 | `it := maps.Values(m)` | Returns an iterator over every value. There can be duplicates. |
| 20 | `m := maps.Collect(seq)` | Returns a map, with every entry defined in a 2D iterator over key-value pairs. |
| 21 | `maps.Insert(m, seq)` | Assigns to `m` all key-value pairs in 2D iterator `seq`. Returns no value. |
</details>
### Interface of `go-cuckoo`
On the other hand, here is the current contract for `go-cuckoo`.
<details>
<summary>Interfaces</summary>
| # | built-in Interface | Description |
| --- | -------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------- |
| 1 | `m := New(opts...)` | Creates a table using the default hash and equal function. The options configure its behavior. Confined to comparable keys. |
| 2 | `m := NewBy(keyFunc, opts...)` | Like #1, but allows any key type. A `keyFunc` is used to derive a comparable key. |
| 3 | `m := NewCustom(hashA, hashB, equalFunc, opts...)` | Like #1, but allows control over the hashes used to allow any key type. An `equalFunc` determines key equality. |
| 4 | `seq := m.Entries()` | Returns an unordered 2D iterator of all key-value pairs in the table. |
| 5 | `v := m.Find(k)` | Removes the value for `k`. Returns true if `k` existed. |
| 6 | `v, ok := m.Get(k)` | Returns the value for `k` in the table. Also, returns true if the `k` exists, otherwise false. When false, `v` is undefined. |
| 7 | `ok := m.Has(k)` | Returns true if `k` is in the table. |
| 8 | `err := m.Put(k, v)` | Sets value `v` for key `k`. Otherwise, returns error. |
| 9 | `n := m.Size()` | Returns the number of items in `m`. |
| 10 | `str := m.String()` | Returns `m` as a string in the format "table[k1:v1 k2:v2 ...]". |
| 11 | `cap := m.TotalCapacity()` | Returns how many slots `m` has allocated. |
| 12 | `ok := m.Drop(k)` | Removes `k` from the table. Returns whether the key had existed. |
</details>
### Determining Congruency
So, how does the core functionality compare?
Listed below is an analysis of every interface in Go's standard map.
Each is compared against what `go-cuckoo` offers, and categorized into the following groups:
- ✅ Covered: an analog exists.
- ⚠️ Partial: workaround available.
- ❌ Gap: no analog yet; addressed in [Target State](#solving-congruency).
Specifically, here we are checking for functionality.
Is there functionality that this offers which `go-cuckoo` does not?
We are checking accessibility, but not discoverability.
The latter will be considered later.
<details>
<summary>✅ <code>m := make(map[K]V)</code></summary>
The analog is `m := New()`.
</details>
<details>
<summary>⚠️ <code>m := make(map[K]V, hint)</code></summary>
This has no simple analog.
It is close to `m := New(Capacity(hint))`, but it assigns starting capacity, not expected size.
For the built-in map, these are two separate things.
- Capacity is an internal measure, used to optimize space/speed.
It is hidden from the user because it depends on the underlying implementation, which may change.
- Expected size requires the map must hold a number of items before resizing.
This is tangeable and agnostic to implementation, hence why it is given to the user.
In short, this interface defines expected size, but `Capacity()` defines capacity.
</details>
<details>
<summary>❌ <code>m := map[K]V{...}</code></summary>
This has no simple analog, the closest being:
```go
m := New[K, V]()
for k, v := range startingEntries {
m.Put(k, v)
}
```
It is idiomatic, but far less ergonomic.
</details>
<details>
<summary>✅ <code>var m map[K]V</code></summary>
The analog is `var m Table[K, V]`.
</details>
<details>
<summary>✅ <code>m[k] := v</code></summary>
The analog is `err := m.Put(k, v)`.
</details>
<details>
<summary>✅ <code>v := m[k]</code></summary>
The analog is `v := m.Find(k)`.
</details>
<details>
<summary>✅ <code>v, ok := m[k]</code></summary>
The analog is `v, ok := m.Get(k)`.
</details>
<details>
<summary>✅ <code>for k, v := range m</code></summary>
The analog is `for k, v := range m.Entries()`.
</details>
<details>
<summary>✅ <code>delete(m, k)</code></summary>
The analog is `ok := m.Drop(k)`.
</details>
<details>
<summary>❌ <code>clear(m)</code></summary>
There is no analog.
The easiest may to do this is to delete all items individually:
```go
for k := range m.Entries() {
m.Drop(k)
}
```
</details>
<details>
<summary>✅ <code>n := len(m)</code></summary>
The analog is `n := m.Size()`.
</details>
<details>
<summary>❌ <code>m2 := maps.Clone(m)</code></summary>
There is no analog.
The easiest way to do this currently is to make a new map, and manually add the items.
```go
m2 := cuckoo.Table[K, V]()
for k, v := range m.Entries() {
m2.Put(k, v)
}
```
This gets complicated by the various options available to the user.
Furthermore, any custom `EqualFunc`, `keyFunc` or `Hash` is not transferred.
</details>
<details>
<summary>❌ <code>maps.Copy(dst, src)</code></summary>
There is no analog.
The simplest way to do this is with a for-loop.
```go
for k, v := range src.Entries() {
dst.Put(k, v)
}
```
</details>
<details>
<summary>❌ <code>ok := maps.Equal(m1, m2)</code></summary>
There is no analog.
Users have to manually check the key-value pairs to determine equality.
</details>
<details>
<summary>❌ <code>ok := maps.EqualFunc(m1, m2, fn)</code></summary>
There is no analog.
Users have to manually check the key-value pairs to determine equality.
</details>
<details>
<summary>❌ <code>maps.DeleteFunc(m, fn)</code></summary>
There is no analog.
Users have to manually delete keys.
</details>
<details>
<summary>✅ <code>it2 := maps.All(m)</code></summary>
The analog is `it2 := m.Entries()`.
</details>
<details>
<summary>⚠️ <code>it := maps.Keys(m)</code></summary>
There is no simple analog.
A close neighbor is `it2 := m.Entries()`.
Users can use this in a for-loop, and pick out just the keys:
```go
for k := range m.Entries() {
// ...
}
```
</details>
<details>
<summary>⚠️ <code>it := maps.Values(m)</code></summary>
There is no simple analog.
A close neighbor is `it2 := m.Entries()`.
Users can use this in a for-loop, and pick out just the values:
```go
for _, v := range m.Entries() {
// ...
}
```
</details>
<details>
<summary>❌ <code>m := maps.Collect(seq)</code></summary>
There is no analog.
</details>
<details>
<summary>❌ <code>maps.Insert(m, seq)</code></summary>
There is no analog.
</details>
## Target State
### Solving Congruency
We should make the following changes to accomodate for congruency:
<details>
<summary><code>ok := maps.EqualFunc(m1, m2, fn)</code></summary>
We should implement a new function:
```go
func EqualFunc[K, V1, V2 any](t1 *Table[K, V1], t2 *Table[K, V2], eq func(V1, V2) bool) bool
```
This function is free, and not bound as a receiver function.
(It is called `cuckoo.Equal(t1, t2)`, not `t1.Equals(t2)`.)
The latter implies `t1` has authority, when in fact neither do.
We define equality as:
1. Neither table has a key the other doesn't.
2. Each key has the same value in each table.
Parameter `eq` determines this equality.
Custom `EqualFunc`'s complicate this, as they modulate key identity in tables.
If two tables may differ on whether two keys are different, this function might break.
So, we must assume that:
- Both tables have `EqualFunc`'s which 'agree' on the identity of the keys present in the tables.
Agreement is defined as: if two keys are distinct in one table, they are distinct in the other.
The name `EqualFunc` is already taken by `EqualFunc[K, V]`: an alias for `func(a, b K) bool`.
Inlining `EqualFunc[K, V]` would solve this problem.
We will move the documentation attached to it to `DefaultEqualFunc`.
</details>
<details>
<summary><code>ok := maps.Equal(m1, m2)</code></summary>
We should implement a new function, to conform with the standard library:
```go
func Equal[K any, V comparable](t1, t2 *Table[K, V]) bool
```
It uses the same equality check as in `EqualFunc`.
Once again, the function is free because it is symmetric.
</details>
<details>
<summary><code>maps.Insert(m, seq)</code></summary>
We should implement a new receiver for the table:
```go
func (t *Table[K, V]) Insert(seq iter.Seq2[K, V]) error
```
A receiver fits better even though `maps.Insert` is a free function, because copying it is asymmetric.
Map `dst` receives entries from map `src`.
It's only free because Go's standard map is built into the language, and so cannot have receivers.
In terms of naming, `t.Extend` is more accurate, and has precedent in [Python](docs.python.org/3/tutorial/datastructures.html#more-on-lists) and [Rust](https://doc.rust-lang.org/std/iter/trait.Extend.html).
When [adding iterator function](https://github.com/golang/go/issues/61900) to the `maps` package, the Go team chose to frame it as 'sources' and 'sinks'.
With this model, `maps.Insert` made more sense than `maps.Extend`.
Ultimately, `t.Insert()` is a better choice to be consistent with `maps`.
</details>
<details>
<summary><code>maps.Copy(dst, src)</code></summary>
We should implement a new receiver for the table:
```go
func (t *Table[K, V]) Copy(src *Table[K, V]) error
```
It's functionality should match that of `t.Insert()`.
A receiver fits better even though `maps.Copy` is a free function, 'copying' it is asymmetric: `dst` is writen into by `src`.
It is only free because Go's standard map is built into the language, and so cannot have receivers.
The name `t.Merge()` might be more accurate, but it does work because:
- `t.Copy()` matches Go's built-in `copy()`, and `io.Copy()`. The Go team used [the same logic](https://github.com/golang/go/discussions/47330#discussioncomment-1167799) to name `maps.Copy()`.
In this case, `t.Merge()` would be an outlier.
- `t.Merge()` implies some sort of conflict-resolution, when there is not.
It simply overwrites the values.
</details>
<details>
<summary><code>maps.DeleteFunc(m, fn)</code></summary>
We should implement a new receiver for the table:
```go
func (t *Table[K, V]) DeleteFunc(del func(K, V) bool)
```
It would have the same functionality as `maps.DeleteFunc`.
A free function could work here, but `t` has clear authority over `del`.
Other than being consistent with the `maps` package, `t.DeleteFunc` follows the Go convention of appending `Func` to higher-order equivalents of functions.
This trumps names like `t.DeleteIf`, which lend more to [Java](https://docs.oracle.com/javase/8/docs/api/java/util/ArrayList.html#removeIf-java.util.function.Predicate-) or [C++](https://en.cppreference.com/cpp/algorithm/remove).
The word `Delete` is also convention, tying back to the built-in `delete()`.
</details>
<details>
<summary><code>m := maps.Collect(seq)</code></summary>
We should implement a new constructor.
```go
func Collect[K comparable, V any](seq iter.Seq2[K, V]) (*Table[K, V], error)
```
It would create a `New()` table, and insert all entries in `seq`.
This reveicer only supports the standard table constructor, with comparable keys.
It is tempting to add `CollectBy` or `CollectCustom` to support all table types, but doing so would pollute the public interface.
It would be just one more line to initialize the table and then call `t.Insert` directly:
```go
t := // ...
err := t.Insert(seq)
```
</details>
<details>
<summary><code>m := map[K]V{...}</code></summary>
We should make a new constructor, because entries are generic.
So, creating an option with inialized entries doesn't work.
With the previous additions, users have a few options.
If they want to use a `New()` table, `t.Collect` matches well:
```go
t, err := cuckoo.Collect(func(yield func(K, V) bool) {
yield(key1, val1)
yield(key2, val2)
})
```
For `NewCustom()` or `NewBy()` tables, users can call `t.Insert` after initialization:
```go
t := // ...
err := t.Insert(func(yield func(K, V) bool) {
yield(key1, val1)
yield(key2, val2)
})
```
It is one more line.
But, the alternative is polluting the public interface with corresponding `*WithEntries` constuctors.
</details>
<details>
<summary><code>m := make(map[K]V, hint)</code></summary>
We should add a new option:
```go
func ExpectedSize(n int) Option
```
When fed to a table, it will allocate enough space to hold `n` entries without a resize.
</details>
<details>
<summary><code>clear(m)</code></summary>
We should implement a new receiver:
```go
func (t *Table[K, V]) Clear()
```
It will remove all entries from the table.
</details>
<details>
<summary><code>m2 := maps.Clone(m)</code></summary>
We should implement a matching function:
```go
func (t *Table[K, V]) Clone() *Table[K, V]
```
Also, it will copy the hash, equality function, and options used in the table.
</details>
<details>
<summary><code>it := maps.Keys(m)</code></summary>
We should implement a matching function:
```go
func (t *Table[K, V]) Keys() iter.Seq[K]
```
It is tempting to just have `All()`, but it returns a `Seq2`, not a `Seq`.
There is no iterator adaptor between `Seq` and `Seq2`, and will not be for the foreseeable future.
This function, while it feels superfluous, is required.
</details>
<details>
<summary><code>it := maps.Values(m)</code></summary>
We should implement a matching function:
```go
func (t *Table[K, V]) Values() iter.Seq[V]
```
For the same reason we need `Keys()`, we also need `Values()`.
</details>

View File

@@ -68,13 +68,12 @@ func FuzzInsertLookup(f *testing.F) {
for _, step := range scenario.steps { for _, step := range scenario.steps {
if step.drop { if step.drop {
ok := actual.Drop(step.key) err := actual.Drop(step.key)
_, has := expected[step.key] assert.NoError(err)
assert.Equal(ok, has)
delete(expected, step.key) delete(expected, step.key)
_, ok = actual.Get(step.key) _, ok := actual.Get(step.key)
assert.False(ok) assert.False(ok)
} else { } else {
err := actual.Put(step.key, step.value) err := actual.Put(step.key, step.value)

View File

@@ -124,9 +124,9 @@ func TestDropExistingItem(t *testing.T) {
table := cuckoo.New[int, bool]() table := cuckoo.New[int, bool]()
(table.Put(key, value)) (table.Put(key, value))
had := table.Drop(key) err := table.Drop(key)
assert.True(had) assert.NoError(err)
assert.Equal(0, table.Size()) assert.Equal(0, table.Size())
assert.False(table.Has(key)) assert.False(table.Has(key))
} }
@@ -136,9 +136,9 @@ func TestDropNoItem(t *testing.T) {
key := 0 key := 0
table := cuckoo.New[int, bool]() table := cuckoo.New[int, bool]()
had := table.Drop(key) err := table.Drop(key)
assert.False(had) assert.NoError(err)
assert.Equal(0, table.Size()) assert.Equal(0, table.Size())
assert.False(table.Has(key)) assert.False(table.Has(key))
} }
@@ -152,9 +152,10 @@ func TestDropItemCapacity(t *testing.T) {
) )
startingCapacity := table.TotalCapacity() startingCapacity := table.TotalCapacity()
table.Drop(key) err := table.Drop(key)
endingCapacity := table.TotalCapacity() endingCapacity := table.TotalCapacity()
assert.NoError(err)
assert.Equal(0, table.Size()) assert.Equal(0, table.Size())
assert.Equal(uint64(128), startingCapacity) assert.Equal(uint64(128), startingCapacity)
assert.Equal(uint64(64), endingCapacity) assert.Equal(uint64(64), endingCapacity)
@@ -202,9 +203,9 @@ func TestDropResizeCapacity(t *testing.T) {
err1 := table.Put(0, true) err1 := table.Put(0, true)
err2 := table.Put(1, true) err2 := table.Put(1, true)
table.Drop(1) err3 := table.Drop(1)
assert.NoError(errors.Join(err1, err2)) assert.NoError(errors.Join(err1, err2, err3))
assert.Equal(uint64(20), table.TotalCapacity()) assert.Equal(uint64(20), table.TotalCapacity())
} }

3
doc.go
View File

@@ -5,8 +5,5 @@
// a table with any key type using [NewCustom]. Custom [Hash] functions and // a table with any key type using [NewCustom]. Custom [Hash] functions and
// key comparison are also supported. // key comparison are also supported.
// //
// NOTE: The [Table] is a look-up structure, and not a source of truth. If
// [ErrBadHash] occurs, the data cannot be restored.
//
// See more: https://en.wikipedia.org/wiki/Cuckoo_hashing // See more: https://en.wikipedia.org/wiki/Cuckoo_hashing
package cuckoo package cuckoo

View File

@@ -19,11 +19,6 @@ const DefaultGrowthFactor uint64 = 2
// [libcuckoo]: https://github.com/efficient/libcuckoo/blob/656714705a055df2b7a605eb3c71586d9da1e119/libcuckoo/cuckoohash_config.hh#L21 // [libcuckoo]: https://github.com/efficient/libcuckoo/blob/656714705a055df2b7a605eb3c71586d9da1e119/libcuckoo/cuckoohash_config.hh#L21
const defaultMinimumLoad float64 = 0.05 const defaultMinimumLoad float64 = 0.05
// defaultGrowthLimit is the maximum number of times a [Table] can grow in a
// single [Table.Put], before the library infers it will lead to a stack
// overflow. The value of '64' was chosen arbirarily.
const defaultGrowthLimit uint64 = 64
type settings struct { type settings struct {
growthFactor uint64 growthFactor uint64
minLoadFactor float64 minLoadFactor float64

View File

@@ -1,6 +1,5 @@
package cuckoo package cuckoo
// An entry is a key-value pair.
type entry[K, V any] struct { type entry[K, V any] struct {
key K key K
value V value V
@@ -20,11 +19,11 @@ type subtable[K, V any] struct {
// location determines where in the subtable a certain key would be placed. If // location determines where in the subtable a certain key would be placed. If
// the capacity is 0, this will panic. // the capacity is 0, this will panic.
func (t *subtable[K, V]) location(key K) uint64 { func (t subtable[K, V]) location(key K) uint64 {
return t.hash(key) % t.capacity return t.hash(key) % t.capacity
} }
func (t *subtable[K, V]) get(key K) (value V, found bool) { func (t subtable[K, V]) get(key K) (value V, found bool) {
if t.capacity == 0 { if t.capacity == 0 {
return return
} }
@@ -49,16 +48,13 @@ func (t *subtable[K, V]) drop(key K) (occupied bool) {
return false return false
} }
func (t *subtable[K, V]) resized(capacity uint64) *subtable[K, V] { func (t *subtable[K, V]) resize(capacity uint64) {
return &subtable[K, V]{ t.slots = make([]slot[K, V], capacity)
slots: make([]slot[K, V], capacity), t.capacity = capacity
capacity: capacity, t.size = 0
hash: t.hash,
compare: t.compare,
}
} }
func (t *subtable[K, V]) update(key K, value V) (updated bool) { func (t subtable[K, V]) update(key K, value V) (updated bool) {
if t.capacity == 0 { if t.capacity == 0 {
return return
} }
@@ -73,7 +69,7 @@ func (t *subtable[K, V]) update(key K, value V) (updated bool) {
return false return false
} }
func (t *subtable[K, V]) insert(insertion entry[K, V]) (evicted entry[K, V], eviction bool) { func (t *subtable[K, V]) evict(insertion entry[K, V]) (evicted entry[K, V], eviction bool) {
if t.capacity == 0 { if t.capacity == 0 {
return insertion, true return insertion, true
} }
@@ -96,8 +92,8 @@ func (t *subtable[K, V]) insert(insertion entry[K, V]) (evicted entry[K, V], evi
return insertion, true return insertion, true
} }
func newSubtable[K, V any](capacity uint64, hash Hash[K], compare EqualFunc[K]) *subtable[K, V] { func newSubtable[K, V any](capacity uint64, hash Hash[K], compare EqualFunc[K]) subtable[K, V] {
return &subtable[K, V]{ return subtable[K, V]{
hash: hash, hash: hash,
capacity: capacity, capacity: capacity,
compare: compare, compare: compare,

144
table.go
View File

@@ -9,7 +9,7 @@ import (
) )
// ErrBadHash occurs when the hashes given to a [Table] cause too many key // ErrBadHash occurs when the hashes given to a [Table] cause too many key
// collisions. Discard the old table, rebuild it from your source data, and try: // collisions. Try rebuilding the table using:
// //
// 1. Different hash seeds. Equal seeds produce equal hash functions, which // 1. Different hash seeds. Equal seeds produce equal hash functions, which
// always cycle. // always cycle.
@@ -20,7 +20,7 @@ var ErrBadHash = errors.New("bad hash")
// one with [New]. Or if you want more granularity, use [NewBy] or // one with [New]. Or if you want more granularity, use [NewBy] or
// [NewCustom]. // [NewCustom].
type Table[K, V any] struct { type Table[K, V any] struct {
tableA, tableB *subtable[K, V] tableA, tableB subtable[K, V]
growthFactor uint64 growthFactor uint64
minLoadFactor float64 minLoadFactor float64
} }
@@ -54,61 +54,30 @@ func (t *Table[K, V]) load() float64 {
return float64(t.Size()) / float64(t.TotalCapacity()) return float64(t.Size()) / float64(t.TotalCapacity())
} }
// insert attempts to put/update an entry in the table, without modifying the // resize clears all tables, changes the sizes of them to a specific capacity,
// size of the table. Returns a displaced entry and 'homeless = true' if an // and fills them back up again. It is a helper function for [Table.grow] and
// entry could not be placed after exhausting evictions. // [Table.shrink]; use them instead.
func (t *Table[K, V]) insert(entry entry[K, V]) (displaced entry[K, V], homeless bool) { func (t *Table[K, V]) resize(capacity uint64) error {
if t.tableA.update(entry.key, entry.value) { entries := make([]entry[K, V], 0, t.Size())
return
}
if t.tableB.update(entry.key, entry.value) {
return
}
for range t.maxEvictions() {
if entry, homeless = t.tableA.insert(entry); !homeless {
return
}
if entry, homeless = t.tableB.insert(entry); !homeless {
return
}
}
return entry, true
}
// resized creates an empty copy of the table, with a new capacity for each
// bucket.
func (t *Table[K, V]) resized(capacity uint64) *Table[K, V] {
return &Table[K, V]{
growthFactor: t.growthFactor,
minLoadFactor: t.minLoadFactor,
tableA: t.tableA.resized(capacity),
tableB: t.tableB.resized(capacity),
}
}
// resize creates a new [Table.resized] with 'capacity', inserts all items into
// the array, and replaces the current table. It is a helper function for
// [Table.grow] and [Table.shrink]; use them instead.
func (t *Table[K, V]) resize(capacity uint64) bool {
updated := t.resized(capacity)
for k, v := range t.Entries() { for k, v := range t.Entries() {
if _, failed := updated.insert(entry[K, V]{k, v}); failed { entries = append(entries, entry[K, V]{k, v})
return false }
t.tableA.resize(capacity)
t.tableB.resize(capacity)
for _, entry := range entries {
if err := t.Put(entry.key, entry.value); err != nil {
return err
} }
} }
*t = *updated return nil
return true
} }
// grow increases the table's capacity by the growth factor. If the // grow increases the table's capacity by the growth factor. If the
// capacity is 0, it increases it to 1. // capacity is 0, it increases it to 1.
func (t *Table[K, V]) grow() bool { func (t *Table[K, V]) grow() error {
var newCapacity uint64 var newCapacity uint64
if t.TotalCapacity() == 0 { if t.TotalCapacity() == 0 {
@@ -122,13 +91,13 @@ func (t *Table[K, V]) grow() bool {
// shrink reduces the table's capacity by the growth factor. It may // shrink reduces the table's capacity by the growth factor. It may
// reduce it down to 0. // reduce it down to 0.
func (t *Table[K, V]) shrink() bool { func (t *Table[K, V]) shrink() error {
return t.resize(t.tableA.capacity / t.growthFactor) return t.resize(t.tableA.capacity / t.growthFactor)
} }
// Get fetches the value for a key in the [Table]. Matches the comma-ok pattern // Get fetches the value for a key in the [Table]. Matches the comma-ok pattern
// of a builtin map; see [Table.Find] for plain indexing. // of a builtin map; see [Table.Find] for plain indexing.
func (t *Table[K, V]) Get(key K) (value V, ok bool) { func (t Table[K, V]) Get(key K) (value V, ok bool) {
if item, ok := t.tableA.get(key); ok { if item, ok := t.tableA.get(key); ok {
return item, true return item, true
} }
@@ -142,59 +111,60 @@ func (t *Table[K, V]) Get(key K) (value V, ok bool) {
// Find fetches the value of a key. Matches direct indexing of a builtin map; // Find fetches the value of a key. Matches direct indexing of a builtin map;
// see [Table.Get] for a comma-ok pattern. // see [Table.Get] for a comma-ok pattern.
func (t *Table[K, V]) Find(key K) (value V) { func (t Table[K, V]) Find(key K) (value V) {
value, _ = t.Get(key) value, _ = t.Get(key)
return return
} }
// Has returns true if a key has a value in the table. // Has returns true if a key has a value in the table.
func (t *Table[K, V]) Has(key K) (exists bool) { func (t Table[K, V]) Has(key K) (exists bool) {
_, exists = t.Get(key) _, exists = t.Get(key)
return return
} }
// Put sets the value for a key. If it cannot be set, an error is returned. // Put sets the value for a key. Returns error if its value cannot be set.
func (t *Table[K, V]) Put(key K, value V) (err error) { func (t *Table[K, V]) Put(key K, value V) (err error) {
var ( if t.tableA.update(key, value) {
entry = entry[K, V]{key, value} return nil
homeless bool
)
for range defaultGrowthLimit {
if entry, homeless = t.insert(entry); !homeless {
return
}
// Both this and the growth limit are necessary: this catches bad hashes
// early when the table is sparse, while the latter catches cases where
// growing never helps.
if t.load() < t.minLoadFactor {
return fmt.Errorf("hash functions produced a cycle at load %d/%d: %w", t.Size(), t.TotalCapacity(), ErrBadHash)
}
// It is theoretically possible to have a table with a larger capacity
// that is valid. But this chance is astronomically small, so we ignore
// it in this implementation.
if grew := t.grow(); !grew {
return fmt.Errorf("could not redistribute entries into larger table: %w", ErrBadHash)
}
} }
return fmt.Errorf("could not place entry after %d resizes: %w", defaultGrowthLimit, ErrBadHash) if t.tableB.update(key, value) {
} return nil
}
// Drop removes a value for a key in the table. Returns whether the key had entry, eviction := entry[K, V]{key, value}, false
// existed. for range t.maxEvictions() {
func (t *Table[K, V]) Drop(key K) bool { if entry, eviction = t.tableA.evict(entry); !eviction {
occupied := t.tableA.drop(key) || t.tableB.drop(key) return nil
}
if entry, eviction = t.tableB.evict(entry); !eviction {
return nil
}
}
if t.load() < t.minLoadFactor { if t.load() < t.minLoadFactor {
// The error is not handled here, because table-shrinking is an internal return fmt.Errorf("hash functions produced a cycle at load %d/%d: %w", t.Size(), t.TotalCapacity(), ErrBadHash)
// optimization.
t.shrink()
} }
return occupied if err := t.grow(); err != nil {
return err
}
return t.Put(entry.key, entry.value)
}
// Drop removes a value for a key in the table. Returns an error if its value
// cannot be removed.
func (t *Table[K, V]) Drop(key K) (err error) {
t.tableA.drop(key)
t.tableB.drop(key)
if t.load() < t.minLoadFactor {
return t.shrink()
}
return nil
} }
// Entries returns an unordered sequence of all key-value pairs in the table. // Entries returns an unordered sequence of all key-value pairs in the table.