11 Commits

Author SHA1 Message Date
bd25bb69bd docs: congruency target 2026-05-09 16:57:58 -04:00
5c84ed7794 docs: DeleteFunc, Collect 2026-05-04 19:16:42 -04:00
cddc205fe8 docs: insert, copy 2026-05-02 11:43:08 -04:00
a72146ca9c docs: finished congruency, started target state 2026-05-01 17:38:52 -04:00
f18d48a3c2 fix: wording 2026-04-29 20:53:07 -04:00
4464af781a feat: current contract list, started similarity 2026-04-29 20:52:15 -04:00
39548b4332 feat!: Drop returns bool, Put doesn't stack-overflow (#21)
All checks were successful
CI / Check PR Title (push) Has been skipped
CI / Go Lint (push) Successful in 58s
CI / Makefile Lint (push) Successful in 55s
CI / Markdown Lint (push) Successful in 34s
CI / Unit Tests (push) Successful in 54s
CI / Fuzz Tests (push) Successful in 1m26s
CI / Mutation Tests (push) Successful in 1m11s
## Description

Closes #11.

## Changes

### Design Decisions

## Checklist

- [ ] Tests pass
- [ ] Docs updated

Reviewed-on: #21
Co-authored-by: M.V. Hutz <git@maximhutz.me>
Co-committed-by: M.V. Hutz <git@maximhutz.me>
2026-04-17 01:31:01 +00:00
29ba6bfd4d fix!: no mixed receiver types (#23)
All checks were successful
CI / Check PR Title (push) Has been skipped
CI / Makefile Lint (push) Successful in 50s
CI / Go Lint (push) Successful in 54s
CI / Markdown Lint (push) Successful in 50s
CI / Unit Tests (push) Successful in 47s
CI / Mutation Tests (push) Successful in 1m31s
CI / Fuzz Tests (push) Successful in 1m21s
## Description

Currently, `bucket` and `Table` have mixed receiver types: some are pointer receviers, and others are value receivers.

As per the Go Wiki, [you can have value and pointer receivers, just don't mix them](https://go.dev/doc/faq#methods_on_values_or_pointers).

## Changes

- Replace all value receivers in `bucket` and `Table` with pointer receivers.

### Design Decisions

This decision was made due to the advice on the Go wiki.

## Checklist

- [x] Tests pass
- [x] Docs updated

Reviewed-on: #23
Co-authored-by: M.V. Hutz <git@maximhutz.me>
Co-committed-by: M.V. Hutz <git@maximhutz.me>
2026-04-16 03:27:48 +00:00
7cc1657403 refactor!: shorter constructors, bucketsubtable (#22)
All checks were successful
CI / Check PR Title (push) Has been skipped
CI / Makefile Lint (push) Successful in 47s
CI / Go Lint (push) Successful in 51s
CI / Markdown Lint (push) Successful in 46s
CI / Unit Tests (push) Successful in 47s
CI / Fuzz Tests (push) Successful in 1m19s
CI / Mutation Tests (push) Successful in 1m36s
## Description

Currently, the name of `bucket` is a bit confusing, because it is considered a 'table' in literature (as well as the whole hash table). A `bucket` is better described as a 'subtable', which is used by the total hash table to perform cuckoo hashing.

In addition, the constructors `NewTable`, `NewTableBy`, and `NewCustomTable` were given shorter names, because the package name `cuckoo` already implies that `New*` would create a hash table with cuckoo hashing. This package has one use-case, and so it unambiguous what constructors produce.

## Changes

- `NewTable` -> `New`
- `NewTableBy` -> `NewBy`
- `NewCustomTable` -> `NewCustom`
- `bucket` -> `subtable`

### Design Decisions

- I would have renamed `Table` and `subtable` to map equivalents, but 'submap' implies that a certain subsection of the map is contained within it, which isn't quite right.
- I chose not to go with `Map` and `table`, because of the split naming convention.

## Checklist

- [x] Tests pass
- [x] Docs updated

Reviewed-on: #22
Co-authored-by: M.V. Hutz <git@maximhutz.me>
Co-committed-by: M.V. Hutz <git@maximhutz.me>
2026-04-16 03:15:39 +00:00
42c5b5f8f4 feat!: update get from (V, error) to (V, bool) (#20)
All checks were successful
CI / Check PR Title (push) Has been skipped
CI / Go Lint (push) Successful in 43s
CI / Makefile Lint (push) Successful in 41s
CI / Markdown Lint (push) Successful in 32s
CI / Unit Tests (push) Successful in 39s
CI / Fuzz Tests (push) Successful in 1m44s
CI / Mutation Tests (push) Successful in 1m28s
## Description

Currently, the signature for `Table.Get` is `func (K) (V, error)`. This is not very Go-idiomatic, which prefers to return a boolean instead of an error. For instance, a built-in Go map is used like so:

```go
if value, ok := users[id]; !ok {
  // ...
}
```

Updating our table to look like that is best practice. In that same vein, to support direct lookup (i.e. `v := users[id]`), this PR also adds `Table.Find`.

## Changes

- BREAKING CHANGE: Update contract of `Table.Get` to `func (K) (V, bool)`. Returns 'false' is the item cannot be found, and 'true' if it is found.
- Add `Table.Find`.
- Updated tests and documentation to match the change.

### Design Decisions

- Chose to make this decision because throwing an error implies that there is something 'wrong' with the table. There is nothing wrong with the table; it is just that the item does not exist.

## Checklist

- [x] Tests pass
- [x] Docs updated

Reviewed-on: #20
Co-authored-by: M.V. Hutz <git@maximhutz.me>
Co-committed-by: M.V. Hutz <git@maximhutz.me>
2026-04-14 01:58:15 +00:00
867a1d49df feat: sentinel error ErrBadHash (#19)
All checks were successful
CI / Check PR Title (push) Has been skipped
CI / Makefile Lint (push) Successful in 1m4s
CI / Markdown Lint (push) Successful in 32s
CI / Go Lint (push) Successful in 1m15s
CI / Unit Tests (push) Successful in 38s
CI / Fuzz Tests (push) Successful in 1m34s
CI / Mutation Tests (push) Successful in 1m31s
## Description

Currently, the errors are not sentinel, and so are hard to test for. This PR makes sure hash collision errors are accounted for.

## Changes

- Add `ErrBadHash`. Happens when there are too many collisions for an item to be added.

### Design Decisions

- Chose to name `ErrBadHash` over `ErrCycle` because the feedbach that the user should be given is to evaluate their hash functions. Cycle collision is a bit esoteric.

## Checklist

- [x] Tests pass
- [x] Docs updated

Reviewed-on: #19
Co-authored-by: M.V. Hutz <git@maximhutz.me>
Co-committed-by: M.V. Hutz <git@maximhutz.me>
2026-04-14 00:38:11 +00:00
13 changed files with 827 additions and 225 deletions

View File

@@ -114,6 +114,9 @@ linters:
# Reports uses of functions with replacement inside the testing package. # Reports uses of functions with replacement inside the testing package.
- usetesting - usetesting
# Reports mixed receiver types in structs/interfaces.
- recvcheck
settings: settings:
revive: revive:
rules: rules:
@@ -198,7 +201,7 @@ linters:
# warns when initialism, variable or package naming conventions are not followed. # warns when initialism, variable or package naming conventions are not followed.
- name: var-naming - name: var-naming
misspell: misspell:
# Correct spellings using locale preferences for US or UK. # Correct spellings using locale preferences for US or UK.
# Setting locale to US will correct the British spelling of 'colour' to 'color'. # Setting locale to US will correct the British spelling of 'colour' to 'color'.

542
adr/001_interface_design.md Normal file
View File

@@ -0,0 +1,542 @@
# Designing an Idiomatic API Interface
We (the maintainers) built `go-cuckoo`'s API interface without design intent.
Up until now, we paid more attention implementing the underlying functionality of the cuckoo hashing.
With the fundamentals of the algorithm built, we should revisit the interface.
It should align closer to the following principles:
- **Congruency**
A `go-cuckoo` table should have the same core functionality as Go's built-in map.
- **Familiarity**
A `go-cuckoo` table should behave similarly to Go's standard map, so users will intuitively know how to use it.
In effect, its users will carry less cognitive load.
## Current State
### Interface of the built-in Map
Listed below is every interface provided by Go to the built-in map object.
Also included, are the functions from the package `maps` in the standard library.
<details>
<summary>Interfaces</summary>
| # | built-in Interface | Description |
| --- | ---------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------- |
| 1 | `m := make(map[K]V)` | Returns an empty map using the built-in `make()` function. |
| 2 | `m := make(map[K]V, hint)` | Returns an empty map using `make()`, with a capacity 'hint'. This hint is how many items the map expects to hold, _not_ a measure of how large it is. |
| 3 | `m := map[K]V{...}` | Returns a map, which may be filled with entries in the ellipsis (optional). |
| 4 | `var m map[K]V` | Defines an empty _variable_ that holds a map. This differs from #1 because `m` is uninitialized (nil) here. |
| 5 | `m[k] := v` | Assigns the value of `k` to `v`. |
| 6 | `v := m[k]` | Returns the value of `k` if it exists. Otherwise, `v` is uninitialized. |
| 7 | `v, ok := m[k]` | Similar to #6, except `ok` is equal to whether `v` is initialized. This is comma-ok notation. |
| 8 | `for k, v := range m` | Iterates over every key-value pair in `m`. The order is random. |
| 9 | `delete(m, k)` | Unassigns the value `k`. Returns no value. |
| 10 | `clear(m)` | Unassigns all keys in `m`. Returns no value. |
| 11 | `n := len(m)` | Returns the number of entries in `m`. If nil, `m` returns 0. |
| 12 | `m2 := maps.Clone(m)` | Returns a copy of `m`. |
| 13 | `maps.Copy(dst, src)` | Assigns every entry of `src` in `dst`. |
| 14 | `ok := maps.Equal(m1, m2)` | Returns true iff `m1` and `m2` the same entries. |
| 15 | `ok := maps.EqualFunc(m1, m2, fn)` | Like #14, but with a custom comparator for non-comparable values. |
| 16 | `maps.DeleteFunc(m, fn)` | Removes every entry in `m` which satisfies `fn`. Returns no value. |
| 17 | `it2 := maps.All(m)` | Returns an 2D iterator over every key-value pair. |
| 18 | `it := maps.Keys(m)` | Returns an iterator over every key. |
| 19 | `it := maps.Values(m)` | Returns an iterator over every value. There can be duplicates. |
| 20 | `m := maps.Collect(seq)` | Returns a map, with every entry defined in a 2D iterator over key-value pairs. |
| 21 | `maps.Insert(m, seq)` | Assigns to `m` all key-value pairs in 2D iterator `seq`. Returns no value. |
</details>
### Interface of `go-cuckoo`
On the other hand, here is the current contract for `go-cuckoo`.
<details>
<summary>Interfaces</summary>
| # | built-in Interface | Description |
| --- | -------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------- |
| 1 | `m := New(opts...)` | Creates a table using the default hash and equal function. The options configure its behavior. Confined to comparable keys. |
| 2 | `m := NewBy(keyFunc, opts...)` | Like #1, but allows any key type. A `keyFunc` is used to derive a comparable key. |
| 3 | `m := NewCustom(hashA, hashB, equalFunc, opts...)` | Like #1, but allows control over the hashes used to allow any key type. An `equalFunc` determines key equality. |
| 4 | `seq := m.Entries()` | Returns an unordered 2D iterator of all key-value pairs in the table. |
| 5 | `v := m.Find(k)` | Removes the value for `k`. Returns true if `k` existed. |
| 6 | `v, ok := m.Get(k)` | Returns the value for `k` in the table. Also, returns true if the `k` exists, otherwise false. When false, `v` is undefined. |
| 7 | `ok := m.Has(k)` | Returns true if `k` is in the table. |
| 8 | `err := m.Put(k, v)` | Sets value `v` for key `k`. Otherwise, returns error. |
| 9 | `n := m.Size()` | Returns the number of items in `m`. |
| 10 | `str := m.String()` | Returns `m` as a string in the format "table[k1:v1 k2:v2 ...]". |
| 11 | `cap := m.TotalCapacity()` | Returns how many slots `m` has allocated. |
| 12 | `ok := m.Drop(k)` | Removes `k` from the table. Returns whether the key had existed. |
</details>
### Determining Congruency
So, how does the core functionality compare?
Listed below is an analysis of every interface in Go's standard map.
Each is compared against what `go-cuckoo` offers, and categorized into the following groups:
- ✅ Covered: an analog exists.
- ⚠️ Partial: workaround available.
- ❌ Gap: no analog yet; addressed in [Target State](#solving-congruency).
Specifically, here we are checking for functionality.
Is there functionality that this offers which `go-cuckoo` does not?
We are checking accessibility, but not discoverability.
The latter will be considered later.
<details>
<summary>✅ <code>m := make(map[K]V)</code></summary>
The analog is `m := New()`.
</details>
<details>
<summary>⚠️ <code>m := make(map[K]V, hint)</code></summary>
This has no simple analog.
It is close to `m := New(Capacity(hint))`, but it assigns starting capacity, not expected size.
For the built-in map, these are two separate things.
- Capacity is an internal measure, used to optimize space/speed.
It is hidden from the user because it depends on the underlying implementation, which may change.
- Expected size requires the map must hold a number of items before resizing.
This is tangeable and agnostic to implementation, hence why it is given to the user.
In short, this interface defines expected size, but `Capacity()` defines capacity.
</details>
<details>
<summary>❌ <code>m := map[K]V{...}</code></summary>
This has no simple analog, the closest being:
```go
m := New[K, V]()
for k, v := range startingEntries {
m.Put(k, v)
}
```
It is idiomatic, but far less ergonomic.
</details>
<details>
<summary>✅ <code>var m map[K]V</code></summary>
The analog is `var m Table[K, V]`.
</details>
<details>
<summary>✅ <code>m[k] := v</code></summary>
The analog is `err := m.Put(k, v)`.
</details>
<details>
<summary>✅ <code>v := m[k]</code></summary>
The analog is `v := m.Find(k)`.
</details>
<details>
<summary>✅ <code>v, ok := m[k]</code></summary>
The analog is `v, ok := m.Get(k)`.
</details>
<details>
<summary>✅ <code>for k, v := range m</code></summary>
The analog is `for k, v := range m.Entries()`.
</details>
<details>
<summary>✅ <code>delete(m, k)</code></summary>
The analog is `ok := m.Drop(k)`.
</details>
<details>
<summary>❌ <code>clear(m)</code></summary>
There is no analog.
The easiest may to do this is to delete all items individually:
```go
for k := range m.Entries() {
m.Drop(k)
}
```
</details>
<details>
<summary>✅ <code>n := len(m)</code></summary>
The analog is `n := m.Size()`.
</details>
<details>
<summary>❌ <code>m2 := maps.Clone(m)</code></summary>
There is no analog.
The easiest way to do this currently is to make a new map, and manually add the items.
```go
m2 := cuckoo.Table[K, V]()
for k, v := range m.Entries() {
m2.Put(k, v)
}
```
This gets complicated by the various options available to the user.
Furthermore, any custom `EqualFunc`, `keyFunc` or `Hash` is not transferred.
</details>
<details>
<summary>❌ <code>maps.Copy(dst, src)</code></summary>
There is no analog.
The simplest way to do this is with a for-loop.
```go
for k, v := range src.Entries() {
dst.Put(k, v)
}
```
</details>
<details>
<summary>❌ <code>ok := maps.Equal(m1, m2)</code></summary>
There is no analog.
Users have to manually check the key-value pairs to determine equality.
</details>
<details>
<summary>❌ <code>ok := maps.EqualFunc(m1, m2, fn)</code></summary>
There is no analog.
Users have to manually check the key-value pairs to determine equality.
</details>
<details>
<summary>❌ <code>maps.DeleteFunc(m, fn)</code></summary>
There is no analog.
Users have to manually delete keys.
</details>
<details>
<summary>✅ <code>it2 := maps.All(m)</code></summary>
The analog is `it2 := m.Entries()`.
</details>
<details>
<summary>⚠️ <code>it := maps.Keys(m)</code></summary>
There is no simple analog.
A close neighbor is `it2 := m.Entries()`.
Users can use this in a for-loop, and pick out just the keys:
```go
for k := range m.Entries() {
// ...
}
```
</details>
<details>
<summary>⚠️ <code>it := maps.Values(m)</code></summary>
There is no simple analog.
A close neighbor is `it2 := m.Entries()`.
Users can use this in a for-loop, and pick out just the values:
```go
for _, v := range m.Entries() {
// ...
}
```
</details>
<details>
<summary>❌ <code>m := maps.Collect(seq)</code></summary>
There is no analog.
</details>
<details>
<summary>❌ <code>maps.Insert(m, seq)</code></summary>
There is no analog.
</details>
## Target State
### Solving Congruency
We should make the following changes to accomodate for congruency:
<details>
<summary><code>ok := maps.EqualFunc(m1, m2, fn)</code></summary>
We should implement a new function:
```go
func EqualFunc[K, V1, V2 any](t1 *Table[K, V1], t2 *Table[K, V2], eq func(V1, V2) bool) bool
```
This function is free, and not bound as a receiver function.
(It is called `cuckoo.Equal(t1, t2)`, not `t1.Equals(t2)`.)
The latter implies `t1` has authority, when in fact neither do.
We define equality as:
1. Neither table has a key the other doesn't.
2. Each key has the same value in each table.
Parameter `eq` determines this equality.
Custom `EqualFunc`'s complicate this, as they modulate key identity in tables.
If two tables may differ on whether two keys are different, this function might break.
So, we must assume that:
- Both tables have `EqualFunc`'s which 'agree' on the identity of the keys present in the tables.
Agreement is defined as: if two keys are distinct in one table, they are distinct in the other.
The name `EqualFunc` is already taken by `EqualFunc[K, V]`: an alias for `func(a, b K) bool`.
Inlining `EqualFunc[K, V]` would solve this problem.
We will move the documentation attached to it to `DefaultEqualFunc`.
</details>
<details>
<summary><code>ok := maps.Equal(m1, m2)</code></summary>
We should implement a new function, to conform with the standard library:
```go
func Equal[K any, V comparable](t1, t2 *Table[K, V]) bool
```
It uses the same equality check as in `EqualFunc`.
Once again, the function is free because it is symmetric.
</details>
<details>
<summary><code>maps.Insert(m, seq)</code></summary>
We should implement a new receiver for the table:
```go
func (t *Table[K, V]) Insert(seq iter.Seq2[K, V]) error
```
A receiver fits better even though `maps.Insert` is a free function, because copying it is asymmetric.
Map `dst` receives entries from map `src`.
It's only free because Go's standard map is built into the language, and so cannot have receivers.
In terms of naming, `t.Extend` is more accurate, and has precedent in [Python](docs.python.org/3/tutorial/datastructures.html#more-on-lists) and [Rust](https://doc.rust-lang.org/std/iter/trait.Extend.html).
When [adding iterator function](https://github.com/golang/go/issues/61900) to the `maps` package, the Go team chose to frame it as 'sources' and 'sinks'.
With this model, `maps.Insert` made more sense than `maps.Extend`.
Ultimately, `t.Insert()` is a better choice to be consistent with `maps`.
</details>
<details>
<summary><code>maps.Copy(dst, src)</code></summary>
We should implement a new receiver for the table:
```go
func (t *Table[K, V]) Copy(src *Table[K, V]) error
```
It's functionality should match that of `t.Insert()`.
A receiver fits better even though `maps.Copy` is a free function, 'copying' it is asymmetric: `dst` is writen into by `src`.
It is only free because Go's standard map is built into the language, and so cannot have receivers.
The name `t.Merge()` might be more accurate, but it does work because:
- `t.Copy()` matches Go's built-in `copy()`, and `io.Copy()`. The Go team used [the same logic](https://github.com/golang/go/discussions/47330#discussioncomment-1167799) to name `maps.Copy()`.
In this case, `t.Merge()` would be an outlier.
- `t.Merge()` implies some sort of conflict-resolution, when there is not.
It simply overwrites the values.
</details>
<details>
<summary><code>maps.DeleteFunc(m, fn)</code></summary>
We should implement a new receiver for the table:
```go
func (t *Table[K, V]) DeleteFunc(del func(K, V) bool)
```
It would have the same functionality as `maps.DeleteFunc`.
A free function could work here, but `t` has clear authority over `del`.
Other than being consistent with the `maps` package, `t.DeleteFunc` follows the Go convention of appending `Func` to higher-order equivalents of functions.
This trumps names like `t.DeleteIf`, which lend more to [Java](https://docs.oracle.com/javase/8/docs/api/java/util/ArrayList.html#removeIf-java.util.function.Predicate-) or [C++](https://en.cppreference.com/cpp/algorithm/remove).
The word `Delete` is also convention, tying back to the built-in `delete()`.
</details>
<details>
<summary><code>m := maps.Collect(seq)</code></summary>
We should implement a new constructor.
```go
func Collect[K comparable, V any](seq iter.Seq2[K, V]) (*Table[K, V], error)
```
It would create a `New()` table, and insert all entries in `seq`.
This reveicer only supports the standard table constructor, with comparable keys.
It is tempting to add `CollectBy` or `CollectCustom` to support all table types, but doing so would pollute the public interface.
It would be just one more line to initialize the table and then call `t.Insert` directly:
```go
t := // ...
err := t.Insert(seq)
```
</details>
<details>
<summary><code>m := map[K]V{...}</code></summary>
We should make a new constructor, because entries are generic.
So, creating an option with inialized entries doesn't work.
With the previous additions, users have a few options.
If they want to use a `New()` table, `t.Collect` matches well:
```go
t, err := cuckoo.Collect(func(yield func(K, V) bool) {
yield(key1, val1)
yield(key2, val2)
})
```
For `NewCustom()` or `NewBy()` tables, users can call `t.Insert` after initialization:
```go
t := // ...
err := t.Insert(func(yield func(K, V) bool) {
yield(key1, val1)
yield(key2, val2)
})
```
It is one more line.
But, the alternative is polluting the public interface with corresponding `*WithEntries` constuctors.
</details>
<details>
<summary><code>m := make(map[K]V, hint)</code></summary>
We should add a new option:
```go
func ExpectedSize(n int) Option
```
When fed to a table, it will allocate enough space to hold `n` entries without a resize.
</details>
<details>
<summary><code>clear(m)</code></summary>
We should implement a new receiver:
```go
func (t *Table[K, V]) Clear()
```
It will remove all entries from the table.
</details>
<details>
<summary><code>m2 := maps.Clone(m)</code></summary>
We should implement a matching function:
```go
func (t *Table[K, V]) Clone() *Table[K, V]
```
Also, it will copy the hash, equality function, and options used in the table.
</details>
<details>
<summary><code>it := maps.Keys(m)</code></summary>
We should implement a matching function:
```go
func (t *Table[K, V]) Keys() iter.Seq[K]
```
It is tempting to just have `All()`, but it returns a `Seq2`, not a `Seq`.
There is no iterator adaptor between `Seq` and `Seq2`, and will not be for the foreseeable future.
This function, while it feels superfluous, is required.
</details>
<details>
<summary><code>it := maps.Values(m)</code></summary>
We should implement a matching function:
```go
func (t *Table[K, V]) Values() iter.Seq[V]
```
For the same reason we need `Keys()`, we also need `Values()`.
</details>

103
bucket.go
View File

@@ -1,103 +0,0 @@
package cuckoo
type entry[K, V any] struct {
key K
value V
}
type slot[K, V any] struct {
entry[K, V]
occupied bool
}
type bucket[K, V any] struct {
hash Hash[K]
slots []slot[K, V]
capacity, size uint64
compare EqualFunc[K]
}
// location determines where in the bucket a certain key would be placed. If the
// capacity is 0, this will panic.
func (b bucket[K, V]) location(key K) uint64 {
return b.hash(key) % b.capacity
}
func (b bucket[K, V]) get(key K) (value V, found bool) {
if b.capacity == 0 {
return
}
slot := b.slots[b.location(key)]
return slot.value, slot.occupied && b.compare(slot.key, key)
}
func (b *bucket[K, V]) drop(key K) (occupied bool) {
if b.capacity == 0 {
return
}
slot := &b.slots[b.location(key)]
if slot.occupied && b.compare(slot.key, key) {
slot.occupied = false
b.size--
return true
}
return false
}
func (b *bucket[K, V]) resize(capacity uint64) {
b.slots = make([]slot[K, V], capacity)
b.capacity = capacity
b.size = 0
}
func (b bucket[K, V]) update(key K, value V) (updated bool) {
if b.capacity == 0 {
return
}
slot := &b.slots[b.location(key)]
if slot.occupied && b.compare(slot.key, key) {
slot.value = value
return true
}
return false
}
func (b *bucket[K, V]) evict(insertion entry[K, V]) (evicted entry[K, V], eviction bool) {
if b.capacity == 0 {
return insertion, true
}
slot := &b.slots[b.location(insertion.key)]
if !slot.occupied {
slot.entry = insertion
slot.occupied = true
b.size++
return
}
if b.compare(slot.key, insertion.key) {
slot.value = insertion.value
return
}
insertion, slot.entry = slot.entry, insertion
return insertion, true
}
func newBucket[K, V any](capacity uint64, hash Hash[K], compare EqualFunc[K]) bucket[K, V] {
return bucket[K, V]{
hash: hash,
capacity: capacity,
compare: compare,
size: 0,
slots: make([]slot[K, V], capacity),
}
}

View File

@@ -2,7 +2,7 @@ package cuckoo
// An EqualFunc determines whethers two keys are 'equal'. Keys that are 'equal' // An EqualFunc determines whethers two keys are 'equal'. Keys that are 'equal'
// are teated as the same by the [Table]. A good EqualFunc is pure, // are teated as the same by the [Table]. A good EqualFunc is pure,
// deterministic, and fast. By default, [NewTable] uses [DefaultEqualFunc]. // deterministic, and fast. By default, [New] uses [DefaultEqualFunc].
// //
// This function MUST NOT return true if the [Hash] digest of two keys // This function MUST NOT return true if the [Hash] digest of two keys
// are different: the [Table] will not work. // are different: the [Table] will not work.

View File

@@ -28,7 +28,7 @@ func ExampleEqualFunc_badEqualFunc() {
// Two users with the same ID are equal. // Two users with the same ID are equal.
isEqual := func(a, b User) bool { return a.ID == b.ID } isEqual := func(a, b User) bool { return a.ID == b.ID }
userbase := cuckoo.NewCustomTable[User, bool](makeHash(1), makeHash(2), isEqual) userbase := cuckoo.NewCustom[User, bool](makeHash(1), makeHash(2), isEqual)
(userbase.Put(User{"1", "Robert Doe"}, true)) (userbase.Put(User{"1", "Robert Doe"}, true))

View File

@@ -56,7 +56,7 @@ func FuzzInsertLookup(f *testing.F) {
fmt.Fprintf(os.Stderr, "seedA=%d seedB=%d capacity=%d growthFactor=%d\n", fmt.Fprintf(os.Stderr, "seedA=%d seedB=%d capacity=%d growthFactor=%d\n",
seedA, seedB, capacity, growthFactor) seedA, seedB, capacity, growthFactor)
actual := cuckoo.NewCustomTable[uint32, uint32]( actual := cuckoo.NewCustom[uint32, uint32](
offsetHash(seedA), offsetHash(seedA),
offsetHash(seedB), offsetHash(seedB),
func(a, b uint32) bool { return a == b }, func(a, b uint32) bool { return a == b },
@@ -68,12 +68,13 @@ func FuzzInsertLookup(f *testing.F) {
for _, step := range scenario.steps { for _, step := range scenario.steps {
if step.drop { if step.drop {
err := actual.Drop(step.key) ok := actual.Drop(step.key)
assert.NoError(err) _, has := expected[step.key]
assert.Equal(ok, has)
delete(expected, step.key) delete(expected, step.key)
_, ok := actual.Get(step.key) _, ok = actual.Get(step.key)
assert.False(ok) assert.False(ok)
} else { } else {
err := actual.Put(step.key, step.value) err := actual.Put(step.key, step.value)

View File

@@ -11,7 +11,7 @@ func TestMaxEvictions(t *testing.T) {
assert := assert.New(t) assert := assert.New(t)
for i := 16; i < 116; i++ { for i := 16; i < 116; i++ {
table := NewTable[int, bool](Capacity(i / 2)) table := New[int, bool](Capacity(i / 2))
expectedEvictions := 3 * math.Floor(math.Log2(float64(i))) expectedEvictions := 3 * math.Floor(math.Log2(float64(i)))
assert.Equal(table.maxEvictions(), int(expectedEvictions)) assert.Equal(table.maxEvictions(), int(expectedEvictions))
@@ -20,7 +20,7 @@ func TestMaxEvictions(t *testing.T) {
func TestLoad(t *testing.T) { func TestLoad(t *testing.T) {
assert := assert.New(t) assert := assert.New(t)
table := NewTable[int, bool](Capacity(8)) table := New[int, bool](Capacity(8))
for i := range 16 { for i := range 16 {
err := table.Put(i, true) err := table.Put(i, true)

View File

@@ -14,7 +14,7 @@ import (
func TestNewTable(t *testing.T) { func TestNewTable(t *testing.T) {
assert := assert.New(t) assert := assert.New(t)
table := cuckoo.NewTable[int, bool]() table := cuckoo.New[int, bool]()
assert.NotNil(table) assert.NotNil(table)
assert.Zero(table.Size()) assert.Zero(table.Size())
@@ -23,7 +23,7 @@ func TestNewTable(t *testing.T) {
func TestAddItem(t *testing.T) { func TestAddItem(t *testing.T) {
assert := assert.New(t) assert := assert.New(t)
key, value := 0, true key, value := 0, true
table := cuckoo.NewTable[int, bool]() table := cuckoo.New[int, bool]()
err := table.Put(key, value) err := table.Put(key, value)
@@ -35,7 +35,7 @@ func TestAddItem(t *testing.T) {
func TestPutOverwrite(t *testing.T) { func TestPutOverwrite(t *testing.T) {
assert := assert.New(t) assert := assert.New(t)
key, value, newValue := 0, 1, 2 key, value, newValue := 0, 1, 2
table := cuckoo.NewTable[int, int]() table := cuckoo.New[int, int]()
(table.Put(key, value)) (table.Put(key, value))
err := table.Put(key, newValue) err := table.Put(key, newValue)
@@ -50,7 +50,7 @@ func TestPutOverwrite(t *testing.T) {
func TestSameHash(t *testing.T) { func TestSameHash(t *testing.T) {
assert := assert.New(t) assert := assert.New(t)
hash := func(int) uint64 { return 0 } hash := func(int) uint64 { return 0 }
table := cuckoo.NewCustomTable[int, bool](hash, hash, cuckoo.DefaultEqualFunc[int]) table := cuckoo.NewCustom[int, bool](hash, hash, cuckoo.DefaultEqualFunc[int])
errA := table.Put(0, true) errA := table.Put(0, true)
errB := table.Put(1, true) errB := table.Put(1, true)
@@ -63,14 +63,14 @@ func TestSameHash(t *testing.T) {
func TestStartingCapacity(t *testing.T) { func TestStartingCapacity(t *testing.T) {
assert := assert.New(t) assert := assert.New(t)
table := cuckoo.NewTable[int, bool](cuckoo.Capacity(64)) table := cuckoo.New[int, bool](cuckoo.Capacity(64))
assert.Equal(uint64(128), table.TotalCapacity()) assert.Equal(uint64(128), table.TotalCapacity())
} }
func TestResizeCapacity(t *testing.T) { func TestResizeCapacity(t *testing.T) {
assert := assert.New(t) assert := assert.New(t)
table := cuckoo.NewTable[int, bool]( table := cuckoo.New[int, bool](
cuckoo.Capacity(8), cuckoo.Capacity(8),
cuckoo.GrowthFactor(2), cuckoo.GrowthFactor(2),
) )
@@ -85,7 +85,7 @@ func TestResizeCapacity(t *testing.T) {
func TestPutMany(t *testing.T) { func TestPutMany(t *testing.T) {
assert := assert.New(t) assert := assert.New(t)
expected, actual := map[int]bool{}, cuckoo.NewTable[int, bool]() expected, actual := map[int]bool{}, cuckoo.New[int, bool]()
for i := range 1_000 { for i := range 1_000 {
expected[i] = true expected[i] = true
@@ -100,7 +100,7 @@ func TestPutMany(t *testing.T) {
func TestGetMany(t *testing.T) { func TestGetMany(t *testing.T) {
assert := assert.New(t) assert := assert.New(t)
table := cuckoo.NewTable[int, bool]() table := cuckoo.New[int, bool]()
for i := range 1_000 { for i := range 1_000 {
err := table.Put(i, true) err := table.Put(i, true)
@@ -121,12 +121,12 @@ func TestGetMany(t *testing.T) {
func TestDropExistingItem(t *testing.T) { func TestDropExistingItem(t *testing.T) {
assert := assert.New(t) assert := assert.New(t)
key, value := 0, true key, value := 0, true
table := cuckoo.NewTable[int, bool]() table := cuckoo.New[int, bool]()
(table.Put(key, value)) (table.Put(key, value))
err := table.Drop(key) had := table.Drop(key)
assert.NoError(err) assert.True(had)
assert.Equal(0, table.Size()) assert.Equal(0, table.Size())
assert.False(table.Has(key)) assert.False(table.Has(key))
} }
@@ -134,11 +134,11 @@ func TestDropExistingItem(t *testing.T) {
func TestDropNoItem(t *testing.T) { func TestDropNoItem(t *testing.T) {
assert := assert.New(t) assert := assert.New(t)
key := 0 key := 0
table := cuckoo.NewTable[int, bool]() table := cuckoo.New[int, bool]()
err := table.Drop(key) had := table.Drop(key)
assert.NoError(err) assert.False(had)
assert.Equal(0, table.Size()) assert.Equal(0, table.Size())
assert.False(table.Has(key)) assert.False(table.Has(key))
} }
@@ -146,16 +146,15 @@ func TestDropNoItem(t *testing.T) {
func TestDropItemCapacity(t *testing.T) { func TestDropItemCapacity(t *testing.T) {
assert := assert.New(t) assert := assert.New(t)
key := 0 key := 0
table := cuckoo.NewTable[int, bool]( table := cuckoo.New[int, bool](
cuckoo.Capacity(64), cuckoo.Capacity(64),
cuckoo.GrowthFactor(2), cuckoo.GrowthFactor(2),
) )
startingCapacity := table.TotalCapacity() startingCapacity := table.TotalCapacity()
err := table.Drop(key) table.Drop(key)
endingCapacity := table.TotalCapacity() endingCapacity := table.TotalCapacity()
assert.NoError(err)
assert.Equal(0, table.Size()) assert.Equal(0, table.Size())
assert.Equal(uint64(128), startingCapacity) assert.Equal(uint64(128), startingCapacity)
assert.Equal(uint64(64), endingCapacity) assert.Equal(uint64(64), endingCapacity)
@@ -164,7 +163,7 @@ func TestDropItemCapacity(t *testing.T) {
func TestPutNoCapacity(t *testing.T) { func TestPutNoCapacity(t *testing.T) {
assert := assert.New(t) assert := assert.New(t)
key, value := 0, true key, value := 0, true
table := cuckoo.NewTable[int, bool]( table := cuckoo.New[int, bool](
cuckoo.Capacity(0), cuckoo.Capacity(0),
) )
@@ -177,7 +176,7 @@ func TestPutNoCapacity(t *testing.T) {
func TestBadHashCapacity(t *testing.T) { func TestBadHashCapacity(t *testing.T) {
assert := assert.New(t) assert := assert.New(t)
table := cuckoo.NewCustomTable[int, bool]( table := cuckoo.NewCustom[int, bool](
func(int) uint64 { return 0 }, func(int) uint64 { return 0 },
func(int) uint64 { return 0 }, func(int) uint64 { return 0 },
func(a, b int) bool { return a == b }, func(a, b int) bool { return a == b },
@@ -197,15 +196,15 @@ func TestBadHashCapacity(t *testing.T) {
func TestDropResizeCapacity(t *testing.T) { func TestDropResizeCapacity(t *testing.T) {
assert := assert.New(t) assert := assert.New(t)
table := cuckoo.NewTable[int, bool]( table := cuckoo.New[int, bool](
cuckoo.Capacity(10), cuckoo.Capacity(10),
) )
err1 := table.Put(0, true) err1 := table.Put(0, true)
err2 := table.Put(1, true) err2 := table.Put(1, true)
err3 := table.Drop(1) table.Drop(1)
assert.NoError(errors.Join(err1, err2, err3)) assert.NoError(errors.Join(err1, err2))
assert.Equal(uint64(20), table.TotalCapacity()) assert.Equal(uint64(20), table.TotalCapacity())
} }
@@ -217,9 +216,7 @@ func TestNewTableBy(t *testing.T) {
} }
assert := assert.New(t) assert := assert.New(t)
table := cuckoo.NewTableBy[User, bool]( table := cuckoo.NewBy[User, bool](func(u User) string { return u.id })
func(u User) string { return u.id },
)
err := table.Put(User{nil, "1", "Robert"}, true) err := table.Put(User{nil, "1", "Robert"}, true)

7
doc.go
View File

@@ -1,9 +1,12 @@
// Package cuckoo provides a hash table that uses cuckoo hashing to achieve // Package cuckoo provides a hash table that uses cuckoo hashing to achieve
// a worst-case O(1) lookup time. // a worst-case O(1) lookup time.
// //
// While a [NewTable] only supports comparable keys by default, you can create // While a [New] only supports comparable keys by default, you can create
// a table with any key type using [NewCustomTable]. Custom [Hash] functions and // a table with any key type using [NewCustom]. Custom [Hash] functions and
// key comparison are also supported. // key comparison are also supported.
// //
// NOTE: The [Table] is a look-up structure, and not a source of truth. If
// [ErrBadHash] occurs, the data cannot be restored.
//
// See more: https://en.wikipedia.org/wiki/Cuckoo_hashing // See more: https://en.wikipedia.org/wiki/Cuckoo_hashing
package cuckoo package cuckoo

View File

@@ -8,7 +8,7 @@ import (
) )
func Example_basic() { func Example_basic() {
table := cuckoo.NewTable[int, string]() table := cuckoo.New[int, string]()
if err := table.Put(1, "Hello, World!"); err != nil { if err := table.Put(1, "Hello, World!"); err != nil {
fmt.Println("Put error:", err) fmt.Println("Put error:", err)

View File

@@ -9,7 +9,7 @@ import "fmt"
const DefaultCapacity uint64 = 16 const DefaultCapacity uint64 = 16
// DefaultGrowthFactor is the standard resize multiplier for a [Table]. Most // DefaultGrowthFactor is the standard resize multiplier for a [Table]. Most
// hash table implementations use 2. // implementations use 2.
const DefaultGrowthFactor uint64 = 2 const DefaultGrowthFactor uint64 = 2
// defaultMinimumLoad is the default lowest acceptable occupancy of a [Table]. // defaultMinimumLoad is the default lowest acceptable occupancy of a [Table].
@@ -19,6 +19,11 @@ const DefaultGrowthFactor uint64 = 2
// [libcuckoo]: https://github.com/efficient/libcuckoo/blob/656714705a055df2b7a605eb3c71586d9da1e119/libcuckoo/cuckoohash_config.hh#L21 // [libcuckoo]: https://github.com/efficient/libcuckoo/blob/656714705a055df2b7a605eb3c71586d9da1e119/libcuckoo/cuckoohash_config.hh#L21
const defaultMinimumLoad float64 = 0.05 const defaultMinimumLoad float64 = 0.05
// defaultGrowthLimit is the maximum number of times a [Table] can grow in a
// single [Table.Put], before the library infers it will lead to a stack
// overflow. The value of '64' was chosen arbirarily.
const defaultGrowthLimit uint64 = 64
type settings struct { type settings struct {
growthFactor uint64 growthFactor uint64
minLoadFactor float64 minLoadFactor float64
@@ -26,10 +31,10 @@ type settings struct {
} }
// An Option modifies the settings of a [Table]. It is used in its constructors // An Option modifies the settings of a [Table]. It is used in its constructors
// like [NewTable], for example. // like [New], for example.
type Option func(*settings) type Option func(*settings)
// Capacity modifies the starting capacity of each bucket of the [Table]. The // Capacity modifies the starting capacity of each subtable of the [Table]. The
// value must be non-negative. // value must be non-negative.
func Capacity(value int) Option { func Capacity(value int) Option {
if value < 0 { if value < 0 {

107
subtable.go Normal file
View File

@@ -0,0 +1,107 @@
package cuckoo
// An entry is a key-value pair.
type entry[K, V any] struct {
key K
value V
}
type slot[K, V any] struct {
entry[K, V]
occupied bool
}
type subtable[K, V any] struct {
hash Hash[K]
slots []slot[K, V]
capacity, size uint64
compare EqualFunc[K]
}
// location determines where in the subtable a certain key would be placed. If
// the capacity is 0, this will panic.
func (t *subtable[K, V]) location(key K) uint64 {
return t.hash(key) % t.capacity
}
func (t *subtable[K, V]) get(key K) (value V, found bool) {
if t.capacity == 0 {
return
}
slot := t.slots[t.location(key)]
return slot.value, slot.occupied && t.compare(slot.key, key)
}
func (t *subtable[K, V]) drop(key K) (occupied bool) {
if t.capacity == 0 {
return
}
slot := &t.slots[t.location(key)]
if slot.occupied && t.compare(slot.key, key) {
slot.occupied = false
t.size--
return true
}
return false
}
func (t *subtable[K, V]) resized(capacity uint64) *subtable[K, V] {
return &subtable[K, V]{
slots: make([]slot[K, V], capacity),
capacity: capacity,
hash: t.hash,
compare: t.compare,
}
}
func (t *subtable[K, V]) update(key K, value V) (updated bool) {
if t.capacity == 0 {
return
}
slot := &t.slots[t.location(key)]
if slot.occupied && t.compare(slot.key, key) {
slot.value = value
return true
}
return false
}
func (t *subtable[K, V]) insert(insertion entry[K, V]) (evicted entry[K, V], eviction bool) {
if t.capacity == 0 {
return insertion, true
}
slot := &t.slots[t.location(insertion.key)]
if !slot.occupied {
slot.entry = insertion
slot.occupied = true
t.size++
return
}
if t.compare(slot.key, insertion.key) {
slot.value = insertion.value
return
}
insertion, slot.entry = slot.entry, insertion
return insertion, true
}
func newSubtable[K, V any](capacity uint64, hash Hash[K], compare EqualFunc[K]) *subtable[K, V] {
return &subtable[K, V]{
hash: hash,
capacity: capacity,
compare: compare,
size: 0,
slots: make([]slot[K, V], capacity),
}
}

211
table.go
View File

@@ -1,41 +1,50 @@
package cuckoo package cuckoo
import ( import (
"errors"
"fmt" "fmt"
"iter" "iter"
"math/bits" "math/bits"
"strings" "strings"
) )
// A Table is hash table that uses cuckoo hashing to resolve collision. Create // ErrBadHash occurs when the hashes given to a [Table] cause too many key
// one with [NewTable]. Or if you want more granularity, use [NewTableBy] or // collisions. Discard the old table, rebuild it from your source data, and try:
// [NewCustomTable]. //
// 1. Different hash seeds. Equal seeds produce equal hash functions, which
// always cycle.
// 2. A different [Hash] algorithm.
var ErrBadHash = errors.New("bad hash")
// A Table which uses cuckoo hashing to resolve collision. Create
// one with [New]. Or if you want more granularity, use [NewBy] or
// [NewCustom].
type Table[K, V any] struct { type Table[K, V any] struct {
bucketA, bucketB bucket[K, V] tableA, tableB *subtable[K, V]
growthFactor uint64 growthFactor uint64
minLoadFactor float64 minLoadFactor float64
} }
// TotalCapacity returns the number of slots allocated for the [Table]. To get the // TotalCapacity returns the number of slots allocated for the [Table]. To get the
// number of slots filled, look at [Table.Size]. // number of slots filled, look at [Table.Size].
func (t Table[K, V]) TotalCapacity() uint64 { func (t *Table[K, V]) TotalCapacity() uint64 {
return t.bucketA.capacity + t.bucketB.capacity return t.tableA.capacity + t.tableB.capacity
} }
// Size returns how many slots are filled in the [Table]. // Size returns how many slots are filled in the [Table].
func (t Table[K, V]) Size() int { func (t *Table[K, V]) Size() int {
return int(t.bucketA.size + t.bucketB.size) return int(t.tableA.size + t.tableB.size)
} }
func log2(n uint64) (m int) { func log2(n uint64) (m int) {
return max(0, bits.Len64(n)-1) return max(0, bits.Len64(n)-1)
} }
func (t Table[K, V]) maxEvictions() int { func (t *Table[K, V]) maxEvictions() int {
return 3 * log2(t.TotalCapacity()) return 3 * log2(t.TotalCapacity())
} }
func (t Table[K, V]) load() float64 { func (t *Table[K, V]) load() float64 {
// When there are no slots in the table, we still treat the load as 100%. // When there are no slots in the table, we still treat the load as 100%.
// Every slot in the table is full. // Every slot in the table is full.
if t.TotalCapacity() == 0 { if t.TotalCapacity() == 0 {
@@ -45,115 +54,153 @@ func (t Table[K, V]) load() float64 {
return float64(t.Size()) / float64(t.TotalCapacity()) return float64(t.Size()) / float64(t.TotalCapacity())
} }
// resize clears all buckets, changes the sizes of them to a specific capacity, // insert attempts to put/update an entry in the table, without modifying the
// and fills them back up again. It is a helper function for [Table.grow] and // size of the table. Returns a displaced entry and 'homeless = true' if an
// [Table.shrink]; use them instead. // entry could not be placed after exhausting evictions.
func (t *Table[K, V]) resize(capacity uint64) error { func (t *Table[K, V]) insert(entry entry[K, V]) (displaced entry[K, V], homeless bool) {
entries := make([]entry[K, V], 0, t.Size()) if t.tableA.update(entry.key, entry.value) {
for k, v := range t.Entries() { return
entries = append(entries, entry[K, V]{k, v})
} }
t.bucketA.resize(capacity) if t.tableB.update(entry.key, entry.value) {
t.bucketB.resize(capacity) return
}
for _, entry := range entries { for range t.maxEvictions() {
if err := t.Put(entry.key, entry.value); err != nil { if entry, homeless = t.tableA.insert(entry); !homeless {
return err return
}
if entry, homeless = t.tableB.insert(entry); !homeless {
return
} }
} }
return nil return entry, true
} }
// grow increases the table's capacity by the [Table.growthFactor]. If the // resized creates an empty copy of the table, with a new capacity for each
// bucket.
func (t *Table[K, V]) resized(capacity uint64) *Table[K, V] {
return &Table[K, V]{
growthFactor: t.growthFactor,
minLoadFactor: t.minLoadFactor,
tableA: t.tableA.resized(capacity),
tableB: t.tableB.resized(capacity),
}
}
// resize creates a new [Table.resized] with 'capacity', inserts all items into
// the array, and replaces the current table. It is a helper function for
// [Table.grow] and [Table.shrink]; use them instead.
func (t *Table[K, V]) resize(capacity uint64) bool {
updated := t.resized(capacity)
for k, v := range t.Entries() {
if _, failed := updated.insert(entry[K, V]{k, v}); failed {
return false
}
}
*t = *updated
return true
}
// grow increases the table's capacity by the growth factor. If the
// capacity is 0, it increases it to 1. // capacity is 0, it increases it to 1.
func (t *Table[K, V]) grow() error { func (t *Table[K, V]) grow() bool {
var newCapacity uint64 var newCapacity uint64
if t.TotalCapacity() == 0 { if t.TotalCapacity() == 0 {
newCapacity = 1 newCapacity = 1
} else { } else {
newCapacity = t.bucketA.capacity * t.growthFactor newCapacity = t.tableA.capacity * t.growthFactor
} }
return t.resize(newCapacity) return t.resize(newCapacity)
} }
// shrink reduces the table's capacity by the [Table.growthFactor]. It may // shrink reduces the table's capacity by the growth factor. It may
// reduce it down to 0. // reduce it down to 0.
func (t *Table[K, V]) shrink() error { func (t *Table[K, V]) shrink() bool {
return t.resize(t.bucketA.capacity / t.growthFactor) return t.resize(t.tableA.capacity / t.growthFactor)
} }
// Get fetches the value for a key in the [Table]. // Get fetches the value for a key in the [Table]. Matches the comma-ok pattern
func (t Table[K, V]) Get(key K) (value V, ok bool) { // of a builtin map; see [Table.Find] for plain indexing.
if item, ok := t.bucketA.get(key); ok { func (t *Table[K, V]) Get(key K) (value V, ok bool) {
if item, ok := t.tableA.get(key); ok {
return item, true return item, true
} }
if item, ok := t.bucketB.get(key); ok { if item, ok := t.tableB.get(key); ok {
return item, true return item, true
} }
return return
} }
// Find fetches the value of a key. Matches direct indexing of a builtin map;
// see [Table.Get] for a comma-ok pattern.
func (t *Table[K, V]) Find(key K) (value V) {
value, _ = t.Get(key)
return
}
// Has returns true if a key has a value in the table. // Has returns true if a key has a value in the table.
func (t Table[K, V]) Has(key K) (exists bool) { func (t *Table[K, V]) Has(key K) (exists bool) {
_, exists = t.Get(key) _, exists = t.Get(key)
return return
} }
// Put sets the value for a key. Returns error if its value cannot be set. // Put sets the value for a key. If it cannot be set, an error is returned.
func (t *Table[K, V]) Put(key K, value V) (err error) { func (t *Table[K, V]) Put(key K, value V) (err error) {
if t.bucketA.update(key, value) { var (
return nil entry = entry[K, V]{key, value}
} homeless bool
)
if t.bucketB.update(key, value) { for range defaultGrowthLimit {
return nil if entry, homeless = t.insert(entry); !homeless {
} return
entry, eviction := entry[K, V]{key, value}, false
for range t.maxEvictions() {
if entry, eviction = t.bucketA.evict(entry); !eviction {
return nil
} }
if entry, eviction = t.bucketB.evict(entry); !eviction { // Both this and the growth limit are necessary: this catches bad hashes
return nil // early when the table is sparse, while the latter catches cases where
// growing never helps.
if t.load() < t.minLoadFactor {
return fmt.Errorf("hash functions produced a cycle at load %d/%d: %w", t.Size(), t.TotalCapacity(), ErrBadHash)
}
// It is theoretically possible to have a table with a larger capacity
// that is valid. But this chance is astronomically small, so we ignore
// it in this implementation.
if grew := t.grow(); !grew {
return fmt.Errorf("could not redistribute entries into larger table: %w", ErrBadHash)
} }
} }
if t.load() < t.minLoadFactor { return fmt.Errorf("could not place entry after %d resizes: %w", defaultGrowthLimit, ErrBadHash)
return fmt.Errorf("bad hash: resize on load %d/%d = %f", t.Size(), t.TotalCapacity(), t.load())
}
if err := t.grow(); err != nil {
return err
}
return t.Put(entry.key, entry.value)
} }
// Drop removes a value for a key in the table. Returns an error if its value // Drop removes a value for a key in the table. Returns whether the key had
// cannot be removed. // existed.
func (t *Table[K, V]) Drop(key K) (err error) { func (t *Table[K, V]) Drop(key K) bool {
t.bucketA.drop(key) occupied := t.tableA.drop(key) || t.tableB.drop(key)
t.bucketB.drop(key)
if t.load() < t.minLoadFactor { if t.load() < t.minLoadFactor {
return t.shrink() // The error is not handled here, because table-shrinking is an internal
// optimization.
t.shrink()
} }
return nil return occupied
} }
// Entries returns an unordered sequence of all key-value pairs in the table. // Entries returns an unordered sequence of all key-value pairs in the table.
func (t Table[K, V]) Entries() iter.Seq2[K, V] { func (t *Table[K, V]) Entries() iter.Seq2[K, V] {
return func(yield func(K, V) bool) { return func(yield func(K, V) bool) {
for _, slot := range t.bucketA.slots { for _, slot := range t.tableA.slots {
if slot.occupied { if slot.occupied {
if !yield(slot.key, slot.value) { if !yield(slot.key, slot.value) {
return return
@@ -161,7 +208,7 @@ func (t Table[K, V]) Entries() iter.Seq2[K, V] {
} }
} }
for _, slot := range t.bucketB.slots { for _, slot := range t.tableB.slots {
if slot.occupied { if slot.occupied {
if !yield(slot.key, slot.value) { if !yield(slot.key, slot.value) {
return return
@@ -172,8 +219,8 @@ func (t Table[K, V]) Entries() iter.Seq2[K, V] {
} }
// String returns the entries of the table as a string in the format: // String returns the entries of the table as a string in the format:
// "table[k1:v1 h2:v2 ...]". // "table[k1:v1 k2:v2 ...]".
func (t Table[K, V]) String() string { func (t *Table[K, V]) String() string {
var sb strings.Builder var sb strings.Builder
sb.WriteString("table[") sb.WriteString("table[")
@@ -191,9 +238,9 @@ func (t Table[K, V]) String() string {
return sb.String() return sb.String()
} }
// NewCustomTable creates a [Table] with custom [Hash] and [EqualFunc] // NewCustom creates a [Table] with custom [Hash] and [EqualFunc]
// functions, along with any [Option] the user provides. // functions, along with any [Option] the user provides.
func NewCustomTable[K, V any](hashA, hashB Hash[K], compare EqualFunc[K], options ...Option) *Table[K, V] { func NewCustom[K, V any](hashA, hashB Hash[K], compare EqualFunc[K], options ...Option) *Table[K, V] {
settings := &settings{ settings := &settings{
growthFactor: DefaultGrowthFactor, growthFactor: DefaultGrowthFactor,
bucketSize: DefaultCapacity, bucketSize: DefaultCapacity,
@@ -207,8 +254,8 @@ func NewCustomTable[K, V any](hashA, hashB Hash[K], compare EqualFunc[K], option
return &Table[K, V]{ return &Table[K, V]{
growthFactor: settings.growthFactor, growthFactor: settings.growthFactor,
minLoadFactor: settings.minLoadFactor, minLoadFactor: settings.minLoadFactor,
bucketA: newBucket[K, V](settings.bucketSize, hashA, compare), tableA: newSubtable[K, V](settings.bucketSize, hashA, compare),
bucketB: newBucket[K, V](settings.bucketSize, hashB, compare), tableB: newSubtable[K, V](settings.bucketSize, hashB, compare),
} }
} }
@@ -216,10 +263,10 @@ func pipe[X, Y, Z any](a func(X) Y, b func(Y) Z) func(X) Z {
return func(x X) Z { return b(a(x)) } return func(x X) Z { return b(a(x)) }
} }
// NewTableBy creates a [Table] for any key type by using keyFunc to derive a // NewBy creates a [Table] for any key type by using keyFunc to derive a
// comparable key. Two keys with the same derived key are treated as equal. // comparable key. Two keys with the same derived key are treated as equal.
func NewTableBy[K, V any, C comparable](keyFunc func(K) C, options ...Option) *Table[K, V] { func NewBy[K, V any, C comparable](keyFunc func(K) C, options ...Option) *Table[K, V] {
return NewCustomTable[K, V]( return NewCustom[K, V](
pipe(keyFunc, NewDefaultHash[C]()), pipe(keyFunc, NewDefaultHash[C]()),
pipe(keyFunc, NewDefaultHash[C]()), pipe(keyFunc, NewDefaultHash[C]()),
func(a, b K) bool { return keyFunc(a) == keyFunc(b) }, func(a, b K) bool { return keyFunc(a) == keyFunc(b) },
@@ -227,10 +274,10 @@ func NewTableBy[K, V any, C comparable](keyFunc func(K) C, options ...Option) *T
) )
} }
// NewTable creates a [Table] using the default [Hash] and [EqualFunc]. Use // New creates a [Table] using the default [Hash] and [EqualFunc]. Use
// the [Option] functions to configure its behavior. Note that this constructor // the [Option] functions to configure its behavior. Note that this constructor
// is only provided for comparable keys. For arbitrary keys, consider // is only provided for comparable keys. For arbitrary keys, consider
// [NewTableBy] or [NewCustomTable]. // [NewBy] or [NewCustom].
func NewTable[K comparable, V any](options ...Option) *Table[K, V] { func New[K comparable, V any](options ...Option) *Table[K, V] {
return NewCustomTable[K, V](NewDefaultHash[K](), NewDefaultHash[K](), DefaultEqualFunc[K], options...) return NewCustom[K, V](NewDefaultHash[K](), NewDefaultHash[K](), DefaultEqualFunc[K], options...)
} }