16 KiB
Designing an Idiomatic Interface
Currently, the contract for package was built without design. More attention was paid to implementing the underlying functionality of the cuckoo hashing.
With the fundamentals of the algorithm built, our API contract should be revisited. It should align closer to the following principles:
-
Congruency to the builtin map. Our cuckoo table should have the same core functionality as Go's built-in map.
-
Familiarity to the builtin map. If our cuckoo table behaves similarly to Go's standard map, our user will intuitively know how to use it. This lowers the cognitive load our developers must carry.
Current State
Interface of the Builtin Map
Listed below is every interface provided by Go to the built-in map object.
Also included, are the functions from the package maps in the standard library.
Interfaces
| # | Builtin Interface | Description |
|---|---|---|
| 1 | m := make(map[K]V) |
Returns an empty map using the built-in make() function. |
| 2 | m := make(map[K]V, hint) |
Returns an empty map using make(), with a capacity 'hint'. This hint is how many items the map expects to hold, not a measure of how large it is. |
| 3 | m := map[K]V{...} |
Returns a map, which may be filled with entries in the ellipsis (optional). |
| 4 | var m map[K]V |
Defines an empty variable that holds a map. This differs from #1 because m is uninitialized (nil) here. |
| 5 | m[k] := v |
Assigns the value of k to v. |
| 6 | v := m[k] |
Returns the value of k if it exists. Otherwise, v is uninitialized. |
| 7 | v, ok := m[k] |
Similar to #6, except ok is equal to whether v is initialized. This is comma-ok notation. |
| 8 | for k, v := range m |
Iterates over every key-value pair in m. The order is random. |
| 9 | delete(m, k) |
Unassigns the value k. Returns no value. |
| 10 | clear(m) |
Unassigns all keys in m. Returns no value. |
| 11 | n := len(m) |
Returns the number of entries in m. If nil, m returns 0. |
| 12 | m2 := maps.Clone(m) |
Returns a copy of m. |
| 13 | maps.Copy(dst, src) |
Assigns every entry of src in dst. |
| 14 | ok := maps.Equal(m1, m2) |
Returns true iff m1 and m2 the same entries. |
| 15 | ok := maps.EqualFunc(m1, m2, fn) |
Like #14, but with a custom comparator for non-comparable values. |
| 16 | maps.DeleteFunc(m, fn) |
Removes every entry in m which satisfies fn. Returns no value. |
| 17 | it2 := maps.All(m) |
Returns an 2D iterator over every key-value pair. |
| 18 | it := maps.Keys(m) |
Returns an iterator over every key. |
| 19 | it := maps.Values(m) |
Returns an iterator over every value. There can be duplicates. |
| 20 | m := maps.Collect(seq) |
Returns a map, with every entry defined in a 2D iterator over key-value pairs. |
| 21 | maps.Insert(m, seq) |
Assigns to m all key-value pairs in 2D iterator seq. Returns no value. |
Interface of go-cuckoo
On the other hand, here is the current contract for go-cuckoo.
Interfaces
| # | Builtin Interface | Description |
|---|---|---|
| 1 | m := New(opts...) |
Creates a table using the default hash and equal function. The options configure its behavior. Confined to comparable keys. |
| 2 | m := NewBy(keyFunc, opts...) |
Like #1, but allows any key type. A keyFunc is used to derive a comparable key. |
| 3 | m := NewCustom(hashA, hashB, equalFunc, opts...) |
Like #1, but allows control over the hashes used to allow any key type. An equalFunc determines key equality. |
| 4 | seq := m.Entries() |
Returns an unordered 2D iterator of all key-value pairs in the table. |
| 5 | v := m.Find(k) |
Removes the value for k. Returns true if k existed. |
| 6 | v, ok := m.Get(k) |
Returns the value for k in the table. Also, returns true if the k exists, otherwise false. When false, v is undefined. |
| 7 | ok := m.Has(k) |
Returns true if k is in the table. |
| 8 | err := m.Put(k, v) |
Sets value v for key k. Otherwise, returns error. |
| 9 | n := m.Size() |
Returns the number of items in m. |
| 10 | str := m.String() |
Returns m as a string in the format "table[k1:v1 k2:v2 ...]". |
| 11 | cap := m.TotalCapacity() |
Returns how many slots m has allocated. |
| 12 | ok := m.Drop(k) |
Removes k from the table. Returns whether the key had existed. |
Determining Congruency
So, how does the core functionality compare?
Listed below is an analysis of every interface in Go's standard map.
Each is compared against what go-cuckoo offers, and categorized into the following groups:
- ✅ Covered: an analog exists.
- ⚠️ Partial: workaround available.
- ❌ Gap: no analog yet; addressed in Target State.
Specifically, here we are checking for functionality.
Is there functionality that this offers which go-cuckoo does not?
We are checking accessibility, but not discoverability.
The latter will be considered later.
✅ m := make(map[K]V)
The analog is m := New().
⚠️ m := make(map[K]V, hint)
This has no simple analog.
It is close to m := New(Capacity(hint)), but it assigns starting capacity, not expected size.
For the built-in map, these are two separate things.
- Capacity is an internal measure, used to optimize space/speed. It is hidden from the user because it depends on the underlying implementation, which may change.
- Expected size requires the map must hold a number of items before resizing. This is tangeable and agnostic to implementation, hence why it is given to the user.
In short, this interface defines expected size, but Capacity() defines capacity.
❌ m := map[K]V{...}
This has no simple analog, the closest being:
m := New[K, V]()
for k, v := range startingEntries {
m.Put(k, v)
}
It is idiomatic, but far less ergonomic.
✅ var m map[K]V
The analog is var m Table[K, V].
✅ m[k] := v
The analog is err := m.Put(k, v).
✅ v := m[k]
The analog is v := m.Find(k).
✅ v, ok := m[k]
The analog is v, ok := m.Get(k).
✅ for k, v := range m
The analog is for k, v := range m.Entries().
✅ delete(m, k)
The analog is ok := m.Drop(k).
❌ clear(m)
There is no analog.
The easiest may to do this is to delete all items individually:
for k := range m.Entries() {
m.Drop(k)
}
✅ n := len(m)
The analog is n := m.Size().
❌ m2 := maps.Clone(m)
There is no analog.
The easiest way to do this currently is to make a new map, and manually add the items.
m2 := cuckoo.Table[K, V]()
for k, v := range m.Entries() {
m2.Put(k, v)
}
This gets complicated by the various options available to the user.
Furthermore, any custom EqualFunc, keyFunc or Hash is not transferred.
❌ maps.Copy(dst, src)
There is no analog.
The simplest way to do this is with a for-loop.
for k, v := range src.Entries() {
dst.Put(k, v)
}
❌ ok := maps.Equal(m1, m2)
There is no analog.
Users have to manually check the key-value pairs to determine equality.
❌ ok := maps.EqualFunc(m1, m2, fn)
There is no analog.
Users have to manually check the key-value pairs to determine equality.
❌ maps.DeleteFunc(m, fn)
There is no analog.
Users have to manually delete keys.
✅ it2 := maps.All(m)
The analog is it2 := m.Entries().
⚠️ it := maps.Keys(m)
There is no simple analog.
A close neighbor is it2 := m.Entries().
Users can use this in a for-loop, and pick out just the keys:
for k := range m.Entries() {
// ...
}
⚠️ it := maps.Values(m)
There is no simple analog.
A close neighbor is it2 := m.Entries().
Users can use this in a for-loop, and pick out just the values:
for _, v := range m.Entries() {
// ...
}
❌ m := maps.Collect(seq)
There is no analog.
❌ maps.Insert(m, seq)
There is no analog.
Target State
Solving Congruency
The following changes will be made to accomodate for congruency:
ok := maps.EqualFunc(m1, m2, fn)
To solve this, we need a new function:
func EqualFunc[K, V1, V2 any](t1 *Table[K, V1], t2 *Table[K, V2], eq func(V1, V2) bool) bool
This function is free, and not bound as a receiver function.
(It is called cuckoo.Equal(t1, t2), not t1.Equals(t2).)
The latter implies t1 has authority, when in fact neither do.
Equality will be defined as:
- Neither table has a key the other doesn't.
- Each key has the same value in each table.
Parameter
eqdetermines this equality.
Custom EqualFunc's complicate this, as they modulate key identity in tables.
If two tables may differ on whether two keys are different, this function might break.
So, we must assume that:
- Both tables have
EqualFunc's which 'agree' on the identity of the keys present in the tables. Agreement is defined as: if two keys are distinct in one table, they are distinct in the other.
ok := maps.Equal(m1, m2)
The addition of cuckoo.EqualFunc makes an implementation trivial:
func Equal[K any, V comparable](t1, t2 *Table[K, V]) bool {
return EqualFunc(t1, t2, DefaultEqualFunc[V])
}
To conform with the standard library, a new function should be added. Once again, the function is free because it is symmetric.
maps.Insert(m, seq)
This functionality requires a new receiver:
func (t *Table[K, V]) Insert(seq *iter.Seq2[K, V]) error
A receiver fits better even though maps.Insert is a free function, because copying it is asymmetric.
Map dst receives entries from map src.
It is only free because Go's standard map is built into the language, and so cannot have receivers.
In terms of naming, t.Extend is more accurate, and has precedent in Python and Rust.
Ultimately, t.Insert() is a better choice because of
maps.Copy(dst, src)
To solve this, we must implement a new receiver:
func (t *Table[K, V]) Copy(src *Table[K, V]) error
A receiver fits better even though maps.Copy is a free function, 'copying' it is asymmetric: dst is writen into by src.
It is only free because Go's standard map is built into the language, and so cannot have receivers.
The name t.Merge() might be more accurate, but it does work because:
t.Copy()matches Go's builtincopy(), andio.Copy(). The Go team used the same logic to namemaps.Copy(). In this case,t.Merge()would be an outlier.t.Merge()implies some sort of conflict-resolution, when there is not. It simply overwrites the values.