
May 11, 2026
Most Ruby developers think of a Hash as a classic hash table: keys, values, O(1) lookups.
That’s only partially true.
Under the hood, Ruby uses a hybrid data structure — and understanding it can change how you think about performance, memory, and even API design.
The surprising part: small hashes aren’t hash tables
When you write:
options = { verbose: true, format: :json }
Ruby does not immediately allocate a full hash table.
For small hashes (roughly ≤ 8 entries in current CRuby), Ruby uses an Array Representation (AR table):
- Stored inline inside the object
- No heap allocation
- Contiguous memory layout
- Optimized for cache locality
👉 Your small hash is effectively a compact array of key/value pairs.
This is not a naive linear scan
It’s tempting to think Ruby just loops through entries.
That’s not quite true.
From CRuby’s hash.c, lookup in small hashes looks like:
if (hints[i] == hint) { if (ar_equal(key, pair->key)) {
What’s happening:
- Ruby computes a hash for the key
- Stores a 1-byte hint alongside each entry
- During lookup: Compare the hint (very fast) Only then check eql? (more expensive)
👉 So the real model is:
Linear scan + hash-based prefiltering + equality check
This is much faster than a naive scan.
Why this is actually fast
For very small collections, this approach beats a traditional hash table because:
- No pointer chasing
- No extra allocations
- Sequential memory access
- Better CPU cache utilization
👉 Ruby is optimizing for real-world usage, not theoretical complexity.
Most hashes in Ruby code are small:
- options hashes
- keyword arguments
- configuration objects
When it becomes a “real” hash table
Once the hash grows beyond its small fixed capacity, Ruby switches to a different structure:
- ST table (st_table)
- Heap-allocated
- Open addressing
- O(1) average lookup
This transition is automatic.
From the source:
if (RHASH_AR_TABLE_SIZE(hash) >= RHASH_AR_TABLE_MAX_SIZE) { return 1;}
Followed by:
ar_force_convert_table(hash, ...)
👉 When the AR table fills up:
- Ruby allocates a real hash table
- Recomputes hashes
- Reinserts entries
This cost happens once, then scales efficiently.
Hash lookup: the real contract
Ruby does not rely on hash values alone.
Lookup always follows two steps:
- Use hash → narrow candidates
- Use eql? → confirm identity
Example:
1 == 1.0 # true1.eql?(1.0) # false{1 => "a", 1.0 => "b"} # two different keys
👉 Hash identity is based on eql?, not ==.
The rule you must never break
If you use custom objects as keys:
class User def hash id.hash end def eql?(other) other.is_a?(User) && other.id == id endend
You must guarantee:
If a.eql?(b) → then a.hash == b.hash
Otherwise, your hash will behave unpredictably.
Subtle edge case: floats
Ruby even normalizes tricky values like floats:
-0.0.eql?(0.0) # true
Internally, it ensures both produce the same hash.
👉 This avoids subtle lookup bugs.
Ordered hashes (since Ruby 1.9)
Ruby’s Hash also preserves insertion order:
h = {}h[:a] = 1h[:b] = 2h.keys # => [:a, :b]
This behavior is guaranteed and maintained internally.
A deeper insight from the source
CRuby contains guards like:
if (UNLIKELY(!RHASH_AR_TABLE_P(hash))) {
This reveals something important:
👉 Hash operations must defend against user code executing during hashing.
For example:
- calling #hash on an object
- that method mutates the hash itself
Ruby’s implementation is built to handle these edge cases safely.
What this means for your code
1) Small hashes are extremely cheap
Don’t avoid hashes for performance in common cases:
def call(user, options = {})
This is already highly optimized.
2) Don’t use mutable keys
key = []h = { key => "value" }key << 1h[key] # ❌ unpredictable
You’ve changed the key’s hash identity.
3) Always implement hash and eql? together
Never override one without the other.
4) Understand the performance model
- Small hash → array-like behavior
- Large hash → real hash table
👉 This mental model helps when diagnosing performance issues.
The takeaway
Ruby’s Hash is not just a hash table.
It’s a hybrid structure that combines:
- Inline storage for small datasets
- Hash tables for scalability
- Hash hinting for fast lookups
All designed around how Ruby is actually used.
