Ruby Hash Isn’t Always a Hash Table (And Why That Matters)

Ruby Hash Isn’t Always a Hash Table (And Why That Matters)
Ruby Hash Isn’t Always a Hash Table (And Why That Matters)

May 11, 2026

Most Ruby developers think of a Hash as a classic hash table: keys, values, O(1) lookups.

That’s only partially true.

Tokyo Topographic Map
Built for Ruby on Rails

Build Maps Without
Google APIs

Generate beautiful production-ready maps directly from your Rails backend. Fast rendering, zero external dependencies, full control.

✓ No API fees ✓ Self-hosted ✓ Rails Native ✓ Fast Rendering
Why developers switch
Replace expensive map stacks.

Stop relying on third-party map billing and bloated JS libraries. Render static or dynamic maps directly in Ruby.

Try It Now
Tokyo MapView Demo

Under the hood, Ruby uses a hybrid data structure — and understanding it can change how you think about performance, memory, and even API design.


The surprising part: small hashes aren’t hash tables

When you write:

options = { verbose: true, format: :json }

Ruby does not immediately allocate a full hash table.

For small hashes (roughly ≤ 8 entries in current CRuby), Ruby uses an Array Representation (AR table):

  • Stored inline inside the object
  • No heap allocation
  • Contiguous memory layout
  • Optimized for cache locality

👉 Your small hash is effectively a compact array of key/value pairs.


This is not a naive linear scan

It’s tempting to think Ruby just loops through entries.

That’s not quite true.

From CRuby’s hash.c, lookup in small hashes looks like:

if (hints[i] == hint) {
if (ar_equal(key, pair->key)) {

What’s happening:

  • Ruby computes a hash for the key
  • Stores a 1-byte hint alongside each entry
  • During lookup: Compare the hint (very fast) Only then check eql? (more expensive)

👉 So the real model is:

Linear scan + hash-based prefiltering + equality check

This is much faster than a naive scan.


Why this is actually fast

For very small collections, this approach beats a traditional hash table because:

  • No pointer chasing
  • No extra allocations
  • Sequential memory access
  • Better CPU cache utilization

👉 Ruby is optimizing for real-world usage, not theoretical complexity.

Most hashes in Ruby code are small:

  • options hashes
  • keyword arguments
  • configuration objects

When it becomes a “real” hash table

Once the hash grows beyond its small fixed capacity, Ruby switches to a different structure:

  • ST table (st_table)
  • Heap-allocated
  • Open addressing
  • O(1) average lookup

This transition is automatic.

From the source:

if (RHASH_AR_TABLE_SIZE(hash) >= RHASH_AR_TABLE_MAX_SIZE) {
return 1;
}

Followed by:

ar_force_convert_table(hash, ...)

👉 When the AR table fills up:

  • Ruby allocates a real hash table
  • Recomputes hashes
  • Reinserts entries

This cost happens once, then scales efficiently.


Hash lookup: the real contract

Ruby does not rely on hash values alone.

Lookup always follows two steps:

  1. Use hash → narrow candidates
  2. Use eql? → confirm identity

Example:

1 == 1.0 # true
1.eql?(1.0) # false
{1 => "a", 1.0 => "b"} # two different keys

👉 Hash identity is based on eql?, not ==.


The rule you must never break

If you use custom objects as keys:

class User
def hash
id.hash
end
def eql?(other)
other.is_a?(User) && other.id == id
end
end

You must guarantee:

If a.eql?(b) → then a.hash == b.hash

Otherwise, your hash will behave unpredictably.


Subtle edge case: floats

Ruby even normalizes tricky values like floats:

-0.0.eql?(0.0) # true

Internally, it ensures both produce the same hash.

👉 This avoids subtle lookup bugs.


Ordered hashes (since Ruby 1.9)

Ruby’s Hash also preserves insertion order:

h = {}
h[:a] = 1
h[:b] = 2
h.keys # => [:a, :b]

This behavior is guaranteed and maintained internally.


A deeper insight from the source

CRuby contains guards like:

if (UNLIKELY(!RHASH_AR_TABLE_P(hash))) {

This reveals something important:

👉 Hash operations must defend against user code executing during hashing.

For example:

  • calling #hash on an object
  • that method mutates the hash itself

Ruby’s implementation is built to handle these edge cases safely.


What this means for your code

1) Small hashes are extremely cheap

Don’t avoid hashes for performance in common cases:

def call(user, options = {})

This is already highly optimized.


2) Don’t use mutable keys

key = []
h = { key => "value" }
key << 1
h[key] # ❌ unpredictable

You’ve changed the key’s hash identity.


3) Always implement hash and eql? together

Never override one without the other.


4) Understand the performance model

  • Small hash → array-like behavior
  • Large hash → real hash table

👉 This mental model helps when diagnosing performance issues.


The takeaway

Ruby’s Hash is not just a hash table.

It’s a hybrid structure that combines:

  • Inline storage for small datasets
  • Hash tables for scalability
  • Hash hinting for fast lookups

All designed around how Ruby is actually used.

Article content

Leave a comment