Inside Ruby’s JSON Library: Complete Deep Dive

May 18, 2026

Introduction

This tutorial explores the internals of the JSON library used by entity [“software”,”Ruby”,”CRuby interpreter”].

The archive contains:

  • Native C parser implementation
  • Native C generator implementation
  • SIMD optimizations
  • Floating-point conversion algorithms
  • Buffer management infrastructure
  • Ruby wrapper APIs
  • JSON additions for Ruby core classes
  • Build system integration

Repository structure:

json/
├── parser/
│ └── parser.c
├── generator/
│ └── generator.c
├── simd/
│ └── simd.h
├── vendor/
│ ├── fpconv.c
│ ├── ryu.h
│ └── jeaiii-ltoa.h
├── fbuffer/
│ └── fbuffer.h
└── lib/json/

This tutorial covers the architecture, major APIs, internal methods, parsing pipeline, generator internals, memory management, and performance optimizations.


Tokyo Topographic Map
Built for Ruby on Rails

Build Maps Without
Google APIs

Generate beautiful production-ready maps directly from your Rails backend. Fast rendering, zero external dependencies, full control.

✓ No API fees ✓ Self-hosted ✓ Rails Native ✓ Fast Rendering
Why developers switch
Replace expensive map stacks.

Stop relying on third-party map billing and bloated JS libraries. Render static or dynamic maps directly in Ruby.

Try It Now
Tokyo MapView Demo

1. High-Level Architecture

Ruby’s JSON implementation consists of two primary components:

  1. Parser
  2. Generator

At the Ruby level:

JSON.parse(string)
JSON.generate(object)

At the native level:

Ruby API
C extension bindings
Parser / Generator engines
Low-level buffer and serialization systems

The parser transforms JSON text into Ruby objects.

The generator transforms Ruby objects into JSON strings.


2. Ruby-Level Public APIs

Parsing JSON

require 'json'
json = '{"name":"Ruby","year":1995}'
obj = JSON.parse(json)

Result:

{
"name" => "Ruby",
"year" => 1995
}

Generating JSON

JSON.generate({name: 'Ruby'})

Result:

{"name":"Ruby"}

Pretty Generation

JSON.pretty_generate({
name: 'Ruby',
version: '3.x'
})

3. Parser Internals

File:

json/parser/parser.c

This is the heart of the JSON parser.

Responsibilities:

  • Tokenization
  • Recursive descent parsing
  • String decoding
  • Unicode handling
  • Number parsing
  • Object/array construction
  • Error handling
  • Memory management

4. Parsing Pipeline

The parser roughly follows this flow:

Input String
Tokenizer
Character Scanner
Value Dispatcher
Object/Array Builders
Ruby Object Creation

JSON values are identified and dispatched:

{
[
"
true
false
null
numbers

Each token maps to a dedicated parsing routine.


5. Core JSON Types

JSON supports:

JSONRubyobjectHasharrayArraystringStringnumberInteger / Floattruetruefalsefalsenullnil

The parser dynamically creates Ruby VALUE objects internally.


6. VALUE and Ruby C API

CRuby internally represents all objects using VALUE.

Example:

VALUE obj;

This may represent:

  • String
  • Array
  • Hash
  • Integer
  • Float
  • Symbol
  • Any Ruby object

The JSON extension heavily uses:

rb_hash_new()
rb_ary_new()
rb_utf8_str_new()
INT2FIX()
DBL2NUM()

These bridge native C code with Ruby objects.


7. String Parsing

String parsing is one of the most complicated parts.

The parser must handle:

  • Escaped quotes
  • UTF-8
  • Unicode escapes
  • Backslashes
  • Control characters

Example JSON:

{"message":"hello\nworld"}

The parser converts:

\n

into a real newline.

Unicode sequences:

\u2764

must also be decoded.


8. Number Parsing

JSON numbers are tricky.

The parser must distinguish:

1
1.5
1e10
-4.2

Ruby internally decides whether values become:

  • Integer
  • Float
  • BigDecimal (optional)

The library contains specialized numeric conversion systems for speed.


9. Floating Point Conversion

Vendor directory:

json/vendor/

Includes:

  • fpconv.c
  • ryu.h
  • jeaiii-ltoa.h

These are highly optimized algorithms for:

  • float-to-string conversion
  • integer formatting
  • accurate serialization

This is extremely important because:

JSON.generate({pi: Math::PI})

must produce deterministic and accurate output.


10. Why Float Serialization Is Hard

Binary floating point cannot precisely represent many decimal values.

Example:

0.1 + 0.2

Result:

0.30000000000000004

JSON generators must:

  • minimize precision loss
  • avoid invalid representations
  • serialize efficiently
  • preserve round-trip accuracy

That is why Ruby vendors advanced algorithms.


11. The Ryu Algorithm

File:

vendor/ryu.h

Ryu is a modern high-performance float serialization algorithm.

Goals:

  • shortest decimal representation
  • exact round-tripping
  • high performance

This is advanced systems engineering.

Most Ruby developers never realize JSON generation relies on sophisticated numerical algorithms.


12. Integer Serialization

File:

vendor/jeaiii-ltoa.h

Optimized integer-to-string conversion.

Why this matters:

JSON generation spends enormous time converting numbers into strings.

Even tiny optimizations can significantly impact:

  • APIs
  • Rails apps
  • Sidekiq
  • Redis pipelines
  • GraphQL
  • microservices

13. Generator Internals

File:

json/generator/generator.c

Responsibilities:

  • object traversal
  • string escaping
  • numeric serialization
  • recursion handling
  • indentation
  • encoding validation
  • output buffering

14. Generator Pipeline

Ruby Object
Type Detection
Serializer Dispatch
Buffer Writer
Escaping / Formatting
Final JSON String

The generator recursively traverses Ruby objects.

Example:

{
user: {
name: 'Alice'
}
}

becomes nested serializer calls.


15. Recursive Structures

JSON generators must protect against recursive objects.

Example:

arr = []
arr << arr

This creates a circular reference.

Without protection:

infinite recursion
stack overflow

The generator tracks visited objects internally.


16. String Escaping

The generator escapes:

  • quotes
  • backslashes
  • control characters
  • unicode sequences

Example:

JSON.generate({x: 'a\nb'})

Output:

{"x":"a\\nb"}

This is performance-critical.


17. Buffer Management

File:

fbuffer/fbuffer.h

The JSON generator avoids repeated Ruby string allocations.

Instead it uses internal expandable buffers.

Benefits:

  • fewer allocations
  • reduced GC pressure
  • improved throughput
  • lower memory fragmentation

This matters enormously in high-throughput Rails APIs.


18. SIMD Optimizations

File:

simd/simd.h

One of the most interesting parts of the library.

SIMD means:

Single Instruction Multiple Data

Modern CPUs can process multiple bytes simultaneously.

JSON parsing benefits heavily from:

  • vectorized scanning
  • delimiter detection
  • quote searching
  • whitespace skipping

This dramatically accelerates parsing.


19. Why SIMD Matters

Without SIMD:

scan one byte at a time

With SIMD:

scan 16–64 bytes simultaneously

This can massively improve throughput for:

  • APIs
  • streaming systems
  • JSON-heavy services
  • GraphQL
  • telemetry pipelines

20. Ruby Additions

Directory:

lib/json/add/

Provides JSON serialization support for:

  • Date
  • Time
  • BigDecimal
  • Rational
  • Complex
  • Struct
  • Set
  • Range
  • OpenStruct
  • Exception

Example:

require 'json/add/time'
JSON.generate(Time.now)

These additions extend Ruby core classes with JSON support.


21. GenericObject

File:

lib/json/generic_object.rb

Allows JSON objects to behave dynamically.

Example:

obj = JSON.parse(json, object_class: JSON::GenericObject)
obj.user.name

Instead of:

obj['user']['name']

22. State Objects

File:

lib/json/ext/generator/state.rb

State objects configure generation behavior.

Options include:

  • indentation
  • spacing
  • ascii-only mode
  • max nesting
  • circular reference handling

Example:

JSON.generate(obj, indent: ' ')

23. Extension Build System

Files:

extconf.rb

Ruby uses mkmf to compile native extensions.

Typical flow:

extconf.rb
Makefile generation
Native compilation
Shared library

This is how:

json/ext/parser.so

gets produced.


24. Memory Management

Native extensions must cooperate with Ruby’s garbage collector.

The JSON extension carefully manages:

  • object references
  • temporary allocations
  • parser buffers
  • recursion state

Incorrect handling would cause:

  • segmentation faults
  • memory corruption
  • leaks
  • GC crashes

25. Error Handling

Parser errors become Ruby exceptions.

Example:

JSON.parse('{')

Raises:

JSON::ParserError

Internally:

rb_raise(...)

creates Ruby exceptions from C.


26. Encoding Handling

JSON requires Unicode support.

The parser validates:

  • UTF-8 correctness
  • escape sequences
  • invalid byte patterns

Ruby’s encoding system integrates deeply with the parser.


27. Performance Characteristics

The native extension is dramatically faster than pure Ruby parsing.

Reasons:

  • direct memory access
  • fewer allocations
  • optimized loops
  • SIMD support
  • specialized serialization algorithms
  • buffer reuse

This is why the C extension remains essential.


28. JSON and Rails

Rails depends heavily on JSON.

Used everywhere:

  • APIs
  • ActiveSupport
  • ActionCable
  • Turbo
  • GraphQL
  • serializers
  • Redis payloads

That means Ruby’s JSON extension is one of the most performance-critical libraries in the ecosystem.


29. Security Considerations

JSON parsers must defend against:

  • deeply nested structures
  • huge payloads
  • invalid UTF-8
  • malicious recursion
  • parser bombs

The library includes limits and validation logic for safety.


30. Final Thoughts

Ruby’s JSON library is far more sophisticated than most developers realize.

Under a simple API:

JSON.parse(json)

exists:

  • native C parsers
  • SIMD acceleration
  • advanced numeric serialization
  • memory management systems
  • Unicode handling
  • recursive traversal engines
  • buffer optimization infrastructure

This library represents decades of accumulated runtime engineering and performance work inside the Ruby ecosystem.

Article content

Leave a comment