Running AI Locally for Ruby Development: A Practical Guide with Ollama, Aider, and Your Own Codebase

May 28, 2026

Ruby Stack News — by Germán Silva

There’s a quiet revolution happening in developer tooling, and it doesn’t require a cloud subscription, an API key, or sending your proprietary code to someone else’s server.

Over the past few months I’ve been experimenting with running large language models entirely on my own machine while working on Ruby-LibGD, my gem for image generation in Ruby, and the experience has been interesting enough to write about.

This article covers a lot of ground: the philosophy behind local AI, what these models actually do (and what they decidedly don’t do), how to set up a productive environment in VSCode, and what it means for Ruby and Rails developers specifically.

Built for Ruby on Rails

Build Maps Without
Google APIs

Generate beautiful production-ready maps directly from your Rails backend. Fast rendering, zero external dependencies, full control.

View Live Demo → Read Docs

✓ No API fees ✓ Self-hosted ✓ Rails Native ✓ Fast Rendering

Why developers switch

Replace expensive map stacks.

Stop relying on third-party map billing and bloated JS libraries. Render static or dynamic maps directly in Ruby.

Try It Now

Why Run AI Locally?

The pitch for local AI is simple:

your code never leaves your machine.

No telemetry. No training-data contribution. No monthly fee per seat.

For indie developers working on proprietary gems, startups with NDAs, or anyone who’s ever winced at a Terms of Service clause, local inference removes the ambiguity entirely.

The tools that make this practical in 2026 are:

Ollama — serves open-weight models through a local HTTP API (:11434) with a docker pull-like experience for downloading and managing models.
Aider — a terminal-based AI pair programmer with strong Git integration.
Continue — a VSCode extension that connects your editor directly to Ollama-served models for inline completions and chat.

Together, they form a stack that competes surprisingly well with cloud coding assistants for everyday Ruby work — without depending on external APIs.

The Big Question: Does It Learn or Just Remember?

This is probably the most misunderstood part of local AI.

Local LLMs do not learn from you automatically.

When you run a model with Ollama, the weights on disk are frozen. The model responding at 9 AM is byte-for-byte identical to the model responding at 5 PM regardless of every conversation you’ve had in between.

What feels like learning is actually something more limited:

in-context memory.

The Context Window: Working Memory, Not Long-Term Storage

Every LLM has a context window — a fixed-size buffer of tokens it can see at any given moment.

Think of it as temporary working memory.

Everything inside that window is “known.” Everything outside it effectively doesn’t exist for that inference call.

The practical implications are important:

Long conversations feel coherent because earlier discussion still exists in context.
Once the context window fills up, older information gets dropped.
Restart Ollama or start a new session and the model begins from zero again unless you manually reload context.

This is a fundamental architectural property, not a limitation of smaller models.

Even very large open-weight models work this way.

What About Persistence?

You can approximate persistence by managing context yourself.

Useful techniques include:

System prompts — loading project-specific context and conventions at session start.
Aider’s repo map — automatic codebase summaries and structural awareness.
Conversation logs — saving and reinjecting previous sessions.
Fine-tuning — permanently modifying model weights using custom training data.

For most day-to-day Ruby work, though, good context management already gets you surprisingly far.

Ruby-LibGD: A Real-World Test Case

My gem Ruby-LibGD wraps the LibGD image-processing library for Ruby — image compositing, drawing primitives, pixel manipulation, and graphics generation.

That made it an interesting benchmark for local AI because it combines:

Ruby,
native C extensions,
graphics processing,
and APIs that aren’t heavily represented in tutorials.

The results were mixed — but useful.

On routine tasks like:

generating documentation examples,
scaffolding RSpec tests,
creating small usage examples,
explaining unfamiliar code,
or drafting repetitive wrapper methods,

even smaller 7B coding models performed surprisingly well.

Where things became less reliable:

hallucinated LibGD function names,
incorrect assumptions about memory ownership,
unsafe C-extension patterns,
invented APIs that looked plausible but didn’t exist.

This is where human supervision remains absolutely essential.

The models are often directionally correct without being technically correct.

What These Models Are Actually Good At (With Ruby)

After weeks of experimentation across Ruby-LibGD and Rails projects, the strengths and weaknesses become fairly predictable.

Strong Areas

Boilerplate generation
RSpec scaffolding
Documentation drafting
Refactoring small methods
Explaining unfamiliar gems
Commit message drafting
Small multi-file edits

Weak Areas

Rails version-specific behavior
Lesser-known gems
Complex ActiveRecord query construction
Deep metaprogramming
Database-aware changes without schema context

Always Review Carefully

Some areas should always receive full human review:

authentication,
authorization,
cryptography,
concurrency,
memory management,
performance-critical code paths.

Local AI doesn’t replace engineering judgment.

What it does do is reduce repetitive cognitive overhead.

Automatic Git Commits and Code Supervision

One of Aider’s strongest features is its Git integration.

Every accepted change gets automatically committed with a contextual message describing what changed.

Instead of:

misc fixes

you get something closer to:

			
Add GD::Image#composite alpha blending support
Implement pixel-level alpha compositing using gdImageCopyMerge.
Add corresponding RSpec tests covering opacity edge cases.

That transforms Git history into a readable audit trail of AI-assisted development.

You can:

git diff,
git revert,
git bisect,
or inspect individual changes exactly like any other workflow.

The supervision loop that works well is:

Request a change in natural language
Review the generated diff carefully
Accept or reject the modification
Run the test suite
Let Aider attempt self-correction if tests fail

For Ruby projects with strong RSpec coverage, this becomes surprisingly effective.

The tests become the supervising authority rather than the model itself.

VSCode Setup with Continue

Continue is the VSCode extension that connects your editor to Ollama-served models.

The setup is straightforward.

Installation

			
# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh
# Pull coding models
ollama pull qwen2.5-coder:7b
ollama pull qwen2.5-coder:32b
# Verify Ollama
curl http://localhost:11434/api/tags

		

Then install the Continue extension from the VSCode marketplace.

Continue Configuration

A practical Ruby setup looks like this:

			
{
  "models": [
    {
      "title": "Qwen2.5-Coder 32B (Chat)",
      "provider": "ollama",
      "model": "qwen2.5-coder:32b",
      "apiBase": "http://localhost:11434"
    }
  ],
  "tabAutocompleteModel": {
    "title": "Qwen2.5-Coder 7B (Autocomplete)",
    "provider": "ollama",
    "model": "qwen2.5-coder:7b",
    "apiBase": "http://localhost:11434"
  }
}

		

The split between models matters.

Smaller models work well for autocomplete because latency matters more than reasoning quality.

Larger models work better for:

refactoring,
architecture discussions,
debugging,
and multi-file reasoning.

Using both together makes the workflow feel substantially more responsive.

Using Continue in Practice

Some of the most useful workflows:

Cmd/Ctrl + I — inline editing on selected code
Cmd/Ctrl + L — open chat sidebar
@codebase — ask questions about the indexed repository
@diff — inject the current Git diff into context

The @codebase functionality is what makes Continue useful beyond single-file edits.

Once the model can inspect repository structure, responses become dramatically more coherent.

Which Models Work Best for Ruby?

For Ollama-based Ruby development, the current landscape looks roughly like this:

ModelSizeStrengths for RubyNotesqwen2.5-coder:7b~4GBFast autocomplete, decent Ruby idiomsGreat latency/performance balanceqwen2.5-coder:32b~20GBStrong reasoning and Rails understandingBest overall coding experiencedeepseek-coder-v2variesGood multi-file reasoningStrong with Aidergranite-code:8b~5GBLightweight and permissiveWeaker Ruby/Rails idioms

For most Ruby developers, the practical setup is:

a smaller model for autocomplete,
and a larger model for chat/refactoring.

If your hardware can’t comfortably run 32B models, 7B models are still genuinely useful productivity tools.

The Bigger Picture for Ruby Developers

Local AI doesn’t eliminate difficult engineering work.

It removes friction.

Tasks like:

writing boilerplate,
scaffolding tests,
drafting documentation,
explaining unfamiliar code,
and repetitive refactoring

become substantially faster.

That leaves more mental bandwidth available for the work that actually matters:

architecture,
debugging,
API design,
performance analysis,
production reliability.

For the Ruby ecosystem specifically, there’s also an opportunity here.

Ruby and Rails are well represented in training data, but the ecosystem around local AI tooling still lags behind Python and JavaScript.

There is room for:

Ruby-focused workflows,
better prompts,
editor integrations,
benchmarking,
and tooling built specifically for Ruby developers.

This article is my contribution toward that discussion.

If you’re experimenting with local AI workflows for Ruby or Rails development, I’d genuinely be interested in hearing what setups and models are working well for you.

Ruby-LibGD is available on GitHub.

Ruby Stack News covers Ruby internals, Rails, tooling, open source, and systems programming across the Ruby ecosystem.

Running AI Locally for Ruby Development: A Practical Guide with Ollama, Aider, and Your Own Codebase

Build Maps Without
Google APIs

Why Run AI Locally?

The Big Question: Does It Learn or Just Remember?

The Context Window: Working Memory, Not Long-Term Storage

What About Persistence?

Ruby-LibGD: A Real-World Test Case

What These Models Are Actually Good At (With Ruby)

Strong Areas

Weak Areas

Always Review Carefully

Automatic Git Commits and Code Supervision

VSCode Setup with Continue

Installation

Continue Configuration

Using Continue in Practice

Which Models Work Best for Ruby?

The Bigger Picture for Ruby Developers

Published by ggerman

Leave a comment Cancel reply

Build Maps WithoutGoogle APIs

Why Run AI Locally?

The Big Question: Does It Learn or Just Remember?

The Context Window: Working Memory, Not Long-Term Storage

What About Persistence?

Ruby-LibGD: A Real-World Test Case

What These Models Are Actually Good At (With Ruby)

Strong Areas

Weak Areas

Always Review Carefully

Automatic Git Commits and Code Supervision

VSCode Setup with Continue

Installation

Continue Configuration

Using Continue in Practice

Which Models Work Best for Ruby?

The Bigger Picture for Ruby Developers

Share this:

Related

Published by ggerman

Leave a comment Cancel reply

Build Maps Without
Google APIs