Performance Guide: Creating 100k Records in Less Than 3 Seconds with Ruby on Rails

November 21, 2024

Working on large-scale projects often requires generating substantial amounts of test or dummy data efficiently. In this tutorial, we’ll explore various methods to create 100,000 records in Ruby on Rails while benchmarking their performance.

This article is inspired by the text Performance Guide to Create 100k Records in Less Than 3s Using Ruby on Rails (https://dev.to/pimp_my_ruby/performance-guide-to-create-100k-records-in-less-than-3s-using-ruby-on-rails-3k07)

🚀 Need Expert Ruby on Rails Developers to Elevate Your Project?

Fill out our form! >>


Dataset Setup

For this guide, we’ll use the following Postgres schema:

# db/schema.rb
create_table "accounts", force: :cascade do |t|
  t.string "first_name"
  t.string "last_name"
  t.string "phone"
  t.string "email"
  t.string "role"
end

Using FactoryBot, we’ll prepare the data:

accounts = FactoryBot.build_list(:account, 100_000)
accounts_attributes = accounts.map(&:attributes)

The accounts variable holds unpersisted ActiveRecord objects, and accounts_attributes contains hashes with their attributes.


Methods to Create Records

Here are the methods we’ll evaluate:

1. .save and .save!

The simplest way to save records involves iterating through each object and calling .save:

accounts.each do |account|
  account.save
end

However, this is slow due to the overhead of executing a separate SQL query for every record. Using .save! raises exceptions for validation errors but offers similar performance.

Optimization with Transactions

Encapsulating the process in a transaction speeds up the operation:

Account.transaction do
  accounts.each(&:save)
end

2. .create and .create!

Account.create simplifies record creation by directly passing attributes:

accounts_attributes.each do |attrs|
  Account.create(attrs)
end

Using .create! is an alternative that raises exceptions for errors. For optimization, use a single transaction:

Account.transaction do
  accounts_attributes.each do |attrs|
    Account.create!(attrs)
  end
end

Hashes Variant

Passing an array of hashes to .create:

Account.create(accounts_attributes)

While it processes all records, it is relatively slower as it internally iterates through the array.

3. .insert_all

Introduced in Rails 6, .insert_all allows bulk inserts without validations or callbacks:

Account.insert_all(accounts_attributes)

This method is extremely fast as it reduces the number of database interactions.

4. .upsert_all

Similar to .insert_all, .upsert_all inserts or updates records in a single query:

Account.upsert_all(accounts_attributes)

This method is ideal for bulk updates and ensures performance by skipping callbacks and validations.

5. activerecord-import

Using the activerecord-import gem:

bundle add activerecord-import
Account.import(accounts_attributes)

This gem is highly efficient, supporting validations while minimizing database interactions.


Performance Benchmark

Here are the benchmark results for creating 100,000 records:


Key Takeaways

Methods Comparison

  • Slow: .save and .create perform poorly for large datasets due to frequent database calls.
  • Fast: .insert_all and .upsert_all offer blazing-fast performance by leveraging batch SQL queries.
  • Balanced: activerecord-import balances speed and validations, making it suitable for scenarios where validations are required.

Recommendations

  • For raw speed: Use .insert_all or .upsert_all when validations and callbacks are unnecessary.
  • For validations: Opt for activerecord-import.

Conclusion

Creating 100,000 records in Rails can be accomplished in seconds using optimized methods. Whether you prioritize speed, data integrity, or scalability, Rails offers a range of tools to suit your needs. By selecting the appropriate method and leveraging batch operations, you can significantly improve performance in large-scale applications.

Leave a comment