Ruby.CodeCompared.To/SQL

An interactive executable cheatsheet for Rubyists learning SQL

Ruby 4.0 SQLite 3.x
Basics
Run a simple expression
puts 2026 - 1995
SELECT 2026 - 1995 AS years_ago;
SQL can evaluate expressions without referencing any table. The AS keyword gives the result column a name. In Ruby you compute expressions directly.
Arithmetic and math functions
puts (-7).abs puts 3.14159.round(2) puts Math.sqrt(144)
SELECT abs(-7), round(3.14159, 2), sqrt(144);
SQL's built-in math functions β€” abs, round, sqrt, ceil, floor, pow, log β€” map closely to Ruby's Numeric methods and the Math module.
String functions
puts 'hello'.upcase puts ' hello '.strip puts 'hello world'.length
SELECT upper('hello'), trim(' hello '), length('hello world');
SQLite string functions (upper, lower, trim, length, substr, replace, instr) correspond directly to Ruby's String methods. SQL uses positional arguments; Ruby uses methods on the receiver.
String concatenation
puts 'Hello, ' + 'World!' puts "Alice" + " " + "Smith"
SELECT 'Hello, ' || 'World!' AS greeting, 'Alice' || ' ' || 'Smith' AS full_name;
SQL uses || for string concatenation; Ruby uses + or string interpolation. The || operator is defined by the SQL standard; some databases use CONCAT() instead, but SQLite uses ||.
Date and time
require 'date' puts Date.today puts (Date.today >> 1).to_s
SELECT date('now') AS today, date('now', '+1 month') AS next_month;
SQLite stores dates as text in ISO 8601 format (YYYY-MM-DD). The date(), time(), and datetime() functions accept modifier strings like '+1 month', '-7 days', and 'start of year'. Ruby's Date and Time classes offer equivalent arithmetic.
Reading Data
Select all rows
employees = [ { id: 1, name: 'Alice', department: 'Engineering', salary: 90_000 }, { id: 2, name: 'Bob', department: 'Marketing', salary: 75_000 }, { id: 3, name: 'Carol', department: 'Engineering', salary: 85_000 }, ] employees.each { |employee| puts employee.values.join(', ') }
CREATE TABLE employees (id INTEGER, name TEXT, department TEXT, salary REAL); INSERT INTO employees VALUES (1, 'Alice', 'Engineering', 90000), (2, 'Bob', 'Marketing', 75000), (3, 'Carol', 'Engineering', 85000); SELECT * FROM employees;
Employee.all in ActiveRecord generates SELECT * FROM employees. The * wildcard selects every column; in production code, prefer naming columns explicitly to avoid surprises when the schema changes.
Select specific columns
employees = [ { id: 1, name: 'Alice', department: 'Engineering', salary: 90_000 }, { id: 2, name: 'Bob', department: 'Marketing', salary: 75_000 }, { id: 3, name: 'Carol', department: 'Engineering', salary: 85_000 }, ] employees.each { |employee| puts "#{employee[:name]}: #{employee[:salary]}" }
CREATE TABLE employees (id INTEGER, name TEXT, department TEXT, salary REAL); INSERT INTO employees VALUES (1, 'Alice', 'Engineering', 90000), (2, 'Bob', 'Marketing', 75000), (3, 'Carol', 'Engineering', 85000); SELECT name, salary FROM employees;
Employee.select(:name, :salary) in ActiveRecord. Selecting only the columns you need reduces memory usage and network transfer β€” especially important when tables have many columns or large text/blob fields.
Computed column with alias
employees = [ { name: 'Alice', salary: 90_000 }, { name: 'Bob', salary: 75_000 }, { name: 'Carol', salary: 85_000 }, ] employees.each do |employee| puts "#{employee[:name]}: #{(employee[:salary] / 12.0).round(2)}" end
CREATE TABLE employees (id INTEGER, name TEXT, salary REAL); INSERT INTO employees VALUES (1, 'Alice', 90000), (2, 'Bob', 75000), (3, 'Carol', 85000); SELECT name, salary / 12.0 AS monthly_salary FROM employees;
Employee.select("name, salary / 12.0 AS monthly_salary") in ActiveRecord. The AS keyword renames a column in the result set β€” it does not affect the underlying data. Any expression can be aliased: arithmetic, function calls, or CASE expressions.
Distinct values
employees = [ { name: 'Alice', department: 'Engineering' }, { name: 'Bob', department: 'Marketing' }, { name: 'Carol', department: 'Engineering' }, ] puts employees.map { |e| e[:department] }.uniq
CREATE TABLE employees (id INTEGER, name TEXT, department TEXT); INSERT INTO employees VALUES (1, 'Alice', 'Engineering'), (2, 'Bob', 'Marketing'), (3, 'Carol', 'Engineering'); SELECT DISTINCT department FROM employees;
Employee.distinct.pluck(:department) in ActiveRecord. DISTINCT removes duplicate rows from the result. It applies to the entire row, not just one column β€” SELECT DISTINCT first_name, last_name returns unique (first, last) pairs.
Filtering
Filter by exact value
employees = [ { name: 'Alice', department: 'Engineering', salary: 90_000 }, { name: 'Bob', department: 'Marketing', salary: 75_000 }, { name: 'Carol', department: 'Engineering', salary: 85_000 }, ] engineers = employees.select { |e| e[:department] == 'Engineering' } engineers.each { |e| puts "#{e[:name]}: #{e[:salary]}" }
CREATE TABLE employees (id INTEGER, name TEXT, department TEXT, salary REAL); INSERT INTO employees VALUES (1, 'Alice', 'Engineering', 90000), (2, 'Bob', 'Marketing', 75000), (3, 'Carol', 'Engineering', 85000); SELECT name, salary FROM employees WHERE department = 'Engineering';
Employee.where(department: 'Engineering') in ActiveRecord. The WHERE clause filters rows before they are returned. String literals in SQL use single quotes; double quotes are for identifiers (table and column names).
Filter by comparison
employees = [ { name: 'Alice', salary: 90_000 }, { name: 'Bob', salary: 75_000 }, { name: 'Carol', salary: 85_000 }, { name: 'Dave', salary: 70_000 }, ] puts employees.select { |e| e[:salary] > 80_000 }.map { |e| "#{e[:name]}: #{e[:salary]}" }
CREATE TABLE employees (id INTEGER, name TEXT, salary REAL); INSERT INTO employees VALUES (1, 'Alice', 90000), (2, 'Bob', 75000), (3, 'Carol', 85000), (4, 'Dave', 70000); SELECT name, salary FROM employees WHERE salary > 80000;
Employee.where('salary > ?', 80_000) in ActiveRecord. SQL comparison operators β€” >, <, >=, <=, =, != (or <>) β€” work on numbers, dates, and strings.
Multiple conditions: AND / OR
employees = [ { name: 'Alice', department: 'Engineering', salary: 90_000 }, { name: 'Bob', department: 'Marketing', salary: 75_000 }, { name: 'Carol', department: 'Engineering', salary: 85_000 }, { name: 'Dave', department: 'Marketing', salary: 92_000 }, ] result = employees.select { |e| e[:department] == 'Engineering' || e[:salary] > 90_000 } result.each { |e| puts "#{e[:name]} (#{e[:department]}): #{e[:salary]}" }
CREATE TABLE employees (id INTEGER, name TEXT, department TEXT, salary REAL); INSERT INTO employees VALUES (1, 'Alice', 'Engineering', 90000), (2, 'Bob', 'Marketing', 75000), (3, 'Carol', 'Engineering', 85000), (4, 'Dave', 'Marketing', 92000); SELECT name, department, salary FROM employees WHERE department = 'Engineering' OR salary > 90000;
Employee.where(department: 'Engineering').or(Employee.where('salary > ?', 90_000)) in ActiveRecord. Use parentheses to control precedence when mixing AND and OR: (A OR B) AND C behaves differently from A OR (B AND C).
Filter against a list: IN
employees = [ { name: 'Alice', department: 'Engineering' }, { name: 'Bob', department: 'Marketing' }, { name: 'Carol', department: 'Finance' }, { name: 'Dave', department: 'Engineering' }, ] target_departments = ['Engineering', 'Finance'] result = employees.select { |e| target_departments.include?(e[:department]) } result.each { |e| puts "#{e[:name]}: #{e[:department]}" }
CREATE TABLE employees (id INTEGER, name TEXT, department TEXT); INSERT INTO employees VALUES (1, 'Alice', 'Engineering'), (2, 'Bob', 'Marketing'), (3, 'Carol', 'Finance'), (4, 'Dave', 'Engineering'); SELECT name, department FROM employees WHERE department IN ('Engineering', 'Finance');
Employee.where(department: ['Engineering', 'Finance']) in ActiveRecord. IN is equivalent to a series of OR equality checks. NOT IN excludes those values β€” but watch out: NOT IN (list) returns no rows if NULL is in the list.
Filter by range: BETWEEN
employees = [ { name: 'Alice', salary: 90_000 }, { name: 'Bob', salary: 75_000 }, { name: 'Carol', salary: 85_000 }, { name: 'Dave', salary: 70_000 }, ] result = employees.select { |e| (75_000..87_000).cover?(e[:salary]) } result.each { |e| puts "#{e[:name]}: #{e[:salary]}" }
CREATE TABLE employees (id INTEGER, name TEXT, salary REAL); INSERT INTO employees VALUES (1, 'Alice', 90000), (2, 'Bob', 75000), (3, 'Carol', 85000), (4, 'Dave', 70000); SELECT name, salary FROM employees WHERE salary BETWEEN 75000 AND 87000;
Employee.where(salary: 75_000..87_000) in ActiveRecord. BETWEEN low AND high is inclusive on both ends β€” equivalent to salary >= 75000 AND salary <= 87000. Ruby's Range#cover? is also inclusive by default for numeric ranges.
Pattern matching: LIKE
employees = [ { name: 'Alice Smith', email: 'alice@example.com' }, { name: 'Bob Jones', email: 'bob@example.com' }, { name: 'Alice Brown', email: 'alice.b@example.com' }, ] result = employees.select { |e| e[:name].start_with?('Alice') } result.each { |e| puts "#{e[:name]} β€” #{e[:email]}" }
CREATE TABLE employees (id INTEGER, name TEXT, email TEXT); INSERT INTO employees VALUES (1, 'Alice Smith', 'alice@example.com'), (2, 'Bob Jones', 'bob@example.com'), (3, 'Alice Brown', 'alice.b@example.com'); SELECT name, email FROM employees WHERE name LIKE 'Alice%';
Employee.where("name LIKE ?", "Alice%") in ActiveRecord. In LIKE patterns, % matches any sequence of characters and _ matches exactly one character. Use ILIKE (or LIKE with LOWER()) for case-insensitive matching β€” SQLite's LIKE is case-insensitive for ASCII letters by default.
NULL checks: IS NULL / IS NOT NULL
employees = [ { name: 'Alice', manager_id: nil }, { name: 'Bob', manager_id: 1 }, { name: 'Carol', manager_id: 1 }, { name: 'Dave', manager_id: nil }, ] top_level = employees.select { |e| e[:manager_id].nil? } top_level.each { |e| puts e[:name] }
CREATE TABLE employees (id INTEGER, name TEXT, manager_id INTEGER); INSERT INTO employees VALUES (1, 'Alice', NULL), (2, 'Bob', 1), (3, 'Carol', 1), (4, 'Dave', NULL); SELECT name FROM employees WHERE manager_id IS NULL;
Employee.where(manager_id: nil) in ActiveRecord. In SQL, NULL represents an unknown or absent value. Never use = NULL or != NULL β€” they always evaluate to UNKNOWN, not TRUE. Always use IS NULL or IS NOT NULL. In Ruby, nil behaves like a normal falsy value β€” there is no such gotcha.
Sorting & Limiting
Sort results: ORDER BY
employees = [ { name: 'Alice', salary: 90_000 }, { name: 'Bob', salary: 75_000 }, { name: 'Carol', salary: 85_000 }, { name: 'Dave', salary: 92_000 }, ] sorted = employees.sort_by { |e| -e[:salary] } sorted.each { |e| puts "#{e[:name]}: #{e[:salary]}" }
CREATE TABLE employees (id INTEGER, name TEXT, salary REAL); INSERT INTO employees VALUES (1, 'Alice', 90000), (2, 'Bob', 75000), (3, 'Carol', 85000), (4, 'Dave', 92000); SELECT name, salary FROM employees ORDER BY salary DESC;
Employee.order(salary: :desc) in ActiveRecord. Without ORDER BY, the row order returned by a SQL query is undefined β€” do not rely on insertion order. ASC (ascending) is the default; DESC must be stated explicitly.
Sort by multiple columns
employees = [ { name: 'Carol', department: 'Engineering', salary: 85_000 }, { name: 'Alice', department: 'Engineering', salary: 90_000 }, { name: 'Bob', department: 'Marketing', salary: 75_000 }, { name: 'Dave', department: 'Marketing', salary: 80_000 }, ] sorted = employees.sort_by { |e| [e[:department], -e[:salary]] } sorted.each { |e| puts "#{e[:department]}: #{e[:name]} (#{e[:salary]})" }
CREATE TABLE employees (id INTEGER, name TEXT, department TEXT, salary REAL); INSERT INTO employees VALUES (1, 'Carol', 'Engineering', 85000), (2, 'Alice', 'Engineering', 90000), (3, 'Bob', 'Marketing', 75000), (4, 'Dave', 'Marketing', 80000); SELECT name, department, salary FROM employees ORDER BY department ASC, salary DESC;
Employee.order(department: :asc, salary: :desc) in ActiveRecord. Multiple ORDER BY columns are evaluated left to right β€” rows are first sorted by department, then within each department by salary descending.
Limit results: LIMIT
employees = [ { name: 'Dave', salary: 92_000 }, { name: 'Alice', salary: 90_000 }, { name: 'Carol', salary: 85_000 }, { name: 'Bob', salary: 75_000 }, ] top_three = employees.sort_by { |e| -e[:salary] }.first(3) top_three.each { |e| puts "#{e[:name]}: #{e[:salary]}" }
CREATE TABLE employees (id INTEGER, name TEXT, salary REAL); INSERT INTO employees VALUES (1, 'Dave', 92000), (2, 'Alice', 90000), (3, 'Carol', 85000), (4, 'Bob', 75000); SELECT name, salary FROM employees ORDER BY salary DESC LIMIT 3;
Employee.order(salary: :desc).limit(3) in ActiveRecord. LIMIT should almost always be used with ORDER BY β€” without ordering, which rows you get is undefined and can vary between queries.
Pagination with OFFSET
employees = [ { name: 'Dave', salary: 92_000 }, { name: 'Alice', salary: 90_000 }, { name: 'Carol', salary: 85_000 }, { name: 'Eve', salary: 82_000 }, { name: 'Bob', salary: 75_000 }, { name: 'Frank', salary: 71_000 }, ] page_size = 2 page_number = 2 # 1-based page = employees.sort_by { |e| -e[:salary] } .drop((page_number - 1) * page_size) .first(page_size) page.each { |e| puts "#{e[:name]}: #{e[:salary]}" }
CREATE TABLE employees (id INTEGER, name TEXT, salary REAL); INSERT INTO employees VALUES (1,'Dave',92000),(2,'Alice',90000),(3,'Carol',85000), (4,'Eve',82000), (5,'Bob',75000), (6,'Frank',71000); SELECT name, salary FROM employees ORDER BY salary DESC LIMIT 2 OFFSET 2;
Employee.order(salary: :desc).limit(2).offset(2) in ActiveRecord. OFFSET n skips the first n rows. This "skip-limit" pagination works well for small datasets but becomes slow on large tables because the database still scans all skipped rows. For large datasets, keyset pagination (WHERE id > last_seen_id) is more efficient.
Aggregates
Count rows
employees = [ { name: 'Alice', department: 'Engineering' }, { name: 'Bob', department: 'Marketing' }, { name: 'Carol', department: 'Engineering' }, ] puts "Total: #{employees.count}" puts "Unique departments: #{employees.map { |e| e[:department] }.uniq.count}"
CREATE TABLE employees (id INTEGER, name TEXT, department TEXT); INSERT INTO employees VALUES (1, 'Alice', 'Engineering'), (2, 'Bob', 'Marketing'), (3, 'Carol', 'Engineering'); SELECT count(*) AS total, count(DISTINCT department) AS unique_departments FROM employees;
Employee.count and Employee.distinct.count(:department) in ActiveRecord. count(*) counts all rows including those with NULL values. count(column) counts only non-NULL values in that column.
Sum and average
salaries = [90_000, 75_000, 85_000, 70_000] puts "Total payroll: #{salaries.sum}" puts "Average salary: #{(salaries.sum.to_f / salaries.count).round(2)}"
CREATE TABLE employees (id INTEGER, name TEXT, salary REAL); INSERT INTO employees VALUES (1,'Alice',90000),(2,'Bob',75000),(3,'Carol',85000),(4,'Dave',70000); SELECT sum(salary) AS total_payroll, avg(salary) AS average_salary FROM employees;
Employee.sum(:salary) and Employee.average(:salary) in ActiveRecord. avg() ignores NULL values β€” a row with NULL salary is not counted. This differs from dividing sum by count(*), which would treat missing salaries as zero.
Minimum and maximum
employees = [ { name: 'Alice', salary: 90_000, hire_date: '2020-03-15' }, { name: 'Bob', salary: 75_000, hire_date: '2019-07-01' }, { name: 'Carol', salary: 85_000, hire_date: '2021-11-20' }, ] puts "Lowest salary: #{employees.min_by { |e| e[:salary] }[:salary]}" puts "Highest salary: #{employees.max_by { |e| e[:salary] }[:salary]}" puts "Earliest hire: #{employees.min_by { |e| e[:hire_date] }[:hire_date]}"
CREATE TABLE employees (id INTEGER, name TEXT, salary REAL, hire_date TEXT); INSERT INTO employees VALUES (1,'Alice',90000,'2020-03-15'), (2,'Bob', 75000,'2019-07-01'), (3,'Carol',85000,'2021-11-20'); SELECT min(salary) AS lowest, max(salary) AS highest, min(hire_date) AS earliest_hire FROM employees;
Employee.minimum(:salary) and Employee.maximum(:salary) in ActiveRecord. min() and max() work on strings and dates too β€” SQLite compares date strings lexicographically, which works correctly for ISO 8601 format (YYYY-MM-DD).
Group rows: GROUP BY
employees = [ { name: 'Alice', department: 'Engineering', salary: 90_000 }, { name: 'Bob', department: 'Marketing', salary: 75_000 }, { name: 'Carol', department: 'Engineering', salary: 85_000 }, { name: 'Dave', department: 'Marketing', salary: 80_000 }, ] groups = employees.group_by { |e| e[:department] } groups.each do |department, members| avg = members.sum { |e| e[:salary] } / members.count.to_f puts "#{department}: #{members.count} employees, avg $#{avg.round}" end
CREATE TABLE employees (id INTEGER, name TEXT, department TEXT, salary REAL); INSERT INTO employees VALUES (1,'Alice','Engineering',90000),(2,'Bob','Marketing',75000), (3,'Carol','Engineering',85000),(4,'Dave','Marketing',80000); SELECT department, count(*) AS headcount, avg(salary) AS avg_salary FROM employees GROUP BY department;
Employee.group(:department).count and Employee.group(:department).average(:salary) in ActiveRecord. Every column in the SELECT list must either be in the GROUP BY clause or wrapped in an aggregate function β€” this rule trips up many SQL newcomers.
Filter groups: HAVING
employees = [ { name: 'Alice', department: 'Engineering', salary: 90_000 }, { name: 'Bob', department: 'Marketing', salary: 75_000 }, { name: 'Carol', department: 'Engineering', salary: 85_000 }, { name: 'Dave', department: 'Marketing', salary: 80_000 }, { name: 'Eve', department: 'Finance', salary: 95_000 }, ] groups = employees.group_by { |e| e[:department] } groups.select { |_, members| members.count > 1 }.each do |dept, members| avg = members.sum { |e| e[:salary] } / members.count.to_f puts "#{dept}: #{members.count} employees, avg $#{avg.round}" end
CREATE TABLE employees (id INTEGER, name TEXT, department TEXT, salary REAL); INSERT INTO employees VALUES (1,'Alice','Engineering',90000),(2,'Bob','Marketing',75000), (3,'Carol','Engineering',85000),(4,'Dave','Marketing',80000), (5,'Eve', 'Finance', 95000); SELECT department, count(*) AS headcount, avg(salary) AS avg_salary FROM employees GROUP BY department HAVING count(*) > 1;
Employee.group(:department).having("count(*) > 1").count in ActiveRecord. HAVING filters after grouping β€” it is for aggregate conditions what WHERE is for row conditions. You cannot use WHERE to filter on count(*) because WHERE runs before grouping.
Conditional aggregation with CASE
employees = [ { name: 'Alice', department: 'Engineering', salary: 90_000 }, { name: 'Bob', department: 'Marketing', salary: 75_000 }, { name: 'Carol', department: 'Engineering', salary: 85_000 }, { name: 'Dave', department: 'Marketing', salary: 80_000 }, ] puts "Total: #{employees.count}" puts "High earners (>80k): #{employees.count { |e| e[:salary] > 80_000 }}" puts "Engineering payroll: #{employees.select { |e| e[:department] == 'Engineering' }.sum { |e| e[:salary] }}"
CREATE TABLE employees (id INTEGER, name TEXT, department TEXT, salary REAL); INSERT INTO employees VALUES (1,'Alice','Engineering',90000),(2,'Bob','Marketing',75000), (3,'Carol','Engineering',85000),(4,'Dave','Marketing',80000); SELECT count(*) AS total, count(CASE WHEN salary > 80000 THEN 1 END) AS high_earners, sum(CASE WHEN department = 'Engineering' THEN salary ELSE 0 END) AS engineering_payroll FROM employees;
A CASE WHEN ... THEN ... END expression inside an aggregate function is a powerful pattern β€” it counts or sums only rows that satisfy a condition, all in a single pass over the data. This is sometimes called "pivot aggregation" and has no direct ActiveRecord equivalent; it requires raw SQL or Arel.
Joins
INNER JOIN
departments = [{ id: 1, name: 'Engineering' }, { id: 2, name: 'Marketing' }] employees = [ { name: 'Alice', department_id: 1, salary: 90_000 }, { name: 'Bob', department_id: 2, salary: 75_000 }, { name: 'Carol', department_id: 1, salary: 85_000 }, ] result = employees.filter_map do |employee| dept = departments.find { |d| d[:id] == employee[:department_id] } "#{employee[:name]}: #{dept[:name]} β€” #{employee[:salary]}" if dept end puts result
CREATE TABLE departments (id INTEGER PRIMARY KEY, name TEXT); CREATE TABLE employees (id INTEGER, name TEXT, department_id INTEGER, salary REAL); INSERT INTO departments VALUES (1,'Engineering'),(2,'Marketing'),(3,'Finance'); INSERT INTO employees VALUES (1,'Alice',1,90000),(2,'Bob',2,75000),(3,'Carol',1,85000); SELECT employees.name, departments.name AS department, employees.salary FROM employees INNER JOIN departments ON employees.department_id = departments.id;
Employee.joins(:department).select('employees.*, departments.name AS dept_name') in ActiveRecord. An INNER JOIN returns only rows where the join condition matches in both tables β€” employees with no matching department (or with a NULL department_id) are excluded from the result.
LEFT JOIN (include unmatched rows)
departments = [{ id: 1, name: 'Engineering' }, { id: 2, name: 'Marketing' }, { id: 3, name: 'Finance' }] employees = [ { name: 'Alice', department_id: 1 }, { name: 'Bob', department_id: 2 }, { name: 'Carol', department_id: 1 }, { name: 'Dave', department_id: nil }, ] result = employees.map do |employee| dept = departments.find { |d| d[:id] == employee[:department_id] } "#{employee[:name]}: #{dept ? dept[:name] : 'NULL'}" end puts result
CREATE TABLE departments (id INTEGER PRIMARY KEY, name TEXT); CREATE TABLE employees (id INTEGER, name TEXT, department_id INTEGER); INSERT INTO departments VALUES (1,'Engineering'),(2,'Marketing'),(3,'Finance'); INSERT INTO employees VALUES (1,'Alice',1),(2,'Bob',2),(3,'Carol',1),(4,'Dave',NULL); SELECT employees.name, departments.name AS department FROM employees LEFT JOIN departments ON employees.department_id = departments.id;
Employee.left_outer_joins(:department) in ActiveRecord. A LEFT JOIN keeps all rows from the left table even when there is no match in the right table β€” the unmatched right-side columns are NULL. Use a LEFT JOIN combined with WHERE right.id IS NULL to find records that lack an association.
JOIN with aggregation
departments = [{ id: 1, name: 'Engineering' }, { id: 2, name: 'Marketing' }] employees = [ { name: 'Alice', department_id: 1, salary: 90_000 }, { name: 'Bob', department_id: 2, salary: 75_000 }, { name: 'Carol', department_id: 1, salary: 85_000 }, { name: 'Dave', department_id: 2, salary: 80_000 }, ] departments.each do |department| members = employees.select { |e| e[:department_id] == department[:id] } avg = members.empty? ? 0 : members.sum { |e| e[:salary] } / members.count.to_f puts "#{department[:name]}: #{members.count} employees, avg $#{avg.round}" end
CREATE TABLE departments (id INTEGER PRIMARY KEY, name TEXT); CREATE TABLE employees (id INTEGER, name TEXT, department_id INTEGER, salary REAL); INSERT INTO departments VALUES (1,'Engineering'),(2,'Marketing'); INSERT INTO employees VALUES (1,'Alice',1,90000),(2,'Bob',2,75000),(3,'Carol',1,85000),(4,'Dave',2,80000); SELECT departments.name, count(employees.id) AS headcount, avg(employees.salary) AS avg_salary FROM departments LEFT JOIN employees ON employees.department_id = departments.id GROUP BY departments.id, departments.name;
Joining and then grouping is one of the most common SQL patterns. Using a LEFT JOIN here ensures that departments with no employees still appear in the results with a count of zero β€” an INNER JOIN would silently omit them.
Self join (hierarchical data)
employees = [ { id: 1, name: 'Alice', manager_id: nil }, { id: 2, name: 'Bob', manager_id: 1 }, { id: 3, name: 'Carol', manager_id: 1 }, { id: 4, name: 'Dave', manager_id: 2 }, ] employees.each do |employee| manager = employees.find { |e| e[:id] == employee[:manager_id] } puts "#{employee[:name]} β€” manager: #{manager ? manager[:name] : 'none'}" end
CREATE TABLE employees (id INTEGER PRIMARY KEY, name TEXT, manager_id INTEGER); INSERT INTO employees VALUES (1,'Alice',NULL),(2,'Bob',1),(3,'Carol',1),(4,'Dave',2); SELECT workers.name AS employee, managers.name AS manager FROM employees AS workers LEFT JOIN employees AS managers ON workers.manager_id = managers.id;
A self join joins a table to itself using table aliases. It is the standard way to query hierarchical data β€” org charts, category trees, threaded comments β€” stored as an adjacency list in a single table. In ActiveRecord: belongs_to :manager, class_name: 'Employee' and has_many :reports, class_name: 'Employee', foreign_key: :manager_id.
Writing Data
Insert a row: INSERT INTO
employees = [] employees << { id: 1, name: 'Alice', department: 'Engineering', salary: 90_000 } employees << { name: 'Bob', department: 'Marketing', salary: 75_000 } employees.each { |e| puts e.values.join(', ') }
CREATE TABLE employees (id INTEGER PRIMARY KEY, name TEXT, department TEXT, salary REAL); INSERT INTO employees VALUES (1, 'Alice', 'Engineering', 90000); INSERT INTO employees (name, department, salary) VALUES ('Bob', 'Marketing', 75000); SELECT * FROM employees;
Employee.create(name: 'Bob', department: 'Marketing', salary: 75_000) in ActiveRecord. The column-list form of INSERT INTO table (col1, col2) VALUES (...) is preferred in production β€” it remains correct when columns are added or reordered later.
Insert multiple rows
catalog = [] new_products = [ { id: 1, name: 'Widget', price: 9.99, category: 'Hardware' }, { id: 2, name: 'Gadget', price: 24.99, category: 'Electronics' }, { id: 3, name: 'Doohickey', price: 4.99, category: 'Hardware' }, ] catalog.concat(new_products) catalog.each { |product| puts "#{product[:name]}: $#{product[:price]}" }
CREATE TABLE products (id INTEGER PRIMARY KEY, name TEXT, price REAL, category TEXT); INSERT INTO products VALUES (1, 'Widget', 9.99, 'Hardware'), (2, 'Gadget', 24.99, 'Electronics'), (3, 'Doohickey', 4.99, 'Hardware'); SELECT * FROM products;
Product.insert_all([...]) in ActiveRecord inserts multiple rows in a single SQL statement β€” far more efficient than calling .create in a loop. A multi-row INSERT sends one round-trip to the database instead of one per record.
Update rows: UPDATE … SET … WHERE
employees = [ { id: 1, name: 'Alice', salary: 90_000 }, { id: 2, name: 'Bob', salary: 75_000 }, { id: 3, name: 'Carol', salary: 85_000 }, ] employees.each do |employee| employee[:salary] = (employee[:salary] * 1.1).round if employee[:salary] < 85_000 end employees.each { |e| puts "#{e[:name]}: #{e[:salary]}" }
CREATE TABLE employees (id INTEGER, name TEXT, salary REAL); INSERT INTO employees VALUES (1,'Alice',90000),(2,'Bob',75000),(3,'Carol',85000); UPDATE employees SET salary = salary * 1.1 WHERE salary < 85000; SELECT name, salary FROM employees;
Employee.where('salary < ?', 85_000).update_all('salary = salary * 1.1') in ActiveRecord. UPDATE ... SET ... WHERE modifies every matching row in one SQL statement without loading records into memory. Omitting the WHERE clause updates every row in the table.
Delete rows: DELETE FROM … WHERE
employees = [ { id: 1, name: 'Alice', active: true }, { id: 2, name: 'Bob', active: false }, { id: 3, name: 'Carol', active: true }, { id: 4, name: 'Dave', active: false }, ] employees.reject! { |employee| !employee[:active] } employees.each { |employee| puts "#{employee[:id]}: #{employee[:name]}" }
CREATE TABLE employees (id INTEGER, name TEXT, active INTEGER); INSERT INTO employees VALUES (1,'Alice',1),(2,'Bob',0),(3,'Carol',1),(4,'Dave',0); DELETE FROM employees WHERE active = 0; SELECT * FROM employees;
Employee.where(active: false).delete_all in ActiveRecord skips callbacks and deletes directly. Use destroy_all when before_destroy / after_destroy callbacks must run β€” but destroy_all loads every record into memory first, making it much slower on large tables.
Upsert: INSERT OR REPLACE
settings = { 'theme' => 'light', 'language' => 'en' } updates = { 'theme' => 'dark', 'timezone' => 'UTC' } settings.merge!(updates) settings.each { |key, value| puts "#{key}: #{value}" }
CREATE TABLE settings (key TEXT PRIMARY KEY, value TEXT); INSERT INTO settings VALUES ('theme', 'light'), ('language', 'en'); INSERT OR REPLACE INTO settings VALUES ('theme', 'dark'), ('timezone', 'UTC'); SELECT * FROM settings;
Setting.upsert({ key: 'theme', value: 'dark' }, unique_by: :key) in ActiveRecord. SQLite's INSERT OR REPLACE deletes the conflicting row and inserts a new one (resetting auto-increment IDs and all columns). The SQL standard form, INSERT INTO ... ON CONFLICT(key) DO UPDATE SET value = excluded.value, updates only specific columns and is supported in SQLite 3.24+.
Schema
Create a table
class CreateEmployees < ActiveRecord::Migration[8.0] def change create_table :employees do |table| table.string :name, null: false table.string :department, null: false table.decimal :salary, null: false table.date :hire_date, null: false, default: -> { 'CURRENT_DATE' } table.boolean :active, null: false, default: true end end end
CREATE TABLE employees ( id INTEGER PRIMARY KEY AUTOINCREMENT, name TEXT NOT NULL, department TEXT NOT NULL, salary REAL NOT NULL, hire_date TEXT NOT NULL DEFAULT (date('now')), active INTEGER NOT NULL DEFAULT 1 );
Rails migrations abstract the SQL DDL β€” create_table generates the CREATE TABLE statement for whatever database is configured. Rails automatically adds the id column (INTEGER PRIMARY KEY AUTOINCREMENT) unless you specify id: false.
SQLite data types
# SQLite's five storage classes map directly to Ruby types: # INTEGER β†’ Ruby Integer # REAL β†’ Ruby Float # TEXT β†’ Ruby String # BLOB β†’ Ruby String (binary encoding) # NULL β†’ Ruby nil puts [42, 3.14, "hello", nil].map(&:class)
CREATE TABLE type_examples (integer_col INTEGER, real_col REAL, text_col TEXT); INSERT INTO type_examples VALUES (42, 3.14, 'hello'); SELECT typeof(integer_col), typeof(real_col), typeof(text_col) FROM type_examples;
SQLite uses dynamic "type affinity" β€” a column declared as TEXT can store an integer. SQLite 3.37+ supports STRICT tables that enforce declared types. Rails maps database types to Ruby types automatically (e.g., :decimal β†’ BigDecimal, :boolean β†’ true/false).
Constraints: NOT NULL, UNIQUE, CHECK, DEFAULT
class CreateUsers < ActiveRecord::Migration[8.0] def change create_table :users do |table| table.string :email, null: false, index: { unique: true } table.string :username, null: false table.string :role, null: false, default: 'user' table.integer :score, null: false, default: 0 end # CHECK constraint requires raw SQL β€” no migration DSL equivalent: # execute "ALTER TABLE users ADD CHECK (score >= 0)" end end
CREATE TABLE users ( id INTEGER PRIMARY KEY, email TEXT NOT NULL UNIQUE, username TEXT NOT NULL, role TEXT NOT NULL DEFAULT 'user', score INTEGER DEFAULT 0 CHECK (score >= 0) );
Database constraints are enforced regardless of which application (or which language) accesses the database. ActiveRecord model validations (like validates :score, numericality: { greater_than_or_equal_to: 0 }) are application-level only β€” a direct SQL INSERT bypasses them. Rely on database constraints for true data integrity.
Foreign key constraint
class Department < ApplicationRecord has_many :employees, dependent: :restrict_with_exception end class Employee < ApplicationRecord belongs_to :department end # Migration: # add_foreign_key :employees, :departments
CREATE TABLE departments ( id INTEGER PRIMARY KEY, name TEXT NOT NULL ); CREATE TABLE employees ( id INTEGER PRIMARY KEY, name TEXT NOT NULL, department_id INTEGER NOT NULL REFERENCES departments(id) );
In SQLite, foreign key enforcement is disabled by default and must be enabled with PRAGMA foreign_keys = ON at the start of each connection. Rails 5+ sets this pragma automatically via the SQLite3 adapter. Without it, the REFERENCES declaration is parsed but not enforced.
Modify schema: ALTER TABLE
class AddFieldsToEmployees < ActiveRecord::Migration[8.0] def change add_column :employees, :salary, :decimal rename_column :employees, :salary, :annual_salary change_column :employees, :annual_salary, :decimal, null: false, default: 0 remove_column :employees, :legacy_field, :string end end
CREATE TABLE employees (id INTEGER PRIMARY KEY, name TEXT); ALTER TABLE employees ADD COLUMN salary REAL DEFAULT 0; ALTER TABLE employees RENAME COLUMN salary TO annual_salary; -- SQLite 3.35+ only: -- ALTER TABLE employees DROP COLUMN legacy_field;
SQLite's ALTER TABLE is more limited than other databases: before SQLite 3.35.0 you could not drop columns, and changing a column's type or constraints still requires recreating the table. Rails migrations handle this complexity automatically β€” the migration DSL generates the correct SQL for each database adapter, including table rebuilds when necessary.
Indexes
Create an index
class AddIndexesToEmployees < ActiveRecord::Migration[8.0] def change add_index :employees, :department add_index :employees, :salary # Composite index: add_index :employees, [:department, :salary] end end
CREATE TABLE employees (id INTEGER PRIMARY KEY, name TEXT, department TEXT, salary REAL); CREATE INDEX idx_employees_department ON employees (department); CREATE INDEX idx_employees_salary ON employees (salary); -- Composite index (covers WHERE department = ? ORDER BY salary): CREATE INDEX idx_employees_dept_salary ON employees (department, salary);
Indexes dramatically speed up queries that filter, sort, or join on the indexed column. The trade-off is slightly slower writes and more storage. Index every foreign key column and any column that frequently appears in WHERE or ORDER BY clauses. A composite index (a, b) can also serve queries on a alone but not queries on b alone.
Unique index
class AddUniqueIndexToUsersEmail < ActiveRecord::Migration[8.0] def change add_index :users, :email, unique: true # Partial unique index (unique among active users only): # execute "CREATE UNIQUE INDEX idx_users_email_active ON users (email) WHERE active = 1" end end
CREATE TABLE users (id INTEGER PRIMARY KEY, email TEXT, active INTEGER DEFAULT 1); CREATE UNIQUE INDEX idx_users_email ON users (email); -- Partial unique index β€” unique only among active users: CREATE UNIQUE INDEX idx_users_email_active ON users (email) WHERE active = 1;
A unique index is both a performance index and a data integrity constraint. ActiveRecord's validates :email, uniqueness: true checks uniqueness at the application level, but it has a race condition in concurrent environments. The database-level unique index is the only reliable guarantee β€” it is enforced atomically by the database engine.
Query analysis: EXPLAIN QUERY PLAN
# In Rails: Employee.where(department: 'Engineering').explain # Output (with index): # QUERY PLAN # `--SEARCH employees USING INDEX idx_employees_department (department=?) # Output (without index): # QUERY PLAN # `--SCAN employees puts "Run Employee.where(...).explain in the Rails console to see the query plan"
CREATE TABLE employees (id INTEGER PRIMARY KEY, name TEXT, department TEXT); CREATE INDEX idx_employees_department ON employees (department); EXPLAIN QUERY PLAN SELECT * FROM employees WHERE department = 'Engineering';
EXPLAIN QUERY PLAN shows whether SQLite uses an index (SEARCH ... USING INDEX) or performs a full table scan (SCAN employees). A full scan on a large table with a selective WHERE clause usually indicates a missing index. ActiveRecord's .explain method runs the equivalent and formats the output.
Transactions
Atomic batch: BEGIN / COMMIT
accounts = { 'Alice' => 1000.0, 'Bob' => 500.0 } begin transfer = 200 accounts['Alice'] -= transfer accounts['Bob'] += transfer puts "Transfer committed" rescue => error puts "Transfer failed: #{error.message}" end accounts.each { |owner, balance| puts "#{owner}: #{balance}" }
CREATE TABLE accounts (id INTEGER PRIMARY KEY, owner TEXT, balance REAL); INSERT INTO accounts VALUES (1,'Alice',1000),(2,'Bob',500); BEGIN TRANSACTION; UPDATE accounts SET balance = balance - 200 WHERE owner = 'Alice'; UPDATE accounts SET balance = balance + 200 WHERE owner = 'Bob'; COMMIT; SELECT owner, balance FROM accounts;
ActiveRecord::Base.transaction { ... } wraps the block in BEGIN/COMMIT. Any exception raised inside the block automatically triggers ROLLBACK. Transactions guarantee atomicity β€” either all operations succeed together or none of them take effect. Without a transaction, a server crash between the two UPDATEs would leave the accounts in an inconsistent state.
Undo a transaction: ROLLBACK
accounts = { 'Alice' => 1000.0, 'Bob' => 500.0 } original = accounts.dup begin accounts['Alice'] -= 1500 # more than Alice has raise "Insufficient funds" if accounts['Alice'] < 0 accounts['Bob'] += 1500 puts "Transfer committed" rescue => error accounts.replace(original) puts "Rolled back: #{error.message}" end accounts.each { |owner, balance| puts "#{owner}: #{balance}" }
CREATE TABLE accounts (id INTEGER PRIMARY KEY, owner TEXT, balance REAL); INSERT INTO accounts VALUES (1,'Alice',1000),(2,'Bob',500); BEGIN TRANSACTION; UPDATE accounts SET balance = balance - 1500 WHERE owner = 'Alice'; ROLLBACK; SELECT owner, balance FROM accounts;
Inside a Rails transaction block, raise ActiveRecord::Rollback to cancel the transaction without propagating the exception to the caller. Any other exception both rolls back the transaction and propagates upward. After a ROLLBACK, the database is exactly as it was before the BEGIN.
Partial rollback: SAVEPOINT
log_entries = ['Starting import'] checkpoint = log_entries.dup log_entries << 'Risky step executed' # Roll back to checkpoint β€” discard the risky step log_entries.replace(checkpoint) log_entries << 'Risky step skipped β€” safe path taken' log_entries.each_with_index { |entry, index| puts "#{index + 1}: #{entry}" }
CREATE TABLE log_entries (id INTEGER PRIMARY KEY AUTOINCREMENT, message TEXT); INSERT INTO log_entries (message) VALUES ('Starting import'); SAVEPOINT before_risky_step; INSERT INTO log_entries (message) VALUES ('Risky step executed'); ROLLBACK TO SAVEPOINT before_risky_step; INSERT INTO log_entries (message) VALUES ('Risky step skipped β€” safe path taken'); RELEASE SAVEPOINT before_risky_step; SELECT id, message FROM log_entries;
Nested ActiveRecord::Base.transaction { ... } blocks use savepoints automatically. Raising ActiveRecord::Rollback inside a nested transaction rolls back only that nested scope without affecting the outer transaction. RELEASE SAVEPOINT (like committing a nested transaction) merges the savepoint's work into the surrounding transaction.
Subqueries
Scalar subquery (inline value)
employees = [ { name: 'Alice', salary: 90_000 }, { name: 'Bob', salary: 75_000 }, { name: 'Carol', salary: 85_000 }, { name: 'Dave', salary: 70_000 }, ] average = employees.sum { |e| e[:salary] } / employees.count.to_f result = employees.map { |e| { **e, above_avg: (e[:salary] - average).round } } result.sort_by { |e| -e[:above_avg] }.each do |e| puts "#{e[:name]}: #{e[:above_avg] >= 0 ? '+' : ''}#{e[:above_avg]}" end
CREATE TABLE employees (id INTEGER, name TEXT, salary REAL); INSERT INTO employees VALUES (1,'Alice',90000),(2,'Bob',75000),(3,'Carol',85000),(4,'Dave',70000); SELECT name, salary, round(salary - (SELECT avg(salary) FROM employees)) AS above_avg FROM employees ORDER BY above_avg DESC;
A scalar subquery is a SELECT that returns exactly one row and one column, used inline as a value expression. The database evaluates it once and substitutes the result. When a scalar subquery could return more than one row, the query fails at runtime with an error.
Subquery in IN clause
departments = [ { id: 1, name: 'Engineering', profitable: true }, { id: 2, name: 'Marketing', profitable: false }, { id: 3, name: 'Sales', profitable: true }, ] employees = [ { name: 'Alice', department_id: 1 }, { name: 'Bob', department_id: 2 }, { name: 'Carol', department_id: 1 }, { name: 'Dave', department_id: 3 }, ] profitable_ids = departments.select { |d| d[:profitable] }.map { |d| d[:id] } puts employees.select { |e| profitable_ids.include?(e[:department_id]) }.map { |e| e[:name] }
CREATE TABLE departments (id INTEGER PRIMARY KEY, name TEXT, profitable INTEGER); CREATE TABLE employees (id INTEGER, name TEXT, department_id INTEGER); INSERT INTO departments VALUES (1,'Engineering',1),(2,'Marketing',0),(3,'Sales',1); INSERT INTO employees VALUES (1,'Alice',1),(2,'Bob',2),(3,'Carol',1),(4,'Dave',3); SELECT name FROM employees WHERE department_id IN (SELECT id FROM departments WHERE profitable = 1);
Employee.where(department_id: Department.where(profitable: true).select(:id)) in ActiveRecord composes the two queries into a single efficient SQL statement. The inner SELECT is executed by the database engine without loading intermediate results into Ruby memory.
EXISTS subquery
departments = [ { id: 1, name: 'Engineering' }, { id: 2, name: 'Marketing' }, { id: 3, name: 'Finance' }, ] employees = [ { name: 'Alice', department_id: 1 }, { name: 'Bob', department_id: 2 }, ] staffed_departments = departments.select { |dept| employees.any? { |e| e[:department_id] == dept[:id] } } puts staffed_departments.map { |d| d[:name] }
CREATE TABLE departments (id INTEGER PRIMARY KEY, name TEXT); CREATE TABLE employees (id INTEGER, name TEXT, department_id INTEGER); INSERT INTO departments VALUES (1,'Engineering'),(2,'Marketing'),(3,'Finance'); INSERT INTO employees VALUES (1,'Alice',1),(2,'Bob',2); SELECT name FROM departments WHERE EXISTS ( SELECT 1 FROM employees WHERE employees.department_id = departments.id );
EXISTS stops scanning as soon as it finds one matching row β€” it is often more efficient than IN for large subsets because it short-circuits. NOT EXISTS finds records that lack an association. The conventional SELECT 1 inside EXISTS is idiomatic; any expression works since only the existence of a row matters.
Correlated subquery
employees = [ { name: 'Alice', department: 'Engineering', salary: 90_000 }, { name: 'Bob', department: 'Marketing', salary: 75_000 }, { name: 'Carol', department: 'Engineering', salary: 85_000 }, { name: 'Dave', department: 'Marketing', salary: 80_000 }, ] dept_max = employees.group_by { |e| e[:department] } .transform_values { |members| members.max_by { |e| e[:salary] }[:salary] } top_earners = employees.select { |e| e[:salary] == dept_max[e[:department]] } top_earners.each { |e| puts "#{e[:name]} (#{e[:department]}): #{e[:salary]}" }
CREATE TABLE employees (id INTEGER, name TEXT, department TEXT, salary REAL); INSERT INTO employees VALUES (1,'Alice','Engineering',90000),(2,'Bob','Marketing',75000), (3,'Carol','Engineering',85000),(4,'Dave','Marketing',80000); SELECT name, department, salary FROM employees AS outer_row WHERE salary = ( SELECT max(salary) FROM employees AS inner_row WHERE inner_row.department = outer_row.department );
A correlated subquery references a column from the outer query (outer_row.department here). It runs once for each outer row. For the "top earner per group" pattern, a window function β€” RANK() OVER (PARTITION BY department ORDER BY salary DESC) β€” is often cleaner and more efficient.
CTEs
Basic CTE: WITH … AS
employees = [ { name: 'Alice', department: 'Engineering', salary: 90_000 }, { name: 'Bob', department: 'Marketing', salary: 75_000 }, { name: 'Carol', department: 'Engineering', salary: 85_000 }, { name: 'Dave', department: 'Marketing', salary: 80_000 }, { name: 'Eve', department: 'Engineering', salary: 95_000 }, ] dept_avgs = employees.group_by { |e| e[:department] } .transform_values { |m| m.sum { |e| e[:salary] } / m.count.to_f } above_average = employees.select { |e| e[:salary] > dept_avgs[e[:department]] } above_average.each { |e| puts "#{e[:name]}: #{e[:salary]} (dept avg: #{dept_avgs[e[:department]].round})" }
CREATE TABLE employees (id INTEGER, name TEXT, department TEXT, salary REAL); INSERT INTO employees VALUES (1,'Alice','Engineering',90000),(2,'Bob','Marketing',75000), (3,'Carol','Engineering',85000),(4,'Dave','Marketing',80000),(5,'Eve','Engineering',95000); WITH department_averages AS ( SELECT department, avg(salary) AS avg_salary FROM employees GROUP BY department ) SELECT e.name, e.salary, round(da.avg_salary) AS dept_avg FROM employees AS e JOIN department_averages AS da USING (department) WHERE e.salary > da.avg_salary;
A CTE (Common Table Expression) names a subquery so it can be referenced like a table elsewhere in the query. CTEs make complex queries readable by giving intermediate results meaningful names β€” like assigning a computed value to a variable before using it. In ActiveRecord, CTEs require raw SQL via .from(Arel.sql("WITH ... SELECT ...")) or the activerecord-cte gem.
Chained CTEs
employees = [ { name: 'Alice', department: 'Engineering', salary: 90_000 }, { name: 'Bob', department: 'Marketing', salary: 75_000 }, { name: 'Carol', department: 'Engineering', salary: 85_000 }, { name: 'Dave', department: 'Marketing', salary: 80_000 }, { name: 'Eve', department: 'Finance', salary: 95_000 }, ] totals = employees.group_by { |e| e[:department] }.map do |dept, members| { department: dept, headcount: members.count, avg_salary: members.sum { |e| e[:salary] } / members.count.to_f } end ranked = totals.sort_by { |t| -t[:avg_salary] }.each_with_index.map { |t, i| { **t, rank: i + 1 } } ranked.each { |row| puts "#{row[:rank]}. #{row[:department]}: #{row[:headcount]} people, avg $#{row[:avg_salary].round}" }
CREATE TABLE employees (id INTEGER, name TEXT, department TEXT, salary REAL); INSERT INTO employees VALUES (1,'Alice','Engineering',90000),(2,'Bob','Marketing',75000), (3,'Carol','Engineering',85000),(4,'Dave','Marketing',80000),(5,'Eve','Finance',95000); WITH totals AS ( SELECT department, count(*) AS headcount, avg(salary) AS avg_salary FROM employees GROUP BY department ), ranked AS ( SELECT *, rank() OVER (ORDER BY avg_salary DESC) AS rank FROM totals ) SELECT rank, department, headcount, round(avg_salary) AS avg_salary FROM ranked;
Multiple CTEs are separated by commas and evaluated in order. A later CTE can reference earlier ones by name. Think of each CTE as a named pipeline stage β€” totals computes department summaries, then ranked adds rank numbers. This "step-by-step assembly" style is far more readable than equivalent deeply nested subqueries.
Recursive CTE (tree traversal)
employees = [ { id: 1, name: 'Alice', manager_id: nil }, { id: 2, name: 'Bob', manager_id: 1 }, { id: 3, name: 'Carol', manager_id: 1 }, { id: 4, name: 'Dave', manager_id: 2 }, { id: 5, name: 'Eve', manager_id: 2 }, ] def build_org(employees, manager_id, depth, path_prefix) employees.select { |e| e[:manager_id] == manager_id }.flat_map do |employee| path = path_prefix.empty? ? employee[:name] : "#{path_prefix} β†’ #{employee[:name]}" [{ name: employee[:name], depth: depth, path: path }] + build_org(employees, employee[:id], depth + 1, path) end end build_org(employees, nil, 0, '').sort_by { |r| r[:path] } .each { |r| puts "#{' ' * r[:depth]}#{r[:name]}" }
CREATE TABLE employees (id INTEGER PRIMARY KEY, name TEXT, manager_id INTEGER); INSERT INTO employees VALUES (1,'Alice',NULL),(2,'Bob',1),(3,'Carol',1),(4,'Dave',2),(5,'Eve',2); WITH RECURSIVE org_chart AS ( SELECT id, name, manager_id, 0 AS depth, name AS path FROM employees WHERE manager_id IS NULL UNION ALL SELECT e.id, e.name, e.manager_id, oc.depth + 1, oc.path || ' β†’ ' || e.name FROM employees AS e JOIN org_chart AS oc ON e.manager_id = oc.id ) SELECT name, depth, path FROM org_chart ORDER BY path;
A recursive CTE has two parts joined by UNION ALL: the anchor (the base case β€” top-level rows where manager_id IS NULL) and the recursive member (which references the CTE itself to add one more level). This is the standard SQL way to walk tree structures stored as adjacency lists β€” org charts, category hierarchies, folder trees.
Window Functions
ROW_NUMBER: rank within a group
employees = [ { name: 'Alice', department: 'Engineering', salary: 90_000 }, { name: 'Bob', department: 'Marketing', salary: 75_000 }, { name: 'Carol', department: 'Engineering', salary: 85_000 }, { name: 'Dave', department: 'Marketing', salary: 80_000 }, { name: 'Eve', department: 'Engineering', salary: 95_000 }, ] result = employees.group_by { |e| e[:department] }.flat_map do |dept, members| members.sort_by { |e| -e[:salary] }.each_with_index.map { |employee, index| { **employee, rank_in_dept: index + 1 } } end result.sort_by { |e| [e[:department], e[:rank_in_dept]] } .each { |e| puts "#{e[:department]} ##{e[:rank_in_dept]}: #{e[:name]} (#{e[:salary]})" }
CREATE TABLE employees (id INTEGER, name TEXT, department TEXT, salary REAL); INSERT INTO employees VALUES (1,'Alice','Engineering',90000),(2,'Bob','Marketing',75000), (3,'Carol','Engineering',85000),(4,'Dave','Marketing',80000),(5,'Eve','Engineering',95000); SELECT name, department, salary, row_number() OVER (PARTITION BY department ORDER BY salary DESC) AS rank_in_dept FROM employees;
Window functions compute a value for each row using a set of related rows β€” without collapsing them into groups the way GROUP BY does. PARTITION BY divides rows into independent windows (one per department here); ORDER BY inside the window function determines the ranking order. ActiveRecord has no direct equivalent; use .select(Arel.sql(...)) or the window_functions gem.
RANK and DENSE_RANK
scores = [ { player: 'Alice', score: 95 }, { player: 'Bob', score: 87 }, { player: 'Carol', score: 95 }, { player: 'Dave', score: 82 }, { player: 'Eve', score: 87 }, ] sorted = scores.sort_by { |s| -s[:score] } sorted.each do |entry| rank = sorted.index { |s| s[:score] == entry[:score] } + 1 dense_rank = sorted.map { |s| s[:score] }.uniq.sort.reverse.index(entry[:score]) + 1 puts "#{entry[:player]}: score #{entry[:score]}, rank #{rank}, dense_rank #{dense_rank}" end
CREATE TABLE scores (player TEXT, score INTEGER); INSERT INTO scores VALUES ('Alice',95),('Bob',87),('Carol',95),('Dave',82),('Eve',87); SELECT player, score, rank() OVER (ORDER BY score DESC) AS rank, dense_rank() OVER (ORDER BY score DESC) AS dense_rank FROM scores;
RANK() skips numbers after ties β€” two players tied at rank 1 means the next player is rank 3. DENSE_RANK() never skips β€” the next distinct score gets the next consecutive rank. ROW_NUMBER() assigns unique sequential numbers with no ties, even among equal values.
Running total and moving average
sales = [ { month: '2026-01', amount: 12_000 }, { month: '2026-02', amount: 15_000 }, { month: '2026-03', amount: 9_500 }, { month: '2026-04', amount: 18_000 }, { month: '2026-05', amount: 14_000 }, ] running_total = 0 sales.each_with_index do |sale, index| running_total += sale[:amount] window = sales[[0, index - 2].max..index] moving_avg = (window.sum { |s| s[:amount] } / window.count.to_f).round puts "#{sale[:month]}: $#{sale[:amount]} | running: $#{running_total} | 3-mo avg: $#{moving_avg}" end
CREATE TABLE sales (id INTEGER, month TEXT, amount REAL); INSERT INTO sales VALUES (1,'2026-01',12000),(2,'2026-02',15000),(3,'2026-03',9500), (4,'2026-04',18000),(5,'2026-05',14000); SELECT month, amount, sum(amount) OVER (ORDER BY month) AS running_total, round(avg(amount) OVER (ORDER BY month ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)) AS moving_avg_3mo FROM sales;
SUM(amount) OVER (ORDER BY month) accumulates the sum row by row β€” a running total. The ROWS BETWEEN 2 PRECEDING AND CURRENT ROW frame limits the window to the current row and the two rows before it, producing a 3-month moving average. Expressing these patterns with ActiveRecord alone requires raw SQL β€” they are impractical with the query DSL.
LAG and LEAD: look backward/forward
sales = [ { month: '2026-01', amount: 12_000 }, { month: '2026-02', amount: 15_000 }, { month: '2026-03', amount: 9_500 }, { month: '2026-04', amount: 18_000 }, { month: '2026-05', amount: 14_000 }, ] sales.each_with_index do |sale, index| prev = index > 0 ? sales[index - 1] : nil pct_change = prev ? ((sale[:amount] - prev[:amount]) / prev[:amount].to_f * 100).round(1) : nil change_str = pct_change ? "#{pct_change >= 0 ? '+' : ''}#{pct_change}%" : 'n/a' puts "#{sale[:month]}: $#{sale[:amount]} | MoM change: #{change_str}" end
CREATE TABLE sales (id INTEGER, month TEXT, amount REAL); INSERT INTO sales VALUES (1,'2026-01',12000),(2,'2026-02',15000),(3,'2026-03',9500), (4,'2026-04',18000),(5,'2026-05',14000); SELECT month, amount, lag(amount, 1) OVER (ORDER BY month) AS prev_month, round( (amount - lag(amount,1) OVER (ORDER BY month)) / lag(amount,1) OVER (ORDER BY month) * 100, 1 ) AS pct_change FROM sales;
LAG(column, n) accesses the value n rows back; LEAD(column, n) accesses n rows ahead. Both return NULL when the offset falls outside the window. Month-over-month percentage change is one of the most common analytics patterns β€” it requires a window function or a self join and has no clean ActiveRecord equivalent.