Tests & Security High severity

Plaintext PII

Storing personally identifiable information (national IDs, tax numbers, medical data, bank account numbers) in plaintext columns of the production database, violating GDPR, HIPAA, and most industry compliance frameworks.

Before / After

Problematic Pattern
# db/schema.rb
create_table :patients do |t|
t.string :name
t.string :email
t.string :ssn           # plaintext
t.string :tax_number    # plaintext
t.text   :medical_notes # plaintext
end

# Any developer with read access to the DB
# sees all sensitive data.
# Backups contain plaintext.
# No way to meet HIPAA or GDPR "right to access".
Target Architecture
# Gemfile
gem 'lockbox'

# db/migrate - add encrypted columns
add_column :patients, :ssn_ciphertext, :text
add_column :patients, :tax_number_ciphertext, :text
add_column :patients, :medical_notes_ciphertext, :text

# app/models/patient.rb
class Patient < ApplicationRecord
has_encrypted :ssn, :tax_number, :medical_notes

# For searchable fields, add blind index:
has_encrypted :email
blind_index :email
end

# Migration: copy data, verify, drop old columns.
# Rotate encryption key regularly, per env.
# Rails 7+ alternative: ActiveRecord Encryption
#   (encrypts :ssn, deterministic: true)

Why this hurts

Plaintext PII in the database is accessible to every process that holds valid credentials: the Rails application, background workers, read replicas, analytics pipelines, and any developer with production access. A SQL injection vulnerability anywhere in the application exposes every sensitive column. A leaked backup file (S3 misconfiguration, stolen laptop) contains complete copies of the data, and the leak is permanent because backups are immutable. Incident response scope balloons because discovery requires reviewing every code path that reads the affected columns.

GDPR Article 32 requires “encryption of personal data” as one of the appropriate technical measures for protecting sensitive information. HIPAA Technical Safeguards at 45 CFR 164.312(a)(2)(iv) require encryption at rest for protected health information. Neither regulation technically mandates column-level encryption over disk-level encryption, but auditors increasingly treat column-level as the expected control. Fines under GDPR reach 4% of annual global revenue, and under HIPAA up to $1.5 million per violation per year.

The secondary exposures are often worse than the primary database. Log aggregators capture request parameters that include PII verbatim if filter_parameters is not configured properly. Error trackers (Sentry, Bugsnag) attach request bodies and database query parameters to exception reports, which replicate the plaintext to third-party systems with their own retention policies. Analytics platforms and data warehouses receive PII via ETL pipelines and retain it indefinitely. Each copy expands the compliance scope to cover the downstream system.

Column-level encryption with a library like lockbox or Rails 7+‘s built-in ActiveRecord::Encryption solves the at-rest requirement by storing ciphertext with a key held outside the database. The application decrypts in memory only when needed, so leaked database backups contain ciphertext instead of plaintext. A blind_index column allows searching by encrypted field using a separate HMAC-derived column, trading off some information leakage for query-ability. Key rotation, per-environment keys, and strict filter_parameters configuration close the remaining gaps.

Get Expert Help

Inheriting a legacy Rails codebase with this problem? Request a Technical Debt Audit.