5 Steps To Change A Column Type With Zero Downtime Deployment In Rails
Tags: migrations, rails, postgres, audited gem
Background
It’s no secret that Ruby on Rails is a great web framework to build something fast. However, with mature mid-size to large web applications come more responsibilities and concerns.
Prior to Rails 5.1 both primary and foreign keys by default were set to a 32-bit integer type. When the application grows, the database has to store more data. That means sooner or later every developer starts wondering what to do when the table hits the magical limit number. Luckily, with Rails 5.1 primary keys were defaulted to big integers
On the other hand, there is always a chance that some popular gems in the Rails community have not caught up with the new change.
One of such gems we encountered was audited
gem. We use this gem to log changes of some of our models in our Postgres database.
The Challenge
The audited
gem stores data into a polymorphic audits
table where auditable_id
column represents primary key of the certain model we want to track and auditable_type
column represents the class name of that model.
At the time of this writing, auditable_id
column is an int
type. There is an open PR to make necessary changes to the gem for switching columns from int
to the bigint
type but, for now, it hasn’t been merged yet.
We had to make this change ourselves as we store and track more and more data.
The challenge here, however, was that our audits
table became large enough to start thinking about uninterrupted zero downtime deployment.
The Solution
If we followed the standard approach for changing a column type in the table we could use the change_column
method and go on with our day.
change_column :audits, :auditable_id, :bigint
But this approach is risky to deploy to production.
We strive to follow the best practices and have uninterrupted deployments. To achieve that we also use the strong_migrations
gem.
According to the strong_migrations
gem, using the change_column
method blocks reading and writing on the entire table. This method re-writes the whole table. As suggested by the author of the gem, the better approach would be to follow multiple steps.
After doing our research, we decided to introduce a 5 step approach to avoid production outages during our deployments.
First deployment
- Create a temporariy column
temp_auditable_id
- Write data to both columns:
auditable_id
andtemp_auditable_id
Second deployment
- Route the
auditable_id
attribute to read/write to thetemp_auditable_id
column, with read fallback onauditable_id
and ultimately on_auditable_id
column, which will be introduced in the third deployment. - Remove the
before_save
callback that populates data totemp_auditable_id
column fromauditable_id
.
Third deployment
- Rename
auditable_id
column to_auditable_id
. - Rename
temp_auditable_id
column toauditable_id
. After the third deployment, verify in the production database that_auditable_id
column has the exact same values asauditable_id
.
Fourth deployment
- Ignore
_auditable_id
column with theignored_columns
method.
Fifth deployment
- Remove other monkey-patches from
Audited::Audit
deployment. - Remove
_auditable_id
column.
Now that we have a plan, let’s discuss each part in detail.
Step 1
The first step would be to create a new temporary column and write data to both auditable_id
and temp_auditable_id
columns.
To achieve this we added temp_auditable_id
column to the audits
table.
disable_ddl_transaction!
def change
add_column :audits, :temp_auditable_id, :bigint
add_index :audits, [:auditable_type, :temp_auditable_id, :version], name: "temp_auditable_index", algorithm: :concurrently
end
We also had to monkey-patch the gem to add a before_save
callback to be able to store new data into the temporary column.
# frozen_string_literal: true
module AuditTempColumns
extend ActiveSupport::Concern
included do
before_save :save_to_temp_columns
end
private
def save_to_temp_columns
self.temp_auditable_id = auditable_id
end
end
Audited::Audit.include(AuditTempColumns)
Last but not least, we had to backfill the existing data into the temporary column. That could be achieved in multiple ways. We chose to do it via a job and call this job from the migration.
class MaintenanceTasks::AuditsBackfillingJob < ApplicationJob
queue_as :default
def perform
::Audited::Audit.where(temp_auditable_id: nil).in_batches do |records|
records.update_all("temp_auditable_id = auditable_id")
sleep(0.01) # throttle
end
end
end
This is how our migration changed.
disable_ddl_transaction!
def up
add_column :audits, :temp_auditable_id, :bigint
add_index :audits, [:auditable_type, :temp_auditable_id, :version], name: "temp_auditable_index", algorithm: :concurrently
MaintenanceTasks::AuditsBackfillingJob.set(wait: 2.hours).perform_later
end
def down
remove_index :audits, name: "temp_auditable_index"
remove_column :audits, :temp_auditable_id, if_exists: true
end
It is a good idea to check that values in both columns are the same.
We can run the following command in our Rails console after the deployment is complete and data is backfilled.
Audited::Audit.where("temp_auditable_id <> auditable_id").count
Once we deployed Step 1 and made sure that data was backfilled properly, it was time to move on to the Step 2 of our plan.
Step 2
For the second deployment we had to do the following:
- Route the
auditable_id
to fallback totemp_auditable_id
if it exists. If thetemp_auditable_id
column does not exist, fall back to eitherauditable_id
or_auditable_id
columns. - Remove the
before_save
callback that populates data totemp_auditable_id
column fromauditable_id
. Instead we have to introduce custom getters and setters to ensure the integrity of the stored data.
# frozen_string_literal: true
module AuditTempColumns
extend ActiveSupport::Concern
# Reading data
def auditable_id
self[:temp_auditable_id] || self[:_auditable_id] || super
end
# Writing data
def auditable_id=(value)
self[:temp_auditable_id] = value if temp_auditable_id_column_exists?
self[:_auditable_id] = value if _auditable_id_column_exists?
super if auditable_id_column_exists?
end
def temp_auditable_id=(value)
self.auditable_id = value
end
def _auditable_id=(value)
self.auditable_id = value
end
private
def temp_auditable_id_column_exists?
ActiveRecord::Base.connection.column_exists?(:audits, :temp_auditable_id)
end
def _auditable_id_column_exists?
ActiveRecord::Base.connection.column_exists?(:audits, :_auditable_id)
end
def auditable_id_column_exists?
ActiveRecord::Base.connection.column_exists?(:audits, :auditable_id)
end
end
Audited::Audit.include(AuditTempColumns)
We added ability to read and write data to _auditable_id
in this deployment in advance so that the code changes get to production before the further updates in the migrations.
Step 3
Once Step 2 is deployed, we follow up on renaming the columns. Here we have to make next changes:
- Rename
auditable_id
column to_auditable_id
. - Rename
temp_auditable_id
column toauditable_id
.
We broke down this task into 2 subtasks. In the first step we renamed the columns.
def change
safety_assured do
rename_column :audits, :auditable_id, :_auditable_id
rename_column :audits, :temp_auditable_id, :auditable_id
end
end
In the second step we renamed the indexes.
def change
rename_index :audits, 'auditable_index', '_auditable_index'
rename_index :audits, 'temp_auditable_index', 'auditable_index'
end
We can run this command to verify that both auditable_id
and _auditable_id
columns have the same values:
Audited::Audit.where("_auditable_id <> auditable_id").count
Step 4
Next, we need to prepare our code for removing _auditable_id
column. It is safer to ignore the column before completely removing it. At this point we also do not need to use any custom getters and setters.
# frozen_string_literal: true
module AuditTempColumns
extend ActiveSupport::Concern
included do
self.ignored_columns = ["_auditable_id"]
end
end
Audited::Audit.include(AuditTempColumns)
Step 5
After deploying all 4 parts, we made it to the end! This is the last step of this journey and it includes the following actions:
- Removing
_auditable_id
column. - Removing monkey-patches from
Auditable::Audit
.
We need to completely delete AuditTempColumns
file and add one last migration.
def change
# This column is ignored in the app
safety_assured { remove_column :audits, :_auditable_id, :int }
end
Conclusion
Applying any change to existing large tables requires additional attention and a carefully thought out deployment plan.
I try to follow the best practices described in the strong_migrations
gem when I write new migrations. Having multiple deployments for a small task such as changing a column type can seem as a big amount of work.
At the same time, it’s always better to keep in the back of our minds that one small migration can cause an outage to the application in producation and create an unpleasant experience for the end users.