Database migration architecture
Status: proposed
Deciders: @lorenyu, @daphnegold, @chouinar, @Nava-JoshLong, @addywolf-nava, @sawyerh, @acouch, @SammySteiner
Date: 2023-06-05
Context and Problem Statement
What is the most optimal setup for database migrations infrastructure and deployment? This will break down the different options for how the migration is run, but not the tools or languages the migration will be run with, that will be dependent on the framework the application is using.
Questions that need to be addressed:
How will the method get the latest migration code to run?
What infrastructure is required to use this method?
How is the migration deployment re-run in case of errors?
Decision Drivers
Security
Simplicity
Flexibility
Considered Options
Run migrations from GitHub Actions
Run migrations from a Lambda function
Run migrations from an ECS task
Run migrations from self-hosted GitHub Actions runners
Decision Outcome
Pros
No changes to the database network configuration are needed. The database can remain inaccessible from the public internet.
Database migrations use the same application image and task definition as the base application.
Cons
Other options considered
Run migrations from GitHub Actions using a direct database connection
Temporarily update the database to be accessible from the internet and allow incoming network traffic from the GitHub Action runner's IP address. Then run the migrations directly from the GitHub Action runner. At the end, revert the database configuration changes.
Pros:
Simple. Requires no additional infrastructure
Cons:
This method requires temporarily exposing the database to incoming connections from the internet, which may not comply with agency security policies.
Run migrations from a Lambda function
Pros:
Relatively simple. Lambdas are already used for managing database roles.
The Lambda function can run from within the VPC, avoiding the need to expose the database to the public internet.
The Lambda function is separate from the application service, so we avoid the need to modify the service's task definition.
Cons:
Lambda functions have a maximum runtime of 15 minutes, which can limit certain kinds of migrations.
Run migrations from self-hosted GitHub Actions runners
Pros
If a project already uses self-hosted runners, this can be the simplest option as it provides all the benefits running migrations directly from GitHub Actions without the security impact.
Cons
The main downside is that this requires maintaining self-hosted GitHub Action runners, which is too costly to implement and maintain for projects that don't already have it set up.
Related ADRS:
Last updated
Was this helpful?