Using Alembic API from inside application code

Python Problem Overview

I am using SQLite as an application file format (see here for why you would want to do this) for my PySide-based desktop application. That is, when a user uses my app, their data is saved in a single database file on their machine. I am using the SQLAlchemy ORM to communicate with the databases.

As I release new versions of the application, I may modify the database schema. I don't want users to have to throw away their data every time I change the schema, so I need to migrate their databases to the newest format. Also, I create temporary databases a lot to save subsets of the data for use with some external processes. I want to create these databases with alembic so they are tagged with the proper version.

I have a few questions:

Is there a way to call alembic from inside my Python code? I think it's weird to have to use Popen to a pure Python module, but the docs just use alembic from the command line. Mainly, I need to change the database location to wherever the user's database is located.
If that's not possible, can I specify a new database location from the command line without editing the .ini file? This would make calling alembic through Popen not a big deal.
I see that alembic keeps its version information under a simple table called alembic_version, with one column called version_num and a single row specifying the version. Can I add an alembic_version table to my schema and populate it with the latest version when I create new databases so there is no overhead? Is that even a good idea; should I just use alembic to create all databases?

I have alembic working great for the single database I use to develop with in my project's directory. I want to use alembic to conveniently migrate and create databases in arbitrary locations, preferably through some sort of Python API, and not the command line. This application is also frozen with cx_Freeze, in case that makes a difference.

Thanks!

Python Solutions

Solution 1 - Python

Here's what I've learned after hooking up my software to alembic:

Is there a way to call alembic from inside my Python code?

Yes. As of this writing the main entry point for alembic is alembic.config.main, so you can import it and call it yourself, for example:

import alembic.config
alembicArgs = [
    '--raiseerr',
    'upgrade', 'head',
]
alembic.config.main(argv=alembicArgs)

Note that alembic looks for migrations in the current directory (i.e., os.getcwd()). I've handled this by using os.chdir(migration_directory) before calling alembic, but there may be a better solution.

Can I specify a new database location from the command line without editing the .ini file?

Yes. The key lies in the -x command line argument. From alembic -h (surprisingly, I wasn't able to find a command line argument reference in the docs):

optional arguments:
 -x X                  Additional arguments consumed by custom env.py
                       scripts, e.g. -x setting1=somesetting -x
                       setting2=somesetting

So you can create your own parameter, e.g. dbPath, and then intercept it in env.py:

alembic -x dbPath=/path/to/sqlite.db upgrade head

then for example in env.py:

def run_migrations_online():   
    # get the alembic section of the config file
    ini_section = config.get_section(config.config_ini_section)

    # if a database path was provided, override the one in alembic.ini
    db_path = context.get_x_argument(as_dictionary=True).get('dbPath')
    if db_path:
        ini_section['sqlalchemy.url'] = db_path

    # establish a connectable object as normal
    connectable = engine_from_config(
        ini_section,
        prefix='sqlalchemy.',
        poolclass=pool.NullPool)

    # etc

Of course, you can supply the -x parameter using argv in alembic.config.main, too.

I agree with @davidism about using migrations vs metadata.create_all() :)

Solution 2 - Python

If you look at the commands API page from the alembic docs you see an example of how to run the CLI commands directly from a Python application. Without going through the CLI code.

Running alembic.config.main has the downside that the env.py script is executed which may not be what you want. For example, it will modify your logging config.

Another, very simple way is to use the "command API" linked above. For example, here is a small helper function which I ended up writing:

from alembic.config import Config
from alembic import command

def run_migrations(script_location: str, dsn: str) -> None:
    LOG.info('Running DB migrations in %r on %r', script_location, dsn)
    alembic_cfg = Config()
    alembic_cfg.set_main_option('script_location', script_location)
    alembic_cfg.set_main_option('sqlalchemy.url', dsn)
    command.upgrade(alembic_cfg, 'head')

I am using the set_main_option method here to be able to run the migrations on a different DB if needed. So I can simply call this as follows:

run_migrations('/path/to/migrations', 'postgresql:///my_database')

Where you get those two values (path and DSN) from is up to you. But this seems to be very close to what you want to achieve. The commands API also has the stamp() methods which allows you mark a given DB to be of a specific version. The example above can be easily adapted to call this.

Solution 3 - Python

This is a very broad question, and actually implementing your idea will be up to you, but it is possible.

You can call Alembic from your Python code without using the commands, since it's implemented in Python too! You just need to recreate what the commands are doing behind the scenes.

Admittedly, the docs aren't in very good shape since these are still relatively early releases of the library, but with a little digging you will find the following:

Create a Config
Use the Config to create a ScriptDirectory
Use the Config and the ScriptDirectory to create an EnvironmentContext
Use the EnvironmentContext to create a MigrationContext
Most commands use some combination of methods from Config and MigrationContext

I've written an extension to provide this programmatic Alembic access to a Flask-SQLAlchemy database. The implementation is tied to Flask and Flask-SQLAlchemy, but it should be a good place to start. See Flask-Alembic here.

Regarding your last point about how to create new databases, you can either use Alembic to create the tables, or you can use metadata.create_all() then alembic stamp head (or equivalent python code). I recommend always using the migration path to create the tables, and ignoring the raw metadata.create_all().

I don't have any experience with cx_freeze, but it should be fine as long as the migrations are included in the distribution and the path to that directory in the code is correct.

Solution 4 - Python

Here is a purely programmical example of howto configure and call alembic commands programmatically.

The directory setup (for easier code reading)

.                         # root dir
|- alembic/               # directory with migrations
|- tests/diy_alembic.py   # example script
|- alembic.ini            # ini file

And here is diy_alembic.py

import os
import argparse
from alembic.config import Config
from alembic import command
import inspect

def alembic_set_stamp_head(user_parameter):
    # set the paths values
    this_file_directory = os.path.dirname(os.path.abspath(inspect.stack()[0][1]))
    root_directory      = os.path.join(this_file_directory, '..')
    alembic_directory   = os.path.join(root_directory, 'alembic')
    ini_path            = os.path.join(root_directory, 'alembic.ini')

    # create Alembic config and feed it with paths
    config = Config(ini_path)
    config.set_main_option('script_location', alembic_directory)    
    config.cmd_opts = argparse.Namespace()   # arguments stub

    # If it is required to pass -x parameters to alembic
    x_arg = 'user_parameter=' + user_parameter
    if not hasattr(config.cmd_opts, 'x'):
        if x_arg is not None:
            setattr(config.cmd_opts, 'x', [])
            if isinstance(x_arg, list) or isinstance(x_arg, tuple):
                for x in x_arg:
                    config.cmd_opts.x.append(x)
            else:
                config.cmd_opts.x.append(x_arg)
        else:
            setattr(config.cmd_opts, 'x', None)

    #prepare and run the command
    revision = 'head'
    sql = False
    tag = None
    command.stamp(config, revision, sql=sql, tag=tag)

    #upgrade command
    command.upgrade(config, revision, sql=sql, tag=tag)

The code is more or less a cut from this Flask-Alembic file. It is a good place to look at other commands usage and details.

Why this solution? - It was written in a need of creating an alembic stamps, upgrades and downgrades when running automated tests.

os.chdir(migration_directory) interfered with some tests.
We wanted to have ONE source of database creation and manipulation. "If we crate and manage databases with alembic, alembic but not metadata.create_all() shell be used for tests too".
Even if the code above is longer than 4 lines, alembic showed itself as a good controllable beast if driven this way.

Solution 5 - Python

See documentation of alembic.operations.base.Operations:

    from alembic.runtime.migration import MigrationContext
    from alembic.operations import Operations

    conn = myengine.connect()
    ctx = MigrationContext.configure(conn)
    op = Operations(ctx)

    op.alter_column("t", "c", nullable=True)

Solution 6 - Python

I'm not using Flask so I couldn't make use of the Flask-Alembic library that was already recommended. Instead after quite a bit of tinkering I coded up the following short function to run all of the applicable migrations. I keep all of my alembic-related files under a submodule (folder) called migrations. I actually keep the alembic.ini together with the env.py, which is perhaps a little bit unorthodox. Here's a snippet from my alembic.ini file to adjust for that:

[alembic]
script_location = .

Then I added the following file in the same directory and named it run.py. But wherever you keep your scripts, all you'd need to do is modify the code below to point to the correct paths:

from alembic.command import upgrade
from alembic.config import Config
import os


def run_sql_migrations():
    # retrieves the directory that *this* file is in
    migrations_dir = os.path.dirname(os.path.realpath(__file__))
    # this assumes the alembic.ini is also contained in this same directory
    config_file = os.path.join(migrations_dir, "alembic.ini")
    
    config = Config(file_=config_file)
    config.set_main_option("script_location", migrations_dir)
        
    # upgrade the database to the latest revision
    upgrade(config, "head")

Then with that run.py file in place, it allows me to do this in my main code:

from mymodule.migrations.run import run_sql_migrations


run_sql_migrations()

Solution 7 - Python

For anyone else trying to achieve a flyway-esque result with SQLAlchemy, this worked for me:

Add migration.py to your project:

from flask_alembic import Alembic

def migrate(app):
    alembic = Alembic()
    alembic.init_app(app)
    with app.app_context():
        alembic.upgrade()

Call it on application startup after your db has been initialized

application = Flask(__name__)
db = SQLAlchemy()
db.init_app(application)
migration.migrate(application)

Then you just need to do the rest of the standard alembic steps:

Initialize your project as alembic

alembic init alembic

Update env.py:

from models import MyModel
target_metadata = [MyModel.Base.metadata]

Update alembic.ini

sqlalchemy.url = postgresql://postgres:postgres@localhost:5432/my_db

Assuming your SQLAlchemy models are already defined, you can autogenerate your scripts now:

alembic revision --autogenerate -m "descriptive migration message"

If you get an error about not being able to import your model in env.py, you can run the following in your terminal fo fix

export PYTHONPATH=/path/to/your/project

Lastly, my migration scripts were getting generated in the alembic/versions directory, and I had to copy them to the migrations directory for alembic to pick them up.

├── alembic
│   ├── env.py
│   ├── README
│   ├── script.py.mako
│   └── versions
│       ├── a5402f383da8_01_init.py  # generated here...
│       └── __pycache__
├── alembic.ini
├── migrations
│   ├── a5402f383da8_01_init.py  # manually copied here
│   └── script.py.mako

I probably have something misconfigured, but it is working now.

Solution 8 - Python

Alembic exposes all of it's commands as importable callables under alembic.command.

https://alembic.sqlalchemy.org/en/latest/api/commands.html

I wrote this wrapper, just so I would be able to setup custom logging via python code.

import logging

import alembic.command
import alembic.config

from somewhere import config_logging


def run():
    config_logging()

    log = logging.getLogger(__name__)

    if len(sys.argv) < 3:
        log.error("command must be specified")
        exit(1)

    else:
        command_name = sys.argv[2]

    try:
        command = getattr(alembic.command, name)

    except AttributeError:
        log.error(f"{name} is not a valid alembic command")
        exit(2)

    config = alembic.config.Config()
    config.set_main_option("script_location", "path/to/alembic")
    config.set_main_option("sqlalchemy.url", "postgres://...")

    command(config, *sys.argv[3:])

Solution 9 - Python

Not really an answer, but i had a hardtime with that, so I wanted to share:

How to pass an x_argument programatically with alembic.command.upgrade:

class CmdOpts:
    x = {"data=true"}

here data=true is what i pass as x_argument in command line

    alembic_config = AlembicConfig(ini_location)
    setattr(alembic_config, "cmd_opts", CmdOpts())
    alembic_config.cmd_opts.x = {"data": True}

Content Type	Original Author	Original Content on Stackoverflow
Question	John David Reaver	View Question on Stackoverflow
Solution 1 - Python	ForeverWintr	View Answer on Stackoverflow
Solution 2 - Python	exhuma	View Answer on Stackoverflow
Solution 3 - Python	davidism	View Answer on Stackoverflow
Solution 4 - Python	MajesticRa	View Answer on Stackoverflow
Solution 5 - Python	moomima	View Answer on Stackoverflow
Solution 6 - Python	soapergem	View Answer on Stackoverflow
Solution 7 - Python	Matthew	View Answer on Stackoverflow
Solution 8 - Python	thnee	View Answer on Stackoverflow
Solution 9 - Python	Yohann	View Answer on Stackoverflow