How to work around lack of support for foreign keys across databases in Django

DjangoDjango ModelsDjango Orm

Django Problem Overview


I know Django does not support foreign keys across multiple databases (originally Django 1.3 docs)

But I'm looking for a workaround.

##What doesn't work

I have two models each on a separate database.

routers.py:

class NewsRouter(object):
    def db_for_read(self, model, **hints):
        if model._meta.app_label == 'news_app':
            return 'news_db'
        return None

    def db_for_write(self, model, **hints):
        if model._meta.app_label == 'news_app':
            return 'news_db'
        return None

    def allow_relation(self, obj1, obj2, **hints):
        if obj1._meta.app_label == 'news_app' or obj2._meta.app_label == 'news_app':
            return True
        return None

    def allow_syncdb(self, db, model):
        if db == 'news_db':
            return model._meta.app_label == 'news_app'
        elif model._meta.app_label == 'news_app':
            return False
        return None

Model 1 in fruit_app/models.py:

from django.db import models

class Fruit(models.Model):
    name = models.CharField(max_length=20)

Model 2 in news_app/models.py:

from django.db import models

class Article(models.Model):
    fruit = models.ForeignKey('fruit_app.Fruit')
    intro = models.TextField()

Trying to add a "Article" in the admin gives the following error because it is looking for the Fruit model on the wrong database ('news_db'):

DatabaseError at /admin/news_app/article/add/

(1146, "Table 'fkad_news.fruit_app_fruit' doesn't exist")

##Method 1: subclass IntegerField

I created a custom field, ForeignKeyAcrossDb, which is a subclass of IntegerField. Code is on github at: https://github.com/saltycrane/django-foreign-key-across-db-testproject/tree/integerfield_subclass

fields.py:

from django.db import models


class ForeignKeyAcrossDb(models.IntegerField):
    '''
    Exists because foreign keys do not work across databases
    '''
    def __init__(self, model_on_other_db, **kwargs):
        self.model_on_other_db = model_on_other_db
        super(ForeignKeyAcrossDb, self).__init__(**kwargs)

    def to_python(self, value):
        # TODO: this db lookup is duplicated in get_prep_lookup()
        if isinstance(value, self.model_on_other_db):
            return value
        else:
            return self.model_on_other_db._default_manager.get(pk=value)

    def get_prep_value(self, value):
        if isinstance(value, self.model_on_other_db):
            value = value.pk
        return super(ForeignKeyAcrossDb, self).get_prep_value(value)

    def get_prep_lookup(self, lookup_type, value):
        # TODO: this db lookup is duplicated in to_python()
        if not isinstance(value, self.model_on_other_db):
            value = self.model_on_other_db._default_manager.get(pk=value)

        return super(ForeignKeyAcrossDb, self).get_prep_lookup(lookup_type, value)

And I changed my Article model to be:

class Article(models.Model):
    fruit = ForeignKeyAcrossDb(Fruit)
    intro = models.TextField()

The problem is, sometimes when I access Article.fruit, it is an integer, and sometimes it is the Fruit object. I want it to always be a Fruit object. What do I need to do to make accessing Article.fruit always return a Fruit object?

As a workaround for my workaround, I added a fruit_obj property, but I would like to eliminate this if possible:

class Article(models.Model):
    fruit = ForeignKeyAcrossDb(Fruit)
    intro = models.TextField()

    # TODO: shouldn't need fruit_obj if ForeignKeyAcrossDb field worked properly
    @property
    def fruit_obj(self):
        if not hasattr(self, '_fruit_obj'):
            # TODO: why is it sometimes an int and sometimes a Fruit object?
            if isinstance(self.fruit, int) or isinstance(self.fruit, long):
                print 'self.fruit IS a number'
                self._fruit_obj = Fruit.objects.get(pk=self.fruit)
            else:
                print 'self.fruit IS NOT a number'
                self._fruit_obj = self.fruit
        return self._fruit_obj

    def fruit_name(self):
        return self.fruit_obj.name

##Method 2: subclass ForeignKey field

As a second attempt, I tried subclassing the ForeignKey field. I modified ReverseSingleRelatedObjectDescriptor to use the database specified by forced_using on the model manager of Fruit. I also removed the validate() method on the ForeignKey subclass. This method did not have the same problem as method 1. Code on github at: https://github.com/saltycrane/django-foreign-key-across-db-testproject/tree/foreignkey_subclass

fields.py:

from django.db import models
from django.db import router
from django.db.models.query import QuerySet


class ReverseSingleRelatedObjectDescriptor(object):
    # This class provides the functionality that makes the related-object
    # managers available as attributes on a model class, for fields that have
    # a single "remote" value, on the class that defines the related field.
    # In the example "choice.poll", the poll attribute is a
    # ReverseSingleRelatedObjectDescriptor instance.
    def __init__(self, field_with_rel):
        self.field = field_with_rel

    def __get__(self, instance, instance_type=None):
        if instance is None:
            return self

        cache_name = self.field.get_cache_name()
        try:
            return getattr(instance, cache_name)
        except AttributeError:
            val = getattr(instance, self.field.attname)
            if val is None:
                # If NULL is an allowed value, return it.
                if self.field.null:
                    return None
                raise self.field.rel.to.DoesNotExist
            other_field = self.field.rel.get_related_field()
            if other_field.rel:
                params = {'%s__pk' % self.field.rel.field_name: val}
            else:
                params = {'%s__exact' % self.field.rel.field_name: val}

            # If the related manager indicates that it should be used for
            # related fields, respect that.
            rel_mgr = self.field.rel.to._default_manager
            db = router.db_for_read(self.field.rel.to, instance=instance)
            if getattr(rel_mgr, 'forced_using', False):
                db = rel_mgr.forced_using
                rel_obj = rel_mgr.using(db).get(**params)
            elif getattr(rel_mgr, 'use_for_related_fields', False):
                rel_obj = rel_mgr.using(db).get(**params)
            else:
                rel_obj = QuerySet(self.field.rel.to).using(db).get(**params)
            setattr(instance, cache_name, rel_obj)
            return rel_obj

    def __set__(self, instance, value):
        raise NotImplementedError()

class ForeignKeyAcrossDb(models.ForeignKey):

    def contribute_to_class(self, cls, name):
        models.ForeignKey.contribute_to_class(self, cls, name)
        setattr(cls, self.name, ReverseSingleRelatedObjectDescriptor(self))
        if isinstance(self.rel.to, basestring):
            target = self.rel.to
        else:
            target = self.rel.to._meta.db_table
        cls._meta.duplicate_targets[self.column] = (target, "o2m")

    def validate(self, value, model_instance):
        pass

fruit_app/models.py:

from django.db import models


class FruitManager(models.Manager):
    forced_using = 'default'


class Fruit(models.Model):
    name = models.CharField(max_length=20)

    objects = FruitManager()

news_app/models.py:

from django.db import models

from foreign_key_across_db_testproject.fields import ForeignKeyAcrossDb
from foreign_key_across_db_testproject.fruit_app.models import Fruit


class Article(models.Model):
    fruit = ForeignKeyAcrossDb(Fruit)
    intro = models.TextField()

    def fruit_name(self):
        return self.fruit.name

##Method 2a: Add a router for fruit_app

This solution uses an additional router for fruit_app. This solution does not require the modifications to ForeignKey that were required in Method 2. After looking at Django's default routing behavior in django.db.utils.ConnectionRouter, we found that even though we expected fruit_app to be on the 'default' database by default, the instance hint passed to db_for_read for foreign key lookups put it on the 'news_db' database. We added a second router to ensure fruit_app models were always read from the 'default' database. A ForeignKey subclass is only used to "fix" the ForeignKey.validate() method. (If Django wanted to support foreign keys across databases, I would say this is a Django bug.) Code is on github at: https://github.com/saltycrane/django-foreign-key-across-db-testproject

routers.py:

class NewsRouter(object):
    def db_for_read(self, model, **hints):
        if model._meta.app_label == 'news_app':
            return 'news_db'
        return None

    def db_for_write(self, model, **hints):
        if model._meta.app_label == 'news_app':
            return 'news_db'
        return None

    def allow_relation(self, obj1, obj2, **hints):
        if obj1._meta.app_label == 'news_app' or obj2._meta.app_label == 'news_app':
            return True
        return None

    def allow_syncdb(self, db, model):
        if db == 'news_db':
            return model._meta.app_label == 'news_app'
        elif model._meta.app_label == 'news_app':
            return False
        return None


class FruitRouter(object):
    def db_for_read(self, model, **hints):
        if model._meta.app_label == 'fruit_app':
            return 'default'
        return None

    def db_for_write(self, model, **hints):
        if model._meta.app_label == 'fruit_app':
            return 'default'
        return None

    def allow_relation(self, obj1, obj2, **hints):
        if obj1._meta.app_label == 'fruit_app' or obj2._meta.app_label == 'fruit_app':
            return True
        return None

    def allow_syncdb(self, db, model):
        if db == 'default':
            return model._meta.app_label == 'fruit_app'
        elif model._meta.app_label == 'fruit_app':
            return False
        return None

fruit_app/models.py:

from django.db import models


class Fruit(models.Model):
    name = models.CharField(max_length=20)

news_app/models.py:

from django.db import models

from foreign_key_across_db_testproject.fields import ForeignKeyAcrossDb
from foreign_key_across_db_testproject.fruit_app.models import Fruit


class Article(models.Model):
    fruit = ForeignKeyAcrossDb(Fruit)
    intro = models.TextField()

    def fruit_name(self):
        return self.fruit.name

fields.py:

from django.core import exceptions
from django.db import models
from django.db import router


class ForeignKeyAcrossDb(models.ForeignKey):

    def validate(self, value, model_instance):
        if self.rel.parent_link:
            return
        models.Field.validate(self, value, model_instance)
        if value is None:
            return

        using = router.db_for_read(self.rel.to, instance=model_instance)  # is this more correct than Django's 1.2.5 version?
        qs = self.rel.to._default_manager.using(using).filter(
                **{self.rel.field_name: value}
             )
        qs = qs.complex_filter(self.rel.limit_choices_to)
        if not qs.exists():
            raise exceptions.ValidationError(self.error_messages['invalid'] % {
                'model': self.rel.to._meta.verbose_name, 'pk': value})

##Additional information

##Update

We implemented the last method after tweaking our routers some more. The whole implementation has been pretty painful which makes us think that we must be doing it wrong. On the TODO list is writing unit tests for this.

Django Solutions


Solution 1 - Django

You could make a view in the database that has the cross database query in it, then define the model for the view in a separate file to keep syncdb working.

Happy programming. :)

Solution 2 - Django

I know that Djano-nosql has support for keys and such though some magic from http://www.allbuttonspressed.com/projects/django-dbindexer. Maybe some of that could help.

From the description:

"you can just tell the dbindexer which models and fields should support these queries and it'll take care of maintaining the required indexes for you."

-Kerry

Solution 3 - Django

As to the ForeignKeyAcrossDb part, couldn't you possibly make some adjustments to your class inside __init__? Check if the appropriate field is Integer if not, load it from the database, or do anything else that is required. Python __class__es can be changed at runtime without much problem.

Solution 4 - Django

After breaking my head some days, I managed to get my Foreign Key ON THE SAME BANK!

Can be made ​​a change over the FORM to seek a FOREIGN KEY in a different bank!

First, add a RECHARGE of FIELDS, both directly (crack) my form, in function init

app.form.py

# -*- coding: utf-8 -*-
from django import forms
import datetime
from app_ti_helpdesk import models as mdp

#classe para formulario de Novo HelpDesk
class FormNewHelpDesk(forms.ModelForm):
    class Meta:
        model = mdp.TblHelpDesk
        fields = (
        "problema_alegado",
        "cod_direcionacao",
        "data_prevista",
        "hora_prevista",
        "atendimento_relacionado_a",
        "status",
        "cod_usuario",
        )

    def __init__(self, *args, **kwargs):
        #-------------------------------------
        #  using remove of kwargs
        #-------------------------------------
        db = kwargs.pop("using", None)

        # CASE use Unique Keys
        self.Meta.model.db = db
        
        super(FormNewHelpDesk, self).__init__(*args,**kwargs)

        #-------------------------------------
        #   recreates the fields manually
        from copy import deepcopy
        self.fields.update(deepcopy( forms.fields_for_model( self.Meta.model, self.Meta.fields, using=db ) ))
        #
        #-------------------------------------

        #### follows the standard template customization, if necessary

        self.fields['problema_alegado'].widget.attrs['rows'] = 3
        self.fields['problema_alegado'].widget.attrs['cols'] = 22
        self.fields['problema_alegado'].required = True
        self.fields['problema_alegado'].error_messages={'required': 'Necessário informar o motivo da solicitação de ajuda!'}


        self.fields['data_prevista'].widget.attrs['class'] = 'calendario'
        self.fields['data_prevista'].initial = (datetime.timedelta(4)+datetime.datetime.now().date()).strftime("%Y-%m-%d")

        self.fields['hora_prevista'].widget.attrs['class'] = 'hora'
        self.fields['hora_prevista'].initial =datetime.datetime.now().time().strftime("%H:%M")
        
        self.fields['status'].initial = '0' 				#aberto
        self.fields['status'].widget.attrs['disabled'] = True
        
        self.fields['atendimento_relacionado_a'].initial = '07'
        
        self.fields['cod_direcionacao'].required = True
        self.fields['cod_direcionacao'].label = "Direcionado a"
        self.fields['cod_direcionacao'].initial = '2'
        self.fields['cod_direcionacao'].error_messages={'required': 'Necessário informar para quem é direcionado a ajuda!'}

        self.fields['cod_usuario'].widget = forms.HiddenInput()

calling the Form from the View

app.view.py

form = forms.FormNewHelpDesk(request.POST or None, using=banco)

Now, the change in the source Code DJANGO

Only fields of type ForeignKey, ManyToManyField and OneToOneField can use the 'using', so added an IF ...

django.forms.models.py

# line - 133: add using=None
def fields_for_model(model, fields=None, exclude=None, widgets=None, formfield_callback=None, using=None):

# line - 159

if formfield_callback is None:
    #----------------------------------------------------
	from django.db.models.fields.related import (ForeignKey, ManyToManyField, OneToOneField)
	if type(f) in (ForeignKey, ManyToManyField, OneToOneField):
		kwargs['using'] = using

	formfield = f.formfield(**kwargs)
    #----------------------------------------------------
elif not callable(formfield_callback):
	raise TypeError('formfield_callback must be a function or callable')
else:
	formfield = formfield_callback(f, **kwargs)

ALTER FOLLOW FILE

django.db.models.base.py

alter

# line 717
qs = model_class._default_manager.filter(**lookup_kwargs)

for

# line 717
qs = model_class._default_manager.using(getattr(self, 'db', None)).filter(**lookup_kwargs)

Ready :D

Solution 5 - Django

A foreign key field implies that you can

  • query on the relationship by joining ie fruit__name
  • check referential integrity
  • ensure referential integrity upon deletes
  • admin raw id lookup functionality
  • (some more...)

The first use case would always be problematic. Probably there are some other foreign key special cases in the codebase which also wouldn't work.

I run a rather large django site and we are currently using a plain integerfield. For now i would think subclassing the integerfield and adding the id to object conversion would be easiest (in 1.2 that required patching some bits of django, hope that improved by now) Will let you know what solution we find.

Solution 6 - Django

Ran into a similar problem of needing to reference (mostly) static data across multiple (5) databases. Made a slight update to the ReversedSingleRelatedObjectDescriptor to allow setting the related model. It doesn't implement the reverse relationship atm.

class ReverseSingleRelatedObjectDescriptor(object):
"""
This class provides the functionality that makes the related-object managers available as attributes on a model
class, for fields that have a single "remote" value, on the class that defines the related field. Used with
LinkedField.
"""
def __init__(self, field_with_rel):
    self.field = field_with_rel
    self.cache_name = self.field.get_cache_name()

def __get__(self, instance, instance_type=None):
    if instance is None:
        return self

    try:
        return getattr(instance, self.cache_name)
    except AttributeError:
        val = getattr(instance, self.field.attname)
        if val is None:
            # If NULL is an allowed value, return it
            if self.field.null:
                return None
            raise self.field.rel.to.DoesNotExist
        other_field = self.field.rel.get_related_field()
        if other_field.rel:
            params = {'%s__pk' % self.field.rel.field_name: val}
        else:
            params = {'%s__exact' % self.field.rel.field_name: val}

        # If the related manager indicates that it should be used for related fields, respect that.
        rel_mgr = self.field.rel.to._default_manager
        db = router.db_for_read(self.field.rel.to, instance=instance)
        if getattr(rel_mgr, 'forced_using', False):
            db = rel_mgr.forced_using
            rel_obj = rel_mgr.using(db).get(**params)
        elif getattr(rel_mgr, 'use_for_related_fields', False):
            rel_obj = rel_mgr.using(db).get(**params)
        else:
            rel_obj = QuerySet(self.field.rel.to).using(db).get(**params)
        setattr(instance, self.cache_name, rel_obj)
        return rel_obj

def __set__(self, instance, value):
    if instance is None:
        raise AttributeError("%s must be accessed via instance" % self.field.name)

    # If null=True, we can assign null here, but otherwise the value needs to be an instance of the related class.
    if value is None and self.field.null is False:
        raise ValueError('Cannot assign None: "%s.%s" does not allow null values.' %
                         (instance._meta.object_name, self.field.names))
    elif value is not None and not isinstance(value, self.field.rel.to):
        raise ValueError('Cannot assign "%r": "%s.%s" must be a "%s" instance.' %
                         (value, instance._meta.object_name, self.field.name, self.field.rel.to._meta.object_name))
    elif value is not None:
        # Only check the instance state db, LinkedField implies that the value is on a different database
        if instance._state.db is None:
            instance._state.db = router.db_for_write(instance.__class__, instance=value)

    # Is not used by OneToOneField, no extra measures to take here

    # Set the value of the related field
    try:
        val = getattr(value, self.field.rel.get_related_field().attname)
    except AttributeError:
        val = None
    setattr(instance, self.field.attname, val)

    # Since we already know what the related object is, seed the related object caches now, too. This avoids another
    # db hit if you get the object you just set
    setattr(instance, self.cache_name, value)
    if value is not None and not self.field.rel.multiple:
        setattr(value, self.field.related.get_cache_name(), instance)

and

class LinkedField(models.ForeignKey):
"""
Field class used to link models across databases. Does not ensure referrential integraty like ForeignKey
"""
def _description(self):
    return "Linked Field (type determined by related field)"

def contribute_to_class(self, cls, name):
    models.ForeignKey.contribute_to_class(self, cls, name)
    setattr(cls, self.name, ReverseSingleRelatedObjectDescriptor(self))
    if isinstance(self.rel.to, basestring):
        target = self.rel.to
    else:
        target = self.rel.to._meta.db_table
    cls._meta.duplicate_targets[self.column] = (target, "o2m")

def validate(self, value, model_instance):
    pass

Solution 7 - Django

This solution is originally written for one managed database with migrations and one or more legacy databases with models Meta managed=False connected at database level to the same database. If a db_table option contains a database name plus table name quoted correctly by ' ` ' (MySQL) or by ' " ' (other db), e.g. db_table = '"DB2"."table_b"', then it is not quoted any more by Django. Queries are compiled by Django ORM correctly, even with JOINs:

class TableB(models.Model):
    ....
    class Meta:    
        db_table = '`DB2`.`table_b`'    # for MySQL
        # db_table = '"DB2"."table_b"'  # for all other backends
        managed = False

Query set:

>>> qs = TableB.objects.all()
>>> str(qs.query)
'SELECT "DB2"."table_b"."id" FROM DB2"."table_b"'

That is supported by all db backends in Django.

(It seems that I started a bounty on a duplicate new question where my answer continues.)

Solution 8 - Django

Inspired by @Frans ' comment. My workaround is to do this in business layer. In the example given this question. I would set fruit to an IntegerField on Article, as "not to do integrity check in data layer".

class Fruit(models.Model):
    name = models.CharField()

class Article(models.Model):
    fruit = models.IntegerField()
    intro = models.TextField()

Then honor reference relation in application code (business layer). Take Django admin for example, in order to display fruit as a choice in Article's add page, you populate a list of choices for fruit manually.

# admin.py in App article
class ArticleAdmin(admin.ModelAdmin):
    class ArticleForm(forms.ModelForm):
        fields = ['fruit', 'intro']

        # populate choices for fruit
        choices = [(obj.id, obj.name) for obj in Fruit.objects.all()]
        widgets = {
            'fruit': forms.Select(choices=choices)}
    
    form = ArticleForm
    list_diaplay = ['fruit', 'intro']

Of course you may need to take care of form field validation (integrity check).

Solution 9 - Django

I have a new solution for django v1.10. There are two parts. It works with django.admin and django.rest-framework.

  1. Inherit the ForeignKey class and create ForeignKeyAcrossDb, and override the validate() function, based on this ticket and this post.

>

class ForeignKeyAcrossDb(models.ForeignKey):
        def validate(self, value, model_instance):
            if self.remote_field.parent_link:
                return
            super(models.ForeignKey, self).validate(value, model_instance)
            if value is None:
                return
            using = router.db_for_read(self.remote_field.model, instance=model_instance)
            qs = self.remote_field.model._default_manager.using(using).filter(
                **{self.remote_field.field_name: value}
            )
            qs = qs.complex_filter(self.get_limit_choices_to())
            if not qs.exists():
                raise exceptions.ValidationError(
                    self.error_messages['invalid'],
                    code='invalid',
                    params={
                        'model': self.remote_field.model._meta.verbose_name, 'pk': value,
                        'field': self.remote_field.field_name, 'value': value,
                    },  # 'pk' is included for backwards compatibility
                )

2. In field declaration, use db_constraint=False, for example,

album=ForeignKeyAcrossDb(Singer, db_constraint=False, on_delete=models.DO_NOTHING)

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionsaltycraneView Question on Stackoverflow
Solution 1 - DjangoadorablepuppyView Answer on Stackoverflow
Solution 2 - DjangoKerry HatcherView Answer on Stackoverflow
Solution 3 - DjangojulxView Answer on Stackoverflow
Solution 4 - DjangoarannasousaView Answer on Stackoverflow
Solution 5 - DjangoThierryView Answer on Stackoverflow
Solution 6 - DjangoJonathanView Answer on Stackoverflow
Solution 7 - DjangohynekcerView Answer on Stackoverflow
Solution 8 - DjangoTianwei ChenView Answer on Stackoverflow
Solution 9 - DjangoDianshengView Answer on Stackoverflow