Django query annotation with boolean field
PythonDjangoOrmDjango QuerysetPython Problem Overview
Let's say I have a Product
model with products in a storefront, and a ProductImages
table with images of the product, which can have zero or more images. Here's a simplified example:
class Product(models.Model):
product_name = models.CharField(max_length=255)
# ...
class ProductImage(models.Model):
product = models.ForeignKey(Product, related_name='images')
image_file = models.CharField(max_length=255)
# ...
When displaying search results for products, I want to prioritize products which have images associated with them. I can easily get the number of images:
from django.db.models import Count
Product.objects.annotate(image_count=Count('images'))
But that's not actually what I want. I'd like to annotate it with a boolean field, have_images
, indicating whether the product has one or more images, so that I can sort by that:
Product.objects.annotate(have_images=(?????)).order_by('-have_images', 'product_name')
How can I do that? Thanks!
Python Solutions
Solution 1 - Python
I eventually found a way to do this using django 1.8's new conditional expressions:
from django.db.models import Case, When, Value, IntegerField
q = (
Product.objects
.filter(...)
.annotate(image_count=Count('images'))
.annotate(
have_images=Case(
When(image_count__gt=0,
then=Value(1)),
default=Value(0),
output_field=IntegerField()))
.order_by('-have_images')
)
And that's how I finally found incentive to upgrade to 1.8 from 1.7.
Solution 2 - Python
As from Django 1.11 it is possible to use Exists
. Example below comes from Exists documentation:
>>> from django.db.models import Exists, OuterRef
>>> from datetime import timedelta
>>> from django.utils import timezone
>>> one_day_ago = timezone.now() - timedelta(days=1)
>>> recent_comments = Comment.objects.filter(
... post=OuterRef('pk'),
... created_at__gte=one_day_ago,
... )
>>> Post.objects.annotate(recent_comment=Exists(recent_comments))
Solution 3 - Python
Use conditional expressions and cast outputfield to BooleanField
Product.objects.annotate(image_count=Count('images')).annotate(has_image=Case(When(image_count=0, then=Value(False)), default=Value(True), output_field=BooleanField())).order_by('-has_image')
Solution 4 - Python
Read the docs about extra
qs = Product.objects.extra(select={'has_images': 'CASE WHEN images IS NOT NULL THEN 1 ELSE 0 END' })
Tested it works
But order_by
or where
(filter) by this field doesn't for me (Django 1.8) 0o:
> If you need to order the resulting queryset using some of the new > fields or tables you have included via extra() use the order_by > parameter to extra() and pass in a sequence of strings. These strings > should either be model fields (as in the normal order_by() method on > querysets), of the form table_name.column_name or an alias for a > column that you specified in the select parameter to extra().
qs = qs.extra(order_by = ['-has_images'])
qs = qs.extra(where = ['has_images=1'])
> FieldError: Cannot resolve keyword 'has_images' into field.
I have found https://code.djangoproject.com/ticket/19434 still opened.
So if you have such troubles like me, you can use raw
Solution 5 - Python
If performance matters, my suggestion is to add the hasPictures
boolean field (as editable=False
)
Then keep right value through ProductImage
model signals (or overwriting save
and delete
methods)
Advantages:
-
Index friendly.
-
Better performance. Avoid joins.
-
Database agnostic.
-
Coding it will raise your django skills to next level.
Solution 6 - Python
When you have to annotate existence with some filters, Sum
annotation can be used. For example, following annotates if there are any GIFs in images
:
Product.objects.filter(
).annotate(
animated_images=Sum(
Case(
When(images__image_file__endswith='gif', then=Value(1)),
default=Value(0),
output_field=IntegerField()
)
)
)
This will actually count them, but any pythonic if product.animated_images:
will work same as it was boolean.