ActiveRecord Query Union
Ruby on-RailsActiverecordUnionActive RelationRuby on-Rails Problem Overview
I've written a couple of complex queries (at least to me) with Ruby on Rail's query interface:
watched_news_posts = Post.joins(:news => :watched).where(:watched => {:user_id => id})
watched_topic_posts = Post.joins(:post_topic_relationships => {:topic => :watched}).where(:watched => {:user_id => id})
Both of these queries work fine by themselves. Both return Post objects. I would like to combine these posts into a single ActiveRelation. Since there could be hundreds of thousands of posts at some point, this needs to be done at the database level. If it were a MySQL query, I could simply user the UNION
operator. Does anybody know if I can do something similar with RoR's query interface?
Ruby on-Rails Solutions
Solution 1 - Ruby on-Rails
Here's a quick little module I wrote that allows you to UNION multiple scopes. It also returns the results as an instance of ActiveRecord::Relation.
module ActiveRecord::UnionScope
def self.included(base)
base.send :extend, ClassMethods
end
module ClassMethods
def union_scope(*scopes)
id_column = "#{table_name}.id"
sub_query = scopes.map { |s| s.select(id_column).to_sql }.join(" UNION ")
where "#{id_column} IN (#{sub_query})"
end
end
end
Here's the gist: https://gist.github.com/tlowrimore/5162327
Edit:
As requested, here's an example of how UnionScope works:
class Property < ActiveRecord::Base
include ActiveRecord::UnionScope
# some silly, contrived scopes
scope :active_nearby, -> { where(active: true).where('distance <= 25') }
scope :inactive_distant, -> { where(active: false).where('distance >= 200') }
# A union of the aforementioned scopes
scope :active_near_and_inactive_distant, -> { union_scope(active_nearby, inactive_distant) }
end
Solution 2 - Ruby on-Rails
I also have encountered this problem, and now my go-to strategy is to generate SQL (by hand or using to_sql
on an existing scope) and then stick it in the from
clause. I can't guarantee it's any more efficient than your accepted method, but it's relatively easy on the eyes and gives you a normal ARel object back.
watched_news_posts = Post.joins(:news => :watched).where(:watched => {:user_id => id})
watched_topic_posts = Post.joins(:post_topic_relationships => {:topic => :watched}).where(:watched => {:user_id => id})
Post.from("(#{watched_news_posts.to_sql} UNION #{watched_topic_posts.to_sql}) AS posts")
You can do this with two different models as well, but you need to make sure they both "look the same" inside the UNION -- you can use select
on both queries to make sure they will produce the same columns.
topics = Topic.select('user_id AS author_id, description AS body, created_at')
comments = Comment.select('author_id, body, created_at')
Comment.from("(#{comments.to_sql} UNION #{topics.to_sql}) AS comments")
Solution 3 - Ruby on-Rails
Based on Olives' answer, I did come up with another solution to this problem. It feels a little bit like a hack, but it returns an instance of ActiveRelation
, which is what I was after in the first place.
Post.where('posts.id IN
(
SELECT post_topic_relationships.post_id FROM post_topic_relationships
INNER JOIN "watched" ON "watched"."watched_item_id" = "post_topic_relationships"."topic_id" AND "watched"."watched_item_type" = "Topic" WHERE "watched"."user_id" = ?
)
OR posts.id IN
(
SELECT "posts"."id" FROM "posts" INNER JOIN "news" ON "news"."id" = "posts"."news_id"
INNER JOIN "watched" ON "watched"."watched_item_id" = "news"."id" AND "watched"."watched_item_type" = "News" WHERE "watched"."user_id" = ?
)', id, id)
I'd still appreciate it if anybody has any suggestions to optimize this or improve the performance, because it's essentially executing three queries and feels a little redundant.
Solution 4 - Ruby on-Rails
You could also use Brian Hempel's active_record_union gem that extends ActiveRecord
with an union
method for scopes.
Your query would be like this:
Post.joins(:news => :watched).
where(:watched => {:user_id => id}).
union(Post.joins(:post_topic_relationships => {:topic => :watched}
.where(:watched => {:user_id => id}))
Hopefully this will be eventually merged into ActiveRecord
some day.
Solution 5 - Ruby on-Rails
Could you use an OR instead of a UNION?
Then you could do something like:
Post.joins(:news => :watched, :post_topic_relationships => {:topic => :watched})
.where("watched.user_id = :id OR topic_watched.user_id = :id", :id => id)
(Since you are joins the watched table twice I'm not too sure what the names of the tables will be for the query)
Since there are a lot of joins, it might also be quite heavy on the database, but it might be able to be optimized.
Solution 6 - Ruby on-Rails
How about...
def union(scope1, scope2)
ids = scope1.pluck(:id) + scope2.pluck(:id)
where(id: ids.uniq)
end
Solution 7 - Ruby on-Rails
Arguably, this improves readability, but not necessarily performance:
def my_posts
Post.where <<-SQL, self.id, self.id
posts.id IN
(SELECT post_topic_relationships.post_id FROM post_topic_relationships
INNER JOIN watched ON watched.watched_item_id = post_topic_relationships.topic_id
AND watched.watched_item_type = "Topic"
AND watched.user_id = ?
UNION
SELECT posts.id FROM posts
INNER JOIN news ON news.id = posts.news_id
INNER JOIN watched ON watched.watched_item_id = news.id
AND watched.watched_item_type = "News"
AND watched.user_id = ?)
SQL
end
This method returns an ActiveRecord::Relation, so you could call it like this:
my_posts.order("watched_item_type, post.id DESC")
Solution 8 - Ruby on-Rails
There is an active_record_union gem. Might be helpful
https://github.com/brianhempel/active_record_union
> With ActiveRecordUnion, we can do:
>
> the current user's (draft) posts and all published posts from anyone
> current_user.posts.union(Post.published)
> Which is equivalent to the following SQL:
SELECT "posts".* FROM (
SELECT "posts".* FROM "posts" WHERE "posts"."user_id" = 1
UNION
SELECT "posts".* FROM "posts" WHERE (published_at < '2014-07-19 16:04:21.918366')
) posts
Solution 9 - Ruby on-Rails
In a similar case I summed two arrays and used Kaminari:paginate_array()
. Very nice and working solution. I was unable to use where()
, because I need to sum two results with different order()
on the same table.
Solution 10 - Ruby on-Rails
Heres how I joined SQL queries using UNION on my own ruby on rails application.
You can use the below as inspiration on your own code.
class Preference < ApplicationRecord
scope :for, ->(object) { where(preferenceable: object) }
end
Below is the UNION where i joined the scopes together.
def zone_preferences
zone = Zone.find params[:zone_id]
zone_sql = Preference.for(zone).to_sql
region_sql = Preference.for(zone.region).to_sql
operator_sql = Preference.for(Operator.current).to_sql
Preference.from("(#{zone_sql} UNION #{region_sql} UNION #{operator_sql}) AS preferences")
end
Solution 11 - Ruby on-Rails
Less problems and easier to follow:
def union_scope(*scopes)
scopes[1..-1].inject(where(id: scopes.first)) { |all, scope| all.or(where(id: scope)) }
end
So in the end:
union_scope(watched_news_posts, watched_topic_posts)
Solution 12 - Ruby on-Rails
I would just run the two queries you need and combine the arrays of records that are returned:
@posts = watched_news_posts + watched_topics_posts
Or, at the least test it out. Do you think the array combination in ruby will be far too slow? Looking at the suggested queries to get around the problem, I'm not convinced that there will be that significant of a performance difference.
Solution 13 - Ruby on-Rails
Elliot Nelson answered good, except the case where some of the relations are empty. I would do something like that:
def union_2_relations(relation1,relation2)
sql = ""
if relation1.any? && relation2.any?
sql = "(#{relation1.to_sql}) UNION (#{relation2.to_sql}) as #{relation1.klass.table_name}"
elsif relation1.any?
sql = relation1.to_sql
elsif relation2.any?
sql = relation2.to_sql
end
relation1.klass.from(sql)
end
Solution 14 - Ruby on-Rails
When we add UNION to the scopes, it breaks at time due to order_by clause added before the UNION.
So I changed it in a way to give it a UNION effect.
module UnionScope
def self.included(base)
base.send(:extend, ClassMethods)
end
module ClassMethods
def union_scope(*scopes)
id_column = "#{table_name}.id"
sub_query = scopes.map { |s| s.pluck(:id) }.flatten
where("#{id_column} IN (?)", sub_query)
end
end
end
And then use it like this in any model
class Model
include UnionScope
scope :union_of_scopeA_scopeB, -> { union_scope(scopeA, scopeB) }
end