ActiveRecord Query Union

Ruby on-RailsActiverecordUnionActive Relation

Ruby on-Rails Problem Overview


I've written a couple of complex queries (at least to me) with Ruby on Rail's query interface:

watched_news_posts = Post.joins(:news => :watched).where(:watched => {:user_id => id})
watched_topic_posts = Post.joins(:post_topic_relationships => {:topic => :watched}).where(:watched => {:user_id => id})

Both of these queries work fine by themselves. Both return Post objects. I would like to combine these posts into a single ActiveRelation. Since there could be hundreds of thousands of posts at some point, this needs to be done at the database level. If it were a MySQL query, I could simply user the UNION operator. Does anybody know if I can do something similar with RoR's query interface?

Ruby on-Rails Solutions


Solution 1 - Ruby on-Rails

Here's a quick little module I wrote that allows you to UNION multiple scopes. It also returns the results as an instance of ActiveRecord::Relation.

module ActiveRecord::UnionScope
  def self.included(base)
    base.send :extend, ClassMethods
  end

  module ClassMethods
    def union_scope(*scopes)
      id_column = "#{table_name}.id"
      sub_query = scopes.map { |s| s.select(id_column).to_sql }.join(" UNION ")
      where "#{id_column} IN (#{sub_query})"
    end
  end
end

Here's the gist: https://gist.github.com/tlowrimore/5162327

Edit:

As requested, here's an example of how UnionScope works:

class Property < ActiveRecord::Base
  include ActiveRecord::UnionScope

  # some silly, contrived scopes
  scope :active_nearby,     -> { where(active: true).where('distance <= 25') }
  scope :inactive_distant,  -> { where(active: false).where('distance >= 200') }

  # A union of the aforementioned scopes
  scope :active_near_and_inactive_distant, -> { union_scope(active_nearby, inactive_distant) }
end

Solution 2 - Ruby on-Rails

I also have encountered this problem, and now my go-to strategy is to generate SQL (by hand or using to_sql on an existing scope) and then stick it in the from clause. I can't guarantee it's any more efficient than your accepted method, but it's relatively easy on the eyes and gives you a normal ARel object back.

watched_news_posts = Post.joins(:news => :watched).where(:watched => {:user_id => id})
watched_topic_posts = Post.joins(:post_topic_relationships => {:topic => :watched}).where(:watched => {:user_id => id})

Post.from("(#{watched_news_posts.to_sql} UNION #{watched_topic_posts.to_sql}) AS posts")

You can do this with two different models as well, but you need to make sure they both "look the same" inside the UNION -- you can use select on both queries to make sure they will produce the same columns.

topics = Topic.select('user_id AS author_id, description AS body, created_at')
comments = Comment.select('author_id, body, created_at')

Comment.from("(#{comments.to_sql} UNION #{topics.to_sql}) AS comments")

Solution 3 - Ruby on-Rails

Based on Olives' answer, I did come up with another solution to this problem. It feels a little bit like a hack, but it returns an instance of ActiveRelation, which is what I was after in the first place.

Post.where('posts.id IN 
      (
        SELECT post_topic_relationships.post_id FROM post_topic_relationships
          INNER JOIN "watched" ON "watched"."watched_item_id" = "post_topic_relationships"."topic_id" AND "watched"."watched_item_type" = "Topic" WHERE "watched"."user_id" = ?
      )
      OR posts.id IN
      (
        SELECT "posts"."id" FROM "posts" INNER JOIN "news" ON "news"."id" = "posts"."news_id" 
        INNER JOIN "watched" ON "watched"."watched_item_id" = "news"."id" AND "watched"."watched_item_type" = "News" WHERE "watched"."user_id" = ?
      )', id, id)

I'd still appreciate it if anybody has any suggestions to optimize this or improve the performance, because it's essentially executing three queries and feels a little redundant.

Solution 4 - Ruby on-Rails

You could also use Brian Hempel's active_record_union gem that extends ActiveRecord with an union method for scopes.

Your query would be like this:

Post.joins(:news => :watched).
  where(:watched => {:user_id => id}).
  union(Post.joins(:post_topic_relationships => {:topic => :watched}
    .where(:watched => {:user_id => id}))

Hopefully this will be eventually merged into ActiveRecord some day.

Solution 5 - Ruby on-Rails

Could you use an OR instead of a UNION?

Then you could do something like:

Post.joins(:news => :watched, :post_topic_relationships => {:topic => :watched})
.where("watched.user_id = :id OR topic_watched.user_id = :id", :id => id)

(Since you are joins the watched table twice I'm not too sure what the names of the tables will be for the query)

Since there are a lot of joins, it might also be quite heavy on the database, but it might be able to be optimized.

Solution 6 - Ruby on-Rails

How about...

def union(scope1, scope2)
  ids = scope1.pluck(:id) + scope2.pluck(:id)
  where(id: ids.uniq)
end

Solution 7 - Ruby on-Rails

Arguably, this improves readability, but not necessarily performance:

def my_posts
  Post.where <<-SQL, self.id, self.id
    posts.id IN 
    (SELECT post_topic_relationships.post_id FROM post_topic_relationships
    INNER JOIN watched ON watched.watched_item_id = post_topic_relationships.topic_id 
    AND watched.watched_item_type = "Topic" 
    AND watched.user_id = ?
    UNION
    SELECT posts.id FROM posts 
    INNER JOIN news ON news.id = posts.news_id 
    INNER JOIN watched ON watched.watched_item_id = news.id 
    AND watched.watched_item_type = "News" 
    AND watched.user_id = ?)
  SQL
end

This method returns an ActiveRecord::Relation, so you could call it like this:

my_posts.order("watched_item_type, post.id DESC")

Solution 8 - Ruby on-Rails

There is an active_record_union gem. Might be helpful

https://github.com/brianhempel/active_record_union

> With ActiveRecordUnion, we can do: > > the current user's (draft) posts and all published posts from anyone > current_user.posts.union(Post.published) > Which is equivalent to the following SQL:

SELECT "posts".* FROM (
  SELECT "posts".* FROM "posts"  WHERE "posts"."user_id" = 1
  UNION
  SELECT "posts".* FROM "posts"  WHERE (published_at < '2014-07-19 16:04:21.918366')
) posts

Solution 9 - Ruby on-Rails

In a similar case I summed two arrays and used Kaminari:paginate_array(). Very nice and working solution. I was unable to use where(), because I need to sum two results with different order() on the same table.

Solution 10 - Ruby on-Rails

Heres how I joined SQL queries using UNION on my own ruby on rails application.

You can use the below as inspiration on your own code.

class Preference < ApplicationRecord
  scope :for, ->(object) { where(preferenceable: object) }
end

Below is the UNION where i joined the scopes together.

  def zone_preferences
    zone = Zone.find params[:zone_id]
    zone_sql = Preference.for(zone).to_sql
    region_sql = Preference.for(zone.region).to_sql
    operator_sql = Preference.for(Operator.current).to_sql
    
    Preference.from("(#{zone_sql} UNION #{region_sql} UNION #{operator_sql}) AS preferences")
  end

Solution 11 - Ruby on-Rails

Less problems and easier to follow:

    def union_scope(*scopes)
      scopes[1..-1].inject(where(id: scopes.first)) { |all, scope| all.or(where(id: scope)) }
    end

So in the end:

union_scope(watched_news_posts, watched_topic_posts)

Solution 12 - Ruby on-Rails

I would just run the two queries you need and combine the arrays of records that are returned:

@posts = watched_news_posts + watched_topics_posts

Or, at the least test it out. Do you think the array combination in ruby will be far too slow? Looking at the suggested queries to get around the problem, I'm not convinced that there will be that significant of a performance difference.

Solution 13 - Ruby on-Rails

Elliot Nelson answered good, except the case where some of the relations are empty. I would do something like that:

def union_2_relations(relation1,relation2)
sql = ""
if relation1.any? && relation2.any?
  sql = "(#{relation1.to_sql}) UNION (#{relation2.to_sql}) as #{relation1.klass.table_name}"
elsif relation1.any?
  sql = relation1.to_sql
elsif relation2.any?
  sql = relation2.to_sql
end
relation1.klass.from(sql)

end

Solution 14 - Ruby on-Rails

When we add UNION to the scopes, it breaks at time due to order_by clause added before the UNION.

So I changed it in a way to give it a UNION effect.

module UnionScope
  def self.included(base)
    base.send(:extend, ClassMethods)
  end

  module ClassMethods
    def union_scope(*scopes)
      id_column = "#{table_name}.id"
      sub_query = scopes.map { |s| s.pluck(:id) }.flatten
      where("#{id_column} IN (?)", sub_query)
    end
  end
end

And then use it like this in any model

class Model
  include UnionScope
  scope :union_of_scopeA_scopeB, -> { union_scope(scopeA, scopeB) }
end

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionLandonSchroppView Question on Stackoverflow
Solution 1 - Ruby on-RailsTim LowrimoreView Answer on Stackoverflow
Solution 2 - Ruby on-RailsElliot NelsonView Answer on Stackoverflow
Solution 3 - Ruby on-RailsLandonSchroppView Answer on Stackoverflow
Solution 4 - Ruby on-RailsdgilperezView Answer on Stackoverflow
Solution 5 - Ruby on-RailsKyle d'OliveiraView Answer on Stackoverflow
Solution 6 - Ruby on-RailsRichard WanView Answer on Stackoverflow
Solution 7 - Ruby on-RailsrichardsunView Answer on Stackoverflow
Solution 8 - Ruby on-RailsMike LyubarskyyView Answer on Stackoverflow
Solution 9 - Ruby on-RailssekrettView Answer on Stackoverflow
Solution 10 - Ruby on-Railsjoeyk16View Answer on Stackoverflow
Solution 11 - Ruby on-RailsDmitry PolushkinView Answer on Stackoverflow
Solution 12 - Ruby on-RailsJeffrey Alan LeeView Answer on Stackoverflow
Solution 13 - Ruby on-RailsEhudView Answer on Stackoverflow
Solution 14 - Ruby on-RailsHaroon KhalidView Answer on Stackoverflow