Rails: uniq vs. distinct
Ruby on-RailsActiverecordRuby on-Rails Problem Overview
Can someone briefly explain to me the difference in use between the methods uniq
and distinct
?
I've seen both used in similar context, but the difference isnt quite clear to me.
Ruby on-Rails Solutions
Solution 1 - Ruby on-Rails
Rails queries acts like arrays, thus .uniq
produces the same result as .distinct
, but
.distinct
is sql query method.uniq
is array method
Note: In Rails 5+ Relation#uniq
is deprecated and recommended to use Relation#distinct
instead.
See http://edgeguides.rubyonrails.org/5_0_release_notes.html#active-record-deprecations
Hint:
Using .includes
before calling .uniq/.distinct
can slow or speed up your app, because
uniq
won't spawn additional sql querydistinct
will do
But both results will be the same
Example:
users = User.includes(:posts)
puts users
# First sql query for includes
users.uniq
# No sql query! (here you speed up you app)
users.distinct
# Second distinct sql query! (here you slow down your app)
This can be useful to make performant application
Hint:
Same works for
.size
vs.count
;present?
vs.exists?
map
vspluck
Solution 2 - Ruby on-Rails
Rails 5.1 has removed the uniq method from Activerecord Relation and added distinct method...
- If you use uniq with query it will just convert the Activerecord Relaction to Array class...
- You can not have Query chain if you added uniq there....(i.e you can not do User.active.uniq.subscribed it will throw error
undefined method subscribed for Array
) - If your DB is large and you want to fetch only required distinct entries its good to use distinct method with Activerecord Relation query...
Solution 3 - Ruby on-Rails
From the documentation:
> uniq(value = true)
>
> Alias for ActiveRecord::QueryMethods#distinct
Solution 4 - Ruby on-Rails
Its not exactly answer your question, but what I know is:
If we consider ActiveRecord context then uniq is just an alias for distinct. And both work as removing duplicates on query result set(which you can say up to one level).
And at array context uniq is so powerful that it removes duplicates even if the elements are nested. for example
arr = [["first"], ["second"], ["first"]]
and if we do
arr.uniq
answer will be : [["first"], ["second"]]
So even if elements are blocks it will go in deep and removes duplicates.
Hope it helps you in some ways.