Find all records which have a count of an association greater than zero
SqlRuby on-RailsRuby on-Rails-3ActiverecordSql Problem Overview
I'm trying to do something that I thought it would be simple but it seems not to be.
I have a project model that has many vacancies.
class Project < ActiveRecord::Base
has_many :vacancies, :dependent => :destroy
end
I want to get all the projects that have at least 1 vacancy. I tried something like this:
Project.joins(:vacancies).where('count(vacancies) > 0')
but it says
SQLite3::SQLException: no such column: vacancies: SELECT "projects".* FROM "projects" INNER JOIN "vacancies" ON "vacancies"."project_id" = "projects"."id" WHERE ("projects"."deleted_at" IS NULL) AND (count(vacancies) > 0)
.
Sql Solutions
Solution 1 - Sql
-
To get Projects with at least 1 vacancy:
Project.joins(:vacancies).group('projects.id')
-
To get Projects with more than 1 vacancy:
Project.joins(:vacancies).group('projects.id').having('count(project_id) > 1')
-
Or, if
Vacancy
model sets counter cache:belongs_to :project, counter_cache: true
then this will work, too:
Project.where('vacancies_count > ?', 1)
Inflection rule for vacancy
may need to be specified manually?
Solution 2 - Sql
joins
uses an inner join by default so using Project.joins(:vacancies)
will in effect only return projects that have an associated vacancy.
UPDATE:
As pointed out by @mackskatz in the comment, without a group
clause, the code above will return duplicate projects for projects with more than one vacancies. To remove the duplicates, use
Project.joins(:vacancies).group('projects.id')
UPDATE:
As pointed out by @Tolsee, you can also use distinct
.
Project.joins(:vacancies).distinct
As an example
[10] pry(main)> Comment.distinct.pluck :article_id
=> [43, 34, 45, 55, 17, 19, 1, 3, 4, 18, 44, 5, 13, 22, 16, 6, 53]
[11] pry(main)> _.size
=> 17
[12] pry(main)> Article.joins(:comments).size
=> 45
[13] pry(main)> Article.joins(:comments).distinct.size
=> 17
[14] pry(main)> Article.joins(:comments).distinct.to_sql
=> "SELECT DISTINCT \"articles\".* FROM \"articles\" INNER JOIN \"comments\" ON \"comments\".\"article_id\" = \"articles\".\"id\""
Solution 3 - Sql
Yeah, vacancies
is not a field in the join. I believe you want:
Project.joins(:vacancies).group("projects.id").having("count(vacancies.id)>0")
Solution 4 - Sql
# None
Project.joins(:vacancies).group('projects.id').having('count(vacancies) = 0')
# Any
Project.joins(:vacancies).group('projects.id').having('count(vacancies) > 0')
# One
Project.joins(:vacancies).group('projects.id').having('count(vacancies) = 1')
# More than 1
Project.joins(:vacancies).group('projects.id').having('count(vacancies) > 1')
Solution 5 - Sql
Performing an inner join to the has_many table combined with a group
or uniq
is potentially very inefficient, and in SQL this would be better implemented as a semi-join that uses EXISTS
with a correlated subquery.
This allows the query optimiser to probe the vacancies table to check for the existence of a row with the correct project_id. It doesn't matter whether there is one row or a million that have that project_id.
That's not as straightforward in Rails, but can be achieved with:
Project.where(Vacancies.where("vacancies.project_id = projects.id").exists)
Similarly, find all projects that have no vacancies:
Project.where.not(Vacancies.where("vacancies.project_id = projects.id").exists)
Edit: in recent Rails versions you get a deprecation warning telling you to not to rely on exists
being delegated to arel. Fix this with:
Project.where.not(Vacancies.where("vacancies.project_id = projects.id").arel.exists)
Edit: if you're uncomfortable with raw SQL, try:
Project.where.not(Vacancies.where(Vacancy.arel_table[:project_id].eq(Project.arel_table[:id])).arel.exists)
You can make this less messy by adding class methods to hide the use of arel_table
, for example:
class Project
def self.id_column
arel_table[:id]
end
end
... so ...
Project.where.not(
Vacancies.where(
Vacancy.project_id_column.eq(Project.id_column)
).arel.exists
)
Solution 6 - Sql
In Rails 4+, you can also use includes or eager_load to get the same answer:
Project.includes(:vacancies).references(:vacancies).
where.not(vacancies: {id: nil})
Project.eager_load(:vacancies).where.not(vacancies: {id: nil})
Solution 7 - Sql
I think there's a simpler solution:
Project.joins(:vacancies).distinct
Solution 8 - Sql
Without much Rails magic, you can do:
Project.where('(SELECT COUNT(*) FROM vacancies WHERE vacancies.project_id = projects.id) > 0')
This type of conditions will work in all Rails versions as much of the work is done directly on the DB side. Plus, chaining .count
method will work nicely too. I've been burned by queries like Project.joins(:vacancies)
before. Of course, there are pros and cons as it's not DB agnostic.
Solution 9 - Sql
You can also use EXISTS
with SELECT 1
rather than selecting all the columns from the vacancies
table:
Project.where("EXISTS(SELECT 1 from vacancies where projects.id = vacancies.project_id)")
Solution 10 - Sql
The error is telling you that vacancies is not a column in projects, basically.
This should work
Project.joins(:vacancies).where('COUNT(vacancies.project_id) > 0')