Find all records which have a count of an association greater than zero

SqlRuby on-RailsRuby on-Rails-3Activerecord

Sql Problem Overview


I'm trying to do something that I thought it would be simple but it seems not to be.

I have a project model that has many vacancies.

class Project < ActiveRecord::Base

  has_many :vacancies, :dependent => :destroy

end

I want to get all the projects that have at least 1 vacancy. I tried something like this:

Project.joins(:vacancies).where('count(vacancies) > 0')

but it says

SQLite3::SQLException: no such column: vacancies: SELECT "projects".* FROM "projects" INNER JOIN "vacancies" ON "vacancies"."project_id" = "projects"."id" WHERE ("projects"."deleted_at" IS NULL) AND (count(vacancies) > 0).

Sql Solutions


Solution 1 - Sql

  1. To get Projects with at least 1 vacancy:

    Project.joins(:vacancies).group('projects.id')

  2. To get Projects with more than 1 vacancy:

    Project.joins(:vacancies).group('projects.id').having('count(project_id) > 1')

  3. Or, if Vacancy model sets counter cache:

    belongs_to :project, counter_cache: true

then this will work, too:

Project.where('vacancies_count > ?', 1)

Inflection rule for vacancy may need to be specified manually?

Solution 2 - Sql

joins uses an inner join by default so using Project.joins(:vacancies) will in effect only return projects that have an associated vacancy.

UPDATE:

As pointed out by @mackskatz in the comment, without a group clause, the code above will return duplicate projects for projects with more than one vacancies. To remove the duplicates, use

Project.joins(:vacancies).group('projects.id')

UPDATE:

As pointed out by @Tolsee, you can also use distinct.

Project.joins(:vacancies).distinct

As an example

[10] pry(main)> Comment.distinct.pluck :article_id
=> [43, 34, 45, 55, 17, 19, 1, 3, 4, 18, 44, 5, 13, 22, 16, 6, 53]
[11] pry(main)> _.size
=> 17
[12] pry(main)> Article.joins(:comments).size
=> 45
[13] pry(main)> Article.joins(:comments).distinct.size
=> 17
[14] pry(main)> Article.joins(:comments).distinct.to_sql
=> "SELECT DISTINCT \"articles\".* FROM \"articles\" INNER JOIN \"comments\" ON \"comments\".\"article_id\" = \"articles\".\"id\""

Solution 3 - Sql

Yeah, vacancies is not a field in the join. I believe you want:

Project.joins(:vacancies).group("projects.id").having("count(vacancies.id)>0")

Solution 4 - Sql

# None
Project.joins(:vacancies).group('projects.id').having('count(vacancies) = 0')
# Any
Project.joins(:vacancies).group('projects.id').having('count(vacancies) > 0')
# One
Project.joins(:vacancies).group('projects.id').having('count(vacancies) = 1')
# More than 1
Project.joins(:vacancies).group('projects.id').having('count(vacancies) > 1')

Solution 5 - Sql

Performing an inner join to the has_many table combined with a group or uniq is potentially very inefficient, and in SQL this would be better implemented as a semi-join that uses EXISTS with a correlated subquery.

This allows the query optimiser to probe the vacancies table to check for the existence of a row with the correct project_id. It doesn't matter whether there is one row or a million that have that project_id.

That's not as straightforward in Rails, but can be achieved with:

Project.where(Vacancies.where("vacancies.project_id = projects.id").exists)

Similarly, find all projects that have no vacancies:

Project.where.not(Vacancies.where("vacancies.project_id = projects.id").exists)

Edit: in recent Rails versions you get a deprecation warning telling you to not to rely on exists being delegated to arel. Fix this with:

Project.where.not(Vacancies.where("vacancies.project_id = projects.id").arel.exists)

Edit: if you're uncomfortable with raw SQL, try:

Project.where.not(Vacancies.where(Vacancy.arel_table[:project_id].eq(Project.arel_table[:id])).arel.exists)

You can make this less messy by adding class methods to hide the use of arel_table, for example:

class Project
  def self.id_column
    arel_table[:id]
  end
end

... so ...

Project.where.not(
  Vacancies.where(
    Vacancy.project_id_column.eq(Project.id_column)
  ).arel.exists
)

Solution 6 - Sql

In Rails 4+, you can also use includes or eager_load to get the same answer:

Project.includes(:vacancies).references(:vacancies).
        where.not(vacancies: {id: nil})

Project.eager_load(:vacancies).where.not(vacancies: {id: nil})

Solution 7 - Sql

I think there's a simpler solution:

Project.joins(:vacancies).distinct

Solution 8 - Sql

Without much Rails magic, you can do:

Project.where('(SELECT COUNT(*) FROM vacancies WHERE vacancies.project_id = projects.id) > 0')

This type of conditions will work in all Rails versions as much of the work is done directly on the DB side. Plus, chaining .count method will work nicely too. I've been burned by queries like Project.joins(:vacancies) before. Of course, there are pros and cons as it's not DB agnostic.

Solution 9 - Sql

You can also use EXISTS with SELECT 1 rather than selecting all the columns from the vacancies table:

Project.where("EXISTS(SELECT 1 from vacancies where projects.id = vacancies.project_id)")

Solution 10 - Sql

The error is telling you that vacancies is not a column in projects, basically.

This should work

Project.joins(:vacancies).where('COUNT(vacancies.project_id) > 0')

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionjphortaView Question on Stackoverflow
Solution 1 - SqlArtaView Answer on Stackoverflow
Solution 2 - SqljvnillView Answer on Stackoverflow
Solution 3 - SqlPeter AlfvinView Answer on Stackoverflow
Solution 4 - SqlDorianView Answer on Stackoverflow
Solution 5 - SqlDavid AldridgeView Answer on Stackoverflow
Solution 6 - SqlkonyakView Answer on Stackoverflow
Solution 7 - SqlYuri KarpovichView Answer on Stackoverflow
Solution 8 - SqlkonyakView Answer on Stackoverflow
Solution 9 - SqlK M Rakibul IslamView Answer on Stackoverflow
Solution 10 - SqlwkhatchView Answer on Stackoverflow