Using a Subquery Instead of Pluck

Ruby gets a bad rap for being slow, but as countless blog posts across the Internet have told anyone who'd ever be reading this one, it's all about how you write the code. Anything can be slow if you're writing mediocre code -- not that I don't write my fair share, of course. One easy performance issue to miss cropped up in some code I was reviewing recently though:

scope :has_foo, -> { where(id: Foo.pluck(:bar_id).uniq) }

That's perfectly valid code. Bar.has_foo.to_sql will return something like

select * from bars where id in (1, 2, 3)

and it returns exactly what you'd expect.

Except. You're suddenly slowing things way down. Why? Pluck returns an array.

The code above is actually querying the foos table for every bar_id, using Ruby to find the unique values, and then passing that into another query against the bars table. For a small table this might not even really be noticeable, but as the size of the table grows so does the time this is going to take. In the case of the code I was reviewing, the table is one of the biggest in our entire database and would have been very not fun for anyone trying to use this scope somewhere that performance matters (hint: everywhere).

Instead of using pluck, just use a regular ole subquery:

scope :has_foo, -> { where(id: Foo.select(:bar_id)) }

Now Bar.has_foo.to_sql will return something like

select * from bars where id in (select bar_id from foos)

Suddenly your database specifically designed to make queries like that blazingly fast has done all of the hard work for you in a single hit. Your code is faster with zero real effort.