Older posts...
Older posts...
23 Sep
Designing APIs in a resource-oriented architecture
23 Sep 2014
Designing APIs in a resource-oriented architecture
28 Sep
How I'm going to land my dream job
28 Sep 2014
How I'm going to land my dream job
1 Oct
Neural net training fail
1 Oct 2014
Neural net training fail
13 Oct
Pow + SSL without the hassle
13 Oct 2014
Pow + SSL without the hassle
17 Oct
Using machine learning to rank search results (part 1)
17 Oct 2014
Using machine learning to rank search results (part 1)
23 Oct
Using machine learning to rank search results (part 2)
23 Oct 2014
Using machine learning to rank search results (part 2)
9 Nov
Managing complexity in Go
9 Nov 2014
Managing complexity in Go
25 Nov
Remote work: an engineering leader's perspective
25 Nov 2014
Remote work: an engineering leader's perspective
19 Sep
Running A/B tests on our hosting infrastructure
19 Sep 2016
Running A/B tests on our hosting infrastructure
27 Mar
Every service is an island
27 Mar 2017
Every service is an island

Tip: Good indices in relational databases

ActiveRecord, like any other ORM (Hibernate, Datamapper, et al.) adds lots of sugar for your persistence needs, but it’s not magic.

On of the things it won’t do for you is make sure your queries run quickly.

Here’s a few tips to achieve what is often overlooked in Rubyland.

Whether you modify an existing named scope or add a new one, or when you write a new query, make sure you have the proper indices.

This particularly applies if you’re going to run non-trivial queries of course (admin backends, analytics, etc).

Compound indices

A chain of scopes results in (usually) one query. You should take into account all attributes (columns) that are used in :conditions, :join, :group, :having, and :order, as all those result in filtering and sorting–slow operations without indices.

Take the list of all those attributes: the table should have at least one compound index that includes all those attributes. It can sometimes be enough to already have an index on a subset of your query’s columns, but it is not enough to have 2 indices covering all your columns

If not, add a new index.

Index order

Ordering keys in indices is important: in general, order columns in your index from lowest to highest cardinality, typically flags and enumerations first, then foreign keys, timestamps, and finally your table’s primary key (:id).

Using parts of indices

When looking for an index to use, MySQL only use leftmost parts of compound indices. For instance, if querying on columns a, b, and c, an index on (a,b) will be used, but not an index on (b,c).

Issues with sorting

When sorting, only leftmost part of indices are used. To optimize sorting you’ll need an index that starts with your sorting criteria in the same order.

For instance, if sorting with :order => "created_at DESC, id DESC you need a compound index that looks like:

 add_index :things, [:created_at, :id]

Sort direction is important too. Don’t mix ASC and DESC or no indices will be used.

Using EXPLAIN

All the above is more guidelines than general rules.

To figure out whether your query’s efficient or not, just run your query prefixed with EXPLAIN. If it tells you it’s using temporary tables or filesort… your query’s probably not going to be very fast.

Example query

SELECT `orders`.*
     FROM `orders`
INNER JOIN `payments`
    ON payments.order_id = order.id
WHERE (order.status IN ('confirmed','pending')
GROUP BY order.id
ORDER BY order.created_at DESC, order.id DESC
LIMIT 0, 100

Efficient indices:

add_index :payments, [:order_id]
add_index :orders, [:status, :created_at, :id]