How to Use Lateral Joins in PostgreSQL
Once upon a time, my queries were a mess. I didn’t know how to use lateral joins, so I would copy-and-paste the same calculations over and over again in my queries.
Co-workers were starting to talk.
Lateral joins allow you to reuse calculations, making your queries neat and legible. Let's learn about lateral joins by rewriting an atrocious query together.
We'll use a cool sample dataset of real Kickstarter projects, if you'd like to follow along.
For each Kickstarter project, we want to calculate:
- total pledged in USD
- average pledge in USD
- USD over or under goal
- duration of the project in days
- daily shortfall / surplus, the extra USD needed daily to hit goal
Queries, Before and After
select (pledged / fx_rate) as pledged_usd, (pledged / fx_rate) / backers_count as avg_pledge_usd, (goal / fx_rate) - (pledged / fx_rate) as amt_from_goal, (deadline - launched_at) / 86400.00 as duration, ((goal / fx_rate) - (pledged / fx_rate)) / ((deadline - launched_at) / 86400.00) as usd_needed_daily from kickstarter_data;
Without lateral joins, see how often I reuse the same calculations:
Yuck. Not only does this make the query difficult to read, it introduces risk of typos or other errors if I ever need to make an update.
select pledged_usd, avg_pledge_usd, amt_from_goal, duration, (usd_from_goal / duration) as usd_needed_daily from kickstarter_data, lateral (select pledged / fx_rate as pledged_usd) pu lateral (select pledged_usd / backers_count as avg_pledge_usd) apu lateral (select goal / fx_rate as goal_usd) gu lateral (select goal_usd - pledged_usd as usd_from_goal) ufg lateral (select (deadline - launched_at)/86400.00 as duration) dr;
With lateral joins, I can define the calculation just once. I can then reference those calculations in other parts of my query.
lateral keyword allows us to access columns after the
FROM statement, and reference these columns "earlier" in the query ("earlier" meaning "written higher in the query").
SQL queries run in a different order than you might expect. In fact,
JOIN are the first statements run. Therefore it's no problem to reference columns after the
Image Credit: Julia Evans
Word of warning: stick to simple mathematical operations when writing lateral joins for calculations. Aggregate functions like
SUM() are not supported.
Happy querying! 🍭