As web developers, we deal with SQL often. Perhaps it’s mostly abstracted away from us using an ORM like ActiveRecord. But sometimes we find ourselves trying to get more complex data and reach for SQL. We hit an error and scratch our heads. Without a proper mental model of SQL, it’s very difficult to figure out what to do. Its syntax doesn’t share as much with languages like Ruby or JavaScript as we might like. In this three part series, I hope to give you strong mental models for three aspects of SQL: WHERE
and SELECT
, JOIN
, and GROUP BY
.
Test Data
PostgreSQL provides some sample datasets that we can use. For this post, I’m using the world
dataset. If you have a postgres server running, here’s how you can download, unpack, create, and import this dataset:
$ curl -O https://ftp.postgresql.org/pub/projects/pgFoundry/dbsamples/world/world-1.0/world-1.0.tar.gz
$ tar -zxvf world-1.0.tar.gz
$ createdb world
$ psql world < dbsamples-0.1/world/world.sql
Then, we can connect to our new database:
psql world
The dataset has three tables, city
, country
, and countrylanguage
. city
and country
are lists of places around the world and countrylanguage
lists the languages spoken in a given country. You can take a look around at what kind of data we have using \d
or \d [table name]
.
WHERE is for Rows
When querying, we can use the WHERE
clause to filter the rows we get back. To use WHERE
we specify a condition like “population less than 1000”:
SELECT *
FROM city
WHERE population < 1000;
I visualize the result of this query by highlighting every row where the value is less than 1000:
SELECT is for Columns
Conversely, if we want to get back only certain pieces of data (i.e. columns) of each row, we use SELECT
:
SELECT name, countrycode
FROM city;
Similarly, I like to visualize this by highlighting just the columns of the city
table:
Both at the Same Time
And the real power of SQL is that we can combine both together:
SELECT name, countrycode
FROM city
WHERE population < 1000;
Interestingly, this selects two columns (name, countrycode
), and uses a third column (population
) to filter the data. This works because WHERE
is run before SELECT
.
Parting Thoughts
Maybe you already knew this but needed a bit of practice. Play around with the results a bit and try some other queries. You can do multiple WHERE
queries by using AND
like: WHERE population < 1000 AND countrycode = 'AIA'
In part 2, we’ll talk about combining two tables together with JOIN
, and part 3 will cover GROUP BY
.