Blog

We write about Ember.js, Ruby on Rails as well as Elixir and Phoenix.

Back End & Full Stack

SQL Fundamentals

Most web applications rely on storing their data in a relational database, consisting of tables which are comprised of columns and rows. PostgreSQL, MySQL and SQLite are the most popular and established relational databases, and luckily, they have a LOT in common.

1
Module 1 Duration: 100 minutes

1 — Foundation of Relational Databases

Before we dive into our workshop project, we will spend some time to lay the foundation for relational databases and SQL. Learning whether a given task is best handled by your database or application layer is a big part of ensuring your apps perform well under heavy loads.

Agenda
Foundation of Relational Databases Duration: 15 minutes
9:00
Welcome and Tech Check

We’ll get to know each other and ensure everyone is set up for the workshop project.

Foundation of Relational Databases Duration: 30 minutes
9:15
Relational Algebra and Codd’s Relational Model

In 1970, Edgar Codd invented a new way to model large, organized and shared piles of data using the expressive semantics provided by relational algebra. Today, virtually all relational databases are still based on these fundamental principles. We’ll cover the conceptual models behind tables, columns, result sets and “joins”.

Foundation of Relational Databases Duration: 15 minutes
9:45
Structured Query Language

Virtually all relational databases use some variant of a (mostly) declarative programming language called Structured Query Language (SQL) to perform operations. We’ll learn what SQL looks like, and try writing a few statements together.

Foundation of Relational Databases Duration: 20 minutes
10:00
Three forms of SQL databases

We’ll focus on a few types of databases, all of which fit our definition of “SQL Databases”.

  • Hosted databases exist as a completely independent system component, often running on their own server. PostgreSQL and MySQL are among the most popular hosted relational database products today.
  • Embedded databases are often packaged with an application instance, often as a file on disk. From the outside world, it’s hard or impossible to separate “app” from “database”. We’ll be working extensively with SQLite - a very popular relational embedded database widely used in mobile, desktop and web applications.
  • Spreadsheets can also be regarded as databases. Although more limited than the other types of databases, a surprising amount of SQL syntax can be used to perform advanced queries and calculations! We’ll be using the google visualization API, which allows us to “query” a google spreadsheet using a SQL-like syntax and get JSON back.
Foundation of Relational Databases Duration: 20 minutes
10:20
DB Management Tools

We’ll look at a few tools that will help us on our journey to learn more about SQL databases.

2
Module 2 Duration: 210 minutes

2 — Retrieving Data

The first thing we will learn is how to get data out of a database in a variety of ways. We will begin with the simplest possible queries; move on to filtering our result set; join tables together to retrieve the data we are interested as quickly and easily as possible.

Agenda
Retrieving Data Duration: 20 minutes
10:40
SELECTing a collection of data

SELECT is the best and easiest way to begin working with a SQL database! But as we’ll see later on, it is by far the most complex type of query we’ll encounter.

Retrieving Data Duration: 30 minutes
11:00
EXERCISE: Selecting Columns

Selecting all columns in a table is generally inappropriate for a production app. We’ll explicitly pick which columns we need for several collections of data that our app needs, and witness the improved performance gained by making this simple change.

Retrieving Data Duration: 30 minutes
11:30
Filtering via WHERE clauses

It is often undesirable to work with all tuples or “rows” from a given table. Adding a WHERE clause to our SELECT query allows us to specify one or more criteria for filtering the result set, down to only what we are interested in.

Retrieving Data Duration: 30 minutes
12:00
EXERCISE 2 - Filtering via WHERE clauses

We’ll add WHERE clauses to the collection queries for two pages on our app.

  • On the products list page, we’ll allow the user to filter by those products that need to be reordered, those that are discontinued, or the full list,
  • On the customer list page, we’ll add a rudimentary search field and use a LIKE clause to find matching rows.
Retrieving Data Duration: 60 minutes
12:30
Lunch

Break for Lunch

Retrieving Data Duration: 20 minutes
13:30
LIMITing and ORDERing the result set

Particularly when working with large collections of data, it is important to be able to sort data the way we want and paginate or scroll through the results. We’ll learn how to use LIMIT and OFFSET to retrieve the records of interest, and ORDER BY to sort.

Retrieving Data Duration: 20 minutes
13:50
EXERCISE: Sorting and Paging

In our example app, the orders page has over 16,000 records. This is way too much data to show to users all at once. Even looking at an individual customer’s orders is a bit overwhelming. We’ll use the existing user interface for sorting and pagination, and modify the “orders list” and “customer orders list” queries as appropriate.

3
Module 3 Duration: 110 minutes

3 — Querying Across Tables

Time to put our newfound knowledge of relational algebra into practice! One of the great advantages of a relational database is the ability to mix tables together in queries, and aggregate or group across columns. Databases are built to do this kind of work, so it’s often much faster to build the right query than to move similar logic into our application code.

Agenda
Querying Across Tables Duration: 30 minutes
14:10
Inner and outer JOINs

There are five types of joins in most relational database systems, but we can get away with focusing almost entirely on the two categories: INNER and OUTER joins. We’ll learn about the distinction between these two types, and how to pick the right join operation for the job.

Querying Across Tables Duration: 30 minutes
14:40
EXERCISE: JOIN to replace ids with names

There are several places in our app where alphanumeric IDs are shown to users. Humans prefer referring to things by names, so let’s use JOIN to transform these references into records that are more user-friendly!

Querying Across Tables Duration: 20 minutes
15:10
Aggregate Functions and GROUP BY

Often, we are interested in summary data that is aggregated over a result set (example: “give me the number of products we have in each category”). Through using GROUP BY, we can define the records we are interested in. We can use aggregate functions like sum, count, group_concat to aggregate over duplicate data.

Querying Across Tables Duration: 30 minutes
15:30
EXERCISE: Aggregate Functions and GROUP BY

There are several places where some additional aggregate information is needed in order to “fix” the currently broken experience. Firstly, we need to get the subtotal of an order’s line items and display it prominently at the bottom of the order page. Then, we’ll count and concatenate aggregate results as we group records on the employee, customer, and product list pages.

4
Module 4 Duration: 135 minutes

4 — Creating, Updating and Deleting

Now that we have gotten used to the different ways to retrieve data from our database, we will learn how to create, manipulate and destroy records.

Agenda
Creating, Updating and Deleting Duration: 30 minutes
16:00
Creating and Deleting Records

CREATE and DELETE are considerably simpler than the SELECT statement we have been working with so far. More often than not, you’ll be building these queries with values entered by users, so this is a great time to discuss SQL injection attacks and how we can defend against them.

Creating, Updating and Deleting Duration: 30 minutes
16:30
EXERCISE: Creating and Updating Orders

We’ll build the proper queries for creating new orders and updating existing ones, being sure to avoid susceptibility to SQL injection attacks.

Creating, Updating and Deleting Duration: 30 minutes
17:00
Transactions

Transactions allow a sequence of SQL statements to be grouped together and treated by the database as one “all or nothing” unit. This important tool allows us to achieve an even higher level of data consistency and integrity - through the assurance that the entire transaction will either complete, or the database will be left totally unaffected.

Creating, Updating and Deleting Duration: 30 minutes
17:30
EXERCISE: Transactions

We’ll use a transaction to update our SQL statement for creating a new order.

Creating, Updating and Deleting Duration: 15 minutes
18:00
Wrap up and review

We’ll review everything we have covered so far, and set our sights on tomorrow’s topics.

5
Module 5 Duration: 150 minutes

5 — The Schema Evolves

Over time, you will often need to update the schema or “shape” of your data to meet your application’s needs. In this unit, we will learn about using migrations to manage these changes. We can apply database-level constraints via the schema to ensure that even if our application logic misbehaves, our data is always consistent.

Agenda
The Schema Evolves Duration: 30 minutes
9:00
Migrations

As a database-driven system evolves, we often need to make changes to its schema. We’ll discuss best practices for treating a database as a semi-free-standing system component and learn how to commit schema changes to a git repository along with our source code, to provide ourselves with a reliable way to maintain multiple environments across a team.

The Schema Evolves Duration: 30 minutes
9:30
Indices

When we create an index in a database, we are telling it to do some bookkeeping as records are added and updated. When the time comes to search in a particular way, results can be retrieved quickly and directly using indices.

The Schema Evolves Duration: 30 minutes
10:00
Boost JOIN performance via INDEXes

You may have noticed that our database query times increased over the last few exercises. One contributor to this problem has to do with the JOINs we added in exercise 4. Adding an index will tell the database to keep track of particular slices of data at all times, and should dramatically improve these JOINed queries

The Schema Evolves Duration: 20 minutes
10:30
Constraints

To ensure data integrity, sometimes we have the option to put constraints directly on a database. For example, if one record refers to another, we can require that the other record actually exists. We’ll look at several different database-level constraints we can put in place, including NOT NULL, UNIQUE indices and foreign keys.

The Schema Evolves Duration: 25 minutes
10:50
EXERCISE: Adding DB constraints

We’ll add a few constraints to our database to ensure that even if our business logic runs into problems, only consistent and valid records can be stored.

The Schema Evolves Duration: 15 minutes
11:15
Wrap up

We’ll recap everything we have learned, and provide some resources for further learning.

Get your team trained!