Relational Databases & Data Modeling

November 23, 2019

The past couple of weeks at Lambda School were the most challenging for me so far.

However, our instructor Sean Kirkby did a great job explaining the material thoroughly and breaking down concepts to make it easier to understand.

I learned:

The basics of relational Databases
The basics of SQL
How to use Knex.js to make queries, migrations and seeds
How to use the SQLite RMDBS and SQLite Studio
What foreign keys are and how to use them to make multi-table JOIN queries
The principles of data normalization
How to model data using DbDesigner

Notes

Relational Databases & SQL

Databases are collections of data organized for easy retrieval and manipulation.

A database is often necessary because our application needs data persistence (the data needs to be safely stored and remain untouched unless intentionally modified).

Relational databases are databases that store the data in tables.

Tables are made up of rows and columns, where each column is a field type and each row is a single record in the table.

Tables often have a primary key column, which is used to uniquely identify each row.

SQL (Structured Query Language) is the standard language used to manage databases and the data within them.

INSERT, SELECT, UPDATE and DELETE are the CRUD commands for SQL.

Examples:

SELECT first_name, last_name, salary
  FROM Employees;

INSERT INTO Employees (first_name, last_name, salary)
  VALUES ('Seong', 'Kim', 100000);

UPDATE Employees
  SET salary  = 1000000
  WHERE first_name = 'Seong'

DELETE FROM Employees
  WHERE first_name = 'Seong'

Note: In an actual query, we would use the primary key (usually an ID) instead of the first_name value to UPDATE and DELETE as there could be employees with the same first name

Knex

Knex.js is a query builder - a JavaScript library that allow us to interface with a database using a JS version of SQL.

The Knex version of the above examples would look like:

db('Employees')
  .select('first_name', 'last_name', 'salary');

db('Employees')
  .insert({ first_name: 'Seong', last_name: 'Kim', salary: 100000 });

db('Employees')
  .where({ first_name: 'Seong' })
  .update({ salary: 1000000 });

db('Employees')
  .where({ first_name: 'Seong' })
  .del();

SQLite & Schemas

A DBMS (Database management system) is a specialized software that allows us to create, access and manage our databases.

For relational databases, some examples of RDBMSs are Postgres, SQLite, MySql and Oracle.

SQLite is a is a lightweight RDBMS (not a database).

A database schema is the shape of our database. It defines what tables should exist, which columns should be in those tables, and any restrictions on each column.

When designing a single table, we need to ask three things:

What fields (or columns) are present? (eg. id, name, address, etc.)
What type of data do we expect for each field? (eg. integer, text, null, etc.)
Are there other restrictions needed for each column? (eg. default value, not null, unique, primary key)

In Knex can use migrations to define any changes to the structure of our database, such as adding new tables and modifying existing tables.

We can use seeds to prepopulate our database with sample data.

Multi-table Queries

Foreign keys are type of table field used to connect a record in one table to a record in another table. They're often integers that identify (rather than store) data.

We can use a SQL JOIN command with foreign keys to query data from multiple tables using a single SELECT command.

Eg.

SELECT * FROM Employees
  JOIN Departments
  ON Employees.department_id = Departments.id

The JOIN command joins every record in a table with a record in the second table and creates a temporary table with every record. (So if we had ten rows in one table and ten rows in the second table, there would be 100 rows in the temporary JOINed table.) The ON command filters through this temporary table and only SELECTs from the rows that meet the clause. So in the example above, we would only get Employees where Employees.department_id (foreign key) is equal to Deparments.id

Data Modeling

Data Normalization is the process of designing or refactoring databases for maximum consistency and minimum redundancy.

Normalization Guidelines:

Each record has a primary key.
No fields are repeated.
All fields relate directly to the key data.
Each field entry contains a single data point.
There are no redundant entries.

When modeling the data for multiple tables, there are different relationships between the tables.

One to One

Lets say we want to model employees and social security numbers.

Each employee has one SSN, and each SSN can be assigned to one employee.

This data can be represented in two tables: employees and social_security_numbers. (We can actually use one table, but sometimes it's good to keep separate concerns in separate tables.)

We can use a foreign key in one table to link to the other table.

One to Many

Lets say we want to model employees and office location.

Each employee can have one office, but one office can have many employees.

This can also be represented in two tables: employees and offices. But for one to many relationships, we want to foreign key to be in the 'many' table - so in this case we want to have a office_id foreign key in the employees table.

Many to Many

If we wanted to model employees and projects, each employee can have many projects, and each project can have many employees.

This is a many to many relationship.

In this case, we would need to introduce an intermediary table that hold foreign keys that reference the primary key on both employees and projects.

Overall it was a tough but rewarding two weeks. I'm enjoying the process of thinking about how to model data and build out the back-end of an application.

I'm looking forward to the Authentication and Testing section coming up, but next week is Thanksgiving break for Lambda School. I'll use the time to brush up on my CS fundamentals.

← back to all posts