{ MANOYLO }

PostgreSQL Basics Part 1

Create tables, insert and select values

SQL

Posted on Feb 25, 2020

Databases. The heart of any application. If you ever used a web framework like Ruby on Rails, you probably wondered how all those database migrations you keep typing in the command line work. I’m assuming that you have PostgreSQL (or postgres) installed on your machine. Let’s dive in into some basic postgres.

Design

Before writing any SQL, we need to think about what are we creating. So, for this example, let’s make a blogging social network database similar to Facebook.

Thinking about what that might look like, first things that come to mind are that we will have users, we will have articles and we will have comments. We would also want to be able to friend a user, comment on many posts, and favourite many articles.

So what tables do we have to create here?

Let’s start with:
  1. users table;
  2. articles table;
  3. and comments table.
To create users table, we would write:
create table users (
  id serial primary key, 
  first_name text not null, 
  last_name text not null, 
  age int, 
  email text unique not null
);
So what is happening here?
create table users is pretty self-explanatory. A database table needs an ID that is being auto-incremented each time a new record has been created. So, to do that, we will type id serial primary key. The id is the name of the column, serial means that the value will be auto-incremented. What is primary key?

Primary key

The primary key consists of one or more columns whose data contained within is used to uniquely identify each row in the table. Think of primary keys as your home address. Your street (table) has many buildings (which are table rows), but there is only one building with your number in that street - and that is the primary key.

When defining your table, you specify the primary key column. A table can have only one primary key. We can also have multiple columns set as primary key. Approach like this is used for something called “many to many” relationship, and we will get to that a little bit later.

While we are on the topic of keys, let’s mention the other key - foreign key.

Foreign key

A foreign key is one tables primary key (ID) stored as a column in another table. For example, let’s use the users and comments from this example. One user can have many comments, but a comment can only belong to a one user. This is called a “one to many” relationship. So how does a foreign key fit the picture?

Easily! When creating a new comment, we would assign user ID value to comments user_id column.

OK, let’s move on.

Create users table

To create our first table, we would write:
create table users (
  id serial primary key, 
  first_name text not null, 
  last_name text not null, 
  age int, 
  email text unique not null
);
This table needs data, so let’s add some. The way we do this is by using a command like this:
insert into table_name (
  column_1, 
  column_2, 
  ... 
  column_n
) values (
  value_1, 
  value_2, 
  ... 
  value_n
)
That being said, let’s add our first 3 users:
insert into users (
  first_name, 
  last_name, 
  age, 
  email
) values (
  'Marko', 
  'Manojlovic', 
  35, 
  'marko.manojlovic.bg@gmail.com'
);

insert into users (
  first_name, 
  last_name, 
  age, 
  email
) values (
  'Mark', 
  'Cuban', 
  61, 
  'mark.cuban@mavs.com'
);

insert into users (
  first_name, 
  last_name, 
  age, 
  email
) values (
  'Bill', 
  'Gates', 
  64, 
  'bill.gates@microsoft.com'
);
And done! Our users table now has 3 users!
To verify this, we can check it by typing:
select * from users;
What this does is selecting all (the * symbol) from a table, in our case from a users table.

Great! Let’s continue to our next step:

Articles table

Our articles table is pretty simple: it has an ID that is unique and a primary key, it has a title that is type of text and can’t be null (must have a value when creating it), it has a body that is also type of text and has a default value of ‘content not provided’, and we have a foreign key: an ID of user who is creating this article. Foreign key also can’t be null, or in plain English - each article created must belong to a user.

To create our articles table we would run:
create table articles (
  id serial primary key, 
  title text not null, 
  body text default 'content not provided', 
  user_id int references users(id) not null
);
Next, let’s create an article. This article was written by Marko (ID: 1), it is about rainy days in Vancouver (title: ‘Why does it rain so much in Vancouver?’) and he haven’t added any article body yet:
insert into articles (
  id, 
  title, 
  user_id
) values (
  default, 
  'Why does it rain so much in Vancouver?', 
  1
);
There are a few things you should notice in the line above:
  1. in our command, we have explicitly specified the ID column with a default value. This is really not needed, so you can skip that;
  2. we haven’t provided a body column, nor its value. If a column has no value, it will simply be created as null. However, in our case, body has a default value of ‘content not provided’, so that will be assigned as its value.

Changing values

A year has passed since our users have joined our social medial platform, thus their age has increased by 1. So we have to update their age accordingly.

We can do it by selecting users one by one, which on our case is perfectly fine to do since there are only 3 users. But, in larger applications, you won’t be updating values for each user manually.

Let’s see both ways of doing so.

Our user with ID 1 has a new age value that is his current age + 1. To do this, we would write:
update users set age = age + 1 where id = 1;
OK, let’s explain this word by word:
  1. update users => pretty self explanatory;
  2. set age = age + 1 => we are setting a new value to the column name called age by simply increasing its value by 1;
  3. where id = 1 => do all the things above for user with ID of 1.
Let’s imagine that our social media platform got super popular over night and now we have 1,00,000 users to update the age for. We can’t be writing the line above one million times, right?

Well, the answer is - it depends. It depends if you are willing to spend next 10 years of your life doing that every single day.

If you love yourself and do not like to suffer for really no reason, you should update all users age with one statement like this:
update users set age = age + 1;
So what is different between these two? We have left out the where id = 1; part from the second statement. And that’s it!

Now run select * from users to check if everything was updated. And it should be good to go.

Finding Nemo

So far, we have seen how to create a new table, how to add and update the content. However, one of the things you will be doing a lot is finding stuff in the database.

This is where all the fun starts!

So, what are we looking for? Let’s find first names for all of our users:
select first_name from users;
Since fetching only first names of all users is probably not very useful, so let’s also add an ID to this statement:
select id, first_name from users;
We can write as many column names as we want in a statement like above. Of course, we have to have those columns in our table to begin with.

Let’s write a bit more complex query. Why don’t we get all articles written by user with ID 1?
select * from articles a inner join users u on a.user_id = u.id;
OK, your head must be spinning, but, that’s just the beginning to be honest. It will spin more, and more. Just kidding.
Let’s figure out what is happening here:
  1. select * from articles => is a familiar statement, we are getting all columns from articles table;
  2. the character a after articles is not a typo, it’s called an alias. Since we have some logic, we can assign an alias to a table that we can use in the statement. So instead of typing articles all the time, we simply type a;
  3. inner join users u => OK, let’s pause here for a sec.

Inner join

A INNER JOIN creates a new result table by combining column values of two tables (table1 and table2) based upon the join-predicate. So, in plain English: the query compares each row of table1 (articles) with each row of table2 (users) to find all pairs of rows, which satisfy the join-predicate.

More about joins

There are a few different types of joins and each does a different thing. We’ve seen the inner join example above, where inner join returns all records that match the join-predicate. In the case above - return all articles where article user_id matches the user id.

Outer join

An outer join is used to return results by combining rows from two or more tables. But unlike an inner join, the outer join will return every row from one specified table, even if the join condition fails. For example, if we write a statement like this:
select title, u.id, u.first_name from articles full outer join users u on articles.user_id = u.id;
what we get as a result is a full join between articles and users table, so we get all users that have an article, and all users without any articles (all article values are null for these users).

You noticed the keyword full between articles and outer part.

Full join

A FULL JOIN returns all the rows from the joined tables, whether they are matched or not i.e. you can say a full join combines the functions of a LEFT JOIN and a RIGHT JOIN.

Left join

We also have something called left join. The LEFT JOIN clause allows you to query data from multiple tables. It returns all rows from the left table and the matching rows from the right table. If no matching rows found in the right table, NULL are used.

Let’s look at this from two angles: one from the users table and one from articles table. If we run this:
select first_name, a.title from users left join articles a on users.id = a.user_id;
What we will get here are all users from our database, with article title and article ID columns. However, since Bill didn’t write any articles, his articles title and articles IDs are all null.

If we do a similar query from articles angle, and write something like this:
select title, u.id, u.first_name from articles left join users u on articles.user_id = u.id;
we will get a list of all articles with users first name and ID who wrote them, but without Bill Gates since he didn’t write a single article yet.

Left join is called left because it returns all rows from left table and matching rows from the right table, which is articles table in our case. In the first example, querying from users angle, our users table was the left table, and if no matching rows found in the right table, NULL are used, which is true as Bill Gates don’t have any articles but we see him in results with null values for title and article ID.

From the second example, our articles table is our left table, and our users table is in this case our right table. Our query said “select title, user id, user first name from articles table and match user_id on articles table value with user id values from users table.”

So this query got all our articles records, it saw the user_id value in it, then it wen’t to users table, compared the user_id with user.id value and returned only those values. That’s why we don’t see Bill Gates in this example, as he doesn’t have any articles written thus his ID is not existing in articles table under user_id column.

Right join

Right join is opposite from left join. It will return all rows from the right table and matching rows from the left table, even if there are no matches in the left table. So it works opposite from left join.

Returning a single user

OK, all these joins…. Your head must be high in the clouds by now. Let’s see how we can return one record from the database. Let’s grab user with an ID of 1 and return the name column:
select first_name from users where id = 1;

Creating a new table - friends

OK, time to move on. Let’s create a new table in our database called friends. On almost all social network platforms you can add a friend, or follow another user. Well, our network is not any different.

Let’s see how that works:
create table friends (
  user_id1 int references users(id), 
  user_id2 int references users(id), 
  primary key (user_id1, user_id2)
);
You must be thinking “this guy is crazy!” Yes, I am. BUT, let’s see a bit closer what is going on here:
  • create table friends => self explanatory;
  • user_id1 int references users(id) => we are adding first foreign key to this table called user_id1; this is a value of user.id column from users table;
  • user_id2 int references users(id) => we are adding the second foreign key to this table called user_id2; this is a value of user.id column from users table as well;
  • primary key (user_id1, user_id2) => we are setting a primary key that is a combination of two foreign keys. This is also called a composite key.

What is a composite key?

If you remember from one of my paragraphs above, a primary key is defined as a key or database column which uniquely identifies each row in a database table. A composite key is a set of more than one key that, together, uniquely identifies each record. You will also notice that there is no ID column on this table. We don’t really need an ID for this example.

This approach is used when you want to declare a many to many relationship. In our example, Marko can be friend with Bill. Marko can be friend with Jane. But also Bill can be a friend with Jane and Adrian.

Let’s be friends, friend!

OK, let’s add the first record to our newly created friends table:
insert into friends (user_id1, user_id2) values (1, 2);
Marko just befriended Mark Cuban! Hooray!

Why is this useful?

Well, now we can write some crazy queries, that’s why! For example, one of the most important things about any social media platform is the feed. So, since our platform is a blogger oriented social media platform, the feed we would consider making is showing to a user all articles from the friends list. We can go even further, and show not only articles that Marko’s friends wrote, but also the articles they liked as well.

In part 2 we will write some crazy queries.