Database System and SQL

8 minute read

1. Introduction

SQL is a simple yet powerful language that helps analysts/developers manage and query data directly from a database, without having to copy it first.

2. Data/Database

Data is distinct pieces of information and A database is a well-organized collection of data that is stored in an electronic format. In other words, it is an electronic system that allows to easily 1) access, 2) manipulate, and 3) update data.

3. Database Management System (DBMS)

DBMS is a system such that simply various databases (collections of data) can be managed by a management system (a set of programs to store and retrieve those data). DBMS provides 4 ways to 1) create, 2) retrieve, 3) update, and 4) manage data.

4. Types of DB Architecture

When creating a network, a network administrator can two options to implement:

  • File-Server
  • Clinet-Server

Likewise, two types of database architecture comes. File-server systems primarily deal with file storage while client-server systems divide workload requests among a number of servers.

4.1. File-Server Architecture

In a file-server architecture, files are located on the local system so whenever clients send a request for a file through the network, the file server checks their authentication to send a response (sending the requested file/rejecting to send the file).

  1. Software (client) requests a file from a file-server.
  2. Software opens/changes/stores the file in memory and saves it back to the file server.

While a user requests a file from file-server architecture, the server locks the file to prevent other users from accessing it and sends the data to the user’s workstation. A file-server system connects to a network with shared computer file storage, eliminating the need for each station to have copies of the same files. If multiple clients access to the same file concurrently, then file locking is needed to prevent loss of updating data. However, file locking (or ESRI’s implementation) is very inconvenient because other clients should wait until the first one who has editing right finishes his/her job.

4.2. Client-Server Architecture

In the client-server architecture, the database is the server and any application that uses the data is a client.

  1. Software requests data from the database
  2. Data is held in memory, but when a change is made, a request (SQL statement) is generated.
  3. Once you save the data, these requests are passed to the database in the order they were generated.

Many clients can work simultaneously in the same data, minimizing a risk of data loss. Moreover, version control is an option to keep.

4.3. DBMS Architecture

The architecture of DBMS depends on the computer system on which it runs. For example, in a client-server DBMS architecture, the database systems at server machine can run several requests made by client machine. We will understand this communication with the help of diagrams.

There are three types of the DBMS architecture:

4.3.1. Single Tier

In this type, the database is readily available on the client machine so any request from the database doesn’t require a network connection.

4.3.2. Two Tier

In this type, the database system is present at the server machine and the DBMS application is present at the client machine. These two machines are connected with each other through a network, so whenever client machine requests to access the database present at server using SQL, the server perform the request on the database and returns the result back to the client.

4.3.3. Three Tier

In this type, server DBMS application layer is added between users and databases; a client DBMS application does not communicate directly with the database systems present at the server machine. Through a server DBMS application, a client DBMS application communicates with the database systems.

5. SQL

SQL (Structured Query Language) helps you to communicate with the database by commands. The following shows the features of SQL database:

  • Allows users to extract data from relational database.
  • Allows to create database and tables
  • Allows updating, inserting, deleting, and altering database and tables
  • Provides security and Allows to set a permission

5. Types of SQL Commands

SQL commands are traditionally divided into four categories:

  • DQL (Data Query Language): SELECT \(\rightarrow\) to retrieve data.
  • DDL (Data Definition Language): CREATE, DROP, ALTER \(\rightarrow\) to create and restructure database objects.
  • DML (Data Manipulation Language): SELECT, INSERT, UPDATE, DELETE \(\rightarrow\) to manipulate data within the tables.
  • DCL (Data Control Language): GRANT, REVOKE \(\rightarrow\) to control the distribution of privileges among users and create objects related to user access.

6. Entities, Fields(Attributes), and Records

Every relational database comprises of one or more tables that contain data about a designed entity.

An entity is type of thing, not a thing itself like a class in object-oriented language.

For instance, in the case that you own a company that makes model cars like miniature Hyundai and Ferraris(let’s simplify only two kinds), we need a table format to store information of two different models in stock. We need to build a general ‘Model’ entity table. When you store all the records (rows) of different models and all the important fields/attributes (columns) these models have, the entity is designed generally model vehicle, the record is for a specific model, and the fields are the qualities these specific models have.

prod_id prod_name prod_line quantity price
M1111 2015Accent Med_Sedan 3000 33.12

For each model vehicle we sell, we’ll create new records, rows to store information fields about our model vehicles: what the name is, which type it is, how many we have in stock, etc. The table below contains brief definitions of each of the above terms.

Term Definition
Entity A type of thing
Record An example of en entity (row)
Field An attribute of a record (column)

7. Data Types

Every column has an associated data type. This feature of RDBMS improves data quality by preventing the accidental introduction of wrong type and maintaining all data in each column to be the same type.

8. Primary keys

Records require an identifier field to find it easily like a key in dictionary; each record must have unique value to be differentiable. That unique value is called primary key. This primary key must be unique, non-null, and immutable.

We can create the above table with a piece of SQL like this:

CREATE TABLE products (
    prod_id int NOT NULL,
    prod_name varchar(70) NOT NULL,
    prod_line varchar(50) NOT NULL,
    quantity int NOT NULL,
    price decimal(10,2) NOT NULL,
    PRIMARY KEY (prod_id) )

This SQL statement has DDL to make a data structure and DML to update, retrieve, or delete the desired data.

9. NoSQL

Whereas SQL (relational databases) is useful for handling structured data that organizes elements of data and standardizes how they relate to one another and to different properties, NoSQL (non-relational databases) allows you to store and retrieve unstructured data using a dynamic schema for its flexible ability to create a unique structure, and can be document, graph, column, or even KeyValue organized as a data structure.

For instance, we have a hospital tracking medical relational database. In this case, each patient (record/row) have the field of id, test1, test2, test3, dob, and b_type.

id test1 test2 test3 dob b_type
Null Null Null Null 1-06 Null
2732 80 Null Null 1965 A-
Null Null Null Null 15 07 Null
2946 Null 92 Null 1965 Null
Null Null Null Null 16 07 Null
3650 86 Null Null 1965 O

In reality, most data miss values a lot and contain inconsistent values; poor data integrity and poor data quality. The values for the dob column spill out over two rows, with the date and month on an upper row and the year on the lower. It is possible that the first two rows indicate one patient whose ID: 2732, test1 result: 80, test3 result: 95, dob: 01-06-1965, and blood type: A-.

In many cases, we don’t want a big, unwieldy relational database full of super sparse tables like this one. We’d prefer to rearrange our data in a key-value table, as follows:

id key value
2732 test1 80
2732 test3 95
2732 dob 01-06-1965
2732 b_type A-
2946 test2 92
2946 dob 15-07-1965
3650 test1 86
3650 dob 16-07-1965
3650 b_type O

Now, the sparse table was transformed into the better table to analyze. NoSQL works by replacing complex relational databases. These tables use key-value data model. NoSQL uses no schema or dynamic schema, which means that our data is susceptible to bad data entry.

NoSQL has its strengths and weaknesses:

10. Entity-Relationship Diagrams

In relational databases, schemas constrain updates or changes to the database to ensure that the data continuously maintain the same data type for each column.

Suppose that we have a database containing the following tables: ‘Manager’, ‘Branch’, ‘Employee’, and ‘Location’. We can represent the relations between these tables with a diagram called Entity-Relationship Diagram.

To help us interpret this, we can use the following notations:

See how the diagram captures the descriptions of the specific relationships between the tables. For example, a given employee may not work at any branch, but if they do, they’ll work at just one. This accounts for the Optional Multiplicity of One relationship (Op 1:1) on the far right of our diagram. And since a given branch may have no employees, but if it does, it may have multiple employees, we have an Optional Multiplicity of Many relationships (Op 1:*) just to the left of the previous one; second from the furthest right.

11. Personal Relational Database Query Project (PostgreSQL and SQLite)

Please check here (.ipynb file in html) if you are interested in how I implemented practical data ; I explained when to use and how to use sql techniques. This is motivated by DataCamp’s Joining Data in SQL and DataCamp’s Intermediate SQL. The former one is queried via PostgreSQL and the latter one is studied via SQLite3. If you are interested in implementing my code, check here.

Categories:

Updated:

Leave a comment