Database and donuts

September 2, 2023

(1y ago)

Hello There🐼

Let's talk Database management.


In this blog we will talk about Basics and advanced concepts related to database management and SQL.



Database

We have a lot of things in DBMS. we are going to cover few important ones here.


Cardinallity and it's ratio

  • Cardinallity is a measure to represt how the entities are related to each other.
  • There are 4 types of cardinality ratio:
    1. One-to-One
    2. One-to-Many
    3. Many-to-One
    4. Many-to-Many

Extension and Intension

ExtensionIntension
THe stae of the database at a given time.The internal structure and architecture of the database while creating it.

Keys

  • Keys play an important role in Database.
  • There are various keys:

Super key

  • All possible key combinations in a table or a realtion are known as super keys.
  • It's like a superset of all keys.

Candidate key

  • It is a subset of all minimal super keys.
  • All candidate keys as super keys.

Primary key

  • Subset of candidate keys who's value can not be null.
  • A unique attribute in a table which can be used to identify all other keys in the table.
  • All primary keys are Candidate keys.
  • The value can not be null.

Unique key

  • Subset of Candidate keys which are not primary key
  • We can modify the values in this columns.
  • Value can be null.

Foreign key

  • The attribute from one table which is refferenced in other table.
  • It can only take values which are defined in the parent table.
  • A foreign key is allways reffering a primary key of other table.

Locks

Shared LockExclusive Lock
We can have mnay readers reading the same data.We have only one writer writing the data.
We allow other transactions or clients to read the data at the same timeWe do not give access to any reader or writer on the data which is being written by a transactions at that moment.

Prime and Non-Prime attributes

Prime attributesNon Prime attributes
The attributes which are part of a candidate key.The attributes which are not part of a candidate key.
Say Candidate Key is AB-->CD.Say Candidate Key is AB-->CD.
AB are Prime attributes.CD are Non Prime attributes.

Partial dependency

  • For a Candidate key AB-->CD if there exists a relation B-->C or B-->D or A-->C or A-->D then this is called partial dependency.
  • Basically all non prime attributes should completely depend only on Candidate key.

Transitive dependency

  • For a given candidate key A-->BC we have realtions B-->C.
  • A is prime attribute.
  • B and C are non prime attributes.
  • B being a non prime attribute is identifying C which is also a non prime attribute.
  • This kind of situation in Transitive dependency.

Joins

Natural Join

  • Joining two tables based on the common attributes which is in both the tables.

Equi Join

  • It is same of natural join but here you can join the tables based on any attribute.
  • The joining attribute need not be the same.

Self Join

  • Here we join the table with itself.

Left outter Join

  • After joining two tables we get the common attributes from both the table along with that we get all the attributes from left table.
  • Syntax:SELECT * FROM emp LEFT OUTTER JOIN(dept) ON(emp.id = dept.id)

Right outter Join

  • After joining two tables we get the common attributes from both the table along with that we get all the attributes from right table.
  • Syntax:SELECT * FROM emp RIGHT OUTTER JOIN(dept) ON(emp.id = dept.id)

FULL outter Join

  • After joining two tables we get the common attributes from both the table along with that we get all the attributes from BOTH tables.
  • Syntax:SELECT * FROM emp FULL OUTTER JOIN(dept) ON(emp.id = dept.id)

Functional dependency

  • A given attribute relation X-->Y. Here Y is said to be dependent on X.
  • If X can uniquely identify Y for any value of X.
Trivial Functional dependencyNon-Trivial Functional dependency
X-->YX-->Y
Y ⊆ XY ⊈ X
EX: AB-->AEX: AB-->C

Closure set

  • It is a set of all attribute which can be identied by a given attribute.
  • (A)+=ABC


Normalization

  • Breaking down a bigger table into smaller tables based on some rules is normalization.
  • Need for normalization is to get rid of problem realted with INSERT, UPDATE and DELETE anomalies.
  • Or in simple terms u can say that you want to get rid of redundancy.
  • Various normalization forms:
    1. 1NF
    2. 2NF
    3. 3NF
    4. BCNF
    5. 4NF

1NF

  • Each cell in a table should have only atomic or one value.
  • Or you can say that each cell should not have multivalued attribute.
  • How to convert a table into 1NF if it is not in 1NF form:
    • Make one more row for each multivalued attribute in a cell.

2NF

  • We need to know the concept of Prime and Non-Prime attributes to understand 2NF.
  • Consider Candidate keys AB-->CD, B-->C and AB-->D.
  • Here CD are Non Prime attributes and AB are Prime attributes.
  • C is entirely not dependent on the prime attributes or candidate keys.
  • C can be derived from a subset of Prime attribute B.
  • This is known as Partial dependency.
  • Partial dependency is to be avoided in 2NF.
  • In simple terms we can say that non prime attributes should entirely depend on candidate key and not a subset of the candidate key.
  • How to make a table into 2NF:
    • Make another table with candidate key and all the attributes which are completely dependent on the candidate key.
    • Make separate table for all the other keys which are defying the rule of partial dependency.

3NF

  • We Need to avoid Transitive dependency.
  • For any given functional dependency X--->Y If we satisfy either of the two rules then the table is in 3NF
    1. X should be a super key.
    2. Y should be a Prime attribute.
  • How to make table into 3NF:
  • Whoever is voilating the rule make separate table for them along with all the attributes which it is identifying.

BCNF

  • Only one condition is there:
  • LHS should be a Super key.

ACID

  • A-->Atomicity
    • For a given instruction or a transaction it should take place completely or not at all.
    • Transaction management system is responsible for Atomicity behaviour.
  • C-->Consistency
    • For a given data Transaction the database should go from one consistent state to another consistent state.
    • There is no system that holds takes care of Consistency.
  • I-->Isolation
    • Each transaction or a set of instruction should run or execute independently without interfering with each other.
    • Concurrancy control component is responsible for Isolation behaviour.
  • D-->Durability
    • Whatever the data stored in the database it should be permanent.
    • Recovery management component is responsible for Durability behaviour.

If any of the above properties is voilated by a transaction then the database automatically performs roll back operation.

Roll back operation takes the database to it's previous consistent state.


Transaction

It is a set of instruction that are executed on the database.

There are few concepts you need to understand for transactions and schedule:


Dirty read problem

T1T2
R(A)
W(A)
R(A)
Commit
Commit
  • If a transaction modifies some data, Here in this case it is A.
  • If some other transaction reads that data before commiting, Then that transaction is reading out of time data.
  • This is only dirty read problem.

Unrepeatable read problem

T1T2
R(A)
R(A)
w(A)
W(B)
R(A)
Commit
Commit
  • The same data which is read at two different times during a transaction is not the same.
  • This is only Unrepeatable read problem.

Phantom read problem

T1T2
R(A)
R(A)
DELETE(A)
R(B)
Commit
Commit
  • Here a transaction is trying to read a data which has been deleted.

Lost update problem

T1T2
R(A)
R(A)
W(A)
W(A)
Commit
Commit
  • Here the data written by Transaction-T1 is overwritten by Transaction-T2.

Schedule

  • It is just a collection of transactions.
  • If there are n transactions in a given Schedule then there are n! ways we can arrange these transactions.
Serail ScheduleNon Serail Schedule
The transactions in a given Schedule takes place one after the other serially.The transactions in a given Schedule takes place interleaving one after the other.
  • Serail Schedules are always consistent.

Serializability

  • It is a technique to see if a non Serail Schedule can produce the same result if it is run as Serail Schedule.
  • There are two methods to check if a Schedule is Serializable or not
Conflict SerializableView Serializable
We try to swap the order of the instructions and check if it produces the same output.It is a sueprset of Conflict Serializable.
If two or more transactions are working on the same data and any of the following situation occurs then it is not not Conflict Serializable.If a schedule is Conflict Serializable then it is view Serializable.
If Read-Write, Write-Read or Write-Write.If a schedule is not Conflict Serializable then we have to perform some checks.

Recoverable schedule

When there is a failure in the system ehen some transaction is reading or writing some data to the database, we need to be able to recover the data or bring back the database to it's previous state.

recoverable


Strict schedule

If a transaction is working on some data item, then don't allow other transactions to get their hands on that data item till it commits.


Concurrancy control protocols

Timestamp protocolLock based
Assign a timestamp to all the transaction that takes place, so that when conflict occurs we can resolve it based on timestamp.We assign lock on the the data.

Views

  • It is a virtual table that contains data from one or more tables.
  • Created using CREATE VIEW view_name as ....
  • Whenever there is a change in the underlying tables the view table changes.

TRIGGER

  • It is a special type of procedure which gets invoked automatically in a response to an event.
  • Pprocedures are basically stored SQL queries which can be executed anytime.

Index in SQL

  • It is a disc structure associated with a table or a view.
  • It is used to retrieve tha data faster as we will know where the required data is located.
  • Increases the data retrievel speed by reducing the number of pages that need to be searched to get the required data.

SQL

  • Structured Query Language.
  • Used to talk with the database.
  • Multiple types of commands are there in SQL, Mainly
    1. DDL (Data Definition Language)
    2. DML (Data Manipulation Language)
    3. DCL (Data Control Language)
    4. TCL (Transaction Control Language)
DDLDMLDCLTCL
Commands which are used to defines the data. Like TABLES, DATABASE, VIEWS, TRIGGERSCommands which are used to Manipulate the existing data. Like UPDATE, INSERTCommands which are used to control the behaviour or charecteristics of the data. Like GRANTCommands which are used to control transactions in the database. Like COMMMIT
CREATEINSERTGRANTCOMMIT
DROPUPDATEREVOKE--
ALTERDELETE----


Good Day 🐼