< Previous Page | Home Page | Next Page >

Using SQL in Python¶

After we have explored the most important pieces of SQL Syntax in the previous chapters, we'll now have a look at how we can use pyhton to work with SQL (specifically SQLite) databases.

Luckily, this is rather easy once we have understood the underlying SQL syntax.

In a first step, we need to install two libraries: sqlite3 and pandas. The first one helps us with connecting to and manipluating databases whereas pandas makes reading from a database very easy.

import sqlite3
import pandas as pd # it is convention to install pandas as pd

Loading a SQL Database in Python¶

Once we have imported the libraries above, we can connect our Python kernel to an existing database. This is done with the following command:

DB = sqlite3.connect('<our database>')

The sqlite3 module provides a straightforward interface for interacting with SQLite databases. A connection object is created using sqlite3.connect(). Once our session has ended, we must close the connection as follows:

DB.close()

While it is possible to directly read from the database using the sqlite3 library, using pandas is way easier.

Fetching Information via pd.read_sql¶

As stated above, it is preferable to use the pandas library, when trying to read data from an existing database. This is because pandas directly transforms the received information in a so called DataFrame (which will be discussed in the Data Science chapters). Habdlibg a DataFrame is much easier than reading the tuples or lists that are returned by the sqlite3 library.

To read a database we will use the pd.read_sql() function, which consists of the following syntax.

pandas.read_sql(sql, DB, index_col=None)

where

sql contains a string of the applicable SQL syntax (by convention we do triple strings ("""))
DB specifies our Database object from above
index_col specifies the column name (as a string), which will be used as the index for the returned DataFrame object. If index_col is not passed, index values starting from 0 appear to the left of the DataFrame’s rows.

As stated before, this function returns a DataFrame object.

Let's consider our STUDENT example from before.

ROLL_NO	NAME	AGE	BRANCH	CITY
10001	Kate	22	CSE	Mumbai
10002	Richard	21	ECE	Delhi
10003	Rick	33	ME	Chennai
10004	Peter	26	CSE	Delhi
10005	Peter	22	CSE	Noida
10006	Mark	22	ECE	Jaipur
10007	Brad	23	ME	Rampur

Let's say we wanted to

Open the database in python
Fetch the first four columns (we do not care about the city in this case) of students that are in the CSE branch, ordered according to their age.
Return the fetched information as a pandas DataFrame and print it

This would look as follows:

import sqlite3
import pandas as pd

DB = sqlite3.connect(STUDENT.db)

DataFrame = pd.read_sql("""
                          SELECT ROLL_NO, NAME, AGE, BRANCH
                          FROM 
                            STUDENT
                          WHERE
                            BRANCH = 'CSE'
                          ORDER BY
                            AGE ASC;
                            """, 
                            DB,
                            index_col = ['ROLL_NO'])



print(DataFrame)

DB.close()

Writing to a database via sqlite3¶

To manipulate an existing Database, pandas is, unfortunately, of little use. In such a case, we have to do the following:

First we connect the database to Python as explained above.

DB = sqlite3.connect('<our database>')

Once this is done, we open a so-called cursor object. A cursor can be understood as an iterable (a tuple or a list) that shows theresults from Database queries.

cursor = DB.cursor()

Using the .execute() method of the newly created cursor object, we are then able to directly pass SQL syntax to our Database.

cursor = cursor.execute("""UPDATE STUDENT SET Name='Brad'
                            WHERE NAME='Kate'""")

# we rename Kate to Brad

So far so good. If we want to print out the updated DataBase, we must consider that the cursor object, is an iterable. A print(cursor)would therefore merely result in an ugly string. This can be solved by printing out the individual rows of our Database.

for row in cursor:
  print(row)

Lastly, we need to close our Database - and thus save the changes to it.

DB.close()

The full documentation of sqlite3`s cursor object, can be found here (not relevant for the exam)

< Previous Page | Home Page | Next Page >