Title: Select data from a MariaDB database in Python
Earlier posts showed you to create a MariaDB database, add tables to it, and insert data into those tables. This post explains how to select data from the tables. There are any ways to select data and this post doesn't cover them all, but it does cover the most common type of SELECT statement. It also shows one way you can format results to display data in a table-like format.
Basic Queries
You use an SQL SELECT statement to select data from relational databases like MariaDB. Here's a basic format for a SELECT statement.
SELECT <fields> FROM <table> WHERE <conditions>
Here's an example.
SELECT * FROM Authors
The * tells the database to select all fields. This query doesn't contain a WHERE clause so it returns every record in the Authors table.
Here's a more complicated example that joins records from two tables.
SELECT * FROM Authors, Books
WHERE Authors.AuthorId = Books.BooksId
This query returns all fields from the Authors and Books tables. It combines the values from the tables where the AuthorId fields match.
Because this query returns all of the fields in both tables, and because both tables include an AuthorId field, that value will be included in the results twice. For example, you can see the value 1 repeated in the following row of the results.
(1, 'Rod', 'Stephens', 1, 'Beginning Database Design Solutions, Second Edition')
SQL is pretty good at understanding which table a field comes from if its name is distinct, but if a name appears in multiple tables, you need to qualify it with the table name to avoid confusing the database. In this example, both the Authors and Books tables contains an AuthorId field, so the WHERE clause must include the table names so the database can figure out which column to match up. (You could avoid that by giving the fields different names, like AuthorId in one table and AuthorNumber in the other, but that way lies madness! It's less much confusing if fields that hold the same values have the same names.)
Basic Fetches
The following select_records function executes a simple query and displays the results.
def select_records(cur):
'''Select the records and display the results.'''
query = '''
SELECT FirstName, LastName, Title
FROM Authors, Books
WHERE Authors.AuthorId = Books.AuthorId
ORDER BY .Authors.AuthorId
'''
cur.execute(query)
for first_name, last_name, title in cur:
print(f'{first_name} {last_name} wrote "{title}"')
print()
This code composes a SELECT and then uses the cursor's execute method to perform the query. After that, the cursor represents the query's results. Each row of the results is a tuple holding the selected data.
The function loops through the cursor's results, splits the tuple into variables, and uses an f-string to display the values. Here's the output.
Rod Stephens wrote "Beginning Database Design Solutions, Second Edition"
Rod Stephens wrote "Build Your Own Ray Tracer With Python"
Rod Stephens wrote "Beginning Software Engineering, Second Edition"
Rod Stephens wrote "Interview Puzzles Dissected"
Terry Pratchett wrote "Going Postal"
Terry Pratchett wrote "Making Money"
Christopher Moore wrote "Lamb: The Gospel According to Biff, Christ's Childhood Pal"
Christopher Moore wrote "Fool"
Christopher Moore wrote "The Serpent of Venice"
Christopher Moore wrote "Shakespeare for Squirrels"
Tabular Formatting
So far so good, but what if you want to display the output in a tabular form. To make this look table-like, we need to know the names of the columns and the lengths of the values in each field. Learning those requires a few new techniques.
Rather than explaining the tools you need and then presenting the code, I'll show you the code and explain the tools as we reach them.
def select_and_format_records(cur, query):
'''Execute the query and and display the results in columns.'''
cur.execute(query)
# Get the selected column names.
col_names = [description[0] for description in cur.description]
num_cols = len(col_names)
# Fetch the results into a list.
all_rows = cur.fetchall()
# Loop through the results to get column widths.
col_widths = []
for col in range(num_cols):
width = 0
for row in all_rows:
if len(row[col]) > width:
width = len(row[col])
col_widths.append(width)
# Print the column headers.
for col in range(num_cols):
print(f'{col_names[col]:<{col_widths[col]}}', end=' ')
print()
# Print a row of dashes.
for col in range(num_cols):
print(f'{"-" * col_widths[col]} ', end='')
print()
# Print the records.
for row in all_rows:
for col in range(num_cols):
print(f'{row[col]:<{col_widths[col]}}', end=' ')
print()
print()
This function takes a cursor and query as parameters. It uses the cursor to execute the query must as the previous function did.
The cursor's description property is a tuple of tuples describing the selected columns. Note that these are the columns created by the query and not necessarily columns in the tables. We'll talk about that more in a bit.
The code uses a list comprehension to pull each field's first description value (which is the field's name) from the descriptions and saves them in the col_names list.
The next issue to address is how we figure out how long the data values are. It would be nice if we could loop through the cursor's rows to see how long the values are and then rewind the cursor to process the data again. Unfortunately cursors are forward-only so once you advance to a new row, there's no going back.
There are two common solutions. First, if the data isn't too huge, you can do what this example does and use the cursor's fetchall method to fetch all of the data and save it in a list. Then you can move through the list as necessary.
The second solution is useful if there's too much data to conveniently hold it in a list. In that case, you can execute the query and loop through it to get the data values' lengths. Then you run the query again to process the data.
Having built the all_rows list, the code loops through it and finds the length of the longest value in each column.
Now the code is ready to start printing. It loops through the column names and prints them on a single line. It uses an f-string to pad each column's name to its column width.
After it displays the column names, the code does roughly the same thing to display a row of dashes below the field names.
The code repeats those steps one more time to display the data values.
The program calls select_and_format_records to execute this query.
SELECT CONCAT(FirstName, " ", LastName) AS Author, Title AS "Book Title"
FROM Authors, Books
WHERE Authors.AuthorId = Books.AuthorId
Notice that this query concatenates the author's first and last names with a space in between and gives the result the alias "Author." It gives the Title field the alias "Book Title." That alias includes a space, so it must be enclosed in quotes.
Here's the result produced by select_and_format_records.
Author Book Title
----------------- ----------------------------------------------------------
Rod Stephens Beginning Database Design Solutions, Second Edition
Rod Stephens Build Your Own Ray Tracer With Python
Rod Stephens Beginning Software Engineering, Second Edition
Rod Stephens Interview Puzzles Dissected
Terry Pratchett Going Postal
Terry Pratchett Making Money
Christopher Moore Lamb: The Gospel According to Biff, Christ's Childhood Pal
Christopher Moore Fool
Christopher Moore The Serpent of Venice
Christopher Moore Shakespeare for Squirrels
Sorted Results
In the previous results, the rows are arranged sort of randomly. In this example, they're arranged in the order in which they were created, but in general the database doesn't guarantee any particular order.
To arrange the results consistently, you can use an ORDER BY clause.
SELECT CONCAT(FirstName, " ", LastName) AS Author, Title AS "Book Title"
FROM Authors, Books
WHERE Authors.AuthorId = Books.AuthorId
ORDER BY Author, `Book Title`
The ORDER BY clause orders the results first by Author and then, if there's a tie, by Title. Here's the sorted result.
Author Book Title
----------------- ----------------------------------------------------------
Christopher Moore Fool
Christopher Moore Lamb: The Gospel According to Biff, Christ's Childhood Pal
Christopher Moore Shakespeare for Squirrels
Christopher Moore The Serpent of Venice
Rod Stephens Beginning Database Design Solutions, Second Edition
Rod Stephens Beginning Software Engineering, Second Edition
Rod Stephens Build Your Own Ray Tracer With Python
Rod Stephens Interview Puzzles Dissected
Terry Pratchett Going Postal
Terry Pratchett Making Money
Now the results are ordered properly, but there's one weird thing I want to mention. The ORDER BY clause uses the alias Author with no problem, but the alias "Book Title" contains a space. To avoid confusing the database, you need to enclose that alias in back quotes aka back ticks. Normal single or double quotes make the database think you're trying to sort on a string literal rather than the alias. (That's pretty silly but, as notable philosopher Bruce Hornsby said, "That's just the way it is.")
Most developers avoid using spaces inside aliases (and other names like table and field names) to avoid weird programming effects and general confusion.
Conclusion
This post explains how to use Python to select data from a MariaDB database. The select_and_format_records function selects data and uses the selected field names, together with a generous dollop of Python, to format the results in a table.
Download the example to experiment with it and to see additional details.
|