Title: Use GROUP BY to aggregate related records from a MariaDB database in Python
Consider the following table definition.
CREATE OR REPLACE TABLE Sales (
Salesperson VARCHAR(50) NOT NULL,
Date DATE NOT NULL,
Amount DECIMAL(10,2) NOT NULL
This table holds sales data over time. From my previous posts, you could easily select that data, but what if you want to aggregate it to show each salesperson's total sales as in the following?
Name Date Amount
------ ---------- ----------
Alice 2026-04-01 30.00
2026-04-01 65.00
2026-04-01 100.00
Total 195.00
Bobby 2026-04-01 20.00
2026-04-01 70.00
2026-04-01 87.00
Total 177.00
Cindy 2026-04-01 47.00
2026-04-01 90.00
2026-04-01 130.00
Total 267.00
This example shows several options for displaying this kind of aggregated data.
Basic Selection
The easiest way to display this data is to just select it and display all of the table's rows. The following select_all function uses this approach.
def select_all(cur):
'''Select the records and display the results by group.'''
query = '''
SELECT *
FROM Sales
ORDER BY Salesperson, Date, Amount
'''
cur.execute(query)
print(f'{"Name":<6} {"Date":<10} {"Amount":>10}')
print('------ ---------- ----------')
for name, date, amount in cur:
print(f'{name:<6} {date:%Y-%m-%d} {amount:>10}')
print()
This is pretty straightforward and produces the following less-than-perfect result.
Name Date Amount
------ ---------- ----------
Alice 2026-04-01 30.00
Alice 2026-04-01 65.00
Alice 2026-04-01 100.00
Bobby 2026-04-01 20.00
Bobby 2026-04-01 70.00
Bobby 2026-04-01 87.00
Cindy 2026-04-01 47.00
Cindy 2026-04-01 90.00
Cindy 2026-04-01 130.00
At least each salesperson's results are grouped together, but we can do better.
GROUP BY
The SQL GROUP BY statement lets you group related records and execute an aggregating function on them. Aggregating functions include
COUNT, MAX, MIN, SUM, and AVG.
The following display_totals function demonstrates this approach.
def display_totals(cur):
'''Use GROUP BY to display only the totals.'''
query = '''
SELECT Salesperson, SUM(Amount)
FROM Sales
GROUP BY Salesperson
ORDER BY Salesperson
'''
cur.execute(query)
print(f'{"Name":<6} {"Total":>10}')
print('------ ----------')
for name, amount in cur:
print(f'{name:<6} {amount:>10}')
print()
The GROUP BY statement groups records that share the same Salesperson value and applies the SUM function to those records' Amount fields. That returns the sum of the Amount values, which is the total amount sold by that person.
Here's the result.
Name Total
------ ----------
Alice 195.00
Bobby 177.00
Cindy 267.00
That's great and fairly easy if all you want is the summary. You need to do more work if you want to see each salesperson's sales records plus a total.
Single Query
One way to get each salesperson's records plus a total is to select all of the records ordering them by Salesperson. You then loop through the records and keep track of each person's records and total. The following select_all_by_group function takes this approach.
def select_all_by_group(cur):
'''Select the records and display the results by group.'''
query = '''
SELECT *
FROM Sales
ORDER BY Salesperson, Date, Amount
'''
cur.execute(query)
print(f'{"Name":<6} {"Date":<10} {"Amount":>10}')
print('------ ---------- ----------')
prev_name = ''
total = 0
for name, date, amount in cur:
# Print the name if it has changed.
the_name = ' '
if name != prev_name:
if prev_name != '':
# Display the previous person's total.
print(f'{"":<6} {"Total":>10} {total:>10}\n')
prev_name = name
the_name = name
total = 0
total += amount
print(f'{the_name:<6} {date:%Y-%m-%d} {amount:>10}')
# Display the last person's total.
print(f'{"":<6} {"Total":>10} {total:>10}\n')
This code executes the query. It then displays Name, Date, and Amount headers and starts looping through the results.
If the current record's Name value differs from the previous name, the records are switching from one salesperson to the next. In that case, the code prints the previous person's total and sets the_name equal to the new person's name.
The code then prints the current name (which is either the current person's name or six spaces to make things line up) followed by the current record's date and amount.
When the function has finished looping through the results, it prints the last person's total. Here's the result.
Name Date Amount
------ ---------- ----------
Alice 2026-04-01 30.00
2026-04-01 65.00
2026-04-01 100.00
Total 195.00
Bobby 2026-04-01 20.00
2026-04-01 70.00
2026-04-01 87.00
Total 177.00
Cindy 2026-04-01 47.00
2026-04-01 90.00
2026-04-01 130.00
Total 267.00
Multiple Queries
The previous solution used a single query to loop through all of the Sales records. The last approach shown here uses multiple queries, one for each salesperson. That makes it slightly easier to do the bookkeeping needed to print names and totals.
Here's the select_all_by_group2 function.
def select_all_by_group2(cur):
'''Select the records and display the results by group.'''
query = 'SELECT DISTINCT Salesperson FROM Sales ORDER BY Salesperson'
cur.execute(query)
# Save the names into a list.
salespeople = [row[0] for row in cur.fetchall()]
# Display the data for each salesperson.
for person in salespeople:
print(person)
query = f'''
SELECT Date, Amount
FROM Sales
WHERE Salesperson = "{person}"
ORDER BY Date
'''
cur.execute(query)
total = 0
for date, amount in cur:
# Add to this person's total.
total += amount
# Print the date and amount.
print(f' {date:%Y-%m-%d} {amount:>10}')
# Display the person's total.
print(f' {"Total":>10} {total:>10}\n')
The code first executes the query SELECT DISTINCT Salesperson FROM Sales ORDER BY Salesperson to get a list of the distinct Salesperson values. It then loops through those values.
For each value, the code prints the person's name and then composes a query similar to SELECT Date, Amount FROM Sales WHERE Salesperson = "Alice" ORDER BY Date. The function then loops through that person's results.
For each of that person's results, the code adds to the person's total and displays the current record's date and amount.
When it finishes printing values for every salesperson, the code prints that person's total. Here's the result.
Alice
2026-04-01 100.00
2026-04-01 30.00
2026-04-01 65.00
Total 195.00
Bobby
2026-04-01 70.00
2026-04-01 20.00
2026-04-01 87.00
Total 177.00
Cindy
2026-04-01 90.00
2026-04-01 130.00
2026-04-01 47.00
Total 267.00
This is slightly different from the previous result but that's just to show a new format. You could use this technique to produce the earlier format if you like.
Conclusion
The SQL GROUP BY statement lets you aggregate values in related rows. If you want to display detail plus aggregated values, you're going to have to loop through a set of records and do your own calculations.
Download the example to experiment with it and to see additional details.
|