Title: Use Gaussian elimination to solve a system of equations in Python
This example shows how to use Gaussian elimination to solve a linear system of equations of the form:
A1*x1 + B1*x2 + ... + N1*xn = C1
A2*x1 + B2*x2 + ... + N2*xn = C2
...
An*x1 + Bn*x2 + ... + Nn*xn = Cn
Here's an example:
2 * x1 + 1 * x2 - 1 * x3 = 8
-3 * x1 - 1 * x2 + 2 * x3 = -11
-2 * x1 + 1 * x2 + 2 * x3 = -3
The solution to these equations gives values for x1, x2, and x3 that make all three equations true. (In this example, you can verify that the values 2, 3, and -1 work.)
Terminology
Here's a very quick overview of how Gaussian elimination works. The whole process replies on just a few facts.
Fact 1
If you multiply both sides of an equation by a non-zero value, the truth of the equation remains unchanged.
For example, if 3 × x = 9, then multiplying both sides by 2 gives 6 × x = 18.
Fact 2
If you add one equation to another, then the truth of the equations still holds.
For example, consider these equations:
9 * x1 + 4 * x2 = 7
4 * x1 + 3 * x2 = 8
If you add those equations, you get:
13 * x1 + 7 * x2 = 15
Fact 3
You can swap the order of the equations without changing the truth of the whole system.
(This fact isn't used as often as the others.)
Before I explain the technique, you should know that it's often convenient to represent a system of linear equations with a matrix of coefficients multiplied by a vector of variables x1, x2, ..., xn, resulting in a vector of constants C1, C2, ..., Cn.
| A1 B1 ... N1 | | x1 | | C1 |
| A2 B2 ... N2 | | x2 | | C2 |
| ... | × |... | = |... |
| An Bn ... Nn | | xn | | Cn |
Here's what the earlier example looks like in this format.
| 2 1 -1 | | x1 | | 8 |
| -3 -1 2 | x | x2 | = | -11 |
| -2 1 2 | | x3 | | -3 |
Sometimes it's handy to keep all of the values in one place so we make an augmented matrix that includes the result values (the Cs). When writing this out, we usually put a line between the coefficients and the result values like this:
| A1 B1 ... N1 | C1 |
| A2 B2 ... N2 | C2 |
| ... |... |
| An Bn ... Nn | Cn |
And here's our example again in augmented matrix form:
| 2 1 -1 | 8 |
| -3 -1 2 | -11 |
| -2 1 2 | -3 |
Gaussian Elimination
Now that we have an augmented matrix, here's how Gaussian elimination works.
Step 1
Use the first equation to zero out the entries in the first column of the equations below the first one.
To do that, we use Facts 1 and 2 to add a multiple of the first equation to each of the following equations. For example, consider the example again. The first equation starts with 2 and the second equation starts with -3. If we multiply the first equation by 3/2 and then add the result to the second equation, the second equation's first entry becomes 0. To maintain the truth of the equations, we also need to multiply the other terms in the first equation and add them to the second equation. Here's the full result after this operation.
| 2 1 -1 | 8 |
| 0 0.5 0.5 | 1 |
| -2 1 2 | -3 |
Now we do the same thing for the third equation. That equation begins with -2, so we multiply the first equation by 1 and add them to get this:
| 2 1 -1 | 8 |
| 0 0.5 0.5 | 1 |
| 0 2 1 | 5 |
Step 2
Use the second equation to zero out the entries in the second column of the equations below the second one.
This is the same thing we just did for the first column, but this time for the second column. If you look at the example, you'll see that the second equation starts with 0.5 and the third equation starts with 2, so we multiply the second equation by -4 and add it to the third to get this:
| 2 1 -1 | 8 |
| 0 0.5 0.5 | 1 |
| 0 0 -1 | 1 |
For a larger system of equations, you would continue as in Steps 1 and 2 using the next equation to zero out the corresponding column in the equations below it. You continue until all of the entries to the lower left of the diagonal are 0. At that point, the matrix is called upper triangular because all of the non-zero entries are in a triangle in the upper right part of the matrix.
Unfortunately, one odd thing can happen while you're processing the matrix. Sometimes you'll want to use the Nth equation to zero out the Nth column, but that equation has a 0 in that column. In that case, you cannot multiply the equation by something to zero out the other equations' entries.
This is when we invoke Fact 3. You swap that equation with an equation lower in the matrix that does not have a 0 in that column and you continue as normal.
Eventually you will either reach an upper triangular form or all of the equations below have 0 in the Nth column. If the former happens, read on. If the later happens, there is no unique solution to the system of equations.
Backsolving
At this point, the matrix is in upper triangular form and you can backsolve to find the solution. Here's the final value of the example augmented matrix.
| 2 1 -1 | 8 |
| 0 0.5 0.5 | 1 |
| 0 0 -1 | 1 |
The last equation is really shorthand for 0 * x1 + 0 * x2 + -1 * x3 = 1 which simplifies to -x3 = 1. That's easy to solve! That means x3 = -1.
Now you can go back up one equation (hence the term backsolve) and use that result to solve the equation. The second equation means 0 * x1 + 0.5 * x2 + 0.5 * x3 = 1. But now we know x3 = -1 so we can plug that in and simplify to get 0.5 * x2 + 0.5 * (-1) = 1 or 0.5 * x2 = 1.5. Solving that gives x2 = 3.
Now you can plug the values for x2 and x3 into the first equation to get 2 * x1 + 1 * x2 -1 * x3 = 8. Plugging in the values gives 2 * x1 + 3 + 1 = 8 which simplifies to 2 * x1 = 4 and x1 = 2.
Therefore the solution is [2, 3, -1] and Bob's your uncle.
I want to make one final note before we move on. When backsolving, you need somewhere to store the results. One convenient way to handle that is to add another column to the augmented matrix. Initially those entries are all zeros and you fill them in while you backsolve. (That is the way the Python code shown shortly does it.)
Python Code
Here comes the Python code. To make using the code a bit easier, I wrote a load_augmented_matrix function to create an augmented matrix. The gaussian_elimination function uses that helper function and then solves the system of equations.
load_augmented_matrix
Here's the load_augmented_matrix function.
import math
import copy
def load_augmented_matrix(coeffs, values):
'''Use the coefficients and values to create an augmented matrix.'''
# Start with a copy of the coefficients.
arr = copy.deepcopy(coeffs)
# Add the values and an extra entry at the end of each row.
for r in range(len(coeffs)):
arr[r].append(values[r])
arr[r].append(0)
return arr
The code first uses copy.deepcopy to make a copy of the coeffs matrix. It must be a deep copy because coeffs is a list of lists. If you just used copy, then the result would be a list containing the original sub-lists. Then when we later modify the copy, we would mess up the original values.
After making the deep copy, the code loops through the values (the Cs in the earlier discussion) and adds them to the ends of their rows in the arr copy. It also adds a 0 entry to hold the results during backsolving.
When it's done, the function returns the augmented matrix arr.
gaussian_elimination
Here's the gaussian_elimination function.
def gaussian_elimination(coeffs, values):
'''Solve the system of equations.'''
# arr is the augmented matrix.
arr = load_augmented_matrix(coeffs, values)
# The values num_rows and num_cols are the number of rows
# and columns in the original matrix, not the augmented matrix.
num_rows = len(arr)
num_cols = len(arr[0]) - 2
# Start solving.
for r in range(num_rows - 1):
# Zero out all entries in column r after this row.
# See if this row has a non-zero entry in column r.
if math.isclose(arr[r][r], 0, abs_tol=0.00001):
# Too close to zero. Try to swap with a later row.
for r2 in range(r + 1, num_rows):
if not math.isclose(arr[r2][r], 0, 0.00001):
# This row will work. Swap them.
arr[r], arr[r2] = arr[r2], arr[r]
break
# If this row has a non-zero entry in column r, use it.
if not math.isclose(arr[r][r], 0, abs_tol=0.00001):
# Zero out this column in later rows.
for r2 in range(r + 1, num_rows):
factor = -arr[r2][r] / arr[r][r]
for c in range(r, num_cols + 1):
arr[r2][c] = arr[r2][c] + factor * arr[r][c]
# At this point, we have an upper-triangular array.
# print('Upper-triangular:')
# print_array(arr)
# print()
# See if we have a solution.
if math.isclose(arr[-1][-3], 0):
# We have no solution.
# See if all of the entries in this row are 0.
if all(math.isclose(value, 0) for value in arr[num_rows - 1]):
raise ValueError('The solution is not unique')
else:
raise ValueError('There is no solution')
else:
# Backsolve.
for r in range(num_rows - 1, -1, -1):
tmp = arr[r][num_cols]
for r2 in range(r + 1, num_rows):
tmp -= arr[r][r2] * arr[r2][num_cols + 1]
arr[r][num_cols + 1] = tmp / arr[r][r]
# print('Backsolved:')
# print_array(arr)
# print()
# Pull out the results.
results = [row[-1] for row in arr]
# Return the results.
return results
This code calls load_augmented_matrix to prepare the augmented matrix. It then loops through the rows using row r to zero out entries in column r. Before it tries to do the zeroing, it checks that row r column r is not zero (or too close to it). If that entry is close to zero, the code loops through later rows trying to find one with a non-zero entry. If it finds one, it swaps the two rows.
Having (hopefully) found a row with a non-zero entry, the code loops through the later rows and zeros out that position.
After it has done its best to make an upper-triangular matrix, the function checks to see if it succeeded and, if it did not, raises an appropriate error.
If everything's cool, the code backsolves and stores the results in the augmented matrix's last column.
The column then pulls the solution values out into a new list and returns it.
check
The following check method evaluates a system of equations and prints the differences between the calculated values and a list of expected values.
def check(coeffs, xs, values):
for r in range(len(coeffs)):
total = 0
for c in range(len(xs)):
total += coeffs[r][c] * xs[c]
diff = values[r] - total
print(f'Diff {diff:.2e}: {values[r]:4} = {total}')
This code loops through the equations. For each equation, it multiplies the coefficients by the xs values and adds the results. It then prints the total for that equation and the expected value in the values list.
Main Program
Here's how the main program solves one system of equations.
# Define the coefficients and values.
coeffs = [
[ 1, 1, 1, 1, 1],
[ 32, 16, 8, 4, 2],
[ 243, 81, 27, 9, 3],
[1024, 256, 64, 16, 4],
[3125, 625, 125, 25, 5],
]
values = [1, -1, 8, -56, 569]
# Solve.
xs = gaussian_elimination(coeffs, values)
# Display the inputs.
print('Coefficients:')
for row in coeffs:
print(' [' + ', '.join(f'{value:4d}' for value in row) + ']')
print('Values:')
print(' [' + ', '.join(f'{value:4d}' for value in values) + ']')
print()
# Display the results.
print('Results:')
print(' [' + ', '.join(f'{value:.2f}' for value in xs) + ']')
print()
# Check.
check(coeffs, xs, values)
This code creates a coefficients array and a list of values. It then calls gaussian_elimination to solve the system of equations.
Next, the code displays the coefficients, values, and results. It finishes by calling the check method to verify the results.
Conclusion
Here are some other equations you can try to test the program:
| 1 -3 1| |x1| | 4|
| 2 -8 8| *|x2| = |-2| has solution 3, -1, -2.
|-6 3 -15| |x3| | 9|
| 9 3 4| |x1| | 7|
| 4 3 4| *|x2| = | 8| has solution -0.2, 4, -0.8.
| 1 1 1| |x3| | 3|
| 2 -1 1| |x1| |1|
| 3 2 -4| *|x2| = |4| has no solution.
|-6 3 -3| |x3| |2|
| 1 -1 2| |x1| |-3|
| 4 4 -2| *|x2| = | 1| has no unique solution.
|-2 2 -4| |x3| | 6|
For a more general and theoretical discussion on Gaussian elimination, see the article Gaussian Elimination by Eric W. Weisstein at MathWorld--A Wolfram Web Resource.
Download the example to experiment with it and to see additional details.
|