[Rod Stephens Books]
Index Books Python Examples About Rod Contact
[Mastodon] [Bluesky]
[Build Your Own Ray Tracer With Python]

[Beginning Database Design Solutions, Second Edition]

[Beginning Software Engineering, Second Edition]

[Essential Algorithms, Second Edition]

[The Modern C# Challenge]

[WPF 3d, Three-Dimensional Graphics with WPF and C#]

[The C# Helper Top 100]

[Interview Puzzles Dissected]

Title: Find a polynomial least squares fit for a set of points in Python

[A set of points fitted with a polynomial least squares fit in Python]

This example shows how to make a polynomial least squares fit to a set of data points. It's kind of confusing, but you can get through it if you take it one step at a time.

The example Find a linear least squares fit for a set of points in Python explains how to find a line that best fits a set of data points. If you haven't read that example yet, you should do so now because this example follows the same basic strategy.

Overview

With a degree d polynomial least squares fit, you need to find the coefficients A0, A1, ... Ad to make the following equation fit the data points as closely as possible:

A0 * x0 + A1 * x1 + A2 * x2 + ... + Ad * xd

The goal is to minimize the sum of the squares of the vertical distances between the curve and the points.

Keep in mind that you know all of the points so, for given coefficients, you can easily loop through all of the points and calculate the error.

If you store the coefficients in a list, then the following function calculates the value of the function F(x) at the point x.

def polynomial(coeffs, x): '''Calculate the polynomial function's value.''' total = 0 x_factor = 1 for i, coeff in enumerate(coeffs): total += x_factor * coeff x_factor *= x return total

The following function uses the polynomial function to calculate the total error squared between the data points and the polynomial curve.

def error_squared(points, coeffs): '''Return the error squared for this polynomial curve fit.''' total = 0 for point in points: dy = point[1] - polynomial(coeffs, point[0]) total += dy * dy return total

So far this shouldn't be too hard. Now let's talk about the calculations, calculus, and all that.

Calculations

The following equation shows the error function:

    Sum[(yi - (A0 * x^0 + A1 * x^1 + A2 * x^2 + ... + Ad*x^d)^2]

To simplify this, let Ei be the error in the ith term so:

    Sum[(Ei)^2]

The steps for finding the best solution are the same as those for finding the best linear least squares solution:

  • Take the partial derivatives of the error function with respect to the variables, in this case A0, A1, ... Ad.
  • Set the partial derivatives equal to 0 to get d + 1 equations and d + 1 unknowns A0, A1, ... Ad.
  • Solve the equations for A0, A1, ... Ad.

As was the case in the previous example, this may sound like an intimidating problem. Fortunately the partial derivatives of the error function are simpler than you might think because that function only involves simple powers of the As. For example, the partial derivative with respect to Ak is:

    Sum[2 * Ei * partial(Ei, Ak)]

The partial derivative of Ei with respect to Ak contains lots of terms involving powers of xi and different As, but with respect to Ak all of those are constants except the single term Ak * xik. All of the other terms drop out leaving:

    Sum[2 * Ei * (-xi^k)]

If you substitute the value of Ei, multiply the -xik term through, and add the As separately you get:

    equation

As usual, this looks pretty messy, but if you look closely you'll see that most of the terms are values that you can calculate using the xi and yi values. For example, the first term is simply the sum of the products of the yi values and the xi values raised to the kth power. The next term is A0 times the sum of the xi values raised to the kth power. Because the yi and xi values are all known, this equation is the same as the following for a particular set of constants S:

    equation

This is a relatively simple equation with d + 1 unknowns A0, A1, ..., Ad.

When you take the partial derivatives for the other values of k as k varies from 0 to d, you get d + 1 equations with d + 1 unknowns, and you can solve for the unknowns.

Linear least squares is a specific case where d = 1 and it's easy to solve the equations. For the more general case, you need to use a more general method such as Gaussian elimination. (Now you know why my last post was about Gaussian elimination.) For an explanation of Gaussian elimination, see Use Gaussian elimination to solve a system of equations in Python.

Solving the Equations

The following code shows how the example program finds polynomial least squares coefficients.

def find_polynomial_least_squares_fit(points, degree): '''Find the least squares polynomial fit.''' # Allocate space for (degree + 1) equations with (degree + 1) terms each. coeffs = [[0 for c in range(degree + 1)] for r in range(degree + 1)] constants = [0 for r in range(degree + 1)] # Calculate the coefficients for the equations. for j in range(degree + 1): # Calculate the coefficients for the jth equation. # Calculate the constant term for this equation. constants[j] = 0 for point in points: constants[j] -= (point[0] ** j) * point[1] # Calculate the other coefficients. for a_sub in range(degree + 1): # Calculate the next coefficient. coeffs[j][a_sub] = 0 for point in points: coeffs[j][a_sub] -= point[0] ** (a_sub + j) # Solve the equations. answer = gaussian_elimination(coeffs, constants) # Return the result. return answer

The code simply builds the coeffs and constants lists to hold the coefficients (the Ss in the previous equation) and values, and then calls the gaussian_elimination method to find the As.

[A degree 4 polynomial fitting a curve that should probably be fit by a degree 2 polynomial] Give the program a try. It's pretty cool!

Conclusion

Tip: Use the smallest degree that makes sense for your problem. If you use a very high degree, the curve will fit the points very closely but it will probably emphasize structure that isn't really there. For example, the picture on the right fits a degree 4 polynomial to points that really should be fit with a degree 2 polynomial.

Download the example to experiment with it and to see additional details.

© 2024 - 2025 Rocky Mountain Computer Consulting, Inc. All rights reserved.