Title: Map a trapezoidal image onto a rectangle in Python
I often take pictures of products and, because I don't have a studio, the results are often a little crooked. They're nowhere near as crooked as the picture shown here, but the human eye is pretty good at noticing when objects don't line up with a picture's edges, so this is an issue even when the error is much smaller.
I'd been meaning to build a tool to adjust images to fix this for a long time, but it seemed like a lot of work. The mapping is a perspective transformation and I've done a lot of three-dimensional programming so I knew what was involved. Set up some equations to represent the points on the trapezoid and the destination rectangle, solve the equations to create a transformation matrix, and then apply the matrix to the image. It all seemed like a lot of work.
When I finally decided to give it a try, I discovered that cv2 (the latest version of OpenCV, the Open Computer Vision Library) can do most of the work for you!
Run the program and use the File menu's Open command to load an image file. Then click four times to define the trapezoid.
When you select the fourth point, the program performs the mapping. Use the File menu's Save As command to save the resulting image. Use the File menu's Reset command to restore the original image so you can try again.
IMPORTANT: You must click to define the trapezoid's points in the order: upper left, upper right, lower right, and lower left. If you use a different order, the program will not preserve the image properly. It may rotate the image, reflect it, or turn it inside out. (Try it, it's pretty interesting.)
|
To perform the mapping, the program builds two numpy arrays, one holding points that define the trapezoid and one holding points that define the rectangle where you want to map the trapezoid. It calls the cv2.getPerspectiveTransform to get a transformation matrix to map from the first array of points to the second and then uses cv2.warpPerspective to apply the transformation to the image.
Here's the key piece of code.
def rectify(self):
image = cv2.imread(self.filename)
if image is None:
messagebox.showinfo('File Error', f'Cannot find file {self.filename}.')
return
# Get the result rectangle's dimensions.
wid = max(
dist(self.corners[0], self.corners[1]),
dist(self.corners[2], self.corners[3]))
hgt = max(
dist(self.corners[0], self.corners[3]),
dist(self.corners[1], self.corners[2]))
# Make margins 10% of width/height.
margin_x = int(wid * 0.1)
margin_y = int(hgt * 0.1)
pts1 = np.float32(self.corners) # Source points
pts2 = np.float32([
[margin_x, margin_y],
[margin_x + wid, margin_y],
[margin_x + wid, margin_y + hgt],
[margin_x, margin_y + hgt]]) # Destination points
# Get the transformation matrix.
matrix = cv2.getPerspectiveTransform(pts1, pts2)
# Get output image dimensions.
result_wid = int(wid + 2 * margin_x)
result_hgt = int(hgt + 2 * margin_y)
# Transform.
new_image = cv2.warpPerspective(image,
matrix, (result_wid, result_hgt))
# Convert colors from BGR to RGB.
new_image = cv2.cvtColor(new_image, cv2.COLOR_BGR2RGB)
# Remove the current corners.
self.corners = []
# Convert to PIL image and save.
self.current_pil_image = Image.fromarray(new_image)
self.show_current_image()
This code first use cv2.imread to load the image in a format that cv2 can understand.
Next, the code calculates the size that the output rectangle should have. To find the width, it picks the longer distance between the trapezoid's "horizontal" edges. (This is one place where the code assumes you have entered the trapezoids vertices in the order upper left, upper right, lower right, and lower left.) Similarly, it calculates the output rectangle's height by picking the longer of the trapezoid's "vertical" sides.
Next, the code calculates horizontal and vertical margins equal to 10% of the rectangle's width and height. Feel free to adjust the margin sizes if you like.
The code then builds the two arrays of points. The trapezoid's vertices are stored in self.corners, so the code simply converts it into a numpy array. It uses the rectangle's dimensions and the margin values to build the array holding the rectangle's corners.
Now the code calls cv2.getPerspectiveTransform to get the transformation matrix. It calculates the desired size of the result image and calls cv2.warpPerspective to apply the matrix.
The cv2 library manipulates images in BGR format but PIL prefer RGB, so the program calls cv2.cvtColor to convert the image into the RGB format.
Next, the code clears the self.corners list so you can try again. It converts the cv2 image into a PIL image and calls show_current_image to display the result.
The picture on the right shows the result after rectifying the image shown at the top of this post.
Note that three-dimensional images are distorted by the process. If the trapezoid is close to a rectangle already, then the effect will probably not be noticeable. However, if you have many three-dimensional objects (like a chess board covered in pieces) and the adjustment is large, the pieces will probably show some distortion.
In fact, the picture on the right shows distortion by displaying the box's side (at the bottom of the picture). There's no way a normal camera can make the box's front look rectangular and still show any of the box's other sides.
Of course, the program includes there a lot of details that I haven't described here. For example, many of the images I work with are too big to fit on the screen so the program lets you scale the image. Download the example to see how the program does that, builds its interface, loads and saves files, lets you click to select points, and more.
|