Title: Split images into halves in Python

[This program splits the images in a directory either horizontally or vertically in Python]

This is the second in a series of posts leading to a flash card application that you can use to learn about Japanese hiragana characters.

Sometimes I start a project that begins simply enough but that requires me to write a couple of other programs. In this case I wanted to write a simple flash card program to display images of Japanese hiragana characters so I could learn their pronunciation. The program would display a character. Then when you clicked or pressed a button, it would show you a second image showing a mnemonic giving the character's pronunciation.

I found some images that I liked at ToFuGu, but that site places each character and its mnemonic on a single image. That leads to this program to split images.

Even this isn't completely trivial because the two pieces of the image are not centered in their halves. If you split an image in the middle, then the two pieces are near the left and right edges of the halves.

This program lets you split images into halves either horizontally (which is what we need here) or vertically (just in case you need that later). The program then optionally processes the halves to remove whitespace around the edges so you can displays the results centered.

Using the Program

To use the program, enter the name of the directory that contains the images to split. Also enter an output directory. Use the radio buttons to indicate whether you want to split the images horizontally or vertically, and check the Remove Whitespace box if you like. When you click Split, the program splits the image files in the input directory and saves the results in the output directory with _a and _b appended to their names. For example, the file ku.png is split into files ku_a.png and ku_b.png.

Getting Started

When you click the Split button, the following code executes.

def split(self): '''Split the images.''' # Enumerate the images in the From directory. extensions = ['.png', '.jpg', '.gif', '.tiff', '.jpeg', '.bmp'] from_path = Path(self.from_dir_var.get()) to_path = Path(self.to_dir_var.get()) # Make the output directory if needed. to_path.mkdir(parents=True, exist_ok=True) # Process the files. for item in from_path.iterdir(): if item.is_file(): if item.suffix.lower() in extensions: # Split this file. self.split_file(item, to_path) # Clear the canvas so we know we are done. self.canvas.delete(tk.ALL)

This event handler converts the "from" and "to" directory paths into Path objects. It creates the output directory if it doesn't already exist and then uses from_path.iterdir() to loop through the contents of the input directory.

The from_path.iterdir() method returns both files and directories, so the program uses each item's is_file method to see if it is a file.

If an item is a file, the code gets its suffix (extension), converts it into lower case, and sees if the extension is in the extensions list, which contains the most common image file extensions.

If the item is a file with an image file extension, the code calls split_file to split the image into two new files.

After it processes all of the files in the input directory, the split method clears the program's Canvas widget. You'll see how the program uses the canvas to display the file it is working on shortly.

Splitting Images

The following split_file method splits an image file into two new images.

def split_file(self, item, to_path): '''Split this file.''' # Load the image. filename = item.resolve() if PRINT_ORIGINAL_NAME: print(filename) image = Image.open(filename) image = image.convert('RGB') # Use RGB format. # Resize to fit the canvas. image, x, y = fit_image(image, 0, 0, self.canvas.winfo_width(), self.canvas.winfo_height()) # Display the resized image. self.photo_image = ImageTk.PhotoImage(image) self.canvas.delete(tk.ALL) self.canvas.create_image(x, y, image=self.photo_image, anchor=tk.NW) self.canvas.update() # Split the image into two pieces. if self.split_var.get() == 0: # Split horizontally. wid = image.width // 2 hgt = image.height image1 = image.crop((0, 0, wid-1, hgt-1)) image2 = image.crop((wid, 0, image.width-1, hgt-1)) else: # Split vertically. wid = image.width hgt = image.height // 2 image1 = image.crop((0, 0, wid-1, hgt-1)) image2 = image.crop((0, hgt, wid-1, image.height-1)) # If we should remove whitespace, do so. if self.remove_whitespace_var.get(): # Remove whitespace around the egdes. image1 = remove_whitespace(image1) image2 = remove_whitespace(image2) # Create the pieces' file names. filename1 = to_path / f'{item.stem}_a{item.suffix}' filename2 = to_path / f'{item.stem}_b{item.suffix}' if PRINT_NEW_NAMES: print(filename1) print(filename2) # Save the pieces. image1.save(filename1) image2.save(filename2)

The methods inputs are a Path object representing an image file and the path to the output directory.

The code first uses the file's Path object's resolve method to get the full path to the file. It then uses Image.open to load the file into a PIL image and calls its convert method to make sure the image is using the RGB color model.

Next, the program uses the fit_image function described in my earlier post Fit an image to a target rectangle and center it in Python to get a version of the image that fits this program's Canvas widget. It converts the PIL image into a PhotoImage, saving it in a class variable so it won't be garbage collected.

The code deletes any previous items on the canvas, displays the new image there, and updates the canvas so you can see it. (If you don't update the canvas, you don't see anything and the program is a lot more boring.)

Next, the method gets down to the work of splitting the image into two pieces. It checks self.split_var to see if we want to split the image horizontally or vertically, and splits the image accordingly.

If the Remove Whitespace checkbox is checked, the program calls the remove_whitespace function (described next) to remove whitespace around the images' edges.

The program then composes the names of the new image files that it will create. For example, the first name consists of the "to" directory's path, the input file's stem (file name without path or extension), the extra text _a, and the input file's suffix (extension). Note that this means the output files have the same format (PNG, JPG, etc.) as the input file.

Finally, the method saves the two new images.

Removing Whitespace

The following remove_whitespace function removes the whitespace around the edges of a PIL image.

def remove_whitespace(image): '''Remove whitespace around the image's edges.''' bounds = non_white_bounds(image) return image.crop(bounds)

This function calls the non_white_bounds function described next to get the bounds of the pixels in the image that are not whitespace. It then uses the image's crop method to extract and return the non-whitespace area.

Here's the non_white_bounds method that finds the bounds of the area that contains the image's non-whitespace pixels.

def non_white_bounds(image): '''Find the bounds of the image's non-white pixels.''' pixels = image.load() # Consider pixels with brightness less than this to be non-whitespace. cutoff = 3 * 255 # Find ymin. ymin = None for y in range(image.height): for x in range(image.width): (r, g, b) = pixels[x, y] if r + g + b < cutoff: ymin = y break if ymin is not None: break # See if all pixels are white. if ymin is None: return None # Find ymax. ymax = None for y in range(image.height-1, -1, -1): for x in range(image.width): (r, g, b) = pixels[x, y] if r + g + b < cutoff: ymax = y break if ymax is not None: break # Find xmin. xmin = None for x in range(image.width): for y in range(image.height): (r, g, b) = pixels[x, y] if r + g + b < cutoff: xmin = x break if xmin is not None: break # Find xmax. xmax = None for x in range(image.width-1, -1, -1): for y in range(image.height): (r, g, b) = pixels[x, y] if r + g + b < cutoff: xmax = x break if xmax is not None: break return (xmin, ymin, xmax, ymax)

This function first uses the image's load method to give us fast access to the image's pixels.

Next, the code sets a cutoff value. If the sum of a pixel's red, green, and blue color components is less than this cutoff, the function regards it as a dark pixel that is not whitespace.

This example sets cutoff so a pixel is whitespace only if its red, green, and blue components are all 255 so the pixel is completely white. You can make cutoff smaller to treat slightly darker pixels as whitespace if you like.

The function then finds the smallest Y coordinate that contains a non-whitespace pixel. To do that, it make y loop from 0 to the image's height. For each y value, it makes x loop across the image's width, and it examines the pixel at position (x, y). If the sum of that pixel's red, green, and blue color components is less than cutoff, it is not whitespace. In that case, the code sets ymin to the current y value and breaks out of its loops.

There's one odd situation that may occur. If the image is completely whitespace, then the loop doesn't assign ymin so it remains None. In that case, the non_white_bounds function returns None as the bounds for the non-whitespace pixels. If you look back at the remove_whitespace function, you'll see that it uses the bounds in a call to the image's crop method. It turns out that, if crop receives the parameter None, it returns a copy of the image. What all this means is, if an image file is completely blank, the program creates two half-sized images that are also blank.

Assuming the non_white_bounds function finds ymin, it performs similar steps to find the maximum Y coordinate and the minimum and maximum X coordinates for the area containing the image's non-whitespace pixels.

The function finishes by returning the bounds.

Conclusion

To summarize:

split loops through the contents of the input directory and calls split_file for the image files there
split_file splits a file horizontally or vertically and optionally calls remove_whitespace on the pieces
remove_whitespace calls non_white_bounds to find the area containing non-whitespace pixels and crops out that area
non_white_bounds loops through an image's pixels and returns the bounds of the area that contains non-whitespace pixels

It's a lot of steps, but they're all pretty straightforward taken one at a time.

In my next post, we'll perform one last pre-processing step before getting to the final flash card app. Meanwhile, download the example to experiment with it and to see additional details.