I had an opportunity to obtain the coordinates within a larger image from multiple cropped sub-images. This article is a memo summarizing the method for doing this.
I introduce a method using OpenCV’s SIFT (Scale-Invariant Feature Transform) to perform feature point matching between template images and the original image, estimate the affine transformation, and obtain the coordinates.
The following code matches template images (PNG images in templates_dir) against a specified large image (image_path) using SIFT, and obtains the coordinates within the original image.
This article introduced a method for estimating where sub-images are located in the original image using SIFT-based feature point matching, and identifying positions through affine transformation.
SIFT is used for feature extraction (freely available since OpenCV 4.4)
BFMatcher is used for feature matching, and RANSAC is used for noise removal
Affine transformation is used to estimate coordinates and draw rectangles on the original image
Result images are saved for visualization of where each sub-image is located
This method can be applied to tasks such as locating partial images in historical maps, OCR region detection, and image comparison.
Future challenges:
Correction for rotated images
Consideration of faster algorithms than SIFT (ORB, AKAZE, etc.)
Processing speed optimization (feature point filtering)
There may be some incomplete points, but I hope this serves as a useful reference.