Smart Cropping: How FontFinder Isolates the Right Text from Background Noise
A company logo usually contains both a symbol and text. A screenshot might include navigation bars, buttons, and multiple typographic hierarchies. A product photo has decorative elements, photography, and perhaps a dozen fonts competing for attention.
FontFinder's smart cropping step solves a fundamental challenge: finding the text region that the user actually cares about, and cropping to exactly that region.
The Challenge
If we feed a full logo image to our AI model, it has to process the icon, the brand mark, the decorative elements, and the text all at once. The text — the part that contains font information — might occupy only 20% of the image. The rest is noise from the model's perspective.
Step 1: Morphological Dilation
Individual characters are separated by small gaps. The letter 'H' consists of two vertical strokes and a crossbar — three distinct components. A word like "FontFinder" has many separate connected components (one per letter, approximately).
We use morphological dilation with a horizontal structuring element to connect nearby components into unified text blobs. The kernel width is tuned to connect characters within the same word while not connecting words from different text lines.
Step 2: Contour Detection
After dilation, we run OpenCV's contour detection to find all connected regions in the binarised image. We filter contours by aspect ratio and area — text blocks have characteristic proportions that distinguish them from icons, decorative elements, and noise artifacts.
Step 3: Bounding Box Selection
From the filtered contours, we compute bounding boxes and score them based on:
- Area — larger text regions are preferred
- Aspect ratio — text is typically wider than tall
- Edge density — text has high edge density (many character strokes)
- Position — central regions are preferred over edge regions
Step 4: Padding and Letterboxing
After cropping to the text bounding box, we add a small padding margin (10% of each dimension) to ensure no character strokes are cut off. Then we resize to our model's input dimensions (224×224) using letterbox scaling — adding white borders rather than distorting the aspect ratio.
Manual Override: The Crop Tool
Automatic cropping works for most images, but some logos and designs are complex enough to fool our algorithm. That's why FontFinder also gives users a manual crop tool — draw a box around exactly the text you want identified, and that region goes directly to the model, bypassing the automatic cropping entirely.