Behind the Tech · Jan 2025 · 6 min read · By FontFinder Engineering

Smart Cropping: How FontFinder Isolates the Right Text from Background Noise

A company logo usually contains both a symbol and text. A screenshot might include navigation bars, buttons, and multiple typographic hierarchies. A product photo has decorative elements, photography, and perhaps a dozen fonts competing for attention.

FontFinder's smart cropping step solves a fundamental challenge: finding the text region that the user actually cares about, and cropping to exactly that region.

The Challenge

If we feed a full logo image to our AI model, it has to process the icon, the brand mark, the decorative elements, and the text all at once. The text — the part that contains font information — might occupy only 20% of the image. The rest is noise from the model's perspective.

Step 1: Morphological Dilation

Individual characters are separated by small gaps. The letter 'H' consists of two vertical strokes and a crossbar — three distinct components. A word like "FontFinder" has many separate connected components (one per letter, approximately).

We use morphological dilation with a horizontal structuring element to connect nearby components into unified text blobs. The kernel width is tuned to connect characters within the same word while not connecting words from different text lines.

Step 2: Contour Detection

After dilation, we run OpenCV's contour detection to find all connected regions in the binarised image. We filter contours by aspect ratio and area — text blocks have characteristic proportions that distinguish them from icons, decorative elements, and noise artifacts.

Step 3: Bounding Box Selection

From the filtered contours, we compute bounding boxes and score them based on:

Area — larger text regions are preferred
Aspect ratio — text is typically wider than tall
Edge density — text has high edge density (many character strokes)
Position — central regions are preferred over edge regions

Step 4: Padding and Letterboxing

After cropping to the text bounding box, we add a small padding margin (10% of each dimension) to ensure no character strokes are cut off. Then we resize to our model's input dimensions (224×224) using letterbox scaling — adding white borders rather than distorting the aspect ratio.

Manual Override: The Crop Tool

Automatic cropping works for most images, but some logos and designs are complex enough to fool our algorithm. That's why FontFinder also gives users a manual crop tool — draw a box around exactly the text you want identified, and that region goes directly to the model, bypassing the automatic cropping entirely.