Automatic visual classification of very similar instances, a.k.a. fine-grained classification, is the problem of assigning images to classes where all instances differ only by minute details. Examples include flower types or specific bird species. In this project, we propose to use recognized scene texts images to aid in fine-grained classification.
When text is present in natural scenes, the text is typically there to give semantic meaning beyond what is obvious from exclusively visual cues. For instance, in figure above, the left and middle images share a very similar scene layout. However, if one wants to group these images based on the semantics of the scenes, the middle and the right images belong together because they share the same business name ("Starbucks").
In this project, we focus on fine-grained classification of Buildings into their sub-classes such as Cafe, Tavern, Diner, etc. by detecting scene text in images. We introduce a text detection method that does not try to detect all possible foreground text regions but instead aims to reconstruct the scene background to eliminate non-text regions. Object cues such as color, contrast, and objectiveness are used in corporation with a random forest classifier to detect background pixels in the scene.
Con-Text dataset is built from sub-categories of the ImageNet "building" and "place of business" sets to evaluate fine-grained classification. The dataset consists of 28 categories with 24,255 images in total. Note that this dataset is not specifically build for text recognition and thus not all the images have text in them. Moreover, high variability of text size, location, resolution and style and, uncontrolled environmental settings ( illumination ) make text recognition from this dataset harder.
@InProceedings{Karaoglu17,
author = "Sezer Karaoglu and Ran Tao and Jan van Gemert and Theo Gevers",
title = "Con-Text: Text Detection for Fine-grained Object Classification",
booktitle = "IEEE Transactions on Image Processing (TIP)",
year = "2017"
}
Sezer Karaoglu is supported by Dutch national program COMMIT.
For the questions about the dataset please contact Sezer Karaoglu and Jan C. van Gemert, (s.karaoglu[at]uva.nl and j.c.vangemert[at]uva.nl)