Logical Structure Detection for Heterogeneous Document
Classes
Leon Todoran, Marco Aiello, Christof Monz, and Marcel Worring
In: Proceedings of Document Recognition and Retrieval VIII,
SPIE, 2001, pp. 99-110
We present a fully implemented system based on generic document
knowledge for detecting the logical structure of documents for which
only general layout information is assumed. In particular, we focus on
detecting the reading order. Our system integrates components based on
computer vision, artificial intelligence, and natural language
processing techniques. The prominent feature of our framework is its
ability to handle documents from heterogeneous collections. The system
has been evaluated on a standard collection of documents to measure
the quality of the reading order detection. Experimental results for
each component and the system as a whole are presented and discussed
in detail. The performance of the system is promising, especially when
considering the diversity of the document collection.
|
@InProceedings{todo:01logi,
author = {Todoran, L. and Aiello, M. and Monz, C.
and Worring, M.},
title = {Logical Structure Detection for Heterogeneous
Document Classes},
booktitle = {Proceedings of Document Recognition and
Retrieval {VIII}},
pages = {99--110},
year = 2001,
publisher = {SPIE}
}
|