5679

Border Noise Removal of Camera-Captured Document Images using Page-Frame Detection

Syed Saqib Bukhari, Faisal Shafait, Thomas Breuel

4th International Workshop on Camera-Based Document Analysis and Recognition International Workshop on Camera-Based Document Analysis and Recognition (CBDAR-11), 4th, September 22, Beijing, China , Springer , 2011
Camera-captured document images usually contain two main types of marginal noise: textual noise (coming from neighboring pages) and non-textual noise (resulting from the page surrounding and/or binarization process). These types of marginal noise degrade the performance of the preprocessing (dewarping) of camera-captured document images and subsequent document digitization/recognition processes. Page frame detection is one of the newly investigated areas in document image processing, which is used to remove border noise and to identify the actual content area of document images. In this paper, we present a new technique for page frame detection of camera-captured document images. We use text and nontext contents information to find the page frame of document images. We evaluate our algorithm on the DFKI-I (CBDAR 2007 Dewarping Contest) dataset. Experimental results show the effectiveness of our method in comparison to other stateof- the-art page frame detection approaches.

Show BibTex:

@inproceedings {
       abstract = {Camera-captured document images usually contain
two main types of marginal noise: textual noise (coming
from neighboring pages) and non-textual noise (resulting from
the page surrounding and/or binarization process). These types
of marginal noise degrade the performance of the preprocessing
(dewarping) of camera-captured document images and subsequent
document digitization/recognition processes. Page frame
detection is one of the newly investigated areas in document
image processing, which is used to remove border noise and
to identify the actual content area of document images. In this
paper, we present a new technique for page frame detection
of camera-captured document images. We use text and nontext
contents information to find the page frame of document
images. We evaluate our algorithm on the DFKI-I (CBDAR
2007 Dewarping Contest) dataset. Experimental results show
the effectiveness of our method in comparison to other stateof-
the-art page frame detection approaches.},
       number = {}, 
       month = {9}, 
       year = {2011}, 
       title = {Border Noise Removal of Camera-Captured Document Images using Page-Frame Detection}, 
       journal = {}, 
       volume = {}, 
       pages = {}, 
       publisher = {Springer}, 
       author = {Syed Saqib Bukhari, Faisal Shafait, Thomas Breuel}, 
       keywords = {Border Noise Removal, Page Frame Detection, Camera-Captured Document Images},
       url = {http://www.dfki.de/web/forschung/publikationen/renameFileForDownload?filename=Bukhari-Page-Frame-CBDAR11.pdf&file_id=uploads_1187}
}