Tools and Metrics for Document Analysis Systems Evaluation

Any development within a specific research field like document analysis and recognition comes along with the need for data and corresponding measurement devices and test equipment. This chapter introduces the basic issues of evaluation methods for different kind of document analysis systems and modules with a special emphasis on tools and metrics available and used today.

This chapter is organized as follows: After a general introduction including general definitions of terms used in document analysis system evaluation and general overviews of evaluation processes in section “Introduction,” different evaluation metrics are discussed in section “Evaluation Metrics.” These metrics cover the different aspects of the Document Analysis Handbook as presented in Chaps. 2 (Document Creation, Image Acquisition and Document Quality)–8 (Text Segmentation for Document Recognition), from image-processing evaluation metrics to special metrics for selected applications e.g., character/text recognition. In section “Evaluation Tools,” an overview of ground-truth file structure and a selection of available ground-truth tools are presented. Performance evaluation tools and competitions organized within the last years are also listed in section “Evaluation Tools.”

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic €32.70 /Month

Buy Now

Price includes VAT (France)

eBook EUR 534.99 Price includes VAT (France)

Hardcover Book EUR 527.49 Price includes VAT (France)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Similar content being viewed by others

Historical Document Image Binarization: A Review

Article 16 May 2020

A survey of historical document image datasets

Article Open access 30 July 2022

A Dataset for Quality Assessment of Camera Captured Document Images

Chapter © 2014

References

  1. Antonacopoulos A, Gatos B, Karatzas D (2003) ICDAR 2003 page segmentation competition. In: Proceedings the 7th international conference on document analysis and recognition (ICDAR), Edinburgh, pp 688–692 Google Scholar
  2. Antonacopoulos A, Bridson D, Gatos B (2005) Page segmentation competition. In: Proceedings the 8th international conference on document analysis and recognition (ICDAR), Seoul, pp 75–79 Google Scholar
  3. Antonacopoulos A, Karatzas D, Bridson D (2006) Ground truth for layout analysis performance evaluation. In: Proceedings of the 7th international conference on document analysis systems (DAS 06), Nelson, pp 302–311 ChapterGoogle Scholar
  4. Antonacopoulos A, Gatos B, Bridson D (2007) Page segmentation competition. In: Proceedings the 8th international conference on document analysis and recognition (ICDAR), Curitiba, pp 1279–1283 Google Scholar
  5. Baird HS, Govindaraju V, Lopresti DP (2004) Document analysis systems for digital libraries: challenges and opportunities. In: Proceedings of the IAPR international workshop on document analysis systems (DAS), Florence, pp 1–16 Google Scholar
  6. Belaid A, D’Andecy VP, Hamza H, Belaid Y (2011) Administrative document analysis and structure. In: Learning structure and schemas from documents. Springer, Berlin/Heidelberg, pp 51–72 ChapterGoogle Scholar
  7. Bippus R, Märgner V (1995) Data structures and tools for document database generation: an experimental system. In: Proceedings of the third international conference on document analysis and recognition (ICDAR 1995), Montreal, 14–16 Aug 1995, pp 711–714 Google Scholar
  8. Chhabra A, Phillips I (1998) The second international graphics recognition contest – raster to vector conversion: a report. In: Graphics recognition: algorithms and systems. Lecture notes in computer science, vol 1389. Springer, Berlin/Heidelberg, pp 390–410 ChapterGoogle Scholar
  9. Clausner C, Pletschacher S, Antonacopoulos A (2011) Aletheia – an advanced document layout and text ground-truthing system for production environments. In: Proceedings of the 11th international conference on document analysis and recognition (ICDAR), Beijing, pp 48–52 Google Scholar
  10. Doermann D, Zotkina E, Li H (2010) GEDI – a groundtruthing environment for document images. In: Proceedings of the ninth IAPR international workshop on document analysis systems, DAS 2010, Boston, 9–11 June 2010 Google Scholar
  11. Fenton R (1996) Performance assessment system development. Alsk Educ Res J 2(1):13–22 MathSciNetGoogle Scholar
  12. Gatos B, Ntirogiannis K, Pratikakis I (2011) DIBCO 2009: document image binarization contest. Int J Doc Anal Recognit (Special issue on performance evaluation) 14(1):35–44 ArticleGoogle Scholar
  13. Geist J et al (ed) (1994) The second census optical character recognition systems conference. Technical report NISTIR-5452, National institute of standards and technology, U.S. Department of Commerce, U.S. Bureau of the Census and NIST, Gaithersburg Google Scholar
  14. Haralick RM, Jaisimha MY, Dori D (1993) A methodology for the characterisation of the performance of thinning algorithms. In: Proceedings of the ICDAR’ 93, Tsukuba Science City, pp 282–286 Google Scholar
  15. Héroux P, Barbu E, Adam S, Trupin É (2007) Automatic ground-truth generation for document image analysis and understanding. In: Proceedings of the 9th international conference on document analysis and recognition (ICDAR 2007), Curitiba, Sept 2007, pp 476–480 Google Scholar
  16. IAPR TC11 Website for Datasets, Software and Tools (2012). http://www.iapr-tc11.org/mediawiki/index.php/Datasets_List, Dec 2012
  17. Kanai J, Rice V, Nartker TA (1995) Automated evaluation of OCR zoning. IEEE Trans TPAMI 17(1):86–90 ArticleGoogle Scholar
  18. Lam L, Suen CY (1993) Evaluation of thinning algorithms from an OCR viewpoint. In: Proceedings of the ICDAR’93, Tsukuba Science City, pp 287–290 Google Scholar
  19. Lee CH, Kanungo T (2003) The architecture of TRUEVIZ: a groundtruth/metadata editing and visualizing toolkit. Pattern Recognit 36(3):811–825 ArticleGoogle Scholar
  20. Lee S-W, Lam L, Suen CY (1991) Performance evaluation of skeletonization algorithms for document analysis processing. In: Proceedings of the ICDAR’91, Saint Malo, pp 260–271 Google Scholar
  21. Lee S-W, Park J-S, Tang YY (1993) Performance evaluation of nonlinear shape normalization methods for the recognition of large-set handwritten characters. In: Proceedings of the ICDAR’93, Tsukuba Science City, pp 402–407 Google Scholar
  22. Lu H, Kot AC, Shi YQ (2004) Distance-Reciprocal distortion measure for binary document images. IEEE Signal Process Lett 11(2):228–231 ArticleGoogle Scholar
  23. Lucas SM (2005) Text locating competition results. In: Proceedings of the 8th international conference on document analysis and recognition (ICDAR), Seoul, pp 80–85 Google Scholar
  24. Märgner V, Karcher P, Pawlowski AK (1997) On benchmarking of document analysis systems. In: Proceedings of the 4th international conference on document analysis and recognition (ICDAR), Ulm, vol 1, pp 331–336 Google Scholar
  25. Märgner V, Pechwitz M, El Abed H (2005) ICDAR 2005 Arabic handwriting recognition competition. In: Proceedings of the 8th international conference on document analysis and recognition (ICDAR), Seoul, vol 1, pp 70–74 Google Scholar
  26. Nadal C, Suen CY (1993) Applying human knowledge to improve machine recognition of confusing handwritten numerals. Pattern Recognition 26(3):381–389 ArticleGoogle Scholar
  27. Nartker TA, Rice SV (1994) OCR accuracy: UMLVs third annual test. INFORM 8(8):30–36 Google Scholar
  28. Palmer PL, Dabis H, Kittler J (1996) A performance measure for boundary detection algorithm. Comput Vis Image Underst 63(3):476–494 ArticleGoogle Scholar
  29. Phillips I, Chhabra A (1999) Empirical performance evaluation of graphics recognition systems. IEEE Trans Pattern Anal Mach Intell 21(9):849–870 ArticleGoogle Scholar
  30. Phillips I, Liang J, Chhabra A, Haralick R (1998) A performance evaluation protocol for graphics recognition systems. In: Graphics recognition: algorithms and systems. Lecture notes in computer science, vol 1389. Springer, Berlin/Heidelberg, pp 372–389 ChapterGoogle Scholar
  31. Pratikakis I, Gatos B, Ntirogiannis K (2011) ICDAR 2011 Document image binarization contest (DIBCO 2011). In: Proceedings of the 11th international conference on document analysis and recognition, Beijing, Sept 2011, pp 1506–1510 Google Scholar
  32. Randriamasy S (1995) A set-based benchmarking method for address bloc location on arbitrarily complex grey level images. In: Proceedings of the ICDAR’95, Montreal, pp 619–622 Google Scholar
  33. Randriamasy S, Vincent L (1994) Benchmarking and page segmentation algorithms. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR’94), Seattle, pp 411–416 Google Scholar
  34. Rice SV (1996) Measuring the accuracy of page-reading systems. PhD thesis, Department of Computer Science, University of Nevada, Las Vegas Google Scholar
  35. Saund E, Lin J, Sarkar P (2009) PixLabeler: user interface for pixellevel labeling of elements in document images. In: Proceedings of the 10th international conference on document analysis and recognition (ICDAR2009), Barcelona, 26–29 July 2009, pp 446–450 Google Scholar
  36. Thulke M, Märgner V, Dengel A (1999) A general approach to quality evaluation of document segmentation results. In: Document Analysis Systems: Theory and Practice. Third IAPR workshop DAS’98, Nagano, Japan, selected papers. LNCS vol 1655. Springer, Berlin/Heidelberg, pp 43–57 ChapterGoogle Scholar
  37. Trier OD, Jain AK (1995) Goal-directed evaluation of binarization methods. IEEE Trans PAMI 17(12):1191–1201 ArticleGoogle Scholar
  38. Trier OD, Taxt T (1995) Evaluation of binarization methods for document images. IEEE Trans PAMI 17(3):312–315 ArticleGoogle Scholar
  39. Yacoub S, Saxena V, Sami SN (2005) PerfectDoc: a ground truthing environment for complex documents. In: Proceedings of the 2005 eight international conference on document analysis and recognition (ICDAR’05), Seoul, pp 452–457 Google Scholar
  40. Yanikoglu BA, Vincent L (1998) Pink Panther: a complete environment for ground-truthing and benchmarking document page segmentation. Pattern Recognit 31(9):1191–1204 ArticleGoogle Scholar
  41. Yanikoglu BA, Vincent, L (1995) Ground-truthing and benchmarking document page segmentation. In: Proceedings of the third international conference on document analysis and recognition (ICDAR), Montreal, vol 2, pp 601–604 Google Scholar

Further Reading

  1. Many books about performance evaluation and benchmarking are on the market, especially for benchmarking of computer systems. But there is no book about document analysis methods evaluation. In most image processing, pattern recognition, and document analysis books chapters about evaluation of methods can be found. Readers interested in modern evaluation and benchmarking methods in general may find more details in the following recently published books: Google Scholar
  2. Madhavan R, Tunstel E, Messina E (eds) (2009) Performance evaluation and benchmarking of intelligent systems. Springer, New York Google Scholar
  3. Obaidat MS, Boudriga NA (2010) Fundamentals of performance evaluation of computer and telecommunication systems. Wiley, Hoboken BookGoogle Scholar