CFP

DALL 2026: The Second of International Workshop on Documents analysis of Low-resource Languages

Vienna, Austria, September 3, 2026

Conference web page	https://icdar-dall.github.io/
Submission link	https://easychair.org/conferences/?conf=dall2026
Submission deadline	May 30, 2026

Topics: optical character recognition document image processing layout analysis natural language processing

The importance of low-resource document analysis is multifaceted, particularly in the fields of cultural preservation, data scarcity, linguistic research, and technological applications. Firstly, low-resource languages often embody unique cultural and historical contexts. Document analysis facilitates the digitization and preservation of these linguistic materials, providing crucial resources for understanding human history and cultural evolution. For instance, many endangered languages possess vast amounts of scanned documents, which can be analyzed to create valuable linguistic and cultural repositories. Secondly, low-resource languages typically suffer from a lack of large-scale annotated datasets, posing challenges for training machine learning models. Document analysis techniques, such as Optical Character Recognition (OCR) and document layout analysis, enable the extraction and structuring of data from existing documents, thereby mitigating data scarcity issues. Moreover, document analysis plays a pivotal role in enhancing machine translation capabilities. Monolingual data extracted through OCR can be utilized to improve machine translation for low-resource languages, which is particularly critical for languages with limited parallel corpora. Additionally, document analysis supports linguistic research by enabling the study of language variations and historical documentation, shedding light on the evolution and unique features of these languages. Finally, document analysis enhances the accessibility and usability of low-resource language documents. For example, advancements in OCR systems for non-Latin scripts allow researchers to extract text more efficiently from scanned documents, enabling applications such as content summarization and information retrieval. In summary, low-resource document analysis is not only a vital tool for cultural preservation but also a key driver of language technology development and academic research.

Submission Guidelines

This workshop invites original contributions in both theoretical and applied research domains. All submissions must adhere to the formatting guidelines specified on the ICDAR 2026 official website.

Manuscripts not adhering to the page limit (maximum 17 pages all included), the formatting guidelines (Springer Lecture Notes format) or the anonymization requirements will be rejected without review.
The authorship of submitted manuscripts is final. No changes to the list of authors will be allowed once a paper has been submitted!
The review of conference papers will be double blind. Authors should not include their names, affiliations, or acknowledgements in submitted manuscripts, and should ensure that their identity is not revealed indirectly by citing their earlier work in the third person. Authors will be given the opportunity to respond to reviews in a rebuttal phase.
Submitted papers must adhere to the Springer Lecture Notes in Computer Science (LNCS) format (Springer Guidelines/Templates).
Submissions will be accepted through the workshop's EasyChair submission portal. At least one author of each accepted paper must complete workshop registration to present the work. Detailed submission procedures are available on the ICDAR 2026 official website.

Contact

yongtso@163.com

List of Topics

Document Image Processing for Low-Resource Languages
Optical Character Recognition (OCR) for Printed Text in low-resource languages
Logical layout analysis for Low-Resource Languages
Handwriting Text Recognition (HTR) for manuscripts and historical documents
Natural Language Processing for Understanding of Documents written in low resource languages
Scene Text Detection and Recognition for Low-Resource Languages
Gold-Standard Benchmarks and Datasets for Low-Resource Languages
Document analysis systems for Low-Resource Languages
Multimodal Large Language Models (MLLMs) for Low-Resource Document Understanding
Large Language Model (LLM)-Driven Document Analysis Technologies for Low-Resource Languages

Organizing Committee

Yong,Tso, Xizang University, China
Brian Kenji Iwana，Kyushu University,Japan
Yu,Yongbin, University of Electronic Science and Technology, China

Publication

Accepted papers will be published in the ICDAR 2026 workshop proceedings.

Venue