DALL 2026: The Second of International Workshop on Documents analysis of Low-resource Languages Vienna, Austria, September 3, 2026 |
| Conference website | https://icdar-dall.github.io/ |
| Submission link | https://easychair.org/conferences/?conf=dall2026 |
| Submission deadline | May 1, 2026 |
The importance of low-resource document analysis is multifaceted, particularly in the fields of cultural preservation, data scarcity, linguistic research, and technological applications. Firstly, low-resource languages often embody unique cultural and historical contexts. Document analysis facilitates the digitization and preservation of these linguistic materials, providing crucial resources for understanding human history and cultural evolution. For instance, many endangered languages possess vast amounts of scanned documents, which can be analyzed to create valuable linguistic and cultural repositories. Secondly, low-resource languages typically suffer from a lack of large-scale annotated datasets, posing challenges for training machine learning models. Document analysis techniques, such as Optical Character Recognition (OCR) and document layout analysis, enable the extraction and structuring of data from existing documents, thereby mitigating data scarcity issues. Moreover, document analysis plays a pivotal role in enhancing machine translation capabilities. Monolingual data extracted through OCR can be utilized to improve machine translation for low-resource languages, which is particularly critical for languages with limited parallel corpora. Additionally, document analysis supports linguistic research by enabling the study of language variations and historical documentation, shedding light on the evolution and unique features of these languages. Finally, document analysis enhances the accessibility and usability of low-resource language documents. For example, advancements in OCR systems for non-Latin scripts allow researchers to extract text more efficiently from scanned documents, enabling applications such as content summarization and information retrieval. In summary, low-resource document analysis is not only a vital tool for cultural preservation but also a key driver of language technology development and academic research.
Submission Guidelines
This workshop invites original contributions in both theoretical and applied research domains. All submissions must adhere to the formatting guidelines specified on the ICDAR 2026 official website.
- Manuscripts not adhering to the page limit (maximum 17 pages all included), the formatting guidelines (Springer Lecture Notes format) or the anonymization requirements will be rejected without review.
- The authorship of submitted manuscripts is final. No changes to the list of authors will be allowed once a paper has been submitted!
- The review of conference papers will be double blind. Authors should not include their names, affiliations, or acknowledgements in submitted manuscripts, and should ensure that their identity is not revealed indirectly by citing their earlier work in the third person. Authors will be given the opportunity to respond to reviews in a rebuttal phase.
- Submitted papers must adhere to the Springer Lecture Notes in Computer Science (LNCS) format (Springer Guidelines/Templates).
Submissions will be accepted through the workshop's EasyChair submission portal. At least one author of each accepted paper must complete workshop registration to present the work. Detailed submission procedures are available on the ICDAR 2026 official website.
Contact
- yongtso@163.com
List of Topics
- Document Image Processing for Low-Resource Languages
- Optical Character Recognition (OCR) for Printed Text in low-resource languages
- Logical layout analysis for Low-Resource Languages
- Handwriting Text Recognition (HTR) for manuscripts and historical documents
- Natural Language Processing for Understanding of Documents written in low resource languages
- Scene Text Detection and Recognition for Low-Resource Languages
- Gold-Standard Benchmarks and Datasets for Low-Resource Languages
- Document analysis systems for Low-Resource Languages
- Multimodal Large Language Models (MLLMs) for Low-Resource Document Understanding
- Large Language Model (LLM)-Driven Document Analysis Technologies for Low-Resource Languages
Organizing Committee
- Yong,Tso, Xizang University, China
- Brian Kenji Iwana,Kyushu University,Japan
- Yu,Yongbin, University of Electronic Science and Technology, China
Publication
Accepted papers will be published in the ICDAR 2026 workshop proceedings.
Venue
The workshop will be held in Vienna, Austria at september 3, 2026.
