This book provides readers with a brief account of the history of Language Identification (LI) research and a survey of the features and methods most used in LI literature. LI is the problem of determining the language in which a document is written and is a crucial part of many text processing pipelines. The authors use a unified notation to clarify the relationships between common LI methods. The book introduces LI performance evaluation methods and takes a detailed look at LI-related shared tasks. The authors identify open issues and discuss the applications of LI and related tasks and proposes future directions for research in LI.
ISBN: | 9783031458217 |
Publication date: | 3rd January 2024 |
Author: | Tommi Jauhiainen, Marcos Zampieri, Timothy J Baldwin, Krister Lindén |
Publisher: | Springer an imprint of Springer International Publishing |
Format: | Hardback |
Pagination: | 148 pages |
Series: | Synthesis Lectures on Human Language Technologies |
Genres: |
Natural language and machine translation Artificial intelligence Computational and corpus linguistics Probability and statistics Applied computing Computer science |