A diffusion-based model of language learning and interlingual distance
https://doi.org/10.25587/2310-5453-2025-2-67-74
Abstract
Understanding the process of language learning and quantifying interlingual relationships are central challenges in linguistics, cognitive science, and language education. In this paper, we propose a novel framework that models second language acquisition as a diffusion process within a structured, multidimensional space of languages. We introduce a formal measure of interlingual distance, grounded in linguistic features, to quantify structural and functional differences between languages. Building on Barenblatt-type nonlinear diffusion models, we represent language learning as a multicontinua diffusion process, where distinct components of language – such as phonetics, grammar, vocabulary, and pragmatics – are treated as separate, interacting continua. Each continuum evolves independently according to its own diffusion dynamics, capturing the heterogeneous difficulty and pace of learning across linguistic subsystems. The interaction between these continua reflects the coupling between linguistic competencies in real-world acquisition. We can validate this model with empirical data on second language learning rates across various language pairs, demonstrating that diffusion distances in each continuum correlate with observed learning difficulties in the corresponding language domain. This approach not only offers a new theoretical lens on language learning but also provides a predictive framework for curriculum design, learner modeling, and applications in multilingual NLP and AI systems.
About the Authors
A. V. GrigorevRussian Federation
Aleksandr V. Grigorev – Cand. Sci. (Physics and Mathematics), Associate Professor, Institute of Mathematics and Information Science, Scientific
Research Department “Computing Technologies”
Yakutsk
Researcher ID: H-7502-2016
Scopus Author ID: 57194029133
Elibrary AuthorID: 7855-8090
Z Guo
China
Zhenwei Guo – Cand. Sci. (Physics and Mathematics), Teacher, School of Mathematical Sciences
Liaocheng
Researcher ID: GQO-9442-2022
Scopus Author ID: 57215305659
References
1. Dörnyei Z. The Psychology of Second Language Acquisition. Oxford: Oxford University Press. 2009.
2. Ellis NC. Selective attention and transfer phenomena in SLA: Contingency, cue competition, salience, interference, overshadowing, blocking, and perceptual learning. Applied Linguistics. 2006;27(2):164-194. DOI: 10.1093/applin/aml015
3. Vabishchevich PN, Grigoriev AV. Numerical modeling of fluid flow in anisotropic fractured porous media. Numerical Analysis and Applications. 2016;9(1):45-56. DOI: 10.1134/S1995423916010055
4. Grigorev AV, Vabishchevich PN. Two-level approach for numerical modeling of blood flow in the liver lobule. Journal of Numerical Analysis, Industrial and Applied Mathematics. 2022;16(1-2):15-28.
5. Barenblatt GI, Zheltov IP, Kochina IN. Basic concepts in the theory of seepage of homogeneous liquids in fissured rocks. Journal of Applied Mathematics and Mechanics. 1960;24(5):1286-1303. DOI: 10.1016/0021-8928(60)90107-6
6. Rabinovich M, Ordan N, Wintner S. Found in translation: Reconstructing phylogenetic language trees from translated texts. Transactions of the Association for Computational Linguistics. 2017;5:169-182. DOI: 10.1162/tacl_a_00058
7. Devlin J, Chang MW, Lee K, Toutanova K. BERT: Pre-training of deep bidirectional transformers for language understanding. Proceedings of NAACL-HLT. 2019;4171-4186. DOI: 10.18653/v1/N19-1423
8. Artetxe M, Schwenk H. Massively multilingual sentence embeddings for zero-shot cross-lingual transfer and beyond. Transactions of the Association for Computational Linguistics. 2019;7:597-610. DOI: 10.1162/tacl_a_00288
9. Conneau A, Khandelwal K, Goyal N, et al. Unsupervised cross-lingual representation learning at scale. Proceedings of ACL. 2020;8440-8451. DOI: 10.18653/v1/2020.acl-main.747
10. Lake BM, Baroni M. Generalization without systematicity: On the compositional skills of sequence-to-sequence recurrent networks. Proceedings of ICML. 2018;80:2873-2882. Available at: https://proceedings.mlr.press/v80/lake18a. html
Review
For citations:
Grigorev A.V., Guo Z. A diffusion-based model of language learning and interlingual distance. Arctic XXI Сentury. 2025;(2):67-74. https://doi.org/10.25587/2310-5453-2025-2-67-74