Tajik language breaks new ground with first AI model launch

Tajik language breaks new ground with first AI model launch

The first AI model built specifically for the Tajik language and its regional dialects has been officially unveiled, Trend reports via its developers zehnlab.ai.

The new model, called Soro, marks a significant step forward in making artificial intelligence more inclusive for underrepresented languages.

According to the developers, Tajik has historically been underrepresented in the training data of major large language models such as GPT and Llama. Soro was created to change that by focusing on the linguistic richness of Tajik — including its non-standard syntax, dialectical variations, and unique patterns of real-life usage.

The team behind Soro says the model is designed not just to answer simple queries, but to understand and reflect the cultural and linguistic diversity of the people of Tajikistan. With its multimodal capabilities under development, Soro aims to bridge the gap between technology and local language communities.

Meanwhile, on June 25, President of Tajikistan Emomali Rahmon inaugurated the Darya AI Artificial Intelligence Management Center in the Darvoz district. The center was established as part of the national initiative declaring 2025–2030 the Years of Development of the Digital Economy and Innovation.

Source: en.trend.az