ANKARA — Within the misty mountains and coastal cities of Turkey’s japanese Black Sea, the Laz language, which as soon as echoed via the lanes of small villages, now lingers principally within the quiet homes of great-grandparents, and activists are turning to AI in a bid to stop it from vanishing.
The Istanbul-based Laz Institute has been searching for Laz volunteers to coach AI within the language via Mozilla’s Frequent Voice program, making a digital archive that might protect the language more and more beneath menace of extinction. Consultants, nevertheless, warning that different strategies — together with training, institutional assist and group engagement — are wanted to maintain the Laz language alive.
Laz, a South Caucasian language, is spoken primarily in Turkey’s northeastern Black Sea provinces of Rize, President Recep Tayyip Erdogan’s hometown, and Artvin, close to the border with Georgia. To a lesser extent, it is usually spoken in Turkey’s northwestern Marmara area and in components of southwest Georgia.
Laz individuals in Turkey are sometimes related to qualities lengthy woven into the nation’s folklore and jokes: brief tempers, candidness, and generally a bent to overthink. Turkey’s well-known comic Cem Yilmaz gives a glimpse of this temperament in a private anecdote with a Black Sea native from a area whose individuals, whether or not Laz or not, share lots of the similar traits: when requested for instructions to the airport, the native paused and deadpanned, “Airport? You imply the one … for airplanes?”
A fading language
Along with these playful stereotypes, ethnic Laz individuals confront a extra critical actuality: They’re striving to claim their identification as a definite group with a language, traditions and customs. UNESCO classifies the Laz language as “endangered,” citing shrinking each day use, weakened transmission to youngsters and its lack of an official standing in both Turkey or Georgia.
As Turkey doesn’t accumulate official information on ethnic populations, the precise variety of Laz individuals residing within the nation is unknown, although estimates place them near 600,000. Amongst them, solely half are believed to actively communicate Laz.
In line with Irfan Cagatay, the creator of “The Laz within the Late Ottoman Empire 1877-1923,” one of the crucial complete research on Laz identification in Turkey, the decline of the language sped up after the founding of the fashionable Turkish Republic, as nation-building efforts discouraged individuals from talking Laz and some other minority language. In 1924, the nation established Turkish because the official language of instruction, sidelining the training of different Muslim-minority languages in faculties.
“Turkish authorities prioritized the language as a marker of unity, extending authorities providers, faculties and public life in Turkish,” Cagatay informed Al-Monitor.
City integration introduced Laz communities additional into Turkish society as an entire, Cagatay added. Lengthy remoted within the steep mountains of the Black Sea, residents had much less incentive to make use of Laz in each day life, and extra cause to talk Turkish.
“Finally, Laz became a language spoken solely by grandparents and even great-grandparents’ homes,” he added.
Instructing Laz to AI
As Laz regularly retreated from each day life, activists sought new methods to protect it, and the Laz Institute has been on the forefront of those efforts for greater than a decade.
The institute’s most up-to-date initiative is popping to AI to create digital archives of the language. The initiative has been working for 2 years to combine Laz into Mozilla’s Frequent Voice program, an open-source platform that trains voice recognition know-how utilizing recordings submitted by extraordinary audio system.
Picture taken throughout the Laz language training for lecturers in 2022 (Laz Institute)
“This venture is necessary not just for growing Laz’s visibility and elevating consciousness concerning the language,” Ismail Avci, the director of the Laz Institute, informed Al-Monitor, “but additionally for constructing a critical repository of information.”
Mozilla’s Frequent Voice program, which was launched in 2017, was initially aimed toward constructing a various open voice dataset that anybody might use for voice recognition applied sciences, in keeping with this system’s web site. Over time, it expanded to assist endangered and underrepresented languages. As of 2025, Frequent Voice hosts recordings in over 140 languages, lots of that are low-resource or endangered, together with Yauyos Quechua from Peru, Tush from the North Caucasus and Bebele from Cameroon, Mozilla’s Open Multilingual Speech Fund famous on its web site.
Via Frequent Voice, volunteers learn brief sentences aloud and others confirm the clips, and the validated audio turns into a part of a public database to be used by builders and researchers.
Hoping that bringing Laz into this ecosystem will safe its presence within the digital world, the institute has ready practically 10,000 sentences in Laz for inclusion within the database, about half of which have already been uploaded, Avci mentioned.
Now, the venture wants Laz-speaking volunteers who can report sentences. In his column for Turkey’s impartial information platform Bianet, Avci known as on Laz audio system to volunteer within the initiative.
“Even in probably the most conservative estimate, we want round 250 volunteers, however realistically nearer to 1,000,” Avci informed Al-Monitor. He hopes that the venture will strengthen group ties and spark a broad mobilization amongst Laz audio system.
The venture doesn’t have a set timeline, and its tempo will rely largely on what number of volunteers come ahead.
Laz’s journey: Margins to mainstream
Turkey has lengthy handled minority languages as if they threatened to erode nationwide unity, particularly amid tensions with the Kurdish inhabitants and its calls for for cultural and linguistic rights. The temper started to vary within the late Nineties, when reforms aimed toward advancing Turkey’s accession negotiations to the European Union opened a slender however significant area for cultural expression. Musician Kazim Koyuncu, a local of Artvin, introduced the language into the mainstream through his Laz album, which gave the language its first true nationwide viewers in 1995.
A bigger shift arrived throughout the 2012-2015 peace talks between Ankara and the Kurdistan Workers Party, which has waged an armed insurgency for Kurdish self-rule since 1984. As a part of the opening, the Erdogan authorities enacted political reforms permitting elective Kurdish and different minority-language programs in public faculties, a small however symbolic gesture that acknowledged Turkey’s multilingual actuality.
The Laz Institute, based in 2013, took benefit of that opening, coaching lecturers and assembling the supplies required to convey Laz into the classroom.
But early enthusiasm for the Laz-language elective lessons in Rize and Artvin ultimately fizzled, Avci mentioned. Amid an absence of incentives to be taught Laz and dwindling institutional assist — as minority-language programs had been seen as expensive for the state — the Training Ministry started including different electives to its curriculum, positioning them as extra sensible alternate options to minority-language lessons.
AI is not any actual treatment
Towards this backdrop, instructing Laz to machines is, at greatest, a modest line of protection for a language beneath menace.
“In fact, this initiative gained’t take away Laz from hazard of disappearance,” Avci acknowledged.
“Having AI be taught Laz means the language survives within the digital world, so long as that world exists, even when it faces the danger of disappearing in each day life,” he mentioned.
Many endangered languages, together with Laz, undergo from scarce documentation, which might make AI coaching unreliable, in keeping with a MIT Expertise Assessment article revealed in September.
Machine studying fashions thrive on massive, high-quality annotated datasets, and for low-resource languages, there usually aren’t sufficient written texts, recordings or verified translations to coach the fashions correctly, the article famous.
In the end, preserving Laz would require greater than AI or remoted classroom efforts, each consultants agreed.
“Laz wants optimistic discrimination,” Cagatay mentioned, akin to necessary Laz lessons in faculties in predominantly Laz cities, or requiring data of Laz for sure civil-service positions in these areas.
