Improving Access to Hyperlocal Information in Rural Communities

A team of researchers and students from the International Institute of Information Technology-Bangalore (IIIT-B) has developed a search interface to facilitate access to hyperlocal information for rural communities. This initiative aims to overcome the challenge faced by those living outside urban areas, where access to information is limited compared to city dwellers who have easy access to knowledge.

Rural communities often rely on community radio, local newspapers, and volunteer organizations for information specific to their locality. However, the knowledge produced by these entities is usually localized and not available on the internet, making it difficult for people to retrieve information at a later stage. Additionally, language barriers further hinder access to information.

To address these challenges, the IIIT-B team developed a search interface called Graama-Kannada Audio Search, specifically designed for colloquial audio content in the Kannada language. The interface allows users to search for and access hyperlocal information from the Tumakuru region in audio format. The team collaborated with Namma Halli Radio, a community-owned Wi-Fi mesh radio run by Janastu NGO in the Tumakuru region, to incorporate their audio corpus into the search model.

Using automatic speech recognition (ASR) models, the audio from Namma Halli Radio was transcribed into text. When a user searches for a specific keyword, the transcribed text is matched to deliver results in audio format. The interface supports searches in both Kannada and English, and the audio results are timestamped to pinpoint the exact location of the keyword.

By including colloquial dialects in language models, like Graama Kannada, that have been trained with data from the community radio’s audio corpus, the team aims to address the limitations of large language models, which often fail to capture the heterogeneity of the human population.

Although the application currently supports text-based search, the team plans to incorporate audio-based search in the future. This would enable users to perform voice searches using the Tumakuru dialect or other regional dialects, improving the accessibility of information for rural communities.

In addition to serving the community members, the search interface also provides a window for the general public to gain insights into regional cultures and local information about the Tumakuru region. The web application offers a list of most searched words, providing clues about the corpus and facilitating understanding for those less familiar with the community.

Converting the audio from community radio to text was a significant challenge for the team. However, with the introduction of advanced ASR models like OpenAI’s Whisper and Meta’s multilingual model, the team achieved better results. They also addressed the issue of spelling mistakes in the transcripts by implementing a relaxed criterion for matching search queries.

The development of this search interface marks a significant step toward improving access to hyperlocal information for rural communities. By bridging the gap between localized knowledge and digital accessibility, this initiative has the potential to empower rural residents with valuable information that was previously inaccessible.

The source of the article is from the blog elperiodicodearanjuez.es

Privacy policy
Contact