• AiNews.com
  • Posts
  • Sony and AI Singapore Collaborate on Southeast Asian Language AI Model

Sony and AI Singapore Collaborate on Southeast Asian Language AI Model

A clean and simple illustration representing the collaboration between Sony and AI Singapore on the SEA-LION language model. The image features digital data flowing between Southeast Asia and Indian languages, such as Tamil, symbolizing the AI's focus on regional languages. Abstract AI elements are incorporated to emphasize the technological aspect, with subtle Sony and AI Singapore logos, reflecting their partnership in advancing AI for diverse linguistic representation.

Image Source: ChatGPT-4o

Sony and AI Singapore Collaborate on Southeast Asian Language AI Model

Sony Research has announced a collaboration with AI Singapore (AISG) to help refine and test the Southeast Asian Languages in One Network (SEA-LION) large language model (LLM), with a special focus on Indian languages. This partnership aims to strengthen the model’s ability to represent Southeast Asian languages on a global scale.

Focus on Indian Languages in SEA-LION Model

The SEA-LION model is designed to address the linguistic diversity of Southeast Asia, focusing on languages spoken across the region. Sony Research will work closely with AISG to fine-tune the model, with particular attention on Indian languages, such as Tamil. Tamil is spoken by an estimated 60 to 85 million people globally, primarily in India and Southeast Asia.

SEA-LION has been pre-trained and instruct-tuned using 981 billion language tokens, including 128 billion from Southeast Asia and 91 billion Chinese tokens, alongside 623 billion English tokens. This vast dataset will help the LLM better understand and represent regional languages and cultures.

Sony's Role in the Partnership

As part of the collaboration, Sony will contribute its expertise in large language model development, particularly in speech generation, content analysis, and recognition. Sony’s research team in India will focus on enhancing the LLM’s performance for Indian languages and will provide feedback on the overall model’s capabilities.

The integration of Tamil language support is expected to improve the performance of applications utilizing the SEA-LION model. AISG’s senior director of AI products, Leslie Teo, emphasized the importance of sharing best practices to further develop the model.

Wider Industry Involvement and Global Significance

In addition to Sony, other industry giants like IBM and Google are also involved in the ongoing development and fine-tuning of the SEA-LION LLM. The goal is to provide developers with access to a comprehensive model that can be used to create customized AI applications tailored to diverse linguistic and cultural needs.

Hiroaki Kitano, president of Sony Research, emphasized the critical role of diversity and localization in AI models, noting that limited access to LLMs has hindered research and the development of technologies that are both representative and equitable for the global population. “In Southeast Asia specifically, there are more than 1,000 different languages spoken by the citizens of the region. This linguistic diversity underscores the importance of ensuring AI models and tools are designed to support the needs of all populations around the world,” he said.

Sony’s Broader Research Initiatives

Sony Research, established in 2023, is committed to advancing technology in AI, content creation, and virtual spaces. The company’s research spans areas like model compression, neural rendering, and AI-powered products for entertainment sectors such as music, movies, and gaming.

One of Sony's latest patents, filed in 2024, focuses on detecting harassment in multiplayer games and virtual reality environments using biometric data. This system can analyze speech and emotional states to identify victims of harassment, demonstrating Sony’s commitment to enhancing user safety in digital spaces.