AiNews.com
Posts
Meta FAIR Releases Innovative AI Research Models and Datasets

Meta FAIR Releases Innovative AI Research Models and Datasets

Alicia Shapiro
June 19, 2024 • Estimated Reading Time: 4 minutes

An image illustrating AI research and innovation at Meta FAIR. It shows a futuristic lab with holographic screens displaying AI models and datasets. Researchers of diverse backgrounds are collaborating, some working on computers, others discussing models. Elements highlighting text-to-music generation, image-to-text conversion, and AI-generated speech detection are present. Subtle Meta logos and symbols representing openness, collaboration, and innovation are included in the design

Meta FAIR Releases Innovative AI Research Models and Datasets

Meta FAIR has announced the public release of several new research artifacts aimed at advancing AI innovation. These contributions emphasize Meta's commitment to openness, collaboration, and the development of a robust AI ecosystem. By sharing these resources, Meta FAIR hopes to inspire the research community to explore and apply AI in novel ways.

Commitment to Open Research

For over a decade, Meta’s Fundamental AI Research (FAIR) team has been at the forefront of AI advancement through open research. With the rapid pace of innovation, collaboration with the global AI community remains crucial. Meta FAIR's open science approach aims to build AI systems that benefit everyone, fostering global connectivity.

Recent Research Releases

Meta FAIR is excited to share six new research artifacts, each focusing on core themes of innovation, creativity, efficiency, and responsibility. These include:

Image-to-text and text-to-music generation models
A multi-token prediction model
A technique for detecting AI-generated speech

By making these early research works publicly available, Meta FAIR aims to encourage further iterations and responsible advancements in AI.

Meta Chameleon Models

Meta Chameleon is a family of models capable of processing text and images as both inputs and outputs using a unified architecture. Unlike current models that use diffusion-based learning, Meta Chameleon utilizes tokenization, simplifying design and scalability. Today, Meta FAIR is releasing components of the Chameleon 7B and 34B models under a research-only license, supporting mixed-modal inputs and text-only outputs.

Access Chameleon Models Here.

Multi-Token Prediction Approach

Traditional language models predict the next word in a sequence, a method that is simple but inefficient. Meta FAIR’s new approach trains models to predict multiple future words simultaneously, enhancing efficiency and speed. The pre-trained models for code completion are now available under a non-commercial/research-only license.

Get Multi-Token Models Here.

Text-to-Music Generation with JASCO

The new JASCO model improves control over text-to-music generation by incorporating specific chords or beats as conditioning inputs. This model enhances creative outputs by combining symbolic and audio-based conditions. Meta FAIR is releasing the research paper and a sample page, with inference code and the pre-trained model to follow.

Listen to JASCO Samples Here.

AudioSeal: AI-Generated Speech Detection

AudioSeal is a pioneering audio watermarking technique designed for the localized detection of AI-generated speech. This method enhances detection speed significantly and is suitable for large-scale, real-time applications. AudioSeal is being released under a commercial license, continuing Meta FAIR’s efforts to ensure the responsible use of AI tools.

Access AudioSeal Here.

PRISM Dataset for LLM Feedback

The PRISM dataset, developed in collaboration with external partners, maps the sociodemographics and preferences of 1,500 participants across 75 countries. This dataset, which captures feedback on live conversations with 21 different LLMs, aims to foster more inclusive AI development and broaden participation in technology design.

Get the PRISM Dataset Here.

Addressing Geographical Disparities in Text-to-Image Models

Meta FAIR's recent research efforts focus on improving the geographical and cultural diversity of text-to-image models. The introduction of “DIG In” indicators and the contextualized Vendi Score guidance aims to enhance the representation diversity of generated images. This work underscores Meta FAIR's dedication to creating AI systems that reflect global diversity.

Access DIG In Code and Annotations: