AiNews.com
Posts
Runway’s Gen-3 AI Tool Allegedly Trained on Copyrighted Content

Runway’s Gen-3 AI Tool Allegedly Trained on Copyrighted Content

Alicia Shapiro
July 26, 2024 • Estimated Reading Time: 3 minutes

A clean and organized document screenshot showing keywords like "beach" and "rain" alongside the names of Runway employees, suggesting involvement in scraping YouTube and other copyrighted content without permission for training the Gen-3 AI video generation tool. The background includes logos of popular media companies such as Pixar, Netflix, Disney, and Sony, highlighting the ethical and legal concerns of using copyrighted material for AI training

Runway’s Gen-3 AI Tool Allegedly Trained on Copyrighted Content

A recently leaked document obtained by 404media indicates that part of the data used to train Runway's latest AI video generation tool, Gen-3, may have been sourced from YouTube channels and other copyrighted content without permission. This raises significant ethical and legal concerns about the practices employed by AI companies in training their models.

Content Scraping Practices

The document, which includes 14 spreadsheets, suggests that videos from popular media companies such as Pixar, Netflix, Disney, and Sony may have been used to train Gen-3. Although 404media could not verify every video mentioned, the document provides a potential insight into how AI companies might scrape copyrighted material.

Employee Involvement

A former Runway employee revealed to 404media that employees were tasked with finding videos or channels related to specific keywords like "beach" or "rain." These employees reportedly used a YouTube video downloader tool via a proxy to scrape content without being blocked by Google. The leaked spreadsheets also included 14 links to non-YouTube sources, some of which were sites dedicated to streaming pirated media.

Ethical and Legal Concerns

Using copyrighted content without permission for training AI models is a contentious issue. Runway, part-funded by Google, could face significant legal repercussions if these practices are confirmed. The situation highlights the broader debate about AI content theft and the need for clear guidelines and ethical standards in AI training.

Performance Issues

Despite the controversy, the Gen-3 model appears to have performance issues. Ars Technica reported that the Gen-3 Alpha version created a video of a cat with human hands, suggesting that the training data or the methodology might need further refinement.