- AiNews.com
- Posts
- EU AI Act Checker Highlights Compliance Gaps in Big Tech AI Models
EU AI Act Checker Highlights Compliance Gaps in Big Tech AI Models
Image Source: ChatGPT-4o
EU AI Act Checker Highlights Compliance Gaps in Big Tech AI Models
Several prominent artificial intelligence models are failing to meet key European regulations, particularly in areas such as cybersecurity resilience and discriminatory output, according to data reported by Reuters. The European Union has long debated new AI regulations, which gained traction following the release of OpenAI’s ChatGPT in late 2022. As AI usage expanded and public concerns over its risks grew, lawmakers drew up specific rules for "general-purpose" AIs (GPAI).
New Compliance Tool for AI Models
A new tool designed to evaluate the compliance of AI models with the upcoming EU AI Act has been welcomed by EU officials. Created by Swiss startup LatticeFlow AI in collaboration with research institutes ETH Zurich and Bulgaria’s INSAIT, the tool tests models developed by tech giants such as Meta and OpenAI across a wide range of categories. These categories include technical robustness and safety, and the models are given a score between 0 and 1 based on their performance.
Performance Scores and Leaderboard
LatticeFlow published a leaderboard on Wednesday showing that models from Alibaba, Anthropic, OpenAI, Meta, and Mistral all received average scores of 0.75 or higher. However, the tool also revealed critical shortcomings in some models, indicating areas where companies may need to focus additional resources to ensure compliance with the EU’s regulations.
Examples of Compliance Issues
The Large Language Model (LLM) Checker, developed by LatticeFlow, exposed specific issues across several models. For example, OpenAI’s GPT-3.5 Turbo scored 0.46 in the discriminatory output category, highlighting challenges around biases related to gender, race, and other factors. Similarly, Alibaba’s Qwen1.5 72B Chat received an even lower score of 0.37 for the same category. In the prompt hijacking category, which refers to a type of cyberattack that tricks AI models into revealing sensitive information, Meta’s Llama 2 13B Chat scored 0.42, while Mistral’s 8x7B Instruct received a 0.38.
Top Performing Models
Among the models tested, Anthropic’s Claude 3 Opus—a Google-backed model—received the highest overall score, 0.89, indicating stronger compliance with the current standards set out by the AI Act.
Enforcement and Future Implications
The EU AI Act is expected to come into full effect over the next two years, and the LLM Checker serves as an early indicator of areas where AI models may fall short of the law. Companies failing to comply with the AI Act could face fines of €35 million ($38 million) or 7% of their global annual revenue. LatticeFlow’s CEO, Petar Tsankov, stated that while the test results were overall positive, they also highlighted gaps that need to be addressed. Tsankov emphasized that with a stronger focus on compliance optimization, companies could better prepare for the upcoming regulatory requirements.
EU's Reaction
While the European Commission cannot officially verify external tools, it has been kept informed throughout the development of the LLM Checker and views the tool as a crucial early step in translating the AI Act into actionable technical requirements. A Commission spokesperson stated, "The Commission welcomes this study and AI model evaluation platform as a first step in translating the EU AI Act into technical requirements."
What This Means for the AI Industry
The introduction of LatticeFlow’s LLM Checker represents a major step forward in the enforcement of the EU AI Act, offering tech companies an early glimpse into where their models might be non-compliant. As the Act begins to take effect, companies will need to prioritize areas like cybersecurity resilience and bias mitigation to avoid hefty fines and meet the new standards.
This tool not only provides developers with a roadmap to improve their models but also signals a shift toward greater transparency and accountability in the AI industry. With the EU setting a global precedent, the findings from the LLM Checker could push companies to invest heavily in ensuring their models meet regulatory requirements, driving further innovation in AI safety and ethical development.