AiNews.com
Posts
Mistral Launches Codestral Mamba for Faster and Longer Code Generation

Mistral Launches Codestral Mamba for Faster and Longer Code Generation

Alicia Shapiro
July 17, 2024 • Estimated Reading Time: 4 minutes

A futuristic tech workspace with developers working on large screens displaying AI-generated code. The environment includes holographic interfaces and robotic assistants, highlighting the advanced capabilities of Mistral's Codestral Mamba for efficient and productive code generation

Mistral Launches Codestral Mamba for Faster and Longer Code Generation

Mistral, a well-funded French AI startup renowned for its robust open-source AI models, has introduced two new additions to its lineup of large language models (LLMs): Codestral Mamba and Mathstral. These models, built on the innovative Mamba architecture, aim to improve efficiency and performance in their respective domains.

Introducing Codestral Mamba

Codestral Mamba 7B, Mistral’s latest code-generating model, leverages the Mamba architecture to enhance code productivity. The Mamba architecture simplifies the attention mechanisms used in traditional transformer models, resulting in faster inference times and the ability to handle longer contexts. This makes Codestral Mamba particularly effective for local coding projects and scenarios requiring rapid response times.

Mistral’s testing revealed that Codestral Mamba can process inputs of up to 256,000 tokens, double the capacity of OpenAI’s GPT-4o. Benchmarking tests demonstrated that Codestral Mamba outperforms other open-source models like CodeLlama 7B, CodeGemma-1.17B, and DeepSeek in HumanEval tests.

Developers can access and modify Codestral Mamba through its GitHub repository and HuggingFace, under an open-source Apache 2.0 license. Mistral claims that earlier versions of Codestral outperformed other code generators such as CodeLlama 70B and DeepSeek Coder 33B.

A table comparing the performance of various code-generating AI models in different benchmarks. Codestral Mamba (7B) shows high performance in HumanEval, MBPP, Spider, CruX, HumanEval C++, HumanEval Java, and HumanEval JavaScript, outperforming CodeGemma-1.1 7B, CodeLlama 7B, and DeepSeek v1.5 7B in several metrics. Larger models like Codestral (22B) and CodeLlama 34B are also included for comparison

Image Source: VentureBeat

The Rise of Code Generation AI

AI-driven code generation has become a widely adopted application, with platforms like GitHub’s Copilot, Amazon’s CodeWhisperer, and Codenium gaining traction. Codestral Mamba joins this competitive landscape, offering developers a powerful tool to enhance their coding efficiency.

Mathstral: Advancing Math Reasoning

Alongside Codestral Mamba, Mistral has launched Mathstral 7B, a model designed for math-related reasoning and scientific discovery. Developed in collaboration with Project Numina, Mathstral features a 32K context window and operates under an Apache 2.0 open-source license. According to Mistral, Mathstral outperforms other models in math reasoning benchmarks and achieves superior results with more inference-time computations. Users can utilize the model as is or fine-tune it to meet specific needs.

A table comparing the performance of various math reasoning AI models in different benchmarks. Mathstral 7B shows strong performance in MATH, GSM8K (8-shot), Odyssey Math maj@16, GRE Math maj@16, AMC 2023 maj@16, and AIME 2024 maj@16, outperforming DeepSeek Math 7B, Llama 3 8B, GLM4 9B, QWen2 7B, and Gemma2 9B in several metrics

Image Source: VentureBeat

Mistral’s Competitive Edge

Mistral continues to compete with major AI developers like OpenAI and Anthropic by offering its models on an open-source platform. The company recently secured $640 million in Series B funding, elevating its valuation to nearly $6 billion. This investment round included contributions from tech giants such as Microsoft and IBM.

Conclusion

Mistral’s launch of Codestral Mamba and Mathstral highlights the company’s commitment to advancing AI capabilities in code generation and mathematical reasoning. With its innovative Mamba architecture and open-source approach, Mistral is poised to make significant strides in the AI landscape.