Published on December 11, 2024
In AI News

Cerebras CePo Brings Test Time Computation to Llama

Meta’s Llama 3.3 70B now outperforms Llama 405B, thanks to test time computation.

by Supreeth Koundinya

Cerebras, the AI hardware and inference solution provider, announced a new technique called CePO – Cerebras Planning and Optimization that ‘drastically’ improves the reasoning capabilities of Meta’s Llama models.

Cerbras uses the much-coveted test time computation technique on the Llama 3.3 70B model, outperforming the Llama 3.1 405B model across several benchmarks while ‘maintaining interactive speeds of 100 tokens per second’. Cerebras has also unveiled a detailed technical documentation, outlining the capabilities of CePO.

“While models like OpenAI o1 and Alibaba QwQ have demonstrated the power of additional computation at inference time, CePO brings these capabilities to Llama – the world’s most popular open-source LLM family,” said Cerebras in the announcement.

Cerebras also compared its technique with GPT-4 Turbo and Claude 3.5 Sonnet, and it achieved ‘comparable performance’ in most benchmarks. However, there isn’t any comparison being made among the industry-leading reasoning model – OpenAI’s o1.

For example, the Llama 3.3 70B model scored 53.3% on the GPQA benchmark, whereas the o1 model scored a higher 76%. While OpenAI hasn’t revealed the number of parameters in the o1 model, it surely has, and significantly more than, 70B parameters.

“By bringing these capabilities to the Llama family of models, we’re democratizing access to sophisticated reasoning techniques previously limited to closed commercial systems,” said Andrew Feldman, CEO and Co-founder of Cerebras Systems.

Cerebras is also going to open-source the CePO framework. The company also aims to develop more ‘advanced prompting frameworks that leverage comparative reasoning’ and synthetic datasets that are optimised for inference time computing.

Cerebras is using the latest edition of Meta’s Llama, the Llama 3.3. Meta announced the model only a few days ago. According to Meta, the model delivers ‘leading performance’ in synthetic data generation, and the model also supports an expanded context length of 128k tokens.

A few days ago, Meta also unveiled a new ‘Chain of Continuous Thought’ technique, or COCONUT, that overcomes the limitations of the Chain of Thought, or CoT technique, where the explicit reasoning process is generated in natural language tokens.

Instead of making the model convert its internal thinking into words after each step, COCONUT uses its internal thinking as a starting point for the subsequent step.

Reasoning models are the next big thing in the ecosystem today. While OpenAI just unveiled the full version of the o1 model, they also have strong competition from the East. China’s DeepSeek R1 Lite supposedly offers better reasoning capability versus the o1 and is also available as an open-source model.

📣 Want to advertise in AIM? Book here

Supreeth Koundinya

Supreeth is an engineering graduate who is curious about the world of artificial intelligence and loves to write stories on how it is solving problems and shaping the future of humanity.

How Women in Tech Navigate Work-Life Balance & Career Growth

‘We’d Like to Work With China,’ says OpenAI CEO Sam Altman

The Future of Content is ‘Headless’

Indian Companies Bullish on Long-Term AI Investments; 76% Surveyed Firms Achieved ROI-Driven Results: IBM Study

What’s Pushing Companies to Choose On-Site Work Over Hybrid?

Apple Partners with Alibaba to Bring AI Features to iPhones in China

Perplexity Launches Sonar for Pro Users; Performance on Par with GPT-4o, Claude 3.5 Sonnet

France’s MediaWen, India’s Reverie Partner to Advance AI Localisation

Association of Data Scientists

GenAI Corporate Training Programs

Our Upcoming Conference

Rising 2025

India's Biggest Women in Tech Summit

Mar 20 and 21, 2025 | 📍 J N Tata Auditorium, Bengaluru

Download the easiest way to
stay informed

This Indian Father-Son Duo is Challenging AI Giants from the West

Vandana Nair

“The world didn’t want to believe in us, so open-sourcing was a great way to tell them, ‘Look, we’re building our own technology’,” said Akshat Prakash, co-founder and CTO of