Physical Address

304 North Cardinal St.
Dorchester Center, MA 02124

Researchers open source Sky-T1, a ‘reasonable’ AI model that can be trained for less than $450


So-called AI reasoning models have become easier – and cheaper – to develop.

On Friday, NovaSky, a team of researchers based at UC Berkeley’s Sky Computing Laboratory, released Sky-T1-32B-Preview, a reasoning model that is competitive with a previous version of OpenAI’s o1 on a number of key benchmarks. Sky-T1 seems to be the first truly open reasoning model in the sense that it can be replicated from scratch; the team has released the data set they used to train as well as the required training code.

“Remarkably, Sky-T1-32B-Preview was designed for less than $450,” the team wrote in a blog post“demonstrating that it is possible to replicate high-level reasoning abilities cheaply and efficiently.”

$450 might not sound that affordable. But it was not long ago that the price for the formation of a model with comparable performance often ranges in the millions of dollars. Synthetic training data, or training data generated by other models, helped reduce costs. Palmyra X 004, a model recently released by the IA Writer company, formed almost entirely on synthetic datahe said it only cost $700,000 to develop.

Unlike most AI, the reasoning models are effectively verified, which it helps them avoid some of the pitfalls that normally befall models. Reasoning models take a little longer – usually seconds to minutes longer – to arrive at solutions compared to a typical non-reasoning model. The advantage is that they tend to be more reliable in areas such as physics, science and mathematics.

The NovaSky team says it used a different reasoning model, QwQ-32B-Preview by Alibabato generate the initial training data for Sky-T1, then “curated” the data mix and leveraged OpenAI. GPT-4o-mini to refactor the data into a more workable format. Training the 32-billion-parameter Sky-T1 took about 19 hours with a rack of 8 Nvidia H100 GPUs. (Parameters roughly correspond to a model’s problem-solving skills).

According to the NovaSky team, Sky-T1 outperforms an early preview version of o1 on MATH500, a collection of “competition-level” math challenges. The model also beats o1’s preview on a number of difficult problems from LiveCodeBench, a coding benchmark.

However, Sky-T1 falls short of o1’s preview in GPQA-Diamond, which contains physics, biology and chemistry questions that a PhD graduate should know.

It is also important to note that OpenAI GA release of o1 is a stronger model than the preview version of o1, and that OpenAI is expected to release an even better reasoning model. o3in the weeks ahead.

But the NovaSky team says that Sky-T1 marks only the beginning of its journey to develop open source models with advanced reasoning capabilities.

“Going forward, we will focus on developing more efficient models that maintain strong reasoning performance and exploring advanced techniques that further increase the efficiency and accuracy of the models at test time,” the team wrote in the post. “Stay tuned as we make progress on these exciting initiatives.”



Source link