OpenAI has officially launched its new AI model, o1, designed to tackle complex tasks like coding and math problems more efficiently. This model is the first in a planned series that aims to enhance the reasoning capabilities of AI. Alongside the primary o1 model, OpenAI has also introduced a more cost-effective version, called o1-mini, for those looking for a smaller-scale solution.
The o1 model was previously referred to in rumors as Project Strawberry and Q\, and its release marks a significant step forward in AI development.
OpenAI’s o1 model
OpenAI views the launch of the o1 model as a significant step towards creating AI systems with more human-like reasoning abilities. The company highlights that o1 is more efficient in writing code and solving complex, multi-step problems than its predecessors.
In its documentation, OpenAI explains that the o1 models are trained with reinforcement learning, allowing them to handle complex reasoning tasks. The model is designed to “think before responding,” meaning it can generate a detailed thought process before providing an answer, improving accuracy in challenging scenarios.
o1 is already accessible to ChatGPT Plus and Team users, with Enterprise and Edu subscribers gaining access next week. For free-tier users, the release of the more affordable o1-mini version is planned but has no set launch date yet. Regarding API pricing, o1 is priced at $15 per 1 million input tokens and $60 per 1 million output tokens, notably higher than GPT-4’s API.
AI “thinks” before responding.
What distinguishes the o1 model is its use of reinforcement learning, a training technique where the system learns by receiving rewards or penalties, allowing it to solve problems autonomously. This process enables the model to adopt a “flow of thought,” miming human-like step-by-step reasoning.
According to OpenAI, this advanced reasoning capability improves performance and enhances the security and robustness of the models. For instance, the models can better understand security policies when addressing potentially unsafe prompts, offering more controlled responses.
The company emphasizes that incorporating a reasoning chain before responding can unlock significant benefits while reducing some risks associated with more powerful AI systems. However, OpenAI acknowledges ongoing challenges, particularly the issue of hallucinations, where the model generates incorrect or nonsensical information. While internal tests show that o1 and o1-mini hallucinate less often than their GPT-4 counterparts, anecdotal feedback suggests they may still struggle in this area.
Moreover, o1 doesn’t surpass GPT-4 in factual knowledge or web navigation. Despite this, OpenAI believes the release of o1 signals the beginning of a new phase in AI development, focusing on more complex reasoning tasks that were previously beyond the capabilities of earlier models.