Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI. learn more
The age of inference AI is definitely advancing.
After OpenAI kickstarts the AI revolution again with the o1 inference model introduced in September 2024, it takes longer to answer questions but has improved performance, especially for complex, multi-step problems in math and science. I will. The commercial AI field is flooded with copycats and competitors.
There’s DeepSeek’s R1, Google Gemini 2 Flash Thinking, and today’s release of LlamaV-o1, all of which seek to provide similar built-in “inference” to OpenAI’s new o1 and upcoming o3 model families. These models engage in “chain of thought” (CoT) prompts, or “self-prompts,” to reflect, double back, check their work, and ultimately do more than just shoot. Forces you to arrive at a good answer too. Similar to other large-scale language models (LLMs), we remove it from embeddings as fast as possible.
However, the higher cost of o1 and o1-mini ($1.25 per million input tokens compared to $15.00 per million input tokens for GPT-4o in OpenAI’s API) makes the expected performance gains less Some people hesitate. Is it really worth paying 12 times more than a typical cutting-edge LLM?
After all, the number of converts is increasing. But the key to unlocking the true value of an inference model may lie in users encouraging the model to do things differently.
Shawn Wang (founder of AI news service Smol) posted a guest post on his Substack over the weekend from Ben Hylak, formerly at Apple Inc. and interface designer for visionOS (which powers the Vision Pro spatial computing headset). We featured it. This post went viral because Hylak convincingly explains how to encourage OpenAI’s o1 model to receive incredibly valuable (to him) output.
That is, instead of a human user creating a prompt for the o1 model, you can create a “summary” with a lot of context upfront, or a more detailed explanation of what the user wants the model to output, and who the user is. You should consider creating a . What format do you want the model to output information in?
Hylak writes on Substack:
Most models are trained to tell the model how you want it to answer. Example: “You are a professional software engineer. Please think slowly and carefully.”
This is the opposite of how I found success in o1. I won’t tell you how. I just tell you what to do. Then let o1 take over and plan and solve their own steps. This is the purpose of autonomous reasoning, and it’s actually much faster than manually reviewing and chatting as a “human in the loop.”
Hylak also includes a nice annotated screenshot of an example prompt from o1 that yielded useful results for the hiking list.
This blog post was so helpful that Greg Brockman, president and co-founder of OpenAI, reshared it on his X account with the following message: To achieve great performance, it must be used in new ways that differ from the standard chat model. ”
In my on-again, off-again quest to become fluent in Spanish, I’ve tried it myself. For those interested, I will present the results. It’s probably not as impressive as Hylak’s well-constructed prompts and responses, but it definitely shows strong potential.
Apart from this, even in the case of non-inferential LLMs like Claude 3.5 Sonnet, there may be room for regular users to improve the prompts to get better results with less constraints.
As Louis Arge, a former Teton.ai engineer and current creator of the neuromodulation device openFUS, wrote in “It means trusting the prompts,” he said, giving an example of how he persuaded Claude. By first “causing a fight” with him over the consequences, you “stop being a coward.”
All of this shows that agile engineering remains a valuable skill even as the AI era advances.
Daily insights into business use cases with VB Daily
If you want to impress your boss, VB Daily has you covered. From regulatory changes to real-world implementations, we give you the inside scoop on what companies are doing with generative AI. This allows you to share insights to maximize ROI.
Thank you for subscribing. Check out other VB newsletters here.
An error has occurred.