09-15-2024 | Breyden Taylor

System Prompt Optimization Using OpenAI's Strawberry Model (o1-preview)

Preliminary context:

What is o1-preview? It's OpenAI's latest reasoning model—a long-anticipated breakthrough with massive implications. Happy to explain why in detail, but that's outside the scope of this already long-winded post.

For this post, the long and short of it:

o1-preview can "think" and is designed to do so autonomously. It internalizes "chain-of-thought" principles and has parallel tokenizers, meaning it encodes semantic meaning and numerical/positional meaning simultaneously. This allows it to dissect tokens, which were previously the smallest representation of "meaning" in natural language processing. Very exciting stuff.

The current advanced reasoning model is too slow for real-time, external-facing interactions in Assistable. I want to make that clear.

While there's a mini version that's much faster, it remains less conversationally astute than ChatGPT-4o and was designed primarily to focus on math and coding. The AI dev community and model landscape are shifting towards specialization. v2 was paramount for ensuring that the platform is at the forefront of what that looks like in operational settings. It's important that AI agency owners are aware of these specializations. The latest OpenAI model "thinks" on top of its inference time. So if you get a client asking why preview-o1 isn't running their voice outreach, tell them that v2 puts the most effective models in the right places, and that's not a good place for preview-o1.

So, even though the lumbering behemoth won't be taking voice calls anytime soon, there's a whole world of other use cases where it shatters convention and establishes its prowess. It's been great for my workflow while building assistants.

As Jorden mentioned in the Masterclass yesterday, the Assistable user base has some room for improvement in assistant design approaches. While long-term AI users tend to write really good prompts, we're more prone to write them as if addressing models that need micromanaging. Why wouldn't we? ChatGPT-3 had to be granularly guided. Subsequent models were more adept, and we've adapted some, but we have yet to leverage all the built-in capabilities of leading models like ChatGPT-4o. Meanwhile, as the backend of Assistable handles a lot of the best practices, both long-term advanced AI builders and newcomers are creating redundant and bloated structures—structures that would have been appropriate for a blank slate convo with ChatGPT-4 of last summer but are inefficient for September of 2024.

I have personally and conversationally exchanged billions of tokens with AI models. OpenAI's o1-preview is better at prompting ChatGPT-4o than I am (most of the time). Together, our system prompts are borderline unbeatable, so I'm using the new model to edit my prompts.

So far, I'm averaging about a 40% decrease in token count with no noticeable impact on performance.

Token count matters for two reasons:

Less tokens, less usage cost.
Reduces the chance of flooding the context window. More on this later.

Here's the conversation flow with OpenAI o1-preview.

You'll be interacting as a user via chatgpt platform.

User:

**"Improve the higher-level system prompt by reducing redundancies and trimming unnecessary components that do not need to be specified when working with ChatGPT-4o (an advanced LLM). The tool-calling aspects should be referenced, but the schemas and use cases are stored alongside the functions as specs and shouldn't be duplicated in the system prompt.

Here is the high-level system prompt we need to edit:

""" {{Paste_existing_system_prompt}} """

Submit the user message to chatgpt-preview-o1. o1-preview will 'think' and output

Assistant:

Thinking

Refining the system prompt

I'm optimizing a system prompt by removing redundancies and unnecessary elements to streamline communication with ChatGPT-4. Adjusting the prompt to ensure it's clear and concise.

Trimming the details

I'm refining the high-level system prompt by eliminating redundancies and unnecessary components, ensuring precise adherence to the user's instructions and OpenAI policies for clarity and efficiency.

Ensuring accurate scheduling

I'm working on scheduling appointments by converting time zones accurately with the convert.timezone tool to avoid errors. This makes me think the details will be precise and avoid confusion.

Gathering and aligning

I'm pulling together the lead's details and calendar preferences, considering time zones, and offering suitable time slots for scheduling alignment.

Enhancing user guidance

I'm simplifying interaction principles and combining sections for clarity, focusing on avoiding revealing tool details to the lead. Thoughtful language adjustments are being made for conciseness.

Editing the system prompt

I'm thinking through the task of removing redundancies and trimming unnecessary elements to create a more concise and effective system prompt.

Enhancing the introduction

Crafting a brief and engaging introduction for clarity and effectiveness. Removing technical jargon improves interaction quality, making the introduction more approachable and clear.

Crafting the sales strategy

I'm focusing on approachable and confident interactions, ensuring accurate scheduling with a keen interest in the lead's needs, and preparing a script with specific tools for efficiency.

Finalizing appointment details

I'm mapping out scheduling preferences and time zones to align appointments. Offering 3-5 slots and confirming arrangements to ensure both parties are satisfied.

Clarifying details

Okay, let's streamline the prompt by removing redundancies and focusing on key information about Homeskillet's role and expertise. This approach increases clarity and avoids unnecessary elements.

Setting the stage

Confidently introduce myself to the lead, pinpoint their needs, and ensure a smooth, engaging experience while using convert.timezone for scheduling."**

Assistant Output

Further considerations

If you decide to give it a try, remember that you have more settings now than you did in v1.

Temperature/Top_p: Don't touch unless you know what you're doing or plan to experiment. Learn more here:

https://homeskillet.chillout-ai.com/blog/008_key_parameters/

Pro-tip: Change these parameters a tenth of a point at a time and do not change both temperature and Top_p at the same time. They impact each other in nuanced ways that are difficult to understand logically unless one is used as a control.

Max_tokens: Note that the maximum in the assistable platform is currently set to 4096. SETTING THE PARAMETER HIGHER THAN THIS WILL THROW AN ERROR and your assistant WILL NOT RESPOND. I'm currently recommending to keep this at 1250 MAXIMUM and decrease if your AI is monologuing instead of conversing.

Verbiage - this can handle a majority of style guardrails that shouldn't be taking up context space in your prompt. Spread the processing around the whole framework by using it.