How to Fine-tune GPT-3.5 Turbo: Step-by-step with No Code

Miha Cacic Blog Post Author Profile Pic
By Miha Cacic
August 23rd, 2023
Fine-tune GPT-3.5 Turbo

OpenAI just released GPT-3.5 Turbo for fine-tuning and it's a game-changer.

According to OpenAI, a fine-tuned GPT-3.5 model can easily match the quality of GPT-4 when it’s fine-tuned on high-quality examples—and you only need 50 to start seeing clear quality improvements versus few-shot learning.

Plus, it’s 90% cheaper to get completions from a fine-tuned GPT-3.5 model than a fine-tuned GPT-3 model.

Better quality and lower cost? Game on! 🤩

But hol' up a second. 

OpenAI also made significant changes to the fine-tuning process for GPT-3.5 Turbo. Figuring out the nuances of formatting the new JSONL and making the API calls to fine-tune can feel like being stuck in an escape room you didn’t ask to be in.

That’s why we created this guide that will get a shiny new fine-tuned GPT-3.5 model in your hands much faster with no code — and you might even have fun doing it.

1. Start with data

Don't have data at hand? No problem.

I have some for you, so you can fine-tune along this guide — Helpful AI Clerk.

Helpful AI Clerk Dataset

2. Upload CSV to Entry Point

Entry Point AI is a platform that helps you fine-tune AI models without writing a single line of code.

You can start for free, no credit card required.

Simply open the app and log in:

Entry Point AI Open App

You'll land on your Dashboard.

Create a new project by clicking the (+) button.

Create New Project

You'll then get to select from a few presets.

Click on them to get familiar with typical project formats and examples.

Fine-tuning Presets

Whenever you're uploading custom data, though, I suggest going with the Blank blueprint.

We'll name our project "AI Clerk".

After clicking "Create", you'll land on the Project Overview.

Name Fine-tuning Project

To navigate to the Data Import page either click "Import", or "Import CSV":

Import CSV for Fine-tuning

Now click "Choose .csv file":

Import CSV Training Dataset

Keep in mind that on the free plan, you can have a max of 50 examples in your organization at any time. According to OpenAI, that’s enough to start seeing clear improvements versus prompting alone.

Next upload the CSV of the Google Sheet I've shared with you before.

(Here's how to download your Google Sheet as a CSV:)

Download CSV

After your CSV is in Entry Point, select which columns in the Google Sheet should go to the prompt or completion:

Map CSV Columns to Prompt and Completion Fields

This is essentially choosing your input variables and then what kind of output you expect to get back. In our case, we're giving the AI three shopping cart items and getting back a recommended item for our hypothetical customer, along with a rationalization for why that item would be good.

Note: Entry Point uses the term 'fields' instead of 'columns'. For all intents and purposes, they are the same.

Finally, you can dedicate a percentage of your examples to be Validation Examples.

Allocate Validation Examples to Dataset

Entry Point won't include these in your training data.

Instead, validation examples are used to automatically test your models after they're fine-tuned.

Click "Finish" and the import will start. Wait a few seconds and you should see your example count go up in the sidebar.

Example Count in Sidebar

There we go — our examples:

Training Examples for Helpful AI Clerk

3. Format examples in bulk using the Prompt Template

Now we're going to prepare the example format for fine-tuning.

Open the Templates page.

Templates Page

You should see the default template that was created when you imported the data.

Essentially, the 'Shopping Cart' column is on the left (in the Prompt), and the 'Suggested Item' and 'Reasoning' columns are on the right (in the Completion).

The variables with double curly braces will be replaced with the row values from the Google Sheet. You can do whatever you want with this. Wrap them inside additional instructions…. Leave them blank, as is… Add little tags that help the model understand your task better (as I have in my Completion:)

LLM Prompt and Completion Templates

Note: We all know GPT-3.5 Turbo has a chat interface, with alternating "User" and "Assistant" inputs and outputs. Yet the templates in Entry Point are tagged as "Prompt" and "Completion". This is the same. The "User" is the "Prompt", and the "Assistant" is the "Completion".

(The system message is left blank for now, although we'll add that option soon.)

Now, I'd like my model to keep the output text a bit more separated. So I'm going to add an empty line between my 'Suggested Item' and my 'Reasoning'.

Line break in completion

After I press save, all my examples get updated.

Examples Updated from Template

And if you want to edit individual examples, you can hover the example and click the pen icon.

Example Edit Icon

Then change the details as you wish.

Edit Training Example

In conclusion, Entry Point Templates eliminate the need to format your data using Python (phew).

4. Connect Entry Point to OpenAI

We’re fine-tuning GPT-3.5 Turbo, which is a model made and hosted by OpenAI.

Meaning, we need to connect Entry Point to OpenAI.

To do this, open the Integrations page in Entry Point from the top navigation bar and click on OpenAI.

OpenAI Fine-tuning Integration

Now we need to get an API key from OpenAI.

Go to this page:

Then click 'Create new secret key'...

Create OpenAI Secret Key

And name it something like “Entry Point AI” so you remember where you’re using it.

Create new OpenAI key for Entry Point integration

Once you have it, copy it.

Copy OpenAI Secret Key

And paste it into Entry Point here:

Paste API Key into Entry Point

5. Synthesize more examples (if you want)

Now that we’re connected to OpenAI, we can fine-tune a model.

Sometimes though, we don't have enough examples to create a high-quality model.

Because creating new training examples by hand is a drag, we created Data Synthesis. This feature let's you expand your dataset automatically using AI.

Let’s take a quick peek at Data Synthesis.

Synthesize Data in Entry Point AI

You can choose which model you want to use to generate examples, add Alignment text to steer the kind of examples you want it to generate, and set how many to produce at a time.

Synthesis Settings

You can even have it automatically save the examples, or manually add the best ones.

Add Synthetic Data to LLM Fine-tuning Dataset

When you add a new example, you can edit it first to ensure it meets your standard of quality.

Save Synthetic Training Example

6. Run the fine-tune

Okay, we already have plenty of data for our Helpful AI Clerk model. Let’s get back to fine-tuning GPT-3.5.

Go to Fine-tunes and click the plus button.

Start GPT-3.5 Fine-tune

On the 'Start a fine-tune' page, you can select a base model.

Choose GPT-3.5 Turbo.

GPT-3.5 Turbo Fine-tuning

You can press "Show Advanced" to view and edit hyperparameters for the fine-tuning job.

The only hyperparameter available for GPT-3.5 Turbo at its time of release is N Epochs. If this is your first time fine-tuning, just leave the default. You can always learn more about hyperparameters and play with them later.

GPT-3.5 Hyperparameters

Note the estimated time to complete the fine-tuning job, because this can vary from a few minutes to several hours depending on how smoothly things are going at OpenAI and how many fine-tuning jobs are backed up in the queue.

Press Start and watch the magic happen.

Fine-tuning Job Status: Started

Entry Point will show your fine-tuning job as “Preparing” initially. This is where it’s writing the JSONL files (both training and validation datasets, if applicable) and uploading them to OpenAI. Once the training data is uploaded to OpenAI, it will show the job as “Started.” That means it’s in the queue.

GPT-3.5 Turbo Fine-tune Completed

You’ll receive an email when your fine-tune is ready and the status will update to "Completed."

Congratulations, you successfully fine-tuned GPT-3.5!

7. Test the fine-tune

Now you can use your model in a variety of ways, including directly from the OpenAI playground.

Fine-tuned GPT-3.5 Model in OpenAI Playground

Entry Point also has a playground designed to work perfectly with your fine-tuned model that lets you leverage the structure of your fields and templates to input data more easily and with less room for formatting errors.