https://github.com/openai/openai-python/blob/main/chatml.md

Skip to content Toggle navigation
 
Sign up

  * Product
      +  
        Actions
        Automate any workflow
      +  
        Packages
        Host and manage packages
      +  
        Security
        Find and fix vulnerabilities
      +  
        Codespaces
        Instant dev environments
      +  
        Copilot
        Write better code with AI
      +  
        Code review
        Manage code changes
      +  
        Issues
        Plan and track work
      +  
        Discussions
        Collaborate outside of code
      + Explore
      + All features
      + Documentation
      + GitHub Skills
      + Blog
  * Solutions
      + For
      + Enterprise
      + Teams
      + Startups
      + Education
      + By Solution
      + CI/CD & Automation
      + DevOps
      + DevSecOps
      + Case Studies
      + Customer Stories
      + Resources
  * Open Source
      +  
        GitHub Sponsors
        Fund open source developers
      +  
        The ReadME Project
        GitHub community articles
      + Repositories
      + Topics
      + Trending
      + Collections
  * Pricing

[                    ] 

  *  
    #
    In this repository All GitHub |
    Jump to |

  * No suggested jump to results

  *  
    #
    In this repository All GitHub |
    Jump to |
  *  
    #
    In this organization All GitHub |
    Jump to |
  *  
    #
    In this repository All GitHub |
    Jump to |

Sign in
Sign up
{{ message }}
openai / openai-python Public

  * Notifications
  * Fork 647
  * Star 3.2k

  * Code
  * Issues 30
  * Pull requests 7
  * Actions
  * Security
  * Insights

More

  * Code
  * Issues
  * Pull requests
  * Actions
  * Security
  * Insights

Permalink
main
Switch branches/tags
[                    ]
Branches Tags
Could not load branches
Nothing to show
{{ refName }} default View all branches
Could not load tags
Nothing to show
{{ refName }} default
View all tags

Name already in use

A tag already exists with the provided branch name. Many Git commands
accept both tag and branch names, so creating this branch may cause
unexpected behavior. Are you sure you want to create this branch?
Cancel Create

openai-python/chatml.md

Go to file

  * Go to file T
  * Go to line L
  * 
  * Copy path
  * Copy permalink

This commit does not belong to any branch on this repository, and may
belong to a fork outside of the repository.
@logankilpatrick
logankilpatrick Create chatml.md (#238)
...
Latest commit 75c90a7 Mar 1, 2023 History

* Create chatml.md

* Update chatml.md

1 contributor

Users who have contributed to this file

Non-chat use-cases Few-shot prompting
87 lines (87 sloc) 3.25 KB
  
Raw Blame
Edit this file
E
Open in GitHub Desktop

  * Open with Desktop
  * View raw
  * 
  * View blame

Traditionally, GPT models consumed unstructured text. ChatGPT models
instead expect a structured format, called Chat Markup Language
(ChatML for short). ChatML documents consists of a sequence of
messages. Each message contains a header (which today consists of who
said it, but in the future will contain other metadata) and contents
(which today is a text payload, but in the future will contain other
datatypes). We are still evolving ChatML, but the current version
(ChatML v0) can be represented with our upcoming "list of dicts" JSON
format as follows:

[
 {"token": "<|im_start|>"},
 "system\nYou are ChatGPT, a large language model trained by OpenAI. Answer as concisely as possible.\nKnowledge cutoff: 2021-09-01\nCurrent date: 2023-03-01",
 {"token": "<|im_end|>"}, "\n", {"token": "<|im_start|>"},
 "user\nHow are you",
 {"token": "<|im_end|>"}, "\n", {"token": "<|im_start|>"},
 "assistant\nI am doing well!",
 {"token": "<|im_end|>"}, "\n", {"token": "<|im_start|>"},
 "user\nHow are you now?",
 {"token": "<|im_end|>"}, "\n"
]

You could also represent it in the classic "unsafe raw string"
format. Note this format inherently allows injections from user input
containing special-token syntax, similar to a SQL injections:

<|im_start|>system
You are ChatGPT, a large language model trained by OpenAI. Answer as concisely as possible.
Knowledge cutoff: 2021-09-01
Current date: 2023-03-01<|im_end|>
<|im_start|>user
How are you<|im_end|>
<|im_start|>assistant
I am doing well!<|im_end|>
<|im_start|>user
How are you now?<|im_end|>

 Non-chat use-cases

ChatML can be applied to classic GPT use-cases that are not
traditionally thought of as chat. For example, instruction following
(where a user requests for the AI to complete an instruction) can be
implemented as a ChatML query like the following:

[
 {"token": "<|im_start|>"},
 "user\nList off some good ideas:",
 {"token": "<|im_end|>"}, "\n", {"token": "<|im_start|>"},
 "assistant"
]

We do not currently allow autocompleting of partial messages,

[
 {"token": "<|im_start|>"},
 "system\nPlease autocomplete the user's message."
 {"token": "<|im_end|>"}, "\n", {"token": "<|im_start|>"},
 "user\nThis morning I decided to eat a giant"
]

Note that ChatML makes explicit to the model the source of each piece
of text, and particularly shows the boundary between human and AI
text. This gives an opportunity to mitigate and eventually solve
injections, as the model can tell which instructions come from the
developer, the user, or its own input.

 Few-shot prompting

In general, we recommend adding few-shot examples using separate
system messages with a name field of example_user or
example_assistant. For example, here is a 1-shot prompt:

<|im_start|>system
Translate from English to French
<|im_end|>
<|im_start|>system name=example_user
How are you?
<|im_end|>
<|im_start|>system name=example_assistant
Comment allez-vous?
<|im_end|>
<|im_start|>user
{{user input here}}<|im_end|>

If adding instructions in the system message doesn't work, you can
also try putting them into a user message. (In the near future, we
will train our models to be much more steerable via the system
message. But to date, we have trained only on a few system messages,
so the models pay much most attention to user examples.)

[                    ] Go
Footer

 (c) 2023 GitHub, Inc.

Footer navigation

  * Terms
  * Privacy
  * Security
  * Status
  * Docs
  * Contact GitHub
  * Pricing
  * API
  * Training
  * Blog
  * About

You can't perform that action at this time.
You signed in with another tab or window. Reload to refresh your
session. You signed out in another tab or window. Reload to refresh
your session.