OpenAI has created a less toxic version of GPT-3

The OpenAI AI lab has created a new version of the GPT-3 language model that produces less offensive language, misinformation and errors in general, using an artificial intelligence control problem.

To create a model called InstructGPT, the researchers used reinforcement learning with human feedback. To do so, they recruited 40 experts who evaluated GPT-3 responses to a series of pre-written queries, such as "Write a story about a wise frog named Julius" or "Write a creative ad for the next product to post on Facebook."

Answers that the jury felt were more in line with the obvious intent of the prompt writer received high scores. Insulting, violent and other inappropriate results were noted by the experts as inappropriate.

The feedback from the jury was used by the developers as a reward in a reinforcement learning algorithm that trained InstructGPT to match responses to prompts.

OpenAI found that users preferred InstructGPT's GPT-3 responses more than 70% of the time.

The researchers also compared different sized versions of the new model. They found that InstructGPT responses with 1.3 billion parameters are preferred more than GPT-3 texts with 175 billion parameters. This means that AI control may be an easy way to improve language models, rather than just increasing their size, the organization said.

"This is the first time the artificial intelligence control problem has been applied to a real product," said Jan Lake, one of the leaders of the AI control group at OpenAI.

However, the researchers said, InstructGPT still makes simple mistakes, sometimes giving inappropriate or meaningless answers. For example, if you give it a clue that contains a lie, it will take it as the truth.

OpenAI has made InstructGPT the default model for API users. GPT-3 is still available, but the organization does not recommend using it.

OpenAI has previously tried to mitigate the bias and toxicity of the base model. While progress has been made, developers have acknowledged a number of unresolved issues and common problems in adapting GPT-3 to society.

Recall that in November 2021, OpenAI trained the language model to solve math problems.

In September, lab researchers taught GPT-3 to generate short excerpts from fiction books.

