By now, you’ve heard about ChatGPT and its script generation capabilities. He passed a business school exam, baffled teachers looking to spot cheaters and helped people craft emails to co-workers and loved ones.
And I accomplished these tasks remarkably well, because tests, essays, and emails required correct answers. But being correct isn’t really the goal of ChatGPT – it’s a by-product of its goal: to produce text that looks natural.
So how do AI chatbots work, and why do you get some answers right and some answers really wrong? This is a look inside the box.
The technology behind large language models like ChatGPT is similar to the predictive text feature you see when you type a message on your phone. Your phone will evaluate what has been typed and calculate the odds of what is likely to follow, based on its model and what it has observed of your previous behavior.
View this graphic on nbcnews.com
Anyone familiar with the process knows how many different directions a string can branch into.
Unlike the phone’s predictive text feature, ChatGPT generative (G in GPT). It does not make one-time predictions. Instead, it is meant to create logical text strings across multiple sentences and paragraphs. The output is meant to make sense and read as if someone wrote it, and must match the prompt.
So what helps him pick a good next word, then another after that, and so on?
Internal reference
There is no fact-database or dictionary inside the machine to help it “understand” words. Instead, the system mathematically treats words as a set of values. You can think of these values as representing some quality that a word might have. For example, is the word complimentary or critical? Sweet or sour? Low or high?
Theoretically, you could set these values wherever you want and find that you got close to a word. Here’s a fancy example to illustrate the idea: The generator below is designed to return a different fruit based on the three attributes. Try changing any of the attributes to see how the output changes.
View this graphic on nbcnews.com
This technique is called embed word, which is not new. I originated in the field of linguistics in the fifties. While the above example only uses three “adjectives”, in a large language model the number of “adjectives” per word would be in the hundreds, allowing for a very accurate way of identifying words.
Learn to have meaning
When the model is new, the adjectives associated with each word are randomly assigned, which is not very useful, because its predictive ability depends on it being finely tuned. To get there, it must be trained a lot of content. This is it big Part of the larger language model.
A system like ChatGPT may be fed millions of web pages and digital documents. (Think the entire Wikipedia, big news sites, blogs, and digital books.) The machine goes through the training data one by one, blocking out a word in a sequence, and making a “guess” of the values that most closely represent what should be put in the blank. When the correct answer is detected, the machine can use the difference between what it predicted and the actual word to improve it.
It’s a long process. OpenAI, the company behind ChatGPT, hasn’t released details about how much training data was fed into ChatGPT or computer power used to train it, but researchers from Nvidia, Stanford University and Microsoft estimate that with 1,024 GPUs, 34 days to train GPT 3, ChatGPT’s predecessor. One analyst estimated that the computational resource cost of training and running large language models could run into the millions.
ChatGPT also has an additional layer of training, referred to as reinforcement learning from human feedback. While the previous exercise is about making the model fill in the missing text, this stage is about making it put together coherent, precise, conversational strings.
During this phase, people evaluate the machine’s response, and flag output that is incorrect, unhelpful, or even completely irrational. Using the feedback, the machine learns to predict whether humans will find their responses useful. OpenAI says this training makes the output of its model safer, more relevant, and less likely to distort facts. This, the researchers said, is what makes ChatGPT responses better align with human expectations.
At the end of the process, there is no record of the original training data inside the model. It contains no facts or quotes to point out – only the extent to which the words are related or unrelated to each other in the work.
Training use
This set of data turns out to be surprisingly powerful. When you write your query in ChatGPT, it translates everything into numbers using what you learned during training. Then it does the same series of arithmetic operations from above to predict the next word in its response. This time, there is no hidden word to reveal; She just expects.
Thanks to his ability to refer back to earlier parts of a conversation, he can keep up page after page of realistic, human-sounding text that is sometimes, but not always, true.
View this graphic on nbcnews.com
determinants
At this point, there is a lot of disagreement about what AI is or what it will be capable of, but one thing is well agreed upon—and featured prominently on the interfaces of ChatGPT, Google Bard, and Microsoft Bing: these tools should not be relied upon when precision is required.
Large language models are able to recognize them text stylesNot the facts. A number of models, including ChatGPT, have knowledge outage dates, which means they can’t go online to learn new information. This is unlike Microsoft’s Bing chatbot, which can query online resources.
The large language model is also as good as the materials that were used to train it. Because the models identify patterns between words, feeding the AI dangerous or racist text means that the AI will learn dangerous or racist text patterns.
OpenAI says it has created some barriers to prevent it from offering this, and ChatGPT says it is “trained to reject inappropriate requests,” as we discovered when she refused to write an angry email demanding a raise. But the company also acknowledges that ChatGPT will still occasionally “respond to malicious instructions or display biased behaviour.”
View this graphic on nbcnews.com
There are many useful ways to take advantage of technology now, such as crafting cover letters, summarizing meetings, or planning meals. The big question is whether improvements in the technology can overcome some of its shortcomings, and enable it to create truly authoritative text.
methodology
Drawings by Joila Karman. In the “Pride and Prejudice” graph, Google Bard, OpenAI GPT-1 and ChatGPT were given the prompt “Please sum up Pride and Prejudice by Jane Austen in one sentence”. BigScience Bloom was asked to finish a sentence “On Pride and Prejudice, Jane Austen.” All responses were collected on May 11, 2023. In the email graphic, OpenAI ChatGPT was given the directions: “Write a positive email asking for a raise,” “Write a neutral email asking for a raise,” “Write an upset email asking for a raise,” “ Write an angry email asking for a raise.” All responses collected May 8, 2023.
#ChatGPT #Chatbots #Work