What is ChatGPT: An Easy Explanation for Non-Techies
For a normal user, ChatGPT is simply a website to chat, talk about all kinds of topics with a virtual bot.
ChatGPT is currently one of the hottest keywords on social networks. However, not everyone can clearly understand the nature of this AI program. Below, VietNamNet Newspaper would like to send readers an article by security expert Nguyen Hong Phuc about ChatGPT, with the purpose of explaining it easily to people who do not know about technology.
A Simple Understanding of ChatGPT
For a normal user, ChatGPT is simply a website to chat, talk about all kinds of topics with a virtual bot.
This bot was created by OpenAI, a company founded by Elon Musk in 2015, with the initial mission of "preventing the dangers of AI".
How is ChatGPT created?
ChatGPT is an artificial intelligence computer program. Technically, people often call it Model AI, in Vietnamese it is "artificial intelligence data model", but in fact it is still digital data running on a computer, so calling it a program is not wrong.
The word Model AI consists of two parts: Model (Data model) and AI (Artificial intelligence). The literal meaning is Intelligence comes from data, which means that with more data, intelligence will arise.
The process of creating an AI Model is a process consisting of the following steps: Data collection, data selection, data labeling for training, and training.
Teaching AI is basically easy, like this dialogue:
Question: What is your name?
Answer: My name is ChatGPT
Question: What is VietNamNet?
Answer: VietNamNet is an electronic newspaper in Vietnam.
Then we teach the AI to remember this information (training), then save the AI's memorized brain as an AI Model (model checkpoint). Later when using it, load the brain with the memory containing the above information (inference) into the computer, you just need to ask the corresponding question, the AI will remember the knowledge that has been taught and answer "exactly what it was taught".
In fact, over the past decades, AI has been specialized in many specific tasks such as: AI to support aircraft construction, AI to simulate combat, AI in games... but almost no large companies have invested in AI in the language field. It was not until 2017 that there was a technological breakthrough that made AI training dramatically more effective, especially language AI.
Language, specifically writing, is the achievement that creates human civilization. Humans contain their knowledge in writing. Understanding language (writing) is understanding human knowledge. This is the core point that creates linguistic AI. Before 2017, it was very difficult for humans to make computers understand the meaning of a meaningful sentence.
So what's in 2017?
In August 2017, scientists at Google, specifically Google Brain, Google's AI research unit since 2011, invented an algorithm called Transformer (the name of the algorithm is very similar to the famous movie in the cinema field, Robot Wars).
The Transformer algorithm is a breakthrough, specifically a breakthrough in language AI training. Before this algorithm, if humans wanted to teach AI, they had to create a training dataset with question-answer pairs (labeling data) as mentioned above, and the machine actually only memorized the question-answer pairs but did not "understand" the meaning of the sentence, a huge difference between rote learning and understanding.
It's even easier to understand that after 2017, we just need to pour in as much text data as possible, the computer will automatically figure out what what we pour in means instead of us having to show them the meaning.
Excerpt from Google's Transformer announcement document: "With transformers, computers can see the same patterns humans see".
Google was kind enough to release the detailed documentation of the Transformer algorithm publicly for everyone to access. At the same time, it provided Open-Source rights for this algorithm. So the entire AI scientific community benefited from Google's invention. Among them was OpenAI, a company founded in 2015 and did not have any notable achievements until after 2017.
After Google announced Transformer, a few months later, the first language AIs based on this new algorithm were born in droves. In January 2018, OpenAI released the first AI based on Transformer, GPT-1, and they applied it very quickly, faster than Google itself.
GPT stands for Generative Pre-trained Transformer which means "Generative Pre-trained Transformer program".
This AI GPT was created with the main purpose of "Generating Words". Specifically, you will play a word association game with it, you write a sentence, it will read that sentence and then based on the knowledge it is storing in its memory, "generate words" to continue the sentence you wrote.
For example:
You entered: Vietnam is
ChatGPT: Vietnam is a country located in Southeast Asia...
This is the seemingly "magical" thing: You chat a sentence with ChatGPT and it says a sentence back. Actually it is not answering you but it is playing word connection by "Generating Words" to continue the meaning of the sentence you typed in the chat with it.
![]() |
GPT-1 is the first generation of ChatGPT. This GPT-1 is a pretty small AI, small in both size and complexity. |
In the world of Linguistic AI, people measure complexity - corresponding to the level of "intelligence" of the AI - by a unit called Hyper Parameters - Hyper Parameters, this concept can be roughly explained as how many layers of meaning this AI understands the meaning of all the texts used to teach it.
To get answers like these, scientists at OpenAI collected a huge amount of human written text.
To train this GPT AI, scientists at OpenAI collected a large amount of human written text, mostly from Wikipedia, encyclopedias, major and public newspapers, the volume is somewhere around hundreds of GB and hundreds of millions of documents. After collecting, they cleaned and selected the content. Then they gave those texts to the AI to read, making it read many, many times, each time it read that block of data, it saw a layer of meaning behind those words, the more times, the more layers of meaning.
AIs are trained to reach a level of deep understanding of human written language, leading to a very serious problem that to date no AI scientist has found a solution to.
Calculate "True" or "False". AI cannot understand what is "True" or "False".
AI can see many layers of meaning in a sentence, but cannot "understand whether that meaning is right or wrong". Because right and wrong are relative, for humans it is fragile and controversial, even causing fights between humans.
Besides, the huge amount of text data that scientists at OpenAI collect to train AI is not all biased "correctly" and contains information that is "correct" with human social norms, because the amount of data is too large beyond their ability to filter.
For example, they may collect texts that say the earth is round, and they may also collect texts that say the earth is flat. Data contains both true and false information. Then when AI reads and re-reads those texts to find layers of meaning, it also finds the "true" and "false" meanings, but AI has no consciousness to recognize which meaning - which information is true and which meaning - information is false. AI simply remembers everything. When asked later, it simply answers from its memory that information, without distinguishing between right and wrong.
Companies like Google, Facebook, IBM, Microsoft have repeatedly announced breakthrough Language AIs in answering human input questions, but quickly deleted that AI. You can search for articles about this on the internet from major newspapers. Mostly because that AI answers some questions with a bias towards an unacceptable "Wrong" meaning in terms of current human social standards such as respect for gender, respect for religion, respect for ethnicity, the accuracy of events that have happened, truths that humans have agreed to be true...
Large companies adhere to standards of information accuracy, they assess that AI cannot yet solve the problem of Right - Wrong perception, so it is best not to go public.
![]() |
GPT-3 is the same, it also creates paragraphs that violate human standards of "Right-Wrong", even wrong to the point of being unacceptable. |
GPT-3 was on the way to becoming popular when the Covid-19 pandemic broke out globally. The epidemic situation became increasingly tense from mid-2020, and the pandemic information stream overwhelmed information about GPT-3.
The AI GPT-3 and OpenAI were forgotten by the public until the end of 2022. OpenAI decided to do a marketing program to see if it could revive interest in Language AI?
So they modified the AI GPT-3 into ChatGPT, making it easier to use, instead of coming in the form of a website where people type in words, edit parameters, and then get back a paragraph of words, ChatGPT comes in the form of a Chat program, with a chat box to enter questions, the AI ChatGPT plays the Word Generator game with that question, but in the form of an answer.
To summarize the success formula of ChatGPT in the past month: A Language AI trained deeply enough to generate meaningful sentences that are convincing enough for readers + the unethical nature of an AI technology company + suitable UI/UX (Chat) = ChatGPT.
AI can see many layers of meaning in a sentence, but cannot "understand whether that meaning is right or wrong".
(Expert Nguyen Hong Phuc)