ChatGPT domestic version is free (what ChatGPT is, understand ChatGPT in one article), artificial in
ChatGPT是个啥?近期,OpenAI 发布了 ChatGPT,是一个可以对话的方式进行交互的模型,因为它的智能化,得到了很多用户...
What is ChatGPT? Recently, OpenAI released ChatGPT, a model that can interact through dialogue. Due to its intelligence, ChatGPT has been welcomed by many users and is also a relative of OpenAI's previously released InstrumentGPT,.
The training of the ChatGPT model is using RLHF (reinforcement learning with human feedback). Perhaps the arrival of ChatGPT is also the prelude to the official launch of OpenAI's GPT-4. What is GPT? From GPT-1 to GPT-3
Generic Pre trained Transformer (GPT) is a text generation in-depth learning model based on Internet available data training. It is used for question answering, text summary generation, machine translation, classification, code generation and dialogue. AI2018, GPT-1 was born, which is also the first year of NLP (natural language processing) pre training model.
In terms of performance, GPT-1 has certain generalization ability and can be used in NLP tasks unrelated to supervisory tasks. Its commonly used tasks include: natural language reasoning: determining the relationship between two sentences (including, contradictory, neutral) Q&A and common sense reasoning: inputting articles and several answers, outputting the accuracy of answers Semantic similarity recognition: determining whether the semantics of two sentences are related
Classification: Determine which category the input text belongs to. Although GPT-1 has some effectiveness in untuned tasks, its generalization ability is far lower than that of fine-tuned supervised tasks. Therefore, GPT-1 can only be considered a relatively good language understanding tool, rather than a conversational AIGPT-2, which arrived as scheduled in 2019. However, GPT-2 did not innovate or design the original network too much, Only more network parameters and a larger dataset were used: the maximum model consists of 48 layers with a parameter count of 1.5 billion, while the learning objective uses unsupervised pre training models for supervised tasks.
In terms of performance, in addition to its ability to understand, GPT-2 has for the first time demonstrated a powerful talent in generating content: reading abstracts, chatting, continuing to write, creating stories, and even generating fake news, phishing emails, or role-playing online. Without hesitation, after "getting bigger," GPT-2 has indeed demonstrated universal and powerful capabilities, and achieved optimal performance in multiple specific language modeling tasks at that time.
Later, GPT-3 came into being. As an unsupervised model (now often referred to as the self supervised model), it can almost complete most tasks of natural language processing, such as problem oriented search, reading comprehension, semantic inference, machine translation, article generation and automatic question answering. Moreover, the model performs well in many tasks, For example, in French English and German English machine translation tasks, we have reached the current best level. The automatically generated articles can hardly distinguish between people and machines (only 52% of the correct rate, which is equivalent to random guess). What's more surprising is that we have achieved almost 100% of the correct rate in the two digit addition and subtraction tasks, and even can automatically generate code according to the task description.
An unsupervised model with multiple functions and good results seems to give people hope for universal artificial intelligence. Perhaps this is the main reason why the GPT-3 model has such a significant impact. What is the GPT-3 model? In fact, GPT-3 is a simple statistical language model from the perspective of machine learning. Language models model the probability distribution of word sequences, which uses previously mentioned fragments as conditions to predict the probability distribution of different words appearing at the next moment.
On the one hand, a language model can measure the degree to which a sentence conforms to the language grammar (such as measuring whether the response generated by the human-machine dialogue system is natural and fluent), and can also be used to predict the generation of new sentences. For example, for a segment "It's 12:00 noon, let's go to the restaurant together", the language model can predict the possible words that may appear after "the restaurant".
General language models predict that the next word is "eat". A powerful language model can capture time information and predict the generation of the word "eat lunch" that conforms to the context. Generally, whether a language model is powerful depends on two points: first, whether the model can use all historical context information. In the above example, if the remote Semantic information of "12 noon" cannot be captured, Language models are almost unable to predict the next word 'lunch'.
Secondly, it also depends on whether there is enough rich historical context for the model to learn, that is, whether the training corpus is rich enough. Since the language model belongs to self supervised learning, the optimization goal is to maximize the probability of the language model of the seen text, so any text can be used as training data without labeling. Because GPT-3 has stronger performance and significantly more parameters, it contains more topic texts, which is obviously superior to the previous generation GPT-2.
As the largest dense neural network currently available, GPT-3 can convert web page descriptions into corresponding code, imitate human narratives, create custom poems, generate game scripts, and even imitate deceased philosophers - predicting the true meaning of life. GPT-3 does not require fine-tuning, and in dealing with grammar problems, it only requires some output type samples (a small amount of learning).
It can be said that GPT-3 seems to have satisfied all our imaginations for language experts. Note: The above article mainly refers to the following article: 1. GPT4 is about to be released, which is like a human brain, and many industry leaders can't sit still- Xu Jiecheng, Yun Zhao - official account 51CTO technology stack - 2022-11-24 18:082. This article answers your curiosity about GPT-3! What is GPT-3? Why is it so excellent- Zhang Jiajun, Institute of Automation, Chinese Academy of Sciences, 2010-11-11, 17:25, published in Beijing 3. The Batch: 329 | InstrumentGPT, a friendlier and gentler language model - official account account DeeplearningAI-2022-02-07, 12:30
What are the issues with GPT-3? However, GTP-3 is not perfect. Currently, one of the main concerns of artificial intelligence is that chatbots and text generation tools may learn all the texts on the network regardless of quality, producing erroneous, malicious, or even offensive language output, which will fully affect their next application.
OpenAI has also proposed to release a more powerful GPT-4 in the near future:
Comparing GPT-3 with GPT-4 and the human brain (image source: Lex Friedman @ youtube), it is said that GPT-4 will be released next year. It can pass Turing testing and be advanced to the point where it is no different from humans. In addition, the cost of introducing GPT-4 to enterprises will also be significantly reduced.
ChatGP and InstrumentGPTChatGPT and InstrumentGPT When it comes to Chatgpt, we need to talk about its predecessor, InstrumentGPT. In early 2022, OpenAI released InstrumentGPT; In this study, compared to GPT-3, OpenAI used alignment research to train a more realistic, harmless, and user friendly language model, InstrumentGPT. InstrumentGPT is a fine-tuned new version of GPT-3 that can minimize harmful, untrue, and biased outputs.
What is the working principle of InstrumentGPT? Developers improve the output quality of GPT-3 by combining supervised learning+reinforcement learning obtained from human feedback In this learning, humans rank the potential outputs of models; Reinforcement learning algorithms reward models that produce materials similar to advanced output.
The training dataset starts with creating prompts, some of which are based on input from GPT-3 users, such as "Tell me a story about a frog" or "Explain landing on the moon to a 6-year-old child in a few sentences". The developers divide the prompts into three parts and create responses for each part in different ways.
Human writers will respond to the first set of prompts. Developers have fine-tuned a trained GPT-3 and transformed it into an InstrumentGPT to generate existing responses for each prompt. The next step is to train a model to give higher rewards for better responses. For the second set of prompts, the optimized model will generate multiple responses.
After giving a prompt and two responses, a reward model (another pre trained GPT-3) learned to calculate higher rewards for high rated responses and lower rewards for low rated responses. Developers used a third set of prompts and reinforcement learning methods, Proximal Policy Optimization (PPO), to further fine-tune the language model.
After the prompt is given, the language model will generate a response, and the reward model will give corresponding rewards PPO uses rewards to update the language model Reference: The Batch: 329 | InstructGPT, a friendlier and gentler language model - official account account DeeplearningAI-2022-02-07 12:30
What is important? The core lies in the fact that artificial intelligence needs to be responsible. The language model of OpenAI can assist in the fields of education, virtual therapists, writing aids, role-playing games, etc. In these fields, the existence of social bias, erroneous information, and toxic information is more troublesome, and systems that can avoid these defects can be more useful.
What are the differences in the training processes between Chatgpt and InstrumentGPT? Overall, Chatgpt, like the InstrumentGPT mentioned earlier, uses RLHF (reinforcement learning from human feedback) for training. The difference lies in how the data is set up for training (and collection).
(Here's an explanation: The previous InstrumentGPT model used to give one input and one output, and compared it with training data, there was a reward but not a punishment. The current Chapter model has one input and multiple outputs, and then people sort the output results, asking the model to change these results from "more like human language" to "nonsense".) Sorting, allowing the model to learn the way humans sort, is a strategy called supervised learning. Thank you to Dr. Zhang Zijian in this paragraph
What are the limitations of ChatGPT? Here are the following: a) In the reinforcement learning (RL) stage of training, there is no specific source of truth and standard answers to answer your question. b) The training model is more cautious and may refuse to answer (to avoid false positives). c) Supervised training may mislead/bias the model towards knowing the ideal answer, rather than generating a random set of responses and only human reviewers choose good/ranked responses.
Note: ChatGPT is sensitive to wording, and sometimes the model may not respond to a phrase, but with slight adjustments to the question/phrase, it will ultimately answer correctly. Trainers are more inclined to prefer longer answers because these answers may appear more comprehensive, leading to more lengthy answers. Additionally, the model may overuse certain phrases, and if the initial prompt or question is ambiguous, the model will not appropriately request clarification.
Finally, if chatGPT were open for use, what would you do with it?
当前非电脑浏览器正常宽度,请使用移动设备访问本站!