After leveraging OpenAI’s GPT-4 to introduce ChatGPT-like functionality for Bing Chat, Bing Image Creator, Microsoft 365 Copilot, Azure OpenAI Service, and GitHub Copilot X.Microsoft now announces the DeepSpeed-Chata low-cost open source solution for RLHF training, based on Microsoft’s open source deep learning optimization library DeepSpeed; claims that anyone can create high-quality ChatGPT-style models even with a single GPU.
The company said that despite the great efforts of the open source community, there is still a lack of a large-scale system that supports end-to-end reinforcement learning (RLHF) based on human feedback mechanisms, which makes it difficult to train a powerful ChatGPT-like model. The training of the ChatGPT model is based on the RLHF method in the InstructGPT paper, which is completely different from the pre-training and fine-tuning of common large language models, which makes the existing deep learning system have various limitations in training the ChatGPT-like model. Therefore, in order to make ChatGPT-type models more accessible to ordinary data scientists and researchers, and to make RLHF training truly popular in the AI community, they released DeepSpeed-Chat.
DeepSpeed-Chat has the following three core functions:
- Simplify the training and enhanced inference experience of ChatGPT type models: Multiple training steps can be implemented with just one script, including using the Huggingface pre-trained model, running all three steps of InstructGPT training with the DeepSpeed-RLHF system, and even generating your own ChatGPT-like model. Additionally, an easy-to-use inference API is provided for users to test conversational interactions after model training.
- DeepSpeed-RLHF module: DeepSpeed-RLHF reproduces the training model in the InstructGPT paper and ensures that the three steps including a) supervised fine-tuning (SFT), b) reward model fine-tuning, and c) reinforcement learning with human feedback (RLHF) are consistent with it. One to one correspondence. In addition, data abstraction and mixing functions are provided to support users to use multiple data sources from different sources for training.
- DeepSpeed-RLHF system: It integrates DeepSpeed’s training (training engine) and inference capabilities (inference engine) into a unified hybrid engine (DeepSpeed Hybrid Engine or DeepSpeed-HE) for RLHF training. DeepSpeed-HE is able to seamlessly switch between inference and training modes in RLHF, enabling it to take advantage of various optimizations from DeepSpeed-Inference, such as tensor parallel computing and high-performance CUDA operators for language generation, while training Some also benefit from ZeRO- and LoRA-based memory optimization strategies. DeepSpeed-HE is also capable of intelligent memory management and data caching at different stages of RLHF automatically.
The content of the document points out that the advantages of DeepSpeed Chat over other advanced solutions are:efficiency and economyMore than 15 times faster than existing systemsit only takes 9 hours to train an OPT-13B model on the Azure cloud, and it only takes 18 hours to train an OPT-30B model, costing less than $300 and $600 respectively.
In terms of speed and scalability, even a 13B model can be trained in 1.25 hours, and a huge 175B model can be trained in less than a day using a 64 GPU cluster. In terms of accessibility and popularization of RLHF, models with more than 13 billion parameters can be trained on a single GPU.Also supportsRunning 6.5B and 50B models respectively on the same hardware achieves up to 7.5x improvement.
Despite the recent There are constant voices of opposition and concerns about the development of ChatGPT-like large language models, but Microsoft still seems to be fully advancing its AI development.For Microsoft’s release, former Meta AI expert Elvis is also excitedexpress, DeepSpeed Chat provides an end-to-end RLHF pipeline to train a ChatGPT-like model that Alpaca and Vicuna lack, addressing the challenges of cost and efficiency. It’s “Microsoft’s impressive open source effort…is a big deal“.
More details can beCheck out the official documentation.
#Microsoft #opensources #DeepSpeedChat #Speed #ChatGPTlike #billion #large #models #save #money #times #Development details