Open-source LLM provider MosaicML has announced the release of its most advanced models to date, the MPT-30B Base, Instruct, and Chat.
These state-of-the-art models have been trained on the MosaicML Platform using NVIDIA’s latest-generation H100 accelerators and claim to offer superior quality compared to the original GPT-3 model.
With MPT-30B, businesses can leverage the power of generative AI while maintaining data privacy and security.
Since their launch in May 2023, the MPT-7B models have gained significant popularity, with over 3.3 million downloads. The newly released MPT-30B models provide even higher quality and open up new possibilities for various applications.
MosaicML’s MPT models are optimised for efficient training and inference, allowing developers to build and deploy enterprise-grade models with ease.
One notable achievement of MPT-30B is its ability to surpass the quality of GPT-3 while using only 30 billion parameters compared to GPT-3’s 175 billion. This makes MPT-30B more accessible to run on local hardware and significantly cheaper to deploy for inference.
The cost of training custom models based on MPT-30B is also considerably lower than the estimates for training the original GPT-3, making it an attractive option for enterprises.
Furthermore, MPT-30B was trained on longer sequences of up to 8,000 tokens, enabling it to handle data-heavy enterprise applications. Its performance is backed by the usage of NVIDIA’s H100 GPUs, which provide increased throughput and faster training times.