Microsoft’s training Small Language Models to outperform ChatGPT

2 min. read

Published onJune 27, 2023

published onJune 27, 2023

Share this article

Read our disclosure page to find out how can you help Windows Report sustain the editorial teamRead more

Microsoft may have a $10 billion dollarinvestment and partnershipwith OpenAI for its ChatGPT Large Language Model but it looks like the company could be hedging its bets with its own smaller transformer technology.

Microsoftresearchers recently revealedits Phi-1 1.3B transformer-based language model beat much larger models including HumanEval, MBPP, and even partnered ChatGPT when tasked with coding.

Combing “textbook quality” from The Stack and The StackOverflow datasets to train the artificial intelligence of Phi-1 1.3B and the use of eight NVIDIA A1000 GPUs over four days produced 6 billion high-quality training tokens based on GPT-4 classifiers and 7 billion generated using GPT 3.5 standards.

Not only did Phi-1 1.3B outperform some of its larger language model counterparts, it did so using fewer parameters.

While the researchers may be popping bottles of champagne in excitement, Phi-1 1.3Bs achievements are tempered by its comparative limited versatility. Unlike larger models, Phi-1 1.3B gains ground through its specialized training in Python programming and as such, misses out on specific API programming resulting in less knowledge about domain specifics than larger models tend to have.

At the end of the day, Phi-1 1.3B success highlights the need for higher quality data to flow through these language models to optimize their output.

Microsoft’s other SML Orca has also proven to outperform ChatGPT in similar testing, further lending credence to the necessity for high quality data to shrink the resource question of LLMs.

Microsoft is planning to open-source Phi-1 1.3B through HuggingFace, but as of now, there is no official date for the release.

Kareem Anderson

Networking & Security Specialist

Kareem is a journalist from the bay area, now living in Florida. His passion for technology and content creation drives are unmatched, driving him to create well-researched articles and incredible YouTube videos.

He is always on the lookout for everything new about Microsoft, focusing on making easy-to-understand content and breaking down complex topics related to networking, Azure, cloud computing, and security.

User forum

0 messages

Sort by:LatestOldestMost Votes

Comment*

Name*

Email*

Commenting as.Not you?

Save information for future comments

Comment

Kareem Anderson

Networking & Security Specialist

He is a journalist from the bay area, now living in Florida. He breaks down complex topics related to networking, Azure, cloud computing, and security

With KB5043178 to Release Preview Channel, Microsoft advises Windows 11 users to plug in when the battery is low#

Copilot in Outlook will generate personalized themes for you to customize the app#

Microsoft will raise the price of its 365 Suite to include AI capabilities#

Death Stranding Director’s Cut is now Xbox X|S at a huge discount#

Outlook will let users create custom account icons so they can tell their accounts apart easier#

Microsoft’s training Small Language Models to outperform ChatGPT#

With KB5043178 to Release Preview Channel, Microsoft advises Windows 11 users to plug in when the battery is low

Copilot in Outlook will generate personalized themes for you to customize the app

Microsoft will raise the price of its 365 Suite to include AI capabilities

Death Stranding Director’s Cut is now Xbox X|S at a huge discount

Outlook will let users create custom account icons so they can tell their accounts apart easier

Microsoft’s training Small Language Models to outperform ChatGPT