Share this article
Latest news
With KB5043178 to Release Preview Channel, Microsoft advises Windows 11 users to plug in when the battery is low
Copilot in Outlook will generate personalized themes for you to customize the app
Microsoft will raise the price of its 365 Suite to include AI capabilities
Death Stranding Director’s Cut is now Xbox X|S at a huge discount
Outlook will let users create custom account icons so they can tell their accounts apart easier
Microsoft’s training Small Language Models to outperform ChatGPT
2 min. read
Published onJune 27, 2023
published onJune 27, 2023
Share this article
Read our disclosure page to find out how can you help Windows Report sustain the editorial teamRead more
Microsoft may have a $10 billion dollarinvestment and partnershipwith OpenAI for its ChatGPT Large Language Model but it looks like the company could be hedging its bets with its own smaller transformer technology.
Microsoftresearchers recently revealedits Phi-1 1.3B transformer-based language model beat much larger models including HumanEval, MBPP, and even partnered ChatGPT when tasked with coding.
Combing “textbook quality” from The Stack and The StackOverflow datasets to train the artificial intelligence of Phi-1 1.3B and the use of eight NVIDIA A1000 GPUs over four days produced 6 billion high-quality training tokens based on GPT-4 classifiers and 7 billion generated using GPT 3.5 standards.
Not only did Phi-1 1.3B outperform some of its larger language model counterparts, it did so using fewer parameters.
While the researchers may be popping bottles of champagne in excitement, Phi-1 1.3Bs achievements are tempered by its comparative limited versatility. Unlike larger models, Phi-1 1.3B gains ground through its specialized training in Python programming and as such, misses out on specific API programming resulting in less knowledge about domain specifics than larger models tend to have.
At the end of the day, Phi-1 1.3B success highlights the need for higher quality data to flow through these language models to optimize their output.
Microsoft’s other SML Orca has also proven to outperform ChatGPT in similar testing, further lending credence to the necessity for high quality data to shrink the resource question of LLMs.
Microsoft is planning to open-source Phi-1 1.3B through HuggingFace, but as of now, there is no official date for the release.
Kareem Anderson
Networking & Security Specialist
Kareem is a journalist from the bay area, now living in Florida. His passion for technology and content creation drives are unmatched, driving him to create well-researched articles and incredible YouTube videos.
He is always on the lookout for everything new about Microsoft, focusing on making easy-to-understand content and breaking down complex topics related to networking, Azure, cloud computing, and security.
User forum
0 messages
Sort by:LatestOldestMost Votes
Comment*
Name*
Email*
Commenting as.Not you?
Save information for future comments
Comment
Δ
Kareem Anderson
Networking & Security Specialist
He is a journalist from the bay area, now living in Florida. He breaks down complex topics related to networking, Azure, cloud computing, and security