âThe original agreement on the European Economic Area was signed in August 1992.â
Letâs try translating that sentence to French.
Taking a first pass at the phrase âthe European Economic Areaâ, you might be tempted to go word by word, arrive at âla europĂ©en Ă©conomique zoneâ and be met with a confused stare from the Parisian you decided to bother on the TGV.
As any Frenchman/Frenchwoman will tell you, âEconomicâ precedes âEuropeanâ in French, and gendered agreement rules dictate that âEuropeanâ be translated to âeuropĂ©enneâ in the feminine form to match âĂ©conomique.â Given the context of gender agreements and flipped orders, the correct translation is âla zone Ă©conomique europĂ©enne.â
This is an example that researchers at Google Brain used to illustrate the magic of the âtransformerâ model in a 2017 paper titled âAttention is all you needâ - the greatest thing to happen to Natural Language Processing/Understanding (NLP/U) since sliced bread.
Hereâs what was unique about transformer models - âPositional encodingâ enabled transformers to be trained much, much faster on larger datasets - or slices of the internet. Self-attention enabled these models to understand a word in the context of other words around it. For example, a transformer can learn to interpret âbarkâ differently when used alongside âtreeâ, versus âdog.â

Understanding the underlying meaning and context of an input is a pretty powerful thing, and is the secret sauce powering all the cutting-edge NLP-powered tools youâre likely to encounter today: Grammarlyâs grammatical error detection, GitHub copilot, GMailâs âautocompleteâ, CopyAIâs AI copywriter, and even DALL-E 2.
Today, if you want to leverage the power of transformers, and donât have the processing power of Google to build it from scratch (weâll pause while you check) you can, thanks to three europĂ©ens who open-sourced the transformers library (and many others too, btw) and are well on their way to democratizing machine-learning - Clem Delangue, Julien Chaumond, and Thomas Wolf - co-founders of đ€ - Hugging Face.
Dirt bikes to Chatbots to the GitHub for ML â
If you ever took an ATV out for a spin in France, thereâs a non-zero probability that it was once in the Hugging Face founder Clem Delangueâs garden equipment shop. Clemâs first steps into the world of entrepreneurship were through selling ATVs and dirt bikes imported from China on eBay - who were so impressed that they asked him to come intern with them. Clem met machine learning and machine learning met Clem when the co-founder of Moodstocks, a start-up working on image recognition tech, accosted him at an e-commerce trade show. After a short stint here, Clem started up on his own, with no ATVs this time. Bit by the ML bug, his work on a collaborative note-taking app idea connected him with a fellow entrepreneur building a collaborative e-book reader - Julien Chaumond.

The duo met with Chaumondâs friend from college, who was now active in ML research, and together they set out to build an âopen-domain Conversational AIâ - the sort of AI that features in the movie âHer.â
âWeâre building an AI so that youâre having fun talking with it. When youâre chatting with it, youâre going to laugh and smile â itâs going to be entertaining.â - Clem Delangue, CEO & Co-founder
The original app was a Tamagotchi-like friend chatbot that could talk back to you coherently about a wide range of topics, but also detect emotions in text, and adapt its tone accordingly.
The big pivot â
Julien says that the chatbot was an excuse for the early team to dive into the state-of-the-art NLP and the bleeding-edge research of the time. It was an early runaway success, with ~100,000 DAU (daily active users) at its peak and decent retention numbers. For the early team of 5-6 NLP-heads however, the heart was where the tech was. And frustratingly, the massive leaps they made to their underlying tech did not translate to breakthroughs in consumer usage. Accuracy improvements in the Hugging Face botâs responses didnât seem to correlate with growth or retention.
Around two years later, the âAttention is all you needâ paper marked the beginning of the age of transformers. Hugging Face, who had already released parts of the powerful library powering their chatbot as an open-source project on GitHub, open-sourced the hot new thing in NLP and made it available to the community.
Today, Transformers is the most widely adopted software library for machine learning models to deal with NLP applications and has 63.3k stars and 14.9k forks on GitHub.

On May 7, they raised $100 million in Series C funding at a $2B valuation led by Lux Capital with major participation from Sequoia, and Coatue. The hotly contested round also saw support from existing investors Addition, NBA star Kevin Durant, a_capital, SV Angel, Betaworks, AIX Ventures, Rich Kleiman from Thirty Five Ventures, Olivier Pomel (co-founder & CEO at Datadog), and others.
Racing to the cutting-edge đ
Since pivoting away from the chatbot, Hugging Face has been on a mission to advance and democratize artificial intelligence through open source and open science.
To become the GitHub for machine learning.
With ~100,000 pre-trained machine-learning models and <10,000 datasets currently hosted on the platform, Hugging Face enables the community and 10,000+ companies including Grammarly, Chegg, and others to build their own NLP capabilities, share their own models, and more.
Hugging Faceâs rise to the cutting-edge is mirrored in the star history of Transformers on Github compared to other leading open-source projects - even Confluent, MongoDB, and Databricks.

How did the NYC/Paris-based company named after an emoticon with just under 10 employees until 2019 rise to the very top, and become one of the most prestigious companies for a data scientist to work for today? Letâs try and break down the sorcery.
Community-led growth đ„
Even in the early days of Hugging Face, the founders were quick to notice that the community of people interested in âlarge language models applied to textâ was dense. In open-sourcing their early libraries, they stumbled upon a handful of super-users in the community. After Google released model weights for the language representation model BERT in TensorFlow, the very first starting point of Hugging Faceâs repo was moving this model to Pytorch - it was here that they really discovered their core group of contributors.
Hugging Face aspires to build the #1 community for machine learning. The commitment to the community is seared into the cultural fabric of the company:
âEveryone in the team should have some kind of focus on usage on community. If we want to build the best community for ML, we want to make sure everyone on the team is passionate about working with the community.â - Julien Chaumond
Hugging Face taps into some key community dynamics that drive engagement and growth. Chief among them was the Hugging Face Hub. The team started building the hub when they found the need for a platform for users of transformers and dataset libraries to easily share their models or datasets. They hacked together a simple way for the community to publish to AWS S3, etc if they wanted to. The Hub hosts Git-based repositories which are storage spaces that can contain all user files. It currently hosts three repo types:
Spaces - The recently launched Hugging Face Spaces empowers members of the community to become creators and contributors. Spaces are a simple way to build and share apps with in-built control versioning and git-based workflows. Over 200 spaces are live on the website today.

Datasets - Hugging Face is home to ~4.8k datasets with a broad range of use cases, tasks, and languages.
Models - Hugging Face is home to ~45k models with applications ranging from image classification and segmentation, audio classification, automatic speech recognition, zero-shot classification, and more.
Hereâs the RoBERTa base model making a strong case for simplicity in life:

Hugging Face also hosts an Inference API that lets users access models via a programming interface, and âAutoTrainâ them.
Open-source to open doors đ
CEO and co-founder Clem believes that with open-source models, Hugging Face can harness the power of a community to do things differently - âdeliver a 1000 times more valueâ than a proprietary tool, he says, drawing parallels to Elastic and MongoDB. Clem says that in the field of NLP, the team has always felt like they had been standing on the shoulders of giants and that no one company - not even the legacy tech giants can push the envelope by themselves.
What started with open-sourcing PyTorch BERT and GPT led to a snowballing effect that has propelled Hugging Face to where they are today. In a field like NLP or Machine Learning, Clem believes that the worst position to be in is to be in competition with research labs and open source projects. He says that even monetizing 1% of the value created while tapping into the power of the community can often be more than enough to grow even a publicly-traded company.
Usage is deferred revenue đž
âGiven how valuable machine learning is and how mainstream itâs becoming, usage is deferred revenue,â Clem Delangue
For Hugging Face, monetization is still an early play - they started their paid offerings just last year, and already count 1000+ companies as customers including Intel, eBay, Pfizer, and Roche. The advances in transfer learning meant that transformers are effective not just in NLP, but in other domains as well. The opportunity to become the GitHub for machine learning was apparent and Hugging Face decided to pounce on that opportunity. With ~$10M in revenue in the bank along with most of their $40M Series B from March 2021, and a strong community driving them forward, the product-led growth flywheel is ready to kick into high gear.

On the GTM front, with the hot $100M Series C in the bag, Hugging Face is hiring AEs, BDRs, and Enterprise Sales to layer into their organic growth flywheel.
âI donât really see a world where machine learning becomes the default way to build technology and where Hugging Face is the No. 1 platform for this, and we donât manage to generate several billion dollars in revenue.â - Clem
Oh, hello there, GPT-4 đ
When you talk about Hugging Face's blazing trail to the forefront of NLP, you can't ignore the turbo boost provided by models like ChatGPT and GPT-4 from OpenAI. Imagine Hugging Face as the sleek sports car of the NLP world, and GPT-4 - the nitrous injection making it go vroom.
By integrating these powerhouse engines into their Transformers library, Hugging Face didn't just ride the wave; they became the wave. And oh boy, the developer community noticed. With the newfound ability to tweak and play with models that resembled Iron Man suits in the NLP universe, developers flocked, collaborated, and innovated.
And the results? Staggering. Hugging Face's platform, already bustling with pre-trained models and datasets, became the pit stop for every NLP enthusiast wanting a taste of that GPT magic. đ
From garage startups to tech giants like Grammarly and Chegg, everyone wanted in on the action. It's like Hugging Face had suddenly unveiled a next-gen racetrack, and everyone with a need for speed in the NLP domain wanted a piece of it.
Whatâs next for Hugging Face? [MASK] đź
With a team of ~140 thatâs rapidly growing every month, and offices in New York and Paris, team Hugging Face is edging closer and closer to their goal of democratizing machine learning, and making bleeding-edge AI accessible to even companies and teams without the resources to build them from scratch.
Transformers evolving to become a general-purpose architecture for speech, computer vision, and even protein structure prediction holds Hugging Face in good stead - at the intersection of overlapping domains.
With what Andrej Karpathy calls Software 2.0 around the corner if not already here, and open-domain conversational AI still an open-problem statement, there are exciting milestones to hit before đ€ becomes the first company to go public on Nasdaq with an emoji, instead of the three-letter ticker.
â