Join us at Chase Center in SF, September 10th! >>

Building the Open Source AI Revolution (with Hugging Face CEO, Clem Delangue)

ACQ2 Episode

October 14, 2024
October 13, 2024

We sit down with Hugging Face CEO Clem Delangue to understand the current state of the open source AI ecosystem. Hugging Face is the leading platform to host and collaborate on AI models, datasets, and applications. They also have a compute offering for AI builders to train their models directly on the platform. Clem has a contrarian take on the future: there will not be just a few major foundation model companies with everyone using their APIs. But rather, that thousands of companies will have their own specialized AI models built in-house for their particular use case. It's obviously a very dynamic landscape and we'll have to see how it shakes out, but Clem has a pretty great viewpoint to see it all, working with their 5 million registered Hugging Face users!

Links:

Sponsors:

More Acquired: 

Join the Slack
Get Email Updates
Become a Limited PartnerJoin the Slack

Get New Episodes:

Thank you! You're now subscribed to our email list, and will get new episodes when they drop.

Oops! Something went wrong while submitting the form

Transcript: (disclaimer: may contain unintentionally confusing, inaccurate and/or amusing transcription errors)

Ben: Clem Delangue, welcome to ACQ2.

Clem: Thanks for having me.

Ben: It’s a pleasure to have you here. We have heard so much about Hugging Face over the last few years. It just feels appropriate in this moment to talk to you about the company directly.

Clem: I feel like at the very critical time for AI, and with Hugging Face we have the pleasure and the honor to be at the center of it. So excited to be able to share some of the things that we’re seeing.

Ben: I think the listeners who are tuning into this and saying, what is this episode going to be about? We want to frame it as you should come in and you don’t need to know anything about AI and you should walk out with a pretty clear understanding of open source AI, the more closed ecosystem, what is the difference between the two, what are the trade-offs, what are the virtues of each one.

We’re going to tell it through the Hugging Face story, so what role do you play in the ecosystem, who do you work with, who do you not, how did this thing spring up out of quite an unlikely place given the name of your company. We’ll work our way backwards. At this moment in time today, how do you describe what Hugging Face is?

Clem: Hugging Face has been lucky to become the number one platform for AI builders. AI builders are the new software engineers in a way. In the previous paradigm of technology, the way you would build technology was by writing code. You would write a million lines of code and I would create a product like Facebook, Google, all the products that we use in our day-to-day Life.

Now today, the way that you create technology is by training models, using data sets, and building AI apps. Most of the people that do that today are using the Hugging Face platform to find models, find data sets, and build apps. We have over five million AI builders that are using the platform every day to do that.

Ben: The ecosystem around Hugging Face in many ways reminds me of the 2008–2010 era of the Web 2.0 RESTful APIs that everybody was publishing. You could suddenly daisy chain together a million different companies—

David: Mashups.

Ben: Services into, yeah, the API mashups. It feels like there’s a loose analogy to at least the movement that you’re on is similar to that one. What can we create with a bunch of these more open, flexible building blocks?

Clem: It’s super exciting because it’s replacing some of the previous capabilities. Now we are starting to see search being built with AI. You’re starting to see social networks being built with AI. But at the same time, it’s empowering new use cases. It’s unlocking new capabilities that weren’t possible before to some extremes.

Some people are talking about super intelligence AGI, completely new things that we weren’t even thinking about in the past. We are at this very interesting time where the technology is starting to catch up to the use cases and we’re seeing the emergence of a million new things that weren’t possible before.

Ben: That’s cool. And just so listeners understand the scale at what you’re operating, Hugging Face is currently valued as of recording at $4.5 billion. Investors include Nvidia, Salesforce, Google, Amazon, Intel, AMD, Qualcomm, IBM. It’s a pretty wild set. What are some metrics that you care about as a company that you can use to describe the scale at which developers are using it today?

Clem: I was saying that we have five million AI builders using the platform, but more interestingly, I think it’s the frequency and volume of usage that they have on the platform. Collectively, they shared over three million models, data sets, and apps on the platform.

Some of these models you might know them, might have heard of them, like Llama 3.1. Maybe you’ve heard of Stable Diffusion for image. Maybe you’ve heard of Whisper for audio or Flux for image. We’re going to cross soon one million public models that have been shared on the platform and almost as many that have not been shared and that companies are using internally, privately for their use cases.

David: The analogy and model for you guys really is just like GitHub except for AI models, right? You can public open source, open to everybody and companies can also use internal closed source repositories for their own use, right?

Clem: Yeah. It’s a new paradigm. AI is quite different than traditional software. it’s not going to be exactly the same. But we are similar in the sense that we’re the most used platform for this new class of technology builders.

For GitHub it was software engineers and for us it’s AI builders. To add, to reflect the usage side of things, one interesting metric is now that a model or dataset or map is built every 10 seconds on the Hugging Face platform. I don’t know how long this podcast is going to last, but by the end of this podcast we’re going to have a few hundred more models, data sets, and apps built on the Hugging Face platform.

Ben: And to continue to maybe torture the repository comparison, the set of things that need to exist besides I’m going to upload a pile of code that everyone can see and potentially attempt to modify, it’s also the data sets themselves. It’s also a platform to actually run applications, and also a compute platform where if you want to train a model that is also possible on Hugging Face, right?

Clem: Yeah. One additional aspect that sometimes people underestimate is a lot of features around collaboration for building AI. The truth is that you don’t build AI by yourself as a single individual. You need the help of everybody in your team, but also sometimes people in other teams in your company or even people in the field.

Things like the ability to comment on the model, on the dataset, on the map, to version your code, your models, your data sets, to report bugs, to comment and add reviews about your code, your models, your data sets. These are some of the most used features on the platform because it enables bigger and bigger teams to build AI together.

That’s something we’re seeing at companies, is that a few years ago maybe there was a small team or 5–10 people leading the AI teams at companies. Now, it’s much bigger teams. For example, at Microsoft, at Nvidia, at Salesforce we have thousands of users using the Hugging Face platform altogether, privately, and publicly.

Ben: I have a whole bunch of questions, philosophical ones about where AI goes from here and how the mental model for the AI ecosystem is different than previous generations. But to get there I think it’s helpful to understand how you arrived here. In 2016 you co-founded a company named after the Unicode code point Hugging Face the emoji. As far as I can tell, it was an emoji that you could talk to as a chatbot aimed primarily at teenagers. Is that right?

Clem: Absolutely correct. There was a long journey.

David: You started neither an AI infrastructure company nor did you even start in the current era of AI.

Clem: No, but we did start based on our excitement and passion for AI, even if we weren’t even calling it AI at the time. We were saying more machine learning, deep learning. I was lucky enough, I think it’s now almost 15 years ago or a few years more, to work at the startup in Paris that was called MOOD Stocks, where we are doing machine learning for computer vision.

Before, a lot of people were talking about AI. It made me realize the potential for the new technology and the way we could change things with AI. When we started Hugging Face with my co-founders, Julien and Thomas, we were super excited about the topic and we were like, okay, it’s going to enable a lot of new things.

Let’s start with a topic that is both scientifically challenging and fun. We started with conversational AI. We were at the time, okay, Siri, Alexa, they suck. We remember our tamagotchi, which were these fun virtual pets that you would play with. Let’s build an AI tamagotchi, like a conversational AI that would be fun to talk to. That’s what we did. We worked on it for three years. We raised our first two rounds of funding on this idea. Shout out to our first investors who invested in a very different idea than what we are today.

Ben: Who were your early investors?

Clem: Our earliest investor was Betaworks in New York.

Ben: Oh yeah. I had no idea.

Clem: Yes, with John Borthwick and Matt Hartman who were our first supporters, really backed us when we were random French dudes with no specific background or credentials, with a broken English.

Ben: I assume you’re now the most valuable company Betaworks has ever invested in.

Clem: Yes and more proud of the fact that now with the companies that they invested the most money in, so we are the biggest bets that they’ve met. They’ve been extremely, extremely supportive, but the support from a bunch of very important impactful angel investors for us. Like Richard Socher was the founder of u.com, was the chief scientist at Salesforce at the time. Then the support of the Conway family with A. Capital run by Ron Conway that led our next round, and Ron Conway who also supported us throughout the early days of Hugging Face.

Ben: That’s awesome. This was all still for the, ‘I’m going to chat with an emoji’ idea. And to put a finer point on it, you started the company in 2016. 2017 is when the transformer paper gets released from Google. We are not yet to the era of even people in the AI community really knowing LLMs are close on the forefront, like OpenAI hadn’t made their big pivot yet. The state-of-the-art for natural language processing is still pretty limited, small models trained on very particular well-cleaned data sets. Is that right?

Clem: Yeah, surprisingly or luckily that’s what led to what Hugging Faces today because at the time, the way you were doing conversational AI is by stitching a bunch of different models, which would do very different tasks. You would need one model to extract information from the text, one model to detect the intent of the sentence, one model to generate the answer, one model to understand the emotion linked with the model.

Very early on in the journey of Hugging Face, we started to think about how do you build a platform and abstraction layer that allows you to have multiple models with multiple data sets, because we wanted the chatbot to be able to talk about the weather, talk about sports, talk about so many different topics that you needed a bunch of different data sets. That was the foundation to what Hugging Face is today, this platform to host so many models, so many data sets.

It’s a very interesting fate, a very interesting thing. Obviously it reinforces for people who are listening the importance of being flexible, being opportunistic, and being able to seize new opportunities. Even three years in. For us it was three years in with maybe $6 million raised, completely changing what we are doing, what we are going after, what we building.

Obviously we don’t regret at all, but it’s a good learning for everyone listening that even with $6 million raised three years in, you can still pivot and find a new direction for your company and did this for the best.

David: How did those conversations start? How did they go ? How much time did it take to go from talking about it to doing it?

Clem: Surprisingly, the transition wasn’t as hard as we thought. It all started from an initiative from Thomas, who’s our third co-founder and our chief scientist. I think it’s right at the time when BERT, the first very popular transformers models came out.

Ben: That’s Google’s model?

Clem: Google’s model that they opened sourced, I think on the Friday. That day, I remember really vividly Thomas told us, oh, there’s this new transformer model that came out from Google. It’s amazing, but it sucks because it’s in TensorFlow. At the time, the most popular language for AI was (and still is, actually) PyTorch. It was like, oh, I think I’m going to spend the weekends porting this model into PyTorch.

Julien and I were like, okay, yeah, if you don’t have anything better to do during your weekend, just have fun, do it. On Monday he released a PyTorch version of BERT, tweeted about it, and I think his tweet got maybe a thousand likes. For us at the time we were like, what is happening here? We broke the Internet. A thousand Twitter likes? That’s insane.

Ben: The developer demand is so obviously at that point in time, PyTorch, but since it was born out of Google, of course we’re going to implement it in TensorFlow. They had to use their own endorsed stack. It’s just waiting there for the first person to realize, oh my God, this thing needs to exist in PyTorch to go and get all the Internet points by doing that.

Clem: Yeah. I guess it’s another gift from fate or from the universe to us that we managed to seize thanks to the work of Tom. After that, we saw the interest double down on it and I think six months later we told our investors, look, this is adoption. This is the usage that we’re getting on this new platform. We think we need to pivot from one to another. Luckily they were all super supportive, and that’s what led to the pivot and to the direction that we took.

David: Wow. How did you take Thomas porting BERT from TensorFlow into PyTorch into the idea of, oh there should actually be a platform for this.

Clem: It was very organic. What we did is really followed community feedback. What happened is after this first model release, we just started to hear from other scientists building other models who expressed interest in adding their models to our library. I think at the time it was things like ExcelNet actually coming, if I’m not mistaken, from Guillaume Lample who is the founder of Mistral now. I think it was GPT-2 from the OpenAI team at the time, which was open source.

Ben: That’s right. It used to be OpenAI.

Clem: Yes, and they told us that they wanted to add their model since we really followed the community feedback on it. That’s what took it from a single model repository. I think the first name was Pre-trained PyTorch BERT to (I think it was) PyTorch-Transformers to Transformers. Then it expanded to the Hugging Face platform as we see it now.

Ben: And that’s the thing you got famous for was that transformers library. You were the steward of that open source project, and you constructed the Hugging Face platform around it to host and facilitate all the community interaction on transformers. It turned out, oh my gosh, there are a lot of other people who are building something that looks like our transformers libraries that also want to place for that same infrastructure.

Clem: Exactly, with the same process. At some point, users in the community started to tell us, oh, I have bigger models. I can’t host them on GitHub anymore. All right, let’s build a platform for that. Oh I want to host my data sets, but I want to be able to search in my data sets to see is there good data, bad data, how can I filter my data and things like that. We started to build that and a few months later we realized that basically we built a new GitHub for AI.

Our development has always been very community-driven, really following the feedback from the community. I think that’s a big part of the reason why we’ve been so successful over the years and why the community has contributed so much to our platform and to our success. We couldn’t be anywhere close to where we are without the millions of AI builders, contributors that are sharing open models, open data sets, open apps that are contributing with comments, with bug fixes. It’s the main reason for success today.

Ben: You’re famously open. It really embrace this. We literally will build the product that the community tells us they want. Internally, you have a very open policy. The Twitter account, your social media accounts are actually accessible (I think) by all employees, right?

Clem: Yes.

Ben: As someone who is a champion of open source, how much openness is too much openness? Like you’re not a DAO. You don’t do the thing where you publish everyone’s salaries, I don’t think. What do you like to be open versus what do you feel is good that it’s proprietary?

Clem: What we like to do is to give tools for companies to be more open than they would be without us, but without forcing them in any way. I was mentioning the number of models, data sets, and apps that are built on the platform. Something that people don’t know as well is that half of them are actually private. The companies are just using internally for their own AI systems that they’re not sharing.

We’re completely fine with that because we understand that some companies build more openness than others. But we want to provide them tools to open what they feel comfortable opening. Sometimes it’s not a big model, it’s not big data sets. They can share a research paper on the platform because obviously openness is even more important for science than it is for AI in general.

Progressively, it allows them to share more and contribute more to the world because ultimately we believe that openness and open source AI, open science is really the tides that lift all boats, that enable everyone to build, that enable everyone to understand, to get transparency on how AI is working, not working, and ultimately leads to a safer future.

A lot of people right now are talking about AGI. I’m incredibly scared of a non-decentralized AGI. If only one company, one organization gets to AGI, I think that’s when the risk is the highest, versus if we can give access to the technology to everyone, not only private companies but also policy makers, nonprofits, civil society, I think it creates a much safer future and the future I’m much more excited about.

Ben: I was going to not go here because it’s almost too much of a shiny question to ask, but we’re talking AGI so we have to do it. Do you feel that the models today are on a path to AGI or do you feel like AGI is something completely separate and these are not stepping stones to it?

Clem: Well I think they’re building blocks for AGI surely in the sense that we are learning how to build some better technologies. But I think at the same time there’s some sort of a misconception based on the name of the technology itself.

We can call it AI artificial intelligence, so in people’s minds it brings association with sci-fi with acceleration with singularity. Whereas for me, what I’m seeing on the ground is that it’s just a new paradigm to build technology. I prefer to call it almost software 2.0. Like you had software before. You have software 2.0.

I think it will keep improving in the next few years the way software has kept improving in the past few years. But it’s not because we call it AI that it makes it closer to some Robocop scenario of an all-dominating AI system that is going to take over the world.

Ben: It does feel like there are these two different things that masquerade under the same name as AI. One of them is, I like software 2.0 because software gave humans leverage to do more and to scale more with a small set of humans. This new era of software really feels like it’s just that on steroids. The richness of applications that you can build very quickly is astonishing and is another 10x improvement on top of the amazing software paradigms that we had until now.

There is a completely separate thing, which is things that pass the Turing test. I’m talking to something and I’m pretty convinced that thing is a human, but it’s not. it is a little bit funny to me that these are both referred to as AI or one is really just leverage for builders on how much they can make.

Clem: It’s also maybe because we overestimate the second field that you’re talking about. To me it doesn’t feel incredibly difficult and incredibly mind-blowing that we finally managed to build a chatbot.

David: You thought you could do it in 2016, right?

Clem: Yeah. If anything, I’m surprised that we didn’t manage to build a good chatbot before. To me, even that falls into development of the technology for the past few decades. I think sometimes we forget because we’re so entrenched on [...] today and we are more impressed with progress of today than progress in the past.

Imagine the first vi course that were going faster than humans. Imagine the first computer that can retrieve information much better than humans. Imagine the first time you would go on Google and find any information in a matter of a few seconds. These are all impressive progress. Now we take them for granted, but they were impressive progress.

I think technology continues to progress the way it’s been progressing for the past few years. Obviously, some of the builders of these technologies are hyping it and are excited about it, which is normal. But as a society, I think it’s good to keep some moderation and understand that the technology will keep improving. That we need to take it into the direction that is positive for us, for society, for humans, and that everything is going to be fine, that we’re not going to fall into a doomsday scenario in a few months because of a chatbot.

Ben: Fascinating. It’s funny, as you were talking, you linked it to the bicycle. I always think back to the Steve Jobs quote, “A computer is a bicycle for the mind,” which is in many ways saying it’s leverage. It’s a way for the mind to output way more than it otherwise could have, the way that a bicycle does to someone walking. It’s almost like this software 2.0 is a bicycle for the bicycle for the mind. It’s like a compounded bicycle.

David: You’ve been there for this whole arc of the modern development of AI. How would you characterize open versus closed over the last (call it) 6–7 years that you’ve been in this? Does it feel like the pendulum has shifted significantly during that time? Or is it like, oh no, well there was always open and closed. You go back to the beginning and like, well Facebook and Google were closed, and back then the academic research community was open. How do you view it?

Clem: First the debate itself is a bit misleading because the truth is that open source is kind of the foundation to all AI. Something that people forget is even the closed source companies are using open source quite a lot. If you think about OpenAI, if you think about Entropik, they’re using open research, they’re using open source quite a lot. It’s almost like two different layers of the stack where open source, open science is here, and then you can build like closed source on top of this open source foundation.

But I do think if you look at the field in general, that it has become less open than it used to be. We talked about 2017, 2018, 2019. At that time, most of the research was shared publicly by the research community. That’s how transformers emerged. That’s how BERT emerged. Players like Google, OpenAI at the time were sharing most of their AI research and their models, which in my opinion led to the time that we are now.

It’s all this openness and this collaborativeness between the fields that led to much faster progress than we would’ve had if everything was closed source. OpenAI took transformers, the GPT-2, GPT-3, and that led to where we are today. For the past few years, maybe 2–3 years, it became a bit less open or a lot more open depending on your point of view. Probably because more commercial considerations are starting to play a factor. Also because I think there has been some misleading arguments around the safety of openness against closedness, which leads to something weird where open source and open science is not as celebrated as it used to be.

Ben: Yeah, maybe talk about that. What is the argument and why do you feel it is misleading?

Clem: There are a lot of people emphasizing the existential risk of AI to justify the fact that it shouldn’t be as open as it is, saying that it’s better not to share research because it’s dangerous.

David: A bad actor gets a hold of this and could do bad things.

Clem: Exactly. That’s not the first time that such things have been used actually in every technology cycle. If you look at it, it’s the same. Like books are dangerous. They shouldn’t be given to everyone. They should be controlled just by a few organizations. You need a license to write a book, to share a book.

Ben: It feels like that’s never happened though in the software industry. Yes, that happened in the nuclear era, but I don’t remember any of this around like, oh my god, software as a service, that’s terribly dangerous. Or the mobile apps, ah, make sure state actors don’t get a hold of that.

Clem: Yeah, it’s true. Maybe the cycle has been faster with AI between people not knowing about the technology at all to everyone knowing. It creates more fear, more ability for people to manipulate, and people to mislead. Maybe the name played a big factor. When you call it artificial intelligence, it’s much more scary than when you call it software.

David: Even back in the day it was the world viewed what was happening as like, oh, it’s a bunch of nerds. It was its own community and the norms of the community were around openness. It really just coming out of the hippie movement in the Bay area, frankly in the 60s and 70s. But now the stakes are way higher.

Clem: The competitive environment is quite different too. The early days of software, I think it was easier for new companies, new actors to emerge than now where you have much more concentration of power in the hands of a few big technology companies. That might play a role.

For me, one of the most important things in support to openness is that hopefully it’s going to empower thousands of new AI companies to be built, which is incredibly exciting. Big companies are doing a lot of good and they’re doing a great job in many aspects. But I think if we can use this change in paradigm between software and AI as a way to redistribute the cards, change things, and empower a new generation of companies, of founders, of CEOs, of team members to play a bigger role in the world, it would be great. I think it would align in a way more the challenges and the preoccupations of society with what companies are actually building. I’m excited to try to do that.

David: For listeners who haven’t seen this firsthand, I was over the weekend with a good friend of mine who is a startup founder, non-technical, has a small bootstrapped company, decided to essentially build an AI product around it 10 days ago, built it, well probably decided a month ago, built it over the course of a couple of weeks being non-technical—I’m sure using Hugging Face—launched it, and it’s completely transformed his business. The output of it as a product is mind-blowing and world class, thanks to these AI tools.

Clem: It’s incredibly exciting. That’s one of the reasons why I feel like we don’t need the doomsday scenario of AI or the AGI super intelligence talks about AI because just the fact that it’s a completely new paradigm to build all tech is exciting enough.

It’s already like thinking about how many people it will empower, how many new capabilities, how many new startups, companies it’s going to create, is exciting enough for me and for a lot of people. It’s going to change a lot of things in the ways that you build companies—you build startups as you mentioned—the way you invest in startups. I know a lot of investors are listening to this podcast. I think it’s going to completely change the way you invest in startups.

I’ve played a little bit with investment at this point. I’ve done 100 angel investments in the past two years, mostly in the community around Hugging Face. I think we’re starting to see that building an AI startup is very different from building a software startup in many ways. That is (I think) impactful for the way you think about investing and returns for firms.

For example, it seems like it’s the first time that you’re seeing so many of these startups with very heavy need for capital, for compute, like Mistral that we know, with OpenAI. I think it changed a little bit the way you think about investment, returns on investment, burn for startups.

Ben: That category of companies requires way, way more capital. But there are not that many foundational model companies.

Clem: I think there could be. If you think of it, most of the investment now is going towards foundational LLMs. But it’s just one modality—text. What about foundational models for video? What about foundational models for biology, for chemistry, for audio, for image? What if foundational model companies are actually just normal AI companies the same way software companies were the new type, the new default for all companies in the software software paradigm?

The truth is that we don’t know yet. I think it’s still too early to tell exactly what are the recipes for AI startups. That’s why it’s super exciting as an investor too because the truth is you can’t apply the same playbook that you used to in software. In software it was so mature that you had the playbooks. You need a co-founder, CTO, CEO, small team, and then you do the lean startup, and then you follow your rounds and then you get to the highest probability of success.

In AI, it’s completely different. For example, most of the founders actually are not software engineers anymore. They’re scientists. It’s a totally completely different game. The lean startup doesn’t work anymore because they need heavy capital investment before any return. What I’m saying is that it just completely changes the game and you have to forget everything that you’ve learned, everything that you’ve internalized, and start from scratch.

Ben: It’s funny, where I thought you were going to go with this was AI companies or companies that use AI can be just a few people and get huge output because they’re just using the API as provided by these foundational model companies and there’s an extreme amount of leverage to produce great value for customers with few employees.

You took it completely the other direction, which I think is quite contrarian and said most AI companies or perhaps you were saying most dollars deployed into AI will require new foundational models, and therefore they’re going to be these unbelievably large investments to get these step function advancements in a lot of different fields. Am I hearing you right?

Clem: Yeah. I think the truth is that nobody knows yet. I’m not saying that I’m 100% sure that it’s going to go that way, but I’m saying that it’s possible. That’s why it’s exciting to see how it’s going to evolve in the next few years.

Ben: One easy way you win that argument is that the dollars consumed by foundational model companies are so large, that even if there are a thousand times more regular startups consuming APIs provided by AI companies, it’s still the case that most investment dollars will actually go to foundational models and large training runs.

Clem: If you look at some of the successful companies so far, if you look at Hugging Face, if you look at OpenAI, companies like that, I don’t think they acted in the traditional way you would expect a software company to act. Maybe on OpenAI, they started with a billion dollar race, did open source open science for 6–7 years, and then started a completely new model.

For Hugging Face, we operated in fully open source for many years, really community driven, very different organization than what everyone was telling us to do. I think there’s something to be said about really throwing away the playbooks, throwing away the learnings from the software paradigm, and really start from scratch, maybe start from first principles, and build a new model, a new playbook for AI.

Ben: Has Hugging Face as a company been particularly capital-intensive? And if so, why?

Clem: We haven’t. We raised a bit more than $500 million so far over the course of seven years. We actually spent less than half of that, and we’re lucky enough to be profitable.

Ben: Congratulations.

Clem: Which is quite quite unusual for most AI startups. We have a different model than some other AI companies.

David: I assume you all don’t have nearly the same capital expenditure requirements that (say) an OpenAI does in terms of compute and training.

Clem: Yeah, and we have enough usage already that is free that we’ve quite straightforward and quite permissive freemium model. We can easily get to a level of revenue that is meaningful. We have some specificities for sure that allows us to do that.

It was also an intentional decision for us because as a community platform, we want to make sure that we’re not going to be here just for a year, two years. When people build on top of you when they contribute to the platform, I think you have some responsibility towards them to be here for the long term. Finding a profitable, sustainable business model that doesn’t prevent us from doing open source and sharing most of the platform for free was important for us to be able to deliver to the community that we are catering to.

Ben: Your customers do use Hugging Face for very capital-intensive things, training these models, but that doesn’t show up in your financials as, oh my God, we had to sink a billion dollars into a training run. You partner with a cloud provider on the backend and pass it along to whoever’s doing the training run, right?

Clem: Yeah. We try to find the sustainable ways to do that either by partnering with the cloud providers, by providing enough value so that companies that are buying the compute are okay with paying a markup to the compute that makes it high margin for us. Or providing paid features that are basically 100% margins.

For example, a lot of companies are now subscribed to our enterprise hub offering, which is an enterprise version of the hub, which is obviously a different economics than selling compute.

Ben: Very proven business model. You get to choose how you make money. Are you marking up compute? Are you selling SaaS? Are you going the enterprise route and developing this custom package for every engagement?

I’m very curious on the routes where you choose to apply a margin or a markup on top of compute. What is it? Because clearly you’re not ashamed of this and I think it’s a great business model. What is it that Hugging Face can provide where a customer goes, yeah, I’ll do it through Hugging Face instead of going and figuring out how to do it myself directly on a cloud provider?

Clem: We’ve never been so interested in taking part of the race to the bottom on compute. It’s a much more challenging business model than a lot of people think, especially with the hyperscaler being in such a position of trends, both in terms of offering but also in terms of of cash flow, giving them the ability to do a lot of things that other organization wouldn’t be able to to do.

The way we think about it is, instead of taking part of this race to the bottom, we are trying to provide enough value both with the platform, the features, and the compute, so that companies are comfortable paying a sustainable amount of money for it.

When you use, for example, the platform, and when you use offerings like the inference endpoints or spaces GPUs on the platform, the idea is that it’s so integrated with the feature of the platform that it actually makes it 10 times easier for you as a company to use that as a bundle versus using just the platform and then going for cloud provider for the compute.

It’s what I call a locked-in compute. It’s almost like not the compute that you can trade-in and it doesn’t really matter to you if you switch from AWS, Google Cloud, or another provider. It’s more we make the experience so much more seamless, so much less complex, which is the name of the game for AI—the AI is still complex for most companies—that at the end of the day, yes, companies are paying more for it, but instead of having 10 ML engineers, maybe they’re going to have 1 or 2.

David: The alternative to this would be you have your AI researchers working on models and then when you want to go train or deploy it not through Hugging Face, you basically need a whole nother team of AI infrastructure deployment engineers, right?

Clem: Yeah. As we mentioned before, when the early days of AI monetization, today no one knows what is a profitable, sustainable business model for AI. Even the big players. OpenAI is of course generating a lot of revenue, but the question of profitability and sustainability of this revenue is still an open question. And I think they’re going to figure it out and I hope they’re going to figure it out. But we saw early in figuring out business models for AI that there’s a lot to build, so that is extremely exciting.

Ben: And I would argue you’re not figuring out any business model. You are using time tested, proven ways to make money where you occupy a particular part in the value chain, where you’re providing a rich set of experiences to developers, they’re willing to pay for that directly, they’re willing to pay for it in the chain of slightly more expensive compute. The nice thing is you get to innovate on all the AI things without having to build a business model from scratch.

These foundational model companies, that is where there’s this big question of what exactly is the business model, especially when the consumer expectation with interacting with all these AI chat-style agents is that that is free for a huge set of functionality.

Clem: The beauty of the position we are in is that if you are the number one platform that AI builders are using, and if AI becomes the default to build or tech, it’s pretty obvious that this sustainable massive business model around it. Otherwise, we would be doing something wrong.

That’s why we’re so much focused on the usage, on the community because we believe if we keep figuring that out, if we keep leading on the usage and the adoption, we keep empowering the community to use our tools and be successful with our tools. There are going to be good things in the future for us, for Hugging Face, and hopefully for the community.

Ben: There are some businesses that are just perfect, like you analyze them. Visa is a good example and you’re like, man, there is basically nothing wrong with this business model. Everything about it is just glorious if you are a shareholder of Visa. And every business shy of Visa has these things where you’re like, that’s an exceptional thing about that business and here’s the thorn in my side that as I’m operating this business, I just can’t escape this thing that sucks.

We’ve talked a lot about all the ways in which you’ve positioned yourself in a remarkable place in the emerging AI ecosystem. What’s the thing that you have to deal with where you’re like, ugh, it is such a thorn in my side.

Clem: For us, inherently we have to almost take a step back from the communities that we’re empowering. That’s a little bit the curse of the platforms. If you think for example of GitHub, it’s probably the company in the past 20 years that has empowered the most the way you build technology. Because virtually all software engineers have used GitHub as the way of collaboratively building, and yet people don’t talk about them. They don’t talk about the product.

It’s not as visible as Facebook, Google, or these companies can be. We have some curse around, I would say visibility, maybe sexiness, will never be like an OpenAI in terms of sexiness and hotness and people talking about us, and always stay a little bit in the background.

David: Back in the day though, when GitHub was in its earlier years and was a startup, it was very—

Ben: The hundred million dollars series A. I still remember it.

David: Yeah, I remember that for sure. It was plenty buzz. But to your point of as an infrastructure company or a developer writ large, in your case AI builder platform, you’re more behind the scenes.

Clem: And then another challenge for us is that yes, AI is starting to be mainstream in terms of usage, but if you really look at it, the underlying technology foundations are still evolving really fast. There’s this constant battle between building mature, stable platforms and solutions, but at the same time innovating, iterating fast enough so that you don’t miss the next wave.

For us, more like a company building aspect, it’s something that we always worried about. We have 250 team members in the company. We say that we always want to stay in order of magnitude less team members than our peers. We could be 2000 people, but we prefer to be 202,000 as a way to reconcile this difficult challenge between building really, really fast, but really building tools to that scale. That’s an important challenge for us for sure.

David: That’s such a good point you made it a minute ago I hadn’t really considered. We might still be in the sort of Yahoo, Alta Vista era of foundational model companies. Many of them are very successful. You know about them as you were saying, they make a lot of revenue, but are they fundamentally profitable endeavors yet? Probably not.

Clem: I think we are. Even when you think about how companies are building with AI, to me, an AI company using an API sounds very unintuitive or it doesn’t sound like the optimal way to build AI and more almost like a transitional time where the technology is still a bit too hard for all companies to build AI themselves. But I would be surprised if it didn’t happen.

It’s almost like the early days of software where you had to use, I don’t remember what they were at the time, but like Squarespace, you had to use like no-code platform to build a website.

Ben: Dreamweaver and Microsoft Frontpage, yeah.

Clem: Yeah. Before technology companies could learn, before software engineers could learn to build code themselves, we might be at the same time in AI where companies are using API because they haven’t built yet the capabilities, the trust, the ability to do AI themselves. At some point they will. They know their customers, they know their constraints, they know the value that they’re providing.

At some point in history, all tech companies will be AI companies and that means that all companies, all these technologies, they’re going to build their own models, optimize their own models, fine tune their own models for their own use case, for their own constraints, for their own domains.

Ben: I think this is pretty contrarian too. Coming into this conversation I would’ve fallen in the opposite camp of there are going to be 5–8 players, maybe even consolidating more from there, that need to spend $10–$100 billion every couple of years, and no one else has that ability to spend or attract that research talent, so we all consume their APIs. You’re proposing a very opposite future.

Clem: I’m a bit biased obviously by the usage that we see.

Ben: You’re a lot closer to it than we are.

Clem: As I was saying there’s a new model, dataset, or app that is built on Hugging Face every 10 seconds. I can’t believe that these new models are just created for the sake of new models. I think what we’re seeing is that you need new models because they’re optimized for a specific domain. They’re optimized for a specific latency, for a specific hardware, for a specific use case, so they’re smaller, more efficient, cheaper to run.

Ultimately, I believe in a world where there are almost as many models as code repositories today, and actually if you think about models, they’re somehow similar to code repositories. It’s a tech stack. A model is like a tech stack. I can’t imagine that only a few players are going to build the tech stacks, and that everyone else is just going to try to ping them through APIs to use their tech stacks. I envision they’re a bit of a different one.

Ben: It makes sense. Implicit in your comment is that 99.9%-something of models are inexpensive to train and do inference on, and they’re small and they’re purpose-built. It’s nice that this thing happened in the last 3 years where these God models seem to be able to do everything better than all the specialized models that people spent 10 years building before. But that’s a blip in time and we’re going to shift back to specialized cheap models handling a lot of the labor as everyone gets better at the state of the art.

Clem: Or something in between. It’s always a gradient. I think some companies, some context, some use cases will require very large generalist models. Like when you’re doing a chat GPT, yes, of course you need a big generalist model because users are asking everything.

But when you’re building a banking customer support chatbot, you don’t really need it to tell you the meaning of life. You can save some of the parameters to make sure that your chatbot is smaller, has been trained more on the data that is relevant to you, that is going to cost you less, that is going to reply faster. That’s of course also very depending on the use cases that you plan to use AI for.

David: I’m curious. If you’re listening to this and thinking about starting a company, thinking about starting an AI company, maybe you have a use case or avertical use case knowledge that you want to go after, what are the ingredients and skillsets that you need on your team? If you buy what you’re saying of like, hey, you could use APIs, but like really ultimately you want to build your own model, what do you need to build your own model and build a great one?

Clem: For me, the main difference between the software paradigm and the AI paradigm is that AI is much more science-driven than software. It’s a bit of a paradox because in software sometimes we call people computer scientists. But the reality is that they’re not really scientists in the true sense of it.

Ben: Such a misnomer.

David: They’re engineers, yeah.

Ben: This always bothered me studying computer science in college. All of the other sciences are things that occur in our natural world—biology, chemistry, physics—and computer science is like, no, you’re learning how a thing that is manmade works and how to operate it.

Clem: Yeah. To me that’s the main difference between the software paradigm and the AI paradigm. When it comes to founding teams and capabilities, I think having more science background is actually like a must. Having one co-founder who is a scientist, I think is a big, big plus. If you look at most of the successful AI companies, they actually have a science co-founder. We do at Hugging Face. I think OpenAI has. Of course with [...] that’s one big thing.

David: How would you describe the difference in mindset and skillset between a traditional software startup and the engineering skillset you need for that versus the scientist skillset and the research skillset?

Clem: Timing is very different the way you look at how fast to build something, ship something. When I was more working at software startups, we have the code of shipping really fast. This might not be as true for AI. I think you want to ship as fast as you can, but realistically, to train a model and optimize a model is more at best a matter of months than a matter of days. You probably want to look differently at how you’re shipping, how fast you’re shipping, how you are iterating on things.

The skills are quite different too. I think an AI scientist has the potential to be more skilled at pure math than an engineer. I think thinking more in terms of, how can I make foundational or meaningful progress compared to the state of the art, and you like looking at bigger scale of improvement.

I think in the software paradigm you can almost think like, okay, if I make my product 5% better than others is going to be enough because I’m going to make it 5% better now and then in two weeks, 5% more and in two weeks, 5% more. at some point you’ll have enough differential in terms of value add to get users and convince and retain users. For science, it’s almost like you don’t create any value. You work on something for 6 months and then after 6 months you have something 10 times better than the existing.

In a way, that’s what OpenAI did. They worked for six years, barely releasing anything or anything successful, but at some point they were able to release something that was probably 10 times better than others. That’s a different way of looking at it too.

Ben: I’d push back on that. I think that’s a little bit revisionist history. I’m sure you were watching OpenAI very closely. It felt like they were releasing all sorts of stuff. None of it had any commercial value and all of it felt super researchy. But that thing where they trained universe on Grand Theft auto, I mean the GPT and GPT-2 weren’t known in the mainstream, but it was pretty remarkable watching that.

I think them going all in on the transformer and deciding, hey, we need to fundamentally change the set of things that we’re working on. I think that company has worked incredibly fast, shipped pretty fast, and now they’re shipping faster than ever because they’re actually in this arms race. I definitely don’t think of them as a, go away, think and build for 10 years, and then finally release something.

Clem: They did release a lot of things, but compared to their size and their scale, knowing that they started with a $1 billion investment, maybe they were releasing one thing every three or six months. Relatively to their size, scale, and the amount of money that they raised, I think they were shipping and releasing way less things than a typical software company would have with their budget. But I agree with you that it was an iterative way.

David: I guess to the point too, if you have a large model, you’re not going to do continuous deployment because you have to retrain the model if nothing else.

Clem: Yeah, it’s just a different approach. The best advice I give to people is to trash their lean startup book when they’re starting an AI company, because I think these things have been so ingrained into our minds, into our way of building like software entrepreneurs, that it’s really easy to fall into the trap of doing it without even realizing we do it, instead of completely changing the paradigm, changing the operating system of the startup builder, which in my opinion, leads to much better results.

Ben: Well, Clem, this brings us to a topic that I’ve been wanting to ask you about, which I think will be our last major topic for today. In the discussion of which approach will win in the marketplace of open source versus closed source AI, there’s a pretty compelling argument, which is, as more real time training data is required, people’s interactions with an application will become incredibly valuable to fine tune or train the next version of the model.

There’s a compelling argument that as closed source AI will win because they’re just going to get all of that directly from users. When you own the model and the application and you have tightly integrated everything. Versus in the open source world, like great, you publish something and then a bunch of people fork it and they build their own applications, and then the real time interaction data with the application doesn’t make its way all the way back upstream to make the model smarter. How do you think about that?

Clem: Well, I think a lot of people are thinking and talking about moats and economies of scale for AI. I think that all of that is an open question at this point. I think nobody really knows how to create a moat or how to generate economies of scale for AI.

My intuition is that they’re not going to be so different from the software paradigm, and that you’re going to find the same moats maybe applied differently, but you are going to have the cost economies of scale. Similar maybe to a cloud or hardware provider who can get an advantage from larger scale to reduce prices.

I think you’re going to have social moats or the network effect. That’s more like the game that we play in where when you have collaborative usage in a way, like your platform becomes more and more useful the more users you have. It makes it difficult for anyone to compete with you. That’s why GitHub has never really been challenged or that’s why social networks are arguably very hard to compete with.

They’re going to maybe be more intense than in the software paradigm. Maybe the cost moat of compute will be more extreme, but it’s an open question because if you think about some of the current winners, they didn’t have so much of these advantages from the get-go.

If you look at OpenAI, they didn’t really have more access to data than most companies. They ended up scrapping the web and getting data that everyone else could get.

If you look at Hugging Face, I don’t think like going in we had any specific advantage that allowed us except being as community driven as we were, that enabled us to develop the social network effects.

It’s still an open question. I will be careful of people and companies overplaying and overhyping one moat compared to others. Even if you think of ethically and the world that we need, I hope that we are not going to have just a few companies winning. It would be a shame, it would be quite sad if we ended up with just five companies winning in AI. I think it would be dangerous.

Imagine if only a few companies were able to do software, we would be in a very different world than we are today. I hope many companies win. I think the technology is impactful enough so that they can be almost more AI companies winning than software companies in the past. That’d be very exciting to me.

Ben: And you make a very credible argument that it’s going to empower more people than ever to build products. It stands to reason that there should be more companies or at least more attempts to start companies that can serve a particular customer need in this generation than any previous generation before.

Clem: AI is the opportunity of the century to shake things up, break the monopolies, break and flag the established positions, and do something a bit new.

David: I’m curious to get there. Do you think that we need just a lot more people getting trained in how to be AI builders and AI scientists? Or do we need the tools and infrastructure to get a lot easier to use? Or both?

Clem: Both, but I think it’s much more important that we get many more AI builders than we do today. If you’re looking at Hugging Face, as I was saying, we have five million AI builders. We can assume like most AI builders are using Hugging Face one way or another. You can estimate that there are around five million AI builders in the world today.

There are probably around 50 million software engineers or software builders, depending on how you set this definition. I think GitHub has over 100 million users. A lot of them obviously are not software engineers, but probably half of them.

We are still at the early innings. It wouldn’t be surprising that if in a few years you would have more AI builders than software builders. Maybe in a few years you’re going to have 50 million, 100 million AI builders even more because the beauty of AI is that it’s a bit less constrained than software in the way that people can contribute to it.

In a way of software building, you have to learn a programming language and write lines of code, which is a pretty high barrier to entry, versus for AI you can be considered an AI builder if you contribute expertise, if you contribute data to a model that improve the model. Maybe we’re going to have 10 times more AI builders than software builders, which would be also good for the world because it would mean that more people could contribute, could understand. and could shape the technology more aligned with what they want.

I think sometimes in San Francisco, in Silicon Valley or in tech in general, we forget that it’s a very small number of people shaping products for a much bigger number of people. Whereas if you maybe include more people in the building process, you can not only build better products, but more inclusive products, maybe products that can solve more social issues than we’ve been solving. That’s quite an exciting future for sure.

Ben: Well, Clem, I can’t imagine a better place to leave it. Where should listeners go to learn more about you or Hugging Face or get involved?

Clem: huggingface.co, actually .com. We just got the .com a few days ago.

Ben: Hey, congratulations.

Clem: Yes. It’s a good example that you shouldn’t sweat the small things early on. Our name, Hugging Face, is obviously very unusual for the things we do. Our domain name for seven years, we kept huggingface.co, but it didn’t create too many problems for us. I’m on Twitter, I share a lot on X and on LinkedIn, so you can follow me there or ask me questions there, and happy to answer.

Ben: Awesome. Well thank you so much and listeners, we’ll see you next time.

David: We’ll see you next time.

Note: Acquired hosts and guests may hold assets discussed in this episode. This podcast is not investment advice, and is intended for informational and entertainment purposes only. You should do your own research and make your own independent decisions when considering any financial transactions.

More Episodes

All Episodes > 

Thank you! You're now subscribed to our email list, and will get new episodes when they drop.

Oops! Something went wrong while submitting the form