Founded less than six years ago, OpenAI maintains a philosophy that giant corporations should not control progressive technology development. The non-profit organization aims to research artificial intelligence (AI) to discover its potential and benefits to society. The goal is to produce open-source software and applications that allow various researchers to develop AI systems. Since the beginning of the organization, it has racked up several impressive achievements, which is the primary focus of this article.
What is OpenAI?
The non-profit organization OpenAI established a research laboratory that aims to promote AI tech that benefits society.
It was initially founded in late 2015 by several entrepreneurs, including Elon Musk, Sam Altman, and many others.
They all pledged $1 billion to support the development of AI systems that are developer-friendly.
Although Musk resigned from the organization after three years, he remained a donor and an advocate for OpenAI.
The organization seemingly drifted away from its initial objectives of avoiding developing software to generate financial returns.
In 2019, OpenAI accepted a $1 billion investment from Microsoft, one of the world’s most prominent tech companies.
On January 23rd, 2023, Microsoft and OpenAI finalized a multi-billion, multi-year deal, where Microsoft provides the infrastructure to OpenAI to keep developing and operating its products.
And Microsoft gets commercial exclusivity in the integration and distribution of these products.
As OpenAI explained:
This multi-year, multi-billion dollar investment from Microsoft follows their previous investments in 2019 and 2021, and will allow us to continue our independent research and develop AI that is increasingly safe, useful, and powerful.
As Microsoft announced the deal will move around three pillars:
Supercomputing at scale
Microsoft will increase our investments in the development and deployment of specialized supercomputing systems to accelerate OpenAI’s groundbreaking independent AI research. We will also continue to build out Azure’s leading AI infrastructure to help customers build and deploy their AI applications on a global scale.
New AI-powered experiences
Microsoft will deploy OpenAI’s models across our consumer and enterprise products and introduce new categories of digital experiences built on OpenAI’s technology. This includes Microsoft’s Azure OpenAI Service, which empowers developers to build cutting-edge AI applications through direct access to OpenAI models backed by Azure’s trusted, enterprise-grade capabilities and AI-optimized infrastructure and tools.
Exclusive cloud provider
As OpenAI’s exclusive cloud provider, Azure will power all OpenAI workloads across research, products and API services.
OpenAI Products Throughout the Years
The organization was structured to be non-profit to focus on its main goal — researching AI technology.
The primary purpose of OpenAI is to leverage artificial intelligence that brings a positive, long-term impact.
There are always risks in advancing a powerful technology such as AI.
Exploring such complex technology tends to become abused.
With such great power, they make it their mission to guarantee a positive and prosperous future.
Overall, the organization develops technologies to empower people to utilize AI for the betterment of the world.
The focus of OpenAI research goes beyond artificial intelligence itself.
They dived into the paradigm of machine learning called reinforcement learning.
It involves the training of learning models that become the basis for future actions. With that in mind, here are the products and applications that OpenAI developed throughout the years.
One of the first software that the non-profit organization created is called Gym.
It is an open-source library where researchers can discover reinforcement learning algorithms.
This software provides a plethora of opportunities for developers to explore various AI environments.
The toolkit also involves AI research publications for easier discovery of their latest developments.
In late 2017, developers failed to maintain the documentation site and transferred information regarding their recent work on Open AI’s GitHub page.
RoboSumo involves humanoid “meta-learning” robots that compete against one another.
The main goal is to let simulated AI technologies learn physical skills, including ducking, pushing, and moving around.
While in the arena, the competitive environment creates an intelligence that allows AI to overcome adversity and adapt to changing conditions.
The result of this research concluded that agents face a whole new environment with high winds.
Through adversarial learning, it applied its newfound intelligence in a generalized way.
The Debate Game is another application developed by Open AI in 2018.
Machines debate various toy problems in the presence of a human judge.
In hopes of developing explainable AI, this research explores the influence of AI in making crucial decisions.
Open AI developed Dactyl to manipulate objects with the use of a Shadow Dexterous Hand.
Accompanied by reinforcement learning algorithm code utilized in Open AI Five, Dactyl explores AI’s role in robotics.
Generative Models of Open AI
One of the most important subjects that Open AI explored is generative models.
To determine the capabilities of artificial intelligence, researchers leveraged these models.
It involves training the models through the large volumes of data generated from a domain.
For instance, generative models read a book to reinforce their learning and create data that resembles it.
For this to be more successful, the neural networks integrate into multiple parameters significantly smaller than their training networks.
In this way, the models need to explore the data’s extent independently and generate a copy.
The first-ever publication on the language model of generative pre-training (GPT) unveiled in June 2018.
Researchers developed in-depth data about how the generative model of language acquires knowledge from the Open AI website.
As the successor to GPT, GPT-2 utilizes generative models to predict the following words within a 40-gigabyte internet text.
This transformer-based language model can reach 1.5 billion parameters on a data set of 8 million pages.
Reinforcement learning gets leveraged to train models on a simple objective, which predicts the next word. I
t is after being provided with strings of text that talk about a particular topic.
Although the models get exposed to diverse domains, it is fascinating that they can predict text within the sample text context.
GPT-2 can perform tasks involving question answering, reading comprehension, summarization, and translation.
The models begin to predict text accurately after learning from a couple of language tasks that showcase raw text.
As a result, tasks can get achieved while unsupervised.
Following the concerns about the potential abuse of such advanced technology, Open AI did not release the training model for GPT-2.
However, those who are interested can still experiment on a smaller model to try out its capabilities.
GPT-3 was initially introduced by Open AI in May 2020.
Access to the private beta version of this technology is only available to a few people that sent requests before the release.
However, GPT-3 was acquired to be licensed exclusively to Microsoft in September 2020.
As a predecessor to GPT-2, it improves the predictive capacities when exposed to streams of texts with a range of different styles.
Increasing its parameters to 175 billion, GPT-3 strides as the leading language model that surpasses certain limitations that cannot be overcome by GPT-2.
ChatGPT and the explosion of AI commercial use cases
With the rise of generative models and the exponential improvement of GPT-3, a bunch of companies was built on top of these players.
The latest release from OpenAI of a conversational interface called ChatGPT left anyone in the industry astounded.
When I tested it, I was blown away by its practical capabilities.
We thought AI would start working backward by disrupting physical labor or less specialized workers and moving upstream to knowledge workers.
Instead, the opposite happened!
AI works incredibly well right now for creative endeavors, thus showing that it’s very hard to predict the evolution of technology.
This means that the whole knowledge economy might be the first to be completely redefined by AI.
But what’s ChatGPT?
I asked it to define itself, and that is what it said!
I asked ChatGPT whether it would kill Google.
Right now, on top of OpenAI and other generative models, we’ve already seen the rise of multi-billion companies!
In 2022 alone:
– Stability AI announced $101 Million in Funding for Open-Source Artificial Intelligence.
– Jasper AI, a startup developing what it describes as an “AI content” platform, has raised $125 million at a $1.5 billion valuation.
– OpenAI, Valued at Nearly $20 Billion, is in advanced talks with Microsoft for more funding.
Every startup, in the next five years, will become an AI company.
This will give rise to even leaner startups able to build trillion dollars empires with small teams!
OpenAI Business Model
Right now, OpenAI is organized around two entities, one is non-profit, and another is a “capped profit” organization.
This transition started in 2019.
The AI research lab
In a post back in June 2016, as OpenAI ramped up its research into generative models, they also explained the goal of what, at the time, was primarily a research lab.
As they explained back then:
OpenAI’s mission is to build safe AI, and ensure AI’s benefits are as widely and evenly distributed as possible. We’re trying to build AI as part of a larger community, and we want to share our plans and capabilities along the way. We’re also working to solidify our organization’s governance structure and will share our thoughts on that later this year.
Back in 2016, as OpenAI started to ramp up its research on generative models, which required substantial computing power, it partnered up with Microsoft.
As also Microsoft announced back in 2016, the main goal of the partnership would be to “democratize Artificial Intelligence (AI), to take it from the ivory towers and make it accessible for all.”
Microsoft would do it through a four-pronged approach, which the company described as follows:
- Harness artificial intelligence to fundamentally change how we interact with the ambient computing, the agents, in our lives.
- Infuse every application that we interact with, on any device, at any point in time, with intelligence.
- Make these same intelligent capabilities that are infused in our own apps — the cognitive capabilities — available to every application developer in the world.
- Building the world’s most powerful AI supercomputer and making it available to anyone, via the cloud, to enable all to harness its power and tackle AI challenges, large and small.
That was the beginning of a partnership that would lead to one of the most important deals of our time, the multi-billion commercial partnership between Microsoft and OpenAI!
The turning point, GPT-2
By February 2019, as OpenAI had been working on scaling generative models for a few years, it became clear that the path forward was promising.
And that became apparent with the release of GPT-2.
As OpenAI explained back then:
We’ve trained a large-scale unsupervised language model which generates coherent paragraphs of text, achieves state-of-the-art performance on many language modeling benchmarks, and performs rudimentary reading comprehension, machine translation, question answering, and summarization—all without task-specific training.
GPT-2 was the successor of GPT, released in June 2018.
As OpenAI explained, back in 2018,
- First they train a transformer model on a very large amount of data in an unsupervised manner — using language modeling as a training signal.
- Then they fine-tune this model on much smaller supervised datasets to help it solve specific tasks.
The turning point of this approach was the removal of the key drawbacks of supervised learning.
Indeed, as OpenAI explained back in 2018:
Supervised learning is at the core of most of the recent success of machine learning. However, it can require large, carefully cleaned, and expensive to create datasets to work well. Unsupervised learning is attractive because of its potential to address these drawbacks. Since unsupervised learning removes the bottleneck of explicit human labeling it also scales well with current trends of increasing compute and availability of raw data. Unsupervised learning is a very active area of research but practical uses of it are often still limited.
This step is critical to understand the intersection between Cloud and AI.
Indeed, while unsupervised learning did provide a leap forward for those large generative models, it also proved to be extremely expensive, in terms of computing power, to perform this sort of training (which needs to be done only once).
Indeed, in 2018, OpenAI highlighted the massive computer requirements of generative models, like GPT, moving forward.
Compute Requirements: Many previous approaches to NLP tasks train relatively small models on a single GPU from scratch. Our approach requires an expensive pre-training step – 1 month on 8 GPUs. Luckily, this only has to be done once and we’re releasing our model so others can avoid it. It is also a large model (in comparison to prior work) and consequently uses more compute and memory — we used a 37-layer (12 block) Transformer architecture, and we train on sequences of up to 512 tokens. Most experiments were conducted on 4 and 8 GPU systems. The model does fine-tune to new tasks very quickly which helps mitigate the additional resource requirements.
Yet, with a partnership in place, with Microsoft Azure, pre-training these models by using a massive amount of computing power became doable.
And that also led to the release of GPT-2.
This model was trained simply to predict the next word in 40GB of Internet text.
GPT-2 was trained on 1.5 billion parameters, and a dataset comprising of 8 million web pages.
GPT-2 was a direct scale-up of GPT, with more than 10X the parameters and trained on more than 10X the amount of data.
That’s it, scale!
Yet, GPT-2 was only released in a small version, as OpenAI feared it could be used to manipulate content on the web.
Thus, OpenAI opted for a staged release of GPT-2 involving the gradual release of a family of models over time.
The purpose of this staged release of GPT-2 would be to give people time to assess the properties of these models, discuss their societal implications, and evaluate the impacts of release after each stage.
Eventually, in the same year, OpenAI would release a larger version of GPT-2, taking into account a few key parameters:
- The ease of use (by various users) of different model sizes for generating coherent text.
- The role of humans in the text generation process.
- The likelihood and timing of future replication and publication by others.
- Evidence of use in the wild and expert-informed inferences about unobservable uses.
From the research lab to a for-profit, capped organization
As OpenAI research quickly moved forward, the team might have realized that by further scaling these models up (by having more parameters and more data), they could create a much more powerful version of GPT.
That was also the time in which OpenAI transitioned from a research lab into a hybrid, two-headed entity.
Indeed, a few months before the announcement of a one billion dollars partnership with Microsoft (which might have back speeded up the development of GPT-3), OpenAI announced the creation of an LP.
As OpenAI explained, in Marc 2019, as the creation of the LP came about, “We’ve created OpenAI LP, a new “capped-profit” company that allows us to rapidly increase our investments in compute and talent while including checks and balances to actualize our mission.”
As OpenAI highlighted, once they had understood that most of the progress, from GPT-2 going forward, was about scale (more computing power needed to get more parameters in, and more data to process) the creation of the LP was a way to bring in more capital to scale these models.
Yet, the LP acted as a capped organization. Meaning, that after a certain threshold, any additional revenue generated by OpenAI’s generative models would go back to the non-profit organization.
In addition, the OpenAI LP’s overall mission would be aligned with the non-profit, thus ensuring the creation and adoption of safe and beneficial AGI—ahead of generating returns for investors.
For the sake of maintaining the OpenAI LP on track to its mission, the for-profit, capped company is controlled by the OpenAI Nonprofit’s board.
The Nonprofit board is comprised of employees of the LP, like Greg Brockman (Chairman & President), Ilya Sutskever (Chief Scientist), and Sam Altman (CEO).
And non-employees Adam D’Angelo (co-founder and CEO of Quora), Reid Hoffman (co-founder of LinkedIn, now owned by Microsoft), Will Hurd, Tasha McCauley, Helen Toner, and Shivon Zilis.
And investors in the LP comprised Microsoft, Reid Hoffman’s charitable foundation, and Khosla Ventures.
As OpenAI set the structure into two entities (OpenAI LP, led by the Nonprofit OpenAI), it also set the rules to prevent conflict of interests.
Indeed, only a minority of board members are allowed to hold financial stakes in the partnership at once.
Furthermore, only board members without such stakes can vote on decisions where the interests of limited partners and OpenAI Nonprofit’s mission may conflict—including any decisions about making payouts to investors and employees.
OpenAI LP, back in 2019, as it was established, it already counted around 100 people organized into three main areas:
- Capabilities (advancing what AI systems can do).
- Safety (ensuring those systems are aligned with human values).
- And policy (ensuring appropriate governance for such systems).
That is what OpenAI organizational structure still looks like today.
The one-billion dollars partnership with Microsoft
As OpenAI established its LP, it also managed to further the partnership with Microsoft, announcing a billion dollars invested into this partnership in July 2019.
As OpenAI explained, the partnership would be used “to develop a hardware and software platform within Microsoft Azure which will scale to AGI.”
OpenAI, together with Microsoft would jointly develop the new Azure AI supercomputing technologies.
Thus, opening up to Microsoft as OpenAI exclusive cloud provider.
This would prove as a critical and winning bet, for Microsoft Azure as it developed (thanks to OpenAI) its capabilities in large-scale AI systems.
As Microsoft explained, in 2019:
- Microsoft and OpenAI will jointly build new Azure AI supercomputing technologies.
- OpenAI will port its services to run on Microsoft Azure, which it will use to create new AI technologies and deliver on the promise of artificial general intelligence.
- Microsoft will become OpenAI’s preferred partner for commercializing new AI technologies.
With the financial backing from Microsoft, and the ability to push these AI models into a powerful AI Cloud Infrastructure, like Azure, it was finally possible for the OpenAI team to further scale up its models, which brought to the release of GPT-3!
By June 2020, GPT-3 would be released, together with the APIs, which would enable the exposition of a whole industry.
And a few months after, in September 2020, OpenAI announced the commercial licensing of GPT-3 to Microsoft. That would be a turning point for a few reasons.
First, the open APIs enabled anyone to build a business on top of the foundational layer.
Second, it showed that with further scale, those generative models could get much more powerful.
Third, it further sealed the partnership with Microsoft, which, with Azure AI Cloud infrastructure, became the de facto computing platform for OpenAI.
ChatGPT, the iPhone moment. A blessing and a curse!
By November 2022, OpenAI released ChatGPT, a conversational interface powered up by the latest version of GPT and fine-tuned on human interactions.
That broke the internet!
Yet, that highlighted, to me, that ChatGPT scalability is not just a technological issue but primarily a business modeling issue.
Indeed, in order for OpenAI to keep maintaining and sustaining momentum, it needed to avoid an implosion due to this hypergrowth.
That put OpenAI in a squeeze. On the one side, make sure it can keep sustaining the free traffic of ChatGPT.
On the other hand, making sure the research team could keep focusing on the advancement of GPT-4.
And a third, and also an important part, is that OpenAI would be able to maintain the Open APIs infrastructure at the foundation of a business ecosystem that had been created since 2020.
Those things combined are shaping up the OpenAI-Microsoft deal.
Ok, now it makes sense to go back and revisit the deal in light of the timeline above!
ChatGPT premium, and in search of a business model!
Indeed, if you have not been following, it seems that Microsoft is planning to pile, $10 billion into the partnership with OpenAI, in exchange for a substantial stake into the for profit organization.
On the on hand, OpenAI is looking into a premium version of ChatGPT, as shared by Greg Brockman (Chairman & President).
On the other hand, OpenAI is looking for a further infusion of capital from Microsoft.
It will be interesting to watch whether the deal will change anything in the corporate structure of OpenAI.
How does the OpenAI business model work?
Read: OpenAI Business Model.
How does the OpenAI/Microsoft partnership work?
How does OpenAI corporate and organizational structure work?
Connected Business Model Analyses
Stability AI Ecosystem
Read: Microsoft/OpenAI Partnership.
Main Free Guides: