Noam Shazeer

According to his LinkedIn profile, researcher Noam Shazeer “invented much of the current revolution in large language models” such as the transformer architecture in 2017. His primary areas of study are AI, machine learning, deep learning, and machine translation. 

In the sections that follow, let’s take a look at this lesser-known identity in artificial intelligence and machine learning. 

Google

Shazeer had been invited to study at Duke University as part of a mathematics scholarship, but at some point decided he wanted to become a computer scientist. 

After graduating from Duke, he took up a role at Google as a software engineer in 2000 where he remained on and off for almost two decades. He joined Google because it seemed “like an interesting place to work” despite overlooking the company at a 1999 jobs fair at Berkeley. 

In his job interview with Gmail creator Paul Buchheit, Shazeer was asked how he would implement a spelling corrector. His answer, which was a far better solution than Google had in place at the time, was to verify queries using statistics against logs of what other users had typed.

Shazeer also worked on the first version of Google’s calculator and, as we hinted at earlier, was a key contributor to the transformer architecture.

Google AdSense

While Google’s search algorithm is well-known and extensively studied, few people know of the company’s second most important algorithm: PHIL.

Developed by Shazeer and Georges Harik, PHIL (Probabilistic Hierarchical Inferential Learner) was an algorithm that decided which AdSense ads should be served on specific pages and, in the process, avoid showing inappropriate or irrelevant content. 

PHIL ultimately became the heart of the AdSense system which, according to author Steven Levy, countered a general belief that it was powered by Applied Semantics and its advertising technology (which Google acquired in 2003). 

Two patents filed by Shazeer and Harik in 2002 and 2004 provide some backstory to the algorithm’s development, but the ideas discussed in the patents about how words and concepts could be classified were arguably precursors to later work in AI and ML.

The two patents are:

  1. US Patent 7,383,258 – Method and apparatus for characterizing documents based on clusters of related words, and
  2. US Patent 7,231,393 – Method and apparatus for learning a probabilistic generative model for text. 

Harik Shazeer Labs

In 2008, Harik and Shazeer founded the private non-profit Harik Shazeer Labs in Los Angeles, California. Shazeer serves as President and Director, while Harik is also a Director in addition to the company’s CFO and Secretary.

Little is known about the company or what projects it is involved in, but at some point, Harik and Shazeer were joined by ML research scientist Sergey Pankov. Pankov has been at the company for over 14 years and enjoys difficult and ambitious problems related to ML, AI, control theory, and theoretical physics.

Project LaMDA

Over his time with Google, Shazeer was promoted to a principal role where he worked on project LaMDA. This somewhat secretive project involves chatbots trained to converse like humans with large language models that consume trillions of words scraped from the internet.

Project LaMDA started life as Meena – the brainchild of fellow Google employee Daniel De Freitas. Though principally assigned to YouTube, De Freitas worked on a chatbot that could mimic human conversation as a side project

Meena was trained on 40 billion words from social media discussions – a number that far eclipsed the eight million web pages GPT-2 was trained on. While Google never released a version that researchers could test, work on the project continued. 

Shazeer then joined the team and Meena was renamed Project LaMDA – a loose acronym for Language Model for Dialogue Applications. In 2020, Shazeer and De Freitas managed to integrate the chatbot into Google Assistant but were again stymied by Google executives who were not interested in a public demo.

Efforts to appease the pair were made by CEO Sundar Pichai who asked them to continue their work without committing to a release date. Frustrated, Shazeer and de Freitas left Google in 2021 after long and successful stints at the company.

Character.ai

Nazeer co-founded the chatbot Character.ai with Google counterpart Daniel De Freitas in November 2021. The chatbot, which amassed hundreds of thousands of users in the first three weeks of beta tests, enables users to converse with impersonations of various celebrities such as Donald Trump, Sherlock Holmes, and Albert Einstein. 

Character.ai can also be utilized as a text-based adventure game where the user is talked through various scenarios – one of the most popular is one in which AI is control of a spaceship and tries to elude an enemy. Other users have created chatbots of deceased relatives or their favorite authors.

In an interview with The Washington Post, Shazeer said the pair left Google to share new AI tech with as many people as possible: “I thought, ‘Let’s build a product now that can help millions and billions of people. Especially in the age of covid, there are just millions of people who are feeling isolated or lonely or need someone to talk to.’”

While Nazeer’s intentions were seen as noble, his and De Freitas’ exit from Google was part of a larger so-called “brain drain” of talent from major AI companies.

There is no doubt that Google’s reluctance to publicly test new AI (and slow response to the industry in general) also played a part in Shazeer striking out on his own to found Character.ai.

Key takeaways

  • According to his LinkedIn profile, machine learning researcher Noam Shazeer “invented much of the current revolution in large language models” such as the transformer architecture in 2017.
  • After graduating from Duke, he took up a role at Google as a software engineer in 2000 where he remained on and off for almost 20 years. In his interview with Gmail creator Paul Buchheit, Shazeer came up with a spell corrector that landed him the job.
  • Nazeer co-founded the chatbot Character.ai with Google counterpart Daniel De Freitas in November 2021. Among other things, the chatbot enables users to converse with the impersonations of various celebrities.

Connected Business Model Analyses

AGI

artificial-intelligence-vs-machine-learning
Generalized AI consists of devices or systems that can handle all sorts of tasks on their own. The extension of generalized AI eventually led to the development of Machine learning. As an extension to AI, Machine Learning (ML) analyzes a series of computer algorithms to create a program that automates actions. Without explicitly programming actions, systems can learn and improve the overall experience. It explores large sets of data to find common patterns and formulate analytical models through learning.

Deep Learning vs. Machine Learning

deep-learning-vs-machine-learning
Machine learning is a subset of artificial intelligence where algorithms parse data, learn from experience, and make better decisions in the future. Deep learning is a subset of machine learning where numerous algorithms are structured into layers to create artificial neural networks (ANNs). These networks can solve complex problems and allow the machine to train itself to perform a task.

DevOps

devops-engineering
DevOps refers to a series of practices performed to perform automated software development processes. It is a conjugation of the term “development” and “operations” to emphasize how functions integrate across IT teams. DevOps strategies promote seamless building, testing, and deployment of products. It aims to bridge a gap between development and operations teams to streamline the development altogether.

AIOps

aiops
AIOps is the application of artificial intelligence to IT operations. It has become particularly useful for modern IT management in hybridized, distributed, and dynamic environments. AIOps has become a key operational component of modern digital-based organizations, built around software and algorithms.

Machine Learning Ops

mlops
Machine Learning Ops (MLOps) describes a suite of best practices that successfully help a business run artificial intelligence. It consists of the skills, workflows, and processes to create, run, and maintain machine learning models to help various operational processes within organizations.

OpenAI Organizational Structure

openai-organizational-structure
OpenAI is an artificial intelligence research laboratory that transitioned into a for-profit organization in 2019. The corporate structure is organized around two entities: OpenAI, Inc., which is a single-member Delaware LLC controlled by OpenAI non-profit, And OpenAI LP, which is a capped, for-profit organization. The OpenAI LP is governed by the board of OpenAI, Inc (the foundation), which acts as a General Partner. At the same time, Limited Partners comprise employees of the LP, some of the board members, and other investors like Reid Hoffman’s charitable foundation, Khosla Ventures, and Microsoft, the leading investor in the LP.

OpenAI Business Model

how-does-openai-make-money
OpenAI has built the foundational layer of the AI industry. With large generative models like GPT-3 and DALL-E, OpenAI offers API access to businesses that want to develop applications on top of its foundational models while being able to plug these models into their products and customize these models with proprietary data and additional AI features. On the other hand, OpenAI also released ChatGPT, developing around a freemium model. Microsoft also commercializes opener products through its commercial partnership.

OpenAI/Microsoft

openai-microsoft
OpenAI and Microsoft partnered up from a commercial standpoint. The history of the partnership started in 2016 and consolidated in 2019, with Microsoft investing a billion dollars into the partnership. It’s now taking a leap forward, with Microsoft in talks to put $10 billion into this partnership. Microsoft, through OpenAI, is developing its Azure AI Supercomputer while enhancing its Azure Enterprise Platform and integrating OpenAI’s models into its business and consumer products (GitHub, Office, Bing).

Stability AI Business Model

how-does-stability-ai-make-money
Stability AI is the entity behind Stable Diffusion. Stability makes money from our AI products and from providing AI consulting services to businesses. Stability AI monetizes Stable Diffusion via DreamStudio’s APIs. While it also releases it open-source for anyone to download and use. Stability AI also makes money via enterprise services, where its core development team offers the chance to enterprise customers to service, scale, and customize Stable Diffusion or other large generative models to their needs.

Stability AI Ecosystem

stability-ai-ecosystem

Discover more from FourWeekMBA

Subscribe now to keep reading and get access to the full archive.

Continue reading

Scroll to Top
FourWeekMBA