What is PaLM 2?

PaLM 2 is a next-generation large language model (LLM). Developed by Google, the model features improved coding, reasoning, and multi-lingual capabilities.

Understanding PaLM 2

PaLM 2 is Google’s latest attempt to assert itself in the AI industry and develop a viable alternative to OpenAI’s GPT-4. PaLM 2 – short for Pathways Language Model 2 – was announced by CEO Sundar Pichai at the Google I/O annual developer conference in May 2023. 

The model is available in four different sizes for a variety of use cases. The sizes, from smallest to largest, include Gecko, Otter, Bison, and Unicorn. Google notes in particular that “Gecko is so lightweight that it can work on mobile devices and is fast enough for great interactive applications on-device, even when offline.

PaLM 2 also underpins ChatGPT competitor Google Bard and can be fine-tuned into smaller LLMs to support more specialized AI tools. Current examples include the medical diagnostic tool Med-PaLM 2 and security threat detector Sec-PaLM, but the company notes that PaLM 2 already powers over 25 Google products and features.

Some of these include YouTube, Workspace, and Gmail apps such as Gmail and Google Docs. However, such is PaLM 2’s prevalence that most users already interact with the model on a daily basis without realizing it.

The three pillars of PaLM 2

PaLM 2’s improvements over predecessor PaLM are centered around three pillars:

  1. Reasoning – PaLM 2 was trained on a broader dataset that includes websites with mathematical expressions and various scientific journals. Consequently, the model displays superior logic, common sense reasoning, and mathematical capabilities.
  2. Coding – PaLM 2 was also pre-trained on numerous source code datasets. It excels at languages such as JavaScript and Python and, impressively, can generate or debug specialized code in Verilog, Fortran, and Prolog, among others.
  3. Multilinguality – PaLM 2 is trained on text from over 100 languages which have improved its ability to translate nuanced text such as poems, riddles, and idioms. In addition to solving this complex problem, the model is also able to pass language proficiency exams at a master level.

PaLM 2 training parameters

Google has not explicitly stated how many parameters it used to train PaLM 2, but does explain that it was trained on diverse sources that incorporated “web documents, books, code, mathematics, and conversational data.” 

Predecessor PaLM was trained on 540 billion parameters, and since PaLM 2 is smaller, faster, and more efficient, one can assume the improved version incorporates fewer parameters. 

In any case, the number of parameters used also depends on the size of the model. TechCrunch reports that “one of the more capable PaLM 2 modelswas trained on 14.7 billion parameters, while NLP researcher and coder Aman Sanger posited that the largest size (Unicorn) was likely to have been trained on closer to 100 billion parameters

Key takeaways:

  • PaLM 2 is a next-generation large language model (LLM). Developed by Google, the model features improved coding, reasoning, and multi-lingual capabilities.
  • PaLM 2 is available in four different sizes for a variety of use cases. These sizes, from smallest to largest, include Gecko, Otter, Bison, and Unicorn. PaLM 2 powers ChatGPT competitor Google Bard and can be fine-tuned into smaller LLMs to support more specialized AI tools.
  • Google has not explicitly stated how many parameters it used to train PaLM 2, but the smaller, faster, and more efficient successor to PaLM likely features around 15 billion parameters for smaller versions and 100 billion for the largest Unicorn size.

Key Highlights

  • Introduction to PaLM 2:
    • PaLM 2 stands for Pathways Language Model 2.
    • It is developed by Google and is a next-generation large language model (LLM).
    • PaLM 2 is designed with enhanced capabilities in coding, reasoning, and multilingual understanding.
  • PaLM 2’s Importance:
    • Google aims to establish PaLM 2 as a competitor to OpenAI’s GPT-4, asserting itself in the AI industry.
    • The model addresses the need for improved language understanding and generation across various applications.
  • PaLM 2 Versions:
    • PaLM 2 comes in four different sizes, each tailored for specific use cases: Gecko, Otter, Bison, and Unicorn.
    • The sizes range from lightweight and fast models suitable for mobile devices to larger, more capable models.
  • PaLM 2 Applications:
    • Google Bard, a competitor to ChatGPT, is built on the foundation of PaLM 2.
    • PaLM 2 can also be fine-tuned to create smaller specialized language models for specific AI tools.
    • It powers over 25 Google products and features, including YouTube, Workspace, Gmail, and more.
  • Three Pillars of Improvement:
    • Reasoning: PaLM 2’s training data includes scientific journals and mathematical expressions, leading to enhanced logic, reasoning, and mathematical capabilities.
    • Coding: PaLM 2 is pre-trained on source code datasets, excelling in languages like JavaScript and Python, and capable of generating and debugging specialized code.
    • Multilinguality: Trained on text from 100 languages, PaLM 2 improves translation of nuanced text, handles poems, riddles, idioms, and can achieve a high level of proficiency in multiple languages.
  • Training Parameters:
    • Google has not explicitly disclosed the number of parameters used to train PaLM 2.
    • It was trained on diverse sources including web documents, books, code, mathematics, and conversational data.
    • Smaller models likely have around 15 billion parameters, while larger models (like Unicorn) might have been trained on around 100 billion parameters.
  • Implications and Takeaways:
    • PaLM 2 represents Google’s advancement in language model technology.
    • It addresses the need for improved coding, reasoning, and multilingual capabilities.
    • Different model sizes cater to various use cases, and PaLM 2 powers multiple Google products.

Read Next: Business Engineer, Business Designer.

Connected Business Frameworks And Analyses

AI Paradigm

current-AI-paradigm

Pre-Training

pre-training

Large Language Models

large-language-models-llms
Large language models (LLMs) are AI tools that can read, summarize, and translate text. This enables them to predict words and craft sentences that reflect how humans write and speak.

Generative Models

generative-models

Prompt Engineering

prompt-engineering
Prompt engineering is a natural language processing (NLP) concept that involves discovering inputs that yield desirable or useful results. Like most processes, the quality of the inputs determines the quality of the outputs in prompt engineering. Designing effective prompts increases the likelihood that the model will return a response that is both favorable and contextual. Developed by OpenAI, the CLIP (Contrastive Language-Image Pre-training) model is an example of a model that utilizes prompts to classify images and captions from over 400 million image-caption pairs.

AIOps

aiops
AIOps is the application of artificial intelligence to IT operations. It has become particularly useful for modern IT management in hybridized, distributed, and dynamic environments. AIOps has become a key operational component of modern digital-based organizations, built around software and algorithms.

Machine Learning

mlops
Machine Learning Ops (MLOps) describes a suite of best practices that successfully help a business run artificial intelligence. It consists of the skills, workflows, and processes to create, run, and maintain machine learning models to help various operational processes within organizations.

Continuous Intelligence

continuous-intelligence-business-model
The business intelligence models have transitioned to continuous intelligence, where dynamic technology infrastructure is coupled with continuous deployment and delivery to provide continuous intelligence. In short, the software offered in the cloud will integrate with the company’s data, leveraging on AI/ML to provide answers in real-time to current issues the organization might be experiencing.

Continuous Innovation

continuous-innovation
That is a process that requires a continuous feedback loop to develop a valuable product and build a viable business model. Continuous innovation is a mindset where products and services are designed and delivered to tune them around the customers’ problems and not the technical solution of its founders.

Technological Modeling

technological-modeling
Technological modeling is a discipline to provide the basis for companies to sustain innovation, thus developing incremental products. While also looking at breakthrough innovative products that can pave the way for long-term success. In a sort of Barbell Strategy, technological modeling suggests having a two-sided approach, on the one hand, to keep sustaining continuous innovation as a core part of the business model. On the other hand, it places bets on future developments that have the potential to break through and take a leap forward.

Business Engineering

business-engineering-manifesto

Tech Business Model Template

business-model-template
A tech business model is made of four main components: value model (value propositions, missionvision), technological model (R&D management), distribution model (sales and marketing organizational structure), and financial model (revenue modeling, cost structure, profitability and cash generation/management). Those elements coming together can serve as the basis to build a solid tech business model.

OpenAI Business Model

how-does-openai-make-money
OpenAI has built the foundational layer of the AI industry. With large generative models like GPT-3 and DALL-E, OpenAI offers API access to businesses that want to develop applications on top of its foundational models while being able to plug these models into their products and customize these models with proprietary data and additional AI features. On the other hand, OpenAI also released ChatGPT, developing around a freemium model. Microsoft also commercializes opener products through its commercial partnership.

OpenAI/Microsoft

openai-microsoft
OpenAI and Microsoft partnered up from a commercial standpoint. The history of the partnership started in 2016 and consolidated in 2019, with Microsoft investing a billion dollars into the partnership. It’s now taking a leap forward, with Microsoft in talks to put $10 billion into this partnership. Microsoft, through OpenAI, is developing its Azure AI Supercomputer while enhancing its Azure Enterprise Platform and integrating OpenAI’s models into its business and consumer products (GitHub, Office, Bing).

Stability AI Business Model

how-does-stability-ai-make-money
Stability AI is the entity behind Stable Diffusion. Stability makes money from our AI products and from providing AI consulting services to businesses. Stability AI monetizes Stable Diffusion via DreamStudio’s APIs. While it also releases it open-source for anyone to download and use. Stability AI also makes money via enterprise services, where its core development team offers the chance to enterprise customers to service, scale, and customize Stable Diffusion or other large generative models to their needs.

Stability AI Ecosystem

stability-ai-ecosystem

Main Guides:

About The Author

Scroll to Top
FourWeekMBA