What Is Cognitive Load Theory? Cognitive Load Theory In A Nutshell

Cognitive load theory (CLT) argues that instructional design quality is increased when consideration is given to the role and limitations of working memory. The theory is based on the premise that since the brain can only do so many things at once, the individual should be selective about what they ask it to process.

Understanding cognitive load theory

Cognitive load theory was developed in the late 1980s by psychologist John Sweller, who argued that instructional design could be used to reduce cognitive load in students. 

The theory is based on two commonly accepted ideas:

  • There is a limit to how much new information the brain can process at any given time. This is called working memory, which can only store a few pieces of information for a very short duration.
  • There are no limits to how much stored information the brain can process at any given time. Stored information is accessed from long-term memory where it may be held semi-permanently.

If the working memory of a student is overloaded, there is a risk they will not understand the content being taught to them. With regular practice, however, learning can be facilitated as information is recalled from long-term memory with little conscious effort. Since this knowledge is accessed subconsciously, the working memory is freed up to learn something else.

Ultimately, the goal of cognitive load theory is to develop models of instruction that support the way the human brain learns.

The three types of cognitive load

The theory defines three types of cognitive load, which refers to the number of resources used in working memory.

Following is a look at each type:

  1. Intrinsic load – or the complexity of the material or skill, measured by the number of the elements that need to be learned. When there are a large number of interacting elements, a novice learner experiences a high intrinsic load. As a result, the intrinsic load is dependent on the complexity of the learning material and the learner’s prior level of knowledge or understanding. 
  2. Germane load – this refers to the load placed on working memory by the process of learning. In other words, the transferring of information to long-term memory where it becomes knowledge. This process is facilitated by schemas, or frameworks organizing elements of information according to how they should be used. For example, a mathematics student will use the BODMAS mnemonic to help them remember the correct order for completing calculations. Crucially, schemas reduce working memory load because they are single elements of information representing complex or multi-faceted knowledge.
  3. Extraneous load – caused by cognitive activities that do not contribute to learning. In most cases, the information presented is poorly designed and may be confusing, unnecessary, or excessive. The teacher may also instruct in a way that is similarly complex.

Five principles for reducing cognitive load

In 2002, educational psychologist Richard E. Mayer built on Sweller’s research to create five principles for reducing cognitive load:

  1. The Coherence Principle – reduce the amount of information to only what is critical and relevant to learning. Simplicity and clarity should be favored over style and applies to teaching materials and the disseminating of instructions.
  2. The Signalling Principle – important written information should be highlighted in whatever way the teacher deems appropriate. Teachers should alter their pacing and intonation when teaching verbally and should avoid speaking in a monotone voice.
  3. The Redundancy Principle – teachers should never become so lazy that they instruct by reading information from a screen.
  4. Spatial Contiguity – to reduce cognitive load, it is also important to show related topics or items close to each other. If a diagram is included in the course content, the annotations should be included on the same page.
  5. Temporal Contiguity – similar to the fourth principle, but with time instead of proximity. Related concepts or items must be mentioned in quick succession. Hours or days should not elapse before a link is made between two related concepts. Spatial and temporal learning can be facilitated by using context, which links the information to a relatable student situation and reduces germane cognitive load.

Key takeaways:

  • Cognitive load theory is a theory of instructional design based on the role and limitations of working memory on learning.
  • Cognitive load theory describes three forms of cognitive load which consumes limited resources in working memory. These include intrinsic load, germane load, and extraneous load.
  • Five principles for reducing cognitive load were later added to the theory in 2002 by Richard E. Mayer. Among other things, teachers must favor a simple and clear instructional style and avoid reading off a screen.

Read Next: BiasesBounded RationalityMandela EffectDunning-Kruger

Read Next: HeuristicsBiases.

Connected Thinking Frameworks

Convergent vs. Divergent Thinking

Convergent thinking occurs when the solution to a problem can be found by applying established rules and logical reasoning. Whereas divergent thinking is an unstructured problem-solving method where participants are encouraged to develop many innovative ideas or solutions to a given problem. Where convergent thinking might work for larger, mature organizations where divergent thinking is more suited for startups and innovative companies.

Critical Thinking

Critical thinking involves analyzing observations, facts, evidence, and arguments to form a judgment about what someone reads, hears, says, or writes.

Systems Thinking

Systems thinking is a holistic means of investigating the factors and interactions that could contribute to a potential outcome. It is about thinking non-linearly, and understanding the second-order consequences of actions and input into the system.

Vertical Thinking

Vertical thinking, on the other hand, is a problem-solving approach that favors a selective, analytical, structured, and sequential mindset. The focus of vertical thinking is to arrive at a reasoned, defined solution.

Maslow’s Hammer

Maslow’s Hammer, otherwise known as the law of the instrument or the Einstellung effect, is a cognitive bias causing an over-reliance on a familiar tool. This can be expressed as the tendency to overuse a known tool (perhaps a hammer) to solve issues that might require a different tool. This problem is persistent in the business world where perhaps known tools or frameworks might be used in the wrong context (like business plans used as planning tools instead of only investors’ pitches).

Peter Principle

The Peter Principle was first described by Canadian sociologist Lawrence J. Peter in his 1969 book The Peter Principle. The Peter Principle states that people are continually promoted within an organization until they reach their level of incompetence.

Straw Man Fallacy

The straw man fallacy describes an argument that misrepresents an opponent’s stance to make rebuttal more convenient. The straw man fallacy is a type of informal logical fallacy, defined as a flaw in the structure of an argument that renders it invalid.

Streisand Effect

The Streisand Effect is a paradoxical phenomenon where the act of suppressing information to reduce visibility causes it to become more visible. In 2003, Streisand attempted to suppress aerial photographs of her Californian home by suing photographer Kenneth Adelman for an invasion of privacy. Adelman, who Streisand assumed was paparazzi, was instead taking photographs to document and study coastal erosion. In her quest for more privacy, Streisand’s efforts had the opposite effect.


As highlighted by German psychologist Gerd Gigerenzer in the paper “Heuristic Decision Making,” the term heuristic is of Greek origin, meaning “serving to find out or discover.” More precisely, a heuristic is a fast and accurate way to make decisions in the real world, which is driven by uncertainty.

Recognition Heuristic

The recognition heuristic is a psychological model of judgment and decision making. It is part of a suite of simple and economical heuristics proposed by psychologists Daniel Goldstein and Gerd Gigerenzer. The recognition heuristic argues that inferences are made about an object based on whether it is recognized or not.

Representativeness Heuristic

The representativeness heuristic was first described by psychologists Daniel Kahneman and Amos Tversky. The representativeness heuristic judges the probability of an event according to the degree to which that event resembles a broader class. When queried, most will choose the first option because the description of John matches the stereotype we may hold for an archaeologist.

Take-The-Best Heuristic

The take-the-best heuristic is a decision-making shortcut that helps an individual choose between several alternatives. The take-the-best (TTB) heuristic decides between two or more alternatives based on a single good attribute, otherwise known as a cue. In the process, less desirable attributes are ignored.


The concept of cognitive biases was introduced and popularized by the work of Amos Tversky and Daniel Kahneman in 1972. Biases are seen as systematic errors and flaws that make humans deviate from the standards of rationality, thus making us inept at making good decisions under uncertainty.

Bundling Bias

The bundling bias is a cognitive bias in e-commerce where a consumer tends not to use all of the products bought as a group, or bundle. Bundling occurs when individual products or services are sold together as a bundle. Common examples are tickets and experiences. The bundling bias dictates that consumers are less likely to use each item in the bundle. This means that the value of the bundle and indeed the value of each item in the bundle is decreased.

Barnum Effect

The Barnum Effect is a cognitive bias where individuals believe that generic information – which applies to most people – is specifically tailored for themselves.

First-Principles Thinking

First-principles thinking – sometimes called reasoning from first principles – is used to reverse-engineer complex problems and encourage creativity. It involves breaking down problems into basic elements and reassembling them from the ground up. Elon Musk is among the strongest proponents of this way of thinking.

Ladder Of Inference

The ladder of inference is a conscious or subconscious thinking process where an individual moves from a fact to a decision or action. The ladder of inference was created by academic Chris Argyris to illustrate how people form and then use mental models to make decisions.

Six Thinking Hats Model

The Six Thinking Hats model was created by psychologist Edward de Bono in 1986, who noted that personality type was a key driver of how people approached problem-solving. For example, optimists view situations differently from pessimists. Analytical individuals may generate ideas that a more emotional person would not, and vice versa.

Second-Order Thinking

Second-order thinking is a means of assessing the implications of our decisions by considering future consequences. Second-order thinking is a mental model that considers all future possibilities. It encourages individuals to think outside of the box so that they can prepare for every and eventuality. It also discourages the tendency for individuals to default to the most obvious choice.

Lateral Thinking

Lateral thinking is a business strategy that involves approaching a problem from a different direction. The strategy attempts to remove traditionally formulaic and routine approaches to problem-solving by advocating creative thinking, therefore finding unconventional ways to solve a known problem. This sort of non-linear approach to problem-solving, can at times, create a big impact.

Bounded Rationality

Bounded rationality is a concept attributed to Herbert Simon, an economist and political scientist interested in decision-making and how we make decisions in the real world. In fact, he believed that rather than optimizing (which was the mainstream view in the past decades) humans follow what he called satisficing.

Dunning-Kruger Effect

The Dunning-Kruger effect describes a cognitive bias where people with low ability in a task overestimate their ability to perform that task well. Consumers or businesses that do not possess the requisite knowledge make bad decisions. What’s more, knowledge gaps prevent the person or business from seeing their mistakes.

Occam’s Razor

Occam’s Razor states that one should not increase (beyond reason) the number of entities required to explain anything. All things being equal, the simplest solution is often the best one. The principle is attributed to 14th-century English theologian William of Ockham.

Mandela Effect

The Mandela effect is a phenomenon where a large group of people remembers an event differently from how it occurred. The Mandela effect was first described in relation to Fiona Broome, who believed that former South African President Nelson Mandela died in prison during the 1980s. While Mandela was released from prison in 1990 and died 23 years later, Broome remembered news coverage of his death in prison and even a speech from his widow. Of course, neither event occurred in reality. But Broome was later to discover that she was not the only one with the same recollection of events.

Crowding-Out Effect

The crowding-out effect occurs when public sector spending reduces spending in the private sector.

Bandwagon Effect

The bandwagon effect tells us that the more a belief or idea has been adopted by more people within a group, the more the individual adoption of that idea might increase within the same group. This is the psychological effect that leads to herd mentality. What in marketing can be associated with social proof.

Read Next: BiasesBounded RationalityMandela EffectDunning-Kruger

Read Next: HeuristicsBiases.

Main Free Guides:

About The Author

Scroll to Top