About Intelligence
Looked at in one way, everyone knows what intelligence is; looked at in another way, no one does.
Robert J. Sternberg, 2000
For centuries, intelligence has been a concept both revered and elusive, often quantified by metrics like IQ tests and cognitive ability scores. These measures have long served as the standard for determining one’s capacity for learning, reasoning, and problem-solving. However, as our understanding of cognition evolves, so too must our definitions of intelligence. In this light, François Chollet, a prominent figure in the field of artificial intelligence, offers a fresh perspective in his paper “On the Measure of Intelligence.” Chollet challenges traditional views and proposes that true intelligence lies not merely in solving problems but in the ability to generalize and adapt to new, unforeseen challenges efficiently.
It seems to us that in intelligence there is a fundamental faculty, the alteration or the lack of which, is of the utmost importance for practical life. This faculty is […] the faculty of adapting one’s self to circumstances.
Alfred Binet, 1916
Chollet’s Core Ideas
At the heart of Chollet’s argument is the idea that intelligence is primarily about generalization ability. According to Chollet, intelligence should not be measured by how well an individual or machine can solve a problem it has seen before, but rather by how well it can navigate and solve new problems based on limited prior knowledge. This perspective shifts the focus from rote memorization and specific skill acquisition to the broader and more adaptable cognitive process of generalization.
In addition to generalization, Chollet emphasizes the importance of efficiency in intelligence. Efficiency, in this context, refers to the ability to achieve goals with minimal resources — be it time, energy, or information. This contrasts with the brute-force methods often used in artificial intelligence today, where success is achieved through overwhelming amounts of data and computational power. Chollet argues that true intelligence, whether human or artificial, is marked by the ability to learn and adapt quickly and efficiently, using minimal resources to generate maximal outcomes.
Comparison with Traditional Views & Levels of Adaptability
Chollet’s definition of intelligence contrasts sharply with traditional views that focus on specific cognitive abilities, such as those measured by IQ tests. IQ tests, for instance, evaluate intelligence based on a narrow set of skills, such as verbal reasoning, mathematical ability, and pattern recognition. While these are important aspects of cognition, they do not fully capture the breadth of what it means to be intelligent.
Traditional measures also tend to overlook the importance of context and adaptability. An individual may excel in a controlled testing environment but struggle to apply that knowledge in real-world situations. Chollet’s framework addresses this gap by prioritizing the ability to generalize from past experiences to novel scenarios. In doing so, it offers a more holistic view of intelligence, one that encompasses not just what we know but how we use that knowledge in unfamiliar contexts.
In order to label/measure this view of intelligence, Chollet introduces the term “degrees of generalization”.
Degrees of generalization simply refers to the number of variables or choices available in a given situation. Using degrees of generalization, one can underline the importance of flexibility and adaptability of a system when facing new challenges.
In general, there are 4 different degrees of generalization levels and those are;
- Absence Of Generalization
Absence of generalization reflects a lack of degrees of freedom — such systems operate within strict boundaries, unable to adapt or apply their “knowledge” to new, unknown situations. In contrast, a system with higher degrees of freedom has the capacity to generalize effectively, leveraging its adaptability to handle uncertainty and novelty.
2. Local Generalization or “Robustness”
Local Generalization refers to a system’s ability to handle new instances that fall within a known distribution for a specific task or a narrowly defined set of tasks. This level of generalization involves adapting “known unknowns” — situations that are unfamiliar but fall within the expected range of variations.
For example, consider an image classifier trained to distinguish between cats and dogs. If the classifier is capable of accurately identifying new 150x150 RGB images it has never seen before, as long as they come from the same distribution of images it was trained on, it demonstrates local generalization. This adaptation to anticipated variations within a fixed context is a foundational aspect of machine learning, and has been a central focus of the field since the 1950s.
While local generalization is an important step in the broader spectrum of generalization, it is somewhat limited. It allows a system to adapt within a well-defined scope but does not extend to more complex or entirely new tasks. In the realm of all creatures, from simple organisms to humans, local generalization reflects an essential but basic level of intelligence. More advanced degrees of generalization involve adapting to entirely new environments or challenges, far beyond the known distributions of experiences.
3. Broad Generalization or “Flexibility”
Broad generalization represents a more advanced and comprehensive form of adaptability. This type of generalization is the ability of a system to handle a wide variety of tasks and environments, even those that were not anticipated by the system’s creators. Broad generalization reflects the capability to adapt to “unknown unknowns” — situations that are completely novel and beyond the scope of the system’s training or design.
Let’s use examples to reinforce this degree of generalization.
Imagine a Level 5 (L5) self-driving vehicle or a domestic robot designed to enter any(random) kitchen and make a cup of coffee (passing the “Wozniak’s coffee cup test”). To demonstrate broad generalization, the self-driving car must be able to navigate not just well-mapped city streets, but also handle unexpected situations such as unfamiliar road layouts, sudden weather changes, or unpredictable pedestrian behavior — scenarios that the system’s creators may not have specifically programmed for. Similarly, a domestic robot capable of broad generalization would need to adapt to various kitchen layouts, different types of coffee machines, and potentially even language differences in the household.
“As Long as They Come from the Same Distribution” vs. Broad Generalization:
- In local generalization (as explained earlier), the system operates effectively within a “known distribution,” meaning it can handle new instances that are similar to the examples it was trained on. However, this is limited to familiar situations within a narrow context.
- In broad generalization, the system is not confined to a specific distribution of tasks or environments. It must adapt to entirely new, unforeseen challenges without requiring additional training or human intervention. The system should be capable of generalizing its knowledge and skills across a broad range of tasks, environments, and even unexpected scenarios. This level of generalization is much closer to human-like flexibility in problem-solving and adaptability.
4. Extreme Generalization
Extreme generalization represents the highest level of adaptability and intelligence, far beyond what current artificial systems can achieve. This form of generalization involves handling entirely new tasks and challenges that share only abstract or distant similarities with previously encountered situations. It’s the ability to apply knowledge and skills across an almost unlimited range of tasks and domains, adapting to “unknown unknowns” in ways that are not confined by prior experience or specific training.
Let’s break down this concept with an example:
Imagine a scenario where a human is faced with a task they have never encountered before, such as learning a completely new skill like playing a musical instrument they’ve never seen, or surviving in an unfamiliar environment with entirely new challenges. Despite the novelty of these situations, a human can draw on their vast array of experiences, abstract reasoning, and creativity to devise solutions. They can recognize patterns, draw analogies to similar but not identical past experiences, and apply their understanding in a highly flexible way. This ability to transfer knowledge across vastly different domains and tasks is what defines extreme generalization.
Human-Centric Extreme Generalization:
Within the scope of human experience, extreme generalization can be referred to as “generality.” This type of generalization allows humans to tackle novel challenges and situations that previous generations may never have encountered, such as navigating modern technology or solving unprecedented global problems. This adaptability is a key aspect of human cognition, allowing us to innovate, explore, and thrive in a constantly changing world.
Extreme generalization is the pinnacle of intelligence, representing the ability to adapt to any new task or domain, no matter how unfamiliar, by leveraging abstract reasoning and highly flexible cognitive processes. It’s a defining characteristic of human intelligence, allowing us to navigate a vast and unpredictable world in ways that no current AI system can replicate. This type of generalization is not just about handling new tasks within a known framework, but about mastering completely unknown challenges across any possible domain.
As of now, no AI systems possess true broad generalization in the way it’s defined by François Chollet and others in the field of artificial intelligence. Broad generalization, which involves the ability to handle a wide range of tasks and environments without human intervention and adapt to unforeseen situations, remains a goal that current AI systems have yet to achieve.
Current State of AI and Generalization:
- Narrow AI (or Weak AI): Most AI systems today are examples of narrow AI, meaning they are designed and trained to perform specific tasks within a limited domain. These systems, such as image classifiers, language models, and recommendation systems, can demonstrate local generalization by handling new instances within a familiar context. However, they struggle with tasks or environments outside their training data and predefined scope.
- Multi-Task Learning: Some AI models can perform multiple tasks within related domains, such as models that can handle both image recognition and text generation. However, even these systems do not exhibit broad generalization because they still rely on pre-defined tasks and data distributions. In my opinion, these methods are nothing more than a ‘brute force’ approach or “hallucinated broad generalization.” They can’t seamlessly transition to completely new and unforeseen tasks without additional training.
- Transfer Learning: Transfer learning allows AI models to apply knowledge gained from one task to a different but related task. While this is a step towards broader generalization, it is still limited to scenarios where there is some overlap or similarity between tasks. The ability to adapt to entirely new domains or environments without retraining remains elusive.
- Reinforcement Learning in Complex Environments: AI systems that use reinforcement learning, such as those trained to play video games or navigate simulations, can exhibit some degree of adaptability. However, they are typically effective only within the specific environments they were trained in. Extending their capabilities to real-world scenarios with unpredictable challenges is still a major hurdle.
Implications for Artificial Intelligence
Chollet’s ideas have profound implications for the development of artificial intelligence. Much of the AI that dominates today’s landscape relies on deep learning models, which achieve impressive results by processing vast amounts of data. These models excel at tasks they are trained for but often fail when confronted with new problems outside their training set. This limitation is a direct result of their reliance on brute-force learning(?) rather than the ability to generalize efficiently.
By adopting Chollet’s perspective, I certainly believe AI research could shift towards developing systems that mimic human-like adaptability and efficiency. Instead of focusing on models that require massive datasets and computational power, the goal would be to create AI that can learn and adapt from limited information, much like a human would. This approach could lead to more robust and versatile AI systems capable of operating effectively in dynamic and unpredictable environments.
Human Intelligence: A Broader Perspective
While Chollet’s framework is rooted in artificial intelligence, it also offers valuable insights into human cognition. If we view intelligence through the lens of generalization and efficiency, we can begin to rethink how we approach education, problem-solving, and creativity. For instance, education systems could shift from emphasizing memorization and standardized testing to fostering critical thinking and adaptability. Problem-solving could focus more on innovative thinking and less on following established procedures. Creativity, often seen as separate from intelligence, could be redefined as a core component of intelligent behavior.
This broader perspective on intelligence also has implications for how we assess and value different types of intelligence. Chollet’s framework encourages us to recognize and nurture diverse cognitive abilities, understanding that intelligence is not a one-size-fits-all trait but a multifaceted and dynamic process.
Conclusion: A New Measure for a New Age
As our world becomes increasingly complex and interconnected, our understanding of intelligence must evolve. François Chollet’s “On the Measure of Intelligence” offers a timely and thought-provoking redefinition of what it means to be intelligent. By focusing on generalization and efficiency, Chollet provides a framework that is not only more aligned with the realities of both human and artificial cognition but also more adaptable to the challenges of the future.
In embracing this new measure of intelligence, we can open the door to advancements in artificial intelligence, education, and our broader understanding of human potential. Intelligence, in this new age, is not just about what we know or how quickly we can solve a problem — it’s about how we adapt, how we learn from the unknown, and how efficiently we can navigate the complexities of an ever-changing world.
References:
Chollet, F. (2019). On the Measure of Intelligence. arXiv preprint arXiv:1911.01547. Available at: https://arxiv.org/abs/1911.01547