Ingredients of Intelligence
First draft: 2022-12-18
TODO
Introduction
I recognize that the title sounds catchy, but being catchy is not the intention of this post. This is supposed to be a serious list of directions for future research (you can also just treat it as an overview).
This post includes what I think might be a somewhat comprehensive recipe for building intelligent systems, but I am sure that some important ingredients are probably missing. Of course this list will be updated when I notice that an important concept is not included.
Definition
Before we get to the list, let’s start by defining a measure for intelligence.
For that, i think the definition from Shane Legg and Marcus Hutter is nice (appropriate), because it makes general capability central:
\[\begin{align*} \Upsilon(\pi) &\dot{=} \sum_{\mu \in E} 2^{-K(\mu)} V_\mu^\pi \\ &\dot{=} \sum_{\mu \in E} 2^{-K(\mu)} \frac{1}{\Gamma} \mathbb{E}[ \sum_{i=1}^{\infty} \gamma^i r_i ] \\ &\dot{=} \sum_{\mu \in E} 2^{-K(\mu)} \frac{1}{\sum_{i=1}^{\infty} \gamma^i} \mathbb{E}[ \sum_{i=1}^{\infty} \gamma^i r_i ] \end{align*}\]where $\Upsilon(\pi)$ measures the universal intelligence of an agent with policy $\pi$. This universal intelligence is determined by the added performance (=value of the starting state in an environment) of different environments $\mu \in E$, with a weighting factor $2^{-K(\mu)}$ that weights the performance in simpler environments (=low Kolmogorov complexity) higher.
The list
Ingredient | Purpose | Implementation Candidates |
Learning (the most important one!) |
Aquiring new knowledge by updating your beliefs when your experience deviates from your expectation. | Reinforcement Learning, Unsupervised Learning, Supervised Learning |
Curiosity | Efficient Exploration. | e.g. Feature-Space Curiosity |
Dreaming | Recalling past experiences for consolidation into long-term memory and quicker learning. | (Prioritized) Experience Replay, Forward-Forward Algorithm |
World models & planning | World models enable experiences in an imaginary world and if the world model is good, we can enable sample efficient learning because we don't need to interact with the real environment as much anymore. Planning means thinking ahead how trajectories will play out and using this information to select better actions. Planning is also only possible if we have a model of the world, so that we can look to see what might happen. | Model-based RL |
Function approximation | Compressing knowledge to generalize concepts (also as a sideeffect converting different modalities into thoughts and back into possibly other modalities). | (Deep) Neural Networks |
Attention | Focussing on some parts of the input data more than on other parts to make better predictions. | Transformers |
TODO
-
add pointers to research papers for each ingredient
- Curiosity: https://pathak22.github.io/noreward-rl/
- DreamerV3 (Model-based): https://arxiv.org/abs/2301.04104v1
References
- Shane Legg and Marcus Huttter - Universal Intelligence: A Definition of Machine Intelligence
- Wiki: Kolmogorov complexity