First draft: 2022-12-18

TODO

Introduction

I recognize that the title sounds catchy, but being catchy is not the intention of this post. This is supposed to be a serious list of directions for future research (you can also just treat it as an overview).

This post includes what I think might be a somewhat comprehensive recipe for building intelligent systems, but I am sure that some important ingredients are probably missing. Of course this list will be updated when I notice that an important concept is not included.

Definition

Before we get to the list, let’s start by defining a measure for intelligence.

For that, i think the definition from Shane Legg and Marcus Hutter is nice (appropriate), because it makes general capability central:

\[\begin{align*} \Upsilon(\pi) &\dot{=} \sum_{\mu \in E} 2^{-K(\mu)} V_\mu^\pi \\ &\dot{=} \sum_{\mu \in E} 2^{-K(\mu)} \frac{1}{\Gamma} \mathbb{E}[ \sum_{i=1}^{\infty} \gamma^i r_i ] \\ &\dot{=} \sum_{\mu \in E} 2^{-K(\mu)} \frac{1}{\sum_{i=1}^{\infty} \gamma^i} \mathbb{E}[ \sum_{i=1}^{\infty} \gamma^i r_i ] \end{align*}\]

where $\Upsilon(\pi)$ measures the universal intelligence of an agent with policy $\pi$. This universal intelligence is determined by the added performance (=value of the starting state in an environment) of different environments $\mu \in E$, with a weighting factor $2^{-K(\mu)}$ that weights the performance in simpler environments (=low Kolmogorov complexity) higher.

The list

Ingredient Purpose Implementation Candidates
Learning
(the most important one!)
Aquiring new knowledge by updating your beliefs when your experience deviates from your expectation. Reinforcement Learning, Unsupervised Learning, Supervised Learning
Curiosity Efficient Exploration. e.g. Feature-Space Curiosity
Dreaming Recalling past experiences for consolidation into long-term memory and quicker learning. (Prioritized) Experience Replay, Forward-Forward Algorithm
World models & planning World models enable experiences in an imaginary world and if the world model is good, we can enable sample efficient learning because we don't need to interact with the real environment as much anymore. Planning means thinking ahead how trajectories will play out and using this information to select better actions. Planning is also only possible if we have a model of the world, so that we can look to see what might happen. Model-based RL
Function approximation Compressing knowledge to generalize concepts (also as a sideeffect converting different modalities into thoughts and back into possibly other modalities). (Deep) Neural Networks
Attention Focussing on some parts of the input data more than on other parts to make better predictions. Transformers

TODO

  • add pointers to research papers for each ingredient

  • Curiosity: https://pathak22.github.io/noreward-rl/
  • DreamerV3 (Model-based): https://arxiv.org/abs/2301.04104v1

References

  1. Shane Legg and Marcus Huttter - Universal Intelligence: A Definition of Machine Intelligence
  2. Wiki: Kolmogorov complexity

Post a comment.