Model-Based and Model-Free

Model-Based and Model-Free

When studying reinforcement learning algorithms, the terms model-based and model-free are frequently encountered. Simply put, model-based implies having full knowledge of the environment. The environment refers to all the surrounding states in which the MDP operates. The environment generally consists of the state, state transition probability, reward, action, and discount factor. If everything is illustrated in a diagram like the examples examined earlier, it can be considered model-based. Conversely, an environment represented as a black box, where new states and rewards are returned depending on the input action and state, is model-free.

The major difference between model-based and model-free in reinforcement learning is the ability to predict the next state. In the example above, each current state is connected by arrows to possible future states in the next time step, allowing for straightforward verification of the next state without complex algorithms. This is model-based reinforcement learning. In a model-free scenario, the agent cannot predict the state it can move to in the next time step, necessitating complex algorithms to determine it. Most problems that reinforcement learning aims to solve exist in model-free environments.

Model-Based and Model-Free

Post a Comment