RL

Gradient

A gradient represents the slope in a space. Earlier, we used partial differentiation with respect to 'x' to see how the slope changes along the x-axis on a three-dimensional graph while keeping 'y' constant. A gradient, however, invo…

멀티코어

Scalar and Vector

Let’s first look at the concepts of scalar and vector. Scalars are quantities with only magnitude and no direction, like weight, exam scores, or height. Vectors, on the other hand, have both magnitude and direction, like magnetic force, velocity, …

멀티코어

Partial Derivative

A partial derivative is a type of differentiation. The function we looked at earlier had one variable (x). A partial derivative, on the other hand, applies when a function has two or more variables. For instance, if a function is f(x, y), where th…

멀티코어

Derivatives (Differentiation)

A derivative finds the rate of change of a function at any given point. Before discussing the rate of change, let’s understand the concept of change rate. The rate of change represents the amount of change, including average change rates and insta…

멀티코어

Setting Up the Development Environment

To program for reinforcement learning, you’ll need to install several programs. In data analysis, installing Anaconda provides most of the required programs by default, making it convenient. However, here we’ll go through the process of setting up t…

멀티코어

Deep Learning

Neurons (nerve cells) are the cells that make up the nervous system. They send and receive electrical signals to communicate with other neurons, playing a key role in distributing and storing information. The human brain is composed of hundreds of b…

멀티코어

Classification Analysis

Let’s explore binary classification analysis, which categorizes data into two types. Classification analysis is also a type of supervised learning. Here, we’ll examine simple two-dimensional (X, Y) data. On the graph, multiple Xs are located at the …

멀티코어

Linear Regression Analysis

To understand the concept of machine learning, let's examine simple one-dimensional linear regression analysis. Linear regression analysis is a type of supervised learning used to create a predictive model that can forecast outcomes for unknown …

멀티코어

Machine Learning

Machine Learning is an AI technology that learns and continuously improves performance without explicit programming. Machine learning algorithms establish a mathematical model for a specific field and complete this model by training with data, enabl…

멀티코어

Q-Learning

Earlier, we explained that the Q-function is used to control policies in MC and TD. In environments where MC and TD are applied, all information about the model is not available (Model-Free), meaning the next state is unknown. Therefore, it is not…

멀티코어

On-Policy and Off-Policy, Importance Sampling

On-Policy and Off-Policy All the content we have studied so far pertains to on-policy. This is because the policy used for evaluation (π) and the policy used for control (π) are the same. In on-policy learning in TD, one more timestep is taken to…

멀티코어

SARSA

In TD, the state-value function was used for policy evaluation, while the Q-function (Action-Value Function) was used only for policy control. However, both policy evaluation and control can be conducted using the Q-function since it contains valu…

멀티코어

TD (Temporal Difference Learning)

MC has one drawback: the state-value function is calculated after the episode is completed, which slows down learning. To address this, a new concept called Temporal Difference Learning (TD) was introduced. Temporal Difference Learning (TD) (1) …

멀티코어
Load More
That is All