Gradient
A gradient represents the slope in a space. Earlier, we used partial differentiation with respect to 'x' to see how the slope changes along the x-axis on a three-dimensional graph while keeping 'y' constant. A gradient, however, invo…
Let’s first look at the concepts of scalar and vector. Scalars are quantities with only magnitude and no direction, like weight, exam scores, or height. Vectors, on the other hand, have both magnitude and direction, like magnetic force, velocity, …
A partial derivative is a type of differentiation. The function we looked at earlier had one variable (x). A partial derivative, on the other hand, applies when a function has two or more variables. For instance, if a function is f(x, y), where th…
A derivative finds the rate of change of a function at any given point. Before discussing the rate of change, let’s understand the concept of change rate. The rate of change represents the amount of change, including average change rates and insta…
To program for reinforcement learning, you’ll need to install several programs. In data analysis, installing Anaconda provides most of the required programs by default, making it convenient. However, here we’ll go through the process of setting up t…
Neurons (nerve cells) are the cells that make up the nervous system. They send and receive electrical signals to communicate with other neurons, playing a key role in distributing and storing information. The human brain is composed of hundreds of b…
Let’s explore binary classification analysis, which categorizes data into two types. Classification analysis is also a type of supervised learning. Here, we’ll examine simple two-dimensional (X, Y) data. On the graph, multiple Xs are located at the …
To understand the concept of machine learning, let's examine simple one-dimensional linear regression analysis. Linear regression analysis is a type of supervised learning used to create a predictive model that can forecast outcomes for unknown …
Machine Learning is an AI technology that learns and continuously improves performance without explicit programming. Machine learning algorithms establish a mathematical model for a specific field and complete this model by training with data, enabl…
Earlier, we explained that the Q-function is used to control policies in MC and TD. In environments where MC and TD are applied, all information about the model is not available (Model-Free), meaning the next state is unknown. Therefore, it is not…
On-Policy and Off-Policy All the content we have studied so far pertains to on-policy. This is because the policy used for evaluation (π) and the policy used for control (π) are the same. In on-policy learning in TD, one more timestep is taken to…
MC has one drawback: the state-value function is calculated after the episode is completed, which slows down learning. To address this, a new concept called Temporal Difference Learning (TD) was introduced. Temporal Difference Learning (TD) (1) …