Definition of

Supervised learning

Feature extraction

Supervised learning enables feature extraction.

Supervised learning is a machine learning model that is based on the use of labeled data to train algorithms that can classify them or anticipate results from them. What the model does is adjust its weights as more data is entered.

The supervision referred to in the name of the model is carried out by a human being . A person is in charge of labeling the data: therefore, it is not a fully automated learning that a system can develop without human intervention.

Supervised learning concept

To understand what supervised learning is, you must first pay attention to several notions. It should be clear that supervised learning is a technique of automatic learning or machine learning , which in turn is an area of ​​artificial intelligence .

The idea of ​​artificial intelligence refers to certain intellectual or cognitive capabilities expressed by algorithms or systems. In this way, machines can imitate - in a certain way - human intelligence.

Machine learning , in this framework, is a set of techniques that make it possible for a system to learn: that is, to improve its performance based on the use of data and experience. As we already indicated, in the case of supervised learning it is necessary for an individual to interact with the model.

big data

With supervised learning, data analysis is optimized.

The techniques

Supervised learning can be carried out using different techniques and algorithms. Among these resources are artificial neural networks , which seek to imitate the functioning of the human brain by using layers of nodes to process input data.

Each of the nodes is formed with the inputs , the weights , a threshold ( bias ) and the output . The activation of the node occurs when the input value manages to exceed the threshold; In that case, the data advances to the next layer of the neural network. Supervised learning, in this context, is achieved with adjustments that are based on the loss function via gradient descent.

Linear regression , meanwhile, recognizes the link between a dependent variable and one or more independent variables. This way you can predict results. If there is a single independent variable, it is called simple linear regression ; if there are more, it is multiple linear regression .

Logistic regression , for its part, is used if the dependent variables present binary outputs (that is, if they are categorical). That is why they are used above all to solve problems of this type of classification .

Support vector machines (SVM) are also used for regression and data classification. With them, the construction of a hyperplane (the decision limit) is carried out that separates the data according to its class.

Among the machine learning algorithms there is also the one known as K -nearest neighbors (KNN) . It is used to classify data points according to their association and proximity to other data, calculating the distance and then assigning a category based on the highest frequency average.

The algorithm mentioned as random forest , on the other hand, is a series of uncorrelated decision trees that, when merged, minimize variance and make more accurate predictions.

Preference analysis

Recommender systems typically use supervised learning.

Examples of supervised learning

An example of supervised learning is spam detection . Through these algorithms, it is possible to train systems to recognize patterns. In this way, incoming messages are examined and those that are considered spam are identified and separated from the rest.
Another example of supervised learning occurs in sentiment analysis . In this framework, the algorithms work with big data, extracting and classifying the information that is posted in the comments. Artificial intelligence allows us to generate knowledge about emotions and even context.

Image analysis to locate and categorize objects and predictive modeling to anticipate results that help make decisions are other examples of the use of supervised learning.

Unsupervised machine learning

It is common that, when referring to supervised learning, reference is also made to unsupervised learning . The difference between the two is that supervised learning uses labeled data, while unsupervised learning uses unlabeled data.

What unsupervised machine learning does is look for patterns that contribute to generating associations or groupings without people intervening. Gaussian mixture , k-means and hierarchical models are the most used algorithms in this case.

It should be noted that a third model is semi-supervised learning , which works with a set of labeled data and other unlabeled ones.

When choosing one or the other, it should be considered that supervised learning is more expensive and time- consuming since it is not easy to label the data appropriately.