What is machine learning?
Two definitions of Machine Learning to understand the concept.
"the field of study that gives computers the ability to learn without being explicitly programmed." By Arthur Samuel.
"A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E." by Tom Mitchell.
Supervised learning
The term Supervised Learning refers to the process of giving data sets and the expected "right answer" to an algorithm. The task for the algorithm is then to produce a model that matches the right outputs for each data set.
In this process the algorithm learns to shape a model by comparing its outputs to the one given by a human. It is called supervised learning because the process of an algorithm learning from the training dataset can be thought of as a teacher supervising the learning process.
Supervised learning algorithms problems can be grouped into two main categories:
- Regression problems. Models used to predict a real value such as an amount of price or an amount of weight. Example: Price prediction of a house given the features of house like size, price, location…
- Classification problems. Given a data set, the variable output is going to be category. For example, an algorithm to determinate if e-mail is spam or not. The expected output can be 1 (spam) or 0 (not spam). There can be multiple output categories also known as class labels.
Unsupervised learning
Unsupervised learning allows to approach problems without knowing the expected output. Algorithms derivate structure from a given data set without humans knowing how results will be. They are called unsupervised learning because there is no feedback based on the prediction results nor correct answers.
Unsupervised learning algorithms problems can be grouped into two main categories:
- Clustering. Used for algorithms to group and find patterns of data inside a given data set. For example, grouping users by their behaviour while interacting with a new software.
- Association. Used for algorithms to find associations, relationships and dependencies between entities in a given data set. Example: Two products that are usually bought together on a bookstore.
References