vuoi
o PayPal
tutte le volte che vuoi
K-Means = k-means is an algorithm developed for clustering tasks. In simple words, upon a given
set of observations, k-means searches for k groups of tightly similar observation is estimated by
considering the minimum distance among k possible centroids, which are iteratively updated based
on the newly assigned observations. The term “k-means” was first used in 1967, but the idea dates
back to Hugo Steinhaus in 1956.
The General Problem Solver = similar to the Logic Theorist, the General Problem Solver was
conceived to perform general problem-solving, e.g. Automated theorem proving or Tower of Hanoi
games. In contrast, the algorithm exploits a new search method, called Mean-ends analysis, which
limits the search for the correct solution in order to keep the algorithm computationally efficient.
The Perception = another important milestone was reached in 1957, when the psychologist Frank
Rosenblatt designed a new type of artificial neuron, inspired by the design of the McCulloch and
Pitts neuron. Rosenblatt’s neuron, called Perceptron, had the ability to automatically update its
synaptic weights, based on some input and some expected output. The update was performed
following the principles of Hebbian Learning, through the Delta rule which activates the update
every time neuron miss-classify the input. Rosenblatt’s neuron is at the basis of the modern research
in deep learning. Eventually, multiple perceptrons can be combined in stacks, connected together to
form a multilayer perceptron, to perform complex classification/regression tasks.
Support Vector Machine = SVM is an algorithm developed by two Russian scientists Vladimir
Vapnik, and Alexey Chervonenkis. The algorithm can be seen as an extension of the Perceptron,
with the additional requirement that the learned separating boundary must be equally far from the
point nearest to the boundary. In 1992, SVMs were further extended to deal with non-linearly
separable data, by using the “kernel-trick”.
Computational complexity theory = theoretical computer scientists soon realised that not all the
decidable problems were easily solvable in practice. In other words, with a linearly increase of ht
input size some problems were insignificantly increasing their amount of computational time, while
others were computationally infeasible. From these considerations, Harmanis and Sterns wrote On
the Computational Complexity of Algorithms, establishing the basic concepts of the Computational
Complexity theory. Later, in 1971, Cook and Levin proved the existence of computationally
3 of 11
  intractable problem, called NP-Complete. This result effectively pushed the community into
embracing a paradigm shift, searching for approximate solutions to NP-complete problems, instead
of directly solving them.
Fukushima’s Neocognitron = inspired by early studies on the cat visual cortex, the Japanese
Kunihiko Fukushima introduced a new connectionist architecture, the Nerocognitron. This model
introduces two important layers that were successively in the CNN, proposing during the next years
various method for the network weights training.
Hopfield Network = Hopfield networks named after its inventor introduced a new kind of neural
model in the field of connectionist AI, named RNN. This model repeatedly sends its output back as
new input to re-process it, and update its weight according to Hebbian learning rules, giving the
ability to “remember” states by associate memory. An Hopfield network has been successfully used
to solve problems such as the travelling salesman problem.
WordNet = with the improvement of the computer hardware and the development of more efficient
learning algorithms, the need for more learning data contextually increased and the dataset size
started increasing. One notable example is provided by the WordNet project. Started by George
Miller in 1985, WordNet aimed to provide a large lexical database of English words grouped into
sets of cognitive synonyms and interlinked according to semantic and lexical relations. WordNet
has been efficiently used in a number of NLP tasks.
Backpropagation = in 1986 Rumelhart, Hinton and Williams addressed a well-known limitation
regarding the difficulty to train MLPs. In their paper, the authors popularised a previously used
method, named back propagation that consists in the application of the derivative chain rule to
apply gradient descent algorithm for the weight update. Backpropagation helped connectionist AI in
gaining again popularity, after the first AI winter.
CNN = another interesting achievement in machine learning are convolutional neural networks.
Inspired by Fukushima’s neocognitron, the researcher Yann LeCun proposed this new kind of
connectionist architecture to recognise hand-written ZIP code numbers and used the back
propagation algorithm to train the network. CNN are nowadays still a method of choice for almost
all computer vision tasks.
Vanishing Gradient = the 90’s attested another AI winter, mainly due to the discovery of the
vanishing gradient problem. Sepp Hochreiter illustrated the problem in his diploma thesis. In
particular, the researcher described the dampening effect generated by some activation functions to
the backpropagation error, slowing down the convergence of neural networks such as MPLs.
Kenrel Methods = Kernel Methods arose in the ‘90s. They do not learn weights but instead they
store a labelled subset of observations and classify a new instance based on its similarity with the
subsets observations. The similarity is computed by means of a kernel function, which is suitable
for fast computation.
IBM Deep Blue Beats Kasparov = on February 10-17 1996, the American company IBM
organised a match against the world chess champion Garry Kasparov and a computer-chess
algorithm developed in the previous decade, Deep Blue. The match terminated with a 4-2 for the
Russian champion. However, on may 1997, a second match was organised, where an updated
4 of 11
  version of the algorithm was successfully able to beat Kasparov. This even is perhaps one of the
most known in artificial intelligence history and soon decorated the start of a new era where AI was
considered “mature” enough to compete with human thinking.
ImageNet Challenge = In 2006, the researcher Fei-Fei Li started working on the creation of a large
public image dataset with the aim of providing more accessible data for machine learning research.
Her vision concretised in the creation of ImageNet a dataset containing 14 million images with
more than 20,000 categories. In addition a subset of 1000 classes was selected to propose an image
classification challenge, which soon became popular in the computer vision research field. With its
huge size, ImageNet effectively helped the transition into the deep learning paradigm.
AlexNet = AlexNet named after one of its designers, Alex Krizhevsky, is a deep neural model that
is considered a milestone in the computer vision research field. This model was one of the first
GPU-implemented neural models to achieve “superhuman” performance in the ImageNet challenge.
AlexNet effectively started a new “AI spring” of connectionist models, with the main effect of
pushing the AI community to gradually adopt deep architectures.
Word2Vec = in 2013, the Google team lead by Tomas Molotov, proposed a group of related
models, called Word2vec able to compute effective word embeddings for various NLP tasks. The
model is simple yet effective, and it employs a one-layer.
AlphaGo beats human Go players = from a computational point of view, the game of Go is more
difficulty to learn, because of its huge number of possible moves, compared to chess. Thus, the
standard computer-chess algorithms were able to beat amateur Go players only. In 2015, however,
google’s DeepMind team proposed a new model, AlphaGo which was able to beat several
professional players. To achieve this goal, the DeepMind team used a combination of reinforcement
learning, deep learning and tree search. AlphaGo proved the effectiveness of the reinforcement
learning approach for addressing complex games such as Go, but also incomplete information
games such as poker, or even video games.
The Bitter Lesson = on March 2019, the researcher Richard Sutton published a blogpost entitled
The Bitter Lesson, where he described its personal point of view on the actual situation in the
machine learning community. In particular, Sutton notices that much of the improvements in the
history of machine learning are due to an increment of computational power. The result is that the
most effective methods are those that leverage a massive computational power to perform simple
algorithms such a tree search. Sutton thus encourages the researchers to change their mindset and
instead of focusing on the implementation of their knowledge model, they should focus on scalable
models based on search and learning.
GPT-3 = GPT-3 is the third version in a series of generative models based on the transformer
architecture and it is used for addressing NLP tasks. Since its debut, GPT-3 impressed the public
with its amazing performance a d has been used in a wide variety of applications, including AI
paired programming. 5 of 11
  WHAT IS MACHINE LEARNING?
ML is a complex, evolving field of research. As such, the organisation of ML knowledge has not
been fully established yet. Its interdisciplinary nature usually poses more than a difficulty to
newcomers, and the subject can be addressed from different points of view.
The Various Modalities of Information
Human learning comes from the observation of real-world information. Accordingly, every ML
algorithm learns from the wide variety of data that can be digitised.
One of the most common information elaborated by ML algorithms are images. Because of their
immediacy, images are one of the primary source of learning during the childhood. Visual concepts
spontaneously arise upon a quick look of an image, thanks to the capacity of our brain to process
and correlate visual stimuli while filtering out unneeded information. Such sophisticated ability has
been subject of study and modelisation since the very start of ML research. Consider the following
image:
Your computer displays it as a set of coloured dots, called pixels. Each pixel is represented as a
triplet of three values, and their combination encode a colour intensity (orange, brown, red, etc.).
The first thing you note is that the image contains a cute pig. This task is often referred by the
machine learning community as Image Classification or Object Recognition and, roughly, consists
in describing the image in its entirety.
You can move one step forward and describe the image in more detail, by telling where the pig is
located in the image, as follows:
This task is referred as object detection and consists in localising the object in the image, by
providing the pixel coordinates of an area where the object is located. This can be done by
providing four values, representing the left and right corners of a rectangle, e.g. (10, 15) (20, 30).
6 of 11