AVARAGE RECIPROCAL HIT-RANK
The Average Reciprocal Hit-Rank, ARHR, is a modified version of recall, the denominator is the same, for
the numerator, each item is "weighted" by a fraction.
1 divided by the ranking of i, that is the position of the item in the list. The weight is equal to 1 if the item is
in position 1; 0.5 if the item is in position 2; and so on. This is a useful
metric, but technically speaking it is not a ranking in strict sense, since it
does not compare the ranking provided by the user with the ranking
provided by the system.
22/09/2022
CONTENT BASED FILTERING
In content-based filtering, we recommend items based on their attributes. In other words, we want to
understand how similar are two items based on their attributes. Compare items based on their
à
attribute (liking films with the same film). The main assumption, at the foundation of content-based
methods, is that user that expressed a preference for an item, will probably like similar items.
In order to represent the attributes of a pool of items, we use
the Item Content Matrix., or ICM (row=item, column=attribute
of the items). For example, an attribute can be the presence of
the actor Harrison Ford, for the movie Star Wars. In each cell of
the matrix, we can find a value that is either zero, or one.
If a cell contains a one, it means that the item i has the attribute a.
MEASURING SIMILARITY
How do we measure the similarity between two items i and j, based on their attributes? We can look at the
items i and j, as two vectors. Each vector has a number of elements, which is equal to the total number of
attributes, available in the ICM. Vectors are binary vectors. In other words, values in the vectors can be
either 0, or 1. The value 1 means that the item has such an attribute, while the value 0 means that the item
doesn’t have that attribute. In this example the 0 value are left empty, to maintain the figure clearer.
If two items have many attributes in common,
we can assume that they are very similar. In a
more formal way, the number of common
elements between two binary vectors can be
computed by using the Dot Product, where, the
result of the similarity is the product of the two
items’ vector.
COSINE SIMILARITY In same cases, the similarity can be improved by normalizing
the dot product. We can take the dot product between the
two vectors and normalize it with the length of the two
vectors. The similarity computed this way is the cosine of the
two vectors of attributes. In a graphical way, we can see that
the angle of the cosine similarity is the angle included between the two vectors, that are the two items. The
more i and j are similar, the more the cosine is large
SUPPORT: THE SIZE OF THE SAMPLE Another important concept is the Support. The support
of a vector is the number of non-zero elements in the
vector. For example, in the matrix on the left, there is a
small support. While in the matrix on the right, the
support is larger. Looking at the similarity between the
items in the two matrixes, trough the cosine similarity
given before, we say that the items in the first matrix
are more similar to each other, compared to the items
in the second matrix. But is this true? Or we have to take in account something else? (sx: small support, dx:
large support)
SHRINKING
In order to reduce similarity, so that we can take into account only most similar with large support, we
introduce a new variable in the cosine similarity. This new variable is a constant, and it is called Shrink
Term.
Coming back to the previous matrixes, and choosing 3 as a Shrink Term, we can now notice that the two
similarity have changed. 0.25<0.5, i=rows, j=columns
à
Now, the items into the second matrix are more similar than the one contained in the first one. This
happens because, if the denominator of the cosine similarity is large, the shrink term will have a smaller
effect, with respect to the situation in which the denominator is small.
SIMILARITY MATRIX
The similarities computed across all pairs of items constitute the similarity matrix. An
element i j in the matrix describes how much item i is similar to item j. The similarity
matrix, computed with the dot product or the shrinked cosine, is a symmetric matrix.
In other words, the similarity between i and j is equal to the similarity between j and i.
ESTIMATING RATING
Once you have the similarity matrix, you can estimate the rating of user u on item i, by relying on the past
ratings from the same user on other items j. This estimation can be done by using this equation. Let’s
analyse it! The rating that user u will give to an item i is equal to the summation of the
user’s past rating on items j, multiplied by the similarity between the item j
and i, and is all divided by the normalization. This normalization is useful if
we want to estimate the ratings as accurately as possible. However, if your
goal is a Top-N recommendation, you may leave the normalization out. Top-N task: recommend the N
items with the highest rating.
MATRIX NOTATION
In this section, we will describe content based filtering, by using a matrix notation that helps us in writing
the equations of our recommender models in a more compact way. As we have seen, we can estimate the
rating of user u on item i, by relying on the past ratings from the same user on other items j. We can
rewrite the formula saying that the summation of the ratings can be seen as a vector, and the similarity as a
Matrix. Where the vector is the whole user profile, not just for the element j like before. Also the similarity,
now, is no more just the similarity between item i and j, indeed, it is the entire items similarity matrix. We
can generalize this formula even more. As it is shown, we can extend this formula for all the users. Where,
the estimated rating Matrix is equal to the product of the User Rating Matrix and the Similarity Matrix.
K-NEAREST NEIGHBOURS (KNN)
The K-Nearest Neighbours: technique that is used to simplify and get the most out of the use of a similarity
matrix. The similarity matrix is a dense matrix (few empty cells). This causes the matrix to be heavy in a
computational sense. Having a big data structure is expensive, in both terms of memory and time.
Moreover, the similarity matrix contains values that are mostly small. This can cause the presence of a lot
of noise in the data. In fact, the similarity values are for the most part small, and very similar to each other.
In this way, it becomes difficult to distinguish items through the similarity values. Overall, these presented
problems lead to a lower quality when making recommendations using a similarity matrix.
The solution that we introduce is the K Nearest-
Neighbors Technique. This method consists in
keeping only the K most similar couples of items in
the similarity matrix. As K, we intend an integer
number, of course. As pointed out in the example,
with K equals to two, only the green similarity
values are kept, in the matrix. The formula for the estimation of the ratings now keeps track of the “K
nearest neighbours”. In practice, the estimated ratings are calculated on the basis of the items j, that are
part of the “k nearest neighbours” of item i.
THE CHOICE OF K
The quality of the recommender algorithm depends on the value of K. If K is too small, the model doesn’t
have enough data to make reliable estimations. On the other hand, if the value of K is taken too big, the
data will contain too much noise.
NON-BINARY ATTRIBUTES
One main improvement can be done by introducing Non-Binary attributes. We have seen that an item can
have or not specific attributes. However, there could be some middle cases. What if, for example, Back to
the Future 3 is both a Science fiction and Western film? Using Binary weight, we have to say that it has the
attribute of Science fiction as well the one of Western. But is it right? I think that is more Science fiction
than Western, but how can I depict this difference in the ICM? Using Non Binary Weights. Now, I can
specify how much “Back to the future 3” is a Science fiction and how much is a western. The Item-Content
Matrix can be further improved, by the introduction of the "attribute weights".
As we can see, movie 1 and 3 has some
attributes in common: similar title, actors and
directors. But different year of production and
cost. Are these attributes so important in the
comparison of movies? In this same example,
there are movies 1 and 2 that are similar for the
year and cost, however, they are really different
for title, actors and directors. So, is the first one
more similar with the second or with the third one? We can decide it, if we add attribute weights. Here, for
example, we consider the attribute "title", which has 0.8 of weight, more important than the "year"
attribute, that has only 0.5 of weight.
TF-IDF TECHNIQUES
The techniques presented are used to automatically adjust the weights of attributes in the Item Content
Matrix. The TF-IDF is given by the product of two terms, the Term Frequency (TF), and the Inverse
Document Frequency (IDF). The main principle of these techniques is to balance the weights of the
attributes depending on their frequency of appearance in the items.
The first component is the TF (Term Frequency): with N the apparances of attribute a in item
a,i
i that is often equal to 1 and this can give, as result, a lot of attributes with very little term frequency values
and N the total number of attribute of item i.
i
The second component is the IDF (Inverse Document Frequency): with N the total
items
number of items and N the number of items with attribute a. It aims at solving the problem of little values
a
of Term Frequency, for rare attributes.
Example using TF as technique: For the highlighted item I (3° rows), the TF for the attribute a
has value of 1/3, but in this other case, for the highlighted
item i, the TF for the attribute c has value of zero. with the
item j, the TF for the attribute a has value of 1 divided by 6.
If the analyzed item has many attributes, the weight of the
single attribute becomes small.
Example using TF as technique for normalizing the weights:
Let's start by computing the IDF for the
attribute a. The result is zero. In fact, if the
attribute has value 1 in all the items, it has no
informative content! Analyzing the IDF for the
attribute b, we can see that the value becomes
different from zero. Finally, for the attribute c,
the value of IDF is 0.6. In conclusion, the
product between TF and IDF gives a more balanced values for the weights of the attributes.
12/10/2022
ITEM SIMILARITY: IMPLICIT RATINGS
In the item-based collaborative filtering technique, the idea is to calculate the similarity between each pair
of items according to how many users have rated them both. Then we use ratings specified by the user for
that item, to predict if he or she will like the target items.
Scarica il documento per vederlo tutto.
Scarica il documento per vederlo tutto.
Scarica il documento per vederlo tutto.
Scarica il documento per vederlo tutto.
Scarica il documento per vederlo tutto.
Scarica il documento per vederlo tutto.
Scarica il documento per vederlo tutto.
Scarica il documento per vederlo tutto.
-
Appunti comparing financial systems
-
Appunti di Automotive Electronics Systems
-
Appunti Safety and Maintenance for Industrial Systems
-
Appunti Machine Learning
- Risolvere un problema di matematica
- Riassumere un testo
- Tradurre una frase
- E molto altro ancora...
Per termini, condizioni e privacy, visita la relativa pagina.