Mar-24-2020, 09:44 PM

I was given two files one with a person name and their rating which I put into a dictionary and then one with the author's names and their books they've written which were also put into dictionaries.

I was then asked to create an algorithm based on this.

A program can calculate how similar two users are by treating each of their ratings as a vector and calculating the dot

product of these two vectors. The dot product is simply the sum of the products of each of the corresponding elements.

For example, suppose we had 3 books in our database and User A rated them [5, 3, -5], User B rated them [1, 5, -

3], User C rated them [5, -3, 5], and User D rated them [1, 3, 0]. The similarity between Users A and B is calculated

as: (5 x 1) + (3 x 5) + (-5 x -3) = 5 + 15 + 15 = 35. The similarity between Users A and C is: (5 x 5) + (3 x -3) + (-5 x

5) = 25 - 9 - 25 = -9. The similarity between Users A and D is (5 x 1) + (3 x 3) + (-5 x 0) = 5 + 9 + 0 = 14. We see

that if both people like a book (rating it with a positive number) it increases their similarity and if both people dislike a

book (both giving it a negative number) it also increases their similarity.

Once you have calculated the pair-wise similarity between User A and every other user, you can then identify whose

ratings are most similar to User A’s. In this case, User B is most similar to User A, so we would recommend to

User A the top books from User B’s list that User A hasn't already read.

I am confused and stuck on how to create this

I was then asked to create an algorithm based on this.

A program can calculate how similar two users are by treating each of their ratings as a vector and calculating the dot

product of these two vectors. The dot product is simply the sum of the products of each of the corresponding elements.

For example, suppose we had 3 books in our database and User A rated them [5, 3, -5], User B rated them [1, 5, -

3], User C rated them [5, -3, 5], and User D rated them [1, 3, 0]. The similarity between Users A and B is calculated

as: (5 x 1) + (3 x 5) + (-5 x -3) = 5 + 15 + 15 = 35. The similarity between Users A and C is: (5 x 5) + (3 x -3) + (-5 x

5) = 25 - 9 - 25 = -9. The similarity between Users A and D is (5 x 1) + (3 x 3) + (-5 x 0) = 5 + 9 + 0 = 14. We see

that if both people like a book (rating it with a positive number) it increases their similarity and if both people dislike a

book (both giving it a negative number) it also increases their similarity.

Once you have calculated the pair-wise similarity between User A and every other user, you can then identify whose

ratings are most similar to User A’s. In this case, User B is most similar to User A, so we would recommend to

User A the top books from User B’s list that User A hasn't already read.

I am confused and stuck on how to create this