{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Example: MovieLens dataset\n",
"\n",
"We will use R package [recommenderlab](https://cran.r-project.org/web/packages/recommenderlab/index.html) and __100k-MovieLense__ dataset. The data was collected through the MovieLens web site\n",
"(movielens.umn.edu) during Sept 1997 - Apr 1998. The data set contains ~100k ratings (1-5) from 943 users on 1664 movies. Each user has rated at least 19 movies. Note that the ratings matrix is stored with users corresponding to rows and movies corresponding to columns (different from what we had in the lectures)."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"943 x 1664 rating matrix of class ‘realRatingMatrix’ with 99392 ratings."
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"library(recommenderlab)\n",
"data(MovieLense)\n",
"MovieLense\n",
"nusers=dim(MovieLense)[1]\n",
"nmovies=dim(MovieLense)[2]"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
" Min. 1st Qu. Median Mean 3rd Qu. Max. \n",
" 19.0 32.0 64.0 105.4 147.5 735.0 "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"#check how many movies have the users rated\n",
"summary(rowCounts(MovieLense))"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"title | year | url | unknown | Action | Adventure | Animation | Children's | Comedy | Crime |
\n",
"\n",
"\tToy Story (1995) | 1995 | http://us.imdb.com/M/title-exact?Toy%20Story%20(1995) | 0 | 0 | 0 | 1 | 1 | 1 | 0 |
\n",
"\tGoldenEye (1995) | 1995 | http://us.imdb.com/M/title-exact?GoldenEye%20(1995) | 0 | 1 | 1 | 0 | 0 | 0 | 0 |
\n",
"\tFour Rooms (1995) | 1995 | http://us.imdb.com/M/title-exact?Four%20Rooms%20(1995) | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
\n",
"\tGet Shorty (1995) | 1995 | http://us.imdb.com/M/title-exact?Get%20Shorty%20(1995) | 0 | 1 | 0 | 0 | 0 | 1 | 0 |
\n",
"\tCopycat (1995) | 1995 | http://us.imdb.com/M/title-exact?Copycat%20(1995) | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
\n",
"\n",
"
\n"
],
"text/latex": [
"\\begin{tabular}{r|llllllllll}\n",
" title & year & url & unknown & Action & Adventure & Animation & Children's & Comedy & Crime\\\\\n",
"\\hline\n",
"\t Toy Story (1995) & 1995 & http://us.imdb.com/M/title-exact?Toy\\%20Story\\%20(1995) & 0 & 0 & 0 & 1 & 1 & 1 & 0 \\\\\n",
"\t GoldenEye (1995) & 1995 & http://us.imdb.com/M/title-exact?GoldenEye\\%20(1995) & 0 & 1 & 1 & 0 & 0 & 0 & 0 \\\\\n",
"\t Four Rooms (1995) & 1995 & http://us.imdb.com/M/title-exact?Four\\%20Rooms\\%20(1995) & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\\\\n",
"\t Get Shorty (1995) & 1995 & http://us.imdb.com/M/title-exact?Get\\%20Shorty\\%20(1995) & 0 & 1 & 0 & 0 & 0 & 1 & 0 \\\\\n",
"\t Copycat (1995) & 1995 & http://us.imdb.com/M/title-exact?Copycat\\%20(1995) & 0 & 0 & 0 & 0 & 0 & 0 & 1 \\\\\n",
"\\end{tabular}\n"
],
"text/plain": [
" title year url \n",
"1 Toy Story (1995) 1995 http://us.imdb.com/M/title-exact?Toy%20Story%20(1995) \n",
"2 GoldenEye (1995) 1995 http://us.imdb.com/M/title-exact?GoldenEye%20(1995) \n",
"3 Four Rooms (1995) 1995 http://us.imdb.com/M/title-exact?Four%20Rooms%20(1995)\n",
"4 Get Shorty (1995) 1995 http://us.imdb.com/M/title-exact?Get%20Shorty%20(1995)\n",
"5 Copycat (1995) 1995 http://us.imdb.com/M/title-exact?Copycat%20(1995) \n",
" unknown Action Adventure Animation Children's Comedy Crime\n",
"1 0 0 0 1 1 1 0 \n",
"2 0 1 1 0 0 0 0 \n",
"3 0 0 0 0 0 0 0 \n",
"4 0 1 0 0 0 1 0 \n",
"5 0 0 0 0 0 0 1 "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"MovieLenseMeta[1:5,1:10] #metadata about movies (feature vectors) are also available - we don't use them here!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can visualise a part of the ratings matrix. There is lots of missing data!"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA0gAAANICAMAAADKOT/pAAADAFBMVEUAAAABAQECAgIDAwME\nBAQFBQUGBgYHBwcICAgJCQkKCgoLCwsMDAwNDQ0ODg4PDw8QEBARERESEhITExMUFBQVFRUW\nFhYXFxcYGBgZGRkaGhobGxscHBwdHR0eHh4fHx8gICAhISEiIiIjIyMkJCQlJSUmJiYnJyco\nKCgpKSkqKiorKyssLCwtLS0uLi4vLy8wMDAxMTEyMjIzMzM0NDQ1NTU2NjY3Nzc4ODg5OTk6\nOjo7Ozs8PDw9PT0+Pj4/Pz9AQEBBQUFCQkJDQ0NERERFRUVGRkZHR0dISEhJSUlKSkpLS0tM\nTExNTU1OTk5PT09QUFBRUVFSUlJTU1NUVFRVVVVWVlZXV1dYWFhZWVlaWlpbW1tcXFxdXV1e\nXl5fX19gYGBhYWFiYmJjY2NkZGRlZWVmZmZnZ2doaGhpaWlqampra2tsbGxtbW1ubm5vb29w\ncHBxcXFycnJzc3N0dHR1dXV2dnZ3d3d4eHh5eXl6enp7e3t8fHx9fX1+fn5/f3+AgICBgYGC\ngoKDg4OEhISFhYWGhoaHh4eIiIiJiYmKioqLi4uMjIyNjY2Ojo6Pj4+QkJCRkZGSkpKTk5OU\nlJSVlZWWlpaXl5eYmJiZmZmampqbm5ucnJydnZ2enp6fn5+goKChoaGioqKjo6OkpKSlpaWm\npqanp6eoqKipqamqqqqrq6usrKytra2urq6vr6+wsLCxsbGysrKzs7O0tLS1tbW2tra3t7e4\nuLi5ubm6urq7u7u8vLy9vb2+vr6/v7/AwMDBwcHCwsLDw8PExMTFxcXGxsbHx8fIyMjJycnK\nysrLy8vMzMzNzc3Ozs7Pz8/Q0NDR0dHS0tLT09PU1NTV1dXW1tbX19fY2NjZ2dna2trb29vc\n3Nzd3d3e3t7f39/g4ODh4eHi4uLj4+Pk5OTl5eXm5ubn5+fo6Ojp6enq6urr6+vs7Ozt7e3u\n7u7v7+/w8PDx8fHy8vLz8/P09PT19fX29vb39/f4+Pj5+fn6+vr7+/v8/Pz9/f3+/v7////i\nsF19AAAACXBIWXMAABJ0AAASdAHeZh94AAAgAElEQVR4nO3de2AU5b3w8V8CJELAQFC5eOMu\nVFAQkHJEkYsVLdZ7taCiINBIFS8otva8Vm1FG61az6k9oIdX7BG5CLSkavHKa4v2CBZE5SCo\nR48X8EQiBAwhyc47s7vZnd3sE2aHZ+fZbL6fPybZzMyzD+t+3Us2M2IBOGRiegJALiAkQANC\nAjQgJEADQgI0ICRAA0ICNCAkQANCAjQgJEADQgI0ICRAA0ICNCAkQANCAjQgJEADQgI0ICRA\nA0ICNCAkQANCAjQgJEADQgI0ICRAA0ICNCAkQANCAjQgJEADQgI0ICRAA0ICNCAkQANCAjQg\nJEADQgI0ICRAA0ICNCAkQANCAjQgJEADQgI0ICRAA0ICNCAkQANCAjQgJEADQgI0ICRAA0IC\nNCAkQANCAjQgJEADQgI0ICRAA0ICNCAkQANCAjQgJEADQgI0ICRAA0ICNCAkQANCAjQgJEAD\nQgI0yImQ9s2tNTuB6rk1ZidQc1u12QnUzt1ndgJ1c6uMXn9OhPS+7DA7ge3yqdkJfCIfmp3A\nl7LF7AQq5B2j109IOhASIRm9dk0IiZAISQNCIiRC0oCQCImQNCAkQiIkDQiJkAhJA0IipNwL\n6Z31gVsma4K/UrdV8mezEyiXP5qdwF9kudkJvCzPuC5t93RX3dnUgO+md7fXHtJaAUzL3+3l\nvjq8yTHeTut+rz2kFwp3BW+ngetkAtk7gb/J/3q5rw5qMqS/pXW/z0BIukcE0rSZkIBDR0iA\nBoQEaEBIgAaEBGhASIAGhARoQEiABoQEaEBIgAaEBGhASIAGhARoQEiABoQEaEBIgAaEBGhA\nSIAGhARoQEiABoQEaEBIgAaEBGhASIAGhARoQEiABoQEaEBIgAaEBGhASIAGhARokImQqj92\nKMclJOSeTIT0bPiH06KXlo0oHvuWezUhIfdkIqSy7itt0fNhludPWnB68TbXakJC7slESKXj\nXBfGjA1Ze7rNdf2EkJB7MhHS2dOt2obvK2S+vZzRx7WakJB7MhFS39H98nqW1YW/3xRe+XBh\nKL6akJB7tIR0xdywrZFt69qUPFpeKneHL6yR9+3lItkdH4uQkHu0hDT20rANkW33L95uL6e2\nCz8krZEt9vJJqYiPRUjIPV5DKuyglvIXsivkA+fLRllnLx8p4KkdcloGQvr0+Xp7uUp2OBcq\nZKG9nNXLtZ6QkHsyENJGWW0vpx0feRA6c6L9ZK/Hra71hITck4GQQhNKyp6+RpZb1gPjqqzy\nvJvLLyje7lpPSMg9mXiNVFnavWjkc/Y306TSspYMO3zMevdqQkLuydibDWqEhNxDSIAGhARo\nQEiABoTkWeiOGSq/CmQCyGKE5FmldDkmtSPzA5kAshgheVYpZ12a2mhCavEIyTNCghoheUZI\nUCMkzwgJaoTkGSFBjZA8IySoEZJnhAQ1QvKMkKBGSJ4REtQIyTNCghoheUZIUCMkzwgJaoTk\nGSFBjZA8IySoEZJnhAQ1QvKMkKBGSJ4REtQIyTNCghoheUZIUCMkzwgJaoTkGSFBjZA8IySo\nEZJnhAQ1QvKMkKBGSJ59IyVdUuvUKpAJZLEvJ4xXuc303IJBSN797j6Vfw9mAtnrTenWPbXi\nAabnFgxCggZvyslDUjuWkNwICU0hJEKCBoTkNaQOitfZtqMIqcUjJEKCBoRESNCAkAgJGhAS\nIUEDQiIkaEBIhAQNCImQoAEhERI0ICRCggaEREjQgJAICRoQEiFBA0IiJGhASIQEDQiJkKAB\nIRESNCAkQoIGhERI0ICQCAkaEBIhQQNCIiRosF5aKeQPMj23YBASNAi9+qLKFtNzC0bGQ6r+\n2OG+EkJC7slQSPVjJ0e/e1Yc01zrCAm5J0Mh/U4aQirrvtL2tmsdISH3ZCakj9q3bwipdFzy\nSkJC7slISPVjrjytIaSzp1u1iWsJCbknIyH9a9evYyH1Hd0vr2dZnWutz5Aav2vRgJBgnJaQ\nThga9kp044+KVlkNIdW1KXm0vFTudo3lM6TG71o0ICQYpyWk2/4tbEdk2/oxdkQNIe1fvN1e\nTm3nekjyGVLjdy0aEBKMy8BTuyc6ba+sHHFp5YH4j1bIB/ELPkNq/K5FA0KCcRkI6RaJWOlc\n+PT5enu5SnbE1/sMqfG7Fg0ICcZlIKTtr9oGjn81PPBGWW0vpx0fiq/3GVKjdy3qdkUtJySY\ntlm2Ndwfm9os7Y8IhV8jPTCuKjShpOzpa2S5a5W/kBq/a3GVNMjzNSKgz2uxe6M80sRm/kKa\nJpVWZWn3opHPuVf5C6nxuxYV66MeLfA1IqDPZnmp4f74bRObZcunvxPetWjAayQY12z+jCLF\nuxYNCAnGNZuQUrxr0YCQYFyzCSnFuxYNCAnGeQ3pyD5KvYN5jdT4XYsGhATjmk9IaoQE4wgJ\n0ICQAA0ICdCAkAANCMm7Lz5U2RnMBLLYJ8rbpsnPcKZrp/JqvtR5NT4Qkmf7WotKUSATyGLv\nKG8aOUXn9bRTXk3rfTqvJ32E5FmlzFuc2s/zA5lAFntTnlTcNlO1Hvs7Tx3sNzqvJ32E5Fml\nPLI6tV8RkjyruG1KCcmNkAipKYRESJ4RkhohEZJnhKRGSITkGSGpERIheUZIaoRESJ4Rkhoh\nEZJnhKRGSITkGSGpERIheUZIaoRESJ4RkhohEZJnhKRGSITkGSGpERIheUZIaoRESJ4Rkhoh\nEZJnhKRGSITkGSGpERIheUZIaoRESJ4RkhohEZJnhKRGSITkGSGpERIheUZIaoRESJ4Rkhoh\nEZJnlXLN3NQuz62Q1i9VWVGr2OVNuUVx24w9Wjna9vSnRkguzTWkmpN6qXw3kAkE5eTWhakV\nyBuKXf67n+qmOSJPMVhh/pXpT62JkL4+pH/zISMkJBt0zJDUBstf0x7sd4cpBhvSeXL6U+MR\nyYWQshwh+UBISEZIPhASkhGSD4SEZITkAyEhGSH5QEhIRkg+eA3p6JOVTiKk3EJIPhASkhGS\nD4SEZITkAyEhGSH5QEhIRkg+EBKSEZIPhIRkhOQDISEZIflASEhGSD4QEpIRkg+EhGSE5AMh\nIRkh+UBISEZIPhASkhGSDxkKqX5s7FZaNqJ47FvudYSU5QjJhwyF9DtpuJXK8yctOL14m2sd\nIWU5QvIhMyF91L59w600ZmzI2tNtrmslIWU5QvIhIyHVj7nytOitVCHz7eWMPq61hJTlBnXo\nktpRfkJqrRisS1vTIa1THBzW9l7ag2kJ6YVdYaGGrf+169cNIW0KV/ZwYSg+FiFluXvGq5y3\nI+3BNpytHO0P6U9tqnKwSXVpD3Zdx8EKRfelPZiWkKKWRjf+qGiV1RDSGnnfXi6S3fGxCAlZ\n4bpRigP/rz5hXtqDaQlp2Ydh0f8n1I+xI4qHtMVePikV8bEICVkh+0JKfI30RKftlZUjLq08\n4FzYKOvs5SMFPLVDtsn2kG6JPtNb6VyokIX2clYv13pCQlbI9pC2v2obOP7VyMBnTrSs/T1u\nda0nJGSFbA8pLPwa6YFxVVZ53s3lFxS7zylFSMgKzSekaVJpWUuGHT5mvXsVISErNIuQmkBI\nyAqElISQ4AchJSEk+EFISQgJfhBSEkKCH4SUhJDgByElIST4QUhJCAl+EFISQoIfhJSEkOAH\nISUhJPhBSEkICX4QUhJCgh+ElISQ4IeRkHqNUjqNkNAcEVISQoIfhJSEkOAHISUhJON2nKM8\ncuNPTc9NycgBIgkJTXhTRp2RWp8BpuemtF55xOKfbk17MEKCBm/Knb9M7QfZG5JWhAQNCImQ\noAEhERI0ICRCggaEREjQgJAICRoQEiFBA0IiJGhASIQEDQiJkKABIRESNCAkQoIGhERI0ICQ\nCAkaEBIhQQNCIiRoQEiEBA0IiZCgASEREjQgJEKCBoRESNCAkAgJGhASIUEDQiIkaLBeDmub\nWpuBpucWDEKCBvWrl6psMj23YBASoAEhARoQEqABIQEaEBKgASEBGjSHkObcEv6ybETx2LdS\nrSckGNcMQtpWEg6pPH/SgtOLt6XYgJBgXNaHtHZUawmHNGZsyNrTbW6KTQgJxnkNacAEpe9l\nNKTNZWXhR6QKmW8vZ/RJsQkhwbisD8nW2wlpU/haHi4MxX5cvytqOSHBtM2yLXp3rGxqM/Mh\nrZH37eUi2R378RRpkJfuiIBmr8XujfLbJjbzH1L1xw73w57fkLbYyyelIvbjr9ZHPVqQ7oiA\nZpvlpYb7494mNksrpD2ze7QbvDh64dlwpNNcq/2FtFHW2ctHCkKN1/MaCcZl4jXS5OL7F18s\nz0UulHVfaXvbtdpfSBWy0F7O6pViPSHBuAyEVCmPWVZdn8mRS6XjksfyF5J15kTL2t/j1hTr\nCQnGZSCkraOdVzNjL45cOnu6VZs4ls+QyvNuLr+geHuK9YQE4zLz9nftV4sPi75I6ju6X17P\nsjrXSp8hWUuGHT5mfar1hATjtIR0T+SPir+ObT5PZFbkXYG6NiWPlpfK3a6x+NAqco+WkLr3\nCnshtvkXa+8tjHyYZ/9i58nY1HauhyRCQu7J2Ccb5hYciF9YIR/ELxASck8GQlp8ovPws0Cq\nnAufPl9vL1fJjvh6QkLuyUBIf5NX7OXkHuELG2W1vZx2vOv3qISE3JOBkOpHdn1o8VTn96cP\njKsKTSgpe/oaWe5aT0jIPZl4jbRjytFFw5bYj0HTpNKqLO1eNPI592pCQu5pDn9GcTCEBOMI\nCdCAkAANCAnQgJAADbIjpOrPq9MaIhEhwTjjIYXeuvvMjiLS8cy7Ux790QNCgnGGQ6p7api0\nGnz5rJ/NunxwKxn+hzrlXk0gJBhnNqS3h7WfsqbhWBF711zdfvg/0hoqgpBgnNmQjro/8Ygr\ne+87Kq2hIggJxpkNaXejlY1/cnCEBOOMv9kQVf9RU0cDaxohwbgsCGnt1e9ZFUOk1Wxf7zRY\nhIQsYD6k5/PkDet6GTdUnkhrmDhCgnHmQxrVbm19fZdh1v6SkWkNE0dIMM58SJ0uc2bxoGWd\nd0Raw8QREowzH9Lh51vWb2SDZV3dLq1h4ggJxpkPaWjHqgPf6V5v1fTtn9YwcYQE48yH9IQc\n31N+Zr08VH6e1jBxhATjzIdUf1fnVj+osu6Uid+kNUwcIcE48yFVWaEa+8uHH6c485E3hATj\nzIdUMK7sHd8NhRESjDMf0gki0n3qkq9Tb+wFIcE48yFZO5Zdf3Ke5H/3F2+kNUwcIcG4LAjJ\nsWv1zSXi92AOQYX04lKVtcFMANnLa0iDL1a68JBDqll337nFIp3SGiYuoJD2SJuC1Fq3CWQC\nyGLmQ3rpzjGHiXS+8OF/1Kc1TFxAIVXKWZemNjo/kAkgi5kPyY7o0n/Z7DciByHBOPMhtZL8\nobOXfZHWEIkICcaZD2nvy3ed1V6k11Xz30trmDhCgnHmQ3LUbvjtDztl+7t2hAS17Ahp57JZ\nJ4pk+Z9REBLUzIcUiUgG3/by/rSGiSMkGGc+JDuiLlcs2pF6W08ICcaZD2nsfb5/gRRFSDDO\nfEiOb949hM+sEhLMy4KQvvn5kfbTu5Kf+v27PkKCeeZD2nuCdL3wuou7Sf99aQ0TR0gwznxI\nt8jtztt1++fIrWkNE0dIMM58SINPivx9bP2JQ9IaJo6QYJz5kNpdGf3miqK0hokjJBhnPqQT\nh0e+hoYNTGuYOEKCceZDKpXfOM/tQr+R0rSGiSMkGGc+pF3HyqCf3POTQXLsrrSGiSMkGGc+\nJOvza1uJSKtrP09rFBdCgnFZEJJl1Wx9dWtNWmMkICQYZz6k2MEha+5Na5g4QoJxhkP6x9lH\nFn73RWvPw1dNPLULf9iHZstsSJtbi7SVVq+dKo7vpDVMHCHBOLMhXSQ37QltG1EkMzft2FGd\n1iguhATjzIZ0/HHOmczflH6H9BdJgYXUo39qxxJSQB6Zq3L/oZ2I4ZCZDSl/grPcJz9Ia4Bk\nAYVU98PxKlcHMgFYeaLk+69w9DAbklzs/uIXB9FvMQjJhZDgFyG5EBL8IiQXQoJfhOTiDumo\nsx3RL2enNUwcIbUYhOTiDilRWsPEEVKLQUgurl62JUprmDhCajFaVkh7ZvdoN3hxw6VlI4rH\nvuVe7feBR4mQWoyWFdLk4vsXXyzPRS6U509acHqx+8GGkOBXiwqpUh6zrLo+kyOXxowNWXu6\nzXWtd4V081dJ17PzJh//BkJqMVpUSFtHb7GXYyNvaVfIfHs5o49rvSuk6R1u+kf8M1KhDbM7\nzPDxbyCkFqNFhWSr/WrxYZEXSZvCKx8udH2k0P3Ubu0w6T9z4bqtX2xdt3BGPzn1dT//BkJq\nMXI8pPNnhG2ObT5PZFYknTXyvr1cJLvjYyW8Rgqtu+rI6C1x5FVv+vs3EFKL0fxDGnG10lWN\nQ/pi7b2FkZdFa8R5nvekVMTHSn6zoX7Tol/f/utFm3z/KQUhtRg5HlKq3yPNLTjgfNko6+zl\nIwWKp3ZaEFKL0aJCWnyi89d6C6TKuVAhC+3lrF6u9YQEv1pUSH+TV+zl5B6RS2dOtKz9Pdwn\nmiAk+NWiQqof2fWhxVOdR6IHxlVZ5Xk3l19QvN21npDgV4sKydox5eiiYUvsV0XTpNKylgw7\nfMx692pCgl8tK6SDICT4RUguhAS/CMklZUj1H+1NaxA3QmoxCMklOaS1V79nVQyRVrPr0hom\njpBaDEJySQrp+Tx5w7pexg2VJ9IaJo6QWgxCckkKaVS7tfX1XYZZ+0tGpjVM3AutlUduvC/9\n0eYqBzvva58TTM8dyglMTP6rkwZfnqvc5xeqq/mDcpfew1Vrjj5DseKM7srRfq6awDPKXb73\nV9U+6o5E9R/n7bOV1/O46mrmKXc55xPFLuZD6nSZM4sHLeu8I9IaJu6F/DMUevpos9fQS1I7\nX/7hc4Lp6X+KYgIXyN8Vu7wuJyiOptzlJNXVTOmhuJpLWg1WTmCMYs1Q6aeawADVBKYfq5pA\npwdU+zTxiFSp2OXxItXV9L5UdTXfHajaJ/8vil3Mh3T4+Zb1G9lgWVe3S2uYuBfyf6lwlp+Q\nblid2jNBhXSdYgLLmwjpEsXx/QerQxqvuJrVBTOUE/i1Ys0NcpFiAqeoQzpTNYE+fkJSPbV7\nvJvqar6vDmmKap/W2RvS0I5VB77Tvd6q6ds/rWHiCImQCMl6Qo7vKT+zXh4qymfTB0FIhERI\nVv1dnVv9oMq6Uyb6fd+FkAiJkGwh50TMH37s+wQ3hERIhGQ9+h9p7Z0CIRESIVlFndPaOwVC\nIiRCsm4Q5a/fPCIkQiIkq/4Xxyz4r4pKR1rDxBESIRGS1blzfsNtkdYwcYRESIRkzYxLa5g4\nQiIkQtKAkAiJkBw17637yvdvkQiJkAjJ8cWUw0RWlp+1OeXWHhASIRGStbOPDJwkK98o7LQ9\n9fYHRUiEREjWDXJ3/cey0trQampaw8QREiERknX8kJDlhGQN75nWMHGEREiEZLW70oqEdBV/\n2BdGSITkSVJIwwfUhUMKDRua1jBxhERIhGTdJT+pdkKaL7enNUwcIRESIVkHRkrXCTL+VBn4\nbVrDxBESIRGSZVWXHW3fDp3v2JPWKC6EREiEFLbn3UM5ZBwhERIhRR3Ssb8JiZAIScOxv/Pb\nKrQZlf5ofQvbp1Yk7/icYHpOVE9gvWKXdVKg0GqI6mquaa24mvaimkB7aatYUShtVBMYqJrA\nTOUE8h9S7dNaHVKVYpeFeaqraXOZ6mpOK1DeAi8pdjEfkoZjf7dZquLj83tvKQf7k+/zrqdl\ng3ICf1T9r6Z2lXIf5aPofyt3KVuoWnPPM4oVz9yjHO1t1QQ+Ve6ybJdqn9eU+7yo2mXPcuU+\nys+kbVbusqJGtYvxkDQc+5uD6MM0ryGdfp3Sj40f+5uQYJr5kDQc+5uQYJr5kDQc+5uQYJr5\nkDQc+5uQYJr5kLwe+7v6Y0eq2RISjDMf0sGP/T3nFmf5bPi3BdNSrCckGJcNIR3EtpJwSGXd\nV9pS/VaCkGCc2ZC6xPS/4uXUW68d1VrCIZWOUw1ISDDObEhFMfmSd2/qCZaVRR6Rzp5u1aYe\nkJBgXLY8tdu3umue6qNsvcMh9R3dL69nWaoPyRASjMuWkCzrFZms2CEcUl2bkkfLS+Xu+I+n\nxj6vmJfW9QP6vRb/+Oy/NLFZEG829FR94j8c0v7FzkcMp7aLPyR98WLUvIK0rh/Qb7Msj94d\nX27qL1SDCGmc6iNCkad2YSvkg8breWoH47LnqZ017BjFinBInz7v/BHDKtnReD0hwbjsCami\n3RjFDuGQNspqeznt+BS/tiUkGJc1IdVMVr5IC4cUmlBS9vQ1sjzFekKCcWZDmh1zZS8ZcECx\nQ+Q1UmVp96KRz6VaT0gwzmxI7r+5v8j3gYQICcaZDSl+dJbXd6c1RgJCgnFZ8xrpEBASjCMk\nQANCAjQgpGS7P1T5QufVILcQUrKTlMfyzNum83qQUzIeUuNDLWR5SL3UR8UN5tjfaI4yEVLd\ngwPaDXw0+jHtxodaICTknkyENC/vhqdL8+6KXGh8qAVCQu7JQEihTtPt5fVtI38Y3vhQC4SE\n3JOBkD6TVfZymXwUvtT4UAuEhNyTgZCqtzgng72xYF/4UuNDLRASco+WkLr3CnvBtcPCVrPD\nXxsfaoGQkIO0hBQ9y1T849ufXSKTImdkanyoBUJCDsrI75GWduy1IuEHCYdaICTknkyEtEym\nVjd8n+JQC4SE3JOBkGqOmhY/sEKKQy0QEnJPBkJ6WUrLHHutB8ZVpTjUAiEh92QgpPnRu92X\n1jSpTHGoBUJC7uHT38kICT4QUjJCgg+ElIyQ4AMhJSMk+OA1pPFzleYQElo8QkpGSPCBkJIR\nEnwgpGSEBB8IKRkhwQdCSkZI8IGQkhESfCCkZE2E9OulCn/VOYFNqmtZurpe5/UovaOcwJ/q\nDr53Rv1VObVXDM+MkJJN6KTQUdoUpNa6jc4JnFjYPrUi2aDzepQGq/9X8mYgE1Br3Vrxn6CN\nVJmdGSF5VilnXZra6Hyd19P/utWpLZe/67wepUHqkLQ+9PqQN1rxn+As+cbszAjJM0IiJDVC\n8oyQCEmNkDwjJEJSIyTPCImQ1AjJM0IiJDVC8oyQCEmNkDwjJEJSIyTPCImQ1AjJM0IiJDVC\n8oyQCEmNkDwjJEJSIyTPCImQ1AjJM0IiJDVC8oyQCEmNkDwjJEJSIyTPCImQ1AjJM0IiJDVC\n8oyQCEmNkDwjJEJSIyTPCImQ1AjJM0IiJDVC8oyQCEmNkDwjJEJSIyTPKqVH/9SO1RvSKZek\ndgEh5R2n+E/QI6CQHlNVMFW+8rI/IVlW7cXjVa7QeT23K6/mnJ06r0fp5+oJfBHIBNQmK6d2\ncTBHUy7o2CW1jrLVy/6EBNgKzlA8t/wnQgI8IyRAA0ICNCAkQINDDmniL5XuIiS0FIQEaEBI\ngAaEBGhASIAGhARoQEiABoQEaEBIgAaEBGhASIAGhARoQEiABoQEaBBoSHUPDmg38NGGv6Ff\nNqJ47Fvu1YSEZivQkObl3fB0ad5dkQvl+ZMWnF68zbWakNBsBRlSqNN0e3l929rwpTFjQ9ae\nbnNd6wkJzVaQIX0mq+zlMvnIuVAh8+3ljD6u9YSEZivIkKq3fGsvbyzY51zYFF75cGEovp6Q\n0GxlNqRlH4a5D9G3sNXs8Nc18r69XCS746sICX5NVR4gclJAB4g88pjUOusIKWppbPPPLpFJ\nNeHv1sgWe/mkVMTHIiT4lac+mnIwhyx+WHUIoKni6TC4TYf0wq6w2NO3pR17rYh+u1HW2ctH\nCnhqBw2Mh6Tk9SD66bz9vUymVjd8XyEL7eWsXq7VhAS/WlRINUdNcz3+nDnRsvb3uNW1npDg\nV4sK6WUpLXPstR4YV2WV591cfkHxdtd6QoJfLSqk+dF/2ZfWNKm0rCXDDh+z3r2ekOBXiwrp\nYAgJfhGSCyHBL0JyIST4RUguhAS/CMmFkOAXIbkQEvwiJBdCgl+E5EJI8IuQXAgJfhGSCyHB\nL0JyIST4RUguhAS/CMmFkOAXIbkQEvxq/iFd+BulMkJCQAjJhZDgFyG5EBL8IiQXQoJfhORC\nSPCLkFwCDGlZL6Xf6p4FAtBaHVKV2Znldkj3FvZUKJqlexYIwCtLVV4wPLMcD6loiEJHQoJO\nhARoQEiABoQEaEBIgAaEBGiQ9SHVPTig3cBHndOxLRtRPPatVJsQEozL+pDm5d3wdGneXZZV\nnj9pwenF21JsQkgwLttDCnWabi+vb1trjRkbsvZ0m5tiG0KCcdke0meyynLOAfhRhcy3v5nR\nJ8U2hATjsj2k6i3f2ssbC/ZtCl/Lw4WhxtsQEozL9pDCFraaba2R9+3vFsnu2E+vjX1eMU+1\nIyEhIK/FPz77uyY2MxnSZ5fIpBo7pC32909KRezn//Ni1LwC1a6EhIBsluUN98em/qDDYEhL\nO/ZaYX/ZKOvs5SMFPLVDNsr6p3bLZGq187VCFtrLWb1SbEJIMC7bQ6o5alr0MejMiZa1v8et\nKbYhJBiX7SG9LKVljr1Wed7N5RcUb0+xDSHBuGwPaX70nZAvLWvJsMPHrE+1DSHBuGwPyQtC\ngnGEBGhASIAGhARoQEiABjkeUruTFIqv3aWivJ69yl326f4npTuBvcFMAEq5HdLD6gNzqi1T\nDPZtoXKXjrr/SSkdaKucQPv6QGYApdwOqfZDlXMG36xQ/IRisEoZdW5qI/J1/5NS2ieTFHO+\nTGoDmQGUcjsktUu+qzoBYYk6pLMuTW10UCH9WDHn6YRkGiEREjQgJEKCBoRESNCAkAgJGhAS\nIUEDQiIkaEBIhAQNvIb0owVKjxESIbV4hERI0ICQCAkaZDyk6o8d7ishJJ8IKYtlKKQ5tzR8\n92z408nTXOsIySdCymKZCWlbSSyksu4rbW+7VhKST4SUxTIR0tpRrSUWUum45LEIySdCymKZ\nCGlzWVn8Eens6VbSf2RC8ik4fkUAABE7SURBVImQsliGXiP1joXUd3S/vJ5lda51hOQTIWUx\nLSHd9m9hO+Lbx0Kqa1PyaHmp3O0ai5B8IqQspiWkE4aGvRLfPhbS/sXO0bqntnM9JBGST4SU\nxTL+1C5shXwQv0BIPhFSFst0SJ8+7xzfZpW4nvYRkk+ElMUyHdJGWW0vpx3vOtEeIflESFks\noyE9MK4qNKGk7OlrZLlrHSH5REhZLKMhTZNKq7K0e9HI59zrsiOkAdcodFCHNPSM1E4KKqTz\nFHM+N3tD2vei0ufpj7ZBOdjf9U89LS31099zOim9oNhl/3HKXU44pPl7VdtDOYHeKc5RnR3m\nK48OK1elP1ob9WhV+ueejpYaEgLyO/Vdf3L6o+WpR/tG/9zTQUjIKEJKQEjwh5ASEBL8IaQE\nhAR/CCkBIcEfQkpASPCHkBIQEvwhpASEBH8IKQEhwR9CSkBI8IeQEhAS/CGkBIQEfwgpASHB\nH0JKQEjwh5ASEBL8IaQEhAR/CCkBIcEfQkpASPCHkBIQEvwhpASEBH8IKQEhwR9CSkBI8IeQ\nEgya8rTSIkKCEiElICT4Q0gJCAn+rBuvND/90X6kHOxCwwdtJiRAA0ICNCAkQANCAjQgJEAD\nQgI0ICRAA0ICNCAkQANCAjQgJEADQgI0ICRAA0ICNCAkQANCAjQgJEADQgI0ICRAA0ICNCAk\nQANCAjQgJEADQgI0ICQku/9SlUlfKXapmKTcZ57Oqf1EeTXX1uu8nvQREpINKjoitc7yV8Uu\nb0rPXqkdMUDn1PKKFVPryCGLDx0h6TXomCGpDW4ipIsUjxSn6A2pr2Jq/Qnp0BGSXoTkAyEh\nGSH5QEhIRkg+EBKSEZIPhIRkhORDhkKac0vs22Ujise+5V5HSFmOkHzITEjbSmIhledPWnB6\n8TbXSkLKcoTkQyZCWjuqtcRCGjM2ZO3pNte1mpCyHCH5kImQNpeVxR6RKsQ5U+iMPq7VhJTl\nCMmHDL1G6t0Q0qbwyocLQ/F1hJTlCMkHryHNWK20UsZGbrMN8e1jIa2R9+3lItkdX0dIWY6Q\nfNAS0hVzw7bGt3eFtMVePikV8XWElOUIyQctITXx1G6jrLOXjxTw1K75ICQfMh1ShSy0l7N6\nudYRUpYjJB8yHZJ15kTL2t/jVtc6QspyhORDRkN6YFyVVZ53c/kFxdtd6wgpyxGSDxkNaZpU\nWtaSYYePWe9eR0hZjpB8yFBITSGkLEdIPhASkhGSD4SEZITkAyEhGSH5QEhIRkg+EBKSEZIP\nhIRkJ7cuTK0g/HmvVP5TihQKT9Q5tVZtFFNrI3t0Xo/SGYrjYPY6Rj7zsj8htSDr/k3lyQOK\nXWoXKfdRPYj5Uq68mhU6r0atYOT5qY2QrQffm5CAsIKrf5naJEICPCMkQANCAjQgJECDHAip\n+mNHqjfrCQlBab4h1T04oN3AR+ss61lxTEuxCSEhKM03pHl5NzxdmneXZZV1X2l7O8UmhISg\nNNuQQp2m28vr29ZapeNU2xASgtJsQ/pMVtnLZfKRdfZ0qzb1NoSEoDTbkKq3fGsvbyzYZ/Ud\n3S+vZ1ldim0ICUFptiGFLWw126prU/JoeancHf/pTGmQl/aIgC9NhBS7N/6+if1NhvTZJTKp\nxtq/2DkS0dR28YekT16MmleQ5oiAT02EtLDh/ririf0NhrS0Y6/4J3tXyAeNt+CpHYLSfJ/a\nLZOp1c7XT5+vt5erZEfjTQgJQWm2IdUcNS1y2PCNstpeTjs+1HgbQkJQmm1IL0tpmWNvaEJJ\n2dPXyPIU2xASgtJsQ5offSvkS6uytHvRyOdSbUNICEqzDckLQkJQCAnQgJAADQgJ0ICQAA1y\nPKRdwdtp4DqZgPEJFFxyS2rnewzp6sVKT5kOaa0Axnk60urwJofYkNb9XntI1jvrA7dM1gR/\npW6r5M9mJ1AufzQ7gb/I8uCvtPypuMfkXtellzzdVXc0Nfi76d3t9YdkwPupPvQXpO3yqdkJ\nfCIfmp3Al7LF7AQq5B2j109IOhASIRm9dk0IiZAISQNCIiRC0oCQCImQNCAkQiIkDQiJkAhJ\nA0IiJELSYN9cxdEqg1I9t8bsBGpuqzY7gdq5+8xOoG5uldHrz4mQANMICdCAkAANCAnQgJAA\nDQgJ0ICQAA1yICT1iaFbFG4Go7dADoSkPjF0IObcEv6ybETx2LdMzsDUzRA7Qbepm8DDGcID\nkAMhqU8MHYRtJeG7cXn+pAWnF28zOANTN0PsBN2mbgIPZwgPQA6EpD4xdOatHdVawnfjMWND\n1p5ucw3OwNDNED9Bt6GbwMsZwgOQAyGpTwydeZvLysKPBxUy317O6GNuBqZuhtgJuk3dBF7O\nEB6AHAhJfWLoQPR27sabwkdBe7gwxQmjApqBqZshdoJuUzeBlzOEB6D5h9T4xNDBCt+N18j7\n9nKR7DY2A6M3g3OCbqM3QeozhAep+YfU+MTQwYqG5Pw5zpNSYWwGBm+GyAm6Dd4EqjOEB6n5\nhxSR8sTQgQjfjTfKOnv5SIHBp3ZhJm6G6Am6zd0EBz9DeACaf0hNnBg6EL0jbzYstJezepmb\ngbGboeEE3cZuAg9nCA9A8w+piRNDByLyeHDmRPvZVY9bzc3A1M0QO0G3qZvAyxnCA9D8Q2ri\nxNCBiIRUnndz+QXF283NwNTNEDtBt6mbwMsZwgPQ/ENq4sTQgYi+Qlky7PAx603OwNDNED9B\nt6GbwMsZwgOQAyEB5hESoAEhARoQEqABIQEaEBKgASEBGhASoAEhARoQEqABIQEaEBKgASEB\nGhASoAEhARoQEqABIQEaEBKgASEBGhASoAEhaSS9/e454+bwl9d+dExhjwkvJK2cLOkfG373\nUf/wOxf4QUgahUNaLU+lvePrHb6yl/U3iBw9/jsiMxPX+gnJum+4sdMKtEiEpJHfkELDb3K+\n/LP0ch5GNvWTJxNW+wqpqn36PcM/QtLIb0hviBPQB62PiJz/9B0ZnLDaV0jWlOGmjj3bIhGS\nRk5IZzsHK7SDOHDPiKKeNznP2GYV77/xhCMu2LGvtE/7MZvtH9QvPLW45AzXK6GrTnDu87fJ\nfdHLl5+807J2lQ4sGjJnnxUJ6ftFzopamZxivJnFtb847rCBTyQOvUYMndK2ZSIkjZyQ1syW\n6Qurrf3/JP2vGCx9v7Tv+EXnDLn1DDl5+HfmnCV97Vcud0vx+Ze2y1/bsFv9keEXRWPkC9dY\nnx8nw64cJP2/aRxS8ngzi685+rqZRfJswtB7842dMqolIiSNXE/tHpBZdVboLrnavuPL92vt\nl0EyqtoKjZePrFDn46ssa62zKmJT+DwO1rHt3M/FrpUH7RdPt8k/Nw4paTxrppxgP/K9Jpcn\nDj34zOD+5SAkjVwhHd3VOdVI/YltD9h3fOfEQbfIn+3lPfbzrZr8XvZrnvo33mvYbVF4g9q8\nfq6haloNdE5SUt31yBQhJY5nh/QH+9tQ0bjEoS/vGMg/GmGEpFE8pD0y4WPHJHnPvuPbr3is\nO2SrvSxz7vjnyYkPvVsf3+3X4VVWp06uobbKT8JfL5RvGoeUNN7MyLm1Oo9LHHqWVGf434s4\nQtIoHtK70mCdfYd23oy7Q7ZZ0Tt+1e1dRbrOjp0j8vbIi6MR8nX0B3+Z+fyrck/42+vkXVdI\nByIhJY03U3Y5K52Q3EP/TD4P6N8NQtIqHtLXMn5lxFeN7/j2k6/1D54iQxoelKKPSNPk99Ef\nXCQvNTwiXWzXFQ/pc0VIlc5KJyT30DwiBYmQNHK9RioZEf7Jm+WhRnf8D+982f4uNNZ5nyAs\n8hrJ+pt0+yZ8+cO2hd/WtBrkvPWwv3tJ9DVSG+eDCqsOElLC0LxGChIhaRQN6XHLuZ87yw2F\n4xvf8T+WYTV2IkNbVUV3i75rZ10lfTfYX7YOlXudd+0esh9e5sjtkZCmyF8sa9fJBwkpYWje\ntQsSIWkUDukVGfTTKmvPiXLqlFNbdXyn8R0/9H3pN/W8ErmhYbfo75Gs6gtFuo0b1Fp+YD/6\nfH6snHrlwNjvkVbLYVNLjx17XNMhuYfe24rfIwWIkDQKh1Rz0WGdv7asb28b3LbH1c6dvdEd\n/5uf9mtbMmJB/EOlkU822M/JVp7XpU2vs1eEL+0qPbHdyXP2WtGPCD016LCuN+3rfZDXSK6h\n+WRDoAgpC7whb+sf9OphfNYuQISUBULDb9Q+5t4OfPo7SISUDf5fh526h7x/GH+PFCRCygoz\nbtI84O6jMvBsEWqEBGhASIAGhARoQEiABoQEaEBIgAaEpENZ+E+P2p186VLn0wRTRHwendHn\nnp9NOaHtwB/vcL49PvJXUCM0bQyvCEmHstjf8c0MBR/S1uLwVXd837K+zTtYG2ltDM8ISQc7\npJF33nnDUPte+bRl/fPJJ/+Xv3H87XmOyDVLrrWn4PxFRoc7HQv0bAzPCEkHO6Rb7C+h20X6\nBf5R0f358k/2lzNEdlpLD/bwktbG8I6QdIiGZB3oKvK2dZk4f+IwVFpVTOrSc+qOnZOO7XSO\ncyBH6/OpJxYNud45wMI4kc/mnNR+2Arnx+vOP66w92XOR3oie1p1j4w5svPpv9qv3nCLyAkN\nV/6+yG32lzkir1q/dP7yz+UFkfZfWbs6ivzx4BvjEBCSDg0hWaUiz8RCyuvnvAI5pbez7Fpp\nWX8vCb8kOWZLuI+R4QvPW9afIq9U8tc0hLRvROQnA/5XuaE7pJ2PPbbe/nKNyH9aV8qlEzse\nc9HW2MyuErnJsh8oL/eyMfwjJB1iIf1K5J5YSNLt3h+3Eimcc8eRTl/1w+TwRRsfzJdzw320\nnf3Qd0W+Z1nfkbyH1vzqMBnQEJJ9v+//+JP2/tOVG24tLh6WOIVPiuSIb61TI621e6PhxxVH\nSuFbbaXzTi8bwz9C0iEW0u9FfhwP6a+W9QNxjuj9G5FfWWtF5tmbTBV5z+njPyxrV2s5zjqQ\nJ90+s6x/mXL1gcietYXSwX4s2ttV8ipVGzbydneR31uhYjnsoeftaxgaO2zeMyIdJHwESQ8b\nwzdC0sH9iPTLeEj2HX62yCuWtdJ5oLIj69q7d+/OIsucPpzHiGOlsxU6xn62dto9bzn35vCe\nH0jktcsNIn9XbZjsT0UiP7Ws2qeesh9eQqeJ/HfDmtB59oPOOSFvG8M3QtLB/RppSTwkKxzS\n69GQbo/9suk+pw/nOAvH231Ybw4K//C4JdGQXoq8H2Dd74yl2DBRyH6+2Mb1JvYd4ZdUUa/Z\nu7zsdWP4RUg6xN616yayURXSQ7G3zqyEPqzQhv8z2HkTYUviI9Js5zCtig0ThH4sctTrznf/\n8+qrzolkbgtfaUT96fYep9V72xi+EZIODb9H+plI/5AqpD+L/Mz+yWevv17h7uM/Z816zbI+\nPkfk/8ZeIx0eeY0klaoNE/yryNGfhr9b5bwWsw4MiOzUsLKLyG+9bQzfCEmHyCcbZg8X59mY\nKqT9vaXo8fVP95bDv3b38a5Iz8dfe2aEyN+ie84VGfDvi2Lv2qXa8IPOnWO/TK21HwbPDX9E\n4ZMd7aTwxvtHSPwXRJ+0lw4fdJSij71sDP8ISYf4Z+2uDylDsv7SNrxJm6UJfYRmRncdF33X\nztobfV960C7lhu7fI70Zu/LXrSci3/T73+i60ASRu51XW98LHXxjHAJC0qHh098/DB/bURWS\n9cFlfQt7TNloJb5Gqntm3HEFRwx9sCr2yYbah0cfWTLqvhr1hu6QFrvasNade3SHEb+IHT1/\nkf2CqMr69miRfz/4xjgEhARoQEiABoQEaEBIgAaEBGhASIAGhARoQEiABoQEaEBIgAaEBGhA\nSIAGhARoQEiABoQEaEBIgAaEBGhASIAGhARoQEiABoQEaEBIgAaEBGhASIAGhARoQEiABoQE\naEBIgAb/H8D/P/NQ7Vf3AAAAAElFTkSuQmCC",
"text/plain": [
"plot without title"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"image(MovieLense[sample(nusers,25),sample(nmovies,25)])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We will next run user-based collaborative filtering (UBCF), item-based collaborative filtering (IBCF) and alternating least squares (ALS) on this dataset. Let us first prepare the dataset. Users are split into a training set ($90\\%$) and a test set ($10\\%$). Thus, we will train our models on the ratings of 848 users. On the test set of 95 users, 12 ratings per user will be given to the recommender to make predictions and the other ratings are held out for computing prediction accuracy."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Evaluation scheme with 12 items given\n",
"Method: ‘split’ with 1 run(s).\n",
"Training set proportion: 0.900\n",
"Good ratings: NA\n",
"Data set: 943 x 1664 rating matrix of class ‘realRatingMatrix’ with 99392 ratings."
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/plain": [
"848 x 1664 rating matrix of class ‘realRatingMatrix’ with 88557 ratings."
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/plain": [
"95 x 1664 rating matrix of class ‘realRatingMatrix’ with 1140 ratings."
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/plain": [
"95 x 1664 rating matrix of class ‘realRatingMatrix’ with 9695 ratings."
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"## create 90/10 split (known/unknown)\n",
"evlt <- evaluationScheme(MovieLense, method=\"split\", train=0.9,\n",
" given=12)\n",
"evlt\n",
"tr <- getData(evlt, \"train\"); tr\n",
"tst_known <- getData(evlt, \"known\"); tst_known\n",
"tst_unknown <- getData(evlt, \"unknown\"); tst_unknown"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Create a UBCF recommender, using Pearson similarity and 50 nearest neighbours. "
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"95 x 1664 rating matrix of class ‘realRatingMatrix’ with 156940 ratings."
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
"\n",
"\tRMSE | 1.0914818 |
\n",
"\tMSE | 1.1913325 |
\n",
"\tMAE | 0.8711042 |
\n",
"\n",
"
\n"
],
"text/latex": [
"\\begin{tabular}{r|l}\n",
"\tRMSE & 1.0914818\\\\\n",
"\tMSE & 1.1913325\\\\\n",
"\tMAE & 0.8711042\\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"1. 1.09148177801425\n",
"2. 1.19133247173715\n",
"3. 0.871104173103724\n",
"\n",
"\n"
],
"text/plain": [
" [,1] \n",
"RMSE 1.0914818\n",
"MSE 1.1913325\n",
"MAE 0.8711042"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"## create a user-based CF recommender using training data\n",
"rcmnd_ub <- Recommender(tr, \"UBCF\",\n",
" param=list(method=\"pearson\",nn=50))\n",
"\n",
"## create predictions for the test users using known ratings\n",
"pred_ub <- predict(rcmnd_ub, tst_known, type=\"ratings\"); pred_ub\n",
"\n",
"## evaluate recommendations on \"unknown\" ratings\n",
"acc_ub <- calcPredictionAccuracy(pred_ub, tst_unknown);\n",
"as(acc_ub,\"matrix\")"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
" | Toy Story (1995) | GoldenEye (1995) | Four Rooms (1995) | Get Shorty (1995) | Copycat (1995) |
\n",
"\n",
"\t5 | 4 | 3 | NA | NA | NA |
\n",
"\t6 | 4 | NA | NA | NA | NA |
\n",
"\t7 | NA | NA | NA | 5 | NA |
\n",
"\t13 | 3 | 3 | NA | 5 | 1 |
\n",
"\t35 | NA | NA | NA | NA | NA |
\n",
"\t44 | 4 | NA | NA | NA | 4 |
\n",
"\t54 | 4 | NA | NA | NA | NA |
\n",
"\t65 | 3 | NA | NA | NA | NA |
\n",
"\n",
"
\n"
],
"text/latex": [
"\\begin{tabular}{r|lllll}\n",
" & Toy Story (1995) & GoldenEye (1995) & Four Rooms (1995) & Get Shorty (1995) & Copycat (1995)\\\\\n",
"\\hline\n",
"\t5 & 4 & 3 & NA & NA & NA\\\\\n",
"\t6 & 4 & NA & NA & NA & NA\\\\\n",
"\t7 & NA & NA & NA & 5 & NA\\\\\n",
"\t13 & 3 & 3 & NA & 5 & 1\\\\\n",
"\t35 & NA & NA & NA & NA & NA\\\\\n",
"\t44 & 4 & NA & NA & NA & 4\\\\\n",
"\t54 & 4 & NA & NA & NA & NA\\\\\n",
"\t65 & 3 & NA & NA & NA & NA\\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"1. 4\n",
"2. 4\n",
"3. NA\n",
"4. 3\n",
"5. NA\n",
"6. 4\n",
"7. 4\n",
"8. 3\n",
"9. 3\n",
"10. NA\n",
"11. NA\n",
"12. 3\n",
"13. NA\n",
"14. NA\n",
"15. NA\n",
"16. NA\n",
"17. NA\n",
"18. NA\n",
"19. NA\n",
"20. NA\n",
"21. NA\n",
"22. NA\n",
"23. NA\n",
"24. NA\n",
"25. NA\n",
"26. NA\n",
"27. 5\n",
"28. 5\n",
"29. NA\n",
"30. NA\n",
"31. NA\n",
"32. NA\n",
"33. NA\n",
"34. NA\n",
"35. NA\n",
"36. 1\n",
"37. NA\n",
"38. 4\n",
"39. NA\n",
"40. NA\n",
"\n",
"\n"
],
"text/plain": [
" Toy Story (1995) GoldenEye (1995) Four Rooms (1995) Get Shorty (1995)\n",
"5 4 3 NA NA \n",
"6 4 NA NA NA \n",
"7 NA NA NA 5 \n",
"13 3 3 NA 5 \n",
"35 NA NA NA NA \n",
"44 4 NA NA NA \n",
"54 4 NA NA NA \n",
"65 3 NA NA NA \n",
" Copycat (1995)\n",
"5 NA \n",
"6 NA \n",
"7 NA \n",
"13 1 \n",
"35 NA \n",
"44 4 \n",
"54 NA \n",
"65 NA "
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" | Toy Story (1995) | GoldenEye (1995) | Four Rooms (1995) | Get Shorty (1995) | Copycat (1995) |
\n",
"\n",
"\t5 | 2.478321 | 2.357157 | 2.336074 | 2.428473 | 2.368132 |
\n",
"\t6 | 3.458488 | 3.374429 | 3.510538 | 3.429548 | 3.370482 |
\n",
"\t7 | 3.859838 | 3.703389 | 3.658794 | 3.728707 | 3.717585 |
\n",
"\t13 | 2.707438 | 2.584175 | 2.479966 | 2.549201 | 2.564746 |
\n",
"\t35 | 2.838027 | 2.750989 | 2.745954 | 2.755377 | 2.722942 |
\n",
"\t44 | 3.475823 | 3.387616 | 3.239503 | 3.332724 | 3.312472 |
\n",
"\t54 | 4.007823 | 3.810384 | 3.815898 | 3.863014 | 3.807695 |
\n",
"\t65 | 3.811664 | 3.641846 | 3.575492 | 3.651777 | 3.683020 |
\n",
"\n",
"
\n"
],
"text/latex": [
"\\begin{tabular}{r|lllll}\n",
" & Toy Story (1995) & GoldenEye (1995) & Four Rooms (1995) & Get Shorty (1995) & Copycat (1995)\\\\\n",
"\\hline\n",
"\t5 & 2.478321 & 2.357157 & 2.336074 & 2.428473 & 2.368132\\\\\n",
"\t6 & 3.458488 & 3.374429 & 3.510538 & 3.429548 & 3.370482\\\\\n",
"\t7 & 3.859838 & 3.703389 & 3.658794 & 3.728707 & 3.717585\\\\\n",
"\t13 & 2.707438 & 2.584175 & 2.479966 & 2.549201 & 2.564746\\\\\n",
"\t35 & 2.838027 & 2.750989 & 2.745954 & 2.755377 & 2.722942\\\\\n",
"\t44 & 3.475823 & 3.387616 & 3.239503 & 3.332724 & 3.312472\\\\\n",
"\t54 & 4.007823 & 3.810384 & 3.815898 & 3.863014 & 3.807695\\\\\n",
"\t65 & 3.811664 & 3.641846 & 3.575492 & 3.651777 & 3.683020\\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"1. 2.4783212734142\n",
"2. 3.45848782979558\n",
"3. 3.85983836543683\n",
"4. 2.70743788499186\n",
"5. 2.83802708977108\n",
"6. 3.47582324288164\n",
"7. 4.00782294054929\n",
"8. 3.81166438448772\n",
"9. 2.35715702600034\n",
"10. 3.37442870643056\n",
"11. 3.70338871618972\n",
"12. 2.58417509882385\n",
"13. 2.75098921652312\n",
"14. 3.38761593881076\n",
"15. 3.81038409847356\n",
"16. 3.64184641041554\n",
"17. 2.33607354369403\n",
"18. 3.51053825584847\n",
"19. 3.65879374387208\n",
"20. 2.47996550422415\n",
"21. 2.74595406548076\n",
"22. 3.23950262351902\n",
"23. 3.81589814844269\n",
"24. 3.57549202803257\n",
"25. 2.4284726877311\n",
"26. 3.42954754792015\n",
"27. 3.72870660076816\n",
"28. 2.54920073040395\n",
"29. 2.75537737061299\n",
"30. 3.33272442627577\n",
"31. 3.86301365456674\n",
"32. 3.65177695648019\n",
"33. 2.36813185924013\n",
"34. 3.370482154003\n",
"35. 3.71758450635809\n",
"36. 2.56474551987803\n",
"37. 2.72294221863129\n",
"38. 3.3124720707517\n",
"39. 3.8076948329858\n",
"40. 3.68302010826899\n",
"\n",
"\n"
],
"text/plain": [
" Toy Story (1995) GoldenEye (1995) Four Rooms (1995) Get Shorty (1995)\n",
"5 2.478321 2.357157 2.336074 2.428473 \n",
"6 3.458488 3.374429 3.510538 3.429548 \n",
"7 3.859838 3.703389 3.658794 3.728707 \n",
"13 2.707438 2.584175 2.479966 2.549201 \n",
"35 2.838027 2.750989 2.745954 2.755377 \n",
"44 3.475823 3.387616 3.239503 3.332724 \n",
"54 4.007823 3.810384 3.815898 3.863014 \n",
"65 3.811664 3.641846 3.575492 3.651777 \n",
" Copycat (1995)\n",
"5 2.368132 \n",
"6 3.370482 \n",
"7 3.717585 \n",
"13 2.564746 \n",
"35 2.722942 \n",
"44 3.312472 \n",
"54 3.807695 \n",
"65 3.683020 "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"#compare predictions with true \"unknown\" ratings\n",
"as(tst_unknown, \"matrix\")[1:8,1:5]\n",
"as(pred_ub, \"matrix\")[1:8,1:5]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now, let us repeat the same thing with IBCF. On this dataset, it does not work as well."
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
" | RMSE | MSE | MAE |
\n",
"\n",
"\tUBCF | 1.091482 | 1.191332 | 0.8711042 |
\n",
"\tIBCF | 1.683765 | 2.835066 | 1.2843314 |
\n",
"\n",
"
\n"
],
"text/latex": [
"\\begin{tabular}{r|lll}\n",
" & RMSE & MSE & MAE\\\\\n",
"\\hline\n",
"\tUBCF & 1.091482 & 1.191332 & 0.8711042\\\\\n",
"\tIBCF & 1.683765 & 2.835066 & 1.2843314\\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"1. 1.09148177801425\n",
"2. 1.68376546140195\n",
"3. 1.19133247173715\n",
"4. 2.83506612901011\n",
"5. 0.871104173103724\n",
"6. 1.28433140154139\n",
"\n",
"\n"
],
"text/plain": [
" RMSE MSE MAE \n",
"UBCF 1.091482 1.191332 0.8711042\n",
"IBCF 1.683765 2.835066 1.2843314"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"## repeat with the item-based approach\n",
"rcmnd_ib <- Recommender(tr, \"IBCF\",\n",
" param=list(method=\"pearson\",k=50))\n",
"pred_ib <- predict(rcmnd_ib, tst_known, type=\"ratings\")\n",
"acc_ib <- calcPredictionAccuracy(pred_ib, tst_unknown) \n",
"acc <- rbind(UBCF = acc_ub, IBCF = acc_ib); acc"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We next try the alternating least squares approach (`ALS`). We will use latent attributes of dimension $k=20$. "
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
" | RMSE | MSE | MAE |
\n",
"\n",
"\tUBCF | 1.091482 | 1.191332 | 0.8711042 |
\n",
"\tIBCF | 1.683765 | 2.835066 | 1.2843314 |
\n",
"\tALS | 1.022098 | 1.044684 | 0.8146836 |
\n",
"\n",
"
\n"
],
"text/latex": [
"\\begin{tabular}{r|lll}\n",
" & RMSE & MSE & MAE\\\\\n",
"\\hline\n",
"\tUBCF & 1.091482 & 1.191332 & 0.8711042\\\\\n",
"\tIBCF & 1.683765 & 2.835066 & 1.2843314\\\\\n",
"\tALS & 1.022098 & 1.044684 & 0.8146836\\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"1. 1.09148177801425\n",
"2. 1.68376546140195\n",
"3. 1.02209769383002\n",
"4. 1.19133247173715\n",
"5. 2.83506612901011\n",
"6. 1.04468369573264\n",
"7. 0.871104173103724\n",
"8. 1.28433140154139\n",
"9. 0.814683619911017\n",
"\n",
"\n"
],
"text/plain": [
" RMSE MSE MAE \n",
"UBCF 1.091482 1.191332 0.8711042\n",
"IBCF 1.683765 2.835066 1.2843314\n",
"ALS 1.022098 1.044684 0.8146836"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"rcmnd_als <- Recommender(tr, \"ALS\",\n",
" param=list(n_factors=20))\n",
"pred_als <- predict(rcmnd_als, tst_known, type=\"ratings\")\n",
"acc_als <- calcPredictionAccuracy(pred_als, tst_unknown) \n",
"acc <- rbind(UBCF = acc_ub, IBCF = acc_ib, ALS = acc_als); acc"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The results of ALS look favourable to those of memory-based methods. However, each method has a number of tuning parameters (type of similarity, number of neighbours, number of latent factors, regularization parameters) so further comparisons are needed. There is a number of other methods -- below we query the registry of implemented methods."
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"$ALS_realRatingMatrix\n",
"Recommender method: ALS for realRatingMatrix\n",
"Description: Recommender for explicit ratings based on latent factors, calculated by alternating least squares algorithm.\n",
"Reference: Yunhong Zhou, Dennis Wilkinson, Robert Schreiber, Rong Pan (2008). Large-Scale Parallel Collaborative Filtering for the Netflix Prize, 4th Int'l Conf. Algorithmic Aspects in Information and Management, LNCS 5034.\n",
"Parameters:\n",
" normalize lambda n_factors n_iterations min_item_nr seed\n",
"1 NULL 0.1 10 10 1 NULL\n",
"\n",
"$ALS_implicit_realRatingMatrix\n",
"Recommender method: ALS_implicit for realRatingMatrix\n",
"Description: Recommender for implicit data based on latent factors, calculated by alternating least squares algorithm.\n",
"Reference: Yifan Hu, Yehuda Koren, Chris Volinsky (2008). Collaborative Filtering for Implicit Feedback Datasets, ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining, pages 263-272.\n",
"Parameters:\n",
" lambda alpha n_factors n_iterations min_item_nr seed\n",
"1 0.1 10 10 10 1 NULL\n",
"\n",
"$IBCF_realRatingMatrix\n",
"Recommender method: IBCF for realRatingMatrix\n",
"Description: Recommender based on item-based collaborative filtering.\n",
"Reference: NA\n",
"Parameters:\n",
" k method normalize normalize_sim_matrix alpha na_as_zero\n",
"1 30 \"Cosine\" \"center\" FALSE 0.5 FALSE\n",
"\n",
"$POPULAR_realRatingMatrix\n",
"Recommender method: POPULAR for realRatingMatrix\n",
"Description: Recommender based on item popularity.\n",
"Reference: NA\n",
"Parameters:\n",
" normalize aggregationRatings aggregationPopularity\n",
"1 \"center\" new(\"standardGeneric\" new(\"standardGeneric\"\n",
"\n",
"$RANDOM_realRatingMatrix\n",
"Recommender method: RANDOM for realRatingMatrix\n",
"Description: Produce random recommendations (real ratings).\n",
"Reference: NA\n",
"Parameters: None\n",
"\n",
"$RERECOMMEND_realRatingMatrix\n",
"Recommender method: RERECOMMEND for realRatingMatrix\n",
"Description: Re-recommends highly rated items (real ratings).\n",
"Reference: NA\n",
"Parameters:\n",
" randomize minRating\n",
"1 1 NA\n",
"\n",
"$SVD_realRatingMatrix\n",
"Recommender method: SVD for realRatingMatrix\n",
"Description: Recommender based on SVD approximation with column-mean imputation.\n",
"Reference: NA\n",
"Parameters:\n",
" k maxiter normalize\n",
"1 10 100 \"center\"\n",
"\n",
"$SVDF_realRatingMatrix\n",
"Recommender method: SVDF for realRatingMatrix\n",
"Description: Recommender based on Funk SVD with gradient descend.\n",
"Reference: NA\n",
"Parameters:\n",
" k gamma lambda min_epochs max_epochs min_improvement normalize verbose\n",
"1 10 0.015 0.001 50 200 1e-06 \"center\" FALSE\n",
"\n",
"$UBCF_realRatingMatrix\n",
"Recommender method: UBCF for realRatingMatrix\n",
"Description: Recommender based on user-based collaborative filtering.\n",
"Reference: NA\n",
"Parameters:\n",
" method nn sample normalize\n",
"1 \"cosine\" 25 FALSE \"center\"\n"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"recommenderRegistry$get_entries(dataType = \"realRatingMatrix\")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "R",
"language": "R",
"name": "ir"
},
"language_info": {
"codemirror_mode": "r",
"file_extension": ".r",
"mimetype": "text/x-r-source",
"name": "R",
"pygments_lexer": "r",
"version": "3.4.3"
},
"latex_envs": {
"bibliofile": "biblio.bib",
"cite_by": "apalike",
"current_citInitial": 1,
"eqLabelWithNumbers": true,
"eqNumInitial": 0
}
},
"nbformat": 4,
"nbformat_minor": 1
}