{"id":1288,"date":"2013-10-21T17:29:54","date_gmt":"2013-10-21T15:29:54","guid":{"rendered":"https:\/\/blog.zitnik.si\/?p=1288"},"modified":"2015-08-05T15:37:29","modified_gmt":"2015-08-05T13:37:29","slug":"machine-learning-for-everyone-predictionio","status":"publish","type":"post","link":"https:\/\/blog.zitnik.si\/?p=1288","title":{"rendered":"Machine Learning for everyone &#8211; PredictionIO"},"content":{"rendered":"<p><a href=\"http:\/\/prediction.io\/\" target=\"_blank\">PredictionIO<\/a> (<a href=\"http:\/\/prediction.io\/\" target=\"_blank\">http:\/\/prediction.io\/<\/a>) is an open source machine learning (ML) server. Its goal is to make personalization and recommendation algorithms more accessible to programmers without ML knowledge. It includes recommendation engine and similarity engine which can be instantiated, configured and evaluated via web-based GUI.<\/p>\n<p>Due to a limited number of integrated ML methods I do not think this product should be already called &#8220;<em>machine learning server<\/em>&#8220;. As I was curious how does the system work, I tested it. Therefore in this post I review how to install and use the server.<\/p>\n<h3><strong>1. Installation <\/strong><\/h3>\n<p>First we need to install the server and its dependencies. I was using Mac OSX Mavericks \u00a0(10.9, GM):<\/p>\n<ul>\n<li>We need to install <a href=\"http:\/\/www.mongodb.org\" target=\"_blank\">MongoDB<\/a> (<a href=\"http:\/\/www.mongodb.org\" target=\"_blank\">http:\/\/www.mongodb.org<\/a>). Currently, version 2.4.6 was available. To run the database, we need to create a db folder and run the service<\/li>\n<\/ul>\n<p>[codesyntax lang=&#8221;bash&#8221;]<\/p>\n<pre>$ mkdir \/data\/db\r\n$ .\/mongod<\/pre>\n<p>[\/codesyntax]<\/p>\n<ul>\n<li>Then we need to download <a href=\"http:\/\/hadoop.apache.org\/\" target=\"_blank\">Apache Hadoop<\/a> (<a href=\"http:\/\/hadoop.apache.org\/\" target=\"_blank\">http:\/\/hadoop.apache.org\/<\/a>) and add it to PATH.<\/li>\n<li>The Prediction IO server and MongoDB connector should be installed as follows in the getting stared guide:<\/li>\n<\/ul>\n<p>[codesyntax lang=&#8221;bash&#8221;]<\/p>\n<pre>git clone https:\/\/github.com\/mongodb\/mongo-hadoop.git\r\ncd mongo-hadoop\r\ngit checkout r1.1.0\r\n.\/sbt publish-local\r\n\r\ngit clone https:\/\/github.com\/PredictionIO\/PredictionIO.git\r\ncd PredictionIO\r\nbin\/build.sh\r\nbin\/package.sh<\/pre>\n<p>[\/codesyntax]<\/p>\n<h3>2. Run the server<\/h3>\n<p>After we packaged the distribution, we can run the server from\u00a0<em>dist\/target\/PredictionIO-&lt;version&gt;<\/em>. First we need to run the setup script\u00a0<em>.\/bin\/setup.sh<\/em> and then run it\u00a0<em>.\/bin\/start-all.sh<\/em>.<\/p>\n<p>The server is accessible only to registered users, which can be added using the following command\u00a0<em>.\/bin\/users<\/em>. After that, we can login to the server via the default port:\u00a0<a href=\"http:\/\/localhost:9000\/\" target=\"_blank\">http:\/\/localhost:9000\/<\/a>.<\/p>\n<p><a href=\"https:\/\/blog.zitnik.si\/wp-content\/uploads\/2013\/10\/Screen-Shot-2013-10-21-at-14.34.26.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter\" alt=\"Screen Shot 2013-10-21 at 14.34.26\" src=\"https:\/\/blog.zitnik.si\/wp-content\/uploads\/2013\/10\/Screen-Shot-2013-10-21-at-14.34.26-300x127.png\" width=\"300\" height=\"127\" \/><\/a><\/p>\n<p>Later, if we see the message\u00a0&#8220;This feature will be available soon.&#8221;, we need to run the setup script again and restart the server.<\/p>\n<h3>3. Write an example application<\/h3>\n<p>Firstly, we create an application. The result of this step is an App Key, which is used for our script. Then we create an engine &#8211; we chose recommendation engine. We need to define item types and some basic recommendation parameters. Afterwards we select a recommendation algorithm and set its parameters.<\/p>\n<p>The main idea is to have a set of users and a set of different items to predict new items for new or existing users.<\/p>\n<p><a href=\"https:\/\/blog.zitnik.si\/wp-content\/uploads\/2013\/10\/a.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-medium wp-image-1292\" alt=\"a\" src=\"https:\/\/blog.zitnik.si\/wp-content\/uploads\/2013\/10\/a-300x71.png\" width=\"300\" height=\"71\" srcset=\"https:\/\/blog.zitnik.si\/wp-content\/uploads\/2013\/10\/a-300x71.png 300w, https:\/\/blog.zitnik.si\/wp-content\/uploads\/2013\/10\/a-1024x244.png 1024w, https:\/\/blog.zitnik.si\/wp-content\/uploads\/2013\/10\/a.png 1381w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><\/p>\n<p><a href=\"https:\/\/blog.zitnik.si\/wp-content\/uploads\/2013\/10\/b.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-medium wp-image-1293\" alt=\"b\" src=\"https:\/\/blog.zitnik.si\/wp-content\/uploads\/2013\/10\/b-300x154.png\" width=\"300\" height=\"154\" srcset=\"https:\/\/blog.zitnik.si\/wp-content\/uploads\/2013\/10\/b-300x154.png 300w, https:\/\/blog.zitnik.si\/wp-content\/uploads\/2013\/10\/b-1024x527.png 1024w, https:\/\/blog.zitnik.si\/wp-content\/uploads\/2013\/10\/b.png 1127w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><\/p>\n<p><a href=\"https:\/\/blog.zitnik.si\/wp-content\/uploads\/2013\/10\/c.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-medium wp-image-1294\" alt=\"c\" src=\"https:\/\/blog.zitnik.si\/wp-content\/uploads\/2013\/10\/c-300x200.png\" width=\"300\" height=\"200\" srcset=\"https:\/\/blog.zitnik.si\/wp-content\/uploads\/2013\/10\/c-300x200.png 300w, https:\/\/blog.zitnik.si\/wp-content\/uploads\/2013\/10\/c.png 634w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><\/p>\n<p><a href=\"https:\/\/blog.zitnik.si\/wp-content\/uploads\/2013\/10\/d.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-medium wp-image-1295\" alt=\"d\" src=\"https:\/\/blog.zitnik.si\/wp-content\/uploads\/2013\/10\/d-300x157.png\" width=\"300\" height=\"157\" srcset=\"https:\/\/blog.zitnik.si\/wp-content\/uploads\/2013\/10\/d-300x157.png 300w, https:\/\/blog.zitnik.si\/wp-content\/uploads\/2013\/10\/d.png 915w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><\/p>\n<p><a href=\"https:\/\/blog.zitnik.si\/wp-content\/uploads\/2013\/10\/e.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-medium wp-image-1296\" alt=\"e\" src=\"https:\/\/blog.zitnik.si\/wp-content\/uploads\/2013\/10\/e-300x214.png\" width=\"300\" height=\"214\" srcset=\"https:\/\/blog.zitnik.si\/wp-content\/uploads\/2013\/10\/e-300x214.png 300w, https:\/\/blog.zitnik.si\/wp-content\/uploads\/2013\/10\/e.png 997w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><\/p>\n<p><a href=\"https:\/\/blog.zitnik.si\/wp-content\/uploads\/2013\/10\/f.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-medium wp-image-1297\" alt=\"f\" src=\"https:\/\/blog.zitnik.si\/wp-content\/uploads\/2013\/10\/f-300x169.png\" width=\"300\" height=\"169\" srcset=\"https:\/\/blog.zitnik.si\/wp-content\/uploads\/2013\/10\/f-300x169.png 300w, https:\/\/blog.zitnik.si\/wp-content\/uploads\/2013\/10\/f-1024x579.png 1024w, https:\/\/blog.zitnik.si\/wp-content\/uploads\/2013\/10\/f.png 1194w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><\/p>\n<p><a href=\"https:\/\/blog.zitnik.si\/wp-content\/uploads\/2013\/10\/g.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-medium wp-image-1298\" alt=\"g\" src=\"https:\/\/blog.zitnik.si\/wp-content\/uploads\/2013\/10\/g-300x176.png\" width=\"300\" height=\"176\" srcset=\"https:\/\/blog.zitnik.si\/wp-content\/uploads\/2013\/10\/g-300x176.png 300w, https:\/\/blog.zitnik.si\/wp-content\/uploads\/2013\/10\/g-1024x602.png 1024w, https:\/\/blog.zitnik.si\/wp-content\/uploads\/2013\/10\/g.png 1196w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><\/p>\n<p>&nbsp;<\/p>\n<p>Secondly, we need to populate the database via our program and then we can call functions to get new predictions. We published our sample code on GitHub (<a href=\"https:\/\/github.com\/szitnik\/prediction-io-Test\" target=\"_blank\">https:\/\/github.com\/szitnik\/prediction-io-Test<\/a>). The key idea was to have 4 users and their friendships (modelled as <em>view<\/em> action) to predict new possible friendships.<\/p>\n<p style=\"text-align: center;\"><a href=\"https:\/\/blog.zitnik.si\/wp-content\/uploads\/2013\/10\/h.png\"><img loading=\"lazy\" decoding=\"async\" alt=\"h\" src=\"https:\/\/blog.zitnik.si\/wp-content\/uploads\/2013\/10\/h-300x146.png\" width=\"300\" height=\"146\" \/><\/a><\/p>\n<p style=\"text-align: center;\"><a href=\"https:\/\/blog.zitnik.si\/wp-content\/uploads\/2013\/10\/i.png\"><img loading=\"lazy\" decoding=\"async\" alt=\"i\" src=\"https:\/\/blog.zitnik.si\/wp-content\/uploads\/2013\/10\/i-300x106.png\" width=\"300\" height=\"106\" \/><\/a><\/p>\n<p><a href=\"https:\/\/blog.zitnik.si\/wp-content\/uploads\/2013\/10\/j.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter\" alt=\"j\" src=\"https:\/\/blog.zitnik.si\/wp-content\/uploads\/2013\/10\/j.png\" width=\"246\" height=\"164\" \/><\/a><\/p>\n<p>After we inserted the data, the system calculated all possibilites and stored them into the MongoDB database:<\/p>\n<pre style=\"padding-left: 30px;\">13\/10\/21 16:22:50 INFO mongodb.MongoDbCollector: Putting key for output: null { \"uid\" : \"1_XWoman\" , \"iid\" : \"1_xwoman\" , \"score\" : 0.8630746441103142 , \"itypes\" : [ \"person\"] , \"algoid\" : 2 , \"modelset\" : true}\r\n 13\/10\/21 16:22:50 INFO mongodb.MongoDbCollector: Putting key for output: null { \"uid\" : \"1_XWoman\" , \"iid\" : \"1_mirco\" , \"score\" : 0.7472373332042158 , \"itypes\" : [ \"person\"] , \"algoid\" : 2 , \"modelset\" : true}\r\n...\r\n 13\/10\/21 16:22:50 INFO mongodb.MongoDbCollector: Putting key for output: null { \"uid\" : \"1_Mirco\" , \"iid\" : \"1_jurcek\" , \"score\" : 0.8126793796567733 , \"itypes\" : [ \"person\"] , \"algoid\" : 2 , \"modelset\" : true}\r\n 13\/10\/21 16:22:50 INFO mongodb.MongoDbCollector: Putting key for output: null { \"uid\" : \"1_Mirco\" , \"iid\" : \"1_mirco\" , \"score\" : 0.6338884061675459 , \"itypes\" : [ \"person\"] , \"algoid\" : 2 , \"modelset\" : true}\r\n 13\/10\/21 16:22:50 INFO mongodb.MongoDbCollector: Putting key for output: null { \"uid\" : \"1_Mirco\" , \"iid\" : \"1_xwoman\" , \"score\" : 0.5107068611450168 , \"itypes\" : [ \"person\"] , \"algoid\" : 2 , \"modelset\" : true}\r\n...\r\n 13\/10\/21 16:22:50 INFO mongodb.MongoDbCollector: Putting key for output: null { \"uid\" : \"1_Johan\" , \"iid\" : \"1_mirco\" , \"score\" : 0.9135908855406835 , \"itypes\" : [ \"person\"] , \"algoid\" : 2 , \"modelset\" : true}\r\n 13\/10\/21 16:22:50 INFO mongodb.MongoDbCollector: Putting key for output: null { \"uid\" : \"1_Johan\" , \"iid\" : \"1_johan\" , \"score\" : 0.7490173941424351 , \"itypes\" : [ \"person\"] , \"algoid\" : 2 , \"modelset\" : true}<\/pre>\n<p>After that we ran some queries to get new friends recommendations:<\/p>\n<pre style=\"padding-left: 30px;\">Retrieve top 1 recommendations for user Mirco\r\nRecommendations: jurcek\r\nRetrieve top 1 recommendations for user XWoman\r\nRecommendations: mirco<\/pre>\n<p>The web-based GUI also supports some parameters tuning and evaluation methods for recommendation algorithms.<\/p>\n<p>To conclude, I believe the PredictionIO project is a nice start to bring ML (although I do not agree with ML naming here :)) methods closer to a large group of programmers.<\/p>\n<p>&nbsp;<\/p>\n<div style=\"margin-top: 0px; margin-bottom: 0px;\" class=\"sharethis-inline-share-buttons\" ><\/div>","protected":false},"excerpt":{"rendered":"<p>PredictionIO (http:\/\/prediction.io\/) is an open source machine learning (ML) server. Its goal is to make personalization and recommendation algorithms more accessible to programmers without ML knowledge. It includes recommendation engine and similarity engine which can be instantiated, configured and evaluated via web-based GUI. Due to a limited number of integrated&#8230;<\/p>\n<div class=\"more-link-wrapper\"><a class=\"more-link\" href=\"https:\/\/blog.zitnik.si\/?p=1288\">Continue reading<span class=\"screen-reader-text\">Machine Learning for everyone &#8211; PredictionIO<\/span><\/a><\/div>\n","protected":false},"author":1,"featured_media":1289,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[21],"tags":[],"class_list":["post-1288","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-research","entry"],"_links":{"self":[{"href":"https:\/\/blog.zitnik.si\/index.php?rest_route=\/wp\/v2\/posts\/1288","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blog.zitnik.si\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.zitnik.si\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.zitnik.si\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.zitnik.si\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1288"}],"version-history":[{"count":2,"href":"https:\/\/blog.zitnik.si\/index.php?rest_route=\/wp\/v2\/posts\/1288\/revisions"}],"predecessor-version":[{"id":1302,"href":"https:\/\/blog.zitnik.si\/index.php?rest_route=\/wp\/v2\/posts\/1288\/revisions\/1302"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/blog.zitnik.si\/index.php?rest_route=\/wp\/v2\/media\/1289"}],"wp:attachment":[{"href":"https:\/\/blog.zitnik.si\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1288"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.zitnik.si\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1288"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.zitnik.si\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1288"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}