
<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>daniel shiffman &#187; netflix</title>
	<atom:link href="http://www.shiffman.net/category/netflix/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.shiffman.net</link>
	<description></description>
	<lastBuildDate>Tue, 24 Jan 2012 03:41:01 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=</generator>
		<item>
		<title>Alien vs. Predator</title>
		<link>http://www.shiffman.net/2006/10/10/alien-vs-predator/</link>
		<comments>http://www.shiffman.net/2006/10/10/alien-vs-predator/#comments</comments>
		<pubDate>Wed, 11 Oct 2006 04:01:23 +0000</pubDate>
		<dc:creator>Daniel</dc:creator>
				<category><![CDATA[blog]]></category>
		<category><![CDATA[netflix]]></category>
		<category><![CDATA[p5]]></category>
		<category><![CDATA[programming]]></category>

		<guid isPermaLink="false">http://www.shiffman.net/2006/10/10/alien-vs-predator/</guid>
		<description><![CDATA[This is a quick visualization of data from the netflix prize. A vertical bar is drawn for every customer rating a movie. Ratings go from 1 to 5 stars (represented top to bottom.) Note how &#8220;Alien&#8221; (on the left) received &#8230; <a href="http://www.shiffman.net/2006/10/10/alien-vs-predator/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.flickr.com/photos/shiffman/266636904/" title="Photo Sharing"><img src="http://static.flickr.com/112/266636904_6a8c054fc5_m.jpg" width="240" height="135" alt="Alien vs. Predator" /></a></p>
<p>This is a quick visualization of data from the <a href="http://netflixprize.com">netflix prize</a>.  A vertical bar is drawn for every customer rating a movie.  Ratings go from 1 to 5 stars (represented top to bottom.)  Note how &#8220;Alien&#8221; (on the left) received many ratings of 4 and 5 stars, but &#8220;Predator&#8221; (on the right) mostly received ratings of 4 stars.  This depicts approximately 50,000 customer ratings.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.shiffman.net/2006/10/10/alien-vs-predator/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Netflix Challenge</title>
		<link>http://www.shiffman.net/2006/10/10/netflix-challenge/</link>
		<comments>http://www.shiffman.net/2006/10/10/netflix-challenge/#comments</comments>
		<pubDate>Wed, 11 Oct 2006 01:28:14 +0000</pubDate>
		<dc:creator>Daniel</dc:creator>
				<category><![CDATA[blog]]></category>
		<category><![CDATA[netflix]]></category>
		<category><![CDATA[programming]]></category>

		<guid isPermaLink="false">http://www.shiffman.net/2006/10/10/netflix-challenge/</guid>
		<description><![CDATA[Netflix recently released 100 million movie rating records as part of a contest to improve its movie recommendation system. The problem: I know how I rated a whole bunch of movies. I know how everyone else has rated a whole &#8230; <a href="http://www.shiffman.net/2006/10/10/netflix-challenge/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Netflix recently released 100 million movie rating records as part of a <a href="http://netflixprize.com">contest</a> to improve its movie recommendation system.  </p>
<p><strong>The problem:</strong></p>
<p>I know how I rated a whole bunch of movies.  I know how everyone else has rated a whole bunch of movies.   For any given movie that I have not yet rated (but others have), predict how I would rate it based on my and everyone else&#8217;s rating history.  Netflix uses the root mean squared error (RMSE) to evaluate results.  In other words, let&#8217;s guess that I would give the movie <a href="http://www.imdb.com/title/tt0087957/">Purple Rain</a> a rating of 5, when in reality, I would only rate it a 4.  And let&#8217;s also guess that I would rate <a href="http://www.imdb.com/title/tt0045152/">Singin&#8217; in the Rain</a> a 3.5 when my true rating is a 5.  Here&#8217;s how we would calculate the RMSE:</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;">Purple Rain Prediction <span style="color: #003399;">Error</span><span style="color: #339933;">:</span>  <span style="color: #cc66cc;">5</span> <span style="color: #339933;">-</span> <span style="color: #cc66cc;">4</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">1</span>
Singin<span style="color: #0000ff;">' in the Rain Prediction Error: 3.5 - 5 = -1.5
&nbsp;
Squaring each error:  1*1 = 1, -1.5*-1.5 = 2.25
Add the squares of all errors together = 3.25
&nbsp;
MSE = Sum of Squares divided by Total Guesses = 3.25 / 2 = 1.625
&nbsp;
RMSE = square root of MSE = sqrt(1.625) = 1.275</span></pre></div></div>

<p>Let&#8217;s take a simple algorithm to solve the problem: for any user rating any movie, predict a future rating as the global average rating for that movie.  This algorithm produces an RMSE of 1.05, not too shabby.  The RSME for Netflix&#8217;s Cinematch system (which presumably employs <a href="http://en.wikipedia.org/wiki/Collaborative_filtering">collaborative filtering</a> techniques) is around 0.95, a mere 10% improvement.   The problem is indeed a difficult one.   Netflix will award a one million dollar prize to anyone who can improve the system by an additional 10%.</p>
<p>I submitted my first prediction file today, mostly as a test, nowhere near the <a href="http://www.netflixprize.com/leaderboard">leaderboard</a>, with the following algorithm:</p>
<p>A customer C will rating a movie M based on the following function:</p>
<p>rating(C,M) = 0.5 * (the global netflix average rating for movie M) + 0.5 * (the customer&#8217;s average rating)</p>
<p>My RMSE?  </p>
<blockquote><p>
Your prediction file submitted 2006-10-10 21:31:56 has been decompressed and processed.<br />
The computed RMSE for the quiz subset was 1.0147.
</p></blockquote>
<p>More to come. . . </p>
]]></content:encoded>
			<wfw:commentRss>http://www.shiffman.net/2006/10/10/netflix-challenge/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

