5. FRUSTRATION
Hard to find movies you would enjoy
General ratings don't work (IMDB)
e.g. movies you enjoy aren't neccessary have high rating
6. THE IDEA
What if ...
you had a clone of yourself
it could watch movies
and you could do other things
then he could tell you what is good and what is not
just like twins
7. TWINS
What if ...
You had not 1 twin
but 2 twins
or 10 twins
or 100 twins
or 1000 twins
8. PLAN
1. Select movies you like and don't
2. Run magic algorithm
3. Get personalised movie recommendations
10. ALGORITHM
Step 1. Find your top Twins
Match each user with every other user
Calculate compatibility rating between each pair of users
11. REPLACE INTO `user_twins`
(`user_id`, `twin_id`, `avg_difference`,
`percent`, `movies_matched`, `level`, `updated_at`)
(select user_id,twin_id, avg_difference, percent, movies
_matched,
if(percent > 92.5, 1,
if(percent > 90, 2,
if(percent > 87.5, 3,
if(percent > 85, 4,
if(percent > 82.5, 5,
if(percent > 80, 6,
if(percent > 77.5, 7,
if(percent > 75, 8, 9)))))))),
CURRENT_TIMESTAMP
from
(select sr.user_id as user_id, sr.twin_id as twin_id,
Sum(single_points) as avg_difference,
(10-(Sum(single_points)/count(*)))*10 as percent, count(
*) as movies_matched
from
(select mr1.user_id as user_id,
mr2.user_id as twin_id,
ABS(mr1.rating-mr2.rating) as single_points
from user_votes as mr1 join user_votes as mr2
on mr1.movie_id = mr2.movie_id and mr1.user_id <> mr2.us
er_id
where mr1.user_id = %d)
as sr
group by sr.user_id, sr.twin_id
having count(*) >= %d) as t2
);
12. ALGORITHM
Step 2. Calculate personalized movie ratings
Pick 1000 best twins
Combine their ratings
Predict rating for each movie
Pick best movies which user haven't seen
select ...
Power(10-level, ln(count(*))/ln(2.5)) * count(*) * avg(ratin
g) as rating_points_sum,
Power(10-level, ln(count(*))/ln(2.5)) * count(*) as vote_poi
nts_sum,
ln(count(*))/ln(2.5) as power,
13. PROBLEM
Where do you get your movie catalogue?
IMDB.com - good and expensive
TMDB.org - okay and free
70. LESSONS
Keep history of your progress
Don't strive for perfection
Release as soon as you can
Have a blog
Keep asking for feedback
Javascript frameworks are good
Browsers are not as fast
125. PERFORMANCE ISSUES
800+kb .js file
1000+ animated objects
Optimised filtering
Smart fetching
How many times to redraw
Chrome vs others
126. LESSONS
Grow and change ideas
Browser perfomance is hard
Communicating ideas is hard
Iterate
Ember.js + d3.js is a killer combo
Caching and CDN - FTW!
Web analytics to measure success