|
|
View previous topic :: View next topic
|
| Author |
Message |
IIOIOOIOO double banned
Joined: 08 Dec 2006
|
Posted: Sun Feb 13, 2011 7:12 am |
|
|
| Luvcraft, do not write the recco algorithm yourself. Look up the top10 list for netflix's last round and offer a gainshare of 5% to anyone on the list who will let you use their engine. 10% if they manage the customization for your content. |
|
| Unfilter / Back to top |
|
 |
IIOIOOIOO double banned
Joined: 08 Dec 2006
|
Posted: Tue Feb 15, 2011 12:39 am |
|
|
| luvcraft wrote: |
| Endless wrote: |
Look at how much room it frees up as well |
that IS very pretty, but it would require creating custom "letterboxed" images for all 7,700+ games in the system, and also unfortunately it kinda de-emphasizes the top row, which is contrary to what we want. :(
|
I've solved this problem before. It is a great technique for dealing with source images of varying size. I would do it consistently for all images. instead of just the top row. To get it prepped quickly, here's what you do.
Get an image manipulation library and write a little program (could probably be done with a scripting language and an automatable image editor too) to generate 10 images based on pre-set relative "rectangle selections" of the source image.
Write a second quicky program that does nothing other than display the 10 images on the screen and ask you to select which one looks best. Maybe add an option for "no good choice." You and your crew can split it up and shred through in a few hours. You will use this tool forever.
Tips:
1) Make the rectangle cuts relative to image dimensions.
2) Spend a couple of hours looking at a sampling to identify the "most likely hit" areas.
3) Make it so that you can start and stop the exercise anywhere in the list.
4) Make it so that you can redo this easily in the future... i.e. when you add a new platform or to keep up with new releases.
5) Make it so that the feedback only requires one click.
6) If you have others to help you, web-enable the app. The extra hands will make up for delays in rendering and display.
7) Add a capability to your site for people to submit their own replacement images. Give their name a star or some crap when they contribute.
Oh, also: Give achievement type things on your site later on. Website achievements make your site stickier.
Oh, also, also: Create a way for netflix users to "friend" reccr on netflix. You will find correlations between movie tastes and game tastes. That's a cool feature: Tell me you name on netflix and I will recc games.
A final also: Are you able to access friend network data when people auth through facebook? (I think so.) If you can, take advantage of that. Being able to offer a new users reccs on the spot will be great.
Final, final also: Do you do any tagging for your games? If you don't have quantitative tags on your items, network-based recommendations fail. That's why your engine thinks that one guy wants all cave games just because other games on his list match a user who also likes STGs. If you're not going to fix that, then just make the whole thing transparent and say "People who like games that you like also like these, with the greatest overlap." Until you fix it, your "correct match %" will be very low permanently. If that's the case, you shouldn't rank recommendations. |
|
| Unfilter / Back to top |
|
 |
IIOIOOIOO double banned
Joined: 08 Dec 2006
|
Posted: Tue Feb 15, 2011 12:58 am |
|
|
Honestly, my recommendation would be to look through the rankings from that recent netflix competition and hire some single person from the top 20 teams to give you guidance on how to do this. It's hard (very) to get beyond that first "hey, this looks like it might work" point and to a place where your accuracy will be high enough to retain visitors. If you have no money, offer an ownership stake. All of the people behind rank 1 came up with an awesome approach but got nothing for it.
I'd offer myself up but am bound by multiple restrictive agreements as a result of past technology sales. I only offer this guidance freely because it's generally well known and discussed at public presentations, etc.
P.S. You won't need the image thing forever if you make this work. If you are able to drive sales, especially for back-catalog items, publishers will feed you the content in whatever way you like it so that their stuff will appear on your site. |
|
| Unfilter / Back to top |
|
 |
IIOIOOIOO double banned
Joined: 08 Dec 2006
|
Posted: Tue Feb 15, 2011 12:59 am |
|
|
| PPS: Cloud Tags are not meaningful meta-data. They're cool for natural search stuff but terrible for quantitative analysis. |
|
| Unfilter / Back to top |
|
 |
IIOIOOIOO double banned
Joined: 08 Dec 2006
|
Posted: Wed Feb 16, 2011 12:48 am |
|
|
Luvcraft, please: Let me get this one for you.
But seriously, and I mean no offense by this, the likelihood is high that your result accuracy will degrade as your network grows more dense. Both as more users enter the system and as individuals rate more games. Your accuracy will end up lower than 20% r-squared. Eventually, you will end up with everyone being recommended hentai games because they like STGs. You will struggle spectacularly with people who have multiple niche genre interests as those interests largely have poor inferential value from one to the next. As I said, it appears (from reported false hits) that your engine largely relies on "similar profile inference."
If you intend to deliver this as a marketable product, the quality of the engine will need to be much better. Services like Netflix and Amazon have set the bar pretty high in this arena. Consumers have grown used to engines that "just work" and will not be pleased by poor performance in this area. Will your engine ever be able to predict whether a person will enjoy a game that just came out and almost nobody in your network has played? Nope. Will your engine ever be able to understand that a user JUST DOESN'T LIKE some type of game? I'm sure you're a great programmer but this is a difficult problem you're trying to solve and you seem resistant to feedback from a person who actually knows how to do what you're trying to do. Find someone to help you, even if it means diluting your ownership stake. Giving away 25% of all future profits on a service that actually grows is better than keeping 100% of profits from a venture which goes nowhere.
I appreciate new projects and I very much hope that you succeed. Please be willing to learn from those that came this way before you. |
|
| Unfilter / Back to top |
|
 |
IIOIOOIOO double banned
Joined: 08 Dec 2006
|
Posted: Thu Feb 17, 2011 6:53 am |
|
|
Sorry if I was confusing. English is not my first (or third) language.
Here's a good exercise to better understand. Get your hands on the Netflix sample data. Apply your algorithm using no metadata beyond Title and Format. Break the sample set into 5% chunks and feed your engine increasingly large sets (5, 10, 15, etc) while applying the resultant information to the prediction test files included. Up through 25% of set, it probably gets better (because it goes from random to something.) Beyond that, accuracy will likely begin decreasing again.
Associative Set-based prediction engines that only use network selections (overlap between liked games between set participants) more or less always coalesce into highly segregated spires of recommendation overlaps and ultimately fail in their ability to distinguish "dislikes" unless there is a sufficient strong relationship between the core area of overlap and tangential interests. I.E. If all people who like STGs also like H-Games, your engine will do just fine. If not, accuracy will begin to degrade as "Set Megatrends" (Sorry, struggling for a better way to describe w/o maths) emerge through the growth of your network. Your problem is further complicated by not having any metadata to allow for matrix/vector based to distinguish characteristics useful for predicting segmenting factors.
I should have better-clarified my challenge: Can your engine predict with anything above random accuracy that a user will like a game that NOBODY has rated yet? Everyone else's can.
Is there no source for the necessary metadata? If not, and if this is a commercial venture, consider using the Mechanical Turk to gather it. You can probably collect that info for USD .10 per title and also have the opportunity to resell it.
I've invested more time than intended in this conversation. I am not going to solve this problem for you, for various reasons. As a starting point for your exploration, check out this article:
http://blog.smellthedata.com/2009/06/netflix-prize-tribute-recommendation.html
I think it does a good job of taking a complex problem and posing it in simple terms... especially for a developer. Read it, and the commentary, and you will see several techniques named which are worth investigating.
Welcome to the Internet. All of the easy problems were solved in the 90s. |
|
| Unfilter / Back to top |
|
 |
IIOIOOIOO double banned
Joined: 08 Dec 2006
|
Posted: Thu Feb 17, 2011 6:56 am |
|
|
| Also, good luck. I look on with interest. |
|
| Unfilter / Back to top |
|
 |
IIOIOOIOO double banned
Joined: 08 Dec 2006
|
Posted: Fri Oct 05, 2012 1:58 am |
|
|
| IIOIOOIOO wrote: |
| Luvcraft, do not write the recco algorithm yourself. Look up the top10 list for netflix's last round and offer a gainshare of 5% to anyone on the list who will let you use their engine. 10% if they manage the customization for your content. |
Sorry for your loss. Had you nailed it, you could've sold the algorithm and database to Gamefly, Inc before they went subscription-focused. |
|
| Unfilter / Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
Powered by phpBB © 2001, 2002 phpBB Group
|