Boardgames-O-Matic
My capstone project helps users on the premier board game website in the world, BoardGameGeek.com (BGG), navigate the sea of board games available and identify games to check out based on their past rating habits. Coming into this course, I was determined to work on recommender systems technology for my capstone and came up with 3 ideas. The first two were food-related and the third had to do with board games. I chose to work on board games as the dataset was easy to obtain and users explicitly rated the games on a scale of 1-10 based on a rubric. I performed a little bit of web scraping to extract a portion of the games for my analysis and made API calls to the website to obtain user and rating information. In all, I obtained almost 8 million ratings from more than 120,000 users over 1807 games. With a sparse rating matrix in hand, I proceeded to utilize the cosine similarity neighborhood method and a couple of latent factor methods to derive top 20 lists for each user. I created a base recommender class and several subclasses for each model. I conducted offline evaluation utilizing RMSE and the results showed that the best method was that of a Non-negative matrix factorization with weighted alternating least squares method although an SVD with 10 latent factors method produced very similar top 20 lists with an RMSE differing by 0.0001 I then created a web app through flask and aws hosting that provided users that rated at least 10 games on BGG the opportunity to get three top 20 lists from three different models while evaluating if they were useful and accurate or not. In 2 days of going live, I received 222 respondents, with about 74% of them liking at least one of the list. Looking back, I am most proud of being able to see this data project through the entire pipeline from data collection, EDA, modeling and even obtaining live online feedback of the usefulness of the system, which actually matters more than offline evaluation metrics.