Even in the time of COVID-19, Meetups are still happening virtually! Last week I tuned into Dataiku‘s talk on “Recommender Systems and Deep Models at Etsy” presented by Moumita Bhattacharya, Senior Data Scientist at Etsy. This was an introduction to recommender systems for me and I thought it would be an interesting topic to share, as recommender systems are present on many websites that people use on a daily basis.
For those unfamiliar with Etsy, it is a two-sided marketplace where sellers sell handmade items, and where buyers go to purchase personalized handmade items. One feature of Etsy, is that it provides personalized item recommendations to buyers based on their previous interactions on the site, a.k.a. recommender systems.
Recommender systems originated as part of a Netflix challenge to predict user ratings for a movie that a user had not seen before. Using matrix factorization, a class of filtering algorithms, they were able to predict a users rating for a movie that they had not seen using other user ratings for all movies. While the Netflix challenge was a pioneer in the field of recommender systems, one of the main challenges of this approach is that it expected all users to provide some kind of explicit feedback. Additionally, there are constraints regarding what users are saying in their feedback; are users even watching the entire movie before providing a rating? Or are they just giving it 5 stars because they know that it’s a good movie?
This is where Etsy did things differently; instead of using explicit user feedback, they turned to implicit user feedback. Implicit feedback includes all of a users interactions on a site including the items that they purchased, items that they favorited, items that they clicked on and even items that they saw but made no interaction with. Etsy wanted to use this kind of information in combination with user and listing features to train their machine learning models and predict the probability of purchase as their recommender system. With these probabilities of purchase, they could also use it to infer how new users might make purchases.
A problem that Etsy faced is that they had too many users and too many product listings to compute all of these purchase probabilities. Therefore, they did it in a two stage process: 1) “Candidate Selection” in which millions of listings are filtered down to hundreds of relevant items and 2) “Ranker” in which those items are then ranked based on relevance using more precision.
Originally, the second stage “Ranker” used a linear model, but more recently there has been experimentation using non-linear models and deep neural networks (DNNs); a DNN is better in this case because it provides a better fit of the model. After experimentation, it was found that a 3-layer neural network was optimal for predicting the likelihood of purchase.
Once Etsy developed their recommender system for relevant items, they than wanted to see if they could optimize for both relevance and profit at the same time. So, instead of only showing users relevant items, or showing them items priced from highest to lowest, they wanted to find an optimum where they could show a relevant item while not compromising on revenue. Therefore, they introduced a revenue term into their model, and were successful at optimizing for revenue without compromising on relevance.
When it came to evaluating the model, metrics used included Area under the Curve (AUC) for relevance, Normalized Discounted Cumulative Gain (NDCG) used for ranking in terms of both relevance and price and then price based metrics like Profit. When compared against linear regression, and logistic and weighted logistic regressions for baseline performance, the new revenue-relevance model was in fact able to attain the highest Profit and AUC metrics among these. And we can see below an example of how different the recommended results are: