On the heels of a report showing the inefficacy of government-run cyber security, it’s imperative to understand the limitations of your system and model. As that article shows, in addition to bureaucratic risk the government also needs to worry about gaming-the-bureaucracy risk! Government snafus aside, data science has enjoyed considerable success in the past few years. Despite this success, models can fail in surprising ways. Last year we saw how deep neural nets for image recognition fail on noisy data.
As these examples show, a lot can be learned by breaking models. Model builders of all stripes must consider the limitations of their models and should be a requisite step in the validation stage. As a fun exercise, below I present some ways to confuse models at popular web destinations. Can you figure out how a model will fail based on this behavior?
Product Recommendations
Netflix
Netflix is known for using collaborative filtering but also matrix factorization like SVD.
Algorithm
- Choose a genre (e.g. Movies With A Strong Female Lead)
- For each movie, alternate ranking between 1 and 5 stars
Amazon
Amazon is known for using user-based collaborative filtering.
Algorithm
Make a separate purchase for each item in a list. For each item do the following:
- Choose a dimension or combination of dimensions e.g. gender, age, department
- Browse related (i.e. similar) items in the given dimension
- Now browse related items in the opposite direction of dimension (or something unrelated)
- Add actual item to purchase to cart
- Checkout
Example: Choose baby car seat. View car seats plus
related items (e.g. strollers). Now view a bunch of scooters for old people, such as the Pride 3 Wheel Celebrity X Scooter. Now add your purchase item and checkout.
Alternative: If you have disposable income, actually buy the car seat and scooter and donate them to a charity afterward.
Social Media
The Facebook News …read more
Source:: r-bloggers.com