Invisible Guides Through the Galaxy

We are visiting the digital galaxy again - today, as any other day. So grand and immense, expanding by the second. For years, we have roamed here, visiting place after place, yet not seen a percentage of what it has to offer. You would think we would become rather overwhelming at some point. Yet we seldom are. Maybe it's due to the invisible helpers we have by our side, guiding us through these various landscapes and obstacles toward our desired destination. Although many know these guides and interact with them everyday, few know their name: Recommender systems.

Alright, let’s get down to earth for a second and talk about these systems. This blog post aims to be a short introduction to the topic, including a bit about what recommender systems are, how they work, and why they are important. If you want the extremely short version, it can be summarized in one sentence: Recommender systems are systems which can recommend something to you based on information about you, other users, and the content they recommend to you. That’s it. Now, let’s dig in.

What Are Recommender Systems?

Since its beginning, the Internet has expanded immensely, containing more information to consume and options to choose from than ever before. To not be overwhelmed while traversing the digital landscape, the users rely on their appointed guides - the recommender systems - to bring them safely to their destination. The guides provide the users with personalized recommendations by navigating and filtering content to fit the users preferences and needs. Today, the systems have become such an integrated part of the users’ lives that they almost appear invisible. For instance, in 2015 Netflix revealed that the choices for about 80 % of the streaming hours were influenced by the recommendation algorithm. That is a lot!

Recommending you a good movie to watch at a late night is not the only thing these systems are good at. You can find them in all sorts of online services and you rarely go a day without using one. Picture this: On the way to work, you open Spotify and the first thing to pop up is the new Discover Weekly playlist, containing your latest obsessions. The break is spent scrolling through the Facebook feed, providing you with updates about friends and family, memes, celebrity news, and cute dogs. On the way home, you walk through a busy street filled with blissful couples and Christmas decorations. Your hand pulls up the phone, and swipes left and right until the matchmaking guru Tinder can find you a suitable date. The day is topped off by lifting your feet up on the coach and beginning the great hunt for Christmas presents from the comforts of your home. You find yourself leaning on Amazon’s trusted recommendations, as your family never seems to wish for anything. All of these applications are recommender systems, and yet there exist even more variants.

How Do Recommender Systems Work?

You may have noticed that your Netflix feed looks quite different from the first time you opened it. Somehow, it has become much better at guessing which types of movies you would like to watch. It might make you wonder how Netflix does this. How is a machine able to know what sort of movie you would like to watch or which product you would like to buy? The short answer is machine learning - most often. But we don’t really like big, scary buzzwords in this blog post, so I will try to give a few pointers without getting too technical, just so we can grasp the concepts.

Let's begin by looking at the recommendation process. There are usually three phases: Collection of user data, processing data and generating recommendations, and returning a list of recommendations to the user. First, in order to make personalized recommendations, the system needs a lot of data about the user - the more, the merrier. This data is usually collected in two ways. One way is to explicitly request the user for data - such as the user’s gender, age or nationality - when the user begins using the app. Another way is to implicitly collect the data through the user’s interactions with the system. For example, when a user clicks on a product, it might indicate the user’s interest in that sort of product. The same principle applies with the genres a user previously has listened to, or the videos they have liked.

Once the needed data has been collected, the system uses it to try to extract the user’s preferences and predict which items they will like. I make it sound as if this happens in one go, but that is not entirely true. As mentioned earlier, the system usually uses some form of machine learning. This means that the system keeps a model containing all the users and their preferences, which can be used to make recommendations. It would take a lot of time to update the model anytime a user clicks on a new item, and therefore this usually is done periodically. For instance, if you watch a new movie on Netflix you may not notice much difference in your feed until the next day. So, when a user logs into a streaming platform, the system only needs to check the model for preferences and generate some recommendations to send back.

Recommendation Strategies

Now, let’s look at some common strategies for deriving preferences and generating these recommendations. It all comes down to similarity - similarity between users and similarity between items. What do I mean by similar users? Have you ever bought a nice sweater in an online shop, and then been prompted with a pair of pants and a text saying: Other people who bought this sweater also bought these? That means there is a group of users out there which share some of your preferences based on your previous purchases. The same concept can be applied to movie recommendations. Netflix might look through your watch history, and search for other users who have watched many of the same movies. As you seem to have some common interests, they might recommend some movies from their watch history which you have not seen yet. This strategy is called Collaborative filtering. It is by far the most popular strategy and is used by many of the big applications.

Content-based filtering is another strategy that tries to generate recommendations based on the similarity between items rather than users. An item may refer to a product, a movie, or anything which can be recommended to the users. In this case, the similarity is based on the content of the items and the available information about them. For example, take a webpage which recommends articles. If a user reads a lot of articles about a certain topic, such as Computer Science, the system will assume their interest and find other related articles which they might like to read. However, this strategy is not well suited for all content types. In order to find out the characteristics of an item, there must be some way to analyze the content and extract this information, making it rather hard to apply to videos and music.

There is also a third common strategy called Hybrid filtering, which combines the two previous strategies. This is done to utilize their strengths and mellow out their weaknesses.

Why Are Recommender Systems Important?

We know that the greatest advantage of the recommender systems is their ability to help users navigate through content and find what they are looking for. And we know this is quite a valuable ability due the enormous amount of content on the Internet. In other words, the users are gaining a great deal from using these systems. But what about the businesses behind these systems, what are they gaining?

There is an overwhelming amount of businesses and services - all which are fighting daily for the users attention. Therefore, providing the users with the best recommendations and experience has a great impact on business, and is taken quite seriously by many companies. Let’s look at a quick example. In 2009, Netflix held an open competition where teams could compete to find the best collaborative filtering algorithm to predict users movie ratings. The competition lasted for three years, and the winning team received a shocking amount of US$1,000,000! The time and money Netflix spent on this competition demonstrates how important the recommendation experience is to many companies. After all, the quality of recommendations may determine whether a company gains or loses a customer, or how much time a customer spends on their platform.

It might be good to keep in mind that the recommender systems do not only cater to the users’ needs, but also serve the interests of the company. Therefore, there might come a time when their interests and the users’ interests are in conflict. For example, a social media platform which tries to keep the user entertained for as long as possible, may not be considerate of the user’s personal goal to spend less time there.

So, What Have We Learned?

Let’s summarize. We have now learned that recommender systems are systems which recommend interesting content and products to the users. This can be achieved by extracting the user’s preferences from either data about users (collaborative filtering), data about items (content-based), or both (hybrid). The systems mitigate the problem of information overload on the Internet, and are beneficial to users and businesses alike. The conclusion? I get to watch good movies. Netflix gets my money. Everyone’s happy.

Relevant resources recommended by the author

Recommender Systems: The Textbook

link.springer.com

How Recommender Systems Work (Netflix/Amazon)

www.youtube.com

Introduction to Recommender System

towardsdatascience.com

Recommender systems: introduction and challenges.