What is a recommender system (or recommendation engine)?

Dec 15th 2021

Have you had to make a big decision recently?

Maybe you needed to choose where to move or were in the market for a big-ticket item like a mattress. Maybe you did a ton of research online and asked your friends or nearest neighbors their opinions to try to land on a definitive answer about which choice would be the absolute best, the one that would bring you 100% satisfaction: the city recommended as the #1 place to live, with no air pollution and great schools. Or the product with all five-star reviews (not just 4.5) and exactly the right features for you; the crystal-clear winner.

If you’re like millions of people weighing decisions, you got overwhelmed. You had trouble deciding on the size, color, make, model, the optional features. You felt yourself sinking into mental quicksand, with your clarity about the best choice growing muddier by the minute.

More choices, less happiness

You can end up feeling “decisioned out” to the point that you don’t even have the energy to make smaller, less consequential choices, either.

If you’re like most people who want to make perfect choices, you’ve been afflicted by “choice paralysis,” an inability to choose anything at all because there are simply too many options with too many variations to sift through. Maybe you’ve ended up walking away from buying a new item after all, or giving up and randomly picking from seemingly popular items. You don’t have all day.

The phenomenon of “too much choice” sounds like a joke, but it’s a problem that has been widely documented. Hick’s Law states that the time required to reach a decision increases logarithmically with the number of choices available. And authors have written whole books about it, including The Paradox of Choice: Why More Is Less, by Barry Schwartz. “The fact that some choice is good doesn’t necessarily mean that more choice is better,” he says.

Bleak and potentially depressing as it can be for people to have too many choices, the good news is that using data science, recommender systems can simplify your process of having to choose and make the selection process easier.

What’s a recommendation engine?

What exactly is a “recommender system”? It’s a data-filtering tool that uses machine-learning algorithms to determine consumer behavior patterns, predict user ratings or preferences they’d have, and recommend relevant items to them. In short, it’s software that filters information to help users choose, and it provides them with the most relevant user-based suggestions.

A recommender system expertly narrows the number of items a particular user could consider so that they can continue calmly browsing and eventually make the right choice, confidently select an item, and feel good about the user experience to the point that they feel like they could — or even want to — do it again.

Recommender systems are all the rage. Most major tech companies that have an online presence are taking advantage of the ability to make “intelligent” recommendations based on user profiles, which, in addition to improving customer satisfaction, helps companies optimize by boosting their user engagement and retention rates.

For example:

YouTube suggests videos to watch, as well as which ones to automatically start playing next, based on the person’s user history
Facebook suggests adding new Facebook friends that users may know in the real world (offline)
LinkedIn suggests jobs that people may be interested in applying for based on various factors

Classic online retailers are also big fans of recommender systems. The most notable example and pioneering leader in product recommendations driven by deep learning is Amazon.com: In 2021, the site accounted for half of all U.S. ecommerce sales (Statista).

Big tech is the primary arena for personalized recommendations, but this technology is also being used for more-specific and focused applications, such as restaurant choices, online dating, social media, and financial services.

How recommendation engines works

What goes on in the inner workings of a recommender system? A recommendation engine does its work in four phases:

1. Data collection

Data is gathered from people in two ways: implicitly, without their knowledge; and explicitly.

Implicit data gathering consists of the software essentially following the user around online and noting what they do. When they go to the same page three times in a row or put an item in their cart, the data collection functionality takes note and, judging by what’s collected, starts to infer the user’s typical behavior.

Explicit data gathering takes place when a user is asked to or voluntarily offers information, such as when they write a review, rate a product, like a social media post or leave a comment, or follow an author or musician. This type of data collection also infers user preferences.

One drawback of explicit data gathering is that since people don’t always take the time to rate or review products they buy or provide information in other ways, the available data may be incomplete.

2. Storage

All of this interesting user data is filed away (for instance, in a NoSQL or standard SQL database) for future reference.

3. Analysis

Through filtering (such as batch, real time, near real time), the system looks at data relationships and identifies items that show similar user engagement data.

4. Filtering

Finally, the data is filtered to collect relevant information so that intelligent, appropriate recommendations can be made to users. Different algorithms can be used when recommending products. Content-based and collaborative algorithms recommend similar content, whereas a cluster algorithm makes recommendations that could also be based on what other users are doing.

The importance of relationships in a recommendation system

The relationships between elements in the collected data are the “glue” that gives recommender systems an understanding of customers’ preferences and helps them know what people want.

Three types of relationship between users and items are looked at in data analysis:

The user-item relationship: Users have preferences for certain items and types of products
The item-item relationship: People may like items that are similar looking or described in a similar way, such as books in the same genre or foods in the same type of cuisine
The user-user relationship: people with similar backgrounds (for example, in the same age group) may have similar taste

In addition to focusing on these relationships, recommender systems look at data about:

User behavior: details about people’s online activity, collected when they do things like go to particular web pages, give product ratings, click on items, engage with content (e.g., by watching a movie no Netflix), and buy things
User demographics: This is what it sounds like — information about age, ethnicity, gender, education, career, family structure, income, and religious beliefs
Product attributes: product-related information such as item genre or type of cuisine

As long as we’re making lists of three…there are also three different types of recommender system: those that do collaborative filtering, those that do content-based filtering, and those that do both types.

Collaborative filtering

A collaborative filtering recommender engine focuses on collecting and analyzing data on user behavior, activities, and preferences in order to predict what a user likes based on their similarity to other users. It does not analyze or understand the content itself (the product, book, or video).

This type of filtering:

Uses a matrix-style formula to suggest items based on what it knows about the user
Applies the logic that if a user likes item A and another someone likes it too, as well as item B, then the first user might also like item B
Predicts new interactions based on earlier ones

There are two ways of filtering collaboratively: memory based and model based.

Memory-based collaborative filtering methods

This type of collaborative filtering:

Identifies clusters of users; utilizes user interactions to predict those of similar users
Identifies clusters of items rated by user A, then uses them to predict interaction of user A with similar item B
Can be challenging with matrix sparsity because the number of user-item interactions may be too low for generating high-quality clusters

Model-based collaborative filtering methods

This type of collaborative filtering:

Relies on machine learning and data-mining techniques
Strives to train models to make predictions; for example, it might use existing interactions a user has had with an item to help a model predict the person’s top-5 liked items
Has an advantage over memory-based collaborative filtering in that it can recommend larger numbers of items to larger numbers of users

One drawback of collaborative filtering is that this type of processing takes time and has to start somewhere. A new user on the site or the introduction of a new product is a cold-start problem. You need enough user-item interactions for the system to work; it can’t give product recommendations until the person has interacted with enough items. Companies may be able to decrease the lag time by directly asking people for data (such as their age or interests) when they sign up on a site, and pairing that collected information with item metadata that’s already available to relate the new user to the existing items.

Content-based filtering

With filtering according to content, similar items are grouped together based on their features. Recommendation algorithms take into account the customer’s preferences plus descriptions of items (e.g., their genre, product type, color, or word length). You know a system is filtering based on content when the interface tells you “If you like that, you might also like this.” Or when it shows you an item that it remembers you were interested in the past, thinking you could still be interested in it.

What are the drawbacks of a recommendation engine using content-based filtering?

You have limited insight; you can only recommend what’s similar to what a person is viewing, buying, or using
If user interaction shows that someone is interested only in certain categories, the system has no way to reliably recommend items in other categories
There’s not enough data available about new users when they first get started

Hybrid approaches

A recommendation engine that uses hybrid filtering utilizes both collaborative and content-based data. Not surprisingly, its superior filtering produces the best information.

Netflix is a good example of a company using hybrid recommender filtering on its website. It accounts for users’ interests (collaborative filtering) and also for movie descriptions and features (content-based filtering).

Decisions, decisions

Recommender systems are a sanity saver that can help people break through choice paralysis when making decisions, confidently choose what’s best, and leave them with enough energy and brain cells to focus on making those bigger, tougher decisions.

Will we get to the point where we delegate even our “big” decisions to online expertise? You might ask yourself a big question, such as “Where shall we go on vacation?” or “Who’s the best plumber to fix my overflowing sink?” only to end up checking the web for low plane fares or looking on Yelp for plumber reviews.

So for all intents and purposes, that time is already here…

Want a great customer experience provided by the right recommendation engine? Contact us today.

For more information

The anatomy of high-performance recommender systems (part 1)

What is a recommender system (or recommendation engine)?

What is a recommender system (or recommendation engine)?

More choices, less happiness

What’s a recommendation engine?

How recommendation engines works

1. Data collection

2. Storage

3. Analysis

4. Filtering

The importance of relationships in a recommendation system

Collaborative filtering

Memory-based collaborative filtering methods

Model-based collaborative filtering methods

Content-based filtering

Hybrid approaches

Decisions, decisions

For more information

Recommend

预约破百万，测试一码难求，这款硬核二次元手游真要火？

Algolia AI: Announcing the industry’s most intelligent search platform

播音员主持人注意！广电总局提出这些新要求

技术干货| 如何在MongoDB中轻松使用GridFS？

Taking documentation search to new heights with Algolia and Autocomplete

Intel, Fiverity and Fortanix join forces to stop digital fraud

API keys vs JWT authorization – Which is best?

Project Management Tools are the Key to Asynchronous Work

Node.js 18 is now available!

Add Autocomplete search to your Strapi CMS

About Joyk