Many gurus will tell you that as soon as you write for humans you will be fine. This is in part true. In fact, although the LinkedIn algorithm will pick up content based on quality and relevance. I believe it is important to know how LinkedIn defines "quality and relevance"
Indeed, the algorithm assesses the content of the platform based on the guidelines it receives from humans. Thus knowing how the algorithm works - I argue - is critical to be successful on LinkedIn. Why? By understanding the algorithm in reality you understand what's the aim of the platform and also how it is evolving.
In addition, even if you don't plan to build a strategy on LinkedIn but you're a user and you're curious about what's behind a platform that recently has crossed half a billion members this article is for you.
- Everything you need to know about LinkedIn Feed Algorithm
- Pass the content quality assessment
- Precision and Recall mechanism to select relevant content
- The FollowFeed mechanism of virality
- How LinkedIn created the Activity Graph from a bug to emphasize organic content
- Avoid low-quality content to go hyper-viral
- LinkedIn feed content syndication summarized
- Leverage the network effect to grow your LinkedIn audience
- LinkedIn feed algorithm explained
Everything you need to know about LinkedIn Feed Algorithm
As explained by Rushi Bhatt, director of Engineering at LinkedIn:
Keeping the LinkedIn feed relevant by identifying unprofessional and spammy content is critical to maintaining the quality our members’ content consumption experiences. In this post, we describe the various processes and algorithms that keep our feed cleared of spam and relevant to our members.
Thus, the relevance of the content is assessed via negativa. In short, the LinkedIn algorithm is at each time trying to understand whether a piece of content is spammy or not:
The LinkedIn spam-fighting strategy, Source: engineering.linkedin.com
It does that by performing small tests. In other words, instead of pushing out the post update to most of your network, the LinkedIn algorithm starts with a small number of people; if those people find the content engaging. The selection begins to expand. Therefore, when you post content on LinkedIn it goes through three main players:
- human editors
The users give their vote with Likes, Comments, and Shares. The algorithm acts as a sort of middleman between the users and the human editors to keep the flow going. When the content gets a low content quality score (which tries to answer "is content good?"), it triggers the check of a human editor to see whether it's spammy. If instead, the content passes the content quality score it's cleared and displayed to a small number of users that can give their vote.
After the vote (with likes, comments, and shares) quality classifiers and virality predictors algorithms keep performing an assessment, which triggers a check from human editors again. If it passes the test (is content good?), then it gets cleared and displayed back to users.
That is why in some way posting stuff that is interesting to people, rather than focusing on the algorithm is critical. In fact, if your post is going viral yet it might be spammy, it will still be checked by human editors that might demote it! That is because assessing whether the content is low quality or not also depends on factors that are not easy to weight.
In short, as of now, it might be easier to fool algorithms, than human editors. However, the algorithms do play a critical role in filtering out content. In fact, using humans to do that, for a platform with over half a billion users might be impossible.
What are the steps the LinkedIn feed algorithm takes to make a post or article go viral?
Pass the content quality assessment
As LinkedIn specifies:
The role of the LinkedIn feed is to provide timely, professional content. What may pass as acceptable content on a general social network may not be a pleasing experience for a professional social network like LinkedIn. We would like to eliminate as much low-quality content from the site as possible. At the same time, we do not want to be overzealous about filtering content from the site, because that could lead to more false positives and user dissatisfaction. In other words, we need to strive for high precision and recall for our classification and labeling.
The LinkedIn main aim is to provide "timely, professional content." This definition is critical. In fact, as LinkedIn gets closer and closer to Facebook, it is worth understanding that LinkedIn Business Model is quite different from it.
As of 2015 (before LinkedIn got merged into Microsoft), more than 50% of LinkedIn Revenues came from Hiring and premium subscriptions. Instead, if we look at Facebook revenues stream as of 2017, you can see that it mostly come from advertising. Why is this important at all to understand the LinkedIn feed algorithm?
Willingly or not, companies' decisions are influenced by the way they monetize. Where Facebook is highly dependent on its feed for monetization. LinkedIn is less so. That means LinkedIn has also more freedom to choose how to shape its feed in a way that is more in line with its subscriptions based users and HR professionals part of the platform.
Thus, the LinkedIn algorithm is focused on avoiding low-quality content on the feed, while making sure not to filter content that might lead to false positives (cases in which something seems spammy but in fact, it's not). How does it do that?
Precision and Recall mechanism to select relevant content
As specified on Wikipedia:
In pattern recognition, information retrieval and binary classification, precision (also called positive predictive value) is the fraction of relevant instances among the retrieved instances, while recall (also known as sensitivity) is the fraction of relevant instances that have been retrieved over the total amount of relevant instances. Both precision and recall are therefore based on an understanding and measure of relevance.
In short, this is a sort of balancing mechanism. On the one hand, precision focuses on finding relevant instances. While recall focuses on completeness. So imagine, there are ten posts showing up on your feed, of which only five seem to be relevant to you. This means that the precision is 5/10. In short, you get half the time what you're looking for.
However, imagine that in your network of one hundred connections at that given time were posted thirty posts that might have been relevant to you. Yet you only got five. This means the recall is 5/30. As you missed other twenty-five potentially relevant posts.
In other words, this mechanism tries to answer two specific questions:
- how useful is the content shown?
- and how much relevant information is shown to each user?
Back in 2012 LinkedIn introduced a feed infrastructure called Sensei. This was a distributed data system that also supported the LinkedIn feed. As explained by LinkedIn, Sensei was both a search engine and a database. However, in 2014 LinkedIn set out to build FollowFeed which was launched in March 2016 and today powers up LinkedIn feed experience. Why does it matter at all?
The FollowFeed uses the concept of a timeline (shares an article, member is mentioned in an article, etc) to compose the feed for each member. To compose the feed LinkedIn uses a model called "Fan-out-on-write" which as explained:
Feed for each viewer is pre-computed, materialized and kept ready for retrieval using a simple lookup query. This is made possible by fanning out a content record to pre-materialized feeds of multiple entities.
How LinkedIn created the Activity Graph from a bug to emphasize organic content
LinkedIn's strategy is based on users engagement. The strength of LinkedIn - I believe - is based on its business model. In fact - as advertising is one of the revenue streams LinkedIn relies on - the company can focus on engagement without affecting too much its bottom line.
This is highlighted by LinkedIn in June 2017:
The story of the almost year-long project behind LinkedIn’s Activity Graph begins with a bug report, as things usually go. We noticed that sometimes, sponsored content (i.e., an ad) would show up in the first position in a member’s feed. This is against our internal best practices and something we actively try to avoid; we want the most interesting organic content to be the first thing a member sees, not an ad.
In other words, once the LinkedIn team has figured out that with a sponsored post you could hijack its feed algorithm. They worked out a way to avoid this happening so that organic content could be emphasized over sponsored content.
What is an organic content?
it consists of the pieces of member-generated content in the feed, which we call “Activities.” An Activity is defined by three main components: Actor, Verb, and Object. An example in prose would be “Val shared a text post,” or “Vivek liked a comment.” We present these Activities as cards in the feed UI.
Thus, each time you're liking, sharing, or writing a text post, this can be defined as an activity, which will be labeled by LinkedIn as organic content.
In short, apparently, as LinkedIn algorithm had come up with a process called "decoration" a spammy organic content, was removed before it could be shown to the users. Thus, allowing a sponsored content (an ad) to get the first slot, which instead was reserved for the organic content.
Before moving on, keep this in mind, not all Likes are born equal.
Beware, LinkedIn isn't Facebook
When you like, share or post something this enters your activity graph. Thus, each of those activities should be done strategically if you're using LinkedIn for business. For instance, if you're liking something, you might want to avoid to like a video of cats (unless of course, you sell accessories or food for cats).
Why? First, this will enter your activity graph, thus influence the feed algorithm and what you will see next in your LinkedIn feed. Second, when you like something this acts as a vote/recommendation that you offer to your network. In short, a Like on LinkedIn weights much more than a like on Facebook.
Thus, before liking the next funny cats' video (assuming the LinkedIn feed algorithm doesn't demote it), beware of that!
LinkedIn also introduced the concept of low-quality content (called LQ).
If the precision and recall mechanism allows the LinkedIn algorithm to filter relevant content by trying to keep out content that is spammy or low-quality. There is an issue of scalability. In fact, as LinkedIn put it:
Having a few low-quality shares go hyper-viral can cause dissatisfaction for a very large number of members.
Thus, since the risk of having low-quality content go hyper-viral is too high the algorithm would rather stop something "suspicious" rather than allow it to go viral.
As explained by the LinkedIn engineering team the mechanism is the following:
There are a set of classifiers that label the content in three ways:
It is important to understand that this process happens in near real time. Thus, as soon as you hit the publish button before the content gets shown to your network it has already been labeled by the LinkedIn algorithm.
If the content gets classified as spam or low quality it might get either demoted or passed to a human editor. It is important to understand that as professional network LinkedIn gets most of its value from keeping its feed as clean from spam as possible. Thus, when a content can be deemed as spam or low quality it would get demoted rather than risk to have it go viral.
If the content passes the quality score assessment it gets cleared to gather some audience data.
As it gathers audience
At this stage, the LinkedIn algorithm needs to gather data from the audience in one person network to assess whether the content is worth. However, to avoid the risk of having low-quality content go viral the LinkedIn feed algorithm keeps monitoring a few aspects:
- reach of the original poster,
- members interacting with the content,
- and the temporal signals like the velocity of likes, shares, and comments,
- the computed content quality scores
This sort of recipe allows the algorithm to understand two main things. First, if the content is likely to go viral. Second, if the content that is likely to go viral is also potentially low-quality content. Those analyses run every few hours to keep the feed as clean as possible.
There is an aspect that is highlighted by the LinkedIn engineering team:
Most of the easy to catch content is filtered out at creation time, leaving behind hard to classify instances in the feed.
Thus, once again when you hit the publish button on LinkedIn at creation time your content might already be checked for quality insurance before it even gets shown to anyone. As the feedback from LinkedIn members grows so the LinkedIn feed algorithm gathers data to push the content through the network.
Members also play a key role in assessing content quality
When members report low quality or spammy content those are taken into account for two reasons. First, to reassess and bring down content that the feed algorithm or human editors were not able to detect. Second, this data gets fed to the LinkedIn algorithm so it can learn what the LinkedIn members find valuable and what not.
Now that we know how the algorithm works, how can you leverage the LinkedIn network effect?
Leverage the network effect to grow your LinkedIn audience
Start by analyzing your LinkedIn network with http://socilab.com/#home.
Making sure you widen your network is important. Yet, size matters relatively. Knowing whom you need to talk; to building your brand/business is crucial. For instance, I used socilab to understand how my network is clustered.
There are a few metrics you might want to monitor to leverage the "network effect" (although the tool I used only analyzes 499 connections max):
The effective size of your LinkedIn network
That tells you that not all contacts are born equal as some overlap with your network. Those overlapping might add less value in terms of reach. However, they might also be important to build a strong brand. Thus, if you're trying to build a brand with social media it might make sense to have a clustered network initially. Then as you get known in that cluster it makes sense to expand that network to contacts that don't overlap to have a greater reach.
LinkedIn Network constraints
That is an index that measures how distributed is your network. While a widespread network might be good for virality. That might be less so to build a reliable brand. Thus, you need to balance those two aspects
LinkedIn Network Density
It shows how close is your network in terms of actual ties compared to possible ties. The denser your network the more your contacts know each other. Once again, while this might be good initially to take over a niche, once you've become known in that niche you might want to lower the density of our network.
LinkedIn Network Hierarchy
Hierarchy assesses how dependent you are on a few focal contacts (imagine most of your contacts know you through your boss!). In general, you want your network to be distributed to avoid the dependence on a few contacts. In short, also in terms of networking, you might want to avoid to have your eggs all in one basket.
LinkedIn Network Betweenness
That tells you the bridging opportunities (for instance, which of your contacts might get you closer to a cluster or a contact that can widen your network quickly). For instance, I noticed that there are two contacts in my network which bridge me with an industry I'm weaker. I connect back to those two contacts to widen my network.
LinkedIn feed algorithm explained