Link Graphs & Information Retrieval

A couple of days ago on LinkedIn, I came across a discussion thread where someone wanted the community’s recommendations on the “best course out there for mastering SEO” for a complete beginner. There is no such a thing as mastering SEO with a one-off course.

Truly mastering SEO is a continuous learning process and if I could go back and tell myself 7 years ago where to start, this post sums up what I’d say. You can quickly grasp the more advanced SEO knowledge if you first understand how the web is organised and how search engines retrieve information.

To dig deeper into the topics I discuss, click the links under I want to know more! at the end of each section to access the sub-sections for the more in-detail knowledge.

What Do SEOs Do?

An SEO is essentially a one-person marketing agency. You need a wide variety of tools in your SEO toolbox in addition to technical, on-page and off-page implementation skills.

Many people stop after getting a general understanding of these three roles, that professional and highly skilled SEOs fulfill, because there is a prevailing idea that an SEO’s job entails only getting links to show up on Google’s first page. Even though this is a part of what an SEO does, it is far from the whole truth. Focusing on just one thing alone will make your strategies very ineffective in the long run.

Successful SEOs develop purposeful information assets. They understand how information exists on the web and they know how it is organised and retrieved. They also understand human behaviour behind information consumption and know what their target audience needs at the various stages during the purchase process.

In my opinion an SEO’s real job is to build logical information links between information on a multitude of topics and successfully direct people towards information assets that a business has to offer.

A Better Way To Learn SEO

The most important lesson in SEO that I’ve learnt so far is this:

Change is the only constant and everything else is controlled by Google.

There’s no point in finding easy work-arounds because for long-term success there are no short cuts. Search engine optimisation is not a set of solitary, separated tactics, it’s an intricate strategy.

An indispensable SEO should have answers to the following questions:

What is the Link Graph?
What is PageRank?
What is the difference between Semantic and Entity search?
Fundamentals of Link Building.
Technical SEO tactics every SEO should know.

Yeah; it’s not the sexiest way to learn SEO – I mean things like keyword analysis, competitor research, content modelling and information architecture will get your boss’ attention because they sound ten times cooler. But if you start here you’ll be less likely to get caught up in short-term tactics and be better equipped to develop long-term SEO strategies.

How Is The Web Structured?

Have you ever thought about how the world wide web is organised? I mean there are more than 700 million active websites,  350 million registered domains and on top of that 150,000 new URLs are created each day. Each day!

An Introduction To Link Graph

Let’s kick off this series by understanding how the web is structured using Graphs. A graph is a map that shows how different objects are connected to each other. Here’s a more technical definition of the graph-theory from

A graph is a formal mathematical representation of a network (“a collection of objects connected in some fashion”).

Each connection in the graph is called a node. Corresponding to the connections (or lack thereof) in a network are edges in a graph. Each edge in a graph has two distinct nodes.

A Link Graph is the map of the world wide web.


The image above shows a Link Graph where the coloured circles numbered 1, 2, 3, 4, 5 are nodes that are connected to each other. The dotted lines represent the edges connecting the nodes in the graph.

A Link Graph is how the internet is organised by search engines. Google specifically uses its index (model of the web as a graph) to categorise and prioritise documents available on the web.

Essentially, understanding what a Link Graph is and how it works is fundamental to understanding more advanced topics such as crawling, indexing and link building.

I want to know more!

Link Graphs and How Search Engines Use Link Graph to Organise the Web

Graphs And Connectivity-Based Ranking

Now that you’re familiar with Link Graphs, and if you clicked on the link you’re also up to speed with different types of web graphs and application in structuring the web. Let’s look at how the graph of the web is utilised by Google’s search algorithm.

The web is a massive Link Graph where millions of documents are connected with each other via hyperlinks. Again, we’ll start by looking at the bigger picture first and later take a deeper look at how PageRank utilises the Link Graph.

This is what a Connectivity-Based Ranking system means;

The value of a document available on the web is determined by the number of links (directed and undirected), from other documents available on the web, that are pointing to it.

The Connectivity-Based Ranking system uses both Link Graph and the Co-Citation Graph in indexing, categorising and ranking webpages.

The concept of a Connectivity-Based Ranking system is very intuitive and simple. That’s why it’s the foundation of almost every online search engine in existence. Making it vulnerable to rigging on its own. That’s where PageRank comes in.

I want to know more!

How PageRank Utilises Query Dependencies, Neighbourhood Subgraphs and HITS Algorithm in Indexing and Ranking the Web

Levelling Up: How Link Graphs Assist In Information Retrieval

A good SEO understands that in order to produce the best results every single element of Google’s link analysis and processing algorithm needs to be taken into account:

Number of Inbound Links.
Number of Outbound Links.
HITS Authority Score (Quality of inbound links).
HITS Hub Score (Quality of outbound links).

The final concept I want to introduce in this article is Crawl Bandwidth.

In addition to assisting in ranking and indexing, link graph of the web also helps search crawlers determine which webpages to crawl next and how frequently to crawl them. Crawling the entire web requires massive bandwidth, so naturally, the goal of each crawl is to fetch high-quality webpages.

One way for search crawlers to determine which pages to crawl during each session is by visiting the pages with the highest quality inbound and outbound links from previous crawl sessions. Since bandwidth is both costly and a limited resource, how frequently and how extensively a crawler will visit your site depends on how well your content scores on the Hub, Authority and PageRank factors.


The most important thing to remember for an SEO is that search crawlers are programmes that follow a strict set of instructions, even when they’ve started learning on their own. Where these crawlers go and what they fetch is completely under your control. So my advice to you is to take your time to develop pages that deliver value to search users and use on-page, off-page and technical SEO tactics to always put your best foot forward.

Leave a Comment

Your email address will not be published. Required fields are marked *