Get in touch with our team
Feature image for 07.07.2016

07.07.2016

4 min read

Search Leeds: Dixon Jones on how to evaluate the whole web

This article was updated on: 07.02.2022

Dixon Jones is marketing director at Majestic, one of the largest crawlers on the planet. His talk at this Search Leeds 2016 was all about how companies like Google (and Majestic) evaluate the whole web, and what the SEO industry can learn from this to get results. Here are our notes from his talk:

Take aways

  • Ideas connect in groups
    • Gaining trust within a group is the key to success
  • Topics help Google to connect entities
  • Your web page/site may be an entity…
    • But it’s probably part of larger collections of objects about topics, just as your branding is not just your website, and never has been. Search engines are starting to understand more about your brand and how it interacts with the wider world
  • The more entities that cite you as a topic authority…
    • The more relevant you are for that topic.
  • If it isn’t connected to defined entities, it’s dead in the water
  • Action point for SEO: Be famous for something
    • Get the most famous person in your field to cite you

How do we make decisions

Two of the best places to start when you’re looking at this topic:

The Chimp Paradox by Prof Steve Peters

Hooked: How to Build Habit-Forming Products by Nir Eyal

These two books start to figure out what we’re doing as people; not just on the web but in life, and they’re both well worth reading. They bring us to the big question here:

Why do we trust search engines?

More specifically, why do we use a search engine to find out something instead of asking someone we know? We could ask a friend, who we know is motivated either to not let us down, be altruistic or look good in our eyes (or a combination of the three). But billions of us use search engines, despite the fact that search engines are motivated by selling to us.

Why is this so? To answer the question, we need a bit of psychology.

“When we use search engines we’re acting on emotion, and emotion works 20 times faster than the reason part of our brain.”

If we really thought about it, we would know that we would be better off asking a friend, but Google is faster. It’s got to, otherwise we’ll realise we’re being conned. Because it’s fast, we get instant gratification. And we also know we won’t be embarrassed. Our logical brain works too slowly and Google wins.

Second to this is the fact that using a search engine then becomes a habit, so Google wins. Search engines haven’t so much earned our trust as learned our trust.

That’s the why, now for the ‘how’

Search engines retrieve information through the collection, grouping, indexing and matching of data. Groups make search better by eliminating as many irrelevant results as possible. Just like you would head to the right section of a library if you want to find a specific book. If search engines can segment data into smaller segments then the retrieval process is sped up almost infinitely. This also helps hugely with spam too.

Google categories by:

  • IP
  • Image search
  • Video Search
  • Local listings
  • Maps
  • Travel

All of this makes Google’s retrieval process far more efficient, because it groups  information in ways that make it more efficient for their categorisation. However, Google spends lots of time and money trying to understand semantic search and understanding pages and words which means the same thing. It uses this approach to break down topics into concepts. Good news for SEOs, who are more interested in defining by topic rather than data type.

How Majestic categorises the web

Majestic emulates Google’s page rank. It’s worked out that content connects in groups, and that keywords also gravitate into groups as well. It sees and analyses links and has worked out that it can categorise the internet based on this groups.

However, the drawbacks to this approach…

  • You need a universal data set for it to work best (i.e. the whole internet!)
  • Every signal is small
  • Individually prone to error or opinion
  • At scale the error rate decreases and confidence increases

More examples of this at https://blog.majestic.com/research/pagerank-trustflow-search-universe/

Then it converts that crawl to a number from 0-100; the Trust Flow score, which is now also categorised. The calculation is iterative and not based on the number of links, but the quality and grouping of links.

Google is now re-defining authority

Jon M. Kleinberg talked about Hubs of authority back in 1999. Google took this view for a long time, then they said ‘hey, Google can do better than that’ and so the Knowledge Graph was born. This has changed everything for SEO and we’re now at a fork in the road.

Old approach: start by measuring the power of every page

Vs

New approach: Chasing the entity over the link

As this feature from Google’s research blog shows, Google now wants to change the way we think about authority. It wants us to stop thinking about authority as a page and start thinking about authority as a thing (Teaching machines to read between the lines).

 

Read more from Dixon Jones

https://twitter.com/dixon_jones

@tryMajestic

https://majestic.com/