One of the questions I get asked all the time is how the Google search engine works. This post is a short overview of how Google search works and how we can use this information to figure out how we can get better Google rankings.
If you don’t want to read this whole post, here is a short video from Matt Cutts explaining how Google works.
The first step from Google is to crawl the web to discover all the pages on the web.
Google use software known as “web crawlers” to discover publicly available webpages. The most well-known crawler is called “Googlebot.” Crawlers look at webpages and follow links on those pages, much like you would if you were browsing content on the web. They go from link to link and bring data about those webpages back to Google’s servers. This is one reason links are so important to ranking well on Google and getting your page discovered.
The crawl process begins with a list of web addresses from past crawls and sitemaps provided by website owners. As Google crawlers visit these websites, they look for links for other pages to visit. The software pays special attention to new sites, changes to existing sites and dead links. This means that if you are providing new content often your website could get crawled more often and again this can help increase rankings.
Computer programs decide which sites to crawl, how often, and how many pages to fetch from each site. Google doesn’t accept payment to crawl a site more often for our web search results. They care more about having the best possible results because in the long run that’s what’s best for users and their business.
The next step is to index all this data from the web crawls so Google can use the data from your website quickly.
The web is like an ever-growing public library with billions of books and no central filing system. Google essentially gathers the pages during the crawl process and then creates an index, so we know exactly how to look things up. Much like the index in the back of a book, the Google index includes information about words and their locations. When you search, at the most basic level, our algorithms look up your search terms in the index to find the appropriate pages.
The search process gets much more complex from there. When you search for “dogs” you don’t want a page with the word “dogs” on it hundreds of times. You probably want pictures, videos or a list of breeds. Google’s indexing systems note many different aspects of pages, such as when they were published, whether they contain pictures and videos, and much more. With the Knowledge Graph, Google continuing to go beyond keyword matching to better understand the people, places and things you care about.
For a typical query, there are thousands, if not millions, of webpages with helpful information. Algorithms are the computer processes and formulas that take your questions and turn them into answers. Today Google’s algorithms rely on more than 200 unique signals or “clues” that make it possible to guess what you might really be looking for. These signals include things like the terms on websites, the freshness of content, your region and PageRank.
Presenting the Results
A Google search doesn’t just dive into this index and fish around for what it needs. That would take a long time and return a lot of garbage. Several factors are used to present the most relevant search results, and this is where the Algorithm comes in. Some of these factors are known and others are kept confidential to thwart malcontents who might try to unfairly rig the system (read: spammers and other scum and villainy). So Google looks at the index from the pages it has crawled and runs this through its search algorithm to find the best content that answers your specific query.
The Future of Google Search
Google is become even more clever with its search. They are developing their algorithm to better understand the users real search intent. They are calling this semantic search. Semantic search is a data searching technique in a which a search query aims to not only find keywords, but to decide the intent and contextual meaning of the words a person is using for search.
Google are also presenting a wider variety of content in the search results than ever before. Check out this article from Dr Pete at Moz where he finds a huge range of different types of search results.
So How Can Knowing How Search Engines Work Help You?
Knowing how a search engine works in this basic sense means you can make some smart decisions about your search engine optimisation.
In the first instance you should make sure your site is able to be crawled by Google, If it cannot be crawled then it cannot be indexed, and therefore it cannot be found. There are many reasons why the site might be blocking Google bots, but it is usually the robot.txt file is giving the Google bot instructions not to crawl the site.
You can also help Google along by making sure you have a well designed site, not pretty graphics, but site architecture. Google can go through your site from link to link. You could also make sure you have submitted a site map to Google through webmaster tools.
Content is also important in ranking well. As you can see Google is ranking pages which offer unique, valuable content that answers a specific search query. So what is your customer asking? Answer this question or questions.
The Google search engine is a complex beast and they are constantly changing it to help deliver the best answers to people’s problems. This has hopefully enlightened you in a very brief way to how it all fits together.