How Search Engines Work

Clearpath Technology

Remember that there are many factors that contribute to how search engines work, and some of them are closely-guarded secrets. Google, the most popular search engine, keeps parts of their operation under wraps to prevent fraud and dishonesty.

Web Spiders and Bots

In the old days of the Internet, a website would often have to be submitted directly to a search engine in order for listing on that search engine. This was often the only way to gain a listing on a site like Yahoo! because directory listing sites relied on human editors to select the topics and contents of directory listings. Some search engines had rudimentary ways to search pages on the Internet. The problem is that search engines hadn’t yet developed the ability to filter results very well. By using a search engine, the user couldn’t be assured that he or she would even come upon pages related to the keywords typed into the search engine.
Today, however, no submission is necessary in order to be listed in search engines. Because technology has evolved, search engine listings are much more refined and certainly more user-friendly. Today’s search engines spend lots of time, money and research in order to provide a searching service that will give the user the best web experience possible. In fact, the search engine business model relies upon this fact. The more people that visit a search engine, the more web traffic that site will receive. The more web traffic, the more people will see advertising on that search engine. Since search engines frequently rely upon advertising to support the site, this works out for everyone. The user gets better search results and the search engine gets more users to their site.
How do search engines provide an improved web experience for their users? One technology development that has improved the typical web search is the spider or bot. A web spider is a program run by the search engine. Instead of relying on the slow speed of a human editor, search engines can now rely on computer programs that never need to stop searching websites on the Internet. The sole purpose of these programs is to “crawl” the web all day, every day. These computer programs are, in essence, looking over every single website they encounter. These programs search through websites, checking links, and examining keywords. The spiders will also look at HTML elements within the pages like page descriptions, meta tags, and page titles.

Basically, these spiders are used to compile a large amount of data from sites on the web. Here is a list of elements within a website that are scanned by search engine spiders:

·    Text within the website
·    Links within the website
·    Page descriptions embedded within HTML
·    Keywords embedded within HTML
·    Photos and photo descriptions and alternate text
All of this information is then compiled into a series of databases maintained by the search engine. This database is what eventually helps search engine users find the information they want. When a user goes to a search engine and searches for the term “kittens,” the search engine will then go to the database of all indexed information. The search engine will look for the keyword “kittens” within all of those databases, and then presents this information to the user. In order to give the best results to the user, the search engine will also sort this data by relevancy, putting the sites that are most related to the keyword or words at the top of the listing. By “indexing” all these sites, including their links and keywords, search engines can then provide a rank for each of these pages.