Friday, June 29, 2007

Estimating Traffic to your Web Site Due to SEO: Some Intuitive Formulas

Estimating the traffic that can be generated by an Internet Marketing Campaign (IMC) is difficult to do since it’s impossible to precisely tell how much more new traffic will be generated. Fortunately, there are ways to overcome these difficulties. Some of these approaches are listed below:

  • Use Industry benchmarks or guidelines
  • Apply "experience" or "commonsense " or "best guess" formulas
  • Establish baseline traffic workload with on-going campaigns using web analytic software and go from there
In the following subsections, we discuss how the above can be used to estimate the amount of traffic generated by an IMC; namely, the increased volume to an e-commerce site resulting from SEO, PPC, Banner Advertising, e-mail marketing, affiliate marketing, blogs, and forums.

Increased Traffic Volume from SEO

Recall that one of the objectives of SEO is to increase volume and quality of traffic generated to a web site from search engines. In general, the earlier a site is presented in the search engine results page (SERP), the higher it ranks. In fact, one of the goals of SEO is to get organic rankings for the site in the top 10 listings of the SERP. The reason is that studies have shown that when a site is listed in the top 10 of the SERP, then the likelihood of the user clicking the link associated with that site is high. Also, studies have shown that when a link is listed after the first 2 pages, it is rarely clicked.

The next question is: what precisely does SEO involve? The following 3 steps (very) briefly describes this activity.

1. The first thing that is usually done is key term analysis. Usually, 10 or 20 keywords and phrases are devised that best describes the business and the products/services that prospective customers would be searching for.

2. The next step involves search engine submission, which consists of submitting the website to major search engines and directories.

3. Finally, the content of the web site is analyzed. This includes amending the HTML code and text of the existing pages, and if necessary creating additional pages relevant for the targeted key terms.

Recently (August, 2006), AOL released search engine statistics confirming some of the studies discussed above. The click through rates of the AOL statistics are summarized in the Figs. 1 and 2.

Fig 1. Percent share of click throughs for the top 10 ranking positions

Fig 2. Percent share of click throughs for ranking positions 7 – 13, 21, 31, and 41

Figs. 1 and 2 show the percentage of click throughs at different ranking positions for SERP’s. Note that a “click though” is the process of clicking on a link in a SERP after a search. For instance, Fig. 1 shows that there is a 42.4% chance the user will click the on the first link returned, an 11.8% chance the user will click on the second link returned, and so on. Figure 2 indicates that the chance of users clicking links later on in the SERP’s become very unlikely.

From a probabilistic perspective, Figures 1 and 2 give the (conditional) probability the link in the SERP’s at rank k is chosen given the user clicks, that is,

P(rank k chosen| click).

It should be noted that for the purpose of estimating the traffic increase due to SEO, we assume that if optimization is performed correctly (as discussed in steps 1-3 above), then the site will get a high ranking (i.e., in the top 10), and results from queries using a search engine will result in the site being in the SERP’s distributed according to the AOL search engine statistics shown Figs. 1 and 2.

So to approximate the probability that one of the top 10 SERP’s links are chosen, we can use the following formula

Using the AOL data, other interesting statistics can be mined. For instance, there were 36,389,567 lines of data, 19,442,629 user click through events, and 16,946,938 queries without user click-throughs. Thus, given this information, we can approximate the (unconditional) probability that a user will click on a link in one of the SERP’s after a query. This is given by

Also, from Eqns. (1) and (2), we can estimate the probability that a result from a query was in the top 10 and the user clicked on one of the links. This can be calculated by manipulating Bayes’ rule, which gives

Estimating the impact SEO has upon a web site (i.e., to measure of how it is optimized) could, in principle, be determined by estimating the volume of searches for the key terms used to optimize site. Remember the more key terms that are embedded in the HTML code of the site, the more likely it is for that web site to be retrieved when a key term is used in a search.

Following this line of reasoning, the increased amount of traffic to a web site due to SEO can be estimated, namely, by approximating the traffic volume for a set of key terms.

In fact, a popular SEO tool, WordTracker, can be used for this purpose. The formula they use to estimate daily search volume for any key term is the following:

As an example, for a key term that appears 9,000 times in WordTracker’s database, the number of daily searches that are predicted is:

It should be noted that every day, on average, WordTracker collects about 4.10 million search terms from and, who according to account for 0.63% of searches across all engines. By combining these two figures, WordTracker estimates that the total daily searches across all engines is approximately 650.5 million.

One approach to determine an (rough) estimate of the increased volume of traffic due to SEO is to determine which key terms are most “representative” of the “optimized” web site and what key terms are most “representative” of the “un-optimized” web site. Usually, 10 to 20 key terms for each case should suffice. Then WordTracker (or a similar tool) can be used to compute the expected number of searches for each of the cases. Once this is done, then the percentage increase in volume for the new key terms can be computed by the following formula:

To calculate the (increased) volume in traffic for a web site, we need to multiply the estimated daily search volume for the old key terms with the estimated percent increase in search volume for the new key terms. This is then multiplied by probability that a result from a query was in the top 10 and the user clicked on one of the links. Finally, all of this is divided by 10 since (we assume) that it’s equally likely for the link to be in any one of the top 10 positions in the 1st SERP (when the user clicks). The following gives this formula:

No comments: