Archive for the ‘Data Science’ Category

Data-Driven Web Design: Examining Link Sizes, Densities, and Click-Throughs

April 24th, 2015 by Dan

Many publishers would likely argue that the design of the website is as important for enticing readers to engage with the content as the content itself—humans, unfortunately, do judge books by their covers. The Guardian, The Atlantic, and The Wall Street Journal are just a few of the many publishers who have redesigned their websites this year.

We wondered if we could use our data to give insight into just how important web design is—a concept we call “data-driven web design.” Are there aspects of a page’s design that correlate to increased traffic, and even better, increased engagement?

Font sizes and colors, link sizes, link density, interaction, responsiveness: These are elements we can analyze for their ability to draw traffic to content and perhaps even contribute (along, of course, with the content itself) to keeping people there. Do people prefer to read articles surrounded by few links, large fonts, and bright colors? Or, are sparse, simple sites with undecorated text better? For those of us keen on data, could you use these attributes to predict how many people will be drawn to the content?

Understanding how page elements relate to click-throughs is by no means a new idea. For as long as Google AdSense has been around, there have been all kinds of smart people who’ve tried to figure out just how ad size relates to clickthrough-rates (CTR). But ads and articles are very different beasts. Do the same rules that hold true for ads hold true for articles? Does link size matter? Is it the only thing? Are there even any rules at all?

We here at Chartbeat like to focus on engagement, but as a first-pass, we wanted to examine how the almighty click-through relates to the size and distribution of links on a homepage. We examined a measure of click-through probability, the clicks per minute per active visitor (CPV). The data used in this analysis is the same which powers one of our most popular products, our Heads Up Display.

We looked at data from 294 publishing sites during several different times of day across several days to sample a variety of conditions. Much of what we found is not surprising—that is, people click where the design guides them to click. For instance, the majority of clicks happen at page depths of 400 to 600 pixels, where most main content links are located (Figure 1). The other most probable places for clicks are the locations of menus on left and right sides of the page. Nothing surprising here. As far as link sizes go, intuition holds as well: One would expect larger links—which likely represent headline articles—to drive greater traffic. This is certainly true. As a link’s area grows, generally so does the clicks per active visitor (Figure 2).

1-where-do-visitors-click

2-how-does-link-size-relate-to-click-throughs

Larger links correlate with higher click-throughs, but what about link density? For sites with a lot of closely packed links, does this dilute click-through rates? After all, there are only so many concurrent users to split across content. As a proxy for density, we looked at the median distance between links on a site. The data shows that CPVs decrease approximately linearly for links a distance of 450 pixels apart to about 2,000 pixels apart. Sites having more closely spaced links perform about two and a half times better than sites with distant links. It seems users prefer denser sites (Figure 3).

3-does-link-density-affect-click-throughs

These two pieces of evidence seem to contradict each other, though, because the distance between large links is necessarily large (assuming, of course, the links aren’t nested!). You might think, “Wait… if I have a lot of large links, I’ll have huge CPV, but they will be spaced far apart, so I’ll have a small CPV!” But, in reality, the data is only reflecting a common website design principle—a few large links interspersed with many smaller, closely spaced links.

In fact, if you ponder these data long enough, it seems that we run into a chicken-and-egg problem. Click-throughs force a tautology. Design forces people to click in certain places, so they do. And we measure this. See why engagement matters?

In any case, the data back up our intuition when it comes to determining how many people will click through to a given piece of content. Given a large enough dataset in which you know where a link is on a page, its height and width, how many people are on the page, and how many are currently engaged with content, you could likely obtain a reasonable prediction for the CPV. And perhaps using this knowledge, one might use such a model to guide the redesign of a website.

We decided to try this (not the site redesign part, the modeling part!). Simple statistical models we have recently built can predict CPV for a link to within 0.007 clicks per min per active visitor for 92% of links. This might seem impressive, but to get a foundation for what this means, only four websites in the set we analyzed have a median CPV greater than this. There is much more work to do until we can really answer the question if design can predict attraction to and engagement with content, but the way forward is promising. Colors, font sizes, responsiveness—the design space is large. These can draw people in, but ultimately, it is the content that will keep people there.

So, the next time you are thinking of undergoing an overhaul or redesign, stare closely at your Heads Up Display. Think about link size, link density, and ask yourself what you can do to draw people into that fabulous content.

The Evolution of Dark Social: Correcting Attribution in the Mobile App Age

April 16th, 2015 by Chris

Over the past few years, Internet traffic has seen major changes. As smartphones become more ubiquitous, more and more people are spending a significant amount of time on the web on mobile devices, and in particular, via mobile applications. In October, more than half of the time Internet users spent online was via mobile and tablet applications.

With the rise in mobile application traffic, there has been a parallel increase in unattributed traffic to articles on the web—a bucket of traffic referred to as dark social. This category of traffic encompasses not only the visitors who enter a URL directly, but also those who click on links from email, instant messaging, and many mobile and desktop applications. Unattributed traffic can also result from a number of technical issues that cause referrer information to be omitted from a known traffic source. The lack of clear attribution for this traffic is a big problem: for most domains on our network, dark social accounts for 20% to 40% of overall external traffic to articles. Because of the popularity of mobile applications, the percentage of dark social traffic among mobile users is even higher.

Fortunately, the problem of dark social is becoming more widely acknowledged throughout the industry. Individual domains have long tried to manually alleviate the problem by including tracking tags and custom URLs on their social content, but are increasingly looking for additional tools to confront the problem head on. Analytics providers continue to refine their offerings and take a leading role in driving the conversation. Major referrer sources are doing more to ensure that their traffic is properly acknowledged. We’ll take a look at some of these developments.

One way of getting a handle on this attribution problem is to look carefully at traffic patterns among the articles on your site. For a large majority of the articles we have looked at, dark social traffic closely correlates in time with other attribution sources. For instance, several of the most popular mobile applications for Reddit do not pass referrer information. Consequentially, when we see spikes in Reddit-based traffic on desktop, we tend to see a corresponding spike of dark social traffic on mobile. This suggests that a large portion of dark social traffic is really just misattribution of known referrers. As a result, for individual articles, you can explicitly attribute much of this traffic to the correct sources.

Chartbeat is now leveraging user agent profiles to disambiguate a significant chunk of dark social mobile application traffic. Many major mobile applications such as Facebook, Twitter, Pinterest, Drudge Report, and Flipboard set a tag in the user agent to identify the application. For example, in the following user agent, the tag “[FBAN/FBIOS…]” identifies the use of the Facebook application on iOS:

Mozilla/5.0 (iPhone; CPU iPhone OS 8_1_2 like Mac OS X) AppleWebKit/600.1.4 (KHTML, like Gecko) Mobile/12B440 [FBAN/ FBIOS;FBAV/21.0.0.25.14;FBBV/6017145;FBDV/ iPhone7,2;FBMD/iPhone;FBSN/iPhone OS;FBSV/8.1.2;FBSS/2; FBCR/AT&T;FBID/phone;FBLC/ en_US;FBOP/5]

In many cases, we saw an immediate difference after Chartbeat started capturing missing referrers for these user agent-tagged mobile applications. For instance, we saw the traffic attributed to mobile Facebook use jump as much as 40% from previously misattributed dark social traffic.

Several large sites have also made recent efforts to try to pass along referrer information more of the time. In early 2014, Yahoo made a sitewide conversion to use HTTPS instead of HTTP by default, causing referrer data to be dropped. Recently, however, we have observed changes from the Yahoo site that now allow the referrer to be passed for both Yahoo Search and Yahoo News. Facebook also recently announced that it fixed a bug that was causing referrer data to get lost on outgoing mobile application clicks. This fix is particularly notable because of how much traffic originates from the social network.

We can see the results of these changes across our network. Figure 1 shows how the share of dark social traffic has evolved over the second half of 2014. While dark social on desktop is relatively stable, we can see a significant drop in dark social for both mobile and tablet devices in November, concurrent with the Facebook fix. (We also see a corresponding rise in Facebook traffic.)

dark-social-share-of-traffic-by-referrer

As more sites pay closer attention to the analytics needs of its publishers and as more mobile applications pass referrer information or user agent identification, perhaps we can make further inroads into the problem of missing attribution. Still, even with the most recent efforts, dark social share remains at a third of external traffic. We still see close time series correlations for major drivers of traffic such as Facebook and Reddit. It is apparent that we’ve made strong progress in mitigating dark social traffic on mobile and tablet devices; but as a share of traffic, dark social on mobile is still significantly higher than dark social on desktop. Unfortunately, we can’t give up on tracking codes and custom URLs quite yet.

How Engaged Time Affects Reading Comprehension

December 22nd, 2014 by Chris

We recently got an interesting question from a client about the connection between engaged time and understanding in news articles. A priori, one may think that there should be a strong correlation–someone quickly skimming through an article should not be expected to retain as much as someone carefully reading–but there are some reasons this might not be the case.

Journalists are taught to get to the point quickly in their news articles, using an inverted pyramid style. One of the presumed benefits is to allow readers to exit the story at any point they like and still retain the important information from the story. One of the most common phrases a young journalist hears is often “don’t bury the lede” for this very reason. Perhaps, then, it might be that a reader gleans most of his/her understanding from a story from the first seconds of reading, and the marginal value of spending more time reading the rest of the story is relatively small. We decided to put the assertion to the test.

First, let’s you and I conduct an informal experiment right now. I’m going to present you with a few dense bullet points of the findings of the study, and I’d like you to decide for yourself whether you feel like you have anything to gain from reading further.

  • We conducted a survey of over 1000 people to investigate the association between how long a reader is engaged reading a news article and what s/he takes away from it.
  • We confirmed that there is a strong association between fact recall and engaged time.
  • Readers engaged for more than a minute were almost twice as likely to recall specific facts about the article as readers who spent less than 15 seconds. This was true even when the fact was found in the first lines of the article.
  • Further, we found evidence that readers who spend more time engaged are more likely to agree with the author’s conclusions.

Now that you’ve gotten the facts, you can feel free to leave this blog post and spend your valuable time elsewhere, but if you’ll indulge me, I think you’ll find that your understanding may be heightened if you let me go into a little more detail about the experiment and the results.

Experimental Design

In order to test the relationship between reading comprehension and engagement, we carefully considered how best to induce paid participants to act like internet users reading news articles. We’ve previously established that it’s not common for a reader to read all the way through an article. In a typical article, the most likely (modal) behavior for a reader is to leave after only about 15 seconds. The most simple survey design would be to instruct readers to read through an article however they like and then ask questions about it afterwards.  The problem with structuring an experiment like this is that a paid participant just doesn’t act like a typical internet user. We tried this with a quick pilot study. The average reading time for the article we selected was an order of magnitude higher than what you might expect for a typical reader.

It struck me that one important consideration that this design was obviously lacking was an element of choice. When a reader visits your site, s/he is making a choice to spend valuable time reading your content instead of reading a different article, looking at funny cat pictures, or spending time off of the internet entirely. In the naive design, we’d effectively purchased time from people to take our survey, so they felt compelled to “do a good job” and carefully read through the article, even though that’s not what we wanted them to do at all.

Once we settled on a design that included elements of choice, we got more sensible results. We put together a simple website that showed one of eight different news articles with a button to flip from one to to the next. The participants were instructed to read as much or as little of each article as they liked. After five minutes, the site would redirect users to a survey page that asked five questions about one of the articles. The survey had five multiple choice questions about one particular opinion article about Iranian air strikes:

  1. A detail question asking about a fact from the first paragraph of the article.
  2. A detail question about a fact from near the end of the article.
  3. An “attention check” designed to weed out respondents who were not reading the questions.
  4. A conceptual question asking for a summary of the author’s thesis
  5. An opinion question relating to the author’s message.

We asked 1,000 paid participants on Amazon’s Mechanical Turk to take our survey. Of these, we eliminated about 10% for various reasons (e.g., failing the attention check question, multiple submissions from the same device, not clicking through to the article we asked questions about). When we look at an article, our engagement metrics look reasonable:

engage_time_hist

The graph above shows a peak engagement of about 20 seconds, gradually tapering off as time increases. This reassures us that we are sampling a population that models internet viewership reasonably well.

We can also intuitively understand the pattern of responses.

question_correct_barchart

The majority of readers could correctly identify a detail from the beginning of the article and summarize the author’s thesis, but fewer were able to answer a question about a detail near the end of the article. This can be explained by the fact that relatively few people read through the article. In truth, the 37% that answered the question correctly is likely as high as it is only because you could expect 25% correct from random chance.

Fact Recall by Engaged Time

Looking deeper, we find a strong association between fact recall and engaged time. Readers spending more than a minute were almost twice as likely to recall specific facts about the article as readers who spend less than 15 seconds (approximately the top and bottom quartile of the engaged time of the responses).

Let’s look specifically at conceptual understanding. Roughly 40% of participants engaged for less than 15 seconds correctly assessed the message of the article, compared to more than 80% of those engaged for more than a minute. We’ve plotted recall against engagement below, with its associated 95% confidence interval. The slope of the logistic regression tells us that for every 15 seconds of increased engagement we can expect to see about a 30% increase in the odds of correctly answering the question correctly.

LogisticRegressionQ4

The complete results are summarized in the following table. In particular, for each of the questions, we see a positive association between recall and engaged time, even when the relevant information was found at the very beginning of the article.

Respondents Answering Correctly by Engaged Time
Question Overall < 15 seconds >1 minute Increase in Odds Per 15 Second Increase in Engaged Time
Conceptual Understanding 62% 42% 81% 32%
Detail from Beginning of Article 63% 39% 81% 31%
Detail from End of Article 37% 27% 44% 8%

A matter of opinion

The opinion question was adapted from a June 2014 CBS News/New York Times Poll: “Do you favor or oppose the United States working with Iran in order to try and resolve the situation in Iraq?” In the original poll, 53% were in favor, 39% opposed, and 8% were unsure. The first observation about the results is that respondents were much less likely to express an opinion, but given the survey population this is not surprising. The second observation is that readers of this article, which supported this position, were relatively more likely to agree with the author’s position. In particular, the portion of responses agreeing with the author varied significantly between people who engaged with the article for more than a minute and those who engaged for less than 15 seconds:

Q5_top_pieplotQ5_bottom_pieplot

I caution that the experiment does not show causation. It’s probably the case, for instance, that readers who more strongly agree with the author are more likely to be more engaged with the article. However, it is at least plausible that reading the article helped impact the opinion of the readers (though we would have to do more tests to find out how much this might be the case in general).

It’s worth explicitly noting that we only effectively ran this experiment for one particular article. Whereas we think the questions we’ve raised and attempted to answer are important, and that the results we’ve shown are useful directional indications, it’s clear that the magnitude of the results will depend on the questions asked and the article itself.

Win Hearts and Minds

We’ve seen before that engaged time affects things like brand recall for advertisements, so the result that engaged time affects reading comprehension is not altogether surprising. What is interesting is the extent to which this is borne out in the results.  This just adds to the growing body of evidence that capturing the attention of your readers gives you the opportunity to win both their hearts and their minds.

For writers, I suppose this conclusion is both a blessing and a curse. Yes, you’re still going to have to spend time polishing the second halves of your articles; but by focusing on keeping your readers engaged, they will ultimately take away more from what you’re saying. And isn’t that the point of effective journalism?

The State of Dark Social in 2014

December 4th, 2014 by Chris

Here at Chartbeat, we have a long history of trying to shed light on the sources of your traffic. Since 2012, we’ve helped illuminate the phenomenon known as dark social—where traffic is likely to come from social sources, yet lacks explicit referrer attribution. Two years later, Internet traffic looks a lot different than it previously did. Mobile and application traffic have grown significantly. More sites are moving to HTTPS. Usage patterns are evolving. We wanted to take the opportunity to look into the current state of dark social and dive deeply not only into potential causes, but also potential disambiguations of this nebulous block of traffic. As a result of these investigations, we found a way to attribute a sizable chunk of dark social (up to half!) to application traffic.

So what exactly is dark social? Here’s a brief recap. Back in the Wild West of web analytics, we tagged any traffic coming in without a referrer field as direct traffic. Many people have attempted to exhaustively list what might cause an empty referrer field, but it the typical explanation of a visitor typing in the URL directly was unsatisfying for article content. The alternate explanation that these visitors came from IMs, emails, or apps seemed much more likely, and so we categorized them as social instead. This social traffic came to be known as “dark social” and has made regular appearances at the top of referrer lists ever since.

Dark Social Volume

These days, dark social accounts for about a third of external traffic to sites across our network. The exact amount varies quite a bit depending on the particular site in question, but most sites have a chunk ranging from significant to extremely significant. The following graph shows a rough distribution of the percent of external traffic classified as dark social for a given domain for a sample of Chartbeat’s data, with the mean given in red:

 

 

 

 

 

 

 

 

 

 

We can break this data out further. The number is markedly higher on mobile, with upwards of 50% of mobile external traffic lacking a referrer on some sites. This is already a critical problem — how are we to analyze our top traffic drivers if we can’t attribute half of our traffic? — and since mobile’s share of traffic is increasing, it’s only going to get worse.

 

 

 

 

 

 

 

 

 

 

Potential Sources of Dark Social

In order to get a handle on the drivers of the problem, we did an empirical analysis of potential sources of dark social by setting up a site, posting links to it on various traffic sources, and clicking those links from a wide variety of traffic sources. The goal was to determine which traffic sources can be reliably assumed to not be dark social (because they always successfully set the referrer) and which do contribute to dark social (because they always lack a referrer or sometimes lack a referrer).

We were specifically interested in looking at some of the most popular social mobile apps. The following table shows whether some of the combinations of sites and modes of interaction successfully passed a referrer in our testing (with the caveat that we only tested the current versions of the applications and were not exhaustive using all different browsers and operating systems):

Referrer Passed?
Desktop
Mobile Browser
Mobile App
Facebook
Mostly
Yes
Sometimes
Twitter
Yes
Yes
Yes
Reddit
Yes
Yes
No
Tumblr
Yes
Yes
Yes
Gmail
No
No
No
IM/Text
No
No
No

We can see that major traffic sources are generally good about allowing their data to be tracked. However, there were some interesting exceptions.

  1. Facebook’s desktop site sometimes doesn’t set a referrer if the onclick listener is avoided (for instance, if you open a link in a new tab/window).
  2. Desktop and mobile traffic from Reddit.com sets a referrer, but the top apps for reading Reddit all do not set a referrer.
  3. It became clear in further analysis (see below) that the Facebook app only sometimes sets the referrer.

Beyond these notes, things generally worked as expected: email, IM, and most mobile apps were dark social; social networks and major sources of external traffic (even some using HTTPS: like Facebook and Google) were not.

Disambiguating Traffic with Time Series

The above findings raised more questions than they settled. If less well measured sources like email and IM drive a significant portion of traffic, do they at least correlate well with more explicitly measurable sources of traffic? For applications like Facebook and Reddit that do not always send referrer data, is there a way to identify their contributions within patterns of dark social traffic? We found that in many cases, the answers to these questions were a resounding yes.

For the next phase of our analysis, we wanted to take a look at the time series data for specific articles to try to identify patterns in the traffic. If a popular story were to break, you’d expect to see different responses in different traffic sources. For a site like Reddit, you might expect traffic to be tightly peaked and highly correlated with the story’s ranking on the home page. For a site like Facebook, the interest might fade out more gradually as it filters through different people’s feeds. You might expect instant messaging to yield a tighter, shorter-tailed traffic distribution than a medium like email. The following plot shows an interesting example of a story that illustrates some of these features. There was a distinct spike in Reddit-driven traffic that lasted all of four hours followed by a more prolonged pickup in Facebook traffic.

 

 

 

 

 

 

 

 

 

 

The most interesting observation here is how well correlated dark social traffic is to the identifiable sources. In this example, you could be convinced that the dark social is really just misattributed traffic from facebook and reddit. Some evidence for this:

  • The residual traffic is almost non-existent, and in particular, the amount of internal and search traffic is negligible.
  • If a secondary social sharing mechanism like email or IM were driving a significant amount of traffic, we’d expect to see some delay in the dark social time series from the sharp spike in traffic.

We can further break these numbers down by examining the difference between mobile and desktop traffic. In the following graph, we zoom into the Reddit spike in traffic in the above article.

 

 

 

 

 

 

 

 

 

 

We can see a stark divergence in traffic patterns by device, which confirms some of our earlier findings. We have at least strong anecdotal evidence that large portion of Reddit mobile traffic is from apps categorizing traffic as dark social.

We can examine the patterns of traffic for the Facebook-driven portion of the time series as well:

 

 

 

 

 

 

 

 

 

 

Here, the picture is not quite as cut-and-dry as before. Dark social comprises only a small percent of overall desktop traffic, but commands a fairly significant chunk of mobile traffic. Over other articles, this pattern is typical. When we observe Facebook traffic, we can almost always find a corresponding amount of dark social traffic. The actual amount of dark social traffic relative to Facebook traffic can vary significantly by article and by site, but will generally be much higher on mobile devices. As Facebook is such a large driver of mobile traffic in general, this can help explain some of the difference we see between desktop dark social share and mobile dark social share.

Of course, it’s difficult to disambiguate where dark social is coming from at scale — it’s a mix of traffic from many referrers. But, for a large majority of stories, if we look at the top 10 referrers and correlate the time series of traffic that they send with dark social’s time series, we get some referrer that’s a very high match, which strongly suggests that that particular story is getting its dark social from that particular referrer.

This suggests that, while we can’t just flip a switch and disambiguate all traffic, a careful analysis of a particular story is likely to be able to turn up the source of the majority of its dark social. Of course, this won’t always work– there are still person-to-person shares (IM, email, etc), shares on apps with no corresponding website, and so forth that account for a chunk of dark social. Still, if we look at correlations between dark social traffic and other traffic sources (a rudimentary and blunt tool to be sure), we see that fewer than 25% of stories have time series that have less than 80% correlation, with many being much more highly correlated.

Disambiguating App Traffic Using User Agents

In this analysis, we discovered that many major apps set a string in the user agent that can be used to identify the app, even in the case that the app doesn’t set a referrer. Facebook, Buzzfeed, Twitter, QQ, Baidu, and others all do this. By looking at this user agent string and using it to identify the referrer, we’re able to disambiguate a non-trivial portion of dark social traffic and correctly attribute it to specific mobile apps. We recently implemented this change, and if you happened to be looking closely at your dashboard around 6pm last night, you might have seen your m.facebook.com traffic jump up by 40% and your dark social fall by 5-10% when we flipped the switch. While this is only a small piece of the overall dark social share, it is a clear step in the right direction. As more apps take similar measures, this approach has the potential to help reverse the growth of the dark social problem.

Going Forward

As we get more data from the User Agent change, it will be interesting to see how much of the relationship between dark social and some of the major applications remains. Will the relationship between dark social and Facebook mobile traffic disappear? Well, probably not, because there will still be people who see a link on Facebook and then share it through text or email or other means.

Still, the general approach of looking into your articles’ traffic patterns is quite fruitful — you’re likely to be able to identify the source of dark social for specific stories if you choose to dive in (feel free to reach out if you’d like advice on how to do it using our historical APIs).

As always, we’ll be keeping a keen eye on dark social. Please feel free to reach out to me with any questions, specifically about your traffic or generally about dark social, at christopher@chartbeat.com.

Economics of Ad Refreshing

November 5th, 2014 by Justin

Editor’s Note: This article originally appeared in the fall 2014 issue of the Chartbeat Quarterly, our once-a-season data science magazine.

When a television program goes to commercial break, we see a series of 30-second spots, rather than one continuous advert. That three minutes of commercial time generates more collective value to advertisers when it’s split up than if it were given to a single advertiser. So what happens if we apply the same principle to ads on the Internet?

Our research suggests that the longer an ad is in view, the greater the likelihood that a person will recall the brand behind the advertisement. However, according to multiple studies, after a short period of time, the effect of time on brand recall is greatly diminished (Figure 1).

1

This means that ads with higher active exposure time have higher value to advertisers, but only to a point. So why not exploit this fact by “refreshing” an ad after a fixed amount of time?

Ad refreshing is not a new idea, but it is unpopular because ads refreshed traditionally—after a certain amount of wall clock time has passed—are unlikely to be seen. A series of non-viewable ads have no value to advertisers. On the other hand, if we refresh ads once they’ve been in view for a set amount of time, we can ensure that an ad was seen for a fair amount of time before changing it over to a new one and that the new ad will be viewed.

This is an exciting idea because refreshing ads generates a large number of new viewable impressions. Traditionally, if a user is reading a page for two minutes with an ad in view, this person will only be exposed to one ad in a given position. If we refresh each ad after it’s viewed for 30 seconds, however, each single impression becomes four, generating three additional impressions, each of which is viewed. Table 1 shows the impact of different ad refresh times on viewable impressions and average exposure times across the Chartbeat network.

2

From Table 1, we see that the number of viewable impressions on a typical site can be increased by as much as 93% if a 10-second ad refresh is used. This has the effect of almost doubling the inventory of viewable impressions on a site. On the other hand, we also see that this reduces the time that people spend with individual ads on average, because we are limiting the amount of time people can spend with an individual ad. This means that each refreshed impression has slightly less value to an advertiser than before. Because of this, we can probably expect that advertisers would require a discount to compensate for the loss of time.

So, is ad refreshing worth it? Does the value of an increased inventory of viewable impression offset the loss in value to each refreshed impression? To answer this question, we will investigate the economic ramifications of ad refreshing.

Our goal is to compare the value of the ad inventory on a typical site with and without ad refreshing.

According to research at Yahoo, the closer an ad is to the start of a session, the more likely a user is to recall the brand represented in that ad. This means that when refreshing ads, the ads shown first have more value than the ads shown later. In fact, the researchers suggest that showing more than two ads in a single session is unlikely to be effective. Therefore, for our comparison we will only analyze single ad refreshes within an ad unit and we will make the following assumptions:

  1. Value of ad exposures to an advertiser can be quantified by recognition and recall.
  2. This value to advertisers correlates directly to revenue for the publisher.
  3. The value of first and second ad impressions are represented in Figure 2 relating exposure time to recall and recognition.

3

We use these assumptions to calculate a baseline value of the ad inventory for a typical site without ad refreshing and compare this to the value of the ad inventory using different ad refresh times.

As we can see in Table 2, ad refreshing does result in an increase in ad revenue. This means that the increase in viewable impression inventory outweighs the loss in value to refreshed impressions thanks to the diminishing returns in recall shown in Figure 2.

4

Researchers agree that refreshing ads this way should increase a site’s revenue, and I think this calculation bares this out. For example, with a 10-second ad refresh the typical site gains 93% extra inventory of viewable impressions, and a 12% increase in revenue. Even with our relatively conservative calculation that only allowed for a single ad refresh, we see a healthy increase in revenue. For this reason, it seems likely that ad refreshing will be a significant source of new revenue for online publishers.


 

quarterly_800x200