Author Archive

Update: a reader wrote in with the great suggestion of examining the effect of direct quotations in headlines. We found that headlines with direct quotes are 14% more likely to win headline tests than average headlines, making them the second most effective headline style we’ve tested. Please comment or get in touch with other suggestions for headline styles to examine!

Writing a catchy headline that captures the attention of your audiences is, without question, an art form. As demonstrated in this headline, blindly following guidelines can lead to copy that sounds cliché at best, and actively off-putting at worst. Still, effective headline writing can make quite a difference in the success of your content — after all readers have to get to the actual articles somehow — so it can be expensive to get wrong.

Chartbeat Engaged Headline Testing enables content creators and editors to become better headline writers. By testing copy in real time, newsrooms can challenge assumptions about what kinds of headline constructions work well and which don’t.

Accordingly, we would like to turn that introspective lens on some of our own recommendations of how best to use our tool and then on some commonly cited “tips and tricks” for getting the most out of your headlines. As a foreword, while we have the luxury of being able to plot general trends in a rich dataset of over 100 publishers and almost 10,000 headline tests, each publisher and audience is different. We encourage you to take a look at your own data and put some of our findings to the test (literally!) to see what works best for you.

Verifying Best Practices for Engaged Headline Testing

To help our clients get started with our tool, we often give them a list of best practices. Here are a few examples:

  • Test in Higher Traffic Positions
  • Don’t be Afraid to Test Multiple Variants
  • Test Distinct Differences

We like to encourage users to conduct headline tests that converge to a winner quickly, so that winning headlines spend the most possible time with the largest possible audience.

This begs the question of what “converging to a winner quickly” means, and to answer it, I would like to appeal to our data for an overall view. The graph below shows a histogram of experiments by the number of headline trials — that is, the number of unique visitors that see one of the tested headlines:

graph_blog

About half of conclusive experiments (those that determine a winner) need fewer than 2,500 trials to converge. More than 85% need fewer than 10,000 trials. That said, identifying an average convergence time for your site will depend on the amount of traffic you have and how “evergreen” your content is.

For sake of example, let’s imagine a publisher that gets 100 trials per minute. They want to see their experiments finish within 25 minutes. The above statistics imply that only about half of this publisher’s experiments will finish before we reach 25 * 100 = 2,500 trials.

Want to maximize the ROI of your headline testing practice? Learn how.

Click-Through Rate
Now, let’s take a look at how we can leverage higher traffic (click-through rate) positions to optimize for convergence time. The following graph is a density plot of number of trials needed for convergence against the CTR of the winning headline:

EHT_Headline_Writing_Blog_-_Google_Docs

While there is a fair amount of noise in the plot, the main indication is that the needed number of trials is roughly inversely proportional to the CTR of the slot. So what does this mean in practice? If a publisher tests in a prominent headline position getting 8% CTR on the page, the test will converge in 4 times fewer trials than a position below the fold getting 2% CTR. That brings our convergence rate (within 25 minutes) from 50% to closer to 90%. Pretty astounding.


Number of Headline Variants
Finally, let’s graph the number of headline variants in each experiment:

graph

Right now, we see that more than two-thirds of our headline tests are basic A/B tests, meaning only 2 variants. There are clear pros and cons for testing additional headline options. On the negative side, you need to actually write more headlines, and I can sympathize with the creative burden. (Unfortunately, taking the lazy way out in tweaking a word or rearranging a sentence tends to have less impact than trying to highlight different viewpoints or angles.) Also, adding an additional (average) headline often will hurt convergence time, because you need additional trials to explore the added headline.

table_01-1

But, as demonstrated in the table above, there is clear benefit to testing additional headlines as well. The above table shows the amount by which the winning headline exceeds an average headline, by number of headlines tested. The winning headline in a five variant experiment typically has more than a 50% higher CTR than the average headline, whereas you may only see a 23% benefit for a standard A/B test. This pattern of increasing divergence of winner to mean follows directly from the variance in the CTR of each headline. Another consideration is how often the original headline (Variant A) ends up as the winning headline. Admittedly, the following result depends fairly strongly on how organizations decide to come up with headlines; but even in the A/B headline case, publishers have been fairly significantly rewarded for using the additional variant. In some extreme cases, we have seen publishers use as many as 17 (!) different variants in a single headline test, successfully converging in fewer than 10,000 trials (!!).

Testing the Efficacy of Common Headline Themes

We wanted to take a closer look at the characteristics that make up a good headline. Some of the essence of a great headline, such as Vincent A. Musetto’s “Headless Body in Topless Bar,” can never be fully captured in categorical variables; but there are common tropes that are commonly used to capture audience attention. With the help of headline guides, other headline studies, and raw expertise, we compiled a list of 12 commonly-cited themes:

  1. Does the headline contain a question?
  2. Does the headline have a number?
  3. Does the headline use adjectives?
  4. Does the headline use question words (e.g., ‘who’, ‘what’, ‘where’, ‘why’)?
  5. Does the headline use demonstrative adjectives (e.g., ‘this’, ‘these’, ‘that’, ‘those’)?
  6. Does the headline use articles (e.g., ‘a’, ‘an’, ‘the’)?
  7. Is the headline in the 90th percentile of length (73 characters or greater)?
  8. Is the headline in the 10th percentile of length (32 characters or fewer)?
  9. Does the headline contain the name of a person?
  10. Does the headline contain any named entity (e.g., person, place, organization)?
  11. Does the headline use positive superlatives (‘best’, ‘always’)?
  12. Does the headline use negative superlatives (‘worst’, ‘never’)?

For this exercise, Spacy.io was used for the natural language processing tasks, including entity recognition and part-of-speech tagging for English language sites.

There are a number of statistical challenges in trying to sort out what characteristics have real significance and which are spurious outliers. The first thing to note when making multiple significance tests is that it is important to control the familywise error rate, via Bonferroni correction, or else you greatly increase the likelihood of spurious results. The second thing is that there are a number of confounding variables to consider. Raw CTR is appealing for its simplicity, but it could very well be the case that short headlines, for instance, are much more likely to be tested in leaderboard spots at the top of busy homepages, so despite being inferior to other headlines in the same spot, the CTR ends up being higher. This is a form of Simpson’s Paradox.

We will look at two alternate metrics of headline success. The first is scaled CTR, where instead of comparing CTRs globally, we look at the ratios of CTR of a given headline to the CTR of the headline that won the experiment. With this metric, the average scaled CTR of a headline is close to 77% in this data set, so we use that 77% as a benchmark to see whether a particular property has a beneficial effect.

The second metric is winner propensity. We look at the set of experiments that compare headlines with a given property to a headline without and calculate how often we would expect headlines with that property to win, if winners for each experiment were chosen randomly. We then see whether the headlines of the given property are more likely to win.

table_v2

Results
The results were somewhat mixed. Only long headlines and headlines with demonstrative adjectives show significantly higher scaled CTR, and only headlines with demonstrative adjectives and numbers show higher propensity of being declared winner in a given headline test. The presence of articles actually significantly detracts from scaled CTR.

It’s worth discussing the one unambiguous result in a bit more detail. Demonstrative adjectives can actually be used in multiple ways in a headline. You can use them to create intrigue in clickbait-ish fashion: “These simple tricks will leave you speechless” or “You’ve never tasted anything like this.” There are also quite a few examples in our dataset of using demonstrative adjectives as a temporal specifier: “GOP Debate this evening,” for instance. In the future, as we collect more data, we can think about drilling down more granularly into specific constructions.

Perhaps more interesting than the positive results is the lack of significance among other factors that have been cited to be useful in capturing the attention of an audience. “Use terse, punchy headlines”; “Ask questions”; “Name drop.” None of these properties show much predictive power in the general case.

“That’s right, writers: We’ve proven that ‘5 Ways To Write The Best Headline Ever’ isn’t actually that effective.”

Final Thoughts
So where does that leave us? If you want to be an effective headline writer, maybe there is no substitute for creativity and attention. Watch for patterns in the headlines that end up floating to the top. Take the time to discuss what worked and what didn’t. Avoid the formulas and cliches. Be liberal with your use of headline testing, so that you can harness feedback from your readers in real time.

If there are any other ideas that you would like us to take a look at in the data, especially as our repository of tests grows, please don’t hesitate to reach out.

In the meantime, here’s a great resource for headline testing optimization.

Over the past few years, Internet traffic has seen major changes. As smartphones become more ubiquitous, more and more people are spending a significant amount of time on the web on mobile devices, and in particular, via mobile applications. In October, more than half of the time Internet users spent online was via mobile and tablet applications.

With the rise in mobile application traffic, there has been a parallel increase in unattributed traffic to articles on the web—a bucket of traffic referred to as dark social. This category of traffic encompasses not only the visitors who enter a URL directly, but also those who click on links from email, instant messaging, and many mobile and desktop applications. Unattributed traffic can also result from a number of technical issues that cause referrer information to be omitted from a known traffic source. The lack of clear attribution for this traffic is a big problem: for most domains on our network, dark social accounts for 20% to 40% of overall external traffic to articles. Because of the popularity of mobile applications, the percentage of dark social traffic among mobile users is even higher.

Fortunately, the problem of dark social is becoming more widely acknowledged throughout the industry. Individual domains have long tried to manually alleviate the problem by including tracking tags and custom URLs on their social content, but are increasingly looking for additional tools to confront the problem head on. Analytics providers continue to refine their offerings and take a leading role in driving the conversation. Major referrer sources are doing more to ensure that their traffic is properly acknowledged. We’ll take a look at some of these developments.

One way of getting a handle on this attribution problem is to look carefully at traffic patterns among the articles on your site. For a large majority of the articles we have looked at, dark social traffic closely correlates in time with other attribution sources. For instance, several of the most popular mobile applications for Reddit do not pass referrer information. Consequentially, when we see spikes in Reddit-based traffic on desktop, we tend to see a corresponding spike of dark social traffic on mobile. This suggests that a large portion of dark social traffic is really just misattribution of known referrers. As a result, for individual articles, you can explicitly attribute much of this traffic to the correct sources.

Chartbeat is now leveraging user agent profiles to disambiguate a significant chunk of dark social mobile application traffic. Many major mobile applications such as Facebook, Twitter, Pinterest, Drudge Report, and Flipboard set a tag in the user agent to identify the application. For example, in the following user agent, the tag “[FBAN/FBIOS…]” identifies the use of the Facebook application on iOS:

Mozilla/5.0 (iPhone; CPU iPhone OS 8_1_2 like Mac OS X) AppleWebKit/600.1.4 (KHTML, like Gecko) Mobile/12B440 [FBAN/ FBIOS;FBAV/21.0.0.25.14;FBBV/6017145;FBDV/ iPhone7,2;FBMD/iPhone;FBSN/iPhone OS;FBSV/8.1.2;FBSS/2; FBCR/AT&T;FBID/phone;FBLC/ en_US;FBOP/5]

In many cases, we saw an immediate difference after Chartbeat started capturing missing referrers for these user agent-tagged mobile applications. For instance, we saw the traffic attributed to mobile Facebook use jump as much as 40% from previously misattributed dark social traffic.

Several large sites have also made recent efforts to try to pass along referrer information more of the time. In early 2014, Yahoo made a sitewide conversion to use HTTPS instead of HTTP by default, causing referrer data to be dropped. Recently, however, we have observed changes from the Yahoo site that now allow the referrer to be passed for both Yahoo Search and Yahoo News. Facebook also recently announced that it fixed a bug that was causing referrer data to get lost on outgoing mobile application clicks. This fix is particularly notable because of how much traffic originates from the social network.

We can see the results of these changes across our network. Figure 1 shows how the share of dark social traffic has evolved over the second half of 2014. While dark social on desktop is relatively stable, we can see a significant drop in dark social for both mobile and tablet devices in November, concurrent with the Facebook fix. (We also see a corresponding rise in Facebook traffic.)

dark-social-share-of-traffic-by-referrer

As more sites pay closer attention to the analytics needs of its publishers and as more mobile applications pass referrer information or user agent identification, perhaps we can make further inroads into the problem of missing attribution. Still, even with the most recent efforts, dark social share remains at a third of external traffic. We still see close time series correlations for major drivers of traffic such as Facebook and Reddit. It is apparent that we’ve made strong progress in mitigating dark social traffic on mobile and tablet devices; but as a share of traffic, dark social on mobile is still significantly higher than dark social on desktop. Unfortunately, we can’t give up on tracking codes and custom URLs quite yet.

We recently got an interesting question from a client about the connection between engaged time and understanding in news articles. A priori, one may think that there should be a strong correlation–someone quickly skimming through an article should not be expected to retain as much as someone carefully reading–but there are some reasons this might not be the case.

Journalists are taught to get to the point quickly in their news articles, using an inverted pyramid style. One of the presumed benefits is to allow readers to exit the story at any point they like and still retain the important information from the story. One of the most common phrases a young journalist hears is often “don’t bury the lede” for this very reason. Perhaps, then, it might be that a reader gleans most of his/her understanding from a story from the first seconds of reading, and the marginal value of spending more time reading the rest of the story is relatively small. We decided to put the assertion to the test.

First, let’s you and I conduct an informal experiment right now. I’m going to present you with a few dense bullet points of the findings of the study, and I’d like you to decide for yourself whether you feel like you have anything to gain from reading further.

  • We conducted a survey of over 1000 people to investigate the association between how long a reader is engaged reading a news article and what s/he takes away from it.
  • We confirmed that there is a strong association between fact recall and engaged time.
  • Readers engaged for more than a minute were almost twice as likely to recall specific facts about the article as readers who spent less than 15 seconds. This was true even when the fact was found in the first lines of the article.
  • Further, we found evidence that readers who spend more time engaged are more likely to agree with the author’s conclusions.

Now that you’ve gotten the facts, you can feel free to leave this blog post and spend your valuable time elsewhere, but if you’ll indulge me, I think you’ll find that your understanding may be heightened if you let me go into a little more detail about the experiment and the results.

Experimental Design

In order to test the relationship between reading comprehension and engagement, we carefully considered how best to induce paid participants to act like internet users reading news articles. We’ve previously established that it’s not common for a reader to read all the way through an article. In a typical article, the most likely (modal) behavior for a reader is to leave after only about 15 seconds. The most simple survey design would be to instruct readers to read through an article however they like and then ask questions about it afterwards.  The problem with structuring an experiment like this is that a paid participant just doesn’t act like a typical internet user. We tried this with a quick pilot study. The average reading time for the article we selected was an order of magnitude higher than what you might expect for a typical reader.

It struck me that one important consideration that this design was obviously lacking was an element of choice. When a reader visits your site, s/he is making a choice to spend valuable time reading your content instead of reading a different article, looking at funny cat pictures, or spending time off of the internet entirely. In the naive design, we’d effectively purchased time from people to take our survey, so they felt compelled to “do a good job” and carefully read through the article, even though that’s not what we wanted them to do at all.

Once we settled on a design that included elements of choice, we got more sensible results. We put together a simple website that showed one of eight different news articles with a button to flip from one to to the next. The participants were instructed to read as much or as little of each article as they liked. After five minutes, the site would redirect users to a survey page that asked five questions about one of the articles. The survey had five multiple choice questions about one particular opinion article about Iranian air strikes:

  1. A detail question asking about a fact from the first paragraph of the article.
  2. A detail question about a fact from near the end of the article.
  3. An “attention check” designed to weed out respondents who were not reading the questions.
  4. A conceptual question asking for a summary of the author’s thesis
  5. An opinion question relating to the author’s message.

We asked 1,000 paid participants on Amazon’s Mechanical Turk to take our survey. Of these, we eliminated about 10% for various reasons (e.g., failing the attention check question, multiple submissions from the same device, not clicking through to the article we asked questions about). When we look at an article, our engagement metrics look reasonable:

engage_time_hist

The graph above shows a peak engagement of about 20 seconds, gradually tapering off as time increases. This reassures us that we are sampling a population that models internet viewership reasonably well.

We can also intuitively understand the pattern of responses.

question_correct_barchart

The majority of readers could correctly identify a detail from the beginning of the article and summarize the author’s thesis, but fewer were able to answer a question about a detail near the end of the article. This can be explained by the fact that relatively few people read through the article. In truth, the 37% that answered the question correctly is likely as high as it is only because you could expect 25% correct from random chance.

Fact Recall by Engaged Time

Looking deeper, we find a strong association between fact recall and engaged time. Readers spending more than a minute were almost twice as likely to recall specific facts about the article as readers who spend less than 15 seconds (approximately the top and bottom quartile of the engaged time of the responses).

Let’s look specifically at conceptual understanding. Roughly 40% of participants engaged for less than 15 seconds correctly assessed the message of the article, compared to more than 80% of those engaged for more than a minute. We’ve plotted recall against engagement below, with its associated 95% confidence interval. The slope of the logistic regression tells us that for every 15 seconds of increased engagement we can expect to see about a 30% increase in the odds of correctly answering the question correctly.

LogisticRegressionQ4

The complete results are summarized in the following table. In particular, for each of the questions, we see a positive association between recall and engaged time, even when the relevant information was found at the very beginning of the article.

Respondents Answering Correctly by Engaged Time
Question Overall < 15 seconds >1 minute Increase in Odds Per 15 Second Increase in Engaged Time
Conceptual Understanding 62% 42% 81% 32%
Detail from Beginning of Article 63% 39% 81% 31%
Detail from End of Article 37% 27% 44% 8%

A matter of opinion

The opinion question was adapted from a June 2014 CBS News/New York Times Poll: “Do you favor or oppose the United States working with Iran in order to try and resolve the situation in Iraq?” In the original poll, 53% were in favor, 39% opposed, and 8% were unsure. The first observation about the results is that respondents were much less likely to express an opinion, but given the survey population this is not surprising. The second observation is that readers of this article, which supported this position, were relatively more likely to agree with the author’s position. In particular, the portion of responses agreeing with the author varied significantly between people who engaged with the article for more than a minute and those who engaged for less than 15 seconds:

Q5_top_pieplotQ5_bottom_pieplot

I caution that the experiment does not show causation. It’s probably the case, for instance, that readers who more strongly agree with the author are more likely to be more engaged with the article. However, it is at least plausible that reading the article helped impact the opinion of the readers (though we would have to do more tests to find out how much this might be the case in general).

It’s worth explicitly noting that we only effectively ran this experiment for one particular article. Whereas we think the questions we’ve raised and attempted to answer are important, and that the results we’ve shown are useful directional indications, it’s clear that the magnitude of the results will depend on the questions asked and the article itself.

Win Hearts and Minds

We’ve seen before that engaged time affects things like brand recall for advertisements, so the result that engaged time affects reading comprehension is not altogether surprising. What is interesting is the extent to which this is borne out in the results.  This just adds to the growing body of evidence that capturing the attention of your readers gives you the opportunity to win both their hearts and their minds.

For writers, I suppose this conclusion is both a blessing and a curse. Yes, you’re still going to have to spend time polishing the second halves of your articles; but by focusing on keeping your readers engaged, they will ultimately take away more from what you’re saying. And isn’t that the point of effective journalism?

Here at Chartbeat, we have a long history of trying to shed light on the sources of your traffic. Since 2012, we’ve helped illuminate the phenomenon known as dark social—where traffic is likely to come from social sources, yet lacks explicit referrer attribution. Two years later, Internet traffic looks a lot different than it previously did. Mobile and application traffic have grown significantly. More sites are moving to HTTPS. Usage patterns are evolving. We wanted to take the opportunity to look into the current state of dark social and dive deeply not only into potential causes, but also potential disambiguations of this nebulous block of traffic. As a result of these investigations, we found a way to attribute a sizable chunk of dark social (up to half!) to application traffic.

So what exactly is dark social? Here’s a brief recap. Back in the Wild West of web analytics, we tagged any traffic coming in without a referrer field as direct traffic. Many people have attempted to exhaustively list what might cause an empty referrer field, but it the typical explanation of a visitor typing in the URL directly was unsatisfying for article content. The alternate explanation that these visitors came from IMs, emails, or apps seemed much more likely, and so we categorized them as social instead. This social traffic came to be known as “dark social” and has made regular appearances at the top of referrer lists ever since.

Dark Social Volume

These days, dark social accounts for about a third of external traffic to sites across our network. The exact amount varies quite a bit depending on the particular site in question, but most sites have a chunk ranging from significant to extremely significant. The following graph shows a rough distribution of the percent of external traffic classified as dark social for a given domain for a sample of Chartbeat’s data, with the mean given in red:

 

 

 

 

 

 

 

 

 

 

We can break this data out further. The number is markedly higher on mobile, with upwards of 50% of mobile external traffic lacking a referrer on some sites. This is already a critical problem — how are we to analyze our top traffic drivers if we can’t attribute half of our traffic? — and since mobile’s share of traffic is increasing, it’s only going to get worse.

 

 

 

 

 

 

 

 

 

 

Potential Sources of Dark Social

In order to get a handle on the drivers of the problem, we did an empirical analysis of potential sources of dark social by setting up a site, posting links to it on various traffic sources, and clicking those links from a wide variety of traffic sources. The goal was to determine which traffic sources can be reliably assumed to not be dark social (because they always successfully set the referrer) and which do contribute to dark social (because they always lack a referrer or sometimes lack a referrer).

We were specifically interested in looking at some of the most popular social mobile apps. The following table shows whether some of the combinations of sites and modes of interaction successfully passed a referrer in our testing (with the caveat that we only tested the current versions of the applications and were not exhaustive using all different browsers and operating systems):

Referrer Passed?
Desktop
Mobile Browser
Mobile App
Facebook
Mostly
Yes
Sometimes
Twitter
Yes
Yes
Yes
Reddit
Yes
Yes
No
Tumblr
Yes
Yes
Yes
Gmail
No
No
No
IM/Text
No
No
No

We can see that major traffic sources are generally good about allowing their data to be tracked. However, there were some interesting exceptions.

  1. Facebook’s desktop site sometimes doesn’t set a referrer if the onclick listener is avoided (for instance, if you open a link in a new tab/window).
  2. Desktop and mobile traffic from Reddit.com sets a referrer, but the top apps for reading Reddit all do not set a referrer.
  3. It became clear in further analysis (see below) that the Facebook app only sometimes sets the referrer.

Beyond these notes, things generally worked as expected: email, IM, and most mobile apps were dark social; social networks and major sources of external traffic (even some using HTTPS: like Facebook and Google) were not.

Disambiguating Traffic with Time Series

The above findings raised more questions than they settled. If less well measured sources like email and IM drive a significant portion of traffic, do they at least correlate well with more explicitly measurable sources of traffic? For applications like Facebook and Reddit that do not always send referrer data, is there a way to identify their contributions within patterns of dark social traffic? We found that in many cases, the answers to these questions were a resounding yes.

For the next phase of our analysis, we wanted to take a look at the time series data for specific articles to try to identify patterns in the traffic. If a popular story were to break, you’d expect to see different responses in different traffic sources. For a site like Reddit, you might expect traffic to be tightly peaked and highly correlated with the story’s ranking on the home page. For a site like Facebook, the interest might fade out more gradually as it filters through different people’s feeds. You might expect instant messaging to yield a tighter, shorter-tailed traffic distribution than a medium like email. The following plot shows an interesting example of a story that illustrates some of these features. There was a distinct spike in Reddit-driven traffic that lasted all of four hours followed by a more prolonged pickup in Facebook traffic.

 

 

 

 

 

 

 

 

 

 

The most interesting observation here is how well correlated dark social traffic is to the identifiable sources. In this example, you could be convinced that the dark social is really just misattributed traffic from facebook and reddit. Some evidence for this:

  • The residual traffic is almost non-existent, and in particular, the amount of internal and search traffic is negligible.
  • If a secondary social sharing mechanism like email or IM were driving a significant amount of traffic, we’d expect to see some delay in the dark social time series from the sharp spike in traffic.

We can further break these numbers down by examining the difference between mobile and desktop traffic. In the following graph, we zoom into the Reddit spike in traffic in the above article.

 

 

 

 

 

 

 

 

 

 

We can see a stark divergence in traffic patterns by device, which confirms some of our earlier findings. We have at least strong anecdotal evidence that large portion of Reddit mobile traffic is from apps categorizing traffic as dark social.

We can examine the patterns of traffic for the Facebook-driven portion of the time series as well:

 

 

 

 

 

 

 

 

 

 

Here, the picture is not quite as cut-and-dry as before. Dark social comprises only a small percent of overall desktop traffic, but commands a fairly significant chunk of mobile traffic. Over other articles, this pattern is typical. When we observe Facebook traffic, we can almost always find a corresponding amount of dark social traffic. The actual amount of dark social traffic relative to Facebook traffic can vary significantly by article and by site, but will generally be much higher on mobile devices. As Facebook is such a large driver of mobile traffic in general, this can help explain some of the difference we see between desktop dark social share and mobile dark social share.

Of course, it’s difficult to disambiguate where dark social is coming from at scale — it’s a mix of traffic from many referrers. But, for a large majority of stories, if we look at the top 10 referrers and correlate the time series of traffic that they send with dark social’s time series, we get some referrer that’s a very high match, which strongly suggests that that particular story is getting its dark social from that particular referrer.

This suggests that, while we can’t just flip a switch and disambiguate all traffic, a careful analysis of a particular story is likely to be able to turn up the source of the majority of its dark social. Of course, this won’t always work– there are still person-to-person shares (IM, email, etc), shares on apps with no corresponding website, and so forth that account for a chunk of dark social. Still, if we look at correlations between dark social traffic and other traffic sources (a rudimentary and blunt tool to be sure), we see that fewer than 25% of stories have time series that have less than 80% correlation, with many being much more highly correlated.

Disambiguating App Traffic Using User Agents

In this analysis, we discovered that many major apps set a string in the user agent that can be used to identify the app, even in the case that the app doesn’t set a referrer. Facebook, Buzzfeed, Twitter, QQ, Baidu, and others all do this. By looking at this user agent string and using it to identify the referrer, we’re able to disambiguate a non-trivial portion of dark social traffic and correctly attribute it to specific mobile apps. We recently implemented this change, and if you happened to be looking closely at your dashboard around 6pm last night, you might have seen your m.facebook.com traffic jump up by 40% and your dark social fall by 5-10% when we flipped the switch. While this is only a small piece of the overall dark social share, it is a clear step in the right direction. As more apps take similar measures, this approach has the potential to help reverse the growth of the dark social problem.

Going Forward

As we get more data from the User Agent change, it will be interesting to see how much of the relationship between dark social and some of the major applications remains. Will the relationship between dark social and Facebook mobile traffic disappear? Well, probably not, because there will still be people who see a link on Facebook and then share it through text or email or other means.

Still, the general approach of looking into your articles’ traffic patterns is quite fruitful — you’re likely to be able to identify the source of dark social for specific stories if you choose to dive in (feel free to reach out if you’d like advice on how to do it using our historical APIs).

As always, we’ll be keeping a keen eye on dark social. Please feel free to reach out to me with any questions, specifically about your traffic or generally about dark social, at christopher@chartbeat.com.