Today Incapsula released its annual bot traffic report for 2014. In it, we point out a number of trends about bots—the Internet’s automated inhabitants—including:

  • Bot traffic directed to smaller websites has surpassed 80%
  • The slow demise of RSS is driving down bot traffic
  • Impersonator bots are on the rise
  • Bad bots threaten small and big websites alike

Based on this research, it appears that bot miscreants pose a ubiquitous threat. No matter what size or type of web presence you oversee, it’s going to be visited by bots—frequently. In this post we’ll show you how the Incapsula solution (including our free plan) blocks bad bots.

Hands-off Client Classification

Dealing with bad bots using Incapsula couldn’t be simpler. We’ve gone to painstaking measures to develop a completely automated system capable of identifying, classifying, and blocking malicious bots with no manual intervention. That is not to say we’ve implemented an iron-fisted, one-size fits all approach to dealing with bots. Quite the contrary, Incapsula is designed to be a no-touch, low false-positive solution; the key is our client classification engine.

Conceptually, the Incapsula client classification system may be thought of as concentric rings, or sequential layers of analysis. It determines whether a website visitor is human or not, and what its intention is.

Here’s a more detailed look at the process Incapsula uses to identify and classify bots for you:

Incapsula - Client Classification

Step 1: Looking at Header Data

By inspecting HTTP headers, Incapsula gains valuable insight into visitors, including various clues to as whether each is human or automated, and whether or not it is malicious. It’s important to note that headers can be faked, so they should never be the sole criterion for making a blocking decision. Instead this information should be combined with additional criteria to make a more informed determination.

Step 2: IP and ASN Verification

The IP and ASN verification process is next on our checklist. Here we look for a couple of items, including the identity of the IP and ASN owners and whether they match with the visitor. This can be used to identify malicious bots posing as legitimate ones.

For example, if a bot claims to be from a search engine like Google, but neither the IPs nor the ASN used match with that company, it’s a telltale sign that it’s likely a dangerous impostor.

Step 3: Behavior Monitoring

Additional useful information can be garnered from the visitors’ behavior and their requests. During analysis by our web application firewall, suspicious and malicious requests are flagged or blocked. This information is then fed into our classification engine. Indicative of automation, Incapsula also tracks items like the order or rate of requests, irregular browsing patterns, and abnormal interaction between clients and servers.

Step 4: IP Reputation

IP reputation is another powerful tool which Incapsula uses to quickly filter out bad bots. Due to our global network footprint and the number of customers we protect, Incapsula is uniquely positioned to perform large scale analysis on automated clients. Once we’ve identified a bad bot, a signature is created for it. All traffic across our network is then screened using that signature. This type of crowd sourcing enables disparate websites across the entire Incapsula community to actively participate in their own security, thereby benefitting the whole.

Step 5: Client Technology Finger Printing

Not all malicious bots are naïve. For example, last year Incapsula reported a DDoS attack initiated by browser-based bots using legitimate user-agents and correct header data. They even went so far as to mimic human-like behavior.

To thwart such attacks—which are becoming more common—our algorithms are augmented with additional security features. These empower them to dig deeper, looking at attributes such as a JavaScript footprint and cookie/protocol support.

Assessing the Automated Threat You Face

As stated earlier, by default Incapsula protects users from bad bots. However, if you want to “pop the hood” on our client classification engine and get über familiar with bot traffic on your website, here’s how to become a bot killing pro.

The first step to using Incapsula to mitigate bad bots visiting your site is to understand your traffic. How much of it is automated? How much is malicious? Where is it coming from?

To answer these questions, we suggest starting at the traffic dashboard in Incapsula’s user interface (Figure 1). Just below the fold you’ll find aggregated visitor statistics comparing humans with automated clients (bots) that have accessed your website. Not all reported bots are malicious; good ones include Google, Pingdom, Bing and others.

Banishing Bad Bots - Figure 1

Figure 1

According to our newly-released 2014 bot report, the average ratio of humans to bots is 44% versus 56%, respectively. You’ll likely see similar numbers reflected in your own analytics.

Taking a Deep Dive Into Your Bot Traffic

Banishing Bad Bots - Figure 2

Figure 2

To learn which of the reported bots are suspicious or malicious, click the Security tab (Figure 2). Here you’ll be able to see how many bad/suspicious bots Incapsula has blocked.

Banishing Bad Bots - Figure 3

Figure 3

Scrolling down further, you’ll see the top bot agents that visit your website.

Banishing Bad Bots - Figure 4

Figure 4

Now that you have an overview of your bot visitors, let’s examine the details and then take action to protect our website. Within the Threats breakdown table, click View Incidents on the far right of the Bot Access Control row. Here you’re able to access per-incident event logs, and then single out specific actions taken by the bad bots interacting with your site.

Banishing Bad Bots - Figure 5

Figure 5

This next screen (Figure 5) displays your website security logs. It already has an active filter that shows only bot-related events.

Banishing Bad Bots - Figure 6

Figure 6

To focus on a specific bot type, select it from the Visitor Type choices (Figure 6 and Figure 5, highlighted). For example, selecting the Comment Spam Bot filter displays only incident results for this bot type.

Banishing Bad Bots - Figure 7

Figure 7

Once you identify a bot you want to eradicate, options within the Actions dropdown menu (Figure 7, highlighted) let you Blacklist (its) IP, or block it by selecting Add to Bad Bots.

Banishing Bad Bots - Figure 8

Figure 8

Dealing With Unwanted Visitors

To deal with bots more generally ”using site-wide rules” you’ll want to use Incapsula’s bot access control feature. This is located within the Settings menu (Figure 1). Navigate to Settings > Security > Bot Access Control.

Banishing Bad Bots - Figure 9

Figure 9

Your account should reflect the default settings shown in Figure 9. This configuration automatically blocks bad bots, while maintaining access for good ones. You have the option to adjust either setting, of course. To restrict the list of good bots which can access your site, click the Good Bots link (upper-right, Figure 8) and deselect them (Figure 9).

banishing-bots-10

Figure 10

You can also identify specific bots as being bad, thereby disallowing access to your website. To do this, click the Also block link (upper-right, Figure 8) and type in a bot you would like Incapsula to blacklist (Figure 10). If it’s in our bot database, then you can simply select it and we’ll make sure it never bothers you again.

Banishing Bad Bots - Figure 11

Figure 11

If you want to tighten the screws on potentially-automated patrons, you can engage CAPTCHAs for suspected bots. Using this mode, if a visitor appears to be a bot but cannot be positively identified as such using Incapsula’s client classification engine, we challenge it with a human capacity test like the one in Figure 11.

Banishing Bad Bots - Figure 12

Figure 12

Check the Require all other Suspected Bots to pass a CAPTCHA test box (Figure 12) to enable this feature.

Banishing Bad Bots - Figure 13

Figure 13

If you want to ensure a bot does have access to your website, you can create a whitelist rule for it by clicking the Add exception link (lower-right, Figure 12). Here you’re able to create a custom rule based on a part of your website (URL), Client app ID, IP address, Country, or User agent (Figure 13).

Between Incapsula’s out-of-the-box bot mitigation and these additional tools, you’re fully empowered to weed out pesky robotic troublemakers from your web traffic. As part of all Incapsula plans, all of these features are available for free to help you better protect your website(s).

To learn more about more about this year’s trends in Internet bot activity, check out the Incapsula 2014 Bot Traffic Report.

Want to learn more about bots?

Visit these links to learn more about:


Would you like to write for our blog? We welcome stories from our readers, customers and partners. Please send us your ideas: blog@incapsula.com