‘Semalt’ is a dubious SEO tool whose unscrupulous behavior already caused concern to many website owners. Playing a lead role in what appears to be a large-scale referrer spam campaign, Semalt is often accused of ignoring ‘robots.txt’ directives and overbearing servers with a slew of suspicious-looking requests.

In this post, we shed light on Semalt’s suspicious activity and explains our rationale in preventing this bot from accessing any Incapsula-protected websites — unless manually permitted to do so by webmasters.


Tweetable Stats


Referrer SPAM 101

Referrer spam belongs to a niche within the spamming ecosystem. Somewhere between Facebook clickjacking campaigns and 419 scams, referrer spam lays a scheme to improve search engine rankings by exploiting bad practices of unsuspecting webmasters.

The perpetrators’ goal here is to create backlinks to a certain URL by abusing publicly-available access logs.

Their first step is to locate vulnerable websites by using scanner bots, which typically serve a double function — both crawling the Web to locate a vulnerable target and executing the attack itself, once the vulnerability was exposed.

In case of referrer spam, such scanner bots access thousands of websites in bulk, sending out requests with a fabricated ‘Referrer” header that holds the URL of a website they are trying to boost.

As with all other types of incoming traffic, all such requests are automatically recorded in the website’s access log. In this case, beacause the log is presented on a publicly-available webpage, the spam bot visit is recorded as a HTML link, with the information from the fake “Refferer” header.

Later on, when these links are crawled by search engines, they have the potential to improve the spammer’s SEO rankings.

Semalt's MO - Referrer Spam Campaign

Granted, such spam activity does not constitute a security threat per se. Having said that, the existence of such ‘SEO Leeches” can cause long-term SEO damage to their prey-ranging from demotion in search engine result pages (SERP) to complete SERP blacklisting.

Enter Semalt

A few months ago Incapsula saw the first indications of a large-scale referrer spam campaign. The focal point of this spam activity was a service named Semalt whose bots were employing referrer spam techniques on an impressive scale and were aggressive enough to draw our (and our clients’) attention.

Semalt links in access log

Semalt links in access log.

On their website, Semalt describes their service as ‘a professional webmaster analytics tool’ However, a Google search for “Semalt” yields mostly negative comments, directed toward the company and its services.

The comments originate from numerous users, many of whom complain about Semalt ignoring ‘robots.txt’ directives. We’ve also seen hundreds of people taking to Twitter, to call out Semalt for their questionable tactics as well as offer speculation about the company’s actual activity.

First rule of Spam

Probably the most antagonizing behavior of all is Semalt’s claims that you can complete an online form to remove your website from their crawling list. Still, instead of stopping the flood of unwanted requests, submitting the removal form seems to results in an increase of Semalt bot traffic.

One Soundfrost to Spam them All

It should be noted that Semalt is not your typical bot.

Our analysis shows that the company uses a QtWebKit browser engine to avoid detection. Consequently, Semalt bots can execute JavaScript and hold cookies, thereby enabling them to avoid common bot filtering methods (e.g., asking a bot to parse JavaScript). Because of their ability to execute JavaScript, these bots also appears in Google Analytics reports as being ‘human” traffic.

Recently, substantial evidence revealed that Semalt isn’t running a regular crawler. Instead, to generate bot traffic, the company appears to be using a botnet that is spread around by a malware, hidden a Soundfrost untillity.

Our data shows that, using this malware-infested utility, Semalt has already infected hundreds of thousands of computers to create a large botnet. This botnet has been incorporated in Semalt’s referrer spam campaign and, quite possibly, several other malicious activities.

Soundfrost botnet - 290,000 unique IPs recorded over the last 30 days

Soundfrost botnet – 290,000 unique IPs recorded over the last 30 days.

To put things in numbers, during the last 30 days we saw Semalt bots attempting to access over 32% of all websites on our service with spamming attempts originating over 290,000 different IP addresses around the globe.

Soundfrost botnet - geo-distribution

As evidenced by the IP distribution data above, Semalt’s botnet is quite widespread, with most of the affected IPs located in South America.

Beyond providing Semalt with the scale it needs to operate, this botnet also help Semalt’s bots avoid rudimentary security practices, such as IP blacklisting and rate-limiting. Coupled with its ability to overcome challenge-based detection mechanisms, this makes Semalt’s shady activity that much more concerning.

Blocked by Default

With the record of Semalt’s spam activity in hand, and with numerous requests to block the service coming from our clients, we added Semalt to our ‘Bad Bots” rules baseline, blocking it by default for all Incapsula accounts.

We hope that the combined efforts of the Internet community will help put an end to Semalt’s illicit activity and help dissuade other services from using similar tactics in their business practices.


Would you like to write for our blog? We welcome stories from our readers, customers and partners. Please send us your ideas: blog@incapsula.com