As everyone reading this message knows, spam (junk email) is an ongoing problem on the Internet. It has continued to increase on the Internet at-large, and the servers at my own organization receive more on average every week. This year (2006) in particular, a massive spike in spam volume was detected by Internet security companies monitoring the Internet’s email traffic volume.
In addition to being more voluminous, these spam messages are becoming more dangerous as well. Phishing, or the attempt to fraudulently obtain personal information from Internet users, is becoming more prevalent than ever before and attacks have become focused and targeted – often for obtaining IDs, passwords, credit card numbers, and other data used to impersonate or defraud victims. Additionally, spyware / adware continue to use email as a "vector of attack" or as a method to coax unsuspecting computer users to malicious websites…
Updated and Historical Statistics:
In Feb. 2005, my company was receiving approximately 40,000 email messages-per-day from the Internet.
In Oct. 2006, this number has increased to nearly 60,000 messages per day, close to a 40% increase in ~18 months. Our blocking performance in Feb. 2005 was 87% of all email, which allowed an average of 5,000 messages (spam and not spam) into the internal email system per day. Not a bad blocking rate, but still extremely disruptive on a per-user basis. Our latest blocking performance as of Oct. 2006 is over 98%, allowing only an average of 950 messages into our internal systems per day (detail below). To put that into perspective, our company has around 810 email users in the US at this time, so 950 emails equates to a little more than 1 message per user/per day allowed in – both legitimate and spam combined. In other words, as suspected the effectiveness of the anti-spam services employed internally has become better over time.
Massive up-tick in spam – 2006
In or around June 2006, a large and sustained spike in spam email distribution was detected by a spam tracking and blocking companies. This event is likely due to the growing use of malicious networks called botnets to distribute spam email (see Sidebar at bottom of this post for botnet info). One of the most common uses of botnets is to distribute and relay spam using the ensnared computers as relays for bulk email. This creates detection problems for us, since spam suddenly (not gradually) originates from many thousand different sources at once which doesn’t allow the artificial intelligence models to react in time to detect and efficiently block new spam messages. Additionally, the malicious software used to create botnets is usually installed through known and previously unknown vulnerabilities in Microsoft’s Internet Explorer browser and Windows operating systems. Sometimes, no user interaction is even required to become victimized, so that is why it is so important to keep up with Microsoft’s Windows and Office updates as they are released.
The charts and numbers
Below is a graph of the 2006-June event – notice that June and beyond experienced a doubling, tripling, and more of new spam messages and that rate has more-or-less been sustained through this week. Red represents total spam, blue represents known blocked hosts, and gold represents new spam events. The gradient y-axis is a scaled factor representation of volume and the x-axis represents dates (in weeks) – mm-dd-yy format.
Note: graph modified from original – Thanks to TQM3 for the continued research and service to the community
Anti-spam vendor Postini reports that nearly 80% of all email on the Internet is from known compromised systems hosting spam, while a more in-depth content analysis by multiple vendors has shown that less than 4% of all Internet email is legitimate email. This means that in practice, 96% of all email is junk mail of some kind which is staggering. While this may seem dubious at first, note the following per-day averages culled from my company’s own anti-spam server logs:
Average per-day incoming Internet mail stats: Note that we block outright 97.5% of all email received, and when combined with quarantined mail, this increases to 98.3%.
This percentage means the mail that reaches my users’ inboxes without any user action, on average, represents less than 2% of all the attempted delivered email. If we were to eliminate spam filtering, users could expect 50-70 more spam messages per-day / per-user on average, adding up to email being "lost in the shuffle" creating productivity loss, massive increase in resource utilization on email servers, and a lot of angry internal customers ;-)
Another interesting stat above – on average, we only receive 6 viruses-infected email messages per day out of 55,000+… a rounding error in raw number terms. This indicates the larger and irrefutable trend that email is no longer used as a conduit for spreading viruses as was once the case; rather it is being used to make money from spamming, phishing, identity theft, and other forms of organized crime.
Note: The caveat to this statistic is that our anti-spam server drops traffic from known spammer IP addresses and subnets, prior to the virus scanner analyzing the message. It could be that there are many more virus-infected email messages being dropped before virus analysis if those virus-infected messages come from known spammer IPs.
The massive up-tick in spam generated and sustained since 2006-June has created a "law of big numbers" problem that is allowing a higher raw number volume of spam through that would have otherwise been blocked pre-June 2006. Botnets are the primary cause of this effect and are the single biggest threat on the Internet at-large today.
Sidebar: For those not familiar with the term, botnets are the result of a coordinated installation of a certain type of malicious software designed in such a way as to allow surreptitious and central control of many computers. After compiling the control of these computers (sometimes number in the tens or hundreds of thousands), hackers can use them to perform coordinated attacks against other systems, gather and amalgamate information on large numbers of people (for identity fraud, etc.), and are largely used in organized cybercrime today. The client computers that have this software installed are called "bots" or "zombies" since control of their operation has been seized by the hacker and they are no longer autonomous. The "net" part is the fact that they are operating as a distributed network of computing resources – thus, botnet
. . . . . .. . . . . .
"24-hour banking; I don’t have time for that"