What is spam and how to deal with spammers on the Internet. Overview of the anti-spam system Kaspersky Anti-Spam What anti-spam methods exist

Dear friends and users of our site, I am with you again, SpaceWolf, and today we will talk about the pressing problem of “SPAM”. The solution to this problem will allow you to get rid of spam on the form feedback , spam comments or spam for orders in the online store.

I would like to immediately note the pros and cons of this method:

  1. Works well against bots.
  2. Quick installation in the message sending form
  3. Minimum code (3 lines)
  4. Does not require special knowledge, except for the location of the main files.
  5. Users who do not have java will not be able to pass the verification and therefore send a message.

Basically everything. Let's start the installation:

1) Add an additional hidden field to your form (this is a comment form, feedback form, product order form) with the name name=”check” meaning value=”” leave it blank. Example:

2) In the same form but only in the button (“send”, “write”, “Leave a review” or whatever you call it) add the following code:

If ($_POST["check"] != "stopSpam") exit("Spam decected");

Anti-spam protection - how it works

The principle is as simple as the code itself. It is designed to ensure that spam bots do not know how to run programs on JavaScript. During the time when regular user will click on the “order” button in our hidden field, the word “stopSpam” will be entered, and in the case of a robot, this field will remain empty. Let me explain at this point why it will remain empty? The robot fills in all fields except our hidden ID field id=”check” and variable "check" will remain blank, therefore the mail will not be sent. And when the user clicks on the button, our JavaScript, which we added to the button.

I advise you to use this method in conjunction with captcha, the effect will be better.

Well, that's all. If the article helped you, write comments, repost and don’t forget to say “Thank you” in the comments.

If anyone has other problems or questions, leave them in the comments, we will be happy to find a solution together. We are waiting for your messages!

This is a new Kaspersky Lab product designed for comprehensive protection of your home computer. This program provides simultaneous reliable protection against viruses, hackers and spam. The Kaspersky Anti-Spam module is one of the elements of this home computer protection system. First of all, it should be noted that Kaspersky Anti-Spam is not an independent product and does not work separately from Kaspersky Personal Security Suite. To some extent, this can be called a disadvantage, since users cannot use Kaspersky Anti-Spam separately, but comprehensive protection also has its undoubted advantages.

Antivirus protection and firewall have been reviewed more than once on the pages of our publication. Therefore, in this article we will look exclusively at the operation of the antispam module.

The basis of Kaspersky Anti-Spam is the intelligent SpamTest technology, which provides: fuzzy (that is, triggered even if there is an incomplete match) comparison of the letter being checked with samples - letters previously identified as spam; identification of phrases characteristic of spam in the text of the letter; detection of images previously used in spam emails. In addition to the criteria listed above, formal parameters are also used to identify spam, including:

  • "black" and "white" lists that the user can maintain;
  • various header features mail message, characteristic of spam, - for example, signs of falsification of the sender's address;
  • techniques used by spammers to deceive mail filters - random sequences, replacing and doubling letters, white-on-white text, and others;
  • checking not only the text of the letter itself, but also attached files in plain text, HTML, MS Word, RTF and other formats.

Installation of the antispam module

The module is installed during installation of Kaspersky Personal Security Suite. When choosing installation options, a user who uses email clients other than Microsoft email programs may not install the module for Microsoft Outlook.

It should be noted that Kaspersky Anti-Spam scans any correspondence received via the SMTP mail protocol. Thanks to this, it can filter out spam in any email program, but more on that below.

Integration into Microsoft Outlook Express

The program does not have its own interface as such. At Microsoft Outlook Express The Kaspersky Anti-Spam module is integrated as a menu and as an additional panel.

One may note some inconvenience when using this panel, although it has nothing to do with the antispam module itself. Due to the principles of operation of the mechanism Microsoft programs Outlook Express Kaspersky Anti-Spam panel cannot be docked in a convenient place for the user. Each time you start the program, the panel will appear third. You will have to constantly move it to a convenient place or come to terms with this state of affairs.

Program operation

Upon admission Kaspersky mail Anti-Spam analyzes incoming correspondence. If spam is detected, the letter is marked with a special label [!! SPAM] in the Subject field and placed in the Deleted Items folder. Messages recognized as not spam are not marked with anything and are processed by mail program in accordance with established rules. If the program is not sure that the letter is spam, then the [?? Probable Spam] and the letter is placed in the Inbox for the user to make a final decision. In addition, the program uses two more types of labels: - for letters with obscene content and - for automatically generated letters, for example letters from email robots.

Thanks to such labels, you can organize the work of Kaspersky Anti-Spam with any other email program. It is enough to create rules in your email client to sort emails by these tags. In Microsoft Outlook itself, such folders are created with one click of a button in the antispam module settings window.

Training program

The program can be trained in two ways: by classifying messages received by the user as spam - not spam, and by downloading updates from the Laboratory server. The first method allows you to train the program under personal mail user, the second is to quickly respond to massive spam events on the Internet.

When you launch it for the first time, Kaspersky Anti-Spam will extract all addresses from the Microsoft Outlook address book to place them in the "Friends List". All letters from these recipients will be perceived by the antispam module as not spam and will be passed through without checking. Subsequently, the user can edit this list by adding or removing recipients to it. In addition to the "Friends List" there is also a "Enemies List". Any correspondence received from recipients on the Enemy List will be clearly classified as spam.

Adding recipients to your friends or enemies lists is done by simply clicking a special button on the Kaspersky Anti-Spam panel. Training is also carried out there. If you miss a spam email, you just need to click the “This is spam” button. A window will appear in which the user must tell the program what to do with this message.

The “Send as an example of spam” command generates a letter to Kaspersky Lab with a message about spam for further training. This command can be ignored. You can neglect adding the author to the enemies, but you should definitely add the letter to the spam samples. This is how the program is trained for personal correspondence.

Since Kaspersky Anti-Spam does not integrate into other email clients, its training in these programs is only possible through updates received from the Laboratory server. Unfortunately, this training option does not make it possible to train the program for the specifics of personal mail.

Settings

In the program settings you can: specify the location of the module databases, if the user wants them to be stored in a non-standard location; disable or enable filtering; set update parameters and view statistics.

The Kaspersky Anti-Spam module provides fairly complete protection of user mail from spam. Like any other program, it requires training. And while this learning is taking place, correct emails may be mistakenly recognized as spam and vice versa. A relative disadvantage is that the module does not allow you to delete messages on the server that are obvious spam. The user still has to spend his traffic on these unnecessary letters. On the other hand, with this approach to spam filtering, not a single valuable message will be lost. In all other respects, Kaspersky Anti-Spam deserves the most serious attention, especially considering the integration of the module with other programs that ensure the security of the user’s computer.

Modern spam mailings are distributed in hundreds of thousands of copies in just a few tens of minutes. More often spam is coming through user computers infected with malware - zombie networks. What can be countered to this onslaught? Modern industry IT security offers many solutions, and anti-spammers have various technologies in their arsenal. However, no existing technology is a magic “silver bullet” against spam. There is simply no universal solution. Most modern products use multiple technologies, otherwise the effectiveness of the product will not be high.

The most well-known and common technologies are listed below.

Blacklists

They are also DNSBL (DNS-based Blackhole Lists). This is one of the oldest antispam technologies. Block mail coming from IP servers listed in the list.

  • Pros: The blacklist 100% blocks mail from a suspicious source.
  • Minuses: They give high level false positives, so should be used with caution.

Crowd control (DCC, Razor, Pyzor)

The technology involves identifying mass messages in the mail flow that are absolutely identical or differ only slightly. To build a working “mass” analyzer, huge mail flows are required, so this technology is offered by large manufacturers who have significant volumes of mail that they can analyze.

  • Pros: If the technology worked, then it was guaranteed to detect a mass mailing.
  • Minuses: Firstly, a “large” mailing may not be spam, but quite legitimate mail (for example, Ozon.ru, Subscribe.ru send thousands of almost identical messages, but this is not spam). Secondly, spammers know how to “break through” such protection using intelligent technologies. They use software that generates various content - text, graphics, etc. - in every spam letter. As a result, crowd control does not work.

Checking Internet Message Headers

Spammers write special programs to generate spam messages and distribute them instantly. At the same time, they make mistakes in the design of headers; as a result, spam does not always comply with the requirements of the RFC mail standard, which describes the header format. These errors can be used to identify a spam message.

  • Pros: The process of recognizing and filtering spam is transparent, regulated by standards and quite reliable.
  • Minuses: Spammers are learning quickly, and spam header errors are becoming fewer and fewer. Using this technology alone will allow you to stop no more than a third of all spam.

Content filtering

Also one of the old, proven technologies. The spam message is checked for the presence of spam-specific words, text fragments, pictures and other characteristic spam features. Content filtering began with the analysis of the message subject and those parts of it that contained text (plain text, HTML), but now spam filters check all parts, including graphic attachments.

As a result of the analysis, a text signature can be built or the “spam weight” of a message can be calculated.

  • Pros: Flexibility, ability to quickly fine-tune. Systems running on this technology easily adapt to new types of spam and rarely make mistakes in distinguishing between spam and normal mail.
  • Minuses: Updates are usually required. Filter settings are carried out by specially trained people, sometimes by entire antispam laboratories. Such support is expensive, which affects the cost of the spam filter. Spammers invent special tricks to circumvent this technology: they introduce random “noise” into spam, making it difficult to find spam characteristics of a message and evaluate them. For example, they use non-literal symbols in words (this is how, for example, the word viagra may look when using this technique: vi_a_gra or vi@gr@), generate variable colored backgrounds in images, etc.

Content filtering: Bayes

Statistical Bayesian algorithms are also designed for content analysis. Bayesian filters do not require constant tuning. All they need is prior training. After this, the filter is adjusted to the email topics that are typical for this particular user. Thus, if a user works in the education system and conducts training, then personally messages on this topic will not be recognized as spam. For those who do not need offers to attend training, the statistical filter will classify such messages as spam.

  • Pros: Customization.
  • Minuses: Works best on individual mail flow. Setting up Bayes on a corporate server with heterogeneous mail is a difficult and thankless task. The main thing is that the end result will be much worse than for individual boxes. If the user is lazy and does not train the filter, then the technology will not be effective. Spammers specifically work to bypass Bayesian filters, and they succeed.

Greylisting

Temporary refusal to receive a message. The refusal comes with an error code that is understood by all mail systems. After some time they resend the message. And programs that send spam do not resend the letter in this case.

  • Pros: Yes, this is also a solution.
  • Minuses: Delay in mail delivery. For many users, this solution is unacceptable.

Introduction to the problem

We all know what spam is because we have either encountered it or read about it. We all know how spammers collect email addresses. It is also no secret that spam cannot be completely defeated. The problem is how to maximally protect users who leave their contact details on your website with minimal effort.

Previously tested methods of protection

The biggest threat mailboxes represent programs that download websites and take postal addresses from the text of pages. They either download only your site, or wander around like search engines, throughout the network. If your site is small, the following text auto-replacement protection is quite sufficient:

]+href=)([""]?)mailto:(+)()@".
"()(+.(2,4))2([ >])~i", "1"mailto: [email protected]"
onMouseover="this.href="mai" + "lto:3" + "4" + "%40" + "5" + "6";"7", $text); ?>

Unfortunately, it won't work if you have a large site. Let's say spectator.ru, whose author was one of the first to use this method. If I were a spammer, I would go into personal settings, check the “do not show ears” checkbox, 1000 reviews on the page, and catch cookies with Proxomitron. Then, using a rocker or a PHP script, I would download pages with comments (substituting cookies with settings) and use a regular expression to catch the addresses. I would get a small base for advertising mailings.

There were a couple more protection methods in which the mailto: link was automatically replaced with some other one, but the effect remained the same - when you clicked on it, the system client would create a letter to the desired address. Both of them did not stand up to criticism.

Meet the hedgehogs

Obviously, it is difficult to come up with another method of protection other than what has already been tested - providing a form on the site for sending a message. Let's start designing it. The advantages of this method are obvious: no one will be able to get addresses for their spam database from your website. It will not be possible to send messages by hiding your address, as spammers do - the web server will record its IP address. Lists of public anonymous proxy servers are regularly updated, and it is easy to block access from them.

Form sender

Let's start with this, because this is the most difficult part.

When installing a form sender on a site, it is important to protect it from hooligan attacks, which can be no easier than spam. Therefore, we will have to make great efforts in this direction.

First, let's protect ourselves from stupid double clicks and sending many identical requests. The idea is this: the message will not be sent if the user has not previously opened the page with the form, and by opening the page with the form, the message can be sent only once. This can be done using sessions built into PHP. When opening a page with a form, we will launch a session in which we will save a variable, say $flag. We will display the session ID as a hidden element at the very end of the form. The user enters a message and submits the form. Upon receiving the form, the script starts a session and checks the presence and value of the $flag variable. If the variable does not exist, then this is a repeated click, the letter is not sent and an error message is displayed. If the variable exists, and the form data suits us (the required fields are filled in), the script sends a letter and deletes the session.

Secondly, let’s protect ourselves from smart hooligans by recording message logs. If the user submits a correctly filled out form, the script will look at the logs and check what is there. Yes, it should be banned

* send messages to the same address more often than a certain period
* send the same text to different addresses
* and simply use the form sender too often - say, no more than 10 messages per day per user

We display the session ID at the very end of the form, so that the hacker would need to download the entire form and parse it, which is more complicated than simply sending HTTP requests. Naturally, the form sender will issue messages about errors in writing the message, a requirement to indicate a return address, etc.

The resulting form sender code turned out to be too large to include in the text. It has been archived on the website. It seems that the script is working and sending messages.

Replacing addresses in text

Now the form sender is ready, and you need to replace all emails with links to it. Of course, you shouldn't do this manually. For myself, I wrote a script that automatically replaces addresses with links to the form sender.

...Disadvantages: more time for arranging links (compensated by the directory of links), the user, when hovering the cursor over a link, does not see what address he will go to. (Dmitry Smirnov, “Ideal author’s project, hypertextuality”)

All the mentioned disadvantages can be easily eliminated if you use code similar to the one I will now describe and show.

There is nothing complicated here; if these are links, then “more time for arrangement” is not required. On my site I use an engine script that is called by all pages, so it’s not a problem to add code to it or call it from it that replaces addresses. Mailing addresses were and are written directly in the text of the pages, but before being displayed to the user they are replaced with the required text. Compiling a database of links or email addresses is not a problem.

So what does an address replacer do? It searches for “mailto:” links in the text, selects addresses from them, sends a query to the database to count (count(*)) how many addresses from those on the page are in a special table. If there are new addresses on the page, then their number will be greater than the query result. In this case, a query is made in which address values ​​are selected, and those already existing in the table are excluded from the list. The remaining list is sent to the table using an INSERT query.

As for ID addresses, in my opinion it is better to use something that a site visitor could not find. Can you imagine the link /email.php?id=10 leading to the form sender? What a temptation to put 11, 12, etc. there. and try sending them all a message. Therefore, I decided to use the md5 hash of the addresses as identifiers. It’s unlikely that anyone will undertake to select the hash. In the case of a directory of links, you can get by with ID, but then you have to select all the values ​​from the database, and replacing addresses with their hashes is much simpler.

A command of the form is executed

]+href=)". "([""]?)mailto:(+@+". ".(2,4))2(.*?>)~ie", ""12"/email.php ?email=". urlencode(md5("3")). ""4"", $text); ?>

...which replaces addresses with their hashes. I did not dare replace the remaining addresses in the text with links, but made a simple replacement with addresses like vasya_at_pupkin_dot_ru. The autoreplacement code is also in the archive.

Bottom line

Hiding email addresses from visitors is quite easy. The autocorrect mechanism does not require additional effort, and you can continue writing site pages as if nothing had happened. Difficulties arise when protecting the form sender from web hooligans. This protection requires a lot of effort and complex code, so I have not yet used written code on the site. You can download an archive with an address substitute and a form sender, but I ask you very much: do not put it on your site in the form in which you downloaded it, I myself don’t know how reliably it works.

Publications on the topic