A blog about WordPress plugin settings and search engine optimization for beginner webmasters. A quick way to check the indexing of pages in Yandex and Google Submit for indexing

Quite often, a new site cannot be found in Yandex. Even if you type its name in the search bar. The reasons for this may be different. Sometimes search engines simply don’t yet know that a new resource has appeared. To figure out what’s going on and solve the problem, you need to register your site with Yandex.Webmaster.

What is site indexing in Yandex

First, let's figure out how search engines generally find out about new sites or changes to them. Yandex has a special program called a search robot. This robot surfs the Internet and looks for new pages. Sometimes he goes to old ones and checks to see if anything new has appeared on them.

When the robot finds a useful page, it adds it to its database. This database is called the search index. When we look for something in the search, we see sites from this database. Indexing is when the robot adds new documents there.

A robot cannot crawl the entire Internet every day. He doesn't have enough power for that. Therefore, he needs help - to report about new pages or changes to old ones.

What is Yandex.Webmaster and why is it needed?

Yandex.Webmaster is an official service from Yandex. You need to add a website to it so that the robot knows about its existence. With its help, resource owners (webmasters) can prove that this is their site.

You can also see in Webmaster:

  • when and where the robot entered;
  • which pages it indexed and which it did not;
  • what keywords do people search for?
  • are there any technical errors?

Through this service you can set up a website: set the region, prices of goods, protect your texts from theft. You can ask the robot to re-visit the pages where you made changes. Yandex.Webmaster makes it easy to move to https or another domain.

How to add a new website to Yandex.Webmaster

Go to the Webmaster panel. Click "Login". You can enter the login and password that you use to log into Yandex mail. If you don't have it yet account, you will have to register.

After logging in, you will be taken to a page with a list of added resources. If you have not used the service before, the list will be empty. To add a new resource, click the “+” button.

On the next page, enter the address of your site and confirm its addition.

At the last stage you need to confirm your rights- prove to Yandex that you are the owner. There are several ways to do this.

How to confirm rights to a website in Yandex.Webmaster

The easiest way to confirm rights in Yandex.Webmaster is add a file to the site. To do this, click on the “HTML File” tab.

A small file will download. You'll need this file now, so save it somewhere you can see it. For example, on the desktop. Do not rename the file! There is no need to change anything about it.

Now upload this file to your website. Typically used for this file managers, But users don't need to do any of this. Just go to the back office, click "Files". Then at the top of the page - “Add file”. Select the file you downloaded earlier.

Then return to the Yandex.Webmaster panel and click the “Check” button. After successfully confirming access rights, your site will appear in the list of added ones. Thus, you have informed Yandex.Webmaster about the new site.

Meta tag Yandex.Webmaster

Sometimes the method described above does not work, and the owners cannot confirm the rights to the site in Webmaster. In this case, you can try another method: add a line of code to the template.

In Webmaster go to the "Meta Tag" tab. You will see a line that needs to be added to the HTML code.

Users can contact technical support and ask to insert this code. This will be done as part of a free revision.

When they do this in Webmaster, click the “Check” button. Congratulations, you have registered your site in a search engine!

Preliminary setup of Yandex.Webmaster

The site has been added to the search, now the robot will definitely come to you and index it. This usually takes up to 7 days.

Add a link to your sitemap

In order for the robot to index the resource faster, add the sitemap.xml file to Webmaster. This file contains the addresses of all pages of the resource.

Online stores already have this file configured and should be added to Webmaster automatically. If this does not happen, add a link to sitemap.xml in the “Indexing” - “Sitemap Files” section.

Check robots.txt

In the robots.txt file indicate pages that the robot does not need to visit. These are the cart, checkout, back office and other technical documents.

By default, it creates robots.txt, which does not need to be modified. Just in case, we recommend checking for errors in robots. To do this, go to “Tools” - “Analysis of robots.txt”.

Set the site region

On the “Site Information” - “Region” page, you can set the region of the site. For online stores, these are the cities, regions and countries where purchased goods are delivered. If you don’t have a store, but a directory or blog, then the region will be the whole world.

Set the sales region as shown in the screenshot:

What else is Webmaster useful for?

On the page " Search queries» you can see the phrases that come to you from the search.

The “Indexing” section displays information about when the robot was on the site and how many pages it found. The “Site Moving” subsection will help you if you decide to install and switch to https. The “Page Retraversal” subsection is also extremely useful. In it you can indicate to the robot the pages on which the information has changed. Then, on your next visit, the robot will index them first.

On the “Products and Prices” page of the “Site Information” section, you can provide information about your online store. To do this, the resource must be configured to upload data on products and prices in YML format. At correct setting Prices and delivery information will be displayed in the search results of product pages.

If you want to improve the visibility of your company in Yandex services, you should use the “Useful Services” section. In Yandex.Directory, you can specify the phone number, address of your store, and opening hours. This information will be displayed directly in Yandex results. This will also add you to Yandex.Maps.

Yandex.Metrica - another important tool for the owner of an Internet resource, showing traffic data. Statistics and dynamics of site traffic are displayed in easy-to-analyze tables, charts and graphs.

After connecting to the Yandex.Webmaster and Yandex.Metrica services, you will receive a sufficient amount of information to manage the site’s positions and traffic. These are indispensable tools for website owners who want to promote their resources in the most popular search engine in Russia.

The next step in website promotion is through a similar service Search Console. That's all, good luck with your promotion!

What is site indexing? How does it happen? You can find answers to these and other questions in the article. in search engines) is the process of adding information about a site to a database by a search engine robot, which is subsequently used to search for information on web projects that have undergone such a procedure.

Data about web resources most often consists of keywords, articles, links, documents. Audio, images, and so on can also be indexed. It is known that the algorithm for identifying keywords depends on the search device.

There are some restrictions on the types of information indexed (flash files, javascript).

Inclusion management

Indexing a website is a complex process. To manage it (for example, prohibit the inclusion of a particular page), you need to use the robots.txt file and regulations such as Allow, Disallow, Crawl-delay, User-agent and others.

Tags are also used for indexing and props , hiding the contents of the resource from Google robots and Yandex (Yahoo uses the tag ).

In the Goglle search engine, new sites are indexed from a couple of days to one week, and in Yandex - from one week to four.

Do you want your site to show up in search engine results? Then it must be processed by Rambler, Yandex, Google, Yahoo, and so on. You must inform search engines (spiders, systems) about the existence of your website, and then they will crawl it in whole or in part.

Many sites have not been indexed for years. The information contained on them is not seen by anyone except their owners.

Processing methods

Site indexing can be done in several ways:

  1. The first option is to add it manually. You need to enter your site data through special forms offered by search engines.
  2. In the second case, the search engine robot itself finds your website using links and indexes it. He can find your site using links from other resources that lead to your project. This method is the most effective. If a search engine finds a site this way, it considers it significant.

Deadlines

Site indexing is not very fast. The terms vary, from 1-2 weeks. Links from authoritative resources (with excellent PR and Tits) significantly speed up the placement of the site in the search engine database. Today Google is considered the slowest, although until 2012 it could do this job in a week. Unfortunately, everything is changing very quickly. It is known that Mail.ru has been working with websites in this area for about six months.

Not every specialist can index a website in search engines. The timing of adding new pages to the database of a site that has already been processed by search engines is affected by the frequency of updating its content. If fresh information constantly appears on a resource, the system considers it frequently updated and useful for people. In this case, its work is accelerated.

You can monitor the progress of website indexing in special sections for webmasters or on search engines.

Changes

So, we have already figured out how the site is indexed. It should be noted that search engine databases are frequently updated. Therefore, the number of pages of your project added to them may change (either decrease or increase) for the following reasons:

  • search engine sanctions against the website;
  • presence of errors on the site;
  • changes in search engine algorithms;
  • disgusting hosting (inaccessibility of the server on which the project is located) and so on.

Yandex answers to common questions

Yandex is a search engine used by many users. It ranks fifth among search systems in the world in terms of the number of research requests processed. If you added a site to it, it may take too long to add it to the database.

Adding a URL does not guarantee it will be indexed. This is just one of the methods by which the system informs the robot that a new resource has appeared. If your site has few or no links from other sites, adding it will help you discover it faster.

If indexing does not occur, you need to check whether there were any failures on the server at the time the request was created by the Yandex robot. If the server reports an error, the robot will terminate its work and try to complete it in a comprehensive crawl. Yandex employees cannot increase the speed of adding pages to the search engine database.

Indexing a site in Yandex is a rather difficult task. You don't know how to add a resource to a search engine? If there are links to it from other websites, then you do not need to add the site specifically - the robot will automatically find and index it. If you don't have such links, you can use the Add URL form to tell search engines that your site exists.

It is important to remember that adding a URL does not guarantee that your creation will be indexed (or how quickly it will be indexed).

Many people are interested in how long it takes to index a website in Yandex. Employees of this company do not make guarantees or predict deadlines. As a rule, since the robot has learned about the site, its pages appear in searches within two days, sometimes after a couple of weeks.

Processing process

Yandex is a search engine that requires accuracy and attention. Site indexing consists of three parts:

  1. The search robot crawls the resource pages.
  2. The content of the site is recorded in the database (index) of the search system.
  3. After 2-4 weeks, after updating the database, you can see the results. Your site will appear (or not appear) in search results.

Indexing check

How to check site indexing? There are three ways to do this:

  1. Enter the name of your business in the search bar (for example, “Yandex”) and check each link on the first and second page. If you find the URL of your brainchild there, it means the robot has completed its task.
  2. You can enter your site's URL in the search bar. You will be able to see how many Internet sheets are shown, that is, indexed.
  3. Register on the webmasters' pages in Mail.ru, Google, Yandex. After you pass the site verification, you will be able to see the results of indexing and other search engine services created to improve the performance of your resource.

Why does Yandex refuse?

Indexing a site in Google is carried out as follows: the robot enters all pages of the site into the database, low-quality and high-quality, without selecting. But only useful documents are included in the ranking. And Yandex immediately excludes all web junk. It can index any page, but the search engine eventually eliminates all garbage.

Both systems have an additional index. For both, low-quality pages affect the ranking of the website as a whole. There is a simple philosophy at work here. A particular user's favorite resources will rank higher in search results. But this same individual will have difficulty finding a site that he didn’t like last time.

That is why it is first necessary to protect copies of web documents from indexing, check for empty pages, and prevent low-quality content from being returned.

Speeding up Yandex

How can I speed up site indexing in Yandex? The following steps must be followed:

Intermediate actions

What needs to be done until the web page is indexed by Yandex? A domestic search engine should consider the site the primary source. That is why, even before publishing an article, it is imperative to add its content to the “Specific Texts” form. Otherwise, plagiarists will copy the entry to their resource and end up first in the database. In the end, they will be recognized as the authors.

Google Database

Prohibition

What is a site indexing ban? You can apply it either to the entire page or to a separate part of it (a link or a piece of text). In fact, there is both a global indexing ban and a local one. How is this implemented?

Let's consider prohibiting adding a website to the search engine database in Robots.txt. Using the robots.txt file, you can exclude indexing of one page or an entire resource category like this:

  1. User-agent: *
  2. Disallow: /kolobok.html
  3. Disallow: /foto/

The first point indicates that the instructions are defined for all subsystems, the second indicates that indexing of the kolobok.html file is prohibited, and the third does not allow adding the entire contents of the foto folder to the database. If you need to exclude several pages or folders, specify them all in Robots.

In order to prevent the indexing of an individual Internet sheet, you can use the robots meta tag. It differs from robots.txt in that it gives instructions to all subsystems at once. This meta tag obeys general principles html format. It should be placed in the page header between the Ban entry, for example, could be written like this: .

Ajax

How does Yandex index Ajax sites? Today, Ajax technology is used by many web site developers. Of course, she has great opportunities. Using it, you can create fast and productive interactive web pages.

However, the system “sees” the web sheet differently than the user and the browser. For example, a person looks at a comfortable interface with movably loaded Internet sheets. For a search robot, the content of the same page may be empty or presented as other static HTML content, for the generation of which scripts are not used.

To create Ajax sites, you can use a URL with #, but the search engine robot does not use it. Usually the part of the URL after the # is separated. This needs to be taken into account. Therefore, instead of a URL like http://site.ru/#example, he makes a request to the main page of the resource located at http://site.ru. This means that the content of the Internet sheet may not be included in the database. As a result, it will not appear in search results.

To improve the indexing of Ajax sites, Yandex supported changes in the search robot and the rules for processing URLs of such websites. Today, webmasters can indicate to the Yandex search engine the need for indexing by creating an appropriate scheme in the resource structure. To do this you need:

  1. Replace the # symbol in the page URL with #!. Now the robot will understand that it can request an HTML version of the content for this Internet sheet.
  2. The HTML version of the content of such a page should be placed at a URL where #! replaced by?_escaped_fragment_=.

What is indexing? This is the process of a robot receiving the content of your site's pages and including that content in search results. If we look at the numbers, the indexing robot’s database contains trillions of website page addresses. Every day the robot requests billions of such addresses.

But this whole large process of indexing the Internet can be divided into small stages:


First, the indexing robot must know that a page on your site has appeared. For example, by indexing other pages on the Internet, finding links, or downloading the set nemp. We learned about the page, after which we plan to crawl this page, send data to your server to request this page of the site, receive the content and include it in the search results.

This entire process is the process of exchanging the indexing robot with your website. If the requests sent by the indexing robot practically do not change, and only the page address changes, then your server’s response to the robot’s page request depends on many factors:

  • from your CMS settings;
  • from the hosting provider settings;
  • from the work of the intermediate provider.

This answer is just changing. First of all, when requesting a page, the robot from your site receives the following service response:


These are HTTP headers. They contain various service information that allows the robot to understand what content will be transmitted now.

I would like to focus on the first header - this is the HTTP response code that indicates to the indexing robot the status of the page that the robot requested.

There are several dozen such HTTP code statuses:


I'll tell you about the most popular ones. The most common response code is HTTP-200. The page is available, it can be indexed, included in search results, everything is fine.

The opposite of this status is HTTP-404. The page is not on the site, there is nothing to index, and there is nothing to include in the search. When changing the structure of sites and changing addresses internal pages We recommend setting up a 301 server for redirects. He will just point out to the robot that old page moved to a new address and needs to be included in search results exactly the new address.

If the page content has not changed since the last time a robot visited the page, it is best to return an HTTP-304 code. The robot will understand that there is no need to update the pages in the search results and the content will not be transferred either.

If your site is only available for a short period of time, for example, when doing some work on the server, it is best to configure HTTP-503. It will indicate to the robot that the site and server are currently unavailable, you need to come back a little later. In case of short-term unavailability, this will prevent pages from being excluded from search results.

In addition to these HTTP codes and page statuses, you also need to directly obtain the content of the page itself. If for a regular visitor the page looks like this:


these are pictures, text, navigation, everything is very beautiful, then for the indexing robot any page is just a set of source code, HTML code:


Various meta tags, text content, links, scripts, a lot of all kinds of information. The robot collects it and includes it in search results. It seems that everything is simple: they requested a page, received the status, received the content, and included it in the search.

But it’s not without reason that the Yandex search service receives more than 500 letters from webmasters and site owners stating that certain problems have arisen with the server’s response.

All these problems can be divided into two parts:

These are problems with the HTTP response code and problems with the HTML code, with the direct content of the pages. There can be a huge number of reasons for these problems. The most common is that the indexing robot is blocked by the hosting provider.


For example, you launched a website, added new section. The robot begins to visit your site more often, increasing the load on the server. The hosting provider sees this on their monitoring, blocks the indexing robot, and therefore the robot cannot access your site. You go to your resource - everything is fine, everything works, the pages are beautiful, everything opens, everything is great, but the robot cannot index the site. If the site is temporarily unavailable, for example, if you forgot to pay Domain name, the site has been down for several days. The robot comes to the site, it is inaccessible, under such conditions it can disappear from the search results literally after a while.

Incorrect CMS settings, for example, when updating or switching to another CMS, when updating the design, can also cause pages on your site to disappear from the search results if the settings are incorrect. For example, the presence of a prohibiting meta tag in source code site pages, incorrect setting of the canonical attribute. Make sure that after all the changes you make to the site, the pages are accessible to the robot.

The Yandex tool will help you with this. To the webmaster to check the server response:


You can see what HTTP headers your server returns to the robot, and the contents of the pages themselves.


The “indexing” section contains statistics where you can see which pages are excluded, the dynamics of changes in these indicators, and do various sorting and filtering.


Also, I already talked about this section today, the “site diagnostics” section. If your site becomes unavailable to a robot, you will receive a corresponding notification and recommendations. How can this be fixed? If no such problems arise, the site is accessible, meets codes 200, and contains correct content, then the robot begins automatic mode visit all the pages that he recognizes. This does not always lead to the desired consequences, so the robot’s activities can be limited in a certain way. There is a robots.txt file for this. We'll talk about it in the next section.

Robots.txt

The robots.txt file itself is small Text Document, it lies in the root folder of the site and contains strict rules for the indexing robot that must be followed when crawling the site. The advantages of the robots.txt file are that you do not need any special or specialized knowledge to use it.

All you have to do is open Notepad, enter certain format rules, and then simply save the file on the server. Within a day, the robot begins to use these rules.

If we take an example of a simple robots.txt file, here it is, just on the next slide:


The “User-Agent:” directive shows for which robots the rule is intended, allowing/denying directives and auxiliary Sitemap and Host directives. A little theory, I would like to move on to practice.

A few months ago I wanted to buy a pedometer, so I turned to Yandex. Market for help with the choice. Moved from the main page of Yandex to Yandex. Market and got to home page service.


Below you can see the address of the page I went to. The address of the service itself also added the identifier of me as a user on the site.

Then I went to the “catalog” section


I selected the desired subsection and configured the sorting parameters, price, filter, how to sort, and manufacturer.

I received a list of products, and the page address has already grown.

I went to the desired product, clicked on the “add to cart” button and continued checkout.

During my short journey, the page addresses changed in a certain way.


Service parameters were added to them, which identified me as a user, set up sorting, and indicated to the site owner where I came from to this or that page of the site.

I think such pages, service pages, will not be very interesting to search engine users. But if they are available to the indexing robot, they may be included in the search, since the robot essentially behaves like a user.

He goes to one page, sees a link that he can click on, goes to it, loads the data into his robot’s database and continues this crawl of the entire site. This category of such addresses also includes personal data of users, for example, such as delivery information or contact information of users.

Naturally, it is better to ban them. This is exactly what the robots.txt file will help you with. You can go to your website this evening at the end of the Webmaster, click, and see which pages are actually available.

In order to check robots.txt there is a special tool in Webmaster:


You can download, enter page addresses, see if they are accessible to the robot or not.


Make some changes, see how the robot reacts to these changes.

Errors when working with robots.txt

In addition to such a positive effect - closing service pages, robots.txt can play a cruel joke if handled incorrectly.

Firstly, the most common problem when using robots.txt is the closing of really necessary site pages, those that should be in the search and shown for queries. Before you make changes to robots.txt, be sure to check whether the page you want to close is showing up for search queries. Perhaps a page with some parameters is in the search results and visitors come to it from search. Therefore, be sure to check before using and making changes to robots.txt.

Secondly, if your site uses Cyrillic addresses, you won’t be able to indicate them in robots.txt direct form, they must be encoded. Since robots.txt is an international standard that all indexing robots follow, they will definitely need to be coded. It is not possible to explicitly specify the Cyrillic alphabet.

The third most popular problem is different rules for different robots of different search engines. For one indexing robot, all indexing pages were closed, for the second, nothing was closed at all. As a result of this, everything is fine in one search engine, the desired page is in the search, but in another search engine there may be trash, various garbage pages, and something else. Be sure to make sure that if you set a ban, it must be done for all indexing robots.

The fourth most popular problem is the use of the Crawl-delay directive when it is not necessary. This directive allows you to influence the purity of requests from the indexing robot. This is a practical example, a small website, placed it on a small hosting, everything is fine. We added a large catalog, the robot came, saw a bunch of new pages, started accessing the site more often, increased the load, downloaded it and the site became inaccessible. We set the Crawl-delay directive, the robot sees this, reduces the load, everything is fine, the site works, everything is perfectly indexed, it is in the search results. After some time, the site grows even more, is transferred to a new hosting that is ready to cope with these requests, with a large number of requests, and they forget to remove the Crawl-delay directive. As a result, the robot understands that a lot of pages have appeared on your site, but cannot index them simply because of the established directive. If you've ever used the Crawl-delay directive, make sure it's not there now and that your service is ready to handle the load from the indexing robot.


In addition to the described functionality, the robots.txt file allows you to solve two very important tasks - get rid of duplicates on the site and indicate the address of the main mirror. This is exactly what we will talk about in the next section.

Doubles


By duplicates we mean several pages of the same site that contain absolutely identical content. The most common example is pages with and without a slash at the end of the address. Also, a duplicate can be understood as the same product in different categories.

For example, roller skates can be for girls, for boys, the same model can be in two sections at the same time. And thirdly, these are pages with an insignificant parameter. As in the example with Yandex. The market defines this page as a “session ID”; this parameter does not change the content of the page in principle.

To detect duplicates and see which pages the robot is accessing, you can use Yandex. Webmaster.


In addition to statistics, there are also addresses of pages that the robot downloaded. You see the code and the last call.

Troubles that duplicates lead to

What's so bad about doubles?

Firstly, the robot begins to access absolutely identical pages of the site, which creates an additional load not only on your server, but also affects the crawling of the site as a whole. The robot begins to pay attention to duplicate pages, and not to those pages that need to be indexed and included in search results.


The second problem is that duplicate pages, if they are accessible to the robot, can end up in search results and compete with the main pages for queries, which, naturally, can negatively affect the site being found for certain queries.

How can you deal with duplicates?

First of all, I recommend using the “canonical” tag in order to point the robot to the main, canonical page, which should be indexed and found in search queries.

In the second case, you can use a 301 server redirect, for example, for situations with a slash at the end of the address and without a slash. We set up redirection - there are no duplicates.


And thirdly, as I already said, this is the robots.txt file. You can use both deny directives and the Clean-param directive to get rid of insignificant parameters.

Site mirrors

The second task that robots.txt allows you to solve is to point the robot to the address of the main mirror.


Mirrors are a group of sites that are absolutely identical, like duplicates, only the two sites are different. Webmasters usually encounter mirrors in two cases - when they want to move to a new domain, or when a user needs to make several website addresses available.

For example, you know that when users type your address or the address of your website in the address bar, they often make the same mistake - they misspell, put the wrong character, or something else. You can purchase an additional domain in order to show users not a stub from the hosting provider, but the site they really wanted to go to.

Let's focus on the first point, because it is with this that problems most often arise when working with mirrors.

I advise you to carry out the entire moving process according to the following instructions. A small instruction that will allow you to avoid various problems when moving to a new domain name:

First, you need to make sites accessible to the indexing robot and place absolutely identical content on them. Also make sure that the robot knows about the existence of the sites. The easiest way is to add them to Yandex. Webmaster and confirm rights to them.

Secondly, using the Host directive, point the robot to the address of the main mirror - the one that should be indexed and be in the search results.

We are waiting for gluing and transfer of all indicators from the old site to the new one.


After which you can set up redirection from the old address to the new one. A simple instruction, if you are moving, be sure to use it. I hope there won't be any problems with
moving.

But, naturally, errors arise when working with mirrors.

First of all, the most important problem is the lack of explicit instructions for the indexing robot to the address of the main mirror, the address that should be in the search. Check on your sites that they have a host directive in their robots.txt, and that it points to exactly the address that you want to see in the search.

The second most popular problem is using redirection to change the main mirror in an existing group of mirrors. What's happening? The old address, since it redirects, is not indexed by the robot and is excluded from search results. In this case, the new site does not appear in the search, since it is not the main mirror. You lose traffic, you lose visitors, I think no one needs this.


And the third problem is the inaccessibility of one of the mirrors when moving. The most common example in this situation is when they copied the site’s content to a new address, but the old address was simply disabled, they did not pay for the domain name and it became unavailable. Naturally, such sites will not be merged; they must be accessible to the indexing robot.

Useful links in the work:

  • More useful information you will find in the Yandex.Help service.
  • All the tools I talked about and even more - there is a beta version of Yandex.Webmaster.

Answers on questions

“Thank you for the report. Is it necessary to disable indexing of CSS files for the robot in robots.txt or not?

We do not recommend closing them at this time. Yes, it’s better to leave CSS and JavaScript, because now we are working to ensure that the indexing robot begins to recognize both scripts on your site and styles, and see how a visitor does from a regular browser.

“Tell me, if the site URLs are the same for the old and the new, is that normal?”

It's okay. Basically, you just update the design, add some content.

“The site has a category and it consists of several pages: slash, page1, page2, up to 10, for example. All pages have the same category text, and it turns out to be duplicate. Will this text be a duplicate or should it be closed somehow, a new index on the second and further pages?

First of all, since the pagination on the first page and the content on the second page are generally different, they will not be duplicates. But you need to expect that the second, third and further pagination pages can get into the search and show up for some relevant query. Better in pagination pages, I would recommend using the canonical attribute, in the best case - on the page on which all products are collected so that the robot does not include pagination pages in the search. People very often use canonical on the first page of pagination. The robot comes to the second page, sees the product, sees the text, does not include the page in the search and understands due to the attribute that it is the first pagination page that should be included in the search results. Use canonical, and close the text itself, I think there is no need.

Source (video): How to set up site indexing- Alexander Smirnov

Magomed Cherbizhev

By and large, if your resource is good, well-made, then there should be no problems with its indexing. If the site, although not 100%, meets the requirements of search engines - “for people”, then they will be happy to look at you and index everything new that will be added.

But be that as it may, the first step in promoting a site is to add it to the PS index. Until the resource is indexed, by and large there is nothing to promote, because search engines will not know about it at all. Therefore, in this article I will look at what site indexing is in Yandex and how to submit a resource for indexing. I’ll also tell you how to check whether a site or a separate page is included in the Yandex index and what to do to speed up indexing by Yandex.

Indexing a site in Yandex is the robots crawling the yandex search engine of your site and entering all open pages to the database. The Russian search engine spider adds data about the site to the database: its pages, pictures, videos, documents that are searchable. Also, the search bot is engaged in indexing links and other elements that are not hidden by special tags and files.

The main ways to index a resource:

    Forced - you must submit the site for indexing to Yandex through a special form.

    Natural - the search spider manages to independently find your site by moving from external resources that link to the website.

The time it takes to index a site in Yandex is different for everyone and can range from a couple of hours to several weeks.

This depends on many factors: what values ​​are in Sitemap.xml, how often the resource is filled, how often mentions of the site appear on other resources. The indexing process is cyclical, so the robot will come to you at (almost) equal intervals of time. But with what frequency depends on the factors mentioned above and the specific robot.

The spider can index the entire website (if it is small) or a separate section (this applies to online stores or media). On frequently updated resources, such as media and information portals, there live so-called fast robots for quick site indexing in Yandex.

Sometimes technical problems (or problems with the server) may arise on the project; in this case, Yandex indexing of the site will not take place, which is why the search engine may resort to the following scenario:

  • immediately throw out unindexed pages from the database;
  • re-index the resource after a certain time;
  • set pages that were not indexed to be excluded from the database, and if it does not find them during re-indexing, it will be thrown out of the index.

How to speed up site indexing in Yandex

How to speed up indexing in Yandex is a common question on various webmaster forums. In fact, the life of the entire site depends on indexing: the position of the resource in the PS, the number of clients from them, the popularity of the project, profit, in the end.

I have prepared 10 methods that I hope will be useful to you. The first five are standard for constant indexing of a resource, and the next five will help you speed up the indexing of your site in Yandex:

    bookmarking services;

    RSS feed – will ensure the broadcast of new materials from your resource to subscribers’ emails and RSS directories;

    link exchanges - will ensure a stable increase in dofollow links from quality donors, if they are selected correctly (how to select correctly);

    – if you have not yet registered your site in directories, then I advise you to do so. Many people say that directories have died a long time ago or that registering in them will kill a site - this is not true. More precisely, it’s not the complete truth, if you register in all the directories in a row, then indeed your resource will only suffer from this. But with the correct selection of trust and good catalogs, the effect will undoubtedly be.

Checking site indexing in Yandex

  • The site and url operators. If you want to check the indexing of a site in Yandex, you can use standard search engine operators ..biz. (Naturally, instead of my domain, yours)

  • RDS bar. I consider it the best and fastest way to check the indexing of a page in Yandex. This plugin can be installed on all popular browsers and will immediately provide detailed information about the number of site pages in the index and the presence of specific material in it. With this extension, you will not waste time manually entering URLs in services or searches. In general, I recommend it, the RDS bar is extremely convenient:
  • Service Serphant. A multifunctional resource with which you can analyze a site: assessing the effectiveness and monitoring of sites, analyzing competitors’ pages, checking positions and site indexing. You can check page indexing for free using this link: https://serphunt.ru/indexing/. Thanks to batch checking (up to 50 addresses) and high reliability of the results, this service is one of the three best in my opinion.

  • XSEO service. A set of tools for webmasters, in XSEO.in you can look at the site indexing in Yandex. Also get a lot of additional useful information about your resource:

  • PR-CY and CY-PR services. A couple more services that will provide you with information about the total number of indexed pages:

  • Sitereport service. An excellent service that will point out all your mistakes in working on the site. It also has a section “Indexation”, where information will be presented for each page of the site, indicating whether it is indexed or not in search engines Yandex systems and Google. Therefore, I recommend using this resource to detect problems on the site and check Yandex mass indexing:

With Google everything is very simple. You need to add your site to webmaster tools at https://www.google.com/webmasters/tools/, then select the added site, thus getting into the Search Console of your site. Next, in the left menu, select the “Scanning” section, and in it the “View as Googlebot” item.

On the page that opens, in the empty field, enter the address of the new page that we want to quickly index (taking into account the already entered domain name of the site) and click the “Crawl” button to the right. We wait until the page is scanned and appears at the top of the table of addresses previously scanned in a similar way. Next, click on the “Add to Index” button.

Hurray, your new page is instantly indexed by Google! In just a couple of minutes you will be able to find it in Google search results.

Fast indexing of pages in Yandex

IN new version webmaster tools became available similar tool to add new pages to the index. Accordingly, your site must also first be added to Yandex Webmaster. You can also get there by selecting the desired site in the webmaster, then go to the “Indexing” section, select the “Page Re-Crawling” item. In the window that opens, enter the addresses of new pages that we want to quickly index (via a link on one line).

Unlike Google, indexing in Yandex does not yet occur instantly, but it is trying to strive for it. Using the above actions you will inform the Yandex robot about new page. And it will be indexed within half an hour to an hour - this is what my personal experience shows. Perhaps the speed of page indexing in Yandex depends on a number of parameters (the reputation of your domain, account and/or others). In most cases, you can stop there.

If you see that the pages of your site are poorly indexed by Yandex, that is, there are several general recommendations on how to deal with this:

  • The best, but also difficult, recommendation is to install the Yandex quickbot on your website. To do this, it is advisable to add fresh materials to the site every day. Preferably 2-3 or more materials. Moreover, add them not all at once, but after a while, for example, in the morning, afternoon and evening. It would be even better to maintain approximately the same publication schedule (approximately maintain the same time for adding new materials). Also, many recommend creating RSS feed site so that search robots can read updates directly from it.
  • Naturally, not everyone will be able to add new materials to the site in such volumes - it’s good if you can add 2-3 materials per week. In this case, you can’t really dream about the speed of Yandex, but try to get new pages into the index in other ways. The most effective of which is considered to be posting links to new pages to upgraded Twitter accounts. By using special programs like Twidium Accounter, you can “pump up” the number of Twitter accounts you need and, with their help, quickly insert new site pages into the search engine index. If you do not have the opportunity to post links to upgraded Twitter accounts yourself, you can buy such posts through special exchanges. One post with your link will cost on average 3-4 rubles and more (depending on the strength of the selected account). But this option will be quite expensive.
  • The third option for quick indexing is to use the http://getbot.guru/ service, which for just 3 rubles will help you achieve the desired effect with a guarantee of results. Well suited for sites with a rare schedule of adding new publications. There are also cheaper rates. It is better to look at their details and differences on the website of the service itself. Personally, I have services of this service I'm very pleased with it as an indexing accelerator.

Of course, you can also add new posts to social bookmarks, which in theory should also help the site get indexed quickly. But the effectiveness of such an addition will also depend on the level of your accounts. If you have little activity on them and you use accounts only for such spam, then there will be practically no useful output.

P.S. with extensive experience is always relevant - contact us!

Publications on the topic