Today we will talk about an unpleasant phenomenon called duplicates. What are duplicates? Duplicate is a page of your site that is present in the search index, which is available at a different URL. The index may contain various links to the same pages of your site. Almost every site has duplicates why do you have to find and fix it? Two reasons:
The first reason is that a search engine can consider relevant (show users upon request) not the main page promoted by you, but its duplicate. And your page gets into the search results and starts competing on deliberately unequal terms with other sites. Because this page, duplicate, doesn’t have the support of internal and external links. The next problem is indexing. The search engine has a certain quota for indexing your site. A number of pages can be indexed in a certain amount of time. Imagine that you have prepared the content, optimized the pages and want them to appear in the search index as soon as possible. But they don’t get into the index, because the search engine at this time indexes duplicate pages, and the duplicates go to the search index. These problems compel us to seek duplicates and correct them. Below we will talk about the three basic classes of duplicates. These are page duplicates, site mirrors and incomplete duplicates. Let’s start with the duplicate pages. What are duplicate pages? You have one page that is available at different URLs. For example, product page in an online store, and the same page is located in different product categories. And in every category it has different URLs. The content is the same, the URLs are different. Or, for example, the product has a print version, we can print the description of the product – the same content, the links are different. But most often the problem with duplicate pages happens in sorting, in search and in pagination. Pagination is a per-page navigation of the site. How to find duplicate pages? Register our site in Google search console and Yandex webmaster. In Google search console go to Search Appearance, HTML improvements, if there are duplicate meta descriptions, then you have duplicate pages. We open these duplicate descriptions and look, if it is explicitly a duplicate page, we copy the URL address and put aside, to fix this problem later. In Yandex webmaster we look at the page index section in the search, that is, all the pages that are present in the index of Yandex, and analyse whether there are duplicate pages. You can do it even simpler: open Google, open Yandex and input the search operator site: mysite.ru (the name of our domain) and see all pages that are present in the search index and if there are duplicates, copy their addresses. What to do with duplicates? How to fix them? You can put a permanent server 301 redirect from the duplicate page to the main page. But if there is no such option, you can close specific pages or entire directories with duplicates from indexing in the robots. txt file, but this method has its drawbacks, because duplicates sometimes fall into the search index. Search engines recommend using the canonical address of the page through the LINK attribute. The next big problem is the site mirrors. Mirrors are sites that are available at different addresses with the same content. If both www.site.ru and site.ru open, they are mirrors. You can also search for mirrors by simply entering to the address bar site.ru, www.site.ru and adding index.php at the end, or just adding a slash. Only one page should be opened for each such query. What to do with mirrors, if they are present? They should be “glued”. Set a permanent 301 server redirect to the main mirror, choose the site in Google Search Console, click on the gear in the upper right corner, and choose the preferred domain. In Yandex webmaster, choose the main mirror we want to use in the indexing settings section. Also specify the main mirror in the HOST directive in the robots.txt file. The next group of duplicates is incomplete duplicates: the same content, which is found on different pages of your site. Most often these are the announcements of the articles in the rubrics. The announcement for an article is taken from its content, the beginning. Or category descriptions on various pages of pagination. You go through pages and the category description is the same everywhere. Also it is some conversion blocks at the bottom of the page: order from us, phone, email and some other text, and this block is found on various pages of the site. You can find incomplete duplicates just by looking at your site. What to do with them, if they are present? first of all, they can be uniqualized. Write unique announcements for articles. You can also close from indexing the directories with incomplete duplicates in the robots. txt file or in noindex for Yandex. Or don’t show them, as in the case of category description, on the pagination pages. That’s all for duplicates. If you still have questions, please feel free to ask them in the comments. Good luck!