In a website, it is important to index the pages in Google in order to appear in the searches carried out by users. When creating new content or updating existing content, it is common to use Search Console to notify Google and index the content as soon as possible, manually or through sitemaps files.
On some occasions it may be necessary to carry out the reverse process, for a URL that is positioned in the ranking to stop being so, this is what is known as de-indexing a URL in Google.
What does "de-index a URL" mean?
De-indexing a web is the process that is carried out so that that site stops appearing on the Google results pages.
Reasons to de-index a URL
Although at first glance it may not seem reasonable to remove a website that is already positioned in Google, there may be reasons why it is recommended to do so, such as:
- Obsolete pages. They are URLs that are no longer useful as they are obsolete or out of date.
- Low-quality content. Pages with very low-quality content (thin content) that you do not want to update.
- Pages indexed by mistake. Temporary pages that are indexed by mistake, for example, when a template is applied to a website and it takes time to modify all its content.
- Avoid content duplication. When two pages have the same content to avoid being penalized by Google.
Accurate and low-cost Rank Tracker
3 ways to de-index a URL
There are different ways to remove a URL from the Google index. Here are three of the common ways to de-index a URL:
1. De-index a Google URL from Google Search Console
Search Console is a free Google tool that allows you to perform various functions, including removing URLs from your own search engine.
To remove a page from the index, go to the corresponding option in Google Search Console and enter its URL. You can select between removing it from the cache, directly from the search results (de-indexing it), or both at the same time.
2. De-index a URL with the Meta Robots tag noindex
The noindex meta tag can be used to instruct Google to remove a page from its index. The format of this tag is as follows: and it must be added to the HTML of the page that does not want to be indexed or de-indexed.
3. Unindex a URL with Disallow in robots.txt
The robots.txt file is used to tell Google bots which pages should and should not be indexed. Using the disallow command in this file you can de-index pages from the SERPs, although what it really does is block the Google crawler from accessing it.
To correctly de-index a URL, it is better to use the two previous methods, and use the robots.txt layout to prevent them from being indexed.
4. Other options
There are other options to de-index a URL in Google such as adding a code status 410 (the page has disappeared or has been deleted) or leaving it in error 404 (status not found), for instance.
By sending a sitemap after marking the URLs with 410 or 404 codes, it will be possible to de-index in a massive way many URLs, and you will not have to go one by one de-indexing them in Google Search Console.
Although at first, it may not seem like it, de-indexing Google pages is a very useful tool in many circumstances, as it may be necessary to remove a URL that is negatively affecting the web positioning of the site.
From Google Search Console you can easily de-index any URL of a web page, blog, or online store.