Google Index Checker:
Getting your web pages indexed by Google (and other search engines) is essential. Pages that are not indexed cannot be sorted.
How do you see how many pages you have indexed?
- Use the site: operator.
- Check the status of your XML Sitemap submissions in the Google Search Console.
- Check your global indexing status.
- Each will give different numbers, but why they are different is another story.
For now, let’s just analyze a decrease in the number of indexed pages reported by Google.
If your pages are not indexed, it may indicate that Google does not like your page or cannot easily analyze it. As a result, if your number of indexed pages starts to decrease, this may be because:
- You have been hit with a Google penalty.
- Google thinks your pages are irrelevant.
- Google can not crawl your pages.
Here are some tips to diagnose and solve the problem of decreasing the number of indexed pages.
1. Are the pages loaded correctly?
Make sure they have the correct HTTP header 200 status.
Has the server experienced frequent or long downtime? Has the domain recently expired and was it renewed late?
Element of action
You can use a free HTTP header verification tool to determine if the appropriate status is present. For very large sites, standard analysis tools such as Xenu, DeepCrawl, Screaming Frog or Botify can be tested.
The correct header status is 200. Sometimes 3xx (except for 301), 4xx, or 5xx errors may appear – none of these are good news for the URLs you want to index.
2. Have your URLs changed recently?
Sometimes a change to the CMS, backend scheduling, or server setting that causes a domain, subdomain, or folder change can, therefore, change the URLs of a site.
Search engines can remember old URLs, but if they are not redirected correctly, many pages may be deindexed.
Element of action
Fortunately, a copy of the old site can still be visited one way or another to take note of all the old URLs so that you can map the 301 redirects to the corresponding URLs.
3. Have you solved duplicate content issues?
Correcting duplicate content often involves the implementation of canonical tags, 301 redirects, noindex meta tags, or bans in the robots.txt file. All of this can lead to a decrease in indexed URLs.
This is an example where decreasing indexed pages could be a good thing.
Element of action
As this is good for your site, the only thing you need to do is to check that it is definitely the cause of decreasing the indexed pages and nothing else.
4. Are your pages expiring?
Some servers are subject to bandwidth restrictions because of the cost associated with higher bandwidth. these servers may need to be upgraded. Sometimes the problem is hardware-related and can be solved by upgrading your hardware processing or memory limitation.
Some sites block IP addresses when visitors access too many pages at a certain pace. This setting is a strict way to avoid any DDOS hacking attempt, but it can also have a negative impact on your site.
Typically, this operation is monitored during the second set of pages, and if the threshold is too low, normal search engine scanning may reach the threshold and robots cannot scan the site properly.
Element of action
If this is a bandwidth throttling of the server, it might be a good time to upgrade the services.
If this is a server processing/memory issue, in addition to the hardware upgrade, check to see if you have server caching technology, which will reduce server constraints.
If the anti-DDOS software is in place, release the settings or place the Googlebot whitelist so that it is not blocked at all times. Beware though, there are fake Googlebots out there; make sure to detect Googlebot correctly. Detect Bingbot has a similar procedure.
5. Do search engine robots see your site differently?
- Sometimes what search engine spiders see is different from what we see.
- Some developers build sites in a privileged way without knowing the implications of SEO.
- From time to time, a favorite CMS will be used without checking if it is compatible with the search engines.
- Sometimes this could be done on purpose by an SEO who tried to camouflage content, trying to play with the search engines.
- Other times, the website has been compromised by hackers, who have shown a different page to Google to promote their hidden links or hide the 301 redirects to their own site.
- The most serious situation would be that pages infected with a type of malware are automatically deindexed by Google once detected.
Element of action
The Google Search Console rendering and rendering feature is the best way to see if Googlebot sees the same content as you.
You can also try to translate the page into Google Translate even if you do not intend to translate the language or view Google’s cached page, but there are also ways to hide the content.