A webpage must go through two steps to appear in Google search: crawling and indexing.
Typically, the process is fast and easy: Google crawls the website and then processes it for indexing.
However, there are situations when Google was able to crawl a webpage but did not index it, preventing it from appearing in search results.
In Google Search Console, these pages have the “Not Indexed” status type “Crawled – Currently Not Indexed.” Pages crawled but not indexed by Google are found on any website.
Some may find these figures worrisome. Typically, the more pages a website has, the more pages are assigned to this status. However, these figures do not reveal the whole picture, and there isn’t much to see except the number of pages and URLs included in the list.
This article will explain the ‘Crawled – Currently Not Indexed’ status in Google Search Console.
What is “crawled – currently not indexed?”
Crawled – Currently Not Indexed is a status category in Google Search Console under Excluded. Google has previously crawled URLs listed under this status category but opted not to index them. However, Google may opt to index or not index it in the future, according to the Google Search Console Help Center.
If you open your Google Search Console account and see pages such as tags, categories, archives, and feed, there is no need to worry. However, if you notice significant pages like landing pages, product pages, and blog posts here, you may need to start evaluating these pages separately.
URL inspection tool
The URL Inspection Tool in Google Search Console can also notify you of URLs that have been Crawled but are not yet indexed.
The top area of the tool tells you whether the URL can be located on Google. For example, if the examined URL is in the Index Coverage report’s Excluded category, the URL Inspection Tool will report: “The page is not in the index, but not due to an error.”
More detailed information regarding the current Coverage status of the inspected URL can be seen below — in the instance above, the URL was Crawled – currently not indexed.
Are Your Webpages Crawled but Not Indexed? Let’s Fix That!
Don’t miss out on potential traffic and visibility. Resolve the “Crawled – Currently Not Indexed” status in Google Search Console with Linkilo’s expert SEO guidance. Automate internal link suggestions, avoid link cannibalization, and achieve better SEO performance with our powerful tool.
Unlock Your Content’s Full Potential! Experience seamless internal linking and elevate your website’s indexing potential!Why is Google not indexing these pages?
When Google first crawls a website or a single page, it takes time for Google to process the collected data. There are several factors to consider, but the main one is the website’s size.
If a website contains hundreds of pages and constantly adds new ones, Google should limit the number of pages it indexes and prioritize which pages it should index first. Additional pages will ultimately be indexed if other criteria are met.
Lack of importance
Google’s algorithm is intelligent enough to determine whether a given page is valuable to users. So if Google decides not to index a page, it simply says that it is not significant for people at the time, but it will reconsider the page the next time it crawls.
Thin content
Google may opt not to index a page if the content is insufficient, often known as thin content. A webpage with thin content is thought to give little value to users. However, as previously said, Google aims to use its resources best. Therefore, it will prioritize crawling other pages that bring value to users.
How to resolve crawled – currently not indexed status
Unlike status under Errors, where you can manually verify that fixes have been done, there is no way to manually inform Google that you’ve improved the pages under Excluded.
According to Google Search Console Help, there is no need for human requests to re-index pages with this status because Google will ultimately reassess the content.
However, if you want to ensure that Google indexes these sites the next time they crawl them, follow these steps.
Improve content
Improving a web page’s content entails adding more words to the count and providing useful stuff to users when they visit your website. Consider how a particular page contributes to the user’s journey on your website.
This is true for sites that you want users to see. However, if they are unneeded sites, such as archives and feed pages, it is perfectly OK to leave them alone. On the other hand, you might want to consider blocking them from being crawled to save money on crawling.
Duplicate content
Check for duplicate content on your website for the pages you’re trying to index. Add a canonical tag from these duplicate pages pointing to the original content you wish to index if you discover any.
Content that violates copyright
This is a well-known case of plagiarism. Plagiarism occurs when someone steals someone else’s content (even if it is in another language) and makes minor changes to “hide” the truth.
Regardless of the legal repercussions, your page has broken one of the site’s guidelines and will not be indexed by Google.
In addition, the same pattern of excluding duplicate pages is used to determine whether or not to join the Google Partner Program (Google Adsense).
So make your blog 100 percent unique and free of plagiarism, and if you have plagiarized content on your site, this might cause your indexing problems.
Increase internal links
Increasing the number of internal links to the webpage solves two problems: first, it makes Google crawl the page more frequently, and second, it gives the page more importance. If you have other content on your website, like blog posts, I strongly advise you to include a few internal links to sites Google crawled but has not yet indexed.
Reduce click-depth
The number of clicks a user requires to arrive at a specific page is called click-depth. If a person has to click several times before reaching the intended page, this is terrible for the user experience, and Google may deem the page unimportant. Limiting important pages to 1 to 2 clicks is a decent number. I wouldn’t go past four clicks because it is already too deep.
JavaScript rendering
JavaScript is a difficult one to diagnose. It’s difficult to tell how Google renders your page precisely, and the JavaScript renders time available to your website without understanding the code and how a crawler works.
In general, crucial components and content should be rendered within 5 seconds. However, it’s likely to be an issue if your web page uses many resources, has a lot of render-blocking scripts, or has to make many API requests. In other words, you may be unable to render essential sections of your website before the crawler time out.
This is often referred to as partial rendering. If you don’t have the technical know-how, you can work with a technical SEO professional and developer to figure out how to reduce JavaScript resources on page load.
If the JavaScript renders on the client-side, you can disable it in the browser settings or install one of several popular Chrome extensions to view what the web page looks like without JavaScript. Are there any significant differences? If, for example, the navigation or large blocks of text are missing, Google may not be rendering the page correctly.
You can also use Google’s Page Speed Insights and the Network tab in Chrome Developer Tools to see what JavaScript resources are loaded. The waterfall graphic displays loading priority as well as file/script sizes. There are several methods for resolving JavaScript issues. When it comes to fixing JavaScript errors, your developer will always be your best friend.
RSS feed URLs
RSS feeds are obsolete. When was the last time you subscribed to a blog through its RSS feed? Maybe in 2007? Popular CMS platforms, such as WordPress, duplicate your content by appending the ‘/feed’ suffix to the end of the URL.
On the one hand, Google is sophisticated enough these days to disregard ‘/feed’ URLs, so they shouldn’t directly influence the SEO of your website.
On the other hand, ‘/feed’ URLs present a perfect chance for spammers to crawl your website and potentially send low-quality directory links or other nefarious actions to your website. Ideally, it would help if you prevented feed URLs from being created by installing a plugin and configuring a robots.txt directive to prevent crawlers from crawling ‘/feed’ URLs.
Paginated URLs
Pagination can be a big problem for SEO if done wrong. Most importantly, look at which paginated pages appear in the ‘Crawled – currently not indexed’ report. For example, if it’s Author paginated pages, I wouldn’t be too concerned unless Author pages are crucial to your business.
If you wish bots to cease crawling author pages, you can always use a robot’s meta tag to noindex the page, followed by a directive in your robots.txt file.
Above all, you want to ensure that web spiders can access your website’s pagination using normal a href=”> HTML tags. It is not unduly reliant on JavaScript rendering. And, for the love of all that is good in the world, please refrain from using infinite scrolling. Search bots dislike them.
Out-of-stock products
If you own an e-commerce store, you should check the availability of product pages on your website. Google can occasionally deindex unavailable products to make SERPs as relevant as possible for end-users.
In this case, ensure that the products are unavailable or out of stock. If not, try to submit an index request for your product page using the GSC account associated with your website. Ensure that the page is included in your shop’s product sitemap.
Document files
Consider if a PDF, Excel, Word, or PowerPoint file type URL in a status report should be indexed. That is totally up to you and the goals of your website.
To see if your website has any files indexed by Google, type the following query into Search (without the quotation marks):
“site:yourdomain.com filetype:pdf”
or
- xls
- xlsx
- doc
- docx
- SVG
- txt
- ppt
- pptx”
If any files come up, you have to decide whether or not to index them. Hopefully, it won’t be an internal board report or your company’s quarterly financials. You should remove any sensitive information as quickly as possible.
URLs with query strings
Different implementations of your website can result in query string URLs. Internal search, faceted navigation, and pagination are the most common.
A clothes eCommerce site, for example, can allow visitors to filter products using a faceted menu. In other words, if I’m shopping for a t-shirt, I can narrow down the product on the page using a faceted menu. By selecting medium, red, and slim-fit, you may add the following element to the URL:
https://www.ecommercesite.com/men/shoes/boots?color=black&size=11&shoelace
It’s easy to assume that depending on how many filters are in the faceted menu. The parameters can generate millions of URL variations. That’s not ideal from the standpoint of technical SEO.
As a result, identifying which query string URLs show in the ‘Crawled – currently not indexed’ report is essential. It is better to have a strategy for dealing with query string URLs than relying on Google. After all, indexing query string parameters for your website will negatively influence your crawl budget and create significant bloating.
Content audit
A website will have some obsolete content at some point in time. A content audit will help you identify pages that lack value and improve the page. In circumstances where adding more content to the page is unsuitable, consider alternatives such as:
- Removing the page entirely
- Adding noindex meta tag
- Redirecting to a more relevant page
When you apply a noindex meta tag, the content remains on your website for your audience, but search engines are told not to consider it for indexing. According to Google’s John Mueller, site quality is only considered for pages intended to be indexed.
You can enhance the overall site quality by methodically deleting low-quality pages from the index. Site quality does not change overnight. It takes time for Google to pick up the signals, reprocess them, and reassess the overall quality of your site.
Your website is only a few months old.
If your site is still relatively young, this might be the source of your Google indexing issues.
As we shall see later, having a good network of well-established internal links is one of the best methods for ensuring your site gets indexed on Google pages.
And in the case of a newly launched site, there aren’t enough pages on which to build this network of links, which might lead to indexing issues.
These are only the most common causes, and while there are many other possibilities, it would be impossible and pointless to list them all here.
With the possible causes for your problem identified, let’s move on to the practical: what you should do to try to reverse the indexing problem and have Google rank your page.
Difference between crawled not indexed and discovered
“Crawled – Currently Not Indexed” and “Discovered – Currently Not Indexed” are two different statuses that might confuse some people. The primary difference between the two is that under “Crawled – Currently Not Indexed,” Google has previously identified and crawled the website and has decided not to index it.
For “Discovered – Currently Not Indexed,” Google discovered the page via crawling beneath pages but chose not to crawl it and hence not index it.
This means that sites marked “Discovered – Currently Not Indexed” are less important to Google than those marked “Crawled – Currently Not Indexed.”
Validate Fix your Crawled – currently not indexed
After you’ve done everything you can to update your content quality, you can now hit the “Validate Fix,” and Google will “eventually” get around to validating those fixes:
After you click on “Validate Fix,” you will see the button changed to “See Details”:
Key takeaway
In retrospect, the ‘Crawled – Currently Not Indexed’ status type in Google Search Console may not give much information. Still, it is helpful to know which areas of our websites Google neglects. This gives us SEOs an opportunity to optimize.
It is also worth noting that this status does not cause quick attention unless your crucial pages are found here. Remember to provide Google a reason to index the page and rank it in the search results the next time it crawls the website.
If you submitted a URL to Google Search Console and received the message Crawled – Currently Not Indexed, that implies Google crawled but did not index the page. As a result, the URL will not appear in search results for the time being.
The Crawled – Currently Not Indexed notification does not mean that your page has any issues. It is unnecessary to resubmit the URL to be crawled and indexed. It may or may not be indexed in the future, depending on Google’s decision.
If you believe this page is important enough to appear in search results and the live test displays no errors, you can request indexing.
Ready to Boost Your Website’s Visibility in Search Results?
Take control of your website’s indexing status with Linkilo. Our intelligent internal linking plugin analyzes your content, provides relevant link suggestions, and helps prevent SEO penalties. Make informed decisions for better SEO results and drive more organic traffic to your site.
Get Started with Linkilo Today! Take the first step towards higher search rankings and improved user experience!