Troubleshooting Yoast SEO: Fixing Robots.txt and Sitemap Issues in Google Search Console
Content
Many WordPress users rely on the Yoast SEO plugin to manage their technical SEO, but sometimes encounter frustrating errors in Google Search Console (GSC) related to their robots.txt file and XML sitemaps. These issues can prevent search engines from properly crawling and indexing a site. Based on common community reports, this guide explains the most frequent problems and their solutions.
Common Robots.txt and Sitemap Errors
Users often report the following messages in Google Search Console:
- "Robots.txt unreachable" or "could not be fetched"
- "Your Sitemap appears to be an HTML page"
- General failures to fetch the sitemap_index.xml
Why Do These Errors Happen?
These errors are rarely due to a fundamental bug in the Yoast SEO plugin itself. Instead, they are typically caused by external factors that interfere with how search engine bots access these crucial files. Common causes include:
- Caching: Overly aggressive caching by plugins, servers, or services like Cloudflare can serve an old, incorrect, or even HTML version of the sitemap or robots.txt file.
- Configuration Changes: Recent changes to your site, such as moving to a new host, switching from HTTP to HTTPS, or reconfiguring DNS (e.g., moving to Cloudflare), can cause discrepancies.
- Conflicting Instructions: A misconfigured robots.txt file that blocks access (e.g.,
Disallow: /) will prevent all crawling. - Google's Crawler Delays: Sometimes, the error is transient on Google's end and resolves itself with time.
Step-by-Step Troubleshooting Guide
1. Clear All Caches
This is the most critical first step. You must clear every cache that could be serving a stale file.
- Plugin Cache: Clear the cache from any caching plugin you use (e.g., LiteSpeed Cache, W3 Total Cache, WP Optimize).
- Server Cache: Contact your web host to clear any server-level caching (e.g., Varnish, Memcached).
- CDN Cache: If you use a Content Delivery Network like Cloudflare, purge its cache through the CDN's dashboard.
- Browser Cache: Clear your own browser's cache and cookies or test in an incognito/private window.
2. Verify and Reset the Robots.txt File
First, manually check your robots.txt file by visiting yoursite.com/robots.txt. If it is missing, contains errors, or has a Disallow: / directive, you need to fix it in the Yoast SEO settings.
- In WordPress, go to Yoast SEO > Settings > Tools.
- Edit the robots.txt file in the text editor. A standard, permissive configuration looks like this:
# START YOAST BLOCK # --------------------------- User-agent: * Disallow: Sitemap: https://www.yoursite.com/sitemap_index.xml # --------------------------- # END YOAST BLOCK - Save changes and clear your caches again.
3. Verify and Resubmit Your Sitemap
Manually visit your sitemap by going to Yoast SEO > Settings > General > Site Features > XML Sitemaps and clicking "View the XML sitemap". Ensure it loads as XML code, not as an HTML page.
- In Google Search Console, remove any previously submitted sitemaps.
- Only resubmit the main index file:
https://www.yoursite.com/sitemap_index.xml. Do not submit individual sitemaps (e.g., page-sitemap.xml). - After resubmitting, be patient. Google Search Console is not real-time, and it can take several days to weeks to recrawl and update the status.
4. Check for HTTPS and WordPress Settings
If your sitemap URLs are showing HTTP instead of HTTPS, it usually means your WordPress settings are incorrect.
- Go to WordPress Dashboard > Settings > General.
- Ensure both the WordPress Address (URL) and Site Address (URL) are set to
https://. - Save the settings.
5. Temporarily Disable Features
As a test, try temporarily disabling the XML sitemaps feature in Yoast SEO and then re-enabling it. This can sometimes reset the sitemap generation.
- Go to Yoast SEO > Settings > General > Site Features.
- Toggle the "XML Sitemaps" switch off and save changes.
- Then, toggle it back on and save changes again.
When to Look Elsewhere
If you have followed all the steps above and the errors persist, the issue likely lies outside of the Yoast SEO plugin.
- Hosting Issues: Contact your web hosting provider. Ask them to check for server-level firewalls, mod_security rules, or other configurations that might be blocking access to
robots.txtorsitemap_index.xmlfor Googlebot's user-agent. - Plugin Conflicts: Try disabling all other plugins except Yoast SEO to see if the issue resolves. If it does, reactivate them one by one to find the culprit.
- .htaccess File: Check your site's root
.htaccessfile for any rules that might be redirecting or blocking access to these files.
By methodically working through these steps, you can resolve most robots.txt and sitemap availability errors and get your site back on track for being properly indexed by search engines.
Related Support Threads Support
-
Problem with load favicon in google search resultshttps://wordpress.org/support/topic/problem-with-load-favicon-in-google-search-results/
-
Your Sitemap appears to be an HTML page. Please use a supported sitemaphttps://wordpress.org/support/topic/your-sitemap-appears-to-be-an-html-page-please-use-a-supported-sitemap-2/
-
Include favicon to the sitemaphttps://wordpress.org/support/topic/include-favicon-to-the-sitemap/
-
robot.txt issuehttps://wordpress.org/support/topic/robot-txt-issue-2/
-
website getting redirect to google search consolehttps://wordpress.org/support/topic/website-getting-redirect-to-google-search-console/
-
Non-canonical page in sitemap, http not httpshttps://wordpress.org/support/topic/non-canonical-page-in-sitemap-http-not-https/
-
Robots.txt is not availablehttps://wordpress.org/support/topic/robots-txt-is-not-available/
-
An issue with Robots.txt through subdomainshttps://wordpress.org/support/topic/an-issue-with-robots-txt-through-subdomains/
-
Error with Sitemap and Google Search Consolehttps://wordpress.org/support/topic/error-with-sitemap-and-google-search-console/
-
Google can’t fetch Yoast’s sitemap_index.xmlhttps://wordpress.org/support/topic/google-cant-fetch-yoasts-sitemap_index-xml/
-
No information is available for this page. Learn whyhttps://wordpress.org/support/topic/no-information-is-available-for-this-page-learn-why/
-
Acces to my robots.txt file is blocked for no reasonhttps://wordpress.org/support/topic/acces-to-my-robots-txt-file-is-blocked-for-no-reason/
-
Google Search Console Error With Yoast Sitemaphttps://wordpress.org/support/topic/google-search-console-error-with-yoast-sitemap/
-
robot.txt blocking but all looks finehttps://wordpress.org/support/topic/robot-txt-blocking-but-all-looks-fine/
-
Sitemap Html errorhttps://wordpress.org/support/topic/sitemap-html-error/
-
Lighthouse is not able to fetch robots.txthttps://wordpress.org/support/topic/lighthouse-is-not-able-to-fetch-robots-txt/
-
Yoast sitemap identified as htmlhttps://wordpress.org/support/topic/yoast-sitemap-identified-as-html/
-
Google can’t fetch Yoast’s sitemap_index.xmlhttps://wordpress.org/support/topic/google-cant-fetch-yoasts-sitemap_index-xml-2/
-
Robots.txt not updatinghttps://wordpress.org/support/topic/robots-txt-not-updating-2/
-
Your Sitemap appears to be an HTML page. Please use a supported sitemaphttps://wordpress.org/support/topic/your-sitemap-appears-to-be-an-html-page-please-use-a-supported-sitemap/
-
Robots.txt unreachable for Google Search Consolehttps://wordpress.org/support/topic/robots-txt-unreachable-for-google-search-console/
-
Google Search Console Blocked: Robots.txt unreachablehttps://wordpress.org/support/topic/google-search-console-blocked-robots-txt-unreachable/
-
Robots.txt and sitemap.xml unreachablehttps://wordpress.org/support/topic/robots-txt-and-sitemap-xml-unreachable/
-
sitemap recognised as html filehttps://wordpress.org/support/topic/sitemap-recognisez-as-html-file/
-
Host had problems last week – Server connectivity – High fail rate last weekhttps://wordpress.org/support/topic/host-had-problems-last-week-server-connectivity-high-fail-rate-last-week/