Back to Community

Understanding and Managing Your robots.txt File with XML Sitemap Generator for Google

14 threads Sep 16, 2025 PluginXml sitemap generator for google

Content

A common point of confusion for users of the 'XML Sitemap Generator for Google' plugin is how it interacts with the WordPress robots.txt file. Many users want to add custom rules, like blocking search engines from specific directories, but are unsure how to proceed. This guide explains the two types of robots.txt files and how to manage them effectively with this plugin.

The Core of the Confusion: Virtual vs. Physical robots.txt

The confusion often stems from the plugin's setting labeled "Add sitemap URL to the virtual robots.txt file." The description for this option states: "The virtual robots.txt generated by WordPress is used. A real robots.txt file must NOT exist in the site directory!"

Let's break down what this means:

  • Virtual robots.txt: This is a file that WordPress generates dynamically on the fly. You cannot see it in your website's file structure (e.g., via FTP). Its content is built by WordPress core and can be modified by plugins using filters. The 'XML Sitemap Generator for Google' plugin uses this method to add your sitemap URL to the virtual robots.txt output.
  • Physical robots.txt: This is a static text file physically located in the root directory of your website (e.g., public_html/robots.txt). You can create and edit this file directly via your web host's file manager or FTP.

The plugin's note means that its virtual method will only work if a physical robots.txt file does not already exist. If a physical file is present, WordPress will serve that file instead of generating the virtual one, and the plugin cannot add the sitemap to it.

How to Add Custom robots.txt Rules

Based on the community discussions, there are two primary methods to add custom rules like Disallow: /wp-content/themes/ or Disallow: /images/.

Solution 1: Use a Physical robots.txt File (Recommended for Customization)

This is the most straightforward method if you need full control over your robots.txt content.

  1. In the 'XML Sitemap Generator for Google' settings, uncheck the option "Add sitemap URL to the virtual robots.txt file."
  2. Using your web host's file manager or an FTP client, create a new text file named robots.txt in your website's root directory (the same folder that contains wp-config.php).
  3. Edit this new file and add all your desired rules. You must manually add your sitemap URL to this file. A basic example would be:
    User-agent: *
    Disallow: /wp-admin/
    Allow: /wp-admin/admin-ajax.php
    Disallow: /wp-content/themes/
    Disallow: /images/
    
    Sitemap: https://yourwebsite.com/sitemap.xml
    
  4. Save the file and upload it to your server if you used FTP.

This method gives you complete control and ensures there is no conflict with the plugin.

Solution 2: Use a Filter in WordPress (Advanced)

For users who are comfortable adding code to their site, you can use the WordPress robots_txt filter to modify the virtual output. This requires adding code to your theme's functions.php file or a custom functionality plugin.

function my_custom_robots_rules( $output, $public ) {
    // Append your custom rules to the existing virtual output
    $output .= "nDisallow: /wp-content/themes/n";
    $output .= "Disallow: /images/n";
    return $output;
}
add_filter( 'robots_txt', 'my_custom_robots_rules', 10, 2 );

Important Note: As noted in one thread, the 'XML Sitemap Generator for Google' plugin hooks into this filter with a very late priority. In some cases, it might overwrite custom rules added by other plugins or your theme. If you encounter this, you may need to adjust the priority of your own filter.

Key Takeaways

  • The plugin itself does not create a full robots.txt file. It only adds the sitemap URL to WordPress's virtual output.
  • You cannot directly edit the virtual robots.txt content from the plugin's admin screen. Customization requires either creating a physical file or using a code filter.
  • If you have a physical robots.txt file, you must disable the plugin's virtual option and manage your sitemap URL manually within that physical file.

By understanding the difference between virtual and physical files, you can choose the method that best fits your technical comfort level and website needs.

Related Support Threads Support