How to Configure Robots.txt in Magento 2
October 02, 2020
As you know, configuring robot.txt is important to any website that is working on a site’s SEO. Particularly, when you configure the sitemapto allow search engines to index your store, it is necessary to give web crawlers the instructions in the robot.txt file to avoid indexing the disallowed sites. The robot.txt file, that resides in the root of your Magento installation, is directive that search engines such as Google, Yahoo, Bing can recognize and track easily. In this post, I will introduce the guides to configure the robot.txt file so that it works well with your site.
Following steps to Configure robots.txt in Magento 2
- On the Admin panel, click
Stores
. In theSettings
section, selectConfiguration
. - Select
Design
underGeneral
in the panel on the left - Open the
Search Engine Robots
section, and continue with following:- In
Default Robots
, select one of the following:- INDEX, FOLLOW
- NOINDEX, FOLLOW
- INDEX, NOFOLLOW
- NOINDEX, NOFOLLOW
- In the
Edit Custom instruction of robots.txt File
field, enter custom instructions if needed. - In the
Reset to Defaults
field, click onReset to Default
button if you need to restore the default instructions.
- In
- When complete, click
Save Config
.
Examples of Custom Robots.txt file
- Allows Full Access
User-agent:*
Disallow:
- Disallows Access to All Folders
User-agent:*
Disallow: /
Default Robots.txt for Magento 2
Disallow: /lib/
Disallow: /*.php$
Disallow: /pkginfo/
Disallow: /report/
Disallow: /var/
Disallow: /catalog/
Disallow: /customer/
Disallow: /sendfriend/
Disallow: /review/
Disallow: /*SID=
Disallow: /*? # Disable checkout & customer account
Disallow: /checkout/
Disallow: /onestepcheckout/
Disallow: /customer/
Disallow: /customer/account/
Disallow: /customer/account/login/ # Disable Search pages
Disallow: /catalogsearch/
Disallow: /catalog/product_compare/
Disallow: /catalog/category/view/
Disallow: /catalog/product/view/ # Disable common folders
Disallow: /app/
Disallow: /bin/
Disallow: /dev/
Disallow: /lib/
Disallow: /phpserver/
Disallow: /pub/ # Disable Tag & Review (Avoid duplicate content)
Disallow: /tag/
Disallow: /review/ # Common files
Disallow: /composer.json
Disallow: /composer.lock
Disallow: /CONTRIBUTING.md
Disallow: /CONTRIBUTOR_LICENSE_AGREEMENT.html
Disallow: /COPYING.txt
Disallow: /Gruntfile.js
Disallow: /LICENSE.txt
Disallow: /LICENSE_AFL.txt
Disallow: /nginx.conf.sample
Disallow: /package.json
Disallow: /php.ini.sample
Disallow: /RELEASE_NOTES.txt # Disable sorting (Avoid duplicate content)
Disallow: /*?*product_list_mode=
Disallow: /*?*product_list_order=
Disallow: /*?*product_list_limit=
Disallow: /*?*product_list_dir= # Disable version control folders and others
Disallow: /*.git
Disallow: /*.CVS
Disallow: /*.Zip$
Disallow: /*.Svn$
Disallow: /*.Idea$
Disallow: /*.Sql$
Disallow: /*.Tgz$
More Robots.txt examples
Block Google bot from a folder
User-agent: Googlebot
Disallow: /subfolder/
Block Google bot from a page
User-agent: Googlebot
Disallow: /subfolder/page-url.html
Common Web crawlers (Bots)
Here are some common bots in the internet.
User-agent: Googlebot
User-agent: Googlebot-Image/1.0
User-agent: Googlebot-Video/1.0
User-agent: Bingbot
User-agent: Slurp # Yahoo
User-agent: DuckDuckBot
User-agent: Baiduspider
User-agent: YandexBot
User-agent: facebot # Facebook
User-agent: ia_archiver # Alexa