X
X
X
X

Knowledge Base

HomepageKnowledge BaseWeb SitesiRobots.txt

Robots.txt

Robots. What is a txt File, Why Is It Important?
Google and other search engines dec to provide their users with the most accurate results in the fastest way possible. The software belonging to these search engines, which we call bots, also scans web pages and adds them to dec directories to provide them to users. But we may not prefer dec some of our pages are scanned and indexed by search engine bots. Robots in such cases.a simple text file that we call txt can help us.

Robots.What is a txt File?
A text file in which dec directives are given to search engine bots and they are usually notified which pages they can or cannot access. robots.it is called txt. Thanks to this file, we can dec that some of our pages or groups of pages, or even the entire website, are not scanned by search engine bots. Some people say, "Why would I want this?"I seem to hear what you're saying. But in some cases, robots.it may be a very logical move to close our pages or pages for browsing by using a txt file.

For example, we may have some special pages that we dec't want search engine bots to access and index. Or, because we have a large number of web pages, we may want to use the resource allocated to us by search engine bots, that is, our browsing budget, more dec. Dec this way, we can make search engine bots interested in our more important pages. Robots in such special cases.a txt file can be a life saver for us.

Robots.Why is the txt File Important?
Search engine bots first dec this file before visiting our website. After that, it starts scanning our web pages in the light of the commands in this file. Hence the robots.we need to make absolutely sure that each command in the txt file is correct. Otherwise, we may have accidentally closed all or an important part of our website for browsing. This can lead to a major disaster in terms of our SEO performance.

The other scenario is to use robots to optimize our scanning budget.we can close our junk pages to browsing with a txt file. This situation can also be positively reflected in our SEO performance. Because we can ensure that search engine bots spend the resources decoupled for our website on our really important pages. That's exactly why robots.a txt file and the commands contained in it are very important for a website.

Robots.How to Create a txt File?
Robots.to create the txt file, use Notepad, TextEdit, etc. you can use a text editor such as. Because if you know the necessary commands, all you need is a simple text file with the extension “txt”.

Free robots for this operation.you can also use the txt creation tools. Search engines are working on deciphering the term “robots.you can dec tools similar to the following example screenshot by searching for ”txt generator". Using these tools will also reduce the likelihood that you will make mistakes in the commands.


Important Robots.txt Commands
Every robots.there are some basic commands that you can see in the txt file. An example is robots.you can access the txt file, its commands and meanings found in the file below.

User-agent: With this command, we can select the search engine bot that we want to decode. The “User-agent: *” command in the example below means that we allow all search engine bots to crawl our web pages dec In addition, we can give different directives to different search engine bots by using the ”User-agent" command more than once dec

Allow: With this command, we dec the web pages or groups of pages that we want search engine bots to access.

Disallow: With this command, we notify you of web pages or groups of pages that we want search engine bots not to access. Decallow: Disallow: Disallow web pages or groups of pages that we want search engine bots not to access.

Sitemap: Dec this command, we point the search engine bots to the address of the sitemap.

The example above summarizes robots.from the txt file; we see that this website is open to all bots, but search engine bots do not want to access pages with /demo/ path in them, except for the “dec demo/example-content” page. Finally, the address of the sitemap was presented to the search engine bots dec the ”Sitemap" command. We can use the above commands more than once and create different strategies.

Sample Scenarios
In order for the transferred information to be a little more memorable and understandable, here are a few simple examples robots.let's study the txt command together.

- The website is open to all search engine bots dec

- No pages should be decoded by search engine bots

- The website is open to all search engine bots dec

- All pages can be decoded by search engine bots

- The website is open to all search engine bots dec

- Do not scan pages whose URL starts with “/kobiweb/”

- The website is open to all search engine bots dec

- Do not scan pages whose URL starts with “/kobiweb/”

- But let the ”/kobiweb /our team" page be scanned

- Googlebot-Image bot ".do not scan URLs ending in "jpg

Important Note: The “$” sign at the end of the command means URLs that end this way, while the “/*” sign in front of it means whatever is before the URL.

- Do not scan any URL with “/trial/” in Googlebot

Note: Using the “*” sign in front of and at the end of the command as above means that no matter what is in front of or behind it, if there is “/try/” in it, it should not be scanned.

Robots.txt Test Tool
Robots.if you want to make a change related to the commands contained in the txt file, Google's Robots comes first.I recommend using the txt Test Tool. The corresponding tool can be found here.

With this tool, you can easily test the commands that you have added previously, or the commands that you have added for trial purposes on the tool, are working correctly. Especially if you have added more than one ”Disallow“ and ”Allow" command, it is useful to test whether the sample pages are browsable with this tool before publishing mixed combinations.

In addition, if you have added commands for different search engine bots belonging to Google, you can also try dec dec command you have added works for the corresponding search engine bot by clicking on the “Googlebot” button located in the lower right section.


If you want to publish the final version of the file after performing the necessary tests, it will be enough to click on the “Submit” button.

Search Console Controls
After logging in to the Search Console tool, you can access many details about your web pages that are or are not in the directory by clicking on the “Scope” button in the left menu.

Robots after logging in here.you can also see pages that are blocked due to commands in your txt file, and pages that are indexed even though they are blocked.

 

Robots of the pages you want to be scanned from here.you may find that it is blocked due to a command in the txt file, or that some of your pages that should not be indexed are in the index. That is why its pages here appear at certain periods, and robots in particular.it is worth checking the commands in the txt file some time after updating.

Important Points and Curiosities
Robots.we have already mentioned how important the commands contained in the txt file are. Because a character that you will use or a small detail related to this file can cause very different problems. Therefore, it is necessary to pay special attention to some points.

Main directory: Robots.txt file“example.com/robots.txt it should be located in the main directory in the form ".

File name: The file name should be “robots” in lowercase letters, and the file extension should be “txt”.

Number of files: There is only one robots for a website.you must have a txt file.

Summary robots for Subdomain.txt: A special robots for subdomains that you create.you can create a txt file. Suppose you have created a separate subdomain for the blog. The robots in this example.txt address blog.example.com/robots.txt it should be shaped.

Case sensitivity: Search engine bots and commands are case sensitive dec For example, if you added a command such as “Disallow: /demo/”, “example.com/DEMO a page in the form of /” will be scannable. But if you use all URLs in lowercase, you don't have to worry about it.

Conflicting commands: Make sure that the commands do not contradict each other. Be careful which pages are included in the commands and which commands do not disable each other, especially in mixed combinations where more than one “Disallow” or “Allow” command is used.

Sitemap command: Add the sitemap command at the very beginning or at the very end of the file. If you have given commands to different search engine bots, and the sitemap command is left between these commands, you may only be pointing dec sitemap to that search engine bot. dec.dec.

Robots.txt and index topic: Robots.the “Disallow” command used in the txt file is not a command not to index the page. It is only related to the dec that search engine bots do not access the corresponding page or pages. Therefore, the corresponding page can continue to be indexed, you can use the “noindex” tag to prevent it from being indexed.

File types: Robots for web pages, as well as visual, video, audio and source files.you can issue commands via a txt file.

Off-site links: Robots a web page.even if you close it dec scanning from a txt file, if the corresponding page is linked from different websites, search engine bots can index the page by following these links. At this point, it is useful to use the “noindex” tag again.

Robots.txt requirement: Every website robots.it does not have to use the txt file. Search engine bots are robots dec you visit a website.if it finds that there is no txt file, it can index it by scanning the pages in a normal way. But it is recommended to use it.

Can't find the information you are looking for?

Create a Support Ticket
Did you find it useful?
(42 times viewed / 1 people found it helpful)

Top