I noticed my site vanished in googles index because of this

13 replies
  • WEB DESIGN
  • |
Google can not crawl my site because it says it can not acsess my robot txt. Only thing is that I never installed robot txt. on this site. Google has crawled my site thousands of times and now this. Any thoughts? my site is www.todds-cleaningservice.com any help would be wonderful.

#googles #index #noticed #site #vanished
  • Profile picture of the author angshuy2k
    Please check your robot.txt file and see how it is being set out as.
    {{ DiscussionBoard.errors[8124128].message }}
    • Profile picture of the author Dmreed4311
      Originally Posted by angshuy2k View Post

      Please check your robot.txt file and see how it is being set out as.
      That just it, I do not have robot txt. on my site
      Signature

      Carpet Doctor
      212 east Ross Ave.
      Tampa Fl 33602
      813-440-8335

      {{ DiscussionBoard.errors[8124184].message }}
  • Profile picture of the author SteveJohnson
    You need to ask your hosting company why your server is returning a 500 Internal Server Error when robots.txt is requested. THAT is probably why Google is giving you a 'we choked on your site' message.
    Signature

    The 2nd Amendment, 1789 - The Original Homeland Security.

    Gun control means never having to say, "I missed you."

    {{ DiscussionBoard.errors[8124316].message }}
  • Profile picture of the author Patrick
    The OP clearly said that he does not have robots.txt in his FTP...

    If there is no robots.txt, then google will crawl your site normally and index your site. robots.txt is used ONLY if you want to apply special rules or ask google not to crawl a particular section...

    Reference is here just to prove I am not talking nonsense.....

    https://developers.google.com/webmas...x/docs/faq#h01

    About the crawling, Dmreed, I had faced the same issue regarding one site but I noticed that google was crawling and indexing my site as well. After a couple of days, I saw that error had vanished. I read a lot of articles on the web regarding this and really no one knows why this happens.
    {{ DiscussionBoard.errors[8124766].message }}
    • Profile picture of the author SteveJohnson
      Originally Posted by Patrick View Post

      The OP clearly said that he does not have robots.txt in his FTP...

      If there is no robots.txt, then google will crawl your site normally and index your site. robots.txt is used ONLY if you want to apply special rules or ask google not to crawl a particular section...

      Reference is here just to prove I am not talking nonsense.....

      https://developers.google.com/webmas...x/docs/faq#h01

      About the crawling, Dmreed, I had faced the same issue regarding one site but I noticed that google was crawling and indexing my site as well. After a couple of days, I saw that error had vanished. I read a lot of articles on the web regarding this and really no one knows why this happens.
      Had you bothered to check for the file, you would have found that the request returns a 500 Server Error.

      From https://developers.google.com/webmas...ocs/robots_txt :
      5xx (server error) Server errors are seen as temporary errors that result in a "full disallow" of crawling. The request is retried until a non-server-error HTTP result code is obtained. A 503 (Service Unavailable) error will result in fairly frequent retrying. To temporarily suspend crawling, it is recommended to serve a 503 HTTP result code. Handling of a permanent server error is undefined.
      The absence of a physical robots.txt file does not necessarily mean that one is not served to the googlebot when it asks for one.
      Signature

      The 2nd Amendment, 1789 - The Original Homeland Security.

      Gun control means never having to say, "I missed you."

      {{ DiscussionBoard.errors[8125080].message }}
      • Profile picture of the author Patrick
        Originally Posted by SteveJohnson View Post

        Had you bothered to check for the file, you would have found that the request returns a 500 Server Error..
        I repeat again that the OP said there is no robots.txt file on his server....

        His domain Carpet Cleaning Las Vegas-Todds Cleaning service opens fine...

        Robots.txt file - http://www.todds-cleaningservice.com/robots.txt

        Does not give an HTTP error, gives a hostgator page not found, means there is no page or file in the server.

        So what's the point here?

        Moreover quoted from google...

        Does my website need a robots.txt file?

        No. When Googlebot visits a website, we first ask for permission to crawl by attempting to retrieve the robots.txt file. A website without a robots.txt file, robots meta tags or X-Robots-Tag HTTP headers will generally be crawled and indexed normally.
        And since the OP said that he has not robots.txt file, so Google should crawl the site normally, I already said this in my last post.
        {{ DiscussionBoard.errors[8125128].message }}
  • Profile picture of the author Weblover50
    It says Google found 7 errors while attempting to get robots.txt. So there might have been an error with the server which is solved now (404 error is perfectly okay, so what we are seeing now is not a problem) or Google bot is seeing something else due to some settings / .htaccess on the server.

    The solution is to use "Fetch as Googlebot" tool in webmaster tool and make sure that robots.txt returns the same 404 page we are seeing now. Or you may even create a simply robots.txt to allow all. In any case if "Fetch as Googlebot" returns a page without error, Google should start indexing again. Otherwise take corrective steps based on what you see.
    Signature

    Hosting specials - Hostgator Review and Inmotion Coupon

    {{ DiscussionBoard.errors[8125411].message }}
    • Profile picture of the author Dmreed4311
      Thank you for all the replies. I contacted my hosting company and they fixed the error that was giving google the 500 error message. I learned that when google tries to crawl your site it checks for robot.txt files and if it does not find them or get a 404 error message back they will not crawl the site. It was an error on my hosting companies end that sadly lost me ranking on 7 of my sites, hopefully they will come back.
      Signature

      Carpet Doctor
      212 east Ross Ave.
      Tampa Fl 33602
      813-440-8335

      {{ DiscussionBoard.errors[8125482].message }}
      • Profile picture of the author SteveJohnson
        Originally Posted by Patrick View Post

        I repeat again that the OP said there is no robots.txt file on his server....

        His domain Carpet Cleaning Las Vegas-Todds Cleaning service opens fine...

        Robots.txt file - http://www.todds-cleaningservice.com/robots.txt

        Does not give an HTTP error, gives a hostgator page not found, means there is no page or file in the server.

        So what's the point here?

        Moreover quoted from google...

        And since the OP said that he has not robots.txt file, so Google should crawl the site normally, I already said this in my last post.
        The point is, as you will see below, there WAS a 500 error being returned when the robots.txt file was requested, and it was causing the googlebot to not crawl the site, as *I* said.

        Are we done with the 'mine is bigger than yours' back-and-forth?

        Originally Posted by Dmreed4311 View Post

        Thank you for all the replies. I contacted my hosting company and they fixed the error that was giving google the 500 error message. I learned that when google tries to crawl your site it checks for robot.txt files and if it does not find them or get a 404 error message back they will not crawl the site. It was an error on my hosting companies end that sadly lost me ranking on 7 of my sites, hopefully they will come back.
        Signature

        The 2nd Amendment, 1789 - The Original Homeland Security.

        Gun control means never having to say, "I missed you."

        {{ DiscussionBoard.errors[8127510].message }}
        • Profile picture of the author Patrick
          Originally Posted by SteveJohnson View Post

          The point is, as you will see below, there WAS a 500 error being returned when the robots.txt file was requested, and it was causing the googlebot to not crawl the site, as *I* said.

          Are we done with the 'mine is bigger than yours' back-and-forth?
          heh. I knew that was coming. I won't respond to your baseless "off-topic" questions, coz am sure you don't know the different between "500 error" and "404 error".

          "I learned that when google tries to crawl your site it checks for robot.txt files and if it does not find them or get a 404 error message back"
          I agree with SmallBiz's reply that it can be something to do with www and non www.

          However the OP's issue is solved so....
          {{ DiscussionBoard.errors[8128527].message }}
  • Profile picture of the author davidfrankk
    Having a robots.txt file is very important for a site nowadays. It tells search engine bots which directories of a website are not to be accessed by which particular bots.
    Add the Robots.txt file and add it to webmaster tools. The problem will be resolved.
    {{ DiscussionBoard.errors[8125554].message }}
  • Profile picture of the author SmallBizWebsites
    One thing you should do, PROMPTLY, is edit your .htaccess file to fix the URL cannonicalization problem on your site, which loads as both www [dot] todds-cleaningservice.com and todds-cleaningservice.com.

    http://www.todds-cleaningservice.com/robots.txt and
    http://todds-cleaningservice.com/robots.txt

    are two different URLs.

    I would also get rid of the default 404 error page, which is a blatant advertisement for Hostgator, and replace it with one of your own, to look more professional.

    To fix the robots.txt problem, why don't you CREATE one, which allows all robots to visit? The contents should be the following two lines of code:

    Code:
    User-agent: *
    Code:
    Allow: /
    {{ DiscussionBoard.errors[8125715].message }}
  • Profile picture of the author electrickiwi
    Are you using Wordpress? Perhaps the "Discourage Search Engines" box has been inadvertently checked, if so?

    Otherwise, I'm unsure what the problem could be but hope you find a resolution!
    {{ DiscussionBoard.errors[8135611].message }}

Trending Topics