fyi- still wrestling with search bots

Discussion in 'APUG System Announcements' started by Sean, Feb 8, 2005.

  1. Sean

    Sean Admin Staff Member Admin

    Messages:
    9,316
    Joined:
    Aug 29, 2002
    Location:
    New Zealand
    Shooter:
    Multi Format
    We got hit by googlebot again eventhough I had items in pace to turn it away. I'm looking into it and things seem to be stable.
     
  2. Sean

    Sean Admin Staff Member Admin

    Messages:
    9,316
    Joined:
    Aug 29, 2002
    Location:
    New Zealand
    Shooter:
    Multi Format
    well.. from what I can tell the methods I put in place to block these bots can take up to 30 days to kick in. Will just keep an eye on things until then..
     
  3. Michael Mutmansky

    Michael Mutmansky Member

    Messages:
    345
    Joined:
    Sep 7, 2002
    Location:
    Sacramento
    Shooter:
    ULarge Format
    Sean,

    Enlighten me a little. What does the bot do that causes problems? Also, how will the data on APUG be searchable if you block the bots (or is that the price to pay for not permitting a bot to crawl the site)?


    ---Michael
     
  4. Sean

    Sean Admin Staff Member Admin

    Messages:
    9,316
    Joined:
    Aug 29, 2002
    Location:
    New Zealand
    Shooter:
    Multi Format
    well the bots are like leeches. They find your site and latch onto it, sometimes for days pulling all the data from your pages. I think what's happening is that these bots might hit something they think should exist but doesn't. For example maybe a gallery image or a forum post, it gets stuck on this, glitches, and the cpu usage goes to 100%. People on dedicated servers with a lot of grunt can usually ride these things out, but we don't have that luxury yet. Most of the site is already indexed by the major engines so we still get some good visibility for now. When I have more time I'll try to get to the bottom of this and get some indexing back online..
     
  5. rbarker

    rbarker Member

    Messages:
    2,222
    Joined:
    Oct 31, 2004
    Location:
    Rio Rancho,
    Shooter:
    Multi Format
    FWIW, I just read an article in one of the IT mags that indicated some nefarious sites are using Google searches to probe web sites for security vulnerabilities. I wonder if this might be what's happening, Sean.
     
  6. Sean

    Sean Admin Staff Member Admin

    Messages:
    9,316
    Joined:
    Aug 29, 2002
    Location:
    New Zealand
    Shooter:
    Multi Format
    possible maybe but all indications of the ip address show that it is one of their "googlebots". seems the net is getting more and more like a battle field :sad:
     
  7. John McCallum

    John McCallum Member

    Messages:
    2,410
    Joined:
    Apr 25, 2004
    Location:
    New Zealand
    Shooter:
    Multi Format
    Hey what'd we do??? :tongue: :tongue:
     
  8. mark

    mark Member

    Messages:
    5,267
    Joined:
    Nov 13, 2003
    Sean unleashed the dreaded analoguebot to do battle with the evil invading googlebot. Analogue bot took out google. :cool:
     
  9. garryl

    garryl Member

    Messages:
    542
    Joined:
    Jul 19, 2003
    Location:
    Fort Worth,
    Shooter:
    35mm
  10. edz

    edz Member

    Messages:
    685
    Joined:
    Dec 4, 2002
    Location:
    Munich, Germ
    Shooter:
    Multi Format
    This is something very much apart of the world of a few of the PHP kits. They do the silly assumption that if a resource does not exist then deliver a default resource.. and something this gets recursive.. One solution, if you can't fix it, is to deliver the pages via a reverse proxy server..