Switch to English Language Passer en langue française Omschakelen naar Nederlandse Taal Wechseln Sie zu deutschen Sprache Passa alla lingua italiana
Members: 70,931   Posts: 1,556,926   Online: 1089
      
Results 1 to 10 of 10
  1. #1
    Sean's Avatar
    Join Date
    Aug 2002
    Location
    New Zealand
    Shooter
    Multi Format
    Posts
    8,566
    Blog Entries
    7
    Images
    15

    fyi- still wrestling with search bots

    We got hit by googlebot again eventhough I had items in pace to turn it away. I'm looking into it and things seem to be stable.

  2. #2
    Sean's Avatar
    Join Date
    Aug 2002
    Location
    New Zealand
    Shooter
    Multi Format
    Posts
    8,566
    Blog Entries
    7
    Images
    15
    well.. from what I can tell the methods I put in place to block these bots can take up to 30 days to kick in. Will just keep an eye on things until then..

  3. #3

    Join Date
    Sep 2002
    Location
    State College, PA
    Shooter
    ULarge Format
    Posts
    336
    Sean,

    Enlighten me a little. What does the bot do that causes problems? Also, how will the data on APUG be searchable if you block the bots (or is that the price to pay for not permitting a bot to crawl the site)?


    ---Michael
    www.mutmansky.com
    B&W photography in Silver, Palladium, and gum bichromate.

  4. #4
    Sean's Avatar
    Join Date
    Aug 2002
    Location
    New Zealand
    Shooter
    Multi Format
    Posts
    8,566
    Blog Entries
    7
    Images
    15
    well the bots are like leeches. They find your site and latch onto it, sometimes for days pulling all the data from your pages. I think what's happening is that these bots might hit something they think should exist but doesn't. For example maybe a gallery image or a forum post, it gets stuck on this, glitches, and the cpu usage goes to 100%. People on dedicated servers with a lot of grunt can usually ride these things out, but we don't have that luxury yet. Most of the site is already indexed by the major engines so we still get some good visibility for now. When I have more time I'll try to get to the bottom of this and get some indexing back online..

  5. #5
    rbarker's Avatar
    Join Date
    Oct 2004
    Location
    Rio Rancho, NM
    Shooter
    Multi Format
    Posts
    2,222
    Images
    2
    FWIW, I just read an article in one of the IT mags that indicated some nefarious sites are using Google searches to probe web sites for security vulnerabilities. I wonder if this might be what's happening, Sean.
    [COLOR=SlateGray]"You can't depend on your eyes if your imagination is out of focus." -Mark Twain[/COLOR]

    Ralph Barker
    Rio Rancho, NM

  6. #6
    Sean's Avatar
    Join Date
    Aug 2002
    Location
    New Zealand
    Shooter
    Multi Format
    Posts
    8,566
    Blog Entries
    7
    Images
    15
    possible maybe but all indications of the ip address show that it is one of their "googlebots". seems the net is getting more and more like a battle field

  7. #7

    Join Date
    Apr 2004
    Location
    New Zealand
    Shooter
    Multi Format
    Posts
    2,410
    Images
    4
    Hey what'd we do???

  8. #8

    Join Date
    Nov 2003
    Posts
    5,243
    Images
    9
    Quote Originally Posted by John McCallum
    Hey what'd we do???
    Sean unleashed the dreaded analoguebot to do battle with the evil invading googlebot. Analogue bot took out google.
    Technological society has succeeded in multiplying the opportunities for pleasure, but it has great difficulty in generating joy. Pope Paul VI

    So, I think the "greats" were true to their visions, once their visions no longer sucked. Ralph Barker 12/2004

  9. #9
    garryl's Avatar
    Join Date
    Jul 2003
    Location
    Fort Worth, TX
    Shooter
    35mm
    Posts
    542
    Images
    2
    I think it's time to call in "Optimus Prime"!
    http://www.141empire.com/141dateline...or/optimus.htm

  10. #10
    edz
    edz is offline

    Join Date
    Dec 2002
    Location
    Munich, Germany
    Shooter
    Multi Format
    Posts
    685
    Quote Originally Posted by g
    is that these bots might hit something they think should exist but doesn't. For example maybe a gallery image or a forum post, it gets stuck on this, glitches, and the cpu usage goes to 100%.
    This is something very much apart of the world of a few of the PHP kits. They do the silly assumption that if a resource does not exist then deliver a default resource.. and something this gets recursive.. One solution, if you can't fix it, is to deliver the pages via a reverse proxy server..
    Edward C. Zimmermann
    BSn R&D // http://www.nonmonotonic.net



 

APUG PARTNERS EQUALLY FUNDING OUR COMMUNITY:



Contact Us  |  Support Us!  |  Advertise  |  Site Terms  |  Archive  —   Search  |  Mobile Device Access  |  RSS  |  Facebook  |  Linkedin