Switch to English Language Passer en langue française Omschakelen naar Nederlandse Taal Wechseln Sie zu deutschen Sprache Passa alla lingua italiana
Members: 68,759   Posts: 1,484,011   Online: 1177
      
Results 1 to 9 of 9
  1. #1
    Sean's Avatar
    Join Date
    Aug 2002
    Location
    New Zealand
    Shooter
    Multi Format
    Posts
    8,518
    Blog Entries
    7
    Images
    15

    APUG Server Stability Issues

    Hi All,

    We still have some instability with the server. It is being looked into aggresively and I'll post updates as I get them. At the moment we are crashing around once per day usually 15-30min outage max. I should have another update for you tomorrow. Thanks for your patience, especially those who host their personal sites on the server.

    Sean

  2. #2
    Sean's Avatar
    Join Date
    Aug 2002
    Location
    New Zealand
    Shooter
    Multi Format
    Posts
    8,518
    Blog Entries
    7
    Images
    15
    *update*

    Looks like we're probably going to be migrated to another server. Not sure of the exact time yet so will post more details when I get them. The host is stunned we've have had 2 problem servers. He runs over 200 of these servers many the same models we have had trouble with and they are typically very stable.

  3. #3
    Sean's Avatar
    Join Date
    Aug 2002
    Location
    New Zealand
    Shooter
    Multi Format
    Posts
    8,518
    Blog Entries
    7
    Images
    15
    Hi, we just had another outage. They may have tracked the issue down to a problem with cpanel (cpanel is a system control panel utility that resides on the server). Will post more details soon.

    Thanks,
    Sean

  4. #4
    Sean's Avatar
    Join Date
    Aug 2002
    Location
    New Zealand
    Shooter
    Multi Format
    Posts
    8,518
    Blog Entries
    7
    Images
    15
    Ok, they have narrowed the problem down to a bug on the servers motherboard being unstable with 4gigs of ram. This has been observed on several other servers and the vender will be sending out a replacement board. I'll keep you posted.

    Thanks,
    Sean

  5. #5
    Sean's Avatar
    Join Date
    Aug 2002
    Location
    New Zealand
    Shooter
    Multi Format
    Posts
    8,518
    Blog Entries
    7
    Images
    15
    Hi guys, not sure how long the database had problems. Site was fine and I went out to get some groceries and I come back to it down. Since the site was technically up my monitoring service did not page me out. I have repaired the database.

    Thanks

  6. #6
    Sean's Avatar
    Join Date
    Aug 2002
    Location
    New Zealand
    Shooter
    Multi Format
    Posts
    8,518
    Blog Entries
    7
    Images
    15
    ok.. I have upgraded my monitoring service to hit the site every 5 minutes and do a page text verification. It will load a specific page and search for a specific word. If this word is not found I'll get paged out. So now this covers the site in 2 ways: 1) if the site can not be accessed I'll be notified 2) if the apug forum database has problems but the site is still running I'll be notified (if the forum database has problems the word which the monitoring service is looking for on the specified page will not be able to load and this will trigger an alert.)

    I am expecting we'll have around 2 to 3 more 30min or so outages before the server hardware is swapped out. I'll be able to get us back up more quickly with the new monitoring in place. Thanks for your continued patience with the server issues.

    Sean

  7. #7
    Sean's Avatar
    Join Date
    Aug 2002
    Location
    New Zealand
    Shooter
    Multi Format
    Posts
    8,518
    Blog Entries
    7
    Images
    15
    *update* for those interested..

    The issue might be resolved. The host has narrowed our problem to one of two things.

    1)
    A bug with cpanel software (cpanel is the front end control system for the server). This bug was driving up cpu load and causing crashes on several servers. The cpanel company acknowledged the bugs 5 days ago and released a fix today which was applied to our server. I did run a few utilities after this fix was applied that would have crashed us a few days ago but today they ran without issue. That is either a fluke or a good sign we may be getting somewhere.

    2)
    There may be an issue with our motherboard having problems with 4gigs of memory installed. If we have another crash this is most likely the problem.

    So the current plan is:
    -monitor things now that cpanel has been patched, if we stay up more than 4 days we should be ok.

    -if we crash again remove 2 gigs of memory from the server in order to stabilize it, then schedule a motherboard replacement within a few days time
    -if for some reason the motherboard replacement fails we will move to another new server
    -failing that we would investigate alternative hosting options but I doubt it will come to this. For one I hope it does not come to that because our operating costs will then more than double..

    Thanks,
    Sean

  8. #8
    Sean's Avatar
    Join Date
    Aug 2002
    Location
    New Zealand
    Shooter
    Multi Format
    Posts
    8,518
    Blog Entries
    7
    Images
    15
    Ok. Well looks like the software patch failed to help. I thought we were in the clear because we had 40hrs without a crash, but we did crash. We are now focusing on the memory issues on the motherboard. The board seems to have problems with 4 gigs of memory so we have removed 2gigs. The last outage was a bit longer since we had to remove the memory. We'll see how things go now. Thanks

  9. #9
    Sean's Avatar
    Join Date
    Aug 2002
    Location
    New Zealand
    Shooter
    Multi Format
    Posts
    8,518
    Blog Entries
    7
    Images
    15
    Hi All,

    Looks like we still have some problems with the server but I want you to know that we are on the verge of moving to a new host that provides mission critical service. This does not come cheap (3x the cost of the current system) so I have spent the last week working to acquire financial backing for us to do this. Everything is falling into place and I'll be making another announcement regarding this transition soon.

    Thanks,
    Sean



 

APUG PARTNERS EQUALLY FUNDING OUR COMMUNITY:



Contact Us  |  Support Us!  |  Advertise  |  Site Terms  |  Archive  —   Search  |  Mobile Device Access  |  RSS  |  Facebook  |  Linkedin