Switch to English Language Passer en langue française Omschakelen naar Nederlandse Taal Wechseln Sie zu deutschen Sprache Passa alla lingua italiana
Members: 69,990   Posts: 1,524,183   Online: 1107
      
Page 2 of 2 FirstFirst 12
Results 11 to 20 of 20
  1. #11
    Aggie's Avatar
    Join Date
    Jan 2003
    Location
    So. Utah
    Shooter
    Multi Format
    Posts
    4,925
    Images
    6
    Sean, Tim is telling me this, so bear with me if it soumds a bit blonde.

    Posts are not emails. Posts are data base entries. This allows them to searched and indexed. Expiration is generally against what your data base is supposed to do. You don't want data to just expire and disappear into thin space. Least of all in a community like this. There is a way to lighten the load without expireing the data, which means hacking the forum php to set the default views to posts for only in a certain recent time period, usually a month. Generally it will also let you set it how ever long back to how ever long you want to view posts. But by setting the default value to a more current time period you ligthen the load by when the data base is searched for posts it doesn;t return every single value except for those posts in that specified time period.

    I hope you understand that, cause it just went over my head. To think I gave birth to a hacker knowledgable geek.
    Non Digital Diva

  2. #12
    edz
    edz is offline

    Join Date
    Dec 2002
    Location
    Munich, Germany
    Shooter
    Multi Format
    Posts
    685
    Quote Originally Posted by Aggie
    Sean, Tim is telling me this, so bear with me if it soumds a bit blonde.
    I don't quite understand what chromatics have to do with anything?

    Posts are not emails. Posts are data base entries.
    You are confusing context with storage models. Posts are like emails--- and too can be viewed as entries in a data base--- in that they are static and don't change--- that we have mechanisms to edit a message are aside the issues. The typical RDBMS system is designed to handle volatile and dynamic information. The volatile aspect of our forums is the topic. It tracks the development, like a mail folder, of a subject but each of the elements, the contributions, are static. In a relational database one needs to design for volatility such as seats as in the question "How many seats are left on the flight to San Francisco?". Data banks should more properly be called data markets. A forum, by contrast, is a collection of static and unchanging objects.

    The model for these forums is little else other than a threaded mailing list or what we've come to call Usenet News lifted over to a web interface (at first mail->web but latter also web->web). As a little aside I created the Mail->Web genre a good dozen years ago: see the w3c.org web museum.....

    This allows them to searched and indexed.
    No. Indexing and search is of information and have nothing to do with it. Relational database systems, in fact, are poorly suited to the task. Again you are confusing things, this time applications with organization and storage models.

    Expiration is generally against what your data base is supposed to do.
    You don't understand what "expiration" in HTTP means. If its not set then the data can expire immediately since its not been defined. If I don't know when data expires then I also can't assume that it will never expire or that the data expires once I get a copy (as the case might be in a ticket reservation system). If data is volatile then this is want you may want to get someone to keep asking for data. This is where the modification date/time enter the picture. One then asks if the data has been changed since the last time one asked and got it.. But if it too is undefined then one will probably need to assume that the data might have been changed (most browsers allow one to set this on a per-session etc. basis but its in the hands of the client and not server). There are also some features for a hash of context to try to distinguish between changes but I think I'm getting too deep into the fine details of designing spiders and search engines (which I do) and less sites.

    You don't want data to just expire and disappear into thin space. Least of all in a community like this.
    That's why one needs to set an expiration date at a distant point in the future and set the modification date. Its up to the site administration to try to controll how clients (and web spiders are in this capacity nothing other than clients with a code of behaviour) behave.
    Edward C. Zimmermann
    BSn R&D // http://www.nonmonotonic.net

  3. #13
    Andy K's Avatar
    Join Date
    Jul 2004
    Location
    Sunny Southend, England.
    Shooter
    Multi Format
    Posts
    9,422
    Images
    81
    Quote Originally Posted by Bob F.
    MSNBOT seems a very hungry creature - you might want to use robots.txt to stop the sod crawling too deeply... This thread makes interesting reading (http://www.webmasterworld.com/forum97/73.htm).


    Cheers, Bob.
    Why am I not surprised that a Microsoft product caused the problem? :rolleyes:


    -----------My Flickr-----------
    Anáil nathrach, ortha bháis is beatha, do chéal déanaimh.

  4. #14
    KenM's Avatar
    Join Date
    Apr 2003
    Location
    Calgary, Alberta
    Shooter
    4x5 Format
    Posts
    800
    Lots of good stuff edz, but keep in mind that thread contents can change after a period of time. People go back and re-edit posts - Aggie knows about this. Setting a very long expiration on the response would cause these changes to not be re-indexed in the short term.
    Cheers!

    -klm.

  5. #15
    kwmullet's Avatar
    Join Date
    Jan 2004
    Location
    Denton, TX, US
    Shooter
    Multi Format
    Posts
    889
    Images
    16
    Also, Sean, if your service provider has a good set of Cisco skills, they can throttle the bandwidth available to traffic from certain address blocks and/or domains, so if there's a commonality to the sources of of the spiders, you could let them wander and do all they want, just within the bandwidth of a dialup connection or two. Doing so at the border router would also benefit the rest of their sites as well.

    -KwM-

  6. #16

    Join Date
    Oct 2003
    Location
    California
    Posts
    12
    Originally Posted by Aggie
    Sean, Tim is telling me this, so bear with me if it soumds a bit blonde.
    I don't quite understand what chromatics have to do with anything?
    --------
    I dont quite understand why this concept is foreign to you. Must be chromaticly challenged.
    ---------
    Quote:
    Posts are not emails. Posts are data base entries.

    You are confusing context with storage models. Posts are like emails--- and too can be viewed as entries in a data base--- in that they are static and don't change--- that we have mechanisms to edit a message are aside the issues. The typical RDBMS system is designed to handle volatile and dynamic information. The volatile aspect of our forums is the topic. It tracks the development, like a mail folder, of a subject but each of the elements, the contributions, are static. In a relational database one needs to design for volatility such as seats as in the question "How many seats are left on the flight to San Francisco?". Data banks should more properly be called data markets. A forum, by contrast, is a collection of static and unchanging objects.
    --------
    I have to break this one down into its component flawed arguements to best show why this is a wrong interpretation.

    Quote: You are confusing context with storage models.
    --------- You are confusing context with method. Just because you view the forums LIKE an email system does not an email system make. You arent logging on to a pop3 server to transmit a properly formated message to then be sent through the magic smoke in the wires over the interweb to another pop3 email server to then be acessed by the end user. What you are doing is acessing an interactive php script that generates a form that is formated and then sent to the same server. It is indexed and added to a database. When a user requests the forum listing hes not logging on to his email server to get his local copy of an email. No. He is makeing a call to the database which then sends back a properly formated page with the requested data on it. There is no CC or BCC option.
    --------
    Quote: Posts are like emails--- and too can be viewed as entries in a data base--- in that they are static and don't change--- that we have mechanisms to edit a message are aside the issues.
    -------- Firstly learn the language. A runon sentance with syntax errors abounding. Posts are not emails but you could make an arguement about certain similartities. The biggest diffrence is that emails are distributed to multiple email servers so that there are multiple independant copies of the email. Your not supposed to edit the contents of the email once sent. Forum posts are semi-static. They can be edited only by the same user or a user with greater admin acess. They are specificly designed to be editable. The mechanisms to edit a message are Not besides the point.


    Quote: The typical RDBMS system is designed to handle volatile and dynamic information.
    --------- Firstly a definition of RDBMS for those too lazy to use google.
    *Short for relational database management system and pronounced as separate letters, a type of database management system (DBMS) that stores data in the form of related tables. Relational databases are powerful because they require few assumptions about how data is related or how it will be extracted from the database. As a result, the same database can be viewed in many different ways.*
    Databases in an environment such as apug Should handle volatile and dynamic information. The entries are specificly designed to be editable and thus dynamic. Entries can be removed or moved or even stored in a diffrent place and thus is volatile. However other than describeing something like APUG's forum database i dont see what this sentance is supposed to imply.

    Quote: The volatile aspect of our forums is the topic. It tracks the development, like a mail folder, of a subject but each of the elements, the contributions, are static.
    -------- The volatile aspect is the numerus catagories and sub catagories. The contents. The entire database is volatile. Again you can construe that the forums are LIKE a mail folder but image and perception does not a mail folder make. The elements are dynamic. They change based upon user input. They can be changed at any time by user intervention. Theres nothing mystical nor hard about this concept.

    Quote: In a relational database one needs to design for volatility such as seats as in the question "How many seats are left on the flight to San Francisco?".
    ------- Are you implying that in the forum database specificly designed and marketed to thousands of users that there must be a limited number of users able to acess the forum at any given time?

    Quote: Data banks should more properly be called data markets. A forum, by contrast, is a collection of static and unchanging objects.
    ------- Data banks should more properly be called Data Banks. The b in banks should be capitalized. Its important you know. A forum, by contrast, is an ever expanding collection of dynamic entries that can be changed at any time by an end user working from a remote connection to a centralized database.


    Quote: The model for these forums is little else other than a threaded mailing list or what we've come to call Usenet News lifted over to a web interface (at first mail->web but latter also web->web). As a little aside I created the Mail->Web genre a good dozen years ago: see the w3c.org web museum.....
    -------- The model for these forums is little else other than phpBB with several phpHacks and a skin that matches the design. This is the defacto standard for web forums but the administrator could have used a diffrent system as his model. To have the arrogance to presume exactly and definately the model is an insult to him. Plus Usenet is being phased out by major isp providers. The system is DEAD. Oh and its nice that you did something relavent in the past dozen years.. :rolleyes:

    ---------
    Quote:
    This allows them to be searched and indexed.

    No. Indexing and search is of information and have nothing to do with it. Relational database systems, in fact, are poorly suited to the task. Again you are confusing things, this time applications with organization and storage models.
    ------- Use english. Were not useing ebonics here. Indexing and search is of information? wtf? Relational database systems allow the user to do a search for RELATED information. Not just specific entries. Just as all these posts are RELATED to the origional Post(entry) they are related to the FORUM(catagory) which is related to the FORUMS(Forum object on the webserver). There are numerus other relationships that are too tedious to point out to your narrow field of view. Mabey you should stop useing antiquated old email listing systems (usenet) and join the modern world. Im confusing things again am i? Your confuseing stupid for english. Before you post please write it down and ask someone to grammar check you because the first and last sentance of this block DONT MEAN ANYTHING. Use English.
    ----------
    Quote:
    Expiration is generally against what your data base is supposed to do.

    You don't understand what "expiration" in HTTP means. If its not set then the data can expire immediately since its not been defined. If I don't know when data expires then I also can't assume that it will never expire or that the data expires once I get a copy (as the case might be in a ticket reservation system). If data is volatile then this is want you may want to get someone to keep asking for data. This is where the modification date/time enter the picture. One then asks if the data has been changed since the last time one asked and got it.. But if it too is undefined then one will probably need to assume that the data might have been changed (most browsers allow one to set this on a per-session etc. basis but its in the hands of the client and not server). There are also some features for a hash of context to try to distinguish between changes but I think I'm getting too deep into the fine details of designing spiders and search engines (which I do) and less sites.
    ---------- Expiration means the data expires at a certain point. IE it is no longer relavent. IE it should not be viewed. IE you are confuseing things. IF Expiration is NOT SET then the data is not supposed to expire. If its not set you cant assume its supposed to expire at all. All you know is that the expiration is not set. The forums dont operate on a token key system. There is no reservation. If the data is volatile then you may want the data to be volatile. The context of the data and how it is acessed determines if you want people to keep asking for it. Because most of this site has dynamic generation of data probably through php scripts then by its very design you want users to continualy request the newest data. The browser should always assume in this case that the data HAS changed. A spider or robot just follow links and there is code to stop them from trying to index past a certain point. The robot.txt and setting iptables are the most common ways.
    ---------
    Quote: You don't want data to just expire and disappear into thin space. Least of all in a community like this.

    That's why one needs to set an expiration date at a distant point in the future and set the modification date. Its up to the site administration to try to controll how clients (and web spiders are in this capacity nothing other than clients with a code of behaviour) behave.
    -------- That is why you DONT set the expiration data Period. End of Story. You do not want your forum data to expire. You already have built into the database systems to limit the data returned by the database on client request. That data should not expire unless the forum administrator wishes it to. Your method requires continualy reseting the expiration date when anything changes. A simpler step is to just not expire the data.

  7. #17

    Join Date
    Nov 2002
    Location
    New Jersey
    Posts
    963
    Certainly more than I needed to know... Maybe we should move this to the lounge...

  8. #18
    SchwinnParamount's Avatar
    Join Date
    Nov 2004
    Location
    Tacoma, WA
    Shooter
    4x5 Format
    Posts
    1,011
    Images
    42
    Valthonis,

    There is no reason for you to make personal attacks against edz. He may be wrong but doesn't deserve rudeness. As an aside, before attacking his grammar you should check your own. Use a spell checker too. You have spelling errors too numerous to mention.

    I am also a RDMS admin/software developer but can spell and write the Queen's english. We should not let technical expertise excuse us from the requirement to write well.

  9. #19
    Andy K's Avatar
    Join Date
    Jul 2004
    Location
    Sunny Southend, England.
    Shooter
    Multi Format
    Posts
    9,422
    Images
    81
    Bloody hell Schwinn, you actually bothered reading all that? lol!


    -----------My Flickr-----------
    Anáil nathrach, ortha bháis is beatha, do chéal déanaimh.

  10. #20
    edz
    edz is offline

    Join Date
    Dec 2002
    Location
    Munich, Germany
    Shooter
    Multi Format
    Posts
    685
    Quote Originally Posted by Valthonis
    Originally Posted by Aggie

    --------
    I have to break this one down into its component flawed arguements to best show why this is a wrong interpretation.
    Skipping much of the dribble, I see a lack of understanding.

    Quote: You are confusing context with storage models.
    --------- You are confusing context with method. Just because you view the forums LIKE an email system does not an email system make. You arent logging on to a pop3 server to transmit a properly formated message to
    POP3 is just a little protocol designed to pass around some e-mail messages that conform to a certain standard. One should never confuse a transfer protocol with the content of the messages being transfered. One should never confuse the syntactics of the message, its form, with its context. One should never confuse the grammar with the model. One should not confuse metaphor with concrete instance. A story is a story and pixies don't in real life fly.

    Quote: The typical RDBMS system is designed to handle volatile and dynamic information.
    --------- Firstly a definition of RDBMS for those too lazy to use google.
    *Short for relational database management system and pronounced as separate letters, a type of database management system (DBMS) that stores data in the form of related tables. Relational databases are powerful because they require few assumptions about how data is related or how it will be extracted from the database. As a result, the same database can be viewed in many different ways.*
    So? And they are ill-suited for many uses. RDBMSs are typically not terribly good at searching for something in anything.

    Databases in an environment such as apug Should handle volatile and dynamic information. The entries are specificly designed to be editable and thus dynamic.
    No. One really needs to distinguish between the need/desire to be able to handle some changes and the need/desire to handle volatility. Its like the difference in evaporation between a glass of oil and a glass of acetone. In designing RDBMSs--- and I've designed a few, including even work some decades ago to go as far as to embed an RDBMS in a disk controller--- one has a set of goals and contraints. Once one gets rid of the demand for volatility and "instants" in contrast to snapshots one can start to use other approaches, models and algorithms. I have some customers that have been using my software (and implementation of some of my algorithms) to search though many millions and millions of human genome records. RDBMSs system like Oracle running even on large sexy big Sun servers just can't handle things. One a cheap PC from the supermarket we can search through many GBs of data in milliseconds. We can do this also searching though complete structure (yes, we can search anywhere in an XML/SGML tree including even among unnamed siblings) . Nearly all of the RDBMSs tend to be very poor at polymorphism. A developer of a system using an RDBMS needs to define his data model with great care. You don't want to feed a RDBMS with just anything, just anyhow.
    Look over at photo.net, a very well developed RDBMS-backed forum system. .. why do you think (beyond the VC-deathtrap that ArsDigita fell into) they don't offer search (despite even starting off using Illustra and then Oracle)?



    Quote: The model for these forums is little else other than a threaded mailing list or what we've come to call Usenet News lifted over to a web interface (at first mail->web but latter also web->web). As a little aside I created the Mail->Web genre a good dozen years ago: see the w3c.org web museum.....
    -------- The model for these forums is little else other than phpBB with several phpHacks and a skin that matches the design. This is the defacto standard for web forums but the administrator could have used a diffrent
    I don't understand the relevance of popularity--- the crux of your "defaco standard" as in Microsoft Windows to totalitarian and corrupt governments to ... I don't want to touch upon why PHP is currently popular nor the current trends of "development".

    PHP is, however, not a standard but a popular piece of software. Most of the forums that have been developed using PHP seem to assume a similar data model but there is little in PHP to dictate the model nor is there any indication that these models have been chosen by carefull design to scale. Because they don't scale well.

    One can model the messages and threads and trees of discussions in a RDBMS but its not terribly effecient for either search, discovery or retrieval. RDBMSs tend to be very good at storage but have a high overhead for indexes. The systems work fine as long as they are small but as they grow and get "large" (the semantics for "large" change in responce to developments in computers and storage devices) they become unuseable. There are means and ways to try to address these deficits.
    Edward C. Zimmermann
    BSn R&D // http://www.nonmonotonic.net

Page 2 of 2 FirstFirst 12


 

APUG PARTNERS EQUALLY FUNDING OUR COMMUNITY:



Contact Us  |  Support Us!  |  Advertise  |  Site Terms  |  Archive  —   Search  |  Mobile Device Access  |  RSS  |  Facebook  |  Linkedin