Page 3 of 5 FirstFirst 12345 LastLast
Results 21 to 30 of 43

Thread: Important issues for WWW forums

  1. #21
    Moderator
    Join Date
    Jan 2001
    Posts
    8,651

    Important issues for WWW forums

    Kirk, the new website is definitely up - nice and easy to get around. The "articles" link brings up the menu of articles, but the individual article links don't work. Ditto with the resume link from the biography page.

  2. #22
    Kirk Gittings's Avatar
    Join Date
    Mar 2004
    Location
    Albuquerque, Nuevo Mexico
    Posts
    9,864

    Important issues for WWW forums

    Oren,

    Odd I get my old site even when I clear the cache and history.
    Thanks,
    Kirk

    at age 73:
    "The woods are lovely, dark and deep,
    But I have promises to keep,
    And miles to go before I sleep,
    And miles to go before I sleep"

  3. #23

    Join Date
    Jun 2002
    Posts
    9,487

    Important issues for WWW forums

    It's 404ing the articles Kirk

  4. #24

    Join Date
    Dec 2004
    Posts
    192

    Important issues for WWW forums

    "But he's a bastard and that is something I will comfortably say to the g@d@mn tratiors at the New York Times."

    well, at least you can't acuse them of being pinko liberal left wing and anti-administration - after sitting on he story of Presidential law breaking for a whole year

  5. #25

    Join Date
    Jul 2005
    Posts
    953

    Important issues for WWW forums

    ' You can prevent any website that you control from being "crawled" '

    Oren, this is not actually correct. What you can do is put code to effectively ask well behaved crawlers not to crawl or archive your site. You can also block IP addresses of known crawlers but what you can not do is stop any unknown (to your code) crawlers or any badly behaved crawlers which use constantly changing IP addresses(false ones) from crawling your site. There are many web archive sites which are badly behaved. Google happens to be one of the few well behaved crawlers.

    The only effective way to ensure your site is not crawled is to password protect it. For most web sites this is self defeating.

  6. #26

    Join Date
    Oct 1998
    Location
    Columbus, OH
    Posts
    42

    Important issues for WWW forums

    Kirk- I read your exchange on Mark Justice Hinton's site but I can't see where I can buy your book. Oh wait, here it is:

    http://www.amazon.com/gp/product/offer-listing/0826312780/ref=dp_olp_2//103-6467018-9327028?condition=all
    http://www.amazon.com/gp/product/offer-listing/0826312772/ref=lp_g_1/103-6467018-9327028?%5Fencoding=UTF8

    (snicker, snicker) ...and yes, I bought one.

    -Ben

  7. #27
    Moderator
    Join Date
    Jan 2001
    Posts
    8,651

    Important issues for WWW forums

    Oren, this is not actually correct. What you can do is put code to effectively ask well behaved crawlers not to crawl or archive your site.

    OK. I know that Google and the Internet Archive ("Wayback Machine") are well-behaved in this sense, but I don't know as much about malicious sites. Thanks for the correction. Just for my own education, off the top of your head can you point to any specific archive sites that are ill-behaved in this way? I'm curious as to exactly what they're doing with the information. Harvesting email addresses for spam?

  8. #28
    Kirk Gittings's Avatar
    Join Date
    Mar 2004
    Location
    Albuquerque, Nuevo Mexico
    Posts
    9,864

    Important issues for WWW forums

    Thank you Ben!

    Frank, sorry I have no idea what that means.
    Thanks,
    Kirk

    at age 73:
    "The woods are lovely, dark and deep,
    But I have promises to keep,
    And miles to go before I sleep,
    And miles to go before I sleep"

  9. #29
    Moderator
    Join Date
    Jan 2001
    Posts
    8,651

    Important issues for WWW forums

    Kirk, "404" is just the code for the "page cannot be found" screen that you get when a link doesn't lead anywhere.

  10. #30

    Join Date
    Jul 2005
    Posts
    953

    Important issues for WWW forums

    off the top of my head? No. Last time I looked, which was quite a while ago, I found stuff I didn't expect to find as I had used noarchive on some pages. I just decided it wasn't worth worrying about. The big boys seem to be quite well behaved and the rest are quite insignificant.

    For more info you can look at:

    http://www.robotstxt.org/wc/robots.html
    http://searchenginewatch.com/

    A quick look at my stats show approx 200 different robots have visited my site over the last year. What they are all doing with the info they extract I have no idea. Some may be going to search engines, others for analysis of some kind, others for archive. None for email spam because my email address doesn't exist in my web site.

Similar Threads

  1. Choosing which forums to view
    By Tom Westbrook in forum News
    Replies: 28
    Last Post: 23-Jun-2012, 19:36
  2. Digital Related Forums
    By neil poulsen in forum Digital Hardware
    Replies: 6
    Last Post: 26-Feb-2005, 20:37
  3. Have Internet Forums Revitalized LF Phototography?
    By Andre Noble in forum On Photography
    Replies: 27
    Last Post: 12-Feb-2005, 22:59
  4. What's important in photo course
    By Doug Paramore in forum On Photography
    Replies: 27
    Last Post: 7-Nov-2001, 21:01

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •