Re: Air-l digest, Vol 1 #94 - 4 msgs
number of web pages worldwide Regarding number of web pages world wide, I was informed that C. Lee Giles at Pen State (www.ist.psy.edu/faculty/giles.html) has developed some tools to sample and estimate the number of pages. I also learned of the distinction between the dark web (pages inaccessible to crawlers) and the pages that available and referenced via search engines. It seems that one estimate has the dark web at 90% of total pages. Greg Monaco Gregory E. Monaco, Ph.D. Program Director, Advanced Networking National Science Foundation 703-292-8948
Check the slightly-out-of-date, fairly technical article from the WWW9 conference (2000). It made a lot of headlines at the time, as it was a collaboration between IBM and AltaVista / Compaq. http://www.almaden.ibm.com/cs/k53/www9.final/ They estimate only / as many as 56 million "strongly connected" web pages, which they found by a variety of techniques, including random IP address checking and web crawls, in a larger set of 200 million pages. One might question their technique, as Google claims to have indexed over "1,387,529,000 web pages". Danyel
Google's index includes dead pages which are cached but otherwise lost, FWTW. -----Original Message----- From: air-l-admin@aoir.org [mailto:air-l-admin@aoir.org]On Behalf Of Danyel Fisher Sent: Tuesday, August 28, 2001 11:53 AM To: air-l@aoir.org Subject: [Air-l] Worldwide Web Pages Check the slightly-out-of-date, fairly technical article from the WWW9 conference (2000). It made a lot of headlines at the time, as it was a collaboration between IBM and AltaVista / Compaq. http://www.almaden.ibm.com/cs/k53/www9.final/ They estimate only / as many as 56 million "strongly connected" web pages, which they found by a variety of techniques, including random IP address checking and web crawls, in a larger set of 200 million pages. One might question their technique, as Google claims to have indexed over "1,387,529,000 web pages". Danyel _______________________________________________ Air-l mailing list Air-l@aoir.org http://www.aoir.org/mailman/listinfo/air-l
And how does one count dynamically generated pages? Are there as many web pages as books available through Amazon? Is google's index counted as only two web pages even though almost every instance of the second one is different? -----Original Message----- From: air-l-admin@aoir.org [mailto:air-l-admin@aoir.org]On Behalf Of monaco Sent: Tuesday, August 28, 2001 11:38 AM To: air-l@aoir.org Subject: [Air-l] Re: Air-l digest, Vol 1 #94 - 4 msgs number of web pages worldwide Regarding number of web pages world wide, I was informed that C. Lee Giles at Pen State (www.ist.psy.edu/faculty/giles.html) has developed some tools to sample and estimate the number of pages. I also learned of the distinction between the dark web (pages inaccessible to crawlers) and the pages that available and referenced via search engines. It seems that one estimate has the dark web at 90% of total pages. Greg Monaco Gregory E. Monaco, Ph.D. Program Director, Advanced Networking National Science Foundation 703-292-8948 _______________________________________________ Air-l mailing list Air-l@aoir.org http://www.aoir.org/mailman/listinfo/air-l
participants (3)
-
Danyel Fisher -
Ellis Godard -
monaco