dimanche 1 février 2009

4.3.2 The size of the Internet

How big is the Internet? Is Google indexing all websites?
It is really hard to answer to those questions but however possible to set
up some estimations according to some information31. Those sources are
saying that in 2005 the size of the Internet is estimated to 5 million terabytes and
Google's index to 170 terabytes which would mean that Google is processing only
0,000034%. However the Internet is also containing what we call the invisible web
composed of websites that owners do not want its content indexed as well as
websites which are protected by a password. In 2004 this invisible web was
estimated to be 500 times bigger than the visible web. It has been said as well that
Google is indexing invisible web only recently32.
A clever calculation will then give us:
Internet size = Visible Web + Invisible Web;
Internet size = 501 * Visible Web;
Visible Web = Internet size / 501;
Visible Web = 5 000 000 / 501;
Visible Web = 9980 terabytes;
Google Index = 170 / Visible Web;
Google Index = 1,7%
This estimation is of course only an estimation and could be full of errors. I
however find it more useful than no information at all. I would also emphasize the
fact that Google has no interest in indexing bad quality websites and that
technologies have evolved in the last few years and that this rate should be of course
far higher than those 1,7%. Whatever is the final result my point is the following:
Google is not the Internet and is not processing all the web... but does
all the web need to be indexed?

Aucun commentaire: