folks realize that using the "number of hits returned on google" is a hilarious bad way to prove a point -- right?
Wrong. What's wrong with using the vast internet resources as a quasi-corpus for natural languages (if you avoid certain pitfalls, which I alluded to in my last message)?
because people assume that all texts that are available are represented, which according to the google people they are *not*. in other words, the sample that you are pulling numbers from is neither complete nor perfect - so your results won't be either. do you understand what google does well enough (details of the algorithm, et cetera) to know what the weaknesses are? oh, you say they haven't published enough information for you to know? that's what i thought. :|
I am afraid, this is how your argumentation sounds to me. Why should it be wrong to use the number of google hits under all circumstances?
i think your tone is pretty crass.
If I want to show that Canada is better known than Vanuatu (http://googlefight.com/index.php?lang=en_GB&word1=canada&word2=vanuatu), why would the comparison of google hits be inadmissable? (There are a number of reasons, why the "Vunuatu" hits are inflated, but that is of no concern here).
popularity of a term is one of the few instances in which comparative occurrence vis a vis the google corpus *might* be useful. it would depend on your question, and whether the data available from the particular google server you're connected to is appropriate to answering it. --elijah