Interesting post here about the Google Ngram viewer and its limitations. One possible limitation about this viewer is if the amount of literature for each time period included are normalized, in other words are all the texts in Google Books represented in equal amount, or would it be possible that there are more texts from the 20th century and after compared to before? Since the data for all this comes from Google books itself, is it just a raw reading of the data, or are the years normalized? I have only casually looked into this tool, so I don’t know if this is true or not. If anybody knows the Google NGram viewer well or uses it on a regular basis, feel free to comment.
When Google?s Ngram Viewer was the topic of a post on Science-Based Medice, I knew it was becoming mainstream. No longer happy to only be toyed with by linguists killing time, the Ngram Viewer had entranced people from other walks of life. And I can understand why. Google?s Ngram Viewer is an impressive service that allows you to quickly and easily search for the frequency of words and phrases in millions of books. But I want to warn you about Google?s Ngram Viewer. As a corpus linguist, I think it?s important to explain just what Ngram Viewer is, what it can be used to do, how I feel about it, and the praise it has been receiving since its inception. I?ll start out simple: despite all its power and what it seems to be capable of, looks can be deceiving.
Have we learned nothing?
Jann Bellamy wrote a post
View original post 1,809 more words