The power of web based corpus research has become even more accessable thanks to Google Ngrams. Its funny how Zimmer writes about how this tool can suck time: meaning that anyone using it may spend hours trying different combinations and comparisons. Good!
“Bigger, Better Google Ngrams: Brace Yourself for the Power of GrammarThe AtlanticBack in December 2010, Google unveiled an online tool for analyzing the history of language and culture as reflected in the gargantuan corpus of historical texts that…
Just wanted to give you notice that I will be giving a workshop at the JALT CALL 2010. (Thats the Japan Association of Language Teachers Computer Assisted Language Learning conference in case you don’t live in Japan)
The title of this workshop is devoted to one internet tool Wordle, created by Johnathan Feinberg. I am offering this here make teachers in Japan aware of its potential. We will create some word clouds in the lab and group brainstorm ideas on applying them in the classroom.
You too can witness the action because I am presenting this via Google Docs Presentation software, which can allow users to see the presentation online!
I think the toughest thing about building a corpus is the fact that if you are working alone, you are the only one who has to input the data. Actually this stage takes the most time for what I want to do.
For the project that I presented at the Temple University Japan Colloquium 2009
, the whole process involves entering data by hand, typing each essay one by one. Luckily junior high L2 writing is short, so what I do to alleviate having to type a lot of text is to look for any consistent phrases that can be seen in the essays and write them on a separate word document. Then I hit Ctrl+C(hitting C twice) to open the clipboard and save the common phrases. I think this depends on the similarity of texts. When I have to enter any new essays, I lay down the copied texts with the annotation information. Then I either add or subtract phrases according to what was written. This way the labor of inputting data is relieved somewhat and my arms can take a break.
If anyone knows of a better way to do this than the above method on Word or maybe Excel, I would be happy to hear from you.
I don’t know how many of you are interested in corpus linguistics but I chanced upon the mother of all sites highlighting the topic, with a guide of the programs available, free or for purchase; to do research.
The web getting programs are something I want to look into. I think there are better things to do than highlight text, copy and do it again. Way too time consuming! The idea of clicking a button and getting a bunch of data for a DIY corpus sounds really nice right now.
Everything you wanted to know about Corpus linguistics…. but was afraid to ask.