Last week, Professor McCarthy gave this online talk for ALS DELTA candidates. It was a brilliant presentation and I am very glad I attended it!
🗝️ Key takeaways.
Top 40
Top 40 words in written corpus are grammatical items:

In spoken corpus, items include hesitations pronouns, not, do, so etc.

We all hesitate and take our time. However, when students hesitate, we think it’s a problem..but natural spoken English includes hesitation!
collocations and chunks:
collocations: statistical measure of words occuring together.
chunks are words cemented, glued, stuck together firmly.
Learner corpus
How creating a learner corpus (electronic collection of learners’ responses) can help teachers:
We can identify which miscollocations are ‘fossilised’. Here are examples of miscollocations with make, at levels B1-C2. We can see that make business is the most ‘stubborn’ error.

We can measure error reduction by level.

Binomials
Binomials are two words connected with or/ and in fixed order, e.g.
In English we say black and white, not white and black, we say pros and cons not cons and pros.
Students may produce them inaccurately because of:
- L1 transfer -in Greek, we say white and black
- memory problems
- lack of exposure
Here are examples of binomial-related confusion at different levels:

Finally
Professor McCarthy said learners can function with 4,000 words – these will cover them for 90% of what they want to do.
Another thing I’ve learned from Prof. McCarthy
Every single slide, every single piece of information was interesting. He presented and explained everything clearly, without rushing any parts. He paused and gave us think time. His slides included useful tables and charts rather than text overload. His voice and whole presence was calm and very positive. Food for thought!
Hi Rachel.
I like the way you describe Prof. McCharthy. It seems his demeanour has influenced the way you engaged with the information. He sure comes across someone one could be around all day.
It was intriguing to read the 40 most frequent spoken words. If I understand this correctly, this corpus was gathered from recorded conversations of members of the UK public between 2012 and 2016, wasn’t it? Also, you highlight a crucial concern of mistreating hesitations as problems.
By the way, I have started to collate samples of repeated spoken errors from my learners. Hopefully, I will be able to put it to good use in the near future 😎
LikeLiked by 1 person
I missed that info about country/year, I’m afraid. Good luck with your learner corpus! Would love to find out how it goes!
LikeLike