I wanted to compare the way each of these tools helped with each of the steps in my 9-step prompt for providing feedback to learners of any language, so I made a table in Word with the transcripts for each step for each tool. I did this on my PC.
I checked the CEFR level of each transcript and found some that were probably too difficult for a good B1/B2 learner, but I also asked Copilot to compare the five transcripts for each step and evaluate them telling me which was best. Copilot excelled at this and apart from giving me a detailed analysis, also gave me a table evaluating all the aspects of each step for each tool.
The Word file is here.
I added a column showing the ranking from 1 to 5, where 1 was number one, the best. I then copied all the tables into Excel and created a final summary table where the ranking for each step were added up for each tool. The lowest number would be the best overall.
The Excel file is here.
The winner by one point was Gemini, closely followed by Copilot and then Copilot 365. The AI Language Coach, which I had vibe-coded myself was a long way off and ChatGPT was the worst by a small margin.
These are the steps that I compared:
- Original transcript - Gemini rated this as B2 but the others said B1+/B2
- Explaining the corrections - between B2+ and C1 except for ChatGPT (B2/B2+)
- Gemini is the most effective for learning because it is clear, detailed, and explains both the correction and the reasoning behind it.
- The corrected version - mostly B2, but one was B1+/B2 and another B2+
- Copilot 365 is the best corrected version for learning and for natural, advanced English.
- Explaining the changes to make it more colloquial - B2+, B2/B2+, B2+/C1
- Gemini is the best explanation for learning how to make a monologue more colloquial. It’s clear, thorough, and models the kind of language it teaches.
- The more colloquial version - B2, B2+, B2/B2+
- Copilot is the best colloquial version. It’s lively, idiomatic, and sounds just like a native speaker telling a travel story.
- Explaining the changes to make it 1/2 a CEFR level higher - B2+/C1, but some were C1 or C1/C1+
- Gemini is the best version for a colloquial rewrite at half a CEFR level higher, with advanced vocabulary and idioms.
- The 1/2 CEFR higher version - B2/B2+, but ChatGPT B2/B2+ and AI Language Coach C1
- Gemini is the best version for a colloquial rewrite at half a CEFR level higher, with advanced vocabulary and idioms.
- Explaining the changes to make it 1 CEFR level higher - C1/C1+, C1
- Copilot is the best explanation for how to make the text one CEFR level higher, keeping it lively, idiomatic, and expressive. It’s clear, engaging, and models the advanced, natural style it describes.
- The 1 further CEFR level higher version - C1, C1/C1+, C1+ (AI Language Coach)
- Copilot is the best version for one CEFR level higher, keeping it lively, idiomatic, and expressive.
- What does ____ mean? - B2+ on average but ChatGPT B1/B1+
- Gemini is the best for thoroughness and depth.
- Copilot is the best for clarity, examples, and learner engagement.
- What level on the CEFR scale was I? - B2+ on average but Gemini had C1
- Copilot 365 is the best reply—it’s clear, detailed, supportive, and gives both a direct answer and actionable feedback.
Conclusions
- If AI Language Coach is worth developing further, I need to find a way to
- Ensure the level of language used is at the level of the learner.
- Allow learners to upload videos as well as audio files
- Add a slider so learners can adjust the speed of the Read Aloud
- Including a further one level higher on the CEFR should be changed to a further 1/2 level higher.
- Further experimentation is needed with pasting the ChatGPT prompt generated by Turboscribe into Gemini and the two versions of Copilot
- This experimentation needs to be done using an Android phone and my iPad
No comments:
Post a Comment