-
Notifications
You must be signed in to change notification settings - Fork 180
Open
Description
Where
Evaluate Translation Quality with METEOR Score
What
The problem has a bunch of misleading & confusing points:
1. Reasoning
- The matched unigrams should be: 'rain', 'from', 'the', 'sky' in the version that we're required to implement (without considering semantic similarity of unigrams, and instead just asked to implement "exact unigram matching") as per this note in learn.md:
- computes precision (4/6) and recall (4/5), note should be 4/6 for both as the reference and candidate sentences are of the same length
2. Learn about the topic section, example subsect.
In the learn.md, example section for this setup:
Chunk calculation is wrong.

In METEOR, we're looking for the smallest possible num of chunks, i.e. a single chunk is a consecutive set of unigrams in both candidate and reference sentences until we find a non-matching unigram. Thus the chunk size should be instead of size 2: https://en.wikipedia.org/wiki/METEOR#:~:text=In%20order%20to%20compute%20this%20penalty%2C%20unigrams%20are%20grouped%20into%20the%20fewest%20possible%20chunks%2C%20where%20a%20chunk%20is%20defined%20as%20a%20set%20of%20unigrams%20that%20are%20adjacent%20in%20the%20hypothesis%20and%20in%20the%20reference

Metadata
Metadata
Assignees
Labels
No labels