Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing Data in MWPBench Test Dataset #1662

Open
yjdeng1 opened this issue Dec 2, 2024 · 0 comments
Open

Missing Data in MWPBench Test Dataset #1662

yjdeng1 opened this issue Dec 2, 2024 · 0 comments

Comments

@yjdeng1
Copy link

yjdeng1 commented Dec 2, 2024

Describe
Model I am using (UniLM, MiniLM, LayoutLM ...): UniLM

Issue Description:

I have noticed an issue with the MWPBench test dataset downloaded from the UNILM repository. According to the paper, the test dataset should contain 18,408 entries. However, upon downloading and verifying the dataset, I found that it actually contains 17,470 entries.

After careful investigation, I discovered that the missing data corresponds to the content of the AGIEval-Math Competition. This discrepancy could affect the evaluation of models trained on this dataset.

Details:

Expected number of entries: 18,408
Actual number of entries: 17,470
Missing content: AGIEval-Math Competition
Impact:
This discrepancy may lead to inaccurate evaluation results for models trained and tested on this dataset.

Request:
Could you please look into this issue and provide an updated dataset that includes all the entries mentioned in the paper? This would ensure fair and accurate evaluation of models.

Thank you for your attention to this matter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant