Issue with the easy difficulty test set in the multi-author-writing-style-analysis-2024

My program runs normally on other data sets except for the easy difficulty test set.
The exception is
"FileNotFoundError: [Errno 2] No such file or directory: ‘//builds/code-research/tira/multi-author-writing-style-analysis-2024/test-datasets/multi-author-writing-style-analysis- 2024/pan24-multi-author-analysis-test-20240426-test//easy/problem-3.txt’
It seems that there is a problem with the easy difficulty test set?

Hi,

may I ask @maik_froebe to look into this - maybe this is a wrong config (the path doesn’t seem correctly formatted to me). Thank you.

Hi,

I had a look and modified the baseline: pan-code/clef24/multi-author-analysis/baselines at master · pan-webis-de/pan-code · GitHub

This baseline now runs through.

I see that the test set sometimes skips some problem files. I.e., the test set contains easy/problem-1.txt, easy/problem-2.txt, then skips 3 to 9 and continues with easy/problem-10.txt.

This explains the error above, as the software tries to access the file easy/problem-3.txt that does not exist.

Does this feedback help to resolve the problem? We could also change this on the dataset side?

Best regards,

Maik

Dear all,

In a private chat with @evazangerle, we found out that the easy difficulty test set was not complete, some files were missing. I.e., that problem-3.txt was missing was an accident.

We now uploaded a new complete version of the test dataset, and we also added the training dataset and a small spot-check dataset (that contains the first 10 problem statements of the training dataset) as public datasets so that it is possible to run submissions on some more datasets with unblinded results.

Sorry for the inconvenience, the new test dataset is now checked multiple times :slight_smile:

Best regards,

Maik