Will the results of running our scripts on the test dataset automatically be submitted to the leaderboard? Or only scores that are greater than the baseline score?
Also, will all our scores be submitted, or only the best score?
Thanks for the information.
Thank you for your message!
We will add all valid runs (i.e., those that do not fail) to the leaderboard, independently of their score.
This also means that you can have multiple runs on the leaderboard (we will show all runs per group, i.e., not only the best run per group).
You can make as many submissions as you like.
We have not yet decided internally when we will reveal the full leaderboard (i.e., that is something that I have to discuss with my PhD supervisors). Personally, I really like the approach of the TREC conference when the full leaderboard is revealed at the conference (because this ensures that the notebooks are more focused on insights). I will discuss this internally and will report back the decision.
Independently of when the complete leaderboard is revealed, you will receive the scores of your runs on the test set (plus additional data like the ground truth) as soon as all your submissions are finalized (we contact each team individually to ensure that there are no pending submissions, a extension by few days is possible on an individual level). I.e., you get all the information that you need for your notebook paper.
Does this answer your question?
Yes, this answers my question. Thanks for the info.