PAN26 submissions runs results \ paper ablation timing

Dear all,

Two questions about the submissions workflow and the paper.

1. Access to my own per-run scores. In the TIRA submissions tab I can
download the outputs of each of my runs, which lets me reconstruct my own
per-submission scores on the ELOQUENT and PAN base test sets — even though
the public leaderboard isn’t out yet. Is this intended? And am I allowed to
cite these (my own runs’ numbers) in the participant notebook before the
official results are published?

2. Retrospective ablations vs. the paper deadline. I understand
ablations are ideally done during R&D, but with the tight timeline I’m only
now in a position to run them retrospectively to tell a stronger “what made
it work” story. The problem: on a 14B backbone each run is ~17h on our A10G,
and a meaningful ablation set is ~20-40h wall-clock — which doesn’t fit the
~2-day gap before the notebook deadline.

So a small suggestion for future PAN editions: a slightly larger buffer
between the software deadline, the public results release, and the paper
deadline. It would especially help stronger/winning approaches do proper
retrospective ablations — purely in the interest of scientific rigor.

Thanks!

Thanks for reaching out!

For your first point:
yes, it is intended as you describe it.

For your second point:
We use the submitted software only for executions on the test set. We should not run ablations on the test set.

There might be a misunderstanding on what the notebook paper that we submit/collect now is. The notebook paper is usually really intended as a “notebook”. So the aspects that you describe are all important and make sense, but those aspects would usually not make it into the notebook paper. The notebook paper is rather at the beginning of the “academic lifecycle”. I.e., having evaluations in it is not even needed, an ablation study is also not needed. The only goal of the notebook paper is that others can understand the idea of your approach. As you said, the other aspects need more time, for that reason, if an approach worked well, one usually would improve the work done/described in the notebook paper into a conference paper submission (e.g., at ECIR or a subsequent CLEF). This is the “intended” lifecycle. The notebook paper describes the rough idea. If the approach worked (and one is still interested in continuing to work on the aspects), then one can improve the notebook paper with ablations and other more evaluation aspects into a conference paper. For this, there is much time, as the next suitable deadlines are likely in September/October (e.g., for ECIR 2027). If you do not want to continue working on the aspects (sometimes interests switch, or one does not have time anymore), you could include additional evaluations and/or ablations into the camera ready version of your notebook paper (the deadline is in a bit more than a month).

Best regards,

Maik

Most complete! Thank you Maik.
– Yurii

1 Like