Hello,
I’d like to submit results to be evaluated for the trigger detection task. If this is not possible, could we gather the labels for the test partition?
Thank you,
-John
Hello,
I’d like to submit results to be evaluated for the trigger detection task. If this is not possible, could we gather the labels for the test partition?
Thank you,
-John
Dear John,
Thanks for reaching out!
The trigger detection datasets (including train, validation, and test) are available for download in zenodo: PAN23 Trigger Detection
Best regards,
Maik
I’m having issues reconciling the “labels” in works ids and those in the file.
From the third line in validation in works (pan23-trigger-detection-validation/works.jsonl):
“work_id”: “23800645”,
“labels”: [
1,
0,
0,
1,
… ]
pan23-trigger-detection-validation/labels.jsonl
{
“work_id”: “23800645”,
“labels”: [
“pornographic-content”,
“sexual-assault”
]
}
According to the README and paper, the labels are ordered by their count:
- pornographic-content Graphic display of sex, plays, toys, technique descriptions
- sexual-assault Physical, sexual Violence, Rape, Sexual Abuse, Non-consensual Actions
I would think labels should be [1,1, 0 …]
Which file should I trust for fidelity? Unfortunately, the test partition doesn’t have the labels alongside the works.
Sincere thanks for your time!
Using mutual information, I get the following indices for the labels–it looks like it lines up with the “labels” in the works files:
[‘pornographic-content’,
‘violence’,
‘death’,
‘sexual-assault’,
‘abuse’,
‘blood’,
‘suicide’,
‘pregnancy’,
‘child-abuse’,
‘incest’,
‘underage’,
‘homophobia’,
‘self-harm’,
‘dying’,
‘kidnapping’,
‘mental-illness’,
‘dissection’,
‘eating-disorders’,
‘abduction’,
‘body-hatred’,
‘childbirth’,
‘racism’,
‘sexism’,
‘miscarriages’,
‘transphobia’,
‘abortion’,
‘fat-phobia’,
‘animal-death’,
‘ableism’,
‘classism’,
‘misogyny’,
‘animal-cruelty’]
The index in the array-representation indicates a fixed label, so index 0 is always ‘pornographic-content’ and so on.
In the code to the task, there are functions to convert the string representation
[“pornographic-content”, “sexual-assault”]
to array representation
[1,1, 0 …]
See: pan-code/clef23/trigger-detection/evaluation/util.py at master · pan-webis-de/pan-code · GitHub