Input Data for Task 2

thisisausername · December 2, 2022, 2:04pm

I have a clarifying question:

Does the input data for Task 2 contain the “tags” field or not? E.g. if I want to use the spoiler type (phrase, passage, multi) in my solution for task 2 do I need to implement a solution to task 1?

It would make sense if the tags are provided for task 2 to separate the two tasks but the task website reads This field is only available in the training and validation dataset (not during test) which suggests otherwise.

Thanks in advance!
Simon

maik_froebe · December 2, 2022, 4:39pm

Dear Simon,

Sorry for the ambiguity.
The field tags is only missing for task 1. For task 2, it is available and can be used.

I have updated the documentation accordingly.

Best Regards,

Maik

thisisausername · December 20, 2022, 1:50pm

Dear Maik,

can you check whether the “tags” field is available in the “task-2-spoiler-generation-validation” dataset?

The data I load from the input.jsonl seems to not contain this feature.

Data before processing:
DatasetDict({
    validation: Dataset({
        features: ['uuid', 'postId', 'postText', 'postPlatform', 'targetParagraphs', 'targetTitle', 'targetDescription', 'targetKeywords', 'targetMedia', 'targetUrl'],
        num_rows: 800
    })
})

maik_froebe · December 20, 2022, 2:07pm

Dear Simon,

Thank you for notifying us!
This was indeed a mistake in the validation dataset, that this field was missing.
I have now fixed this.

The tags field is now available in the task-2-spoiler-generation-validation dataset (it was already correct for the task-2-spoiler-generation-test).

Can you please test again?
It should work now.

Thanks and best regards,

Maik

maik_froebe · December 22, 2022, 1:29pm

Cool, I saw that your submission was now successful.
Nice, thanks for trying again!

Best regards,

Maik