Pre-Registration

maik_froebe · October 27, 2023, 5:10am

We intend to promote new collaborations among potential participants of the Workshop on Open Web Search. Therefore, we maintain a public, non-binding pre-registration list below where we share ideas for components together with potential participants of the workshop who expressed their interest in working on component(s) so that potential participants can coordinate and avoid overlapping their work. If you have additional ideas for components you want to see in the list or want to express interest in working on a component, please write a short message or post a message in the forum below. We will incorporate all suggestions and clean responses to this forum as soon as they are included to keep the forum concise. We hope the pre-registration also provides collaboration ideas for the ECIR’24 Collab-a-thon.

maik_froebe · October 27, 2023, 5:11am

Pre-Registration for Document Processors

Document Processor	Description	Expressed Interest to Work on This
Corpus Graph	Calculate similarity scores between documents in the corpus to improve the recall.	Sean MacAvaney et al.
DeepCT Document Reduction	Calculate the importance of terms in their context to remove unimportant terms.	Sean MacAvaney et al.
Document Expansion with DocT5Query{–}		Sean MacAvaney et al.
Genre Classification	Detect the aim of a web page, e.g., is the goal of a page to educate visitors or to sell something?	Sebastian Schmidt et al.
Keyprhase Extraction	Reduce the length of a document drastically by extracting only the top keyphrases	Already Implemented as Baseline
Medical Document Classification	Binary classification to detect medical content.	Ferdinand Schlatt et al.
SEO detection from ad blockers	Use XPath and Regex rules of ad blockers like ABP to extract some SEO features for ranking/re-ranking or removal of ads from a web page
Spam Classification	Is a given web page a spam page that aims to deliver unwanted payload?	Mohammed Al-Maamari et al.
TextStat Document Analysis	Calculate readability scores etc with the textstat library.
Comparative Stance Detection	Given a comparative query with two comparison objects and a sentence, identify whether the sentence takes a stance for either of the two objects or takes no stance.	Alexander Bondarenko et al
…

Pre-Registration for Query Processors

Query Processor	Description	Expressed Interest to Work on This
Archive Query Log Expansion	Given a query, get k highly similar queries with SERPs/linked documents from the Archive Query Log	Heinrich Reimer et al.
Detection of Comparative Queries	Given a query, classify if it is comparative (e.g., `should I buy a PlayStation or an XBox?`) or not	Alexander Bondarenko et al.
Entity Linking (precision-oriented)	Given a query like `source of the nile`, link to the most prominent entities, e.g., the Nile as river.
Entity Linking (recall-oriented)	Given a query like `source of the nile`, link to all possible interpretations, e.g., the board game and the river.	Marcel Gohsen et al.
Long Query Reduction	Given a long/verbose query, remove unimportant terms.
Spelling Correction	Correct the spelling of queries (maybe especially interesting for dense retrievers, e.g., T5 tokenized `obama` to 4 tokens, but `Obama` to a single token)	Gustav Lahmann et al.
RM3 Expansion on the AQL	Use the top-ranked documents form the Archive Query Log as relevance feedback to RM3/Bo1/etc.	Justin Löscher et al.
Query Intent Prediction	Is a query informational, navigational, or transactional?	Daria Alexander et al.
Query Performance Prediction	Given a list of queries, order them by their predicted effectiveness.
Query Segmentation	Detect query terms that belong together, e.g., segment the query `hubble telescope achievements` into `['hubble telescope', 'achievements']`	Already Implemented as Baseline
Query Variants with ChatGPT
…

Pre-Registration for Full-Rank/Re-Rank approaches respectively Query-Document processors

Approach	Description	Expressed Interest to Work on This
Axiomatic re-ranking		Heinrich Reimer et al.
Query-dependent document summarization
Splade		Thibault Formal et al.
…