Project TechLord: Doxxing and Malicious Traffic

An interest new task idea to detect doxxing as well as more broadly pages whose intended purpose is to only drive traffic to its malicious content, e.g. harassment, etc. I was surprised we didnt have a similar task already. Thoughts?


Indeed, this sounds interesting to me.
I am not aware of such a task, but it would definitively be very valuable in the open web search context.

Last year, Mohammed Al-Maamari (cc ) and Michael Dinzinger (cc @michaeld ) organized a related task on web spam classification which is somewhat related:

It might be that building the dataset could be the main challenge, or?

We might use the Instagram dataset mentioned in Identification of cyber harassment and intention of target users on social media platforms
S. Abarna,a,⁎ J.I. Sheeba,a S. Jayasrilakshmi,a and S. Pradeep Devaneyanb

cyberbullying (30%) , trolling (25% ), sexual harassment (23% ), doxing (19% ), cyberstalking (16% ), online impersonation (15% ), revenge porn (10%), cyber defamation (8%), hacking (8%), and message bombing (6%)

Interesting enough the cyberharasser we have here is basically checking most of the category boxes. My hypothesis is that serial cyberharassers tend to engage in all manners and methods (except ones requiring physical world connection) which might make identifying serial cyberharassers possible. Although generally unmasking and pursueing them would require legal action by law enforcement or through lawsuits.

So the Chelmis and Yao dataset should suffice.