Detecting antisocial behavior in text

The words we use and our writing styles can reveal information about our preferences, thoughts, emotions and intentions. Using this information, I developed machine learning models that can detect antisocial behaviors, such as hate speech and indications of violence, from texts, as part of my recently defended PhD dissertation, titled “Leveraging emotion and word based features for antisocial behavior detection in user-generated content.”

Historically, most attempts to address antisocial behavior have been done from educational, social and psychological points of view. My PhD research, however, demonstrated the potential of using natural language processing techniques to develop state-of-the-art solutions to detect antisocial behavior in written communication.

The research created solutions that can be integrated in web forums or social media websites to automatically or semi-automatically detect potential incidences of antisocial behavior with high accuracy, allowing for fast and reliable warnings and interventions to be made before the possible acts of violence are committed.

One of the great challenges in detecting antisocial behavior is first defining what precisely counts as antisocial behavior and then determining how to detect such phenomena. Thus, using an exploratory and interdisciplinary approach, I applied natural language processing techniques to identify, extract, and utilize the linguistic features, including emotional features, pertaining to antisocial behavior.

The research investigated emotions and their role or presence in antisocial behavior. Literature in the fields of psychology and cognitive science shows that emotions have a direct or indirect role in instigating antisocial behavior. Thus, for the analysis of emotions in written language, the research created a novel resource for analyzing emotions. This resource further contributes to sub-fields of natural language processing, such as emotion and sentiment analysis.

Because a problem in researching antisocial behavior in written language was that there was no adequate collection of texts, the research, in addition, created a novel corpus of antisocial behavior texts. The corpus allowed and will continue to allow for gaining deeper insight and understanding of how antisocial behavior is expressed in written language.

The study showed that natural language processing techniques can help detect antisocial behavior, which is a step towards its prevention in society. With continued research on the relationships between natural language and societal concerns and with a multidisciplinary effort in building automated means to assess the probability of harmful behavior, much progress can be made.

Doctoral dissertation is available for download at: http://epublications.uef.fi/pub/urn_isbn_978-952-61-2464-3/index_en.html

In the press:

Hilary Lamb (13th April 2017), Computers taught to recognise hate speech and violent language, Engineering and Technology, https://eandt.theiet.org/content/articles/2017/04/computers-taught-to-recognise-hate-speech-and-violent-language/

University of Eastern Finland (12 April 2017), New machine learning models can detect hate speech, violence from texts, ScienceDaily, www.sciencedaily.com/releases/2017/04/170412091222.htm

Terhi Nevalainen, (11th April 2017), Tietokone voi tunnistaa terroristin, Karjalainen, http://www.karjalainen.fi/uutiset/uutis-alueet/kotimaa/item/138639-tietokone-voi-tunnistaa-terroristin


Also published on Medium.