Detecting Indirect Forms of Offensive Language using Commonsense Reasoning and Conceptual Modeling of Social Stereotypes
Item statusRestricted Access
Offensive user-generated comments in online social communities are a widespread problem due in part to the arguably inherent impersonal nature of the internet, which often times provides a stage for many kinds of harassment, such as trolling / flaming, hate speech, and cyberbullying. The prevalence of this problem is complicated further when considered within the corporate / enterprise work environment, where growing adoption of online social media and social networks (e.g. Facebook, Twitter, Yammer) as viable everyday business tools now allows employees to engage with customers in new and much more visible ways. In these types of environments, offensive language and slander are not just a matter of personal offense, but rather are a source of professional and legal liability for companies, which are increasingly held responsible for the content their employees post online while at work. Companies are now seeking better means of filtering and subjecting employee outgoing social media posts to further scrutiny and inspection, permitting only those posts that do not contain offensive language. There are many forms of offensive language, it is a highly complex, and multifaceted problem to tackle, as it often draws from sociological, psychological and linguistic factors. Some forms are very explicit and easy detectable, while others rely on more implicit, metaphorical use of language. This thesis focuses on the implied, non-explicit forms of offensive language, specifically, the allusion to social stereotypes. This contrasts with previous research which so far has been primarily involved with the detection of explicit, surface forms of offensive language.