Critical discourse analysis (CDA) aims to understand the link "between language and the social" (Mautner and Baker, 2009), and attempts to demystify social construction and power relations (Gramsci, 1999). On the other hand, corpus linguistics deals with principles and practice of understanding the language produced within large amounts of textual data (Oostdijk, 1991). In my thesis, I have aimed to combine, using machine learning, the CDA approach with corpus linguistics with the intention of deconstructing dominant discourses that create, maintain and deepen fault lines between social groups and classes. As an instance of this technological framework, I have developed a tool for understanding and defining the discourse on Islam in the global mainstream media sources. My hypothesis is that the media coverage in several mainstream news sources tends to contextualize Muslims largely as a group embroiled in conflict at a disproportionately large level. My hypothesis is based on the assumption that discourse on Islam in mainstream global media tends to lean toward the dangerous "clash of civilizations" frame. To test this hypothesis, I have developed a prototype tool "Said-Huntington Discourse Analyzer" that machine classifies news articles on a normative scale -- a scale that measures "clash of civilization" polarization in an article on the basis of conflict. The tool also extracts semantically meaningful conversations for a media source using Latent Dirichlet Allocation (LDA) topic modeling, allowing the users to discover frames of conversations on the basis of Said-Huntington index classification. I evaluated the classifier on human-classified articles and found that the accuracy of the classifier was very high (99.03%). Generally, text analysis tools uncover patterns and trends in the data without delineating the 'ideology' that permeates the text. The machine learning tool presented here classifies media discourse on Islam in terms of conflict and non-conflict, and attempts to put light on the 'ideology' that permeates the text. In addition, the tool provides textual analysis of news articles based on the CDA methodologies.

