Automated Discovery and Analysis of Social Networks from Threaded Discussions
Date
2008
Authors
Gruzd, Anatoliy
Haythornthwaite, Caroline
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
To gain greater insight into the operation of online social networks, we applied Natural Language
Processing (NLP) techniques to text-based communication to identify and describe underlying social
structures in online communities. This paper presents our approach and preliminary evaluation for
content-based, automated discovery of social networks. Our research question is: What syntactic and
semantic features of postings in a threaded discussions help uncover explicit and implicit ties between
network members, and which provide a reliable estimate of the strengths of interpersonal ties among the
network members? To evaluate our automated procedures, we compare the results from the NLP
processes with social networks built from basic who-to-whom data, and a sample of hand-coded data
derived from a close reading of the text.
For our test case, and as part of ongoing research on networked learning, we used the archive of threaded
discussions collected over eight iterations of an online graduate class. We first associate personal names
and nicknames mentioned in the postings with class participants. Next we analyze the context in which
each name occurs in the postings to determine whether or not there is an interpersonal tie between a
sender of the posting and a person mentioned in it. Because information exchange is a key factor in the
operation and success of a learning community, we estimate and assign weights to the ties by measuring
the amount of information exchanged between each pair of the nodes; information in this case is
operationalized as counts of important concept terms in the postings as derived through the NLP analysis.
Finally, we compare the resulting network(s) against those derived from other means, including basic
who-to-whom data derived from posting sequences (e.g., whose postings follow whose). In this
comparison we evaluate what is gained in understanding network processes by our more elaborate
analysis.
Description
Keywords
social networks, named entity recognition, natural language processing, collaborative learning