Poster session 1

Quantitative Analysis of Books about How to Write Narrative Text: Extracting Characteristics of Screenwriting, Playwriting, and Fiction Writing

  • Ryoichi Takahashi (Tokyo Institute of Technology)
  • Hajime Murai (Tokyo Institute of Technology)
  • Takehiro Inohara (Tokyo Institute of Technology)

The purpose of this research is to build a basis for the scientific analysis of how to write narrative text. Utilizing quantitative analysis of books about how to write narrative text written by skillful professional writers, we attempt to investigate the similarities and differences between screenwriting, playwriting, and fiction writing.

Research in the fields of psychology and cognitive science explores the creative narrative writing process as a manifestation of creativity (Kaufman & Kaufman, 2009). However, with regard to writing strategies used by skillful professional writers, much of the research has taken the form of case studies and only a little bit of it has been sample-based (quantitative), because collecting a lot of psychological data from skillful professional writers is difficult. Moreover, little is known about differences in writing strategies between narrative genres. According to expertise theory, an expert's knowledge is not domain general, but domain specific. Hence, it can be assumed that skillful writers' knowledge differs between narrative genres. However, most studies have not focused on this difference.

In this study, to attempt to understand writers' strategies and knowledge, we analyze books about how to write narrative text written by skillful professional authors. “Books about how to write narrative text” here implies reference works on how to write a screenplay, playscript, or novel. These books do not directly reflect the cognition undergone by the writers during their actual production process. However, they are nevertheless the most suitable form of data for investigating writers' strategies key ideas in narrative writing, since their content reflects the matters or aspects of the narrative writing process to which the writers attach importance. Applying data mining techniques to these books, we enumerate their content.

We analyze thirteen screenwriting books, ten playwriting books, and ten fiction-writing books, all selected based on their place in's bestseller rankings, the top 100 in each category. All the books were written in English and published in the United States, since it can be assumed that American books about how to write narrative text are of high quality given the commercial success of American narrative content (for example, Hollywood movies, Broadway plays, and popular American fiction). We convert books to plain text data and then use Tree-Tagger (Schmid, 1994) to assign part of speech and lemma tags with the corpus.

Analysis 1: Using the Network Centrality Analysis function of by NetworkX (Hagberg, Swart, & Chult, 2008), we mathematically identify key nouns referring to concepts related to writing narrative text. Six common key concepts overall genre—character, story, scene, time, action, and audience/reader—are extracted as a result.

Analysis 2: Employing Dependency Analysis using MaltParser (Nivre, 2007), we analyze verbs closely related to the six key concepts (nouns). The results show that verbs that have a dependency relation to key concepts differ by the narrative genre under discussion. This result suggests that screenwriters, playwrights, and novelists approach these key concepts from different points of view.

Analysis 3: We build manually a sentiment lexicon based on the model of Plutchik's (1980) emotional categories. Extracting words expressing sentiments from the corpus, we count word frequencies for each category and analyze them using a chi-squared test. It is found that the frequency of sentiment words differs significantly by genre. This result suggests that affective characteristics qualifying screenwriting, playwriting, and fiction writing differ across these genres.

quantitative analysis, creative writing, screenplay, playscript, novel.


  1. Hagberg, A., Swart, P. and S Chult, D.: “Exploring Network Structure, Dynamics, and Function using Network,” Proceedings of the 7th Python in Science conference, (2008).
  2. Kaufman, S. B. and Kaufman, J. C. (eds.): The Psychology of Creative Writing, Cambridge University Press (2009).
  3. Nivre, J., Hall, J., Nilsson, J., Chanev, A., Eryigit, G., Kubler, S., Marinov, S. and Marsi, E.: “MaltParser: A Language-Independent System for Data-Driven Dependency Parsing,” Natural Language Engineering, Vol. 13, pp. 95–135 (2007).
  4. Plutchik, R.: Emotion: A Psychoevolutionary Synthesis, New York: Harper and Row (1980).
  5. Schmid, H.: “Probabilistic Part-of-Speech Tagging using Decision Trees,” Proceedings of the International Conference on New Methods in Language Processing, (1994).