Computational Literary Analysis Toolkit

Computational Literary Analysis Toolkit

Completed | 2021 - 2023

Overview

An open-source toolkit for analyzing literary texts using natural language processing and machine learning.

About the Toolkit

The Computational Literary Analysis Toolkit (CLAT) is a Python library designed to make advanced text analysis accessible to humanities researchers without extensive programming backgrounds.

Features

  • Style Analysis: Measure and compare authorial style
  • Topic Modeling: Discover themes across large corpora
  • Sentiment Analysis: Track emotional arcs in narratives
  • Character Networks: Visualize character relationships
  • Comparative Analysis: Compare texts across multiple dimensions

Usage

from clat import TextAnalyzer

# Load and analyze a text
analyzer = TextAnalyzer("path/to/novel.txt")
style = analyzer.get_style_metrics()
topics = analyzer.topic_model(n_topics=10)
sentiment = analyzer.sentiment_over_time()

Impact

The toolkit has been downloaded over 5,000 times and is used by researchers at universities worldwide. It has been featured in workshops at major digital humanities conferences.

Future Development

While the initial project is complete, we continue to maintain the toolkit and welcome community contributions through our GitHub repository.

Collaborators

  • Research Assistant Team
  • Computer Science Department Collaborators