Text Annotation Application


The Text Annotation Toolkit (TAT) is a collection of independent components packaged together into one open source software application. TAT was engineered to support the document classification process and work flow. Tracking of changes in a working corpus, saving data used in the training of classifiers to ensure reproducibility, and providing a mechanism for interacting with copyright protected corpora are all fundamental issues that TAT addresses. TAT is built using the robust Open IDE (Oracle, 2010) framework that allows plugin developers access to standard well tested libraries saving years of development time. The main goal of TAT is to minimize the labor intensive process of creating labelled data that can be used to train, test, and deploy machine learning models for automated text annotation. Additionally, TAT allows researchers an easy method to automatically reproduce prior results. The toolkit can facilitate the annotation of text using different machine learning packages as well as corpora with different metadata specifications.

Download (Mac/Linux/Win) TAT