Annotation manual for the Penn Parsed Corpora of Historical English and the Parsed Corpus of Early English Correspondence 2

Beatrice Santorini
(January 2022)

This annotation manual is the latest revision of previous versions (2004, 2016). It is heavily indebted to the original document developed by Ann Taylor and Anthony Kroch for Middle English (Kroch and Taylor 2000) as well as to the spirit of the guidelines for the Penn Treebank (Marcus, Santorini, and Marcinkiewicz 1993).

The substance of the guidelines remains largely unchanged, but the annotation scheme has been streamlined (see Changes), and the differences between the annotation of Middle English and (Early) Modern English have been reduced. Certain differences remain, however, which are occasioned by the syntactic differences between Middle English and later stages of the language. Except where necessary, the examples in the body of the manual are from (Early) Modern English.

The present guidelines are in force for:

They will be in force for the next release of the Penn Parsed Corpora of Historical English (PPCHE), expected in 2024 or 2025. Follow the links below for information about the current (2016) release of the PPCHE.

The guidelines were developed for English, but they have been used as a foundation for annotation guidelines for parsed corpora of various other Germanic and Romance languages. The general idea is that the present guidelines apply as a default except where overruled by language-particular considerations that are set out in a corpus-particular manual.

Suggestions for improvement may be sent to Beatrice Santorini (beatrice DOT santorini AT gmail DOT com).


Thanks are due to the following institutions and individuals for support and assistance:
