Annotation differences among the corpora


Adverbial NP (NP-ADV)

FACE TO FACE and the like have been assimilated to EACH IN HIS OWN WAY and the like, with NP brackets surrounding the initial word in both cases.

Ciphers

Text used as cipher is tagged as CIPHER in the PCEEC2. In the other corpora, such text is tagged with its ordinary part of speech (N, NUM).

Collective noun

In the PPCME2, collective nouns (FOLK, HORS, PEOPLE, etc.) are tagged as N. In early texts, before the generalization of plural -S, it can be quite difficult to distinguish reliably between singular and plural. For texts from the Middle English period M1, we have therefore tried to follow the translation that accompanies the text, or when this is lacking, a separate translation. For details, consult the information for the individual texts. See Singular, collective, or plural noun for further discussion.

In the later corpora, PEOPLE is tagged as singular (N) after an unambiguously singular determiner (A, THAT, THIS), and as plural (NS) otherwise.

Comparative or superlative quantifier (LESS, LEAST; MUCH, MORE, MOST)

In the PPCME2, LESS, LEAST and MUCH, MORE, MOST are generally tagged as adjectives (ADJ, ADJR, ADJS), rather than as quantifiers (Q, QR, QS), under the conditions described below. The distinction between the adjectival use and the pure quantifier use is not always easy to make in a consistent way and becomes more difficult over time. In the later corpora, these items are therefore uniformly tagged as quantifiers (Q, QR, QS). See Comparative adjective as head of ADJP and Superlative adjective as head of ADJP for further discussion.

The items under discussion are tagged as adjectives:

a/D michel/ADJ lust                                                    ← ADJ (with preceding determiner)

fram +te/D michel/ADJ conseil of +te vntrew

ouer +tat/D michele/ADJ water

his/PRO$ michele/ADJ wisdom

of mani mann +de is on michele/Q dwele/N on him seluen                 ← Q (no determiner)

and +te/D more/ADJR fysches/NS swolwen +te lesse/ADJR                  ← ADJ (elided head noun)

' and I shall ensure you ye shall have the/D more/ADJR                 ← ADJ precedes noun
worship than ever ye had . '

And hem thinketh +tat the/D more/ADJR peyne &
the/D more/ADJR tribulacioun +tat +tei suffren for loue of
here god , the/D  more/ADJR ioye/N +tei schull haue
in another world

for he shal ben michel/ADJ bifore gode                                 ← predicate of ordinary clause

Ne dowte we not how byleue may now be lesse/ADJR and now be more/ADJR

and maki+t his myracle more/ADJ                                        ← predicate of small clause

Concessive clause

ALL BE IT (THAT), ALBEIT

In the PPCME2, ALL BE IT (THAT) clauses, like SO BE IT (THAT) clauses, are treated similarly to V1 conditionals. ALL is tagged Q, surrounded by ADVP brackets, and treated as a daughter of CP-ADV. This is not intended as the correct analysis of the construction, but rather to fit in with the annotation of V1 conditionals.

( (CP-ADV (ADVP (Q all))			                  ← ALL BE IT
          (IP-SUB (BEP be)
                  (NP-SBJ-1 (PRO it))
                  (CP-THT-1 (C that)
                            (IP-SUB (NP-SBJ (PRO it))
                                    (BEP was)
				    (ADJP-PRD (FP but)
					      (ADJ litel))))))
  (ID CMASTRO-M3,673.C1.362))

( (CP-ADV (ADVP (Q al))			                          ← ALL BE IT
          (IP-SUB (BED were)
                  (NP-SBJ-1 (PRO it))
                  (ADVP (ADV so))
                  (CP-THT-1 (C that)
                            (IP-SUB (NP-SBJ (PRO she))
                                    (ADVP-TMP (ADV right) (ADV now))
                                    (BED were)
                                    (ADJP-PRD (ADJ deed))))))
  (ID CMCTMELI-M3,217.C1b.19))

In the later corpora, ALBEIT (like HOWBEIT) is treated as a unitary adverb (when used absolutely) or as a unitary preposition (when introducing a subordinate clause).

(ADVP (ADV Al_be_it))

( (PP-LFD (P albeit)		                                  ← ALBEIT
	  (CP-ADV (C 0)
		  (IP-SUB (NP-SBJ (PRO he))
			  (BED was)
			  (ADJP-PRD (ADV sore)
				    (ADJ ennamored)
				    (PP (P vpon)
					(NP (PRO her)))))))
  (ID MORERIC-1513-E1-H,55.117))

HOW BE IT (THAT), HOWBEIT

In the PPCME2, HOW BE IT (THAT) clauses are treated as adverbial free relatives.

( (ADVP (CP-FRL (WADVP-1 (WADV how))		        ← HOW BE IT
		(C 0)
		(IP-SUB (ADVP *T*-1)
			(BEP be)
			(NP-SBJ-2 (PRO it))
			(CP-THT-2 (C 0)
				  (IP-SUB (NP-SBJ (PRO thou))
					  (HVP hast)
					  (ADVP-TMP (ADV often))
					  (ADVP-TMP (ADV before))
					  (PP (P in)
					      (NP (PRO$ thy) (ADJ yonge) (N age)
						  (CONJP (CONJ and)
							 (NX (ADJ myddell) (N age)))))
					  (VBN dyvydyd)
					  (NP-OB1 (PRO$ thy) (N lyfe)))))))
  (ID CMINNOCE-M4,11.189))

In the later corpora, HOWBEIT (like ALBEIT) is treated as a unitary adverb (when used absolutely) or as a unitary preposition (when introducing a subordinate clause).

(ADVP (ADV How_be_it))

( (PP (P Howbeit)					← HOWBEIT
      (CP-ADV (C 0)
	      (IP-SUB (NP-SBJ=2 (EX there))
		      (VBD came)
		      (NP-2 (OTHER other)
			    (NS boats))
		      (PP (P from)
			  (NP (NPR Tiberias))))))
  (ID AUTHNEW-1611-E2-H,6,20J.699))

SO BE IT (THAT)

In the PPCME2, SO BE IT (THAT) clauses, like ALL BE IT (THAT) clauses, are treated similarly to V1 conditionals. SO is tagged ADV, surrounded by ADVP brackets, and treated as a daughter of CP-ADV. Again, this is not intended as the correct analysis of the construction, but rather to fit in with the annotation of V1 conditionals.

( (CP-ADV (ADVP (ADV so))				← SO BE IT
          (IP-SUB (BEP be)
                  (NP-SBJ-1 (PRO hit))
                  (CP-THT-1 (C that)
                            (IP-SUB (NP-SBJ (PRO thou))
                                    (BEP be)
                                    (NEG nat)
                                    (NP-PRD (NPR sir) (NPR Launcelot))))))
  (ID CMMALORY-M4,191.2805))

In the later corpora, SO BE IT is a word order variant of IT BE SO and tagged accordingly.

( (PP-LFD (P IF)
	  (CP-ADV (C 0)
		  (IP-SUB (ADVP (ADV SO))		← SO BE IT
			  (BEP BE)
			  (NP-SBJ=1 (PRO IT))
			  (, ,)
			  (CP-THT-1 (C THAT)
				    (IP-SUB (PP (P IN)
						(NP (Q ANY) (N TRIANGLE)))
					    (, ,)
					    (NP-SBJ (D THE)
						    (N SQUARE)
						    (PP (P OF)
							(NP (D THE) (ONE ONE) (N SYDE))))
					    (BEP BE)
					    (ADJP-PRD (ADJ =L)
						      (PP (P TO)
							  (NP (D THE)
							      (NUM .IJ.)
							      (NS SQUARES)
							      (PP (P OF)
								  (NP (D THE)
								      (OTHER OTHER)
								      (NUM IJ.)
								      (NS SIDES)))))))))))
  (ID RECORD-1551-E1-H,2.E4V.275))

Disfluencies

Disfluencies are indicated only in the PPCMBE2; in the other corpora, they hardly ever occur.

DO

In Middle English, DO can be ambiguous between a causative (
ECM) main verb and a periphrastic auxiliary. The default in the PPCME2 is to treat ambiguous cases as causative except when a causative reading is impossible. Causative DO dies out in the course of Middle English, and so instances of DO in the later corpora that could in principle be treated as ambiguous are instead uniformly treated as periphrastic.

Foreign text

In the PPCHE, sequences of foreign words (FW) are surrounded by brackets labeled LATIN or FOREIGN (any language other than Latin). In the PCEEC2, such sequences are labeled as FOREIGN uniformly.

Left dislocation

In the PPCME2, PPs are annotated as left-dislocated only if the resumptive element is a matching PP (that is, if the prepositions in the left-dislocated and resumptive phrases are identical, and if the objects of the prepositions corefer); there are only a handful of examples.

( (IP-IMP (VBI Thynk)
	  (ALSO eek)
	  (CP-THT (C that)
		  (IP-SUB (PP-LFD (P of)
				  (NP (SUCH swich)
				      (N seed)
				      (PP (P as)
					  (CP-CMP (WPP-1 0)
						  (C 0)
						  (IP-SUB (PP *T*-1)
							  (NP-SBJ (NS cherles))
							  (VBP spryngen))))))
			  (, ,)
			  (PP-RSP (P of)
				  (NP (SUCH swich) (N seed)))
			  (NP-SBJ=2 *exp*)
			  (VBP spryngen)
			  (NP-2 (NS lordes))))
	  (. .))
  (ID CMCTPARS-M3,314.C1.1103))

In the later corpora, the use of the -LFD dash tag is extended beyond such cases to indicate a relationship between a pre-subject PP and any more broadly resumptive PP or ADVP.

( (IP-MAT-SPE (PP-LFD (P Though)
		      (CP-ADV-SPE (C 0)
				  (IP-SUB-SPE (NP-SBJ (PRO I))
					      (VBP beare)
					      (NP-OB1 (N record))
					      (PP (P of)
						  (NP (PRO$ my) (N selfe))))))
	      (, ,)
	      (ADVP-RSP (ADV yet))
	      (NP-SBJ (PRO$ my) (N record))
	      (BEP is)
	      (ADJP-PRD (ADJ true))
	      (. :))
  (ID AUTHNEW-1611-E2-H,8,1J.1028))

NEED(S) and other adverbial genitives

In the PPCME2, NEED(S) and other originally
adverbial genitives are treated as bare nouns (N) heading NP-ADV. These forms die out in the course of Early Modern English. Where they survive (notably in the collocation MUST NEEDS), they are tagged as adverbs heading a bare ADVP.

PPCME2 Later corpora
(NP-ADV (N needs))
(ADVP (ADV needs))

Participial clause as complement of verb

In the PPCME2, participial clauses very rarely function as complements of verbs, as in I don't mind creating exams, but the worst part is grading them. As a result, such cases are annotated as bare IP-PPL rather than as IP-PPL-OB1/PRD, as in the later corpora. See
Participial clause without subject for details.

Possessive clitic

In the PCEEC2, the possessive clitic 's is always split off from its host noun. In the Penn corpora, this is only true when the clitic takes scope over a constituent larger than its host. See
Genitive/possessive modifier of N for details.