example singular / basic plural possessive singular possessive plural common noun school N NS N$ NS$ proper noun Kentucky NPR NPRS NPR$ NPRS$ compound quantifier everyone Q+N Q+N$ – ordinary pronoun they, their PRO PRO$ – wh- pronoun who WPRO WPRO$ – expletive with NP associate there, it EX – ordinary determiner the, that, those D – wh- determiner which WD – Possessive head (see NP-POS) 's POS –
five mile_NS out of town in five year_NS
Bare gerunds are tagged as N when this leads to a significant
simplication of the structure.
Many words referring to groups of people are systematically ambiguous
between a nominal and an adjectival use. If the ambiguous word bears
overt plural morphology or occurs in a syntactic context where an
overtly marked plural would be possible, it is tagged NPR(S).
Offices on their own are tagged as N. Offices as part of a
person's name are tagged as NPR.
Morphologically complex quantifiers beginning with ANY-, EVERY-, NO-, or
SOME- are tagged as Q+N($).
See Known issues.
Ordinary (referential) personal pronoun are tagged as PRO
or PRO$, as are the reciprocal
pronouns
( (IP-MAT (NP-SBJ (PRO I))
(VP (VBP enjoy)
(NP-OB1 (N hunting)))
(PUNC .)))
( (IP-MAT (NP-SBJ (PRO I))
(VP (VBP enjoy)
(NP-OB1 (IP-PPL (VP (VAG hunting)))))
(PUNC .)))
Proper noun (NPR)
The following words are tagged as proper nouns when used as nouns rather
than as adjectives.
He is a Catholic_NPR. analogously: Baptist, Christian, Italian
They are Catholics_NPRS.
They are Catholic_ADJ.
the Civil_NPR War_NPR
the War_NPR between_NPR the_NPR States_NPRS
World_NPR War_NPR One_NPR
the president_N of the United States
President_NPR Roosevelt_NPR
the Bible_NPR
Compound quantifier (Q+N)
everything_Q+N
someone's_Q+N$
Pronoun (PRO)
The treatment of expletive IT depends on
whether it is construed with an NP
associate. See IT for details and examples.
Wh- pronouns on their own are tagged as WPRO. By contrast, the
same items preceding overt nouns are tagged as WD.
Expletive THERE (or its pronunciation
variant THEY) is tagged as EX.
When construed with an NP associate (that is, when analogous to standard
existential THERE), expletive IT is also tagged EX.
See IT for details.
Wh- determiners (WD) are homonymous
with wh- pronouns (WPRO) but
distinguished from them by preceding an overt noun rather than standing
alone.
For so-called intrusive A, see X.
Wh- pronoun (WPRO)
Which_WPRO do you want ?
the girl who_WPRO came
Expletive with NP associate (EX)
Ordinary determiner (D)
Ordinary determiners, whether intransitive or transitive, are tagged
as D.
this_D and that_D
this_D one and that_D one
Wh- determiner (WD)
Which_WD one do you want ?
What_WD kind is most popular ?
Verbs and related categories (back to top)
Verbs consist of a basic label (for BE, DO, GET, HAVE, or an ordinary
verb), with variants depending on whether the verb functions as the
infinitive, present or past tense, active or passive participle, gerund,
or imperative. Verbs are tagged with the same POS tags, regardless of
whether they are used as main verbs or auxiliary verbs.
Lexical
infinitive
present
past
participle, active
participle, passive
gerund
imperative
be
BE
BEP
BED
BEN
–
BAG
BEI
do
DO
DOP
DOD
DON
DAN
DAG
DOI
get
GT
GTP
GTD
GTN
GAN
GTG
GTI
have
HV
HVP
HVD
HVN
HAN
HAG
HVI
other verbs
VB
VBP
VBD
VBN
VAN
VAG
VBI
Aspectual A
Aspectual A is treated as a clitic
that attaches as a sister to gerunds (whether verbal or nominal).
( (VP (A a=)
(VAG coming)
(PP (P round)
(NP (D the)
(N mountain)))))
( (IP-MAT (NP-SBJ (EX They))
(VP (BED was)
(NP-LGS (NP (A a=) (N bickering))
(CONJP (CONJ and)
(NP (A a=) (N fussing))))
(VAG going)
(RP on))
(PUNC .)))
Aspectual marker ASP
ASP is treated as a sister to a (default) past tense verb.
( (IP-MAT (NP-SBJ (PRO She))
(VP (ASP done)
(VBD left))
(PUNC .)))
FOR
When FOR is used as a complementizer, it is tagged with its own POS tag
and attaches as a sister of infinitival TO.
( (IP-MAT (NP-SBJ (PRO I@))
(VP (MD @'d)
(VP (VB prefer)
(IP-INF (FOR for)
(NP-SBJ (PRO them))
(TO to)
(VP (VB leave)))))
(PUNC .)))
( (IP-MAT (NP-SBJ (PRO I@))
(VP (VBP want)
(IP-INF (FOR for)
(TO to)
(VP (VB leave))))
(PUNC .)))
FROM
When FROM functions as the (implicit) head of a gerund, analogous to TO
in infinitival clauses, it is tagged with its own POS tag and attaches
as a sister of the gerund VP.
( (IP-MAT (NP-SBJ (PRO They))
(VP (MD could@)
(NEG @n't)
(VP (VB prevent)
(IP-ECM (NP-SBJ (PRO him))
(FROM from)
(VP (VAG losing)
(NP (D the)
(N job))))))
(PUNC .)))
MD
Ordinary modals: | CAN, COULD, MAY, MIGHT, MUST, SHALL, SHOULD, WILL, WOULD |
Pseudo-modals: | DARE, NEED, OUGHT |
MD counts as the head of a verb phrase. In standard English, modals are necessarily finite, but in Appalachian English, they occur in nonfinite contexts (or at least appear to do so). In particular, multiple modals are both tagged as MD for ease of retrieval (even if a multiple modal analysis turns out not to be correct).
( (IP-MAT (NP-SBJ (PRO You)) (VP (MD can) (VP (VB try) (NP-OB1 (PRO it)) (RP out))) (PUNC .))) ( (IP-MAT (NP-SBJ (PRO she)) (VP (DOD did@) (NEG @n't) (VP (MD might) (VP (HV =uv) (VP (VBN went) (PP (P to) (NP (N school))) (NP-MSR (QP (Q some))))))) (PUNC .))) ( (IP-MAT (CONJ and) (NP-SBJ (D a) (N body)) (VP (MD might) (VP (MD could) (VP (VB fall)))) (PUNC .)))
The pseudo-modals DARE and NEED are tagged MD when they precede negation (even if they govern an infinitival clause rather than a bare VP).
( (IP-MAT (NP-SBJ (PRO They)) (VP (MD dared) ← modal DARE (NEG not) (IP-INF (TO to) (VP (VB come)))) (PUNC .))) ( (IP-MAT (NP-SBJ (PRO They)) (VP (DOD did@) (NEG @n't) (VP (VB dare) ← verbal DARE (IP-INF (TO to) (VP (VB come))))) (PUNC .))) ( (IP-MAT (NP-SBJ (PRO You)) (VP (MD need@) ← modal NEED (NEG @n't) (VP (VB come))) (PUNC .))) ( (IP-MAT (NP-SBJ (PRO You)) (VP (DOP do@) ← verbal NEED (NEG @n't) (VP (VB need) (IP-INF (TO to) (VP (VB come))))) (PUNC .)))
The pseudo-modal OUGHT is always tagged MD.
( (IP-MAT (NP-SBJ (PRO You)) (VP (MD ought) (IP-INF (TO to) (VP (VB try) (NP-OB1 (PRO it))))) (PUNC .)))
Silent modals. Appalachian English features notorious amounts of verbal syncretism. In particular, it is often difficult or impossible to tell whether a particular verb form is a (nonstandard) past tense form or a bare form (infinitive) licensed by the silent counterpart of WOULD, marking habitual aspect. Our conventions for annotating silent WOULD (and silent modals more generally) are as follows. If the overt modal were appropriate in context and the larger context contains a licensing instance of that modal, we assume a silent modal in the clause at issue. The silent modal contains a reference to the licensing modal in the form of a plus or minus sign, followed by a counter. The sign refers to whether the licensing modals occurs in the previous (-) or following (+) context, respectively. The counter indicates the distance in sentence tokens from the silent modal. The counter for licensing modals in the same sentence token is zero (0). In some instances, a silent modal analysis seems clearly appropriate, but there is no licensing overt modal in the context. Such cases are indicated by appending "x" to the silent modal.
( (IP-MAT (NP-SBJ (PRO We@)) (VP (MD @'d) (VP (VB go) (ADVP-DIR (ADV home)) (ADVP-TMP (NP-MSR (QP (Q many)) (NS years)) (ADV ago)))) (PUNC ,))) ( (IP-MAT (CONJ and) (NP-SBJ (PRO we)) (VP (MD 0-1) (VP (VB celebrate) (NP-OB1 (NPR Christmas)) (ADVP-LOC (ADV there)))) (PUNC .)))
( (IP-MAT (NP-SBJ (PRO We)) (VP (MD 0+1) (VP (VB go) (ADVP-DIR (ADV home)) (ADVP-TMP (NP-MSR (QP (Q many)) (NS years)) (ADV ago)))) (PUNC ,))) ( (IP-MAT (CONJ and) (NP-SBJ (PRO we@)) (VP (MD @'d) (VP (VB celebrate) (NP-OB1 (NPR Christmas)) (ADVP-LOC (ADV there)) (PUNC .)))
( (IP-MAT (CP-ADV (C When) (IP-SUB (NP-SBJ (PRO we)) (VP (MD 0+0) (VP (VB go) (ADVP-DIR (ADV home)) (ADVP-TMP (NP-MSR (QP (Q many)) (NS years)) (ADV ago)))))) (PUNC ,) (NP-SBJ (PRO we@)) (VP (MD @'d) (VP (VB celebrate) (NP-OB1 (NPR Christmas)) (ADVP-LOC (ADV there)))) (PUNC .)))
( (IP-INF (IP-INF (TO to) (VP (BE be))) (CONJP (CONJ or) (IP-INF (NEG not) (TO to) (VP (BE be))))))
basic comparative superlative wh- adjective ADJ ADJR ADJS – adverb ADV ADVR ADVS WADV quantifier Q QR QS – numeral NUM –
Degree heads are discussed in detail
in Degree and comparative constructions.
Numeral (see Numbers)
Quantifier
Quantifiers including the following items:
ALL, EVERY, MANY, MORE, MOST, NO, NONE, SOME |
Doubly-marked comparatives are annotated as sequencss of the comparative quantifier MORE and a comparative adjective or adverb.
( (ADJP (QP (QR more)) (ADJR happier))) ( (ADVP (QP (QR more)) (ADVR quicker)))
Analogously for doubly-marked superlatives.
( (ADJP (QP (QS most)) (ADJS happiest))) ( (ADVP (QP (QS most)) (ADVS quickest)))
Category Example POS tag complementizer because, that C coordinating conjunction and, or CONJ focus particle only FP foreign word mangia FW interjection oh INTJ negation not, n't NEG particle down, in, up RP preposition about, in P punctuation . , ? PUNC wh- complementizer if, whether WQ unknown – X
AND, BUT, BOTH, EITHER, OR, NEITHER, NOR |
Anacoluthic clause-medial conjunctions are tagged as X.
( (IP-MAT (CONJ-TEMP And) (CP-ADV (C after) (IP-SUB (NP-SBJ (PRO we)) (VP (GTD got) (VP (VAN organized))))) (PUNC ,) (X and) (NP-SBJ (NS things)) (VP (VBD begin) (IP-INF (TO to) (VP (VB pick) (RP up)))) (PUNC .))) ( (IP-MAT (PP (P through) (NP (N spring) (PP (P of) (NP (D the) (N year))))) (PAREN (IP-MAT (NP-SBJ (PRO you)) (VP (MD might) (NEG not) (VP (VB believe) (NP-OB1 (PRO it)))))) (X but) (NP-SBJ (PRO they)) (VP (HVD had) (INTJ uh) (IP-ECM (NP-SBJ (NS men)) (VP (A a=) (VAG going) (ADVP-X (ADV around))))) (PUNC ,)))
Dangling conjunctions (AND, BUT, OR) are included as part of a preceding sentence token. They precede BREAK and attach as high as structurally possible (as daughter of IP, generally the root IP).
( (IP-MAT (NP-SBJ (PRO He)) (VP (VBD looks) (NP-PRD (D the) (N part))) (PUNC ,) (CONJ but) (CODE <BREAK>)))
The following words can count as focus particles. BUT and JUST also have other uses.
BUT, EVEN, JUST, ONLY |
Focus particles attach as daughters of the phrase with which they are
construed.
False start (FS)
Foreign word (FW)
TO BE ADDED
Interjection (INTJ)
INTJ is used for expressions like the following (along with spelling and
pronunciation variants):
AH, AMEN, AW, AYE, |
BOO, BYE, |
DANG, DARN, DARNIT, DOGGONE, |
EH, EW, |
GAD, GAW, GEE, GOLLY, GOOD-BYE, GOSH, |
HA, HEH, HELLO, HEY, HI, HM, HOWDY, HUSH, |
JEEZ, |
KABOOM, |
LORDY, |
MM, |
NAH, NAW, NO, NOPE, NUH-UH |
OKAY, OKIE-DOKIE, OH, OOPS, OY, |
PHEW, PSH, |
SH, SHOO, SHUCKS |
UH, UH-UH, UH-HUH, UM, |
WHOA, WHEW, WHIZ, WHOO, WOO, WOOPS, WOW, |
YAY, YEAH, YES, YESSIRREE, YUP |
Honorary interjections. In addition to these true interjections, the following words, when used as interjections on their own without any modification, are treated as honorary interjections and tagged INTJ. The annotation of honorary interjections forming part of a larger expression is discussed in INTJP.
ALRIGHT, BOY, DUDE, FUCK and related expressions, GOD, GOODNESS, HEAVENS, HELL LIKE, LORD, MAN, PLEASE, RIGHT, SHOOT, WELL, WHY |
Quotative markers. When
introducing quotation phrases
(QTP), ALL and
LIKE are tagged as INTJ.
Negation (NEG)
For negation in conjunction contexts, see also Negation, ALSO, and related particles.
Otherwise, the default is for NEG to attach as high as structurally
possible.
Particles are a subtype of preposition and are defined as items
belonging to the following list.
Particle (RP)
ABOUT, ACROSS, BY, DOWN, FRO, IN, OFF, ON, OUT, OVER, THROUGH, (stressed) TO, UNDER, UP, WITH (in varieties that allow COME WITH) |
Depending on their syntactic context, the above items are tagged as
RP or P. For details concerning individual items, follow the links in
the list. For more general discussion, notably on which tag is
appropriate and whether RP projects a phrase (PP) or not,
see Internal structure of phrases.
When the possessive morpheme 'S (or bare apostrophe) takes scope over a
constituent larger than a simple NP, it
is split off and tagged as POS,
which then functions as the head of a possessive NP (NP-POS). See the
dash tag -POS for examples and
discussion.
X is used for words whose POS tag is unknown. It can also be used for
words whose tag is mysterious in context.
Possessive morpheme (POS)
Preposition (P)
Prepositions include the homonymous items listed
under Particle and others (DURING, EXCEPT,
SINCE, etc.).
Wh- complementizer (WQ)
WQ is the POS tag for IF when it
heads an indirect question and for WHETHER.
Unknown or mysterious word tag (X)
POS tags that are ambiguous between two (or even three) known tags are handled in a more informative manner; the possible tags are concatenated as illustrated under POS ambiguity. |
( (IP-MAT (CONJ-TEMP and) (ADVP-TMP (ADV then)) (CP-ADV (C-ADV if) (IP-SUB (NP-SBJ (PRO we)) (VP (VBD took) (NP-OB1 (D the) (ADJP (ADJ whole)) (N family))))) (CODE {inhaling}) (PUNC ,) (X that) (NP-SBJ (PRO he@)) (VP (MD @'d) (VP (VB take) (NP-OB1 (D the) (N wagon)))) (PUNC .))) ( (IP-MAT (NP-SBJ (PRO She)) (VP (VBD died) (X in) (ADVP-TMP (NP-MSR (NUMP (ADVP (ADV about)) (NUM two)) (NS weeks)) (ADV ago))) (PUNC ,))) ( (FRAG (INTJ Yeah) (PUNC ,) (NP-SBJ (PRO I)) (XX (X like) (IP-INF (TO ta) (VP (HV =uv) (VP (VBN died) (CP-ADV (C-ADV when) (IP-SUB (NP-SBJ (PRO I)) (VP (VBD moved) (ADVP-DIR (RP over) (ADV here))))))))) (PUNC .)))
Intrusive A. X is used to annotate what the OED calls intrusive A. However, THIS-A-WAY and THAT-A-WAY are treated as simple lexical items.
( (IP-MAT (NP-SBJ (PRO They)) (VP (BED was) (NP-PRD (FP just) (ADJP (X a=) (ADJ little)) (NS kids))) (PUNC .)))
-COMP labels count as word-level tags. In other words, they count as heads of phrases and are are always dominated by a phrasal label that matches the syntactic category of the -COMP label.
Apart from the head (generally the rightmost element), the category of a compound word's constituents do not necessarily match the category of the entire compound.
( (N-COMP (ADJ high) (N school))) ( (N-COMP (ADJ Social) (N Security))) ( (N-COMP (ADJ Social) (N Security) (N card)))
( (ADJ-COMP (ADJ Greek) (ADJ Orthodox))) ( (ADJ-COMP (ADJ bright) (ADJ red)))
( (NP (D a) (N-COMP (N coal) (N miner)))) ( (NP (QP (Q several)) (N-COMP (N mine) (NS inspectors)))) ( (NP (D a) (N-COMP (NPR Christmas) (N present)))) ( (NP (D a) (ADJP (ADJ major)) (N-COMP (N coal) (N mining) (N operation)))) ( (NP (D the) (N-COMP (NPR C) (NPR and) (NPR O) (N canal))))
N-COMP is not recursive, but in some cases, further internal struture is indicated (notably, when a compound noun contains PP).
( (NP (D the) (N-COMP (NPR District) (PP (P of) (NP (NPR Columbia)))))) ( (NP (N-COMP (ADJP (ADJ Fourth)) (PP (P of) (NP (NPR July))) (N fireworks))))
Cities or counties together with their states or countries are treated as compound nouns. As usual, N-COMP is non-recursive.
( (NP (N-COMP (NPR Harlan) (NPR County)))) ( (NP (N-COMP (NPR Louisville) (PUNC ,) (NPR Kentucky)))) ( (NP (N-COMP (NPR Harlan) (NPR County) (PUNC ,) (NPR Kentucky)))) ( (NP (N-COMP (NPR New) (NPR York) (NPR City)))) ( (NP (N-COMP (NPR New) (NPR York) (PUNC ,) (NPR New) (NPR York)))) ( (NP (N-COMP (NPR Hiroshima) (PUNC ,) (Japan))))