See Conjunction and gapping for discussion of the following labels: CONJP, ADJX, ADVX, NX, NUMX |
This section of the documentation focuses on the internal structure of phrases, abstracting away from the grammatical or semantic roles that they play within larger structures, particularly within clauses. These roles are indicated by so-called dash tags and discussed in the section on Grammatical and semantic functions. The internal structure of clauses (IPs and CPs) is also discussed separately.
The general schema for phrases consists of a unique head, possibly accompanied by syntactic dependents. Dependents is a cover term for complements (arguments) and modifiers (adjuncts); for more discussion, see Grammatical and semantic functions. Dependents are uniformly labelled as phrases, regardless of how many words they contain. There are very few exceptions, which follow because of other annotation conventions; see Heads for discussion.
( (ADJP (ADVP (ADV very)) ← ADVP = pre-head dependent (ADJ happy) ← head of ADJP (PP (P with) ← PP = post-head dependent (NP (D the) (N outcome))))) ( (ADVP (NP-MSR (NUMP (NUM two)) ← NP-MSR = pre-head dependent (NS years)) (ADV ago))) ← head of ADVP ( (NP (D the) ← head (exceptional sister to N) (ADJP (ADJS best)) ← ADJP = dependent of N (N outcome))) ← head of NP
Phrase | Possible head |
---|---|
ADJP | ADJ, ADJR, ADJS, ADJ-COMP |
ADVP | ADV, ADVR, ADVS |
INTJP | INTJ |
NP | N, NS, NPR, NPRS, Q+N, N-COMP in the absence of the above: D, EX, PRO |
NUMP | NUM, NUM-COMP |
PP | P, RP |
QP | Q, QR, QS |
VP | MD, inflectional variants of BE, DO, GT, HV, VB |
wh- phrase | same as for corresponding ordinary phrase |
As mentioned earlier, phrases each have a unique head. It should not, therefore, be possible for two categories from the preceding table to be sisters. This is generally true, but there are a very few exceptions, which follow from the fact that not all POS categories always project phrases in our system. (Recall that our annotation conventions are intended to facilitate searches and do not express a theoretical commitment to the details of a particular structure.) The ramifications are discussed in more detail in the appropriate sections.
( (IP-MAT (NP-SBJ (PRO He)) (VP (BED was) (ADJP-PRD (NP-MSR (NUMP (NUM twenty-one)) (NS years)) (ADJ old))) (PUNC ,))) ( (IP-MAT (CONJ and) (NP-SBJ (PRO she)) (VP (BED was) (ADJP-PRD (NP-MSR (NUMP (NUM eighteen)) (NS 0)) (ADJ 0))) (PUNC .)))
ordinary: | ADJP (with linking verb or, rarely, with P), ADVP, NP, PP, VP | ||
clausal: | CP,
IP
other:
| FRAG,
QTP,
XX
| |
NUMP and QP do not serve as complements of their own, but must be
dominated by some type of NP. (Stranded IP-level QP, an apparent
exception, is not a complement).
Modifier
As mentioned earlier, modifiers (by virtue of being dependents) are
treated as phrases (apart from particles in
particle-verb combinations).
At the VP/IP level, modifiers generally bear a dash tag indicating a semantic function. But ADVPs are bare if they are not marked as directional (-DIR), locative (-LOC), or temporal (-TMP) modifiers, and PPs are always unmarked for function.
Modifiers at levels lower than the VP/IP level generally do not bear
a dash tag. But measure NPs are always marked as such
(-MSR), and adnominal NP modifiers are
marked as possessive Head or dependent?
It is sometimes difficult to resolve the dependency relations among
phrases, particularly in cases with several spatial adverbs, particles,
and prepositions. The following considerations, based on diagnostic
distributional patterns, should resolve doubtful cases. The distinction
between complements and modifiers is not explicitly represented in the
annotation, but taking the distinction into account generally
facilitates the decision process. The section on the internal structure
of PPs contains further examples
of difficult cases.
ok come from in the house * comefromin the house (* on intended reading) * come fromin the house( (VP (VB come) (PP (P from) (PP (P in) (NP (D the) (N house)))))) ok stay in from the rain * stayinfrom the rain ok stay infrom the rain( (VP (VB stay) (PP (RP in) (PP (P from) (NP (D the) (N rain))))))
ok live up in the house ok liveupin the house * live upin the house( (VP (VB live) (PP (PP (RP up)) ← like this (P in) (NP (D the) (N house))))) ( (VP (VB live) (PP (RP up) ← not like this (PP (P in) (NP (D the) (N house))))))
ok go over to the house ok gooverto the house ok go overto the house( (VP (VB go) (PP (RP over) ← like this (PP (P to) (NP (D the) (N house)))))) ( (VP (VB go) (PP (PP (RP over)) ← not like this (P to) (NP (D the) (N bouse))))) ( (ADVP-TMP (ADV early) ← like this (PP (P in) (NP (D the) (N morning))))) ( (PP (ADVP (ADV early)) ← not like this (P in) (NP (D the) (N morning))))
( (ADJP (ADVP (ADV very)) (ADJ proud) (PP (P of) (NP (D the) (N result))))) ( (ADJP (ADJR happier) (PP (P than) (NP (NP-POS (PRO$ their)) (NS competitors))))) ( (ADJP (ADJ happy) (CP-THT (C that) (IP-SUB (NP-SBJ (PRO they)) (VP (VBD won)))))) ( (PP (P for) (ADJP (ADJ sure))))
( (ADVP (QP (QR more)) (ADV happily) (PP (P than) (ADVP (ADV before))))) ( (ADVP (NP-MSR (QP (Q many)) (NS years)) (ADVR later)))
ADVP is the default category for phrases with adverbial function. This is true even if it gives rise to category mismatches in connection with exocentric structures.
( (IP-MAT (NP-SBJ (PRO I)) (VP (VBD arrived) (ADVP-TMP (CP-FRL (WNP-1 (ADVP (ADV just)) (WD what) (N time)) (IP-SUB (NP-SBJ (PRO he)) (VP (NP-TMP *T*-1) (VBD left)))))) (PUNC .)))
The label INTJP is used in the following cases:
( (FRAG (INTJ Well) (INTJ hi) (PUNC ,) (ELAB (INTJP (INTJ hi) (ADVP (ADV there))) (ADVP (ADV again))) (PUNC .)))
( (QTP (IP-MAT (NP-SBJ (N-COMP (NPR Mister) (NPR Long))) (VP (VBD told) (NP-OB2 (PRO me)) (IP-INF (TO to) (VP (VB go) (PP (P with) (NP (N-COMP (NPR Emory) (NPR Cook))))))) (PUNC ,) (INTJP (PP (P by) ← BY GOD (NP (NPR God))))) (PUNC ,))) ( (FRAG (INTJ Mmhmm) (PUNC ,) (INTJ oh) (INTJP (NP (N dear))) ← DEAR (PUNC .))) ( (FRAG (INTJ Oh) (INTJP (ADJP (ADJ good)) ← GOOD GRACIOUS (ADJP (ADJ gracious))) (PUNC ,))) ( (IP-MAT (INTJ Oh) (INTJP (NP (NP-POS (PRO$ my)))) ← MY (PUNC ,) (NP-SBJ (PRO we)) (VP (QP (Q all)) (DOD did)) (PUNC .))) ( (FRAG (INTJP (NP (PRO$ My) (NPR God))) (PUNC .))) ( (FRAG (INTJP (NP (PRO$ My) (N goodness))) (PUNC .))) ( (FRAG (INTJ Oh) (INTJP (NP (NP-POS (PRO$ my)) (INTJ gosh))) (PUNC .))) ( (IP-MAT (NP-SBJ (D that@)) (VP (HVP @'s) (VP (BEN been) (INTJ uh) (CODE {hesitating}) (INTJP (WNP (WPRO what))) ← WHAT (PUNC ,) (NP-PRD (NUMP (NUM forty-some)) (NS year)))) (PUNC ?) (CODE {laughing})))
Otherwise, interjections are annotated as bare INTJ.
( (FRAG (INTJ Well) (PUNC ,) (INTJ oh) (INTJ gaw) (PUNC ,) (NP (N-COMP (NPR Bob) (NPR Martin))) (PUNC .))) ( (IP-MAT (INTJ Goodness) ← honorary INTJ (PUNC ,) (NP-SBJ (D that)) (VP (MD must) (VP (HV have) (VP (BEN ben) (ADJP-PRD (ADJ tough))))) (PUNC .))) ( (CP-EXC (INTJ God) ← honorary INTJ (PUNC ,) (WADJP (WADVP (WADV how)) (ADJ disappointing)) (PUNC .)))
For simplicity, our annotation scheme treats noun phrases as projections of N rather than of D. (Recall that this not imply a theoretical rejection of the DP analysis of noun phrase structure.) D, EX and PRO function as heads of NP in the absence of N.
In general, NP do not dominate other bare NPs, except in connection with conjunction and calendar dates.
( (NP (N water))) ( (NP (NPR Paris))) ( (NP (ADJP (ADJ pretty)) (NS pictures) ( (NP (NUMP (NUM three)) (QP (QR more)) (NS examples) ( (NP (D those))) ( (NP (EX there))) ( (NP (PRO ourselves)))
Because noun phrases are treated as projections of N, prenominal determiners violate our convention that phrases have a unique head (or more precisely, dominate at most one word-level category).
( (NP (D a) (ADJP (ADVP (ADV very)) (ADJ nice)) (N room))) ( (NP (D the) (N opportunity) (PP (P of) (NP (D a) (N lifetime))))) ( (NP (D those) (NS books))) ( (NP (D those) (NUMP (NUM two)) (NS books)))
More than with any other phrase type, noun phrases can be headed by a silent nominal head. As mentioned in Known issues, this silent head is not always explicitly indicated, but it informs the annotation. In particular, it prevents other categories (notably, ADJ, NUM, and Q) from serving as heads of NP; rather, when immediately dominated by NP, these categories invariably function as nominal modifiers.
( (NP (D the) (ADJP (ADJ rich)))) ( (NP (D the) (ADJP (ADJ English)))) ( (NP (D those) (NUMP (NUM two)))) ( (NP (QP (Q many)) (QP (QR more)) (PP (P of) (NP (D that) (N type)))))
Postnominal ADJPs are general annotated
as predicates (PRD) of
reduced relative clauses (IP-RRC).
But postnominal ENOUGH and SUCH are annotated as bare ADJP, since they
do not allow an IP-RRC paraphrase.
NUMP
The following examples illustrate simple instances of NUMP. The
annotation of number expressions more generally raises special issues
and is discussed in full detail in Numbers.
( (NP (NUMP (NUM one)) (N house))) ( (NP (NUMP (NUM five)) (NS houses))) ( (NP (NUMP (ADVP (ADV about)) (NUM five)) (NS houses))) ( (NP (NUMP (ADVP (ADV around)) (NUM ten)) (NS children))) ( (NP (NUMP (ADVP (ADV probably)) (NUM five)) (NS houses))) ( (NP (NUMP (NUMP (NUM two)) (CONJP (CONJ or) (NUMP (ADVP (ADV maybe)) (FP even) (NUM three)))) (NS houses)))
( (PP (P without) (NP (D the) (N shadow) (PP (P of) (NP (D a) (N doubt))))))
Particles are a subtype of preposition. They are tagged as P or as RP and sometimes fail to project PP.
( (PP (P in) (NP (D the) ← NP complement (N house)))) ( (FRAG (WNP (WPRO What)) (PP (P about) ← ADVP complement (ADVP (ADVR later))))) ( (FRAG (WNP (WPRO What)) (PP (P about) ← PP complement (PP (P in) (NP (D the) (N house)))))) ( (PP (ADVP (ADV right)) ← modifier (P in) (NP (D the) (N house))))
( (VP (VB give) (NP-OB1 (PRO it)) (RP up))) ( (VP (VB give) (RP up) (NP-OB1 (D the) (N house)))) ( (PP (P in) (NP (QP (Q all)) (NP-POS (PRO$ my)) (N growing) (RP up)))) ( (VP (VB set) (NP-OB1 (PRO it)) (RP down))) ← bare, without PP ( (VP (VB set) (NP-OB1 (PRO it)) (PP (ADVP (ADV right)) ← modified, with PP (RP down))))
Difficult cases. Sequences involving several adverbs, prepositions, and particles are a prime breeding ground for doubtful cases concerning dependency relations. They are resolved according to the general principles in Head or dependent? The following examples illustrate the principles in action.
ok stay in from the rain * stayinfrom the rain ok stay infrom the rain( (VP (VB stay) (PP (RP in) (PP (P from) (NP (D the) (N rain)))))) ok keep out of trouble * keepoutof trouble ok keep outof trouble( (VP (VB keep) (PP (RP out) (PP (P of) (NP (N trouble))))))
ok run on down the hill ok runondown the hill ok run ondown the hill( (VP (VB run) (PP (RP on) (PP (P down) (NP (D the) (N hill)))))) ok jingle on home ok jingleonhome ok jingle onhome( (VP (VB jingle) (PP (RP on) (ADVP (ADV home)))))) ok run on down through the forest ok runondown through the forest ok run ondown through the forestok ... on down through the forest ok ... ondownthrough the forest ok ... on downthrough the forest( (VP (VB run) (PP (RP on) (PP (RP down) (PP (P through) (NP (D the) (N forest))))))) ok the path on down through to home ok the pathondown through to home ok the path ondown through to homeok ... on down through to home ok ... ondownthrough to home ok ... on downthrough to homeok ... down through to home ok ... downthroughto home ok ... down throughto home( (NP (D the) (N path) (PP (RP on) (PP (RP down) (PP (RP through) (PP (P to) (ADVP (ADV home))))))))
ok live on down the hill ok liveondown the hill * live ondown the hill(* on intended reading) ( (VP (VB live) (PP (PP (RP on)) (P down) (NP (D the) (N hill))))) ok live on down in that area ok liveondown in that area * live ondown in that area(* on intended reading) ok down in that area okdownin that area * downin that area(* on intended reading) ( (VP (VB live) (PP (PP (RP on)) (PP (RP down)) (P in) (NP (D that) (N area))))))
( (NP (QP (Q many)) (NS accidents))) ( (NP (QP (Q many)) (QP (QR more)) (NS accidents))) ( (NP (QP (ADVP (ADV overly)) (Q many)) (NS accidents))) ( (NP (QP (ADVP (NP-MSR (QP (Q all))) (ADVR too)) (Q many)) (NS accidents)))
( (VP (MD might) (VP (HV have) (VP (BEN been) (VP (BAG being) (VP (VAN built)))))))
Verbal modifiers attach as low as is consistent with the meaning.
A and ASP do not project their own VP, but attach as sisters of the adjacent verb.
( (VP (A a=) (VAG hunting))) ( (IP-MAT (NP (PRO They)) (VP (ASP done) (VBD told) (NP-OB2 (PRO me)))))
( (WADJP (WADVP (WADV how)) (ADJ beautiful)))
( (WADVP (WADVP (WADV how)) (ADV quickly)))
In wh- CPs where a trace has adverbial function and its silent antecedent could be WADVP or WNP, the default category is WADVP. This is true even if it gives rise to category mismatches across the CP.
( (IP-MAT (NP-SBJ (PRO I)) (VP (VBP remember) (NP-OB1 (D the) (ADJP (ADJ first)) (N time) (CP-REL (WADVP-1 (WADV 0)) (IP-SUB (NP-SBJ (PRO he)) (VP (ADVP-TMP *T*-1) (VBD came)))))) (PUNC .)))
( (WNP (WD which) (N terminal))) ( (WNP (WD what) (ADJP (ADJ quick)) (N service))) ( (WNP (WPRO what)))
The proper analysis of WHAT A + noun is not clear, but we annotate it as follows:
( (WNP (WNP (WPRO what)) (D a) (N nightmare)))
( (WNP (WNUMP (WQP (WADVP (WADV how)) (Q many)) (NUM thousand)) (NS feet)))
( (WPP (P at) (WNP (WD what) (N point)))) ( (WPP (ADVP (ADV just)) (P to) (WNP (WD what) (N extent)))) ( (WPP (P during) (WNP (WPRO which))))
( (WNP (WQP (WADVP (WADV how)) (Q many)) (NS people))) ( (WNP (WQP (WADVP (WADV how)) (Q much)) (N work)))
FRAG is used to label the following types of material:
Root VPs missing more than just the subject are treated as FRAG, and the elided material is not included in the main parse (though it may be indicated by an ELL[ipsis] comment).
( (FRAG (CODE (ELL:Do_you}) (VP (VB Wan@) (IP-INF (TO @na) (VP (VB come) (ADVP (ADV along))))) (PUNC ?)))
Finite VPs without an overt subject are treated as FRAG if the VP is the short answer to a preceding question.
( (CP-QUE-MAT (IP-SUB (BEP Are) (NP-SBJ (PRO you)) (VP (VAG coming))) (PUNC ?))) ( (FRAG (VP (ADVP (ADV Sure)) (BEP am)) (PUNC !)))
Otherwise, finite VPs without a subject are treated as IP with an empty subject.
( (IP-MAT (NP-SBJ (PRO I@)) (VP (MD @'ll) (VP (VB try))) (PUNC .))) ( (IP-MAT (NP-SBJ (PRO *pro*)) (VP (MD might) (NEG not) (VP (VB make) (NP-OB1 (PRO it)))) (PUNC ,) (ADVP (ADV though)) (PUNC .)))
Although the final P in its label is mnemonic for "phrase", QTP is not an endocentric category; in other words, QTP is not the projection of a QT head. |
QTP encloses direct speech and can be a root node (see Direct speech for examples). The material dominated by QTP is annotated in the ordinary way (with the result that QTP is always unary-braching).
( (IP-MAT (NP-SBJ (PRO She)) (VP (VBD said) (PUNC ,) (QTP (IP-MAT (NP-SBJ (PRO I@)) (VP (BEP @m) (VP (VAG coming)))))) (PUNC .))) ( (IP-MAT (NP-SBJ (PRO She)) (VP (VBD said) (PUNC ,) (QTP (FRAG (ADVP (NEG Not) (ADVP (ADVR so)) (ADV fast))))) (PUNC .))) ( (IP-MAT (NP-SBJ (PRO She)) (VP (VBD said) (PUNC ,) (QTP (INTJ Hello))) (PUNC .)))
When not a root node, QTP generally functions a complement of an
ordinary verb of saying, but it can also be a complement of BE or GO.
Especially in the case of BE, QTP is generally preceded by a quotative
marker (ALL,
LIKE), which is
tagged INTJ.
XX
See also FRAG.
XX is used in the following cases:
( (CP-REL (WNP-1 (WPRO 0)) (C 0) (IP-SUB (NP-SBJ *T*-1) (VP (VBD hit) (ADVP (ADVP (ADV right)) (RP in)) (ADVP (ADV agin)) (XX (X at) (PP (P of) (NP (PRO it))))))))
( (ADVP-DIR (ADV back) (CODE {unintelligible}) (XX (NP (NP (D the) (N day) (PP (P of) (NP (NP-POS (PRO$ my)) (N birth)))) (CONJP (CONJ and) (NP (QP (Q all)))))))) ( (IP-MAT (CODE {COM:break_in_tape}) (XX (CODE {inaudible}) (IP-INF (TO to) (VP (VB lose)))) (PUNC ,) (FS (PRO he) (ADV just) (PRO he) (FS di-) (PRO he) (DOD did@) (NEG @n't) (FS ha-)) (NP-SBJ (PRO he)) (VP (BED was@) (NEG @n't) (VP (VAN hurt) (NP-MSR (QP (Q none))) (ADVP (ADV thataway)))) (PUNC ,)))
( (NP (NP (LS (N a)) (D this) (N one)) (PUNC ,) (CONJP (NP (LS (N b)) (D that) (N one))) (PUNC ,) (CONJP (CONJ and) (NP (LS (N c)) (D the) (ADJP (ADJ other)) (N one))))) ( (NP (NP (LS (N number) (NUM one)) (D a) (N tent)) (PUNC ,) (CONJP (CONJ and) (NP (LS (N number) (NUM two)) (D a) (N flag)))))
Nonlinguistic material.
( (META {coughing}))
Paralinguistic material is enclosed in META brackets and integrated into the larger context in the ordinary way. META can, but need not, have a dominating phrasal node.
( (IP-MAT (CONJ and) (NP-SBJ (PRO I)) (VP (VBD said) (PUNC ,) (QTP (META C-A-P-E))) (PUNC .))) ( (FRAG (NP (NP (FP Just) (D the) (NS initials)) (PUNC ,) (CONJP (CONJ or) (NP (META E-L-V-I-E)))) (PUNC ?))) ( (IP-MAT (NP-SBJ (PRO They)) (VP (HVD had) (META (NUM one) (NUM two) (NUM three)) (NP-OB1 (NUMP (NUM four)) (NS children))) (PUNC .)))
Metalinguistic material (material that is mentioned rather than used in the ordinary way) is annotated as usual. Bare words mentioned as words are only given their part-of-speech tag and do not project a phrase. The material is then enclosed in META brackets, analogously to how direct speech is annotated and enclosed in QTP brackets. Depending on the conventions of individual projects, the META constituent may be redundantly set off by single quotes. The function of the META constituent in the larger context is annotated as usual. This differs from the treatment of QTP, which is not explicitly annotated as, say, the direct object of a verb of saying.
As mentioned, titles of books, songs, and so on, count as instances of mention (vs. use). Common nouns may be capitalized in accordance with standard orthographic convention in titles, but they are not tagged as proper nouns unless they would be so tagged outside of the title context.
( (IP-MAT (NP-SBJ (PRO I)) (VP (DOP do@) (NEG @n't) (VP (VB use) (NP-OB1 (D the) (N word) (PUNC ') (ELAB (META (VB utilize)))))) ← no VP (PUNC ') (PUNC .))) ( (IP-MAT (NP-SBJ (NP-POS (PRO$ My)) (N copy) (PP (P of) (PUNC ') (NP (META (NP (N Murder) ← capitalized, but tagged as common noun (PP (P on) (NP (D the) (N-COMP (NPR Orient) ← compound proper noun outside of title context (NPR Express))))))))) (PUNC ') (VP (HVP has) (VP (VBN disappeared))) (PUNC .)))