Alpino parser

Utterances are parsed with the Alpino parser. The Alpino parser uses the Alpino grammar during parsing, and one has to know some major aspects of this grammar (and, when actually defining a query, every fine detail of the structures at hand).

Alpino uses a grammatical formalism inspired by Head-driven Phrase Structure Grammar (HPSG, [Pollard & Sag 1987, 1994]) and it uses internally Directed Acyclic Graphs (DAGs) as the data type for syntactic structures. However, Alpino outputs syntactic structures in accordance with conventions agreed upon in large Dutch treebank projects (projects for the Spoken Dutch Corpus treebank and for the LASSY treebank). These require trees as the data type for syntactic structures and several additional requirements. Alpino meets these requirements but in some cases retains some of the properties of its own internal syntactic structures, especially when these are richer than the standard tree structures. We will point out some of these cases below.

Alpino syntactic structures

The syntactic structures that Alpino generates are trees, and they are encoded in XML in accordance with the alpino_ds DTD . The top level element is alpino_ds (the root node), with directly below this:

  • parser (optional) for properties of the parser

  • node (obligatory): the syntactic structure

  • sentence (obligatory) the sentence parsed

  • metadata (optional) for metadata

  • comments (optional) for comments, usually generated by the Alpino parser

We use the following sentence to illustrate the main characteristics of Alpino syntactic structures:

Het slechte weer heeft al schade aangericht

Clicking on it shows a web page where you can actually view the syntactic structure of this sentence. The viewer used here is the GrETEL viewer, and it shows only a subset of the properties of nodes in this syntactic structure. The full XML-representation of this structure is:

<alpino_ds version="1.3">
  <node begin="0" cat="top" end="7" id="0" rel="top">
        <node begin="0" cat="smain" end="7" id="1" rel="--">
          <node begin="0" cat="np" end="3" id="2" index="1" rel="su">
                <node begin="0" end="1" frame="determiner(het,nwh,nmod,pro,nparg,wkpro)" id="3" infl="het" lcat="detp" lemma="het" lwtype="bep" naamval="stan" npagr="evon" pos="det" postag="LID(bep,stan,evon)" pt="lid" rel="det" root="het" sense="het" wh="nwh" word="Het"/>
                <node aform="base" begin="1" buiging="met-e" end="2" frame="adjective(e)" graad="basis" id="4" infl="e" lcat="ap" lemma="slecht" naamval="stan" pos="adj" positie="prenom" postag="ADJ(prenom,basis,met-e,stan)" pt="adj" rel="mod" root="slecht" sense="slecht" vform="adj" word="slechte"/>
                <node begin="2" end="3" frame="noun(het,mass,sg)" gen="het" genus="onz" getal="ev" graad="basis" id="5" lcat="np" lemma="weer" naamval="stan" ntype="soort" num="sg" pos="noun" postag="N(soort,ev,basis,onz,stan)" pt="n" rel="hd" rnum="sg" root="weer" sense="weer" word="weer"/>
          </node>
          <node begin="3" end="4" frame="verb(hebben,sg_heeft,aux_psp_hebben)" id="6" infl="sg_heeft" lcat="smain" lemma="hebben" pos="verb" postag="WW(pv,tgw,met-t)" pt="ww" pvagr="met-t" pvtijd="tgw" rel="hd" root="heb" sc="aux_psp_hebben" sense="heb" stype="declarative" tense="present" word="heeft" wvorm="pv"/>
          <node begin="0" cat="ppart" end="7" id="7" rel="vc">
                <node begin="0" end="3" id="8" index="1" rel="su"/>
                <node begin="4" end="5" frame="adverb" id="9" lcat="advp" lemma="al" pos="adv" postag="BW()" pt="bw" rel="mod" root="al" sense="al" word="al"/>
                <node begin="5" end="6" frame="noun(de,count,sg)" gen="de" genus="zijd" getal="ev" graad="basis" id="10" lcat="np" lemma="schade" naamval="stan" ntype="soort" num="sg" pos="noun" postag="N(soort,ev,basis,zijd,stan)" pt="n" rel="obj1" rnum="sg" root="schade" sense="schade" word="schade"/>
                <node begin="6" buiging="zonder" end="7" frame="verb(hebben,psp,ninv(transitive,part_transitive(aan)))" id="11" infl="psp" lcat="ppart" lemma="aan_richten" pos="verb" positie="vrij" postag="WW(vd,vrij,zonder)" pt="ww" rel="hd" root="richt_aan" sc="part_transitive(aan)" sense="richt_aan" word="aangericht" wvorm="vd"/>
          </node>
        </node>
  </node>
  <sentence>Het slechte weer heeft al schade aangericht</sentence>
  <comments>
        <comment>Q#ng1648721663|Het slechte weer heeft al schade aangericht|1|1|-6.54556328848</comment>
  </comments>
</alpino_ds>

The main characteristics of Alpino syntactic structures are:

  • The structures are constituent trees. So they contain nodes not only for words but also for phrases (constituents). In the example sentence, the phrases are the nodes labeled smain (declarative main clause), np (noun phrase), ppart (past participial phrase), and top. The phrases in these syntactic structures are phrases at an abstract level of analysis, not necessarily phrases at the surface level (see also below, under order).

  • Top node: each syntactic structure always has a node labeled top as its root.

  • Attribute-value pairs: Nodes have attribute-value pairs (feature name/feature value pairs), implemented as XML attribute-value pairs.

  • Nodes for words have an attribute pt (simple part of speech code), nodes for phrases have an attribute cat (syntactic category). In exceptional cases pt can be lacking in nodes for words, but then there is always the pos attribute (the Alpino-internal attribute for part of speech tag). Nodes for words also always have the attribute word (for the actual surface form of the word).

  • Order: the nodes in a tree occur in a certain order, but this order has no meaning. A tree with two nodes switched in order is thus equivalent to the original tree.

  • Surface order: surface order of the nodes and words is indicated by the attributes begin and end of a node, which have integers turned into strings as values. These attributes are not shown in the GrETEL tree viewer. The value of the begin attribute of the left most word is ‘0’, the value of the end attribute of a word is always equal to str(int(begin) + 1), so it is ‘1’ for the first word. A node without a cat, pt or pos attribute but with an index attribute, informally called an ‘empty node’, has its begin and end attributes equal to the corresponding attributes of its antecedent (see below for more information about such nodes and their antecedents). Phrases also have begin and end attributes, with the begin attribute equal to the smallest value of the begin attributes of its direct children, and its end attribute equal to the largest value of the end attributes of its direct children.

    For the example sentence given above the values of the begin and end attributes can be represented in the easiest way as follows:

    • 0 het 1 slechte 2 weer 3 heeft 4 al 5 schade 6 aangericht 7

    where the number preceding a word is the value of its begin attribute, and the number following a word is the value of its end attribute.

    Note that nodes for two phrases N1 and N2 can have different relations with regard to linear order:

    • N1 can precede N2 @@add examples@@

    • N1 can follow N2

    • N1 can contain N2

    • N1 can be contained in N2

    • N1 can overlap with N2

  • If a phrase consists just of one word, no phrase node occurs. So, there is no advp (adverbial phrase) above the adverb al, no adjp (adjectival phrase) above the adjective slechte, and no detp (determiner phrase) above the article (lid) het. This is due to one of the conventions agreed upon in the Dutch treebank projects. Internally, Alpino does have phrasal nodes in such cases, and Alpino retains some information about this in the syntactic structures (attribute lcat, see below).

  • Nodes have a label for a grammatical relation (as a value of the attribute rel):

    • e.g. su (subject) for the np het slechte weer, mod (modifier) for the modifiers slechte and al, obj1 (direct object) for schade, det (determiner) for the article het, vc (verbal complement) for the past participial phrase al schade aangericht and hd (head) for heeft ( head of the main clause) and aangericht (head of the participial clause). The smain node is labeled with the grammatical relation -- (two hyphens), and the top node has relation top.

    • Conceptually it is wrong to treat a grammatical relation as a property of a node (it should be a label of the edge), and in some cases this leads to more complex operations.

    • Because grammatical relations are made explicit by means of values for an attribute and are not encoded configurationally, the structures can be relatively flat.

    • We often use the notation rel/poscat to describe a node with relation rel and with pt or cat poscat (e.g. su/np, hd/ww).

    • For an overview of the relations that Alpino distinguishes, see https://paqu.let.rug.nl:8068/info.html#rel

  • Nodes can have a value for the attribute index. A node with relation rel, pt or cat poscat and index i is notated as follows in this document: rel / poscat:i

  • Certain words have multiple grammatical relations in the syntactic structure. In these cases, next to the normal node (which we will call the antecedent) one or more additional nodes are present with just an index and a grammatical relation (and begin and end attributes), but no other attributes, in particular not pt, pos, cat or word. These nodes are coindexed with the antecedent. In the example sentence, the phrase het slechte weer is the subject of heeft (see the dominating su/np:1 node) and the subject of aangericht, represented here by the additional su/:1 node under vc/ppart. These ‘empty nodes’ are used for cases in which a word or phrase plays multiple roles in a sentence, for example in constructions such as:

    • Control: ‘ik vroeg hem dat te doen’: hem object of vroeg and subject of dat te doen.

    • Subject to subject raising: ‘Het lijkt te regenen’: het subject of lijkt and of regenen.

    • Object to subject raising: ik zag hem dat doen: hem object of zag and subject of doen.

    • Passives (‘NP-movement’): ‘het huis werd geschilderd’: het huis subject of werd and object of geschilderd.

    • Wh-movement in questions, relative clauses etc. ‘Wat heeft hij gekocht’: wat head of the question and object of gekocht.

    • Ellipsis (e.g., heel zeldzaam en complex): heel a modifier of zeldzaam and of complex.

  • Auxiliary verbs are not distinguished from lexical verbs in Alpino. All are treated the same. An ‘auxiliary verb’ such as heeft in the example sentence therefore takes a participial phrase as a complement.

  • Words of a particular part of speech are often used as if they are of a different part of speech. Sometimes these words are conversions, i.e actually changed the part of speech. In any case, in such examples always the original part of speech is represented in the pt attribute. The different use is sometimes indicated by a different attribute, as we indicate after the examples. Many words can act as a word with a different pt, e.g.

    • infinitives as a noun: het lezen van boeken (pt=ww, positie=nom),
      • participles as an adjective hij is erg opgewonden (pt=ww, often pos=adj),

    • prenominal participles are probably often both a verb and an adjective: de door de mensen gekochte spullen (pt=ww, positie=prenom)
      • adjectives as a noun: de zieke bleef thuis (pt=adj, positie=nom),

      • numerals as a noun: in 2022 (pt=num, positie=vrij)

  • Alpino does not have an equivalent of what is called the complementive in the Syntax of Dutch [Broekhuis et al. 2015, 239]. But what comes closest is

    • phrases with grammatical relation ld for locative and directional complements

    • phrases with grammatical relation predc for predicative complements

  • Note that a traditional notion such as ‘gezegde’ (predicate) is not directly present in Alpino structures. However the W code in TARSP and the SGG code in STAP require this. One must construct these notions using a query.

  • Alpino will always produce a syntactic structure for an input string (if it does not crash or stop if the string is too long). If it cannot connect all constituents it has found into one structure by its normal rules, it puts them under the top node in a sequence with the grammatical relation dp (discourse part). It has the tendency, in case of multiple options, to make the earlier constituents as big as possible, which is not always good for SASTA, because false starts precede and should be as short as possible. Examples (square brackets are around the first found main clause:

    • [toen heeft een mei een van de meisjes] [heeft mij opgevangen]

    • [dat lukte mij niet dus toen] heb ik uiteindelijk uh met de via de gang naar de voordeur gegaan

    • [ik weet niet hoe] ik bij thuis ben gekomen

  • Adverbs: Words that are traditionally classified as adverbs are either adjectives or (adj) or adverbs (bw) in Alpino. The main rule is that an adverb that is also an adjective is treated as an adj, other adverbs are treated as bw. Adverbial pronouns (ervan, hierover, etc) are also considered adverbs, and treated as single word in the grammar (and not as two words which happen to be written together). There is no special property for R-words. R-words can function as an adverb or as pronoun. R-words are always treated as pronouns (vnw).

As stated before, not all these characteristics are due to Alpino. Alpino itself often yields slightly different structures, but the Alpino-structures are adapted to conform to the conventions agreed upon in the consortia that created the Spoken Dutch Corpus and the Lassy treebanks. Alpino syntactic structures have often kept information about the original Alpino structure. For example, in Alpino structures, single word phrases do have a phrasal node in the structure. The category of this node is indicated in the structures in the attribute lcat. For other examples of Alpino properties in the syntactic structures, see Alpino properties.

Grammatical Properties

Nodes in structures generated by Alpino have properties encoded in the form of attribute value pairs. These properties can be divided into a number of categories:

  • General properties of nodes

  • General properties of nodes for words

  • D-Coi properties

  • Phrase properties

  • Alpino properties

General properties of nodes

All nodes have the following attributes:

  • id: a unique identifier for that node within in the current structure.

  • rel: the grammatical relation the node bears. Even the top node has this property. Conceptually, a grammatical relation is a property between nodes, either between a node for a word and a node for another word, or between a node for a word and its parent node. However, in Alpino it has been implemented as a property of a node. A full list of the possible values for this attribute and explanation of their interpretation can be found in https://paqu.let.rug.nl:8068/info.html#rel . A list of possible values is given here (taken from the module treebankfunctions.py):

    allrels = ['hdf', 'hd', 'cmp', 'sup', 'su', 'obj1', 'pobj1', 'obj2',
               'se', 'pc', 'vc', 'svp', 'predc', 'ld', 'me',
               'predm', 'obcomp', 'mod', 'body', 'det', 'app', 'whd',
               'rhd', 'cnj', 'crd', 'nucl', 'sat', 'tag', 'dp',
               'top', 'mwp', 'dlink', '--']
    

All nodes can have the attribute index (but they do not have to):

  • index: an identifier to relate one node to another node. Indexes are present on “empty” nodes (see above) and their antecedent to accommodate phrases and words that play multiple roles in a sentence.

  • begin: to indicate the begin surface position of the node

  • end: to indicate the end surface position of the node

General properties for nodes for words

The general properties for nodes for words are:

  • lemma : for the lemma of the word occurrence

  • word: for the actual word form of the word occurrence. This retains case, accents and other diacritics

In general, almost all conditions in queries must be formulated in terms of the attribute lemma in order to take into account different case variants (Een, EEN, een), different accent variants (héél, heel), repeated vowels (heeeeeel, heel) and reduced and emphatic variants (ik, ‘k, k, ikke). If one really is interested in a particular word form, one has to deal with case and diacritic variants oneself.

D-COI properties

The grammatical properties for words follow the conventions of the D-COI postags as described in [Van Eynde 2005].

[Van Eynde 2005: 72] gives the following list of what he calls ‘partitions’.

  • [P01] TOKENTYPE = woord, speciaal, leesteken

  • [P02] POS = substantief, adjectief, werkwoord, telwoord, voornaamwoord, lidwoord, voorzetsel, voegwoord, bijwoord, tussenwerpsel.

  • [P03] NTYPE = soortnaam, eigennaam.

  • [P04] GETAL = getal (enkelvoud, meervoud).

  • [P05] GRAAD = basis, comparatief, superlatief, diminutief.

  • [P06] GENUS = genus (zijdig (masculien, feminien), onzijdig).

  • [P07] NAAMVAL = standaard (nominatief, oblique), bijzonder (genitief, datief).

  • [P08] POSITIE = prenominaal, nominaal, postnominaal, vrij.

  • [P09] BUIGING = zonder, met-e, met-s.

  • [P10] GETAL-N = zonder-n, meervoud-n.

  • [P11] WVORM = persoonsvorm, buigbaar (infinitief, onvdw, voltdw).

  • [P12] PVTIJD = tegenwoordig, verleden, conjunctief.

  • [P13] PVAGR = enkelvoud, meervoud, met-t.

  • [P14] NUMTYPE = hoofdtelwoord, rangtelwoord.

  • [P15] VWTYPE = pr (persoonlijk, reflexief), reciprook, bezittelijk, vb (vragend, betrekkelijk), exclamatief, aanwijzend, onbepaald.

  • [P16] PDTYPE = pronomen (adv-pronomen), determiner (gradeerbaar).

  • [P17] PERSOON = persoon (1, 2 (2v, 2b), 3 (3p (3m, 3v), 3o)).

  • [P18] STATUS = vol, gereduceerd, nadruk.

  • [P19] NPAGR = agr (evon, rest (evz, mv)), agr3 (evmo, rest3 (evf, mv)).

  • [P20] LWTYPE = bepaald, onbepaald.

  • [P21] VZTYPE = initieel (versmolten), finaal.

  • [P22] CONJTYPE = nevenschikkend, onderschikkend.

  • [P23] SPECTYPE = afgebroken, onverstaanbaar, vreemd, deeleigen, meta, commentaar, achtergrond, afkorting, symbool.

The notation V(v1, … , vn) here means that v is a supertype of v1, …, vn

[Van Eynde 2005: 75-87] also provides a full list of the 320 different tags with examples. These tags take the form of a string, which has some internal (but complicated) structure. The actual values that occur are short (often abbreviated) versions of the values one sees in the partitions. An example tag is N(soort,ev,basis,zijd,stan). The attribute postag is used to store the tags in Alpino nodes.

Each individual value of this complex postag value is also stored in a separate attribute. This is a list of the attribute names, and for each the list of possible values they allow:

attvals = [  ('pt', ['adj', 'bw', 'let', 'lid', 'mwu', 'n',  'spec',
              'tsw', 'tw', 'vg', 'vnw', 'vz', 'ww']),
             ('wvorm', ['buigbaar', 'inf', 'od', 'pv', 'vd' ]),
             ('pvagr', ['ev', 'met-t', 'mv']),
             ('pvtijd', ['conj', 'tgw', 'verl']),
             ('positie', ['prenom', 'nom', 'vrij']),
             ('buiging', ['zonder', 'met-e']),
             ('getal-n', ['zonder-n, mv-n']),
             ('ntype', ['soort', 'eigen']),
             ('getal', ['getal', 'ev', 'mv']),
             ('graad', ['basis', 'comp', 'sup', 'dim']),
             ('genus', ['genus', 'zijd', 'masc', 'fem', 'onz']),
             ('naamval', ['stan', 'nomin', 'obl', 'bijz', 'gen', 'dat']),
             ('numtype', ['hoofd', 'rang']),
             ('vwtype', ['pr', 'pers', 'refl', 'recip', 'bez',
                         'vb', 'vrag', 'betr', 'excl', 'aanw', 'onbep']),
             ('pdtype', ['pron', 'adv-pron', 'det', 'grad']),
             ('persoon', ['1', '2', '2v', '2b', '3', '3p', '3m', '3v', '3o']),
             ('stat', ['vol', 'red', 'nadr']),
             ('npagr', ['agr', 'evon', 'rest', 'evz', 'mv',
                        'agr3',  'evmo', 'rest3', 'evf', 'mv']),
             ('lwtype', ['bep', 'onbep']),
             ('vztype', ['init', 'versm', 'fin']),
             ('conjtype', ['neven', 'onder']),
             ('spectype', ['afgebr', 'onverst', 'vreemd',
                           'deeleigen', 'meta', 'comment', 'achter', 'afk', 'symb'])

Note that the attribute name for (bare) part of speech tag is pt.

In principle, each node for a word has a pt attribute, but there are a few exceptions, in cases where Alpino cannot assign any value to the pt attribute. The attribute postag will then have the value NA(), which is not an officially valid value in the D-COI tags.

Phrase properties

Phrases have the the property cat

  • cat: syntactic category of the phrase. Possible values are:

    allcats = ['smain', 'np', 'ppart', 'ppres', 'pp', 'ssub', 'inf', 'cp', 'du',
               'ap', 'advp', 'ti', 'rel', 'whrel','whsub', 'conj', 'whq', 'oti',
               'ahi', 'detp', 'sv1', 'svan', 'mwu', 'top', 'cat', 'part']
    

Alpino properties

Alpino retains Alpino properties in automatically parsed syntactic structures. One needs these only rarely. Some that have been used so far are frame, lcat, special, and stype.

This is a list of the Alpino attributes and an indication of the possible values (derived by querying the automatically parsed Van Kampen corpus in GrETEL and for some attributes the automatically parsed Lassy-Groot in PaQu):

  • aform: base, compar, super

  • case: both, dat_acc, gen, no_obl, nom, obl

  • comparative: als, dan, e_als

  • def: def, indef

  • frame: more than 2400 different values for frame.

  • gen: both, de, het, sg

  • iets: (can an adjective in the s-form co-occur with iets: true (or absent)

  • infl: at least 40 different values to indicate the inflectional properties of a word

  • lcat: value taken from cat for the category of the phrasal node for a single word phrase

  • neclass: named entity class: LOC, MISC, ORG, PER

  • num: bare_meas, both, de, meas, pl, sg

  • per: for person, mainly occurring in pronouns, with values such as fir, inv, je, thi, u, u_thi

  • pos: Alpino-internal attribute for part of speech. Values: , adj, adv, comp, comparative, det, fixed, name, noun, num, part, pp, prep, pron, punct, tag, verb, vg.

  • pron: only has the value true, and is present on possessive pronouns and genitive nouns (mama’s huis)

  • refl: only has the value refl, and is present on reflexive pronouns (without zelf)

  • rnum: Values are sg and pl, usage not fully clear to me.

  • root: in most cases equal to the lemma, but not in the case of diminutives (suffix _DIM added to the lemma), verbs (equal to stem, plus separable prefix if any, separated by underscore).

  • sc: subcategorisation patterns, over 450 different values

  • sense: often equal to root, but adds e.g. a particular preposition if this yields a different sense (e.g. klaar-met, kapot-van, zich-trek_aan-van)

  • special: a_noun, aanhaal_both, aanhaal_links, aanhaal_rechts, anders, cleft_het, comp, dir, dubb_punt, eenmaal, enumeration, er, er_loc, ge_v_noun, gen, het, hoe, hoofd, iets, intensifier, komma, left, loc, me_intensifier, meas_mod, mod, name, np, nparg, num_predm, post, post_n_n, post_wh, postadj, postadv, postlocadv, postn, postnp, postp, pre_det_quant, pre_num_adv, predm, punt, rang, sentence, strpro, tmp, uitroep, v_noun, vraag, waar, wkpro.

  • status: different forms of a word: vol, nadr, red

  • stype: describe the sentence type in an attribute of the verb: declarative, imparative [sic!], topic_drop, whquestion, ynquestion

  • tense: for the tense of verbs: present, past

  • vform: gerund for present participles, psp for past participles. For other words the value appears to be adj in all cases

  • wh: whether a word is a wh-word or not: Value are nwh, rel (e.g. dat, die), rwh (welk, wiens), wh (wat), and ywh (wie, waarom). Distinction between wh and ywh is not clear.

  • wk: Value: yes, for weak variants of words, e.g. es instead of eens.

Some of these are explained in https://urd2.let.rug.nl/~vannoord/DCOI/AnnotationGuide.html

Clauses in Alpino

Finite clauses can have any of the following values for the attribute cat:

  • smain: for main declarative clauses where the finite verb is not initial. e.g. ik weet dat niet. Main clauses with topicalised phrases (e.g. dat weet ik niet) also have the category smain, and do not differ from clauses that have no topicalised phrases except by the order of the words (indicated by means of the begin and end attributes).

  • whq: for main clause wh-questions, e.g. hoe doe je dat dan. The whq node contains a wh-phrase or word with relation whd (hoe) and a sv1 node with relation body (doe je dat dan)

  • whsub: for subordinate wh-questions, e.g. (weet jij) waar dat was. The whsub node contains a wh-phrase or word with relation whd (waar) and a ssub node with relation body (dat was)

  • sv1: for finite clauses with an initial finite verb. sv1 clauses can be of many different types:

    • main clause yes-no question, e.g., heb je geen telefoon bij je?

    • main clause imperative, e.g., kom hier

    • main clause declarative clause with topic drop: weet ik niet meer (dat omitted)

    • body part of a whq phrase (see above), e.g., hoe doe je dat dan

    • main clause wh-question with an omitted wh-phrase:, e.g. is dat? (wat omitted), is ie nou? (waar omitted)

  • cp: for subordinate clauses introduced by a subordinate conjunction, e.g. dan zei ik dat ik kan vliegen, toen ik klaar was gingen we naar oma. The cp contains the conjunction with relation cmp and an ssub clause with relation body. Note that cp is also used for nonclausal expressions introduced by a subordinate conjunction, e.g. net als je grote broer

  • rel for relative clauses introduced by a relative pronoun or phrase, e.g. een jongen die ook Maria heet, de man wiens vrouw ziek is. A rel clause consists of a relative pronoun or phrase with relation rhd and a body part of category ssub. Note that main clauses that start with a pronoun that can be a relative pronoun (die, dat ) are sometimes incorrectly analysed as involving a relative clause (e.g. die zijn van mama)

  • whrel for relatives introduced by a wh-pronoun, including free relatives, e.g., ik versta niet wat je allemaal zegt, het park waar ik wandel. A whrel clause consists of a relative pronoun with relation rhd and a body part of category ssub. Alpino can not always correctly distinguish whrel clauses from subordinate wh-questions.

  • svan clauses (and other phrase types) introduced by van, e.g. zegt van ja kom jij eens mee

  • ssub: the body part of various types of clauses:

    • body part of a whsub clause, e.g., weet jij waar dat was

    • body part of a cp clause, e.g., toen ik klaar was

    • body part of a rel clause,e.g. een jongen die ook Maria heet

    • body part of a whrel clause, e.g. ik versta niet wat je allemaal zegt

See also https://rug-compling.github.io/dact/cookbook/#sentence-types

Nonfinite clauses can have any of the following values for the attribute cat:

  • inf: for bare infinitival phrases: e.g., hij wilde een boek lezen. Infinitival phrases as a whole utterance are usually analysed as an NP with a substantivised infinitive.

  • ti: for infinitival phrase introduced by te, e.g. hij heeft geprobeerd een boek te lezen, even when the phrase is discontinuous as in hij heeft een boek proberen/geprobeerd te lezen. Such phrases consist of the adposition te (pt=vz) with relation cmp and a body clause with cat= inf.

  • oti: for infinitival phrases introduced by om and te, e.g. hij heeft geprobeerd om een boek te lezen. Such phrases consist of the adposition om (pt=vz) with relation cmp and a body clause with cat= ti.

  • ahi: for infinitival phrases introduce by aan het, e.g. Hij is een boek aan het lezen. Such phrases consist of the multiword unit (mwu) aan het with relation cmp and a body clause with cat= inf.

  • ppart: for past participle phrases: hij heeft een boek gelezen, door mensen gekochte spullen

  • ppres: for present participle phrases: goed werkende praktijkvoorbeelden, uitgaande van de beschikbare gegevens …, deze processen-verbaal zijn geldend tot het bewijs van het tegendeel.