Arabic Treebank Guidelines


A general description of basic clause structure in the Penn Arabic Treebank can be found here. A description of the notation can be found here.

An example of a nice short simple sentence:


1. Noun Phrase Structure

NP example:

Complements

Complements are genitive, arguments of the verbal form of the noun, possessive, obligatory.

The argument/adjunct distinction is shown structurally inside NPs. Argument/complement constituents are children of NP, sister to the head noun: (NP head (NP argument)).

Arguments are genitive, possessive, or (for deverbal head nouns) constituents that would be arguments of the verb that the noun derived from.

NP with NP argument -- the NP argument (NP maHal~) "(of) place" is a sister of the head noun SAHib "owner" itself:

Some more examples:

madiynap luwnog byt$ "city (of) Long Beach" and
wilAyap kAliyfuwroniyA "state (of) California"

NP with a long string on complement NPs: makAn tawAjad qiyAdap >arokAn waHadAt wizArap Al+dAxiliy~ap "place (of) existence (of) leaders (of) general staff (of) units (of) interior ministry"

(NP dawlit (NP miSr)
(NP track (NP Salzburg)
(NP maTar (NP New York)
statement that 715-2-7 (14)
715-2-7 (NP speaking (PP in the name of (NP someone))) -- (NP Al-mutaHad~ivi (PP bi->ismi (NP quw~Ati wixArati))...

Determiners, Quantifiers, and other pre-nominal modification

Flat NP.
(NP any agreement) 715-7-4 (26-27)
any land 715-15-4 (18-19)
(NP this book)
(NP five people) 715-11-1 (18-19)
(NP all books)
(NP some books)
715-1-2
715-7-2 (24-26)
715-16-2 (60-62) third cup

Quantifiers

We make the distinction between quantifiers acting as true quantifiers and acting as NPs. True quantifiers are flat, at above: (NP many schools). However, when the quantifier is acting as a noun, it is given its own NP label: (NP (NP one) (NP schools)).

Examples:
715-6-1 (24-27)

Some tests for making this distinction in Arabic:
might be case? singular vs plural? definiteness?

Note: ahad is a noun, not a quantifier.

Adjuncts

Adjuncts are descriptive, not possessive, not obligatory.

Adjunct constituents are sister to the NP that contains the head noun, child of the NP that contains both: (NP (NP head) (NP adjunct)). For the most part, we do not distinguish among levels or "scope" of modification -- all adjuncts are at the same level, sisters of the head NP.

NP with PP adjunct -- the NP containing the head noun (NP Al+mu$ar~adi+iyona) "the homeless" and the PP adjunct (PP-LOC fiy...) "in..." are sisters, both children of a containing NP:

Some more examples:
(NP (NP sarikap=company) (NP Greyhound)) 715-1-1
(NP (NP wikalap=agency) (NP Frans Pres))
(NP (NP maTar=airport) (NP JFK)
(NP (NP qanAt) (NP ?aljaziira))
(NP (NP jari:dat) (NP >al>akAm))
agency itar tass 715-2-9
reflexive 715-6-3 (51-53)
(NP (NP the algerian/ADJ) (NP name)) in spite of adj 715-17-1 (7-10)

Names in apposition

Names in apposition are the exception to the 'all adjuncts on same level' rule. The whole NP prior to the appositive name is annotated as usual, but the appositive name is an adjunct to that full NP, which is to say, there is an extra NP level: (NP (NP (NP head noun) (PP pp adjunct)) (NP appositive name)

Examples:
1015-35-3 (8-12)

Here is a more complex example, where the head noun (ra}iys president) has a complement (Al+wuzarA' the ministers), a modifying adjective (Al+<isorA}iyliy~ Israeli), and a name in apposition (<iyhuwd bArAk Ehud Barak), which is adjoined to the entire NP:

Flat

Determiners, quantifiers:
(NP Three books)
(NP This book)
(NP Any books)
715-1-2

Titles preceding the name of a person are flat:

Al+malik Ebd All~ah Al_vAniy "the king Ebd Allah next"

(NP President Clinton) 715-1-1???
(NP President Mubarak)
(NP Colonel Smith)

Single word noun with a single word adjective:
(NP the-book the-red)
(NP minister Egyptian)

Numbers

Flat, or QP (Quantity Phrase).

QP (Quantity Phrase) is used when a multi-word number precedes a noun. Single-word numbers preceding a noun are flat.

In this example, "52 thousand" is a multi-word number preceding the noun "dollar", so it is a QP.
52 >alof duwlAr "52 thousand dollar"

In this example, "more than 1600" is treated as a complex numbe, a QP, preceding the head noun "farm".
>akovar min 1600 mazoraEap "more than 1600 farm(s)"

Again, "approximately twenty" is treated as a complex number, a QP.
HawAlaY Ei$oriyona ziyArap "approximately twenty visit(s)"

(NP three books) flat NP, no QP
715-1-1 middle
3 or 4 days 715-7-4 (15-19)
more than 3000 wounded 1015-35-6 (27-31)

Resumptive Pronouns

Trace of NP-TPC or of WHNP adjoined to the overt resumptive pronoun:
(NP (NP ha) (NP-1 *T*))

In this example, the resumptive pronoun of the WH- trace is the object of a preposition.
Al~atiy yataEar~aD qisom min hA "which is exposed a portion of it(which)"

In this example, the resumptive pronoun is the object of the preposition bayona. This is a particular structure modifying an NP -- it is done as a relative clause with a null relative pronoun whose trace is adjoined to the resumptive prounoun hum.
bayona hum valAv nisA' "among them(whom) (are) three women"

This is an example where the object pronoun is resumptive in a relative clause:
Al+>arADiy Al~atiy yamolik hA muzAriEuwna biyD "the territories which white farmers control them(which)"

example in 715-1-6
subject resumptive ex
resumptive pronoun with TPC subject in an equational S 4-22-02 715-59-5
also 715-7-4 (36-45)

Relative Clauses

ALWAYS adjoined to the NP they modify:
(NP (NP the book) (SBAR which....))

The relative clause SBAR (which white farmers control) is adjoined to the head NP (territories):

See the section on Relative Clauses under Subordinate Clauses below for more information about relative clause structure.

Discontinuous Constituents/Rightward Movement

Rightward-moved constituents (usually complements or modifiers of NPs) are coindexed with an empty element *ICH* (Interpret Constituent Here) at the location where they originate.

Examples:
715-3-3
ICH 715-2-3 (3, 14)

Right Node Raising: Right node raised constituents are similarly coindexed with an empty element *RNR* (Right Node Raising) in each of the positions where the constituent is interpreted.

Examples:
715-5-5 (6-14)

Occasionally something which is not exactly a constituent has been moved rightward. Usually this happens with second conjuncts, where both the conjunction and the second conjunct are moved (as in "I ate lunch on Tuesday and dinner"). When this happens, the entire moved portion is given the node label NAC (for Not A Constituent) and then coindexed with an empty *ICH* adjoined to the first conjunct.

Examples:
715-4-1 (15-27)

A parallel example of normal, unmoved coordination:
715-4-3 (20-30)

Clitics

Cliticized determiners are left attached to the noun/adjective. Possessive pronoun clitics are split from the noun, but are annotated as a flat NP:
(NP the book- -ha)

NPs are split from cliticized prepositions, complementizers, conjunctions, etc. (any category that would affect the syntactic tree, i.e. that would not leave a simple flat NP):
(PP li- (NP -book))
(NP (NP the book) wa- (NP -the+paper))
(SBAR ana- (S (NP-TPC-1 -hu) (VP ....)))

A Note on Case Marking

Difficult NP Structure cases:

NX 715-1-3 (the only one we've seen...)
NAC?? need this example


2. Verb Phrase Structure

(NB: The VP is often same as the S, if nothing precedes the verb.)

As in the Penn English Treebank, the distinction between arguments and adjuncts of the verb or verb phrase is made through the use of functional dashtags rather than with a structural difference. Both arguments and adjuncts are children of the VP node. No distinction is made between VP-level modification and S-level modification. All constituents that appear before the verb are children of S and sisters of VP; all constituents that appear after the verb are children of VP.

ARGUMENTS of the verb are: NP-SBJ, NP-OBJ, SBAR (no dashtag or -NOM-SBJ/OBJ), S (no dashtag or -NOM-SBJ/OBJ), PP-DTV, PP-CLR (closely/clearly related -- a PP the annotator's intuition says is an argument, though it doesn't fall into one of the official argument categories).

ADJUNCTS are: any XP with any other adverbial dashtag, PP (no dashtag), ADVP (no dashtag).

In this example, the NP-SBJ is the subject, NP-OBJ is the object of the verb, and NP-TMP is an adverbial (temporal) NP:

Subjects

The subject (labeled NP-SBJ) is inside VP after verb.

A simple sentence with NP subject following the verb:

If there is no overt lexical subject, and empty subject (NP-SBJ *) is inserted following the verb.

A simple sentence with pro-drop:

The subject can be pro-drop even if it is semantically empty:
715-9-7 (1-12) It appears that John is happy

Note: The object of a preposition can NEVER be the subject of a sentence!

Pre-verbal/Topicalized Subjects

If the subject precedes the verb, it is labeled NP-TPC and traced to (NP-SBJ *T*) following the verb.

A topicalized NP subject trace:

Objects

NP objects of the verb are labeled NP-OBJ. Ditransitive object are labeled NP-DTV or PP-DTV, as appropriate.

An example of a sentence with two objects (one labeled NP-OBJ and the other labeled NP-DTV) is seen in

715-7-2 (6-9)
815-72-24 nominate someone-DTV director-OBJ

Clitics

Cliticized object pronouns are split from the verb:
(VP read- (NP-SBJ *) (NP-OBJ -ha))

Sentential Complements (S and SBAR)

Sentential complements of the verb are unlabeled S or SBAR:
(S (VP reported (NP-SBJ the king) (SBAR that...)))
(S (VP said (NP-SBJ the king) " (S ...) " ))

Adverbial Modification (PP, ADVP, NP-ADV, S-ADV, SBAR-ADV)

All adverbial modification of the sentence and the verb phrase appears within the VP. PPs (Prepositional Phrases) and ADVPs (Adverb Phrases) are by default adverbial. NP, S and SBAR all need some kind of adverbial function tag when they are analyzed as having adverbial function.

A specific adverbial function tag is used for all adverbials whenever it is appropriate: -TMP temporal, -LOC locative, -DIR directional, -PRP purpose, -MNR manner. If no specific function is appropriate, -ADV must be used for adverbial noun phrases and clauses: NP-ADV, S-ADV and SBAR-ADV.

Closely Related Prepositional Phrases (PP-CLR)

PPs that are "CLosely Related" to the verb are given the -CLR function tag. This is used for all PPs that seem to be complements of the verb, with the exception of ditransitive verbs where PP-DTV is used.

KANA and her sisters

kAna and her sisters take a subject (usually NP-SBJ) and a predicate. The predicate is shown with the -PRD function tag. It is used with all non-verbal predicates: NP-PRD, ADJP-PRD, PP-PRD.

List of KANA sisters: remain, become, seem, etc.

Examples:
(S (VP KANA (NP-SBJ the book) (ADJP-PRD red)))
(S (VP becomes (NP-SBJ the book) (ADJP-PRD red)))
(S (VP seems (NP-SBJ the book) (ADJP-PRD red)))
715-1-3 badA

Here is the list of kAna and Sisters in Arabic:

>aSbaHa 'to become (in the morning)'
>amsA 'to become (in the evening)'
Dal~a 'to persist'
bAta 'to keep doing something'
>aDHA 'to become (in the afternoon)'
labiva 'to keep to'
baqiy~a 'to remain doing something'
jaEala 'to begin doing something'
>axa*a 'to start doing something'
mA zAla 'to continue'
mA dAma 'to last, to continue'
mA fati}a 'to go on doing something'
mA >infak~a 'to continue doing something'

KANA as an Auxiliary Verb

kAna can also be used as an auxiliary verb, in which case it does not have a subject of its own and it takes a VP complement. KANA is the only auxiliary verb in Arabic (i.e., zAla is NOT an auxiliary).

(S (VP KANA (VP reported (NP-SBJ the king) (SBAR that...))))

vs. zAla, which is not an auxiliary, 715-61-5

Examples:
kanat auxiliary with qad, subject between kana and verb 715-10-4 (1-4.5)

When the subject appears between KANA and the main verb, it is treated as a topicalized subject of the main verb, but it does not have the -TPC tag:

(S (VP KANA (NP-1 the king) (VP reported (NP-SBJ-1 *T*) (SBAR that...))))
ex in 715-2-7

Serial Verbs

KANA is the only auxiliary verb in Arabic. Any other verb that is followed by a second verb is analyzed as a verb with a sentential complement. When the complement sentence has a pro-drop subject, it can be coreferenced with the subject of the first verb.

(S (VP continued (NP-SBJ-1 the king) (S (VP report (NP-SBJ-1 *) (SBAR that...)))))

Examples:
715-10-6 (15-20)

Passive Verbs

Verbs in the passive form always have a passive object trace which is coindexed to the subject: (NP-OBJ-1 *)

passive example

The passive trace is the same, even if the subject is topicalized:

passive with TPC example

Passive with logical subject, NP-LGS:
715-12-3 (4-7)

Middle Verbs

Middle construction example in 715-61-2 "be-composed", Form 5 p. 24 bottom table in Fischer

taC1aC2aC3~a (tafaEal~a)

Floating Quantifiers

example in 715-61-2. Do it as ADVP in VP? Not sure this was totally decided 4-24-02.


3. Coordination

Coordination is done as adjunction (Z (Z ) and (Z )); coordination has the same structure at all phrase levels.
This is an example of NP coordination:

SBAR and SBAR coordination 715-12-1 (23-33)

When constituents of different types are coordinated, the outer coordination-level node label is UCP (Unlike Coordinated Phrase). Any shared function tags are put on the UCP label, and not on the lower labels.

example in 715-1-4 S and SBAR and S
UCP-TMP 715-1-10
715-61-2 coordinated SBAR relatives, need WH 0 for second... 4-24-02
715-4-3 (20-30)

Initial wa

Sentence-inital wa is treated as having a discourse rather than coordinating function, and as such is put inside the S. However, all other instances of wa are treated as true coordination.

This is an example of sentence-inital wa:
715-4-4 (first S in the guidelines)
715-61-2 coordinated SBAR relatives, need WH 0 for second... 4-24-02

This is an example of NP coordination:

Gapping (VP Template Gapping)

Examples:
715-61-6
715-5-3 (15-34)
with *?* 715-17-3 (0-23, whole tree)


4. Subordinate Clauses

Verbs of "Saying"

Direct Speech

Direct "quoted" speech is treated as a complement of the verb of saying, however it is quoted (i.e., null complementizers are not inserted for direct speech).

(S (VP reported (NP-SBJ the king) " (S I'm going home) " ))
(S (VP reported (NP-SBJ the king) " (SBAR that (S I'm going home) " ))

Examples:
715-11-4 whole tree

Indirect Speech

N.B.: may not be relevant for Arabic????

Indirect speech is always treated as an SBAR complement of the verb of saying. If there is no overt complementizer, a null complementizer (0) is inserted.

(S (VP reported (NP-SBJ the king) (SBAR that (S he will leave)))
(S (VP reported (NP-SBJ the king) (SBAR 0 (S he will leave)))

Relative Clauses

Relative clauses are always adjoined to the NP they modify. The relative clause is an SBAR that always begins with a WH- word (alaty, ala*y, mA, when, where, why) or a null WH- word (0) if there is no overt WH- word. The WH- is coreferenced with a trace that fills its function in the clause.

subject relative
object relative
object of PP relative
adverbial relative
WH 0 relative 715-3-2
adj-prd relative WH 0 715-4-1 (6)
relative traced to lower clause 715-9-7 (23.5-33)
rel cl with resumptive object pronoun 715-16-3 (15-29)

Resumptive pronouns in relative clauses

The trace of the WHNP is adjoined to the overt resumptive pronoun:
(NP (NP ha) (NP-1 *T*))

even if the resumptive pronoun is possessive:
(NP book (NP (NP his) (NP-1 *T*)))
the majority of whom - resumptive possessive pronoun, equational sentence, WH0 715-4-6 (4-16)
resumptive OBJ 715-9-3 (29.5-38)
the majority of which 1015-35-6 (21.5-25)

Coordination

Multiple relative clauses modifying the same NP can be coordinated, as coordinated SBARs:

715-7-1 coord rel SBARs WH0 and Alatiy

The above example also illustrates the use of the null relative pronoun (WHNP 0) with passive relative clauses.

Free Relatives

Free relatives have the internal structure of relative clauses (SBAR with a WH and its trace), but function externally as nouns. Therefore, they receive the "nominal" function tag -NOM: SBAR-NOM. In Arabic, they are headed by mA when it means alaty.

free rel ex 715-3-2
also 715-1-7
free rel object of PP 715-10-1 (30-35.5)
free rel object of PP 715-11-1 (41-45.5)

Note that while mA normally heads only free relatives, it may appear heading a relative clause that modifies an NP:
715-6-3 (21 and on)

Special cases

SBAR vs. SBAR-ADV

SBAR complements of the verb are plain SBAR with no function tag. Adverbial SBARs must have an adverbial function tag:
reported that complement
arrived when temporal
will do this if ADV, if in 715-2-6 (36)
when SBAR-TMP 715-10-4 (26-27.5)
if possible SBAR-ADV 715-11-5 (17-18.5)

ana hu can done as a flat complementizer:
715-10-6 (4-15 or 20)

However, the hu can also be a topicalized subject pronoun. The fact that the clitic can be any personal pronoun (not just hu is evidence that this construction is not purely a flat complementizer of "ana hu".
715-12-2 (31-33.5) with iy !

S vs. S-ADV

S complements of the verb are plain S with no function tag. Adverbial Ss must have an adverbial function tag:
reported direct speech complement
continued serial verb complement
hal -ADV 715-9-2 (12-14)
masdar -ADV 715-2-8, 715-4-1, 715-4-5 (30-37)
equational -ADV
small clause
among whom 715-1-8
or coord S??? among them 715-61-12
while, fiy Hiyn S-TMP 715-15-3 (44-51)

PP vs. SBAR

PP if its complement is NP, SBAR if its complement is S.
examples???

li SBAR 715-11-5 (19-34)

Flat multi-word complementizers

A preposition that is not a required argument of the verb (i.e., not PP-CLR) is annotated as flat pre-modification of an SBAR complementizer.

EalaY >an 715-16-4 (7-8)

Small Clauses

Small clauses are complements of verbs like consider, find, call, name. They are shown as an S with a NP-SBJ and a -PRD predicated.

small clause example, passive and TPC 715-7-2 (35-39 or 46)
with rank/classify, WH, passive 715-8-1 (9-13)
passive, TPC 715-12-2 (35-39 or 45)

Small clauses can be complements of the same set of verbs, even if the verb is in the passive form. When the verb is passive, the subject of the small clause is the passive trace.

example series from 4-24-02 Simba -- active, passive, relative clause, relative passive

Active Small Clause

Passive Small Clause

Passive Small Clause with Topicalized Subject

passive small clause example

The passive trace is the same, even if the subject is topicalized:

passive small clause with TPC example

Other subordinate clauses

"if ... or not" example 715-2-6

Expletive SBAR and hu: 715-2-10
expletive S with hu 715-6-2 (6-34)
empty expletive? or not? 715-1-11
empty ex 715-61-2


5. Participles, Gerunds and Masdar

Examples:
715-11-1 (10.5-16) adj->VP rel cl active mufAEil
715-11-1 (13.5-16) adj->VP rel cl passive mafEuwl
715-11-5 (2.5-7) adj->VP active
715-11-5 (21.5-24) adj->VP participle
715-11-5 (19-end?) S-NOM obj of PP, with hA subject

M A S

 

GERUNDS & PARTICIPLES

 

[ Draft dated: Tues, June 25, 2002]

 

I. Distribution of S, S-NOM, S-ADV, NP, ADJP

 

The use of S, S-NOM, S-ADV, NP and ADJP for gerunds and participles is purely distributional.  This distribution assumes that you already know whether the word is a verb or a noun/adjective (see II. And III. below).

 

 

 

 

Null subjects of verbal gerunds can be coindexed to another NP in the sentence if they have a coreferenced interpretation.

 

II. All masdar (=MAS / >ism Al-fiEl), present participle (= PRP /  >ism Al-fAEil) and past participle (=PSP / >ism Al-mafEuwl) constructions are analyzed by default as NPs or ADJPs, depending on the context.  Below are a number of tests to confirm this default interpretation.  However, evidence of verbal arguments, modification or interpretation overrides this default and leads to a VP analysis (see II. below).

 

1. The MAS/PRP/PSP is a single word ( or with a possessive pronoun clitic )  à NP

           

A.        yakuwnu nAjimAF Ean >istidAmihA

bi-Al-ragmi min rafDihi

yawma mawtihi

B.       

 

C.        zAra Al-maHbuwbu Habiybatahu

 

2. a. The MAS/PRP/PSP itself has a determiner (Al -)  à NP

           

A.        Al-Eawdap <ilAy <iyran

Al-bud'i  bi-<iEAdati tawziyEi Al->arADiy...

Al-<ifrATi fiy $urbi Al-kuHuwli

baEda Al-tazaw~udi bi-Al-miyAhi

 

B.        EalaY jamEi  Al-zujAjAti Al-fArigap

Al-mutaHad~ivu bi-{ismi qiyAdati Al->arkAni Al-ruwsiy~ati

Al-muqiymuwna fiY Al-garb

Al-qim~atu Al-munEaqidatu fiy kAmb dayfid

Al-duwali Al-muSad~irati li Al-nafTi...

luwng biyt$ Al-wAqiEatu EalaY

nufuwvu wA$inTuwn Al-muhaymini fiy…

 

C.        li-Al mu$Arakati fiy <iEAdati <iEmArihA

min Al-muqar~ari >an...

 Al->awSAti  Al-muqar~abati min Al-ri{Asap  Al-<iyrAniy~ap

Al-Hariyqi Al-mundalaEi fiy biylyuwn

qim~at $arm Al-$ayx Al-mutawaq~aEati gadAF

>ilaY  >iETA'i  Al-EalAqAti  Al-mutamay~azati bayna ...

Al-t$iyki  milAn, Al-muqAli min manSibihi

... Al-muSan~afatu 12 Ealamiy~AF

 

2.b. The MAS/PRP/PSP itself has a determiner (Al -)  and modifies an NP

(or is itself a predicate) à ADJP

 

N.B. A test to distinguish between NP and ADJP is to try following the MAS/PRP/PSP with jidAF "very”.  If it’s still good, then the MAS/PRP/PSP is an ADJP. 

 

            Examples:

 

ADJP-PRD : Al-nadwatu  Al-muqar~ari EaqduhA fiy...

ADJP in NP : mat$il~A Al-Ealimu  bi-mustawA  Al-lAEibiyna Al-suEudiy~ina

ADJP/flat in NP:    Al-yawmu Al-mawEuwdu

                                   QayS, Al-maHbuwbu Al-majnuwnu

 

3. The MAS/PRP/PSP is modified by an adjective à NP

 

            A.        tawziyE  Ea$wa>iy~in  li-Al->arADiy

 

            B.        ruwsyap Al-rAEiy~ati Al-vAniy~ati li…

 

            C.        Al-kuwayt Al-dawlatu Al-muSad~iratu Al->uwlA li-Al-nafTi

 

4. The MAS/PRP/PSP has a GENITIVE NP argument  à NP

           

A.        mun*u  qiyAmi Al-vawrati Al-<islAmiy~ap

mun*u {inbilAji Al-fajri

HuSuwli Al-hujuwmi Al-$iy$Aniy~

suquwTi qatlaY muEZamuhum min Al-filasTiyniy~iyna

fiy makAni tawAjudi qiyAdati waHadAti wizArati Al-dAxiliy~ati

{indilAEi Al-HarA}iqi fiy Al-gAbAti

tawziyEi  Al->arADiy

sanaquwmu bi-tawfiyri <iqAmatihim

tam~a taxfiyfu Hid~ati  Al-HarA}iqi

li-nazEi fatiyli Al->azmati fiy Al-$arq Al->awSaT

li-tanZiymi HayAtihim

<I$AratAF <ilaY rafDi Al-{igtisAli wa…

EalaY >uhbati <ilqA'i HumuwlatihA

sayakuwnu jaElu waqfi <iTlAqi Al-nAri …

Hub~u  Al-banAti

 

B.        Hamilatu Al-laqabi...

 

C.        musAbaqatu ka>si Al-Ealami

 

N.B.

 

(a)  The GEN may however, appear in a SBJ or OBJ relationship with a "verbal"  MAS ( Fischer # 386.b ) as in:  Hub~u  Al-banAti / >aklu Al-dajAji  which can be "the girls' loving" / "chicken feed" or "loving (the) girls" /"eating the chickens."  Unless there is a strong indication from the context which leads towards  a verbal interpretation, these are all à NP

 

(b) when the GEN and ACCU are formally indistinguishable (especially with DUAL and PL forms-- see Fischer #140)  as in: <ilaY  <iSAbati jundiy~ayni ruwsiy~ayni {ivnayni, the default choice is à NP

 

(c) Note that this test refers only to NP arguments of the participle.  If a preposition intervenes, this test does not apply ! (see below for PPs)

 

5. The MAS/PRP/PSP is modified by a PP à NP or ADJP

      (no strong verbal reading)

 

N.B. A test to distinguish between NP and ADJP is to try following the MAS/PRP/PSP with  jidAF "very”.  If it’s still good, then the MAS/PRP/PSP is an ADJP. 

 

A.        tamhiydAF li-Eawdap >al-EA}ilAti >al-<iyrAniy~ati

<I$Arap <ilaY rafDi Al-{igtisAli  wa -<idmAnihi  EalaY...

qumtu  >ikrAmAF  lahu..

{iEtibArAF  min tam~uwz yuwliyuw

 

B.        yakuwnu nAjimAF Ean >istidAmihA

 

kamA >aElana mutaHad~ivun bi >ismi Al-jamArik…

ADJP : majmuwEatin >amiriykiy~atin muEAriDatin li...

ADJP : $arikatin mutaXaS~iSatin fi Sin~Eati Al-nafT

ADJP : >inna firaqa Al->inqAD mudrikatun li-kulli mA sabaq

 

C.        ADJP : mawjuwdAF fiy maTAri xAn qalEap

ADJP : kAnat mawjuwdatAF EalaY maqrabtin min qiyAdati Al-arkAni

ADJP : >anna Al-gaw~Asata mujah~azatun bi 42 SaruwxAF

ADJP : nabAtAt nAdirAF jid~AF muhad~adatin bi-Al->inqirAD…

ADJP: fiy Eulbatin mawDuwEatin fi maxba>in

 

 

III. Evidence of verbal arguments, modification or interpretation overrides the above default and leads to a VP analysis of masdar, present participle and past participle constructions.  Below are a number of tests for the verbal interpretation.

 

1. The MAS/PRP/PSP has an ACCUSATIVE NP argument  à VP

           

A.        bi-tasjiyli-hi 3.42 mitrap   

                       

B.        Al-bAligatu min Al-Eumuri EamAF

            MA Hamidun Al-Suwqa >il~A man rabiHa

            Lastu bi-Al-jAHidi faDlakum

 

C.        tam~at muHaSaratu gAlibiy~ata Al-HarA}iqi

Al-lAEibi Al-mutaSad~ari baTala Al-mawsimi

 

 

 

2. The MAS/PRP/PSP has any true ADVP modification  à VP

 

A.        bi-Al-ragmi min rafDihi sAbiqAf

 

B.        fal-Eamaliy~atu jAriy~atun Haliy~AF

 

C.        mat$il~A Al-Ealimu tamAmAF 

bi-mustawA Al-lAEibiyna AlsuEudiy~in

 

3. 'HAl '  If the 'Hal' MAS/PRP/PSP is lexicalized as an adverb, then it is analyzed as ADVP.  If the 'Hal' MAS/PRP/PSP does not have a strong verbal reading, but does modify the matrix verb in the clause, it is analyzed as NP-ADV.  If the 'Hal'  MAS/PRP/PSP has a strong predicate reading requiring a subject, it is analyzed as an ADJP-PRD in an S-ADV with the empty subject co-indexed to the co-referent NP in the clause.

 

Need strong examples….

 

B.        tAbiEatap li...

mutawaj~ihAF >ilay

mu$iyrAF <ilaY HuSuwli  XTA

lAHiqAF <ilaY Al-majmuwEap Al~atiy

bi-Al->u'Suwli muntaSirAF EalaY xalfiy~ati Al-muwAjahAt fiy Al->arADiy

 

4. The MAS/PRP/PSP has a very strong event reading in the context  à VP

 

Follow all the rules à NP, but the strong event reading à VP

 


6. PP and ADVP Structure

Prepositional Phrases almost always have a single NP complement.
(PP-LOC fiy (NP Egypt))

Flat PPs

Multi-word prepositions are annotated as flat with an NP complement.
bada >an 715-1-8
siway li
lA buda min
la Hawola
the rest of the list???

If the PP is a required argument of the verb (PP-CLR), it can have an SBAR complement, a construction which is fairly common in Arabic. Here is an example of a PP with an ana complement:
715-11-3 (3-end of SBAR)
715-11-5 (27-34)

gayor can be a preposition, particle, adverb or conjunction, depending on context. Here is an example where it is a conjunction: 715-11-2 (22).

An ADVP can have a PP child, if the adverb head is the primary adverbial and the PP modifies it.

Examples:
715-16-2 (??) badalAF min
715-16-6 (44-46) badalAF min

On the other hand, if the adverb modifies the PP, the PP is the primary structure, and the ADVP is a child of PP.

Examples:
715-16-12 (35-37) especially wiht the presence


7. Miscellaneous Constructions

An unordered miscellany of difficult constructions...

Coreference

In this treebank, we show syntactic coreference through coindexing, but we do not show discourse coreference. This means that when two items are coreferenced, one of them must be an empty category. It also means that we do not show the coreference of pronouns.

Dates

When months appear with two names, they are treated as a two-word noun phrase, and therefore they need to have their own NP level. (NP 28 (PP of (NP (NP Sept. / Sept. ) (ADJP past))))

Examples:
28 of Sep/Sep past 1015-35-6 (13-17)

More examples of constructions involving dates:
715-16-1 (26-33) from 10 to 19 July - endpoints, so 2 separate PPs

Compass directions

Compass directions are basically calques in Arabic, and they are done flat:
715-11-1 (24-26) south east

Sports scores

Sports scores such as "6-4" in "The Phillies won 6-4" should be done as a flat ADVP: (ADVP 6-4).

Examples:
715-5-1 (28-29)

Comparatives

Done as adjunction.


8. Arabic Constructions

Nominal Sentences

Nominal sentences are analyzed as sentences where the subject is "topicalized" and precedes the verb. If the subject precedes the verb, it is labeled NP-TPC and traced to (NP-SBJ *T*) following the verb.

A topicalized NP subject trace:

Verbal Sentences

Verbal sentences are analyzed as sentences where the subject follows the verb. Other adverbial modification may precede the verb. The subject (labeled NP-SBJ) is inside VP after verb.

A simple sentence with NP subject following the verb:

If there is no overt lexical subject, and empty subject (NP-SBJ *) is inserted following the verb.

A simple sentence with pro-drop:

Verbal sentence with adverbial material preceding the verb:

on tuesday came the king... example

Equational Sentences

Equational sentences are analyzed as sentences that must have a subject -SBJ and a predicated -PRD. An "equational" sentence with an adjectival predicate:

Some more examples:
PP-PRD with SBAR-SBJ 715-2-6 (30)

Masdar

See section Participles, Gerunds and Masdar above. Masdar is analyzed as a verbal gerund. S-ADV

715-2-8
715-68-1 with NP-OBJ
715-68-2 2 NP objects???
715-61-11 adding SBAR
715-9-3 (29.5-38) S-NOM
715-17-1 (18-28) S-NOM with hi subject
715-11-1 (28-36) distransitive, object of PP

Here is an example of an ADJP that is NOT masdar:
715-11-5 (2-7)

Mufaal

We do not annotate "reduced relatives" as reduced in Arabic. Since the subject follows the verb, the subject trace of WH-movement has to be shown (and so there is no "reduction" for Arabic). These relatives are annotated as passive verbs with WH 0 or as ADJP-PRD with a WH 0.

WH0 with ADJP-PRD and a resumptive possessive pronoun in the subject 715-4-5 (23-26.5)
715-9-3 (29.5-38)

Hal

S-ADV 715-9-2 (12-14)

WHADVP with Hal, 715-12-4 (21-34.5)

KANA and her Sisters

kAna and her sisters take a subject (usually NP-SBJ) and a predicate. The predicate is shown with the -PRD function tag. It is used with all non-verbal predicates: NP-PRD, ADJP-PRD, PP-PRD.

List of KANA sisters: remain, become, seem, etc.

Examples:
(S (VP KANA (NP-SBJ the book) (ADJP-PRD red)))
(S (VP becomes (NP-SBJ the book) (ADJP-PRD red)))
(S (VP seems (NP-SBJ the book) (ADJP-PRD red)))

Here is the list of kAna and Sisters in Arabic:

>aSbaHa 'to become (in the morning)'
>amsA 'to become (in the evening)'
Dal~a 'to persist'
bAta 'to keep doing something'
>aDHA 'to become (in the afternoon)'
labiva 'to keep to'
baqiy~a 'to remain doing something'
jaEala 'to begin doing something'
>axa*a 'to start doing something'
mA zAla 'to continue'
mA dAma 'to last, to continue'
mA fati}a 'to go on doing something'
mA >infak~a 'to continue doing something'

Clitics

Clitics that play a role in the syntactic structure are split off into separate tokens (e.g., object pronouns cliticized to verbs, subject pronouns cliticized to complementizers, cliticized prepositions, etc.). Clitics that do not affect the structure are not separated (e.g., determiners).

PP with a cliticized object pronoun, split apart so that the NP can be shown:

Subject pronoun cliticized to a complementizer, split so that the structure can be shown:

Initial wa

Sentence-inital wa is treated as having a discourse rather than coordinating function, and as such is put inside the S. However, all other instances of wa are treated as true coordination (see the section on Coordination above for a discussion of coordinated structures).

This is an example of sentence-inital wa:
715-4-4 (first S in the guidelines)

This is an example of NP coordination:

The various used of mA

1. Relative Pronoun mA (with trace)

mA "what; whatever"
man "who, whoever"
mA*A "what"
li-mA*A "for what, why"
mahmA "whatever"
>ay~u (+ GEN "which of…?"
>ay~umA "whichever"
>ayna "where?"
>aynamA "wherever"
matA "when?"
matA mA "whenever"
Hayvu-mA "wherever"
kayfa "how"
kayfa mA "however"

Examples:

mA liy? "what is with me?"
mA laka? "what is with you?"
mA lahu kA*ibAF? "For what is he lying?"
man liy? "Who do I have?"

mA in free relatives/SBAR-NOM
mA sAEadahA EalaY Al-fawzi huw~a >as~ukuwt
[ niEma/bi>sa + mA ] : PRED + SBAR-SBJ
niEma mA >amarta bihi
bi>sa mA SanaEta
mA >agraba mA najiduhu fiy manzilihA
mA can be used to express uncertainty as in:
>akaltu mA >akaltu "I ate whatever I ate"
hum mA hum "they are what ever they are"

2. Quantifier/Indefinite mA "some"

yawmin mA "some day"
>amrN mA " some question"
mA $awqK "much longing"
Eam~A qaliylK "almost"
bimA raHmatK "for kindness""Expletive mA" (see Blachère)

mA min and man min 'So many, so much"
mA min >aHadin yuqad~iru Eamalakum mivla mA >uqad~iruhu
mA min >insAniK hunA yaHtAju >ilayhi
mA min yawmiK >il~A wa ta*ak~artuhu
mA min quwwatin kAnat tastaTiyEu >al-wuquwfa fiy wajhihi
(See Oliverius page 66)
yawmAF mA "some day'
fiy HAlatK mA "in any state"
mA "as long as" + PERFECT
lan nadxulahA mA dAmuw fiyhA (mA + perfecverb + future)

3. Particle mA (PRT)

a) Negative mA [compare to: lA, lam, laysa] mA (>inta) baxiylN --- NOM
lasta (>anta) baxiylAF---ACCU
mA liy
mA bAlu … (see Fischer # 285.1 & #434.1)
mA muHam~aduN >il~A rasuwluN "Muhammad is (nothing) but a messenger"
mA huw~a laka bi jArin "he is not for you a neighbor"
mA hA*a ba$arAF
mA >in + mA "not at all"
mA … >il~A >an…."no sooner …than…"

b) Exclamative mA [ mA >at~aEaj~ubiy~ap] + ACCU Examples: mA >ajmalahA!
mA kAna >aSbarahu 'How patient was he!'
mA >afEala + NP (ACC) or Relative mA
mA >agraba mA najiduhu fiy manzilihA
mA >a$rafa zaydAF (Blachère 192)
mA >ajmala Al-binta
mA >ajmalahA

4. Subordinating Complementizer mA (mA >al-maSdariy~ah) "the fact that" mA "as long as"
>im~A "if"
lam~A "after"
>i*A mA "if"
>lam~A >an "after, when"
Eam~A "about that which" -----Ean mA
EindamA "when" --------Einda mA
baynamA "while"
bimA
fimA
kaviyrAF mA "it is frequent that…" [Blachère, page 220]

It introduces a verbal clause (see Fischer #416): e.g. Eajabtu min mA Darabtahu
mA + PERFECT_VERB (see Fischer #462)
"while" >agu*~u Tarfiy mA badat liy jAratiy "I lower my eyes while my neighbor appears before me"
"as long as"
"as often as"
kul~amA + PERFECT-VERB "everytime that…, whenever, as often as"
"The more…the more" (see Fischer #463)


Please send e-mail to Ann Bies if you have any questions, comments, additions, etc.