Multilingualism effects in an elicitation study on Differential Object Marking in Cusco (Peru) and Misiones (Argentina)

: Although Differential Object Marking in Spanish ( a -marking of direct objects) has been extensively studied from different perspectives and with different methods, its status and functioning in multilingual and language contact settings has thus far received little attention. This paper presents and compares data from monolingual and bilingual speakers of Spanish from two regions in Latin America, namely Argentina/The River Plate and Peru. An experimental elicitation study reveals that there are considerable differences in the DOM systems of Spanish monolinguals vs. bilinguals and between the bilingual groups, with the latter showing more individual variability and lower rates of a -marking in general. Our findings also suggest that within monolingual groups, the variation of a -marking is strongest for semantics-driven factors rather than syntax-driven ones. From a methodological perspective, we introduce an effective tool for collecting oral production data for a wide range of different DOM-sensitive syntactic configurations.


An open question in the study of Differential Object Marking
Since Bossong (1982), the split in Spanish direct object marking between zero marking and the presence of the marker a became to be known as Differential Object Marking (DOM). As the examples in (1) show, some objects obligatorily receive the marker, while others cannot receive it.
(1) a. Veo *(a) María. see.prs.1sg dom M. 'I see Maria.' b. Veo (*a) la bicicleta. see.prs.1sg dom the bicycle 'I see the bicycle.' The topic in itself has always been a controversial one in the study of Spanish grammar and has received broad attention from different perspectives: synchronic, diachronic and variational (for a detailed overview cf. Fábregas 2013). It has also been claimed that the extension of DOM, especially with respect to inanimate objects, may differ regionally. Company Company (2002) discusses data from Mexico, suggesting an important increase of a-marking in indefinite and inanimate objects. Similar claims have been made for the Rio de la Plata region (Dumitrescu 1997;Montrul 2013;Hoff 2018) and corpus studies have found that factors of different relative strength favour a-marking in different varieties (Barraza 2003;Alfaraz 2011;Balasch 2011;Tippets 2011). Many regions, however, lack detailed empirical studies, this being the case for most of the countries along the Pacific coast in South America, including Peru. The same holds for contact scenarios. While there is work on heritage speakers and acquisition of Spanish as a foreign language (cf. Section 2), empirical work on DOM in contact scenarios of predominantly Spanish-speaking countries is rare (cf., however, Mayer/Sánchez, this volume).1 Furthermore, many relevant configurations which, according to the literature, show a strong affinity to DOM have not received special attention in the above-mentioned studies, most probably because they are not frequent enough in spontaneous spoken data. In many cases, it is not clear whether constructions with particular DOM-relevant features (e.g. reversible predicates based on verbs such as reemplazar 'replace' or seguir 'follow') have been included in the analysis of what we will call canonical transitive sentences, such as (1), or whether they have been excluded from the analysis in those studies. The same applies to the other configurations introduced in 1.2.
The present paper addresses this gap both methodologically and empirically. It introduces an approach based on sentence elicitation, performed by speakers from contact regions and control groups with a predominantly monolingual background. The sentence elicitation procedure was set up with an experimental design in order to assess the use of a wider range of relevant structural configurations in a given variety while controlling for a number of factors. Exact replication also allows for the collection of highly comparable data. Furthermore, it allows us to test marginal or infrequent configurations in a more reliable way. In the ideal case, such experimental studies can be backed up with spontaneous spoken language data, acceptability rating tasks, and metalinguistic interviews in a combined approach for a final assessment.
In the remainder of this introductory section, we present a series of configurations characteristic of Spanish DOM, which we will use to check for putative variation patterns in the empirical study. Section 2 addresses the current issues in the investigation of DOM in multilingual settings, with special focus on the two contact regions under discussion. Section 3 presents a description of the elicitation study and its results. Section 4 discusses the findings, and Section 5 presents some conclusions.

DOM as a multifactorial phenomenon
Given that Spanish DOM is a multifactorial phenomenon, different properties should be considered when characterizing the DOM system of a given variety of Spanish. In this Section, we introduce some nominal and verbal semantic properties, as well as configurations in structure and discourse well known to be relevant to DOM.
Animacy, definiteness and specificity: This is the "core contrast" associated with DOM. NPs with reference to humans and definite interpretations always receive the marker (1a). Other animate NPs (e.g. with reference to animals) are also a-marked or show some degree of variation, but they certainly do not reject a-marking. For inanimates in canonical transitive struc-tures as in (1b), the natural intuition in most contexts is strong rejection, hence the claims of ungrammaticality in the literature. However, it has been observed that a-marking occurs sporadically with inanimates both in spontaneous spoken and written language (cf. García García 2014 for a monograph-length discussion).
The specificity contrast in inanimate objects is exemplified in (2), an ambiguous sentence, which can mean that María is just looking for someone who fulfils the requirements of translating from or to German (unmarked), or, that there is a previously identified German translator she is trying to find (a-marked).
search.prs.3sg dom a translator German 'María is looking for a German translator.' (López 2012, 10) According to López (2012, 10), " [t]he object in this sentence can be prefixed by accusative A. With accusative A, it can have a specific reading. Without accusative A, it can only be nonspecific." Similar contrasts can be observed by modifying the object with 'a certain' or 'no matter who' (cierto/cualquiera), the latter blocking a-marking according to López (2012, 17), or with subjunctive/indicative alternations. For further DOM patterns, the properties of the entire construction have to be taken into consideration.
Verb semantics: After the inherent properties of the object noun and its discourse status, the properties of the verb as the main predicate of the sentence also constitute a crucial factor of DOM (von Heusinger/Kaiser 2011; García García 2014). Among these properties of the verb, a primary focus of attention has been on affectedness, in that it refers to the "persistent change in an event participant" (von Heusinger/Kaiser 2011, 594) and therefore is a crucial ingredient in the definition of transitivity. Von Heusinger/Kaiser (2011) use a scaled notion of affectedness in their empirical study and rank the verbs according to the degree to which the participant is transformed or involved according to the meaning of the verb. Figure 1   The generalization expressed by the affectedness scale is that there is a decrease of a-marking from left to right, as verified by von Heusinger/Kaiser (2011) in their corpus study.
Another important factor involves the semantic roles defined by the argument structure of the verb. García García (2014, 22) proposes a relational notion of semantic roles, namely a 'decline of agentivity' between agent and patient. Agentivity is of special importance for the a-marking of inanimate objects, as García García (2014) exemplifies with a class of verbs which he labels "reversible predicates" (García García 2014, 147). This class comprises positioning and substitution verbs, such as preceder and sustituir. In the appropriate readings, such predicates do not express an 'incline' in agentivity between their arguments. Thus, a-marking is a possible and perhaps even necessary strategy to differentiate subject and direct object.
(3) El artículo acompaña al/ *el sustantivo. the article accompany.pres.3sg dom+the/ the noun 'The article accompanies the noun.' (García García 2014, 144) Doubled structures: A further generalization that has emerged from the study of a-marking on inanimates is that certain more complex structures involving secondary predication allow for the a-marking of objects that would be incompatible with it according to the animacy criterion. One case in point is that of verbs which allow for double accusative constructions, such as considerar ('consider'), llamar ('call') and caracterizar ('characterize').
(4) Algunos gramáticos […] no consideran oración a some grammarians neg consider.prs.3pl sentence dom la secuencia con verbo. the sequence with verb 'Some grammarians do not consider the sequence with a verb (to be) a sentence.' (5) Consideren estos datos. consider.imp.3pl these data 'Consider these data.' (García García 2014, 49) In contrast to (5), where a-marking is excluded, there are two elements qualified to fill the direct object slot in (4) -the predicative complement oración and the complex NP complement la secuencia con verbo. Such configurations are reported to show high frequencies of a-marking (Weissenrieder 1991, 150). Interestingly, López (2012, 10) claims that in such configurations -"small clause complements" in his terminologyan animate argument is also obligatorily marked if it is indefinite and non-specific (6).
(6) Considero *(a) un estudiante inteligente. consider.prs.1sg dom a student intelligent 'I consider a student to be intelligent.' García García (2014, 103) presents a more differentiated picture of such constructions. One of the results of his corpus study suggests that the adjacency of the two "objects" is a decisive factor. Sentences where the direct object and the predicative are not adjacent only showed a-marking in 21% of cases, whereas adjacent constructions confirm López' intuition and exhibit a-marking in 100% of cases.
Ditransitive sentences, in which the indirect object typically is an animate NP, represent another case of doubled structures, in the sense that if the direct object of such a sentence is animate and specific, both objects look the same overtly. It has been reported that the a-marking of the direct object is highly disfavoured in such structures.
present.prf.3sg dom his woman to his friends 'Pedro introduced his wife to his friends.' (García García 2014, 53) Complex objects: AcI structures (8) are similar to the double accusative structures presented above in that they also have an object-related secondary predication (García García 2014, 51). These constructions also allow for the a-marking of inanimate objects, especially if the object receives a more agentive description (Torrego 1999(Torrego , 1792 (García García 2014, 51-52) López (2012,(23)(24)(25)) also discusses such constructions ("clause union") and observes that for perception and causation verbs, a-marking for animate indefinites is obligatory regardless of specificity. García García (2014, 106) reports that such causative structures also allow for the a-marking of inanimate objects.  (López 2012, 24) Secondary predicates also play a role in the context of the transitive verb tener which is notorious for rejecting a-marking in most contexts: "A marked object is ungrammatical as the complement of haber 'have' (existential) and tener 'have' (possessor or relator).
[…] The data surrounding tener are extremely intricate.
Tener can mean something close to 'hold' or 'get', in which case a marked object is possible. The VP headed by tener can include a secondary predicate, in which case a marked object is again possible" (López 2012, 20 have.prs.3sg dom a son in the army 'María has a son in the army.' García García (2014, 50) presents similar data and adds that inanimates may also be a-marked in such constructions. He also claims that a-marking is the preferred option with animates: have.prs.3sg dom a computer calculate.ger the problem 'Ana has a computer calculating the problem.' One possible explanation for these findings is that the marked direct object in such configurations can be interpreted as the subject of the secondary predicate and hence as having more agentive properties. It is not the goal of this study to explore the patterns introduced in this Section in greater detail or from a theoretical perspective. Rather, they are listed and explained in order to show that determining the status of DOM in a given variety of Spanish involves taking into consideration very different configurations, and also introducing the types of structures that have been included in the elicitation experiment, where all the distinctions mentioned above are taken into account.

Methodological and grammatical considerations with respect to the elicitation experiment
As outlined above, the goal of this study was to collect equivalent language production data on the variational properties of DOM in Spanish and to compare different contact scenarios. Previous empirical studies using spontaneous spoken data reported considerable variation in the use of Spanish DOM in certain configurations. However, it is not clear to what degree different corpora of spontaneous spoken language from different varieties are actually comparable. Furthermore, many of the configurations introduced above do not occur frequently enough in common corpora to allow for a solid understanding of their behaviour. Both of these issues can be addressed by controlled elicitation. We decided to collect production data rather than acceptability judgments as a first step, because our primary interest was to know what structures speakers actually produce. In Likert-scale acceptability ratings, strictly speaking, only contrasts between sentences can be interpreted. Hence, if there is a correlation between acceptability and use, it is only an indirect one. As mentioned in the introductory section, ideally we should work towards a combination of methods, and this study adds a hitherto missing type of data to the overall picture.
Apart from the methodological considerations, the choice of grammatical configurations that have been included in the study, as well as the number of configurations tested, need further explanation. Obviously, not all DOM-sensitive phenomena described above can be tested thoroughly in one experiment. Therefore, we decided to focus on two configurations, which represent more than half of the experimental items, in order to achieve robust results for them. Four additional configurations are included in the remaining part of experimental items. The data on these four constructions individually are not as robust as for the first two, but they can still provide an exploratory impression of what is possible in these configurations. Section 3.1 provides a detailed description of these six configurations and how they were implemented in the experiment. Since one of the main goals of the experiment was to collect variational data, all these configurations include indefinite and/or inanimate objects in at least one manipulation, often contrasting them with animate and/or definite objects. Therefore, the experiment does not test configurations where a-marking has been shown to be obligatory in previous research, such as strong pronouns and proper names (for the latter cf. example 1a above). Often, Spanish DOM is explained or characterized by making use of the animacy/definiteness hierarchy, which is given in a simplified form here in (15).
(15) pronouns > proper names > definite, animate > indefinite, animate > inanimate There is wide consensus in the literature that, starting from the left side of the scale, a-marking is categorical with pronouns, proper names and definite animates. This is why our study focuses on the "right side" of the scale, where things are less clear and where variation is expected. For the sake of concision, other DOM-related phenomena, such as clitic doubling and leísmo, among others, will also not be discussed in this study.

Multilingual acquisition, language attrition and contact
In contrast to the detailed accounts of DOM in Spanish or other individual languages, there are only few studies dedicated to DOM in multilingual settings. The Spanish-English contact scenario is among the best explored of such constellations: Ticio (2015) investigated the early acquisition of the Spanish DOM system of children growing up in a simultaneous acquisition scenario, while Montrul/Bowles (2009) is a study of DOM in heritage speakers of Spanish in the United States. Accounts for other constellations are Döhla (2011), who discusses different contact scenarios with American Indian languages, and Montrul/Gürel (2015) and  presenting experimental data of learners of Spanish in Turkey and Romania, respectively. Montrul/Bowles (2009) consider heritage speakers in two experiments which include a general proficiency test, an oral production task, and different acceptability judgment tasks. They show that lower proficiency tends to correlate with a decrease in the production of the a-marking of objects that should be marked, and with increasing insecurity in the acceptability judgment tasks. Ticio (2015) finds that, in contrast to monolinguals, bilingual children did not acquire the DOM system in the period under study (until the age of 3;6) and that bilingual acquisition differs from monolingual acquisition in a fundamental way: "[...] DOM seems to be difficult or almost impossible to acquire for L2 learners, and it results in a range of error productions among HS and adult or school-age bilinguals" (Ticio 2015, 70). Similar findings had already been reported by Montrul/Sánchez-Walker (2013) for school-age Spanish-English bilingual children, with the omission of expected marking of over 65% in some cases. By contrast to Ticio's claims, Döhla (2011, 27) speculates that "[s]ince DOM is very common [cross-linguistically], we suppose that, in case of language contact, and first and foremost bilingualism, a language with DOM can easily transfer the morphosyntactic feature to another language without DOM or exert influence on another language that exhibits DOM." The same author discusses examples of American Indian languages that presumably already had a DOM system prior to contact with Spanish, and he suggests that in these cases contact does not play a role. On the other hand, for indigenous languages with more recent traits of DOM, Döhla suggests that Spanish might very well have triggered or potentiated its evolution. As a prime example, he cites Paraguayan Guaraní, exhibiting a DOM system similar to that of Spanish. The author concludes that more empirical data is necessary in order to assess the role of contact in all discussed scenarios.
The basic idea in both Montrul/Gürel (2015) and  is that the existence of a DOM system in the L1, as in Turkish or Romanian, might enhance the acquisition of DOM in another language, such as Spanish, despite some structural differences. They derive the predictions of their study from the so-called Feature Reassembly Hypothesis, according to which grammatical features of lexical and functional items are bundled differently from one language to the other. Consequently, L2 learners would need to work out how the features are bundled in the target language. In this process, reconfiguration of the feature bundles of L1 comes into play (Montrul/Gürel 2015, 290). The results of these studies confirm this basic assumption: From the Turkish participants, even L3 learners with lower proficiency perform quite well, while learners with higher proficiency significantly outperform the Spanish-English bilinguals and heritage speakers from the previously reported studies. For Romanian, a language genetically and structurally closer to Spanish, the enhancement effect is even stronger than for Turkish. This is not the place to discuss the different theories of acquisition on which these works are based. The goal of the present study is not to argue in favour or against a certain model of language contact or acquisition, but rather to begin filling an empirical gap in the literature: There is hardly any empirical work on Spanish DOM in scenarios of contact involving bi-or multilingual territories, such as certain regions of the Andes (cf. also Mayer/Sánchez, this volume) or the Misiones Province in Argentina. The following Section summarizes the most important facts about this linguistic space for the purpose of this paper.

The contact scenarios: Andean Spanish and multilingualism in Misiones
Andean Spanish has been identified as a supranational macrovariety of Spanish showing a series of features at all structural levels that diverge from normative standards. Many of these features have been described and studied in some detail. Escobar (2011) provides a detailed overview of the literature here, with a special focus on the Spanish-Quechua contact scenario, which plays a crucial role in the development of this variety. It is well known for a tendency towards OV word order in contrast to other varieties of Spanish, it has some morphological and many lexical borrowings from Quechua, as well as some phonological peculiarities, such as the distinction between /ʝ/ and /ʎ/ (otherwise uncommon in American varieties), strengthening and preservation of consonants and reduction of unstressed vowels, among many other features. Interestingly, however, there is no mention of DOM in the literature on this contact scenario. Mayer/ Sánchez (this volume) discuss Spanish-Quechua contact data from Huánuco (central Peru) among other contact scenarios in Peru. According to their data, a-marking is quite frequent in Huánuco Spanish, unlike in contact scenarios with languages such as Asháninka or Shipibo. On a more anecdotical note, one could also mention the possible emergence of a new DOM marker in the variety of Cajamarca (northern Peru). In this variety, the substitution of the DOM marker a with onde has been documented in the writings of Ciro Alegría, whose rural characters from that region use this form (Bossong 2008, 93). However, the precise status of this form is unclear. For the southern regions of Peru, including Cusco, even less is known. The data and analysis presented below are therefore first steps towards filling this gap. The northeastern Argentinian region of Misiones has only recently come to be known as a crossroads of language contact. Originally colonized by Jesuits, it was subject to a territorial dispute between Brazil, Paraguay and Argentina until the end of the 19 th century, this related to the Paraguayan Wars, and is now the youngest province of northern Argentina. At the turn of the 20 th century, it was almost entirely repopulated by foreign settlers, many of which were from Central Europe (e.g. Ukraine, Poland, Germany). While the remnants of Slavic and Germanic linguistic heritage are still detectable, the majority language of the Province today is Spanish. Current contact languages are Portuguese, particularly in the villages on the banks of the river Uruguay (bordering Brazil), and Guaraní, which is still spoken within the indigenous population. While the population is conscious of their plurilinguism and their particular linguistic identity, a comprehensive description of the provinces' linguistic situation remains a desideratum: whereas the rather impressionistic description of the habla misionera by Amable (1975) focusses on lexicon and phraseology, an unpublished dissertation by Sanicky (1981) concentrates on phonology. Recent work by de Ramos (2017) confirms the existence of widespread "leísmo", already briefly mentioned in cross-variational studies (Fernández-Ordóñez 1999, 1347-1349, attributed to language contact with Guaraní and possibly related to DOM.

The elicitation study
As noted in the introductory section, the elicitation tasks combine different DOM-sensitive contexts under one general setting. The tasks consist of spontaneously producing a sentence in which input material presented on a display has to be used and some very general instructions followed. Sentences are recorded by means of the SpeechRecorder software (Draxler/Jänsch 2004) for subsequent analysis, in this case, checking for the presence of a-marking on direct objects.

Design and materials
While following an experimental design and striving to avoid confounding factors, we were also interested in receiving the most natural output possible. For some sentences, it was important to ensure specific reference, hence a context sentence had to be included. For others, the relative order of the direct object with other elements had to be established. This implied a delicate balance between nudging towards the intended structures and favouring spontaneous speech (cf. Bautista-Maldonado/Montrul 2019 for a similar technique). Four different versions of the production task were implemented: In Task1, a context sentence was presented (in black letters) together with some additional unconnected words (in red letters), which had to be used in the production task. The additional unconnected words consisted of two NPs and an inflected verb. Participants were asked to create a sentence with the two NPs and the verb, taking the context sentence into consideration. They were explicitly allowed to add more words to the sentence and to arrange the presented material and the additionally included words as they liked. In Task2, a sentence (in black letters) was presented together with two NPs and an inflected verb (in red letters) and participants were asked to paraphrase the presented sentence with the two NPs and the inflected verb. They were explicitly allowed to add more words to the sentence and to arrange the presented material and the additionally included words as they liked. In Task3 and Task4, unconnected words and phrases were presented (in red letters) and participants were asked to build a sentence with this material. They were explicitly allowed to add more words to the sentence. While Task3 required maintaining the relative order of the presented words, Task4 permitted the rearrangement of the presented material (and any additionally included words) at the participant's discretion. Figure 2 gives one example of how Task1 was prompted by written stimuli and displayed on the participants' screen: As can be seen in Figure 2, context sentences or sentences to be paraphrased were presented in black on a white screen. The unconnected words for sentence production were presented in a separate line below in red letters. These chunks of material for sentence construction were graphically separated from each other in the presentation of stimuli. Usually, a vertical bar separated words or phrases. If two nominal arguments were presented, sometimes an arrow was used between them. This arrow was included as a non-verbal strategy to induce transitivity. Thus, the arrow always pointed from the potential subject to the potential object or, in the double accusative set (see below), from the potential object NP to the predicative complement. Participants were not explicitly told about the function of the arrow. Upon request, it was explained that it represented a connection in meaning between the two elements which had to be involved with one another in the sentence. This separation by bars and arrows is included in the reproduction of the example material below.
Six different types of experimental items were created in order to cover the grammatical configurations introduced in Section 1.2, resulting in a total of 40 experimental items.2 All six sets of these items crossed two conditions (2x2 design), and the items were distributed over four lists, with each participant assigned one list. In this way, each participant saw each item in only one of the four conditions and all participants saw the same number of conditions of each set. The first dataset contained four repeated measures per condition and list and the second dataset two repeated measures per condition and list. These lists were randomized by the experimental software for each participant.
The first and largest set of items were aimed at comparing specific and non-specific indefinite objects (animate and inanimate) in combination with verbs presenting different degrees of affectedness. In order to ensure specific readings, three context sentences were created for every item. While the first described a scenario without introducing any referent, the second introduced two referents as potential subjects related to two potential animate objects, and the third used the same potential subjects but combined with two inanimate potential objects. The material for sentence construction presented along with the context sentences contained two indefinite NPs that matched with the introduced referents of the second context sentence. (16) shows such an item with the three context sentences and the two sets of unconnected linguistic material that was presented for sentence construction.
(16) Complete materials of one experimental item of the specificity/animacy set: Context sentences with and without introducing the animated object referents: a. En el puerto se encontraron muchas personas.
'At the harbor, many people met.' b. En un crucero, dos pasajero y dos tripulantes se encontraron en el mismo bar.
'On a cruise ship, two passenger and two crew members met in the same bar.' Stimuli presented with each of the context sentences (a) and (b) This design allows us to observe putative interactions between specificity and animacy. Since it has been claimed that a higher degree of affectedness enhances a-marking, we controlled for this by including four verbs of each of the affectedness groups defined by von Heusinger/Kaiser (2011), cf. Figure 1. In total, the specificity/animacy set had 16 items and the corresponding task was Task1.
The second set was created in order to test reversible predicates. The predictions from the literature are that inanimates are always a-marked in symmetric configurations (both arguments inanimate) and that inanimate objects are also a-marked with animates in the so-called "reversible" interpretations. Eight verbs that allow for reversible structures were selected and combined with animate as well as inanimate subjects and objects in all four possible configurations, resulting in four target sentences per verb/item. For each sentence, a paraphrase was created using the two arguments but not the same verb. This paraphrase was presented in Task2 as a point of departure. The participants were asked to paraphrase the presented sentence using the reversible verb and the two arguments. Such as experimental item is given in (17) Given that all four possible configurations of animate vs. inanimate in subject and object positions were included, the results can be analyzed with respect to relative agentivity as well as for the claim of obligatory a-marking in reversible readings. Set 3 tested double accusative structures. Four verbs licensing such structures were chosen and combined with two sets of two possible objects each. The first set contained an animate NP as a potentially referential expression and a noun that was more apt to serve as a predicative complement. The second set contained an inanimate NP as a potentially referential expression and also a noun that was more apt to serve as a predicative complement. In this way, the claim about high rates of a-marking of inanimates can be verified and a-marking of inanimates and animates can be directly compared. Since it is difficult to control for (non-)adjacency of the two objects within the general design of the study (cf. the findings of García García 2014, 103, presented in Section 1.2), we decided to test for a related factor, namely canonical syntactic configuration vs. structures that included displacement. "Displacement" was created by manipulating the relative order of the two objects. The items were presented in the following way: first, the verb in the third person plural of the past tense (indefinido), and second, the two potential direct objects, either the set with the animate or the set with the inanimate NP. By also manipulating the order of the two potential objects, four conditions were created, as shown in Example (18). Only the material for the construction of the sentence was presented (no context etc.) and participants were asked not to change the given order of the words (Task3). Set 4 was intended to look at ditransitive structures. Four verbs with a "transferential" argument structure were chosen and combined with a person name, a second NP containing a kinship noun and a third NP denoting either an animal or an inanimate. The second and third NP were presented either with an indefinite article or a possessive. Example (19) shows the four manipulations of one item in this group. In this way, the effects of ditransitive structures can be tested for interactions between animacy and definiteness. Set 5 was created to assess DOM in AcI structures. More precisely, it tests whether the potential degree of agentivity of the object would increase the rate of a-marking for human indefinites. Four animate nouns were chosen as possible objects and combined with appropriate verbs in the infinitive. Each of these four nouns was then presented either with a causative verb or a perception verb in the third person plural of the perfective past tense (indefinido) and with an intensification adverb modifying the infinitive or no further modification, again resulting in four manipulations, as presented in (20). The causation/perception manipulation provides a contrast of agentivity from verbal semantics, and the intensification adverb, according to the literature, could further increase the agentivity of the object. Finally, Set 6 tested secondary predicates of the verb tener in combination with agentivity contrasts. The items were created in the following way: kinship term / muy + adjective / gerund (secondary predicate verb) + location / tener in the first person singular of the present tense. Four kinship terms were combined with four manipulations: the verb either denoted a more or less strenuous activity and the adjective denoted a low or a high degree of involvement. The location was chosen to fit the activity expressed by the verb (cf. examples in 21). This allowed us to compare different degrees and sources of agentivity. We considered adjectives overtly expressing strong emotions like happy, excited and agitated to convey more involvement than adjectives not overtly expressing emotions but other transitory states, such as relaxed, exhausted or ill. The elicitation was conducted in the following way: The stimulus material was presented on a separate screen for participants, this connected to the laptop of the experimenter. The audio data was recorded by a directional microphone (RØDE NGT4), also connected to the same laptop via an audio interface (ZOOM U-22). Participants only saw their own screen. The experimenter was seated across the table and had an overview of the experiment from the laptop. The instructions were presented on the screen and commented on by the instructor. First, the four tasks were introduced. The experimenter told the participants that they were free to add words to the ones presented to them in order to create the sentence, but that they had to use all those that were presented without modifying their form, and also that they were allowed to arrange the words as they liked, except in Task3. Participants were then informed that the tasks would be randomized and that they would always receive a brief instruction for whichever of the four tasks they had to perform. They were also given the total number of sentences to be created. After these general instructions, the recording procedure was explained: Participants would be given time to read the complete instructions and stimulus material on the screen and to think. Once they had the sentence in mind and gave the experimenter a signal, the microphone was activated and their answer was recorded. Participants were able to ask questions between the recordings, but the experimenter would not comment on possible sentences they created. The experimenter would only interfere in the following cases: (i) if a participant distributed the stimulus words in more than one sentence (including coordinated structures); (ii) if a participant modified the stimulus material (verb form, determiner, etc...); (iii) if a participant changed the predefined word order in Task3. The experiment was implemented in closed rooms whenever possible, although with some Quechua L1 speakers in Cusco, recordings had to be conducted outside.

Participants
Participants were recruited randomly through a variety of strategies, such as social media, personal contacts and by spontaneously inviting people in public spaces. In the urban areas, we restricted the pool of participants to university students, excluding students of disciplines with an analytical focus on language (philology, linguistics, literature, etc.). For the rural regions, it was not possible to maintain this restriction. Participants received a monetary reward for their participation. All participants understood that participation was entirely voluntary and that they could interrupt or stop their participation at any time.
In this study, we report the results of 32 participants, 16 from Argentina and Montevideo and 16 from Peru. Half of the participants of each region came from a bilingual location. For Cusco and Misiones, this is the total number of participants in our sample. The same number of participants was randomly selected from larger samples taken in Lima and Montevideo for comparison. Table 1 provides more information about the participants, the total number of recorded sentences and the number of "collaborative" sentences, i.e. sentences that could be included in the results and used for the subsequent analysis. The opening lines of Section 3.3 explain the annotation of the recordings and the classification of the elicited sentences in detail. The participants from Lima and Montevideo all had Spanish as their L1, and they lived and studied in their respective cities. As for knowledge of further languages, the students from Lima all had some command of English, with one student additionally mentioning French and one Portuguese. The students from Montevideo had also learned English to different degrees, with some additionally mentioning Portuguese or Italian. Participants from Misiones all had Spanish as their L1 but reported regular contact to Portuguese in their daily lives. They were accustomed to watching Portu-guese television, to seeing and hearing the language in other media and to using it with Portuguese-speaking people. Most also mentioned Guaraní, German, Polish or Russian as heritage languages still spoken by their elders, but they claimed not to be able to speak such languages themselves (except for one participant, who had an active knowledge of German). All participants had finished secondary school, one was currently attending college, but had not yet graduated. Among the other seven, five worked in agriculture and commerce and two had retired.
The participants from Cusco were more heterogeneous in their linguistic profiles. Four lived and worked in Cusco City and were only fluent in Spanish. Some had some basic knowledge of English and some reported understanding isolated Quechua words but not to be not able to speak the language. They all held university degrees and worked in the local university administration or in the tourist sector. The other four participants came from the surrounding areas of Cusco, were native speakers of Quechua and acquired Spanish at school. They were all bilingual in Spanish and Quechua and report to switching freely between the two languages, although most of them preferred to speak Quechua whenever possible. One of them had only finished primary school, two were secondary school graduates and one was a student at the local university. They all worked in agriculture and transportation.

Results
The experimental design had four lists, and each participant of each group received a different list, yielding a complete set of responses for the entire elicitation experiment. Unfortunately, 20 responses to one of the bilingual lists in Cusco were not recorded due to a technical failure. Recordings were transcribed and annotated for further analysis. In a first step, the transcriptions of each item was compared with the expected output in order to determine whether the participant had been cooperative, partially cooperative, or uncooperative. A trial was considered cooperative if the participant used the direct object and, where applicable, the subject according to the outlined transitive structure. If the participant uttered a sentence with a transitive structure but did not use the arguments as expected, this was considered as partly cooperative. Other utterances were discarded as uncooperative. For the analysis, we considered cooperative trials as well as those partly cooperative trials where the grammatical configuration established in the condition was not violated. Thus, if animate subject and animate object exchanged places within a transitive sentence, this would still be taken into consideration for analysis.
For the purpose of comparison, we established idealized predictions about each individual condition, based on findings in the literature. For configurations claimed to have obligatory a-marking, we set the expectation to 100% of a-marking responses, and for configurations that reported to reject a-marking the expectation was 0%. When variation was expected, we set the prediction to 50%. For instance, according to previous studies, a-marking is considered to be obligatory in all four manipulations of the reversible predicates dataset (set 2, cf. example 17 above). Hence, the general expectation would be 100% a-marking in this subset of responses. In the case of the animacy/specificity dataset, only 37.5% of a-marking is predicted, more specifically 100% for specific animates, 50% for unspecific animates and 0% for both inanimate conditions, cf. example (16) above. This allows us to calculate an overall expectation of a-marking for the whole experiment. Table 2 reports the expected overall performance together with that of the different regions and localities.  Table 2 suggests that overall, the Argentina/Montevideo group performed almost exactly as expected, whereas Peru showed a considerably lower overall percentage of a-marking. However, looking more closely at the four localities under investigation, it transpires that the monolingual groups for both regions actually outperform the predictions, while the multilingual groups, especially Cusco, show a drop in the overall rate of a-marking. In what follows we will first take a closer look at the variation found in dataset 1 (specificity and animacy) and 2 (reversible predicates), since the robustness of these datasets allows us to identify interesting patterns of variation; we will then consider trends in individual performances, where the remaining four configurations of the experiment will also be taken into consideration.

Animacy and specificity
Before considering the variation between the different groups outlined above, a differentiation has to be made with respect to Cusco. While the participants from this area obviously all had some connection to and experience with Quechua, four of them had Spanish as their L1, while the other four were Quechua natives.
Our results show that this leads to a considerable contrast in performance, and therefore we will report the two groups from Cusco separately. Figure 3 shows the percentages of a-marking in the four manipulations of the first set of stimuli. As expected, a-marking with human nouns is mostly very high, although never reaching categorical marking. Cross-regional variation in the data is limited, with the exception of the only group that consists of non-native speakers of Spanish, namely the Quechua-Spanish bilinguals from Cusco.3 There, the rate of a-marking with human-reference nouns is roughly comparable to that of a-marking with inanimates in the other regions. Specificity plays a less prominent role than animacy in this dataset. Only in Misiones do specific human nouns trigger a-marking considerably more often than non-specific nouns. For inanimate objects, only Lima shows a notable contrast with respect to specificity. However, a clear tendency or interaction across groups cannot be identified, neither for animate or for inanimate-reference nouns. Furthermore, note that inanimates are a-marked in more than 10% of cases in Montevideo and Lima, at around 20% in Misiones, but almost never in Cusco.

Reversible predicates
For reversible predicates, the literature predicts categorical marking. While the overall rates of a-marking actually turn out to be very high, even for inanimate objects, the supposed generalization cannot be confirmed, as shown in Figure 4. Instead, we find an interesting pattern of variation across groups. Again, the L1 Quechua speakers hardly ever employ a-marking. The Spanish L1 group from Cusco, on the other hand, comes closest to categorical marking. Only in the "prototypical" pattern with animate subject and inanimate object do we find that the rate of a-marking is not 100%. Lima and Montevideo also have very high ratings for the same three conditions, while in Misiones only the symmetric configuration with human nouns achieves very high percentages of a-marking, showing lower rates of a-marking in general compared to the other Spanish L1 groups. In all varieties, the "prototypical" pattern produces the lowest rate of a-marking.

The remaining four configurations
The remaining four configurations will not be discussed individually because, due to the rather low number of observations in each group of speakers, the variation for particular conditions across groups could not be interpreted with a high degree of certainty. Nevertheless, the global pattern found for the previous two sets of stimuli is confirmed. Montevideo, Lima and Spanish L1 speakers from Cusco show the highest rates of a-marking in the expected conditions, while it remains close to zero in the L1 Quechua group from Cusco. Speakers from Misiones perform somewhere between these two extremes.
On closer inspection, Quechua L1 speakers only show some marginal a-marking in the AcI dataset, while it is zero for all other configurations. The AcI dataset has a-marking at very high or categorical levels for all manipulations in the other groups. Example (22) repeats the manipulations from Example (20) in a combined presentation.
(22) Hicieron / vieron correr (rápidamente) a un mensajero. make.prf.3pl see.prf.3pl run.inf fast dom a messenger 'They {made a messenger run/saw a messenger running} (fast).' The double accusative structures (They considered a computer scientist an expert, cf. example 18 for details) are second highest in eliciting a-marking, while secondary predication with tener elicited the lowest numbers of all configurations (cf. example 21). The blocking effect expected for the ditransitive structures did not affect definite animate objects in Montevideo and Cusco (with Spanish L1) at all, while the data from Lima and Misiones show a drop in marking for this highly DOM-favouring context (Cristina gave a parrot to her sister yesterday). In the remaining three conditions, where we have either indefinite or inanimate features on objects (or both), there is only one single observation of a-marking in the whole dataset. This could mean that the blocking effect is stronger for configurations with "optional" marking, but further research is needed to confirm this possibility.
The shared characteristic of these four configurations (AcI, ditransitives, complex objects featuring predicatives or secondary predication) is that they are defined by some structural or constructional property. Thus, their specific syntax plays a more prominent role. The canonical transitive sentences of the first set of stimuli and the reversible predicates of set 2, on the other hand, are straightforward SVO sentences without further structural complexity, and the semantic or discourse properties of the object, such as animacy or definiteness, can be considered as more decisive for the use of a-marking than the specific structure of the sentence. This contrast will be used in the next Section for generating profiles of the performance of individual speakers.

Individual profiles
Comparing individual performances yields two further insights: First, we can observe the variability and dispersion within each group, something that has not been shown in the figures above; second, it allows us to see how much overlap there is between the different groups. For this analysis, we calculated two indices for each participant. The first index is the mean rate of a-marking in all conditions of the four predominantly syntax-driven configurations, and the second is the mean rate of a-marking in all conditions of the merely semantics-driven configurations. When plotted against each other, the picture in Figure 5 emerges. The EXP-point in Figure 5 represents the indices of the predicted results and is located roughly at the midpoint of both axes. Looking at the general pattern of the five groups, Montevideo and Lima show a denser clustering in the upper half of the plot with very few outliers and a high degree of overlap. In terms of the two axes, there seems to be hardly any difference with regard to syntax (y-axis) while it could be argued that Montevideo has somewhat higher a-marking rates on the semantic dimension (x-axis), since most of the respective dots are further to the right than those representing Lima. The three remaining groups show much more dispersion, also partly occupying the lower half of the plot. The eight speakers from Misiones have a wide range of dispersion on both axes. They are clearly the most dispersed group, showing no clear cluster. The Spanish L1 group from Cusco also shows more dispersion, but essentially on the syntactic dimension, while they cluster around the midpoint of the scale as far as semantics is concerned. Again, the Quechua L1 speakers from Cusco requires special comment. As can be seen from the plot, and as already mentioned in footnote 3, a great deal of a-marking in this group is due to just one participant, while the other speakers show almost no marking at all. For two speakers, the overall rate of a-marking is zero, while one participant marked one object from the canonical transitive set and one from the reversible predicates set. The performance of this one exceptional Quechua L1 speaker is closer to the Spanish L1 groups than to the rest of the speakers of Quechua, but even as an outlier, his profile is still located in the transitional area between his fellow Quechua natives and the core of the Spanish L1 speakers.

Discussion
In this Section, we would like to focus on the following three issues with respect to the results presented above: (i) the reliability of the data, (ii) the variation found in the data and its import on claims in the literature, and (iii) the value of the results regarding the status of DOM in the examined varieties. While the data collection followed a strict experimental design and the same protocol in all locations, and thus allows for a high degree of comparability of the linguistic material under investigation, one question that arises is the robustness of the findings, since the sample of participants for each location is relatively small. Another issue that could be raised is that the elicitation tasks are somewhat artificial and hence might not represent normal language use. For both caveats, it is important to point out that the results presented are part of a larger research project on the variation of DOM in different locations of the Spanishspeaking world. Wall et al. (2020) present a more robust dataset of more than 40 participants from the same experiment in Lima and Montevideo. Expanding the dataset for these two varieties does not change the general tendencies drastically. In fact, for Lima all participants cluster around the same region indicated in Figure 5 above. While the expanded dataset for Montevideo shows more dispersion in this respect, it does not reach the amount found for Misiones. Unfortunately, there are no larger datasets for the contact zones. Nevertheless, there are more than 270 individual recorded sentences for each location in the dataset presented above, of which -still per region -more than 100 correspond to the canonical transitive set and more than 50 to the reversible predicates set. Thus, at least for these two datasets we have a considerable number of data points per speaker. It goes without saying that these results should by no means be considered as final and representative for the respective regions. However, this is true for any isolated experiment, for which replication is crucial. Regarding the two contact zones, it is furthermore unclear whether we should assume stable varieties in these contact scenarios in the first place, and it is even less clear what representativity would mean even for larger groups of speakers in those areas. What the findings of this study can provide is a first indication of putative differences in the two contact regions with respect to predominantly monolingual speakers. They also can give us a first impression of some general tendencies for canonical transitive sentences and reversible predicate constructions.
As for the artificiality of the elicitation process, participants were asked about their experiences with the tasks and as to the possible purpose of the experiment. Almost no one was able to guess the research subject; only one participant noted that he had been adding the preposition a to his sentences multiple times, but he was not able to identify the part of speech of interest to us or comment on the argument structure of the sentences. Some participants reported that they needed time to get used to the task, which was not a problem since there were no time constraints in the experiment. While the form of presentation of the stimuli requires a certain degree of literacy, which was checked for beforehand, none of the recruited participants found it impossible to construct sentences out of the presented material. Most found the experience interesting or challenging in a positive way, and none aborted the experiment. It is of course impossible to say whether participants would produce exactly the same amount of variation in more spontaneous conversations, yet the elicited sentences do reflect some of the general tendencies reported in the literature. Also, the results do not show inconsistencies or contradictory behaviour. In our experience, a-marking of direct objects is also relatively unsusceptible to the drawbacks of a more artificial elicitation task. The form does not carry any prominent social or expressive meaning and its use is highly unconscious and automatized. Neither in the literature nor in our fieldwork experience have we come across evidence suggesting that a-marking of direct objects might require a notable amount of preparatory processing or that performance constraints would have a strong impact on it. Therefore, we argue that our results are quite a good approximation to normal language use.
While our results reproduce tendencies that have been described in the literature, they do not fully match the predictions we derived from prior studies. As has been pointed out above, these predictions are idealized and should be taken with a grain of salt. However, we would also like to answer the question as to why the results diverge from the predictions in the way they do, at least for the two more robust datasets. One first case in point would be that animate specific indef-inites should be categorically marked, and they are not. As Figure 3 shows, they are at around 80% for most groups, only exceptionally reaching 90%. Of course, reaching 100% in performance could be considered unrealistic in general, and even more so since the use of a marker has a probabilistic component. In our view, however, a score as low as 80% can probably not be attributed solely to confounding factors. Rather, we suspect that in addition, the context sentences did not always work as expected and the priming context intended to implicate specificity might not have been strong enough. This seems to be corroborated by the fact that the (non-)specificity manipulation of our context sentences did not produce contrasts in most groups and that in most of them, a-marking rates are indeed slightly higher when the referent was not introduced in the context sentence. Thus, the lower numbers of a-marking on animate indefinites (specific and non-specific) could be due to speakers constructing the (non-)specificity of those referents based on factors other than the cues from the context sentences. The role of the context sentence in this kind of elicitation study clearly needs further investigation.
The sentences with reversible predicates produced very high rates of a-marking in general but clearly did not lead to categorical marking, with the exception of the Spanish L1 group from Cusco, where we have categorical marking in three out of four conditions. It is important to recall that the arguments in this set of stimuli were given as definite NPs. Unlike the set of canonical transitive sentences, where the arguments were formally indefinite, these stimuli produce the expected high rate of a-marking, averaging around 90% in most groups. Here, context cannot be invoked to explain the lack of a-marking. Further investigation is needed in order to determine whether other factors might be involved here, or whether this is the range of the probabilistic component in DOM for definite animates in language use or whether it is a consequence of the given task after all. While this issue cannot be resolved here, what we can learn from our results is that the claims in the literature have been oversimplified, since they do not differentiate between the four possible combinations of (in)animate subjects and objects. Our results, however, show that for the "prototypical" alignment of animate subject and inanimate object, the rate of a-marking is considerably lower than for the other three conditions, although remaining above 50% in most groups. Another interesting observation is that the symmetrical alignment with inanimate arguments produces considerably higher rates of a-marking than the "prototypical" alignment. This is not only a new observation; it is also strong evidence for theories that argue for a "global" explanation of a-marking where not only local factors (i.e. the object domain and how the object is related to the verb) are considered as relevant, but also the configuration of other parts of the sentence (for instance, the type of subject).
Turning to the third and final open issue, as was noted in Section 1.3, this study focuses on the "righthand" side of the animacy and definiteness scale, as provided in (15). This is the area where variation is expected to be present, namely in the transition from definite animates to indefinite animates. Thus, pronouns and proper names are not included in the study. Nevertheless, a number of conclusions can still be drawn with respect to the other three categories, which constitute an important part of the scale. First, compared to the other four groups, the Quechua L1 group clearly stands out by producing almost no a-marking.4 It is of course not possible to determine whether the Quechua L1 speakers feature a DOM system as part of their Spanish grammar at all, given the previously mentioned restrictions of the study. In any case, that DOM system would not involve the marking of animate definites, which are well represented in the first two sets of stimuli. This finding is unexpected on the view that DOM should be easily transferable for speakers of languages that have nominative-accusative alignment in their L1 (Döhla 2011). If it is easily transferable to their L1, it should arguably also be easily acquirable in the L2 before. Quechua is a language with nominativeaccusative alignment, but while our participants freely switch between Spanish and Quechua in their daily lives, they have not acquired DOM as expected. Compared to learners of Spanish that have DOM in their L1 (such as Romanian or Turkish), who acquired a very good command of Spanish DOM after a few years (Montrul/ Gürel 2015;, this is remarkable. Thus, nominative-accusative alignment alone might not be sufficient for an "easy transfer". It should be recalled that for a nominative-accusative language lacking DOM, such as English, this property of Spanish is among the most difficult to master, and that in heritage speakers, the DOM system is among the first features to be lost by interference (Montrul/ Bowles 2009;Montrul/Walcker-Mayer 2013). We cannot exclude the possibility that our Quechua speakers show a-marking on stressed pronouns, but even if they do, this system would be rather reminiscent of such rudimentary systems as those found, for instance, in Portuguese, but not the highly grammaticalized ones common in Spanish. Interestingly, the DOM system of the Spanish L1 speakers from Cusco comes closest to the predictions in the literature and shows little variation: for reversible predicates, categorical a-marking in three out of four conditions; for canonical transitive sentences, high rates of a-marking on animates and practically none on inanimates. The issue of whether this pattern is generalized among Spanish L1 speakers in that region needs further investigation.
Finally, for inanimate objects we have been able to confirm a-marking in the Rio de la Plata region and show that in Misiones and Lima similar rates can be expected in canonical transitive sentences. This is the first dataset that allows for such direct comparison. Tippets (2011) reports 8% of a-marking on inanimates for Buenos Aires, a slightly lower rate than ours for Montevideo, and considerably lower than the results from Misiones. However, since it is not clear whether Tippets only considered what we call canonical transitive sentences (we suspect that this was not the case), it is difficult to relate our findings to those. Already the direct comparison between canonical transitives and reversible predicates shows that although both sentences have a simple SVO structure, a-marking rates are very different. We can expect this contrast to become stronger for different and more complex structures.

Summary and conclusion
We have presented a new elicitation tool for collecting highly comparable datasets on the variational range of DOM in Spanish, and the first results for four varieties from two regions in South America. The focus of the study was on two varieties from zones where Spanish is in contact with other languages, namely Cusco and Misiones. For both contact zones we included reference groups from predominantly monolingual surroundings: Lima as a point of comparison for Cusco and Montevideo as a representative of River Plate Spanish and reference point for Misiones. The stimuli included in the elicitation task provided data on six DOM-sensitive constructions, two of which were explored in this study in more detail.
As for the contact regions, the general findings were that Quechua L1 speakers produced almost no a-marking in any set of stimuli. We have not investigated the use of Quechua object syntax of these speakers, but given that most of them do not show signs of a developed DOM system in their Spanish, it seems doubtful that they have transferred DOM from Spanish into their variety of Quechua. Thus, although we did not investigate the contact language, a highly plausible interpretation is that no support has been found for the speculation in Döhla (2011) that DOM systems being common in many languages makes them easily transferable from one language to another. The findings of our study are in line with other recent empirical findings on multilingual settings and on the acquisition of Spanish DOM, which is not easily acquired when the L1 does not have a similar DOM system. The speakers from Misiones do have an articulated DOM system, yet it differs in some aspects from that of the predominantly monolingual speak-ers. For that region, we also found stronger contrasts between individual speakers than in the predominantly monolingual zones, where we observed stronger clustering.
While other experimental studies have focused on less phenomena in favour of more statistical power in the results, our findings are, for the time being, limited to a preliminary overview. However, the method used in the present study can also yield more robust quantitative results if the database is expanded. With our method, we were able to replicate several general tendencies described in the literature, such as a-marking on inanimates and high rates of a-marking in sentences featuring reversible predicates, and we showed that these cases have to be treated separately in empirical terms. We also provided, for the first time, a highly comparable dataset that allows for direct comparison of the variation in the varieties under investigation and that promises even more interesting findings once applied to further regions of the Spanish-speaking world.