Word segmentation is a task that every language learner must deal with. The task of segmenting words from fluent speech is complex since speech does not involve the production of words in isolation but a string of words with no identifiable gaps to detect word boundaries (Cutler & Otake, 1994; Gonzelez-Gomez & Nazzi, 2013). This poses a problem for infants (Cutler & Otake, 1994; LaCross, 2015; Mintz et al., 2018) who are yet to acquire a lexicon from speech: they must find out where words start and end in the speech stream. Previous research on infants’ word segmentation shows that infants rely on a range of acoustic cues from their ambient language to facilitate the detection of words from continuous speech (Echols, Crowhurst & Childers, 1997; Gonzalez-Gomez, & Nazzi, 2013; Höhle & Weissenborn, 2003; Jusczyk, Houston & Newsome, 1999; Mattys & Jusczyk, 2001; Mintz et al., 2018). Some of the word segmentation cues that have been explored in infants so far include: (1) transitional/statistical probabilities, that is, the probability/likelihood of the occurrence of a syllable given the preceding syllable (e.g., Saffran, Aslin & Newport, 1996), (2) prosodic regularities, particularly, rhythmic units such as the strong/weak stress pattern of words in languages like English (e.g., Echols et al., 1997; Orena & Polka, 2019), and (3) phonotactic regularities, which refers to the rules or constraints governing the possible sound combinations, phoneme positions, and syllable structures in a language (e.g., Mattys & Jusczyk, 2001). In the present study, we explore infants’ use of vowel harmony (henceforth, VH), a phonotactic constraint on the co-occurrence of vowels in a word or syllable as a cue for early word segmentation. The vowel harmony constraint is that vowels in a di(multi)syllabic word should share some specific phonological features. For example, in an Advanced Tongue Root (ATR) harmony language like Akan, a Kwa language within the Niger-Congo languages, vowels are grouped into two different sets: +ATR (/i,e,u,o/) and -ATR (/ɪ,ʊ,ɛ,ɔ/) based on the tongue root posture of the vowels. The ATR VH constraint in Akan requires that vowels from only one set occur within a word (with few exceptions), for example, +ATR: kube ‘water’; and -ATR: ɛkɔm ‘hunger.’ Since VH constraints express that vowels must harmonize between syllables within a word in a given feature, any disharmony between two adjacent syllables may indicate a likely word boundary. Studies on adult speakers of VH languages show that adults use the absence of harmony between adjacent syllables as an indication of word boundary location (e.g., Suomi, McQueen, & Cutler, 1997; Vroomen, Tuomainen, & de Gelder, 1998). Two previous studies have found evidence that infants are aware of vowel harmony cues and exploit this as a segmentation cue (Van Kampen, Parmaksiz, van de Vijver & Höhle, 2008; Mintz et al., 2019). Van Kampen et al. (2008) investigated infants learning Turkish, a VH language. Specifically, their study focused on backness harmony in Turkish-learning 9-month-old infants. The result was that infants used the absence of harmony between adjacent syllables as a word boundary cue for segmenting bisyllables from auditorily presented text passages (Van Kampen et al., 2008). Mintz et al. (2019) investigated 7-months-old English-learning infants’ sensitivity to vowel harmony in a series of experiments, half of which focused on vowel identity, and the other half of which focused on backness harmony. Their goal was to explore whether infants learning English as a language without VH would rely on VH as a universal bias for speech segmentation. In their segmentation experiments, infants were briefly exposed (less than a minute) to a continuous speech stream (e.g.,…ditepubobidetupo…). After this, at test, when presented with bisyllabic sequences that had been present in the speech stream, infants preferred listening to sequences like dite, pubo, which have only front vowels or only back vowels respectively, over sequences like detu, bodi, which combine front and back vowels. Results from these prior studies suggest that infants are sensitive to vowel harmony cues and can use them for segmenting words even after minimal exposure to speech containing vowel harmony patterns. Previous studies on VH cues for word segmentation have focused on monolingually-raised infants (e.g., Van Kampen et al. (2008): infants learning Turkish, a VH-language; Mintz et al. (2018): infants learning English, a non-VH language) with no study so far on infants simultaneously learning a VH and non-VH languages. Investigating infants with simultaneous exposure to diverse languages will be relevant since multilingual exposure can have repercussions on infants’ ambient language(s) processing. Some prior research suggests that infants growing up bilingually might, in each of their two languages, perform on par with monolingual infants when segmenting speech: while, for example, monolingual French- and English-learning infants have only been found to be able to segment words from their native language but not from a non-native language, which is assumed to be related to the languages showing different rhythm patterns (Orena & Polka, 2019; Polka & Sundara, 2003), French-English bilingual infants can segment bisyllabic words from both languages. Likewise, in a different study with French-English bilingual infants, Orena and Polka (2019) tested infants in an inter-mixed dual language task. Infants heard English-French mixed text passages with two target words, one from each language. The authors found that bilingual infants could segment bisyllabic words in both their dominant and less-dominant languages. The findings of these studies are interesting, as, in other domains of perception, e.g., in perceptual preferences & speech discrimination, bilingual infants are often found to perform better in their dominant than in their non-dominant language (e.g., Liquan & Kager, 2015; Sebastián-Gallés & Bosch, 2002). Together, these findings suggest that bilingual language experience has an impact on infants’ early word segmentation routines. Therefore, the type of input that multilingual infants in our study receive (both VH and non-VH languages) and the relative amount of input in a vowel harmony language might be revealing about how these infants exploit harmony cues during segmentation. We know from the literature (e.g., Gonzalez-Gomez et al., 2019; Van Kampen et al., 2008) that monolingual infants learning a VH language process their native language differently than infants learning a non-VH language: while the former have a listening preference for syllable sequences that follow the VH pattern in their input, the latter show no such preference, suggesting an influence of language input on infants’ speech perception. This raises the question of whether the degree of language exposure in multilingual infants affects their exploitation of VH cues in speech segmentation. For the first time, this study will examine whether multilingually raised infants learning both minimally one ATR harmony language and minimally one non-VH language can make use of vowel harmony when dealing with the task of detecting word boundaries in fluent speech. We focus on a specific type of vowel harmony, the Advanced Tongue Root (ATR) vowel harmony in Akan, the acquisition of which has not yet been investigated (besides our own pre-registered study on ATR harmony preference in progress, see https://osf.io/9m3z4). Exploring infants' exposure to an ATR harmony language and non-vowel harmony languages will contribute to our understanding of how the degree of exposure to multiple diverse languages affects infants’ ability to use vowel harmony, specifically ATR harmony, for segmentation. The aim of this research is to address two main questions. The first question is whether multilingual infants exposed to minimally one ATR harmony language and minimally one non-vowel harmony language rely on ATR harmony cues for speech segmentation. To address this question, multilingually raised infants performed a word segmentation task with naturally recorded passages in Akan (familiarization) followed by isolated bisyllabic nonwords (test), which were also embedded in the passages as target words. With the passages, target words occurred in contexts manipulated for having either a harmony context or a disharmony context, where vowels in the bisyllabic nonword either harmonize or disharmonize in ATR with vowels of an attached suffix. The second question is, whether the amount of exposure to an ATR harmony language modulates the use of harmony cues in speech segmentation.