Journal of English Language Studies

Multimodal learning environment has been facilitating teachers to present information-rich in technical words and symbols in picture-print as well as picture-spoken words materials to non-native learners. Indeed, many laboratory and experiment conditions have investigated the positive effect of multimodality to pronunciation learning but little is known about applying it in teaching English consonants that are unavailable in Indonesian language. This naturalistic case study explored the experience of five pre-service English teachers when learning English consonants unavailable in Indonesian language, i.e., /ð/; / Ɵ /; / ʃ /; and / ʒ / . The data were collected by means of students’ reflective journals and interview. Findings suggest that the participants utilized the picture-word materials to understand the consonants’ phonetic symbols. The picture - spoken words materia ls facilitate students’ sound production practices. Last, the students’ motivation to acquire intelligibility and comprehensibility is shaped by their identity as pre-service English teachers believe that their correct pronunciation reflects their professional competence and improve their confidence. © 2023 JELS and the Authors - Published by JELS.


INTRODUCTION
English communication is becoming increasingly important as people mobility and collaborative study programs, business networks and information technology continue to grow.Unfortunately, non-native English speakers from various cultural backgrounds and native languages struggle with pronunciation (e.g.(Hassan, 2014;Kosasih, 2021;Sahatsathatsana, 2017) Many errors made by non-native speakers might be attributed to the phenomenon of unconscious influence from their native languages (Flege, 1995).The effect of native language can be so strong that learners may be unconscious of the variations.A number of research investigated the pronunciation problems.For example, a case study revealed that as a result of first language interference, Sudanese students had problems with multi-pronounced English vowels and consonants, such as /z/ and /ð/, /s/ and /θ/, /b/ and /p/, /ʃ/ and /tʃ/ (Hassan, 2014).Similarly, Thai students admitted the different sound system between English Thai influence their pronunciation of [θ], [ð] and [ʤ] (Sahatsathatsana, 2017).Meanwhile, another study found many mispronunciations of /ə/, θ/, /ŋ/ and /ae/ by Turkish students (Mehmet, 2015).All these errors might be possibly caused by the absence of the corresponding sounds in the participants' respective first languages.Indeed, Indonesian students faced similar pronunciation problems (Aulia, 2018;Kosasih, 2021;Simarmata & Pardede, 2018;Tambunsaribu & Simatupang, 2021).Thus, Likewise, unavailable in many world's languages, consonant inventory is proposed to be taught in the Lingua Franca Core (Jenkins, 2000).
In the context of classroom instruction, repetitive aural-oral drills are important to build up a store of sound-memory (Connor, 1980).This can be achieved by students paying close attention to the sample pronunciation of a word and practicing practice uttering the word.Indeed, language learning is a multimodal endeavor.So, to improve pronunciation in a new language, learners access not only auditory information about speech sounds and patterns, but also visual information about articulatory movements and processes.Mayer (2001) proposed the multimodality principle in multimedia learning to present information-rich in technical words and symbol in both picture and printed text to non-native learners.Indeed, some research investigated the contribution of bimodal input to pronunciation.Mitterer and McQueen (2009) found that Dutch participants who watched videos with subtitles developed more accurate L2 phonolexical representation than those who watch videos without subtitles.In another study, Wisniewska and Mora (2018) investigated L2 learners' ability to integrate auditory and textual input through L2 caption video and it was found that speech processing skill of students were different and it was assumed to facilitate pronunciation attainment gain.Wisniewska and Mora (2018) investigated L2 learners' ability to integrate auditory and textual input through L2 caption video and found that the learners' speech processing skill grew.So, it was assumed that auditory and textual inputs facilitate pronunciation attainment.Likewise, audio-visual aids, such as articulatory gestures, were found useful for speech recognition and building students' acceptable pronunciation because the two modalities allow learners' working memory capacity to expand (Li, 2016;Li & Somlak, 2019;Ong'onda & Muindi, 2016).
Skills and content knowledge are important for teachers; therefore, teacher education program should emphasize the acquisition of the skills and knowledge (Shulman, 1987).Several studies have been conducted with respect to the perspective of English as a Foreign Language pre-service teachers of their English skill vis-à-vis their professional preparedness.For example, a survey concluded that the sources of pre-service Foreign Language Teaching Anxiety were pre-service teachers' two sources of Foreign Language Teaching Anxiety were their perceptions of low level language proficiency and fear of negative evaluation (Aydin, 2016).
Likewise, language barrier was a factor contributing to their professional preparedness (Hourani, 2013).Nevertheless, Sasmito and Wijaya (2022) found that there was no correlation between English skill and willingness to be teacher.While, Gungor and Yaylı (2012) concluded that the correlation between pre-service EFL teachers' self-efficacy and anxiety perceptions was low.
While previous studies researches were conducted under laboratory or experiment conditions, the present study was designed to fill the gap by analyzing the use of articulatory gestures in an actual pronunciation classroom.Involving preservice English teachers, this research investigated the embodiment of multimodality instruction in teaching English consonants that are not found in Indonesian language, i.e., /ð/; /Ɵ/; /ʃ/; and /ʒ/.This study explored the Indonesian students' information processing, learning strategies and motivation when learning in picture-print and picture-spoken words materials environment.

METHOD Research Design
This study employed qualitative research.A case study design was utilized to illuminate comprehensive and contextualized understanding of a phenomenon (Yin, 2009).In particular, the current research employed case study to explore the experience of students in multimodality pronunciation learning environment.

Participants and classroom context
This study involved five participants who voluntary participated in the research.They were Kartini, Zahra, Faiza, Nebu and Malik (pseudonyms).This study took place at an English Education Department at a public university in Indonesia.For this study, four 4 meetings were conducted on production, perception and representation of the English consonants unavailable in Indonesian language, i.e., /ð/; /Ɵ/; /ʃ/; and /ʒ/ (See Table 1).Mayer (2001) noted that the modality principles modality principle is are likely to be effective under boundary conditions.Due to the characteristics of the students as non-native English learners, the study adopted both picture-print and picturespoken words materials.Accordingly, every lesson began with presenting the picture-print materials which consisted of the phonetic alphabets of the consonants and articulatory information of speech sounds.Then, assuming the students' comprehension of the technical words and symbol, the lessons preceded with picture-spoken words materials which provide articulatory movements and processes of the consonants.Figure 1 and 2 illustrate the sample of multimodality materials adopted from Rachel's English (2016) used for this study.to make necessary adjustment to the questions.Some modifications were made to ambiguous questions and terms.Students' journal entries and interview were in Bahasa Indonesia, the participants and authors' mother tongue which then translated into English by author two and three.

Data Analysis Procedures
There were 20 reflective journal entries in total and 75-minute interview transcription.To avoid bias, the data were analyzed independently by author two and three.The analysis consisted of 3 phases.The first phase included the analysis of students' experience in processing the information in the multimodal learning environment.The second stage included the students' learning strategies.Last, there was one 1 unanticipated phase included which was analysis of learning motivation.During the interview, participants mentioned about the importance of English skill to their identity as pre-service teachers.
There were 20 reflective journal entries in total and 75-minute interview transcription.The reflective journals were submitted to Google Forms then exported to Excel for analysis.The transcription was made manually by author two then author three checked for accuracy against the original recordings.To further sort data, each participant's transcript and journal entry were placed into two columns in Excel: the researcher's questions/prompts in the first column and the participant's responses in the second column (See Table 2) To avoid bias, the data were analyzed independently by author two and three.The analysis was conducted inductively as described in Creswell & Guetterman (2019).The first phase was coding the data which included labeling the data.Then, the codes were examined for any overlapping and redundant data.
Finally, the codes were collapsed into broad themes.

Mimic; visualize
What about the video?
The video (picturespoken words) helps to mimic the process of making sounds.I visualize my pronunciation while watching the explanation.
Learning Strategy

RESULT
This study aimed at investigating how five 5 pre-service English teachers processed information, applied appl learning strategies and developed motivation when learning four 4 English consonants, i.e., /ð/; /Ɵ/; /ʃ/; and /ʒ/ in multimodal learning environments.It was found that the picture-word materials allowed students to understand the consonants' phonetic symbol.Meanwhile, picture-spoken words materials were valuable for students' sound production practices.Last, the students' identity as pre-service teachers shaped their motivation to increase intelligibility and comprehensibility.The following paragraphs elaborate each finding and provide discussion vis-à-vis the key findings of this research.

Information processing
Kartini, Zahra, and Faiza admitted that their pronunciation ability was overall not good.In regards to pronouncing the four consonants, they admittedly to have problems because they are not accustomed to the sounds.
English sounds are different from bahasa Indonesia.I am not used to it.It's difficult.I need a lot of time to learn English pronunciation.(Kartini, Zahra, Faiza) The students explained that the phonetic symbol of the consonant and their respective articulatory gestures help them to understand the sounds representation.
The materials also allow them to discover the sound production process.The video materials which contain moving picture and narration of consonants' phonetic symbols and production provide opportunity for them to visualize the sound production process and practice accordingly.Last, the three students noted that they always require time to process information and the two modes of learning materials accommodate this need.Nebu and Malik added that the picture-spoken materials helped them distinguishing correct from incorrect pronunciation that they could build more accurate sound production.

Learning strategies
Kartini and Zahra said that they often hesitate to ask questions during the class so they tend to review the lessons independently.Hence, these multimodal learning environments successfully allow them to explore the materials to comprehend further by themselves.
I am a slow learner.I feel that I always need extra time to understand compared to my friends but I am too shy to ask the teacher.So, the video help [me] to use the symbol, the sound production process and pronunciation model at the same time.

DISCUSSION
Informed by the conceptual framework of multimodality learning environment (Mayer, 2001;Tritch, 2018), this research focused on how five process information, apply learning strategies and are get motivated to learn in picture-print and picturespoken words materials when learning consonants unavailable in Indonesian languages.The findings offer insights into the complex students' learning experience and provide implications for the embodied work of multimodality in teaching English pronunciation in EFL classrooms.
First, in regards to pronouncing the four English consonants, i.e. /ð/; /Ɵ/; /ʃ/; and /ʒ/, the five participants admitted that they are not accustomed to the sound hence they have problem in pronouncing the sounds.This is the case because the consonants are not available in Engl.Indeed unavailability of sounds equivalent to students' first language may hinder their pronunciation of the target language (Hassan, 2014;Mehmet, 2015;Tambunsaribu & Simatupang, 2021).Also, phonological awareness is an important factor for students' pronunciation learning (Kosasih, 2021).
Learning pronunciation in a multimodal environment allows participants to process information about consonants' phonetic symbols and related terminology.Indeed, Li (2016) concluded that audiovisual bimodal input facilitated improvement of unfamiliar sound perceptions significantly.Likewise, the picture-spoken materials allow the students to visualize the sound production process (Abel et al., 2015).All in all, the bimodal learning environment assisted the students' comprehension (Sankey et al., 2010).
In regards to the learning strategy, the same learning strategy was applied by the five participants.In regards to the learning strategy, the five participants applied the same learning strategy they utilized the moving picture-spoken materials to practice the sounds' production by following the instructions.They understand the importance of drill in acquiring accurate pronunciation that they make use of the bimodal materials to continuously practice independently.This is in line with Li & Somlak (2019), who found that audiovisual modality uses in EFL classroom on mandarin speaker pronunciation provided learners a chance to practice.Likewise, Kurniadi (2020) noted that multimodal materials allow learners to listen to the model pronunciation while simultaneously practice.
the experience and the unfamiliar English sounds via the visual sound production.
Last, this learning experience can contribute to the pre-service teachers' professional preparedness.Thus, multimodality learning environment which combine picture, motion picture and text might beneficial to English pronunciation instruction.

CONCLUSION
The present study delineated the embodiment of multimodality learning environment in teaching English pronunciation of consonants unavailable in Indonesian language.This study focused on how five pre-service English teachers process information, apply learning strategies and get motivated when learning in picture-print and picture-spoken words materials.The conclusion is limited to the context of the participants of this study.It was found that the picture-word materials serve as the visual clues which facilitate the students' understanding of sound representations by providing the consonants' phonetic symbols and related terminology.Meanwhile, the picture-spoken materials facilitate independent sound production practices.Last, the students' motivation to increase intelligibility and comprehensibility is shaped by their identity as pre-service English teachers.They believe that their correct pronunciation lowers their anxiety and increase their professional competence.
The present study delineated the embodiment of multimodality learning environment in teaching English pronunciation of consonants unavailable in Indonesian language.This study focused on how five pre-service English teachers process information, apply learning strategies and get motivated when learning in picture-print and picture-spoken words materials.It was found that the picture-word materials serve as the visual clues which facilitate the students' understanding of sound representations by providing the consonants' phonetic symbols and related terminology.Meanwhile, the picture-spoken materials facilitate independent sound production practices.Last, the students' motivation to increase intelligibility and comprehensibility is shaped by their identity as pre-service English teachers.They believe that their correct pronunciation lowers their anxiety and increase their professional competence.So, the combination of picture, motion picture and text learning materials might have positive contribution to English as a foreign language students' pronunciation learning.
Some limitations in this study should be acknowledged.Caution should be made when generalizing the findings to other contexts and students as they were based on five students' learning experience.Future research may consider testing students' pronunciation prior and following the integration of multimodality learning environment.
The study was conducted in a Phonetics and Phonology (ING356) course during academic year 2019.The course was a 16-week, three-credit hour compulsory course.The class was held once a week for 150 minutes in each session.The objectives of the course were students to acquire (1) basic phonetics skills, i.e. production and perception of speech sounds, transcription using the International Phonetic Alphabet as well as (2) knowledge of rules, representations and analysis of sound patterns.The first author was a non-native speaker of English who has been teaching the course for 5 years.This study took place at an English Education Department at a public university in Indonesia.Twenty-three Indonesian undergraduate students enrolled in Phonetics and Phonology course during academic year 2019.Five 5 out of 23 students volunteered to participate in the study.They were Kartini, Zahra, Faiza, Nebu and Malik (pseudonyms).The course was a 16-week, three-credit hour compulsory course.The class was held once a week for 150 minutes in each session.The objectives of the course were students to acquire (1) basic phonetics skills, i.e., production and perception of speech sounds, transcription using the International Phonetic Alphabet as well as (2) knowledge of rules, representations and analysis of sound patterns.The first author was a non-native speaker of English who has been teaching the course for 5 years.s.

Figure 1 .
Figure 1.Sample of picture-spoken words material The picture-text is helpful.I learn about the [speech] organs I need to use to produce different sounds.The picture-spoken words help to mimic the process of making sounds.(Zahra) I compare my pronunciation with the video materials.(Malik) the multimodality of the learning materials multimodal materials because it allows them to subsequently see the phonetic alphabet, listen to the pronunciation model in the video and match their own pronunciation with the model.They admitted that their independence grew in the multimodality multimodal environment.Regular access to the model also trains their intelligibility which lead to the development of their confidence about their English comprehension in general.I slowly become confident about my pronunciation because I can practice without worry worrying about my friends' comments (Kartini)

Table 1 .
Lesson plan

Table 2 .
Example of Zahra data layout and coding sheet