Corpus collection
The NOCANDO Corpus is a corpus of spoken narrative texts. It was created by recording free picture-based narrations of native speakers in five different languages: Catalan, Italian, Spanish, English, and German.
Speakers
The participants were mostly students at the Universitat Pompeu Fabra in Barcelona. A smaller number came from different working environments.
Catalan and Spanish speakers were undergraduate or graduate students, mean age 22 for Catalan (between 18 and 30) and 20 for Spanish (between 17 and 29). They were all from Catalonia except one (Catalan speaker) from the Comunidad Valenciana and one (Spanish speaker) from Castilla y León.
Italian, English and German speakers were mostly recently arrived Erasmus students, mean age 29 for Italian (between 20 and 56), 27 for English (between 20 and 41), 34 for German (between 22 and 67). Italian speakers came from different parts of Italy. English speakers came from the United States and the UK. German speakers came from different parts of Germany.
Methodology
Speakers were asked to tell a story by following the pictures of three text-less books:
- Mayer, M. (1973). Frog on his own. New York: Dial Books, Penguin.
- Mayer, M. (1974). Frog goes to dinner. New York: Puffin Books, Penguin.
- Mayer, M. and Mayer, M. (1975). One frog too many. New York: Dial Books, Penguin.
For 40 speakers, recordings were done in an acoustically isolated room provided by the Universitat Pompeu Fabra. For the remaining 28 speakers, they were done in a regular room at the Universitat Pompeu Fabra, with a digital recorder.
The three books were given in a random order for each speaker. The speaker could browse the book before starting the narration.
- Total number of speakers: 68
- Total number of narrations: 222
- Total duration: ca 16 hours (2' to 9' per narration)
Catalan | Italian | Spanish | German | English | |
---|---|---|---|---|---|
Speakers | 19 | 16 | 13 | 9 | 11 |
Recording time | 4:02:43 h | 4:04:32 h | 2:35:20 h | 2:09:13 h | 2:32:20 h |
Word count | 37,555 w | 27,392 w | 25,077 w | 15,944 w | 21,970 w (estimated) |
Segment count | 5,856 seg | 4,306 seg | 3,801 seg | 2,154 seg | 3,140 seg (estimated) |