Help:Audio pronunciations

Definition from Wiktionary, the free dictionary
Jump to: navigation, search

You can request an audio pronunciation. Alternatively, you may add your own by uploading a .ogg file to Commons and then linking here using Template:audio in accordance with our layout conventions.

Recording[edit]

Equipment[edit]

To create audio files, you will need a microphone. A headset style is best. You will also need recording software. The audio format of choice is Ogg Vorbis, because it is a free format. We recommend that you download Audacity for free.

Technique[edit]

Now, record a word. At first, just experiment with the software. Try it a few times and listen to the results. Turn the microphone input volume up. The biggest waves should take up 50-70% of the height of the graphical window. If it is too quiet, it will be hard to hear. If it is too loud, that is, visually going off the edge of the graphical window, the sound will distort.

Move the microphone around. Place it close enough to your mouth to pick up the sound well, but off to the side, so that you don't blow air directly into it when you speak. (Blowing air at a microphone causes a distracting popping sound in your recordings). If you can say "Peter Piper" without hearing puff-puff-puff, then it's in the right place.

Besides that, listen for clarity of your recordings. Put some inflection in your voice and enunciate, maybe a bit more than usual, but do not alter your pronunciation. Make sure nothing is chopped off at the beginning or end. Use the trim and silence buttons to remove extra noise at either end. Leave a bit of empty space, but not too much.

One more thing, before beginning[edit]

You'll also need to create an account on Commons. Besides being better equipped to police multimedia content, Commons allows all projects to share the content you'll create. However, if you have a visual disability or use a text-based web browser, please note that the registration process for Commons is optimized for sighted people using a mainstream graphical web browser who plan to upload visual images. If you plan to upload audio pronunciations and you cannot see images in your browser, you will need to ask an administrator to help you create a Commons account.

Procedure[edit]

Once you can consistently create clean sound files, you'll need a routine to upload them. The following is Dvortygirl's well-honed routine. With practice, it takes less than one minute per word.

  • Record a word.
  • Select the word in Audacity, removing excess blank space and noise at the beginning and end. (From the Edit menu, choose the option Remove Audio and its sub-option Trim.)
  • With the word still selected, standardize the volume. (From the Effect menu, choose the option Amplify... and click OK.)
  • Listen to the selection to make sure you didn't clip off too much at the end, and there aren't any stray noises in it.
  • Go to File, Export Selection as Ogg Vorbis.
  • Type the file name in the box. Call it ll-cc-word.ogg, where ll is the language code (en for English), cc = your country (us, uk, fr, etc.), and the word that you have pronounced. Thus, the word "associate" in UK English would be titled en-uk-associate.ogg.
  • In a Commons tab, paste the following, updated for your information.
[[Category:English pronunciation|associate]]
Pronunciation of the term in US English, recorded by [[User:Yourname|Yourname]], 
14 June 2006
  • Choose a license for your audio. There is a dual license, the first choice under "own work" that works well.
  • Review the information and click upload.
  • Go to the word article and paste
* {{audio|en-us-{{subst:PAGENAME}}.ogg|Audio (US)|lang=en}}

If you are doing batch audio, keep this template in your paste buffer.

  • Check that the word is blue linked in the Wiktionary article. To update a red link on a page that you already saved, upload the correct file to Commons and append ?action=purge to the end of the URL for the Wiktionary article.
  • For the Commons tab, just hit the back button and replace the word in the file path and the template.
  • There are a few cases where you'll want to change the audio template text to match the file name. Heteronyms are one of them:
* {{audio|en-us-associate-noun.ogg|Audio (US)|lang=en}}

Shtooka Recorder[edit]

Shtooka Recorder is a Windows program designed specifically for recording pronunciations. The recorder and other modules are available for download at the Project page. The recorder takes in a list of words from a text file and displays them. It highlights one word at a time. In recording mode, it detects when the volume of speech into the microphone goes above and then drops below a certain threshold for a given amount of time. When it determines it has recorded a word or phrase, it saves a file and highlights the next item in the list.

Shtooka recorder includes settings for the thresholds, times, file naming, and file tags.

The arrow keys will advance to the next word or backtrack to the previous word.

Note: The creator of Shtooka is also collecting freely licensed pronunciation files for use in the language projects he is pursuing. He asks that you contact him about uploading your pronunciations via the email address on the Shtooka website.

Creating a large number of files[edit]

The following is User:Neitrāls vārds workflow for creating pronunciation audio files for anyone looking for inspiration on how to set up their workflow and of course for NV's own future reference.

There exist programs like Shtooka that allow to automate recording audio files but in NV's opinion external devices like smartphone voice recording options will often be superior to microphones to be connected to your computer or built in microphones in laptops (likely of questionable quality.) The following workflow also allows for extensive post-processing.

The following makes use of four pieces of software (all of them completely free) – 3 apps (i.e., they do not need installing, they work by clicking them) – AWB, Ant Renamer and Vicuña Uploader – and one program – Audacity – that needs installing. If Notepad++ is included then it's a total of 5 apps/programs.

The following should allow for approximately 100 recordings per hour (this is "from A to Z" – from saving your list of terms to be read to having audio files in the relevant entries.) The actual reading out of the terms takes up bulk of the time. Thus, e.g., 5 hours = 500 files (and smoke breaks are included in that calculation, I think.)

  • Ideally someone has added {{rfap}} to the entries lacking audio pron. which are then neatly listed in a category "Requests for audio pronunciation (X language)." Benefits of this is two-fold. If not consider quickly going through the entries you want to create pronunciation files for and adding "Pronunciation" header with rfap using Wiktionary:AWB.
  • Copy the contents of that category in a UTF-8 encoded text file, remove any junk like "A", "A cont.", etc. that cats have.
  • Choose the best device that you have (most likely it is going to be a smartphone.)
  • Decide what helps you enunciate words, in NV's experience brushing your teeth and your tongue right before is what you need (considering you want to record hundreds of them.)
  • Take your phone keeping the mic as close to your mouth as possible (most likely how you would talk on it) and start reading the list. Make sure to pause between the words (maybe 3 secs?) If you accidentally make a noise right after reading out a word (like coughing) pause and read it again.
  • Now that you've read a good 100 (150, 200...) terms you have an audio file that you can transfer to your computer and import in w:Audacity (audio editor).
  • Almost all controls are grayed out when it's in "playback mode", the stop button with a square needs to be clicked.
  • Run noise reduction on it.
    • Click and drag to select a "silent" fragment (which probably has white noise), Effects > Noise removal, Get noise profile, after that back to tracks and Ctrl + A to select the entire timeline, again Noise removal, OK.
  • Turn up the volume as much as you can.
    • Effects > Normalize, set the dB to 0,0 (the lower, the louder), OK. (Ideally the highest peaks should reach the border of the box, if many peaks go outside the box, it's too much, try dB 0,1, etc.)
  • If you want – get rid of failed pronunciations/accidental noises – highlight by dragging cursor click the square button and click the button with scissors ("Cut").
    • Perhaps an even better way is to use the mute audio in selection button near the scissors button, getting rid of unwanted artifacts but saving "real estate" between audio files.
  • Select the option to detect sound/cut it out in separate files. Make sure you set a decent "margin" before deciding how close to cut (I think 0.75 secs worked but I'm not sure.) If it's too little all of them will sound like they're cutting off too soon. This is also the reason to pause a bit between reading them out. This gives "wiggle room."
    • Analyze > Sound Finder (pause between sounds 1 sec and margin of 0,75 secs seems to be good.) Check for sounds that didn't have enough pause and were conjoined under 1 label, drag it under the first one and click Ctrl + B to create a new label for the 2nd one. The margins can overlap if 2 files are very close.
    • After that: File > Export Multiple
  • Set a prefix for the file names before exporting (at least your lang code.) In my opinion it's great to specify your native city, e.g., many lv files have lv-riga- and many French files fr-paris-, etc.
  • Export the files as .ogg.
  • Listen to them again at the same time looking at the terms list to make sure all of them correspond exactly to the list (maybe open them as a playlist in WMPlayer...) If they don't sound good delete them all and go back to Audacity and try different cutting-out/exporting settings.
  • Open the exported files in Ant Renamer and use the contents of your terms list to rename the files that Audacity assigned numbers to to have the term in their name, e.g., lv-riga-1, lv-riga-2... to lv-riga-acs, lv-riga-auss, etc. At this point I do not remember if Ant Renamer had the option to append new names to an already existing prefix. In case it didn't you can simply prepend your terms list with your lang code prefix in w:Notepad++ and completely rename your file list (assuming that they are in exactly the same order as your list.)
    • Having the exact name you want in your terms list seems easier than trying to figure out how to preserve this prefix in Ant Renamer, this also means that it's pointless specifying it during exporting.
    • Have terms list open in Notepad++, Search > Replace, tick Regular expressions, enter ^ in Find what, enter your lang (and city) in Replace with, hit Replace all.
  • Open commons:Commons:VicuñaUploader and upload them to Commons with the relevant category.
  • Now that they are uploaded open either the category you took your terms list from in Wiktionary:AWB or paste the list in AWB and choose to replace the rfap template under the Pron. heading with {{audio|[your lang code]-[your city]-{{subst:PAGENAME}}.ogg|lang=[your lang code]}}.