User:OrenBochman/bots/ipa

From Wiktionary, the free dictionary
Jump to navigation Jump to search

IPA-BOT[edit]

  1. A bot to automate IPA entry generation.
    1. the spelling.
    2. a phonemic model.
    3. all the existing IPA data.

Features[edit]

  1. knowledge based version (rule based).
    1. start with a languages that have simple spelling to sound maps like Hungarian and Swahili.
    2. add phonemic adjustment
      1. assimilation
      2. elision
  2. data base version (statistical).
    1. HMM based on input output data.
    2. use existing text to do.
  3. per language on/off flag
  4. check flag - add a template for human checking (for proper nouns).
  5. hybrid
  6. use both models and some discriminator

Issues[edit]

Q.A. - train and test on 95% / 5% split of existing annotation per language.

Other Features[edit]

  1. poll:
  2. is there interest in generating TTS voice files for entries?
  3. is there interest in generating hyphenation as well?

Resources[edit]

  1. open source TTS projects with language models, scripts for tts.
    1. Mbrola
    2. Sphinx
    3. Hspell
  2. CMU dict for English.
  3. mallet to graphic models.