In short, the search pattern is treated as a Perl 5 regular expression.
In its raw format (which is what your pattern will try to match), each
part-of-speech vocabulary entry consists of a word or phrase field followed
by a field delimiter of × (ASCII 215, Perl pattern \xD7
)
and the part-of-speech field that is coded using the ASCII symbols from the
table below (case is significant)
Example raw string:
yes×vN
To find all adjectives containing "cat" (in any case):
cat.*\xD7.*(?-i:A)
Noun | N |
Plural | p |
Noun Phrase | h |
Verb (usu participle) | V |
Verb (transitive) | t |
Verb (intransitive) | i |
Adjective | A |
Adverb | v |
Conjunction | C |
Preposition | P |
Interjection | ! |
Pronoun | r |
Definite Article | D |
Indefinite Article | I |
Nominative | o |
Each pronunciation vocabulary entry consists of a word or phrase field followed by a field delimiter of space " " and the IPA-equivalent field that is coded using the following ASCII symbols (case is significant). Spaces between words in the word or phrase or pronunciation field is denoted with underbar "_".
/&/ | "a" in "dab" |
/(@)/ | "a" in "air" |
/A/ | "a" in "far" |
/eI/ | "a" in "day" |
/@/ | "a" in "ado" or the glide "e" in "system" (dipthong schwa) |
/-/ | "ir" glide in "tire" or the "dl" glide in "handle" or the "den" glide in "sodden" (dipthong little schwa) |
/b/ | "b" in "nab" |
/tS/ | "ch" in "ouch" |
/d/ | "d" in "pod" |
/E/ | "e" in "red" |
/i/ | "e" in "see" |
/f/ | "f" in "elf" |
/g/ | "g" in "fig" |
/h/ | "h" in "had" |
/hw/ | "w" in "white" |
/I/ | "i" in "hid" |
/aI/ | "i" in "ice" |
/dZ/ | "g" in "vegetably" |
/k/ | "c" in "act" |
/l/ | "l" in "ail" |
/m/ | "m" in "aim" |
/N/ | "ng" in "bang" |
/n/ | "n" in "and" |
/Oi/ | "oi" in "oil" |
/A/ | "o" in "bob" |
/AU/ | "ow" in "how" |
/O/ | "o" in "dog" |
/oU/ | "o" in "boat" |
/u/ | "oo" in "too" |
/U/ | "oo" in "book" |
/p/ | "p" in "imp" |
/r/ | "r" in "ire" |
/S/ | "sh" in "she" |
/s/ | "s" in "sip" |
/T/ | "th" in "bath" |
/D/ | "th" in "the" |
/t/ | "t" in "tap" |
/@/ | "u" in "cup" |
/@r/ | "u" in "burn" |
/v/ | "v" in "average" |
/w/ | "w" in "win" |
/j/ | "y" in "you" |
/Z/ | "s" in "vision" |
/z/ | "z" in "zoo" |
Stress or emphasis is marked in the data with the marks:
' | (uncurled apostrophe) marks primary stress |
, | (comma) marks secondary stress. |
Moby Pronunciator contains many common names and phrases borrowed from other languages; special sounds include (case is significant):
"A" | "a" in "ami" |
"N" | "n" in "Francoise" |
"R" | "r" in "Der" |
/x/ | "ch" in "Bach" |
/y/ | "eu" in "cordon bleu" |
"Y" | "u" in "Dubois" |
Words and Phrases adopted from languages other than English have the unaccented form of the roman spelling. For example, "etude" has an initial accented "e" but is spelled without the accent in the Moby Pronunciator II database.
Stress is indicated by means of a numeral [012] attached to a vowel:
0 | no stress |
1 | primary stress |
2 | secondary stress |
Phoneme | Example | Translation |
---|---|---|
AA | odd | AA D |
AE | at | AE T |
AH | hut | HH AH T |
AO | ought | AO T |
AW | cow | K AW |
AY | hide | HH AY D |
B | be | B IY |
CH | cheese | CH IY Z |
D | dee | D IY |
DH | thee | DH IY |
EH | Ed | EH D |
ER | hurt | HH ER T |
EY | ate | EY T |
F | fee | F IY |
G | green | G R IY N |
HH | he | HH IY |
IH | it | IH T |
IY | eat | IY T |
JH | gee | JH IY |
K | key | K IY |
L | lee | L IY |
M | me | M IY |
N | knee | N IY |
NG | ping | P IY NG |
OW | oat | OW T |
OY | toy | T OY |
P | pee | P IY |
R | read | R IY D |
S | sea | S IY |
SH | she | SH IY |
T | tea | T IY |
TH | theta | TH EY T AH |
UH | hood | HH UH D |
UW | two | T UW |
V | vee | V IY |
W | we | W IY |
Y | yield | Y IY L D |
Z | zee | Z IY |
ZH | seizure | S IY ZH ER |