Press your lips together. Air flows through your nose. Vocal cords vibrate.

Americans pronounce media as MEE-dee-uh (/ˈmidiə/). In "media", the "t" between vowels sounds like a quick "d" — the tongue briefly taps the ridge behind the upper teeth. This is called the Flap T, a hallmark of natural-sounding American speech. So instead of MEE·tee·uh, you get MEE·dee·uh. Stress falls on the first syllable — keep everything else short and quick. You'll hear it in sentences like "He uses mixed media to create textured and layered compositions" or "Social media platforms are under scrutiny for spreading misinformation" — more examples below.
Record yourself saying "media" and play it back. The mic stays on your device — nothing's uploaded.
3 syllables, 5 sounds. Tap a syllable to jump to its row, then explore each sound's mouth shape and how it's made.
Click any sentence to see the full breakdown — every link, every reduction, every flap-T.
The textbook way isn't wrong — it's just not how anyone actually says it.
In "media", the "t" between vowels sounds like a quick "d" — the tongue briefly taps the ridge behind the upper teeth. /t/ or /d/ becomes a quick tap [ɾ] — sounds like a soft D. The tongue briefly taps the ridge behind the upper teeth.
Stress falls on the first syllable, not the others. Stretch MEE — keep everything else short and quick.
Don't pronounce the second syllable too fully. The unstressed syllable reduces to a schwa — the lazy "uh" sound — in casual speech.