Touch the tip of your tongue to the roof of your mouth just behind your teeth. Add vocal cord vibration as you release.

Americans pronounce diversity as duh-VUR-suh-tee (/dəˈvɜrsəɾi/). In "diversity", the "t" between vowels sounds like a quick "d" — the tongue briefly taps the ridge behind the upper teeth. This is called the Flap T, a hallmark of natural-sounding American speech. So instead of tuh·VUR·suh·tee, you get duh·VUR·suh·tee. Stress falls on the second syllable — keep everything else short and quick. You'll hear it in sentences like "The theory of evolution explains the diversity of species on Earth" or "Advocates are calling for greater diversity in leadership positions" — more examples below.
Record yourself saying "diversity" and play it back. The mic stays on your device — nothing's uploaded.
4 syllables, 8 sounds. Tap a syllable to jump to its row, then explore each sound's mouth shape and how it's made.
Click any sentence to see the full breakdown — every link, every reduction, every flap-T.
The textbook way isn't wrong — it's just not how anyone actually says it.
In "diversity", the "t" between vowels sounds like a quick "d" — the tongue briefly taps the ridge behind the upper teeth. /t/ or /d/ becomes a quick tap [ɾ] — sounds like a soft D. The tongue briefly taps the ridge behind the upper teeth.
Stress falls on the second syllable, not the others. Stretch VUR — keep everything else short and quick.
Don't pronounce the first syllable too fully. The unstressed syllable reduces to a schwa — the lazy "uh" sound — in casual speech.
Americans use a relaxed retroflex R — the tongue curls back rather than rolling. The R is one continuous sound with the vowel before it, not two separate sounds.