Back to blog

V vs W — "vest" and "west" are not the same word

For /v/, your top teeth press into your bottom lip and buzz against it. For /w/, nothing touches at all: the lips round and glide. They are not close sounds you tighten or loosen, but two different mechanisms, and one fingertip on your lip tells them apart.

Say vest. Now say west. Only the first sound is supposed to change between them, and that first sound is where a lot of otherwise fluent English speakers quietly merge two things the language keeps apart. For vest, your top teeth press into your bottom lip and buzz against it. For west, nothing in your mouth touches anything at all: your lips round into a small circle, then spring open into the vowel. If both words started the same way for you, teeth-on-lip both times or rounded-lips both times, you just folded /v/ and /w/ into a single sound. So do most people who share your first language.

This particular merger sorts speakers by where they come from. German speakers usually push both words toward vest, because the letter w in German already says /v/ and the language has no /w/ of its own. Hindi and much of Indian English run the two together in the middle, so vine and wine land on the same in-between sound and very can come out closer to wery. Russian and Polish speakers tend to harden west into vest. The details differ by language, but the result is the same lost contrast, and it is one English relies on constantly.

The letters v and w stand for two unrelated sounds, not two flavors of one. /v/ is made by resting your top teeth on your bottom lip and forcing voiced air through the gap, so it buzzes and you can hold it as long as your breath lasts. /w/ touches nothing: you round your lips as if to say oo, then glide straight into the next vowel, which is why you cannot hold it without it turning back into a vowel. The whole contrast lives in one question: do your teeth touch your lip, or not? Most learners only have to build one of the two. Whichever sound your first language is missing is the one to drill.

Two different sounds

English spells these two sounds with letters that look like relatives. W is literally double-u, drawn over the centuries as two Vs or two Us stitched together. The letters are cousins. The sounds are not even in the same family.

The V sound is a fricative: you make a narrow gap and force air through it until it hisses or buzzes. It belongs with /f/, /z/, and the th in this. Because it runs on friction, you can hold it. Take a breath and buzz vvvvv until you run out of air, and the sound stays steady the whole time.

The W sound is a glide, what phoneticians call an approximant: the air never gets squeezed, it just slides past. It belongs with the y in yes, another glide that slides out of a vowel shape into the sound after it. You cannot hold it. Try to stretch wwww and within a fraction of a second it collapses into ooo, a plain vowel. That collapse is the tell. A /v/ is a place your mouth can sit; a /w/ is a move your mouth makes on the way to somewhere else.

The gap between them runs through dozens of everyday word pairs:

/v/ — teeth on lip/w/ — rounded lips
vestwest
vinewine
vetwet
vealwheel
verseworse
varywary
vowwow
viperwiper
vilewhile
veeredweird

Read down both columns. If the two sides come out of your mouth sounding the same, that is the merger this article takes apart. The fix is mechanical and quick, because you can feel the difference with your fingertip.

Why “basically the same” is the wrong frame

Most people who blur v and w think of them as neighbors, two close sounds separated by a little more or less effort. They feel similar because both involve the lips, and if your first language uses only one of the two, your ear files both of them under that single category. But “close” is the wrong picture. The two sounds are made by completely different mechanisms. They even rule each other out: you physically cannot make one while your mouth is set up for the other.

The difference is not how hard you push or how long you hold. It is whether your top teeth touch your bottom lip. Teeth, or no teeth.

Try it with a finger. Rest a fingertip lightly across your lips and say very. You should feel your top teeth land on your bottom lip and stay there, vibrating, through the whole first sound. Now say we. Your lips push forward into a small ring, and your teeth never come near them. There is no in-between setting that produces both. The teeth are either down and buzzing, which gives you /v/, or up and clear while the lips round, which gives you /w/. Once you notice the switch, you stop hunting for a sound somewhere “between” them, because there is nothing in between to find.

How to make each sound

For most learners only one of these is genuinely missing, so the trick is to build it out of the one you already own and lean on the contrast between them.

Start with the /v/. Rest your top front teeth gently on your bottom lip, the same place you put them for /f/ in fan, then turn your voice on and push air through. /f/ and /v/ are the same mouth: /f/ is the whispered one, /v/ is the one with the motor running. If you feel a buzz in your lip and your throat at the same moment, that is it. Hold it for a second or two to prove to yourself it can last, the way fff can. A short, stopped sound that cannot be held is a /b/, not a /v/, and that swap is its own common slip.

The /w/ is built completely differently. Push your lips forward into a tight circle, the shape you make for the oo in food, and keep your teeth clear of everything. Say oo, then open straight into ah: oo-ah, oo-ah. Speed that up and the oo stops being a vowel and turns into the consonant: wah. That motion, lips rounding and then releasing into a vowel, is the entire sound. The /w/ is the /uː/ vowel set in motion, a movement rather than a position, which is exactly why you cannot freeze it the way you can freeze a /v/.

The most common mistake is letting your teeth decide for themselves. If west keeps arriving as vest, your teeth are touching down when they should stay clear; round the lips and keep the teeth off the lip entirely. If vest drifts toward west or best, the teeth never landed; put them back on the lip and add the buzz. All you are really training is the ability to throw that one switch on purpose instead of leaving it to habit. That reset is hardest in the middle of a word, in river or away, where the teeth have to drop and lift between two vowels with no time to think about it.

Which spelling is which

Here the page is on your side, which is not always true in English. The letter v says /v/ almost without exception: very, even, love, give, travel. English words rarely end in a bare v, which is why have, give, and live tack on a silent e, but the sound underneath is still a clean /v/. The letter w says /w/ nearly as reliably: west, win, away, water.

The wrinkles are all on the w side, and they are worth knowing:

SpellingSaysExamples
wh-/w/which, what, when, whale (for most American speakers, which = witch)
wh- in who, whole, whose/h/, w silenta small set; whoa and whopper keep the /w/
wr-/r/, w silentwrite, wrong, wrist, wrap
w in a few wordssilenttwo, sword, answer
w inside a vowel spellingpart of the vowel, no /w/low, saw, now, few

One sound trap hides in plain sight: the word of. It is spelled with an f but pronounced uhv, with a real /v/ at the end. That is the rare case where an f on the page is a /v/ in the mouth, and it slips past learners because of is so small and so frequent that nobody stops to listen to it.

Train the ear before the mouth

You cannot reliably produce a contrast you cannot hear. Plenty of learners can place a clean /v/ and a clean /w/ when they are thinking hard about each one, then lose the difference the instant a real sentence goes by, because the ear never learned to flag which one just happened. Perception comes first, and for this pair it usually comes fast, because the cue is concrete.

The drill is minimal pairs fed to you out of order. Have a partner, or a text-to-speech voice, say one word from a pair at random: vest or west, vine or wine. You only sort, no producing yet, naming which one you heard. When you can label fifteen in a row without hesitating, your ear has built the category and your mouth has a target to aim at. A quieter version needs no partner: take a minute of American speech with a transcript, an interview or a clip from a show, and underline every word that starts with v or w. Replay each one and answer one question only, teeth or no teeth. You are not trying to talk. You are teaching your ear to stop folding the two sounds into one, which is the step that makes the mouth-work stick.

Practice phrases

Read these out loud, twice each. Every line makes your mouth switch between /v/ and /w/, which is harder and more useful than drilling either sound alone. Slow down until each word lands on the sound you meant, then bring the pace back up.

  1. The vet drove west in a van. The vet drove west in a van.
  2. We poured the wine beside the vine. We poured the wine beside the vine.
  3. Will you wear the velvet vest? Will you wear the velvet vest?
  4. Vivian waved from the window. Vivian waved from the window.
  5. It works well enough to live with. It works well enough to live with.
  6. Every winter the weather turns vile. Every winter the weather turns vile.
  7. The view from up here never looked worse. The view from up here never looked worse.
  8. Wave the white flag and give up. Wave the white flag and give up.
  9. Victor went straight over the wall. Victor went straight over the wall.
  10. We have never driven this far west. We have never driven this far west.

If a line trips you up at speed, that is the point of stacking both sounds into one breath. The teeth have to come down and lift back off several times a sentence, and getting that timing automatic is what you are really drilling.

How different first languages handle this

Where you start depends on which of the two sounds your first language already gave you, and which way it pulls when the other one is missing. None of this is a deficiency. It is just the shape of the gap you are closing.

Your L1Default slipWhat to focus on
Germanwestvest (no native /w/)Build /w/ from the oo glide. On w-words, keep your teeth off the lip entirely.
Dutchw drifts toward v; v toward fRound the lips fully for /w/ and let the teeth stay clear; switch the voicing firmly on for /v/.
Hindi, Indian Englishv and w merge to one middle soundSplit them on purpose: teeth down and buzzing for /v/, teeth clear and lips rounded for /w/.
Russian, PolishwestvestRound the lips into a full glide and keep the teeth clear. Polish already has this sound (it is the letter ł); the trap there is reading English w as the /v/ it spells in Polish.
Spanishvestbest (/v/ becomes /b/)Your /w/ is already fine. For /v/, put the teeth on the lip; it is a buzz, not a /b/.
Japanesevestbest; weak, under-rounded wBuild /v/ with the teeth-buzz, and round the lips harder on /w/ so it does not flatten out.
KoreanvestbestThe /w/ is yours already. Focus on the teeth-on-lip buzz that turns your /b/ into a /v/.
Mandarin Chineseverywery (/v/ becomes /w/)Build /v/ with the teeth and keep it sharply separate from the /w/ you already use.
Arabicveryfery (/v/ becomes /f/)You have /w/. Make /v/ by voicing your /f/: same mouth, motor on.
Turkishv softens to w between vowelsKeep English /v/ firmly teeth-on-lip even between vowels, where your habit is to loosen it.

FAQ

Why do my V and W sounds come out the same in English?

Almost always because your first language has only one of the two, or a single sound that sits between them, and you are using it for both English letters. German, Russian, and Polish speakers tend to push w toward /v/; Hindi and Indian English merge the pair into one middle sound; Spanish, Japanese, and Korean speakers usually have /w/ but turn /v/ into a /b/. The fix is the same in every case: build the missing sound and learn to switch your teeth on and off on purpose.

What is the difference between the /v/ and /w/ sounds in English?

/v/ is a fricative: your top teeth touch your bottom lip and voiced air buzzes through the gap, and you can hold the sound steady for as long as your breath lasts. /w/ is a glide: nothing touches, your lips round like the start of oo, and you slide immediately into the next vowel, so it cannot be held. The quickest test is a fingertip on the lip. Teeth land and vibrate for /v/; lips round and the teeth stay clear for /w/.

How do I stop pronouncing 'west' as 'vest'?

Your teeth are touching your bottom lip when they should stay clear. Set up for the /w/ first: round your lips into a tight circle as if starting the word oo, keep your teeth well off the lip, then glide into the rest of the word. If it helps, exaggerate the lip rounding at first. The error is almost never about effort; it is the teeth landing out of habit, so the fix is keeping them up.

Is the English W sound really a vowel?

It is made like one but works like a consonant. /w/ is shaped from the /uː/ vowel, the oo in food, set in motion toward another vowel: say oo-ah quickly and the oo becomes a /w/. That vowel-like shaping is why phoneticians call it a semivowel. In a word, though, it behaves as a consonant, starting the syllable and taking a rather than an (a window, never an window).

Does it matter if I confuse V and W when speaking English?

Usually context rescues you, but not always, and the pairs that collapse are common: vest and west, vine and wine, veil and whale. Even when listeners work out your meaning, a consistent v/w swap is one of the most noticeable accent markers in English, because native speakers hear the two as completely separate sounds rather than near-misses. It is also one of the easiest markers to clear, which makes it worth the few days of practice.

Why do some speakers pronounce 'very' as 'wery'?

Their first language has a /w/ but no /v/, so the nearest available sound stands in. Mandarin is the classic case: there is no native /v/, and the rounded glide is the closest neighbor, so very drifts to wery. The repair is to build a true /v/ by setting the top teeth on the bottom lip and adding voice, then keeping it distinct from the /w/ that was filling in for it.

end of article

The v/w merger is one of the most audible accent markers in English and one of the least stubborn, because the whole difference comes down to a single moving part: a fingertip on your lip tells you instantly which sound you just made. Spend a few days only hearing the contrast, then a week running your teeth on and off through the practice lines above. The two sounds pull apart quickly once your mouth stops treating them as one.

By SayWaader Editorial

SayWaader Editorial is the editorial voice of SayWaader, a pronunciation coach for advanced English speakers. We write what we’d say to a friend who’s done sounding textbook‑y. Read our methodology note for how the writing actually happens.

Reading the rule is a start.
Doing it is the work.

Don't keep the cactus waiting. He's getting thirsty for some waa·der.

  • AI feedback on connected speech
    flap T, linking, reductions — the parts textbooks skip
  • Respells how it actually sounds
    "plumber" → "PLUH-mer", "receipt" → "ruh-SEET"
  • 4,000+ real-life sentences
    coffee shops, doctor visits, arguing with the cable company
  • Five-axis scoring per sentence
    accuracy · clarity · intonation · stress · fluency