Lift your bottom lip to touch the very bottom of your top front teeth. Blow air through this contact point without voicing.

Americans pronounce photos as FOH-tohz (/ˈfoʊɾoʊz/). In "photos", the "t" between vowels sounds like a quick "d" — the tongue briefly taps the ridge behind the upper teeth. This is called the Flap T, a small move that separates 'classroom' from 'native'. It comes out as FOH·tohz. Stress falls on the first syllable — keep everything else short and quick. You'll hear it in sentences like "Let's find some old photos" or "I felt a wave of nostalgia when I saw my old photos" — more examples below.
Record yourself saying "photos" and play it back. The mic stays on your device — nothing's uploaded.
2 syllables, 5 sounds. Tap a syllable to jump to its row, then explore each sound's mouth shape and how it's made.
Quickly bounce the front of your tongue against the roof of your mouth. Don't stop the airflow — just a quick tap.

Start with your mouth slightly open, then close your jaw slightly as your lips round. Shift your tongue back slightly, then stretch the back up.
Same position as S, but add vocal cord vibration. Feel the buzz.

Click any sentence to see the full breakdown — every link, every reduction, every flap-T.
The textbook way isn't wrong — it's just not how anyone actually says it.
In "photos", the "t" between vowels sounds like a quick "d" — the tongue briefly taps the ridge behind the upper teeth. /t/ or /d/ becomes a quick tap [ɾ] — sounds like a soft D. The tongue briefly taps the ridge behind the upper teeth.
Stress falls on the first syllable, not the others. Stretch FOH — keep everything else short and quick.