This page is a browser front end for the Seedance 2.0 model. Type a description of a shot, or hand the generator up to 12 mixed files—photos, clips, and audio—as a single combined job. What comes back is motion, lighting, and sound that stay consistent frame to frame.
Where a lot of text-to-video tools stop at a few silent seconds, Seedance 2.0 renders picture and sound in the same pass—lip-synced lines, room tone, and effects that track whatever is moving on screen. Paired with output up to native 4K, that makes it a reasonable fit for social clips, product shots, pre-vis, and portfolio pieces where the audio needs to sound finished, not added later.
The two modes map to how you'd actually brief a shot: Text to Video for starting from nothing but a description, and Universal Reference for stacking images, clips, and audio together. An @ tag in your prompt tells the model which file governs identity, camera movement, or sound.