ByteDance introduces Seedance 2.5, a 30-second native 4K AI video model that takes in 50 reference inputs.
ByteDance introduced Seedance 2.5 during its Volcano Engine FORCE conference in Beijing, allowing for the generation of 30-second native 4K videos from a maximum of 50 multimodal reference inputs. The company made a significant leap by skipping four intermediate versions and moving directly from the previous model to highlight what it calls a generational advancement.
Currently, an enterprise beta is available, with a public launch planned for early July. CEO Liang Rubo emphasized that reaching the summit of AI is the company’s primary goal and that its model-as-a-service initiative is developing into a core operation supported by long-term investments.
A key enhancement in this model is its reference capacity, which now accommodates up to 50 multimodal inputs like images, audio clips, 3D white models, and style references, increasing from 12 in the earlier version. This expanded input range allows Seedance 2.5 to offer significantly better control over style, motion, and composition compared to using only a text prompt.
The model natively generates at 4K resolution instead of upscaling from a lower resolution, a crucial detail for professional production workflows. It supports 10-bit color depth, ensuring smoother gradients and enhanced options for color grading in post-production. ByteDance also asserts a 20 percent improvement in prompt adherence, resulting in fewer iterations needed to achieve a satisfactory output.
Audio is now processed alongside visual signals, ensuring that on-screen actions sync accurately with their sound effects. A new 3D white-box preview feature allows creators to produce low-fidelity animations before finalizing a full-quality render. Collectively, these enhancements position the model as a valuable production resource rather than just a novelty generator.
This announcement follows a three-month period in which ByteDance was compelled to implement watermarking and IP protections for Seedance 2.0 after receiving cease-and-desist letters from Disney, Warner Bros Discovery, Paramount, and Netflix. A viral deepfake involving Tom Cruise and Brad Pitt led to a formal complaint from the Motion Picture Association and a critique from SAG-AFTRA.
ByteDance halted the global rollout in mid-March but resumed it briefly through CapCut at the end of March, implementing face-blocking filters, C2PA watermarks, and measures for detecting copyrighted characters. There has been no timeline provided for when the new model will be available in the United States.
Since February, the competitive landscape has changed significantly. OpenAI discontinued Sora in March after the video tool reached about a million users, operating at a daily cost of approximately one million dollars and generating slightly over two million dollars in total revenue.
Google's Veo 3.1 has taken much of the market share, providing native 4K output, audio generation, and the ability to use up to three reference images for style control. However, the new ByteDance model far surpasses Veo's reference input capacity, being able to incorporate 50 inputs compared to Veo's three, which is crucial for professionals.
The AI video generation market has rapidly diversified, with Chinese models advancing faster in production tools compared to Western counterparts. Third-party platforms, such as Reallusion’s AI Studio, have already established professional workflows based on the predecessor model, while Runway’s fourth-generation tool has dropped out of the top 10 in Artificial Analysis.
The main concern remains whether the new model can penetrate global markets without reigniting the copyright conflicts that hindered its predecessor. ByteDance has the technological capability, a distribution network through CapCut’s 400 million monthly active users, and a streamlined process from generation to editing to sharing. However, it still needs to resolve issues with Hollywood, and each new feature that enhances the model’s capabilities also increases the stakes in this ongoing dispute.
Other articles
ByteDance introduces Seedance 2.5, a 30-second native 4K AI video model that takes in 50 reference inputs.
ByteDance unveiled Seedance 2.5 at its conference in Beijing, which creates 30-second native 4K videos using up to 50 reference inputs, with a public release scheduled for July.
