PixVerse Raises $300 Million in China's Largest Video Generation Funding Round

PixVerse Raises $300 Million in China's Largest Video Generation Funding Round

LatePost has exclusively learned that PixVerse recently completed a $300 million Series C round led by CDH Investments, with over 20 participating institutions including entertainment industry players such as China Ruyi and 37 Interactive Entertainment, local government-backed funds such as Yizhuang Guotou and Suchuangtou, and overseas institutions including UOB Venture Management (a subsidiary of United Overseas Bank) and Lion X Fund (under Singapore’s OCBC Bank).

This is the largest single funding round in China’s AI video generation sector to date.

PixVerse’s annual recurring revenue (ARR) exceeded $40 million by the end of 2025. According to the company’s public disclosures from October, its mobile apps — PixVerse (the international version) and PaiWo AI (the domestic Chinese version) — had surpassed 100 million total users, with monthly active users exceeding 16 million.

According to public reports and data, only a handful of China-based AI startups have surpassed $50 million in ARR, including Manus, Lovart, Genspark, and HeyGen.

Most video generation companies and products — such as Runway, Kuaishou’s Kling, Shengshu Technology, and MiniMax’s Hailuo — primarily serve relatively professional content creators through web-based platforms.

PixVerse also has this line of business, but beginning in the second half of 2024, the company pivoted to consumer-facing products, launching its mobile video generation and sharing app PixVerse in Q4 of that year. ByteDance’s Jimeng began internal testing in March of the same year, initially focused on image generation; Sora’s app, which later attracted even greater attention, did not launch until October of the following year (2025).

On the model technology front, PixVerse’s latest milestone was the January release of PixVerse R1, which can generate video in real time and produce endlessly extending visuals — a capability enabled by its autoregressive approach. “Real-time plus infinite visuals” has long been a discussed trend in video generation, one that could unlock new experiences such as interactive content and generative games.

Other companies exploring this direction include Sand.ai and Vivix AI. Sand.ai was founded by Cao Yue, co-founder of Lightyear, and open-sourced its autoregressive video generation model MAGI-1 in March 2025. Vivix AI was founded by Liu Yu, former executive research director at SenseTime, and focuses on “real-time interactive multimodal content.”

After Seedance 2.0 went viral and broke into the mainstream, the pressure cascaded to other video generation companies. PixVerse co-founder Xie Xuzhang told us that while they do feel some anxiety, user data suggests the impact of Seedance 2.0 on PixVerse has not been significant: “The video generation market is large enough that we haven’t yet reached a stage of direct competition.”

Below is LatePost’s interview with Xie Xuzhang, conducted around the time of this funding round, covering recent capital market shifts, industry developments such as Seedance 2.0, emerging competition, and PixVerse’s own “world model” exploration.

After Raising $300 Million: Continuing In-House Model Development, Globalization, and Consumer Products

LatePost: A single round of $300 million sets a record for video generation funding. Why did the investment market show this level of enthusiasm for this direction — and for PixVerse — at the end of 2025?

Xie Xuzhang: Because from the perspective of both attention and revenue, video generation is one of the fastest-growing areas across all AI verticals. Our peers have also raised substantial amounts. (Note: U.S. video generation company Runway completed a $315 million round in February 2026.)

LatePost: Why were there so many participating institutions in this round? Is it because many parties wanted in, but few had the conviction to write large checks?

Xie Xuzhang: When we launched the fundraise at the end of last year, we originally planned to raise $100 million. But investor interest — both domestic and international — was strong enough that we decided to raise more and build a war chest.

LatePost: How do you plan to spend the $300 million?

Xie Xuzhang: We’ll continue investing in R&D, exploring new business lines, and expanding into global markets. We want to build the best video model and further develop our R1 series.

LatePost: What does a single training run cost you? How much of the $300 million will go toward model R&D?

Xie Xuzhang: To develop a model of comparable — or even superior — quality, we use fewer than a thousand GPUs per month on average for training, costing roughly 10% of what our peers spend. We hope to increase that investment several-fold this year.

LatePost: Why can you operate at a fraction of the cost?

Xie Xuzhang: It’s a combination of advantages — model architecture, algorithms, engineering, and product capabilities. Outside observers tend to look for simple explanations, but there’s no single factor.

Previously, some foundation model companies that raised more money than us had no shortage of talent or data, yet most of them still couldn’t produce a video generation model. That tells you building a video generation model is inherently difficult.

LatePost: But top AI labs like OpenAI can still do it, and in a broad sense, they’re your competitors.

Xie Xuzhang: It’s true that when Sora first launched in early 2024, the narrative was “startups are finished,” “big tech will dominate everything,” “startups should build applications, not foundation models.” Despite all the external commentary, we were internally resolute about keeping model capabilities in our own hands.

Sora’s arrival also had an upside. We were founded in 2023, and before Sora, even doing video model research as a startup in China was contrarian — everyone was focused on large language model companies. So we had already taken self-developed video generation from a contrarian bet to a consensus view. Why not keep pushing a little longer?

LatePost: When did that conviction get validated?

Xie Xuzhang: In the second half of 2024, when we released PixVerse V3. We had a superhero transformation effect that went viral globally. That’s when we felt we had gotten both the model and the product right.

LatePost: How did you come up with — or land on — “templates” as a product format?

Xie Xuzhang: We had decided to go consumer-facing, and adopting “templates” was a way to serve ordinary people. This aligned with our original mission when we founded the company — the technology was simply ready at that point.

LatePost: Isn’t ByteDance better positioned to do this?

Xie Xuzhang: At that time, they hadn’t reacted yet, which gave us an opening.

LatePost: What if ByteDance now invests more heavily in consumer-facing video generation? It has traffic, products, and models.

Xie Xuzhang: We’re not doing the same thing. ByteDance’s highest-traffic video products are Douyin and TikTok, which are primarily about short-video consumption. Our focus is empowering ordinary users who have never made a video before to create with AI. New creators can experience native AI-driven creation and distribution on our platform.

LatePost: After all your exploration in video generation, which scenario do you think can truly support tens of billions in revenue?

Xie Xuzhang: This is already happening. In the United States, monthly video generation API call volume exceeds $100 million — that’s over 10 billion RMB in annual API volume.

LatePost: Do you have any contrarian bets you’re still pursuing?

Xie Xuzhang: Continuing to invest in in-house model development after Seedance 2.0, continuing to build a global consumer product, continuing to serve ordinary users — all of these remain contrarian. The other one is continuing to invest in R1, our video-based world model.

Seedance 2.0’s Emergence Is a Good Thing — the Industry Isn’t at Direct Competition Yet

LatePost: Seedance 2.0 went viral. Do you feel anxious or pressured?

Xie Xuzhang: We do feel some anxiety. But from our founding in 2023 to now, we’ve been through this many times over three years — Sora, Kling, Veo, and more. There have been too many “world-changing” launches and too many “disruptions.”

Looking at the data, we haven’t been meaningfully impacted. Even when Sora 2 launched, the effect on us was minimal. The consumer video generation market is far larger than people imagine — we’re nowhere near the stage of direct competition.

LatePost: The Sora App has been out for almost half a year now. In hindsight, do you think it represents a real consumer platform opportunity?

Xie Xuzhang: At least judging by results, Sora App’s retention is significantly lower than PixVerse’s.

LatePost: According to SensorTower estimates, the Sora App’s 30-day retention is 8%. What’s yours?

Xie Xuzhang: On SimilarWeb, you can see that our bounce rate is lower than Sora’s. A lower bounce rate means users are willing to engage with your site. Based on third-party data, both our app and web retention are the highest in the industry. (Note: Bounce rate measures the proportion of visitors who leave after viewing only a single page — users who open the app but leave immediately without meaningful interaction.)

LatePost: With Seedance 2.0, could ByteDance’s Jimeng become a “more successful version of the Sora App”?

Xie Xuzhang: From what we understand, over the past six months to a year, Jimeng’s primary user base has been professional users. Whether that changes after Seedance 2.0 may take another month or two to determine. When the Sora App first launched, many people thought it would be a super app — but that was disproven within a month.

There’s another point: Jimeng targets the Chinese market; we target the global market. Seedance 2.0 is an excellent model, but will it give rise to the next super app? Not necessarily.

LatePost: What does your core user profile look like?

Xie Xuzhang: Many of our users are creating AI-generated videos for the first time. Globally, billions of people watch videos. Among them, fewer than 10% are video creators — but the remaining 90% also have a desire to express themselves. We want to use AI to help them become creators.

LatePost: You’re also serving enterprise clients in professional content production. What’s the relationship with your consumer product?

Xie Xuzhang: The consumer side is the larger share, but enterprise professional content production is growing too. This year, for instance, we’re seeing clear revenue growth in industries like comic dramas.

Interactive Video Will Change the Logic of Content Production

LatePost: PixVerse recently released “World Model PixVerse R1,” a video generation model capable of real-time generation. Is this really a “world model”? Some have accused this of being a buzzword play.

Xie Xuzhang: There are multiple technical paths toward world models within the industry. When Sora launched, it called itself a “world simulator.” Runway has also released video-based world models. Our definition of a world model, put simply, is letting AI learn enough about how things work to produce a model that can predictively simulate physics, causality, space, and time.

Within this broad direction, different companies pursue it through video, 3D, robotics, and other approaches. I believe our model — which learns from video representations of the real world to construct a virtual world — is an important technical path.

LatePost: We’ve seen R1’s demo videos. When a user sends a prompt to interact with the scene, the visuals change somewhat abruptly and only hold for a few seconds before reverting to their original state. Is this a satisfactory result?

Xie Xuzhang: R1 has many use cases. We have some with stable storylines and others that are completely open-ended, designed for different scenarios. The model itself is fully open and has infinite possibilities, but it’s still at an early stage.

LatePost: Why are you investing in “real-time” and “infinite generation”?

Xie Xuzhang: Our team has always pursued forward-looking experiments. In 2023, when everyone was racing on language models, we were already building video. Later, when we focused on consumer AI templates, few others were doing that either. Real-time generation is the same story — we want to explore how video foundation models can find new applications and carve out a new path.

As for R1, our conviction is that the boundary between video and games will inevitably blur. Once video becomes interactive, entirely new content, users, and creative opportunities will emerge — so we need to position ourselves early.

LatePost: What specific new opportunities could interactive video generation create?

Xie Xuzhang: In R1, creation and consumption are fused. The viewer is the creator — they participate by spending tokens to interact. In the future, people may watch videos that share the same basic framework but whose visuals and storylines change in real time based on individual preferences. This will bring a fundamental shift to current content consumption models.

LatePost: How many people actually want to create while watching a video? Wouldn’t most people just want to sit back and relax?

Xie Xuzhang: Many users have ideas they’d like others to see. Perhaps through video generation, they can show others the images in their minds. Users are also more inclined to share this kind of content.

LatePost: R1 is currently a standalone web product. When will it be integrated into your mobile product?

Xie Xuzhang: It’s currently a separate product line. We’ll experiment with mobile formats while also considering further developing R1 into an AI-native video game engine.

LatePost: Since R1’s release, which industries have approached you for partnerships?

Xie Xuzhang: The gaming industry, primarily. Google’s Genie 3 recently disrupted traditional game engines, and R1 could similarly use AI-native models to reshape the foundational layer of game creation. It could become the core foundation for AI game engines. Future game development won’t require the grueling long development cycles of the past — whether it’s gameplay, visuals, or storytelling, AI can make everything lighter and more imaginative. More importantly, it can help people with creative ideas but no coding skills turn those ideas into real games.

At the same time, short drama and comic drama teams have been actively reaching out. Previously, all video was filmed first and then distributed to audiences. But imagine watching a short drama in the future where you choose whether the protagonist becomes a downtrodden live-in son-in-law — a popular Chinese drama trope — or starts a business. The storyline is up to the viewer. Once video becomes interactive, the fundamental logic of the entire content industry changes.

We’ll continue to focus on the underlying technology and look forward to exploring with partners in gaming, film and television, and smart devices to build the first meaningful product on this model.