Wan 2.7

Summary

Wan 2.7 is Alibaba Tongyi Lab's flagship open-source video generation model, released late March 2026. It produces 15-second 1080p clips with industry-first first/last-frame control, native synchronized audio, multi-reference conditioning (up to 5 videos), and instruction-based editing.

Overview

Wan 2.7 is Alibaba's flagship open-source video generation model, publicly released in late March 2026 (Wan 2.7-Video and Wan 2.7-Image), with broader API and DashScope availability rolled out in early April 2026. It is the third major iteration of the Wan video family from Alibaba's Tongyi Lab and the first to introduce native audio output, first/last frame control, video-to-video editing, and a "thinking mode" for image generation.

Earlier in April 2026, Alibaba made waves by topping the Artificial Analysis text-to-video leaderboard under the codename "HappyHorse-1.0" — that model was Alibaba's stealth-tested precursor to Wan 2.7. Combined with the Wan 2.2 series (the first open-source video model with MoE architecture), Alibaba is now arguably the top open-weight video generation lab globally.

Specifications

  • Developer: Alibaba Tongyi Lab
  • Release Date: Late March 2026 (Wan 2.7-Video / Wan 2.7-Image); broader rollout April 6–9, 2026
  • Type: Text-to-video, image-to-video, reference-to-video, video editing
  • Output: Up to 1080p video, up to 15 seconds (3× longer than earlier Wan models)
  • Reference Inputs: Up to 5 reference videos for character/environment/motion-style guidance
  • License: Open-weight (Apache-2.0 family for many checkpoints; verify per-model card)
  • Distribution: Alibaba Cloud DashScope, WaveSpeedAI API, EachLabs, GitHub (Wan-Video org)

Capabilities

First-and-Last-Frame Control: Specify both opening and closing frames; the model generates everything in between. Industry-first capability for an open-source video model.

15-Second Clips: 3× the duration of earlier Wan generations; competitive with frontier closed models.

Multi-Reference Conditioning: Up to 5 reference video inputs to guide character continuity, environment style, and motion patterns simultaneously.

Native Audio: Audio output baked into the generation pipeline — synchronized with the video, removing the need for a separate post-hoc audio model.

Instruction-Based Video Editing: Change backgrounds, lighting, or style via natural language; full inpaint/outpaint and re-style of existing video.

Thinking Mode (Image): Wan 2.7-Image incorporates a reasoning step before generation, improving prompt adherence and composition logic.

Open-Source Distribution: Free to run locally on consumer hardware for many tier variants — a significant differentiator vs. closed competitors like Veo 4, Runway Gen-4.5, Kling 2.6, and Hailuo 2.3.

Limitations

While Wan 2.7 reaches 1080p / 15 seconds, [[Google DeepMind/Veo 4|Veo 4]] still leads on absolute frontier specs (4K @ 120fps, 15–30s clips). Audio quality and speech-sync are competitive but trail Veo 4 and [[Kuaishou/Kling 2.6|Kling 2.6]] on cinematic dialog scenes. As with other Chinese-origin models, U.S. enterprise adoption involves additional review around data handling and compliance.

Recent Developments

  • Early April 2026 — "HappyHorse-1.0" (Wan 2.7 precursor) tops Artificial Analysis text-to-video leaderboard for both text-to-video and image-to-video on blind tests.
  • April 6–9, 2026 — Wan 2.7-Video and Wan 2.7-Image officially launched and made available via Alibaba DashScope and WaveSpeedAI.
  • May 6, 2026 — Wan 2.7-Video coverage emphasizes 15-second clips, prompt-driven generation, and open-source availability for local execution.
  • 2026 Wan family momentum — Wan 2.2 (MoE T2V/I2V), Wan 2.6 (creator-focused), and Wan 2.7 collectively sustain Alibaba's position as the top open-weight video lab.

Last Updated

May 9, 2026