Wan 2.5 is the multimodal mid-tier: faster than 2.6 at slightly lower fidelity, and significantly more capable than 2.2. It introduces native audio-visual generation (sound synthesised alongside video) and a unified T2I + I2V + T2V pipeline. A strong choice for rapid iteration and drafts before committing to a 2.6 final render.
Output
| Max duration | ~10 s |
| Resolution | 1080p-class |
| Frame rate | ~24 fps |
| Audio | Native A/VSound generated with video |
| Format | MP4 |
Generation modes
| Text-to-video (T2V) | ✓ |
| Image-to-video (I2V) | ✓ |
| Text-to-image (T2I) | ✓Unified pipeline |
Quality traits
| Identity consistency | Good |
| Prompt adherence | Good |
| Motion stability | Improved vs 2.2 |
| Audio sync | Native (experimental) |
Access
| Free tier | ✓ |
| No download / install | ✓ |