Open Source AI Video Model Directory

Open Source AI Video Models

Curated directory of open source AI video generators, editors, and avatar models

29

Tracked Models

Open source video model directory sorted by release date (newest first)
ModelDescriptionCreatorRelease dateGitHub / RepoPaper / Docs
LongCat-Video13.6B foundational video generator with long-term coherence. Supports image/video/text input and continuation.MeituanOct 28, 2025LongCat-VideoarXiv:2510.22200
DittoInstruction-based video editing framework. Supports high-fidelity scene/subject/style edits using natural-language instructions, built on a large curated instruction dataset for video editing.Ditto Team (EzioBy)Oct 17, 2025github.com/EzioBy/Dittoarxiv.org/abs/2510.15742
FlashVSRReal-time 4K video upscaling with diffusion.OpenImagingLabOct 14, 2025FlashVSRarXiv:2510.12747
MoChaEnd-to-end character replacement in video without keypoints. Requires only first-frame mask and reference.Orange 3DV TeamOct 2025MoChaarXiv:2503.23307
OviVideo + audio generation from text/image prompts. Twin diffusion backbone for video and audio.Character AISep 30, 2025OviarXiv:2510.01284
Wan-AlphaHigh-quality text-to-video generation model supporting alpha-channel / transparent background outputs. Built on the Wan 2.1-T2V-14B backbone and LightX2V for fast inference and alpha compositing.WeChatCV (WeChat CV Lab)Sep 30 2025 (v1.0 release)WeChatCV/Wan-AlphaarXiv:2509.24979
WanAnimateCharacter animation & replacement model using video as reference. Integrated with Wan 2.2.Alibaba Tongyi LabSep 19, 2025Wan2.2 AnimatearXiv:2509.14055
LynxHigh-fidelity personalized video generation model focused on identity preservation. Generates new videos of a specific person from one reference image using ID-adapters and Ref-adapters for facial detail control.ByteDanceSep 18, 2025github.com/bytedance/lynxarxiv.org/abs/2509.15496
Lucy EditText-guided video editing model enabling object, style, character, and scene edits while preserving original motion. Built on Wan2.2-5B-based architecture with efficient edit-conditioning.DecartAISep 18, 2025github.com/DecartAI/Lucy-Edit-ComfyUILucy Edit Paper
HuMoMultimodal (text/image/audio) model for talking human videos with strong subject and lip-sync consistency.ByteDanceSep 10, 2025HuMoarXiv:2509.08519
Stand-InPlug-and-play module for maintaining facial identity during video generation across scenes or styles.Tencent WeChat CV LabSep 2025Stand-InarXiv:2508.07901
InfiniteTalkAudio-driven long-form talking-video generator. Produces image-to-video and video-to-video talking portraits with full-body, head, and lip synchronization; supports unlimited video length and sparse-frame generation.MeiGen-AIAug 19, 2025github.com/MeiGen-AI/InfiniteTalkarxiv.org/abs/2508.14033
Wan 2.2 (14B)Second-gen Wan model with Mixture-of-Experts. Enables cinematic 720p videos with better aesthetic and physical control.Alibaba PAI / Tongyi LabJul 29, 2025Wan2.2arXiv:2503.20314
Wan 2.2 (5B)Lightweight dense version of Wan 2.2 with a 3D-aware VAE. Can generate 5-sec 720p/24FPS video on a single high-end GPU.Alibaba PAI / Tongyi LabJul 29, 2025Wan2.2arXiv:2503.20314
ReCamMasterNovel-view video generation via camera trajectory input. Enables re-rendering videos with new motion.Kuaishou & Zhejiang UnivJul 9, 2025ReCamMasterarXiv:2503.11647
FantasyPortraitMulti-character animation with expression-level control. Synchronized expressions across faces.Alibaba AMAP LabJul 2025FantasyPortraitarXiv:2507.12956
EchoShotMulti-shot video generation of same subject with coherent identity across shots.Beihang Univ / D2I LabJul 2025EchoShotarXiv:2506.15838
MTVCraftAudio-video generation framework that splits text into sound streams and aligns visuals.BAAIJun 2025MTVCraftarXiv:2506.08003
PhantomIdentity-preserving text+image to video framework. Integrates with Wan backbone and uses multi-subject memory.ByteDanceMay 27, 2025PhantomarXiv:2502.11079
ATIAdds trajectory control to Wan models via a lightweight conditioning layer.ByteDanceMay 2025ATIarXiv:2505.22944
MiniMax-RemoverObject removal model trained with minimax optimization and distilled for fast inference.Fudan Univ & TencentMay 2025MiniMax-RemoverarXiv:2505.24873
MultiTalkAudio-driven multi-character video generation framework. Supports distinct voices and identity-mapped lipsync.MeiGen-AIMay 2025MultiTalkarXiv:2505.22647
Hunyuan AvatarMulti-character audio-driven avatar video generator. Supports emotion-aware speech animation, multi-speaker dialog videos, and realistic expression/motion using a multimodal diffusion transformer.Tencent Hunyuan LabMay 2025github.com/Tencent-Hunyuan/HunyuanVideo-Avatararxiv.org/abs/2505.20156
Uni3C3D-enhanced model with simultaneous camera and human pose control for video generation.Alibaba DAMOApr 2025Uni3CarXiv:2504.14899
FantasyTalkingTalking-head video generator using portrait + audio. Includes body/gesture motion and emotion control.Alibaba AMAP LabApr 2025FantasyTalkingarXiv:2504.04842
SkyReels V2Infinite-length text/image-to-video model with autoregressive stitching and cinematic control features.Skywork AIApr 2025SkyReelsarXiv:2504.13074
VACEUnified framework for video creation and editing. Combines motion control, style, object manipulation, and more into one architecture.Alibaba DAMO / Tongyi LabMar 2025VACEarXiv:2503.07598
Wan 2.1First open-source model in the Wan series. 14B/1.3B versions. Handles text-to-video/image generation with strong object motion, scene consistency, and bilingual prompt support.Alibaba PAI / Tongyi LabFeb 27, 2025Wan2.1Wan Paper (arXiv)
LivePortraitEfficient portrait animation framework that transforms a single still image into a lifelike video with head/eye/face motion, and supports stitching and retargeting control for high-quality output.Kuaishou Technology (KwaiVGI)Jul 4, 2024 (code release)LivePortraitarXiv:2407.03168

Ready to test?

Try Wan Animate and LivePortrait

Upload a character, record a driving video, and let our pipeline handle motion transfer, lip sync, and model deployment for you.