Why You’ll Love It
Sharper Detail
Preserves fine textures, hair, text, and edges with enhanced clarity and improved visual fidelity.
Prompt Precision
Accurately follows complex prompts, handling multiple subjects, layouts, and detailed stylistic instructions.
Stronger Motion
Delivers smoother, more natural motion with improved consistency and fewer visual distortions.
Cleaner Audio
Produces synchronized audio with reduced noise, fewer gaps, and clearer overall sound quality.
Native Portrait
Generates true vertical 1080 × 1920 videos optimized for portrait viewing, not cropped from landscape.
Better Image-to-Video
Ensures stable motion from images with less freezing, reduced drift, and improved frame consistency.
What Creators Are Saying
How LTX 2.3 Works
Frequently Asked Questions
LTX 2.3 is the latest release in the LTX-2 model family, a diffusion transformer video model that generates high-fidelity video and synchronized audio from one model. It supports text-to-video, image-to-video, and audio-to-video generation up to 1080p, including native portrait video.
LTX 2.3 improves four major areas over LTX 2: sharper detail, tighter prompt adherence, native portrait video support, and cleaner audio quality.
Yes. LTX 2.3 supports image-to-video generation and improves motion consistency, reducing freezing and unwanted artificial movement.
Yes. It supports native 9:16 portrait generation up to 1080 × 1920, trained specifically for portrait orientation instead of cropping from landscape.
Yes. LTX 2.3 generates synchronized audio and improves cleanliness by reducing silence gaps and training noise artifacts.
Yes. The model weights are available on HuggingFace under an open license, including base, fp8, and distilled variants.
Yes. Multiple checkpoints are available for local use, including full, quantized, distilled, and latent upscaler variants.
Yes. LTX 2.3 supports ComfyUI with updated custom nodes and reference workflows for text-to-video, image-to-video, and multi-stage generation.