Kling is a series of video models from the Chinese company Kuaishou, which regularly ranks at the top of independent video generation benchmarks. Version v2.1 has significantly improved the realism of physics compared to its predecessors: water flows and splashes with believable eddies, fabric responds to wind and movement, fire develops dynamically. This is why Kling is often called the best model for animating portraits and natural scenes.
The model operates in two modes. In Image-to-Video (i2v) mode, you upload a photo and receive a video clip with organic movement. In Text-to-Video (t2v) mode, you describe the scene in text — the model generates it from scratch. In both cases, you can control the camera movement by adding instructions like "slow pan right" or "cinematic zoom out" to the prompt.
The cost — 30 credits for I2V and 20 for T2V — reflects the high computational costs of generation. If the budget is limited, Wan 2.1 provides a good result for 4 credits.