Discover The Best AI Websites & Models

27931 AIs and 88 categories in the best AI tools directory.

AI Assistant Code Assistant Video Generation Image Generation Art Generation Chat Developer Tools More

Most Used AI Models

zai-org/GLM-4.7

A state-of-the-art text generation model with 358B parameters, supporting English and Chinese, optimized for agentic reasoning, coding, and complex tool use.

Text Generation

MiniMaxAI/MiniMax-M2.1

MiniMax M2.1 is a state-of-the-art (SOTA) model designed specifically for real-world development and autonomous agents, focusing on coding, tool use, and long-horizon planning.

Text Generation

moonshotai/Kimi-K2.5

Kimi K2.5 is an open-source, native multimodal agentic model built through continual pretraining on approximately 15 trillion mixed visual and text tokens atop Kimi-K2-Base

Any-to-Any

Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice

The Qwen3-TTS-Tokenizer-12Hz model which can encode the input speech into codes and decode them back into speech.

Text-to-Speech

openbmb/MiniCPM-o-4_5

A Gemini 2.5 Flash Level MLLM for Vision, Speech, and Full-Duplex Mulitmodal Live Streaming on Your Phone

Any-to-Any

deepseek-ai/DeepSeek-OCR-2

DeepSeek-OCR is a model designed to explore the boundaries of visual-text compression, investigating the role of vision encoders from an LLM-centric viewpoint.

Image-to-Text

zai-org/GLM-OCR

GLM-OCR is a multimodal OCR model for complex document understanding, built on the GLM-V encoder–decoder architecture

Image-to-Text

PaddlePaddle/PaddleOCR-VL-1.5

PaddleOCR-VL-1.5 is an advanced next-generation model of PaddleOCR-VL, achieving a new state-of-the-art accuracy of 94.5% on OmniDocBench v1.5

Image-to-Text

Qwen/Qwen3-ASR-1.7B

The Qwen3-ASR family includes Qwen3-ASR-1.7B and Qwen3-ASR-0.6B, which support language identification and ASR for 52 languages and dialects.

Automatic Speech Recognition

Qwen/Qwen3-Coder-Next

an open-weight language model designed specifically for coding agents and local development

Text Generation

Tongyi-MAI/Z-Image

An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer

Text-to-Image

openai/whisper-large-v3

Convert speech in audio to text

Automatic Speech Recognition

openai/sora-2-pro

OpenAI's Most advanced synced-audio video generation

Text-to-Video

openai/gpt-image-1.5

OpenAI's latest image generation model with better instruction following and adherence to prompts

Text-to-Image

tencent/hunyuan-image-3

A powerful native multimodal model for image generation (PrunaAI squeezed)

Text-to-Image

stability-ai/stable-diffusion-3.5-large

A text-to-image model that generates high-resolution images with fine details. It supports various artistic styles and produces diverse outputs from the same prompt, thanks to Query-Key Normalization.

Text-to-Image

google/nano-banana

Google's latest image editing model in Gemini 2.5

Text-to-Image

prunaai/p-image

A sub 1 second text-to-image model built for production use cases.

Text-to-Image

recraft-ai/recraft-v3

Recraft V3 (code-named red_panda) is a text-to-image model with the ability to generate long texts, and images in a wide list of styles. As of today, it is SOTA in image generation, proven by the Text-to-Image Benchmark by Artificial Analysis

Text-to-Image