01.AI's multimodal vision-language model for image understanding and analysis.
Specifications
Context
16K
Maximum Output
16K
Inputtext, image
Outputtext
Performance (7-day Average)
Collecting…
Collecting…
Collecting…
Pricing
Input¥6.00/MTokens
Output¥6.00/MTokens
Availability Trend (24h)
Performance Metrics (24h)
Similar Models
¥0.99/¥0.99/M
ctx16Kmax16Kavail—tps—
InOutCap
01.AI's fast and efficient language model for general-purpose tasks.
$3.30/$4.40/M
ctx16Kmax4Kavail—tps—
InOutCap
GPT-3.5 Turbo variant with extended 16K token context window for longer conversations and documents.
$4.40/$17.60/M
ctx32Kmax4Kavail—tps—
InOut
This is our first general-availability realtime model, capable of responding to audio and text inputs in realtime over WebRTC, WebSocket, or SIP connections.
$4.40/$17.60/M
ctx32Kmax4Kavail—tps—
InOut
This is our first general-availability realtime model, capable of responding to audio and text inputs in realtime over WebRTC, WebSocket, or SIP connections.