Groq API Integration
Available Models
LLaMA2-70b-chat
Experience the power of LLaMA2-70b with unprecedented speed. Groq's LPU™ technology delivers responses in milliseconds.
- • Ultra-low latency (~100ms)
- • High accuracy responses
- • 70B parameter model
- • $0.70 per million input tokens, $0.70 per million output tokens
Mixtral-8x7b-chat
Mixtral offers a perfect balance of speed and capability, with performance comparable to GPT-4 on many tasks.
- • Fastest Mixtral inference available
- • Mixture of Experts architecture
- • Excellent coding capabilities
- • $0.27 per million input tokens, $0.27 per million output tokens
Why Choose Groq?
Unmatched Speed
Groq's LPU™ technology delivers the fastest inference speeds in the industry, with response times measured in milliseconds.
- • ~100ms latency
- • No throttling
- • Consistent performance
Cost Effective
Competitive pricing combined with superior speed makes Groq the most cost-effective solution for high-performance AI applications.
- • Transparent pricing
- • No minimum spend
- • Pay only for what you use
Technical Specifications
- • Secure API key storage with iOS Keychain
- • Native iOS integration with SwiftUI
- • Support for streaming responses
- • Automatic token counting and cost estimation
- • Real-time response monitoring