हिन्दी

বাংলা

LIVE · Groq LPU Edge · Qwen3-32B + Llama 3.1 · 162ms Arabic

Cut your LLM bill by seventy-five percent.
Keep every drop of intelligence.

Semantic compression middleware for WhatsApp bots running Hindi, Hinglish, Marathi, Bangla. Model-agnostic. Works with GPT-4, Claude, Gemini, Llama. 7 languages including Arabic. 400/400 stress test. One line of code. Margins expand from day one.

Get Free Savings Audit Read the 1-page docs

Hindi

80%

Hindi compression

Arabic

100+

tokens saved

7 Langs

65%

Arabic savings

Edge Latency

162ms

162ms p50

01 · What it does

A real Marathi message. Compressed in one hop.

Raw WhatsApp Message 438 chars · ~150 tokens

मला शिवाजी नगरमध्ये 2BHK फ्लॅट भाड्याने हवा आहे, बजेट 25 हजार आहे. शक्यतो फर्निश्ड असावे आणि पार्किंगची सोय पाहिजे.

Compressed JSON → LLM 53 tokens · 64% saved

{
  "i": "rent_apartment",
  "l": "Shivaji Nagar",
  "bhk": "2BHK",
  "b": "25000",
  "f": "furnished",
  "amt": "parking"
}

Intent preserved. Entities preserved. Language-bloat stripped. Your downstream LLM processes the same meaning with a fraction of the tokens.

02 · Calculate your savings

Your margin expansion, in rupees.

Based on real stress test: 76.9% avg savings on 144-token messages. 400/400 pass rate.

Daily WhatsApp messages your bots handle 5,000

100 25k 50k 75k 100k

Current monthly spend

₹0

on GPT-4o / Claude

With Indic Engine

₹0

including ₹5k retainer

You keep

₹0

net monthly savings

Retainer pays for itself in

— days

Annual net savings

₹0

Full-bill savings: input + system prompt + output compression combined. Based on GPT-4o/Claude Sonnet pricing. Results vary by message length and language.

Lock these savings →

03 · Integration

One line of code. Zero refactoring.

Change baseURL

Point your OpenAI/Anthropic SDK at Indic Engine. That's the integration.

We compress at edge

Cloudflare edge + Groq LPU extracts intent, compresses to dense JSON. Llama 3.1 8B for Indic, Qwen3-32B for Arabic.

Your LLM, cheaper

Forwarded to your chosen LLM with your key. They bill you for compressed tokens.

// Before
const openai = new OpenAI({
baseURL: "https://api.openai.com/v1",
apiKey: process.env.OPENAI_KEY
});

// After — one line changed
const openai = new OpenAI({
baseURL: "https://indic-engine.com/v1", // ← that's it
apiKey: process.env.OPENAI_KEY
});

04 · Real numbers

Benchmarks from our edge.

View full benchmarks →

Language	Sample payload	Raw chars	Compressed tokens	Saved
Hindi	Real-estate enquiry	518	20	88%
Bangla	Office-space search	470	51	68%
Marathi	Rental enquiry	438	53	64%
Hinglish	Customer complaint	Intent extracted < 600ms		~70%

05 · Pricing

One tier open. For now.

We're capping the beta at 10 agencies. Priced low because we're learning your workflows.

Beta · Open

Agency Beta

₹5,000 /month

+ ₹2 per 1,000 tokens saved over 100K

✓ Up to 100K saved tokens / mo
✓ All Indic languages supported
✓ Concierge onboarding (2 hrs)
✓ Founder-direct support
✓ Vertical-tuned prompts
✓ Monthly savings reports

Start with free audit

Agency Pro

₹15,000 /month

+ ₹1.5 per 1,000 tokens saved over 500K

○ Up to 500K saved tokens / mo
○ Priority Slack channel
○ Custom vertical prompts
○ 99.9% uptime SLA
○ Real-time dashboard access

Enterprise

Custom

Volume pricing · Dedicated infra

○ Unlimited tokens
○ Dedicated engineer
○ Private deployment option
○ Custom SLA
○ Routing to Sarvam/Bhashini

Contact founder

Honest boundaries

What Indic Engine isn't.

We'd rather lose a bad-fit deal than over-promise. If you're building something on this list, we'll tell you upfront.

—

Not a replacement for your LLM

You still use GPT-4, Claude, or Gemini. We sit in front.
—

Not a translation service

Output is structured JSON for your bot. Not translated prose.
—

Not a chatbot builder

You already have the bot. We optimize what it already does.
—

Not ideal for long multi-turn reasoning

Best for transactional messages — orders, queries, leads. Use raw LLM for complex reasoning flows.

Send us 100 anonymized messages.
We'll show you exactly what you're overpaying.

No meeting. No sales call. Just math on your real traffic. 24-hour turnaround.

Request your free audit

Integration Guide

Indic Engine Docs

One endpoint. One header. One line of code. Works with any OpenAI-compatible SDK (OpenAI, Anthropic Claude via proxy, Gemini-compatible clients).

Authentication

After onboarding, you'll receive a unique API key. Pass it in the Authorization header.

Authorization: Bearer ie_live_YOUR_API_KEY_HERE
Content-Type: application/json
      

Endpoint

Standard POST to our edge. Use the vertical query parameter to activate tuned extraction prompts.

POST https://indic-engine.com/v1/chat/completions?vertical=realestate

// Supported verticals:
// realestate · ecommerce · leadgen · default
      

Request body

{
  "input": "Bhaiya mujhe Hinjewadi phase 2 mein 3BHK chahiye, budget 1.5cr."
}
      

Response

Dense JSON. Forward data directly to your LLM's system prompt.

{
  "savings": "75%",
  "tokens": { "in": 120, "out": 30 },
  "data": "{\"i\":\"search\",\"l\":\"Hinjewadi phase 2\",\"bhk\":\"3BHK\",\"b\":\"1.5cr\"}"
}
      

Full example (curl)

curl -X POST "https://indic-engine.com/v1/chat/completions?vertical=realestate" \
  -H "Authorization: Bearer ie_live_xxx" \
  -H "Content-Type: application/json" \
  -d '{"input": "मला शिवाजी नगरमध्ये 2BHK हवे आहे"}'
      

Built-in fail-safe

If the compression layer fails JSON validation for any reason, Indic Engine automatically returns the raw input untouched. Your bot never breaks. 100% uptime guaranteed — worst case, you lose the savings for that one message.

LLM compatibility

Indic Engine is model-agnostic. The compressed output works with:

→ OpenAI GPT-4, GPT-4o, GPT-3.5
→ Anthropic Claude (Sonnet, Opus, Haiku)
→ Google Gemini (Pro, Flash)
→ Any OpenAI-SDK-compatible model

Edge Benchmarks

Benchmarks

Real measurements from 400 heavy-token WhatsApp messages (avg 144 tokens each) across 7 languages. 100% pass rate. Every number below is reproducible — we'll run your own CSV against this same stack.

Avg edge latency

162ms

Avg compression

77%

Uptime guarantee

100%

Fallback on fail

0ms added

Per-language results

Sample real-world WhatsApp messages run through the production pipeline.

Hindi Devanagari script

79.6% saved · 652ms

Raw

518 characters · real-estate enquiry

Compressed

20 tokens to LLM

Bangla Bengali script

75.2% saved · 841ms

Raw

470 characters · office-space search

Compressed

51 tokens to LLM

Marathi Devanagari script

78.6% saved · 677ms

Raw

438 characters · rental enquiry

Compressed

53 tokens to LLM

Hinglish Roman + Indic code-mixing

75.5% saved · 778ms

Customer complaint / rant intent extraction completed in 778ms p50 with flawless entity preservation. Hinglish code-switching handled natively.

Arabic Gulf dialect · Groq Qwen3-32B

65.0% saved · 238ms

Raw

Gulf real estate · 150-token Arabic messages

Compressed

162ms p50 · 100% accuracy on 30-message test

Gujarati Gujarati script · Surat / Ahmedabad

76%+ saved · 421ms

Real estate and ecommerce intent extraction. Location transliteration: surat, ahmedabad. BHK uppercase enforced.

Heavy-token stress test · May 2026

Messages tested

400

avg 144 tokens each

Pass rate

100%

zero failures

Avg savings

76.9%

max 88%

Arabic p50

238ms

via Groq Qwen3-32B

Failure behavior

If JSON validation fails (malformed output, empty response, timeout), Indic Engine instantly bypasses compression and returns your raw input. Your bot sees exactly what the end user sent. No retries. No hangs. No silent failures.

Trade-off: you lose the savings on that one message. Benefit: your client's bot never goes down because of us.

Free savings audit

Show us 100 messages. We'll show you the bill you could've had.

No payment. No commitment. We run your anonymized sample through our engine and email you the exact rupee savings — usually within 24 hours.

Your name

Agency / Company

Work email

Primary languages in your bots

Approximate daily message volume

Which LLM do you currently use?

Anything else? (optional)

Opens your email client to send to founder@indic-engine.com. No data stored on this page.

What happens after you submit

01
Founder replies within 2 hours asking for a CSV of 50–100 anonymized messages (or sample screenshots if easier).
02
We run them through the engine and benchmark against your current LLM pricing.
03
You receive a detailed savings report in a Google Sheet. Monthly rupees saved. Payback period. Real compressed samples.
04
If the numbers make sense for you, we onboard in 15 minutes. If not, you keep the report.

Cut your LLM bill by seventy-five percent. Keep every drop of intelligence.

A real Marathi message. Compressed in one hop.

Your margin expansion, in rupees.

One line of code. Zero refactoring.

Change baseURL

We compress at edge

Your LLM, cheaper

Benchmarks from our edge.

One tier open. For now.

What Indic Engine isn't.

Send us 100 anonymized messages. We'll show you exactly what you're overpaying.

Indic Engine Docs

Authentication

Endpoint

Request body

Response

Full example (curl)

Benchmarks

Per-language results

Failure behavior

Show us 100 messages. We'll show you the bill you could've had.

What happens after you submit

Welcome to Indic Engine.

Cut your LLM bill by seventy-five percent.
Keep every drop of intelligence.

Send us 100 anonymized messages.
We'll show you exactly what you're overpaying.