Semantic compression middleware for WhatsApp bots running Hindi, Hinglish, Marathi, Bangla. Compatible with GPT-4, Claude, Gemini — any LLM your bot already uses. One line of code. Margins expand from day one.
मला शिवाजी नगरमध्ये 2BHK फ्लॅट भाड्याने हवा आहे, बजेट 25 हजार आहे. शक्यतो फर्निश्ड असावे आणि पार्किंगची सोय पाहिजे.
{ "i": "rent_apartment", "l": "Shivaji Nagar", "bhk": "2BHK", "b": "25000", "f": "furnished", "amt": "parking" }
Intent preserved. Entities preserved. Language-bloat stripped. Your downstream LLM processes the same meaning with a fraction of the tokens.
Conservative estimates based on GPT-4o / Claude Sonnet pricing and 70% compression.
Assumes 280 input / 90 output tokens per msg · ₹0.21/₹0.85 per 1K tokens · 70% input compression
Lock these savings →Point your OpenAI/Anthropic SDK at Indic Engine. That's the integration.
Cloudflare Worker extracts intent, compresses to dense JSON via Llama 3.1.
Forwarded to your chosen LLM with your key. They bill you for compressed tokens.
| Language | Sample payload | Raw chars | Compressed tokens | Saved |
|---|---|---|---|---|
| Hindi | Real-estate enquiry | 518 | 20 | 88% |
| Bangla | Office-space search | 470 | 51 | 68% |
| Marathi | Rental enquiry | 438 | 53 | 64% |
| Hinglish | Customer complaint | Intent extracted < 600ms | ~70% | |
We're capping the beta at 10 agencies. Priced low because we're learning your workflows.
We'd rather lose a bad-fit deal than over-promise. If you're building something on this list, we'll tell you upfront.
No meeting. No sales call. Just math on your real traffic. 24-hour turnaround.
Request your free auditOne endpoint. One header. One line of code. Works with any OpenAI-compatible SDK (OpenAI, Anthropic Claude via proxy, Gemini-compatible clients).
After onboarding, you'll receive a unique API key. Pass it in the Authorization header.
Standard POST to our edge. Use the vertical query parameter to activate tuned extraction prompts.
Dense JSON. Forward data directly to your LLM's system prompt.
If the compression layer fails JSON validation for any reason, Indic Engine automatically returns the raw input untouched. Your bot never breaks. 100% uptime guaranteed — worst case, you lose the savings for that one message.
Indic Engine is model-agnostic. The compressed output works with:
Real measurements from our Cloudflare edge running Llama 3.1 8B. Every number below is reproducible — we'll run your own CSV against this same stack.
Sample real-world WhatsApp messages run through the production pipeline.
If JSON validation fails (malformed output, empty response, timeout), Indic Engine instantly bypasses compression and returns your raw input. Your bot sees exactly what the end user sent. No retries. No hangs. No silent failures.
Trade-off: you lose the savings on that one message. Benefit: your client's bot never goes down because of us.
No payment. No commitment. We run your anonymized sample through our engine and email you the exact rupee savings — usually within 24 hours.
Your retainer is active. Your margins just got wider.