What is Sarvam AI?
Sarvam AI makes Sovereign AI for India so that the country can be independent in terms of technology. Dr. Vivek Raghavan and Dr. Pratyush Kumar started the company. It makes Indic-language models, voice-first agents, and infrastructure that is specific to rural India. Sarvam uses local data and computing power to protect India's digital independence.
(1) The Beginning of Geopolitics and Technology
(a) The Race for AI Weapons Around the World
Global technology changed in the early 2020s. OpenAI's GPT-4 and Google's Gemini are examples of generative AI systems that are very close to being able to reason in general. Countries understood that AI is a basic part of national power, just like defense or energy [**].
The United States is the most powerful country because it has a lot of silicon. China's main focus is on state-controlled technological power. This makes a divide. Countries in the "Global South" have to make a choice. They can either become digital colonies that depend on imported knowledge or they can learn to be self-sufficient with technology.
India has the most people living in a digital democracy. It is in the middle of this change. India was a successful global back office in the 1990s and 2000s. But generative AI makes things worse.
We see three big problems for India:
- Training frontier models costs a lot of money.
- There aren't many specialized hardware components, like GPUs.
- The training data shows Western languages and cultures.
When a model learns from the Common Crawl of the English internet, it learns about Western biases, laws, and social norms. There are 22 official languages in India. When you use Western models for government, education, and law, you make things less efficient and more open to attack.
Sarvam AI was created to meet this need in the country. This is not a typical startup for consumers. The goal of "Sovereign AI" is to build, deploy, and govern AI using local infrastructure, data, and workers. Sarvam is the smart layer on top of the India Stack.
(b) The Founders: Deep Tech and Digital Public Infrastructure Come Together
We looked into the backgrounds of Sarvam AI's co-founders, Dr. Vivek Raghavan and Dr. Pratyush Kumar. They bring together two fields of Indian technology: research on digital public infrastructure (DPI) and computing in Indic languages.
Dr. Vivek Raghavan
- Former Chief Product Manager and Biometric Architect at the Unique Identification Authority of India (UIDAI).
- Helped build Aadhaar, which covers more than 1.4 billion people.
- Has a "building for billions" way of thinking.
- Sees generative AI as a public service that can help people who don't speak English, not just a luxury for people who work with knowledge.
Dr. Pratyush Kumar
- I used to work as a researcher at Microsoft Research and IBM Research.
- Used to be a professor at IIT Madras.
- Co-founder of AI4Bharat, a project that makes AI for Indian languages available to everyone.
- Led the work to fix the "tokenization bottleneck" in Indian writing.
(c) Money and Computer Infrastructure
It costs a lot of money to build Sovereign AI. The price of computing power, especially NVIDIA H100 GPUs, is very high.
Data and Timeline for Investments:
- In December 2023, they raised $41 million in a Series A round. This was one of the biggest early-stage raises in the history of Indian deep-tech.
- Lightspeed Venture Partners is the lead investor.
- Investors who are taking part are Peak XV Partners (formerly Sequoia India) and Khosla Ventures.
- Vinod Khosla was OpenAI's first institutional backer, according to this note. His investment supports the idea that he is a "regional champion." This thesis contends that although Open AI dominates the global English market, regional AI leaders will prevail in India, China, and Europe.
- Value in early 2026: around $200 million.
We see a strategic partnership in computing. Sarvam worked with Yotta Data Services to get to a group of 4,096 NVIDIA H100 GPUs. This goes along with the government's IndiaAI Mission. It keeps computing resources physically in India. This is in line with data residency rules and keeps supply chain problems from happening.
(2) The Philosophy of Sovereign AI
(a) What is Sovereignty?
Sarvam defines "sovereignty" as having three parts: Data, Compute, and Utility.
- Sovereign Data: Western models have a bias toward knowledge. GPT-4 is familiar with US copyright law but not so much with the Indian Penal Code or "Hinglish." Sarvam collects local datasets. They turn Indian legal documents, literature, and government records into digital files. This makes models that show what life is really like in India.
- Sovereign Compute: This makes sure that you have strategic freedom. Using cloud APIs in the US puts the country's infrastructure at risk. Sarvam makes sure that AI inference happens on Indian soil and under Indian laws by hosting models with Yotta in Navi Mumbai.
- Sovereign Utility: Sarvam's goal is to reach the "next billion users," or "Bharat." These users are mobile-first, voice-first, and often don't like using English text interfaces. Instead of regular text chatbots, Sarvam makes voice agents and multimodal interactions.
(b) The Split Between "India" and "Bharat"
Sarvam is trying to close the gap between "India" and "Bharat."
- "India": The elite in cities who speak English. They use ChatGPT to easily write code or emails.
- "Bharat": A lot of people live in rural areas and speak a local language. In Vidarbha, a farmer can't ask ChatGPT about the right amount of pesticides to use in the Varhadi dialect of Marathi.
Sarvam wants to put an end to this digital apartheid. The founders say that AI can break down the barrier to literacy. Users don't have to read to get information from the internet if an AI can speak and understand local dialects. Sarvam works with the government to provide legal and agricultural services in the languages of the people.
(c) Working with the IndiaAI Mission
The Indian government started the IndiaAI Mission with a budget of Rs.10,372 crore ($1.25 billion). The goal is to have an AI ecosystem that can stand on its own.
Sarvam gets money to buy GPUs as part of this mission. The relationship goes both ways. The government gets a tech champion in the US. Sarvam gets a government distribution scale. Sarvam talks about working together with the National Payments Corporation of India (NPCI). The goal for the long term is to have a "Unified Intelligence Interface" on the India Stack that works with UPI.
(3) Evolution of Technical Architecture and Models
We kept an eye on Sarvam's technical roadmap. They went from changing open-source models to coming up with new ideas from scratch.
(a) Phase 1: OpenHathi and Tokenization
In December 2023, Sarvam put out OpenHathi-Hi-v0.1. It was based on Meta's Llama-2-7B. The Tokenizer was the most important new feature.
The Token Fertility Issue:
"Tokens" are how large language models break down text. The Latin alphabet is used by Western tokenizers. When you read Devanagari (Hindi) script, Western tokenizers break words into tiny pieces called bytes.
- One token for each English word on Llama-2.
- Hindi word on Llama-2 = 4 to 8 tokens.
The Effect on the Economy:
LLM costs are based on the number of tokens. It costs four to eight times more to process Hindi than English. LLMs also have a set context window, like 4096 tokens. If you have a lot of children, the AI will remember less of a Hindi conversation.
The answer:
Sarvam gave Llama-2's tokenizer a lot more Hindi words. They taught the embedding layers how to make sense of the new tokens. This made the birth rate drop a lot.
(b) Phase 2: Sarvam-1 (2 Billion Parameters)
Sarvam-1 came out in October 2024. They started from scratch and trained this model with 2 billion parameters. They trained it on a corpus of 2 trillion tokens from 10 Indic languages and synthetic data.
| Language | Llama-3 Fertility (Tokens/Word) | Sarvam-1 Fertility (Tokens/Word) | Efficiency Gain |
| Hindi | ~4.8 | 1.6 | 3.0x |
| Tamil | ~7.2 | 1.9 | 3.8x |
| Bengali | ~5.1 | 1.7 | 3.0x |
| Telugu | ~6.5 | 1.8 | 3.6x |
Data Interpretation: A 3x increase in efficiency means a 3x decrease in cost and a 3x increase in speed. This makes it possible for voice bots to make money. On Indic benchmarks (MMLU and ARC-Challenge), Sarvam-1 did better than bigger models like Gemma-2-2B and Llama-3.2-3B.
(c) Phase 3: Sarvam-M (24 billion parameters)
Sarvam-M is a hybrid model with 24 billion parameters that came out in May 2025. The French model Mistral Small is the base for it. (Source: Reddit)
There was a lot of talk about this release. People who didn't like it called it a "wrapper" model. Sarvam stood up for this. They said that "Post-Training" is a separate field of study. Mistral helped Sarvam focus its computing power on adding knowledge specific to the field, like Indian laws and reasoning, instead of basic grammar. It can do "hybrid reasoning," which means it can get data from Hindi financial reports.
(d) Step 4: Models 30B and 105B MoE
Sarvam introduced Sarvam-30B and Sarvam-105B at the India AI Impact Summit 2026. Both were trained from the ground up using a Mixture-of-Experts (MoE) architecture.
MoE Architecture Breakdown:
In a dense model, every parameter runs for every query. This is taking a long time. In a MoE model, the network divides into "experts," each with their own area of expertise. A router chooses the right expert.
- Sarvam-30B has a total of 30 billion parameters. Only 1 billion are active for each token generation. It gives you 30 billion pieces of information at a cost of 1 billion dollars and 1 billion dollars. This is what high-volume voice agents use all the time.
- Sarvam-105B has a total of 105 billion parameters. 9 billion people are active. Made for complicated thinking and planning.
Global Benchmarks:
Sarvam compared the 105B model to the best in the world. They said it did better than DeepSeek R1, a 600B Chinese model, on some "cultural reasoning" and "Indic reasoning" tasks. It also had a better price-to-performance ratio than Google's Gemini Flash for tasks in Indian languages. (Source: Times of India)
(e) Sarvam Vision: The State Space Model (SSM)
India has a lot of paperwork, so computer vision is very important. Standard Vision Transformers (ViTs) grow in size by a factor of four. Making the image resolution twice as high makes the computer four times more expensive.
The backbone of Sarvam Vision is a State-Space Model (SSM). SSMs grow in a straight line. Sarvam Vision can read pages with very high resolution without downsampling. This keeps small details in complicated scripts like Malayalam safe.
Results for the olmOCR-Bench accuracy test:
- 84.3% of Sarvam Vision
- Gemini Pro: 81.2%
- ChatGPT: 62.4%
(4) The Full-Stack Product Ecosystem
Sarvam is more than just a model factory; it is also a vertical "Product Studio."
(a) Sarvam Studio
Sarvam Studio is a platform for making media for fun and learning. It gets rid of the language barrier in the media.
- AI video translation and dubbing are two of its features.
- Tech: Voice Cloning keeps the original speaker's pitch, tone, and feelings.
- Results: In blind tests against ElevenLabs and YouTube Aloud, native speakers gave Sarvam a higher rating for Dravidian languages. It always beat GPT-4 and Gemini Pro when it came to translating difficult literary works.
(b) Sarvam Arya
Arya is a framework for "Agentic AI" orchestration. Standard frameworks like LangChain have "Hallucination Sprawl," which means that small mistakes add up.
- Method: Arya treats AI actions the same way it treats database transactions.
- Ledgers: Keep track of every change in state.
- Atomic Commits: A step only saves if it passes the test. If it doesn't work, the AI goes back.
- Code: Agents use the HashiCorp Configuration Language (HCL). This makes them stable and version-controlled for Enterprise IT.
(c) Sarvam Akshar
Sarvam Vision powers Akshar, a B2B digitization tool.
- Workflow: Uses "Human-in-the-Loop." If the AI is 90% sure about a handwritten word, it marks the box for a human to check.
- Learning: It gets smarter by using Reinforcement Learning from Human Feedback (RLHF). Used right now to digitize land records in Odisha.
(d) Sarvam Edge
Sarvam Edge models work on consumer devices that aren't connected to the internet.
- Hardware: Designed to work best with the Qualcomm Snapdragon 8 Gen 3 NPU.
- Real-Time Factor (RTF) of 0.12 for speed. It processes speech eight times faster than real time.
- Use Cases: Voice Payments with UPI When Not Connected to the Internet. Rural doctors write down notes about their patients without using the internet, and they keep the data private on their phones.
(e) Sarvam Kaze
In May 2026, Sarvam Kaze will come out as a pair of smart glasses for blue-collar workers.
- Function: A mechanic checks out a part of an engine. The glasses use Sarvam Vision to find the part. They show repair diagrams on the lens and give instructions in the local language. It connects directly with users instead of going through smartphones.
(5) Business Strategy and Market Adoption
(a) Strategy for Pricing
| Service | Price | Strategy Note |
| Sarvam-M (Chat) API | Free (per token) | "Loss leader" to attract developers. |
| Speech-to-Text | ?30 ($0.36) per hour | Undercuts Google Cloud/Azure. |
| Translation | ?20 ($0.24) per 10k chars | Highly competitive for startups. |
| Enterprise Tiers | ?50,000 ($600) / month | Designed for high-volume B2B use. |
(b) Tata Capital: An Enterprise Case Study
Tata Capital needed to get more money back from people without hiring thousands of call center workers. They used Sarvam Samvaad, an AI voice bot.
- The Problem: Customers spoke "Hinglish," and global models couldn't understand mixed sentences.
- The answer: Sarvam's bot knew exactly what the person wanted. It takes care of regular reminders to pay. It can tell how people feel. If the person on the other end of the line is angry, it immediately sends the call to a human agent.
(c) Partnerships with the government: Federal AI
Sarvam uses a "Federal AI" model, which means it works with state governments to build its own infrastructure.
- Odisha: Building a 50MW AI Data Center to host sovereign models. It uses Sarvam Vision to keep an eye on mining safety and gives young people bots to help them learn.
- Making the Digital Sangam Sovereign AI Research Park in Tamil Nadu. Creating "Vivasaya Nanban" (Farmer's Friend), a voice assistant for 8 million farming families. It gives Tamil weather, soil, and market information.
(6) Public Opinion, Disagreements, and Trust in the Community
The technical community is closely watching Sarvam. We looked at three big problems.
(a) The "Wrapper" Claims
When Sarvam-M first came out, Reddit users called it a "wrapper startup." They said that Sarvam raised millions of dollars just to make small changes to the French Mistral model.
- Public Opinion: Nationalists were unhappy. They wanted a basic model made only in India.
- Business Reality: Sarvam said that an Indic-optimized fine-tune gives businesses more value for less money and time than building a bad model from scratch. The later release of the Sarvam-1 and MoE models restored faith.
(b) Download Farming on Hugging Face
In late 2025, Reddit users noticed strange things in Sarvam's Hugging Face download numbers.
- The Problem: Over the course of one weekend, Sarvam-M downloads went from 300 a day to 100,000 a day. People said Sarvam was bot-farming to make their popularity numbers look better.
- The Effect: This hurt trust in the open-source community and showed how hard it is for startups to get investors to believe they are making progress.
(c) The Stress of Sovereignty
There is still a lot of disagreement about Sovereign AI.
- Nationalists see Sarvam as a champion of independence, just like ISRO.
- Globalists: Say that Sovereign AI is a marketing trick to protect itself. They say that India should just use the best models from around the world, like Llama 3.
- Sarvam's Position: The business doesn't want to fight GPT-4 on general intelligence. They only care about specific uses, like lower costs, better voice quality, and regional languages.
(7) Future Roadmap and Strategic Outlook (2026–2030)
Sarvam is going from being a research startup to being an infrastructure provider. We looked over their long-term plans for the future.
(a) The "NPCI of AI"
Sarvam wants to be the basic utility layer for AI in India, just like the NPCI made UPI for payments. They want to connect with ONDC (Open Network for Digital Commerce). A person who lives in the country will be able to talk to their phone in Bhojpuri and buy groceries from any store. Sarvam will provide the smartness behind the deal.
(b) Adding More Hardware
The release of Sarvam Kaze marks the beginning of a shift in hardware.
- Sarvam Home: Smart speakers that are cheap and voice-first, made for rural dialects and meant to replace things like Alexa.
- Public Kiosks: AI kiosks that can talk to people and are set up in village councils (Panchayats) so that people can get government services directly.
(c) Financial Stability and the IPO Horizon
Sarvam's biggest risk is having high burn rates. It costs tens of millions of dollars in computing and electricity to train huge 105B models. Sarvam needs to stop using free APIs and get big business contracts in order to stay alive. Because their value has grown quickly to $200 million, it is very likely that they will go public in India in 2028 or 2029. They will probably market themselves as a "Digital Infrastructure" stock.
Sarvam AI shows that a developing country can make its own AI layer. Sarvam built a strong moat by focusing on cost, connectivity, and the local language. If it works, Sarvam will show the way for digital sovereignty in the Global South.
Conclusion
Sarvam AI is a daring test of how technology affects world politics. It is testing the idea that a developing country doesn't have to be a subordinate in the age of AI. By being very clear about India's specific problems cost, connectivity, and language Sarvam has found a way to defend its niche.
It has to deal with a lot of technical problems and people who doubt its originality, but because it is part of the Indian government's digital infrastructure, it will last longer than pure-play startups. If Sarvam is successful, it won't just make money; it will also show the "Global South" how to take back control of its digital space.
FREQUENTLY ASKED QUESTIONS (FAQs)
