Startup Vana Offers Payment for Personal Data to Train Artificial Intelligence

Innovative startup Vana is developing a decentralized platform where users can provide their data to train AI models and receive cryptocurrency rewards. In the context of projected scarcity of data for AI training, this initiative could become an important solution to the problem.

G. Ostrov

May 20, 2025

The Approaching AI Data Crisis

According to research by the reputable Epoch AI institute, between 2026 and 2032, the artificial intelligence industry may face a serious challenge—a shortage of data for training new models. This could potentially slow down the development of AI technologies, which are becoming increasingly demanding in terms of volume and quality of training data.

As a promising solution to this problem, startup Vana has proposed a fundamentally new approach to data collection. The company is developing a decentralized network where ordinary users can provide their personal data for artificial intelligence training and receive compensation in the form of cryptocurrency tokens. The project has already attracted significant funding of $25 million from investors.

How the Vana Platform Works

The Vana platform integrates numerous services (over several dozen) through which users can share various types of their data:

Posts from social networks
Browser history
Data from smart home devices
Biometric indicators from fitness trackers
Genome sequencing and other personal information

The company has already announced the development of its own AI model called Collective-1, which has 7 billion parameters. A distinctive feature is that it is trained exclusively on platform user data. The technical side is supported by the Flower system, which allows combining the computing power of participants' devices for distributed neural network training.

Currently, more than 1.3 million users have joined the Vana network. However, the company's ambitious plans involve increasing this number to 100 million participants. Such a scale will allow collecting about 453 trillion tokens for training—significantly exceeding the volumes of data used to create modern models. For comparison: Llama-3 was trained on 15 trillion tokens, and Qwen3 on 36 trillion tokens.

Democratization of Artificial Intelligence

The main philosophy of the project lies in democratizing the process of creating and owning AI models. According to Vana co-founder Anna Kazlauskas: "You won't end up in a situation where one company controls an all-powerful AI model. Thanks to our approach, users will truly own the models. You get more advanced technology, and everyone benefits from it."

This model potentially allows solving several problems of the modern AI industry at once:

Increase the availability of quality training data
Ensure fair compensation to individuals providing their data
Prevent monopolization of AI technologies by large corporations
Improve transparency and ethics in the data collection process

However, such an initiative also raises legitimate questions about confidentiality, data security, and potential risks. Many users are concerned about how reliably their personal data will be protected and what consequences its use for training AI models might have.

Nevertheless, under conditions of growing demand for AI data and projected deficits, such decentralized platforms may become an important link in the artificial intelligence ecosystem of the future.

Official website of the Vana project: https://www.vana.org

If you have any problems, write to us, we will help you quickly and efficiently!