BlogStart EarningLogin

Get high-quality datasets for your

We source, create, and label high-quality datasets in text, image, video, and audio.

DataHive popular datasets

Some of the examples of the datasets we can provide:

E-Commerce Product Listings Dataset

Proprietary dataset of eCommerce listings from verified sellers, including metadata such as title, brand, description, pricing, and availability. Ideal for retail analytics, pricing models, and AI-powered product intelligence.A collection of Amazon product listings including metadata such as title, brand, description, pricing, availability, reviews, and more.
E-Commerce
retail
Pricing
availability
Product Data

E-Commerce Ratings and Reviews Dataset

Large-scale dataset of verified eCommerce ratings and customer reviews provided by online sellers. Enables sentiment analysis, product benchmarking, and brand perception modeling for AI and market research.Amazon Best Sellers Ratings and Reviews is a large-scale commercial dataset designed to give organizations deep visibility into what products people buy, how they review them, and how trends evolve in the consumer marketplace. It spans 1.5 million top-performing products and includes more than 100 million customer reviews.
E-Commerce
Ratings
Reviews
Consumer
Insights

European Languages Spoken Audio Dataset

Multilingual dataset featuring short scripted recordings from native speakers across France, Spain, Portugal, Germany, and Poland. Built for speech recognition, voice synthesis, and multilingual AI model training.
Audio
Speech
Voice
AI Training
Multilingual

European Languages Speech Transcription Dataset

Dataset of European speech recordings transcribed and validated by professional linguists. Optimized for ASR fine-tuning, accent detection, and multilingual voice model development across multiple European languages.
Audio
Speech
Transcription
Multilingual
AI Training

Global Video Dataset with Sentiment Annotations

Collection of over 1,000 hours of videos created by DataHive’s distributed contributors. Includes metadata, engagement metrics, and verified sentiment labels. Fully owned and IP-cleared for AI research and analysis.
Video
Multimedia
Sentiment
AI Training
Human-Created

Global Image and Photo Dataset

An extensive dataset of original photos and visual content created by DataHive’s distributed workforce. Includes metadata, contextual tags, and categories. Fully rights-owned and ready for computer vision and AI training.
Images
Photos
Visual Data
AI Training
Human-Created

Movies 2000–2024 Reviews Dataset

A curated entertainment dataset featuring movie reviews and metadata from 2000 to 2024, created and verified by DataHive’s distributed contributors. Fully rights-owned and cleared for AI and analytics use.
Movies Reviews
Entertainment
AI Training

Clients

Trusted by companies of all sizes from startups to Fortune 500s

Investors

Alliance DAO logo
Alliance DAO
6MV logo
6MV
Side Door Ventures logo
Side Door Ventures
Wave GP logo
Wave GP
Solana Ventures logo
Solana Ventures
Nural Capital logo
Nural Capital
Race Capital logo
Race Capital
DCF God angel logo
DCF God angel
Curved Ventures logo
Curved Ventures