DataHive popular datasets
Some of the examples of the datasets we can provide:


E-Commerce Product Listings Dataset
Proprietary dataset of eCommerce listings from verified sellers, including metadata such as title, brand, description, pricing, and availability. Ideal for retail analytics, pricing models, and AI-powered product intelligence.A collection of Amazon product listings including metadata such as title, brand, description, pricing, availability, reviews, and more.
E-Commerce
retail
Pricing
availability
Product Data


E-Commerce Ratings and Reviews Dataset
Large-scale dataset of verified eCommerce ratings and customer reviews provided by online sellers. Enables sentiment analysis, product benchmarking, and brand perception modeling for AI and market research.Amazon Best Sellers Ratings and Reviews is a large-scale commercial dataset designed to give organizations deep visibility into what products people buy, how they review them, and how trends evolve in the consumer marketplace. It spans 1.5 million top-performing products and includes more than 100 million customer reviews.
E-Commerce
Ratings
Reviews
Consumer
Insights


European Languages Spoken Audio Dataset
Multilingual dataset featuring short scripted recordings from native speakers across France, Spain, Portugal, Germany, and Poland. Built for speech recognition, voice synthesis, and multilingual AI model training.
Audio
Speech
Voice
AI Training
Multilingual


European Languages Speech Transcription Dataset
Dataset of European speech recordings transcribed and validated by professional linguists. Optimized for ASR fine-tuning, accent detection, and multilingual voice model development across multiple European languages.
Audio
Speech
Transcription
Multilingual
AI Training


Global Video Dataset with Sentiment Annotations
Collection of over 1,000 hours of videos created by DataHive’s distributed contributors. Includes metadata, engagement metrics, and verified sentiment labels. Fully owned and IP-cleared for AI research and analysis.
Video
Multimedia
Sentiment
AI Training
Human-Created


Global Image and Photo Dataset
An extensive dataset of original photos and visual content created by DataHive’s distributed workforce. Includes metadata, contextual tags, and categories. Fully rights-owned and ready for computer vision and AI training.
Images
Photos
Visual Data
AI Training
Human-Created


Movies 2000–2024 Reviews Dataset
A curated entertainment dataset featuring movie reviews and metadata from 2000 to 2024, created and verified by DataHive’s distributed contributors. Fully rights-owned and cleared for AI and analytics use.
Movies Reviews
Entertainment
AI Training
Clients
Trusted by companies of all sizes from startups to Fortune 500s
Investors

Alliance DAO

6MV

Side Door Ventures

Wave GP

Solana Ventures

Nural Capital

Race Capital

DCF God angel

Curved Ventures