DBbun — Fuel for a Data-Hungry World

DBbun LLC creates unique, high-quality synthetic datasets for research, analytics, and machine learning. DBbun’s datasets are completely synthetic, generated intelligently using advanced AI. DB stands for database, and bun stands for bundling many pieces of data together in one place. Each dataset is a carefully assembled mix of variables, statistics, and outcomes (click to explore the library). Users can contact us for any inquiry — including requests to create custom datasets based on specific resources.
- Introducing DBbun: Synthetic Data for a Data-Hungry World
- From Story to Dataset: Turning Speculative Fiction Into Data
- From EMRBots (2015) to Clinically-Informed Synthetic Patient Populations (October 2025)
- How AI Turns a Neighborhood Into a 400-Year Story
- An AI-Created Vision of 2045 Health Sensing

Who Uses DBbun?

AI Companies: Organizations developing foundational or domain-specific AI models can benefit from novel datasets that do not exist anywhere else.
Educational Institutions & Instructors: Professors and trainers can use synthetic datasets for hands-on workshops. Students can safely practice machine learning, statistics, and prediction modeling.
Hackathons, Bootcamps, and Training Programs: Organizers can provide ready-to-use, realistic datasets for competitions and training exercises.
Startup Companies in Stealth or Early Growth: Need realistic datasets to test prototypes without privacy concerns. Useful for showing traction to investors or validating product pipelines.
Consulting Firms & Independent Analysts: Can run proof-of-concept analyses for clients without waiting for access to sensitive real-world data. Synthetic data helps them demonstrate methods, models, or dashboards.

Founder Background

DBbun was founded in September 2025 by Uri Kartoun, a data scientist, inventor, and PhD in Intelligent Systems with over 15 years of experience in real-world evidence, predictive modeling, and large-scale data solutions at Microsoft, IBM, and Harvard/Mass General Hospital. Uri is the author of 80+ patents and has developed pioneering methods for generating and analyzing complex datasets.

History & Inspiration

During his fellowship at Harvard/Mass General Hospital, Uri created EMRBots, a non-profit project that generated synthetic EMR-like data long before generative AI became popular.
EMRBots became widely used in teaching and research.
It inspired development of a new type of neural network.
Its popularity and impact on the scientific community laid the groundwork for DBbun.

Contact

Users can contact us using the form below for any inquiry — including requests to create custom datasets based on specific resources. All artificially generated datasets are cross-domain, starting with healthcare but not limited to it.

Disclaimer

All generated data is based on public-domain sources only.
No real patient data is used.
DBbun datasets are intended for research, teaching, prototyping, and analytics.
They are not suitable for clinical decision support or direct patient care.
Users are solely responsible for how the data is applied.

Intellectual Property

DBbun LLC owns utility patent-pending technologies covering its unique approach to dataset generation and packaging.