Stage 1: free safety marketplace

Fixing data procurement for AI safety and alignment.

Data Canon is building the transparent marketplace for model training data, starting with safety. We consolidate alignment benchmarks, certify dataset quality, and make safety training data easier for labs, enterprises, and data vendors to find, compare, and use.

The goal is to remove the friction of variable pricing, tedious negotiations, and unknown quality, so the best data wins instead of the best marketing team.

Benchmark safety

Consolidate industry-leading AI safety and alignment benchmarks, with our own views on honesty, intent alignment, moral reasoning, societal harm, and hazardous knowledge.

Prove dataset lift

Identify off-the-shelf datasets that improve model performance on safety benchmarks, train open source models, and publish the measured lift.

Host safety data

Create a no-fee marketplace where safety datasets can be made discoverable and available for purchase without Data Canon taking a transaction fee.

Demand transparency

Circulate a commitment to safety data transparency, asking labs and vendors to bring safety training data into the light, even when it is not free.