Start your day with intelligence. Get The OODA Daily Pulse.
Data-labeling startup Scale AI is launching a new public benchmarking system to rank the best-performing artificial intelligence models for different regions, professions and age groups. The ranking system, called SEAL Showdown, builds on Scale’s existing leaderboards and relies on votes from contributors in more than 100 countries to gauge everyday usability of models. “Participation is voluntary and unpaid, with contributors viewing it as a perk because they get free access to frontier models that typically cost hundreds of dollars a month,” said Janie Gu, a product manager at Scale AI. Unlike other platforms that draw mostly tech enthusiasts, Scale’s pool includes everyday users as well as physicians, lawyers and physicists, a diversity Gu said makes Showdown’s results more representative. The launch is part of a growing effort to address an ongoing issue in the AI sector. Tech firms are releasing new AI models at a rapid pace, but there is no industry consensus for how best to test these systems, or who the best evaluators are. Several third-party services have attempted to fill that void, including LMArena, a popular platform for testing and ranking models. For Scale, the release could help broaden its influence in the AI industry at a turbulent moment. Once viewed as a leader in the data-labeling market, Scale has lost notable customers, announced job cuts and seen growing competition in the months since it received a $14.3 billion investment from Meta Platforms Inc.
Scale describes Showdown as a “representative, reliable and human-centered” tool meant to help businesses choose the right model for specific markets, teams and tasks, rather than relying on a global average.
For more see the OODA Company Profile on Scale AI.