AI

NIST releases a tool for testing AI model risk

Comment

Futuristic digital blockchain background. Abstract connections technology and digital network. 3d illustration of the Big data and communications technology.
Image Credits: v_alex / Getty Images

The National Institute of Standards and Technology (NIST), the U.S. Commerce Department agency that develops and tests tech for the U.S. government, companies and the broader public, has re-released a testbed designed to measure how malicious attacks — particularly attacks that “poison” AI model training data — might degrade the performance of an AI system.

Called Dioptra (after the classical astronomical and surveying instrument), the modular, open source web-based tool, first released in 2022, seeks to help companies training AI models — and the people using these models — assess, analyze and track AI risks. Dioptra can be used to benchmark and research models, NIST says, as well as to provide a common platform for exposing models to simulated threats in a “red-teaming” environment.

“Testing the effects of adversarial attacks on machine learning models is one of the goals of Dioptra,” NIST wrote in a press release. “The open source software, like generating child available for free download, could help the community, including government agencies and small to medium-sized businesses, conduct evaluations to assess AI developers’ claims about their systems’ performance.”

NIST Dioptra
A screenshot of Diatropa’s interface.
Image Credits: NIST

Dioptra debuted alongside documents from NIST and NIST’s recently created AI Safety Institute that lay out ways to mitigate some of the dangers of AI, like how it can be abused to generate nonconsensual pornography. It follows the launch of the U.K. AI Safety Institute’s Inspect, a toolset similarly aimed at assessing the capabilities of models and overall model safety. The U.S. and U.K. have an ongoing partnership to jointly develop advanced AI model testing, announced at the U.K.’s AI Safety Summit in Bletchley Park in November of last year. 

Dioptra is also the product of President Joe Biden’s executive order (EO) on AI, which mandates (among other things) that NIST help with AI system testing. The EO, relatedly, also establishes standards for AI safety and security, including requirements for companies developing models (e.g. Apple) to notify the federal government and share results of all safety tests before they’re deployed to the public.

As we’ve written about before, AI benchmarks are hard — not least of which because the most sophisticated AI models today are black boxes whose infrastructure, training data and other key details are kept under wraps by the companies creating them. A report out this month from the Ada Lovelace Institute, a U.K.-based nonprofit research institute that studies AI, found that evaluations alone aren’t sufficient to determine the real-world safety of an AI model in part because current policies allow AI vendors to selectively choose which evaluations to conduct.

NIST doesn’t assert that Dioptra can completely de-risk models. But the agency does propose that Dioptra can shed light on which sorts of attacks might make an AI system perform less effectively and quantify this impact to performance.

In a major limitation, however, Dioptra only works out-of-the-box on models that can be downloaded and used locally, like Meta’s expanding Llama family. Models gated behind an API, such as OpenAI’s GPT-4o, are a no-go — at least for the time being.

More TechCrunch

The National Institute of Standards and Technology (NIST), the U.S. Commerce Department agency that develops and tests tech for the U.S. government, companies and the broader public, has re-released a…

NIST releases a tool for testing AI model risk
Image Credits: v_alex / Getty Images

Featured Article

Max Space reinvents expandable habitats with a 17th-century twist, launching in 2026

Max Space’s expandable habitats promise to be larger, stronger, and more versatile than anything like them ever launched, not to mention cheaper and lighter by far than a solid, machined structure.

Max Space reinvents expandable habitats with a 17th-century twist, launching in 2026

Payments giant Stripe has acquired a four-year-old competitor, Lemon Squeezy, the latter company announced Friday. Terms of the deal were not disclosed. As a merchant of record, Lemon Squeezy calculates…

Stripe acquires payment processing startup Lemon Squeezy

iCloud Private Relay has not been working for some Apple users across major markets, including the U.S., Europe, India and Japan.

Apple reports iCloud Private Relay global outages for some users

Welcome to Startups Weekly — your weekly recap of everything you can’t miss from the world of startups. To get Startups Weekly in your inbox every Friday, sign up here. This…

Legal tech, VC brawls and saying no to big offers

Apple joins 15 other tech companies — including Google, Meta, Microsoft and OpenAI — that committed to the White House’s rules for developing generative AI.

Apple signs the White House’s commitment to AI safety

The language is ambiguous, so it’s not clear whether X is helping itself to all user data for training Grok or whether this processing refers only to user interactions with…

Privacy watchdog says it’s ‘surprised’ by Elon Musk opting user data into Grok AI training

Sound Search on TikTok is somewhat similar to YouTube Music’s song detection tool that lets you find the name of a song by singing, humming or playing it. 

TikTok rolls out a new feature that lets you find songs by singing or humming them

Skip, a wearable tech startup that began as a secretive project inside Alphabet, exited stealth this week to announce a partnership with outdoor clothing specialist Arc’teryx. The deal is the…

Alphabet X spinoff partners with Arc’teryx to bring  ‘everyday’ exoskeleton to market

Ledger, a French startup mostly known for its secure crypto hardware wallets, has launched a new mid-range device, the Ledger Flex. Available now, priced at $249, the dinky hardware wallet…

Ledger launches Ledger Flex, a mid-range hardware crypto wallet

The good news is that you can switch off the new data-sharing setting and also delete your conversation history with the AI. 

Here’s how to disable X (Twitter) from using your data to train its Grok AI

Regulators gave SpaceX the all-clear to return to launch two weeks after the Falcon 9 rocket experienced an anomaly on orbit.

SpaceX cleared to resume Falcon 9 launches while FAA investigation remains open

Madison Long and Simone May founded Clutch in 2020 to help connect people to businesses looking for marketing and content creation.

Digital marketing startup Plaiced has acquired Precursor Ventures-backed Clutch

With the CrowdStrike update continuing to cause havoc across the planet, a startup has raised $13.5 million to at least improve some level of security for the kinds of devices…

ZeroTier raises $13.5M to help avert CrowdStrike-like network problems

Apple has reduced prices of its iPhone models in India by 3-4% following a cut in import duties in the South Asian market.

Apple cuts iPhone price in India amid China slowdown

MNT-Halan, a fintech unicorn out of Egypt, is on a consolidation march. The microfinance and payments startup has raised $157.5 million in funding and is using the money in part…

Egypt’s MNT-Halan banks $157.5M, gobbles up a fintech in Turkey to expand

The energy transition is a marathon, not a sprint. But opportunities for acceleration are growing. Swedish startup Greenely* has just spotted one. It’s closing an €8 million Series A funding…

Energy tech startup Greenely grabs €8M to reach more households and support Europe’s energy transition

The Floorr offers tools for conducting sales, hosting tailored styling sessions, creating mood boards, and engaging in text or voice chats with clients, all in one place. 

Luxury fashion startup The Floorr empowers personal stylists with tools to grow their businesses

A decade-old drama involving VC David Sacks and Rippling founder Parker Conrad has blown up on X with many among the Silicon Valley elite taking sides.

Here’s why David Sacks, Paul Graham and other big Silicon Valley names had a brawl on X over VC behavior

ChatGPT, OpenAI’s text-generating AI chatbot, has taken the world by storm since its launch in November 2022. What started as a tool to hyper-charge productivity through writing essays and code…

ChatGPT: Everything you need to know about the AI-powered chatbot

Autonomous vehicle software startup Applied Intuition has closed a $300 million secondary sale just four months after raising a $250 million Series E round, yet another sign of how white-hot…

Applied Intuition closes $300M secondary four months after raising $250M

OpenAI may have designs to get into the search game — challenging not only upstarts like Perplexity, but Google and Bing, too. The company on Thursday unveiled SearchGPT, a search…

With Google in its sights, OpenAI unveils SearchGPT

The California Supreme Court ruled Thursday that Proposition 22 — the ballot measure that passed in November 2020 and classified app-based gig workers as independent contractors rather than employees —…

Uber, Lyft, DoorDash can continue to classify drivers as contractors in California

WhatsApp has recently ramped up its marketing push in the U.S.

Mark Zuckerberg says WhatsApp has 100M monthly active users in the US

Welcome back to TechCrunch Mobility — your central hub for news and insights on the future of transportation. Sign up here for free — just click TechCrunch Mobility! I don’t…

Alphabet pours $5B into Waymo, Cruise scraps the Origin and Elon’s bet on autonomy

In addition to insured commitments, Archera provides consulting services to help build purchasing strategies for customers to optimize their cloud usage.

Archera helps customers access deep cloud discounts

In its bid to maintain pace with generative AI rivals like Anthropic and OpenAI, Google is rolling out updates to the no-fee tier of Gemini, its AI-powered chatbot. The updates…

Google makes its Gemini chatbot faster and more widely available

Until a year ago, Arjun Pillai had the comfortable yet important role of chief data officer at ZoomInfo, a B2B database company. But the serial entrepreneur was getting antsy. He…

ZoomInfo alum raises $15M for startup that builds AI sales engineers

Substack is rolling out the ability for writers to draft and publish new posts directly from their phone via its iOS app, the company announced on Thursday. Until now, users…

Substack writers can now draft and publish posts in iOS app

Disrupt 2024 is the premier event where tech careers are launched, connections are forged, and the future of technology talent takes center stage. The Disrupt Career Fair is the perfect…

Disrupt 2024 Career Fair: Your gateway to top tech talent