AI

OpenAI’s DevDay brings Realtime API and other treats for AI app developers

Comment

SAN FRANCISCO, CALIFORNIA - NOVEMBER 06: OpenAI CEO Sam Altman smiles during the OpenAI DevDay event on November 06, 2023 in San Francisco, California. Altman delivered the keynote address at the first-ever Open AI DevDay conference.(Photo by Justin Sullivan/Getty Images)
Image Credits: Justin Sullivan / Getty Images

It’s been a tumultuous week for OpenAI, full of executive departures and major fundraising developments, but the startup is back at it, trying to convince developers to build tools with its AI models at its 2024 DevDay. The company announced several new tools Tuesday, including a public beta of its “Realtime API”, for building apps with low-latency, AI-generated voice responses. It’s not quite ChatGPT’s Advanced Voice Mode, but it’s close.

In a briefing with reporters ahead of the event, OpenAI chief product officer Kevin Weil said the recent departures of chief technology officer Mira Murati and chief research officer Bob McGrew would not affect the company’s progress.

“I’ll start with saying Bob and Mira have been awesome leaders. I’ve learned a lot from them, and they are a huge part of getting us to where we are today,” said Weil. “And also, we’re not going to slow down.”

As OpenAI undergoes yet another C-suite overhaul – a reminder of the turmoil following last year’s DevDay – the company is trying to convince developers that it still offers the best platform to build AI apps on. Leaders say the startup has more than 3 million developers building with its AI models, but OpenAI is operating in an increasingly competitive space.

OpenAI noted it had cut costs for developers to access its API by 99% in the last two years, though it was likely forced to by competitors such as Meta and Google continuously undercutting their prices.

One of OpenAI’s new features, dubbed the Realtime API, will give developers the chance to build nearly real-time, speech-to-speech experiences in their apps, with the choice of using six voices provided by OpenAI. These voices are distinct from those offered for ChatGPT, and developers can’t use third party voices, in order to prevent copyright issues. (The voice ambiguously based on Scarlett Johansson’s is not available anywhere.)

During the briefing, OpenAI’s head of developer experience, Romain Huet, shared a demo of a trip planning app built with the Realtime API. The application allowed users to verbally speak with an AI assistant about an upcoming trip to London, and get low-latency responses. The Realtime API also has access to a number of tools, so the app was able to annotate a map with restaurant locations as it answered.

At another point, Huet showed how the Realtime API could speak on the phone with a human to inquire about ordering food for an event. Unlike Google’s infamous Duo, OpenAI’s API can’t call restaurants or shops directly; however, it can integrate with calling APIs like Twilio to do so. Notably, OpenAI is not adding disclosures so that its AI models automatically identify themselves on calls like this, despite the fact that these AI-generated voices sounds quite realistic. For now, it seems to be the developers’ responsibility to add this disclosure, something that could be required by a new California law.

As part of its DevDay announcements, OpenAI also introduced vision fine-tuning in its API, which will let developers use images, as well as text, to fine-tune their applications of GPT-4o. This should, in theory, help developers improve the performance of GPT-4o for tasks involving visual understanding. OpenAI’s head of product API, Olivier Godement, tells TechCrunch that developers will not be able to upload copyrighted imagery (such as a picture of Donald Duck), images that depict violence, or other imagery that violates OpenAI’s safety policies.

OpenAI is racing to match what its competitors in the AI model licensing space already offer. Its prompt caching feature is similar to the feature Anthropic launched several months agoallowing developers to cache frequently used context between API calls, reducing costs and improve latency. OpenAI says developers can save 50% using this feature, whereas Anthropic promises a 90% discount for it.

Lastly, OpenAI is offering a model distillation feature to let developers use larger AI models, such as o1-preview and GPT-4o, to fine-tune smaller models such as GPT-4o mini. Running smaller models generally provides cost savings compare to running larger ones, but this feature should let developers improve the performance of those small AI models. As part of model distillation, OpenAI is launching a beta evaluation tool so developers can measure their fine-tune’s performance within OpenAI’s API.

DevDay may make bigger waves for what it didn’t announce – for instance, there wasn’t any news on the GPT Store announced during last year’s DevDay. Last we’ve heard, OpenAI has been piloting a revenue share program with some of the most popular creators of GPTs, but the company hasn’t announced much since then.

Also, OpenAI says it’s not releasing any new AI models during DevDay this year. Developers waiting for OpenAI o1 (not the preview or mini version) or the startup’s video generation model, Sora, will have to wait a little longer.

More TechCrunch

It’s been a tumultuous week for OpenAI, full of executive departures and major fundraising developments, but the startup is back at it, trying to convince developers to build tools with…

OpenAI’s DevDay brings Realtime API and other treats for AI app developers
Image Credits: Justin Sullivan / Getty Images

The new capability, rolling out Tuesday, will help advertisers who want to enhance their Pinterest Product Pins (ads) and attract more clicks, according to Pinterest.

Pinterest rolls out genAI tools for product imagery to advertisers

Monorepos are becoming an increasingly popular way to manage source code, but they require a slightly different toolset. Google developed its own internal build and test tool on top of…

Aspect Build gets $3.85M to help developers create software with Bazel

Sometimes, a demo is all you need to understand a product. And that’s the case with Runware. If you head over to Runware’s website, enter a prompt and hit enter…

Runware uses custom hardware and advanced orchestration for fast AI inference

Where most startups aim to recreate the superheated, super-pressurized conditions inside of a star, Acceleron takes a different approach.

Acceleron Fusion has raised $15M to take another stab at cold fusion, filing reveals

Microsoft was ahead of the game in the world of enterprise AR.

Microsoft HoloLens 2 discontinued with no successor in sight

Get ready for TechCrunch Disrupt 2024, our signature event for startups of all stages, taking place at Moscone West in San Francisco from October 28-30. This year, we’re expecting a…

The complete agenda for the Disrupt Stage at TechCrunch Disrupt 2024

Last year, Sound Ventures, the 9-year-old, Beverly Hills, California-based venture firm led by general partners Ashton Kutcher, Guy Oseary, and Effie Epstein, announced a new $265 million AI fund that…

Ashton Kutcher, Effie Epstein, and Guy Oseary are coming to TechCrunch Disrupt 2024

Numa, a startup developing AI-powered automation tech for car dealerships, has raised fresh capital in a Series B round.

Numa raises $32M to bring AI and automation to car dealerships

Featured Article

How the FBI and Mandiant caught a ‘serial hacker’ who tried to fake his own death

Jesse Kipf was a prolific hacker who sold access to systems he hacked, had contacts with a notorious cybercrime gang, and tried to use his hacking skills to get off the grid for good.

How the FBI and Mandiant caught a ‘serial hacker’ who tried to fake his own death

Ford is slashing both the monthly and annual cost of its hands-free driver-assistance feature, BlueCruise, for new and existing owners in response to “customer and dealer” feedback, the company tells…

Ford cuts price of BlueCruise hands-free driving feature

Drones and sidewalk delivery robots promise to make last-mile delivery cheaper and more efficient, but they both have their limitations. Drones have trouble touching down in dense urban areas, and…

Serve Robotics and Wing to trial robot-to-drone delivery in Dallas

People participating on the open social web have a problem: it’s not yet possible to reach users on multiple sites like Bluesky, Mastodon, and Threads with a single post. While…

Croissant debuts a cross-posting app for Threads, Bluesky, and Mastodon

Microsoft has given its Copilot assistant on Windows a makeover — and a voice. Copilot can now read your screen, speak aloud, and more.

Microsoft Copilot can now read your screen, think deeply, and speak aloud to you

Microsoft has broadly launched Bing Generative Search, its answer to Google’s AI Overviews and other AI-powered search apps.

Microsoft brings AI-powered overviews to Bing

Microsoft is paying publishers for content as part of a new Copilot feature, Copilot Daily, that gives a spoken summary of current events.

Microsoft starts paying publishers for content surfaced by Copilot

Evil Corp maintains a “privileged” relationship with the Kremlin, and was often tasked with launching cyberattacks on behalf of Russia. 

UK unmasks LockBit ransomware affiliate as high-ranking hacker in Russia state-backed cybercrime gang

E-commerce giant eBay, facing stiff competition from newer rivals, has removed final-value sales fees for all items excluding cars sold domestically in the U.K. This mirrors a similar move the…

eBay removes UK seller fees to counter new wave of marketplace startups

Google is announcing new Chromebook models today with Samsung and Lenovo. With Samsung’s Galaxy Chromebook Plus model in particular, the company is also introducing a new multifunctional quick insert key.…

Google adds a multi-functional quick insert key and new AI features to Chromebook Plus

Anduril sued defense tech startup Salient Motion. It still raised $12 million with participation from Anduril investor a16z.

Palmer Luckey tried to crush aeronautics startup Salient Motion. But Anduril backer a16z invested.

The company laid out a plan it hopes will go a long way toward reversing fortunes and repairing relationships.

Sonos outlines turnaround plan following app disaster

A team of founders who sold their last company to Amazon to build a new unit within AWS is setting out to reinvent the tricky business of backing up organizations’…

Eon emerges from stealth with $127M to bring a fresh approach to backing up cloud infrastructure

Air Doctor’s platform helps travelers find doctors in other countries, and it has now raised $20 million in a Series B round after seeing strong traction. 

Air Doctor raises $20M to plug a gap in how people find doctors when they’re traveling

Featured Article

Sequoia backs Pydantic to expand beyond its open source data-validation framework

Sequoia is investing $12.5M in UK startup Pydantic to help it expand beyond its open source data-validation framework.

Sequoia backs Pydantic to expand beyond its open source data-validation framework

Invesco has raised the value of its stake in Swiggy, ascribing an implied valuation of about $13.3 billion to the Indian food delivery and quick-commerce startup.

Invesco raises its valuation of Swiggy to $13.3B

The world of WordPress, one of the most popular technologies for creating and hosting websites, is going through a very heated controversy. The core issue is the fight between WordPress…

The WordPress vs. WP Engine drama, explained

Anduril is expanding even further into the “ultimate high ground.”  The company, which is best known for AI-powered defense products that span air, land and sea, is partnering with satellite…

Anduril speeds up launch of defense payloads by buying Apex satellite buses off the shelf

With this merger, Dott and Tier didn’t want to build a conglomerate of micromobility services; the operation was all about scale.

Tier becomes Dott following the merger of the two micromobility companies

Meta’s AI-powered Ray-Bans have a discreet camera on the front, for taking photos not just when you ask them to, but also when their AI features trigger it with certain…

Meta won’t say whether it trains AI on smart glasses photos

A Y Combinator startup named PearAI launched with a tweet thread and YouTube video on Saturday and caused an immediate backlash.

Y Combinator is being criticized after it backed an AI startup that admits it basically cloned another AI startup