AI

Runware uses custom hardware and advanced orchestration for fast AI inference

Comment

Image Credits: Runware

Sometimes, a demo is all you need to understand a product. And that’s the case with Runware. If you head over to Runware’s website, enter a prompt and hit enter to generate an image, you’ll be surprised by how quickly Runware generates the image for you — it takes less than a second.

Runware is a newcomer in the AI inference, or generative AI, startup landscape. The company is building its own servers and optimizing the software layer on those servers to remove bottlenecks and improve inference speeds for image generation models. The startup has already secured $3 million in funding from Andreessen Horowitz’s Speedrun, LakeStar’s Halo II and Lunar Ventures.

The company doesn’t want to reinvent the wheel. It just wants to make it spin faster. Behind the scenes, Runware manufactures its own servers with as many GPUs as possible on the same motherboard. It has its own custom-made cooling system and manages its own data centers.

When it comes to running AI models on its servers, Runware has optimized the orchestration layer with BIOS and operating system optimizations to improve cold start times. It has developed its own algorithms that allocate interference workloads.

The demo is impressive by itself. Now, the company wants to use all this work in research and development and turn it into a business.

Unlike many GPU hosting companies, Runware isn’t going to rent its GPUs based on GPU time. Instead, it believes companies should be encouraged to speed up workloads. That’s why Runware is offering an image generation API with a traditional cost-per-API-call fee structure. It’s based on popular AI models from Flux and Stable Diffusion.

“If you look at Together AI, Replicate, Hugging Face — all of them — they are selling compute based on GPU time,” co-founder and CEO Flaviu Radulescu told TechCrunch. “If you compare the amount of time it takes for us to make an image versus them. And then you compare the pricing, you will see that we are so much cheaper, so much faster.”

“It’s going to be impossible for them to match this performance,” he added. “Especially in a cloud provider, you have to run on a virtualized environment, which adds additional delays.”

As Runware is looking at the entire inference pipeline, and optimizing hardware and software, the company hopes that it will be able to use GPUs from multiple vendors in the near future. This has been an important endeavor for several startups as Nvidia is the clear leader in the GPU space, which means that Nvidia GPUs tend to be quite expensive.

“Right now, we use just Nvidia GPUs. But this should be an abstraction of the software layer,” Radulescu said. “We can switch a model from GPU memory in and out very, very fast, which allow us to put multiple customers on the same GPUs.

“So we are not like our competitors. They just load a model into the GPU and then the GPU does a very specific type of task. In our case, we’ve developed this software solution, which allow us to switch a model in the GPU memory as we do inference.“

If AMD and other GPU vendors can create compatibility layers that work with typical AI workloads, Runware is well positioned to build a hybrid cloud that would rely on GPUs from multiple vendors. And that will certainly help if it wants to remain cheaper than competitors at AI inference.

More TechCrunch

Amsterdam-based Brineworks, a company specializing in seawater electrolysis technology, says its innovative method is expected to cost under $100 per ton of CO2 at scale.

Direct ocean capture may be the next frontier for carbon removal

It’s been a tumultuous week for OpenAI, full of executive departures and major fundraising developments, but the startup is back at it, trying to convince developers to build tools with…

OpenAI’s DevDay brings Realtime API and other treats for AI app developers

The new capability, rolling out Tuesday, will help advertisers who want to enhance their Pinterest Product Pins (ads) and attract more clicks, according to Pinterest.

Pinterest rolls out genAI tools for product imagery to advertisers

Monorepos are becoming an increasingly popular way to manage source code, but they require a slightly different toolset. Google developed its own internal build and test tool on top of…

Aspect Build gets $3.85M to help developers create software with Bazel

Sometimes, a demo is all you need to understand a product. And that’s the case with Runware. If you head over to Runware’s website, enter a prompt and hit enter…

Runware uses custom hardware and advanced orchestration for fast AI inference
Image Credits: Runware

Where most startups aim to recreate the superheated, super-pressurized conditions inside of a star, Acceleron takes a different approach.

Acceleron Fusion has raised $15M to take another stab at cold fusion, filing reveals

Microsoft was ahead of the game in the world of enterprise AR.

Microsoft HoloLens 2 discontinued with no successor in sight

Get ready for TechCrunch Disrupt 2024, our signature event for startups of all stages, taking place at Moscone West in San Francisco from October 28-30. This year, we’re expecting a…

The complete agenda for the Disrupt Stage at TechCrunch Disrupt 2024

Last year, Sound Ventures, the 9-year-old, Beverly Hills, California-based venture firm led by general partners Ashton Kutcher, Guy Oseary, and Effie Epstein, announced a new $265 million AI fund that…

Ashton Kutcher, Effie Epstein, and Guy Oseary are coming to TechCrunch Disrupt 2024

Numa, a startup developing AI-powered automation tech for car dealerships, has raised fresh capital in a Series B round.

Numa raises $32M to bring AI and automation to car dealerships

Featured Article

How the FBI and Mandiant caught a ‘serial hacker’ who tried to fake his own death

Jesse Kipf was a prolific hacker who sold access to systems he hacked, had contacts with a notorious cybercrime gang, and tried to use his hacking skills to get off the grid for good.

How the FBI and Mandiant caught a ‘serial hacker’ who tried to fake his own death

Ford is slashing both the monthly and annual cost of its hands-free driver-assistance feature, BlueCruise, for new and existing owners in response to “customer and dealer” feedback, the company tells…

Ford cuts price of BlueCruise hands-free driving feature

Drones and sidewalk delivery robots promise to make last-mile delivery cheaper and more efficient, but they both have their limitations. Drones have trouble touching down in dense urban areas, and…

Serve Robotics and Wing to trial robot-to-drone delivery in Dallas

People participating on the open social web have a problem: it’s not yet possible to reach users on multiple sites like Bluesky, Mastodon, and Threads with a single post. While…

Croissant debuts a cross-posting app for Threads, Bluesky, and Mastodon

Microsoft has given its Copilot assistant on Windows a makeover — and a voice. Copilot can now read your screen, speak aloud, and more.

Microsoft Copilot can now read your screen, think deeply, and speak aloud to you

Microsoft has broadly launched Bing Generative Search, its answer to Google’s AI Overviews and other AI-powered search apps.

Microsoft brings AI-powered overviews to Bing

Microsoft is paying publishers for content as part of a new Copilot feature, Copilot Daily, that gives a spoken summary of current events.

Microsoft starts paying publishers for content surfaced by Copilot

Evil Corp maintains a “privileged” relationship with the Kremlin, and was often tasked with launching cyberattacks on behalf of Russia. 

UK unmasks LockBit ransomware affiliate as high-ranking hacker in Russia state-backed cybercrime gang

E-commerce giant eBay, facing stiff competition from newer rivals, has removed final-value sales fees for all items excluding cars sold domestically in the U.K. This mirrors a similar move the…

eBay removes UK seller fees to counter new wave of marketplace startups

Google is announcing new Chromebook models today with Samsung and Lenovo. With Samsung’s Galaxy Chromebook Plus model in particular, the company is also introducing a new multifunctional quick insert key.…

Google adds a multi-functional quick insert key and new AI features to Chromebook Plus

Anduril sued defense tech startup Salient Motion. It still raised $12 million with participation from Anduril investor a16z.

Palmer Luckey tried to crush aeronautics startup Salient Motion. But Anduril backer a16z invested.

The company laid out a plan it hopes will go a long way toward reversing fortunes and repairing relationships.

Sonos outlines turnaround plan following app disaster

A team of founders who sold their last company to Amazon to build a new unit within AWS is setting out to reinvent the tricky business of backing up organizations’…

Eon emerges from stealth with $127M to bring a fresh approach to backing up cloud infrastructure

Air Doctor’s platform helps travelers find doctors in other countries, and it has now raised $20 million in a Series B round after seeing strong traction. 

Air Doctor raises $20M to plug a gap in how people find doctors when they’re traveling

Featured Article

Sequoia backs Pydantic to expand beyond its open source data-validation framework

Sequoia is investing $12.5M in UK startup Pydantic to help it expand beyond its open source data-validation framework.

Sequoia backs Pydantic to expand beyond its open source data-validation framework

Invesco has raised the value of its stake in Swiggy, ascribing an implied valuation of about $13.3 billion to the Indian food delivery and quick-commerce startup.

Invesco raises its valuation of Swiggy to $13.3B

The world of WordPress, one of the most popular technologies for creating and hosting websites, is going through a very heated controversy. The core issue is the fight between WordPress…

The WordPress vs. WP Engine drama, explained

Anduril is expanding even further into the “ultimate high ground.”  The company, which is best known for AI-powered defense products that span air, land and sea, is partnering with satellite…

Anduril speeds up launch of defense payloads by buying Apex satellite buses off the shelf

With this merger, Dott and Tier didn’t want to build a conglomerate of micromobility services; the operation was all about scale.

Tier becomes Dott following the merger of the two micromobility companies

Meta’s AI-powered Ray-Bans have a discreet camera on the front, for taking photos not just when you ask them to, but also when their AI features trigger it with certain…

Meta won’t say whether it trains AI on smart glasses photos