AI

Meta’s Llama AI models get multimodal

Comment

people walking past Meta signage
Image Credits: TOBIAS SCHWARZ/AFP / Getty Images

Benjamin Franklin once wrote that nothing is certain except death and taxes. Let me amend that phrase to reflect the current AI goldrush: Nothing is certain except death, taxes, and new AI models, with the last of those three arriving at an ever-accelerating pace.

Earlier this week, Google released upgraded Gemini models, and, earlier in the month, OpenAI unveiled its o1 model. But on Wednesday, it was Meta’s turn to trot out its latest at the company’s annual Meta Connect 2024 developer conference in Menlo Park.

Llama’s multimodality

Meta’s multilingual Llama family of models has reached version 3.2, with the bump from 3.1 signifying that several Llama models are now multimodal. Llama 3.2 11B — a compact model — and 90B, which is a larger, more capable model, can interpret charts and graphs, caption images, and pinpoint objects in pictures given a simple description.

Given a map of a park, for example, Llama 3.2 11B and 90B might be able to answer questions like, “When will the terrain become steeper?” and “What’s the distance of this path?” Or, provided a graph showing a company’s revenue over the course of a year, the models could quickly spotlight the best-performing months of the bunch.

For developers who wish to use the models strictly for text applications, Meta says that Llama 3.2 11B and 90B were designed to be “drop-in” replacements for 3.1. 11B, and 90B can be deployed with or without a new safety tool, Llama Guard Vision, that’s designed to detect potentially harmful (i.e. biased or toxic) text and images fed to or generated by the models.

In most of the world, the multimodal Llama models can be downloaded from and used across a wide number of cloud platforms, including Hugging Face, Microsoft Azure, Google Cloud, and AWS. Meta’s also hosting them on the official Llama site, Llama.com, and using them to power its AI assistant, Meta AI, across WhatsApp, Instagram, and Facebook.

Meta Llama 3.2
Image Credits: Meta

But Llama 3.2 11B and 90B can’t be accessed in Europe. As a result, several Meta AI features available elsewhere, like image analysis, are disabled for European users. Meta once again blamed the “unpredictable” nature of the bloc’s regulatory environment.

Meta has expressed concerns about — and spurned a voluntary safety pledge related to — the AI Act, the EU law that establishes a legal and regulatory framework for AI. Among other requirements, the AI Act mandates that companies developing AI in the EU commit to charting whether their models are likely to be deployed in “high-risk” situations, like policing. Meta fears that the “open” nature of its models, which give it little insight into how the models are being used, could make it challenging to adhere to the AI Act’s rules.

Also at issue for Meta are provisions in the GDPR, the EU’s broad privacy law, pertaining to AI training. Meta trains models on the public data of Instagram and Facebook users who haven’t opted out — data that in Europe is subject to GDPR guarantees. EU regulators earlier this year requested that Meta halt training on European user data while they assessed the company’s GDPR compliance.

Meta relented, while at the same time endorsing an open letter calling for “a modern interpretation” of GDPR that doesn’t “reject progress.”

Earlier this month, Meta said that it would resume training on U.K. user data after “[incorporating] regulatory feedback” into a revised opt-out process. But the company has yet to share an update on its training throughout the rest of the bloc.

More compact models

Other new Llama models — models that weren’t trained on European user data — are launching in Europe (and globally) Wednesday.

Llama 3.2 1B and 3B, two lightweight, text-only models designed to run on smartphones and other edge devices, can be applied to tasks such as summarizing and rewriting paragraphs (e.g. in an email). Optimized for Arm hardware from Qualcomm and MediaTek, 1B and 3B can also tap tools such as calendar apps with a bit of configuration, Meta says, allowing them to take actions autonomously.

There isn’t a follow-up, multimodal or not, to the flagship Llama 3.1 405B model released in August. Given 405B’s massive size — it took months to train — it’s likely a matter of constrained compute resources. We’ve asked Meta if there are other factors at play and will update this story if we hear back.

Meta’s new Llama Stack, a suit of Llama-focused dev tools, can be used to fine-tune all the Llama 3.2 models: 1B, 3B, 11B, and 90B. Regardless of how they’re customized, the models can process up to around 100,000 words at once, Meta says.

Meta Llama 3.2
Image Credits: Meta

A play for mindshare

Meta CEO Mark Zuckerberg often talks about ensuring all people have access to the “benefits and opportunities” of AI. Implicit in this rhetoric, however, is a desire that these tools and models be of Meta’s making.

Spending on models that it can then commoditize forces the competition (e.g. OpenAI, Anthropic) to lower prices, spreads Meta’s version of AI broadly, and lets Meta incorporate improvements from the open source community. Meta claims that its Llama models have been downloaded over 350 million times and are in use by large enterprises including Zoom, AT&T, and Goldman Sachs.

For many of these developers and companies, it’s immaterial that the Llama models aren’t “open” in the strictest sense. Meta’s license constrains how certain devs can use them; platforms with over 700 million monthly users must request a special license from Meta that the company will grant on its discretion.

Granted, there aren’t many platforms of that size without their own in-house models. But Meta isn’t being especially transparent about the process. When I asked the company this month whether it had approved a discretionary Llama license for a platform yet, a spokesperson told me that Meta “didn’t have anything to share on the topic.”

Make no mistake, Meta’s playing for keeps. It’s spending millions lobbying regulators to come around to its preferred flavor of “open” AI, and it’s ploughing billions into servers, datacenters, and network infrastructure to train future models.

None of the Llama 3.2 models solves the overriding problems with today’s AI, like its tendency to make things up and regurgitate problematic training data (e.g. copyrighted ebooks that might’ve been used without permission, the subject of a class action lawsuit against Meta). But, as I’ve written before, they do advance one of Meta’s key goals: becoming synonymous with AI, and in particular generative AI.

More TechCrunch

At Wednesday’s Meta Connect event, CEO Mark Zuckerberg announced what he described as Meta teases Orion, ‘the most advanced glasses the world has ever seen.’ The glasses, which are notably…

Meta teases Orion, ‘the most advanced glasses the world has ever seen’

Meta has yet to announce the languages that will be initial available, though judging from the above statement, it seems them will initially be limited to romance languages like English,…

Meta Ray-Bans are getting live translation

Meta had previously noted that there are more than 400 million monthly users across the world.

Mark Zuckerberg says Meta AI has nearly 500 million users

Meta Connect 2024 is starting soon, and with it comes the expectation of reveals and updates to Meta’s latest AI model and metaverse ambitions. The developer-centric event will feature a…

Meta Connect 2024: Quest 3S headset, Meta AI upgrades, Ray-Ban Meta real-time video and more revealed

The Ray-Ban Meta smart glasses got an update with new familiar smartphone features and AI capabilities coming later in 2024.

Meta updates Ray-Ban smart glasses with real-time AI video, reminders, and QR code scanning

Onstage during its Connect 2024 developer conference, Meta showed Hyperspace, a feature that lets users scan real-life spaces and explore them in VR.

Meta’s Hyperscape lets you scan and explore real-life spaces in VR

At its annual Meta Connect event on Wednesday, Meta CEO Mark Zuckerberg announced that the company has rebuilt its social apps for mixed reality. “We’ve got all new Instagram and…

Meta has rebuilt Instagram and Facebook for its Quest headsets

Canva just announced new enhancements to its developer platform, such as including premium apps with its Pro subscription, a translation feature for apps, better discovery mechanisms, and new API functionalities.…

Canva adds new abilities for app developers and improves discoverability

Meta announced that Meta AI will now be able to help you edit photos using AI technology as well as answer questions about the photos you share.

Meta AI can now understand and edit your photos

With the update, users will be able to use prompts to generate AI photos directly in their feed, Stories, and for their Facebook profile pictures.

Meta AI’s genAI ‘Imagine’ features expand across Facebook, Instagram, and Messenger

Meta is now allowing businesses to create ads that, when clicked, bring up a chatbot customers can speak with about common topics.

Meta lets businesses create ad-embedded chatbots

Meta is bringing a voice mode to its AI assistant, Meta AI, along with an Meta AI-powered translation feature for Instagram Reels.

Meta AI gets celebrity voices and lip-synced translations

With the budget Quest 3S now up for preorder, Meta will no longer sell the Quest 2 and Quest Pro once back stock is deleted.

Meta discontinues Quest 2 and Quest Pro

The Meta Quest S3 is up for preorder Wednesday. It starts shipping October 15.

Meta announces $300 Quest 3S, a cheaper take on mixed reality

Back in 2019, Microsoft launched Dapr, a new open-source project that made building event-driven distributed applications easier for developers. Like so many popular open-source projects, Dapr spawned its own ecosystem,…

Diagrid launches Catalyst to help enterprises build their microservices

Amazon CEO Andy Jassy announced to employees last week that they will be expected to work from the office five days a week, starting in 2025. Employees who have grown…

Amazon employees beg management to reverse 5-day RTO mandate

Two of the industry’s most famous sisters, Erin and Sara Foster, sit down alongside business partner Phil Schwarz at TechCrunch Disrupt 2024 to talk about consumer investing, culture curation, and…

Consumer, culture, and creators with Erin and Sara Foster at TechCrunch Disrupt 2024

Why use a rocket when you could use a giant, miles-long “gun” instead?  That’s the question posed by Longshot Space, a company that’s completely rethinking how to send mass to…

Longshot Space closes over $5M in new funding to build space gun in the desert

Consumer apps can generate a lot of traffic and revenue, yet some carriers have complained that they’re not getting a fair cut of the pie for carrying all that traffic…

After losing a peering lawsuit in Germany, Meta says it’s never getting back together with Deutsche Telekom

As the universe’s lightest gas, hydrogen is tricky to contain. But there’s an alternative: attaching hydrogen atoms to a carrier molecule that’s easier to move.

Ayrton Energy mimics margarine to store hydrogen safely

At Meta Connect 2024, the company announced a new family of Llama models, Llama 3.2. It’s somewhat multimodal.

Meta’s Llama AI models get multimodal
Image Credits: TOBIAS SCHWARZ/AFP / Getty Images

Audible, the Amazon-owned audiobook service, continues to experiment with AI to improve audiobook discovery and offer customized recommendations. Recent tests involve AI-powered tags, which analyze customer feedback to offer tailored…

Audible experiments with new AI features for tailored audiobook recommendations

Meta has acquired the Threads.com domain name, according to Whois records of the URL that were updated on September 24. Users on domain-related forums pointed out this transfer, and TechCrunch…

Meta acquires the Threads.com domain name

The European Commission has revealed a list of the first 100-plus signatories to the AI Pact — an initiative focused on getting companies to publish “voluntary pledges” on how they…

Early sign-ups to EU’s AI Pact include Amazon, Google, Microsoft, and OpenAI — but Apple and Meta are missing

Ticktock! The countdown is on to lock in your ticket for TechCrunch Disrupt 2024 at a huge discount. Save up to $600 on individual ticket types if you register before…

Final 3 days to score up to $600 in savings on TechCrunch Disrupt 2024 passes

Meta Connect starts Wednesday at 10 a.m. PT and is set to focus on Meta’s XR platforms, the metaverse, and its generative AI platform, Llama.

Meta Connect 2024: How to watch the metaverse and generative AI event today

Google has filed a complaint with the European Commission (EC), alleging that Microsoft uses anti-competitive licensing practices to strongarm companies into staying on its Azure cloud infrastructure. The complaint relates…

Google files antitrust complaint against Microsoft in Europe over cloud licensing practices

Backyard Baseball 1997 is being remastered and will be available October 10 for $10.

Backyard Baseball ’97 is back, with a re-release coming soon on Steam

With this, Supabase has now raised a total of $196 million, including a Series B round in 2022, which was also an $80 million round.

Supabase, a Postgres-centric developer platform, raises $80M Series C

The common wisdom is that companies like Google, OpenAI, and Anthropic, with bottomless cash reserves and hundreds of top-tier researchers, are the only ones that can make a state-of-the-art foundation…

Ai2’s Molmo shows open source can meet, and beat, closed multimodal models