Social

Meta Llama: Everything you need to know about the open generative AI model

Comment

Llama illustration
Image Credits: Larysa Amosova via Getty

Like every big tech company these days, Meta has its own flagship generative AI model, called Llama. Llama is somewhat unique among major models in that it’s “open,” meaning developers can download and use it however they please (with certain limitations). That’s in contrast to models like Anthropic’s Claude, OpenAI’s GPT-4o (which powers ChatGPT) and Google’s Gemini, which can only be accessed via APIs.

In the interest of giving developers choice, however, Meta has also partnered with vendors including AWS, Google Cloud and Microsoft Azure to make cloud-hosted versions of Llama available. In addition, the company has released tools designed to make it easier to fine-tune and customize the model.

Here’s everything you need to know about Llama, from its capabilities and editions to where you can use it. We’ll keep this post updated as Meta releases upgrades and introduces new dev tools to support the model’s use.

What is Llama?

Llama is a family of models — not just one:

  • Llama 8B
  • Llama 70B
  • Llama 405B

The latest versions are Llama 3.1 8B, Llama 3.1 70B and Llama 3.1 405B, which was released in July 2024. They’re trained on web pages in a variety of languages, public code and files on the web, as well as synthetic data (i.e. data generated by other AI models).

Llama 3.1 8B and Llama 3.1 70B are small, compact models meant to run on devices ranging from laptops to servers. Llama 3.1 405B, on the other hand, is a large-scale model requiring (absent some modifications) data center hardware. Llama 3.1 8B and Llama 3.1 70B are less capable than Llama 3.1 405B, but faster. They’re “distilled” versions of 405B, in point of fact, optimized for low storage overhead and latency.

All the Llama models have 128,000-token context windows. (In data science, tokens are subdivided bits of raw data, like the syllables “fan,” “tas” and “tic” in the word “fantastic.”) A model’s context, or context window, refers to input data (e.g. text) that the model considers before generating output (e.g. additional text). Long context can prevent models from “forgetting” the content of recent docs and data, and from veering off topic and extrapolating wrongly.

Those 128,000 tokens translate to around 100,000 words or 300 pages, which for reference is around the length of “Wuthering Heights,” “Gulliver’s Travels” and “Harry Potter and the Prisoner of Azkaban.”

What can Llama do?

Like other generative AI models, Llama can perform a range of different assistive tasks, like coding and answering basic math questions, as well as summarizing documents in eight languages (English, German, French, Italian, Portuguese, Hindi, Spanish and Thai). Most text-based workloads — think analyzing files like PDFs and spreadsheets — are within its purview; none of the Llama models can process or generate images, although that may change in the near future.

All the latest Llama models can be configured to leverage third-party apps, tools and APIs to complete tasks. They’re trained out of the box to use Brave Search to answer questions about recent events, the Wolfram Alpha API for math- and science-related queries and a Python interpreter for validating code. In addition, Meta says the Llama 3.1 models can use certain tools they haven’t seen before (but whether they can reliably use those tools is another matter).


Where can I use Llama?

If you’re looking to simply chat with Llama, it’s powering the Meta AI chatbot experience on Facebook Messenger, WhatsApp, Instagram, Oculus and Meta.ai.

Developers building with Llama can download, use or fine-tune the model across most of the popular cloud platforms. Meta claims it has over 25 partners hosting Llama, including Nvidia, Databricks, Groq, Dell and Snowflake.

Some of these partners have built additional tools and services on top of Llama, including tools that let the models reference proprietary data and enable them to run at lower latencies.

Meta suggests using its smaller models, Llama 8B and Llama 70B, for general-purpose applications like powering chatbots and generating code. Llama 405B, the company says, is better reserved for model distillation — the process of transferring knowledge from a large model to a smaller, more efficient model — and generating synthetic data to train (or fine-tune) alternative models.

Importantly, the Llama license constrains how developers can deploy the model: App developers with more than 700 million monthly users must request a special license from Meta that the company will grant on its discretion.

What tools does Meta offer for Llama?

Alongside Llama, Meta provides tools intended to make the model “safer” to use:

  • Llama Guard, a moderation framework
  • Prompt Guard, a tool to protect against prompt injection attacks
  • CyberSecEval, a cybersecurity risk assessment suite

Llama Guard tries to detect potentially problematic content either fed into — or generated — by a Llama model, including content relating to criminal activity, child exploitation, copyright violations, hate, self-harm and sexual abuse. Developers can customize the categories of blocked content, and apply the blocks to all the languages Llama supports out of the box.

Like Llama Guard, Prompt Guard can block text intended for Llama, but only text meant to “attack” the model and get it to behave in undesirable ways. Meta claims that Llama Guard can defend against explicitly malicious prompts (i.e. jailbreaks that attempt to get around Llama’s built-in safety filters) in addition to prompts that contain “injected inputs.”

As for CyberSecEval, it’s less a tool than a collection of benchmarks to measure model security. CyberSecEval can assess the risk a Llama model poses (at least according to Meta’s criteria) to app developers and end users in areas like “automated social engineering” and “scaling offensive cyber operations.”

Llama’s limitations

Llama comes with certain risks and limitations, like all generative AI models.

For instance, it’s unclear whether Meta trained Llama on copyrighted content. If it did, users might be liable for infringement if they end up unwittingly using a copyrighted snippet that the model regurgitated.

Meta at one point used copyrighted e-books for AI training despite its own lawyers’ warnings, according to recent reporting by Reuters. The company controversially trains its AI on Instagram and Facebook posts, photos and captions, and makes it difficult for users to opt out. What’s more, Meta, along with OpenAI, is the subject of an ongoing lawsuit brought by authors, including comedian Sarah Silverman, over the companies’ alleged unauthorized use of copyrighted data for model training.

Programming is another area where it’s wise to tread lightly when using Llama. That’s because Llama might — like its generative AI counterparts — produce buggy or insecure code.

As always, it’s best to have a human expert review any AI-generated code before incorporating it into a service or software.

More TechCrunch

Elon Musk has denied a report that one of his companies, Tesla, has discussed sharing revenue with another of his companies, xAI, so that it can use the startup’s AI…

Elon Musk says Tesla has ‘no need’ to license xAI models

After weeks in political limbo, France now has a new prime minister, former EU’s Brexit negotiator Michel Barnier. But parliament remains bitterly divided, generating uncertainty for many economic sectors —…

La French Tech gears up to go in a new direction

Italy-based app company Bending Spoons, which owns Evernote and Meetup, is planning to lay off 75% of the staff of file transfer service WeTransfer, TechCrunch has learned. Bending Spoons acquired…

Bending Spoons plans to lay off 75% of WeTransfer staff after acquisition

Like other generative AI models, Llama can perform a range of different assistive tasks, like coding and answering basic math questions, as well as summarizing documents in eight languages.

Meta Llama: Everything you need to know about the open generative AI model
Image Credits: Larysa Amosova via Getty

Featured Article

Apple Event 2024: iPhone 16, Apple Intelligence and all the other expected ‘Glowtime’ reveals

Apple’s Glowtime iPhone event will include the iPhone 16, but may also feature new AirPods, a new Apple Watch and possibly even new Macs.

Apple Event 2024: iPhone 16, Apple Intelligence and all the other expected ‘Glowtime’ reveals

The startup has devised a durable way to store solar power as heat that can then be used for household heating or hot water.

Sunamp’s thermal battery uses a chemical found in salt-and-vinegar potato chips

Featured Article

The coolest startup in the Bay Area is a baseball team called the Oakland Ballers

This year, the B’s made their debut in the Pioneer League, a professional baseball organization that’s partnered with the MLB, but unlike the minor leagues, it’s not tied to any existing MLB teams.

The coolest startup in the Bay Area is a baseball team called the Oakland Ballers

A New York Times analysis of more than 3.2 million Telegram messages from 16,000 channels found that the messaging platform has been “inundated” with illegal and extremist activity. Specifically, The Times…

Telegram reportedly ‘inundated’ with illegal and extremist activity

Bluesky keeps growing: The company announced that as of Friday morning, it had added 3 million new users, bringing its total user count to more than 9 million. In other…

Bluesky grows to 9M+ users

Warp, a young payroll startup in New York, is in the spotlight following controversial posts from an account tied to the company. On Thursday, an account posting under the name…

Payroll startup Warp disavows ‘affiliate’ who posted about white superiority

Canva is dramatically increasing prices for some customers. Canva Teams subscribers on older pricing plans will see a 300% increase for a five-person plan, jumping from $119.99 per year to…

Canva wants you to pay a lot more for its AI features

After months of delays and uncertainty, Boeing’s Starliner capsule has returned from the International Space Station, touching down in White Sands Space Harbor, New Mexico, just after midnight on Saturday. …

Boeing’s Starliner performs flawless touchdown without on-board crew, program’s future remains uncertain

TechCrunch sat down with Shaikh this week at the Korea Blockchain Week 2024 conference in Seoul to talk about Aptos’ expansion; its partnerships with major Asian web2 companies; and how…

Aptos CEO Mo Shaikh shares his journey to web3 and market opportunities in Asia and Middle East

Featured Article

Startups are getting fined, or sometimes banned, by individual states

The problem, experts say, is that each state has its own complex fees, tax, and business registration requirements.

Startups are getting fined, or sometimes banned, by individual states

Today’s scams can be as simple as picking up a phone call. To avoid the next fraud, there are good reasons to let your calls run to voicemail.

For security, we have to stop picking up the phone

Featured Article

How a viral AI image catapulted a Mexican startup to a major adidas contract

Antonio Nuño, Fatima Alvarez, and Enrique Rodriguez have been friends since they were five years old. As teenagers, they became volunteers helping indigenous communities — first in Mexico, then in other countries — and saw that many of the women were artisans.  The trio came to realize that these artists…

How a viral AI image catapulted a Mexican startup to a major adidas contract

BDO, the auditor for Indian edtech startup Byju’s, has resigned with immediate effect, marking the second auditor departure for the embattled startup in about a year and further intensifying concerns…

Second Byju’s auditor exits in a year amid bankruptcy proceedings

A federal judge says he will deliver a punishment in Google’s antitrust case by August 2025, according to The New York Times, after ruling earlier this month that Google had…

Google to receive punishment for search monopoly by next August, says judge

ChatGPT, OpenAI’s text-generating AI chatbot, has taken the world by storm since its launch in November 2022. What started as a tool to hyper-charge productivity through writing essays and code…

ChatGPT: Everything you need to know about the AI-powered chatbot

The world will have to wait a little longer to see Blue Origin’s massive New Glenn rocket fly for the first time. That rocket had been scheduled to launch two…

The maiden voyage of Blue Origin’s massive new rocket won’t be for NASA

After 93 days on orbit, Starliner is coming home.  The spacecraft is a “go” for undocking from the International Space Station at 6:04 p.m. EST, though it will be leaving…

Watch live as Boeing and NASA attempt to bring empty Starliner back to Earth

Some of Vice President Kamala Harris’ wealthier donors are informally asking for FTC Chair Lina Khan to be replaced, reports Bloomberg. It’s not really surprising: Her expansive definition of antitrust…

Wealthy Harris donors are reportedly pressing for ouster of FTC Chair Lina Khan

Mangomint seeks to make it easier for spa and salon owners to run their businesses.

How a cold email to a VC helped salon software startup Mangomint raise $35M

The honors program is one of the first in the U.S. that allows incoming freshmen to apply for the program as part of their initial admission application.

University of Texas opens robotics program up to incoming freshmen

By using readily available natural gas as the feedstock, C-Zero hopes to produce emission-free hydrogen for less than other green hydrogen startups.

C-Zero is raising $18M to make emission-free hydrogen using natural gas, filings reveal

Meta on Friday published an update on how it plans to comply with the Digital Markets Act (DMA), the European law that aims to promote competition in digital marketplaces, where…

Meta will let third-party apps place calls to WhatsApp and Messenger users — in 2027

At the annual Roblox Developers Conference, the company announced on Friday a series of changes coming to the platform in the next few months and years. Most notably, Roblox is…

Roblox introduces new earning opportunities for creators, teases generative AI project

Apple is likely to unveil its iPhone 16 series of phones and maybe even some Apple Watches at its Glowtime event on September 9.

How to watch the iPhone 16 reveal during this year’s big Apple Event

Welcome to Startups Weekly — your weekly recap of everything you can’t miss from the world of startups. Want it in your inbox every Friday? Sign up here. You won’t…

Startups have to be clever when fighting larger rivals

The Philadelphia Eagles and the Green Bay Packers will face off tonight in their first game of the NFL season. But this season opener is a bit different. As the…

NFL kicks off in Brazil for the first time, but reporters and fans can’t post on X due to nationwide ban