Large Language Models Challenge Social Media and XR: A Conversation Between Zuckerberg and Huang

Here's the English translation:

01 Training Large Models is Expensive, How Will Meta Recoup Its Investment?

Meta's Money-Printing Business - Information Feeds and Recommendation Systems, Are Being "Shaken" by Large Models

Jensen Huang: Mark, welcome to your first SIGGRAPH. Can you believe it? As one of the pioneers in computing and a driver of modern computing, I actually have to invite you to SIGGRAPH. I'm glad you could make it.

Zuckerberg: Yeah, it should be fun. You've been talking for about five hours now, right?

Jensen Huang: Yes, that's SIGGRAPH for you. 90% of the people here are PhDs. The best thing about SIGGRAPH is that it's a conference that combines computer graphics, image processing, artificial intelligence, and robotics. Over the years, many companies have showcased and revealed amazing things here, like Disney, Pixar, Adobe, Epic Games, and of course, NVIDIA.

We've done a lot of work here this year: we've published 20 papers at the intersection of AI and simulation; we're using AI to help simulation run at a larger scale and faster speed. For example, differentiable physics - we're using simulation to create simulated environments for AI, for synthetic data generation. These two fields are really merging.

Meta has actually done amazing work in AI. I find it interesting that when the media writes about Meta suddenly investing in AI in the past few years, it's as if they don't know about FAIR's (Facebook AI Research, Meta's AI research department, established in 2013) past achievements. In fact, we all use PyTorch, the open-source deep learning framework from Meta (an indispensable tool in AI research and development), and Meta's work in computer vision, language models, and real-time translation has been groundbreaking.

The first question I want to ask you is, how do you view Meta's progress in generative AI? How will it enhance your business or introduce new capabilities?

Zuckerberg: Compared to you, we're still newcomers. But Meta has been attending SIGGRAPH for eight years now. In 2018, we first showcased some hand tracking work for our VR and mixed reality headsets. We've also discussed a lot of progress made in codec avatars, for realistic avatars to be displayed in consumer-grade headsets.

There's also a lot of work we've done on display systems, some future prototypes and research, allowing mixed reality headsets to become very thin. What I want is a very advanced optical stack, display system, and integrated system.

So I'm glad to be here, and this year, it's not just about talking about the metaverse, but everything about AI. As you mentioned, we established FAIR before starting Reality Labs (Meta's metaverse R&D department), back when we were still called Facebook, now of course Meta. So in AI, we have years of accumulation.

Regarding generative AI, it's an interesting revolution, and I think it will ultimately transform all the products we make. For example, the information feeds and recommendation systems of Instagram and Facebook, which we've evolved for decades, AI will further change them.

The initial feeds were just about connections with friends, and in this case, feed ranking was key. Because if someone did something very important, like your cousin having a baby or something, you want it to appear at the top. If we buried it somewhere in your feed, you'd be very angry.

But in recent years, feeds have evolved to another stage where the content you need to see is more about public content. In this case, recommendation systems become super important. Because it's not just a few hundred or thousand posts from friends waiting to be shown to you, but millions of pieces of content, which becomes a very interesting recommendation problem.

And with generative AI, we'll soon enter a new phase. Most of the content you see on Instagram today is recommended to you, written by someone in the world, matching your interests, whether or not you follow these people. But in the future, some of this will be new content created by creators using tools, and some content will even be created instantly for you, or generated by synthesizing different existing content.

This is just one example of how our core business will evolve, which has already been evolving for 20 years, but few people realize it.

Unveiling Llama4, Unlocking AI Assistants in Meta's Entire Product Family

Jensen Huang: However, people realize that one of the world's largest computing systems is the recommendation system.

Zuckerberg: It's a completely different path, not entirely the generative AI people are talking about now. Although it's all Transformer architecture, all building increasingly general systems, embedding unstructured data into features.

But the two approaches produce qualitative differences. In the past, we trained different models for different types of content, like one model for sorting and recommending Meta's short video APP Reels, and another model for sorting and recommending long videos. Then, you need to do some product work to make the system able to inline display any content.

As you create more and more general recommendation models, it gets better and better because you can draw from a wider pool of content, rather than inefficiently drawing from different pools.

Now, as models become larger and more general, they will get better and better. I dream that one day, all the content of Facebook or Instagram will be driven by a single AI model, unifying all these different content types and systems. In reality, the APP has different recommendation goals at different time periods, some of which are just to show you interesting content you want to see today, but some are to help you build your long-term network of connections, in which case these multimodal models tend to be better at identifying patterns, weak signals, etc.

Jensen Huang: So AI is used so deeply in your company. You've been building GPU infrastructure to run these large recommendation systems for a long time.

Zuckerberg: Actually, we were a bit slow in using GPUs.

Jensen Huang: Yes, you seem to admit mistakes, no need to bring it up voluntarily (laughs).

Now, the really cool thing about using AI is that when I use WhatsApp, I feel like I'm "collaborating" with WhatsApp. Imagine I'm typing, and it generates images along with what I'm typing. When I change my wording, it generates other images. For example, if I input, an old Chinese man enjoying a glass of whiskey at sunset with three dogs beside him; it generates a pretty good picture.

Zuckerberg: On one hand, I think generative AI will be a huge upgrade to all our long-standing workflows and products.

But on the other hand, all these brand new things can be created and generated. Just like AI assistants like Meta AI can help you complete different tasks. In our world, it will be very creative, it will be able to answer any question over time.

In the future, when we switch from Llama 3 models to Llama 4 and later versions, I think Meta AI will no longer be just like a chatbot where you ask a question and it answers. Instead, after understanding your intent, it will work autonomously over multiple time frames. For example, you initially give it an intent, it will start, and after weeks or months of computational tasks, it will come back and tell you the results, which I think will be very powerful.

Jensen Huang: As you said, today's AI is a back-and-forth, question-and-answer way, but obviously, human thinking is not like this. When we are given a task or a problem, we consider multiple options, may come up with a decision tree, we simulate running in our minds, what are the different outcomes of each decision. Planning and decision-making like this, AI will be able to do similar things in the future.

When you talk about your vision for creator AI, I'm very excited to hear it. Why don't you tell everyone about your plans?

Zuckerberg: We've talked about it a little, but we're rolling it out more broadly today. I don't think there will be just one AI model, which is what some other companies in the industry are doing, building a centralized intelligent agent.

We're different, we'll have Meta AI assistants for you to use, but we want everyone using Meta products to have the ability to create their own intelligent agents. Whether it's millions of creators on the platform, or hundreds of millions of small businesses, they can quickly build a business intelligent agent that can interact with your customers, such as sales and customer service, etc.

So Meta is now starting to roll out more of what we call AI Studio, which is a set of tools that will ultimately allow every creator to build some kind of AI version of themselves, as a kind of agent or assistant that community members can interact with.

If you're a creator who wants to have more interaction with your community, you're actually limited by time and energy. A better option is to be able to let people create these AIs, which can be trained based on your corpus in the way you want, to represent you. You're very clear that you're not interacting with the creator themselves, but this is another interesting way, just like real creators posting content on these social systems, to have agents do this.

Similarly, I think people will create these intelligent agents for their own businesses.