AI Large Model Direction Classification
Currently, most companies utilize AI by using general language large models, which are called differentiated applications after training with industry-specific data. However, this may be a shortcut. Mike Knoop, co-founder of Zapier, believes that expanding language large models can only promote the development of "memory" as a form of intelligence, which is distinct from intelligence. It cannot understand enterprise scenarios and needs, thus failing to fully leverage the value of AI.
Additionally, the curve of increased GPU computing power investment and language large model capability improvement may face diminishing marginal returns. Once easily accessible public data is exhausted, relying on general language large models to overtake in the AI field will become an illusion.
This is even more unfavorable for enterprises. Companies often lose sight of their original goals when pursuing new technologies, initially aiming to solve specific problems but ending up in a concept chase, forgetting the most fundamental issues.
The solution to this problem lies with AI companies. Sarah Tavel, partner at Benchmark, believes the best development direction is to start large model ventures based on specific customer needs. Alex Wang, co-founder of Scale AI, thinks that data is the bottleneck for AI model performance, not algorithms or computation. Data ultimately comes from multiple vertical industries, meaning AI companies should delve into industry domains and develop industry-specific large models that meet enterprise needs.
There are two key points in this process:
-
Data issue: AI companies need to "understand" users and industries. Many companies have large amounts of underutilized data corpora.
-
Management and iteration issue: Due to the diversity of industries and scenarios, it's currently difficult for one company to build large models across all domains.
Both Fourth Paradigm and Mike Knoop of Zapier point to automation as the key. Technologically, AutoML, program synthesis, and neural architecture search all involve automation and optimization processes to reduce manual intervention and improve efficiency and effectiveness. Mike Knoop believes AGI exploration should be based on program synthesis and neural architecture search, while Dai Wenyuan, founder of Fourth Paradigm, mentions that AutoML is the foundational technology for building countless industry-specific large models.
Dai Wenyuan calls AutoML "an art of failure," stating that it can exert greater value because Fourth Paradigm has experienced numerous scenarios and knows how to align data and models with specific scenario needs. Successes are transformed into results, failures become nutrients, accelerating iteration based on automation. As Alex Wang says, "Machine learning is a garbage in, garbage out framework." But with high-quality industry data and continuous error correction capabilities, reliable implementation of industry-specific large models will ultimately be achieved.
Different AI Models: Ideas, Approaches, and Prospects
Some companies focused on general large models, represented by OpenAI, are developing horizontally, with large models being everything. Their business model simply sells large model capabilities. In contrast, companies like Fourth Paradigm and Glean are taking a different path, using AI technology to help enterprises make decisions in certain aspects to improve overall work effectiveness. Their business models also differ.
Glean provides an AI-powered enterprise search and knowledge management platform that integrates multiple third-party application functions, becoming part of the workflow. It can also help enterprises train proprietary AI models using their own data, based on Glean's self-developed "trusted knowledge model."
Fourth Paradigm delves deeper into predicting and managing core business issues in industries. Its industry large model platform, AIOS 5.0, builds industry foundation large models based on X-modal data from various industry scenarios. At the capability level, it focuses on "Predict the Next X," where X represents the logic and results of major industries. At the usage level, it provides low-threshold modeling tools and scientist innovation service systems to achieve end-to-end industry large model construction, deployment, and management services.
This is a typical case of Chinese AI companies developing based on industrial backgrounds. Dai Wenyuan believes that China has advantages in numerous scenarios and data. After covering enough scenarios, connecting these models might also achieve AGI. In comparison, many popular industry large models are still industry-specific large language models, big but not precise. When divided into more precise scenarios, although it seems to require building many large models, the data load for each precise scenario is limited. With the help of automation technology, this approach may achieve AGI development at the application level through an alternative path.
Mike Knoop believes that AGI has encountered upward obstacles after rapid progress because it overly relies on language large models, defining AGI as a system that can complete most tasks. However, AGI should actually focus more on efficiently acquiring new capabilities and solving open-ended problems in various scenarios.
NVIDIA CEO Jensen Huang mentioned that as large models develop, computers are transitioning from instruction-driven to intent-driven, "Future applications will do and execute in ways similar to how we do things, assembling teams of experts, using tools, reasoning, planning, and executing our tasks." This logic itself implies universality, as large models are entering the physical world because decision-making in the physical world is also traceable.
A similar example is Palantir, originally a To G big data company that assisted decision-making based on data analysis and modeling simulation. Generative AI technology has transformed its data processing methods, making significant progress in automation and data-driven decision-making, accelerating the development of AI To B business. Fourth Paradigm, on the other hand, establishes industry-specific large models in each deterministic scenario, helping enterprises master their own applications and make effective decisions.