Is there an opportunity to create another ByteDance or Pinduoduo with artificial intelligence?
Compared to previous years, this year's WAIC is characterized by practitioners no longer being confined to discussions of basic large models, but rather the emergence of more application-oriented products for vertical scenarios. This represents that the thinking of technical R&D personnel has become increasingly clear, and AI is getting closer to the lives of ordinary people.
It has only been a short year and a half since ChatGPT was released at the end of 2022. Zhuang Minghao, Vice President and Chief Strategy Officer of Qiwan Technology, remembers that a year ago, discussions were still limited to AI applications in language and text. But now, the frequent hits in multi-modal sectors such as images, videos, audio, and music prove the vitality of the new wave.
In this AI wave of fierce competition, both startups and small and medium-sized companies have once again gained the opportunity to compete on the same stage as giant companies - just like the arena created by the mobile internet more than a decade ago.
Zhuang Minghao summarizes the current chaotic situation with the phrase "AI is a hidden card for entrepreneurs and an open card for big companies."
The so-called "open card for big companies" refers to the fact that for all large companies, doing AI today is a given, something that must be done to empower existing businesses, and something that is planned and paced. The "hidden card for entrepreneurs" refers to the fact that for startup companies, judging the direction of AI entrepreneurship is something that requires guessing and speculation.
However, the next generation of giants often hide in the opportunities of playing hidden cards. "In 2010, when big companies were all doing wireless transformation of their businesses, no one would have thought that companies like Pinduoduo and Douyin would emerge a few years later," Zhuang said. "Only open cards don't make a game, only with a forest can there be towering trees, and only with a complex ecosystem can there be outstanding companies."
Ten years ago, Qiwan Technology caught that "hidden card" and launched TT Voice, breaking the void in the mobile voice market. Ten years later, at the new moment of intertwining open and hidden cards, why does Qiwan Technology have the qualification to sit at the table? Zhuang gave three reasons.
First, Qiwan has been deeply involved in vertical fields such as pan-entertainment for ten years and has a deep understanding of the ecosystem and users in this scenario. "We companies that do business are essentially doing it to meet user scenarios, satisfy user demands, and always stay close to user needs. This has always been the mission of startup companies and business companies," Zhuang said.
Second, in the development of vertical models, Qiwan has long-term investment in self-developed technology accumulation and "unique" high-quality data accumulation. In the AI field, the importance of data far exceeds that of models, and companies with unique data will have more competitive advantages.
Finally, Qiwan has highly sticky and highly active experimental scenarios. As mentioned earlier, because of its popular solid products, Qiwan's innovative technology can quickly leave the laboratory and be tested and polished by users and the market, entering the positive development cycle of "R&D - efficiency improvement - revenue increase" as early as possible.
First become a specialist, then find new opportunities in your area of expertise
When mobile internet emerged in 2011, John Doerr, a partner at a famous venture capital firm, proposed the concept of "SoLoMo", which stands for Social, Local, and Mobile. When this concept was proposed, it was widely recognized as the future development trend of the internet and became the standard answer guiding many companies forward.
Returning to the current point in time, AI is still in its early stages of development, and that standard answer everyone is expecting has not yet appeared. Whether big companies or startups, they are constantly trying and exploring, weighing and gaming, and many things are still in a state of fuzzy chaos. But for some companies, this "chaos" is not an abyss, but rather a ladder.
Compared to big companies "rolling" general large models and big clients, Qiwan Technology is more like a "specialist with generalist thinking", more adept at solving problems and finding new paths in vertical scenarios. This is Qiwan's innate advantage and confidence, and it has already proven its ability.
"The rapidly evolving industry state and the rapid iteration of basic large model capabilities have brought challenges to companies doing engineering and application," Zhuang said. This causes companies that shape products based on large models to often be "dragged along", often just finishing their own adjustments when the underlying model changes again, making it difficult to ensure service stability.
Qiwan Technology's approach is to self-develop vertical "small" models based on its deep-rooted voice and pan-entertainment scenarios, do its own training, and form a "product-model parallel" development path. "In this field, we can ensure that this model is relatively stable and will not be particularly impacted by the rapid iteration of the underlying general large models," Zhuang said.
To date, Qiwan Technology's self-developed vertical large models have covered areas such as audio, music, and dialogue, with more specialized and user-friendly multi-modal understanding, generation, and interaction. In the previously released "2024 China Artificial Intelligence Industry Large Model Enterprise Competitiveness Top 100 Research Report", Qiwan's large model also ranked among the top 100.
For example, in the field of AI music, Qiwan Technology has developed the world's first multi-modal music composition large model, capable of text-to-music, audio-to-music, and even video-to-music generation, supporting AI lyric writing, automatic composition, arrangement, mixing, etc. It can solve users' music creation problems throughout the entire process, allowing ordinary music enthusiasts to truly achieve zero-threshold music creation.
AI music is a new track without standard answers. When Suno, the "ChatGPT of the music world", emerged out of nowhere and the world's attention was focused on this small sector, Qiwan Technology had already been cultivating it for many years. It can be said that Qiwan Technology is also one of the earliest enterprises in the industry to develop music large models and AI native application products.
In addition, based on self-developed technologies such as generative action large models and audio large models, Qiwan Technology has developed one-stop enterprise-level solutions such as digital humans and multilingual translation. In addition to serving the game manufacturers and MCN institutions in its industrial chain for video content creation and overseas expansion, it has also been applied to scenarios such as intelligent customer service, local life, and film and tourism, radiating to a market scale of trillions, with partners including well-known enterprises such as China Telecom.
Almost all major products start with vertical groups and then gradually generalize to become national-level products. AI will structurally change user experience and industrial ecology, and the broad business opportunities bred in vertical application scenarios are self-evident.
The "democratization" of AI gives everyone the qualification to stand on the same starting line. But for startup companies, achieving "product-model parallel" in vertical industries like Qiwan's path may be worth learning from but cannot be hard-copied, after all, data and industry know-how accumulation are key, and the longer the accumulation, the higher the barrier.
Using a "simple formula" to grasp the anchor of certainty
In the AI era of survival of the fittest, how can enterprises grasp certainty in uncertainty? Zhuang believes that what ultimately determines success or failure is always a "correct cliché", which is "staying close to user needs".
"Our mission has always been to solve user needs that have not been met through innovative technology and products," Zhuang said. No matter how technology evolves, how the capital environment changes, whether doing X+AI or AI Native, this underlying logic will never change.
From self-developing vertical large models to building a full-stack AI interaction technology industry ecosystem, Qiwan Technology has always insisted on starting from user needs, prioritizing typical vertical scenarios as pilots for breakthroughs, and then replicating successful experiences to other scenarios after successful pilots, thereby reducing uncertainty in AI transformation. Through various "doing a little more", they have achieved "a little more stable" and "a little more user-friendly" for users and customers. Using this step-by-step "slow method", they gradually derived a simple formula of "one generates two, two generates three".
At the same time, facing AI that seems to be omnipotent, Qiwan Technology also emphasizes the "sense of boundaries" in use. This boundary includes both the boundary definition of AI integration with business scenarios and the boundary understanding of what technical level AI can achieve at the current stage.
"This year's WAIC conference reminds me of how I felt when participating in mobile internet conferences more than a decade ago," Zhuang said. The atmosphere in the huge exhibition hall next to the Bird's Nest in Beijing years ago is identical to the state of WAIC causing a sensation in Shanghai now.
Facing a more brutal competitive environment, Qiwan Technology has already established certain barriers and advantages in vertical fields, prioritizing getting tickets to keep up with the big wave of the AI era. What new things will the pioneers in the AI industry bring next year? How can enterprises store more "ammunition"?
Zhuang believes that the AI industry will witness a moment of victory or defeat within a year or two. And now, we are like walking in a dark forest, you light a torch somewhere in the forest, can only illuminate the surrounding area, but as you slowly walk down, you will see some places with faint light, discover more of your kind, until these lights are connected together, jointly welcoming a brand new world.