NEWS  /  OPINION

Zhao Hejuan: Are We Ready for the Bottleneck Period of GPT Large Models?

By  Jany Hejuan Zhao  Sep 12, 2024, 9:22 a.m. ET

It is still difficult to determine how long this bottleneck period will last, but whether it is three months, six months, a year, or even longer, it may be a rather tough period for us.

I have been busy with global market research and the establishment of TMTPost's overseas office in the past few months, so my video postings have been sporadic. However, I didn't expect that even a capricious vlogger like me would receive many questions from users, such as why GPT-5 hasn't been released yet, or if there are any new application directions in Silicon Valley for startups.

I could sense that everyone is anxious and worried, but what we need more now is patience, not haste. A few months ago, I predicted in a public speech that the anticipated explosion of large model applications this year might not happen. It's too early to talk about an explosion; we can only consider it a beginning.

The development of large models in the past few months has validated my prediction. There are rumors circulating in the tech circles of Silicon Valley that the internal testing of GPT-5 has failed, making it unlikely to be launched this year. This is because GPT models based on Transformer have reached an expansion bottleneck. Apart from chat and some workflow assistance applications like programming, text, design, and office work, it's challenging to apply them comprehensively in the market, and GPT-4 is already sufficient to support these needs. Upgrading the existing architecture in the short term is unlikely to achieve a qualitative breakthrough. These bottlenecks include the model's own architecture expansion capabilities, data bottlenecks, and even the challenges of addressing safety and ethical requirements, as well as the market demand insufficiency due to application limitations.

Are Chinese entrepreneurs engaging in large model and applications, who are traditionally good at following trends, prepared to change course because of this?

It's hard to determine how long this bottleneck period will last, but whether it's three months, six months, a year, or even longer, it might be a tough time for us.

The good news is that based on speculations from some American tech media reports and information I've gathered from various sources, OpenAI might be considering launching a model product in new direction, no longer limited to the Transformer architecture, possibly named "Orion" (some say it might be named GPT-NEXT). Orion is the Latin name for the constellation Orion, symbolizing strength, adventure, and discovery, with a strong mythological significance of exploring the unknown. NASA's Orion spacecraft was also named after it, intended for future manned deep space exploration missions.

If this code name is indeed used, it might suggest that OpenAI views this as the beginning of a truly new era for AI, distinct from the previous GPT models. However, as of now, OpenAI has neither confirmed nor denied this code name. The Orion model is also part of OpenAI's internally codenamed "Strawberry" system project. Reports indicate that in internal demonstrations of the Strawberry system, the reasoning capabilities of the Orion model are already far superior to GPT, while its hallucination error rate is significantly lower than GPT.

Based on various pieces of information, I have summarized a few potential differences between Orion and the GPT models. Although Orion can still be classified as a large language model (LLM), it might have significant technical differences from the existing GPT series. Here are some possible technical differences and evolutionary directions:

1. Evolution of the foundational architecture

  • New architecture: Orion might adopt a newer model architecture, no longer confined to the Transformer architecture used by the GPT series. It might introduce more efficient model components or entirely new neural network designs to enhance computational efficiency, reasoning capabilities, and generation quality.

  • Modular design: Orion might employ a more modular design, making it easier to integrate different types of data sources (such as text, images, audio, etc.), thereby achieving multi-modal processing capabilities. This could be a significant distinction compared to the existing GPT models.

2. Multi-modal capabilities

  • Cross-modal learning: Orion might not just be a language model but a model capable of processing and generating various forms of data, such as images, sounds, and videos. This multi-modal capability could enhance the model's understanding and generation abilities by integrating different types of data, making it suitable for a broader range of application scenarios.

  • Joint Training: Orion may adopt a joint training approach, allowing the model to learn across multiple modalities simultaneously, thereby providing a more comprehensive and accurate understanding of context and content generation. This could be a key difference from traditional GPT models, which primarily focus on text data.

3. Security and Controllability

  • Enhanced Control Mechanisms: In terms of security and controllability, Orion may introduce more built-in control mechanisms to avoid generating harmful or inappropriate content. This could include new filtering algorithms, enhanced contextual understanding, and higher-level interpretability features.

  • Dynamic Adjustment Capability: Orion may possess stronger dynamic adjustment capabilities, allowing it to adjust the model's output style and content in real-time based on user needs and feedback, something that the GPT series currently struggles with in certain aspects.

4. Optimized Computing and Energy Efficiency

  • Computing Efficiency: Orion may employ new technologies to enhance computing efficiency, such as sparse activation, compression techniques, or other forms of optimization, to reduce the consumption of computing resources while maintaining model performance. This could make Orion more cost-effective in practical applications compared to GPT models.

  • Energy Efficiency: In terms of energy consumption, Orion may focus more on optimization, considering environmental impact and resource constraints, which will become increasingly important in the future development of AI.

5. Expansion of Application Scenarios

  • Industry Customization: Orion may focus more on industry-specific applications, providing optimized models tailored to the needs of different industries, rather than just being a general-purpose large language model. This customization could involve deeper integration of industry knowledge and training on specific domain data.

  • Enhanced Interactivity: Orion may significantly improve in terms of interactivity, better understanding user intentions and needs, and providing a more natural and intelligent interactive experience.

In summary, in terms of development direction for large language models, Orion may undergo significant expansion and improvement in architecture, functionality, and application scenarios.

As for the future relationship between Orion and GPT, whether it will be a short-term measure to address bottlenecks, a long-term replacement for GPT, or a coexistence of two evolving model routes, will likely be determined by the development of OpenAI. We can only wait and see.

Please sign in and then enter your comment
More