OpenAI’s ChatGPT serves as a testament to the captivating power of a product done right. The platform's meteoric rise to 100 million users in just two months is unprecedented. Despite the technology being available for a while, this sudden surge in adoption was driven by ChatGPT’s user-friendly experience. Now, established companies are seeking ways to integrate chatbots into their offerings, while numerous startups are hastily building businesses entirely on OpenAI’s chat completion API.
While we are familiar with using text chat for human interaction, the idea of fluent conversation with AI was science fiction until recently. The intuitive chat format lends itself to simple inquiries, iterative work, and conversation-like interactions we would have with a human. However, it is less natural for structured inputs (like filling out a form) or performing actions within software (like saving a document in a folder or hitting "send" on an email). Therefore, leaders should be cautious about assuming chat should now be the default method of AI interaction.
The potential of generative AI (GenAI) extends far beyond chatbots.
Instead of solely focusing on solutions like ChatGPT and its derivatives for product inspiration, leaders should broaden their vision to appreciate the potential of building language models into their product stack to provide real value. While traditional AI models excel at solving problems with specific inputs and outputs, real-world problems are invariably more complex. Utilizing a different model at each stage can aid in transitioning between modalities (such as image to text) or addressing a specific aspect of a problem more effectively.
A prime example of this is "Nothing, Forever," the AI-generated, endless stream of a sitcom akin to Seinfeld. According to a Hacker News post and subsequent reports, the AI/ML processes and API are orchestrated by one server, while another program assembles and choreographs the cameras and characters. Although this is an experimental piece of art, it demonstrates the potential to automate complex, generative work by orchestrating multiple large language models and other services.
Sequoia Capital also reports that many of the companies in their network are experimenting with AI in their tech stacks. Although not all of the applications are in production, they span a variety of use cases beyond chat. Product teams are purchasing and creating ways to efficiently build context for their models, and are frequently incorporating multiple models and data types, either from different API sources, or models trained to their specific purposes.
Multimodal AI also multiplies risks, in enterprise and in open-source.
It's crucial to note that with the high potential of AI or LLM-based products come heightened risks. Many LLMs remain enigmatic 'black boxes,' and even open-source models may not open-source their training data. Although generative AI has a remarkable ability to generate content and work with large datasets, this introduces real risks of cascading and catastrophic mistakes. The output of one step needs strict regulation to ensure the input into the subsequent step is acceptable. When failures occur, they can have disastrous consequences, as exemplified by a later incident with "Nothing, Forever." The stream was banned from Twitch due to a service outage that resulted in transphobic statements being broadcast from an un-moderated model.
Despite ChatGPT and other chat-centered experiences capturing current headlines, leaders must stay abreast of developments in the open-source community, such as novel applications of Stable Diffusion and the release of commercial, code-completion models. In addition, they should explore opportunities to incorporate LLMs and other models to solve increasingly complex problems.
The power of generative AI extends far beyond successful chatbots like OpenAI's ChatGPT. Its real potential lies in solving complex, multi-modal problems through the orchestration of different AI models. Yet, this approach isn't without risks as failures can be significant, and the 'black box' nature of many models presents its own challenges. Therefore, leaders should embrace the broader possibilities of AI beyond user-facing interfaces, while maintaining vigilance for the inherent complexities and risks involved.