The post office began with one of the first widespread use cases of AI by using computers to decode handwriting to recognize addresses. Sean McGee, head of product management at Samsara, said that took over a decade to crack. A few years later, a step up from handwriting recognition, came image recognition, which also took years to crack, he said.
“Now the breakthroughs are happening every week,” McGee said. “If you go on vacation, and you stop reading the newspaper, you will miss some major breakthrough in AI.”
AI and machine learning have been around for a long time, but generative AI kicked off in earnest with OpenAI’s launch of ChatGPT, said Toki Sherbakov, head of solutions architecture at OpenAI. Sherbakov shared insights into the future of AI in operations during a recent session at the annual Samsara Beyond conference held this year in San Diego.
[RELATED: Samsara's new AI features pinpoint safety]
When ChatGPT launched in 2022, OpenAI didn’t expect it to go viral like it did. It started with a silent launch – what the company considered a low-key research preview, Sherbakov said. The platform received over 100 million users within two months, kicking off the wave of generative AI.
“This stuff is just dramatically accelerating, and not only is AI getting smarter, but it's also getting dramatically cheaper,” McGee said. “It's the fastest thing that we've really seen in technology, and it's happening right now. The practical implication of that for operations is we can apply AI everywhere. This is no longer a thing where we can only do it for the highest value uses.”
Pathways to AI
The most common first step into AI, Sherbakov said, is getting AI into the hands of workers to build AI literacy from the ground up. The second step is thinking about how your organization can use AI to automate operations, like repetitive workflows. Thirdly is building custom AI into products that serve end users.
It starts with AI agents, in which a human can input a task into the AI model, and it will solve a problem independently, Sherbakov said.
For example, within OpenAI’s ChatGPT is a product called deep research, which condenses the hours of research it would take a human to perform into minutes.
Sherbakov said he asked ChatGPT about shipping and logistics to prepare for the conference and learned that Mexico recently overtook China as the U.S.’s largest trading partner and about how tariffs impact the logistics market.
“And all I did was send in the question. It went and searched 83 different data sources, thought for 14 minutes and gave me back a really nice synthesized report,” he said. “I could not do that. It would take me weeks to do that, probably, but this is now something in your pocket that can do really well.”
Users can also point the AI to specific knowledge bases or custom data sources for more powerful capability.
Another agent in ChatGPT is called operator, and it has the ability to use a computer like a human. It’s AI that can operate inside of a user interface and submit information. Sherbakov said it can be used to go into a platform like Salesforce and fill out forms.
“I think we're kind of just scratching the surface of what's possible with this,” he said. “It's getting really good at being able to navigate different user interfaces.”
There is also the capability within ChatGPT to build custom agents around your own data to automate workflows at scale.
“We really start to see this transforming many industries,” Sherbakov said.
The future of AI
Sherbakov said the next era of AI will see agents grow in their reasoning capabilities. That means the agents will spend more time thinking before generating an answer with higher quality and accuracy and fewer hallucinations (incorrect outputs from AI models).
“These reasoning models are getting a lot better, so for more complex problems, we're really going to see the ceiling be raised here for the capability of these recent models,” he said.
Another trend in the AI of the future is native multi-modality, which refers to AI models that are trained from scratch on multiple data modalities like text, images, audio and others simultaneously, rather than being built by combining separately trained components.
“If you use things like ChatGPT, it's very text based. These models can do a lot more than just understand text. They can hear; they can talk; they can see,” Sherbakov said. “So we'll see models that get a lot better at speaking, at listening and at being able to process videos and images to power more of these natural interactions with AI.”
And to round it all out, AI is getting faster and cheaper, he said, adding that ChatGPT is 99% cheaper now than when it initially launched.
“I expect this trend to continue,” Sherbakov said. “It's kind of like the cost of electricity. It ends up hopefully becoming ubiquitous, where it's something you wouldn't even really notice the cost.”