Computer vision a crucial bridge between AI and human intelligence, says Roboflow CEO

The ultimate aim of modern computing advancements, such as artificial intelligence and machine learning, is to make as much of the human experience as possible programmable.

And with the advancements in generative AI being led by companies such as Roboflow Inc., we might be witnessing the maturity of computer vision and the expansion of modern software capabilities all around.

“Roboflow exists to really make the world programmable,” said Joseph Nelson (pictured), co-founder and chief executive officer of Roboflow. “And our North Star is enabling developers predominantly to build that future. But the limiting reactant is how to enable computers and machines to understand things as well as people can. And, in many ways, computer vision is that missing element that enables anything you see to become software. If software is eating the world, computer vision makes the aperture infinitely wide.”

Nelson spoke with theCUBE industry analyst John Furrier at the AWS Startup Showcase: “Top Startups Building Generative AI on AWS” event, during an exclusive broadcast on theCUBE, SiliconANGLE Media’s livestreaming studio. They discussed the current state of AI and how the playing field has advanced from just a few years ago. (* Disclosure below.)

LLMs and their impact on the AI landscape

Everyone’s talking about large language models, such as ChatGPT and Bard, and taking advantage of their vast spectrum of functions. However, even these super-capable tools have a notable deficiency, according to Nelson.

“The rise of large language models is showing what’s possible, especially with text,” he explained. “Although there’s this core missing element of understanding. The rise of large language models creates this new area of generative AI. In the context of computer vision, it is a lot of creating video and image assets and content. There’s also this whole surface area to understanding what’s already created — basically digitizing physical, real-world things.”

In essence, computer vision links virtual, AI-driven experiences to the physical ones with which we interact in our everyday lives. And mirroring these experiences will be crucial in cases such as the budding metaverse, Nelson added.

“The metaverse can’t be built if we don’t know how to mirror, create or identify the objects that we wanna interact with in our everyday lives,” he said. “Where computer vision comes to play, especially with what we’ve seen at Roboflow, is a little over 100,00 developers now have built with our tools over 10,000 pre-trained models using more than 100M labeled open-source images.”

Human intuition and decision-making, as advanced as it is, remain fallible. Generative AI, as expressed in these LLMs, imbues computers with the logic, reasoning and critical thinking to fully understand visual and auditory input cues and compensate for human shortcomings, Nelson concluded.

Computer vision today vs. a few years ago

Computer vision is used to describe a set of processes by which machines and computers are imbued with capabilities to act on visual data as effectively as humans. Typically, these capabilities have seen immense use in situations such as object identification, classification and manipulation.

“Then you have key point detection, which is where you see athletes on screen and each of their joints is outlined,” Nelson explained. “This is another more traditional type of problem in signal processing and computer vision.”

The subfield is bringing about a reimagining of what’s possible within artificial intelligence, setting the course for nano-level precision and accuracy in the carrying out of tasks. This has already occurred in the example of Rivian Automotive Inc., an electric car company and Roboflow customer.

“One of our customers Rivian, in tandem with AWS, is tackling visual quality assurance and manufacturing in production processes,” Nelson explained. “Now, only Rivian knows what a Rivian is supposed to look like. Only they know the imagery of what their goods that are gonna be produced are. And then between those long tails of proprietary data with highly specific things in the center of the curve, you have a whole kind of messy middle type of problem.”

ML model requirements are only going to become even more complex. And as that happens, companies are going to rely on techniques like computer vision to efficiently and effectively feed those models with the most important resource of all, data.

“My mental model for how computer vision advances is this: You have that bell curve, and you have increasingly powerful models that eat outward,” Nelson stated. “And multimodality has a role to play in that; larger models also have a role to play in that. The existence of more compute and data also has a role to play in that.”

Here’s the complete video interview, part of SiliconANGLE’s and theCUBE’s coverage of the AWS Startup Showcase: “Top Startups Building Generative AI on AWS” event:

(* Disclosure: Roboflow Inc. sponsored this segment of theCUBE. Neither Roboflow nor other sponsors have editorial control over content on theCUBE or SiliconANGLE.)

Photo: SiliconANGLE

A message from John Furrier, co-founder of SiliconANGLE:

Show your support for our mission by joining our Cube Club and Cube Event Community of experts. Join the community that includes Amazon Web Services and Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger and many more luminaries and experts.

Join Our Community

Click here to join the free and open Startup Showcase event.

“TheCUBE is part of re:Invent, you know, you guys really are a part of the event and we really appreciate your coming here and I know people appreciate the content you create as well” – Andy Jassy

We really want to hear from you, and we’re looking forward to seeing you at the event and in theCUBE Club.

Click here to join the free and open Startup Showcase event.

Computer vision a crucial bridge between AI and human intelligence, says Roboflo...

LLMs and their impact on the AI landscape

Computer vision today vs. a few years ago

Photo: SiliconANGLE

A message from John Furrier, co-founder of SiliconANGLE:

Show your support for our mission by joining our Cube Club and Cube Event Community of experts. Join the community that includes Amazon Web Services and Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger and many more luminaries and experts.

Join Our Community

Recommend

What's next in large language model (LLM) research? Here's what's coming down th...

Apple expands new App Store price points for all purchases

一个月接五个团，招聘广告挂了两个月还没招满，最近导游有多抢手？

Deal: Save 47% on the Razer Viper Ultimate wireless gaming mouse

Fixing scroll issues with your site on mobile

Sonos Era Smart Speakers Cut Google Assistant, Gain Dolby Atmos

How to watch Real Madrid documentary on Apple TV

ChatGPT火爆登场，AIGC在细分领域大有可为

A 30-Day Content Plan to Completely Transform Your Online Marketing [Infographic...

Overit Wins Best of Show at 2023 ADDY Awards

About Joyk