THE BEST Photo to 3D AI Model !

April 1st 2022

The results have dramatically improved upon the first model I covered in 2020, called NeRF. The quality is comparable, if not better, but it is more than 1'000 times faster with less than two years of research. NVIDIA made it even better. The quality of the results is comparable or better, and it's more than one thousand times faster than the first NeRF model. The results are available in an open-source version of the open-source format for a popular python package called Instant Neural Graphics Primitives with a Multiresolution™ Hash Encoding.

@whatsai

Louis Bouchard

I explain Artificial Intelligence terms and news to non-experts.

NEWABOUT PAGE

As if taking a picture wasn’t a challenging enough technological prowess, we are now doing the opposite: modeling the world from pictures. I’ve covered amazing AI-based models that could take images and turn them into high-quality scenes. A challenging task that consists of taking a few images in the 2-dimensional picture world to create how the object or person would look in the real world.

Take a few pictures and instantly have a realistic model to insert into your product. How cool is that?!

The results have dramatically improved upon the first model I covered in 2020, called NeRF. And this improvement isn’t only about the quality of the results. NVIDIA made it even better.

Not only that the quality is comparable, if not better, but it is more than 1'000 times faster with less than two years of research.

Watch the video

References

►Read the full article: https://www.louisbouchard.ai/nvidia-photos-into-3d-scenes/
►NVIDIA's blog post (credit to video): https://blogs.nvidia.com/blog/2022/03/25/instant-nerf-research-3d-ai/
►NVIDIA's video: https://nvlabs.github.io/instant-ngp/assets/mueller2022instant.mp4
►Paper: Thomas Muller, Alex Evans, Christoph Schied and Alexander
Keller, 2022, "Instant Neural Graphics Primitives with a Multiresolution
Hash Encoding", https://nvlabs.github.io/instant-ngp/assets/mueller2022instant.pdf
►Project link: https://nvlabs.github.io/instant-ngp/
►Code: https://github.com/NVlabs/instant-ngp
►My Newsletter (A new AI application explained weekly to your emails!): https://www.louisbouchard.ai/newsletter/

Video Transcript

as if taking a picture wasn't a

challenging enough technological prowess

we are now doing the opposite modeling

the world from pictures i've covered

amazing ai based models that could take

images and turn them into high quality

scenes a challenging task that consists

of taking a few images in the

two-dimensional picture world to create

how the object or person will look like

in the real world you can easily see how

useful this technology is for many

industries like video games animation

movies or advertising take a few

pictures and instantly have a realistic

model to insert into your product the

results have dramatically improved upon

the first model i covered in 2020 called

nerf and this improvement isn't only

about the quality of the results nvidia

made it even better not only that the

quality is comparable if not better but

it's more than one thousand times faster

with less than two years of research

this is the pace of ai research

exponential gains in quality and

efficiency a big factor that makes this

field so incredible you will be lost

with the new techniques and quality of

the results if you miss just a couple of

days which is why i first created this

channel and why you should also

subscribe just look at those 3d models

these cool models only needed a dozen

pictures and the ai guessed the missing

spot and created this beauty in seconds

something like this took hours to

produce with nerf let's dive into how

they made this much progress on so many

fronts in so little time but first i'd

like to take a few seconds to talk about

active loop an amazing company i

recently stumbled on and they are now

sponsoring this video active loop is

becoming popular with its open source

dataset format for ai hub one of the top

10 python packages in 2021 with active

loop hub you can treat your data sets as

numpy like arrays as a result you have a

simple dataset api for creating storing

version controlling and querying ai data

sets of any size it's perfect to

collaborate with your team and iterate

on your data sets the feature i like the

most is being able to stream my data

sets while training models in pytorch or

tensorflow this means anyone can access

any slice of the data and start training

models in seconds no matter how big is

the data set just like that how cool is

that with all these neat features hub

definitely frees me from building data

pipelines so i can train my models

faster active loop has just released

more than 100 image video and audio data

sets available almost instantly with a

single line of code try them out in your

workflows and let me know in the

comments below how it works i'd love to

know what you build with them

instant nerf attacks the task of inverse

rendering which consists of rendering a

3d representation from pictures a dozen

in this case approximating the real

shape of the object and how light will

behave on it so that it looks realistic

in any new scene here nerf stands for

neural radiance fields i will only do a

quick overview of how nerfs work as i

already covered this kind of network in

multiple videos which i invite you to

watch for more detail and a better

understanding quickly nerfs is a type of

neural network they take images and

camera settings as inputs and learn how

to produce an initial 3d representation

of the objects or scenes in the picture

fine tune this representation using

learn parameters from a supervised

learning settings this means that we

need a 3d object and a few images of it

at different known angles to train it

and the network will learn to recreate

the object to make the results as best

as possible we need a picture from

multiple viewpoints like this to be sure

we capture all or most sides of the

objects and we train this network to

understand general objects shapes and

light radiance we are asking it to learn

how to fill the missing parts based on

what it has seen before and how light

reacts to them in the 3d world basically

it will be like asking you to draw a

human without giving any details on the

hands you'd automatically assume the

person has five fingers based on your

knowledge this is easy for us as we have

many years of experience behind the belt

and one essential thing current ais are

lacking our intelligence we can create

links where there are none and do many

unbelievable things on the opposite side

ai needs specific rules or at least

examples to follow which is why we need

to give it what an object looks like in

the real world during its training phase

to improve then after such a training

process you only feed the images with

the camera angles at inference time and

it produces the final model in a few

hours did i see a few hours i'm sorry i

was still in 2021. it now does that in a

few seconds this new version by nvidia

called instant nerf is indeed 1000 times

faster than its nerf predecessor from a

year ago why because of multi-resolution

hash grid encoding multi-what

multi-resolution hash grid encoding they

explained it very clearly with this

sentence

we reduce the cost with a versatile new

input encoding that permits the use of a

smaller network without sacrificing

quality thus significantly reducing the

number of floating point and memory

access operations

in short they change how the nerf

network will see the inputs so our

initial 3d model prediction makes it

more digestible and information

efficient to use a smaller network while

keeping the quality of the outputs the

same keeping such a high quality using a

smaller network is possible because we

are not only learning the weights of the

nerf network during training but also

the way we are transforming those inputs

beforehand so the input is transformed

using trained functions here step one to

four compressed in a hash table to focus

on valuable information extremely

quickly and then sent to a much smaller

network in step 5 as the inputs are

similarly much smaller now they are

storing the values of any type in the

table with keys indicating where they

are stored for super efficient parallel

modifications and removing the lookup

time for big arrays during training and

inference this transformation and a much

smaller network is why instant nerf is

so much faster and why it made it into

this video and voila this is how nvidia

is now able to generate 3d models like

these in seconds

if this wasn't cool enough i said that

it can store values of any type which

means that this technique can not only

be used with nerfs but also with other

super cool applications like gigapixel

images that become just as incredibly

efficient of course this was just an

overview of this new paper attacking

this super interesting task in a novel

way i invite you to read their excellent

paper for more technical detail about

the multi-resolution hash grid encoding

approach and their implementation a link

to the paper and their code is in the

description below thank you for watching

the whole video please take a second to

let me know what you think of the

overall quality of the videos and new

editing i will see you next week with

another amazing paper

by Louis Bouchard @whatsai.I explain Artificial Intelligence terms and news to non-experts.

Watch more on YouTube: https://www.youtube.com/c/WhatsAI

Customized Exp|

THE BEST Photo to 3D AI Model !

THE BEST Photo to 3D AI Model !

Watch the video

References

Video Transcript

Recommend

First Tesla, now Polestar—Hertz signs a new electric vehicle partner

创业要么赶紧早婚，要么别谈恋爱

How to Read a Block Explorer

假设让你接手一个中小规模的软件项目，你该如何入手

9 Top Tips For Filing Your Crypto Taxes in 2022

Camarounds' Journey as a Startup After 1 Year

请问 Python 如何强制不使用科学计数法打印浮点数？

职业发展路径与规划

How To Run a Ukrainian IT Business When the World is Falling Apart

2022年4月4日丨一周资讯

About Joyk