2

What Does a Deep Learning Architect at NVIDIA Do? (Video Interview)

 1 year ago
source link: https://hackernoon.com/what-is-a-deep-learning-architect-at-nvidia-video-interview
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

What Does a Deep Learning Architect at NVIDIA Do? (Video Interview)

Search icon
see notifications
Notifications
Happy Weekend, enjoy these top stories from this week, Kaiseki, PIRATES, and more 💚
Last Saturday at 6:00 PM
Happy Weekend, enjoy these top stories from this week, A Duel, ESCAPE, and more 💚
03/11/2023
Happy Weekend, enjoy these top stories from this week, XAXA, ARENA, and more 💚
03/04/2023
Happy Weekend, enjoy these top stories from this week, UTGARD, THUVIA, and more 💚
02/25/2023
Good news Hackers! Our stats graph has leveled up - see how much time daily people spend reading your stories!
02/23/2023
Annotate any hackernoon stories to win a free t-shirt!
02/13/2023
New Contest Alert: Win BIG with the #web-development and #ecommerce writing contests!
02/13/2023
🚨 WINNERS ALERT 🚨 #growth-marketing Writing Contest Announces Round 4 Results 🕺 💃
01/30/2023
#respectthefuckinggreen mug, OG Tee, and other Hacker Merch at 15% Discount until 1/31/2023
01/18/2023
It’s Not Over Until It’s Over! Read our 2022 recap! 📗
12/26/2022
ChatGPT is on 🔥! What do you think is next for A.I?
12/15/2022
New Writing Contest Launch! Win Up To 500 USD Per Month on #MobileDebugging Stories!
12/05/2022
HackerNoon is a Multi-language Platform: All Top Stories Now Available in 8 Languages
11/28/2022
Over 100,000 votes have been casted for this year’s Noonies Nominees. Results will be announced on December 1st.
11/28/2022
Start off your week the right way! Here are some must-read top stories from this week, The Trail to the Sea, Culture, The Philosopher's Public Library and more 💚
11/28/2022
Recently laid off from a tech company? Share your story for free on HackerNoon!
11/14/2022
New Week, New Chance to Win from $18,000! Enter #EnterTheMetaverse Writing Contest Now!
11/08/2022
Stable Diffusion AI Image Generation is Now Available in the HackerNoon Text Editor!
11/03/2022
HN Shareholder Newsletter: Green Clock Strikes Noon :-)
10/26/2022
Vote now on HackerNoon weekly polls!
10/26/2022
Highlight any text on a story and HackerNoon will generate beautiful quote images for you to share!
10/26/2022
Don't miss out on the daily top trending stories on HackerNoon! Subscribe to TechBeat to see what people are currently interested about!
09/20/2022
HackerNoon now publishes sci-fi! Read some of our science fiction stories today and submit your own!
09/05/2022
$200k+ in Committed Writer Payouts for HackerNoon Writing Contests. Enter to Win Monthly Prizes!
08/29/2022
see 17 more
What Does a Deep Learning Architect at NVIDIA Do? (Video Interview) by@whatsai

What Does a Deep Learning Architect at NVIDIA Do? (Video Interview)

March 20th 2023 New Story
1 min
by @whatsai

Louis Bouchard

@whatsai

I explain Artificial Intelligence terms and news to non-experts.

Too Long; Didn't Read

Adam Grzywaczewski, a senior deep learning architect at NVIDIA, is interviewed in this podcast.

He obtained a Ph.D. in information retrieval systems in 2013, before AI was trendy. With over 6 years of experience at NVIDIA, Adam has helped several companies scale their online models and is well-versed in NLP.

The interview covers topics such as Adam's Ph.D. research, the deep learning architect role, working at NVIDIA, his favorite tools, the challenges with the scaling model, and more. Gain insights from an NLP and scaling model expert.

featured image - What Does a Deep Learning Architect at NVIDIA Do? (Video Interview)
Your browser does not support theaudio element.
Read by Dr. One (en-US)
Audio Presented by

@whatsai

Louis Bouchard

I explain Artificial Intelligence terms and news to non-e...

About @whatsai
LEARN MORE ABOUT @WHATSAI'S EXPERTISE AND PLACE ON THE INTERNET.

Adam Grzywaczewski, a senior deep learning architect at NVIDIA, is interviewed in this podcast.

He obtained a Ph.D. in information retrieval systems in 2013, before AI was trendy. With over 6 years of experience at NVIDIA, Adam has helped several companies scale their online models and is well-versed in NLP.

The interview covers topics such as Adam's Ph.D. research, the deep learning architect role, working at NVIDIA, his favorite tools, the challenges with the scaling model, and more. Gain insights from an NLP and scaling model expert.

Watch the Video

As a final installment in the NVIDIA partnership interview series, this podcast also includes a giveaway for an RTX 4080 GPU.

foreign

[00:00:00] : [00:00:06]

[Music]

[00:00:06] : [00:00:06]

ski deep learning architect at Nvidia

[00:00:06] : [00:00:16]

for six years before Nvidia Adam did a

[00:00:16] : [00:00:18]

PhD in information retrieval systems

[00:00:18] : [00:00:22]

which ended in 2013 and then worked as a

[00:00:22] : [00:00:24]

research engineer at Jaguar Land Rover

[00:00:24] : [00:00:27]

this interview is the third and last

[00:00:27] : [00:00:29]

interview in partnership with Nvidia and

[00:00:29] : [00:00:32]

the GTC event here's your last chance to

[00:00:32] : [00:00:35]

win an RTX 4080 you just have to attend

[00:00:35] : [00:00:38]

the free GTC event take a screenshot and

[00:00:38] : [00:00:39]

send it to me you will see that there

[00:00:39] : [00:00:41]

are a lot of incredibly interesting

[00:00:41] : [00:00:42]

talks including the ones that we'll

[00:00:42] : [00:00:45]

discuss in this interview I hope you

[00:00:45] : [00:00:46]

enjoy it

[00:00:46] : [00:00:49]

if you could go over your back

[00:00:49] : [00:00:51]

background I've seen that you've done a

[00:00:51] : [00:00:53]

PhD and worked as a resource scientist

[00:00:53] : [00:00:56]

and now at Nvidia so I would love a few

[00:00:56] : [00:00:59]

good like go back a few years and into

[00:00:59] : [00:01:01]

especially the academics background and

[00:01:01] : [00:01:03]

then how you transition upward into

[00:01:03] : [00:01:07]

Nvidia sure so you know life is a

[00:01:07] : [00:01:09]

journey in a sense and many things they

[00:01:09] : [00:01:14]

happen by accident so in on University

[00:01:14] : [00:01:17]

when I still in Poland we have been

[00:01:17] : [00:01:19]

toying with this idea with my colleague

[00:01:19] : [00:01:23]

to to to to start a business focusing on

[00:01:23] : [00:01:26]

finding tenders I used to live in this

[00:01:26] : [00:01:30]

like large metropolitan area and and and

[00:01:30] : [00:01:32]

with a lot of different cities everyone

[00:01:32] : [00:01:34]

with a different tender process so this

[00:01:34] : [00:01:37]

is how I started looking at the problem

[00:01:37] : [00:01:39]

of information retrieval that bit didn't

[00:01:39] : [00:01:42]

work out but on the back of it at some

[00:01:42] : [00:01:43]

point of time I've submitted an

[00:01:43] : [00:01:46]

application for funding of a PhD focused

[00:01:46] : [00:01:48]

on information she was a process of

[00:01:48] : [00:01:50]

finding information not necessarily in

[00:01:50] : [00:01:53]

the internet but and and and and

[00:01:53] : [00:01:55]

funnily enough despite the fact that

[00:01:55] : [00:01:58]

statistically I didn't have very high

[00:01:58] : [00:02:00]

chances someone liked it

[00:02:00] : [00:02:02]

with that particular application was

[00:02:02] : [00:02:05]

rejected but one of the reviewers

[00:02:05] : [00:02:07]

suggested that I submitted to a

[00:02:07] : [00:02:08]

different funding body and that was

[00:02:08] : [00:02:11]

accepted and funnily enough this is how

[00:02:11] : [00:02:14]

I started doing High PhD focused on you

[00:02:14] : [00:02:16]

know Finding information I specialized

[00:02:16] : [00:02:18]

in the process of supporting software

[00:02:18] : [00:02:20]

engineers in software development

[00:02:20] : [00:02:23]

process so many of you

[00:02:23] : [00:02:26]

most likely heard about Visual Studio

[00:02:26] : [00:02:30]

the the programming interface

[00:02:30] : [00:02:33]

from from Microsoft and right now

[00:02:33] : [00:02:36]

Microsoft has this thing called codex so

[00:02:36] : [00:02:39]

effectively my my PhD was focused on

[00:02:39] : [00:02:40]

building something like that obviously

[00:02:40] : [00:02:43]

not with neural networks but that was in

[00:02:43] : [00:02:45]

the the time for neural networks but

[00:02:45] : [00:02:47]

much more traditional approaches but

[00:02:47] : [00:02:48]

effectively we've been doing that we've

[00:02:48] : [00:02:51]

been trying to you know crawl various

[00:02:51] : [00:02:54]

online code repositories and inline

[00:02:54] : [00:02:56]

provide code recommendation that would

[00:02:56] : [00:02:59]

go beyond what was then intelli sense

[00:02:59] : [00:03:01]

and when was this

[00:03:01] : [00:03:03]

all right when was it I think I started

[00:03:03] : [00:03:08]

my PhD in what was it 2011.

[00:03:08] : [00:03:11]

yes so definitely a time before no

[00:03:11] : [00:03:13]

neural networks were understood but at

[00:03:13] : [00:03:16]

that point of time no one perceived them

[00:03:16] : [00:03:17]

as something that

[00:03:17] : [00:03:20]

can work particularly well and and and

[00:03:20] : [00:03:24]

like computation didn't exist to to

[00:03:24] : [00:03:26]

train anything meaningful so during my

[00:03:26] : [00:03:30]

PhD I was you know dealing with Much

[00:03:30] : [00:03:31]

More Much More conventional algorithm

[00:03:31] : [00:03:35]

related allocation and also spent a lot

[00:03:35] : [00:03:37]

of time looking at human behavior of

[00:03:37] : [00:03:40]

search so opportunistic programming so

[00:03:40] : [00:03:42]

how people actually formulate the Search

[00:03:42] : [00:03:45]

terms so that's that and funnily enough

[00:03:45] : [00:03:47]

when I was finishing my PhD even before

[00:03:47] : [00:03:50]

I finished it turned out because I used

[00:03:50] : [00:03:53]

to do it in in the Midlands that nearby

[00:03:53] : [00:03:57]

Jaguar Land Rover was just kicking off a

[00:03:57] : [00:03:59]

a part of the research Department

[00:03:59] : [00:04:01]

focusing on telematics so connected car

[00:04:01] : [00:04:03]

and they needed people that understand

[00:04:03] : [00:04:06]

machine learning and like you know that

[00:04:06] : [00:04:09]

I I I had a conversation that sounded

[00:04:09] : [00:04:11]

interesting

[00:04:11] : [00:04:14]

so after even yeah right after

[00:04:14] : [00:04:16]

submitting my reviews to PhD even before

[00:04:16] : [00:04:20]

the defense of the the physics itself I

[00:04:20] : [00:04:22]

think I moved to jaguarantro for

[00:04:22] : [00:04:24]

research and then then we've been doing

[00:04:24] : [00:04:27]

a lot of different things for those that

[00:04:27] : [00:04:29]

are interested just go to YouTube and

[00:04:29] : [00:04:32]

type self learning car jagran Rover that

[00:04:32] : [00:04:36]

was on the project that I helped to to

[00:04:36] : [00:04:38]

shape I wasn't working on it myself it

[00:04:38] : [00:04:40]

was a very large group but that was one

[00:04:40] : [00:04:42]

of the bigger things that I was focusing

[00:04:42] : [00:04:45]

on and roll out of telematics you know

[00:04:45] : [00:04:47]

and from there it was just a fairly

[00:04:47] : [00:04:49]

organic Journey part of those projects

[00:04:49] : [00:04:52]

involved neural networks coincidentally

[00:04:52] : [00:04:55]

it wasn't yeah okay super aware life

[00:04:55] : [00:04:57]

decision at that point of time it wasn't

[00:04:57] : [00:04:59]

obvious that neural networks will

[00:04:59] : [00:05:01]

actually work as well as they will but

[00:05:01] : [00:05:04]

we've looked at Andrew engs in in

[00:05:04] : [00:05:06]

baidu's work on automatic speech

[00:05:06] : [00:05:08]

recognition this is something that was

[00:05:08] : [00:05:10]

needed automatic speech recognition in

[00:05:10] : [00:05:12]

the car is it's complicated you have a

[00:05:12] : [00:05:14]

lot of background noise

[00:05:14] : [00:05:16]

so we've looked at that we've started to

[00:05:16] : [00:05:18]

look at reproducing some of that work

[00:05:18] : [00:05:21]

and this is how I CL Beyond just

[00:05:21] : [00:05:22]

traditional machine learning background

[00:05:22] : [00:05:25]

created a bit of background in

[00:05:25] : [00:05:29]

in automotive and it it so happened that

[00:05:29] : [00:05:31]

uh Nvidia recruiter was searching for

[00:05:31] : [00:05:33]

people he noticed the profile on like

[00:05:33] : [00:05:35]

then reached out and after I don't know

[00:05:35] : [00:05:38]

how many conversations and I think the

[00:05:38] : [00:05:40]

process like took nine months because it

[00:05:40] : [00:05:43]

wasn't that I've applied and went for

[00:05:43] : [00:05:45]

more interview processes immediately

[00:05:45] : [00:05:47]

like there was a bit of a back and forth

[00:05:47] : [00:05:50]

I joined Nvidia in 2017 and and started

[00:05:50] : [00:05:53]

to help with at that time predominantly

[00:05:53] : [00:05:57]

also Automotive so working a lot with

[00:05:57] : [00:06:00]

oems so car manufacturers and tier was

[00:06:00] : [00:06:02]

tier one suppliers

[00:06:02] : [00:06:05]

predominantly focusing on perception for

[00:06:05] : [00:06:07]

the self-driving curve and trying to

[00:06:07] : [00:06:08]

help them

[00:06:08] : [00:06:11]

Define the process of training

[00:06:11] : [00:06:12]

perception algorithms

[00:06:12] : [00:06:14]

which was at that point of time super

[00:06:14] : [00:06:17]

difficult like you have to appreciate

[00:06:17] : [00:06:19]

that in 2017 a lot of things that we

[00:06:19] : [00:06:21]

take for granted right now didn't work

[00:06:21] : [00:06:23]

yes so a human kind didn't know how to

[00:06:23] : [00:06:26]

scale neural networks we just didn't we

[00:06:26] : [00:06:28]

and and that that works in many

[00:06:28] : [00:06:30]

different ways we didn't know how to

[00:06:30] : [00:06:32]

make them deeper because they would

[00:06:32] : [00:06:35]

explode and not converge so for those of

[00:06:35] : [00:06:37]

you that don't believe me just

[00:06:37] : [00:06:39]

look at some older papers so for example

[00:06:39] : [00:06:41]
[00:06:41] : [00:06:44]

um what was the name of the like first

[00:06:44] : [00:06:47]

I think in Inception it it you'll notice

[00:06:47] : [00:06:49]

that it has those like multiple losses

[00:06:49] : [00:06:52]

this one at the top but then also

[00:06:52] : [00:06:54]

someone decides that's kind of so some

[00:06:54] : [00:06:56]

tricks to stabilize the training process

[00:06:56] : [00:06:59]

and we definitely didn't know how to

[00:06:59] : [00:07:01]

chain well in data parallel way with

[00:07:01] : [00:07:02]

large batch sizes not to mention that

[00:07:02] : [00:07:04]

the tools didn't exist this is like

[00:07:04] : [00:07:08]

horovot it was that wasn't something

[00:07:08] : [00:07:11]

that was available and so so there was

[00:07:11] : [00:07:13]

quite a lot of work around that also

[00:07:13] : [00:07:15]

understanding of Hardware just started

[00:07:15] : [00:07:18]

to form formulate and we had to discover

[00:07:18] : [00:07:21]

a lot of things that are obvious today

[00:07:21] : [00:07:24]

and and that that's the journey so on

[00:07:24] : [00:07:26]

the back of that I've I've learned how

[00:07:26] : [00:07:28]

to scale neural networks so when natural

[00:07:28] : [00:07:30]

language processing

[00:07:30] : [00:07:33]

like this new Revolution started with

[00:07:33] : [00:07:36]

GPT and birth it became obvious that

[00:07:36] : [00:07:39]

unsupervised training has a chance of

[00:07:39] : [00:07:41]

working but for that we'll need very

[00:07:41] : [00:07:43]

large models very large data sets focus

[00:07:43] : [00:07:46]

on that this is how I am right now where

[00:07:46] : [00:07:50]

I am the journey basically you were

[00:07:50] : [00:07:52]

already trying to scale the models that

[00:07:52] : [00:07:54]

existed in the days it's just that the

[00:07:54] : [00:07:57]

hardware didn't allow you know like no

[00:07:57] : [00:08:00]

one knew how to do it yeah so so that

[00:08:00] : [00:08:01]

knowledge didn't exist I very

[00:08:01] : [00:08:04]

distinctively remember in Europe's 2017

[00:08:04] : [00:08:07]

in December I've attended that and there

[00:08:07] : [00:08:12]

was a workshop on a AI at HPC scale no

[00:08:12] : [00:08:13]

one knew you know there were people from

[00:08:13] : [00:08:16]

Google Facebook no one knew how to scale

[00:08:16] : [00:08:18]

those models it wasn't obvious what to

[00:08:18] : [00:08:21]

do when batch size exceeds a certain

[00:08:21] : [00:08:23]

threshold very bad things happen to

[00:08:23] : [00:08:25]

optimization if you do that

[00:08:25] : [00:08:30]

is is is I think one said that France

[00:08:30] : [00:08:33]

don't allow friends to train with large

[00:08:33] : [00:08:35]

bar sizes that it's good for their

[00:08:35] : [00:08:37]

mental health and and that was true at

[00:08:37] : [00:08:38]

that point of time now obviously

[00:08:38] : [00:08:40]

if you train with extremely logical

[00:08:40] : [00:08:42]

sizes but

[00:08:42] : [00:08:44]

so it was definitely a good like you

[00:08:44] : [00:08:46]

started at the right time since you you

[00:08:46] : [00:08:49]

already had quite a lot of experience in

[00:08:49] : [00:08:53]

the field before it it got so so much

[00:08:53] : [00:08:56]

hype like in the past since 2018 let's

[00:08:56] : [00:08:59]

see let's see you know it's you say luck

[00:08:59] : [00:09:01]

but it's so luck is a big part of life

[00:09:01] : [00:09:04]

yes but you always look at the decisions

[00:09:04] : [00:09:07]

ahead of you you look at them critically

[00:09:07] : [00:09:10]

and you try to make a good decision at

[00:09:10] : [00:09:12]

that point of time and you know those

[00:09:12] : [00:09:13]

those kind of changes that I've made

[00:09:13] : [00:09:16]

seem logical

[00:09:16] : [00:09:16]

they could have ended up being a mistake

[00:09:16] : [00:09:20]

but fortunately they weren't but and for

[00:09:20] : [00:09:23]

did you really want to do a PhD or was

[00:09:23] : [00:09:26]

this just to to work on the the project

[00:09:26] : [00:09:28]

you had in mind how many people do you

[00:09:28] : [00:09:31]

know that I just went straight out of

[00:09:31] : [00:09:33]

University and know what what does it

[00:09:33] : [00:09:36]

even mean like what does it mean to do a

[00:09:36] : [00:09:38]

PhD and then have an academic career

[00:09:38] : [00:09:41]

that's a that's a super abstract thing

[00:09:41] : [00:09:43]

maybe if you have parents and that are

[00:09:43] : [00:09:46]

that are you know publishing for a

[00:09:46] : [00:09:50]

living but you know my father is a was a

[00:09:50] : [00:09:52]

minor my human story in a railway so I

[00:09:52] : [00:09:54]

clearly didn't have that understanding

[00:09:54] : [00:09:56]

you know it sounded like a good idea at

[00:09:56] : [00:09:58]

the time and the fact that I managed to

[00:09:58] : [00:10:01]

win in you know a pot of money to

[00:10:01] : [00:10:03]

so so pay for it and sustain me for a

[00:10:03] : [00:10:05]

period of time helped a lot with the

[00:10:05] : [00:10:05]

decision

[00:10:05] : [00:10:09]

but did you already have an idea in mind

[00:10:09] : [00:10:11]

of like right now for example in

[00:10:11] : [00:10:13]

artificial intelligence a lot of people

[00:10:13] : [00:10:16]

think that a PhD might be required to

[00:10:16] : [00:10:19]

get a good job and so like they are just

[00:10:19] : [00:10:21]

aiming for a title

[00:10:21] : [00:10:23]

so think about it like you're an

[00:10:23] : [00:10:26]

employer yes you need to hire someone

[00:10:26] : [00:10:29]

you get 500 CDs you need to choose

[00:10:29] : [00:10:31]

somehow who you're going to interview

[00:10:31] : [00:10:34]

you cannot interview everyone it's just

[00:10:34] : [00:10:36]

not physically possible you cannot even

[00:10:36] : [00:10:37]

interview a large group of people

[00:10:37] : [00:10:40]

because every person that you're going

[00:10:40] : [00:10:42]

to interview that's at least I would say

[00:10:42] : [00:10:43]

two three hours of work if you want to

[00:10:43] : [00:10:47]

do it well prepare now let's think about

[00:10:47] : [00:10:49]

it so you really have to be you know

[00:10:49] : [00:10:52]

quite selective at the get-go and you

[00:10:52] : [00:10:54]

need to use some heuristics and you get

[00:10:54] : [00:10:56]

a CV so what can you look what can you

[00:10:56] : [00:10:57]

look at

[00:10:57] : [00:11:01]

evidence of you know in evidence that

[00:11:01] : [00:11:02]

would suggest that that person can do

[00:11:02] : [00:11:05]

their job PhD can be one of them and if

[00:11:05] : [00:11:07]

you don't have a PhD and you're a young

[00:11:07] : [00:11:09]

person how else would you prove yourself

[00:11:09] : [00:11:12]

if you know achieved something published

[00:11:12] : [00:11:14]

a paper contributed substantially to an

[00:11:14] : [00:11:17]

to an open source project and you know

[00:11:17] : [00:11:20]

PhD is not strictly required but you

[00:11:20] : [00:11:24]

need to start somehow and PhD is a in a

[00:11:24] : [00:11:26]

sense an easy way because it's super

[00:11:26] : [00:11:28]

prescribed you go to the university you

[00:11:28] : [00:11:31]

follow a program and typically if you

[00:11:31] : [00:11:34]

don't you know make some mistakes of

[00:11:34] : [00:11:35]

have a lot of bad luck at the end of it

[00:11:35] : [00:11:37]

you'll have a PhD you would have

[00:11:37] : [00:11:39]

contributed in a bunch of projects you

[00:11:39] : [00:11:41]

have published a bunch of papers it

[00:11:41] : [00:11:43]

achieves something that you can then

[00:11:43] : [00:11:45]

through which you can describe your

[00:11:45] : [00:11:49]

your skills if you have some other way

[00:11:49] : [00:11:51]

of doing it and there are people that do

[00:11:51] : [00:11:55]

that amazing but gaining the same level

[00:11:55] : [00:11:56]

of experience

[00:11:56] : [00:11:58]

without going through like four-year PHD

[00:11:58] : [00:12:01]

programs is sometimes challenging but we

[00:12:01] : [00:12:04]

did hire a group of amazing people into

[00:12:04] : [00:12:07]

into my brother Team without phds

[00:12:07] : [00:12:08]

because they kind of proved that they

[00:12:08] : [00:12:11]

can can do the work they need to be

[00:12:11] : [00:12:11]

doing

[00:12:11] : [00:12:16]

and when you see you hired were you the

[00:12:16] : [00:12:17]

one of the people making decisions

[00:12:17] : [00:12:20]

looking at the CD or just looking at the

[00:12:20] : [00:12:24]

the profile and deciding so so I had

[00:12:24] : [00:12:27]

either personally into my team and prior

[00:12:27] : [00:12:30]

to having a team that I have supported

[00:12:30] : [00:12:32]

hiring so I've interviewed and I made

[00:12:32] : [00:12:35]

notes it's a critical assessment of

[00:12:35] : [00:12:37]

their capabilities

[00:12:37] : [00:12:39]

in that case may I ask how are you

[00:12:39] : [00:12:42]

assessing their capabilities first

[00:12:42] : [00:12:45]

before the interview but also during the

[00:12:45] : [00:12:47]

interview for example when you say that

[00:12:47] : [00:12:50]

there are different ways of of proving

[00:12:50] : [00:12:53]

that you can do the work the PHD is is

[00:12:53] : [00:12:55]

one way but what would be the other ways

[00:12:55] : [00:12:57]

that you you've seen or that you are

[00:12:57] : [00:13:00]

looking for like the the most obvious

[00:13:00] : [00:13:01]

thing but you'd be surprised how few

[00:13:01] : [00:13:03]

people do it is to actually read the job

[00:13:03] : [00:13:06]

description

[00:13:06] : [00:13:06]

I'm not kidding

[00:13:06] : [00:13:09]

me a lot of people will send a genetic

[00:13:09] : [00:13:13]

CV without any consideration to what is

[00:13:13] : [00:13:16]

it that the employer is looking for So

[00:13:16] : [00:13:18]

reading the job description helps and

[00:13:18] : [00:13:20]

then taking the next step and after

[00:13:20] : [00:13:23]

reading the job description tailoring

[00:13:23] : [00:13:26]

your CV so that you're answering the

[00:13:26] : [00:13:29]

questions via lcv that needs to be

[00:13:29] : [00:13:31]

answered so okay the the employer said

[00:13:31] : [00:13:33]

that they want to see evidence of me

[00:13:33] : [00:13:37]

knowing certain technology how about I

[00:13:37] : [00:13:39]

include that information and not talk

[00:13:39] : [00:13:42]

about something irrelevant so so that is

[00:13:42] : [00:13:44]

the secret sauce really now just reading

[00:13:44] : [00:13:47]

the the job description and you know

[00:13:47] : [00:13:50]

showing a bit of empathy and and trying

[00:13:50] : [00:13:52]

to to spend time and in helping the

[00:13:52] : [00:13:55]

person that might know nothing about the

[00:13:55] : [00:13:57]

technology to read and through the city

[00:13:57] : [00:13:59]

in our case we can spend a lot of

[00:13:59] : [00:14:00]

technical time

[00:14:00] : [00:14:02]

going through them but in many

[00:14:02] : [00:14:05]

organizations you will have either a HR

[00:14:05] : [00:14:08]

person or a I don't know broker that has

[00:14:08] : [00:14:10]

nothing that knows nothing about the

[00:14:10] : [00:14:12]

technology and they are just matching

[00:14:12] : [00:14:14]

keywords so if you don't do yourself a

[00:14:14] : [00:14:16]

favor and actually read

[00:14:16] : [00:14:19]

the job description and then include

[00:14:19] : [00:14:22]

appropriate evidence then you have no

[00:14:22] : [00:14:23]

chance of going through that Festival

[00:14:23] : [00:14:24]
[00:14:24] : [00:14:27]

but then if when you get through the

[00:14:27] : [00:14:29]

this this first round there's the second

[00:14:29] : [00:14:32]

round of uh for example selecting like

[00:14:32] : [00:14:34]

the the best ten percent or something so

[00:14:34] : [00:14:37]

if for example you have a lot of resume

[00:14:37] : [00:14:40]

and CV that includes the skills you are

[00:14:40] : [00:14:44]

looking for is there any projects or or

[00:14:44] : [00:14:46]

academics level that are more

[00:14:46] : [00:14:48]

interesting to others like for example

[00:14:48] : [00:14:52]

is it fine if his experience is all into

[00:14:52] : [00:14:54]

kaggle different kaggle competitions or

[00:14:54] : [00:14:58]

are you looking for someone that built a

[00:14:58] : [00:15:01]

startup or sure or push something online

[00:15:01] : [00:15:04]

so so I actually and I think we'll talk

[00:15:04] : [00:15:05]

about it later because we've talked

[00:15:05] : [00:15:11]

about it earlier but like and I at some

[00:15:11] : [00:15:13]

point and and really we should spend

[00:15:13] : [00:15:15]

some time about it like building AI in

[00:15:15] : [00:15:17]

AI systems

[00:15:17] : [00:15:19]

is not

[00:15:19] : [00:15:22]

it's not a trivial task and they vary we

[00:15:22] : [00:15:25]

I think I refer to it as almost like

[00:15:25] : [00:15:26]

building cars

[00:15:26] : [00:15:30]

and and you there are no people people

[00:15:30] : [00:15:32]

that know how to build cars don't exist

[00:15:32] : [00:15:35]

that's just not possible a car is a

[00:15:35] : [00:15:37]

super complicated end-to-end system

[00:15:37] : [00:15:39]

composed of countless different

[00:15:39] : [00:15:41]

components and you typically need a

[00:15:41] : [00:15:42]

fairly large collection of Specialists

[00:15:42] : [00:15:46]

so I I don't think I have ever hired a

[00:15:46] : [00:15:48]

person that knows machine learning AI

[00:15:48] : [00:15:51]

I'm typically looking much narrower than

[00:15:51] : [00:15:54]

that and so a person that understands

[00:15:54] : [00:15:56]

inference

[00:15:56] : [00:15:58]

a person that understands automatic

[00:15:58] : [00:16:01]

speech recognition models a person that

[00:16:01] : [00:16:03]

understands to an extent natural

[00:16:03] : [00:16:04]

language processing

[00:16:04] : [00:16:07]

a person that shows evidence of

[00:16:07] : [00:16:09]

experience in deploying computer vision

[00:16:09] : [00:16:12]

pipelines into production a person that

[00:16:12] : [00:16:15]

understands embedded systems

[00:16:15] : [00:16:18]

cargo is maybe an Evidence of a person

[00:16:18] : [00:16:20]

knowing to an extent traditional machine

[00:16:20] : [00:16:23]

learning because in kaggle there are

[00:16:23] : [00:16:25]

very rarely neural network based

[00:16:25] : [00:16:28]

competitions most people really want to

[00:16:28] : [00:16:31]

basically learn to do everything so it's

[00:16:31] : [00:16:32]

not possible so yeah so be more

[00:16:32] : [00:16:35]

generalist but you you you would advise

[00:16:35] : [00:16:37]

to rather focus on something which will

[00:16:37] : [00:16:40]

maybe help you build a stronger or

[00:16:40] : [00:16:43]

stronger portfolio for a very specific

[00:16:43] : [00:16:45]

job is that possible to know everything

[00:16:45] : [00:16:48]

it's just not like

[00:16:48] : [00:16:51]

there are there exists this small subset

[00:16:51] : [00:16:54]

of people that seem to have superhuman

[00:16:54] : [00:16:56]

capability and consumer information much

[00:16:56] : [00:16:59]

faster than and everyone else and and

[00:16:59] : [00:17:01]

they those people can know a bit more

[00:17:01] : [00:17:03]

yes but it's just not possible I'm very

[00:17:03] : [00:17:05]

I'm trying to focus right now on two

[00:17:05] : [00:17:06]

things namely scaling of natural

[00:17:06] : [00:17:08]

language processing

[00:17:08] : [00:17:10]

pipelines and and inference so

[00:17:10] : [00:17:12]

optimizing models for production and and

[00:17:12] : [00:17:15]

I I don't claim to be

[00:17:15] : [00:17:18]

keeping up to up to date with the

[00:17:18] : [00:17:20]

literature because between reading I

[00:17:20] : [00:17:22]

have to do also other things like help

[00:17:22] : [00:17:25]

customers resolve issues and and so it's

[00:17:25] : [00:17:27]

just physically not possible yeah

[00:17:27] : [00:17:30]

obviously some high level of general

[00:17:30] : [00:17:33]

knowledge helps what is it like for for

[00:17:33] : [00:17:34]

example if someone gets through the

[00:17:34] : [00:17:36]

first steps and starts the actual

[00:17:36] : [00:17:38]

interviews what what does it look like

[00:17:38] : [00:17:41]

how many interviews or what the not

[00:17:41] : [00:17:44]

really the questions but how what is the

[00:17:44] : [00:17:46]

shape of the interview and and the

[00:17:46] : [00:17:48]

format of the interviews

[00:17:48] : [00:17:50]

sure so every organization and more

[00:17:50] : [00:17:51]

frequently

[00:17:51] : [00:17:54]

even a group in the organization will

[00:17:54] : [00:17:55]

have their own way depending on their

[00:17:55] : [00:17:57]

own needs and capabilities and and

[00:17:57] : [00:18:00]

bandwidth so I cannot comment for NVIDIA

[00:18:00] : [00:18:02]

I cannot even comment for my broader

[00:18:02] : [00:18:04]

team I can tell you how I do it yeah yes

[00:18:04] : [00:18:08]

so I tend to be quite empathic and I

[00:18:08] : [00:18:11]

start by reading the CV yes

[00:18:11] : [00:18:14]

and in the same way as I would like them

[00:18:14] : [00:18:16]

to read the job description I read the

[00:18:16] : [00:18:18]

CV and I typically just ask them about

[00:18:18] : [00:18:20]

the things that they have written in the

[00:18:20] : [00:18:21]
[00:18:21] : [00:18:23]

and make sure that they actually

[00:18:23] : [00:18:25]

understand them because you know what's

[00:18:25] : [00:18:27]

the point of asking them

[00:18:27] : [00:18:29]

about something that is not listed that

[00:18:29] : [00:18:32]

I would hope that CV has enough you know

[00:18:32] : [00:18:36]

knowledge there already for so that I I

[00:18:36] : [00:18:37]

believe that that person can do the job

[00:18:37] : [00:18:40]

and not sure and I'll just focus

[00:18:40] : [00:18:41]

typically on that and if there are

[00:18:41] : [00:18:44]

certain gaps between the CV and what I

[00:18:44] : [00:18:45]

need them to do

[00:18:45] : [00:18:49]

I'll also be focusing on that so if they

[00:18:49] : [00:18:52]

would have read the job description and

[00:18:52] : [00:18:54]

prepared to the interview by looking at

[00:18:54] : [00:18:56]

all of the things listed there they

[00:18:56] : [00:18:58]

should be able to have a very meaningful

[00:18:58] : [00:18:59]

conversation with me

[00:18:59] : [00:19:03]

so an ideal preparation for such an

[00:19:03] : [00:19:05]

interview would be to as you said look

[00:19:05] : [00:19:08]

description description yes read the

[00:19:08] : [00:19:11]

description and understand it but also

[00:19:11] : [00:19:13]

look further into the bits of the

[00:19:13] : [00:19:15]

description that you are not sure you

[00:19:15] : [00:19:17]

are skilled about

[00:19:17] : [00:19:20]

you know so let's say I'm looking for

[00:19:20] : [00:19:22]

someone that that understands I don't

[00:19:22] : [00:19:23]
[00:19:23] : [00:19:27]

inference process and I'm writing that I

[00:19:27] : [00:19:28]

need someone that will be among many

[00:19:28] : [00:19:31]

things that will be supporting customers

[00:19:31] : [00:19:34]

in a quantization our training

[00:19:34] : [00:19:36]

it's a huge chance that someone will ask

[00:19:36] : [00:19:39]

me about quantization over training it

[00:19:39] : [00:19:42]

might be a simple thing and I might

[00:19:42] : [00:19:44]

spend five minutes reading about it and

[00:19:44] : [00:19:46]

understand it very well and in fact it

[00:19:46] : [00:19:49]

is but if I don't even devote two

[00:19:49] : [00:19:51]

minutes to to even on High level

[00:19:51] : [00:19:54]

understand it it shows that you know

[00:19:54] : [00:19:56]

like it's also to some extent that I

[00:19:56] : [00:19:59]

don't care

[00:19:59] : [00:19:59]

um so I've seen that you wear a deep

[00:19:59] : [00:20:03]

learning architect at Nvidia that's

[00:20:03] : [00:20:06]

correct could you explain a bit what is

[00:20:06] : [00:20:08]

a deep learning architect and especially

[00:20:08] : [00:20:10]

what it what is it in your case because

[00:20:10] : [00:20:12]

I assume as with Charles title it may

[00:20:12] : [00:20:15]

vary depending on the companion

[00:20:15] : [00:20:20]

that's correct so so so at Nvidia I

[00:20:20] : [00:20:22]

think the the that means actually

[00:20:22] : [00:20:24]

something quite different to what it

[00:20:24] : [00:20:27]

means in many other companies so

[00:20:27] : [00:20:29]

solution Architects are part of

[00:20:29] : [00:20:32]

pre-sales organizations so their goal is

[00:20:32] : [00:20:34]

to support customers and Adoption of our

[00:20:34] : [00:20:38]

technology but Nvidia tends to focus on

[00:20:38] : [00:20:40]

things that are difficult

[00:20:40] : [00:20:44]

that's that's almost like one of the key

[00:20:44] : [00:20:47]

principles that drive the selection of

[00:20:47] : [00:20:49]

what technologies Nvidia is and it's not

[00:20:49] : [00:20:51]

developing as a consequence solution

[00:20:51] : [00:20:54]

Architects tends to be you know very

[00:20:54] : [00:20:57]

specialized in in

[00:20:57] : [00:21:00]

quite narrow in a narrow field so but

[00:21:00] : [00:21:03]

but the role is on high level quite

[00:21:03] : [00:21:06]

straightforward at this to explain we

[00:21:06] : [00:21:08]

effectively have

[00:21:08] : [00:21:11]

in contrast to research scientists that

[00:21:11] : [00:21:15]

understand you know one topic one very

[00:21:15] : [00:21:18]

narrow topic very well and our role is

[00:21:18] : [00:21:21]

to understand substantially broader set

[00:21:21] : [00:21:23]

of topics also relatively well but

[00:21:23] : [00:21:26]

obviously nowhere near as well as as

[00:21:26] : [00:21:29]

individual researchers and and help to

[00:21:29] : [00:21:30]

bring those all of those things together

[00:21:30] : [00:21:33]

because as I mentioned building of AI

[00:21:33] : [00:21:35]

applications is almost like designing

[00:21:35] : [00:21:38]

and Manufacturing cars a lot of pieces

[00:21:38] : [00:21:40]

that need to come together and an

[00:21:40] : [00:21:42]

architect is a person that can grasp

[00:21:42] : [00:21:44]

those pieces maybe not understand each

[00:21:44] : [00:21:46]

and every components to its finest

[00:21:46] : [00:21:48]

detail but grasp those pieces together

[00:21:48] : [00:21:50]

and bring them into a holistic

[00:21:50] : [00:21:52]

end-to-end working solution

[00:21:52] : [00:21:55]

and what pieces are you working on you

[00:21:55] : [00:21:57]

mentioned that you wear mainly working

[00:21:57] : [00:22:00]

on scaling but are there any other

[00:22:00] : [00:22:03]

pieces so that's my core area of

[00:22:03] : [00:22:04]

competence so you know a lot of people

[00:22:04] : [00:22:07]

throughout Europe if they need to to to

[00:22:07] : [00:22:10]

know something about that piece they

[00:22:10] : [00:22:12]

would reach out to me and I have other

[00:22:12] : [00:22:14]

colleagues that are specializing in in

[00:22:14] : [00:22:19]

other things as well that are my go-to

[00:22:19] : [00:22:22]

you know support people for example a

[00:22:22] : [00:22:24]

colleague of my Miriam she specializes

[00:22:24] : [00:22:27]

in Riva which is our platform for

[00:22:27] : [00:22:31]

conversational AI I I have quite a bit

[00:22:31] : [00:22:33]

of respect for dye for his systems

[00:22:33] : [00:22:36]

knowledge and and so on and so forth so

[00:22:36] : [00:22:38]

so I specialize in that but I support

[00:22:38] : [00:22:41]

broad range of of activities of

[00:22:41] : [00:22:42]

customers because especially when

[00:22:42] : [00:22:45]

customers are just starting the journey

[00:22:45] : [00:22:49]

a lot of things I can cover with my

[00:22:49] : [00:22:50]

knowledge so you know stuff associated

[00:22:50] : [00:22:55]

with data preparation like enabling

[00:22:55] : [00:22:55]

establishing first pipelines the

[00:22:55] : [00:23:03]

development of first metrics and

[00:23:03] : [00:23:03]

kicking offers jobs monitoring their

[00:23:03] : [00:23:08]

performance doing error analysis

[00:23:08] : [00:23:11]

measuring efficiency with which they

[00:23:11] : [00:23:13]

execute first deployment into what will

[00:23:13] : [00:23:17]

become a production system and so so the

[00:23:17] : [00:23:19]

end-to-end process we try to support

[00:23:19] : [00:23:21]

customers some of them know all of that

[00:23:21] : [00:23:23]

already and then they have very specific

[00:23:23] : [00:23:24]

questions

[00:23:24] : [00:23:27]

and then we just work with engineering

[00:23:27] : [00:23:30]

to fix bugs that might be in the

[00:23:30] : [00:23:32]

software some of them don't and they

[00:23:32] : [00:23:34]

require more holistic support

[00:23:34] : [00:23:36]

so you mentioned customers I assume you

[00:23:36] : [00:23:39]

are part of a team at Nvidia that helps

[00:23:39] : [00:23:42]

with other companies asking for social

[00:23:42] : [00:23:45]

medias help using their corrector our

[00:23:45] : [00:23:48]

mission is to support the broader

[00:23:48] : [00:23:50]

community in adoption of AI Technologies

[00:23:50] : [00:23:52]

maybe the word customer is a bit

[00:23:52] : [00:23:55]

misleading because and and bbsl through

[00:23:55] : [00:23:58]

partners and you can buy our gpus from

[00:23:58] : [00:24:02]

AWS so micro

[00:24:02] : [00:24:02]
[00:24:02] : [00:24:06]

but yes our mission is to make sure that

[00:24:06] : [00:24:08]

the broad Community

[00:24:08] : [00:24:11]

can and adopts our Technologies and yes

[00:24:11] : [00:24:15]

that includes gpus but like like we are

[00:24:15] : [00:24:16]

more of a software

[00:24:16] : [00:24:19]

engineering company than a hardware

[00:24:19] : [00:24:23]

manufacturer uh the entire deep learning

[00:24:23] : [00:24:24]

stack is

[00:24:24] : [00:24:27]

is filled with Nvidia software from kudi

[00:24:27] : [00:24:29]

Anan from countless Fighters libraries

[00:24:29] : [00:24:31]

we're a major contributor to python

[00:24:31] : [00:24:33]

tensorflow

[00:24:33] : [00:24:36]

Triton infinite server tensor RT and

[00:24:36] : [00:24:37]

many many others

[00:24:37] : [00:24:40]

could you go over a bit more details

[00:24:40] : [00:24:42]

into one specific recent project that

[00:24:42] : [00:24:45]

you've had or had to help with

[00:24:45] : [00:24:47]

we have an upcoming conference called

[00:24:47] : [00:24:50]

the GTC it will be the 20th and there

[00:24:50] : [00:24:51]

will be two projects that I've been

[00:24:51] : [00:24:54]

supporting recently I will not be able

[00:24:54] : [00:24:56]

to go beyond what is already published

[00:24:56] : [00:24:58]

on the website I want to leave the

[00:24:58] : [00:25:00]

surprise

[00:25:00] : [00:25:03]

to the people that will actually be with

[00:25:03] : [00:25:06]

me presenting on the on the event but

[00:25:06] : [00:25:08]

like we have two talks one with juggle

[00:25:08] : [00:25:12]

and Rover and another one with Deutsche

[00:25:12] : [00:25:15]

Bank and and in both cases I was

[00:25:15] : [00:25:17]

supporting them in development of

[00:25:17] : [00:25:18]

natural language processing capability

[00:25:18] : [00:25:21]

for you know different use cases totally

[00:25:21] : [00:25:22]

different use cases totally different

[00:25:22] : [00:25:24]

sectors but the same technology stack

[00:25:24] : [00:25:27]

and same problems you know I have a

[00:25:27] : [00:25:29]

problem to solve

[00:25:29] : [00:25:32]

that problem is important to me you know

[00:25:32] : [00:25:36]

how do I how do I you know create a

[00:25:36] : [00:25:38]

training data set that is sufficiently

[00:25:38] : [00:25:42]

uh big to allow us to achieve our goals

[00:25:42] : [00:25:44]

you know how do I scale the training

[00:25:44] : [00:25:47]

process once I've succeeded and I've

[00:25:47] : [00:25:49]

reached my metrics how do I now provide

[00:25:49] : [00:25:51]

this user to a large group of people and

[00:25:51] : [00:25:54]

so you know some fairly standard yet not

[00:25:54] : [00:25:57]

trivial to solve problems like you know

[00:25:57] : [00:26:00]

GP utilization making sure that it's

[00:26:00] : [00:26:03]

high reaching latency targets when

[00:26:03] : [00:26:05]

executing those models scaling those

[00:26:05] : [00:26:07]

models so what if they you know the

[00:26:07] : [00:26:09]

demand from the users is Bumpy how do I

[00:26:09] : [00:26:11]

you know deploy dynamically a couple

[00:26:11] : [00:26:13]

additional gpus and then scale down all

[00:26:13] : [00:26:15]

of those problems are obviously solvable

[00:26:15] : [00:26:17]

but they require a lot of tools and a

[00:26:17] : [00:26:19]

lot of knowledge so we help and point

[00:26:19] : [00:26:21]

them into the right direction

[00:26:21] : [00:26:24]

and what's the for example you said that

[00:26:24] : [00:26:27]

the kind of the same technology were

[00:26:27] : [00:26:29]

applied very differently and the two

[00:26:29] : [00:26:32]

other two projects so what's the what

[00:26:32] : [00:26:35]

are the similar challenges on on both

[00:26:35] : [00:26:38]

projects but on all your projects is

[00:26:38] : [00:26:40]

there a recurrent challenge or something

[00:26:40] : [00:26:41]
[00:26:41] : [00:26:43]

right now you you know how to do but

[00:26:43] : [00:26:46]

it's like complicated for other people

[00:26:46] : [00:26:48]

to do and thus need your help

[00:26:48] : [00:26:51]

so most of the problems when you look at

[00:26:51] : [00:26:54]

them from again miles away they they

[00:26:54] : [00:26:57]

seem simple but it's really difficult is

[00:26:57] : [00:27:00]

the fact that life does not work like

[00:27:00] : [00:27:03]

that you don't get one thing instead you

[00:27:03] : [00:27:05]

get every single day

[00:27:05] : [00:27:08]

a medium scale problem to solve there's

[00:27:08] : [00:27:11]

just a lot of them and you know we we

[00:27:11] : [00:27:14]

want to label data how much data do we

[00:27:14] : [00:27:16]

need okay we need to establish that and

[00:27:16] : [00:27:18]

how exactly do we label let's define

[00:27:18] : [00:27:21]

that you know okay that's that's a lot

[00:27:21] : [00:27:23]

of data to label I cannot do it in one

[00:27:23] : [00:27:26]

evening so who exactly will label the

[00:27:26] : [00:27:28]

data okay I'm not labeling the data I

[00:27:28] : [00:27:30]

need to explain how exactly I want to

[00:27:30] : [00:27:33]

label to all of those people that I just

[00:27:33] : [00:27:35]

hired okay they've labeled some data how

[00:27:35] : [00:27:37]

do I know that they've labeled it in the

[00:27:37] : [00:27:40]

way that that is that that that is

[00:27:40] : [00:27:42]

appropriate yes okay I cannot do all of

[00:27:42] : [00:27:44]

that quality analysis I need to you know

[00:27:44] : [00:27:47]

maybe get a dedicated person to oversee

[00:27:47] : [00:27:48]

that process and you go through the

[00:27:48] : [00:27:51]

pipeline and all of those problems in

[00:27:51] : [00:27:54]

itself are not like rocket science but

[00:27:54] : [00:27:55]

all of them

[00:27:55] : [00:27:57]

needs to be solved and some of them

[00:27:57] : [00:27:58]

actually actually are surprisingly

[00:27:58] : [00:28:00]

tricky

[00:28:00] : [00:28:03]

and since you mentioned the working with

[00:28:03] : [00:28:07]

NLP mainly and NLP you are also working

[00:28:07] : [00:28:10]

with very large models and and various

[00:28:10] : [00:28:14]

data set as well so may I ask how the

[00:28:14] : [00:28:15]

different

[00:28:15] : [00:28:15]
[00:28:15] : [00:28:17]

not customers but the different people

[00:28:17] : [00:28:19]

that are working with you are dealing

[00:28:19] : [00:28:23]

with with these very large either model

[00:28:23] : [00:28:26]

and data sets so like a lot of of

[00:28:26] : [00:28:28]

compute as well I have a lot of space

[00:28:28] : [00:28:32]

required so what's your typical solution

[00:28:32] : [00:28:34]

for that or just like so how do you

[00:28:34] : [00:28:36]

press it process that so two years ago

[00:28:36] : [00:28:38]

like this would be a very difficult

[00:28:38] : [00:28:41]

conversation but yeah today tools just

[00:28:41] : [00:28:42]

exist

[00:28:42] : [00:28:45]

like you can go to to to to your

[00:28:45] : [00:28:47]

favorite search engine and and and

[00:28:47] : [00:28:50]

search for say Nemo Megatron and that's

[00:28:50] : [00:28:52]

a tool with which you can largely change

[00:28:52] : [00:28:54]

the language code

[00:28:54] : [00:28:56]

and I don't know you want to build a

[00:28:56] : [00:28:58]

model for Polish you change the language

[00:28:58] : [00:29:01]

code it will download a pile data set

[00:29:01] : [00:29:03]

you then click and if you have compute

[00:29:03] : [00:29:05]

that will train you a fairly decent

[00:29:05] : [00:29:08]

language model of almost any size you

[00:29:08] : [00:29:10]

want I think we've published hyper

[00:29:10] : [00:29:13]

parameters for models up to 175 billion

[00:29:13] : [00:29:16]

parameters and it will scale perfectly

[00:29:16] : [00:29:19]

as long as you have the right Hardware

[00:29:19] : [00:29:22]

so lead by perfectly I mean linearly so

[00:29:22] : [00:29:25]

so so that that's not a problem we have

[00:29:25] : [00:29:28]

published reference designs for how to

[00:29:28] : [00:29:31]

build systems that will skill linearly

[00:29:31] : [00:29:33]

we have this thing called superpot

[00:29:33] : [00:29:36]

reference architectures and many many

[00:29:36] : [00:29:39]

many people are reusing those also for

[00:29:39] : [00:29:40]

their own

[00:29:40] : [00:29:43]

system design so so we know exactly how

[00:29:43] : [00:29:47]

to how to train and and prompt tune or

[00:29:47] : [00:29:50]

adapter fine tune or

[00:29:50] : [00:29:53]

those models

[00:29:53] : [00:29:56]

the literature exists today this is not

[00:29:56] : [00:29:59]

as big of a challenge as you might think

[00:29:59] : [00:30:02]

and you know for an individual

[00:30:02] : [00:30:04]

the amount of hardware needed can be

[00:30:04] : [00:30:06]

scary if you think about oh how much my

[00:30:06] : [00:30:09]

car costs but but then you know a

[00:30:09] : [00:30:11]

typical large company

[00:30:11] : [00:30:15]

sometimes in a canteen will will spend

[00:30:15] : [00:30:18]

the same amount of money on sandwiches

[00:30:18] : [00:30:20]

in a year and what it takes to chain

[00:30:20] : [00:30:22]

those models and electricity so so those

[00:30:22] : [00:30:24]
[00:30:24] : [00:30:27]

not as as challenging problems as you

[00:30:27] : [00:30:30]

might think and there are many many

[00:30:30] : [00:30:32]

startups that have set up either

[00:30:32] : [00:30:37]

in-house or via some partners and a

[00:30:37] : [00:30:39]

large training systems that they use for

[00:30:39] : [00:30:41]

training for those jobs

[00:30:41] : [00:30:43]

so they're compared to the past the

[00:30:43] : [00:30:45]

current challenge is mainly to find

[00:30:45] : [00:30:50]

which tool to use and how to to

[00:30:50] : [00:30:52]

use them in a way that they would be

[00:30:52] : [00:30:54]

like cost effective rather than

[00:30:54] : [00:30:57]

developing something yourself

[00:30:57] : [00:30:58]

there are a lot of organizations that

[00:30:58] : [00:31:01]

are developing through themselves but if

[00:31:01] : [00:31:02]

you don't want to develop tools for

[00:31:02] : [00:31:04]

training large language models you don't

[00:31:04] : [00:31:06]

have to you just go to

[00:31:06] : [00:31:09]

Nemo websites you download the code

[00:31:09] : [00:31:11]

assets and you download the container

[00:31:11] : [00:31:13]

Docker container that packages

[00:31:13] : [00:31:15]

everything that you need including the

[00:31:15] : [00:31:17]

software you configure a single yaml

[00:31:17] : [00:31:19]

file to point it to

[00:31:19] : [00:31:22]

to where the data is located and choose

[00:31:22] : [00:31:24]

the size you you want and actually

[00:31:24] : [00:31:26]

preparing the data is always difficult

[00:31:26] : [00:31:29]

it's not as cheesy yes you can download

[00:31:29] : [00:31:32]

like file and train on that but you know

[00:31:32] : [00:31:34]

that model will not have all necessary

[00:31:34] : [00:31:36]

all the properties you want the systems

[00:31:36] : [00:31:39]

Hardware exists for training of those

[00:31:39] : [00:31:41]

models it's not that difficult to be one

[00:31:41] : [00:31:44]

of the big challenge is the people

[00:31:44] : [00:31:46]

they there aren't that many people that

[00:31:46] : [00:31:48]

actually know that those tools exist let

[00:31:48] : [00:31:51]

alone know how to use them or

[00:31:51] : [00:31:54]

have any hands-on experience using them

[00:31:54] : [00:31:57]

that's just a function of all of this

[00:31:57] : [00:32:00]

being very new very mind I think birth

[00:32:00] : [00:32:02]

paper and doesn't a large language model

[00:32:02] : [00:32:06]

that was published in what 2018 no end

[00:32:06] : [00:32:08]

of 2019 where we've shoved curves

[00:32:08] : [00:32:11]

demonstrating that larger NLP

[00:32:11] : [00:32:13]

architectures trained with larger models

[00:32:13] : [00:32:15]

on larger data sets

[00:32:15] : [00:32:18]

improving sample efficiency

[00:32:18] : [00:32:20]

so all of that is super new

[00:32:20] : [00:32:23]

so what would you well I assume this is

[00:32:23] : [00:32:25]

one of the things that you will talk in

[00:32:25] : [00:32:28]

the two events at GTC coming so people

[00:32:28] : [00:32:32]

can learn more about how to to scale NLP

[00:32:32] : [00:32:35]

models and and just to deploy them yeah

[00:32:35] : [00:32:37]

and we'll have dedicated talks devoted

[00:32:37] : [00:32:40]

to that as well so I think those talks

[00:32:40] : [00:32:42]

uh the two talks that I've mentioned are

[00:32:42] : [00:32:43]

predominantly focused on what

[00:32:43] : [00:32:45]

specifically Deutsche Bank and throw

[00:32:45] : [00:32:47]

that has right but we will have

[00:32:47] : [00:32:50]

dedicated stocks also focusing on

[00:32:50] : [00:32:53]

um on on large language models from

[00:32:53] : [00:32:55]

various different perspectives some

[00:32:55] : [00:32:57]

Hardware to software and we have this

[00:32:57] : [00:32:59]

thing called LM service which is like

[00:32:59] : [00:33:02]

hosted large language model so it would

[00:33:02] : [00:33:05]

be all sorts yeah perfect and so if if

[00:33:05] : [00:33:06]

for for the audience if you are

[00:33:06] : [00:33:09]

interested in learning more about what

[00:33:09] : [00:33:11]

we've just discussed there's the the

[00:33:11] : [00:33:13]

toes will be in the description below so

[00:33:13] : [00:33:16]

you and it's as as Adam said it's

[00:33:16] : [00:33:18]

completely free and it's during yeah

[00:33:18] : [00:33:22]

event coming so just to to go a bit more

[00:33:22] : [00:33:27]

into the your your Nvidia work I would

[00:33:27] : [00:33:29]

just like to ask a very basic question

[00:33:29] : [00:33:33]

that is what what is your day-to-day

[00:33:33] : [00:33:35]

life at Nvidia what are you you're doing

[00:33:35] : [00:33:37]

on a regular basis

[00:33:37] : [00:33:41]

our job is is quite flexible yes and and

[00:33:41] : [00:33:45]

really changes with uh with not only the

[00:33:45] : [00:33:48]

technology landscape but also with where

[00:33:48] : [00:33:50]

our customers are so when I've joined in

[00:33:50] : [00:33:53]

2017 my job was dramatically different

[00:33:53] : [00:33:56]

to what it is right now yes so in 2017

[00:33:56] : [00:33:59]

they weren't there wasn't that much of

[00:33:59] : [00:34:01]

an adoption of deep neural networks we

[00:34:01] : [00:34:04]

had some you know selected customers

[00:34:04] : [00:34:06]

that we've been supporting on a

[00:34:06] : [00:34:08]

day-to-day basis we've been supporting a

[00:34:08] : [00:34:10]

lot of academic and business events as

[00:34:10] : [00:34:14]

preparing talks and doing evangelization

[00:34:14] : [00:34:18]

and and and now it's it's dramatically

[00:34:18] : [00:34:21]

different right now a lot of those early

[00:34:21] : [00:34:24]

customers matured we have like very

[00:34:24] : [00:34:26]

large inference deployments so I

[00:34:26] : [00:34:28]

personally have couple customers that

[00:34:28] : [00:34:30]

will have multiple thousands of of

[00:34:30] : [00:34:34]

instances of Triton infinite server we

[00:34:34] : [00:34:36]

have regular engineering calls during

[00:34:36] : [00:34:38]

which they ask questions ask for

[00:34:38] : [00:34:40]

features and erase bugs and I have to

[00:34:40] : [00:34:42]

work with product management to

[00:34:42] : [00:34:46]

prioritize and resolve them and we we

[00:34:46] : [00:34:48]

definitely have now less less of a

[00:34:48] : [00:34:51]

presence on those like business focused

[00:34:51] : [00:34:53]

AI events because there is less need to

[00:34:53] : [00:34:57]

do that and I don't think I have a a a

[00:34:57] : [00:35:00]

super strict daily routine yes I have a

[00:35:00] : [00:35:03]

schedule of engineering calls with a

[00:35:03] : [00:35:04]

fairly large number of organizations

[00:35:04] : [00:35:06]

it's like every week maybe every two

[00:35:06] : [00:35:08]

weeks maybe every month depending on the

[00:35:08] : [00:35:10]

pace of that progress I'll dial in we'll

[00:35:10] : [00:35:12]

have a conversation about progress they

[00:35:12] : [00:35:16]

made issues challenges maybe bugs I

[00:35:16] : [00:35:18]

would then try to resolve some of myself

[00:35:18] : [00:35:21]

I would then pass some on to engineering

[00:35:21] : [00:35:25]

and I support development of proposals

[00:35:25] : [00:35:28]

just day-to-day conversations with

[00:35:28] : [00:35:30]

customers and answering the deeply

[00:35:30] : [00:35:32]

technical questions around

[00:35:32] : [00:35:36]

details so yeah things such as sizing

[00:35:36] : [00:35:39]

scaling how many gpus do we need to

[00:35:39] : [00:35:41]

train a model of this size on this data

[00:35:41] : [00:35:44]

set or how many do we need to to to to

[00:35:44] : [00:35:48]

be able to work with a team of six to to

[00:35:48] : [00:35:50]

do I don't know prompt learning or

[00:35:50] : [00:35:53]

adapter tuning on working towards this

[00:35:53] : [00:35:54]

problem

[00:35:54] : [00:35:56]
[00:35:56] : [00:35:58]

um and then it changes throughout the

[00:35:58] : [00:36:00]

year as well so closer to events like

[00:36:00] : [00:36:02]

GTC we we focus slightly more on the

[00:36:02] : [00:36:06]

content and sometimes prior to GTC we we

[00:36:06] : [00:36:08]

support development of demos

[00:36:08] : [00:36:12]

and and we have some customers that we

[00:36:12] : [00:36:14]

work very closely with Lighthouse

[00:36:14] : [00:36:17]

account customers that we support more

[00:36:17] : [00:36:18]

intensity so for example I just

[00:36:18] : [00:36:21]

published a paper with a company called

[00:36:21] : [00:36:23]

instadeep on nucleotide Transformers I

[00:36:23] : [00:36:26]

don't I know very little about biology

[00:36:26] : [00:36:28]

yeah admittedly but I do know how to

[00:36:28] : [00:36:31]

force neural networks to scale well so I

[00:36:31] : [00:36:33]

was you know Hands-On helping them to

[00:36:33] : [00:36:34]

make sure that they're very large

[00:36:34] : [00:36:37]

language model works very well on

[00:36:37] : [00:36:38]

proteins

[00:36:38] : [00:36:41]

so um so so

[00:36:41] : [00:36:45]

there isn't a single recipe that I

[00:36:45] : [00:36:48]

follow on a day-to-day basis yeah super

[00:36:48] : [00:36:50]

interesting it seems like for all the

[00:36:50] : [00:36:52]

interviews that I had recently with

[00:36:52] : [00:36:56]

people working at Nvidia you have a very

[00:36:56] : [00:36:59]

broad range of of projects that you can

[00:36:59] : [00:37:01]

participate in and learn from so it

[00:37:01] : [00:37:03]

seems really cool

[00:37:03] : [00:37:07]

yeah Nvidia is trying through their

[00:37:07] : [00:37:09]

culture to be very agile and and quickly

[00:37:09] : [00:37:13]

adapt to this admittedly insane

[00:37:13] : [00:37:16]

technological landscape

[00:37:16] : [00:37:19]

and would you say that is that that this

[00:37:19] : [00:37:22]

technology is more insane now than it

[00:37:22] : [00:37:24]

was like in 2018 when you first joined

[00:37:24] : [00:37:28]

and dramatically dramatically more so I

[00:37:28] : [00:37:31]

have like almost every day or every year

[00:37:31] : [00:37:33]

I say the same thing I've never seen

[00:37:33] : [00:37:37]

such a fast rate of progress yeah it's

[00:37:37] : [00:37:39]

unbelievable like if you would take I

[00:37:39] : [00:37:40]

don't know what was published right now

[00:37:40] : [00:37:43]

on those instruction tuned models

[00:37:43] : [00:37:45]

and tickets to an academic conference

[00:37:45] : [00:37:48]

two or three years ago

[00:37:48] : [00:37:49]

they would refer you to a mental

[00:37:49] : [00:37:51]

institution

[00:37:51] : [00:37:53]

I don't think anyone would have believed

[00:37:53] : [00:37:55]

that in such a short period of time

[00:37:55] : [00:37:57]

those model would exhibits those types

[00:37:57] : [00:38:00]

of behavior especially that in in most

[00:38:00] : [00:38:02]

cases we are not necessarily explicitly

[00:38:02] : [00:38:04]

training them to exhibit those behaviors

[00:38:04] : [00:38:07]

those are emerging features most of

[00:38:07] : [00:38:09]

those models are just trying to predict

[00:38:09] : [00:38:11]

the next token yeah it's indeed crazy

[00:38:11] : [00:38:13]

and how do you keep up with with this

[00:38:13] : [00:38:16]

rate of progress I don't I don't think I

[00:38:16] : [00:38:19]

do yeah so I have my

[00:38:19] : [00:38:21]

day-to-day job I do a bit of reading

[00:38:21] : [00:38:24]

based on we have an amazing research

[00:38:24] : [00:38:27]

team and and and we have in internal

[00:38:27] : [00:38:30]

email and other newsletters where people

[00:38:30] : [00:38:33]

just erase the most important stuff so

[00:38:33] : [00:38:35]

they do an amazing job

[00:38:35] : [00:38:38]

but I I don't I don't like it's not

[00:38:38] : [00:38:42]

possible so not even across entire AI

[00:38:42] : [00:38:44]

but even right now within natural

[00:38:44] : [00:38:46]

language processing it's just borderline

[00:38:46] : [00:38:48]

impossible to keep up with everything

[00:38:48] : [00:38:50]

and you have to learn to let go and

[00:38:50] : [00:38:52]

focus on what is it that you're doing

[00:38:52] : [00:38:54]

what are the problems that you have in

[00:38:54] : [00:38:57]

front of you solve them and and try to

[00:38:57] : [00:38:58]

add value like that

[00:38:58] : [00:39:02]

so as the field is maturing like this or

[00:39:02] : [00:39:05]

just getting crazier would you say that

[00:39:05] : [00:39:08]

you have to be more and more specific on

[00:39:08] : [00:39:11]

what you are doing right compared to the

[00:39:11] : [00:39:14]

to five or six years ago oh definitely

[00:39:14] : [00:39:17]

so we had that conversation with some of

[00:39:17] : [00:39:20]

my colleagues I don't remember exactly

[00:39:20] : [00:39:22]

one doesn't matter I guess so when I

[00:39:22] : [00:39:26]

joined Nvidia in 2017 it was somehow

[00:39:26] : [00:39:28]

possible yet already challenging for me

[00:39:28] : [00:39:32]

to grasp everything AI Nvidia right now

[00:39:32] : [00:39:33]

it's just not

[00:39:33] : [00:39:37]

not practical not not possible

[00:39:37] : [00:39:40]

that's too much would you say is that

[00:39:40] : [00:39:43]

it's more challenging now compared to

[00:39:43] : [00:39:47]

then when you you know for example it's

[00:39:47] : [00:39:48]

different it's definitely different

[00:39:48] : [00:39:51]

challenges back then you had to learn a

[00:39:51] : [00:39:52]

lot about

[00:39:52] : [00:39:56]

everything basically or like you

[00:39:56] : [00:39:58]

it was difficult to know what to learn

[00:39:58] : [00:40:01]

about just because you you have it's not

[00:40:01] : [00:40:04]

as broad so you can allow yourself to

[00:40:04] : [00:40:06]

learn about a bit everything comparative

[00:40:06] : [00:40:09]

versus now you you have to learn a lot

[00:40:09] : [00:40:12]

about a very specific thing and

[00:40:12] : [00:40:14]

that yeah as I said the challenges are

[00:40:14] : [00:40:16]

are very different would you say that

[00:40:16] : [00:40:20]

it's harder now or harder than with the

[00:40:20] : [00:40:22]

different challenges and different

[00:40:22] : [00:40:26]

things to to do I'm sure harder right

[00:40:26] : [00:40:28]

at that point of time it was also quite

[00:40:28] : [00:40:31]

hard because yeah a lot of things were

[00:40:31] : [00:40:34]

super non-obvious ly

[00:40:34] : [00:40:35]

and right now there is just a lot of

[00:40:35] : [00:40:38]

very good quality research and

[00:40:38] : [00:40:42]

Engineering coming out and so so I just

[00:40:42] : [00:40:44]

you have to learn to let go just it's

[00:40:44] : [00:40:47]

okay you cannot know everything I I at

[00:40:47] : [00:40:50]

some point I was trying to keep up with

[00:40:50] : [00:40:51]

both natural language processing and

[00:40:51] : [00:40:53]

computer vision research but I I think I

[00:40:53] : [00:40:55]

had to let go of computer vision I know

[00:40:55] : [00:40:58]

obviously that Transformer architectures

[00:40:58] : [00:41:00]

are very popular on occasional skin for

[00:41:00] : [00:41:02]

a paper to make sure that I more or less

[00:41:02] : [00:41:04]

know how to understand them but if you

[00:41:04] : [00:41:07]

were and I look at multimodal

[00:41:07] : [00:41:09]

architectures now a lot so that helps me

[00:41:09] : [00:41:11]
[00:41:11] : [00:41:13]

helps me be somehow up to speed and

[00:41:13] : [00:41:15]

unsupervised models but if you were to

[00:41:15] : [00:41:17]

ask me what is right now the best

[00:41:17] : [00:41:20]

architecture for I don't know object

[00:41:20] : [00:41:24]

detection on on Ms cook I I don't know

[00:41:24] : [00:41:27]

yeah you can ask strategypt so that's

[00:41:27] : [00:41:29]

yeah I can go to Google it doesn't

[00:41:29] : [00:41:30]

matter it's just I don't know from the

[00:41:30] : [00:41:32]

top of my head you have to let go

[00:41:32] : [00:41:33]

there's too much

[00:41:33] : [00:41:35]

yeah it's even

[00:41:35] : [00:41:38]

this thing is even stronger now when

[00:41:38] : [00:41:40]

Google first came up it's it's already a

[00:41:40] : [00:41:42]

different mindset of instead of trying

[00:41:42] : [00:41:46]

to gather knowledge in your own mind you

[00:41:46] : [00:41:48]

just need to understand how to find the

[00:41:48] : [00:41:50]

knowledge because Google has everything

[00:41:50] : [00:41:53]

accessible from from your fingertips but

[00:41:53] : [00:41:57]

now it's it's even more towards that

[00:41:57] : [00:41:59]

that way of just basically you need to

[00:41:59] : [00:42:02]

to know how to type to charge GPT or

[00:42:02] : [00:42:04]

whatever machine learning model that

[00:42:04] : [00:42:06]

will give you that the answer if it's

[00:42:06] : [00:42:08]

not hallucinating but that's another

[00:42:08] : [00:42:12]

thing but like it's yeah it's

[00:42:12] : [00:42:14]

what I think is that it's very dangerous

[00:42:14] : [00:42:16]

for our own memory just because we don't

[00:42:16] : [00:42:18]

have to know

[00:42:18] : [00:42:21]

as many things nearly as many things as

[00:42:21] : [00:42:24]

we did in the past and so maybe our

[00:42:24] : [00:42:26]

memory will just enter fire

[00:42:26] : [00:42:29]

or something but that's no it's still

[00:42:29] : [00:42:32]

useful to to memorize that it's just you

[00:42:32] : [00:42:35]

cannot just rely on once you know you

[00:42:35] : [00:42:37]

have expertise come from the fact that

[00:42:37] : [00:42:40]

you can bring together facts and combine

[00:42:40] : [00:42:42]

them together to like create new

[00:42:42] : [00:42:43]

insights

[00:42:43] : [00:42:45]

for your current project which is

[00:42:45] : [00:42:47]

something that I think a lot of people

[00:42:47] : [00:42:48]

are interested in too scaling and

[00:42:48] : [00:42:51]

natural language processing sure what

[00:42:51] : [00:42:54]

what is your well this question has

[00:42:54] : [00:42:56]

twofolds the first one is what is your

[00:42:56] : [00:42:59]

favorite tools to use and the second one

[00:42:59] : [00:43:01]

is what is your Tech stack

[00:43:01] : [00:43:05]

programming language and other like

[00:43:05] : [00:43:07]

internal tools you are using

[00:43:07] : [00:43:09]

so the tools that we're using are open

[00:43:09] : [00:43:13]

sources and I use almost exclusively

[00:43:13] : [00:43:16]

those so on a day-to-day basis when it

[00:43:16] : [00:43:18]

comes to NLP and I do other stuff as

[00:43:18] : [00:43:19]

well don't get me wrong

[00:43:19] : [00:43:22]

we use Nemo

[00:43:22] : [00:43:26]

uh I I use Nemo for both natural

[00:43:26] : [00:43:28]

language processing but I work with some

[00:43:28] : [00:43:31]

ad quite a lot also in like this broader

[00:43:31] : [00:43:33]

conversation AI so I do support quite a

[00:43:33] : [00:43:35]

few customers with automatic speech

[00:43:35] : [00:43:38]

recognition as well and less so in text

[00:43:38] : [00:43:41]

to speech and and Nemo provides

[00:43:41] : [00:43:45]

foundations for for those models as well

[00:43:45] : [00:43:48]

for users that don't want to be exposed

[00:43:48] : [00:43:51]

to kind of the the the low level

[00:43:51] : [00:43:53]

elements of implementations of Nema we

[00:43:53] : [00:43:56]

also have this thing called Tau which

[00:43:56] : [00:43:57]

stands for

[00:43:57] : [00:43:58]
[00:43:58] : [00:44:01]

terrain adapt optimize

[00:44:01] : [00:44:04]

which effectively helps people with less

[00:44:04] : [00:44:07]

of model knowledge to just fine-tune

[00:44:07] : [00:44:12]

models on their own data sets and I so I

[00:44:12] : [00:44:15]

use I use I use Nemo and historically

[00:44:15] : [00:44:18]

Megatron LM also for a lot of natural

[00:44:18] : [00:44:22]

language processing work prior to the

[00:44:22] : [00:44:25]

the generative models and where we've

[00:44:25] : [00:44:28]

been working with smaller bird-like

[00:44:28] : [00:44:31]

architectures those were typically just

[00:44:31] : [00:44:32]

Standalone models that you can still

[00:44:32] : [00:44:34]

find on our deep learning examples git

[00:44:34] : [00:44:37]

Repository and on inference front I

[00:44:37] : [00:44:39]

predominantly if not exclusively work

[00:44:39] : [00:44:42]

with Titan inference server which is an

[00:44:42] : [00:44:44]

open source infinite server used by many

[00:44:44] : [00:44:48]

used by Microsoft integrated in teams

[00:44:48] : [00:44:49]

office

[00:44:49] : [00:44:52]

and with by American Express and and

[00:44:52] : [00:44:55]

many other organizations

[00:44:55] : [00:44:57]

um and it has a back end called faster

[00:44:57] : [00:45:00]

Transformer through which which is still

[00:45:00] : [00:45:03]

quite early but it it already Works

[00:45:03] : [00:45:05]

quite well in my opinion in which

[00:45:05] : [00:45:06]

through which you can integrate even the

[00:45:06] : [00:45:09]

largest of language models for for

[00:45:09] : [00:45:11]

serving so even if you have a model that

[00:45:11] : [00:45:13]

doesn't fit into a GPU a faster

[00:45:13] : [00:45:16]

Transformer has a tensor and pipeline

[00:45:16] : [00:45:18]

parallel implementations you can slice

[00:45:18] : [00:45:21]

the model in half either a vertically or

[00:45:21] : [00:45:23]

horizontally and and and serve it like

[00:45:23] : [00:45:26]

that but if it doesn't if it's so huge

[00:45:26] : [00:45:28]

that it doesn't fit into any whatever

[00:45:28] : [00:45:31]

many however many gpus you have an s78

[00:45:31] : [00:45:34]

you it also supports like multi-node

[00:45:34] : [00:45:36]

serving and it has a lot of tricks up

[00:45:36] : [00:45:37]

its sleep around

[00:45:37] : [00:45:40]

just optimizing Auto regressive

[00:45:40] : [00:45:42]

inference because models like GPT are

[00:45:42] : [00:45:44]

Auto aggressive meaning that you

[00:45:44] : [00:45:47]

generate one token at a time and then

[00:45:47] : [00:45:49]

you do a lot of forward passes so so

[00:45:49] : [00:45:51]

there's a need to do a bit of trickery

[00:45:51] : [00:45:52]

so that you don't compute the same thing

[00:45:52] : [00:45:54]

so over and over again

[00:45:54] : [00:45:59]

so so that's kind of that we were I work

[00:45:59] : [00:46:01]

mostly with pytorch that's just because

[00:46:01] : [00:46:04]

the tools are available for pytharch and

[00:46:04] : [00:46:08]

we I also work with tensor RT quite a

[00:46:08] : [00:46:11]

bit tensority is a is a tool that we

[00:46:11] : [00:46:13]

also provide to the community that takes

[00:46:13] : [00:46:14]

a neural network

[00:46:14] : [00:46:17]

that was trained in say pytorch or that

[00:46:17] : [00:46:19]

was exported into Onyx and it optimizes

[00:46:19] : [00:46:22]

it for deployment to a specific GPU so

[00:46:22] : [00:46:24]

it does things such as I don't know

[00:46:24] : [00:46:29]

Fusion of layers of kernels it does

[00:46:29] : [00:46:32]

post-training quantization and and quite

[00:46:32] : [00:46:34]

a few other things

[00:46:34] : [00:46:38]

and and that list I work a lot with Riva

[00:46:38] : [00:46:42]

reverse our stack for

[00:46:42] : [00:46:45]

for conversational AI so provides

[00:46:45] : [00:46:46]

pipelines for automatic speech

[00:46:46] : [00:46:50]

recognition text to speech and now also

[00:46:50] : [00:46:53]

has an early version of chatbot maker

[00:46:53] : [00:46:55]

and all of the satellite utilities

[00:46:55] : [00:46:56]

around the technologies that I just

[00:46:56] : [00:46:58]

mentioned

[00:46:58] : [00:47:01]

are you surprised by the fact that open

[00:47:01] : [00:47:04]

source Technologies are so powerful

[00:47:04] : [00:47:06]

like compared to Preparatory

[00:47:06] : [00:47:09]

Technologies are you is it something

[00:47:09] : [00:47:10]

that is like to me this is something

[00:47:10] : [00:47:13]

like that is kind of mind-blowing that

[00:47:13] : [00:47:16]

everyone can access that and even right

[00:47:16] : [00:47:19]

now build companies and make profit off

[00:47:19] : [00:47:22]

of tools that are that were developed

[00:47:22] : [00:47:25]

openly for for everyone

[00:47:25] : [00:47:28]

I'm not sure they have a very strong

[00:47:28] : [00:47:30]

opinion or ever thought about this

[00:47:30] : [00:47:32]

problem in a lot of detail you know

[00:47:32] : [00:47:33]

definitely for technologies that have a

[00:47:33] : [00:47:37]

lot of a big community and this is a

[00:47:37] : [00:47:39]

blessing yes yeah but at the same time

[00:47:39] : [00:47:41]

there are countless open source projects

[00:47:41] : [00:47:43]

that are just maintained by one person

[00:47:43] : [00:47:45]

and they have no chance of

[00:47:45] : [00:47:48]

competing with a dedicated Enterprise

[00:47:48] : [00:47:50]

solution

[00:47:50] : [00:47:52]

yeah I was asked I was asking that

[00:47:52] : [00:47:55]

because right now we are in a world

[00:47:55] : [00:47:58]

where for example there could be cha GPT

[00:47:58] : [00:48:01]

that is completely issued and built by a

[00:48:01] : [00:48:03]

specific company and so you have to pay

[00:48:03] : [00:48:06]

and you cannot really do well you can do

[00:48:06] : [00:48:08]

a lot with it but you cannot really work

[00:48:08] : [00:48:11]

with its inner working and and it modify

[00:48:11] : [00:48:15]

it whereas for example there's the

[00:48:15] : [00:48:19]

stability AI with open source work and

[00:48:19] : [00:48:22]

like I feel like

[00:48:22] : [00:48:25]

just before stable diffusion the most

[00:48:25] : [00:48:28]

open source projects where kind of

[00:48:28] : [00:48:32]

not a pale copy but where under the

[00:48:32] : [00:48:35]

Preparatory Technologies most times and

[00:48:35] : [00:48:38]

I I'm feeling like there's a

[00:48:38] : [00:48:42]

turn around now where open source

[00:48:42] : [00:48:44]

Technologies is becoming

[00:48:44] : [00:48:46]

closer and closer to companies or even

[00:48:46] : [00:48:50]

more powerful now is that true like

[00:48:50] : [00:48:52]

there are plenty of Apache foundations

[00:48:52] : [00:48:54]

projects that are just at the foundation

[00:48:54] : [00:48:56]

of the internet you know that you know

[00:48:56] : [00:49:00]

that Linux is open so since yeah

[00:49:00] : [00:49:02]

so I you know they always played it all

[00:49:02] : [00:49:04]

but there is always also a role for

[00:49:04] : [00:49:08]

commercial products so so open source is

[00:49:08] : [00:49:10]

not necessarily cheap you know that no

[00:49:10] : [00:49:12]

of course let's say you have a small

[00:49:12] : [00:49:14]

company you take an open source project

[00:49:14] : [00:49:17]

and something doesn't work and your only

[00:49:17] : [00:49:19]

person that understands that project

[00:49:19] : [00:49:21]

cannot solve it

[00:49:21] : [00:49:24]

what do you do you ask politely to the

[00:49:24] : [00:49:27]

community and they maybe do or do not

[00:49:27] : [00:49:29]

help you right that's not that's not a

[00:49:29] : [00:49:32]

way to do business so there is obviously

[00:49:32] : [00:49:36]

value for many other companies to you

[00:49:36] : [00:49:38]

know provide Commercial Services around

[00:49:38] : [00:49:40]

US Open Source projects provide

[00:49:40] : [00:49:42]

Alternatives even Nvidia we have this

[00:49:42] : [00:49:44]

thing called Nvidia AI Enterprise

[00:49:44] : [00:49:47]

through which we provide support

[00:49:47] : [00:49:50]

for you know people that want it to all

[00:49:50] : [00:49:52]

of the open source tools like you know

[00:49:52] : [00:49:54]

if you find the bug in pytharch we'll

[00:49:54] : [00:49:58]

fix it yes for you and so so there is a

[00:49:58] : [00:50:01]

value in those and and and if you ever

[00:50:01] : [00:50:04]

would like I I participated in

[00:50:04] : [00:50:06]

deployment of production systems so

[00:50:06] : [00:50:09]

if stuff goes wrong you want to fix it

[00:50:09] : [00:50:12]

and I just wanted as my similar to my

[00:50:12] : [00:50:16]

other questions what what is the

[00:50:16] : [00:50:18]

the biggest challenge or the recurrent

[00:50:18] : [00:50:20]

challenge in just deploying models is

[00:50:20] : [00:50:22]

there is it something that is more

[00:50:22] : [00:50:24]

difficult than other stuff so you never

[00:50:24] : [00:50:26]

deploy models

[00:50:26] : [00:50:29]

you deploy pipelines yeah like I I don't

[00:50:29] : [00:50:31]

think I've ever participated in a

[00:50:31] : [00:50:34]

project where you deployed a model the

[00:50:34] : [00:50:36]

the challenges that you deploy really a

[00:50:36] : [00:50:38]

lot of different things that need to

[00:50:38] : [00:50:40]

work together and that all have

[00:50:40] : [00:50:42]

different properties so I don't know in

[00:50:42] : [00:50:45]

computer vision uh let's say you get a

[00:50:45] : [00:50:47]

video stream first thing you have to do

[00:50:47] : [00:50:49]

is to decode the Stream

[00:50:49] : [00:50:51]

and the coding in itself is super

[00:50:51] : [00:50:53]

non-trivial you have to think how you do

[00:50:53] : [00:50:56]

it if you just do it with some random

[00:50:56] : [00:50:58]

library on a CPU that would just be

[00:50:58] : [00:51:00]

maxing out all of the servers and you

[00:51:00] : [00:51:03]

will be not making much money but you

[00:51:03] : [00:51:05]

know if you choose the right codec

[00:51:05] : [00:51:08]

Nvidia gpus have Hardware acceleration

[00:51:08] : [00:51:10]

for video decoding and suddenly that

[00:51:10] : [00:51:12]

becomes free

[00:51:12] : [00:51:13]

but then next step you want to do some

[00:51:13] : [00:51:15]

pre-processing let's say you've used

[00:51:15] : [00:51:19]

Nvidia GPU for decoding yes so so you

[00:51:19] : [00:51:21]

want to use a library that does

[00:51:21] : [00:51:24]

something then I don't know trims the

[00:51:24] : [00:51:26]

trims the the image doesn't really

[00:51:26] : [00:51:30]

matter but you choose to use a CPU a

[00:51:30] : [00:51:31]

library because you your team knows it

[00:51:31] : [00:51:34]

that has implications you have to make a

[00:51:34] : [00:51:37]

memory copy from the GPU memory over PCI

[00:51:37] : [00:51:39]

Express to the host and you're putting

[00:51:39] : [00:51:43]

load so you you so in most day-to-day

[00:51:43] : [00:51:46]

life you don't deploy a model but a

[00:51:46] : [00:51:48]

quite complicated multi-stage pipeline

[00:51:48] : [00:51:50]

that's typically is composed of

[00:51:50] : [00:51:51]

substantially more than one neural

[00:51:51] : [00:51:52]

network yeah

[00:51:52] : [00:51:56]

and even in computer vision so you might

[00:51:56] : [00:51:57]

first of all do I don't know

[00:51:57] : [00:51:59]

identification of regions of interest

[00:51:59] : [00:52:01]

and classify initially those regions of

[00:52:01] : [00:52:04]

interest and then pass on those regions

[00:52:04] : [00:52:05]

to individual

[00:52:05] : [00:52:07]

inevitables that do different things I

[00:52:07] : [00:52:08]

don't know

[00:52:08] : [00:52:11]

in the shop setting and theft detection

[00:52:11] : [00:52:15]

vandalism detection or analytics you

[00:52:15] : [00:52:18]

want to have a shared decode because for

[00:52:18] : [00:52:19]

every one of those use cases you don't

[00:52:19] : [00:52:21]

want to decode the video stream over and

[00:52:21] : [00:52:23]

over again and then you do like common

[00:52:23] : [00:52:27]

detection of locations and maybe couple

[00:52:27] : [00:52:29]

other neural networks and then you have

[00:52:29] : [00:52:30]

another neural network that does

[00:52:30] : [00:52:33]

tracking and then you might know our

[00:52:33] : [00:52:35]

Network plus a classical tracking

[00:52:35] : [00:52:38]

algorithm classical tracking algorithm

[00:52:38] : [00:52:39]

because it's cheap

[00:52:39] : [00:52:41]

and neural network because if someone

[00:52:41] : [00:52:43]

goes to the toilet he wants to be able

[00:52:43] : [00:52:45]

to re-identify them so for

[00:52:45] : [00:52:47]

identification you'll have a second

[00:52:47] : [00:52:51]

neural network so suddenly what was just

[00:52:51] : [00:52:54]

oh I'll deploy whatever mask rcnn it

[00:52:54] : [00:52:57]

ended up in being this as quite a big

[00:52:57] : [00:52:59]

monstrosity of components and all of

[00:52:59] : [00:53:01]

them need to work all of them need to

[00:53:01] : [00:53:03]

scale

[00:53:03] : [00:53:06]

video analytics it's easy because the

[00:53:06] : [00:53:08]

workload is like clockwork 30 frames per

[00:53:08] : [00:53:11]

second 30 frames per second yes but in

[00:53:11] : [00:53:13]

many cases it's not so your customers

[00:53:13] : [00:53:16]

come in waves at Christmas

[00:53:16] : [00:53:19]

and then they don't come at all on New

[00:53:19] : [00:53:22]

New Year's Eve and and you don't want to

[00:53:22] : [00:53:24]

be paying for the harder it's such a

[00:53:24] : [00:53:25]

multi-dimensional problem

[00:53:25] : [00:53:28]

inference requires quite a lot of work

[00:53:28] : [00:53:29]

quite a lot of people with quite a lot

[00:53:29] : [00:53:31]

of different expertise

[00:53:31] : [00:53:34]

so the main challenges come with the the

[00:53:34] : [00:53:36]

complexity of the the solution as well

[00:53:36] : [00:53:38]

as the complexity of depression and

[00:53:38] : [00:53:41]

Randomness behind it it's just life like

[00:53:41] : [00:53:44]

life is never that easy like even things

[00:53:44] : [00:53:47]

that seem super trivial when you look at

[00:53:47] : [00:53:50]

them from from 100 miles away when you

[00:53:50] : [00:53:52]

start digging into the detail have a

[00:53:52] : [00:53:54]

level of complexity that level of

[00:53:54] : [00:53:56]

complexity might not be necessarily high

[00:53:56] : [00:53:58]

like neither of the things that I just

[00:53:58] : [00:54:00]

mentioned it's rocket science but it's

[00:54:00] : [00:54:02]

just a lot of it and you have to

[00:54:02] : [00:54:04]

systematically tackle every one of those

[00:54:04] : [00:54:06]

problems and that requires a bit of

[00:54:06] : [00:54:08]

patience a bit of

[00:54:08] : [00:54:12]

character sometimes and and you know

[00:54:12] : [00:54:13]

especially if you have a younger thing

[00:54:13] : [00:54:15]

without experience of doing it earlier

[00:54:15] : [00:54:19]

in quite a bit of

[00:54:19] : [00:54:22]

time that and this is kind of the the

[00:54:22] : [00:54:25]

the support we provide we we have a very

[00:54:25] : [00:54:28]

capable team that have has done those

[00:54:28] : [00:54:31]

things quite a few times and this is the

[00:54:31] : [00:54:33]

type of guy evidence we get them

[00:54:33] : [00:54:36]

so my my fourth question was about the

[00:54:36] : [00:54:39]

the two topics you we discussed that you

[00:54:39] : [00:54:42]

will be giving at and at GTC sure and so

[00:54:42] : [00:54:44]

is there anything else you wanted to

[00:54:44] : [00:54:48]

mention about those topics to or just at

[00:54:48] : [00:54:51]

least maybe summarize why should people

[00:54:51] : [00:54:53]

tune in to those two specific tasks that

[00:54:53] : [00:54:57]

you are giving at GTC so so

[00:54:57] : [00:55:00]

if you amplify emphasize with what I

[00:55:00] : [00:55:03]

just said yes around just day-to-day

[00:55:03] : [00:55:05]

complexity of solving problems those

[00:55:05] : [00:55:07]

talks will be great for you because

[00:55:07] : [00:55:10]

you'll hear two different groups that

[00:55:10] : [00:55:12]

are serving two dramatically different

[00:55:12] : [00:55:15]

problems talk about more or less the

[00:55:15] : [00:55:16]

same pain points pain points of you know

[00:55:16] : [00:55:18]

getting the first pipeline up and

[00:55:18] : [00:55:21]

running writing kpis

[00:55:21] : [00:55:24]

and you you'll get their point of view

[00:55:24] : [00:55:26]

of how to solve those problems and

[00:55:26] : [00:55:30]

hopefully it will help you plan your uh

[00:55:30] : [00:55:32]

your projects a bit better and and give

[00:55:32] : [00:55:36]

you some you know ideas on you know what

[00:55:36] : [00:55:37]

what things to put in place and in which

[00:55:37] : [00:55:39]

order because I think both of those

[00:55:39] : [00:55:42]

stocks are are organized chronologically

[00:55:42] : [00:55:44]

in a sense that they just go through the

[00:55:44] : [00:55:45]

Journey

[00:55:45] : [00:55:47]

from the very beginning to the very end

[00:55:47] : [00:55:49]

highlighting all of the key things that

[00:55:49] : [00:55:51]

caught them by surprise they also go

[00:55:51] : [00:55:54]

into motivation in quite a bit of detail

[00:55:54] : [00:55:57]

so why did those companies those

[00:55:57] : [00:56:00]

particular groups decided to embark on

[00:56:00] : [00:56:02]

that journey and that's also something

[00:56:02] : [00:56:04]

that might help you maybe if you're not

[00:56:04] : [00:56:07]

in an engineering role but moreover in a

[00:56:07] : [00:56:09]

management

[00:56:09] : [00:56:11]

to to to help you think more

[00:56:11] : [00:56:13]

systematically about what is it that

[00:56:13] : [00:56:14]

it's possible with natural language

[00:56:14] : [00:56:15]

processing

[00:56:15] : [00:56:18]

so very uh we should expect very

[00:56:18] : [00:56:21]

applicable tips from real world examples

[00:56:21] : [00:56:24]

basically yeah they will they will they

[00:56:24] : [00:56:27]

will they will people will

[00:56:27] : [00:56:29]

obviously have different experiences but

[00:56:29] : [00:56:30]
[00:56:30] : [00:56:33]

awesome well thank you very much for

[00:56:33] : [00:56:35]

your time and for all the very valuable

[00:56:35] : [00:56:38]

insights it was really interesting and I

[00:56:38] : [00:56:41]

learned a lot just in this past hour or

[00:56:41] : [00:56:43]

so so thank you very much for your time

[00:56:43] : [00:56:46]

again and I was glad to have you on on

[00:56:46] : [00:56:47]

this interview

[00:56:47] : [00:56:49]

thank you Candy and have a great day you

[00:56:49] : [00:56:52]
[00:56:52] : [00:56:52]

[Music]

[00:56:52] : [00:57:16]

foreign

[00:57:16] : [00:57:16]

Listen to the interview on your favorite streaming platforms like Spotify, Apple podcasts, or YouTube.


The lead image was generated using HackerNoon's Stable Diffusion AI Image Generator feature, via the prompt "a human architect standing and looking at a building".

by Louis Bouchard @whatsai.I explain Artificial Intelligence terms and news to non-experts.
Watch more on YouTube: https://www.youtube.com/c/WhatsAI

Comments

loading...
Customiz|
Hackernoon hq - po box 2206, edwards, colorado 81632, usa

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK