Machine learning in product discovery – Ha Phan on The Product Experience

BY The Product Experience ON APRIL 6, 2022

If used well, machine learning (ML) can be a key tool to gain the most effective data about your users. However, how do you conclude if it’s the right approach for your product and teams? We spoke with Ha Phan, Director of Discovery Products at Pluralsight, to develop a stronger understanding of how we can best use ML in product management.

Listener of The Product Experience? We need your help! Fill out this short survey to help us improve the overall experience of our community-led podcast.

Featured Links: Follow Ha on LinkedIn and Twitter | Not Hotdog on the App Store | ‘A thank you letter to my team’ article by Ha Phan

Episode transcript

Lily Smith: 0:00

Randy Have I ever told you about when I worked for a search and discovery business, and we played around a lot with machine learning. It was really fun. There was no front end, just data, inputs, outputs, lots of experiments.

Randy Silver: 0:15

We’re not sure if I enjoyed that one, Billy, I miss working on interfaces with designers. I mean, I can understand in theory, but I’m not so sure that that would be quite right for me.

Lily Smith: 0:27

Well, it’s a good thing that our guest today can talk to both of us. She’s a former designer who ended up looking after search and discovery for Pluralsight. building out a whole machine learning team. Ah,

Randy Silver: 0:38

then it can only be half fun. She’s one of my favourite product Twitter follows. She’s joining us to give us the lowdown on working with ml teams on a daily basis. Why search is so interesting, and why she’s the wrong person to answer some of our questions.

Lily Smith: 0:56

They will suspense let’s discover what she had to say. See what I did that.

Randy Silver: 1:01

Oh God, my sense of humour is rubbing off on you save us all.

Lily Smith: 1:10

The product experience is brought to you by mind the product.

Randy Silver: 1:14

Every week, we talk to the best product people from around the globe about how we can improve our practice and build products that people love.

Lily Smith: 1:21

Because it mind the product.com to catch up on past episodes and to discover an extensive library of great content and videos,

Randy Silver: 1:29

browse for free, or become a mind the product member to unlock premium articles, unseen videos, AMA’s roundtables, discounts to our conferences around the world training opportunities.

Lily Smith: 1:42

Mind product also offers free product tank meetups in more than 200 cities. And there’s probably one.

Randy Silver: 1:52

Hi, thank you so much for joining us tonight. I’ve been looking forward to talking to you for a very long time.

Ha Phan: 1:58

Nice to meet you Randy

Randy Silver: 1:59

So before we get into the topic of the conversation, we’re gonna be talking a lot about product and ml. Can you just give us a quick intro? How did you get into product in the first place? And what are you up to these days

Ha Phan: 2:13

while I got into product so people have known me for quite some time now me as a UX designer, my friend coined the term geriatric UX designer. So I’ve been doing a lot of time. So you know, we call each other geriatric UX designers, I got into product. I think when I was working at GoPro, I was working on some initiatives for r&d. And we were building like, with the rods, you know, experiments, prototypes. And that experience, I think, was the gateway drug into product, because it taught me how to sell the initiative across the company. It also taught me a lot about thinking through, you know, research questions, how to find clarity from ambiguity. And when you’re working with emerging tech, there’s a there’s no baseline around feasibility. So the ambiguity is vast. So I felt like that provided me with the foundation into product management into figuring out what questions to ask. And then after GoPro, I went to work at the startup, many of the GoPro people left and worked at the startup, also working on some AI tech for smart homes, internet of things. And then when I applied at Pluralsight, I applied as a UX designer. And I showed my case studies and the hiring manager manager saw it and said, You should really be a product manager. So that was how I started.

Randy Silver: 3:50

Okay, so you sounds like you went on this very natural evolution from one to the other. I’m curious, you went so you went from design to products, you picked up a specialisation in machine learning. And you spent a lot of time working on search and discovery as well. Along the way, what was it? What’s the learning curve been like in making those changes and picking up the specialties?

Ha Phan: 4:12

So I think, I think the learning curve is very steep, actually. But I think that what helps me is that I have a solid foundation and information architecture to UX. And working on the r&d work at GoPro helped me to see the data model behind the interaction. So it’s not something I can explain, but I just see it when I design so when I started working in search, I understood like what signals the system has to, you know, like, understand in order to figure out what relevance means. So I just connected the dots between my experience from working on UX in understanding information architecture to the data model. that we, that we use in building searching relevance in machine learning. So it’s that’s really high level. But when you talk to data scientists or machine learning engineers, or going back and people who work on search, they’ll tell you, you know, what metadata they need. And all those things are signals that the system needs to need to collect over time. And then some signals, those are signals that we curate into the dataset. Metadata and tagging are our signals that we curate. But then there’s implicit and explicit signals. So explicit signals are things that the user does explicitly, like the user clicks on something. Implicit is what the the system in first, like, you can also infer things that the user doesn’t do, and have that inform relevance as well. So there are things like that I just understood from my work on emerging tech that I brought with me to search, I didn’t know what I knew at the time, but working on it, and working with engineers and asking a lot of questions enable me to kind of see it, you know, in the in terms of within the system itself, rather than in these abstract that we talk about day to day,

Lily Smith: 6:16

and how you’ve talked about sort of how you use some of your UX experience in your machine learning work, but it sounds like it would also or I imagine it would also need quite a different approach from a product management perspective, to a, for instance, a kind of front facing interface, like a user interface versus like a search product. So what were some of the ways in which you have like managed products? Like, how do you go about that? I mean, I know that’s a massive question. But yeah,

Ha Phan: 6:53

that is a, that’s a hard question. To understand, like how I my journey, you got to understand the context a bit. So when I, when I came on as the pm for search, we didn’t have a search team, I actually asked to work on certificate, I felt like it was a core capability that we needed to scale. And so I asked for such at the time, there was no engineering sorta with the loan, team member and search. So there’s many challenges. First, we have to hire team. And the first team members who came on with full stack engineers have it building searches of very specific skill set. So over time, we did add on machine learning engineers and data scientists. But the reason why I brought that up is because there’s a time for the team to skill up, not just understanding the tool that they use, but also understanding the problem space. So at the beginning, none of us even understood, you know, what, how to build search, like, when when we look at search, we see a query box and results, but it’s very, very complex. I think search is one of the hardest thing I’ve ever built, ever worked on. And so what I had to do as a product manager is that I had to buy the team time to learn not just to learn, you know, the tool, but also to learn, you know, how to measure relevance what relevance meant. So the first, I think the first deliverable, the MVP that we delivered, I think I told engineering team that the new search can’t suck worse than the old search. So then, I basically didn’t add any feature. In fact, we removed many of the features of the all search, because I discovered that people weren’t really using the filters or anything. So we just deliver a very basic search. And then we just looked at the key metrics, and we made sure that nothing exploded. And as long as those those metrics are fine, then it provided a baseline so the team can continue learning. So it early on what I did was I protected the team a lot so that they can learn and get up to speed, learning the how to use Elasticsearch, learning how to building out our backend, you know, just understanding like, what use cases are we solving and search there’s many use cases that we build in that we software in search. Most people when they search they think all they put in a query, but there’s a big difference between searching for a single word versus searching for, you know, like phrases, because, you know, the system has to understand it differently. Like, you know, they, like for example, you search for two words, the difference between like managing resources, for example, if those words were interchange, in order with that change the results. So it’s very nuanced and managing that team really means to protect The time they need to learn, and then also try to align the outcome with each thing that they’re learning. So they’re not just learning how to solve a customer problem. They’re learning also like how to how to scale out our platform has scaled back, and you know how to improve relevance how to build the first learn to rank model. So for me, as the leader of that team, eventually I did get promoted, and I became a director of discovery. But for me, I was always protecting the team when I felt that they needed space to just figure things out. So I that was a long answer, but it was really hard.

Randy Silver: 10:38

It’s a good intro. So I’m curious. So you started off with a problem space, you started off with the problem of search, and you want it and discovery and you wanted to make that better. And you got to the point of saying machine learning is the ideal way of doing this. But that’s, that’s it, you know, you can take a look and say, I want to do machine learning, and therefore I need to find a problem for it. Or in this case, you started with the problem. And you decided that this was the best solution. How did you get there? How did you decide you and the team decided that machine learning was the right approach? Well, you

Ha Phan: 11:09

have to kind of think about the application for machine learning, right? Like sometimes when you see product teams that are led by engineers, not that all product in led by engineers are bad, just to put that out there. But sometimes you see product teams led by engineer, and they’re just building the thing for the thing sake, but they’re not thinking about the problem. So in terms of thinking about where, you know, how we use machine learning, for search was very clear. Because if you research how, you know, other people will search, one of the fundamental things is to understand like, how do you improve relevance, right? So there’s, there’s a model called Learn to rank, which means that the system improves the ranking of results based on user behaviour. Right? And for us, we have to ask ourselves, like, which signal do we give this model so that the model can improve the ranking. So you can say that, well, if users are clicking More on this result, it gets, you know, a higher ranking, that’s a very simplistic way of saying it, or better yet, maybe you can say, if users are, you know, clicking on this result, and then they’re consuming a certain amount of the content, then you know, that the ranking for this result should be higher. So then each time that this happens, the the model gets smarter, and it just rearrange the results based on user behaviour. There’s many other signals that just one signal. So let there might be other signals where you, you’re saying this gets a higher score than this one, it kind of works itself out. But so I use you the example if it is an obvious problem for the model to solve. There might be other problems that where you think, you know how you might use machine learning. So if you think in terms of when people discover things, they either know, what they’re looking for which they’ll put in a query, but sometimes you don’t know what you don’t know, right? You’re like, you’re, you’re in this new space you’re learning, you don’t know what you don’t know. So I think that a lot of times when you have these exploratory use cases, then that’s where, you know, recommendation comes in. And recommendation can recommend content, or recommendation can also recommend topics. And so you can ask yourself these questions like, you know, like, one of the questions we asked was like, how do we get users to input more granular queries. But if you talk to beginners who are learning, they don’t know what creates put in so then you have recommended topics or related query that where that comes from, but then you have to define like, what is a topic, you know? So that’s a whole other thing that’s like an ontology, like how do you know that chickens are related to? I don’t know, Turkey, for example, you know, so you kind of have this ontology that informed you that these topics are related. I’ll give you another example. This is where creativity comes in. And also qualitative research informs what problems machine learning can solve. So I was talking to our users, and we have many kinds of users on our platform. So I forgot to mention that this is the context is Pluralsight. And it’s an elearning platform for technologists. So our platform when we have many kinds of learners, we have beginner learners and we ask ourselves, what does a beginner learner mean? But we talk to learners. If you’re practising engineer, Senior Engineer, who’s been doing it a while, who’s been coding and solving problems for a while, and if that engineer learns a new language, he’s not gonna watch you know, some old review course at the beginning, you can dive right in and try things by coding. But you talk to someone new to coding, they’re gonna watch some overview course. So where they start is very different depending on their context of where they begin. So I brought this up, I learned this when I was talking to users. And a data scientist said, Oh, okay, well, we’re going to create an automated playlist for each of these contexts. So then, you know, we talked about it, and I kind of did this chicken scratch on a whiteboard. And the next week, the data scientist came back with a prototype with these automated playlists around like, you know, the autonomy playlist for practitioners versus beginners. So you can think about, you know, that’s where you understand very concisely what the problem is, and then you bring in the model to help solve it.

Randy Silver: 15:55

Okay, so that’s an interesting example that you gave, you went from being really concise about the problem, but not the solution. And a lot of times when product managers work with with their engineers, we spend a lot of time in the problem space and coming up with defining it. But then we spend a lot of time designing the solution with designers and and potential and with the engineers and going through estimation, everything else. The times I’ve talked to people who work in AI and or ml, it’s very different, as you just said, you know, they went away, and they came back with an idea of how to do it. And I’ve had issues where people don’t know how to estimate this. They don’t know how long everything’s going to take it, try it and see what happens. Is that, is that what it’s like? Or what’s the return that working in this space and working in a more traditional approach to designing solutions?

Ha Phan: 16:45

Well, first of all, you’re kind of asking the wrong person, because I don’t believe in the traditional approach of, you know, design, you know, like the whole water fall ish process where design does all this stuff up front. And there’s some of that I think that informs like your long term vision, but your iterative process might not be the same iterations might not be quite the same. But I believe in regardless of whether you have an ML powered product or not, I believe in continuous discovery, which is, there’s always a parallel track where we’re doing continuous discovery, whether it’s a B testing, or what for data analysis, or qualitative testing, there’s always the continuous discovery track. And this is more important when you have working on data product when you have an ML powered product. So on our team, we always have data scientists who are in machine learning engineers who are trying out different models. They’re trying out new learn to write models, they’re testing things out. And then sometimes they build prototypes that we look at a we go, well, the false negatives are that makes that model kind of stupid. So how do we improve it? So when you think about how you build products, like software products, the the big unknown is you just don’t know the market fit? Right? You really don’t know this solves the problem. But you understand enough of the feasibility because you did it before you seen other people do it. But when you work with, you know, ML and AI, you don’t really, you don’t really know what the model can do you really know yet the data yet. So all those things, there’s a lot of unknowns, you have to have some time to figure out the unknowns, and you have some, some time to figure out like, what the output of that model like sometimes when you How do I say this? Let me let me think about this for a second. Okay, here’s a good example. Imagine that you’re working on that app, you know, the not a hot dog app and Silicon Valley, HBO, you guys watch the show? So that I think a lot of people watch the show. So in the in one of the episode, one of the engineers try to create an app that could detect whether something presented in front of them was a hot dog or not. Okay, so. So the app, you know, when you take a picture of the thing, the app says, Oh, this is hot dog, or it’s not hot dog. So basically, if you have a sucky model, it would get if it gets too many times wrong. And then the user basically not trust the app anymore. Actually, somebody actually built this app. And I actually use it to test it. And I tested it by taking a picture of a Chinese sausage to see if it could infer that this is not a hot dog, and it guessed correctly. So I guess what I’m saying is that when you when you work on machine learning that the guess is said to be pretty good, or else the there’s a threshold of what is pretty good so that the user would try The model

Randy Silver: 20:08

if 2022 is the year you’re looking to advance your career, expand your network, get inspired, and bring the best products to market, then join mind the product for their next conference this May

Lily Smith: 20:21

at MTP con, San Francisco plus Americas, you’ll soak up invaluable insights from an epic lineup of the best in product, covering a range of topics that will challenge and inspire you to step up as a product manager,

Randy Silver: 20:34

you’ve got the option to go fully digital for both days, or get the best of both worlds with a hybrid ticket. Digital on day one and in person at the SF jazz in San Francisco on day two, I was at the most recent edition of this event in London last year, and it was just awesome.

Lily Smith: 20:52

Get tickets now at mine the product.com. So I guess that it like is that one of the challenges when building a machine learning product? How you measure the success of it?

Ha Phan: 21:10

Yeah, so I think it just depends on what your case you’re like any other product, you’re defining your KPIs. And for our team, we like for us we measure search by content start rate, which is the rate at which the users engage with the content across all of the discovery experiences. But yeah, it also depends on the kinds of questions we’re asking. So for example, we define a query in many different ways. So for example, we define a query as user something that user input into a query box. But we also surface topics where users can click on and that triggers a search in many different places, we recommend topics. So we utilise data within search, and many different ways. We utilise queries, we surface queries in different places, like most popular queries, or we surface topics, which are tags. So we collect data, and then depending on the question, we want to ask, like whether that question is whether users input their query, you know, where’s the most popular places to kind of figure out like, engagement in from different pages? So do they go into browse and look at a topic? And that topic is query? Or do they, you know, put in their query input into the search box? So it just depends on the question we’re asking. But generally, for us, the Holy Grail is always the content start, right? Did the user get into the content?

Lily Smith: 22:54

Um, I did a short stint, a personalised search and recommendations business. And I know, when you think about sort of search and discovery, generally, we’re talking about the search bar when you’re inputting a query, like you say, and then content recommendations. Do you see where machine learning can go in the future? And do you think that there’s a future where navigation and like just the whole structure, or the whole kind of information architecture of the site begins to learn about you, I guess, in the in the way that NetFlow?

Ha Phan: 23:30

Yeah, it’s already happening. So in terms of navigation, most of time, when we think of navigation, we think of the static things. We think of like, you know, this is the homepage. This is the Browse page, you know, this is the profile, those are static things. But when we think about topical exploration, topical exploration should be data driven. Because, say, for example, I stick Amazon, Amazon might start out with books. And so that’s one kind of topic. And then Amazon adds more topics. So the list of topics changes. The other thing is language change. So, you know, you might, you might one day call this topic. I don’t know maybe you call it chicken, the chicken. But then you also call the chicken the hen. So when you know that when you search for when you give chicken. So there’s this, you know, ontological structure behind everything that basically understand what the user meant when they say x. And it also enables us to provide dynamic topical exploration. So there’s this robust data structure that’s always powering discovery. And I think that’s really the most difficult part of building discovery experiences.

Lily Smith: 24:55

Yeah, I think it’s really fascinating there as well and the direction that It goes in. And I guess that’s one of the other things I was thinking, you know, as a product manager today, like if we need search on our site and content discovery or product discovery, or whatever it is, not all of us are blessed with being able to have a product manager who can, you know, lead a team of engineers to develop a solution? Do you think there’s enough good out of the box tools out there that enable us to bring some of this in to our products? Or do you think is a long way off that now?

Ha Phan: 25:36

I don’t think that there are. And I think that there’s a lot of things that we don’t see. So when you build other experiences, you know, you see the UI, you can you see enough where you can say, this is what it does, these are the things are structured, but when you build discovery experiences, much of the experience is invisible. 80, or 90% of your roadmap are invisible things that makes all the difference in terms of the user experience about how the Platform Scales. So now there’s there aren’t enough tools, and there will never be enough tools for that. So I think that a product person who runs these kinds of products, must build in space for teams to, you know, find their path to clarity. And we do that by prototyping. We do that by having regular meetings with each other. So for example, every two weeks on my team, we have a meeting between product and design with data science, and data. That’s amazing what data science comes, you know, show us their prototype, just anything, they can think of what they think might provide value. And sometimes it’s us coming to their designs, and hey, we think we think this is a good idea. We don’t know how it works. But maybe there’s something here. So there’s a lot of those kinds of sessions where we work collaboratively together. I also have done things where I gave the team a week to kind of just prototype, like quick and dirty prototypes just to just to see what we got. So I think that it’s kind of like this some r&d involved. So in order for you to kind of reach a common a shared understanding of what you can do, and what problem you can solve. You need, you need those spaces to do that.

Randy Silver: 27:34

So MPs meetings, the SAP data, the data scientists, machine learning engineers, they come in, they present something to you say, it’s interesting, it’s getting there. It’s not quite there, it’s not quite what you need, where, you know, how do you know, you know, in terms of the prioritisation of the amount of time you’re giving to them? How do you know if it’s something that if you give them another week? That

Ha Phan: 27:57

Yeah, another week? No, no, they tell us. So usually, as a product leader, I try very, very hard to not promise anything, especially if there’s a lot of unknowns. If there’s a lot of unknowns, I’ve been known to be very vague, and the master vague. If I have, you know, just like anything, if you have more information, then you are more confident of giving more concise dates. But if not, I’ll come up with some really vague OKR. Like, so, you know, it just, it just psychological safety, you know, I don’t want to, I don’t want to put my team in this position where they can’t deliver, especially when there’s so many unknowns. But what I do is normally we have like, some goal, like a broader goal. And then we place many bets towards like, improve, you know, like, just improve the content, start writing, we kind of think about different contexts. And then we usually brainstorm the roadmap with engineering, and other technical folks, and we kind of, you know, do this morale board, we kind of think what our biggest bets. But usually, the data science folks will tell us, that tell us like, hey, we have this is gonna take this much time to bring it to production. But I usually try to run some kind of cheap test. So we don’t have to productize the model. We kind of like do a can, you know, like, like, for example, we did related searches, we just do a can related search based on that model, just for the most popular queries just to test it. But having done this now for a couple of years, when you work with data products, it’s best to release it and get the data live. We have not you know, it has not hurt us to to release. So the more data you can get in a quicker amount of time, the better.

Lily Smith: 29:57

How do you handle confirmation? bias with machine learning?

Ha Phan: 30:02

Oh, that’s a tough one. I don’t know how to a way around this. And let me explain why. When you build models, you need data. So the bias is built into what’s already there. So let me let me give you an example. Um, we have this thing on up on our site, car path. And a path of learning path. A learning path is like a sequence of courses that you take to learn this topic. That’s a simple explanation. So when users are trying to learn something for the first time, they sometimes just use a path. And then they’ll go sequentially, one course after the next. So if I build a recommendation model, to say, What should I learn next? I’m going to come up with a sequence that is like the path because that’s what most people do. Right? Another example is, I don’t think it’s an accident that the top 100 queries match our Tech’s. So there’s, there’s something that there’s already when we build the model, when we build the platform, and how we order things, how we find all users through their workflows, there’s already intrinsic bias, that’s a point of view. So you know, you can’t get away from that. You unless you’re building something brand new, like some brand new model. But when you get the data somewhere, that data has bias, because it’s built for a certain purpose. So I can’t answer your question, because I don’t know if you can get away from bias.

Randy Silver: 31:45

Okay, so I’m gonna ask you one last question. And it’s gonna be an easier question. I hope that one day you actually can answer for us to go out on for product managers or product, people who are coming into into the space. What’s one thing that they need to know when they’re starting to work with in the in the AML? Space? One thing that you know?

Ha Phan: 32:12

I don’t have to answer that question. I really don’t. I think that I think that the ability to see the signals up in the system, so I’ll give you another example. And I think most people don’t see this when you have to go to see this to see other things. So remember, when the iPhone first release the live photos. So remember, before live photos with a bad photo, just one single photo, right? When live photo came out, you have a moment, which is a, you know, a clip, which is a moment in time six, I think it’s 60 frames a second, because it’s one second. So now the definition of a moment change. So definition of a moment. Now there’s like the key, whatever the key frame. So when you look at that, you shouldn’t just look at that as this interaction when you start to look at that as the structure of the metadata behind the thing. So for example, if you look at the keyframe, you know that there’s a keyframe there and the keyframe is part of this thing. So the reason why it’s important to see this as data structure is that if I see that, I know that the key I can ask questions around is a key frame of face. So that I could infer that with facial recognition that I could build something off of that is a key frame, you know, you know, maybe this, this whole moment is an action shot. And then I can think about like how do I utilise all those pieces of data? So that’s what for me, that’s what I can’t answer the question like, what does each person need to know coming into this, but I know that when I looked at interaction, I looked at the data structure behind each of the interaction, every minutia thing, and I always say that the possibilities of what you can do with a thing is dependent on its metadata. Sorry, that sounds like really, really ridiculous but

Lily Smith: 34:18

no, not at all. I love the passion behind what you’re saying. I can tell that you obviously really enjoy the work that you do. And yeah, and get excited about it. So thank you so much for joining us and talking about it today. It’s been really great talking to you.

Ha Phan: 34:35

Thank you and thank you for your patience because I have a really hard time explaining to a lot of people.

Lily Smith: 34:42

No, it’s been fantastic. The product experience is the first and the best podcast from Mind the product. Our hosts are me, Lily Smith and me Randy silver. Louron Pratt is our producer and Luke Smith is our editor.

Randy Silver: 35:10

Our theme music is from humbard baseband power. That’s Pau. Thanks to RNA killer who curates both product tank and MTP engage in Hamburg and who also plays bass in the band for letting us use their music. You can connect with your local product community via product tank, regular free meetups in over 200 cities worldwide.

Lily Smith: 35:31

If there’s not one near you, maybe you should think about starting one. To find out more go to mind the product.com forward slash product tank

chat

Comment on or discuss this article

Machine learning in product discovery - Ha Phan - Mind the Product

Machine learning in product discovery – Ha Phan on The Product Experience

Episode transcript

Recommend

字节跳动 Flutter 架构实践

Apple returns anti-Putin voting app to the Russian App Store | TechSpot

我该如何放下一个人？我很爱很爱很爱。可是不知道该如何放下。他对我发脾气，我也不曾...

几个小工具

第023期两种方法搭建FRP内网穿透服务器|附防火墙设置避坑指南

web3：颠覆互联网还是白日梦一场？

为什么 Linux 和 macOS 不需要碎片整理

Facebook 集群调度管理系统 · OSDI '20

iPhone 14 Pro 广角传感器尺寸达 1/1.3 英寸

预见2022：《2022年中国工程建设行业全景图谱》(附市场规模、竞争格局和发展趋势等)

About Joyk