About a year ago, Pinterest realized its searches didn’t work.
Users could type in something like “women’s hairstyles,” and get results, but they tended to heavily favor images of white people. So the Pinterest team tried something new: filtering search results by skin tone.
The feature rolled out in late April and gets prime real estate when users search for certain beauty-related topics, like hairstyles and makeup tutorials. It’s not perfect, Pinterest’s head of discovery product Omar Seyal tells Quartz, but the feature has increased engagement with a more diverse set of posts on the site, and is giving exposure to a group of creators who didn’t traditionally get top billing in search.
Bias in search algorithms plagues other sites, of course. Just look up “doctor” in Google Images.
Quartz spoke with Seyal about the algorithmic bias that upends search results in the first place, as well as how inclusivity can create a better social platform.
Quartz: What was the thought process behind adding something like skin tone search?
Seyal: The result when you would search for hairstyle ideas was broken. Forgetting metrics, just if you looked at Twitter and you looked at “Pinterest hair style search,” you found users who would say “why does Pinterest just show white women when I’m searching?”
As someone who knows how search works, or how recommendations work, you know that, to oversimplify, these kinds of things create a popular vote over time, where the dominant set of users kind of vote some set of content to the top. You then realize a) we have actual users who are complaining about this problem, and b) kind of fundamentally the way this thing is built actually creates a destructive experience for users.
Q: When you say only white women were coming up in search, that’s something that is seen pretty often in search, especially on Google Images you’ll type the word “doctor” and get images of white men in lab coats. Do you see what you’re doing as something that can have broader implications for more inclusive search?
S: Yeah, I’d like to. A really common strategy for making recommendation or search results better over time is called collaborative filtering. So you take all the people who make a query, and you take all of their engagement, and you use their engagement to vote up or down content. So all the people who ever search for “CEO,” if most of them click on white men, that floats it up or down. What you end up realizing is, and I’m not going to speak to other platforms, but on Pinterest specifically we don’t take an opinion of “this is the best and this is the worst,” but instead we’re taking an opinion on “this is the set of things you should explore to understand this query.” And it’s a huge imperative for us to show variety and show diversity and show a range of options so that people can explore and get to something explicitly relevant for them. So for us, if you type in “women’s hairstyles” and all you get are options or ways of ending up at white women with brown hair hairstyles, then we’re failing.
Q: Was there anything surprising when you and the team implemented this?
S: The best thing to predict engagement is engagement, so I expected, in doing this, to take a bit of a metrics hit, or for this to not be as engaging of a feature. And when we built it, I was kind of shocked to learn that it was actually more engaging. This means that taking this strategy of diversity, when done right, actually produces more engagement, or more satisfaction with the platform. You kind of understand that intuitively, but as a person who works with metrics over time you can be a bit shocked to learn you can do something that is a bit counter to engagement to improve engagement.
Q: And you’re not targeting ads or predicting ethnicity through skin tone at all, so how do you plan on walking the line between helpful and invasive?
S: One of the solutions that someone at one point in time floated was “what if we just personalize the results to be just hairstyle results for you?” The reality is—one I think that’s possibly illegal if you’re actually classifying a person’s race—that also breaks the use case. The use case is to allow anybody to explore any hairstyle on any kind skin tone. So while that opens up space for someone who doesn’t see what they want, it also allows people like a hairdresser or someone like that to see an entire breadth of things. A hairdresser’s going to have customers who have every single skin tone, so you can imagine they’re going to want to see every single range, so personalizing to their skin tone wouldn’t work very well.
Q: So it’s really a more in-depth search. I think that stands to why it was more engaging, as it serves more of your users, right?
S: What I would like to do over time, now that we’ve created new space with these skin tone range filters, is to use what is popular and good within those and slowly mix them into the default canvas [search results]. But I’d never want to move to a place where we personalize the default canvas to our best guess of what skin tone you are, because, again, that would cut against us trying to show you the breadth of the world.
Q: Something I hear a lot is that unlabeled data is a challenge for diversity. Like, in healthcare and dermatology, they say there aren’t enough labelled images of people with lesions on darker skin. Do you agree with that?
S: This is what was shocking about it. In creating these skin tone filters, there now exist a number of incredibly small canvases, an incredibly small amount of data to choose from. In some cases, you’re like “Ugh, these aren’t so great.” But the reality is that if you give people an opt-in experience, where they can get to what they want, even if it’s not perfect, the default experience is so bad, so biased, that it is better for them.
We always say there’s not enough labeled data or not enough whatever, but the context there is to provide a perfect solution. In most cases there is enough labeled data to make a better solution for a minority group. That was kind of the eye-opening thing. Improving the experience a bit and showing that attention to the group of people who have this intent does two things: It makes them incrementally happier, but it actually creates more room in your system. Now there’s a place for creators who create this sort of content to now get it surfaced and engaged with on equal and possibly higher footing. In my head I probably would have committed to the same labeling assumption. I guess what I learned was that the labeling assumption is predicated on building the perfect solution.
You don’t have to start with perfect. Over time you build perfect.
This interview has been edited and condensed for clarity.