Browsed by
Month: August 2018

Addressing bias by bringing in diversity and inclusion in search

Addressing bias by bringing in diversity and inclusion in search

When it comes to Web search, the Matthew Effect (rich gets richer) applies quite well. Things that show up on the top of a rank list are likely to get clicked more and as they get clicked more, they continue being on the top. Of course, that’s not bad in itself; if there are things relevant to one’s query every time, why not have them show up at the top? But this becomes problematic when objectionable and deceptive items get to the top for some reason.

Think about sensational news. People like them; more specifically, people like clicking on them. When a title of a story says “You won’t believe what NASA is hiding”, most people are naturally enticed to click on it. It doesn’t matter if that story actually has anything substantive or not, but it got clicked, and in many instances, shared. This all probably started by someone searching for ‘NASA’. Or, increasingly, such a story coming up as a paid content (ad) next to a real story. A search system then treats this as a signal of relevance and the next time this story gets even more prominent position, thus starting the vicious cycle of the Matthew Effect. We are victims of our own desire to be curious and wanting to get attention by others!

There are many instances of this and other kinds of intentional and unintentional biases in today’s Web searches, social media, and recommender systems as people try dirty techniques of SEO (search engine optimization), click-bait, and simply wanting to spread rumors and “fake news”. It’s becoming an increasingly frustrating problem for service providers and users, but even more seriously, a threat to our democracy and free speech (and some may even say, free will).

There is no single solution for this. But we have to try all that we can to fight such issues of bias and fairness in Web. For my part, I have been addressing this through the use of collaboration — specifically, bringing in diversity and inclusion in search. For example, in 2007-2008, as a part of my work with Jeremy Pickens and late Gene Golovchinsky at FXPAL, I looked at having people with different backgrounds and skillsets to work together in collaborative search. This was with algorithmic mediation. During 2008-2010, I continued exploring this idea from the user side (people identifying and leveraging different roles). With my doctoral student Roberto Gonzalez-Ibanez, we then worked on identifying opportunities for collaboration in search (2010-2012), so we could start creating synergies instead of biases. In 2013-2014, I worked with Laure Soulier and Lynda Tamine from IRIT in France to bring both the system-side and the user-side together in an attempt to have diverse roles of people while working on search tasks.

While these works were going on with collaboration, I have also been exploring a parallel thread on social information seeking. This is where I studied how communities could come together in creating more synergic solutions, sort of using “wisdom of crowd” idea, rather than relying on any single individual’s ideas and opinions. This thread has been carried out from 2008 to the present day.

Now, we have started a new thread in my lab that is aimed at addressing bias in search in an individual searcher’s setting. It’s not easy. First off, the notions of bias and fairness are not clearly defined, and our own understanding of them keeps evolving. Second, there are several layers of complexity here that involve data as well as algorithms of ranking and organizing information, and it’s not always clear which of these layers are responsible for introducing bias. Third, there is a large amount of personalization in pretty much everything we see and consume on the Web today. Since this layer of personalization is, as the name suggests, personal, it’s hard to peel it off.

Still, I think this issue of bias and fairness in the Web is very important and together with my colleagues and students, I am continuing to invest a lot of efforts on this problem. It’s been more than a decade of working on this problem through many different angles, and I feel like we are just getting started!