Browsed by
Author: Chirag Shah

Creating a fairer search engine

Creating a fairer search engine

It’s getting increasingly more important to understand, evaluate, and perhaps rethink our search results as they continue to show bias of various kinds. Given that so much of our decision-making relies on search engine results, this is a problem that touches almost all aspects of our lives. Read about some of our new works in a new article by InfoSeekers Ruoyuan Gao and Chirag Shah:

Gao, R. & Shah, C. (2020). Toward Creating a Fairer Ranking in Search Engine Results. Journal of Information Processing and Management (IP&M), 57 (1).

With the increasing popularity and social influence of search engines in IR, various studies have raised concerns on the presence of bias in search engines and the social responsibilities of IR systems. As an essential component of search engine, ranking is a crucial mechanism in presenting the search results or recommending items in a fair fashion. In this article, we focus on the top-k diversity fairness ranking in terms of statistical parity fairness and disparate impact fairness. The former fairness definition provides a balanced overview of search results where the number of documents from different groups are equal; The latter enables a realistic overview where the proportion of documents from different groups reflect the overall proportion. Using 100 queries and top 100 results per query from Google as the data, we first demonstrate how topical diversity bias is present in the top web search results. Then, with our proposed entropy-based metrics for measuring the degree of bias, we reveal that the top search results are unbalanced and disproportionate to their overall diversity distribution. We explore several fairness ranking strategies to investigate the relationship between fairness, diversity, novelty and relevance. Our experimental results show that using a variant of fair ε-greedy strategy, we could bring more fairness and enhance diversity in search results without a cost of relevance. In fact, we can improve the relevance and diversity by introducing the diversity fairness. Additional experiments with TREC datasets containing 50 queries demonstrate the robustness of our proposed strategies and our findings on the impact of fairness. We present a series of correlation analysis on the amount of fairness and diversity, showing that statistical parity fairness highly correlates with diversity while disparate impact fairness does not. This provides clear and tangible implications for future works where one would want to balance fairness, diversity and relevance in search results.

Connecting information need to recommendations

Connecting information need to recommendations

A new article published by InfoSeekers Shawon Sarkar, Matt Mitsui, Jiqun Liu, and Chirag Shah in the Journal of Information Processing and Management (IP&M), shows how we could use behavioral signals from a user in a search episode to explicate their information need, their perceived problems, and the potential help they may need.

Here are some highlights.

  • The amount of time spent on previous search results could be an indicator of potential problems in articulation of needs into queries, perceiving useless results, and not getting useful sources in the following search stage in an information search process.
  • While performing social tasks, users mostly searched with an entirely new query, whereas, for cognitive and moderate to high complexity tasks, users used both new and substituted queries as well.
  • From users’ search behaviors, it is possible to predict the potential problem that they are going to face in the future.
  • User’s search behaviors can map an information searcher’s situational need, along with his/her perception of barriers and helps in different stages of an information search process.
  • By combining perceived problem(s) and search behavioral features, it is possible to infer users’ needed help(s) in search with a certain level of accuracy (78%).

Read more about it at https://www.sciencedirect.com/science/article/pii/S0306457319300457

A new NSF grant for explainable recommendations

A new NSF grant for explainable recommendations

Dr. Yongfeng Zhang from Rutgers University and Dr. Chirag Shah from University of Washington are recipients of a new grant from NSF (3 years, $500k) to work on explainable recommendations. It’s a step toward curing the “runaway AI”!

https://www.nsf.gov/awardsearch/showAward?AWD_ID=1910154

Recommendation systems are essential components of our daily life. Today, intelligent recommendation systems are used in many Web-based systems. These systems provide personalized information to help human decisions. Leading examples include e-commerce recommendations for everyday shopping, job recommendations for employment markets, and social recommendations to make people better connected. However, most recommendation systems merely suggest recommendations to users. They rarely tell users why such recommendations are provided. This is primarily due to the closed nature algorithms behind the systems that are difficult to explain. The lack of good explainability sacrifices transparency, effectiveness, persuasiveness, and trustworthiness of recommendation systems. This research will allow for personalized recommendations to be provided in more explainable manners, improving search performance and transparency. The research will benefit users in real systems through researchers? industry collaboration with e-commerce and social networks. New algorithms and datasets developed in the project will supplement courses in computer science and iSchool programs. Presentation of the work and demos will help to engage with wider audiences that are interested in computational research. Ultimately, the project will make it easier for humans to understand and trust the machine decisions.

This project will explore a new framework for explainable recommendation that involves both system designers and end users. The system designers will benefit from structured explanations that are generated for model diagnostics. The end users will benefit from receiving natural language explanations for various algorithmic decisions. This project will address three fundamental research challenges. First, it will create new machine learning methods for explainable decision making. Second, it will develop new models to generate free-text natural language explanations. Third, it will identify key factors to evaluate the quality of explanations. In the process, the project will also develop aggregated explainability measures and release evaluation benchmarks to support reproducible explainable recommendation research. The project will result in the dissemination of shared data and benchmarks to the Information Retrieval, Data Mining, Recommender System, and broader AI communities.

It’s a new chapter for us – at UW in Seattle

It’s a new chapter for us – at UW in Seattle

It’s been a bit quiet on iBlog lately and there is a good reason. The lab, along with me, has moved from Rutgers University in NJ to University of Washington (UW) in Seattle. This happened over the end of the summer and the beginning of the fall. Things were so chaotic at the time that we even missed celebrating or noticing 9 years of the lab!

This transition is still in progress. Most of the PhD students are still in NJ, but new students and projects are starting up with the lab in Seattle. Over the course of the next few weeks and months, we will be bringing more updates to our websites and social media channels.

It is a new chapter for us, indeed, but the journey goes on. We are still seekers!

Creating social impact through research

Creating social impact through research

Over the years, our lab has done some really groundbreaking work in the fields of information seeking, interactive information retrieval, social and collaborative search, social media, and human-computer interaction. Almost all of it had been geared toward scholarly communities. It makes sense. After all, we are operating in an academic research setting.

But lately at least I have been pondering about how what we do could and should benefit the society. And I don’t mean it in subtle, indirect, or some hypothetical ways. Sure, everything we do has a positive impact on people, starting with people doing that work. It earns them class credits, diplomas, and salaries. It also helps educate students and train professionals in certain skills. But that’s still a very small sample of population. Beyond that, some of our research and technologies developed through that work have impacted various government, educational, research, for-profit, and non-profit organizations in furthering their agendas.

And yes, from time to time we have helped out the United Nations (UN) and a few other organizations more directly with their data problems.

That is still not enough. There are many important issues in the world to address and those of us in privileged positions should do more.

And that is where we launched a new effort called Science for Social Good (S4SG). Under this umbrella, we started rethinking some of our existing works and how they could help address one of the issues of societal importance. Since we already had ties with the UN, and I regularly participate in some of their activities, it made logical sense to start with what the UN considers as a set of important issues. As it happens, the UN has a list of 17 Sustainable Development Goals (SDGs), which they hope will be met by year 2030. And we decided to be a part of the solution.

The UN’s list seemed comprehensive enough, and so, we started from that list and first identified a few organizations who aim to address at least some of those SDGs. And then, we looked inward — to see what activities that we do could help with these SDGs. The result was a pleasant surprise. Several of our projects do actually directly connect to one or more of these SDGs. In other words, by solving those research and development problems, we are directly or indirectly helping the UN (and the world) meet those SDGs. Some of the most common SDGs that our projects are addressing include Good Health and Well-being (SDG-3), Education (SDG-4), and Reduced Inequality (SDG-10).

More importantly, creating the S4SG platform has allowed us to rethink some of our future research activities and see if we could better align them with the societal impact in mind. This is not always easy, but it’s almost always worth doing.

Visit S4SG.org to learn more.

Addressing bias by bringing in diversity and inclusion in search

Addressing bias by bringing in diversity and inclusion in search

When it comes to Web search, the Matthew Effect (rich gets richer) applies quite well. Things that show up on the top of a rank list are likely to get clicked more and as they get clicked more, they continue being on the top. Of course, that’s not bad in itself; if there are things relevant to one’s query every time, why not have them show up at the top? But this becomes problematic when objectionable and deceptive items get to the top for some reason.

Think about sensational news. People like them; more specifically, people like clicking on them. When a title of a story says “You won’t believe what NASA is hiding”, most people are naturally enticed to click on it. It doesn’t matter if that story actually has anything substantive or not, but it got clicked, and in many instances, shared. This all probably started by someone searching for ‘NASA’. Or, increasingly, such a story coming up as a paid content (ad) next to a real story. A search system then treats this as a signal of relevance and the next time this story gets even more prominent position, thus starting the vicious cycle of the Matthew Effect. We are victims of our own desire to be curious and wanting to get attention by others!

There are many instances of this and other kinds of intentional and unintentional biases in today’s Web searches, social media, and recommender systems as people try dirty techniques of SEO (search engine optimization), click-bait, and simply wanting to spread rumors and “fake news”. It’s becoming an increasingly frustrating problem for service providers and users, but even more seriously, a threat to our democracy and free speech (and some may even say, free will).

There is no single solution for this. But we have to try all that we can to fight such issues of bias and fairness in Web. For my part, I have been addressing this through the use of collaboration — specifically, bringing in diversity and inclusion in search. For example, in 2007-2008, as a part of my work with Jeremy Pickens and late Gene Golovchinsky at FXPAL, I looked at having people with different backgrounds and skillsets to work together in collaborative search. This was with algorithmic mediation. During 2008-2010, I continued exploring this idea from the user side (people identifying and leveraging different roles). With my doctoral student Roberto Gonzalez-Ibanez, we then worked on identifying opportunities for collaboration in search (2010-2012), so we could start creating synergies instead of biases. In 2013-2014, I worked with Laure Soulier and Lynda Tamine from IRIT in France to bring both the system-side and the user-side together in an attempt to have diverse roles of people while working on search tasks.

While these works were going on with collaboration, I have also been exploring a parallel thread on social information seeking. This is where I studied how communities could come together in creating more synergic solutions, sort of using “wisdom of crowd” idea, rather than relying on any single individual’s ideas and opinions. This thread has been carried out from 2008 to the present day.

Now, we have started a new thread in my lab that is aimed at addressing bias in search in an individual searcher’s setting. It’s not easy. First off, the notions of bias and fairness are not clearly defined, and our own understanding of them keeps evolving. Second, there are several layers of complexity here that involve data as well as algorithms of ranking and organizing information, and it’s not always clear which of these layers are responsible for introducing bias. Third, there is a large amount of personalization in pretty much everything we see and consume on the Web today. Since this layer of personalization is, as the name suggests, personal, it’s hard to peel it off.

Still, I think this issue of bias and fairness in the Web is very important and together with my colleagues and students, I am continuing to invest a lot of efforts on this problem. It’s been more than a decade of working on this problem through many different angles, and I feel like we are just getting started!