Browsed by
Author: Pan Peewsook

Developing a search system that knows what you are looking for before you do.

Developing a search system that knows what you are looking for before you do.

Have you ever searched to plan a trip, a wedding, job hunting, or your next apartment? This kind of search can take hours, days, or even weeks. Inevitably, it would get interrupted by our daily life routine. The interrupted events can be a break for coffees, hopping into the restroom, dining, or sleeping. Therefore, doing the search would require us to pick up where we left off. These kinds of searches are called “Interrupted Search Tasks”. 

We, as well as many other scientists, are working on tackling this problem. Our approach is to try to identify and predict the sub-tasks of complex search tasks. Based on that, we provide solutions to easily complete the tasks. For example, planning a wedding. You need different information i.e., food, dress, venue. Maybe, in the search process, you forget about the food which is a subtask of wedding planning. The system proactively gives suggestions for food. 

And how do we know when to suggest these things to you? First, we try to identify whether or not you are encountering problems during a search. We found that the longer people take at the search result page the higher chance they are having problems. Making this more illustrative, imagine a person searching “Churches in Seattle”, they took a long time on the research result page, and without clicking through any of the results, the person inputs another search query, “places for a wedding”, and so on. The more queries the person puts in without clicking through any pages reflect the likelihood they are encountering problems. However, if the person interacts with the result page, i.e. click the see inside the page, we would look at the number of pages that the person bookmarked. The more subsequence pages got bookmarked, the more relevant results the person found and the fewer problems they encountered. This is how we can tell whether people can find what they are looking for. If we see you are having problems, we will recommend things that you might miss out.

So how do we know what things you missed out? In other words, how do we know that “food”, “dress”, “venue” is related to planning a wedding? We use what people have searched for in the past. The higher frequency the 2 topics are searched together, the stronger the relationship. Let’s say 1000 people searched for “wedding” along with “dress food” vs. 5 people searched for “wedding” along with “black dress”. We can tell that “dress food” has a stronger relationship to the topic “wedding” but not so much with “Black dress”. Therefore, we can recommend “dress food” when the next person searches for “wedding.”

If you are interested to know more detail about this topic. Here is the paper that we published recently, “Identifying and Predicting the States of Complex Search Tasks”.

Soumik Mandal successfully defends his dissertation.

Soumik Mandal successfully defends his dissertation.

Soumik Mandal, Ph.D. student

Our Ph.D. student, Soumik Mandal, has successfully defended his dissertation titled “Clarifying user’s information need in conversational information retrieval”. The committee included  Chirag Shah (University of Washington, Chair), Nick Belkin (Rutgers University), Katya Ognyanova (Rutgers University), and Michel Galley (Microsoft).

Abstract

With traditional information retrieval systems users are expected to express their information need adequately and accurately to get appropriate response from the system. This set up works generally well for simple tasks, however, in complex task scenarios users face difficulties in expressing information need as accurately as needed by the system. Therefore, the case of clarifying user’s information need arises. In current search engines, support in such cases is provided in the form of query suggestion or query recommendation.  However, in conversational information retrieval systems the interaction between the user and the system happens in the form of dialogue. Thus it is possible for the system to better support such cases by asking clarifying questions. However, current research in both natural language processing and information retrieval systems does not adequately explain how to form such questions and at what stage of dialog clarifying questions should be asked of the user. To address this gap, this proposed research will investigate the nature of conversation between user and expert intermediary to model the functions the expert performs to address the user’s information need. More specifically this study will explore the way intermediary can ask questions to user to clarify his information need in complex task scenarios.

InfoSeeking’s 10th Birthday

InfoSeeking’s 10th Birthday

Looking back

In the fall of 2010, we started as a reading group for people who would come together to read papers on topics of information seeking/retrieval/behavior every week. The group was called “Information seeking and behavior group”. Dr. Chirag Shah has been leading the group from the beginning. 

Quickly the reading group became a research group as students and faculty started identifying projects that interested them and pulled resources to design studies and experiments.

In that same fall, as the group started getting traction and attracting more students, resources, and funding, we became InfoSeeking Lab.

In the beginning, the lab focused on issues of information seeking/retrieval and social media. As new members and interests were added, the lab explored many more areas, including wearable sensors, collaborative work, online communities, and conversational systems.

The methods for research also evolved from user studies to large-scale log analysis, and from ethnographic approaches to deep learning models.

Our achievements

We have been pushing forward the knowledge in information seeking/retrieval and other related topics in the Information and Data Sciences field for 10 years. 

Throughout those years, the lab has received more than 4 million dollars in grants and gifts from federal and state agencies as well as private organizations. 

So far, the lab has produced 13 excellent PhD students and countless undergraduate and master students to drive new ideas and innovations into the Data Sciences field. Our alumni have gone to major universities around the world and reputable companies like Dropbox, eBay, Google, Sony, and TD Bank.

Some of the lab’s early works laid the foundation for collaborative and social work by people from all walks of life. One of the outcomes was a system called Coagmento, which was extensively tested with and deployed in classrooms. When it was used in a NY-based highschool, the teachers, for the first time, found that they could gain valuable insights into their students’ work and help them in ways not possible before using our system.

We have been at the forefront of developing new methodologies, tools, and solutions. We were one of the first to use the escape room as a method to understand how people seek information and solve problems.

We have been and are going to continue contributing to the community. The lab worked closely with the United Nations Data Analytics group to address several of the UN’s Sustainable Development Goals (SDGs). As a result of the collaboration, the lab launched Science for Social Good (S4SG). All of our works build into SDGs’ goals.

We have also worked with several private foundations and startups over the years to solve real-world problems. One example is our collaboration with Brainly, a startup from Poland that focused on educational Q&A. With them, we worked on problems of assessing the quality of the content as well as detecting users with certain characteristics, such as those exhibiting struggle. The solutions to these problems are extremely useful in education.

Looking back at the last 10 years and how glorious they have been, we are confident that the next decade will be even more amazing.

People can’t identify COVID-19 fake news

People can’t identify COVID-19 fake news

A recent study conducted by our lab, InfoSeeking Lab at the University of Washington, Seattle shows that people can’t spot COVID-19 fake news in search results.

The study was done by having people choosing between 2 sets of top 10 search results. One is direct from Google and another has been manipulated by putting one or two fake news results in the list. 

This is a continuing study from prior experiments from the Lab in a similar setting but with random information manipulated in the top 10 search results. The outcomes are all in the same directions, people can’t tell which search results are being manipulated.

“This means that I am able to sell this whole package of recommendations with a couple of bad things in it without you ever even noticing. Those bad things can be misinformation or whatever hidden agenda that I have”, said Chirag Shah, InfoSeeking Lab Director and Associate Professor at the University of Washington. 

This brought up very important problems that people don’t pay attention to. They believe that what they see is true because it comes from Google, Amazon, or some other system they use daily. Especially in prime positions like the first 10 results as multiple studies show that more than 90% of searchers’ clicks concentrate on the first page. This means any manipulated information that is able to get into Google’s first page of search results is now being perceived as true.

In the current situation, people are worried about uncertainty. A lot of us seek updates about the situation daily. Google is the top search engine that we turn to. People need trustworthy information; however, there are many who are taking advantage of people’s fear and spreading misinformation for their own agenda. What would happen if the next fake news said that there is a new finding that the virus has mutated with an 80% fatal rate, what would it do to our community? Would people start to usurp for food? Would people wearing a mask in the public be attacked? Would you be able to spot the fake news? The lab is continuing to explore these critical issues of public importance through their research work on FATE (Fairness Accountability Transparency Ethics).

For this finding, InfoSeeking researchers analyzed more than 10,000 answers on both random and fake information manipulated in the list, involving more than 500 English-speaking people around the U.S.

InfoSeeking at The 14th RecSys Conference

InfoSeeking at The 14th RecSys Conference

RecSys is the premier international forum for the presentation of new research results, systems, and techniques in the broad field of recommender systems. 

We are thrilled to be involved in one of the most important annual conferences for the presentation and discussion of recommender systems research. This year, our Lab Director, Chirag Shah, collaborated with Spotify paper – Investigating Listeners’ Responses to Divergent Recommendations – is being presented at the conference. 

Moreover, two of our InfoSeekers, Ruoyuan Gao and Chirag Shah, are also giving a tutorial on “Counteracting Bias and Increasing Fairness in Search and Recommender Systems

About the Tutorial

Search and recommender systems have unprecedented influence on how and what information people access. These gateways to information on the one hand create an easy and universal access to online information, and on the other hand create biases that have shown to cause knowledge disparity and ill-decisions for information seekers. Most of the algorithms for indexing, retrieval, ranking, and recommendation are heavily driven by the underlying data that itself is biased. In addition, ordering of the search and recommendation results create position bias and exposure bias due to their considerable focus on relevance and user satisfaction. These and other forms of biases that are implicitly and some times explicitly woven in search and recommender systems are becoming increasing threats to information seeking and sense-making processes. In this tutorial, we will introduce the issues of biases in search and recommendation and show how we could think about and create systems that are fairer, with increasing diversity and transparency. Specifically, the tutorial will present several fundamental concepts such as relevance, novelty, diversity, bias, and fairness using socio-technical terminologies taken from various communities, and dive deeper into metrics and frameworks that allow us to understand, extract, and materialize them. The tutorial will cover some of the most recent works in this area and show how this interdisciplinary research has opened up new challenges and opportunities for communities such as RecSys.

DATE

Session A on Sep 25 10:00 – Sep 25 11:00, Attend in Whova
Session B on Sep 25 21:00 – Sep 25 22:00, Attend in Whova

Jonathan Pulliza successfully defends his dissertation

Jonathan Pulliza successfully defends his dissertation

Jonathan Pulliza, Ph.D. student

Our Ph.D. student, Jonathan Pulliza, has successfully defended his dissertation titled titled “Let the Robot Do It For Me: Assessing Voice As a Modality for Visual Analytics for Novice Users”. The committee included  Chirag Shah (University of Washington, Chair), Nina Wacholder (Rutgers University), Mark Aakhus (Rutgers University), and Melanie Tory (Tableau).

Pulliza’s study focuses on understanding how the voice system facilitates novice users in Visual Analytics (VA). He found that participants chose to use the voice system because of its convenience, ability to get a quick start on their work, and better access to some functions that they could not find in the traditional screen interface. Participants refrained from choosing voice because of their previous experiences. They felt that using the voice system would not provide then all access to the more complicated VA system. They then often chose to struggle with the visual interface instead of using the voice system for assistance.

Abstract

The growth of Visual Analytics (VA) systems has been driven by the need to explore and understand large datasets across many domains. Applications such as Tableau were developed with the goal of better supporting novice users to generate data visualizations and complete their tasks. However, novice users still face many challenges in using VA systems, especially in complex tasks outside of simple trend identification, such as exploratory tasks. Many of the issues stem from the novice users’ inability to reconcile their questions or representations of the data with the visualizations presented using the interactions provided by the system.

With the improvement in natural language processing technology and the increased prevalence of voice interfaces, there is a renewed interest in developing voice interactions for VA systems. The goal is to enable users to ask questions directly to the system or to indicate specific actions using natural language, which may better facilitate access to functions available in the VA system. Previous approaches have tended to build systems in a screen-only environment in order to encourage interaction through voice. Though they did produce significant results and guidance for the technical challenges of voice in VA, it is important to understand how the use of a voice system would affect novice users within their most common context instead of moving them into new environments. It is also important to understand when a novice user would choose to use a voice modality when the traditional keyboard and mouse modality is also available.

This study is an attempt to understand the circumstances under which novice users of a VA system would choose to interact with using their voice in a traditional desktop environment, and whether the voice system better facilitates access to available functionalities. Given the users choose the voice system, do they choose different functions than those with only a keyboard and a mouse? Using a Wizard of Oz set up in the place of an automated voice system, we find that the participants chose to use the voice system because of its convenience, ability to get a quick start on their work, and in some situations where they could not find a specific function in the interface. Overall function choices were not found to be significantly different between those who had access to the voice system versus those who did not, though there were a few cases where participants were able to access less common functions compared to a control group. Participants refrained from choosing voice because their previous experiences with voice systems had led them to believe all voice systems were not capable of addressing their task needs. They also felt using the voice system was incongruent with gaining mastery of the underlying VA system, as the convenience of using the voice system could lead to its use as a crutch. Participants then often chose to struggle with the visual interface instead of using the voice system for assistance. In this way, they prioritized building a better mental model of the system over building a better sense of the data set and accomplishing the task.

Manasa Rath successfully defends her dissertation

Manasa Rath successfully defends her dissertation

Manasa Rath, Ph.D. student

Our Ph.D. student, Manasa Rath, has successfully defended her dissertation titled “Assessing the quality of user-generated content in the presence of automated quality scores”. The committee included  Chirag Shah (University of Washington, Chair), Vivek Singh (Rutgers University), Kaitlin Costello (Rutgers University), and Sanda Erdelez (Simmons University).

Students seek online crowdsource to fulfill their academic course requirements

Manasa investigated the quality of those user-generated content whether it is correct, credible, clear, and complete using her developed framework. The framework has been validated with multiple experts in the field. She then generated an automation to score the content accordingly and conduct a user study on 45 undergraduate students to see how and to what extent do users considered the role of quality while completing the task provided.

Abstract

With the proliferation in participatory web culture, individuals not only create but also consume content present in crowdsourced environments such as blogs, question-answering systems. Studies have revealed users often employ the content present in them to make a range of choices, from issues in everyday life to important life decisions, without paying much attention to the quality. In the recent years, studies have demonstrated K-12 students’ over reliance on these crowdsourced sites to fulfill their academic course requirements. Therefore, it is important to evaluate users’ cognizance while evaluating the quality of content in these venues. But before identifying to what extent do users make use of quality while evaluating user-generated content, it is important to learn what constitutes of quality. To address these issues, this dissertation expounds to the problems in a three-step process. First, the dissertation begins by developing a conceptual framework for evaluating quality of user-generated content consisting of constructs such as correct, credible, clear, and complete. The second step involves validating the framework with the help of twelve experts i.e. librarians to attest the developed framework and using this validation to come to generate automated methodologies to evaluate the quality of content to provide quality scores. To further investigate, the third step delves deeper into users’ reliance on the automated quality scores by conducting a user study. 45 undergraduate students were recruited to investigate their use of automated quality scores while completing their tasks under three conditions – users provided with genuine quality scores, users provided with manipulated quality scores, and users provided with no quality scores (control). As prior research has indicated users task completion is dependent on the task-type provided, this user study involves providing users with different task types such as ranking and synthesis in the presence of quality scores. To further comprehend users’ use of quality scores while completing the tasks, the author makes use of eye-tracking metrics such as total number of gazes, gaze duration, and the number of gazes on the quality scores to evaluate their use by the users. Analyses was performed with the help of fixation data, users’ responses to pre and post task questionnaires, along with task scenarios along with interview transcripts. ANOVA and other statistical analyses were conducted and no statistical differences were found between users’ use of quality scores and the type of task. It was also found that users primarily considered the constructs – correct and credible from the charts from the qualitative data. Additionally, it was also found that users made use of the quality scores primarily when they had little familiarity with the topic. The study provided insights into how and to what extent do users considered the role of quality while completing the task provided. The contribution of this human information behavior study is of twofold: users’ reliance on an automated score provided by an intelligent tool along with studying users’ confidence in considering quality in evaluating user-generated content.

Ruoyuan Gao successfully defends her dissertation

Ruoyuan Gao successfully defends her dissertation

Ruoyuan Gao, Ph.D. student

Our Ph.D. student, Ruoyuan Gao, has successfully defended her dissertation titled “Toward a Fairer Information Retrieval System”. The committee included  Chirag Shah (University of Washington, Chair), Yongfeng Zhang (Rutgers University), Gerard de Melo (Rutgers University), and Fernando Diaz (Microsoft).

Ruoyuan investigated the existing bias presented in search engine results to understand the relationship between relevance and fairness in the results. She developed frameworks that could effectively identify the fairness and relevance in a data set. She also proposed an evaluation metric for the ranking results that encoded fairness, diversity, novelty, and relevance. With this matric, she developed algorithms that optimized both diversity fairness and relevance for search results.

Abstract

With the increasing popularity and social influence of information retrieval (IR) systems, various studies have raised concerns on the presence of bias in IR and the social responsibilities of IR systems. Techniques for addressing these issues can be classified into pre-processing, in-processing and post-processing. Pre-processing reduces bias in the data that is fed into the machine learning models. In-processing encodes the fairness constraints as a part of the objective function or learning process. Post-processing operates as a top layer over the trained model to reduce the presentation bias exposed to users. This dissertation explored ways to bring the pre-processing and post-processing approaches, together with the fairness-aware evaluation metrics, into a unified frame- work as an attempt to break the vicious cycle of bias.

We first investigated the existing bias presented in search engine results. Specifically, we focused on the top-k fairness ranking in terms of statistical parity fairness and disparate impact fairness definitions. With Google search and a general purpose text cluster as a lens, we explored several topical diversity fairness ranking strategies to understand the relationship between relevance and fairness in search results. Our experimental results show that different fairness ranking strategies result in distinct utility scores and may perform differently with distinct datasets. Second, to further investigate the relationship of data and fairness algorithms, we developed a statistical framework that was able to facilitate various analysis and decision making. Our framework could effectively and efficiently estimate the domain of data and solution space. We derived theoretical expressions to identify the fairness and relevance bounds for data of different distributions, and applied them to both synthetic datasets and real world datasets. We presented a series of use cases to demonstrate how our framework was applied to associate data and provide insights to fairness optimization problems. Third, we proposed an evaluation metric for the ranking results that encoded fairness, diversity, novelty and relevance. This metric offered a new perspective of evaluating fairness-aware ranking results. Based on this metric, we developed effective ranking algorithms that optimized for diversity fairness and relevance at the same time. Our experiments showed that our algorithms were able to capture multiple aspects of the ranking and optimize the proposed fairness-aware metric.

FATE project – What is it about and why does it matter?

FATE project – What is it about and why does it matter?

FATE is our InfoSeeking Lab’s one of the most active projects. It stands for Fairness Accountability Transparency Ethics. FATE aims to address bias found in search engines like Google and discover ways to de-bias information presented to the end-user while maintaining a high degree of utility.

Why does it matter?

There are many pieces of evidence in the past where search algorithms reinforce bias assumptions toward a certain group of people. Below are some past examples of search bias related to the Black community.

Search engines suggested unpleasant words on Black women. The algorithm recommended words like ‘angry’, ‘loud’, ‘mean’, or ‘attractive’. These auto-completions reinforced bias assumptions toward Black women.

Credit: Safiya Noble

Search results show images of Black’s natural hair as unprofessional while showing images of white Americans’ straight hair as professional hairstyles for work.

Credit: Safiya Noble

Search results on “three black teenagers” were represented by mug shots of Black teens while the results of  “three white teenagers” were represented by images of smiling and happy white teenagers.

Credit: Safiya Noble

These are some issues that were around for many years until someone uncovered them, which then sparked changes to solve these problems.

At FATE, we aim to address these issues and find ways to bring fairness when seeking information.

If you want to learn more about what we do or get updates on our latest findings, check out our FATE website.

Yiwei Wang successfully defends her dissertation

Yiwei Wang successfully defends her dissertation

Yiwei Wang, Ph.D. student

Our Ph.D. student, Yiwei Wang, has successfully defended her dissertation titled “Authentic vs. Synthetic: A Comparison of Different Methods for Studying Task-based Information Seeking”. The committee included  Chirag Shah (University of Washington, Chair), Nick Belkin (Rutgers University), Kaitlin Costello (Rutgers University), and Diane Kelly (University of Tennessee Knoxville).

Abstract

In task-based information seeking research, researchers often collect data about users’ online behaviors to predict task characteristics and personalize information for users. User behavior may be directly influenced by the environment in which a study is conducted, and the tasks used. This dissertation investigates the impact of study setting and task authenticity on users’ searching behaviors, perceived task characteristics, and search experiences. Thirty-six undergraduate participants finished one lab session and one remote session in which they completed one authentic and one simulated task. The findings demonstrate that the synthetic lab setting and simulated tasks had significant influences mostly on behaviors related to content pages, such as page dwell time and number of pages visited per task. Meanwhile, first-query behaviors were less affected than whole-session behaviors, indicating the reliability of using first-query behaviors in task prediction. Subjective task characteristics—such as task motivation and importance—also varied in different settings and tasks. Qualitative interviews reveal why users were influenced. This dissertation addresses methodological limitations in existing research and provides new insights and implications for researchers who collect online user search behavioral data.