The Complexity of Database Inconsistency Measures.
Supervisor: Prof. Benny Kimelfeld.Ester Livshits
Postdoctoral fellow in the Computer Science Faculty at Technion, Israel
I am a postdoctoral fellow at Technion, Israel. Prior to this, I was a Postdoctoral Research Associate in the School of Informatics at the University of Edinburgh.
I did my PhD under the supervision of Prof. Benny Kimelfeld in the Computer Science Faculty at Technion, Israel.
My research focuses on foundational aspects of data management, particularly handling inconsistent data and explanations in databases.
The Complexity of Database Inconsistency Measures.
Supervisor: Prof. Benny Kimelfeld.We consider two situations where we wish to quantify the responsibility of individual database tuples to the outcome. The first is query answering, where we wish to provide an explanation as to why we obtained a specific answer. The second is database inconsistency, where the goal is to identify the most problematic tuples. Some tuples may contribute more than others to the outcome, which can be a bit in the case of a Boolean query, a tuple or a number for conjunctive and aggregate queries, respectively, or a number indicating how inconsistent the database is (i.e., an inconsistency measure). To quantify the contribution of tuples, we use the well-known Shapley value that was introduced in cooperative game theory in the 1950s and has found applications in a plethora of domains. We investigate the applicability of the Shapley value in the two settings, as well as the computational aspects of its calculation in terms of complexity, algorithms, and approximation.
The Shapley value is a widely known numerical measure in cooperative game theory and in many applications of game theory for assessing the contribution of a player to a coalition game. It has been established already in the 1950s, and is theoretically justified by being the very single wealth-distribution measure that satisfies some natural axioms. While this value has been investigated in several areas, it received little attention in data management. In this talk, we will discuss the application of the Shapley value to quantifying the contribution of tuples to query results by defining corresponding coalition games.
An inconsistent database is a database that violates a given set of constraints. A minimum repair of an inconsistent database is a consistent database obtained from it by a minimum number of cleaning operations (e.g. tuple deletions, value updates). The importance of computing a minimum repair arises in the challenge of data cleaning, where the goal is to eliminate errors and dirt from the database. In a fully automated cleaning system, a minimum repair is a good candidate, while in a cleaning system that involves the user, the cost of a minimum repair can serve as an educated estimate for the amount of effort needed to complete the cleaning process. This talk will cover complexity and algorithmic aspects of the problem of computing a minimum repair.
The ACM SIGMOD Research Highlight Award.
First place in the Technion’s Computer Science Department Research Day.
Technion Hiroshi Fujiwara cyber security research center graduate student scholarship.
Computer Science Faculty Excellence Scholarship.
4 x Rector’s List of Excellent Students, 2 x Dean’s List of Excellent Students, Technion.
Irwin and Joan Jacobs Fellowship for excellence in graduate studies and research.