Skip to main navigation Skip to Content

Computer Science

University of Toronto
  • U of T Portal
  • Site Map
  • Contact
  • About DCS At U of T
    • Why Study CS at U of T
    • Career Options
    • History of DCS
    • Giving to DCS
    • Information for Prospective Undergraduate Students
    • Information for Prospective Graduate Students
    • Computer Science at UofT Mississauga
    • Computer Science at UofT Scarborough
    • Contact
  • Programs & Courses
    • Prospective Undergraduate Students
    • Current Undergraduate Students
    • Prospective Graduate Students
    • Current Graduate Students
  • Research
    • Research Groups
    • Industrial Relations
    • Research In Action Showcase
    • Research Profiles
    • Research Sponsors & Partners
    • Awards and Accolades
    • UTRECS - Undergraduate Toronto Research Experience in Computer Science
  • Our People
    • Faculty
    • Staff
    • In Memoriam
    • People Profiles
    • Alumni and Friends
    • Women in Computer Science
    • Graduate Student Society
    • Undergraduate Student Union
    • Undergraduate Artificial Intelligence Group
  • News & Events
    • Current News
    • DCS Events Calendar
    • DCS in the Media
    • @dcs Newsletter
    • Undergrad News
    • Distinguished Lecture Series
    • Awards and Accolades
    • DCS Facebook Page
    • DCS Twitter Feed
    • RSS Feed - News
    • RSS Feed - Events
You are viewing: > Home > Research > Research Profiles > SPIDER Data Cleaning Tool
  • Computational Analysis of Ice Hockey Gameplay
  • Online Music Recommendation and the Problem of Missing Ratings
  • Speech Summarization
  • Novel Interfaces for Molecular Visualization
  • Using a Physical Object to Control a Virtual 3D Object
  • Amigo: Proximity-Based Authentication
  • Grapevine
  • Modelling Complex Financial Instruments
  • Using Language to Learn Structure Appearance Models for Image Annotation
  • Stylization of Character Motion
  • ILoveSketch
  • JSCOOP: A High-Level Concurrency Framework for Java
  • Dezombify
  • SPIDER Data Cleaning Tool
  • Cognitive Orthosis for Assisting Activities in the Home
  • NAViGaTOR Visualizing Protein Interaction Networks
  • Friend Forecaster: Cellphone Software Aiding Memory for Games

SPIDER Data Cleaning Tool

Data quality is a serious concern in any organization that relies on data. The quality of data is commonly poor due to a multitude of reasons including, but not limited to, spelling mistakes, abbreviations, lack of standards and inconsistent notations.
spider3















SPIDER is a declarative data cleaning tool. It incorporates a set of algorithms that can be used to aid the improvement of data quality on any relational data source. The main advantage of SPIDER is that it is based purely on declarative methodologies, thus it can be used in any relational data source, and it does not rely on standardized dictionaries or libraries of “clean data.” The main features of this technology include a vast collection of similarity measures expressed fully in SQL, heavy use of sampling for improved performance, statistical schema matching and statistical methodologies for data analysis and identification of quality problems.

For more information on SPIDER, please visit the project site.

Nick Koudas Faculty
Mohammad Sadoghi Graduate Student



More Research Profiles

Computer Science

All rights reserved copyright Computer Science, University of Toronto