Speaker: Juliana Freire
Title: Synthesizing Products for Online Catalogs
A comprehensive product catalog is essential to the success of Product Search engines and shopping sites such as Yahoo! Shopping, Google Product Search, and Bing Shopping. Given the large number of products and the speed at which they are released to the market, keeping catalogs up-to-date becomes a challenging task, calling for the need of automated techniques. In this paper, we will discuss the problem of product synthesis, a key component of catalog creation and maintenance. Given a set of offers advertised by merchants, the goal is to identify new products and add them to the catalog, together with their (structured) attributes. A fundamental challenge in product synthesis is the scale of the problem. A Product Search engine receives data from thousands of merchants about millions of products; the product taxonomy contains thousands of categories, where each category has a different schema; and merchants use representations for products that are different from the ones used in the catalog of the Product Search engine. In this talk, I will present a system that provides an end-to-end solution to the product synthesis problem, and addresses issues involved in data extraction from offers, schema reconciliation, and data fusion.
This is joint work with Hoa Nguyen, Ariel Fuxman, Stelios Paparizos and Rakesh Agrawal.
Prof. Freire’s recent research focuses on Web-scale data integration, big-data analysis and visualization, and provenance management. Prof. Freire is an active member of the database and Web research communities. She has co-authored over 120 technical papers, and holds 8 U.S. patents. She is a recipient of an NSF CAREER Award and an IBM Faculty Award. Prof. Freire also holds an appointment in the Courant Institute for Mathematical Sciences. Prior to joining NYU, Prof. Freire was on faculty at the University of Utah, and before this, a member of technical staff at Bell Laboratories (Lucent Technologies).