Pig, Making Hadoop Easy

Abstract

Pig is a platform for analyzing large data sets. It consists of a high-level language, Pig Latin, for expressing data analysis programs, coupled with infrastructure for evaluating these programs atop Hadoop's MapReduce platform. This talk will review the basic features of Pig, discuss recent interesting additions to the system as well as current work being done, talk about Pig performance, and consider areas for future development and research.

Bio

Alan Gates is the architect for the Pig team at Yahoo. He has been developing database and data processing technology for the last twelve years, including seven years at Yahoo dealing with storage and query engines for petabyte sized data sets.