議程:Ad hoc Query- 輕輕鬆鬆查詢海量資料

議程摘要:

Apache Pig is a platform for analyzing large data sets that consists of a high-level language for expressing data analysis programs. We’ll present a case study of TrendMicro's integration of data analytics tools into its existing Hadoop-based, Pig-centric analytics platform. In our deployed solution, common data analytics tasks such as data sampling, feature generation, training, and testing can be accomplished quickly and directly in Pig, via carefully crafted loaders, storage functions, and user-defined functions.