You are here: Home Invited Talks

Invited Talks

Title: Exploiting Label Relationship in Multi-Label Learning 

Zhi-Hua Zhou, Department of Computer Science & Technology, Nanjing University, China

In many real data mining tasks, one data object is often associated with multiple class labels simultaneously; for example, a document may belong to multiple topics, an image can be tagged with multiple terms, etc. Multi-label learning focuses on such problems, and it is well accepted that the exploitation of relationship among labels is crucial; actually this is the essential difference between multi-label learning and conventional (single-label) supervised learning. In this talk, we will introduce some of our recent findings in the exploitation of label relationship.

Title: NIM: Scalable Distributed Stream Processing System on Mobile Network Data

Wei Fan, IBM T.J. Watson Research, Hawthorne, NY, USA

As a typical example of New Moore's law, the amount of 3G mobile broadband (MBB) data has grown from 15 to 20 times in the past two years (30TB to 40TB per day on average for a major city in China), real-time processing and mining of these data are becoming increasingly necessary.
The overhead of storage and file transfer to HDFS, delay in processing, etc are making offline analysis on these datasets obsolete. Analysis of these datasets are non-trivial, examples include mobile personal recommendation, anomaly traffic detection, and network fault diagnosis. In this talk, we describe NIM - Network Intelligence Miner. NIM is a scalable and elastic streaming solution that analyzes MBB statistics and traffic patterns in real-time and provides information for real-time decision making. The accuracy of statistical analysis and pattern recognition of NIM is identical to that of off line analysis, while NIM can process data at line rate. The design and the unique features (e.g., balanced data grouping, aging strategy) of NIM will be helpful not only for the network data analysis but also for other applications.