Lectures
There will be four sets of lectures by the following distinguished lecturers:
- Torben Bach Pedersen - Managing Complex Multidimensional Data
Multidimensional database concepts such as cubes, dimensions with hierarchies, and measures are a cornerstone of business intelligence. However, the standard data models and system implementations (OLAP) for multidimensional databases are sometimes not able to capture the complexities of advanced real-world application domains. This lecture will focus on how to manage such complex multidimensional data, including complex dimension hierarchies, complex measures, and integration of multidimensional data with complex external data. The lecture will look at how complex multidimensional data emerge in complex application domains such as medical data, location-based services, energy data, music data, web data, and text data, and present solutions for these domains that support multidimensional business intelligence. Finally, the lecture will present current research challenges. The lecture is based on the knowledge typically obtained in a standard introductory database course.
- Andreas Zeller - Mining Programs and processes
Mining Programs
A program fails. How can we locate the cause? A new generation of program analysis techniques automatically determines failure causes - in the input, in the set of code changes, or in the program state. In contrast to "classical" static analysis, these new techniques exploit the data from multiple concrete runs - and may even generate further runs as needed. In this lecture, we explore the state of the art in automated debugging in practice and research, using real-life case studies such as Mozilla and GCC. Finally, we discuss the current frontiers in debugging, and how future research may break them.
Mining Processes
I need to change some piece of code. What do I need to consider? Is there anything else I need to change? To answer such questions, one needs to know not only the software, but also its development process. To learn about the process, one can mine software repositories such as version archives and bug databases - and learn, for instance, which changes are correlated with each other, and better yet, which activities are correlated with success or failure. In this lecture, we discuss the state of the art in mining repositories, using real-life development processes from groups such as Eclipse or Microsoft Windows. Finally, we discuss the current challenges and opportunities in research.
Bio:
Andreas Zeller is a full professor for Software Engineering at Saarland University in Saarbrücken, Germany. His research concerns the analysis of large software systems and their development process; his students are funded by companies like Google, Microsoft, or SAP. In 2010, Zeller was inducted as Fellow of the ACM for his contributions to automated debugging and mining software archives. In 2011, he received an ERC Advanced Grant, Europe's highest and most prestigious individual research grant, for work on specification mining and test case generation.
- Krishna P. Gummadi - Enabling the Social Web
Recently online social networking sites, such as Facebook, YouTube, and Twitter, have become tremendously popular. Users join these sites to connect with other users and share content. With user populations running into hundreds of millions, these sites herald the emergence of the social Web. Despite the popularity of the social Web, it is still in its infancy and existing social Web systems suffer from a number of shortcomings.
In this school, we will study three key trends that distinguish the social Web from the traditional Web and the challenges posed by them. First, social networking sites like Facebook have democratized content publishing allowing individuals to share user generated and personal content, e.g., family photos and videos. But, sharing personal content over social sites raises severe privacy concerns. Second, social networking sites like YouTube leverage collaborative ranking and filtering of content to help users search for useful content, e.g., user rating of YouTube videos. However, collaborative content ranking is susceptible to manipulation by malicious users. Third, social sites like Twitter enable word-of-mouth based discovery of information, e.g., Twitter users propagate information to their followers. But, few understand the dynamics of word-of-mouth based content propagation. In this school, we will discuss recent research efforts to tackle these challenges and thereby, enable the social Web, i.e., help the social Web achieve its full potential.
- Robin Burke - Hybrid Recommender Systems
Recommender systems provide personalized support for users choosing among a large universe of options. Such systems have become a ubiquitous aspect of online commerce and information systems, and their study is a highly-interdisciplinary research area touching on the fields of psychology, decision theory, marketing, machine learning, knowledge representation, human-computer interaction, and social computing, among others. This short course will begin with an introduction to the field of recommender systems, focusing on the diversity of types of data that can be brought to bear on recommendation decisions. We will look at the development of collaborative, content-based and knowledge-based recommendation, and consider how domain characteristics govern the applicability of these approaches.
The second part of the course will discuss the integration of multiple recommendation types in hybrids, beginning with simple two-part designs and extending to large-scale ensemble techniques. We will also discuss recommender system evaluation and the impact that evaluation methodologies have on research and applications.
We will conclude with an investigation of recommendation in social annotation systems. We will consider the different types of recommendation such systems require, and we will examine the application of hybrid recommendation for addressing them. The results of this work will lead us back to the question of domain characteristics, in particular the strength of association between the items, tags and users that make up a social tagging system.