Lectures

There will be five sets of lectures by the distinguished lecturers.

Interactive Visual Analysis of Complex Data

Thomas Ertl

Abstract

During the last 25 years visualization has developed into a scientific discipline of its own at the crossroads of computer graphics, human-computer interaction, and data analysis. Today, interactive visualization techniques play a crucial role in the process of understanding the huge datasets resulting from simulations, sensor measurements, and information systems. The introductory lecture will give an overview of the development and the current state of the field. We will present some standard methods for 3D scalar and vector fields as well as for unstructured and hierarchical data. Elaborate algorithms and efficient GPU implementations exist for many classical visualization problems. Advanced techniques will be introduced in the context of recent research projects of our institute. A closer cooperation with application domains very often leads to real-life problems with data size and dimensionality requiring new approaches combining known techniques into innovative tools. One lecture will focus on scientific visualization techniques for flows, higher-order finite element data, molecular dynamics and systems biology simulations, another lecture will present visual analytics approaches for patent databases and bibliographies, for social media and trajectory analytics and text visualization reaching into e-humanities.

Biography

Thomas Ertl received a MSc in computer science from the University of Colorado at Boulder and a PhD in theoretical astrophysics from the University of Tübingen. Since 1999 he is a full professor of computer science at the University of Stuttgart now leading the Visualization and Interactive Systems Institute (VIS) and the Visualization Research Center of the University of Stuttgart (VISUS). Prior to that he was a professor of computer graphics and visualization at the University of Erlangen and a cofounder and member of the board of science+computing ag, a Tübingen based IT company. His research interests include visualization, computer graphics and human computer interaction in general with a focus on volume rendering, flow and particle visualization, parallel and hardware accelerated graphics, large datasets and interactive steering, visual analytics of text collections and social media, user interfaces and navigation systems for the blind. Dr. Ertl is coauthor of more than 400 scientific publications and he served on numerous program committees and as a papers co-chair for most conferences in the field. From 2007-2010 Dr. Ertl was Editor-in-Chief of the IEEE Transactions on Visualization and Graphics (TVCG) and in 2011/2012 he served as Chairman of the Eurographics Association. He received the Outstanding Technical Contribution Award of the Eurographics Association and the Technical Achievement Award of the IEEE Visualization and Graphics Technical Committee in 2006. In 2007 he was elected as a Member of the Heidelberg Academy of Sciences and Humanities. He received Honorary Doctorates from the Vienna University of Technology in 2011and from the University of Magdeburg in 2014.

Machine learning for interpretable knowledge and better decisions

Peter A. Flach

Abstract

Machine learning -- the art and science of algorithms that make sense of data — is rapidly becoming an established technology in many applications of computer science. Knowledge extracted from data is captured in models that link target variables with observable features in order to make predictions. The machine learning field is characterised by a very wide variety of model classes, each with different characteristics and properties that make them suitable for particular kinds of applications. In my lectures I will concentrate on tree and rule models, which have the distinct advantage that they are interpretable by humans rather than black boxes. The logical nature of these models make them particularly suited for dealing with structured data, which requires going beyond a simple attribute-value format. I will finally discuss how these and other models can be tuned to make the best possible predictions for the problem at hand.

Biography

Peter Flach has been Professor of Artificial Intelligence at the University of Bristol since 2003. An internationally leading researcher in the areas of mining highly structured data and the evaluation and improvement of machine learning models using ROC analysis, he has also published on the logic and philosophy of machine learning, and on the combination of logic and probability. He is author of Simply Logical: Intelligent Reasoning by Example (John Wiley, 1994) and Machine Learning: the Art and Science of Algorithms that Make Sense of Data (Cambridge University Press, 2012). Prof Flach is the Editor-in-Chief of the Machine Learning journal, one of the two top journals in the field that has been published for over 25 years by Kluwer and now Springer. He was Programme Co-Chair of the 1999 International Conference on Inductive Logic Programming, the 2001 European Conference on Machine Learning, the 2009 ACM Conference on Knowledge Discovery and Data Mining, and the 2012 European Conference on Machine Learning and Knowledge Discovery in Databases in Bristol.

Data Analytics for Automated Software Engineering

David Lo

Abstract

Building a working software system is not an easy task. Developers need to spend much effort and resources to ensure that a software is developed well according to a set of functional and non-functional requirements. Quality assurance activities need to be performed to ensure that defects and issues plaguing a system are removed before it is released to the market. These development and quality assurance processes need to be repeated as a software evolves over a long period of time. Due to the complexity of software systems, there is a need for automated tool support to reduce the cost of developing and maintaining systems and increase the reliability of these systems. To address this need, recently data analytics techniques have been used to recover useful and actionable information from passive software data and to convert these into automated tools that can help improve developers' productivity and reduce post-release bugs. Data analytics for automated software engineering is a new and emerging research topic that merges software engineering with fields such as data mining, information retrieval, machine learning, natural language processing, and many others. Tools and techniques from these fields are used, extended, specialized, or combined together to recover actionable information from a diverse set of software data. The growth of this new research topic is propelled with the increasing availability of a large amount of software data in the internet stored in source code repositories, discussion forums, blogs, microblogs, bug reporting systems, and many more data sources. This short course introduces participants to this exciting and emerging topic and is divided into three lectures: the first lecture introduces software engineering research problems, topics, datasets, and basic tools; the second lecture presents data mining techniques and how they can be used to automate software engineering tasks; the third lecture presents information retrieval techniques and their applications to solve software engineering problems.

Biography

David Lo is an assistant professor in School of Information Systems, Singapore Management University and is currently a visiting researcher in Microsoft Research, Redmond. He is working in the intersection of software engineering and data mining research. He has more than a hundred publications in software engineering and data mining. He received the Lee Foundation Fellow for Research Excellence from the Singapore Management University in 2009 for his research contribution in software engineering. He has won a number of research awards including an ACM distinguished paper award for his work on bug report management. He has served in the program committees of many top software engineering and data mining international conferences, including ACM/IEEE International Conference on Software Engineering (ICSE), IEEE/ACM International Conference on Automated Software Engineering (ASE), and ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD). He is serving in the steering committee of the International Conference on Software ANalysis, Evolution and Reengineering (SANER) which is a merger of two major conferences in software engineering, namely the European Conference on Software Maintenance and Reengineering (CSMR) and the Working Conference on Reverse Engineering (WCRE). He is a leading researcher in the emerging field of software analytics which focuses on the design and development of specialized data analysis techniques to solve software engineering problems. He has delivered invited keynote speeches and lectures on the topic in many venues, such as the 2010 Workshop on Mining Unstructured Data, the 2013 Génie Logiciel Empirique Workshop, and the 2014 International Summer School on Leading Edge Software Engineering.

Reactive Cybersecurity: Attack, Detect, Evade

Pavel Laskov

Abstract

Computer security is a never-ending race between attack and defense. What makes this race particularly challenging is the value of assets targeted by modern cyberattacks. One a "mass-market" of cybercrime lie personal credentials of millions of Internet users such as credit card and bank account numbers, email and social network accounts that can be exploited for monetary profit. On the other end of computer attack spectrum lies targeted penetration of highly sensitive corporate or governmental sites, with the aim of stealing corporate know-how and classified data, or even carrying out acts of sabotage. The strong "economic motivation" behind modern cyberattacks fuels a rapid development of novel attack methods and raises a major challenge for security technologies: to detect previously unseen threats.

This course will provide insights into the three main "ingredients" of modern cybersecurity: attack, detection and evasion techniques. The first lecture will reviewing and return-oriented programming, as well as traditional countermeasures, e.g., canaries, ASLR and DEP. The second lecture will focus on methods for detection of attacks in data and network traffic. It will cover algorithms for signature matching, anomaly detection and advanced content analysis. The last lecture will introduce the key techniques used by attackers to avoid detection, e.g., packing and blending, and present potetial countermeasures on the defensive side.

Biography

Pavel Laskov graduated from the Moscow Institute of Radio, Electronics and Automation (Russia) in 1994 with a diploma in computer engineering. He received a M.Sc. and a Ph.D. in computer science from the University of Delaware (Newark, DE, USA) in 1996 and 2001 respectively. In 1997 he visited AT&T Research where he was involved in the pioneering work on kernel methods of machine learning headed by V. Vapnik, the inventor of Support Vector Machines. From 2001 to 2010 he was a senior researcher at the Fraunhofer Institute FIRST in Berlin. In 2004 he started investigation of machine learning methods for intrusion detection and has lead the development of a self-learning intrusion detection system ReMIND. In 2009 he was awarded a Heisenberg Fellowship of the German Science Foundation and moved to the University of Tuebingen. His research interests span intrusion detection, static and dynamic malware analysis, security of machine learning algorithms and many other related topics. He published over 50 articles in the refereed journals and conference proceedings and has served in program committees of several international conference.

Security of mobile communication systems

Valtteri Niemi

Abstract

The mini-course provides an introduction to security features used in mobile communications systems. In the end of the course, challenges with securing future systems, such as 5G, are discussed. The course begins with some basics about communication security and cryptography, followed by the evolution of security in cellular systems. The most important security principles and mechanisms in GSM and 3G systems are explained. Then a closer look is taken in 4G (LTE) security, covering the following aspects: · LTE security principles · Authentication and Key Agreement · Data protection · LTE crypto-algorithms · Security for intra-LTE mobility · Interworking with other systems · Lawful interception · Security for home base stations · Relay node security

Finally we take a look at characteristics of the planned 5G networks that are currently under research phase. We discuss what kind of new threats may emerge in this next evolution step and what kind of research is needed to address the threats.

Biography

Valtteri Niemi received a PhD degree in Mathematics from the University of Turku, Finland in 1989. After serving in various positions in Univ. of Turku, he was an Associate Professor in Mathematics at the University of Vaasa, Finland, during 1993-97. He joined Nokia Research Center (NRC), Helsinki, Finland, in 1997. During 2008-2010 he was at the new NRC laboratory in Lausanne, Switzerland. He was nominated as a Nokia Fellow in January 2009. During 2011, Valtteri was back in NRC/Helsinki. Dr. Niemi contributed in several roles for Nokia research in wireless security area, including cryptological aspects and privacy-enhancing technologies. Starting from 2012, Valtteri has been a Professor of Mathematics in Univ. of Turku, doing research in cryptology and its applications. Professor Niemi serves also as a high-end foreign expert in Xidian University, starting from 2014. Dr. Niemi participated 3GPP SA3 (security) standardization group from its beginning and during 2003-2009 he was the chairman of the group. Before 3GPP, Dr. Niemi took part in ETSI SMG 10 for GSM security specification work. He has published more than 60 scientific articles and he is a co-author of four books and more than 20 patent families.

13th Estonian Summer School on Computer and System Science

Lectures

Interactive Visual Analysis of Complex Data

Thomas Ertl

Machine learning for interpretable knowledge and better decisions

Peter A. Flach

Data Analytics for Automated Software Engineering

David Lo

Reactive Cybersecurity: Attack, Detect, Evade

Pavel Laskov

Security of mobile communication systems

Valtteri Niemi