Seminar Topics

Monitoring & Analysing Mobility in Smart Cities

The Smart City concept involves gathering large amounts of data from different sensor sources (Vehicle counters, public transport, cell tower data, etc), the challenge lies in fusing the different data to value. A theme of interest here is analysing, predicting how the inhabitants of a city move about, as this helps make better decisions to avoid traffic congestion, etc.

RQ: How to manage the fact that sensors cannot observe 100% of mobility events in a city?
RQ: How far ahead can predicitons about mobility be made, what influences this?
RQ: How to determine the "weak link" data sources in sensor fusion?
BuScope: Fusing Individual & Aggregated Mobility Behavior for "Live" Smart City Services
[El-Tawab'20] A Framework for Transit Monitoring System Using IoT Technology: Two Case Studies
RQ: Does context detection and monitoring conflict with Android frameworks energy-saving & security policies?
[Chatterjee'20] Detecting Mobility Context over Smartphones using Typing and Smartphone Engagement Patterns
SimRa: Using crowdsourcing to identify near miss hotspots in bicycle traffic

C1: Evolution of Abstraction and its role in edge-cloud continuum

The evolution of abstraction in computer science is a fascinating journey that has played a major role in the advancement of computing paradigms. Abstraction involves simplifying complex systems by isolating relevant details while hiding unnecessary complexities. It has evolved over the years, enabling programmers and computer scientists to work at higher levels of understanding and productivity. The student will focus on following questions:

General evolution of abstraction starting from Machine language -> Assembly language -> high-level prog. language-> …..-> declarative/domain specific language-> etc. (10%)
What are the layers of abstraction in infrastructure management (45%)
Layers of abstraction in an edge device (45%)

To get a basic understanding you may follow below research articles:

The Evolution of Abstraction in Programming Languages https://apps.dtic.mil/sti/citations/ADA059394
A good post to read: “History Of Increasing The Level Of Abstraction”, https://www.progressivegardening.com/weather-data/history-of-increasing-the-level-of-abstraction.html
Abstraction in Computer Science Education: An Overview , https://files.eric.ed.gov/fulltext/EJ1329311.pdf

C2: Generative AI for Cloud Infrastructure Automation

Generative AI refers to a subset of artificial intelligence techniques and models that are designed to generate new content that is typically similar to, or in some cases entirely different from, existing data. Generative AI models learn from a dataset and then generate new data samples that exhibit patterns and characteristics present in the training data. Generative AI can be applied in cloud infrastructure automation to enhance various aspects of cloud management and operations, including resource provisioning and scaling, cost optimization, security and anomaly detection, infrastructure configuration management, predictive maintenance. The research questions student needs to ans:

What do you mean by cloud infrastructure automation?
Can generative AI be applied in cloud infrastructure automation? If yes, how?

Some of the references to start with:

Some webpost:
- https://azure.microsoft.com/en-us/blog/scale-generative-ai-with-new-azure-ai-infrastructure-advancements-and-availability/
Get acquainted with cloud infrastructure automation

C3: Intelligence discoverability and observability in edge infrastructure

Discoverability is the degree to which something, especially a piece of content or information, can be found in a search of a file, database, or other information system. - Wikipedia
Observability is the ability to measure the internal states of a system by examining its outputs. - Splunk
This topic would focus on the Discoverability and Observability aspects of edge computing infrastructure (including edge intelligence, edge knowledge, IoT data, edge services, edge devices, etc). Some of the precise questions, the student needs to focus are:

Q: What is the difference between Discoverability and Observability in general?
Q: What can be made discoverable and observable in edge computing?
Q: How do you see these two terms in the context of Edge computing?

Some resources to start with:

Discovery systems in ubiquitous computing (https://ieeexplore-ieee-org.ezproxy.utlib.ut.ee/document/1626210)
Edge-to-Edge Resource Discovery using Metadata Replication (https://zenodo.org/record/5851025/files/27.pdf)

C6: X Discovery: A short survey

Discoverability mechanism lets an entity X on a network be discoverable. The entity X can be a service, a device, a network or knowledge. For instance, the service discovery protocol is used by a client device to find out about the services it can use on a server device. On the other hand, edge computing allows the IoT device generated data or client data to be processed by the nearest computing device present at the periphery of the network. This topic would focus on the study of discovery systems in edge computing, especially the study of device, service, and knowledge discovery at the edge. Some of the questions the student need to focus on, are:

What do you mean by X discoverability?
How the X discovery works at the edge, where X can be device, service, or knowledge?

Some of the references to start with:

Device Discovery in D2D Communication: A Survey (https://ieeexplore.ieee.org/abstract/document/8835011)
Service Discovery (https://www.dfki.de/~klusch/i2s/SD_essay_klusch2013.pdf)
Towards Service Discovery and Invocation in Data-Centric Edge Networks (https://ieeexplore.ieee.org/abstract/document/8888081)
Collaborative Learning-Based Industrial IoT API Recommendation for Software-Defined Devices: The Implicit Knowledge Discovery Perspective (https://ieeexplore.ieee.org/abstract/document/9208715)

[Already Taken] Topic 1: Edge Impulse- An MLOps Platform for Tiny Machine Learning

Edge Impulse is a cloud-hosted machine learning operations platform explicitly designed to develop ML models tailored to embedded and resource-constrained devices. This platform offers a range of automated features to streamline the training, validation, and deployment processes on edge devices with minimal hassle.

The following points to be covered:

Architecture of Edge Impulse and its eco-system.
Sample applications using Edge-Impulse(for example-Predictive maintenance)
How does Edge Impulse support reducing the model complexities to run the models in edge devices?

Related research articles to start with:

 Edge Impulse

[Already Taken] Topic 2: Intelligence at the extreme edge using TinyML

The swift expansion of Information and Communication Technology (ICT) has hastened the widespread implementation of IoT applications across various sectors, including smart factories. In these contexts, there is a growing emphasis on processing data in close proximity to the source devices to enable faster decision-making. The concept revolves around utilizing nearby servers to execute Machine Learning algorithms. However, these algorithms demand significant memory and CPU resources for both training and execution, which presents a challenge when dealing with devices that have limited resources. Conversely, a collection of methods and tools is designed to address this challenge, facilitating the creation of memory and CPU-efficient ML models that can seamlessly integrate into diverse embedded system architectures. One noteworthy platform within this landscape is tinyML. Hence, the primary objective of this investigation is to explore the tinyML framework or toolkit, specifically tailored for the development of Machine Learning applications that prioritize edge computing.

RQ1 : What is TinyML framework and its architecture?
RQ2 : What is Reformable TinyML?
RQ3 : Taxonomy of Reformable TinyML?

Topic 3: Fledge Fledge stands as an open-source framework and a collaborative community dedicated to addressing the industrial edge's needs, with a primary focus on critical operations, predictive maintenance, situational awareness, and safety. Folloiwng points to be covered in the seminar:

Architecture of Fledge
Samples use-cases and their implementation

Related research articles to start with: Fledge

[Already Taken] Topic 4: Serverless Simulators The main goal of this topic is to list and describe the various FaaS simulators. The FaaS Simulators are one way of estimating the prior cost and performance metrics without running on real serverless cloud platforms, which helps the developers test and tune the workloads.

RQ1 : Why FaaS simulators are necessary?
RQ2 : List and explain the architecture of various FaaS Simulators available.

Related research articles to start with: SimFaaS: A Performance Simulator for Serverless Computing Platforms

Some of the other topics would be:

Digital twin model for IoT applications
COSCO: Edge and Cloud Computing Coupled Simulator
Is edge computing reality or hype?
eKuiper: Lightweight IoT data streaming analytics engine for edge computing

Topic P.1. Synthetic data generators for generic real-time realistic Smart City data

The goal of this topic is to study the approaches for synthetic IoT data generation for emulating the behavior of industrial and Smart City IoT devices. The second goal is to evaluate existing state-of-the-art synthetic data generation tools and IoT data anonymization approaches.

Research questions:

What are the scalable solutions for generating IoT data that can mimic the behavior of real-world IoT data streams?
Are the any automated solutions for generating synthetic IoT data based on real data that do not expose the content of the real data?
What are the limitations of existing tools and approaches? For example: can they be used on all possible types of IoT or Smart City data?

Related research articles to start with:

Isakovic Haris, Vanja Bisanovic, Bernhard Wally, Thomas Rausch, Denise Ratasich, Schahram Dustdar, Gerti Kappel, and Radu Grosu. Sensyml: Simulation environment for large-scale iot applications. In IECON 2019-45th Annual Conference of the IEEE Industrial Electronics Society, volume 1, pages 3024–3030. IEEE, 2019.

[Already Taken] Topic P.2: Sensor Data fusion in Smart Cities

A large number of sensors have and are being deployed in IoT networks and Smart Cities. However, it may not be feasible to deploy a new sensor every time a new type of observation/data needs to be detected. It is sometimes more feasible to use existing sensors and data to compute or extrapolate new observations - by performing data fusion. The goal of this topic is to study and give an overview of Data Fusion, how it is being applied in these domains and provide an overview of challenges and research gaps related to them.

Related research articles to start with:

Billy Pik Lik Lau, Sumudu Hasala Marakkalage, Yuren Zhou, Naveed Ul Hassan, Chau Yuen, Meng Zhang, U-Xuan Tan, A survey of data fusion in smart city applications, Information Fusion, Volume 52, 2019, Pages 357-374, ISSN 1566-2535, https://doi.org/10.1016/j.inffus.2019.05.004

Topic P.3: Estimating City scale energy consumption and balance

Smart Cities aim to take advantage of data collected in cities and predictive models to improve the lives of citizens and to make more data-driven decisions. Solar energy balance is an aim of such cities to cover as much electricity needs as possible with local energy production. There have been works that investigated the production side of Tartu City, but this topic is aimed at investigating how to model and estimate more granular energy consumption in the city (mainly in buildings) at large scale.

RQ1 : What methods can be applied to estimating Smart City energy consumption at different granularities (Whole city, city areas, city blocks)
RQ2 : What specific data needs to be collected (and with what granularity and preciseness) to estimate Smart City energy consumption

Related research articles to start with:

Constantine E. Kontokosta and Christopher Tull. A data-driven predictive model of city-scale energy use in buildings. Applied Energy, 197:303–317, 2017.
Alessio Mastrucci, Olivier Baume, Francesca Stazi, and Ulrich Leopold. Estimating energy savings for the residential building stock of an entire city: A gis-based statistical downscaling approach applied to rotterdam. Energy and Buildings, 75:358–367, 2014.

Topic P.4: Enhancing the Smart City Metadata

A lot of data is being collected in Smart Cities, but a common issue with collecting huge amount of data from many sensors and süstems is that the description of collected data is often lacking, and there is not enough systematic metadata mapping and documentation

The goal of this topic is to investigate what methods exist and can be applied to enhance the metadata of already collected data to make it more usable, structured, and manageable for non-computer scientists.

Research questions:

What tools exist for efficient and useful metadata management?
What methods exist for automated metadata generation based on raw data, and are useful for Smart City data management?
How to accurately classify devices (data sources) and raw data into common Smart City domain models. Institute of Computer Science, UNiversity of Tartu, Available at: https://comserv.cs.ut.ee/ati_thesis/datasheet.php?id=79773&language=en

Articles to start with:

Quarati, A., De Martino, M., & Rosim, S. (2021). Geospatial open data usage and metadata quality. ISPRS international journal of geo-information, 10(1), 30.
De Nicola, A., & Villani, M. L. (2021). Smart city ontologies and their applications: A systematic literature review. Sustainability, 13(10), 5578.
Kaspar Kadalipp, "Knowledge Graphs for Cataloging and Making Sense of Smart City Data", 2024

Topic P.5. Predictive maintenance of IoT devices

System monitoring is a common activity to ensure good quality of service in applications deployed in edge environments. However due to stochastic workloads and the deployed environment of devices may be prone to outages, faults, and errors. Faults can propagate leading to performance degradation of the application and system. In sensitive IoT applications such as Healthcare and Industrial applications, a small failure is not desirable and can lead to catastrophic failure. It is important to monitor the devices by logging the data and predicting the failures in advance. Predictive maintenance is a technique of monitoring and predicting the failures in a system. For example, Memory card faults are a major problem in IoT devices, where the program or data is stored on them. There are several factors that implicitly affect the failures of the SD card such as the deployment of devices in harsh environments, development of bad blocks and malware attacks, etc.

Research questions:

What SD card performance and quality data can be collected in real-time?
What environment data (e.g. temperature, pressure) should be collected to augment SD card data?
What predictive maintenance techniques have been previously applied to predict SD card failures, and how well have they worked?

Related research articles to start with:

Civerchia, Federico, et al. "Industrial Internet of Things monitoring solution for advanced predictive maintenance applications." Journal of Industrial Information Integration 7 (2017): 4-12.
Lamorie, Joshua, and Francesco Ricci. "MicroSD Operational Experience and Fault-Mitigation Techniques." (2015).

Topic P.6. Backscatter networking for the Internet of Things

In backscatter radio communication, a device transmits data to a receiver by modulating and reflecting a signal from a third, external signal source, as opposed to generating the signal itself and sending it directly to the receiver. This approach is useful mainly due to the decreased energy requirements as the transmitter does not have to generate a radio signal itself, only remodulate it. As a result, this technology is also interesting for IoT devices. This topic studies the existing prototypes &solutions of backscatter networking for IoT Backscatter radio communication exploits the reflected backscattered signals to transmit data, where the backscattered signals can be the reflection of ambient radio frequency (RF) signals, the RF signals from the dedicated carrier emitter

Research questions:

How usable is back-scatter networking in real-world scenarios?
What are the limitations and drawbacks of using back-scatter networking for IoT data collection?

Related research articles to start with:

Jung, J., Ryoo, J., Yi, Y., & Kim, S. M. (2020, June). Gateway over the air: Towards pervasive internet connectivity for commodity iot. In Proceedings of the 18th international conference on mobile systems, applications, and services (pp. 54-66).

Or propose your own ideas

Some examples:

Comparison of State-of-the-Art (SOTA) open-source IoT platforms, focusing on platforms that provide features for both device integration and data storage. The theoretical part involves a survey into the most popular SOTA platforms. The practical part involves setting up and demonstrating a selected candidate platform.
Designing digital twins for Smart City visualization and monitoring
Synthetic data generators for large-scale testing of IoT and Smart City systems

Cloud Computing and Smart City Seminar 2024/25 spring

Seminar Topics

Monitoring & Analysing Mobility in Smart Cities

C1: Evolution of Abstraction and its role in edge-cloud continuum

C2: Generative AI for Cloud Infrastructure Automation

C3: Intelligence discoverability and observability in edge infrastructure

C6: X Discovery: A short survey

Topic P.1. Synthetic data generators for generic real-time realistic Smart City data

[Already Taken] Topic P.2: Sensor Data fusion in Smart Cities

Topic P.3: Estimating City scale energy consumption and balance

Topic P.4: Enhancing the Smart City Metadata

Topic P.5. Predictive maintenance of IoT devices

Topic P.6. Backscatter networking for the Internet of Things

Or propose your own ideas