Seminar Topics

Topic SC1: Estimating City scale energy consumption and balance

Smart Cities aim to take advantage of data collected in cities and predictive models to improve the lives of citizens and to make more data-driven decisions. Solar energy balance is an aim of such cities to cover as much electricity needs as possible with local energy production. There have been works that investigated the production side of Tartu City, but this topic is aimed at investigating how to model and estimate more granular energy consumption in the city (mainly in buildings) at large scale.

RQ1 : What methods can be applied to estimating Smart City energy consumption at different granularities (Whole city, city areas, city blocks)
RQ2 : What specific data needs to be collected (and with what granularity and preciseness) to estimate Smart City energy consumption

Related research articles to start with:

Constantine E. Kontokosta and Christopher Tull. A data-driven predictive model of city-scale energy use in buildings. Applied Energy, 197:303–317, 2017.
Alessio Mastrucci, Olivier Baume, Francesca Stazi, and Ulrich Leopold. Estimating energy savings for the residential building stock of an entire city: A gis-based statistical downscaling approach applied to rotterdam. Energy and Buildings, 75:358–367, 2014.

Topic SC2: Analysis of Smart City building renovation level and heating systems' effect on the energy consumption

Smart Cities aim to take advantage of data collected in cities and predictive models to improve the lives of citizens and to make more data-driven decisions. There exist simulators that can estimate the energy consumption of a building based on the type of building, floor layouts, and heating systems used. We also have some historical heating and energy data available from the Tartu City government buildings and SmartEnCity project. The goal of this topic is to investigate whether the results of such simulations can be adapted to the larger scale of Smart Cities to find and identify buildings that would gain the most benefit from renovations and heating system upgrades. This can lead to designing Smart City Energy Digital Twins.

RQ1: Can single building/apartment simulation results be extended to Smart City Scale (City blocks, areas, whole city) visualizations, analysis, and decisions? Without having to simulate every single apartment.
RQ2: Are these existing datasets of simulation results that can be applied in the Estonian context?
RQ3: What data is required about the Smart City apartments and buildings to be able to analyze and visualize their state? Is such data available from Estonian open state registries? What extra data needs to be collected?

Related research articles to start with:

Wang, Yangmin, et al. "The Impact of Energy Renovation on Continuously and Intermittently Heated Residential Buildings in Southern Europe." Buildings 12.9 (2022): 1316.
Jakobi, Marc, et al. "BIM use-case: model-based performance optimization." 12th International Conference on Solar Energy and Buildings (EuroSun 2018), Rapperswil, 10-13 September 2018. International Solar Energy Society, 2018.
Witzig, Andreas, et al. "Quantifying energy-saving measures in office buildings by simulation in 2D cross sections." 13th Nordic Symposium on Building Physics, NSB 2023. Department of the Built Environment, Aalborg University, 2023.

[Already taken] Topic SC3: Monitoring & Analysing Mobility in Smart Cities

The Smart City concept involves gathering large amounts of data from different sensor sources (Vehicle counters, public transport, cell tower data, etc), the challenge lies in fusing the different data to value. A theme of interest here is analysing, predicting how the inhabitants of a city move about, as this helps make better decisions to avoid traffic congestion, etc.

RQ: How to manage the fact that sensors cannot observe 100% of mobility events in a city?
RQ: How far ahead can predicitons about mobility be made, what influences this?
RQ: How to determine the "weak link" data sources in sensor fusion?
BuScope: Fusing Individual & Aggregated Mobility Behavior for "Live" Smart City Services
[El-Tawab'20] A Framework for Transit Monitoring System Using IoT Technology: Two Case Studies
RQ: Does context detection and monitoring conflict with Android frameworks energy-saving & security policies?
[Chatterjee'20] Detecting Mobility Context over Smartphones using Typing and Smartphone Engagement Patterns
SimRa: Using crowdsourcing to identify near miss hotspots in bicycle traffic

Topic SC4. Synthetic data generators for generic realistic Smart City data

The goal of this topic is to study the approaches for synthetic IoT data generation for emulating the behavior of industrial and Smart City IoT devices. The second goal is to evaluate existing state-of-the-art synthetic data generation tools and IoT data anonymization approaches.

Research questions:

What are the scalable solutions for generating IoT data that can mimic the behavior of real-world IoT data streams?
Are the any automated solutions for generating synthetic IoT data based on real data that do not expose the content of the real data?
What are the limitations of existing tools and approaches? For example, can they be used on all possible types of IoT or Smart City data?

Related research articles to start with:

Isakovic Haris, Vanja Bisanovic, Bernhard Wally, Thomas Rausch, Denise Ratasich, Schahram Dustdar, Gerti Kappel, and Radu Grosu. Sensyml: Simulation environment for large-scale iot applications. In IECON 2019-45th Annual Conference of the IEEE Industrial Electronics Society, volume 1, pages 3024–3030. IEEE, 2019.

Topic SC5: Sensor Data fusion in Smart Cities

A large number of sensors have and are being deployed in IoT networks and Smart Cities. However, it may not be feasible to deploy a new sensor every time a new type of observation/data needs to be detected. It is sometimes more feasible to use existing sensors and data to compute or extrapolate new observations - by performing data fusion. The goal of this topic is to study and give an overview of Data Fusion, how it is being applied in these domains and provide an overview of challenges and research gaps related to them.

Related research articles to start with:

Billy Pik Lik Lau, Sumudu Hasala Marakkalage, Yuren Zhou, Naveed Ul Hassan, Chau Yuen, Meng Zhang, U-Xuan Tan, A survey of data fusion in smart city applications, Information Fusion, Volume 52, 2019, Pages 357-374, ISSN 1566-2535, https://doi.org/10.1016/j.inffus.2019.05.004

[Already Taken] Topic SC6: Estimating City scale energy consumption and balance

RQ1 : What methods can be applied to estimating Smart City energy consumption at different granularities (Whole city, city areas, city blocks)
RQ2 : What specific data needs to be collected (and with what granularity and preciseness) to estimate Smart City energy consumption

Related research articles to start with:

Constantine E. Kontokosta and Christopher Tull. A data-driven predictive model of city-scale energy use in buildings. Applied Energy, 197:303–317, 2017.
Alessio Mastrucci, Olivier Baume, Francesca Stazi, and Ulrich Leopold. Estimating energy savings for the residential building stock of an entire city: A gis-based statistical downscaling approach applied to rotterdam. Energy and Buildings, 75:358–367, 2014.

[Already Taken] Topic SC7: Enhancing the Smart City Metadata

A lot of data is being collected in Smart Cities, but a common issue with collecting huge amount of data from many sensors and süstems is that the description of collected data is often lacking, and there is not enough systematic metadata mapping and documentation

The goal of this topic is to investigate what methods exist and can be applied to enhance the metadata of already collected data to make it more usable, structured, and manageable for non-computer scientists.

Research questions:

What tools exist for efficient and useful metadata management?
What methods exist for automated metadata generation based on raw data, and are useful for Smart City data management?
How to accurately classify devices (data sources) and raw data into common Smart City domain models. Institute of Computer Science, UNiversity of Tartu, Available at: https://comserv.cs.ut.ee/ati_thesis/datasheet.php?id=79773&language=en

Articles to start with:

Quarati, A., De Martino, M., & Rosim, S. (2021). Geospatial open data usage and metadata quality. ISPRS international journal of geo-information, 10(1), 30.
De Nicola, A., & Villani, M. L. (2021). Smart city ontologies and their applications: A systematic literature review. Sustainability, 13(10), 5578.
Kaspar Kadalipp, "Knowledge Graphs for Cataloging and Making Sense of Smart City Data", 2024

Topic C1: Generative AI for Cloud Infrastructure Automation

Generative AI refers to a subset of artificial intelligence techniques and models that are designed to generate new content that is typically similar to, or in some cases entirely different from, existing data. Generative AI models learn from a dataset and then generate new data samples that exhibit patterns and characteristics present in the training data. Generative AI can be applied in cloud infrastructure automation to enhance various aspects of cloud management and operations, including resource provisioning and scaling, cost optimization, security and anomaly detection, infrastructure configuration management, predictive maintenance. The research questions student needs to ans:

What do you mean by cloud infrastructure automation?
Can generative AI be applied in cloud infrastructure automation? If yes, how?

Some of the references to start with:

Some webpost:
- https://azure.microsoft.com/en-us/blog/scale-generative-ai-with-new-azure-ai-infrastructure-advancements-and-availability/
Get acquainted with cloud infrastructure automation

Topic C2: Intelligence discoverability in edge infrastructure

Discoverability is the degree to which something, especially a piece of content or information, can be found in a search of a file, database, or other information system. - Wikipedia
Observability is the ability to measure the internal states of a system by examining its outputs. - Splunk
This topic would focus on the Discoverability aspects of edge computing infrastructure (including edge intelligence, edge knowledge, IoT data, edge services, edge devices, etc). Some of the precise questions, the student needs to focus are:

Q: What can be made discoverable and observable in edge computing?
Q: How do you see these two terms in the context of Edge computing?

Some resources to start with:

Discovery systems in ubiquitous computing (https://ieeexplore-ieee-org.ezproxy.utlib.ut.ee/document/1626210)
Edge-to-Edge Resource Discovery using Metadata Replication (https://zenodo.org/record/5851025/files/27.pdf)

Topic C2B: Topic: Observability in Edge Computing and AI

Edge computing is a distributed computing paradigm that moves data processing and storage closer to the devices and users that generate the data, rather than sending it all to a central data center. Edge intelligence is a combination of edge computing with artificial intelligence (AI), enabling AI algorithms to run directly on edge devices for real-time analytics and decision-making. It adds a layer of sophisticated local intelligence to the data processing capabilities of edge computing, allowing devices to understand, learn from, and act on data without constant cloud connection. Observability in Edge Intelligince is a growing research area because edge systems are distributed, resource-constrained, and often run AI workloads.

RQ: Current Techniques and challenges
RQ: Evalution of tools and frameworks (e.g., Prometheus, OpenTelemetry, KubeEdge)?
RQ: Current research gaps(e.g., lightweight observability for edge AI)?

References:

. Observability in Fog Computing https://arxiv.org/abs/2411.17753
. AI-Driven Lightweight Observability Framework for Edge Computing in IoT

https://www.researchgate.net/publication/389126723_AI-Driven_Lightweight_Observability_Framework_for_Edge_Computing_in_IoT

. A Survey on Observability of Distributed Edge & Container-Based Microservices https://ieeexplore.ieee.org/abstract/document/9837035
. MANAGING OBSERVABILITY FOR NON-DETERMINISTIC WORKLOADS IN AI

AND ML SYSTEMS https://ijetrm.com/issues/files/Apr-2024-26-1745688100-JUNE202421.pdf

Topic C3: X Discovery: A short survey

Discoverability mechanism lets an entity X on a network be discoverable. The entity X can be a service, a device, a network or knowledge. For instance, the service discovery protocol is used by a client device to find out about the services it can use on a server device. On the other hand, edge computing allows the IoT device generated data or client data to be processed by the nearest computing device present at the periphery of the network. This topic would focus on the study of discovery systems in edge computing, especially the study of device, service, and knowledge discovery at the edge. Some of the questions the student need to focus on, are:

What do you mean by X discoverability?
How the X discovery works at the edge, where X can be device, service, or knowledge?

Some of the references to start with:

Device Discovery in D2D Communication: A Survey (https://ieeexplore.ieee.org/abstract/document/8835011)
Service Discovery (https://www.dfki.de/~klusch/i2s/SD_essay_klusch2013.pdf)
Towards Service Discovery and Invocation in Data-Centric Edge Networks (https://ieeexplore.ieee.org/abstract/document/8888081)
Collaborative Learning-Based Industrial IoT API Recommendation for Software-Defined Devices: The Implicit Knowledge Discovery Perspective (https://ieeexplore.ieee.org/abstract/document/9208715)

[Already Taken] Topic C4: Intelligence at the extreme edge

The swift expansion of Information and Communication Technology (ICT) has hastened the widespread implementation of IoT applications across various sectors, including smart factories. In these contexts, there is a growing emphasis on processing data in close proximity to the source devices to enable faster decision-making. The concept revolves around utilizing nearby servers to execute Machine Learning algorithms. However, these algorithms demand significant memory and CPU resources for both training and execution, which presents a challenge when dealing with devices that have limited resources. Conversely, a collection of methods and tools is designed to address this challenge, facilitating the creation of memory and CPU-efficient ML models that can seamlessly integrate into diverse embedded system architectures. One noteworthy platform within this landscape is tinyML. Hence, the primary objective of this investigation is to explore the tinyML framework or toolkit, specifically tailored for the development of Machine Learning applications that prioritize edge computing.

RQ1 : What is TinyML framework and its architecture?
RQ2 : What is Reformable TinyML?
RQ3 : Taxonomy of Reformable TinyML?

Topic C5. Predictive maintenance of IoT devices

System monitoring is a common activity to ensure good quality of service in applications deployed in edge environments. However due to stochastic workloads and the deployed environment of devices may be prone to outages, faults, and errors. Faults can propagate leading to performance degradation of the application and system. In sensitive IoT applications such as Healthcare and Industrial applications, a small failure is not desirable and can lead to catastrophic failure. It is important to monitor the devices by logging the data and predicting the failures in advance. Predictive maintenance is a technique of monitoring and predicting the failures in a system. For example, Memory card faults are a major problem in IoT devices, where the program or data is stored on them. There are several factors that implicitly affect the failures of the SD card such as the deployment of devices in harsh environments, development of bad blocks and malware attacks, etc.

Research questions:

What SD card performance and quality data can be collected in real-time?
What environment data (e.g. temperature, pressure) should be collected to augment SD card data?
What predictive maintenance techniques have been previously applied to predict SD card failures, and how well have they worked?

Related research articles to start with:

Civerchia, Federico, et al. "Industrial Internet of Things monitoring solution for advanced predictive maintenance applications." Journal of Industrial Information Integration 7 (2017): 4-12.
Lamorie, Joshua, and Francesco Ricci. "MicroSD Operational Experience and Fault-Mitigation Techniques." (2015).

[Already taken] Topic C6. Backscatter networking for the Internet of Things

In backscatter radio communication, a device transmits data to a receiver by modulating and reflecting a signal from a third, external signal source, as opposed to generating the signal itself and sending it directly to the receiver. This approach is useful mainly due to the decreased energy requirements as the transmitter does not have to generate a radio signal itself, only remodulate it. As a result, this technology is also interesting for IoT devices. This topic studies the existing prototypes &solutions of backscatter networking for IoT Backscatter radio communication exploits the reflected backscattered signals to transmit data, where the backscattered signals can be the reflection of ambient radio frequency (RF) signals, the RF signals from the dedicated carrier emitter

Research questions:

How usable is back-scatter networking in real-world scenarios?
What are the limitations and drawbacks of using back-scatter networking for IoT data collection?

Related research articles to start with:

Jung, J., Ryoo, J., Yi, Y., & Kim, S. M. (2020, June). Gateway over the air: Towards pervasive internet connectivity for commodity iot. In Proceedings of the 18th international conference on mobile systems, applications, and services (pp. 54-66).

Topic C7: Serverless Simulators

The main goal of this topic is to list and describe the various FaaS simulators. The FaaS Simulators are one way of estimating the prior cost and performance metrics without running on real serverless cloud platforms, which helps the developers test and tune the workloads.

RQ1 : Why FaaS simulators are necessary?
RQ2 : List and explain the architecture of various FaaS Simulators available.

Related research articles to start with: SimFaaS: A Performance Simulator for Serverless Computing Platforms

Some of the other topics would be:

Digital twin model for IoT applications
COSCO: Edge and Cloud Computing Coupled Simulator
Is edge computing reality or hype?
eKuiper: Lightweight IoT data streaming analytics engine for edge computing

[Already Taken] Topic C8: Optimizing Serverless function performance and cost

The Serverless Functions billing model is based on the runtime, memory allocation, and number of functions calls. The functions can be designed using different programming languages and runtimes such as Go, Python, Rust, C#, etc. However, cost and performance will depend on the runtime you are using, how well the functions are coded, how much memory is required, and the number of other characteristics. This topic aims to study the impact of different function design choices, such as the choice of programming languages, on the run time, performance, and cost of the applications.

RQ1: How much does the choice of programming language affect the performance of Serverless functions? Which programming languages result in lower-cost Serverless functions?
RQ2: Which software design choices affect the performance and cost of serverless functions?
RQ3: Which other characteristics have a large impact on the cost of serverless functions

Related research articles to start with:

Jackson, David, and Gary Clynch. "An investigation of the impact of language runtime on the performance and cost of serverless functions." 2018 IEEE/ACM International Conference on Utility and Cloud Computing Companion (UCC Companion). IEEE, 2018.
Eismann, Simon, et al. "Sizeless: Predicting the optimal size of serverless functions." Proceedings of the 22nd International Middleware Conference. 2021.
Sedefoğlu, Özgür, and Hasan Sözer. "Cost minimization for deploying serverless functions." Proceedings of the 36th Annual ACM Symposium on Applied Computing. 2021.

[Already taken] Topic C9: Serverless Big Data Processing and Warehousing

Serverless has been considered a computing model for lightweight event-based applications and has historically been considered unsuitable for large-scale data processing. The goal of this topic is to investigate whether this has recently changed with the addition of new approaches and solutions, or whether this claim still holds true.

RQ1: Is the serverless model suitable for big data processing or building data warehouses in comparison to more monolithic approaches?
RQ2: What characteristics of the serverless model are the main bottlenecks when large-scale data needs to be processed and queried?

Related research articles to start with:

Hellerstein, Joseph M., et al. "Serverless computing: One step forward, two steps back." arXiv preprint arXiv:1812.03651 (2018).
Aimer Bhat, Heeki Park, and Madhumonti Roy. 2021. Evaluating Serverless Architecture for Big Data Enterprise Applications. In 2021 IEEE/ACM 8th International Conference on Big Data Computing, Applications and Technologies (BDCAT '21) (BDCAT '21). Association for Computing Machinery, New York, NY, USA, 1–8. DOI:https://doi.org/10.1145/3492324.3494169
Werner, Sebastian, and Stefan Tai. "A reference architecture for serverless big data processing." Future Generation Computer Systems 155 (2024): 179-192.

Or propose your own ideas

Some examples:

Designing digital twins for Smart City visualization and monitoring
Osmotic computing
Digital twins for Smart City visualization and monitoring
Smart City ontologies
Smart city Data Marketplaces

Pilvetehnoloogia ja Targa Linna seminar 2025/26 sügis

Seminar Topics

Topic SC1: Estimating City scale energy consumption and balance

Topic SC2: Analysis of Smart City building renovation level and heating systems' effect on the energy consumption

[Already taken] Topic SC3: Monitoring & Analysing Mobility in Smart Cities

Topic SC4. Synthetic data generators for generic realistic Smart City data

Topic SC5: Sensor Data fusion in Smart Cities

[Already Taken] Topic SC6: Estimating City scale energy consumption and balance

[Already Taken] Topic SC7: Enhancing the Smart City Metadata

Topic C1: Generative AI for Cloud Infrastructure Automation

Topic C2: Intelligence discoverability in edge infrastructure

Topic C2B: Topic: Observability in Edge Computing and AI

Topic C3: X Discovery: A short survey

[Already Taken] Topic C4: Intelligence at the extreme edge

Topic C5. Predictive maintenance of IoT devices

[Already taken] Topic C6. Backscatter networking for the Internet of Things

Topic C7: Serverless Simulators

[Already Taken] Topic C8: Optimizing Serverless function performance and cost

[Already taken] Topic C9: Serverless Big Data Processing and Warehousing

Or propose your own ideas