Seminar Topics

Topics of Chinmaya Dehury

email: chinmaya.dehury@ut.ee

1. Predicting Cloud service demands. As we know, most of the frequently used apps such as Instagram, Twitter, Spotify, etc are deployed on cloud environment. Sometimes the usage of such applications is very high and sometimes the usage is very low. But can we predict how heavy an app will be used in the next few hours? In short, what would be the future demand for a cloud-based service? This is the question, we will answer in this topic.

In this topic, we will focus on the following tasks:
- Find out how the cloud resources are allocated to an app/service.
- Gather the dataset related to the resource usage of different cloud-based applications (2-4 use cases)
- Apply AI tools to predict and verify the result using the dataset.
What will you learn?
- Basics of cloud computing.
- How are the resources allocated to the applications?
- Basics of ML algorithms.
- How to apply ML tools to predict something?

2. Survey of Reinforcement learning frameworks. Reinforcement learning is one of three ML paradigms. Here a software agent takes actions by understanding the environment and its experience. For example, finding a path from one location to other, solving a knight-prince problem, etc. There are several frameworks to address different kinds of problem. In this topic, we will study different RL frameworks.

In this topic, we will focus on the following tasks:
- Understanding the fundamental concept of Reinforcement Learning
- Survey of different RL frameworks (such as OpenAI Gym, DeepMind Lab, Amazon SageMaker RL, Dopamine, etc.)
What will you learn?
- Basics of AI
- Basics of Reinforcement Learning
- Advantages and limitations of RL frameworks

3. Survey of Reinforcement learning techniques in cloud resource management. Reinforcement learning is one of three ML paradigms. Here a software agent takes actions by understanding the environment and its experience. For example, finding a path from one location to other, solving a knight-prince problem, etc. TIn this topic student will focus on existing RL techniques for cloud resource management.

In this topic, we will focus on the following tasks:
- Understanding the fundamental concept of Reinforcement Learning
- Basics of cloud computing
- Basics of resource management in cloud computing
What will you learn?
- Basics of AI
- Basics of Reinforcement Learning
- What is resource management in cloud computing

4. Integration of IoT-Fog-Cloud environment. Student will be responsible for development of a framework that offer easy integration of IoT devices, fog environment, and cloud environment. For this student need to use Apache Nifi, Minifi, OpenFaaS, Setup of fog environment on Pi4 device. This project is already started and student to resume the development work.

Topics of Mainak Adhikari

1. Federation Learning (FL) at edge devices. Unlike traditional ML approaches, the FL with different DL approaches is trained a distributed model such as edge computing based on the data stored in the local edge devices. Afterward, the trained gradient weights are aggregated and transferred to the centralized cloud servers to build a global intelligent edge-cloud model. Finally, the global model is push back to the edge devices for inference purpose. More importantly, during data analysis using FL technique, the data always stay at the local edge device, which can minimize the data leakage and data transmission cost. Due to the above-mentioned challenges of the FL technique at the edge devices and improves the accuracy and speed-up of the network model, FL-aware training at the edge computing is one of the important research challenges. Most of the FL-aware approaches can process hundreds of edge devices in parallel with a synchronizing order to train the edge network. However, due to the limited processing capacity and battery endurance of the edge devices with different types of offloading and scheduling strategies make it difficult to synchronize between the edge devices in each iteration. Thus, adjusting FL at different edge devices and train the network is still a research challenge in edge computing.

2. Distributed DRL on edge computing. Although the DRL methodology is a powerful ML tool to extract network information and resource utilization of the edge devices dynamically, it can bring more burden to analyze the performance matrices on a single gateway or edge device. While by decreasing the complexity of the DRL methodology and minimizes the overhead of the running gateway or edge device, the researchers want to distribute the load among multiple gateway or edge devices using distributed DRL methodology. During distributed DRL methodology, the performance parameters of the network and edge devices are distributed among near by smart gateway devices which are analyzed independently by the distributed methodology. The gradient NN layer should update the weight of the layer to create a group with neighboring gateway and edge devices based on the proper allocation of the neural cells. As a result, the distributed DRL methodology converges into a stable result for real-time data offloading.

3. Cross-Domain Interoperability. The growing demand for cross-domain IoT technology emphasizes its need for IoT platform-aware interoperability with consistent resource sharing. To enable interoperability, IoT platforms must cooperate with other platforms vertically, can share data with other platforms, and can provide cross-platform services to IoT users registered through other platforms. This could be accomplished by implementing a cohesive abstraction layer for standard connectivity to virtual servers that are exposed by the existing IoT platforms holding it. However, all these issues still need to be addressed. From the security perspective, gateway devices can also be designed as a cross-platform interface for Industrial applications. Hence, dynamic network optimization and cross-platform design for gateway devices could be a research scope for designing an interoperable IoT ecosystem.

4. Power and Energy efficient task offloading in hierarchical Fog-cloud Environment. Cloud Computing is a scale-based computing model which consists of thousands of servers and consumes an extremely large amount of electricity that will increase the cost of the service provider and harm the environment. However, the Fog devices are distributed globally with limited resource capacity and consume a minimum amount of energy while processing IoT applications. The energy consumption of the resources should directly proportional the CO2 emission rate and temperature of the computing devices. This should also affect the environment. Moreover, unlike VM instances, the containers contain a minimum amount of resources which consume minimum energy. So, energy efficient offloading strategy is an important issue in Fog and cloud environment domain for reducing the energy consumption and minimizes the CO2 emission rate and temperature of the computing devices. One of the energy efficient scheduling strategies is to place the IoT applications to the local computing devices with minimum delay and transmission time.

5. IoT task Scheduling with dwindling resource requirements of the Fog devices. The main goal of the emerging technologies in a distributed environment is to utilize the resources efficiently of the computing devices. The services providers receive the sensed data from various sensors including several contiguous stages, each having a specific size and resource requirements. Each application associated with response time and a deadline. So, the main goal of the IoT task scheduling with dwindling resource requirements is to find an optimal computing device with sufficient resource capacity which should meet the deadline of the tasks with efficient resource utilization. For example, patient monitoring data should be offloaded to the local Fog devices for faster processing with a minimum delay within the QoS constraint.

6. Hybrid optimization Strategy on Edge/Fog computing. Hybrid optimization strategy with multiple NIMH algorithms are useful for taking an optimal decision for a complex problem. Edge computing has multiple conflicting parameters to take an optimal decision for meeting various research challenges. For example, making an optimal decision by a edge controller for finding a suitable edge device based on (a) congestion of the network and (b) CPU and memory availability of the active edge devices. In such scenarios, a single NIMH algorithm may not generate an optimal decision within a stipulated time periods. Thus, Hybrid optimization strategy with multiple NIMH algorithms are useful by dividing the tasks into multiple categories and optimize the QoS parameters independently for taking an optimal decision in edge environment.

7. Topology Control in Edge computing. Topology control is a fundamental research aspect in a dynamic edge computing environment. An edge computing domain is made up with heterogeneous nodes that seamlessly interact with each other when reshaping the network typology. Thus, one of the most challenging task is to relocate the fog devices at optimal locations to repair or augment coverage or they can deploy or reallocate the static sensor nodes or objects. Both of these perspectives require an optimal approach to meet the objectives. NIMH/Dynamic/Greedy/deep learning algorithms help in self-relocation of the sensor nodes or the edge devices based on the user demands which reduces the latency and energy consumption of the data offloading and processing. Moreover, the NIMH/Dynamic/Greedy/deep learning algorithms find the reliable communication networks for data transmission to suitable edge devices with quick succession. This policy reduces the overall transmission time of the real-time applications with minimum network overhead.

8. Online resource provisioning for real-time applications. Online resource provisioning for real-time data processing on the local edge devices is one of the prominent research challenges in edge computing. Due to the streaming nature of the real-time applications, knowing the volume of the data sequence in advance for selecting a suitable edge device is not a feasible task for the edge controller. In this regard, the edge controller needs to find an optimal edge device or server for each real-time application in immediate mode using a NIMH/Dynamic/Greedy/deep learning algorithm based on different QoS constraints including resource availability, deadline and budget constraint, etc., without the prior knowledge of the data volume. Moreover, selecting an optimal fog device in online mode can minimize the total queueing time and energy consumption of the applications.

Real-time data processing and Cloud Computing Frameworks -- Pelle Jakovits (Responsible person)

Already Taken)Orchestrating complex Data Pipelines processing real-time IoT data. Student should investigate existing Data pipeline orchestration frameworks (such as Apache Nifi) and recent literature on this topic, which concentrate on managing IoT data flows fusing data from a large number of geographically distributed data sources and which may require deploying data processing tasks at different distances from the data sources (Fog Computing scenario).
Real-time Visualization of streaming data - Student should perform a literature study and present in seminar what are the newest advances, best practices and available solutions for visualizing large scale streaming data in real-time. In the case of available open source visualization tools suitable in the context of this topic, student should demonstrate real-time data visualization on a an illustrative scenario.
(Already Taken) Real-time visitor count estimation in lecture rooms - The Delta Building is a new building to house the Institute of Computer Science. Its construction is to be finished in 2020. There are plans for a number of different modern sensors to be placed in the building. The Computer Graphics and Virtual Reality lab’s students are working on a real-time visualization of the people and activities inside the building. For that purpose there is a desire to know how many people occupy each room (including the hallways) at any given moment. The goal of this topic is to study the state-of-the-art of sensor analytics or image processing (or fusion) and to propose a usable approach for real-time visitor count estimation in lecture rooms.
(Already Taken) Comparison of state-of-the-art Data pipeline technologies - The goal of this topic is to introduce others into Data pipeline technologies and compare the most popular state-of-the-art data pipeline technologies (e.g Apache NiFi vs Apache Airfow vs ...). Student should investigate related work done in this field and investigate their differences in supported features, usability and performance/scalability.
Large Scale Serverless Data Processing - Performance of Data Processing Functions as a Service FaaS and comparison to traditional cloud based Big Data stream data processing'''
(Already Taken, HK)DataOps - Bringing the DevOps to the Data era. DataOps manages the entire data lifecycle of data from data acquisition and cleaning to analysis and reporting. It recognizes the interconnected nature of data analytics team and IT operations.'''
(Already Taken) Comparison of State-of-the-Art (SOTA) open source IoT platforms, focusing on platforms that provide features for both device integration and data storage. Theoretical part involves a survey into the most popular SOTA platforms. Practical part may involve demonstrating a selected candidate platform. '''
Docker performance aspects when running large number of small docker containers on a resource constrained devices.
Viability of Serverless - Performance of FaaS cloud applications in comparison to micro-service and monolithic applications in real life scenarios'''

Topics by Shivananda R Poojara

(Already Taken) Topic 1: Accelerating the faas-federation in edge environment The given topic aims in development of api to connect multiple faas providers across edge environment. This means to connect two or more different FaaS provider types together. For instance: Kubernetes (faas-netes) and Lambda (faas-lambda). This means you can have a single, centralized control-plane but deploy to both AWS Lambda and Kubernetes at the same time. The contribution can be made to open source project https://github.com/openfaas-incubator/faas-federation.

Topic 2: Intelligent Epilepsy seizure prediction using fog/edge computing environments. It aims to design a system to predict the occurrences of seizure in epilepsy patients by collecting various data from mobile phone and other behavioural parameters. Prediction can make use of existing novel machine learning algorithms. To detect the seizures on the fly, may be at work or travelling or at any point of time, fog environment can be used as computation infrastructure to process the prediction tasks. This type of applications are sensitive, requires more attention to get earlier results and respond immediately. So, service latency and reliability parameters are important to be considered to schedule the computation tasks on fog environment. It also aims to design, develop efficient scheduling algorithm on fog environment and use the power of serverless.

Topic 3: Minimizing the execution cost by faas-federation in edge environment The given topic aims in design of efficient faas-federation strategy to minimize the execution cost of application. This can be achieved by efficient allocation of serverless functions to different faas providers in real time processing across edge environments. The faas providers may include local faas clusters or commercial faas providers like aws lamda or azure functions.

Topic 4: Recent trends in health care edge analytics.
Topic 5: Evolution of edge analytics and an industry approach.
Topic 6: Role of devops in edge analytics.
Topic 7: Datapipelines in autonomous vehicles
(Already Taken) Topic 8: Explore FaaS for distrbited data processing frameworks such as Apache Spark(https://github.com/Hydrospheredata/mist), DASK(Distrbuted processing using python)

Topics by Jakob Mass

1. Bluetooth Low Energy and Mesh networks

Mesh networking is a kind of network topology based on local, direct device-to-device connections, allowing dynamic, non-hierarchical connectivity that is typically self-organizing. Bluetooth Low Energy (BLE) is a widely used wireless technology, which mainly is focused on a star topology. To gain the benefits of mesh networking, recent works both in academia and industry have looked into creating mesh networks using BLE.

In this topic, the student should give a comprehensive overview of mesh networking, BLE and solutions which join the two. A recent survey paper provides support.

(Already Taken) 2. Overview and comparison of IoT device management frameworks/firmwares

Management of a large number of end devices (sensors, actuators) can become tedious without some additional framework. These frameworks typically aid with functions such as:

updating configuration of devices (e.g. how often to publish data, which server to publish to)
updating the code (pushing new firmware over the internet)
remote logging
managing duty cycles of the devices

In this topic, the student should analyse the below 2 platforms with ESP32 or similar development boards and present a comparison of them.

Mobiili- ja pilvearvutuse seminar 2020/21 sügis