Institute of Computer Science
  1. Courses
  2. 2022/23 spring
  3. Data engineering for Conversion Master's (LTAT.02.026)
ET
Log in

Data engineering for Conversion Master's 2022/23 spring

  • Main
  • Lectures
  • Project
  • Homework
  • References
  1. Create dev container with VSCode: Utilize the 'Add Dev Container Configuration' command to set up a development container in Visual Studio Code, providing a consistent and reproducible development environment.
  2. Open ETL project or create a new one: Launch your existing ETL project within the dev container or start a new project to work with air quality data.
  3. Install Python packages: Install necessary Python packages, such as duckdb and pyarrow, to handle DuckDB and Parquet file operations.
  4. Write air quality data as Parquet file: Convert the air quality data into a Parquet file format for efficient storage and faster query performance.
  5. Query Parquet file using DuckDB in Python: Employ DuckDB to execute SQL queries on the Parquet file within a Python script, enabling seamless data processing and analysis.
  6. Install R packages and query Parquet file in R: Install the required R packages, such as dockdb and DBI, to interact with the Parquet file. Perform queries and analysis in R to showcase the flexibility of working with Parquet files across different programming languages.
  • Institute of Computer Science
  • Faculty of Science and Technology
  • University of Tartu
In case of technical problems or questions write to:

Contact the course organizers with the organizational and course content questions.
The proprietary copyrights of educational materials belong to the University of Tartu. The use of educational materials is permitted for the purposes and under the conditions provided for in the copyright law for the free use of a work. When using educational materials, the user is obligated to give credit to the author of the educational materials.
The use of educational materials for other purposes is allowed only with the prior written consent of the University of Tartu.
Terms of use for the Courses environment