Arvutiteaduse instituut
  1. Kursused
  2. 2018/19 sügis
  3. Hajusandmetöötlus pilves (LTAT.06.005)
EN
Logi sisse

Hajusandmetöötlus pilves 2018/19 sügis

  • HomePage
  • Lectures
  • Practicals
  • Submit Homework

Practice 2 - Introduction to MapReduce - Alternative exercise guides for Eclipse IDE

These page contains alternative guides for exercises 2 and 3 for the Second lab.

Back to Practice 2 page

Exercise 4.2 Configuring Eclipse for Hadoop

  • Start Eclipse IDE
  • Most new Eclipse comes with Maven project support included. If not you can install Maven plugin for Eclipse manually
    • In Eclipse, go to Help-> Eclipse Marketplace
    • Search for maven
    • Install Maven integration for Eclipse (m2e)
  • Import the hadoop-mapreduce-project folder as a Maven project into Eclipse
    • Go to File->Import->Maven->Existing Maven Projects->Next
    • Set the location of the Root directory to be hadoop-mapreduce-project\hadoop-mapreduce-examples\ inside the previously unpacked hadoop source folder.
  • Wait until Maven has finished configuring the project dependencies.
  • Open the pom.xml file inside the Project directory and solve any errors that appear.
    1. If you get an error about a connected ant build Eclipse plugin the choose to ignore it for the current project.
    2. If you get an error about jdk tools
      • Make sure your system variable JAVA_HOME links correctly to Java SDK installation path.
        • Open your System Control Panel: Control Panel -> System -> Advanced System Settings
        • Check if JAVA_HOME is set. Its value should be the main directory of your Java 8 JDK inside your computer.
        • If it does not exist, add a new JAVA_HOME system variable
      • If you still get the same the error, add the following dependency to Maven pom.xml:
          <dependency>
            <groupId>jdk.tools</groupId>
            <artifactId>jdk.tools</artifactId>
            <scope>system</scope>
            <systemPath>${JAVA_HOME}/lib/tools.jar</systemPath>
            <version>1.8.0_161</version>
         </dependency> 
        • Change the <version>1.8.0_161</version> line to match the version of Java SDK installed in your computer.
      • Update Maven configuration for your project
        • Right click your project (in Eclipse project explorer), choose Maven -> Update Project

Exercise 4.3 Running the WordCount example in Eclipse

  • Create a new folder named input inside your Eclipse project. We will put all the input files for the WordCount Mapreduce application there.
  • Download 5 random books from Gutenberg in text (UTF-8) format:
    • http://www.gutenberg.org/ebooks/search/?sort_order=random
  • Move the downloaded text files into the input folder
  • Find the WordCount class inside your Eclipse project (org.apache.hadoop.examples package)
  • Try to execute WordCount class in Eclipse
    • Right click on the WordCount class -> Run As -> Java Application
    • You will initially see an error concerning the number of supplied arguments.
  • Modify the configuration of the WordCount class to change what arguments should be supplied to it.
    • Right click WordCount class -> Run As -> Run Configuration -> Arguments
    • WordCount class takes two command line argument, input folder and output folder
    • Specify the previously created folder (where you moved Gutenberg books) as input folder and an arbitrarily named folder as output
    • when using relative folder paths in Eclipse, folders are created inside the Eclipse project main folder
  • If the execution is successful, output will be written into part-r-00000 file inside the output folder
    • If you run the application again, output folder must first be deleted, moved or changed to a new folder.

Back to Practice 2 page

  • Arvutiteaduse instituut
  • Loodus- ja täppisteaduste valdkond
  • Tartu Ülikool
Tehniliste probleemide või küsimuste korral kirjuta:

Kursuse sisu ja korralduslike küsimustega pöörduge kursuse korraldajate poole.
Õppematerjalide varalised autoriõigused kuuluvad Tartu Ülikoolile. Õppematerjalide kasutamine on lubatud autoriõiguse seaduses ettenähtud teose vaba kasutamise eesmärkidel ja tingimustel. Õppematerjalide kasutamisel on kasutaja kohustatud viitama õppematerjalide autorile.
Õppematerjalide kasutamine muudel eesmärkidel on lubatud ainult Tartu Ülikooli eelneval kirjalikul nõusolekul.
Courses’i keskkonna kasutustingimused