Transcription

SEETEXTO N LYHadoopon FreeBSD with ZFSTutorialBy Benedict ReuschlingThis article provides a scripted tutorial usingan Ansible playbook to build the clusterfrom the ground up. Manual installation andconfiguration quickly becomes tedious themore nodes are involved, which is whyautomating the installation removes some ofthe burden for BSD system administrators.The tutorial builds hadoop on ZFS to makeuse of the powerful features of that filesystem with an integrated volume manager toenhance HDFS‘s feature set.adoop is an open-source, distributedframework using the map-reduce programming paradigm to split big computing jobs into smaller pieces. Those piecesare then distributed to all the nodes in the cluster(map step), where the participating datanodes calculate parts of the problem in parallel. In thereduce step, those partial results are collected andthe final result is computed. Results are stored inthe hadoop distributed filesystem (HDFS).Coordination is done via a master node called thenamenode. Hadoop consists of many componentsand can run on commodity hardware withoutrequiring many resources. The more nodes thatparticipate in the cluster and if the problem can beexpressed in map-reduce terms, the better the performance than just running the calculation on asingle node. Mostly written in Java, hadoop aimsto provide enough redundancy to allow nodes tofail while still maintaining a functional computecluster. A rich ecosystem of additional softwareH10 FreeBSD Journalgrew up around hadoop, which makes the task ofsetting up a cluster of hadoop machines for bigdata applications a difficult one.RequirementsThis tutorial uses three FreeBSD 11.2 machines,either physical or virtual. Other and older BSD versions should work as well, as long as they supporta recent version of Java/OpenJDK. OpenZFS withregular directories is fine, If OpenZFS is not available, regular directories are fine, too. One machinewill serve as the master node (called namenode)and the other two will serve as compute nodes (ordatanodes in hadoop terms). They need to be ableto connect to each other on the network. Also, anAnsible setup must be available for the playbookwork. This involves an inventory file that containsthe three machines and the necessary software onthe target machines (python 2.7 or higher) forAnsible to send commands to them.Note that this playbook does not use the defaultpaths used by the FreeBSD port/package of hadoop.This way, a higher version of hadoop can be usedbefore the port gets updated. The default FreeBSDpaths can be easily substituted when required. Theconfiguration files presented in this tutorial containonly the minimal sections required to get a basichadoop setup going. The FreeBSD port/packagecontains sample configuration files that have manymore configuration options than are initially needed. However, the port is a great resource forextending and learning about the hadoop clusteronce it is set up.Readers unfamiliar with Ansible should be ableto abstract the setup steps and either implementthem with a different configuration managementsystem (puppet, chef, saltstack) or execute thesteps manually.

TutorialDefining Playbook VariablesTo make management easier, a separate vars.yml file that holds all the variables is used. This file contains all the information in a central location that is needed to install the cluster. For example, when a higher version of hadoop should be used, only the hdp ver variable must be changed.java home: "/usr/local/openjdk8"hdp: "hadoop"hdp zpool name: {{hdp}}poolhdp ver: "2.9.0"hdp destdir: "/usr/local/{{hdp}}{{hdp ver}}"hdp home: "/home/{{hdp}}"hdp tmp dir: "/{{hdp}}/tmp"hdp name dir: "/{{hdp}}/hdfs/namenode"hdp data dir: "/{{hdp}}/hdfs/datanode"hdp mapred dir: "/{{hdp}}/mapred.local.dir"hdp namesec dir: "/{{hdp}}/namesecondary"hdp keyname: "my hadoop key"hdp keytype: "ed25519"hdp user password: "{{vault hdp user pw}}"File: vars.ymlThe first line stores the location of the installed OpenJDK from ports. To save a bit of typing in the playbook and to replace common occurances of the word hadoop, a shorter variable hdp is used as a prefix toall the rest of the variables. The zpool name (hadooppool in this tutorial) already makes use of the variable we defined for hadoop‘s name. As mentioned above, the hadoop version is used to keep track ofwhich version this cluster is based on. The variables describe ZFS datasets (or directories) for the hadoopuser that the software is using while runnning. The last couple of lines are defining the SSH key thathadoop needs to securely connect between the nodes of the cluster. Secrets like passwords are also storedfor the hadoop user in Ansible-vault. This way, the playbook can be shared with others without exposingthe passwords set for that individual cluster.To create a new vault, the ansible-vault(1) command with the create subcommand is used, followed by the path where the encrypted vault file should be stored. ansible-vault create vault.ymlAfter being being prompted to create a passphrase to open the vault, an editor is opened in the vault fileand secrets can be stored within. Refer to (https://docs.ansible.com/ansible/latest/reference words-for-the-user-module) on how to generate encrypted passwords that the Ansible user module can understand. The line in the vault should look like this, with password replaced by the password hash:vault hdp user pw: " yourpassword "File: vault.ymlJuly/August 201811

Playbook ContentsThe playbook itself is divided into several sections to help better understand what is being done in each ofthem. The first part is the beginning of the playbook where it describes what the playbook does (name), whichhosts to work on (hosts), and where the variables and the vault are stored (vars files):#!/usr/local/bin/ansible-playbook- name: "Install a {{hdp}} {{hdp ver}} multi node cluster"hosts: "{{host}}"vars files:- vault.yml- vars.ymlThe first line will ensure that the playbook can run like a regular shell script by making it executable (chmod x). The name: describes what this playbook is doing and uses the variables defined in vars.yml. The hostsare provided on the commandline later to make it more flexible to add more machines. Alternatively, whenthere is a predetermined number of hosts for the cluster, they can also be entered in the hosts: line.Next, the tasks that the playbook should execute are defined (be careful not to use tabs for indentations,this is YAML syntax):tasks:- name: "Install required software for {{hdp}}"package:name: "{{item}}"with items:- openjdk8- bash- gtarThe first task is to install OpenJDK from FreeBSD packages, bash for the hadoop user‘s shell, and gtar toextract the source tarball (the unarchive step later on) that was downloaded from the hadoop website.The datasets (or directories if ZFS can not be used) are created in the next step:- name: "Create ZFS datasets for the {{hdp}} user"zfs:name: "{{hdp zpool name}}{{item}}"state: presentextra zfs properties:mountpoint: "{{item}}"recordsize: "1M"compression: "lz4"with items:- "{{hdp home}}"- "/opt"- "{{hdp tmp dir}}"- "{{hdp name dir}}"- "{{hdp data dir}}"- "{{hdp namesec dir}}"- "{{hdp mapred dir}}"- "{{hdp zoo dir}}"The datasets are each using LZ4 for compression and are able to use a record size of up to 1 megabyte. Thisis important to increase compression as the hadoop distributed filesystem (HDFS) is using 128MB records bydefault. The paths to the mount points are defined in the vars.yml file and will be used in the hadoop-specific config files again later on.12 FreeBSD Journal

- name: "Create the {{hdp}} User"user:name: "{{hdp}}"comment: "{{hdp}} User"home: "{{hdp localhome}}/{{hdp}}"shell: /usr/local/bin/bashcreatehome: yespassword: "{{vault hdp user pw}}"The hadoop processes should all be started and run under a separate user account, aptly named hadoop.This task will create that designated user in the system. The result looks like the following in the passworddatabase (the user ID might be a different one): grep hadoop /etc/passwdhadoop:*:31707:31707:hadoop User:/home/hadoop:/usr/local/bin/bashNext, the SSH keys need to be distributed for the hadoop user to be able to log into each cluster machinewithout requiring a password. Ansible‘s lookup functionality is used to read an SSH key that was generatedearlier on the machine running the playbook (it is recommended to generate this kind of separate key forhadoop using ssh-keygen). The SSH key must not have a passphrase, as the hadoop processes willperform the logins without any user interaction to enter it. The task will add the SSH public key to theauthorized keys file in /home/hadoop.- name: "Add SSH key for {{hdp}} User to authorized keys file"authorized key:user: "{{hdp}}"key: "{{ lookup('file', './{{hdp keyname}}.pub') }}"The public and private key must be placed in hadoop‘s home directory under .ssh. Since a variable hasbeen defined for the key, it is easy to provide the public (.pub extension) as well as the private key (no extension) without having to spell out its real name in this task. Additionally, the key is secured by setting a propermode and ownership so that no one else but hadoop has access to it.- name: "Copy public and private key to {{hdp}}'s .ssh directory"copy:src: "./{{item.name}}"dest: "{{hdp localhome}}/{{hdp}}/.ssh/{{item.type}}"owner: "{{hdp}}"group: "{{hdp}}"mode: 0600with items:- { type: "id {{hdp keytype}}", name: "{{hdp keyname}}" }- { type: "id {{hdp keytype}}.pub", name: "{{hdp keyname}}.pub" }The hadoop user is added to the AllowUsers line in /etc/ssh/sshd config to allow it access to eachmachine. The regular expression will make sure that any previous entries in the AllowUsers line are preserved and that the hadoop user is added to the end of the preexisting user list.- name: "Add {{hdp}} to AllowedUsers line in /etc/ssh/sshd config"replace:backup: nodest: /etc/ssh/sshd configregexp: ' (AllowUsers(?!.*\b{{ hdp }}\b).*) 'replace: '\1 {{ hdp }}'validate: 'sshd -T -f %s'July/August 201813

SSH is restarted explicitly afterwards, as the playbook is going to make use of the hadoop SSH login soon.Note that an Ansible handler can‘t be used here, because it would be executed too late (at the end of theplaybook when all tasks have been executed).- name: Restart SSH to make changes to AllowUsers take effectservice:name: sshdstate: restartedThe next task deals with collecting SSH key information from the node so that hadoop does not have toconfirm the host key of the target system upon establishing the first connection. We need to be able tolocally ssh into the master node itself, so we have to add 0.0.0.0, localhost, the IP address of eachmachine, the master IP address (so that the client nodes know about it and don‘t require an additional task)to .ssh/known hosts. That is what ssh-keyscan is doing in this task step. The variable {{workers}}will be provided on the commandline later and contains all the machines that will act as datanodes to runmap-reduce jobs. (Of course, these can also be placed in vars.yml when the number of machines is staticand do not change.)- name: "Scan SSH Keys"shell: ssh-keyscan 0.0.0.0 localhost \"{{hostvars[inventory hostname]'ansible default ipv4']['address']}} {{master}}" "{{hdp home}}/.ssh/known hosts"- name: "Scan worker SSH Keys one by one"shell: "ssh-keyscan {{item}} {{master}} {{hdp home}}/.ssh/known hosts"with items: "{{workers}}"To function properly, hadoop requires setting a number of environment variables. These includeJAVA HOME, HADOOP HOME and other variables that the hadoop user needs to make the hadoop clusterwork with Java. The environment variables are stored in the .bashrc file that is deployed from the localAnsible control machine to the hadoop home directory on the remote systems.The .bashrc file itself will be provided as a template. This powerful functionality in Ansible makes it possible to store configuration files filled with Ansible variables (utilizing Jinja2 syntax). When deploying them, itis not just a simple copy operation. During transport to the remote machine, the variables are replaced withtheir actual values. In this case, {{hdp destdir}} is replaced by /usr/local/hadoop2.9.0.- name: "Copy BashRC over to {{hdp localhome}}/{{hdp}}/.bashrc"template:src: "./hadoop.bashrc.template"dest: "{{hdp home}}/.bashrc"owner: "{{hdp}}"group: "{{hdp}}"The template file itself needs to have the following additional content at the end of the ortexportexportexportJAVA HOME {{java home}}HADOOP HOME {{hdp destdir}}HADOOP INSTALL {{hdp destdir}}PATH PATH: HADOOP INSTALL/binHADOOP PREFIX /opt/{{hdp}}PATH PATH: HADOOP PREFIX/binPATH PATH: HADOOP HOME/binPATH PATH: JAVA HOME/binPATH PATH: HADOOP HOME/sbinHADOOP CLASSPATH JAVA HOME/lib/tools.jarHADOOP MAPRED HOME HADOOP HOME14 FreeBSD Journal

exportexportexportexportexportHADOOP COMMON HOME HADOOP HOMEHADOOP HDFS HOME HADOOP HOMEYARN HOME HADOOP HOMEHADOOP COMMON LIB NATIVE DIR HADOOP HOME/lib/nativeHADOOP OPTS " HADOOP OPTS -Djava.library.path HADOOP HOME/lib/native"Any users other than hadoop that want to run map-reduce jobs would also need these environment variables set, so consider also copying that file to /etc/skel/.bashrc.Deploying Hadoop Configuration FilesIt is time to deploy the files that make up the hadoop distribution. It is basically a tarball that can be extractedto any directory, as it mostly contains JARs and config files. This way, hadoop can easily be copied around as awhole, since the directory contains everything needed to run hadoop. The files are available for downloadfrom the hadoop webpage (http://hadoop.apache.org/releases.html). There are a lot of supported versions thatkeep evolving rapidly, meaning that there will be new releases coming out at regular intervals. The bottom ofthat page lists how many bugs were fixed in each of the releases. Contrary to how it might sound, there is noneed to keep up with the pace that the hadoop project sets and an older release of hadoop can run for yearsif desired. A fairly recent release (2.9.0) was chosen for this article. Make sure to pick the binary distribution todownload, as it takes additional time to build hadoop from sources. The file is called hadoop2.9.0.tar.gz, and the name will be constructed again in the playbook by using the definitions in thevars.yml file. Ansible‘s unarchive module takes care of extracting the tarball on the remote machine into{{hdp destdir}}, which resolves to /usr/local/hadoop2.9.0. With the version included in thedirectory/dataset name, it is possible to install different versions of hadoop side by side for testing purposes.- name: "Unpack Hadoop {{hdp ver}}"unarchive:src: "./{{hdp}}-{{hdp ver}}.tar.gz"dest: "{{hdp destdir}}"remote src: yesowner: "{{hdp}}"group: "{{hdp}}"Core Hadoop ConfigurationThe time has come to edit the fleet of configuration files that ship with hadoop. It can be overwhelming forbeginners starting out with hadoop to understand which file needs to be changed. Unfortunately, the hadoopwebsite does not do a good job of explaining what files need to be changed for a fully distributed hadoopcluster. In our experience, the documentation on the hadoop website is incomplete, and, even if followed tothe letter, the result is not a functioning hadoop cluster. After a lot of trial and error, the author identified theimportant files needed to create a fully functional map-reduce cluster with the underlying HDFS on FreeBSD.At its core, 4 files form the site-specific configuration parts for this cluster and they are named *-site.xml.They need to be changed and all of them reside in the configuration directory of the hadoop distribution. Inthis tutorial, that path is /usr/local/hadoop2.9.0/etc/hadoop/ and contains core-site.xml,yarn-site.xml, hdfs-site.xml, and mapred-site.xml. ?xml version "1.0"? ?xml-stylesheet type "text/xsl" href "configuration.xsl"? !-- Put site-specific property overrides in this file. -- configuration property name fs.defaultFS /name value hdfs://{{master}}:9000 /value /property property July/August 201815

name io.file.buffer.size /name value 131072 /value /property property name hadoop.tmp.dir /name value {{hdp tmp dir}} /value description A base for other temporary directories. /description /property /configuration core-site.xml Template ?xml version "1.0"? ?xml-stylesheet type "text/xsl" href "configuration.xsl"? configuration !-- Site specific YARN configuration properties -- property name yarn.nodemanager.aux-services /name value mapreduce shuffle /value /property property name ass /name value org.apache.hadoop.mapred.ShuffleHandler /value /property property name yarn.resourcemanager.hostname /name value {{master}} /value /property property name yarn.resourcemanager.resource-tracker.address /name value {{master}}:8025 /value /property property name yarn.resourcemanager.scheduler.address /name value {{master}}:8030 /value /property property name yarn.resourcemanager.address /name value {{master}}:8050 /value /property property name yarn.resourcemanager.webapp.address /name value 0.0.0.0:8088 /value /property /configuration yarn-site.xml Template ?xml version "1.0"? ?xml-stylesheet type "text/xsl" href "configuration.xsl"? configuration property name dfs.namenode.name.dir /name value file://{{hdp name dir}} /value /property 16 FreeBSD Journal

property name dfs.datanode.data.dir /name value file://{{hdp data dir}} /value /property property name dfs.replication /name value 2 /value /property property name dfs.checkpoint.dir /name value file://{{hdp freebsd namesec dir}} /value final true /final /property /configuration hdfs-site.xml Template ?xml version "1.0"? ?xml-stylesheet type "text/xsl" href "configuration.xsl"? configuration property name mapreduce.job.tracker /name value {{inventory hostname}}:8021 /value /property property name mapred.job.tracker /name value {{inventory hostname}}:54311 /value /property property name mapred.local.dir /name value {{hdp mapred dir}} /value /property property name mapred.system.dir /name value /mapredsystemdir /value /property property name mapred.tasktracker.map.tasks.maximum /name value 2 /value /property property name mapred.tasktracker.reduce.tasks.maximum /name value 2 /value /property property name mapred.child.java.opts /name value -Xmx200m /value /property property name mapreduce.framework.name /name value yarn /value /property property name mapreduce.jobhistory.address /name July/August 201817

value {{inventory hostname}}:10020 /value /property /configuration mapred-site.xml TemplateThe following task in the playbook will take care of putting them in the right place with properly replacedvariables from the vars.yml definition file. (Particularly at this point, having a central Ansible variables filebecomes invaluable, as typos and errors in these files cause a lot of headaches debugging an already complexdistributed system like hadoop.)- name: "Templating *-site.xml files for the node"template:src: "./Hadoop275/freebsd/{{item}}.j2"dest: "{{hdp destdir}}/etc/{{hdp}}/{{item}}"owner: "{{hdp}}"group: "{{hdp}}"with items:- core-site.xml- hdfs-site.xml- yarn-site.xml- mapred-site.xmlA file called slaves (newer versions renamed it to workers) contains the names of hosts that shouldserve as datanodes. The machine defined as the master can also participate and work on map-reduce jobs,hence the localhost in the file. The task here adds the workers that are defined as parameters to the Ansibleplaybook to that file:- name: "Create and populate the slaves file"lineinfile:dest: "{{hdp destdir}}/etc/{{hdp}}/slaves"owner: "{{hdp}}"group: "{{hdp}}"line: "{{item}}"with items: "{{ workers }}"Now that a bunch of file changes have been made to the installation, we need to be sure that the files arestill owned by hadoop and not the user running the Ansible script. This last task recursively sets ownershipand group to the hadoop user on the files and directories the playbook that has touched so far.- name: "Give ownership to {{hdp}}"file:path: "{{item}}"owner: "{{hdp}}"group: "{{hdp}}"recurse: yeswith items:- "{{hdp home}}"- "{{hdp destdir}}"- "/{{hdp}}"That‘s the complete playbook called freebsd hadoop2.9.0.yml and it can be executed with the following commandline. ./freebsd hadoop2.9.0.yml -Kbe 'host namenode:datanode1:datanode2master namenode' -e '{"workers":{"datanode1","datanode2"}}' --vault-id @prompt18 FreeBSD Journal

The hosts namenode, datanode1, and datanode2 all need to be defined in Ansible‘s inventory file.The --vault-id @prompt parameter will ask for the vault password that was defined when creating thevault.Starting Hadoop and the First Map-reduce JobAfter the playbook has been run and there are no errors in the deployment, it is time to log into thenamenode host and switch to the hadoop user (using the password that was set). A first test is to verifythat this user can log into each datanode1 and datanode2 without being prompted to confirm thehostkey or provide a password. If the login completes without any of these, then the hadoop services canbe started. The first step is to format the distributed filesystem using the hdfs namenode command (thepath to hadoop is in the .bashrc file, so the full path to the hdfs executable is omitted):[email protected] hdfs namenode -formatA couple of initialization messages scroll by, but there should be no errors at the end. Be careful whenrunning this command a second time. Each time, a unique ID is generated to identify the HDFS from others. Unfortunately, the format is only done on the master node, not throughout the other cluster nodes.Hence, running it a second time will confuse the datanodes because they still retain the old ID. The solution is to wipe the directories defined in {{hdp data dir}} and {{hdp tmp}} of any previous content, both on the datanodes and the namenode.Next, all the services that make up the hadoop system must be started in order. The following commands will take care of that:[email protected] start-dfs.sh && start-yarn.sh && mr-jobhistory-daemon.sh starthistoryserverTo make sure all the processes have started successfully, run jps to verify that the following serviceshave started on the namenode: NameNode, ResourceManager, JobHistoryServer, andSecondaryNameNode.The datanodes must have these processes in the jps output: NodeManager and DataNode. (Runningjps during a map-reduce job means more processes will be spawned on the datanodes that form the unitsof work the node is processing using the YARN framework.)The cluster is ready to run its first map-reduce job. Hadoop provides sample jobs to get to know theframework without having to write a Java program first and packing it in a jar file to be executed as a job.One of these example files will try to calculate the value of pi using a Monte Carlo simulation. The following shell script can do that:#!/bin/shexport HADOOP CLASSPATH {JAVA HOME}/lib/tools.jarhadoop jar op-mapreduce-examples2.9.0.jar pi 16 1000000Executing the shell script will spawn mappers to calculate a subset of the Monte Carlo simulation.Depending on how many mappers are chosen (16 in this example), the accuracy of the result varies.The job can be monitored using a browser that‘s pointed to the URLhttp:// the.namenode.ip.address :8088. Browsing tohttp:// the.namenode.ip.address :50070 displays the overall cluster status along witha filesystem browser for the HDFS and logs (manual refresh is required to get updated information onboth pages).Another interesting sample is the random-text-writer that creates a bunch of files in the HDFSacross the nodes. A timestamp is used to make it possible to run this command multiple times in a row:July/August 201819

#!/bin/shexport HADOOP CLASSPATH {JAVA HOME}/lib/tools.jartimestamp " date %Y%m%d%H%M "hadoop jar oop-mapreduce-examples2.9.0.jar randomtextwriter /random-text timestampAfter both jobs have run without errors, the hadoop cluster is ready to accept more sophisticated mapreduce jobs. This is left as a learning exercise for the reader, the internet is full of tutorials about how towrite map-reduce jobs.This tutorial closes with a view of the ZFS compression ratios achieved with the two jobs being completed (results may vary):[email protected] zfs get refcompressratio o26.28xThe datanodes also achieved quite a good compression ratio from the random-text-writer YrefcompressratioVALUE2.35xSOURCE-This shows that running hadoop on FreeBSD has benefits. OpenZFS is able to add additional protectionto the data stored in HDFS and makes it possible to store more data on the underlying disks. In a big dataworld, this is an enormous win.BENEDICT REUSCHLING joined the FreeBSD Project in 2009. After receiving his full documentation commit bit in 2010, he actively began mentoring other people to become FreeBSD committers. He is a proctor for the BSD Certification Group and joined the FreeBSD Foundation in 2015,where he is currently serving as vice president. Benedict has a Master of Science degree inComputer Science and is teaching a UNIX for software developers class at the University ofApplied Sciences, Darmstadt, Germany.ZFS experts make their serversBy Brooks Davis, Robert Norton, Jonathan Woodruff & Robert N. M. WatsonNow you can too. Get a copy of.Choose ebook, print, or combo. You’ll learn to:BSDTW—2017 Use boot environment, make the riskiest sysadmin tasks boring. Delegate filesystem privileges to users. Containerize ZFS datasets with jails. Quickly and efficiently replicate data between machines. Split layers off of mirrors. Optimize ZFS block storage. Handle large storage arrays. Select caching strategies to improve performance. Manage next-generation storage hardware. Identify and remove bottlenecks. Build screaming fast database storage. Dive deep into pools, metaslabs, and more!Link to:WHETHER YOU MANAGE A SINGLE SMALL SERVER OR INTERNATIONAL DATACENTERS,SIMPLIFY YOUR STORAGE WITH FREEBSD MASTERY: ADVANCED ZFS. GET IT TODAY!20 FreeBSD

To create a new vault, the ansible-vault(1)command with the createsubcommand is used, fol-lowed by the path where the encrypted vault file should be stored. ansible-vault create vault.yml . bash for the hadoop user's shell, and gtar to extract the source tarball (the unarchivestep later on) that was downloaded from the hadoop website. .