1.) Prepare to install Symphony DE on Linux on UNIX:
- If you have Linux, determine which installation file you need. Note that MapReduce workload in Symphony DE is only supported on Linux 64-bit hosts.
- To find out the Linux version, enter:
uname -a
- To find out the glibc version, enter:
rpm -q glibc
- To find out the Linux version, enter:
- Determine the installation package and download.
For Linux, it is a .bin package (that contains .rpm files); for Solaris, it is a tar.gz package.
- Check communication ports.
2.) Install Platform Symphony DE on a UNIX host:
3.) If you have root permissions, you can install using the default settings.
- Set the environment variable CLUSTERADMIN to your user account to be able to run Symphony DE without root permissions after installation. For example:
(bsh) export CLUSTERADMIN=user1 (tcsh) setenv CLUSTERADMIN user1
where user1 is your operating system user name.
- If you plan to use MapReduce (available only on Linux® 64-bit hosts), set JAVA_HOME to point to the directory where JDK 1.6 is installed. For example:
(bsh) export JAVA_HOME=/opt/java/j2sdk1.6.2 (tcsh) setenv JAVA_HOME /opt/java/j2sdk1.6.2
- If you plan to use the Hadoop Distributed File System (HDFS) with MapReduce:
- Set HADOOP_HOME to the installation directory where Hadoop is installed. For example:
(bsh) export HADOOP_HOME=/opt/hadoop-2.4.x (tcsh) setenv HADOOP_HOME /opt/hadoop-2.4.x
- Set HADOOP_VERSION to any of the following:
- The HDFS or Hadoop API version in your cluster:
- 2_4_x = version 2.4.x
- 1_1_1 = version 1.1.1
- The Cloudera version in your cluster:
- cdh5.0.2 = CDH 5.0.2
For example:2.4.x
(bsh) export HADOOP_VERSION=2.4.x (tcsh) setenv HADOOP_VERSION 2.4.x
- The HDFS or Hadoop API version in your cluster:
- Set HADOOP_HOME to the installation directory where Hadoop is installed. For example:
- Install the package:
- To install with default settings, run:
./symphony_DE_package_name.bin
Platform Symphony DE will be installed to the default directory of /opt/ibm/platformsymphonyde/de71.
- To install with custom settings, run:
./rpm -ivh symphony_DE_package_name.rpm --prefix install_dir --dbpath dbpath_dir
- To install with default settings, run:
Installation:
On hosts identified as HBase Master/RegionServers, type:
STEP 1: Download the Hadoop .tar.gz package from the Apache Hadoop download website.
STEP 2 : Select a directory to install Hadoop and untar the package tar ball in that directory. For example, to download the 2.4.x distribution, enter:
cd /opt
tar xzvf hadoop-2.4.x.tar.gz
ln -s hadoop-2.4.x hadoop
STEP 3 : Select a host to be the NameNode (for example, db03b10).
STEP 4 : As user hadoop, configure the environment:
export JAVA_HOME=/usr/java/latest
export HADOOP_HOME=/opt/hadoop-2.4.x
export CLASSPATH=$CLASSPATH:$JAVA_HOME/lib:$JAVA_HOME/jre/lib
export PATH=$JAVA_HOME/bin:$JAVA_HOME/jre/bin:$PATH:$HADOOP_HOME/bin
export HADOOP_VERSION=2_4_x
STEP 5 : Specify the Java version used by Hadoop using the ${HADOOP_HOME}/conf/hadoop-env.sh script file. Change, for example:
# The Java implementation to use. Required.
export JAVA_HOME=/usr/java/latest
STEP 6 : Configure the directory where Hadoop will store its data files using the ${HADOOP_HOME}/conf/core-site.xml file. For example:
hadoop.tmp.dir</name>
<value>/opt/hadoop/data</value>
fs.default.name</name>
<!-- NameNode host -->
<value>hdfs://db03b10:9000/</value>
STEP 7 : Ensure that the base you specified for other temporary directories exists and has the required ownerships and permissions. For example:
mkdir -p /opt/hadoop/data
chown hadoop:hadoop /opt/hadoop/data
chmod 755 /opt/hadoop/data
STEP 8 : Set values for the following properties using the ${HADOOP_HOME}/conf/mapred-site.xml. Set the host and port that the MapReduce job tracker runs at. For example:
<name>mapred.job.tracker</name>
<value>db03b10:9001</value>
STEP 9 : Set the maximum number of map tasks run simultaneously per TaskTracker. For example:
<name>mapred.tasktracker.map.tasks.maximum</name>
<value>7</value>
STEP 10 : Set the default block replication using the ${HADOOP_HOME}/conf/hdfs-site.xml. For example:
<name>dfs.replication</name>
<value>1</value>
STEP 11 : If you plan on using the High-Availability HDFS feature (not available on the Developer Edition), configure the shared file system directory to be used for HDFS NameNode meta data storage in the hdfs-site.xml file. For example:
<name>dfs.name.dir</name>
<value>/share/hdfs/name</value>
STEP 12 : Repeat steps 1 to 5 on every compute host.
STEP 13 : Specify master and compute hosts. For example, in the 2.4.x distribution, the compute file is under the ${HADOOP_HOME}/etc/hadoop directory.
$ vim compute
$ cat compute
db03b11
db03b12