Installing Apache Spark on Ubuntu 12.04
A few easy steps
Installing Apache Spark only involves some simple steps including the- Install Java
- Install Hadoop
- Install Scala
- Install Spark
Install Java on Ubuntu
Java can be installed as shown on this howto:- sudo add-apt-repository ppa:webupd8team/java
- sudo apt-get update
- sudo apt-get install oracle-java7-installer
After installation, you can test if it works by typing java -version at the command prompt. This should give you the java version.
After installing Hadoop, lookup /etc/hadoop/hadoop-env.sh and change the line:
export JAVA_HOME=/usr/lib/jvm/java-6-sun
into
#export JAVA_HOME=/usr/lib/jvm/java-6-sun
Install Hadoop on Ubuntu
Hadoop can simply be installed by downloading a .deb file:- go to http://www.apache.org/dyn/closer.cgi/hadoop/common/ and choose a mirror
- choose a Hadoop version you prefer (i have chosen hadoop-1.2.1)
- download your .deb file (i have chosen hadoop_1.2.1-1_x86_64.deb)
- When opening it, the Ubuntu Software Center opens to install it.
After installing Hadoop, lookup /etc/hadoop/hadoop-env.sh and change the line:
export JAVA_HOME=/usr/lib/jvm/java-6-sun
into
#export JAVA_HOME=/usr/lib/jvm/java-6-sun
Install Scala on Ubuntu
Follow the steps as presented on this page:
- Download Scala from http://scala-lang.org/ and save it somewhere you can find it (e.g. ~/)
- at the command prompt, type:
- cd /usr/share
- sudo tar -zxf <location and name of the tgz file> (e.g. sudo tar -zxf ~/scala-2.10.3.tgz)
- link (ln -s) the executables to the /usr/bin location, e.g.:
- sudo ln -s /usr/share/scala-2.10.3/bin/scala /usr/bin/scala
- sudo ln -s /usr/share/scala-2.10.3/bin/scalac /usr/bin/scalac
- sudo ln -s /usr/share/scala-2.10.3/bin/fsc /usr/bin/fsc
Installing Spark on Ubuntu
Getting Spark up and running is easy as described on http://spark.incubator.apache.org/docs/latest/:- Goto http://spark.incubator.apache.org/downloads.html and download Spark.
- Unpack it at a preferred location
- Go to your Spark home directory in a terminal and type: sbt/sbt assembly
log4j.rootLogger = DEBUG, A1 log4j.appender.A1=org.apache.log4j.RollingFileAppender log4j.appender.A1.File=SparkLog.log log4j.appender.A1.MaxFileSize = 100KB log4j.appender.A1.layout=org.apache.log4j.PatternLayout log4j.appender.A1.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n # Ignore messages below warning level from Jetty, because it's a bit verbose log4j.logger.org.eclipse.jetty=WARN
Now you are ready to make some Sparks!!!
There seems to be a dependency on Git. First run "sudo apt-get install git-core"
BeantwoordenVerwijderen