Akshay Terode

A tech boffin who loves simplifying things currently doing more in big data, focused in Python and Spark.

Apache PIG Installation In Ubuntu

29 Dec 2016 » bigdata, setup

This Pig tutorial briefs how to install and configure Apache Pig. Apache Pig is an abstraction over MapReduce. Pig is basically a tool to easily perform analysis of larger sets of data by representing them as data flows.

So, let’s start Pig Installation on Ubuntu.

Step 1 :- Pre-Requisite to Apache Pig

  • JAVA : Install java,and export path variables in .bashrc.
  • Hadoop : Install hadoop in pseudo-distribuited mode with YARN as resource manager.

Step 2 :- Download PIG

  • You can download latest version of Pig from here Download.
  • Click Download a release now.

Step 3 :- Extract Apache PIG

tar xvf pig-[version].tar.gz

Step 4 :- Move to your desired location. Preferably Hadoop Home

cd pig-[version]
sudo mv pig-[version]/* /$HADOOP_HOME

Step 5 :- Adding envrionment variables.

export PATH=$PATH:/home/noobwithskills/HADOOP_HOME/pig-[version]]/bin
export PIG_HOME=/home/noobwithskills/HADOOP_HOME/pig-[version]]/
export PIG_CLASSPATH=$HADOOP_HOME/conf

Step 6 :- Update bashrc.

source ~/.bashrc

Step 7 :- Check for pig-installation

pig -version

NOTE : If you don’t see a pig-version, try troubleshooting or comment down below .

Step 8 :- Starting Apache Pig

  • In Local Mode : Pig can acces your local filesystem files only.
    pig -x local
    
  • In Cluster Mode : Pig can acces HDFS files.
    pig
    
  • If all goes well you should see grunt > on your CLI.

So, finally, we have seen how to install Apache Pig on Ubuntu. We will learn Pig Programming in our further Pig tutorials. Feel free to ask your queries in the comment section.