Apache Spark 3 Installation on Ubuntu 20.04
How to install Apache Spark 3 Installation on Ubuntu 20.04
What is Apache-Spark :
- Apache Spark is a unified analytics engine for large-scale data processing.
- Apache Spark is a fast and general engine for large‐scale data processing.
- it is written in Scala. it also provides a High-level API of JAVA / Python / R.
Installation
- Step 1: check the Java Installation
java -version
- Step 2: Open a browser and Goto the spark download link
- Step 3:Download Apache-Spark 3.
Click on the download link from the above step.
Copy the link from the above step and run the below cmd
wget https://downloads.apache.org/spark/spark-3.0.1/spark-3.0.1-bin-hadoop2.7.tgz
- Step 4: Untar the downloaded file
tar xvf spark-3.0.1-bin-hadoop<version>.tgz
Go to the downloaded folder
cd spark-3.0.1-bin-hadoop2.7/bin
Check the Spark Version. Go to the bin folder and run the cmd ./spark-shell