Spark on Azure HDInsight
An example Pulumi component that deploys a Spark cluster on Azure HDInsight.
Running the App
Create a new stack:
$ pulumi stack init dev
Login to Azure CLI (you will be prompted to do this during deployment if you forget this step):
$ az login
Create a Python virtualenv, activate it, and install dependencies:
This installs the dependent packages needed for our Pulumi program.
$ python3 -m venv venv $ source venv/bin/activate $ pip3 install -r requirements.txt
pulumi upto preview and deploy changes:
$ pulumi up Previewing changes: ... Performing changes: ... info: 5 changes performed: + 5 resources created Update duration: 15m6s
Check the deployed Spark endpoint:
$ pulumi stack output endpoint https://myspark1234abcd.azurehdinsight.net/ # For instance, Jupyter notebooks are available at https://myspark1234abcd.azurehdinsight.net/jupyter/ # Follow https://docs.microsoft.com/en-us/azure/hdinsight/spark/apache-spark-load-data-run-query to test it out