Spark on Azure HDInsight

View Code Deploy

An example Pulumi component that deploys a Spark cluster on Azure HDInsight.

Running the App

  1. Create a new stack:

    $ pulumi stack init dev
  2. Login to Azure CLI (you will be prompted to do this during deployment if you forget this step):

    $ az login
  3. Create a Python virtualenv, activate it, and install dependencies:

    This installs the dependent packages needed for our Pulumi program.

    $ python3 -m venv venv
    $ source venv/bin/activate
    $ pip3 install -r requirements.txt
  4. Run pulumi up to preview and deploy changes:

    $ pulumi up
    Previewing changes:
    ...
    
    Performing changes:
    ...
    info: 5 changes performed:
        + 5 resources created
    Update duration: 15m6s
  5. Check the deployed Spark endpoint:

    $ pulumi stack output endpoint
    https://myspark1234abcd.azurehdinsight.net/
    
    # For instance, Jupyter notebooks are available at https://myspark1234abcd.azurehdinsight.net/jupyter/
    # Follow https://docs.microsoft.com/en-us/azure/hdinsight/spark/apache-spark-load-data-run-query to test it out