1. Hosting Jupyter Notebooks for Data Science on Droplets


    To host Jupyter Notebooks for Data Science on DigitalOcean Droplets using Pulumi, we will create a virtual server, or "droplet" in DigitalOcean terms. We will use the Pulumi DigitalOcean provider to define and create a droplet pre-configured to run a Jupyter Notebook server. To achieve this, we will follow these steps:

    1. Create a DigitalOcean Droplet.
    2. Configure the Droplet with the necessary software (i.e., Jupyter).
    3. Enable networking features to allow access to the Jupyter Notebook.

    Regarding the setup of the Droplet, we need an image that either comes with Jupyter installed or one that we can easily set up with a startup script (userData property). In this code, I'm going to use a simple startup script for an Ubuntu image that will install Jupyter Notebook on first boot.

    Below is the detailed Pulumi program that will set up the Droplet:

    import pulumi import pulumi_digitalocean as digitalocean # The size slug represents the machine type. You can select different types # based on your budget or performance requirements, for example, "s-1vcpu-1gb" for a basic Droplet. # The image is the base image for our Droplet; this could be changed to an image # with pre-installed Jupyter for more complex setups. droplet = digitalocean.Droplet("jupyter-droplet", # Name to refer to the droplet name="data-science-droplet", # Region where the droplet will be created. region="nyc3", # The size of the droplet (machine type). size="s-1vcpu-1gb", # The image to be used for the droplet. Here we are using Ubuntu 20.04. image="ubuntu-20-04-x64", # Adding user data that will be executed when the droplet is initialized. # This script installs python3, pip and jupyter notebook. # Please note, this is a minimal setup. You might want to secure your Jupyter server, # configure HTTPS, etc., according to your requirements. user_data="""#!/bin/bash apt-get update apt-get install -y python3-pip python3-dev pip3 install jupyter # Start up jupyter notebook on all IPs, allowing access through the Droplet's IP jupyter notebook --ip= --no-browser --NotebookApp.token='' --NotebookApp.password=''""" ) # Exporting the IP address of the Droplet so we can access the Jupyter Notebook server remotely pulumi.export('droplet_ip', droplet.ipv4_address)

    When you run this Pulumi program it will:

    • Provision a new Droplet based on Ubuntu 20.04 LTS.
    • Run an initialization script that will install Python, pip, and Jupyter Notebook.
    • Start the Jupyter Notebook server bound to all network interfaces on the Droplet so it can be accessed from your local browser.

    After the Pulumi program has finished running, you will have the Droplet's IP address as an output. This address can be used in your browser to access the Jupyter Notebook (e.g., http://<droplet-ip>:8888). By default, Jupyter Notebook runs on port 8888.

    Keep in mind the following:

    • The Droplet setup in this code is very basic and not secure for production use. You should configure security settings like firewall rules, secure the Jupyter Notebook with a password or SSL encryption, and consider using SSH tunnels for a real-world application.
    • This example doesn't set up any persistent storage. If the Droplet is destroyed, data stored on it will be lost. Consider using volumes for persistent storage or regularly pushing your notebooks to a remote repository.
    • This program doesn't include details on how to run Pulumi or Pulumi commands. You will need Pulumi installed and configured access to DigitalOcean for it to work.
    • For more details on how to use the resources mentioned in this program, you can refer to the DigitalOcean Droplet documentation.