Posts Tagged jupyter

Data Science in the Cloud

Data Science in the Cloud

Data science has advanced because tools like Jupyter Notebook hide complexity by running high level code for the specific problem they are trying to solve. Increasing the level of abstraction lets a data scientist be more productive by reducing the effort to try multiple approaches to near zero, which encourages experimentation and better results.

Data scientists typically work locally, but they often store data for analyses and models in the cloud. There are clear advantages to using cloud resources for these tasks:

  • Data scientists generally don’t want to manage their storage and databases.
  • They need to be able to store large data sets cheaply.
  • They need large capacity swings available on-demand.

SDKs like AWS’ Python library, boto3, can create resources, but they still require domain expertise to manage and properly architect a solution. The Pulumi Automation API improves on raw SDKs by providing high-level abstractions for creating and managing cloud services, letting data scientists concentrate on analyses and models without being well-versed in cloud APIs.

Read more →

Create Secure Jupyter Notebooks on Kubernetes using Pulumi

Create Secure Jupyter Notebooks on Kubernetes using Pulumi

In this post, we will work through an example that shows how to use Pulumi to create Jupyter Notebooks on Kubernetes. Having worked on Kubernetes since 2015, a couple of critical benefits jump out that may resonate with you as well:

  • You write everything in code - TypeScript in our example here.
  • You need not initialize Tiller or Helm to work with existing Helm charts like nginx-ingress-controller that we use here.
  • The security patterns in Helm and Tiller are no longer concerns, rather you get to focus on the RBAC of the actual service which is Jupyter-notebook in this example.
  • You accomplish more with less YAML and iteratively work towards your use cases.

Read more →