Posts Tagged guest-post

Managing your MySQL databases with Pulumi

Tuesday, May 28, 2019

One of the most critical components of an application’s infrastructure is its database, and one of the most popular databases in use in the cloud today is MySQL.

Pulumi can already be used to create managed MySQL instances in a wide variety of clouds, including AWS, Azure and GCP. In addition to this, Pulumi recently added support for managing the MySQL instances themselves to manage permissions, create databases, and other common tasks.

In this post, we’ll walk through a quick tutorial of how to use this new Pulumi MySQL provider to manage existing and new MySQL databases.

Data science on demand: spinning up a Wallaroo cluster

Friday, Nov 2, 2018

This guest post is from Simon Zelazny of Wallaroo Labs. Find out how Wallaroo powered their cluster provisioning with Pulumi, for data science on demand.

Last month, we took a long-running pandas classifier and made it run faster by leveraging Wallaroo’s parallelization capabilities. This time around, we’d like to kick it up a notch and see if we can keep scaling out to meet higher demand. We’d also like to be as economical as possible: provision infrastructure as needed and de-provision it when we’re done processing.

If you don’t feel like reading the post linked above, here’s a short summary of the situation: there’s a batch job that you’re running every hour, on the hour. This job receives a CSV file and classifies each row of the file, using a Pandas-based algorithm. The run-time of the job is starting to near the one-hour mark, and there’s concern that the pipeline will break down once the input data grows past a particular point.

In the blog post, we show how to split up the input data into smaller dataframes, and distribute them among workers in an ad-hoc Wallaroo cluster, running on one physical machine. Parallelizing the work in this manner buys us a lot of time, and the batch job can continue processing increasing amounts of data.