With ZenML 0.6.3, you can now run your ZenML steps on Sagemaker and AzureML! It’s normal to have certain steps that require specific hardware on which to run model training, for example, and this latest release gives you the power to switch out hardware for individual steps to support this.
We added a new Tensorboard visualization that you can make use of when using our Kubeflow Pipelines integration. We handle the background processes needed to spin up this interactive web interface that you can use to visualize your model’s performance over time.
Behind the scenes we gave our integration testing suite a massive upgrade, fixed a number of smaller bugs and made documentation updates. For a detailed look at what’s changed, give our full release notes a glance.
As your pipelines become more mature and complex, you might want to use specialized hardware for certain steps of your pipeline. A clear example is wanting to run your training step on GPU machines that get spun up automagically without you having to worry too much about that deployment. Amazon’s Sagemaker and Microsoft’s AzureML both offer custom hardware on which you can run your steps.
The code required to add this to your pipeline and step definition is as minimal as can be. Simply add the following like above the step that you’d like to run on your cloud hardware:
@step(custom_step_operator='sagemaker') # or azureml
Sagemaker and AzureML offers specialized compute instances to run your training jobs and offer a beautiful UI to track and manage your models and logs. All you have to do is configure your ZenML stack with the relevant parameters and you’re good to go. You’ll have to set up the infrastructure with credentials; check out our documentation for a guide how to do that.
We’ll be publishing more about this use case in the coming days, so stay tuned for that!
Tensorboard is a way to visualize your machine learning models and training outputs. In this release we added a custom visualization for Kubeflow which allows you to see the entire history of a model logged by a step.
Behind the scenes, we implemented a
TensorboardService which tracks and manages locally running Tensorboard daemons. This interactive UI runs in the background and works even while your pipeline is running. To use this feature, the easiest way is to click the ‘Start Tensorboard’ button inside the Kubeflow UI.
This new functionality has also been integrated into our Kubeflow example from previous releases.
If you ever need a reminder of the function of a particular stack, there’s a new
explain command that works for all stack components (orchestrator, container registry and so on). Typing
zenml orchestrator explain will output the relevant parts of the documentation that explain some basics about the orchestrator component.
We added functionality to output whether a step is being executed from a cached version or is actually being executed for the first time. We also improved error messages when provisioning local Kubeflow resources with a non-local container registry.
Our test suite was thoroughly reimagined and reworked to get the most out of Github Actions. Alexej blogged about this for the ZenML blog here: “How we made our integration tests delightful by optimizing the way our GitHub Actions run our test suite”. We also completed the implementation of all integration tests such that they run on our test suite.
We enabled the use of generic step inputs and outputs as part of your pipeline.
Finally, we made a number of under-the-hood dependency changes that you probably won’t notice, but that either reduce the overall size of ZenML or fix some old or deprecated packages. Notably, ZenML no longer supports Python 3.6.
Join our Slack to let us know what you think we should build next!