April 12, 2023 - Hamza Tahir - 6 mins read
Last updated: April 12, 2023
Today, we are excited to announce the launch of the ZenML Hub, a major game-changer for our open-source MLOps framework. This novel plugin system allows users to contribute and consume stack component flavors, pipelines, steps, materializers, and other pieces of code seamlessly in their ML pipelines.
The goal of ZenML is to standardize MLOps workflows across a rich and diverse ML tooling ecosystem. To achieve this, we have built key abstractions of components that allow users to integrate together a variety of tooling and infrastructure backends, without having to change their business layer logic. This allows MLOps practitioners to standardize processes, prevent vendor lock-in, and ensure reliability across their workflows:
Some examples of using these components in practice are:
Before today, these ZenML components were packaged within the core ZenML package and exposed via the
zenml integration command line. While useful, this made it harder to modify and contribute more of these components for users and our community. With the launch of the ZenML Hub, we’re making big strides toward making these components more widely accessible and easier to use than ever. In the hub, ZenML stack component flavors, steps, materializers, and other pieces of useful code are packaged in
plugins are accessible via a central registry that is available directly from the ZenML dashboard. Each plugin contains descriptions, tags, and helpful information on how to use it.
To use a plugin, one simply needs to install it as follows:
zenml hub install langchain_qa_example
… and then directly import and use it in Python, for example:
from zenml.hub.langchain_qa_example import qa_pipeline
With the hub, extending ZenML has never been easier. Contributing a plugin is a breeze: simply create and submit a public Github repository (like this). After processing, your plugin is installable for all ZenML users, including your company and the community at large. This will foster a community-driven approach to building machine learning workflows. As more users contribute to the hub, the community will benefit from a growing repository of high-quality, reusable components that can be used to build more complex workflows. This in turn enables users to create more impactful and efficient models while also providing the opportunity to collaborate with other community members.
We believe that the ZenML Hub will help democratize MLOps by making it easier for everyone to contribute and consume code. By removing the barriers to entry for new contributors, we hope to accelerate innovation in the field and ultimately lead to more impactful solutions. The ultimate goal is to come to a series of standardized, reusable, components that will help all of us who are putting models in production.
One of the key use cases for the ZenML Hub is sharing reproducible code across different projects within an organization. Take for example a company that is implementing multiple machine learning pipelines for various use-cases. These steps may include data loading, preprocessing, feature engineering, model training, and evaluation. The ZenML Hub will enable the creation of commonly-used steps that will then be shared across all projects, saving time and effort.
Imagine a process where one user in your organization creates a standard wrapper to run a preprocessing job on Spark, a training job on Sagemaker, and a deployment job on AzureML. With the ZenML Hub, these components are discoverable, fully documented, and tested. The work is only done once, and the plugins can be updated, versioned, and maintained separately from the actual machine learning code.
As part of this launch, we are excited to introduce several new plugins that are added to the ZenML Hub. These plugins include standard steps and pipelines that can be easily and freely used for standard use cases with ZenML. We look forward to seeing how these new plugins will streamline the ML workflow and help everyone build better models faster.
One of the most intuitive examples to get started with is the
langchain_qa_example plugin. The plugin features a simple pipeline and steps that allow users to fetch data from a variety of sources (via Langchain and LlamaIndex data loading steps), create an index, and answer a query across the corpus using a GPT-3.5 (and beyond) LLM powered by OpenAI. To reproduce it locally, simply run:
zenml hub install langchain_qa_example export OPENAI_API_KEY=<YOUR_KEY> # get it from https://platform.openai.com/account/api-keys
from zenml.hub.langchain_qa_example import build_zenml_docs_qa_pipeline pipeline = build_zenml_docs_qa_pipeline(question="What is ZenML?", load_all_paths=False).run()
(When you first run this pipeline, it will run a series of steps that will scrape the ZenML docs, and build an index. Subsequent runs will be faster and re-use the index because of ZenML’s internal cache.)
And there you go: You can now recreate a simple question-answering MLOps pipeline using cutting-edge LLMs and the latest libraries, which you can now go on to deploy on custom infrastructure. Of course, if you did want to use the individual steps or pipelines directly, feel free to check out the corresponding project and source code here.
In the future, we plan to add more plugins, with steps like an ONNX converter and step operators like Sagemaker, Spark, EMR, etc. We’re also working on workflows to easily pull and fork public plugins, automated testing, and a playground to test steps.
For now (release 0.38.0 onwards), the ZenML Hub is officially supported within the main ZenML package. Get started with using your first plugin and start contributing today. Your contributions will help build a better future for machine learning and benefit the entire community.
As always, drop by the community Slack to give feedback, or ask any questions. Thank you for your continued support, and stay tuned for more exciting updates from the ZenML team!
ZenML is currently hiring for a number of positions. Check out our careers page for more details!