The latest news, announcements and technical backgrounds on ZenML.
November 26, 2021 - Baris Can Durak
When working on a project in Python, it is very likely that you will run into an issue where even the simplest of imports can lead to a chain of imports, which in turn can cost you a few seconds of run time before you even start to use what you imported. Left unchecked, this eager consumption of time can become even more apparent and annoying if you working on a project where the response time is critical and there are a wide variety of tools in play. Let’s put this into perspective for a tool that handles Machine Learning workflows in a production setting.
November 19, 2021 - Adam Probst
This first episode of Pipeline Conversations is the kick-off for a series of talks in the broader Machine Learning Start-Up Space. Our goal is not to promote any product or company but to discuss and also educate the community on different topics. We will invite thought leaders in the MLOps space, talk about how to build an open-source startup in public, and also the biggest challenges in Machine Learning. There won’t be a strict script and our guests will mainly determine the content.
November 4, 2021 - Alex Strick van Linschoten
There’s nothing like working on testing to get you familiar with a codebase. I’ve been working on adding back in some testing to the ZenML codebase this past couple of weeks and as a relatively new employee here, it has been a really useful way to dive into how things work under the hood.
March 31, 2021 - Hamza Tahir
Today, Machine Learning powers the top 1% of the most valuable organizations in the world (FB, ALPH, AMZ, N etc). However, 99% of enterprises struggle to productionalize ML, even with the possession of hyper-specific datasets and exceptional data science departments.
January 28th, 2021 - Hamza Tahir
Every organization at any scale understands that leveraging the public cloud is a trade-off between convenience and cost. While cloud providers like Google, Amazon and Microsoft have immensely reduced the barrier of entry for machine learning, GPU costs are still at a premium.
December 21st, 2020 - ZenML Team
We did not start as an open-source Machine Learning tooling company. Our original goal was to transform the commercial vehicle industry with predictive analytics. After a few promising proofs-of-concept and projects on trucks we continuously expanded to other commercial vehicles - and other industries.
November 11th, 2020 - ZenML Team
For the last months, we’ve been hard at work bringing ZenML to market as a commercial product. Our vision was and remains the commoditization of reproducible Machine Learning. Every pipeline for every model should forever be reproducible and have a deployable model at its end. All the invaluable lessons we’ve learned in this time, from customers, users, friends, and strangers, are now culminating in the next step towards our goal: We will make ZenML available under an Open Source License.
June 26th, 2020 - Hamza Tahir
Just a few days ago, I was able to share my thoughts on the state of Machine Learning in production, and why it’s (still) broken, on the MLOps World 2020. Read on for a writeup of my presentation, or checkout the recording of the talk on Youtube.
June 11th, 2020 - Hamza Tahir
One attempt to ensure that ML models generalize in unknown settings is splitting data. This can be done in many ways,
from 3-way (train, test, eval)
splits to k-splits with cross-validation. The underlying reasoning is that by training a ML model
on a subset of the data, and evaluating on
unknown data, one can reason much better if the model has underfit or overfit in training.
June 6th, 2020 - Hamza Tahir
Okay, lets make it clear at the start: This post is NOT intended for people who are doing one-off, silo-ed projects like participating in Kaggle competitions, or doing hobby projects on jupyter notebooks to learn the trade. The value of throw-away, quick, diry script code is obvious there - and has its place. Rather, it is intended for ML practitioners working in a production setting. So if you’re working in a ML team that is struggling to manage technical debt while pumping out ML models, this one’s for you.
May 17th, 2020 - ZenML Team
No way around it: I am what you call an “Ops guy”. In my career I admin’ed more servers than I’ve written code. Over twelve years in the industry have left their permanent mark on me. For the last two of those I’m exposed to a new beast - Machine Learning. My hustle is bringing Ops-Knowledge to ML. These are my thoughts on that.
May 7th, 2020 - Baris Can Durak
In the last decade, machine learning applications have proven their capabilities and potential in various applications. Especially in the past few years, they have gained rapid prominence in the gaming industry and now there are countless projects creating an endless array of models interacting with different games.
May 4th, 2020 - Hamza Tahir
Over the last few years at zenml, we have regularly dealt with datasets that contain millions of data points. Today, I want to write about how the we use our machine learning platform, ZenML, to build production-ready distributed training pipelines. These pipelines are capable of dealing with millions of datapoints in a matter of hours. If you also want to build large-scale deep learning pipelines, sign up for ZenML for free here and follow along.
May 1st, 2020 - Hamza Tahir
Around 87% of machine learning projects do not survive to make it to production. There is a disconnect between machine learning being done in Jupyter notebooks on local machines and actually being served to end-users to provide some actual value.
February 27, 2020 - Hamza Tahir - Crossposted on Tensorflow Blog
Principal Component Analysis (PCA) is a dimensionality reduction technique, useful in many different machine learning scenarios. In essence, PCA reduces the dimension of input vectors in a way that retains the maximal variance in your dataset. Reducing the dimensionality of the model input can increase the performance of the model, reduce the size and resources required for training, and decrease non-random noise.