It’s challenging and can be expensive to develop machine learning models, and thus, it is important to have solid foundations in place to enable teams to put those models to work and derive value from them. A Customer of One51 Consulting had the skills to create quality machine-learning models; however, their team worked in an environment that made it arduous to manage and scale their efforts. The client was using a mix of virtual machines, local notebooks, and other resources, doing their best with what they had, but it was clear that a more organised approach could help them get more out of their work.
That’s where ML-Ops comes in. It’s a practice that combines machine learning, data engineering, and DevOps to make the whole process of going from model development to deployment smoother and more manageable. In this post, we’ll talk about how we used ML-Ops tools, specifically Azure ML and Azure DevOps pipelines, to help our customer improve their machine learning workflows and develop a framework for their development lifecycle. We’ll go over the problems we faced, the solutions we came up with, and what we learned from the experience.
Challenges & Solutions
One of the key challenges was the lack of a well-defined environment for data scientists to develop models; that meant that they would do a lot of work locally and spin up virtual machines, which made it difficult to collaborate and impacted reproducibility. Azure ML directly addresses such challenges as it allows for individual compute instances that can be set for each data scientist, providing a secure dedicated notebook environment for development. Azure DevOps templates can be utilised by IT to create and set up these compute instances to ensure appropriate governance.
Another challenge was the management of environments for different use cases. Different models required different packages, and managing dependencies proved difficult. Azure ML directly addresses such a challenge by providing an Environment registry that contains curated environments for common ML tasks. That feature allows users to register custom environments, track control over environments, and share between team members.
One requirement in our project was to bring a model through the ML-Ops framework and into production. We achieved this by:
- Taking a model, the Client had developed locally using a low code development package called PyCaret, which is useful for quick model development but not best suited for production applications.
- Doing some refactoring to present the code in a way in which Azure ML could understand and enable us to track various metrics.
- Creating an Azure ML training pipeline and executing it with an Azure DevOps pipeline.
- Registering the model and the training environment to the model and environments registries
- Using Azure DevOps Pipelines to deploy the model on an endpoint and creating a scoring script to preprocess incoming data and output the model results.
The main complexity was using PyCaret, which is not best optimised for use with Azure ML compared with lower-level packages. Accordingly, some custom wrapper functions were needed to integrate the model fully into Azure ML.
ML-Ops adoption is a process with no one-size-fits-all approach to it. It is important to consider your team’s skills, expertise and ways of working and build a fit-for-purpose solution while still aligning with best practices. A key step is an evaluation of the current state and a clear alignment with stakeholders on the target state. From there, it is possible to map out a plan to improve maturity.
Segmentation of modelling steps is important. Data scientists often start working from raw data and complete several steps, including preprocessing, before getting to a final model output. When productionising such a model, it is important that the right tool is used for the job, and it may be more appropriate to move preprocessing steps upstream to data engineering pipelines.
ML-Ops will not make things easier at first. There is a learning curve, and doing things differently takes time. But it is important to persevere as the trade-offs and eventual pay-off are worthwhile. An initial sacrifice of development time will lead to more robust production models, better reproducibility and reliability and an overall simplification of getting models from proof of concept to Production.
Our team brings extensive expertise in ML-Ops, ensuring you have access to best practices and proven strategies from the start. We can also help flatten the learning curve, allowing you to quickly grasp and implement a framework and practices that work for you.
Talk to us today to find out more!