Developing machine learning systems is painfully slow. Machine learning (ML) models take months to develop and deploy. By the time ML models are deployed into production, development velocity is at a crawl. Let's understand why.
Developing ML systems is a complex endeavour. Machine learning practitioners (researchers, developers, engineers, data scientists) are called in to save the day. Yet despite best intentions, machine learning teams struggle to operationalize and productionize models. ML systems in production are influenced by four factors: developer velocity, complexity, technical debt, and workflows.
Machine Learning Developer Velocity
As developers write "crappy code", the development velocity decreases over time. Developer velocity is a measure of how quickly ML developers ship code. This metric captures how much time is spent doing a task or completing a feature. During research or prototyping stages, the development velocity of ML practitioners is high. In the prototyping stage, developers focus on machine learning code. As ML projects shift into production, the development velocity decreases. In production, developers spend most of their time on non-ML code.
ML practitioners often take months to deploy ML models into production. In October 2019, Algorithmia conducted a study about the state of machine learning. Out of 745 respondents, 40% of companies took more than a month to deploy a machine learning model into production, 28% of respondents taking eight to 30 days, and only 14% spent seven days or less.
Machine Learning Complexity
In the research or prototype stage, most ML models are simple, with core functionality, basic configuration, and data collection code. As ML practitioners are tasked to deploy their models online, they turn to crappy code to glue everything together.
In the production stage, most ML system are complex, requiring "glue code" or supporting code written to get data into and out of general-purpose ML packages and frameworks (Tensorflow, PyTorch, Keras). According to Google, a "mature system might end up being (at most) 5% machine learning code and (at least) 95% glue code." As ML practitioners deploy ML models into production, they are faced with new challenges that extend beyond ML code, including configuration, data collection, feature extraction, data verification, ML management, analysis tools, process management tools, serving infrastructure, and monitoring.
Machine Learning Technical Debt and Workflows
Companies face two major challenges when developing machine learning models: technical debt and overly complex machine learning workflows. The first issue is technical debt, which comes in many forms: bad architecture choices, overly complex code, and lack of code documentation. The second issue is the complex ML workflows. Traditional ML workflows require explicit resource management and over-provisioning.
ML complexity leads to over-provisioning and impacts development speed. The typical ML workflow consists of many steps, including, but not limited to, preprocessing, training, and tuning. This complexity makes it challenging for ML practitioners to correctly provision and manage resources. This complexity also constitutes a significant burden that frequently causes over-provisioning and impairs user productivity and developer velocity.
As ML practitioners move from prototype to production, they accumulate technical debt. Here are top reasons for technical debt: bad architecture choices, overfly complex code, lack of code documentation, inadequate testing, obsolete technology, according to Neil Ernst.
The development velocity should remain constant throughout the entire machine learning project. The only way to achieve high velocity is to use serverless architectures, which allow ML practitioners to focus on code, not serving infrastructure. Serverless computing provides managed infrastructure and automatic scaling. This allows ML practitioners to solve business problems, not manage servers.
Use serverless architectures to achieve high machine learning developer velocity!
Machine learning developers have a choice: keep writing crappy code or stop writing crappy code (see Figure below). This choice is the difference between using serverful architecture (AWS EC2) or serverless (AWS Lambda).
Figure: Velocity vs Time (Crisp)
How do companies benefit from adopting serverless machine learning? Companies that leverage serverless architectures move faster than their competition. Serverless eliminates manual scaling, simplifies workflows, and increases velocity for ML teams and practitioners.
Reduced Maintenance Costs
Companies must understand the real cost of machine learning systems in production. The biggest cost of maintaining existing ML models in production is the opportunity cost of developing new models. The more time is spent on maintaining or managing machine learning models in production, the less time is spent on developing new models that can impact business revenue and growth.
Using serverless computing, companies can save money on infrastructure thanks to the pay-per-use model. If no one is using your machine learning model, you are not paying for the model. If one million people visit your model, you pay per invocations.
As companies or products are featured online (ProductHunt, Reddit, Hacker News) or by traditional media, servers are often overloaded with new visitors. The unexpected surge of demand often takes down servers and customers are faced with 404 or server not responding errors. With serverless, companies can save face. With serverless, companies automatically scale to accommodate demand.
Startups, companies, and organizations are in a race against time. This race is won by the companies that implement features into their products or services. Companies that thrive are those who develop faster. Companies that thrive are those who leverage serverless architectures, stop worrying about operations, and focus on ML development.
How do developers benefit from adopting serverless machine learning? ML teams that leverage serverless architectures crush developer productivity. Serverless greatly reduces technical debt, simplifies workflows, and increases velocity for ML teams and practitioners.
Low Technical Debt
Machine learning teams reduce technical debt thanks to opinionated serverless architecture, simplified code, and low cost. As companies write more code, they accumulate more technical debt. With serverless, technical debt is greatly reduced since the cloud provider takes care of the server infrastructure. With serverless, technical debt is managed by cloud vendors (Amazon Web Services, Google Cloud, Microsoft Azure) which have large development and operations teams. Serverless architectures allow ML practitioners to write less code and have less technical debt compared to traditional ML teams.
Simplified Machine Learning Workflows
Thanks to serverless, ML teams leverage managed infrastructure, automatic scaling, availability, and fault-tolerance. As companies write glue code, they introduce complexity. With serverless, glue code is greatly reduced since cloud functions and cloud storage takes care of serving infrastructure. Serverless architectures allow ML practitioners to not worry about servers and instead focus on core ML code compared to traditional ML teams.
High Developer Velocity
As companies build their technology, the maintenance of software or management of technical debt becomes an issue. If developers are able to maintain high development speed or velocity, they are able to compete against other companies and reach the market faster.
Serverless machine learning allows developers to improve productivity, simplify workflows, and reduce code complexity. Serverless ML allows companies to deploy ML models with reduced maintenance costs, automatic scaling, and competitive advantage.
Modern machine learning teams use serverless architectures to ship faster.
Interested to learn how to use serverless machine learning in your team or organization? Schedule a 15-minute call with Slava Kurilyak, Founder/CEO at Produvia.