WhywebuiltanMLplatformfordevelopers—notjustdatascientists

作者：曾家宏惠茹冠宇 | 来源：互联网 | 2023-09-04 11:03

Machinelearninghas,historically,beenthepurviewofdatascienceteams.ThismakesitabitcounterintuitivethatwebuiltWhileonthesurfaceitseemslikewechosethewrongusertoemphasize,ourdecisionreflectsafundamentalshi

Focusing on the people who build products

Caleb Kaiser

Mar 4 ·5min read

Why we built an ML platform for developers—not just data scientists — Source: Robert Lucian Chiriac

Machine learning has, historically, been the purview of data science teams. This makes it a bit counterintuitive that we built Cortex , our open source ML infrastructure platform, primarily for software engineers.

While on the surface it seems like we chose the wrong user to emphasize, our decision reflects a fundamental shift within the machine learning ecosystem.

The rest of this article will explain this change in more detail, but the short version is that ten years ago, building a product that relied on ML—as opposed to using ML to generate a report—was only feasible for large tech companies. Now, ML has matured to the point to where even solo engineers with little data science background can build machine learning applications.

In other words, there is a new group of engineers focused not on fundamental ML research, but on building products with machine learning—and they have a particular set of needs that differ from those of a researcher.

Machine learning now enables products—not just papers

Going all the way back to machine learning’s roots in the 1950s, the field has historically been research-focused—things like Arthur Samuel’s checkers-playing AI (1959) or IBM’s chess-playing Deep Blue (1988).

Starting around 2010, there was a renewed interest in deep learning, with major tech companies releasing breakthroughs. Projects like Google Brain, DeepMind, and OpenAI (among others) began publishing new, state-of-the-art results.

These breakthroughs manifested as features in big companies’ products:

Netflix’s recommendation engine
Gmail’s smart compose
Facebook’s facial recognition tags

In addition, this renewed focus on machine learning—and particularly deep learning—lead to the creation of better tools and frameworks, like Google’s TensorFlow and Facebook’s PyTorch, as well as open source models and datasets, like OpenAI’s GPT-2 and ImageNet.

With better tools, open source models, and accessible data, it became possible for small teams to train models for production. As a consequence of this democratization, a wave of new products have emerged, all of which at their core are “just” ML models wrapped in software. We refer to these products as ML-native.

The emergence of ML-native software

A lot of the early examples of ML-powered products feature machine learning that improves the user experience, but isn’t necessarily core to the product. You can still write emails without Gmail’s smart compose, or watch YouTube videos without the “Recommended For You” section, for example.

ML-native products are different in that their core functionality is a model making predictions, and we’re seeing them everywhere.

Take computer vision models:

Ezra , Zebra Medical , and Arterys are all startups that use computer vision models to analyze MRIs for anomalies.
SkinVision , SkinIQ , and TroveSkin all use your phone’s camera and a computer vision model to analyze your skin for everything from acne to melanoma.
Comma.ai , Pony.ai , and Phantom.ai all use computer vision models to help cars navigate autonomously.
Actuate (formerly Aegis AI), Athena Security , and Synapse Technology all use computer vision models to detect weapons in video footage.

And that’s just computer vision. You could make a similar list for natural language processing models, where startups like AI Dungeon (an AI choose-your-own-adventure game) have used NLP models to create completely interactive experiences.

These products rely both on the research of data science teams—though sometimes it’s just an engineer finetuning an open source model —and on the design of software engineers.

And designing production software around a model, it turns out, is a speciality of its own.

Production machine learning has unique challenges

In order to make models accessible to engineers, there needs to be an interface that turns a model into something usable for engineers—like a predict() function that takes input and outputs a prediction from the model.

One of the most popular design patterns for building this predict() function is realtime inference , in which a model is deployed as a microservice that engineers can query like any other API. For example, a smart compose-esque feature might take a user’s input text, query a prediction API, and return the predicted next word or phrase, like so:

And while wrapping a model in a JSON API is fairly straightforward, scaling it is difficult.

First, the model has to be loaded and queried within a microservice, probably using a framework like Flask. That microservice then needs to be containerized and deployed to the cloud (e.g. a Kubernetes cluster) in order handle scale. On top of all of that, the cluster needs to be provisioned correctly to handle challenges specific to inference workloads, like:

The size of models. GPT-2, OpenAI’s state-of-the-art NLP model, is over 5 GBs.
The high compute cost of inference . Many models require GPUs to compute a single inference in under a minute.
The challenges of concurrency. It’s not uncommon for just a couple inferences to completely utilize a single instance, meaning instances need to aggressively autoscale to handle traffic.

And that’s without getting into the optimizations required to minimize cost.

These infrastructure challenges represent the largest remaining bottleneck preventing engineers from building products out of models.

The floodgates are inching open

You almost certainly already use ML-powered software—just look at the most used apps in your phone—but it is still a field dominated by a few massive companies.

Very quickly, however, we are seeing a generation of ML-native startups emerge on the back of improved tooling and frameworks, similar to how progress within web frameworks lead to an explosion of web apps in the mid-to-late 2000s.

Infrastructure is one of the last hurdles preventing engineers from building software on top of machine learning, and by raising the level of abstraction around ML infra, ML-native software should benefit from the sort of boom we saw from the democratization of web and mobile.

That, in a nutshell, is why we built ML infrastructure for developers—not just data scientists.

以上就是本文的全部内容，希望对大家的学习有所帮助，也希望大家多多支持我们

推荐阅读

version
解决Docker中volume的权限问题的方法

在Docker中，将主机目录挂载到容器中作为volume使用时，常常会遇到文件权限问题。这是因为容器内外的UID不同所导致的。本文介绍了解决这个问题的方法，包括使用gosu和suexec工具以及在Dockerfile中配置volume的权限。通过这些方法，可以避免在使用Docker时出现无写权限的情况。 ... [详细]

蜡笔小新 2023-12-14 18:48:02
process
云原生边缘计算之KubeEdge简介及功能特点

本文介绍了云原生边缘计算中的KubeEdge系统，该系统是一个开源系统，用于将容器化应用程序编排功能扩展到Edge的主机。它基于Kubernetes构建，并为网络应用程序提供基础架构支持。同时，KubeEdge具有离线模式、基于Kubernetes的节点、群集、应用程序和设备管理、资源优化等特点。此外，KubeEdge还支持跨平台工作，在私有、公共和混合云中都可以运行。同时，KubeEdge还提供数据管理和数据分析管道引擎的支持。最后，本文还介绍了KubeEdge系统生成证书的方法。 ... [详细]

蜡笔小新 2023-12-14 16:49:01
future
伊振华作品 | 沈阳市智慧城市运行管理中心的设计与建设

本文介绍了设计师伊振华受邀参与沈阳市智慧城市运行管理中心项目的整体设计，并以数字赋能和创新驱动高质量发展的理念，建设了集成、智慧、高效的一体化城市综合管理平台，促进了城市的数字化转型。该中心被称为当代城市的智能心脏，为沈阳市的智慧城市建设做出了重要贡献。 ... [详细]

蜡笔小新 2023-12-14 16:35:39
future
自动轮播，反转播放的ViewPagerAdapter的使用方法和效果展示

本文介绍了如何使用自动轮播、反转播放的ViewPagerAdapter，并展示了其效果。该ViewPagerAdapter支持无限循环、触摸暂停、切换缩放等功能。同时提供了使用GIF.gif的示例和github地址。通过LoopFragmentPagerAdapter类的getActualCount、getActualItem和getActualPagerTitle方法可以实现自定义的循环效果和标题展示。 ... [详细]

蜡笔小新 2023-12-13 14:41:31
web
MooTools和JQuery并排 - MooTools and JQuery Side by Side

IjustinheritedsomewebpageswhichusesMooTools.IneverusedMooTools.NowIneedtoaddsomef ... [详细]

蜡笔小新 2023-12-12 13:43:58
list
如何自行分析定位SAP BSP错误

The“BSPtag”Imentionedintheblogtitlemeansforexamplethetagchtmlb:configCelleratorbelowwhichi ... [详细]

蜡笔小新 2023-12-14 19:58:05
list
Android Studio Bumblebee | 2021.1.1（大黄蜂版本使用介绍）

本文介绍了Android Studio Bumblebee | 2021.1.1（大黄蜂版本）的使用方法和相关知识，包括Gradle的介绍、设备管理器的配置、无线调试、新版本问题等内容。同时还提供了更新版本的下载地址和启动页面截图。 ... [详细]

蜡笔小新 2023-12-14 10:34:15
list
Hyperledger Fabric外部链码构建与运行的开发笔记

本文介绍了Hyperledger Fabric外部链码构建与运行的相关知识，包括在Hyperledger Fabric 2.0版本之前链码构建和运行的困难性，外部构建模式的实现原理以及外部构建和运行API的使用方法。通过本文的介绍，读者可以了解到如何利用外部构建和运行的方式来实现链码的构建和运行，并且不再受限于特定的语言和部署环境。 ... [详细]

蜡笔小新 2023-12-13 21:47:39
list
使用多进程实现TCP服务端的优势和注意事项

本文介绍了为什么要使用多进程处理TCP服务端，多进程的好处包括可靠性高和处理大量数据时速度快。然而，多进程不能共享进程空间，因此有一些变量不能共享。文章还提供了使用多进程实现TCP服务端的代码，并对代码进行了详细注释。 ... [详细]

蜡笔小新 2023-12-13 18:25:30
include
CF：3D City Model（小思维）问题解析和代码实现

本文通过解析CF：3D City Model问题，介绍了问题的背景和要求，并给出了相应的代码实现。该问题涉及到在一个矩形的网格上建造城市的情景，每个网格单元可以作为建筑的基础，建筑由多个立方体叠加而成。文章详细讲解了问题的解决思路，并给出了相应的代码实现供读者参考。 ... [详细]

蜡笔小新 2023-12-13 14:17:11
import
Golang条件编译的必要性及实现方法

本文介绍了在多平台下进行条件编译的必要性，以及具体的实现方法。通过示例代码展示了如何使用条件编译来实现不同平台的功能。最后总结了只要接口相同，不同平台下的编译运行结果也会相同。 ... [详细]

蜡笔小新 2023-12-13 09:38:06
import
Go语言实现堆排序的详细教程

本文主要介绍了Go语言实现堆排序的详细教程，包括大根堆的定义和完全二叉树的概念。通过图解和算法描述，详细介绍了堆排序的实现过程。堆排序是一种效率很高的排序算法，时间复杂度为O(nlgn)。阅读本文大约需要15分钟。 ... [详细]

蜡笔小新 2023-12-12 16:23:00
list
JDK源码学习之HashTable(附带面试题)的学习笔记

本文介绍了JDK源码学习之HashTable(附带面试题)的学习笔记，包括HashTable的定义、数据类型、与HashMap的关系和区别。文章提供了干货，并附带了其他相关主题的学习笔记。 ... [详细]

蜡笔小新 2023-12-12 13:05:17
include
文件压缩解压的哈夫曼树实现

本文介绍了使用哈夫曼树实现文件压缩和解压的方法。首先对数据结构课程设计中的代码进行了分析，包括使用时间调用、常量定义和统计文件中各个字符时相关的结构体。然后讨论了哈夫曼树的实现原理和算法。最后介绍了文件压缩和解压的具体步骤，包括字符统计、构建哈夫曼树、生成编码表、编码和解码过程。通过实例演示了文件压缩和解压的效果。本文的内容对于理解哈夫曼树的实现原理和应用具有一定的参考价值。 ... [详细]

蜡笔小新 2023-12-11 14:13:46
include
Codeforces Round #321 (Div. 2) Kefa and Dishes 状压+spfa

本文介绍了Codeforces Round #321 (Div. 2)比赛中的问题Kefa and Dishes，通过状压和spfa算法解决了这个问题。给定一个有向图，求在不超过m步的情况下，能获得的最大权值和。点不能重复走。文章详细介绍了问题的题意、解题思路和代码实现。 ... [详细]

蜡笔小新 2023-12-11 10:37:34

曾家宏惠茹冠宇

这个家伙很懒，什么也没留下！

Tags | 热门标签

RankList | 热门文章