Low-latency model serving at scale helps data scientists deliver
models into production
SUNNYVALE, Calif.–(BUSINESS WIRE)–lt;a href=”https://twitter.com/hashtag/AI?src=hash” target=”_blank”gt;#AIlt;/agt;–ParallelM,
the leader in MLOps, today released a new version of MCenter that
includes REST-based serving using Kubernetes to create a no-code,
autoscaling infrastructure for model serving supporting the leading
modeling frameworks. With this release, data scientists can quickly
create robust autoscaling REST services for their machine learning
models to better serve real-time applications in the cloud or on-premise.
“After working with advanced customers in the financial services
industry, we realized that our clients really needed a REST service
using autoscaling technology, like Kubernetes, to meet their business
requirements for high-volume and low-latency model serving at scale,”
said Sivan Metzger, CEO of ParallelM.
“So we went out and built it for them while keeping an elegant, simple
interface that’s intuitive and integrates with their existing modeling
The 1.3 release of MCenter specifically addresses the deployment
challenges of machine learning for real-time, production applications.
Unfortunately for many data scientists, many existing data science tools
with REST interfaces were designed for testing of model outputs and not
for production applications. This means that while these REST endpoints
are easy to set up they cannot perform in real-world environments and
will fail when they are needed most. The new REST interface in MCenter
is intended for real-time serving models with low latency at high volume
as is required by real-world applications. By using this more robust
REST endpoint, data scientists can be assured that their models will be
available to serve their business applications even under the most
punishing real-world conditions.
ParallelM’s MCenter gives data scientists the following benefits:
Autoscaling with Kubernetes – This new release increases the
scalability and performance of ParallelM MCenter across both batch and
real-time use cases by using Kubernetes to provide autoscaling
infrastructure. Using this industry standard approach allows
loads on the infrastructure to scale up and down as needed to optimize
resource utilization and manage costs for pay-as-you-go services.
Kubernetes also provides robust failover and ease of infrastructure
monitoring and management. So, no matter if companies are just
starting with machine learning or are already building advanced,
real-time AI applications, their platform for ML in production can
scale to meet their needs.
No Coding Required – The product is easy to use with drag and
drop components that require no coding. This expands the number of
people who can build out these high-quality pipelines but also reduces
errors and risks from custom coding these applications.
New Integrated Components – Out-of-the-box components that are
pre-configured for the most common modeling frameworks including
Scikit Learn, PMML and H2O. This makes it easier to get started with
real-time serving for models built on these frameworks in minutes not
ParallelM is the first, and only company
entirely focused on delivering machine learning operationalization
(MLOps) at scale. ParallelM’s breakthrough MCenter™ solution is built
specifically to power the deployment, management, and governance of
machine learning pipelines in production so that companies can scale
machine learning across their business applications. ParallelM’s
approach is that of a single, unified MLOps solution that embeds best
practice processes in technology, enabling all ML stakeholders to unlock
the business value of AI. Please visit www.parallelm.com
or email us at [email protected].
ParallelM and MCenter are trademarks of Parallel Machines, Inc. All
other trademarks are the property of their respective registered owners.
Trademark use is for identification only and does not imply sponsorship,
affiliation, or endorsement.
Marianne Dempsey/Jenna Beaucage