Exciting new release of SynapseML
Today we’re excited to announce another great release of SynapseML v0.10.0 (Previously MMLSpark), an open-source library that aims to simplify the creation of massively scalable machine learning pipelines.
This is great news for all our SynapseML fans out there and this blog post covers the latest additions. First, a quick recap of what SynapseML is for all the readers who have not yet used the library.
SynapseML is a massively scalable (feel free to spin up hundreds of machines!) machine learning library built on Apache Spark. SynapseML makes it easy to train production-ready models to solve problems from simple classification and regression to anomaly detection, translation, image analysis, speech to text, and just about any ML challenge you are facing. Under the hood, SynapseML integrates a wide array of ML technologies such as LightGBM, Vowpal Wabbit, ONNX, and the Cognitive Services into a single easy to use API compatible with MLFlow. We know, we know, everyone hates when developers invent new APIs, but you can rest easy because SynapseML integrates cleanly into existing Spark ML APIs so you can embed models directly into existing pipelines. We strive to make SynapseML available to developers wherever they work, and the library is available in a variety of languages like Python, Scala, Java, R. As of this release SynapseML is also usable from .NET, C#, F#.
Highlights in this release
.NET, C#, and F# Support
Dear .NET community: we heard you! In SynapseML v0.10, we are adding full support for .NET languages like C# and F#. This means that you can now use everything in SynapseML from any of the .NET ecosystem languages, and even load up models you built in other languages like Python, R, and Java.
For a quick taste of the new .NET bindings, you can train a distributed LightGBM model with no fuss:
View our .NET getting started page for more details.
OpenAI Language Models
SynapseML also offers a simple and scalable way to leverage Azure Cognitive Services directly from Spark.
Currently, SynapseML supports over 50 Cognitive Services and are now expanding that the new Azure Open AI Service. This service allows users to tap into 175-Billion parameter language models (GPT-3) from OpenAI that can generate and complete text and code near human parity. GPT-3 requires only a small amount of input text to generate large volumes of relevant and sophisticated machine-generated text. GPT-3 can also solve a variety of ML tasks beyond just completion including summarization, translation, sentiment analysis, analogical reasoning, question answering, and more.
To learn more, check out the SynapseML OpenAI guide or check out a demonstration showing how simple it is to create a search engine on custom unstructured data using SynapseML and GPT-3.
Full Support for MLflow
MLflow is a platform for managing the machine learning lifecycle and streamline machine learning development, including tracking experiments, packaging code into reproducible runs, and sharing and deploying models. We are very happy to announce that SynapseML models now integrate with MLflow with full support for saving, loading, deployment, and autologging. This means that you can now manage the lifecycle of your SynapseML models the same way that you manage your other models using MLflow. Finally, not only did we enable full support for SynapseML, but we contributed back to the MLFlow project to enable autologging for all SparkML models!
For more information, check out the MLflow in SynapseML getting started guide and our documentation on MLflow autologging.
No Cluster, no Problem: Experiment in Browser with Binder
We know that Spark can be intimidating for first users but fear not because with the technology Binder, you can explore and experiment with SynapseML with zero setup, install, infrastructure, or Azure account required. Simply visit our Binder site to get started in your browser!
There is more…
Finally, in addition to the highlights mentioned above, there are many other great updates in this release for Responsible AI, Azure Cognitive Services, LightGBM on Spark, Wowpal Wabbit and other features. We could not capture all of it here but you can read more about this release in the detailed release notes and learn all about the rich SynapseML capabilities on our website.
We would also like to acknowledge the developers and contributors, both internal and external, who helped create this version of SynapseML. We encourage you to learn more about these amazing developers through our contributor spotlight.
We really hope you enjoy the new version of SynapseML and don’t hesitate to reach out to let us know what you think! Stay tuned for more exciting updates and let us know you appreciate SynapseML by giving us a star on GitHub. ;)
- Mark Hamilton, Senior Software Engineer, Azure Synapse
-Nellie Gustafsson, Principal Product Manager, Azure Synapse
Published on:
Learn more