SOURCE: DataScience.com |
May 23, 2017 16:22 ET
DataScience.com's new Python library, Skater, uses a combination of model interpretation algorithms to identify how models leverage data to make predictions.
LOS ANGELES, CA--(Marketwired - May 23, 2017) - DataScience.com has released a beta version of Skater, its new Python library for interpreting predictive models. Skater uses a combination of algorithms to explain the relationships between the data that go into a model and the predictions it makes, allowing users to assess a model's performance and identify key features.
As machine learning becomes increasingly popular in business applications -- such as scoring the creditworthiness of loan applicants, recommending relevant content to viewers, and predicting the amount of money online shoppers will spend -- interpretation is also becoming an indispensable part of the model-building process. Skater provides a common framework for describing predictive models regardless of the algorithm used to build them, giving data science practitioners the freedom to use the technique of their choice without worrying about its complexity.
"In many cases, a data scientist will use simple modeling techniques like linear regression or decision trees because the resulting model is easy to interpret," said DataScience.com Chief Strategy Officer William Merchan. "In effect, he or she is sacrificing performance for interpretability; for example, neural networks or ensembles are harder to explain but produce highly accurate predictions. Skater aims to eliminate this compromise."
Skater features model-agnostic partial dependence plots, a type of visualization that describes the modeled relationship between a predictor and a target, and variable importance, a measure of the degree to which features drive predictions. It also improves upon existing methods for model interpretation like Local Interpretable Model-Agnostic Explanations (LIME). Skater allows these methods to be applied to any machine learning model -- from ensembles to neural nets -- whether it is available locally or deployed as an API.
With Skater, data science practitioners can:
"One of the key features of DataScience.com's enterprise data science platform is the ability to deploy models behind a REST API to make them instantly available for integration with dashboards or real-time applications," Merchan added. "Skater is helping us take that one step further by making it possible to explain the complicated models deployed in our platform -- or anywhere -- in a way that is understandable to both data science practitioners and, ultimately, non-technical stakeholders."
The Skater package is available through GitHub and can be easily installed from PyPI using pip.
For more information, visit www.datascience.com.
About DataScience.com:
DataScience.com provides an enterprise data science platform that combines the tools, libraries, and languages data scientists love with the infrastructure and workflows their organizations need. The DataScience.com Platform maximizes the way data scientists like to work, so they can solve the right problems, create better analyses, amplify their results, and put more work into production -- all from one place.