The HPCC Systems Production Machine Learning bundles provide a diverse set of features that allow the parallelized creation and training of Machine learning models and a large set of evaluation metrics that can be used to test the trained model to ascertain its performance. To help monitor the models more closely however, a new set of evaluation methods that incorporate the analysis of clusters and the selection of features, as well as other commonly used tests, have been proposed, implemented, and tested. The implementations are written completely in Enterprise Control Language (ECL) and support the various features provided by the ML bundles such as the Myriad Interface. The six evaluation metrics for machine learning models chosen are: Area Under ROC Curve, F-Score, Hamming Loss, Adjusted Rand Index, Silhouette Coefficient, Chi squared feature test. All the six metrics were successfully implemented and tested. These metrics can help users monitor their Machine learning models better and can help tweak their hyperparameters. The implementations also support the various features of the production ML bundles such as the Myriad interface.
|Effective start/end date||06/1/19 → 09/1/19|