Artificial intelligence will fashion our tomorrow more effectively than any other innovation this era has ever seen. Anyone who has failed to keep up will find themselves lagging and wake up in a world full of technology that appears more like fantasy.
Machine Learning (ML) is defined as the use of algorithms and computational statistics to learn from data without being explicitly programmed. It is a subsection of the artificial intelligence domain within computer science. While the field of machine learning did not explode until more recently, the term was first coined in 1959 and the most foundational research was done throughout the ’70s and ’80s. Machine learning’s rise to prominence today has been enabled by the abundance of data, more efficient data storage, and faster computers.
The Following 4 services are provided by DLTK as part of Release 1.0.0.
All URIs below are relative to https://prod-kong.dltk.ai
Train a Classification Model | POST /machine/classification/train |
Train a Regression Model | POST /machine/regression/train |
Predicting from Classification Model | POST /machine/classification/predict |
Predicting from Regression Model | POST /machine/regression/predict |
Feedback from Regression Model | POST /machine/regression/feedback |
Feedback from Classification Model | POST /machine/classification/feedback |
Cluster Model | POST /machine/cluster |
Get Job Status | GET /machine/job/status?id={id} |
Get Job Output | GET /machine/output/findBy?jobId={id} |
This API would enable you to train a classification model. The model takes some time to be trained and therefore the job status has to be checked. Once the job is completed, the job output API would give you the model info.
URIPOST
/machine/classification/train
api-key | Your App’s API Key |
library | dltk_ai / weka |
service | Classification |
task | Train |
config.name | Name of the model |
config.algorithm | Name of the algorithm. See the list of algorihtms below. |
config.datasetUrl | Path of the train data after uploading to cloud storage. See here for more Info. |
config.label | Column name to be predicted. |
config.features | List of column names for training. |
config.trainPercentage | Percentage of data used for training, rest gets used for evaluating the model. |
config.saveModel | True / False |
config.params | Any configurations required for libraries. |
{ "library": "weka", "service": "classification", "task": "train", "config": { "name": "Player Churn Model", "algorithm": "NaiveBayesBinomial", "datasetUrl": "/dltk-ai/library/player_train.csv", "label": "player_activity", "trainPercentage": 80, "features": ["stamina","challenges","achievements"], "saveModel": "true", "params": {} } }Response:
{ "code": 200, "data": { "jobId": 969, "appId": 1558586024244, "name": "weka_classification_train", "library": "weka", "service": "Classification", "task": "TRAIN", "state": "RUN", "startTime": "2019-06-21T04:30:54.283+0000", "endTime": null, "request": { "library": "weka", "config": { "name": "Player Churn Model", "algorithm": "NaiveBayesBinomial", "datasetUrl": "/dltk-ai/library/player_train.csv", "label": "player_activity.Grid", "trainPercentage": 80, "saveModel": "true", "params": {}, "features": ["stamina","challenges","achievements"] } }, "isStreamJob": false, "isJobStopped": null } }
This API would enable you to predict a classification model. Once the job is completed, the prediction output API would give you the file info from which you can get the predictions.
URIPOST
/machine/classification/predict
Headers
api-key | Your App’s API Key |
{ "library": "weka", "service": "classification", "config": { "datasetUrl": "/dltk-ai/library/player_test.csv", "modelUrl":"/dltk-ai/1/1550423221357-NaiveBayesMultinomial.mdl", "params":{ } } }Response:
{ "code": 200, "data": { "jobId": 970, "appId": 1560322200284, "name": "weka_classification_predict", "library": "weka", "service": "Regression", "task": "PREDICT", "state": "RUN", "startTime": "2019-06-21T04:33:30.418+0000", "endTime": null, "request": { "library": "weka", "config": { "modelUrl": "/dltk-ai/1/1550423221357-NaiveBayesMultinomial.mdl", "datasetUrl": "/dltk-ai/library/player_test.csv", "features": ["stamina","challenges","achievements"] } }, "isStreamJob": false, "isJobStopped": null } }
This API would enable you to train on already built classification models with a new dataset, with the same features and algorithm. Once the job is completed, the job output API would give you the model info.
URIPOST
/machine/classification/feedback
Headers
api-key | Your App’s API Key |
{ "library":"weka", "service":"Classification", "task":"FEEDBACK", "config":{ "name":"Player Churn Model", "algorithm":"NaiveBayesBinomial", "datasetUrl": "/dltk-ai/library/player_feedback.csv", "modelUrl": "/dltk-ai/library/1550423221357-NaiveBayesMultinomial.mdl", "feedbackDatasetUrl":"/dltk-ai/library/player_feedback.csv", "features":["stamina","challenges","achievements"], "trainPercentage": 80, "label": "player_activity", "saveModel":true, "params":{} } }Response:
{ "code": 0, "data": { "jobId": 971, "appId": 1560322200284, "name": "weka_classification_feedback", "library": "h2o", "service": "Classification", "task": "FEEDBACK", "state": "RUN", "startTime": "2019-06-21T04:37:47.999+0000", "endTime": null, "request": { "library": "weka", "config": { "name": "Player Churn Model", "label": "player_activity", "params": {}, "features": ["stamina","challenges","achievements"], "algorithm": "NaiveBayesBinomial", "saveModel": "true", "datasetUrl": "/dltk-ai/library/player_feedback.csv", "trainPercentage": 80, "feedbackDatasetUrl": "/dltk-ai/library/player_feedback.csv", "modelUrl": "/dltk-ai/library/1550423221357-NaiveBayesMultinomial.mdl" } }, "isStreamJob": false, "isJobStopped": null } }
Description
This API would enable you to train a regression model. The model takes some time to be trained and therefore the job status has to be checked. Once the job is completed, the job output API would give you the model info.
URIPOST
/machine/regression/train
Headers
api-key | Your App’s API Key |
Attributes
library | dltk_ai / weka |
service | regression |
task | train |
config.name | Name of the model |
config.algorithm | Name of the algorithm. See the list of algorihtms below. |
config.datasetUrl | Path of the train data after uploading to cloud storage. See here for more Info. |
config.label | Column name to be predicted. |
config.features | List of column names for training. |
config.trainPercentage | Percentage of data used for training, rest gets used for evaluating the model. |
config.saveModel | True / False |
config.params | Any configurations required for libraries. |
{ "library": "dltk_ai", "service": "regression", "task": "train", "config": { "name": "Housing Price Model", "algorithm": "LinearRegression", "datasetUrl": "/dltk-ai/library/hp_train.csv", "label": "price", "trainPercentage": 80, "features": ["area","parking_area"], "saveModel": "true", "params": {} } }Response:
{ "code": 200, "data": { "jobId": 971, "appId": 1558586024244, "name": "weka_regression_train", "library": "weka", "service": "Regression", "task": "TRAIN", "state": "RUN", "startTime": "2019-06-21T04:30:54.283+0000", "endTime": null, "request": { "library": "weka", "config": { "name": "Housing Price Model", "algorithm": "LinearRegression", "datasetUrl": "/dltk-ai/library/hp_train.csv", "label": "price", "trainPercentage": 80, "saveModel": "true", "params": {}, "features": ["area","parking_area"] } }, "isStreamJob": false, "isJobStopped": null } }
Description
This API would enable you to get predictions from the regression model. Once the job is completed, prediction API would give you the file info containing the predictions.
URIPOST
/machine/regression/predict
Headers
api-key | Your App’s API Key |
{ "library": "weka", "service": "regression", "config": { "datasetUrl": "/dltk-ai/library/hp_test.csv", "modelUrl":"/dltk-ai/1/1550423221357-LinearRegression.mdl", "params":{ } } }Response:
{ "code": 0, "data": { "jobId": 972, "appId": 1560322200284, "name": "weka_regression_predict", "library": "weka", "service": "Regression", "task": "PREDICT", "state": "RUN", "startTime": "2019-06-21T04:33:30.418+0000", "endTime": null, "request": { "library": "scikit", "config": { "modelUrl": "/dltk-ai/1/1550423221357-LinearRegression.mdl", "datasetUrl": "/dltk-ai/library/hp_test.csv", "features": ["area","parking_area"] } }, "isStreamJob": false, "isJobStopped": null } }
Description
URIPOST
/machine/regression/feedback
Headers
api-key | Your App’s API Key |
{ "library":"weka", "service":"regression", "task":"FEEDBACK", "config":{ "name":"Housing Price Model", "algorithm":"LinearRegression", "datasetUrl": "/dltk-ai/library/hp_train.csv", "modelUrl": "/dltk-ai/1/1550423221357-LinearRegression.mdl", "feedbackDatasetUrl":"/dltk-ai/library/hp_feedback.csv", "features":["area","parking_area"], "trainPercentage": 80, "label": "price", "saveModel":true, "params":{} } }Response:
{ "code": 0, "data": { "jobId": 974, "appId": 1560322200284, "name": "weka_regression_feedback", "library": "weka", "service": "Regression", "task": "FEEDBACK", "state": "RUN", "startTime": "2019-06-21T04:37:47.999+0000", "endTime": null, "request": { "library": "h2o", "config": { "name": "Housing Price Model", "label": "price", "params": {}, "features": ["area","parking_area"], "algorithm": "LinearRegression", "saveModel": "true", "datasetUrl": "/dltk-ai/library/hp_train.csv", "trainPercentage": 80, "feedbackDatasetUrl": "/dltk-ai/library/hp_feedback.csv", "modelUrl": "/dltk-ai/1/1550423221357-LinearRegression.mdl" } }, "isStreamJob": false, "isJobStopped": null } }
Description
URIPOST
/machine/cluster
Headers
api-key | Your App’s API Key |
text | Text Sentence (String) |
{ "library":"weka", "service":"Clustering", "task":"CLUSTER", "config":{ "name":"Clustering", "algorithm":"KMeansClustering", "datasetUrl":"/dltk-ai/library/moon_data.csv", "numOfClusters": 2, "saveModel": "True", "params":{}, "features":["X","Y"] } }Response:
{ "code": 200, "data": { "jobId": 968, "appId": 1558586024244, "name": "weka_clustering_cluster", "library": "weka", "service": "Clustering", "task": "CLUSTER", "state": "RUN", "startTime": "2019-06-21T04:28:12.116+0000", "endTime": null, "request": { "library": "weka", "config": { "name": "Clustering", "algorithm": "KMeansClustering", "datasetUrl": "/dltk-ai/library/moon_data.csv", "numOfClusters": 2, "saveModel": "True", "params": {}, "features": [ "X", "Y" ] } }, "isStreamJob": false, "isJobStopped": null } }
Description
URIGET
/machine/job/status?id={id}
Headers
api-key | Your App’s API Key |
None | None |
{ "id": 21, "name": "Player Churn Model", "library": "weka", "service": "Classification", "task": "PREDICT", "state": "FINISH", "startTime": "2019-02-17T18:25:19.587+0000", "endTime": "2019-02-17T18:25:24.583+0000", "msg": null, "request": { "library": "dltk_ai", "config": { "params": {}, "modelUrl": "/dltk-ai/1/1550427728251-NaiveBayesMultinomial_5044073238607802124mdl", "datasetUrl": "/dltk-ai/library/rg_test.csv" } } }
Description
URIGET
/machine/output/findBy?jobId={id}
Headers
api-key | Your App’s API Key |
None | None |
{ "id": 9, "jobId": 20, "state": null, "output": { "eval": { "kappa": -0.05913503971756384, "recall": { "Active": 0.5723684210526315, "Churned": 0.3541666666666667 }, "correct": 104, "accuracy": 52, "revision": "14755", "rocCurve": { "values": [ [ 1, 1 ], [ 0.8958, 0.7368 ], ] }, "errorRate": 0.48, "inCorrect": 96, "precision": { "Active": 0.7372881355932204, "Churned": 0.2073170731707317 }, "areaUnderPRC": { "Active": 0.7848942279681246, "Churned": 0.237283172269615 }, "areaUnderROC": { "Active": 0.49506578947368424, "Churned": 0.518297697368421 }, "priorEntropy": 0.7986194718732207, "confusionMatrix": [ [ 17, 31 ], [ 65, 87 ] ], "numTrueNegatives": { "Active": 17, "Churned": 87 }, "numTruePositives": { "Active": 87, "Churned": 17 }, "trueNegativeRate": { "Active": 0.3541666666666667, "Churned": 0.5723684210526315 }, "truePositiveRate": { "Active": 0.5723684210526315, "Churned": 0.3541666666666667 }, "falseNegativeRate": { "Active": 0.4276315789473684, "Churned": 0.6458333333333334 }, "falsePositiveRate": { "Active": 0.6458333333333334, "Churned": 0.4276315789473684 }, "numFalseNegatives": { "Active": 65, "Churned": 31 }, "numFalsePositives": { "Active": 31, "Churned": 65 }, "pearsonCorrelation": { "challenges": 0.24937135217246517, "achievements": 0.18263960513415353, "stamina": 0.2493238592388467 }, "confusionMatrixHeaders": [ "Churned", "Active" ], "correlationCoefficient": 0, "mathewsCorrelationCoefficient": { "Active": -0.06379320872133686, "Churned": -0.06379320872133686 } }, "modelUrl": "/dltk-ai/1/1550427728251-NaiveBayesMultinomial_5044073238607802124mdl" } }
pip install dltk_ai
Create DLTK client to perform a different task.
client = dltk_ai.DltkAiClient('Your API Key')
To use these services, one needs to register to cloud.dltk.ai website and create a project. Copy your API key to use different APIs.
Upload dataset to dltk’s cloud storage
#Training Dataset train_file_store_response = client.store("Sample_Train.csv", Dataset.TRAIN_DATA) train_data = train_file_store_response["fileUrl"] #Testing Dataset test_file_store_response = client.store("Sample_Test.csv", Dataset.TEST_DATA) test_data = test_file_store_response["fileUrl"]
To use these services, one needs to register to dltk website and create a project. Copy your API key to use different APIs.
To train a model, one needs to pass specific parameters. Parameters for training a model are:
Type: ‘Classification’ or ‘Regression’.
Algorithm: Algorithm by the which model will be trained usch as ‘LinearRegression’ , ‘RandomForest’, etc.
Dataset: Dataset file location in dltk storage.
Label: Label or Target variable in the dataset file.
Features: Column name list which is to be used for model training.
Model name: The model name you want to give.
Library: Library for training the model. Currently dltk as weka, scikit-learn & H2O.
Train split percentage: Percentage of data to be used for training and the model will be tested against the remaining % of data.
train = client.train("regression","LinearRegression", train_data, "SalePrice", ["YearBuilt","YearRemodAdd","TotalBsmtSF", "AboveGrLiveAr","TotalBathroom","TotalRooms","ParkingSpace"], model_name="Housing_Price_Model",lib='weka', train_percentage=80, save_model=True)
train_job_status_response = client.job_status(train["data"]["jobId"])Once job state changes to ‘FINISH’, get the model evaluation metrics:
train_job_output_response = client.job_output(train["data"]["jobId"])
model = train_job_output_response["output"]["modelUrl"] predict_response = client.predict("regression", test_data, model)Prediction job will be created. Once job status states ‘FINISH’, get prediction for the test dataset:
predict_job_status_response = client.job_status(predict_response["data"]["jobId"]) predict_job_output_response = client.job_output(predict_response["data"]["jobId"]) pred_file = predict_job_output_response['output']['predFileUrl']
Following are the release notes as part of Release 1.0.0
Algorithms supported under classification are Logistic, MultilayerPerceptron, NaiveBayesMultinomial, RandomForest, LibSVM, AdaBoostM1, AttributeSelectedClassifier, Bagging, CostSensitiveClassifier, DecisionTable, GaussianProcesses, IBk, RandomTree and SMO.
Algorithms supported under Regression are LinearRegression, AdditiveRegression.
Feedback dataset should contain the same features as the original dataset.
Max train file upload size is 100 MB.
Max test file upload size is 50 MB.
Max no. of features selected for training cannot be more than 20.
Prof. Sanjay Verma is area chair for aligning IT Business at IIM-A and has been mentoring fortunate few on developing great IT products for business.
Dr. Sanjay Verma holds his doctorate in space of Artificial Intelligence and is mentoring CIOs of variety of businesses. Government has appointed him as Independent director for one of India’s largest Public sector bank.
Mr. Sada Iyer played pivotal role in establishing HPE in India. He redefined Service Integration space in India. Sada is considered encyclopaedia of Banking across the globe and has lead globally BFSI division in world class firms like HPE and Oracle.
Sada has been sounding board to several banking and Insurance policy makers.
Experienced in Internal Audits, Risk Management , Corporate Governance and Business Advisory Services. He is a Certified Internal Auditor from The Institute of Internal Auditors, (USA), Certified Information Systems Auditor from ISACA (USA) , Certified Fraud Examiner from Association of Certified Fraud Examiners, (USA) & Specializes in Organizational Transformation, Risk Management and Corporate Governance.
Prof. K.C. John established Qualcomm in India. He is associated to World Economic Forum’s Sustainability Chapter. He has demonstrated a massive success in startup space by establishing successful firms back to back.
Currently he is on advisory board at Qubit AI and mentoring startups associated to Great Lakes Institute of Management and considered finest Professor to impart leadership lessons to Chief Executives.
Highly’ experienced in Research & Development, Strong knowledge Systems, product development, interpretation of National & International standards. Identifying product requirements / risks. Can solve any mechanical & electrical problems related to product development. Very Good at learning new things & implementing. Have 3 international patents.
Skilled in Product Management, AI/ML/DL,Domain expertise in various domains,Design and Lead AI COE, Skilled AI Trainer, Designing courses, Graduated Business Analytics professional from ISB.
Professional Chartered Accountant with experience in both Audit and Finance.