MAADSBML Jupyter Notebook
The MAADSBML jupyter notebook can be downloaded from Github: MAADSBML Jupyter Notebook: maadsbmlcode.ipynb
Process Your Data File
To process your own data file simply drop it in your local folder: csvuploads
Important
You MUST have the MAADSBML docker container running
The file must be CSV (comma separated values). Example CSVs here:
You MUST store your CSV file in your LOCAL csvuploads folder
The first column MUST be Date in the format: M/D/YYYY. The Date column is used for seasonality analysis. If you do not want seasonality analysis, this column is ignored by MAADSBML - but you STILL NEED THE DATE COLUMN.
Your CSV must contain column headings
The Dependent variable MUST be contained in this file
ALL DATA IN YOUR CSV MUST BE NUMERIC (with exception of column headers)
MAADSBML Jupyter Notebook Explained
# IMPORT MAIN LIBRARIES
import maadsbml
import json
import os
import time
# Uncomment IF using jupyter notebook
import nest_asyncio
# Uncomment IF using jupyter notebook
nest_asyncio.apply()
# SET THE HOST AND PORT WHERE MAADSBML IS LISTENING FOR A CLIENT CONNECTION
host='http://127.0.0.1'
port=5595
# Change these two folders to your local paths that you used for the volume mappings
# in Docker Local Paths on Linux/Mac
# Local Paths on Windows - Change to your local paths
localstagingfolder = "c:\\maads\\maadsbml\\staging" # change this folder to your local mapped staging folder
localexceptionfolder = "c:\\maads\\maadsbml\\exception" # change this folder to your local mapped exception folder
# This function is a system function to capture broken pipe network issues - DO NOT MODIFY
def readifbrokenpipe(jres,hasseasonality):
# this function is called if there is a broken pipe network issue
pkey=""
algofile=""
jsonalgostr = ""
pkey= jres.get('AlgoKey')
maadsbmlfile="%s/%s.txt.working" % (localstagingfolder,pkey)
if hasseasonality == 1:
algojsonfile="%s/%s_trained_algo_seasons.json" % (localexceptionfolder,pkey)
else:
algojsonfile="%s/%s_trained_algo_no_seasons.json" % (localexceptionfolder,pkey)
i=0
while True:
time.sleep(5)
i = i + 1
if os.path.isfile(maadsbmlfile):
continue
elif os.path.isfile(algojsonfile):
# Read the json
with open(algojsonfile) as f:
jsonalgostr = f.read()
break # maadsbml finished
#elif i > 400:
# print("ERROR: Could not find the JSON file - CHECK IF YOUR FILE PATHS ARE CORRECT!")
# break
return jsonalgostr
# This is the MAIN ML Training function
# You must enter host, port, filename,dependentvariable,removeoutliers,hasseasonality,deepanalysis,company
# Deepanalysis will perform advanced algorithms but will take potentially hours to complete based on the
# size of your data
# You can also change the summer, shoulder and winter months
def hypertraining(host,port,filename,dependentvariable,removeoutliers,hasseasonality,deepanalysis,company):
#host,port,
#filename= raw data file in csv format - Note this file is stored on your host machine the DOCKER container needs to be mapped to this volume using -v
#dependentvariable= dependent variable name - this is the column name in the csv file
# the file should have a Date column in the format Month/Day/Year
#username= you can specify a username
# mode=0
#timeout=180 - you can modify this in seconds if your data file is large
#company= change this to the name of your company
#removeoutliers= specify 1 or 0, 1=remove outliers, 0 do not remove outliers,
#hasseasonality= specify 1 or 0 to indicate date is affected by seasonaility - 1 = seasonality, 0 = no seasonality,
#summer= specify the summer months ie. '6,7,8', or set to -1 for no summer
#winter= specify winter months i.e. '11,12,1,2', or -1 for no winter
#shoulder= specify shoulder months i.e. '3,4,5,9,10', or -1 for no shoulder season
#trainingpercentage= specify training percentage i.e. 70, the value represents a percentage to split training and test
#shuffle= specify 1 or 0 to shuffle the data, 1= shuffle, 0 = no shuffle
#deepanalysis= specify 1 or 0, 1=deepanalysis, note this will run through deeper algorithms but will take longer, 0 = no deep analysis, this will
#password='123', - leave as is
#email='support@otics.ca', - leave as is
#usereverseproxy=0, - leave as is
#microserviceid='', leave as is
#maadstoken='123' leave as is
summer='6,7,8' # specify -1 if you dont want to analyse summer
winter='11,12,1,2' # specify -1 if you dont want to analyse winter
shoulder='3,4,5,9,10' # specify -1 if you dont want to analyse shoulder
trainingpercentage=75
shuffle=1
res=maadsbml.hypertraining(host, port, filename, dependentvariable,removeoutliers,hasseasonality, summer,winter,shoulder,trainingpercentage, shuffle,
deepanalysis, 'admin', 1200,company)
jres = json.loads(res)
if jres.get('BrokenPipe') != None: # check if the hypertraining function experienced a brokenpipe - if so wait
try:
res=readifbrokenpipe(jres,hasseasonality)
except Exception as e:
print(e)
print(res)
Call the hypertraining to train on your data.
filename='stockdata.csv'
dependentvariable='close'
removeoutliers=0
hasseasonality=0
deepanalysis=0
company='Your company'
hypertraining(host,port,filename,dependentvariable,removeoutliers,hasseasonality,deepanalysis,company)
Here is the output from the hypertraining function.
{"AlgoKey":"admin_stockdata_csv","AlgoDetails":"RidgeRegression,0.996,allseason;LassoCV,0.995864,
allseason;VotingRegressor,0.995778,allseason;LinearSVR,0.995763,allseason;HuberRegressor,0.99503,
allseason;simpleregression_reg,0.995,allseason;ARDRegression,0.994911,allseason;BayesianRidge,0.994905,
allseason;Lars,0.994774,allseason;LarsCV,0.994774,allseason;", "PDF":"/maads/agentfilesdocker/dist/maadsweb/pdfreports/admin_stockdata_csv_no_seasons.pdf",
"Hasseasonality":"No","Deep Analysis":"No","Shuffled":"Yes","Outliers Removed":"No", "Generated On":"2024-04-25 00:28:37",
"Timezone":"UTC","Username":"admin","Dependentvariable":"close", "Filename":"/maads/agentfilesdocker/dist/maadsweb/csvuploads/stockdata.csv",
"Host":"127.0.0.1","Port":5595,"AlgoJson":"/maads/agentfilesdocker/dist/maadsweb/exception/admin_stockdata_csv_trained_algo_no_seasons.json",
"MainSortedAlgosInfoWeighted":"The numbers in the main sorted algorithms represent the average of the MAPE, R-Square, Explained Variance and Model
Accuracy","BESTALGO-ALLSEASON":"RidgeRegression", "MainSortedAlgos-Weighted-
Allseason":"VotingRegressor,0.998;LinearSVR,0.998;HuberRegressor,0.998;RidgeRegression,0.997;LassoCV,0.997;
simpleregression_reg,0.996;ARDRegression,0.996;BayesianRidge,0.996;RANSACRegressor,0.996;LassoLarsIC,0.996", "BESTALGOWEIGHTED-ALLSEASON":"VotingRegressor"}
Once you have executed hypertraining - the output will be the pkey (or AlgoKey) use this this AlgoKey to as input into hyperprediction.
JSON Field |
JSON Value |
AlgoKey |
Key for your optimal algorithm. This is main key. |
AlgoDetails |
Details about the algorithms. For example, RidgeRegression,0.996, allseason, means RidgeRegression has a MAPE (Mean Absolute Percentage Error) of 0.996 with allseason (seasonality ignored). |
Path where the PDF report is saved. |
|
Hasseasonality |
Yes for seasonlity, No for no seasonality. |
Deep Analysis |
Yes for deepanalysis, No for no deep analysis. |
Shuffled |
Yes for shuffled, No for no shuffling. Shuffling, shuffles the training datatset. |
Outliers Removed |
Yes for outliers removed, No for no outliers removed. |
Generated On |
UTC time when training completed. |
Timezone |
UTC timezone. |
Username |
username. |
Dependentvariable |
Dependent variable in the ML model. |
Filename |
Training data filename used. |
Host |
Host IP for maadsbml. |
Port |
Port for Maadsbml. |
Algojson |
Path for the algorithm JSON. |
MainSortedAlgosInfoWeighted |
Description. |
BESTALGO-ALLSEASON |
Best algorithm. |
MainSortedAlgos-Weighted-Allseason |
All the main algorithms. |
BESTALGOWEIGHTED-ALLSEASON |
Best weighted algorithm. |
Once you have executed hypertraining - the output will be the AlgoKey (or pkey) use this this AlgoKey to as input into hyperprediction.
def hyperprediction(pkey,host,port,inputdata,username):
res=maadsbml.hyperpredictions(pkey,inputdata,host,port,username)
print(res)
Important
Once you have executed hypertraining - the output will be the pkey (or AlgoKey) use this this AlgoKey to as input into hyperpredictioncustom and specify the algorithm and season you want to use for hyperpredictions.
# This is the main function to perform predictions from the trained algo.
def hyperpredictioncustom(pkey,host,port,inputdata,username,algoname,season):
res=maadsbml.hyperpredictions(pkey,inputdata,host,port,username,algoname,season)
print(res)
Here is the output from the hyperprediction or hyperpredictioncustom functions:
{"hyperprediction":45.14,"AlgoKey":"admin_stockdata_csv","Season":"allseason","Algorithm":"RidgeRegression","Dependent
Variable":"close","Fields":"Date,Open,High,Low,Volume","Trained Model Accuracy":"0.996","Pickle Files":"/maads/agentfilesdocker/networks/Alberta-Electric-
System-Operator_AESO)_ADMIN_STOCKDATA_CSVALLSEASON_AG1_4_RidgeRegression_normal_1.00000000_946_.pkl, /maads/agentfilesdocker/networks/Alberta-Electric-System-
Operator_AESO)_ADMIN_STOCKDATA_CSVALLSEASON_AG1_4_RidgeRegression_normal_1.00000000_946_scalerx_.pkl, /maads/agentfilesdocker/networks/Alberta-Electric-System-
Operator_AESO)_ADMIN_STOCKDATA_CSVALLSEASON_AG1_4_RidgeRegression_normal_1.00000000_946_scalery_.pkl","CreatedOn":"2024-04-25,
00:29:27","InputData":"5/21/2013,52.650002,83.330002,2.120003,2674600","MicroService":"PREDICTIONSERVICE","Host":"127.0.0.1","Port":5495}
JSON Field |
JSON Value |
hyperprediction |
The prediction. |
AlgoKey |
The AlgoKey |
Season |
The AlgoKey |
Algorithm |
This is the BEST algorithm determined by MAADSBML. |
Dependent Variable |
The AlgoKey |
Fields |
These are the independent variables. |
Trained Model Accuracy |
MAPE value for the trained algorithm. |
Pickle Files |
The Python pickle files for the algorithms and standardization. |
CreatedOn |
The date and time prediction was generated. |
InputData |
The input data used for the predictions. |
MicroService |
microservice. |
Host |
The IP address of MAADSBML |
Port |
The prediction port. |
Use the AlgoKey to find details on the algorithm.
def algoinfo(pk):
res=maadsbml.algodescription(host,port,pk)
print(res)