MAADSBML Python Library API ============================ `MAADSBML Python Library API `_ **Multi-Agent Accelerator for Data Science: Batch AutoML (MAADSBML)** *Revolutionizing Data Science with Artificial Intelligence* **Overview** MAADSBML combines Artificial Intelligence, Docker, Machine Learning. It automates the machine learning process, and finds the BEST algorithm for your data. It also produces a very detailed PDF report. MAADSBML is integrated with Docker Container. This library allows users to harness the power of agent-based computing using hundreds of advanced linear and non-linear algorithms. **Compatibility** - Python 3.8 or greater - Minimal Python skills needed **Copyright** - Author: Sebastian Maurice, PhD **Installation** - At the command prompt write: .. code-block:: console pip install maadsbml - This assumes you have [Downloaded Python](https://www.python.org/downloads/) and installed it on your computer. - You will also need to pull the MAADSBML Docker container: **maadsdocker/maads-batch-automl-otics** **Syntax** - There are literally two lines of code you need to write to train your data and make predictions: **Main functions:** - **hypertraining** Executes hundreds of agents, running hundreds of advanced algorithms and completes in minutes. A master agent then chooses the BEST algorithm that best models your data. - **hyperpredictions** After training, make high quality predictions - takes less than half a second (about ~100 milliseconds). Users can also generate predictions using **non-python code** such as JAVA. - **algodescription** Get detailed information on the optimal algorithm found during hypertraining - **abort** Abort the training process. - **rundemo** To run canned demo of the system to see how it works. - **finddistribution** Finds the best distribution for your continuous data **First import the Python library.** **import maadsbml** 1. **maadsbml.hypertraining(host, port, filename, dependentvariable, removeoutliers=0, hasseasonality=0, summer='6,7,8', winter='11,12,1,2', shoulder='3,4,5,9,10', trainingpercentage=70, shuffle=0, deepanalysis=0, username='admin', timeout=1200, company='otics', password='123', email='support@otics.ca',usereverseproxy=0, microserviceid='', maadstoken='123', mode=0)** **Parameters:** *host* : string, required - This is the IP address of the running Docker container - it is usually http://localhost *port* : int, required - This is the TRAINING PORT in the container. The default is port==5595 *filename* : string, required - This is the raw data file in csv format - Note this file is stored on your host machine - the DOCKER container needs to be mapped to this volume using -v *dependentvariable* : string, required - This is the dependent variable in your csv file. *removeoutliers* : int, optional, 1 or 0 - If 1, then outliers will be removed from your data. If 0, no outliers are removed. *hasseasonality* : int, optional, 1 or 0 - If 1, then your data will be modeled for seasonality: Winter, Summer, Shoulder. If 0, then your data will not be modeled for seasonality. If modeling for seasonality, ensure you have enough data points that covers the seasons, usually 1 year of data. *summer* : string, optional - Definition for summer months. This can be changed. *winter* : string, optional - Definition for winter months. This can be changed. *shoulder* : string, optional - Definition for shoulder months. This can be changed. *trainingpercentage* : int, optional, Default=70 - This is the split percentage between Training and Test data sets. It is defaulted to 70 (70% for training, 30% test). *shuffle* : number, 0 or 1, optional - Indicates whether to shuffle the training dataset or not, default=0. *deepanalysis* : int, optional - This will force MAADSBML to perform deeper analysis on your data. This could take 30-40 minutes. Set to 1 for deepanalysis, 0 for no deep analysis. *username* : string, optional - This identifies a user. You may want to change this if multiple users are running the same file. *company* : string, optional - This identifies your company. You may want to change this for the Report. *timeout* : int, optional - You can increase this if you receive a timeout error before the training is taking too long. The setting is in seconds. *password* : string, optional - leave as is *email* : string, optional - leave as is *usereverseproxy* : int, optional - leave as is *microserviceid* : string, optional - leave as is if not using a pass through service. *mode* : int, optional - leave as is *maadstoken* : string, optional - leave as is **Returns:** string JSON buffer, with the algorithm key (PKEY) and other details: - PKEY: : This is the key to the BEST algorithm and must be used when making predictions. **2. maadsbml.hyperpredictions(pkey,theinputdata,host,port,username,algoname='',seasonname='', usereverseproxy=0,microserviceid='', password='123',company='otics', email='support@otics.ca', maadstoken='123')** **Parameters:** *pkey* : string, required - This is the PKEY you received from the hypertraining function. *theinputdata* : string, required - These are the Xs for your model: For example if my model had 3 Xs then inputdata='5/21/2010,-14.3,-32.0,-12.0', with the first entry as Date: Date must be in the format: M/D/YYYY *host* : string, required - This is the IP address of the running Docker container - it is usually http://localhost *port* : int, required - This is the PREDICTION PORT in the container. The default is port==5495 (or 5595) *username* : string, required - The username you used in the hypertraining functions. Default is admin. *algoname* : string, optional - Enter the name of the algorithm to use, this can be retrieved from the hypertraining function. If this is empty, the BEST algorithm will be used by default. *seasonname* : string, optional - Enter the season to use (winter,summer,shoulder), this can be retrieved from the hypertraining function. If this is empty, the default season is used. *usereverseproxy* : int, optional - leave as is *microserviceid* : int, optional - leave as is *password* : string, optional - leave as is *company* : string, optional - change for reporting. *email* : string, optional - leave as is *maadstoken* : string, optional - leave as is **Returns:** string buffer containing the prediction, and other details. **3. maadsbml.abort(host,port=10000)** **Parameters:** *host* : string, required - This is the IP address of the Docker container: http://localhost *port* : string, optional - Port is fixed at 10000 **Returns:** Abort will shutdown and re-start your system. **4. maadsbml.rundemo(host,port,demotype=1,timeout=1200,usereverseproxy=0,microserviceid='')** **Parameters:** *host* : string, required - This is the IP address of the Docker container: http://localhost *port* : string, required - This is the TRAININGPORT, it is usually 5595. *demotype* : int, required - If demotype is 1, then a regression models is run; if demotype is 0 then a classification model is run. *timeout* : int, optional - The connection timeout between Python and the container, in seconds *usereverseproxy* : int, optional - leave as is *microserviceid* : string, optional - leave as is **Returns:** null **5. maadsbml.algodescription(host,port,pkey,timeout=300,usereverseproxy=0,microserviceid='')** **Parameters:** *host* : string, required - This is the IP address of the Docker container: http://localhost *port* : string, required - This is the TRAININGPORT, it is usually 5595. *pkey* : string, required - This is the PKEY from hypertraining. *timeout* : int, optional - The connection timeout between Python and the container, in seconds *usereverseproxy* : int, optional - leave as is *microserviceid* : string, optional - leave as is **Returns:** null **6. maadsbml.finddistribution(filename,varname,dataarray=[],folderpath='',imgname='distimage',common=1,topdist=5)** Parameters: filename : string, required Filename containing the raw data. This must be a CSV file. varname : string, required Name of the variable for your data. dataarray : array_like, optional Numpy array. Rather than pass a filename, you can pass in an array. folderpath : string, optional Folder path to store the output data, and distribution image file. imgname : string, optional Name of the image and json data. common : int, optional If set to 1, this will apply common distributions to your data. If Set to 0, it will iterate through roughly 80 distributions. topdist : int, optional The number of the TOP distributions to print. Returns: status,dist dataframe,name of best distribution,all JSON data