How to add a custom UI to monitor your model training & evaluation trials in less than 5 min in PySpark

Eden E. Canlilar
4 min readJun 22, 2020

Have you ever lost track of the models/hyper-parameters you’ve tested as you’re developing a machine learning model? Ever found yourself wishing that someone else would just keep track of it all for you? Well wish no longer! An automated solution with a customizable graphical user interface in your browser is here! AND it can be set up in less than five minutes! Oh and…. BONUS it’s open source… Translation: FREE! No joke.

If you’re as excited as I was to hear about this you’re probably jumping up and down right now with pure glee or giving thanks to the machine learning Gods. Truthfully, this really was a dream come true for me, and I’m happy to share it with you because there are not many good resources out there yet about it.

All the code provided below was derived from a Udemy course called “PySpark Essentials for Data Scientists.” So if you want more details (there’s a bunch more) or access the full script, you can check it out by clicking here.

Interest peaked?

Read on and prosper!

Step 1

Set up the environment (this assumes you already have PySpark installed):

  • Pip install mlflow (via the command line)
  • Start your PySpark session (code below)
  • Import dependencies (code below)
import pyspark
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName(“Mlflow”).getOrCreate()# MLflow dependencies
import mlflow
from mlflow.tracking import MlflowClient
# PySpark modeling libraries
from pyspark.ml.classification import DecisionTreeClassifier
from pyspark.ml.evaluation import MulticlassClassificationEvaluator
# For the fake dataframe
from pyspark.ml.linalg import Vectors

Step 2

Set up your client and create a list of all your experiments:

client = MlflowClient()
experiments = client.list_experiments()
# To view list of all experiments:
for x in experiments:
print(x)
print(" ")

Step 3

Create a function to create runs tied to an experiment (a new one is created if the name you choose doesn’t already exist):

experiment_name = "Tree-Algorithms"
def create_run(experiment_name):
mlflow.set_experiment(experiment_name = experiment_name)
for x in experiments:
if experiment_name in x.name:
experiment_index = experiments.index(x)
run = client.create_run(experiments[experiment_index].experiment_id)
return run

Note: You’ll have to run this before EACH test result you want to capture.

Step 4

Read in your data or create some fake data and split into training and evaluation sets:

df = spark.createDataFrame([
(0, Vectors.dense([1.0, 0.1, -1.0]),),
(1, Vectors.dense([2.0, 1.1, 1.0]),),
(1, Vectors.dense([2.0, 1.1, 1.0]),),
(2, Vectors.dense([2.0, 1.1, 1.0]),),
(1, Vectors.dense([2.0, 1.1, 1.0]),),
(1, Vectors.dense([2.0, 1.1, 1.0]),),
(2, Vectors.dense([2.0, 1.1, 1.0]),),
(1, Vectors.dense([2.0, 1.1, 1.0]),),
(1, Vectors.dense([2.0, 1.1, 1.0]),),
(2, Vectors.dense([2.0, 1.1, 1.0]),),
(1, Vectors.dense([2.0, 1.1, 1.0]),),
(1, Vectors.dense([2.0, 1.1, 1.0]),),
(2, Vectors.dense([2.0, 1.1, 1.0]),),
(2, Vectors.dense([3.0, 10.1, 3.0]),)
], ["label", "features"])
seed = 40
train_val = 0.7
test_val = 1-train_val
train,test = df.randomSplit([train_val,test_val],seed=seed)

Step 5

Train and test your model:

# Instantiate algorithm and fit model
classifier = DecisionTreeClassifier(maxDepth=5, maxBins=32)
fitModel = classifier.fit(train)
# Evaluate
predictions = fitModel.transform(test)
accuracy = (MC_evaluator.evaluate(predictions))*100

Step 6

Create a run and log your results to MLflow:

# Create a run (this is very important!)
run = create_run(experiment_name)
# Log metrics to MLflow
client.log_metric(run.info.run_id, “Accuracy”, accuracy)
# Log any tags you want to MLflow (a lot of flexibility here)
classifier_name = type(classifier).__name__
client.set_tag(run.info.run_id, "Algorithm", classifier_name)
client.set_tag(run.info.run_id,"Random Seed",seed)
client.set_tag(run.info.run_id,"Train Perct",train_val)
# Log parameters to the client
paramMap = fitModel.extractParamMap()
for key, val in paramMap.items():
if 'maxDepth' in key.name:
client.log_param(run.info.run_id, "Max Depth", val)
for key, val in paramMap.items():
if 'maxBins' in key.name:
client.log_param(run.info.run_id, "Max Bins", val)
# Set a runs status to finished (best practice)
client.set_terminated(run.info.run_id)

Step 7

Navigate to the MLflow UI to view your results! By default, wherever you run your script (Jupyter Notebook, .py file or whatever), the tracking API writes data into files into a local ./mlruns directory. First you need to open your mlflow instance via the command line. So once you have the command prompt open, follow these steps.

  1. cd into the folder where the notebook you are running the script above is stored (i.e. cd Documents). If you’re not sure how to do this here are resources for mac and windows users.
  2. Once you are in the folder (Documents in the example above), type ‘mlflow ui’ into the command line.
  3. Then you can then paste this web address into your browser to view the UI: http://localhost:5000/#/

And whammy! The results you see in your browser should look something like this….

MLflow UI Output

If your column headings look different you just need to click the “Columns” button on the right above your table and check/uncheck the boxes next to whatever columns you want to see/hide.

Next Steps

Wasn’t that easy?! Seriously that’s really all you have to do to get started. From here you can conduct as many runs as you want. I always like to test out several different algorithms with various hyper-parameter settings (eg. maxBins/maxDepth) and then review the MLflow UI to see which one performed the best.

The table in the UI is great because you can sort the results by the accuracy score to see right away what the best performer was! You can also export this table as a CSV file and email it to your boss to impress the pants off him/her.

Another cool thing I like to do is change the random seed once I think I have a well performing model and make sure the accuracy score stays stable. If I see a significant change either way in the accuracy score I know my model is unreliable.

But that about does it for this post! I hope you all enjoyed this article and learned something new! Best of luck in your model tuning journey :)

Additional resources in case you need them

--

--

Eden E. Canlilar

A data scientist by trade that has extensive experience in object-oriented programming languages, distributed computing and predictive analytics.