Detecting Ghosts with Boo-leans: Part 3
Unleash the Power of Logistic Regression with Alteryx
Halloween has past us but with the time change, darkness comes earlier and ghosts are among us for a longer period of the day. Luckily in the last blog, we have built our ghost detector and we are ready to share with our neighbors to make them feel a little bit safer on their walks home from work at night. To deploy the model there are a few quick steps that we need to follow.
First, we need to bring in our test file or the data file that will have live information. In our case it is just an excel file but for your organization this could also be a server that is always collecting data to get a score at any moment of the day (think of all the loan forms that get to sent to banks and how fast you get a results on if you have been approved or not). From our previous Logistic Regression tool, we will now utilize the “O” anchor. To connect the live data and the “O” anchor, we will go back to the predictive district and bring the Score Tool into our canvas. When connecting these two tools, it is important which tool connects to the correct anchor. Your live data should connect to the “D” anchor of the Score Tool and then your model that is being saved in the background of the Logistic Tool, gets hooked up to the “M” tool.
When we open the Score Tool, we are going to see two new columns when we scroll all the way to the right, X_0 and X_1. Picking which column you use, is user preference. We will use X_1 and you will see that the number is a decimal. We want to treat this number, in this case, as the likelihood that what we are seeing is a ghost i.e. the number is .85, that means the model says there is a 85% chance we are seeing a ghost. That is great to see but we do not want people seeing a percentage of whether it is a ghost or not, our goal is to say yes you are seeing a ghost, or no it is not. To do this we need to create a new column and go back to the Logistic Regression tool to find the Optimal Probability Cutoff (which is found in the “I” anchor). This number will be our decider on if we tell our neighbors if it is a ghost or not. When we go to the tool, we see that my cutoff is .28. In my new column, which we will call “Ghost Predictor”, we write a formula that if X_1 is greater than .28 then 1 else 0. Now when someone sees a ghost and they feed in the attributes, all the end user will see is a 1 or 0 in real time.
To finish the deployment process, we need to figure out how to automate this process and share the information with the public. Luckily Alteryx has some great tools on how to do this. If we do not need live time scoring and decide to report results on a schedule (every 10 minutes or every hour), we can use Alteryx server to produce outputs based on the ghost finder requests we have gotten over that allotted time. If we want live time scoring, Alteryx Promote, will allow us to do this and embed those scores into our internal software processes (intranet, Salesforce, Company App). Lastly, if we want to incorporate visual analytics, so that we can track ghost sightings over time or where people are seeing ghosts, we can use Alteryx’s new Interactive Charts or Tableau (which Alteryx will feed data live to) to accomplish this.
Thank you to those who have been following this series and we look forward to doing more blog posts on Advanced Analytics case studies in 2019!
Author: Justin Grosz