MACHINE LEARNING PARA SERIES DE NEGOCIOS: ÁRBOLES DE DECISIÓN

Machine Learning se ha convertido en un tema central de interés para los medios de comunicación, gracias a sus recientes aplicaciones exitosas en la creación de valor en una variedad de escenarios empresariales. En Clariba, como expertos en análisis predictivo, somos agentes activos de su adopción y democratización, ya que hemos estado aplicando el ML en nuestras soluciones predictivas durante mucho tiempo. Cuando se usan de forma inteligente y con la metodología adecuada, las técnicas de Machine Learning pueden ofrecer un aumento en el rendimiento a empresas y organizaciones de todo tipo.

With this series we aim to introduce newcomers to the different types of Machine Learning, its main techniques and algorithms and their business uses. We want also to help demystify the term and provide our clients and prospects with ideas on how to integrate ML into their daily operational and decision-making processes.

¿Qué es Machine Learning?

El nombre Machine Learning fue introducido por Arthur Samuel en 1959. Es un campo de la ciencia que explora el desarrollo de algoritmos que pueden aprender y hacer predicciones sobre los datos. La principal diferencia con otros algoritmos comunes es la pieza de "aprendizaje". Los algoritmos de Machine Learning no son series de procesos ejecutados en serie para producir un resultado predefinido. En su lugar, son una serie de procesos que buscan "aprender" patrones de eventos pasados y construir funciones que pueden producir buenas predicciones, con un grado de confianza.

Dentro del campo del análisis de datos, Machine Learning  forma parte de una área conocida como análisis predictivo .

TYPES AND USAGE

As we just described, the learning piece is what best defines this kind of algorithms. Depending on the type of learning, they are commonly divided into supervised, unsupervised, semi-supervised and reinforcement learning algorithms.

We will start this series with an example of supervised learning algorithms.

Supervised learning algorithms try to find relationships and dependencies between a target output we want to predict – ranging from churn to insurance fraud or potential success of a sales promotion on different individuals - and data we have from other individuals from the past, including demographic characteristics or previous behavioural data. We use this past data as input variables to predict the most probable output value for new data, based on those relationships learned from previous data sets.

The most typical supervised methods are:

  • ÁRBOLES DE DECISIÓN
  • Linear Regression
  • Nearest Neighbour
  • Naive Bayes
  • Support Vector Machines (SVM)
  • Redes Neuronales

ÁRBOLES DE DECISIÓN

Los árboles de clasificación y regresión se conocen comúnmente como CART. El término fue introducido por Leo Breiman para referirse a los algoritmos de Árbol de Decisión que se pueden usar para problemas de modelado predictivo de clasificación o regresión.

El algoritmo básico de CART es la base para algoritmos más avanzados como árboles de decisión empaquetados, Random forest y árboles de decisión potenciados.

Los árboles de decisión generalmente se usan para predecir la probabilidad de lograr un resultado para una nueva observación (individuo, cliente, ...) según sus atributos (edad, demografía, comportamiento de compra, ...), utilizando datos anteriores que tenemos de un número suficiente de Observaciones similares o individuos. El resultado para predecir es normalmente binario: sí / no (se agitará / no se agitará, se comprará / no se comprará, ...).

They are called trees because they can be represented as a binary tree where each root node represents a single input variable (age, city, segment…) and a split point on that variable (assuming the variable is numeric). The leaf nodes of the tree contain the output variable (will buy, will churn, …) we want to predict.

Let’s start with a simple example, where we will try to predict gender based on height and weight of people. Below is a very simple example of a binary decision tree:

 

Imatge.jpg

El árbol se puede representar tanto como un gráfico o como un conjunto de reglas. Por ejemplo, a continuación se muestra el árbol de decisiones anterior, que describe un conjunto de reglas:

  1. Si Altura> 180 cm Entonces Hombres
  2. If Height <= 180 cm AND Weight > 80 kg Then Male
  3. Si Altura <= 180 cm Y Peso <= 80 kg Entonces Mujer

Con la representación de árbol binario del modelo CART descrito anteriormente, hacer predicciones es relativamente sencillo. Cada vez que evaluamos a un nuevo individuo, podemos predecir su género según la altura y el peso, con un grado de confianza.

Let’s see a decision tree in action, to help a US-based Telco company prevent churn.

BUSINESS CASE EXAMPLE: CHURN PREVENTION IN A TELCO

We have a sample dataset with the following attributes from a significant number of customers:

  • State: 2 characters representing the State
  • Account length: age of the account in days
  • Area code: postal code
  • International plan: Boolean yes/no explaining if the client has an international plan contracted
  • Voice mail plan: Boolean yes/no showing if the customer has a voice mail activated
  • Number vmail messages: total number of voice email messages managed
  • Total day minutes: average minutes of voice calls during daytime, per month
  • Total day calls: average number of calls during daytime, per month
  • Total day charge: average spending on day calls, per month
  • Total eve minutes: average minutes of voice calls during evening, per month
  • Total eve calls: average number of voice calls during evening, per month
  • Total eve charge: average spending on voice calls during evening, per month
  • Total night minutes: average minutes of voice calls during the night, per month
  • Total night calls: average number of voice calls during the night, per month
  • Total night charge: average spending on voice calls during the night, per month
  • Total intl minutes: average minutes on international calls, per month
  • Total intl calls: average number of international calls, per month
  • Total intl charge: average spending on international calls, per month
  • Customer service calls: number of calls to customer service
  • Churned: Boolean true / false meaning they churned from the company or they didn’t. This will be our target variable to predict and prevent.

The image below represents a small sample of our dataset that we are using for our predictions.

Imatge5.jpg

We use SAP Predictive analytics to help us build our decision tree, to respond to the business question: how could we prevent churn from customers, based on our historical data?

Let’s see the outcome:

Imatge2.jpg

The tree starts with the analysis of the whole population, which is in our case 3332 customers. The first thing we notice is that over 14% of these customers have churned. The first variable the algorithm shows as decisive to predict churning is “customer service calls”. As we can see in the box at the right side on Fig 1, around 52% of clients who have called customer service more than 3 times end up churning. The company should act proactively, call those customers and try to listen to them and resolve their issues right after the 3rd call.

First prevention action: proactively call all customers right after the 3rd call received to customer service to try to resolve their issues.

Let’s continue analysing the tree:

Imatge3.jpg

The next level is telling us there is a big correlation between not having a voicemail activated and churning, especially among customers with a low daily spending. We could assume that those users do not use their phones for work, so they receive a better service when they have an active voice mail service during the day, allowing them to catch up on missing calls and messages at the end of the day.

Second prevention action: start a campaign offering free voicemail activation, and proactively inform the most relevant customer segment (daily charge between 0 and 24.43$, monthly)

Finally, let’s analyse the bottom level of the tree by using “International Plan” as a branch separator:

Imatge4.jpg

As we can see, the churning rate is higher among customers who have an international calls plan, no matter which combination the previous variables were. Clearly all customers having an international calls plan are unhappy and tend to churn more.

Third prevention action: review international plans and their adequateness to the usage and needs of each customer segment.

Resumen

Los árboles de decisión son un método fácil de representar frecuencias de atributos que sospechamos que pueden proporcionar información, ayudando a predecir un resultado. Pueden ser muy útiles para analizar las causas probables de los buenos y malos resultados comerciales y ayudarnos a mejorar nuestros niveles de servicio, aumentar la retención de clientes o prevenir el fraude, entre muchas otras aplicaciones.

Most importantly, decision trees as many other ML algorithms are already available in many of the SAP BI solutions. Clariba can help you identify the Machine Learning processes that can add value to your business and integrate them in your existing BI ecosystem. Contact us and we’ll be delighted to support you in this journey.

Referencias

https://machinelearningmastery.com/classification-and-regression-trees-for-machine-learning/

https://towardsdatascience.com/types-of-machine-learning-algorithms-you-should-know-953a08248861

EspañolEnglish