Machine Learning in Cyber-security

In the 21st era, people are busy discussing the possibilities that come with their data and the risks therein while exposed on the wide net. At the start of 2020, planning my year made me think of how my life online is and who has access to my data online, how they are accessing it and the risk that comes with that. This comes when among my objectives this year is building on my image both career-wise and even personal life. This trouble of mind does not leave out businesses from small to big corporations, this is the fear of exposure. The field of cybersecurity comes to the rescue amidst all these fears, but as we know food without salt is merely a tasteless nutrient. It is near to impossible nowadays to deliver a cybersecurity technology without the inclusion of three key pillars;

  • Machine learning
  • Image recognition
  • Natural language processing

There is always someone behind a computer trying to find weakness and worst of all hackers are engaged in building machine learning algorithms that are used in executing nefarious activities.

Application of Machine Learning in cybersecurity

There are reasons why machine learning has become the talk of the town. It is applied in nearly all fields. With machine learning, cybersecurity systems have the possibility of analyzing and learning from patterns with the aim of preventing similar encounters and have flexibility in changing environments or behaviors. It is always a win for the cybersecurity team to be proactive rather than reactive, this helps them in preventing crimes from occurring and acting upon attacks in real-time. Machine learning works best in making sure that happens. Resources are optimized appropriately with the reduction of time on routine works by machine learning.

When looking at machine learning application to cybersecurity, we ought to answer these three questions, the why, what and how. With why we try to predict, detect, prevent, respond and monitor a threat which is a task. What is a technical layer and finally the how indicates the way to check the network of a particular area? Below we see different ways ML is applied in cybersecurity systems;

Network protection using ML

Because the network does not stand out in a single area, but it is distributed on different protocols such as Ethernet, wireless oven software designed networking. ML has been used previously in network protection using signature-based approaches. This has so far changed and ML is being applied using network traffic analytics which helps in analyzing all traffic at each layer and detecting attacks and anomalies.

Some of the basic applications being;

  • Predicting network packets parameters and doing a comparison with the normal ones using regression
  • Identifying different kinds of network attacks such as spoofing and scanning using classification.
  • Doing forensic analysis using clustering.

Endpoint protection with ML

It is advocated that the new generation of anti-viruses is endpoint detection and response. ML application at the endpoint differs depending on the kind of endpoint. As much as the tasks are the same, each endpoint comes with its own specifics.

  • Malware protection on secure email gateways using clustering. A good example might be the separation of legal attachments from the outliers.
  • Predicting the next system call used in executing processes and do a comparison with the real one using regression.
  • Dividing programs into different categories like malware, spyware or ransomware using classification.

Application security

ML is used in wide-area file services and code analysis. Some of the applicable examples of ML are;

  • Detect anomalies in the HTTP requests using regression. A good example will be the XML external entity injections.
  • Clustering user activities to identify distributed denial of services.
  • Detect known types of attack with clustering.

Other applications of Machine learning are mentioned below;

  • Monitoring user behavior with ML
  • Monitoring process behavior with the help of ML

Case Study

MIT ML platform

MIT computer science and AI lab-developed ana adaptive machine learning security platform, which helps to identify needless in the haystack by analysts. The main purpose of the system was to review millions of logins each day and filter data for the simplicity of human analysis. This reduced daily alarms to around 100, this helped increase detection rate up to 85%.

Darktrace algorithm

Used by one casino in North America to detect infiltration attacks, which found a soft spot into the network using a fish tank. The same algorithms were used to prevent attacks during the WannaCry ransomware malice.


Having seen different ways machine learning can aid in cybersecurity, we do understand why there is hype over the need for Machine learning skills. It is to be noted that ML is not the final solution if you indeed want to protect your system. Issues have arisen about its interpretability majorly on deep learning algorithms, but as we know, we humans also have the same problem of interpretations. There is a big gap between machine learning experts and the rapid growth in data. To gain advancements in cybersecurity, we ought to enhance new approaches in machine learning to stay ahead of black hackers.


  • To walk on the safe side, the implementation of machine learning techniques in cybersecurity should go hand in hand with human aid analysis.
  • Do a taxonomy of the known machine and deep learning algorithms for cybersecurity.
  • Policies should be enacted that help in the advancements of research in the field of machine learning for cybersecurity. This includes financial support.



Buczak, A. L., & Guven, E. (2015). A survey of data mining and machine learning methods for cybersecurity intrusion detection. IEEE Communications Surveys & Tutorials, 18(2), 1153-1176.

Chio, C., & Freeman, D. (2018). Machine Learning and Security: Protecting Systems with Data and Algorithms. " O'Reilly Media, Inc.".




The Machine learning wave in the financial services


In this article, I am going to talk about how machine learning is transforming the financial services industry. In a typical economy, the financial services sector is made up of banks, insurance companies, real estate firms, credit and payment processing companies who serve both the retail and commercial consumers. All these firms have enormous volumes of historical data of their customers due to a large number of transactions they process daily. I will begin by defining what machine learning is and then relate it to the large datasets in the financial services firms.

Machine learning is a subset of data science that uses statistical models that learn from experience (data) without being explicitly programmed. The modes then draw insights from the data and make predictions when subjected to a different dataset. Simply Machine learning involves training a model using a dataset and then evaluating the model by fitting a test dataset to check its performance. Think of teaching a small child simple arithmetic computation such as addition, subtraction using a set of examples then subjecting the child to an evaluation test after the learning process.

Having understood what machine learning is, let’s look at some of the major applications of machine learning in the financial services sector.

Underwriting and Credit scoring

Underwriting is the process that banks use to decide who to give credit to, how much to give and when to give. In insurance, it’s the process of gauging the risk associated with insuring a certain customer. Traditional underwriting models relied on limited data points about the customer mainly numerical features. This greatly limited their accuracy levels resulting in underwriting losses.  Over time banks and insurance companies have been able to collect large volumes of historical data about their clients from different sources e.g. social media, CRMs, etc. This data has been used to train machine learning models that utilize thousands of data points about the customer to predict the credit score at high precision. This has greatly reduced underwriting losses previously incurred by the banks and insurance firms. 


Robo-advisors are digital platforms that provide automated, algorithm-driven financial planning services with little human supervision. Machine learning algorithms are being used in Robo-advisors for wealth management and product recommendation. They typically perform the duties that a human investment advisor does. Robo-advisors have helped mitigate the subjectivity associated with human advisors, have lower charges compared to management fees paid to investment advisors and recommend personalized products that improve customer experience.


The financial services sector has experienced a lot of transformations over time. From the old manual bookkeeping to excel spreadsheets, to modern-day sophisticated accounting software. There still exists manual repetitive tasks within the industry despite the developments in departments such as call centers.  Applications such as chatbots which are based on natural language processing that relies on machine learning algorithms have been developed to automate the repetitive tasks in customer support centers. Well trained chatbots can handle 80% of the call center tasks. This has greatly saved on human labor costs and improved customer experience since the bots can be accessed when the customer support representative is not available in the office.

Fraud detection

Financial institutions have been prone to fraudulent transactions which result in financial losses. Traditional fraud detection and prevention techniques employed by these institutions such as whistleblowing and internal checks have proved futile over time. Deployment of machine learning algorithms in monitoring the numerous transactions has helped identify fraudulent transactions in real-time and flag them off. With the large data sets available to these firms the models are well trained and thus have high accuracy scores. The models have helped save revenue lost due to fraudulent transactions.

Case studies

  1. Citi bank: the bank’s heavy investment in FeedzAI. Feedzai is a fraud detection machine learning platform that continuously scans large volumes of data to recognize emerging threats and alert customers in real-time. It also offers customers on the digital platform protection against cyberattacks through its user-friendly omnichannel support.
  1. Zendrive. It is a mobile app that monitors the driving behavior of customers to potentially offer them significant discounts on car insurance premium. This is in contrast with the current models of insurance pricing that use variables like age, education, marital status, etc.

Industry insights

Captricity has developed Machine learning algorithms that are able to extract handwritten or typed forms into a digital form with 99.9% accuracy. These algorithms are helping insurance firms reduce cycle times.

Conclusion and recommendations

The value of Machine learning is growing daily and it is hard to imagine the financial services sector without utilizing it in the near future. It’s evident the numerous advantages of incorporating the algorithms in the daily operations of the sector. Though implementing the whole process can be costly and time-consuming a simple implementation of the initial stages such as solid data engineering, aggregation and visualization then applying the open-sourced machine learning frameworks can effectively perform similar functions to the purchased software and drive the business forward. 

Thank you for reading through the article. I hope you have learned something.


  1. Machine learning in finance: Why what & how-
  2. Machine Learning and Deep Learning in Financial Services-
  3. 7 Ways Fintechs Use Machine Learning to Outsmart the Competition-



Partial dependency of a decision on a specific data point?

Partial dependency of a decision on a specific data point?

What  is the partial dependency of your decision on a specific data point?

The financial industry has made a big step towards adopting the implementation of advanced machine learning models in their credit decisioning. However, the implementation of real time scoring methods supported by machine learning is still a paradox. Institutional heads become worried implementing black box models without understanding the transformational process the data goes through. Risk managers would like to understand computations that take place before a decision is arrived at. In most cases, this is a complex task. Take an example of a random forest model which generates a decision from a forest or optimal selection of many decision trees and predicts whether someone will default or not. In most cases, business would like to generate rules from intelligent processes which will guide the business in deciding what an ideal client profile will look like.

In digital lending, the profile attributes generated will eventually be introduced into the scoring process with points accorded to each feature. Unfortunately, in machine learning, a combination of attributes affects the overall performance of the model and selecting limited features, especially where you have hundreds of important features will lessen the quality of your model. That is why the accuracy of most models implemented in credit reference bureau in Kenya on the digital lending frontier hardly meet the client’s expectations. Lenders eventually must invest in a debt collection team to follow up unpaid debts. In fact, CRB agencies achieve a 10-30% default rate, assuming all clients have to repay their loans by the due date.  The easiest way at the initial stages of introducing machine learning into ordinary business processes in an organization is to attempt to explain how different (most important variables) affect the main response variable thanks to Friedman (2001) who introduced the partial dependency plots.

A partial dependence plot may be viewed as a pictorial representation of linear regression model coefficients. While most people use traditional regression models which allow us to decipher considerable knowledge by breaking down the structure and interpretation of the model in examining its coefficients. Often, we find ourselves running more advanced and complex models which require lots of tuning, stress testing and optimization. These models are such as the XGBoost family, Random Forest Models, support vector machines, etc. In such cases, you hardly find methods of estimating dependency of the response variable on the predictor variables. It becomes difficult to interpret some of these models. Occasionally, we find ourselves trying to convince the executives by using model evaluation metrics such as precision, recall, accuracy, ROC AUC etc. in a bid to explain how far we can fly with the new rocket (ML Model). To tell a more interesting story and connect with the business, consider making use of partial dependency plots.

Friedman (2001) encountered a similar case and probably as a data scientist, it became challenging for him to interpret the data. To address this difficulty for his gradient boosting machine, Friedman proposed the use of partial dependence plots. Partial dependency plots will help you explain how each dependent variable in your dataset influences the response. It gets very interesting if you as a data analyst can explain things from this perspective.

Digital Lending Case of Dependency Plotting

Let us assume we want to predict the likelihood of a customer repaying his digital loan as an output variable (represented as True/False where True stands for – likelihood of repaying in time & False -likelihood of not repaying in time) and two predictor variables Age & income change monthly. Please remember the data used herein is randomly generated for purposes of sharing knowledge. The key question here will be, what is the effect of the salary change on the ability to repay his loan on time. The following table represents data of actual happenings. We will train a machine learning model which will learn the different combinations that will determine repay ability.



Income change

Actual Case


I decided to run two high performance models, namely Random Forest & XGoost. Using an AUC ROC measure, my models had an accuracy of 70%. In a real-world assignment, this isn’t good at all but with only two predictors, I would say that is a good start. Apparently, income change month on month seems to be a stronger predictor as compared to age.

Back to the main analysis, the partial plot. Plotting a partial dependency plot shows some interesting information. As you may know a partial dependence plot shows the dependence of the predicted response to a single feature. The x axis displays the value of the selected feature, while the y axis displays the partial dependence. The value of the partial dependence is by how much the log-odds are higher or lower than those of the average probability. The log-odds for a probability p are defined as log (p / (1 - p)). They are strictly increasing, i.e. higher log odds mean higher probability. The below plot simply tells me that the correlation between ability to repay in time and age becomes more significant starting at age 45.

What does that mean? This simply tells us that people of age 45 and above are less risky.

Applying the same method on other variables creates a client profile that can be used to define the qualifies of a good customer for your business. Share your feedback.

Predictive analytics for better human resource management

Predictive analytics for better human resource management

What is our employee's involuntary turnover rate? What is the revenue per employee? What is our Absenteeism rate? What is our time to hire? What is the human capital risk?

These are questions that any practicing Human Resource professional can easily answer based on the employee’s data. These and other commonly used metrics mainly focus on reporting of employee’s data. Though they may provide useful information to the department, in a data-driven economy, this may be insufficient since large chunks of data with greater insights remain unutilized in their data warehouses. Recent developments in areas of Artificial Intelligence, Machine Learning and Big data analytics have brought about various algorithms which if well applied in the Human resource departments can mine more useful insights. Through this HR Analytics has developed and is been used by organizations to effectively understand and manage their employees.

What is HR analytics?

HR analytics can be simply defined as the application of statistics, modeling, and analysis of employee-related factors to improve business outcomes. This enables the department to attract, manage and retain employees thus significantly improving the return on investment.  Predictive analytics is a branch of advanced analytics that deals with extracting information from data in order to determine patterns and forecast future outcomes and trends with an acceptable level of reliability.

Some applications in HR analytics.

Predictive analytics in HR can answer the following questions based on the current employee data.

  1. What is the probability of an employee leaving the company?
  2. What drives internal innovation?
  3. Which type of employees are at a higher risk of turnover in the future?
  4. Which new hires are likely to be a success?
  5. Which onboarding techniques have a higher retention and engagement rate

Here are some exciting examples of companies that have benefited from HR analytics.

IBM: Defining successful salespeople

Typically, an outstanding personality is considered as a key trait in defining a successful salesperson. Through a comparison of worker surveys and manager assessments, IBM found out that the most salient trait for sales success was emotional courage.

Xerox: Increasing employee retention

The company carried out an analysis of how to retain its customer service employees. It found out that employees who lived nearby and had reliable transportation tended to stick to their jobs. Through the pilot program, the company was able to reduce its attrition rate by 20%.

Royal Dutch Shell: Identifying good idea-generators

The company analyzed a database of ideas generated by its 14,000 employees over several years. The idea generators were later asked to play a video game that was designed by data scientists, neuroscientists and psychologists as a way of testing their human potential.  Shell compared the results of the video game against the real-world results of the ideas generated. This showed the characteristics of employees whose ideas would succeed in the company.

What are the common data sources for the human resource department?

  • Employee surveys
  • Telemetric Data
  • Attendance records
  • Multi-rater reviews
  • Salary and promotion history
  • Employee's work history.
  • Demographic data
  • Personality/temperament data
  • Recruitment process
  • Employee databases


The recent developments in data management have revolutionized the general business environment, with a lot of focus has been put into sales, marketing and finance departments. The Human resource department can equally benefit from predictive analysis. Though algorithms will not exactly show what will happen, they will provide probabilities of events occurring whether bad or good. This will allow decision-makers to make informed decisions.