Understand cyber crime cases using Confusion matrix.

kundan singh
4 min readJun 6, 2021

In this blog first we will discuss about confusion matrix ,and after that we understand how it help us to understand the cyber crime cases.

What is confusion matrix

A confusion matrix is a summary of prediction results on a classification problem.

The number of correct and incorrect predictions are summarized with count values and broken down by each class. This is the key to the confusion matrix.

The confusion matrix shows the ways in which your classification model
is confused when it makes predictions.

It gives you insight not only into the errors being made by your classifier but more importantly the types of errors that are being made.

It is this breakdown that overcomes the limitation of using classification accuracy alone.

Understanding True Positive, True Negative, False Positive and False Negative in a Confusion Matrix

True Positive (TP)

  • The predicted value matches the actual value
  • The actual value was positive and the model predicted a positive value

True Negative (TN)

  • The predicted value matches the actual value
  • The actual value was negative and the model predicted a negative value

False Positive (FP) — Type 1 error

  • The predicted value was falsely predicted
  • The actual value was negative but the model predicted a positive value
  • Also known as the Type 1 error

False Negative (FN) — Type 2 error

  • The predicted value was falsely predicted
  • The actual value was positive but the model predicted a negative value
  • Also known as the Type 2 error

What is Cyber Crime?

Cybercrime is defined as a crime where a computer is the object of the crime or is used as a tool to commit an offense. A cybercriminal may use a device to access a user’s personal information, confidential business information, government information, or disable a device. It is also a cybercrime to sell or elicit the above information online.

Cybercrimes can generally be divided into two categories:

  1. Crimes that target networks or devices: Viruses, Malware ,DoS Attacks
  2. Crimes using devices to participate in criminal activities: Phishing Emails, Cyberstalking, Identity Theft

DDoS Attacks

These are used to make an online service unavailable and take the network down by overwhelming the site with traffic from a variety of sources. Large networks of infected devices known as Botnets are created by depositing malware on users’ computers. The hacker then hacks into the system once the network is down.

identity Theft

This cybercrime occurs when a criminal gains access to a user’s personal information to steal funds, access confidential information, or participate in tax or health insurance fraud. They can also open a phone/internet account in your name, use your name to plan a criminal activity and claim government benefits in your name. They may do this by finding out user’s passwords through hacking, retrieving personal information from social media, or sending phishing emails.

Cyberstalking

This kind of cybercrime involves online harassment where the user is subjected to a plethora of online messages and emails. Typically cyberstalkers use social media, websites and search engines to intimidate a user and instill fear. Usually, the cyberstalker knows their victim and makes the person feel afraid or concerned for their safety.

Phishing

This type of attack involves hackers sending malicious email attachments or URLs to users to gain access to their accounts or computer. Cybercriminals are becoming more established and many of these emails are not flagged as spam. Users are tricked into emails claiming they need to change their password or update their billing information, giving criminals access.

Let me give you an example to better understand this. Suppose we had a classification dataset with 1000 data points for cyber attacking features . We fit a classifier on binary classification (0,1) attack happen or not happen and get the below confusion matrix:

The different values of the Confusion matrix would be as follows:

  • True Positive (TP) = 560; meaning 560 positive class data points were attack is not happen correctly classified by the model
  • True Negative (TN) = 330; meaning 330 negative class data points were attack is happen and it correctly classified by the model
  • False Positive (FP) = 60; meaning 60 negative class data points were incorrectly classified as belonging to the positive class by the model means actually attack is happen but our model predict that attack is not happen this is very dangerous for all.
  • False Negative (FN) = 50; meaning 50 positive class data points were actually attack is not happen but model predict attack is happen by the model

This is one of the advantages of machine leaning in the real world.

Thanks for Reading……….

--

--