Crime Data Analysis

Mohammed Yahya
5 min readApr 22, 2021

--

Introduction:

I started this analysis in which I analyze crime patterns across US. Since then, I uncovered some interesting insights from the data. The data is present in the FBI website and it is accessible through an api to the public.

US Crime:

Statistics on specific crimes are indexed in the annual Uniform Crime Reports by the Federal Bureau of Investigation (FBI) and by annual National Crime Victimization Surveys by the Bureau of Justice Statistics. In addition to the primary Uniform Crime Report known as Crime in the United States, the FBI publishes annual reports on the status of law enforcement in the United States. The report’s definitions of specific crimes are considered standard by many American law enforcement agencies. According to the FBI, index crime in the United States includes violent crime and property crime. Violent crime consists of five criminal offenses: murder and non-negligent manslaughter, rape, robbery, aggravated assault, and gang violence; property crime consists of burglary, larceny, motor vehicle theft, and arson.

The basic aspect of a crime considers the offender, the victim, type of crime, severity and level of that crime, and location. These are the basic questions asked by law enforcement when first investigating any situation. This information is formatted into a government record by a police arrest report, also known as an incident report. These forms lay out all the information needed to put the crime in the system, and it provides a strong outline for further law enforcement agents to review. Society has a strong misconception about crime rates due to media aspects heightening their fear factor. The system’s crime data fluctuates by crime depending on certain influencing societal factors such as economics, the dark figure of crime, population, and geography .

Getting the data

The data was obtained from https://www.fbi.gov/wanted/api.The FBI Wanted API is designed to help developers easily get information on the FBI Wanted program. The API is a simple REST endpoint that accepts query parameter for options and returns application/json responses.

Initial Data Analysis

Here we will make sure that the reader gains knowledge about how the data is structured, so that he or she can feel at ease during the analysis, since they will be able to understand what they are reading.

A great start is to check how our DataFrame is distributed, so we know how many inputs and variables we will have to work with.

It was possible to identify the important columns needed for crime analysis, since we will be working with numerical data, the variables are classified as floats and int.

Variables Dictionary

Let’s look at the variable names that will appear throughout the article and put together a dictionary to clear up any possible doubts.

· crime_place — Defines the place where the crime occurred.

· title — It may be the names of the criminal with their possible first and the last names.

· race — It identifies what race did the criminal or the victim belong to.

· person classification — This is the most important variable as it identifies whether the person was a victim or the criminal.

· weight — Identifies approximately the weight of a criminal or the victim.

· case_details — Denotes the details of the case.

· case_description — Provides the date, how the victim died and in which place.

Fetching the data

First thing for a data analysis is to fetch the data. Here we are fetching the data through an api. Here we did through requests method

Requests allows you to send HTTP/1.1 requests extremely easily. There’s no need to manually add query strings to your URLs, or to form-encode your PUT & POST data — but nowadays, just use the json method.

The output data will be in the form of a json format.

These will be in the form of a dictionary where it will be like a key: value pair. We can access which key we want depending on the analysis we are doing and the columns we want.

Converting JSON to CSV:

Converting JSON to a CSV format will make our data cleaning easier and in a more understandable format.

Here I specified a variable named header inside which assigned the column names. Then wrote that using .writerow function. Then using for loop iterated through each and every row and selected only rows required for the particular columns which I assigned a variable.

With the help of these I was able to extract the required data from json format and converted to a csv format.

Reading CSV:

Reading from a csv helps us to understand the format, structure and to check data quality.

So in the first place I used pandas library to use the pd.read_csv to read the data present in the csv file.

After analyzing the dataset, I figured out which columns are important for further and which are not.

Visualizing:

Visualization helps us to gain insights from the data and help us to understand the nature of crime and the trends and patterns in data.

With the help of matplotlib. pyplot I was able to plot between “crime_place” and “race”. From the graph I could see that Native Americans were the one with highest crime in Albuquerque had the highest crime.

Using seaborn countplot I plotted a graph to discover between crime and victim using the person_classification variable to differentiate between the criminal and the victim. I discovered that there were more criminals, and the victims were less.

Conclusion:

With the help of an api I was able to successfully extract the dataset and get some insights about the crime data. Through this analysis I was able to discover the places where crime occurs the most and also discovered that Native Americans are the ones who are mostly involved in crime.

--

--

No responses yet