Network Analysis -Eurovision Dataset

Mohammed Yahya
6 min readApr 26, 2021

Introduction
I chose Eurovision dataset as it contains metadata, contest ranking and voting data of the songs that have competed in the Eurovision Song Contests.
Every year, this dataset would be updated with the result of the contest .Using this dataset I thought to build a network analysis to understand the contest between each countries and their participation in this contest.

Motivation
As it contains datas of different countries it would be suitable to describe as a network graph and visualize it .

Source of the dataset
I took the dataset from this website : https://eurovision.tv/story/the-results-eurovision-2018-dive-into-numbers through an api using jupyter notebook.

About Eurovision
Each participating broadcaster that represents their country chooses their performer (maximum 6 people) and song (maximum 3 minutes, not released before) through a national televised selection, or through an indoor selection. Each country is liberal to decide if they send their number-1 star or the simplest new talent they might find. They have to try to to so before mid-March, the official deadline to send entries.The winner of the Eurovision Song Contest are going to be chosen through 2 Semi-Finals and a Grand Final.
Traditionally, 6 countries are automatically pre-qualified for the Grand Final. The so-called ‘Big 5’ — France, Germany, Italy, Spain and therefore the uk — and the host country.
The remaining countries will participate in one among the 2 Semi-Finals. From each Semi-Final, the simplest 10 will proceed to the Grand Final. This brings the entire number of Grand Final participants to 26.Each act must sing live, while no live instruments are allowed.
After all songs are performed, each country will give two sets of 1 to eight , 10 and 12 points; one set given by a jury of 5 music industry professionals, and one set given by viewers reception . Viewers can vote by telephone, SMS and thru the official app.

Basics required
Before heading into our dataset first lets discuss the basics required to build the network graph.First we need to understand nodes and edges.

Nodes are usually representing entities within the network, and may hold self-properties (such as weight, size, position and the other attribute) and network-based properties (such as Degree- number of neighbors or Cluster- a connected component the node belongs to etc.).
Edges represent the connections between the nodes and might hold properties also (such as weight representing the strength of the connection, direction just in case of asymmetric relation or time if applicable).

Centrality Measures
Highly central nodes play a key role of a network, serving as hubs for various network dynamics. However the definition and importance of centrality might differ from case to case, and should ask different centrality measures:
· Degree — the quantity of neighbors of the node
· EigenVector / PageRank — iterative circles of neighbors
· Closeness — the extent of closeness to all or any of the nodes
· Betweenness — the quantity of short path browsing the node

Building a Network
At first we will start building the network by fetchching the data from the api using python through jupyter notebook.
In the following dataset we’ll build and visualize the Eurovision dataset through the api with Python networkX package.

Installation

The first step is to install the prerequisites required to build the network graph.We need to install the networkx library using pip

pip install networkx

The following code when ran in jupyter notebook will try to install the networkx library .

Importing

Once the library is installed we need to import using the import function available in python.

import networkx as nx

Fetching the data:
First thing for a data analysis is to fetch the data. Here we are fetching the data through an api. Here we did through requests method
Requests allows you to send HTTP/1.1 requests extremely easily. There’s no need to manually add query strings to your URLs, or to form-encode your PUT & POST data — but nowadays, just use the json method.The output data will be in the form of a json format.

Reading the data :
In the first step after fetching the dataset through api , we need to read to understand how the data is and to understand its features with respect to Eurovision.
Since each row represents all of the votes of every country, we’ll melt the dataset to form sure that every row represents one vote (edge) between two countries (nodes).Then we can start building a directed graph using networkx.

Reading CSV:

After converting the json format to a csv file we need to use the pd.read_csv to understand the datas inside teh csv file as using the json format won’t be in understandable format . We can just pick only the required columns for the analysis .This will help to check our data quality by checking any null values .
So in the first place I used pandas library to use the pd.read_csv to read the data present in the csv file.
votes_data dataframe conatins the information of number of points each country recieved from other countries. Then we need to tranform in into edge-list of votes with melttransformation.

Building a NetworkX graph:

NetworkX is a Python package for the creation, manipulation, and study of the structure, dynamics, and functions of complex networks.

Let’s build a directed, weighted networkx graph from the edgelist in votes_melted:

Visualizing:

Visualization helps us to gain insights from the data and help us to understand the nature of data and the trends and patterns in data.

Checking the out_degree:

An OutDegreeView for (node, out_degree)

The node out_degree is the number of edges pointing out of the node. The weighted node degree is the sum of the edge weights for edges incident to that node.

However, indegree is the one that determines the victory:

An InDegreeView for (node, in_degree) or in_degree for single node.

The node in_degree is the number of edges pointing to the node. The weighted node degree is the sum of the edge weights for edges incident to that node.

Examining elements of a graph:

We can examine the nodes and edges. Four basic graph properties facilitate reporting: G.nodes, G.edges, G.adj and G.degree. These are set-like views of the nodes, edges, neighbors (adjacencies), and degrees of nodes in a graph. They offer a continually updated read-only view into the graph structure. They are also dict-like in that you can look up node and edge data attributes via the views and iterate with data attributes using methods. items(), .data(‘span’). If you want a specific container type instead of a view, you can specify one. Here we use lists, though sets, dicts, tuples and other containers may be better in other contexts.

Conclusion :
From this Network analysis I was able to understand its importance as it will help us to understand the relationship between each feature in the form of a network structure .Network analysis may be a complex and useful gizmo for various domains, especially within the rapidly growing social networks. Using network analysis, we are able to analyze Eurovision dataset through an api and then perform data analysis. Through this dataset I was able to analyze the country wise votes number and the relationship between each of them.

References :

  1. https://eurovision.tv/about/how-it-works — About Eurovison
  2. https://networkx.org/ — About Network analysis
  3. https://towardsdatascience.com/social-network-analysis-from-theory-to-applications-with-python-d12e9a34c2c7 — About nodes and edges
  4. Eurovision Song Contest 2017: 5 of the best places to watch in the UK | HELLO! — Image

--

--