Welcome to my Data Analysis page
I this data set from Kaggle as a csv file. It consists of 9 columns and 122 rows of values. It's from a survey done in 2016 where participants from countries across the globe were asked to rate their happiness. This score along with the HDI (Human Development Index, which according to the United Nations Development Programme website is a summary measure of average achievement in key dimensions of human development: a long and healthy life, being knowledgeable and have a decent standard of living), the GDP (Gross Domestic Product), Beer, Spirit and Wine Per Capita values were tabulated for each country of participants. The countries were grouped by region and hemisphere also.
Inspection of the raw data revealed that column C had misspelt character values 'noth' in rows 48, 53, 72, 86. These were changed to 'both' for analysis later on.
Organizing the data into a pivot table yielded the following:
A strong linear correlation with an r value of 0.815 results. An individual's happiness seems directly related to the HDI of their country
With this graph several outlier values appear.
I filtered out the GDP values above 200, which are shown in red below, and graphed the new values . These outlier GDP values are probably a false indication of the state of economy of the respective countries.
The values in red are GDP values > 200 and were filtered out
I obtained a graph with a much greater defined linear regression pattern. The r value is 0.707, indicating a strong linear relationship between Happiness and GDP per Capita
There is a moderate linear correlation between Happiness and Alcohol per Capita, with an r value of 0.547
An r value of 0.493 is obtained
An r value of 0.256 is obtained
An r value of 0.450 is obtained
It seems as though beer has the greatest effect on a person's happiness and spirits the least