R Visualization

Daniel Tafmizi

Dr. Friedman

Lis 4370

Module 9

Github: daniel.R/Work.R/LIS4370Rprog/wineClusterViz.R at main · DanielDataGit/daniel.R

    I did some clustering analysis with the wine dataset that includes Country, alcohol as liters of wine, deaths per 100,000, heart disease per 100,000, and liver disease per 100,000.


Some correlations are prevalent. Unsurprisingly, heart disease and death are positively correlated. Alcohol and liver disease have a positive correlation. Interestingly, heart and alcohol have a negative correlation. 


I chose to use k-means as opposed to knn because of the small dataset. I think the algorithm did a great job of creating clusters. I found that three clusters resulted in the most uniformity.  It is interesting to see where each country ends up on the map.




I thought this was a really cool visualization. It incorporates a dendrogram into the heatmap, showing us the clusters in a different style. This shows us that there are 5 cluster hierarchies. The heatmap aspect allows us to see where the differences reside. The first cluster has high heart and death, and low alcohol and liver. The second is opposite the first. The third is overall average but leans towards death and heart. The fourth sees a spike in heart, but low in others. The fifth has fairly low values for all.


Comments

Popular posts from this blog

redditApiR

Mod 7 OOP

Description File