Unit 7 Visualization
这一周主要介绍数据可视化。
课程地址:
[https://www.edx.org/course/the-analytics-edge]
setwd("E:\\The Analytics Edge\\Unit 7 Visualization")
读取数据
WHO = read.csv("WHO.csv")
str(WHO)
'data.frame': 194 obs. of 13 variables:
$ Country : Factor w/ 194 levels "Afghanistan",..: 1 2 3 4 5 6 7 8 9 10 ...
$ Region : Factor w/ 6 levels "Africa","Americas",..: 3 4 1 4 1 2 2 4 6 4 ...
$ Population : int 29825 3162 38482 78 20821 89 41087 2969 23050 8464 ...
$ Under15 : num 47.4 21.3 27.4 15.2 47.6 ...
$ Over60 : num 3.82 14.93 7.17 22.86 3.84 ...
$ FertilityRate : num 5.4 1.75 2.83 NA 6.1 2.12 2.2 1.74 1.89 1.44 ...
$ LifeExpectancy : int 60 74 73 82 51 75 76 71 82 81 ...
$ ChildMortality : num 98.5 16.7 20 3.2 163.5 ...
$ CellularSubscribers : num 54.3 96.4 99 75.5 48.4 ...
$ LiteracyRate : num NA NA NA NA 70.1 99 97.8 99.6 NA NA ...
$ GNI : num 1140 8820 8310 NA 5230 ...
$ PrimarySchoolEnrollmentMale : num NA NA 98.2 78.4 93.1 91.1 NA NA 96.9 NA ...
$ PrimarySchoolEnrollmentFemale: num NA NA 96.4 79.4 78.2 84.5 NA NA 97.5 NA ...
按之前方法画图
plot(WHO$GNI, WHO$FertilityRate)
可以看到,上图比较简陋,接下去使用ggplot库作图,我的理解是ggplot是利用面向对象的方法作图
library(ggplot2)
scatterplot = ggplot(WHO, aes(x = GNI, y = FertilityRate))
Add the geom_point geometry
scatterplot + geom_point()
Make a line graph
scatterplot + geom_line()
Switch back to our points
scatterplot + geom_point()
Redo the plot other symbol
scatterplot + geom_point(color = "blue", size = 3, shape = 17)
scatterplot + geom_point(color = "darkred", size = 3, shape = 8)
Add a title to the plot
增加标题
scatterplot + geom_point(colour = "blue", size = 3, shape = 17) + ggtitle("Fertility Rate vs. Gross National Income")
Color the points by region
ggplot(WHO, aes(x = GNI, y = FertilityRate, color = Region)) + geom_point()
Is the fertility rate of a country was a good predictor of the percentage of the population under 15?
ggplot(WHO, aes(x = log(FertilityRate), y = Under15)) + geom_point() + stat_smooth(method = "lm")
Add this regression line to our plot:
ggplot(WHO, aes(x = log(FertilityRate), y = Under15)) + geom_point() + stat_smooth(method = "lm")
本博客所有文章除特别声明外,均采用 CC BY-NC-SA 4.0 许可协议。转载请注明来自 Doraemonzzz!
评论
ValineLivere