59

基于ggplot2网络可视化(二)

 6 years ago
source link: http://www.10tiao.com/html/404/201806/2651058175/2.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client
作者简介

作者:吴健 中国科学院大学 R语言、统计学爱好者,尤其擅长R语言和Arcgis在生态领域的应用分享。个人公众号:统计与编程语言 


往期回顾:

R语言绘制条形图

基于ggplot2的网络可视化(1)


本文对公司员工邮件往来数据进行网络可视化分析,从而确定公司中员工的邮件往来情况。该数据集来自于2014 VAST Challenge (Cook et al.2014) 。

1.加载包:

library(ggplot2)
library(GGally)
library(geomnet)
library(ggnetwork)
library(network)

2.加载数据:

data(email,package=’geomnet’)#该网络数据有55个顶点和4743条边


3.数据整理:

email$edges <- email$edges[, c(1,5,2:4,6:9)]#提取数据

emailnet <- fortify(

as.edgedf(subset(email$edges, nrecipients < 54)),

email$nodes)#构建绘图数据集,排除群发邮件


4.绘制数据

set.seed(10312016)
ggplot(data = emailnet,
aes(from_id = from_id, to_id = to_id)) +
geom_net(layout.alg = “fruchtermanreingold”,
aes(colour = CurrentEmploymentType,
group = CurrentEmploymentType,
linewidth = 3 * (…samegroup.. / 8 + .125)),
ealpha = 0.25, size = 4, curvature = 0.05,
directed = TRUE, arrowsize = 0.5) +
scale_colour_brewer(“Employment Type”, palette = “Set1”) +
theme_net() +
theme(legend.position = “bottom”)


从上图可以看出同一部门邮件往来较为密切,不同部门之间只有个别员工之间存在邮件往来,接下来我们观察一下,这种模式是否随着时间变化而变化。


5.整理数据:

edges <- subset(email$edges, nrecipients < 54)#排除群发邮件

edges <- edges[, c("From", "to", "day") ]#提取指定列

em.net <- network(edges[, 1:2])#构建网络数据

set.edge.attribute(em.net, "day", edges[, 3])

em.cet <- as.character(email$nodes$CurrentEmploymentType)

em.net %v% "curr_empl_type" <- em.cet[ network.vertex.names(em.net) ]


6.绘图:

set.seed(7042016)
ggplot(ggnetwork(em.net, arrow.gap = 0.02, by = “day”,
layout = “kamadakawai”),
aes(x, y, xend = xend, yend = yend)) +
geom_edges(
aes(color = curr_empl_type),
alpha = 0.25,
arrow = arrow(length = unit(5, “pt”), type = “closed”)) +
geom_nodes(aes(color = curr_empl_type), size = 1.5) +
scale_color_brewer(“Employment Type”, palette = “Set1”) +
facet_wrap(~day, nrow = 2, labeller = “label_both”) +
theme_facet(legend.position = “bottom”)

上图可以看出每天员工的邮件往来情况,以便于我们进一步分析。



大家都在看

2017年R语言发展报告(国内)

R语言中文社区历史文章整理(作者篇)

R语言中文社区历史文章整理(类型篇)



公众号后台回复关键字即可学习

回复 R                  R语言快速入门及数据挖掘 
回复 Kaggle案例  Kaggle十大案例精讲(连载中)
回复 文本挖掘      手把手教你做文本挖掘
回复 可视化          R语言可视化在商务场景中的应用 
回复 大数据         大数据系列免费视频教程 
回复 量化投资      张丹教你如何用R语言量化投资 
回复 用户画像      京东大数据,揭秘用户画像
回复 数据挖掘     常用数据挖掘算法原理解释与应用
回复 机器学习     人工智能系列之机器学习与实践
回复 爬虫            R语言爬虫实战案例分享


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK