热门标签 | HotTags
当前位置:  开发笔记 > 编程语言 > 正文

如何制作带有大矩阵的热图?-HowcanImakeaheatmapwithalargematrix?

Ihavea1000*1000matrix(whichonlyincludesinteger0and1),butwhenItriedtomakeaheatmap,

I have a 1000*1000 matrix (which only includes integer 0 and 1), but when I tried to make a heatmap, an error occurs because it is too large.

我有一个1000 * 1000矩阵(只包含整数0和1),但是当我尝试制作热图时,会发生错误,因为它太大了。

How can I create a heatmap with such a large matrix?

如何创建具有如此大矩阵的热图?

6 个解决方案

#1


13  

I can believe that the heatmap is, at least, taking a long time, because heatmap does a lot of fancy stuff that takes extra time and memory. Using dat from @bill_080's example:

我可以相信热图至少需要很长时间,因为热图会做很多花哨的东西,需要额外的时间和内存。使用来自@ bill_080的例子:

## basic command: 66 seconds
t0 <- system.time(heatmap(dat))
## don't reorder rows & columns: 43 seconds
t1 <- system.time(heatmap(dat,Rowv=NA))
## remove most fancy stuff (from ?heatmap): 14 seconds
t2 <- system.time( heatmap(dat, Rowv = NA, Colv = NA, scale="column",
             main = "heatmap(*, NA, NA) ~= image(t(x))"))
## image only: 13 seconds
t3  <- system.time(image(dat))
## image using raster capability in R 2.13.0: 1.2 seconds
t4 <- system.time(image(dat,useRaster=TRUE))

You might want to consider what you really want out of the heatmap -- i.e., do you need the fancy dendrogram/reordering stuff?

您可能想要考虑热像图中您真正想要的东西 - 即,您是否需要花式树形图/重新排序的东西?

#2


11  

There is advice in this SO question about R memory management. If you can't allocated a 1000 by 1000 image, then you should probably stop trying to do stats on your mobile phone.

关于R内存管理的这个SO问题有建议。如果您无法分配1000 x 1000图像,那么您可能应该停止尝试在手机上执行统计信息。

#3


8  

No errors when I try it. Here's the code:

我尝试时没有错误。这是代码:

 library(lattice)

 #Build the data
 nrowcol <- 1000
 dat <- matrix(ifelse(runif(nrowcol*nrowcol) > 0.5, 1, 0), nrow=nrowcol)

 #Build the palette and plot it
 pal <- colorRampPalette(c("red", "yellow"), space = "rgb")
 levelplot(dat, main="1000 X 1000 Levelplot", xlab="", ylab="", col.regiOns=pal(4), cuts=3, at=seq(0,1,0.5))

enter image description here

#4


5  

try the raster package, it can handle huge raster file.

尝试光栅包,它可以处理巨大的光栅文件。

#5


2  

For me

为了我

library(heatmap3)
nrowcol <- 1000
dat <- matrix(ifelse(runif(nrowcol*nrowcol) > 0.5, 1, 0), nrow=nrowcol)
heatmap3(dat,useRaster=TRUE)

enter image description here

works OK. The useRaster=TRUE seems quite important to keep memory use within limits. You can use the same argument in heatmap.2. Calculating the distance matrix for the hierarchical clustering is the main overhead in the calculation, but heatmap3 uses the more efficient fastcluster package for that for large matrices. With very large matrices you will unavoidably get into trouble though trying to do a distance-based hierarchical cluster. In that case you can still use arguments Rowv=NA and Colv=NA to suppress the row and column dendrograms and use some other logic to sort your rows and columns, e.g.

工作正常。 useRaster = TRUE对于将内存使用限制在一定范围内非常重要。您可以在heatmap.2中使用相同的参数。计算层次聚类的距离矩阵是计算中的主要开销,但是热图3使用更高效的快速集群包用于大型矩阵。对于非常大的矩阵,尽管尝试执行基于距离的分层群集,但您将不可避免地遇到麻烦。在这种情况下,您仍然可以使用参数Rowv = NA和Colv = NA来抑制行和列树形图,并使用一些其他逻辑来对行和列进行排序,例如:

nrowcol <- 5000
dat <- matrix(ifelse(runif(nrowcol*nrowcol) > 0.5, 1, 0), nrow=nrowcol)
heatmap3(dat,useRaster=TRUE,Rowv=NA,Colv=NA)

still runs without problems on my laptop with 8 Gb memory, whereas with the dendrograms included it already starts to crunch.

在我的笔记本电脑上使用8 Gb内存仍然没有问题,而包括树形图已经开始紧缩。

#6


1  

You can also use heatmap.2 from the gplots package and simply turn off dendrograms, as these normally take up the most computation time (from my experience).

您也可以使用gplots包中的heatmap.2并简单地关闭树形图,因为这些通常占用最多的计算时间(根据我的经验)。

Also, have you considered directly printing your heatmap to a file via pdf(), png() or jpeg()?

另外,您是否考虑过通过pdf(),png()或jpeg()直接将热图打印到文件中?


推荐阅读
author-avatar
彼岸花2011的冬天_290
这个家伙很懒,什么也没留下!
PHP1.CN | 中国最专业的PHP中文社区 | DevBox开发工具箱 | json解析格式化 |PHP资讯 | PHP教程 | 数据库技术 | 服务器技术 | 前端开发技术 | PHP框架 | 开发工具 | 在线工具
Copyright © 1998 - 2020 PHP1.CN. All Rights Reserved | 京公网安备 11010802041100号 | 京ICP备19059560号-4 | PHP1.CN 第一PHP社区 版权所有