Question :
I have a binary dataframe (53115 rows; 520 columns) and I want to make a correlation chart. I want to color the correlation values in red if they are greater than or equal to 0.95, otherwise blue.
correl = abs(round(cor(bin_mat),2))
pdf("corrplot.pdf", width = 200, height = 200)
a = corrplot(correl, order = "hclust", addCoef.col = "black", number.cex=0.8, cl.lim = c(0,1), col=c(rep("deepskyblue",19) ,"red"))
dev.off()
I was able to get the graph but in many cases I get a wrong coloring (see graph below in value 0.91).
datafile: file
How can I correct this problem to have a correct color?
Answer :
The problem is how the col
option, when custom palettes are used, works together with cl.lim
. The package documentation talks about it. See what happens with and without cl.lim
. I’m using the mtcars base, included in the R, for example, and using cutoff at 0.8 for easy viewing:
correl <- cor(mtcars)
library(corrplot)
par(mfrow = c(1,2))
corrplot(abs(correl),
addCoef.col = "black",
cl.lim = c(0, 1),
col = c(rep("deepskyblue", 9) ,"red")
)
corrplot(abs(correl),
addCoef.col = "black",
col = c(rep("deepskyblue", 9) ,"red")
)
Correlationsoccurinacontinuumbetween-1and1;corrplot
mapscolorstothisrange(removingthesignfromcorrelationsissimplyabadidea)andcannotadjustthecaptionwhenthepaletteiscustomized.
Furthermore,unlikethevalueofp,whereitmatterswhetheritisaboveorbelowathreshold,theintensityofthecorrelationindexvalueisimportant.Socorrplot
wasnotdonethinkingaboutqualitativescales.
Onewaytosolveyourcasebykeepingthesignsistosimplygeneratethecolorstoworkinthe-1:1range,withthedivisionsyouwant.Thecaptionwillbeuseless,socutit:
corrplot(correl,addCoef.col="black",
col = c("red", rep("deepskyblue", 8) ,"red"),
cl.pos = 'n'
)
Just change the value to rep
for your case (38). But the advice is to use a continuous palette.