1
There are many different ways to do what you are asking. In the most literal sense, "bimodal" means there are two peaks. Usually though, you want the "two peaks" to be separated by some reasonable distance, and you want them to each contain a reasonable proportion of the total counts. Only you know what is "reasonable" for your situation, but the following approach might help.
有很多不同的方法去做你要做的事。在最字面的意义上,“双峰”意味着有两个峰。但是,通常情况下,您希望“两个峰值”之间相隔一定的距离,并且希望每个峰值包含一定比例的总数。只有你知道什么对你的情况是“合理的”,但是下面的方法可能会有所帮助。
- Create a histogram of the intensities
- 创建强度的直方图
- Form the cumulative distribution with
cumsum
- 形成累计分布与累计
- For different values of the "cut" between distributions (25%, 30%, 50%, …), compute the mean and standard deviation of the two distributions (above and below the cut).
- 对于分布之间的“cut”的不同值(25%、30%、50%、…),计算这两个分布的均值和标准差(在cut上面和下面)。
- Compute the distance between the means divided by the sum of the standard deviations of the two distributions
- 计算均值之间的距离除以两个分布标准差之和
- That quantity will be a maximum at the "best cut"
- 这个量在“最佳切割”时是最大的
You have to decide what size of that quantity represents "bimodal" for you. Here is some code that demonstrates what I am talking about. It generates bimodal distributions of different degrees of severity - two Gaussians, with increasing delta between them (steps = size of standard deviation). I compute the quantity described above, and plot it for a range of different values of delta
. I then fit a parabola through this curve over a range corresponding to +- 1 sigma of the entire distribution. As you can see, when the distribution becomes more bimodal, two things happen:
你必须决定那个量的大小对你来说代表“双峰”。这里有一些代码演示了我正在谈论的内容。它产生不同严重程度的双模态分布-两个高斯分布,它们之间的增量增加(步骤=标准偏差的大小)。我计算了上面描述的量,并把它绘制成一系列不同的值。然后我把抛物线穿过这条曲线除以整个分布的+- 1。正如你所看到的,当分布变得更加双峰时,会发生两件事:
- The curvature of this curve flips (it goes from a valley to a peak)
- 这条曲线的曲率会翻转(从一个山谷到一个山峰)
- The maximum increases (it is about 1.33 for a Gaussian).
- 最大值增加了(高斯值大约是1.33)。
You can look at these quantities for some of your own distributions, and decide where you want to put the cutoff.
你可以看看你自己分布的这些量,然后决定你要把截止点放在哪里。
% test for bimodal distribution
close all
for delta = 0:10:50
a1 = randn(100,100) * 10 + 25;
a2 = randn(100,100) * 10 + 25 + delta;
a3 = [a1(:); a2(:)];
[h hb] = hist(a3, 0:100);
cs = cumsum(h);
llimi = find(cs <0.2 * max(cs(:)));
ulimi = find(cs > 0.8 * max(cs(:)));
llim = hb(llimi(end));
ulim = hb(ulimi(1));
cuts = linspace(llim, ulim, 20);
dmean = mean(a3);
dstd = std(a3);
for ci = 1:numel(cuts)
d1 = a3(a3=cuts(ci));
m(ci,1) = mean(d1);
m(ci, 2) = mean(d2);
s(ci, 1) = std(d1);
s(ci, 2) = std(d2);
end
q = (m(:, 2) - m(:, 1)) ./ sum(s, 2);
figure;
plot(cuts, q);
title(sprintf('delta = %d', delta))
% compute curvature of plot around mean:
xlims = dmean + [-1 1] * dstd;
indx = find(cuts xlims(1));
pf = polyfit(cuts(indx), q(indx), 2);
m = polyval(pf, dmean);
fprintf(1, 'coefficients: a = %.2e, peak = %.2f\n', pf(1), m);
end
Output values:
输出值:
coefficients: a = 1.37e-03, peak = 1.32
coefficients: a = 1.01e-03, peak = 1.34
coefficients: a = 2.85e-04, peak = 1.45
coefficients: a = -5.78e-04, peak = 1.70
coefficients: a = -1.29e-03, peak = 2.08
coefficients: a = -1.58e-03, peak = 2.48
Sample plots:
示例图:
And the histogram for delta = 40:
的直方图= 40: