俺的学习笔记: Binomial Sign Test

要想搞清楚Binomial Sign Test，首先要搞清楚Binomial Distribution(二项分布)。

Binomial Distribution

Bernoulli trial
コイン投げのように結果が2通りにしかならない確率実験のこと。
試行を繰り返したとき、どの試行においても結果が起こる確率は同じであり、各試行の結果は互いに独立である。
那么，这种只有两种可能性的试行的概率怎么求呢？假设投硬币，出现正面的概率是p，反面的概率就是(1-p)。所以投n次，出现k个正面的概率为：

计算一下投10次硬币，出现正面的概率(p=0.5的情况下)，如下：

出现正面的次数	概率
0	0.000976563
1	0.009765625
2	0.043945313
3	0.1171875
4	0.205078125
5	0.24609375
6	0.205078125
7	0.1171875
8	0.043945313
9	0.009765625
10	0.000976563

它的分布图是这个样子的：

100次的情况下，是这样的：

概率p变化时是这样的(n=100)：

这就是二项分布(binomial distribution)。
注意，the binomial distribution is perfectly symmetric only when p = 0.50.
二项分布只有p=0.5的情况下才是对称的。
OK，下面看看Binomial Test和Sign Test

Binomial Test

Use the binomial test when there are two possible outcomes. You know how many of each kind of outcome (traditionally called "success" and "failure") occurred in your experiment. You also have a hypothesis for what the true overall probability of "success" is. The binomial test answers this question: If the true probability of "success" is what your theory predicts, then how likely is it to find results that deviate as far, or further, from the prediction.
(from GraphPad)
另一个网站的说法：

The binomial test is used when an experiment has two possible outcomes (i.e. success/failure) and you have an idea about what the probability of success is. A binomial test is run to see if observed test results differ from what was expected.
Example: you theorize that 75% of physics students are male. You survey a random sample of 12 physics students and find that 7 are male. Do your results significantly differ from the expected results?

二项检测(binomial test)就是有两个输出的情况下，而且你已知这两种情况出现的概率。当你在试验中观察到某一种情况出现的次数，用二项检测判断你观察到的现象(出现的次数)是否符合你已知的概率。比如，某一试验成功和失败出现的概率分别为0.1和0.9，当进行10次试验，出现3次成功的情况下，判断这次实验是不是遵从0.1-0.90的概率分布。
关于Binomial test，这个网址说的相对比较明白，这个说的是双侧的，单侧的测试，只要找出概率是比观测到的大还是小即可。

硬貨を10枚投げて表が2枚しか出ませんでした。この硬貨は歪んでいるでしょうか。

ある母集団から10人をランダムに選んで聞いたところ，賛成2人，反対8人でした。母集団全体でも反対のほうが多いと言えるでしょうか。

これらの問いについて考えるために，仮に硬貨は歪んでいない（あるいは母集団全体では賛否が等しい）というモデル（帰無仮説，null hypothesis）を立てます。そして，この帰無仮説が正しかった場合に，実際に観測された以上の外れ方（2:8，1:9，0:10，そして通常はさらにそれをひっくり返した8:2，9:1，10:0）が生じる確率の合計を求めます。
按：这里说的不明不白，如果做双向检测(two-tail)的话，要把相反的方向概率也加上，或者干脆有意水准就是0.05/2就可以了。

この確率の合計を $p$ $p$ 値（ピーち， $p$ $p$ -value）といいます。 $p$ $p$ 値が非常に小さければ，実際に起きた事象はこのモデルでは説明しにくいので，たぶん硬貨は歪んでいる（あるいは賛否は等しくない）と推測します。 $p$ $p$ 値が大きければ，これだけのデータでは何も言えないということがわかるだけです。 $p$ $p$ 値が大きいか小さいかの境界（有意水準）を仮に 0.05 として， $p ≦ 0.05$ $p ≦ 0.05$ $p ≦ 0.05$ $p ≦ 0.05$ であれば帰無仮説からの外れが「統計的に有意」（statistically significant）である，あるいは「帰無仮説は棄却（reject）される」ということがあります。0.05 という値に特に意味はありませんが，伝統的によく使われています（物理学では通常もっともっと厳しい条件を課します）。

さて，硬貨を投げて表の出る枚数の分布は2項分布と考えられますので，表も裏も 1/2 の確率で出るとすれば，表が $r$ $r$ 枚出る確率は $_{10} C_{r} (1 / 2)^{10}$ $_{10} C_{r} (1 / 2)^{10}$ $_{10} C_{r} (1 / 2)^{10}$ $_{10} C_{r} (1 / 2)^{10}$ $_{10} C_{r} (1 / 2)^{10}$ $_{10} C_{r} (1 / 2)^{10}$ $_{10} C_{r} (1 / 2)^{10}$ $_{10} C_{r} (1 / 2)^{10}$ $_{10} C_{r} (1 / 2)^{10}$ $_{10} C_{r} (1 / 2)^{10}$ です。表裏が 0:10，1:9，2:8，8:2，9:1，10:0 である確率はそれぞれ

> dbinom(c(0,1,2,8,9,10), 10, 0.5)

[1] 0.0009765625 0.0097656250 0.0439453125 0.0439453125

[5] 0.0097656250 0.0009765625

で，この合計，すなわち $p$ $p$ 値は

> sum(dbinom(c(0,1,2,8,9,10), 10, 0.5))

[1] 0.109375

になります。同じことが

> pbinom(2, 10, 0.5) * 2

[1] 0.109375

でも求められます。また，後で詳しく述べますが，binom.test() という関数でも2項検定ができます。

> binom.test(2, 10, 0.5)

Exact binomial test

data: 2 and 10

number of successes = 2, number of trials = 10, p-value = 0.1094

...

したがって，有意水準を 0.05 とすれば，表裏の差は統計的に有意ではありませんし，アンケートであればこんなに少人数の結果から「賛成が少ない」という結論を導いてはいけないということになります。

表が1枚（賛成が1人）なら， $p$ $p$ 値は 0.02 ほどになり，水準 0.05 で有意になります。

これが，フィッシャー（R. A. Fisher，1890〜1962年）が「有意性の検定」（tests of significance，significance tests）と呼んだ方法の考え方です。

_{10} C_{r} (1 / 2)^{10}

也就是说binomial test就是根据已知的两个结果出现的概率(比如，这里的硬币出现正反面的概率是0.5)，算出你观察到的现象的概率(也就是正面出现2次，再加上1次，0次的概率)也就是p值(p-value)，看看这个p-value是否是统计学上有意义的(statistically significant)，如果是，则元假设不成立(即上例的硬币是歪的，或者赞成和反对的比例是不等的)，否则元假设成立(即硬币是没问题的，或赞成和反对的比例是相等的)。

Sign test

The sign test is a special case of the binomial case where your theory is that the two outcomes have equal probabilities. (from GraphPad)
这句话似乎有点儿问题，应该说sign test (符号检测)是针对category数据进行分析，它把数据给符号化为+和-，用计算概率的方法来判断这个+/-是随机出现的，还是具有一定的倾向的。比如，用过某种药物以后，病人排尿的次数是比用药前增加了还是减少了，增加了就标记为+减少了就标记为-，然后就可以用sign test来判断这种药物是否有效。
所以这个sign test就是一种特殊的binomial test它只关心+/-两种情况。而且+/-出现的概率都是0.5。
※1这里很多材料的说法不一致，比如二项检测，有的说是只有两个可能结果，有的则不说这一点，举例说明掷骰(tóu)子，出现某一数字的概率是1/6，比如抛60次，出现6的概率是15次时，问这个骰子是否被做了手脚？(不过解法依然是通过出现6和不出现6的概率进行判断，似乎不矛盾)
※sign test的说明，很多资料是不全面或错误的。
具体的判断方法和上面的binomial test一样的(sign test只是binomial test的一个特例而已)
符号検定について、このサイトによい例がある。

10人の患者にある睡眠薬を飲ませたところ，睡眠時間がそれぞれ次の時間だけ増えました (Arthur R. Cushny and A. Roy Peebles, The Journal of Physiology 32, 501-510 $1905$ ):

1.9, 0.8, 1.1, 0.1, -0.1, 4.4, 5.5, 1.6, 4.6, 3.4
つまり，10個のうち負の値は1個だけで，残り9個は正です。正・負の符号の付き方は全部で $2^{10} = 1024$ 通りあり，そのうちで

すべて正になる場合は $_{10} C_{0} = 1$ 通り

一つだけ負になる場合は $_{10} C_{1} = 10$ 通り

二つだけ負になる場合は $_{10} C_{2} = (10 \times 9) / (2 \times 1) = 45$ 通り

等々のように場合分けできます。すべての場合を合計すると，当然ですが $2^{10} = 1024$ になります。
もし正になる確率と負になる確率が同じなら，

すべて正になる確率は 1/1024

一つだけ負になる確率は 10/1024

二つだけ負になる確率は 45/1024

となるはずです。実際のデータは10個のうち一つだけ負ですので，このようになる確率と，もっと極端な（すべて正になる）確率を合計すれば，10/1024 + 1/1024 = 11/1024 です。逆に，すべて負になる確率と，一つだけ正になる確率を合計すれば，やはり 10/1024 + 1/1024 = 11/1024 です。そこで，10個のうち1個以下の符号が他と異なる確率は，22/1024 で， $p$ 値は約0.02です。つまり，偶然では50回に1回しか起きない事象です。
このような検定法を符号検定（sign test）といいます。
符号検定では，差が 0 のデータは外して考えます。
一般に，0でない数が n 個あって，そのうち m 個が正（または負）であるなら，binom.test(m, n) で符号検定できます。例えば上の例の場合は
> binom.test(1, 10)

 Exact binomial test

data:  1 and 10
number of successes = 1, number of trials = 10, p-value = 0.02148
alternative hypothesis: true probability of success is not equal to 0.5
95 percent confidence interval:
 0.002528579 0.445016117
sample estimates:
probability of success 
                   0.1 
で， $p = 0.02148$ となります。

Binomial test in R

A binomial test compares the number of successes observed in a given number of trials with a hypothesised probability of success. The test has the null hypothesis that the real probability of success is equal to some value denoted p, and the alternative hypothesis that it is not equal to p. The test can also be performed with a one-sided alternative hypothesis that the real probability of success is either greater than p or that it is less than p. (from Instant R，这段关于binomial test的解释俺感觉也很好)

Exact Binomial Test

Description

Performs an exact test of a simple null hypothesis about the probability of success in a Bernoulli experiment.

Usage

binom.test(x, n, p = 0.5,
           alternative = c("two.sided", "less", "greater"),
           conf.level = 0.95)

Arguments

`x`	number of successes, or a vector of length 2 giving the numbers of successes and failures, respectively.
`n`	number of trials; ignored if `x` has length 2.
`p`	hypothesized probability of success.
`alternative`	indicates the alternative hypothesis and must be one of `"two.sided"`, `"greater"` or `"less"`. You can specify just the initial letter.
`conf.level`	confidence level for the returned confidence interval.

(from this page)
比如上面那个投10次硬币出现2次正面的问题，用R来做binomial test：

> binom.test(2,10,p=0.5,alternative="two.sided")

    Exact binomial test

data:  2 and 10

number of successes = 2, number of trials = 10,

p-value = 0.1094

alternative hypothesis: true probability of success is not equal to 0.5

95 percent confidence interval:

0.02521073 0.55609546

sample estimates:

probability of success 

0.2

Sign test是Binomial test的特殊情况，所以是一样的。可参见这里。
http://ogawas.cerp.u-toyama.ac.jp/e-stat/05.html

PS:「よきにはからえ」とは、君の思うようにしなさい、という意味です。
自身では判断がつかないようなお伺いをたてたときに、このように言っておけば、とりあえず間違いはないでしょう。

Labels: CMMI, HMLA, 统计

俺的学习笔记

Monday, November 19, 2018

Binomial Sign Test