Two Sample T-Test & Paired t-Test
The two-sample t-test is applied to compare whether the average difference between two groups is really significant or if it is due instead to random chance. It helps to answer questions like whether the average success rate is higher after implementing a new sales tool than before or whether the test results of patients who received a drug are better than test results of those who received a placebo.
Two sample t-test是用来判断两组数据的均值的差是有统计意义上的区别,还是仅仅由于随机波动。可以用来回答采用了新的销售工具后平均销售额是否上升了,或者服用某种药物的患者症状是否比服用安慰剂(placebo)的人改善了等等。
※https://www.itl.nist.gov/div898/handbook/eda/section3/eda353.htm
关于什么是1-sample, 2-sample, 和 Paired t-Tests,可以参照这篇文章。
1-sample t test是前些日子刚刚学习过的(参见这里),是比较一组数据的均值是否等于某一个值m0。
Paired t test是用来比较成对的数据,比如改进前和改进后,服药前和服药后……。其实这本质上也是1-sample t test,只不过这个1-sample用的是前后的差值。
2-sample t test是比较两组数据的均值之差,它的t值这样求:
教材里面对于啥时候用paired啥时候用2-sample是这样说的,就是当两组数据是相互独立的,就用2-sample,当两组数据是成对出现的,就用paired。
在R里面不管是1-sample、2-sample和Paired,都是用t.test。
statistic
parameter
p.value
conf.int
estimate
null.value
alternative
method
data.name
from RDocumentation
Two sample t-test是用来判断两组数据的均值的差是有统计意义上的区别,还是仅仅由于随机波动。可以用来回答采用了新的销售工具后平均销售额是否上升了,或者服用某种药物的患者症状是否比服用安慰剂(placebo)的人改善了等等。
※https://www.itl.nist.gov/div898/handbook/eda/section3/eda353.htm
关于什么是1-sample, 2-sample, 和 Paired t-Tests,可以参照这篇文章。
1-sample t test是前些日子刚刚学习过的(参见这里),是比较一组数据的均值是否等于某一个值m0。
Paired t test是用来比较成对的数据,比如改进前和改进后,服药前和服药后……。其实这本质上也是1-sample t test,只不过这个1-sample用的是前后的差值。
2-sample t test是比较两组数据的均值之差,它的t值这样求:
教材里面对于啥时候用paired啥时候用2-sample是这样说的,就是当两组数据是相互独立的,就用2-sample,当两组数据是成对出现的,就用paired。
在R里面不管是1-sample、2-sample和Paired,都是用t.test。
t.test
Student's T-Test
Performs one and two sample t-tests on vectors of data.
- Keywords
- htest
Usage
t.test(x, …)
# S3 method for default
t.test(x, y = NULL,
alternative = c("two.sided", "less", "greater"),
mu = 0, paired = FALSE, var.equal = FALSE,
conf.level = 0.95, …)
# S3 method for formula
t.test(formula, data, subset, na.action, …)
Arguments
- x
- a (non-empty) numeric vector of data values.
- y
- an optional (non-empty) numeric vector of data values.
- alternative
- a character string specifying the alternative hypothesis, must be one of
"two.sided"
(default),"greater"
or"less"
. You can specify just the initial letter. - mu
- a number indicating the true value of the mean (or difference in means if you are performing a two sample test).
- paired
- a logical indicating whether you want a paired t-test.
- var.equal
- a logical variable indicating whether to treat the two variances as being equal. If
TRUE
then the pooled variance is used to estimate the variance otherwise the Welch (or Satterthwaite) approximation to the degrees of freedom is used. - conf.level
- confidence level of the interval.
- formula
- a formula of the form
lhs ~ rhs
wherelhs
is a numeric variable giving the data values andrhs
a factor with two levels giving the corresponding groups. - data
- an optional matrix or data frame (or similar: see
model.frame
) containing the variables in the formulaformula
. By default the variables are taken fromenvironment(formula)
. - subset
- an optional vector specifying a subset of observations to be used.
- na.action
- a function which indicates what should happen when the data contain
NA
s. Defaults togetOption("na.action")
. - …
- further arguments to be passed to or from methods.
Details
The formula interface is only applicable for the 2-sample tests.
alternative = "greater"
is the alternative that x
has a larger mean than y
.
If
paired
is TRUE
then both x
and y
must be specified and they must be the same length. Missing values are silently removed (in pairs if paired
is TRUE
). Ifvar.equal
is TRUE
then the pooled estimate of the variance is used. By default, if var.equal
is FALSE
then the variance is estimated separately for both groups and the Welch modification to the degrees of freedom is used.
If the input data are effectively constant (compared to the larger of the two means) an error is generated.
Value
A list with class
"htest"
containing the following components:
the value of the t-statistic.
the degrees of freedom for the t-statistic.
the p-value for the test.
a confidence interval for the mean appropriate to the specified alternative hypothesis.
the estimated mean or difference in means depending on whether it was a one-sample test or a two-sample test.
the specified hypothesized value of the mean or mean difference depending on whether it was a one-sample test or a two-sample test.
a character string describing the alternative hypothesis.
a character string indicating what type of t-test was performed.
a character string giving the name(s) of the data.
from RDocumentation
0 Comments:
Post a Comment
<< Home