以下係概率論 同統計學 上嘅主要詞彙一覽。
概率論 [e 1] 係數學 一個子領域,專門研究概率 (又叫機會率 )相關嘅問題:概率係一啲描述隨機過程 嘅結果嘅數值,例如掟一個冇出千嘅銀仔 ,出公嘅概率係 50%,所以對於思考不確定性 嚟講不可或缺[1] 。
統計學 [e 2] 就係專門研究點樣喺各個科學 領域當中搜集 、分析 同呈現 數據 ,而實證嘅科學方法 本質上就帶有不確定-理論上,淨係抽個樣本嚟睇 嘅過程就必然會有「手上個樣本有幾大機會真係代表到個總體 嘅實況」嘅問題,所以統計學嘅理論思考梗會用到概率論[2] [3] 。
除此之外,噉亦即係話概率論同統計學本質上就係一啲可以攞嚟「喺有不確定性嘅情況下,按過去 經驗預測未來 」嘅工具,所以呢個表入面嗰啲詞彙同概念响研究「點樣教人工智能 學習 」嘅機械學習 領域上都相當有用[3] 。
描述統計學 [e 126] 係指一啲量化噉描述一柞資訊 嘅統計數值,包括咗[69] :
推論統計學 [e 155] 技術化噉講係指做數據分析 ,推論數據背後反映嘅概率分佈 嘅過程。呢啲分析通常係由數據嘅個案嘅值嗰度計一啲指標出嚟,用呢啲指標評估(例如)某兩個變數之間係咪真係有關,或者個自變數 係咪真係能夠對個應變數 產生影響,甚至估計一個數學模型 出嚟描述所研究嘅現象。廿一世紀嘅統計學上有好多種推論統計分析法,每種能夠處理嘅數據類型都唔同。數據科學 等領域嘅專家一定要對呢啲唔同嘅分析法有所認識,知乜嘢時候應該用邊種分析法[78] 。
統計相關
相關 [e 174] 呢個詞喺統計學 上嘅定義 如下:如果話
x
{\displaystyle x}
同
y
{\displaystyle y}
呢兩個變數 成正相關 ,即係話
x
{\displaystyle x}
數值高嗰陣
y
{\displaystyle y}
數值都傾向高,而
x
{\displaystyle x}
數值低嗰陣
y
{\displaystyle y}
數值都傾向低;如果話
x
{\displaystyle x}
同
y
{\displaystyle y}
呢兩個變數成負相關 ,即係話
x
{\displaystyle x}
數值高嗰陣
y
{\displaystyle y}
數值傾向低,而
x
{\displaystyle x}
數值低嗰陣
y
{\displaystyle y}
數值就會傾向高;而如果話
x
{\displaystyle x}
同
y
{\displaystyle y}
呢兩個變數冇明顯相關 [e 175] ,即係話
x
{\displaystyle x}
嘅數值唔會點預測得到
y
{\displaystyle y}
嘅數值[91] 。
皮亞遜積差相關係數 [e 176] :係常用嚟衡量兩個變數之間嘅相關嘅一個數值,條式如下[92] :
ρ
X
,
Y
=
corr
(
X
,
Y
)
=
cov
(
X
,
Y
)
σ
X
σ
Y
=
E
[
(
X
−
μ
X
)
(
Y
−
μ
Y
)
]
σ
X
σ
Y
{\displaystyle \rho _{X,Y}=\operatorname {corr} (X,Y)={\operatorname {cov} (X,Y) \over \sigma _{X}\sigma _{Y}}={\operatorname {E} [(X-\mu _{X})(Y-\mu _{Y})] \over \sigma _{X}\sigma _{Y}}}
,當中
ρ
X
,
Y
{\displaystyle \rho _{X,Y}}
係
x
{\displaystyle x}
同
y
{\displaystyle y}
呢兩個變數之間嘅皮亞遜積差相關係數;
X
{\displaystyle X}
係第
i
{\displaystyle i}
個個案嘅
x
{\displaystyle x}
數值;
Y
{\displaystyle Y}
係第
i
{\displaystyle i}
個個案嘅
y
{\displaystyle y}
數值;
μ
X
{\displaystyle \mu _{X}}
係啲個案喺
x
{\displaystyle x}
上嘅平均值 ;
μ
Y
{\displaystyle \mu _{Y}}
係啲個案喺
y
{\displaystyle y}
上嘅平均值;
σ
X
{\displaystyle \sigma _{X}}
係啲個案喺
x
{\displaystyle x}
上嘅標準差 ;
σ
Y
{\displaystyle \sigma _{Y}}
係啲個案喺
y
{\displaystyle y}
上嘅標準差。
皮亞遜積差相關係數俾嘅資訊只係「兩個變數大致上嘅相關」,但就算兩個變數之間嘅皮亞遜積差相關係數係 0,都唔等如兩個變數之間真係冇關,好似係以下嘅一柞圖噉,每幅圖上面嗰個數表示皮亞遜積差相關係數,每一點表示一個個案,X 軸係變數
x
{\displaystyle x}
,Y 軸係變數
y
{\displaystyle y}
;由圖中可見,有好多有趣嘅關係都會俾出數值係 0 嘅皮亞遜積差相關係數[92] 。
協方差 [e 177] :皮亞遜積差相關係數 條式個分子 ,即係[93] :
cov
(
X
,
Y
)
=
E
[
(
X
−
E
[
X
]
)
(
Y
−
E
[
Y
]
)
]
{\displaystyle \operatorname {cov} (X,Y)=\operatorname {E} {{\big [}(X-\operatorname {E} [X])(Y-\operatorname {E} [Y]){\big ]}}}
;
相關唔蘊含因果 [e 178] :統計學上嘅一條重要原則,指緊就算兩個變數之間有相關,都唔表示兩個變數之間有因果關係;假想而家有兩個變數
X
{\displaystyle X}
同
Y
{\displaystyle Y}
之間有勁嘅相關(皮亞遜積差相關係數 數值大),噉可以表示三個可能性-
X
{\displaystyle X}
引致
Y
{\displaystyle Y}
;
Y
{\displaystyle Y}
引致
X
{\displaystyle X}
;
X
{\displaystyle X}
同
Y
{\displaystyle Y}
有同一個原因。
有唔少統計學嘅學生都以為兩個變數之間有相關表示咗兩者有因果關係,但呢個係一個錯誤嘅諗法,所以統計學界就有咗句噉嘅說話用嚟提醒學生要小心[94] 。
局部相關 [e 179] :指喺第個或者第啲變數嘅影響冇咗嘅情況下,兩個變數之間嘅相關;想像有兩個變數
X
{\displaystyle X}
同
Y
{\displaystyle Y}
,有
n
{\displaystyle n}
個混淆變數 (睇下面)
Z
=
Z
1
,
Z
2
,
.
.
Z
n
{\displaystyle Z={Z_{1},Z_{2},..Z_{n}}}
,
X
{\displaystyle X}
同
Y
{\displaystyle Y}
之間「喺
Z
{\displaystyle Z}
嘅影響受控制冇咗嗰陣」嘅局部相關
ρ
X
Y
⋅
Z
{\displaystyle \rho _{XY\cdot Z}}
會係
e
X
{\displaystyle e_{X}}
同
e
Y
{\displaystyle e_{Y}}
之間嘅相關,當中
e
X
{\displaystyle e_{X}}
係指做線性迴歸分析 用
Z
{\displaystyle Z}
預測
X
{\displaystyle X}
嗰陣嘅誤差,
e
Y
{\displaystyle e_{Y}}
同一道理[95] 。
組內相關 [e 180] :用嚟衡量每一組有幾「內部一致」嘅基準;想像有個數據 ,有若干個個案,而呢柞個案可以分做若干組,如果柞數據反映組內相關高,就表示同一組嘅個案嘅數值傾向彼此之間接近;要計組內相關可以有幾條唔同嘅式用[96] [97] 。
每個藍點係一個個案,每個個案有個
y
{\displaystyle y}
值,而 X 軸表示個個案屬邊組;左圖係 ICC 高(0.91)嘅情況,而右圖係 ICC 低(-0.07)嘅情況。
等級相關 [e 181] :指要同佢哋計相關值嗰兩個變數係「等級」,即係每個個案喺嗰兩個變數上有「第一高」、「第二高」同「第三高」等嘅數值[98] 。
斯皮亞曼等級相關係數 ([e 182]
r
s
{\displaystyle r_{s}}
):等級相關嘅一種計法,指兩個變數分別嘅等級之間嘅皮亞遜積差相關係數 ,即係[99] :
r
s
=
ρ
rg
X
,
rg
Y
=
cov
(
rg
X
,
rg
Y
)
σ
rg
X
σ
rg
Y
,
{\displaystyle r_{s}=\rho _{\operatorname {rg} _{X},\operatorname {rg} _{Y}}={\frac {\operatorname {cov} (\operatorname {rg} _{X},\operatorname {rg} _{Y})}{\sigma _{\operatorname {rg} _{X}}\sigma _{\operatorname {rg} _{Y}}}},}
,當中
X
i
,
Y
i
{\displaystyle X_{i},Y_{i}}
係考慮緊嗰兩個變數,而
rg
X
i
,
rg
Y
i
{\displaystyle \operatorname {rg} _{X_{i}},\operatorname {rg} _{Y_{i}}}
係啲個案喺
X
i
,
Y
i
{\displaystyle X_{i},Y_{i}}
上嘅等級(第一、第二... 等等)。
Τ 等級相關係數 ([e 183]
τ
{\displaystyle \tau }
):設
(
x
1
,
y
1
)
,
.
.
.
,
(
x
n
,
y
n
)
{\displaystyle (x_{1},y_{1}),...,(x_{n},y_{n})}
做一柞個案,每個個案都喺
X
i
,
Y
i
{\displaystyle X_{i},Y_{i}}
呢兩個變數上有個數值,是但搵兩個個案
(
x
i
,
y
i
)
{\displaystyle (x_{i},y_{i})}
同
(
x
j
,
y
j
)
{\displaystyle (x_{j},y_{j})}
嚟睇,佢哋可以係一致 [e 184] ,即係
(
x
i
>
x
j
)
∧
(
y
i
>
y
j
)
{\displaystyle (x_{i}>x_{j})\land (y_{i}>y_{j})}
或者
(
x
i
<
x
j
)
∧
(
y
i
<
y
j
)
{\displaystyle (x_{i}<x_{j})\land (y_{i}<y_{j})}
,否則佢哋就算係唔一致 [e 185] ,而
τ
{\displaystyle \tau }
嘅計法如下[100] :
τ
=
con
−
dis
(
n
2
)
{\displaystyle \tau ={\frac {{\text{con}}-{\text{dis}}}{n \choose 2}}}
con
{\displaystyle {\text{con}}}
:一致配對嘅數量
dis
{\displaystyle {\text{dis}}}
:唔一致配對嘅數量
自相關 [e 186] :一個隨機過程 嘅自相關係指嗰個過程喺唔同時間點嘅數值之間嘅皮亞遜積差相關係數 ;設
{
X
t
}
{\displaystyle \left\{X_{t}\right\}}
做一個有隨機嘅過程,
t
{\displaystyle t}
設做是但一個時間點,而家將
{
X
t
}
{\displaystyle \left\{X_{t}\right\}}
呢個過程行 若干次,
X
t
{\displaystyle X_{t}}
代表個過程喺時間點
t
{\displaystyle t}
俾出嘅數值,噉呢個過程時間點
t
1
{\displaystyle t_{1}}
同時間點
t
2
{\displaystyle t_{2}}
之間嘅自相關
R
X
X
(
t
1
,
t
2
)
{\displaystyle \operatorname {R} _{XX}(t_{1},t_{2})}
定義上係
X
t
1
{\displaystyle X_{t_{1}}}
同
X
t
2
{\displaystyle X_{t_{2}}}
之間嘅皮亞遜積差相關係數;自相關喺訊號處理 上常用,可以用嚟量度一段訊號有幾接近完全隨機 [101] 。
交叉相關 [e 187] :睇喺每個時間點兩段時間序列
f
{\displaystyle f}
同
g
{\displaystyle g}
之間嘅統計相關,即係話交叉相關會反映
f
(
t
)
{\displaystyle f(t)}
(時間點
t
{\displaystyle t}
嘅
f
{\displaystyle f}
值)同
g
(
t
)
{\displaystyle g(t)}
(時間點
t
{\displaystyle t}
嘅
g
{\displaystyle g}
值)之間嘅相關,又或者係揀個延遲值
τ
{\displaystyle \tau }
,睇吓交叉相關反映
f
(
t
)
{\displaystyle f(t)}
同
g
(
t
+
τ
)
{\displaystyle g(t+\tau )}
之間嘅相關[102] 。
正交 [e 188] :喺統計學上,如果話兩個 IV 「正交」,意思即係話呢兩個 IV 之間冇統計相關[103] 。睇埋多重共線性 。
統計模型 [e 242] 係數學模型 嘅一種。一個統計模型會帶有若干個假設,模擬一個產生數據 (觀察到嘅嘢)嘅過程;一個研究者會收數據,並且用數據估計一個統計模型嘅參數 數值,即係用數據估計世界嘅運作法則[141] 。
迴歸模型
幅圖嘅兩條軸分別代表研究緊嗰兩個變數(x 同 y),每個紅點代表一個個案,每個個案都喺兩個變數上各有個值。用迴歸分析可以估計出條線(綠線同藍線都係可行嘅線)並且知道呢兩個變數大致上成正比 。
迴歸模型 [e 254] 係常用嘅一種統計模型。一般迴歸模型有若干個自變數 同一個應變數 ,兩者多數會係連續變數 ,然後個演算法 就嘗試畫一條能夠表達自變數同應變數之間嘅關係嘅線[149] 。
例:
y
=
0.5
x
+
30
+
e
{\displaystyle y=0.5x+30+e}
,當中
y
{\displaystyle y}
係應變數,
x
{\displaystyle x}
係自變數,
e
{\displaystyle e}
係殘差 [e 255] ,0.5 同 30 係由數據估計出嚟嘅參數。
線性迴歸模型 [e 256] :最簡單嗰種迴歸模型;喺一個線性迴歸模型當中,個應變數 係柞自變數 嘅線性組合 [149] 。
多重迴歸模型 [e 259] :指多過一個自變數嘅迴歸模型。
例:
y
=
0.2
x
1
+
1.65
x
2
+
30
+
e
{\displaystyle y=0.2x_{1}+1.65x_{2}+30+e}
,當中
x
1
{\displaystyle x_{1}}
係第 1 個自變數,
x
2
{\displaystyle x_{2}}
係第 2 個自變數,
e
{\displaystyle e}
係誤差。
多變量適應性迴歸模型 [e 260] :指以下嘅迴歸模型:
f
^
(
x
)
=
∑
i
=
1
k
c
i
B
i
(
x
)
{\displaystyle {\widehat {f}}(x)=\sum _{i=1}^{k}c_{i}B_{i}(x)}
,當中
c
i
{\displaystyle c_{i}}
係恆常嘅系數;
每個
B
i
(
x
)
{\displaystyle B_{i}(x)}
可以係
常數 1、
一個合頁函數 [e 261] ,即係
max
(
0
,
x
−
constant
)
{\displaystyle \max(0,x-{\text{constant}})}
或者
max
(
0
,
constant
−
x
)
{\displaystyle \max(0,{\text{constant}}-x)}
[註 7] 、或者
兩個或者以上嘅合頁函數乘埋[150] 。
決定系數 ([e 262]
R
2
{\displaystyle R^{2}}
):反映一個應變數 嘅變異數 有幾多可以由啲自變數 預測;最廣義上嘅定義如下:
R
2
=
1
−
S
S
r
e
s
S
S
t
o
t
{\displaystyle R^{2}=1-{SS_{\rm {res}} \over SS_{\rm {tot}}}\,}
當中
S
S
r
e
s
{\displaystyle SS_{\rm {res}}}
可以想像成做咗迴歸分析 後嘅殘差平方和 (反映「用個模型得出嘅預測值同實際值傾向差幾遠」),而
S
S
t
o
t
{\displaystyle SS_{\rm {tot}}}
係指變異數 同樣本大細 相乘(反映樣本整體嘅變異數);即係話如果個模型做到完美預測,噉
S
S
r
e
s
=
0
{\displaystyle SS_{\rm {res}}=0}
,
R
2
=
1
{\displaystyle R^{2}=1}
[151] 。
多重共線性 [e 263] :多重迴歸模型 當中間中會出現嘅問題,指其中一個自變數嘅數值可以由其他自變數嘅線性噉預測,
x
1
=
f
(
x
2
,
x
3
.
.
.
x
i
)
{\displaystyle x_{1}=f(x_{2},x_{3}...x_{i})}
,而且準確度有返咁上下高;喺有多重共線性嘅情況下,個多重迴歸模型嘅系數(嗰柞
β
{\displaystyle \beta }
)嘅估計數值可能會隨模型或者數據嘅細少變化而有不穩定嘅變化;多重共線性仲可能會令人懷疑個迴歸模型嘅預測能力-原則上,如果將一個多重迴歸模型嘅
x
1
{\displaystyle x_{1}}
改變而第啲
x
i
{\displaystyle x_{i}}
數值不變,係會睇到
y
{\displaystyle y}
嘅數值會點隨住
x
1
{\displaystyle x_{1}}
變化嘅,但如果有多重共線性,就表示
x
1
{\displaystyle x_{1}}
數值變會令第啲
x
i
{\displaystyle x_{i}}
跟住變,「設其他
x
i
{\displaystyle x_{i}}
不變,
x
1
{\displaystyle x_{1}}
改變」呢樣嘢就會唔可行;因為噉,統計學界對於「要點樣處理多重共線性」有進行認真嘅探討[152] 。
一般線性 [e 265] 模型:一種同時寫低幾個線性迴歸模型 嘅做法,可以表達成[154] :
Y
=
X
B
+
U
{\displaystyle \mathbf {Y} =\mathbf {X} \mathbf {B} +\mathbf {U} }
當中
Y
{\displaystyle \mathbf {Y} }
係一個矩陣 ,包含啲應變數 ,
X
{\displaystyle \mathbf {X} }
係一個包含啲自變數 嘅矩陣,
B
{\displaystyle \mathbf {B} }
係包含啲參數 嘅矩陣,而最後
U
{\displaystyle \mathbf {U} }
係包括啲誤差值 嘅矩陣。
邏輯 [e 266] 迴歸:個應變數係一個二元(得兩個可能數值)變數,例如係「輸定贏」噉;啲自變數就可以係連續可以係離散;邏輯迴歸可以用嚟按一柞個案當中每個喺柞自變數上嘅數值,預測佢哋係兩類當中嘅邊一類,例如係電子遊戲 研究當中可以用嚟靠一個玩家嘅數據嚟估計佢輸定贏[155] 。
p
=
P
(
Y
=
1
)
{\displaystyle p=P(Y=1)}
,
1
−
p
=
P
(
Y
=
0
)
{\displaystyle 1-p=P(Y=0)}
,用方程式嚟表達嘅話:
ℓ
=
log
b
p
1
−
p
=
β
0
+
β
1
x
1
+
β
2
x
2
{\displaystyle \ell =\log _{b}{\frac {p}{1-p}}=\beta _{0}+\beta _{1}x_{1}+\beta _{2}x_{2}}
Sigmoid 函數 :以下呢個函數 :
S
(
x
)
=
1
1
+
e
−
x
=
e
x
e
x
+
1
{\displaystyle S(x)={\frac {1}{1+e^{-x}}}={\frac {e^{x}}{e^{x}+1}}}
Sigmoid 函數畫做圖嘅樣
自迴歸模型 [e 267] :用嚟處理時間序列 嘅一種迴歸模型;攞一個會隨住時間變化嘅變數
x
{\displaystyle x}
,設
x
t
{\displaystyle x_{t}}
做時間點
t
{\displaystyle t}
嘅
x
{\displaystyle x}
值,一個自迴歸模型會用個變數嘅過去數值做自變數,預測個變數而家嘅數值。
x
t
=
f
(
x
t
−
1
,
x
t
−
2
.
.
.
x
1
)
{\displaystyle x_{t}=f(x_{t-1},x_{t-2}...x_{1})}
泊淞 [e 268] 迴歸分析:會用喺數數據 上嘅一種迴歸分析,最基本嗰個模型係噉嘅樣:
log
(
E
(
Y
∣
x
)
)
=
α
+
β
′
x
,
{\displaystyle \log(\operatorname {E} (Y\mid \mathbf {x} ))=\alpha +\mathbf {\beta } '\mathbf {x} ,}
,
當中
Y
{\displaystyle Y}
係應變數(通常會假設佢跟泊淞分佈 ),
x
{\displaystyle \mathbf {x} }
係包含柞自變數 嘅向量 ,而
α
{\displaystyle \alpha }
同
β
′
{\displaystyle \mathbf {\beta } '}
係啲參數 [156] 。
普通最小二乘法 [e 269] :其中一種最常用嚟估計線性迴歸模型參數嘅數值嘅演算法;呢一類演算法會用啲步驟逐漸改變個迴歸模型啲參數,目標係要令殘差平方和 [e 270] 有咁細得咁細(有關將某啲數值最大最小化嘅嘢,可以睇最佳化 )。當中 RSS 係指將所有誤差值嘅平方加埋得出嘅數[149] :
R
S
S
=
∑
i
=
1
n
e
i
2
{\displaystyle RSS=\sum _{i=1}^{n}e_{i}^{2}\,}
逐步 [e 271] 迴歸:一種可以用嚟估計線性迴歸模型參數嘅數值嘅演算法;指
由一個冇自變數嘅迴歸模型開始,foreach 自變數,加個自變數入去,喺每一步都用某啲事先制定咗嘅法則講明要點決定加邊個自變數(前向);
由一個有齊嗮啲自變數嘅迴歸模型開始,foreach 自變數,攞個自變數走,睇吓個模型嘅預測力變成點,喺每一步都用某啲事先制定咗嘅法則講明要點決定攞走邊個自變數(反向)。
喺廿一世紀嘅統計學界,逐步迴歸廣受批評,所以唔多人用[157] 。
線性關係 :如果話兩個變數
x
{\displaystyle x}
同
y
{\displaystyle y}
成線性關係,即係話如果將兩個嘅數值畫做圖,會得出一條直線 ,條式會係[158] :
y
=
a
x
+
b
{\displaystyle y=ax+b}
,當中
a
{\displaystyle a}
係一個特定嘅參數 ,而
b
{\displaystyle b}
係截距 [e 272] 。
固定效應 [e 273] 模型:指個模型嘅參數 係固定或者最少非隨機 嘅數值[159] 。
隨機效應 [e 274] 模型:指個模型嘅參數 係隨機變數 [159] 。
混合 [e 275] 模型:指個模型嘅參數 有啲係固定或者非隨機,有啲係隨機變數 [159] 。
嵌套 [e 276] 模型:如果話「模型
A
{\displaystyle A}
嵌套咗喺模型
B
{\displaystyle B}
裏面」,意思即係話
A
{\displaystyle A}
啲參數 係
B
{\displaystyle B}
嘅子集 ;研究者可以透過比較唔同模型嘅適合度指標 ,睇吓「邊個模型能夠最有效噉描述手上攞住嘅數據」[160] 。可以睇埋奧坎剃刀 嘅概念。
等級線性模型 [e 277] :一種做多層分析 [e 278] 嗰時好有用嘅統計分析方法;「多層分析」意思係指樣本入面有
i
{\displaystyle i}
個群組,而每個個體都屬於某一個群組,研究者有理由相信唔同群組彼此之間會有啲系統化嘅差異。
例如一份管理學 上嘅研究,想分析一間公司(樣本)入面嘅員工(個體),而每個員工都有佢所屬嘅工作團隊(樣本入面嘅群組),研究者有理由相信工作團隊之間嘅差異(例如係團隊領袖嘅領導能力 )會影響佢想研究嘅現象,所以佢就做 HLM,用類似以下噉嘅數學方程式 將唔同層面嘅變數擺入去同一條式入面[161] :
Y
i
j
=
β
0
j
+
β
1
j
X
i
j
+
β
2
G
j
+
e
i
j
{\displaystyle Y_{ij}=\beta _{0j}+\beta _{1j}X_{ij}+\beta _{2}G_{j}+e_{ij}}
Y
i
j
{\displaystyle Y_{ij}}
係一個喺層面 1 嘅應變數 (細階
i
{\displaystyle i}
指個體,而細階
j
{\displaystyle j}
指個群體);
X
i
j
{\displaystyle X_{ij}}
係一個喺層面 1 嘅自變數 ;
G
j
{\displaystyle G_{j}}
係一個喺層面 2(群體層面)嘅自變數,佢嘅數值對於屬同一個群體嘅成員嚟講都係一樣嘅;
β
0
j
{\displaystyle \beta _{0j}}
係個根 ;
淨低嗰啲
β
{\displaystyle \beta }
係迴歸系數 [e 279] ,反映咗佢掕住嗰個自變數有幾能夠預測個應變數嘅數值,而
e
i
j
{\displaystyle e_{ij}}
係指誤差 。
呢條式用文字解釋係噉:
Y
i
j
{\displaystyle Y_{ij}}
嘅數值係受
X
i
j
{\displaystyle X_{ij}}
同
G
j
{\displaystyle G_{j}}
呢兩個變數嘅數值影響嘅,而如果用呢個變數嘅數值去預測
Y
i
j
{\displaystyle Y_{ij}}
嘅數值嘅話,誤差平均會係
e
i
j
{\displaystyle e_{ij}}
。而家想像:
Y
i
j
{\displaystyle Y_{ij}}
係「工作團隊
j
{\displaystyle j}
當中員工
i
{\displaystyle i}
嘅工作表現」,
X
i
j
{\displaystyle X_{ij}}
係「工作團隊
j
{\displaystyle j}
當中員工
i
{\displaystyle i}
嘅身體健康」,而
G
j
{\displaystyle G_{j}}
係「工作團隊
j
{\displaystyle j}
嘅領袖嘅領導能力」-
跟手個研究者就去收數據 ,做統計分析,用數據估計
β
1
j
{\displaystyle \beta _{1j}}
同
β
2
{\displaystyle \beta _{2}}
嘅數值。如果數據反映(例如)一個員工嘅身體健康比起佢所屬嘅團隊嘅領袖嘅領導能力更加能夠預測佢嘅工作表現(簡單啲講就係
β
1
j
>
β
2
{\displaystyle \beta _{1j}>\beta _{2}}
)嘅話,噉佢就發現咗啲有用嘅嘢(對一個組織嚟講,對提高員工表現嚟講,確保員工健康比起領導能力更重要),可以將佢嘅研究結果喺期刊嗰度公佈。HLM 常見於管理學 等社科 領域研究,因為呢啲領域成日會遇到「樣本入面有若干個次群體」嘅情況[162] 。
因素分析
一個潛在變數模型嘅想像圖;家陣研究者想量度
T
{\displaystyle T}
呢個睇唔到嘅因素(例如智能 ),於是就俾受試者做個測驗,有
k
{\displaystyle k}
咁多條題目,
X
1
{\displaystyle X_{1}}
、
X
2
{\displaystyle X_{2}}
、
X
3
{\displaystyle X_{3}}
...
X
k
{\displaystyle X_{k}}
,當中每條題目都有個誤差值
e
i
{\displaystyle e_{i}}
以及
λ
i
{\displaystyle \lambda _{i}}
(因素負荷量;簡單講係反映嗰條題目嘅得分同
T
{\displaystyle T}
有幾強相關 )。
因素分析 [e 280] 係一系列用嚟將大量變數轉化成少量因素 [e 281] 嘅統計方法。因素分析有好多種做,不過做法一般都係由若干個直接觀察到嘅變數嗰度推想一個能夠解釋呢啲變數嘅變化嘅因素出嚟,而最後得出呢個因素能夠一定程度上反映嗰柞變數嘅變化。舉個例說明:
想像家陣手上個數據集有若干個可觀察 [e 282] 嘅隨機變數
x
1
,
x
2
,
.
.
.
,
x
p
{\displaystyle x_{1},x_{2},...,x_{p}}
,而呢柞變數嘅平均值係
μ
1
,
μ
2
,
.
.
.
,
μ
p
{\displaystyle \mu _{1},\mu _{2},...,\mu _{p}}
。
想像有
k
{\displaystyle k}
個數值冇得直接觀察嘅隱藏變數 [e 283]
F
j
{\displaystyle F_{j}}
,
j
∈
1
,
.
.
.
,
k
{\displaystyle j\in 1,...,k}
,呢柞
F
j
{\displaystyle F_{j}}
係所謂嘅因素[註 8] ;
喺做因素分析前,
F
j
{\displaystyle F_{j}}
嘅數值係未知,而因素分析嘅目的就係要搵出以下呢啲式當中嘅參數:
x
i
−
μ
i
=
l
i
1
F
1
+
⋯
+
l
i
k
F
k
+
ε
i
{\displaystyle x_{i}-\mu _{i}=l_{i1}F_{1}+\cdots +l_{ik}F_{k}+\varepsilon _{i}}
;當中
i
∈
1
,
.
.
.
,
p
{\displaystyle i\in 1,...,p}
柞
l
i
j
{\displaystyle l_{ij}}
係參數;
ε
i
{\displaystyle \varepsilon _{i}}
係誤差 ,平均值係 0,而變異數 係一個有限數值,唔同
i
{\displaystyle i}
嘅
ε
i
{\displaystyle \varepsilon _{i}}
變異數數值可以唔同。
假想
p
{\displaystyle p}
嘅數值好大(即係
x
i
{\displaystyle x_{i}}
數量多),研究者覺得吓吓都要用嗮柞
x
i
{\displaystyle x_{i}}
做運算好撈絞;而又假想
k
<
p
{\displaystyle k<p}
,如果研究者搵到上述柞式嘅參數數值,佢就能夠用柞
F
j
{\displaystyle F_{j}}
嘅數值總結成個數據集,做到「用數量少啲嘅變數嚟做分析」嘅效果[163] 。
潛在變數模型 [e 284] :描述到啲可觀察變數(或者外顯變數)戥佢哋背後啲潛在變數之間嘅連繫。
因素結構 [e 285] :指一個因素嘅「結構」,包含「個因素由邊啲睇到嘅變數反映」以及「每個變數嘅因素負荷量 [e 286] 係幾多」等嘅資訊。
因素負荷量 [e 287] :喺每個量度咗嘅變數同個隱藏因素 之間有嘅一個數,值喺 0 到 1 之間,係嗰個變數同個隱藏因素之間嘅統計相關 ;如果一個變數嘅因素負荷量大,就表示佢同個隱藏因素有強嘅統計相關,而如果一個變數嘅因素負荷量細,噉就表示佢同個隱藏因素之間嘅統計相關弱,通常研究者就會覺得噉表示個變數根本反映唔到個隱藏因素,會考慮將嗰個變數由個模型嗰度攞走。
因素分析可以分做兩大類[164] :
探索型因素分析 [e 288] :指研究者冇作出任何事先假設嘅因素分析,研究者會由手上嘅數據嗰度估計因素嘅數量同每個變數條
x
i
−
μ
i
{\displaystyle x_{i}-\mu _{i}}
式係點;
確定型因素分析 [e 289] :指研究者分析前經已有個模型喺度;個模型會描述「因素有幾多個」以及「每個變數係邊幾個因素嘅函數」等嘅資訊,然後分析方法要做嘅係嘗試計出一啲量度「個模型有幾準確噉描述數據嘅實際情況」嘅指標。
主成份分析 [e 290] :因素分析嘅一種;想像有柞個案,佢哋每個喺兩個變數上都有其數值(附圖),跟住可以畫兩條線(附圖嗰兩個箭咀),兩條線分別都可以用一條包含
x
{\displaystyle x}
同
y
{\displaystyle y}
嘅算式表達,當中由圖當中可以清楚睇到,長箭咀嗰條線成功噉捕捉更多嘅變異數 -亦即係話長箭咀嗰條線所代表嗰個「成份」比較能夠用嚟分辨啲個案,所以比較「重要」。喺最簡單嗰種情況下,一個做主成份分析嘅演算法 大致上係噉[165] :
攞數據;
畫條線出嚟,條線有條式,而條式包含數據當中有嘅變數 ;
計出沿呢條線嘅變異數有幾多;
改變吓條線嘅參數 ;
再計出沿條新線嘅變異數有幾多;
一路做步驟 4 同 5,做嗮所有指定咗嘅可能性,最後俾具有最大變異數嗰條線做個演算法嘅輸出。
主成分分析嘅圖解;幅圖每一點代表一個個案,兩個箭咀代表兩個成份,長啲嗰個箭咀係比較重要嗰個成份。
卡隆巴系數 ([e 291] ,
ρ
T
{\displaystyle \rho _{T}}
):心理測量學 上成日用嚟衡量一個心理測驗 嘅信度 (睇上面)嘅數值。想像家陣有個心理測驗,有
k
{\displaystyle k}
咁多條題目,而呢
k
{\displaystyle k}
條題目冚唪唥都係量度緊一個因素(例如 10 條題目量度邏輯 能力),研究者搵人做個測驗攞到數據 之後,個測驗嘅卡隆巴系數(
ρ
T
{\displaystyle \rho _{T}}
)條式如下[166] [167] :
ρ
T
=
k
2
σ
i
j
¯
σ
X
2
{\displaystyle \rho _{T}={k^{2}{\overline {\sigma _{ij}}} \over \sigma _{X}^{2}}}
,當中
σ
i
j
¯
{\displaystyle {\overline {\sigma _{ij}}}}
係指每對題目之間嘅協方差 [e 292] 嘅平均值 ;
σ
X
2
{\displaystyle \sigma _{X}^{2}}
指「啲題目嘅變異數 嘅總和」加埋「題目之間嘅協方差總和」;即係話
σ
X
2
=
∑
i
=
1
k
∑
j
=
1
k
σ
i
j
=
∑
i
=
1
k
σ
i
2
+
∑
i
=
1
k
∑
j
≠
i
k
σ
i
j
{\displaystyle \sigma _{X}^{2}=\sum _{i=1}^{k}\sum _{j=1}^{k}\sigma _{ij}=\sum _{i=1}^{k}\sigma _{i}^{2}+\sum _{i=1}^{k}\sum _{j\neq {i}}^{k}\sigma _{ij}}
(有關呢啲數學符號嘅意思,可以睇吓加總 );
如果卡隆巴系數數值大(接近 1)嘅話,就表示呢柞題目嘅變異數主要源自佢哋之間嘅協方差,簡單講就係表示「呢柞題目之間嘅變異數主要係由佢哋之間嘅相關 引起嘅」而唔係源於佢哋各自獨立嘅變異-所以如果一柞題目嘅卡隆巴系數數值大,研究者就更有理由相信呢柞題目係量度緊同一個隱藏因素 [166] 。
張量分解 [e 293] :將一柞以張量 形式表達嘅數據 「分解」做比較簡單嘅張量以及呢啲簡單張量之間嘅運算 ,例[168] :
T
=
v
1
⊗
w
1
⊗
α
1
+
v
2
⊗
w
2
⊗
α
2
+
⋯
+
v
N
⊗
w
N
⊗
α
N
{\displaystyle T=v_{1}\otimes w_{1}\otimes \alpha _{1}+v_{2}\otimes w_{2}\otimes \alpha _{2}+\cdots +v_{N}\otimes w_{N}\otimes \alpha _{N}}
當中
T
{\displaystyle T}
係一個複雜啲嘅張量,柞
v
{\displaystyle v}
、
w
{\displaystyle w}
同
α
{\displaystyle \alpha }
係簡單啲嘅張量,而
⊗
{\displaystyle \otimes }
係張量積 (一種數學運算)。做張量分解目的係要等計起數上嚟冇咁撈絞[168] 。
拉雜模型
生還分析嘅圖解;想像喺實驗室養咗柞老鼠 ,上面幅圖嘅 X 軸表示時間,Y 軸表示「有幾多百分比 嘅白老鼠仲係生勾勾」。
聚類分析嘅附圖;呢拃物件可以按位置座標分做三類,而聚類分析可以想像成同啲點油顏色嘅過程。
生還分析 [e 301] :一套用嚟分析「一件事件要等幾耐先會發生」嘅統計分析技術,例如分析一隻動物 要幾耐先會死-事件係「死亡 」。生還分析包含咗一系列嘅技術,能夠解答以下呢啲問題:
攞一個時間數值
t
{\displaystyle t}
,過咗
t
{\displaystyle t}
咁耐之後,個總體 入面有幾多百分比 嘅個體會死亡?
喺死淨嗰啲個體當中,佢哋會以幾快嘅率接近死亡?
... 等等。進階啲嘅應用仲可以將「死亡」換做第啲事件,例如電子遊戲 同人機互動 等方面嘅研究就會以「用家 放棄 隻遊戲或者產品」嚟做集中研究嘅嗰件事件,用嚟分析用家用起電子產品上嚟嘅行為 [175] 。除此之外,生還分析呢種分析喺工程學 同經濟學 上都會用到[176] 。
生還函數 ([e 302] ,
S
(
t
)
{\displaystyle S(t)}
):指一個俾出「是但搵個個體,嗰個個體嘅生存時間(
T
{\displaystyle T}
)會有幾大機會(
P
{\displaystyle P}
)會超過
t
{\displaystyle t}
咁長」嘅函數 ,即係話[177] :
S
(
t
)
=
P
(
{
T
>
t
}
)
{\displaystyle S(t)=P(\{T>t\})}
S
(
t
)
{\displaystyle S(t)}
可以有好多種唔同樣,好似指數函數 就係常用嘅一種生還函數。
聚類分析 [e 303] :令一個組(聚類)入面嘅物件彼此之間相似,但同個組以外嘅物件唔相似;最基本上,聚類分析可以用附圖嗰種方法想像-圖入面拃點當中每一粒,都喺 X 軸 (一個變數)同 Y 軸 度有個位置,但就噉用肉眼睇都睇得出,啲點可以分做三大類(唔同色嘅點),每個聚類 都係「個聚類入面啲點,彼此之間距離 近,同時又冚唪唥都係同聚類外嘅點距離遠嘅」;聚類分析就可以想像成「同啲點油顏色,表示每點屬邊個聚類」嘅過程[178] [179] 。
馬可夫鏈 [e 304] :一種用嚟模擬一連串可能事件嘅隨機 數學模型 。喺一條馬可夫鏈當中有若干個可能狀態,而每個狀態
s
i
{\displaystyle s_{i}}
都會有一串數字
[
p
0
,
p
1
,
p
2
.
.
.
]
{\displaystyle [p_{0},p_{1},p_{2}...]}
表示世界由
s
i
{\displaystyle s_{i}}
呢個狀態變成另一個狀態嘅機會率 ;喺統計學 上,一種簡單嘅做法係收數據 ,用數據估計
[
p
0
,
p
1
,
p
2
.
.
.
]
{\displaystyle [p_{0},p_{1},p_{2}...]}
嘅數值,產生一個可以用嚟預測世界變化規律嘅模型[180] 。
獨立成份分析 [e 305] :常見於訊號處理 ,會將一個受多個變數影響嘅訊號
x
=
(
x
1
,
…
,
x
m
)
T
{\displaystyle {\boldsymbol {x}}=(x_{1},\ldots ,x_{m})^{T}}
分做彼此之間獨立嘅子部份,即係將
x
=
(
x
1
,
…
,
x
m
)
T
{\displaystyle {\boldsymbol {x}}=(x_{1},\ldots ,x_{m})^{T}}
變成
s
=
(
s
1
,
…
,
s
n
)
T
{\displaystyle {\boldsymbol {s}}=(s_{1},\ldots ,s_{n})^{T}}
,當中每個
x
i
{\displaystyle x_{i}}
都係某啲
s
i
{\displaystyle s_{i}}
嘅線性組合 ;簡單講就係 foreach
x
i
{\displaystyle x_{i}}
,
x
i
=
a
i
,
1
s
1
+
⋯
+
a
i
,
k
s
k
+
⋯
+
a
i
,
n
s
n
{\displaystyle x_{i}=a_{i,1}s_{1}+\cdots +a_{i,k}s_{k}+\cdots +a_{i,n}s_{n}}
(
a
i
{\displaystyle a_{i}}
反映嗰個
s
i
{\displaystyle s_{i}}
有幾影響到
x
i
{\displaystyle x_{i}}
);
當中啲
s
i
{\displaystyle s_{i}}
之間要盡可能彼此獨立[181] 。
點過程 [e 306] ,又有叫點場 [e 307] :指將一個統計模型想像成會喺一個空間 當中有隨機性噉產生一啲點 ,可以攞嚟做好多涉及空間嘅分析,例如係地質學 上分析地震 噉,一場地震嘅中心 可以想像成空間入面嘅一點,而一場地震出現可能會提升周圍嘅空間出現地震點嘅機會(可以睇餘震 ),而喺建立統計模型嚟分析地震嗰陣,分析者可以將「震央嘅出現」想像成一個會喺代表地面嘅空間嗰度產生一粒點點嘅隨機過程 [182] 。
線性判別分析 [e 308] :指「攞住若干個自變數,搵出一個有齊呢啲自變數嘅線性組合 ,嚟分別出若干個『類別』嘅嘢」。
簡單講就係冇可能同時發生嘅事件。例如家陣擲三粒骰仔, 「掟到
{
1
,
3
,
5
}
{\displaystyle \{1,3,5\}}
」同「掟到
{
2
,
4
,
6
}
{\displaystyle \{2,4,6\}}
」係冇可能同時發生嘅,但 「掟到最少一個 2」同「掟到最少一個 4」係有可能同時發生嘅。
p
{\displaystyle p}
係二項分佈當中有嘅一個參數 。
不過除此之外仲要有幾個條件:
Y
{\displaystyle Y}
嘅變化時間上出現喺
X
{\displaystyle X}
嘅變化之後;
X
{\displaystyle X}
同
Y
{\displaystyle Y}
嘅共同原因攞走咗,兩個變數之間嘅關係依然喺度;
... 呀噉。
哈曼測試到咗 2020 年經已唔再俾人認為係一種可靠嘅做法。
喺實際應用上,考慮咁多極細嘅數值可能會出現算術下溢 嘅情況(指要處理嘅數值細過部電腦能夠表示嘅最細值),所以喺實際應用上要點樣計
Pr
(
X
|
θ
)
{\displaystyle {\text{Pr}}(X|\theta )}
有一定嘅學問。
如果
a
>
b
{\displaystyle a>b}
,
max
(
a
,
b
)
=
a
{\displaystyle \max(a,b)=a}
,否則
max
(
a
,
b
)
=
b
{\displaystyle \max(a,b)=b}
「
a
∈
b
{\displaystyle a\in b}
」意思係「
a
{\displaystyle a}
喺
b
{\displaystyle b}
呢個集 入面」。
Ethem Alpaydin (2004). Introduction to Machine Learning , MIT Press, ISBN 978-0-262-01243-0 .
Cohen, J. (1990). Things I have learned (so far) . American Psychologist , 45 , 1304-1312.
Pedro Domingos (September 2015), The Master Algorithm , Basic Books, ISBN 978-0-465-06570-7 .
Trevor Hastie, Robert Tibshirani and Jerome H. Friedman (2001). The Elements of Statistical Learning , Springer. ISBN 0-387-95284-5 .
Stephen Jones, 2010. Statistics in Psychology: Explanations without Equations . Palgrave Macmillan. ISBN 9781137282392 .
David J. C. MacKay. Information Theory, Inference, and Learning Algorithms . Cambridge: Cambridge University Press, 2003. ISBN 0-521-64298-1 .
VanderPlas, J. (2016). Python data science handbook: essential tools for working with data . O'Reilly Media, Inc.
Ian H. Witten and Eibe Frank (2011). Data Mining: Practical machine learning tools and techniques . Morgan Kaufmann, 664pp., ISBN 978-0-12-374856-0 .
篇文用咗嘅行話 或者專有名詞 ,英文 (或者第啲外語)版本如下:
篇文引用咗以下呢啲文獻 同網頁 :
Henk Tijms (2004). Understanding Probability . Cambridge Univ. Press.
Moses, Lincoln E. (1986). Think and Explain with Statistics , Addison-Wesley. pp. 1-3.
Hays, William Lee, (1973). Statistics for the Social Sciences , Holt, Rinehart and Winston, p.xii.
William Feller, An Introduction to Probability Theory and Its Applications , (Vol 1), 3rd Ed, (1968), Wiley.
Kolmogorov, Andrey (1950) [1933]. Foundations of the theory of probability . New York, USA: Chelsea Publishing Company.
Papoulis, A. (1984). "Bernoulli Trials". Probability, Random Variables, and Stochastic Processes (2nd ed.). New York: McGraw-Hill. pp. 57-63.
Emanuel Parzen (2015). Stochastic Processes . Courier Dover Publications. pp. 7, 8.
Gagniuc, Paul A. (2017). Markov Chains: From Theory to Implementation and Experimentation . USA, NJ: John Wiley & Sons. pp. 1-256.
Doyle, Peter G.; Snell, J. Laurie (1984). Random Walks and Electric Networks . Carus Mathematical Monographs. 22. Mathematical Association of America.
Edwards, A.W.F (2002). Pascal's arithmetical triangle: the story of a mathematical idea (2nd ed.). JHU Press.
Yao, Kai; Gao, Jinwu (2016). "Law of Large Numbers for Uncertain Random Variables". IEEE Transactions on Fuzzy Systems . 24 (3): 615-621.
Billingsley, Patrick (1999). Convergence of probability measures (2nd ed.). John Wiley & Sons. pp. 1–28.
Mahmoodian, Ebadollah S.; Rezaie, M.; Vatan, F. (March 1987). "Generalization of Venn Diagram". Eighteenth Annual Iranian Mathematics Conference . Tehran and Isfahan, Iran.
Miller, Scott; Childers, Donald (2012). Probability and Random Processes (Second ed.). Academic Press. p. 8. ISBN 978-0-12-386981-4 . The sample space is the collection or set of 'all possible' distinct (collectively exhaustive and mutually exclusive) outcomes of an experiment."
Dawid, A. P. (1979). "Conditional Independence in Statistical Theory". Journal of the Royal Statistical Society, Series B . 41 (1): 1-31.
Ash, Robert B. (2008). Basic probability theory (Dover ed.). Mineola, N.Y.: Dover Publications. pp. 66–69.
1941-, Çınlar, E. (Erhan) (2011). Probability and stochastics . New York: Springer. p. 51.
Manikandan, S (1 January 2011). "Frequency distribution". Journal of Pharmacology & Pharmacotherapeutics . 2 (1): 54–55.
Deisenroth,Faisal,Ong, Marc Peter,A Aldo, Cheng Soon (2019). Mathematics for Machine Learning . Cambridge University Press. p. 181.
Ali, Mir M. (1980). "Characterization of the Normal Distribution Among the Continuous Symmetric Spherical Class". Journal of the Royal Statistical Society . Series B (Methodological). 42 (2): 162–164.
Spanos, Aris (1999). Probability Theory and Statistical Inference . New York: Cambridge University Press. pp. 109–130.
MacGillivray, HL (1992). "Shape properties of the g- and h- and Johnson families". Communications in Statistics - Theory and Methods . 21: 1244–1250.
Altman, Douglas G; Bland, J Martin (2005-10-15). "Standard deviations and standard errors". BMJ: British Medical Journal . 331 (7521): 903.
Hazewinkel, Michiel, ed. (2001) [1994], "Joint distribution", Encyclopedia of Mathematics , Springer Science+Business Media B.V. / Kluwer Academic Publishers.
Dinov, Ivo; Christou, Nicolas; Sanchez, Juana (2008). "Central Limit Theorem: New SOCR Applet and Demonstration Activity". Journal of Statistics Education . ASA. 16 (2).
Lescroël, A. L.; Ballard, G.; Grémillet, D.; Authier, M.; Ainley, D. G. (2014). Descamps, Sébastien (ed.). "Antarctic Climate Change: Extreme Events Disrupt Plastic Phenotypic Response in Adélie Penguins". PLoS ONE . 9 (1): e85291.
Mulholland, H., & Jones, C. R. (2013). Fundamentals of statistics . Springer.
Clarkson, K. L., & Shor, P. W. (1989). Applications of random sampling in computational geometry, II. Discrete & Computational Geometry , 4(5), 387-421.
Ken Black (2004). Business Statistics for Contemporary Decision Making (Fourth (Wiley Student Edition for India) ed.). Wiley-India.
Defulio, Anthony (2012). "Quotation: Kahneman on Contingencies". Journal of the Experimental Analysis of Behavior . 97 (2): 182.
Fisher, R.A. (1922). "On the mathematical foundations of theoretical statistics". Philosophical Transactions of the Royal Society A . 222 (594-604): 309-368.
Messner SF (1992). "Exploring the Consequences of Erratic Data Reporting for Cross-National Research on Homicide". Journal of Quantitative Criminology . 8 (2): 155-173.
Patricia M. Shields and Nandhini Rangarajan. 2013. A Playbook for Research Methods: Integrating Conceptual Frameworks and Project Management . Stillwater, OK: New Forums Press.
Mangel, Marc; Samaniego, Francisco (June 1984). "Abraham Wald's work on aircraft survivability". Journal of the American Statistical Association . 79 (386): 259-267.
Rosenbaum, P.R. (2002). Observational Studies (2nd ed.). New York: Springer-Verlag.
Song, J. W., & Chung, K. C. (2010). Observational studies: cohort and case-control studies. Plastic and reconstructive surgery , 126(6), 2234-2242.
Abramson, J.J. and Abramson, Z.H. (1999). Survey Methods in Community Medicine: Epidemiological Research, Programme Evaluation, Clinical Trials (5th edition). London: Churchill Livingstone/Elsevier Health Sciences.
Likert, Rensis (1932). "A Technique for the Measurement of Attitudes". Archives of Psychology . 140: 1–55.
Robins, Richard; Fraley, Chris; Krueger, Robert (2007). Handbook of Research Methods in Personality Psychology . The Guilford Press. pp. 228.
Asher, Herbert: Polling and the Public. What Every Citizen Should Know (4th ed. CQ Press, 1998)
Dunning, Thad (2012). Natural experiments in the social sciences : a design-based approach . Cambridge: Cambridge University Press.
Shadish, William R.; Cook, Thomas D.; Campbell, Donald T. (2002). Experimental and quasi-experimental designs for generalized causal inference (Nachdr. ed.). Boston: Houghton Mifflin.
Kirk, R. E. (2012). Experimental design . Handbook of Psychology, Second Edition, 2.
Hinkelmann, Klaus; Kempthorne, Oscar (2008). Design and Analysis of Experiments, Volume I: Introduction to Experimental Design (2nd ed.). Wiley.
Hacking, Ian (September 1988). "Telepathy: Origins of Randomization in Experimental Design". Isis . 79 (3): 427–451.
Montgomery, Douglas C. (2013). Design and Analysis of Experiments (8th ed.). Hoboken, New Jersey: Wiley.
Dinardo, J. (2008). "natural experiments and quasi-natural experiments". The New Palgrave Dictionary of Economics . pp. 856–859.
"Introduction to Clinical Research Informatics ", Rachel Richesson, James Andrews
Stevens, S. S. (7 June 1946). "On the Theory of Scales of Measurement". Science . 103 (2684): 677–680.
Michell, J (1986). "Measurement scales and statistics: a clash of paradigms". Psychological Bulletin . 100 (3): 398–407.
K.D. Joshi, Foundations of Discrete Mathematics , 1989, New Age International Limited, [1], page 7.
Iacobucci, D., Posavac, S. S., Kardes, F. R., Schneider, M. J., & Popovich, D. L. (2015). The median split: Robust, refined, and revived. Journal of Consumer Psychology , 25(4), 690-704.
Podsakoff, P.M.; MacKenzie, S.B.; Lee, J.-Y.; Podsakoff, N.P. (October 2003). "Common method biases in behavioral research: A critical review of the literature and recommended remedies". Journal of Applied Psychology . 88 (5): 879–903.
Lim, Christopher R.; Harris, Kristina; Dawson, Jill; Beard, David J.; Fitzpatrick, Ray; Price, Andrew J. (2015-07-01). "Floor and ceiling effects in the OHS: an analysis of the NHS PROMs data set". BMJ Open . 5 (7): e007765.
Cramer, Duncan; Howitt, Dennis Laurence (2005). The SAGE Dictionary of Statistics: A Practical Resource for Students in the Social Sciences (Third ed.). SAGE. p. 21 (entry "ceiling effect").
Carmines, E. G., & Zeller, R. A. (1979). Reliability and validity assessment (Vol. 17). Sage publications.
American Educational Research Association, Psychological Association, & National Council on Measurement in Education. (1999). Standards for Educational and Psychological Testing . Washington, DC: American Educational Research Association.
Cronbach, Lee J.; Meehl, Paul E. (1955). "Construct validity in psychological tests". Psychological Bulletin . 52 (4): 281-302.
Campell, D. T., & Fiske, D. W. (1959). Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological Bulletin , 56, 81-105
Gravetter, Frederick J.; Forzano, Lori-Ann B. (2012). Research Methods for the Behavioral Sciences (4th ed.). Belmont, Calif.: Wadsworth. p. 78.
Data, C. E., & Using Descriptive Statistics Bartz, A. E. (1988). Basic statistical concepts . New York: Macmillan. Devore, J., and Peck.
Cox, D. R.; Lewis, P. A. W. (1966). The Statistical Analysis of Series of Events . London: Methuen.
Benjamini, Y. (1988). "Opening the Box of a Boxplot". The American Statistician . 42 (4): 257-262.
E. Kreyszig (1979). Advanced Engineering Mathematics (Fourth ed.). Wiley. p. 880, eq. 5.
Hashimzade, Nigar; Myles, Gareth; Black, John (2017-01-19). A Dictionary of Economics . Oxford University Press. p. 4.
Sarndal, Swenson, and Wretman (1992), Model Assisted Survey Sampling , Springer-Verlag.
Myers, Jerome L.; Well, Arnold D.; Lorch Jr., Robert F. (2010). "Developing fundamentals of hypothesis testing using the binomial distribution". Research design and statistical analysis (3rd ed.). New York, NY: Routledge. pp. 65–90.
Adèr, H. J.; Mellenbergh, G. J. & Hand, D. J. (2007). Advising on research methods: A consultant's companion . Huizen, The Netherlands: Johannes van Kessel Publishing.
Pillemer, D. B. (1991). "One-versus two-tailed hypothesis tests in contemporary educational research". Educational Researcher . 20 (9): 13–17.
Rubin, D. B.; Little, R. J. A. (2002). Statistical analysis with missing data . New York: Wiley.
Hoenig; Heisey (2001). "The Abuse of Power". The American Statistician . 55 (1): 19–24.
Dodge, Yadolah, ed. (1987). Statistical data analysis based on the L1-norm and related methods: Papers from the First International Conference held at Neuchâtel , August 31–September 4, 1987. North-Holland Publishing.
Gillies, D. (2018). Causality, probability, and medicine . Routledge.
Kazdin, A. E. (2007). Mediators and mechanisms of change in psychotherapy research. Annu. Rev. Clin. Psychol. , 3, 1-27.
Dayer, M. R., Mard-Soltani, M., Dayer, M. S., & Alavi, S. M. R. (2014). Causality relationships between coagulation factors in type 2 diabetes mellitus: path analysis approach. Medical journal of the Islamic Republic of Iran , 28, 59.
Granger, C. W. J. (1969). "Investigating Causal Relations by Econometric Models and Cross-spectral Methods". Econometrica . 37 (3): 424–438.
Miller, R.G. (1981). Simultaneous Statistical Inference 2nd Ed. Springer Verlag New York.
Dunn, Olive Jean (1961). "Multiple Comparisons Among Means". Journal of the American Statistical Association . 56 (293): 52-64.
Cohen, J.; Cohen P.; West, S.G. & Aiken, L.S. (2002). Applied multiple regression/correlation analysis for the behavioral sciences (3rd ed.). Psychology Press.
Rodgers, J. L.; Nicewander, W. A. (1988). "Thirteen ways to look at the correlation coefficient". The American Statistician . 42 (1): 59–66.
Rice, John (2007). Mathematical Statistics and Data Analysis . Belmont, CA: Brooks/Cole Cengage Learning. p. 138.
Baba, Kunihiro; Ritei Shibata; Masaaki Sibuya (2004). "Partial correlation and conditional correlation as measures of conditional independence". Australian and New Zealand Journal of Statistics . 46 (4): 657–664.
Koch, Gary G. (1982). "Intraclass correlation coefficient". In Samuel Kotz and Norman L. Johnson (ed.). Encyclopedia of Statistical Sciences . 4. New York: John Wiley & Sons. pp. 213–217.
Cureton, Edward E. (1956). "Rank-biserial correlation". Psychometrika . 21 (3): 287–290.
Myers, Jerome L.; Well, Arnold D. (2003). Research Design and Statistical Analysis (2nd ed.). Lawrence Erlbaum. pp. 508.
Kendall, M. (1938). "A New Measure of Rank Correlation". Biometrika . 30 (1–2): 81-89.
Gubner, John A. (2006). Probability and Random Processes for Electrical and Computer Engineers . Cambridge University Press. p.388.
Tahmasebi, Pejman; Hezarkhani, Ardeshir; Sahimi, Muhammad (2012). "Multiple-point geostatistical modeling based on the cross-correlation functions". Computational Geosciences . 16 (3): 779–797.
Athanasios Papoulis; S. Unnikrishna Pillai (2002). Probability, Random Variables and Stochastic Processes . McGraw-Hill. p. 211.
O'Mahony, M. (1986). Sensory Evaluation of Food: Statistical Methods and Procedures . CRC Press. p. 487. ISBN 0-82477337-3 .
Derrick, B; Toher, D; White, P (2017). "How to compare the means of two samples that include paired observations and independent observations: A companion to Derrick, Russ, Toher and White (2017)". The Quantitative Methods for Psychology . 13 (2): 120–126.
Howell, David (2002). Statistical Methods for Psychology . Duxbury. pp. 324–325.
Gueorguieva; Krystal (2004). "Move Over ANOVA". Arch Gen Psychiatry . 61 (3): 310–7.
Fujikoshi, Yasunori (1993). "Two-way ANOVA models with unbalanced data". Discrete Mathematics . 116 (1): 315–334.
Warne, R. T. (2014). "A primer on multivariate analysis of variance (MANOVA) for behavioral scientists". Practical Assessment, Research & Evaluation . 19 (17): 1-10.
Keppel, G. (1991). Design and analysis: A researcher's handbook (3rd ed.). Englewood Cliffs: Prentice-Hall, Inc.
McCulloch, J. Huston (1985). "On Heteroscedasticity". Econometrica . 53 (2): 483.
Bagdonavicius, V., Kruopis, J., Nikulin, M.S. (2011). Non-parametric tests for complete data , ISTE & WILEY: London & Hoboken.
Hettmansperger, T.P.; McKean, J.W. (1998). Robust nonparametric statistical methods . Kendall's Library of Statistics. Vol. 5 (First ed., rather than Taylor and Francis (2010) second ed.). London; New York: Edward Arnold; John Wiley and Sons, Inc. pp. xiv+467.
Rosenthal, Robert, H. Cooper, and L. Hedges. "Parametric measures of effect size." The handbook of research synthesis , 621 (1994): 231–244.
Everitt, Brian S. (2002). The Cambridge Dictionary of Statistics . Cambridge University Press. p. 128.
Neyman, J. (1937) "Outline of a Theory of Statistical Estimation Based on the Classical Theory of Probability", Philosophical Transactions of the Royal Society of London A , 236, 333–380.
D.V. Lindley: Statistical Inference (1953) Journal of the Royal Statistical Society, Series B , 16: 30-76.
Walter, E.; Pronzato, L. (1997). Identification of Parametric Models from Experimental Data . London, England: Springer-Verlag.
Rossi, Richard J. (2018). Mathematical Statistics : An Introduction to Likelihood Based Inference . New York: John Wiley & Sons. p. 227.
Golub, Gene F.; van der Vorst, Henk A. (2000), "Eigenvalue computation in the 20th century", Journal of Computational and Applied Mathematics , 123 (1-2): 35-65.
Rosenthal, G. & Rosenthal, J. (2011). Statistics and Data Interpretation for Social Work . Springer Publishing Company.
MacKinnon, D. P. (2008). Introduction to Statistical Mediation Analysis . New York: Erlbaum.
Baron, R. M. and Kenny, D. A. (1986) "The Moderator-Mediator Variable Distinction in Social Psychological Research – Conceptual, Strategic, and Statistical Considerations", Journal of Personality and Social Psychology , Vol. 51(6), pp. 1173–1182.
Aiken, L. S., West, S. G., & Reno, R. R. (1991). Multiple regression: Testing and interpreting interactions . Sage.
Pearl, J., (2009). Simpson's Paradox, Confounding, and Collapsibility. In Causality: Models, Reasoning and Inference (2nd ed.). New York : Cambridge University Press.
Horst, P. (1941). The prediction of personal adjustment. Social Science Research Council Bulletin , 48. New York, NY: Social Science Research Council.
Eisenhauer, J. G. (2008). "Degrees of Freedom". Teaching Statistics . 30 (3): 75-78.
Thabane, L., Mbuagbaw, L., Zhang, S., Samaan, Z., Marcucci, M., Ye, C., ... & Debono, V. B. (2013). A tutorial on sensitivity analyses in clinical trials: the what, why, when and how 互聯網檔案館 嘅 歸檔 ,歸檔日期2020年5月26號,. . BMC medical research methodology , 13(1), 92.
Kroese, D. P.; Brereton, T.; Taimre, T.; Botev, Z. I. (2014). "Why the Monte Carlo method is so important today". WIREs Comput Stat . 6 (6): 386–392.
Farcomeni, A.; Greco, L. (2013), Robust methods for data reduction, Boca Raton , FL: Chapman & Hall/CRC Press.
Van Geert, P. (2009). Nonlinear complex dynamical systems in developmental psychology. In S. J. Guastello, M. Koopmans, & D. Pincus (Eds.), Chaos and complexity in psychology: The theory of nonlinear dynamical systems (pp. 242–281). Cambridge University Press.
Cornell, J. E. & Mulrow, C. D. (1999). Meta-analysis. In: H. J. Adèr & G. J. Mellenbergh (Eds). Research Methodology in the social, behavioral and life sciences (pp. 285–323). London: Sage.
Cox, D. R. (2006), Principles of Statistical Inference , Cambridge University Press. p. 178.
Cox, D. R. (2006), Principles of Statistical Inference , Cambridge University Press, p. 197.
Huber-Carol, C.; Balakrishnan, N.; Nikulin, M. S.; Mesbah, M., eds. (2002), Goodness-of-Fit Tests and Model Validity , Springer
Singh, R. (2009). Does my structural model represent the real phenomenon?: a review of the appropriate use of Structural Equation Modelling (SEM) model fit indices. The Marketing Review , 9(3), 199-212.
Sarstedt, M. , Henseler, J. and Ringle, C. (2011), "Multi-group analysis in partial least squares (PLS) path modeling: alternative methods and empirical results", Advances in International Marketing , Vol. 22 No. 1, pp. 195-218.
Takayama, Akira (1985). Mathematical Economics (2nd ed.). New York: Cambridge University Press. p. 61.
Everitt, B.S.; Hand, D.J. (1981). Finite mixture distributions . Chapman & Hall.
Lindley, D.V. (1987). "Regression and correlation analysis," New Palgrave: A Dictionary of Economics , v. 4, pp. 120–23.
Friedman, J. H. (1991). "Multivariate Adaptive Regression Splines". The Annals of Statistics . 19 (1): 1–67.
Hughes, Ann; Grawoig, Dennis (1971). Statistics: A Foundation for Analysis . Reading: Addison-Wesley. pp. 344–348.
Farrar, Donald E.; Glauber, Robert R. (1967). "Multicollinearity in Regression Analysis: The Problem Revisited". Review of Economics and Statistics . 49 (1): 92–107.
K. V. Mardia, J. T. Kent and J. M. Bibby (1979). Multivariate Analysis . Academic Press.
Constant, T., & Levieux, G. (2019, May). Dynamic difficulty adjustment impact on players' confidence . In Proceedings of the 2019 CHI conference on human factors in computing systems (pp. 1-12).
Cameron, A. C.; Trivedi, P. K. (1998). Regression analysis of count data . Cambridge University Press.
Edwards, Harold M. (1995). Linear Algebra . Springer. p. 78.
Laird, Nan M.; Ware, James H. (1982). "Random-Effects Models for Longitudinal Data". Biometrics . 38 (4): 963–974.
Inness, M., Turner, N., Barling, J., & Stride, C. B. (2010). Transformational leadership and employee safety performance: a within-person, between-jobs design . Journal of occupational health psychology , 15(3), 279,呢份管理學 研究用咗嵌套模型,剖析(簡化講)轉工同管理者嘅領導能力 點影響打工仔嘅某啲行為。
Hofmann, D. A., Griffin, M. A., & Gavin, M. B. (2000). The application of hierarchical linear modeling to organizational research .
Hofmann, D. A., & Gavin, M. B. (1998). Centering decisions in hierarchical linear models: Implications for research in organizations. Journal of Management , 24(5), 623-641.
Child, Dennis (2006), The Essentials of Factor Analysis (3rd ed.), Continuum International.
Polit DF Beck CT (2012). Nursing Research: Generating and Assessing Evidence for Nursing Practice , 9th ed. Philadelphia, USA: Wolters Klower Health, Lippincott Williams & Wilkins.
Jolliffe, I. T. (1986). Principal Component Analysis . Springer Series in Statistics. Springer-Verlag.
Cho, E. (2016). Making reliability reliable: A systematic approach to reliability coefficients. Organizational Research Methods , 19(4), 651–682.
Green, S. B., & Yang, Y. (2009). Commentary on coefficient alpha: A cautionary tale. Psychometrika , 74(1), 121–135.
Kaplan, D. (2008). Structural Equation Modeling: Foundations and Extensions (2nd ed.). SAGE. pp. 79-88.
Vandenberg, Robert J.; Lance, Charles E. (2000). "A Review and Synthesis of the Measurement Invariance Literature: Suggestions, Practices, and Recommendations for Organizational Research". Organizational Research Methods . 3: 4–70.
Pearl, Judea (May 2018). The Book of Why . New York: Basic Books. p. 6.
Loehlin, J. C. (2004). Latent Variable Models: An Introduction to Factor, Path, and Structural Equation Analysis . Psychology Press.
Ellen, Hamaker; Rebecca, Kuiper; Raoul, Grasman (March 2015). "A Critique of the Cross-Lagged Panel Model". Psychological Methods . 20 (1): 102–116.
Mund, Marcus; Nestler, Steffen (September 2019). "Beyond the Cross-Lagged Panel Model: Next-generation statistical tools for analyzing interdependencies across the life course". Advances in Life Course Research . 41: 100249.
Allart, T., Levieux, G., Pierfitte, M., Guilloux, A., & Natkin, S. (2016, September). Design influence on player retention: A method based on time varying survival analysis. In 2016 IEEE Conference on Computational Intelligence and Games (CIG) (pp. 1-8). IEEE.
Collett, David (2003). Modelling Survival Data in Medical Research (Second ed.). Boca Raton: Chapman & Hall/CRC.
Kleinbaum, David G.; Klein, Mitchel (2012), Survival analysis: A Self-learning text (Third ed.), Springer.
Duran, B. S., & Odell, P. L. (2013). Cluster analysis: a survey (Vol. 100). Springer Science & Business Media.
Frades, I., & Matthiesen, R. (2010). Overview on techniques in cluster analysis. Bioinformatics methods in clinical research , 81-107.
Gagniuc, Paul A. (2017). Markov Chains: From Theory to Implementation and Experimentation . USA, NJ: John Wiley & Sons. pp. 1-235.
Hyvärinen, Aapo (2013). "Independent component analysis: recent advances". Philosophical Transactions: Mathematical, Physical and Engineering Sciences . 371 (1984): 20110534.
Baddeley, A., Gregori, P., Mateu, J., Stoica, R., and Stoyan, D., editors (2006). Case Studies in Spatial Point Pattern Modelling, Lecture Notes in Statistics No. 185 . Springer, New York.
Kohavi, Ron (1995). "A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection". International Joint Conference on Artificial Intelligence .
Rodriguez, J. D., Perez, A., & Lozano, J. A. (2009). Sensitivity analysis of k-fold cross validation in prediction error estimation. IEEE transactions on pattern analysis and machine intelligence , 32(3), 569-575.
Altman D.G., Bland J.M. (June 1994). "Diagnostic tests. 1: Sensitivity and specificity". BMJ . 308 (6943): 1552.
Pontius, Robert Gilmore; Si, Kangping (2014). "The total operating characteristic to measure diagnostic ability for multiple thresholds". International Journal of Geographical Information Science . 28 (3): 570-583.
Kolmogorov, Andrey (1963). "On Tables of Random Numbers". Sankhyā Ser. A . 25: 369–375.
Kolmogorov, Andrey (1998). "On Tables of Random Numbers". Theoretical Computer Science . 207 (2): 387–395.
Taddy, Matt (2019). Business Data Science: Combining Machine Learning and Economics to Optimize, Automate, and Accelerate Business Decisions . New York: McGraw-Hill. p. 90.
Wit, Ernst; Edwin van den Heuvel; Jan-Willem Romeyn (2012). "'All models are wrong...': an introduction to model uncertainty". Statistica Neerlandica . 66 (3): 217–236.