第3版在保留前一版內容框架的基礎上,對部分內容和R代碼做了修改,并增加了部分新的內容。全書共11章,包括數據的可視化等描述性分析方法、推斷方法以及實際中常用的一些統計方法等。本書是由R語言實現全部例題計算與分析的統計學教材,書中例題的解答和圖表均給出了R的詳細代碼和結果。書中使用的R版本是3.5.1。
本書可作為高等院校統計學專業本科生的基礎課程教材,也可作為經濟管理類專業及部分理、工、農、林、醫藥等專業的統計學教材使用,對實際數據分析人員也有參考價值。
賈俊平,中國人民大學統計學院副教授,多年從事統計學教學和研究工作。著有《統計學》《統計學——基于R》《統計學——基于SPSS》《統計學——基于Excel》《數據可視化分析——基于R語言》等多部著作。曾先后獲得國家級優秀教學成果二等獎、北京市優秀教學成果一等獎、寶鋼教育獎、北京市經濟技術創新標兵等多個獎項。
目錄
第 1 章 數據與 R······················ 1
1.1 數據與統計學 ····················· 1
1.1.1 什么是統計學················· 1
1.1.2 變量與數據 ··················· 2
1.1.3 數據的來源 ··················· 3
1.2 R 的初步使用 ····················· 4
1.2.1 R 的下載與安裝·············· 4
1.2.2 對象賦值與運行 ·············· 5
1.2.3 查看幫助文件················· 6
1.2.4 包的安裝與加載 ·············· 7
1.3 創建 R 數據 ··················· 7
1.3.1 在 R 中錄入數據 ············· 8
1.3.2 數據讀取和保存 ·············· 9
1.3.3 數據使用和編輯 ·············· 12
1.3.4 數據類型的轉換 ·············· 19
1.3.5 生成隨機數 ··················· 20
1.3.6 數據抽樣與篩選 ·············· 21
1.4 編寫 R 函數 ······················· 24
1.5 圖形控制和布局 ·················· 25
1.5.1 par 函數 ······················ 25
1.5.2 layout 函數··················· 27
習題························· 29
第 2 章 數據的可視化 ················ 30
2.1 數據的頻數分布 ·················· 30
2.1.1 類別數據的頻數分布 ········· 30
2.1.2 數值數據的類別化············ 37
2.2 類別數據的可視化 ··············· 40
2.2.1 條形圖及其變種 ·············· 40
2.2.2 餅圖及其變種················· 46
2.3 數值數據的可視化 ··············· 48
2.3.1 展示數據分布的圖形 ········· 48
2.3.2 展示變量間關系的圖形······· 65
2.3.3 比較多樣本相似性的圖形 ···· 72
2.3.4 時間序列圖 ··················· 79
2.4 洛倫茨曲線 ························ 81
2.5 ggplot2 繪圖的一個示例 ········ 84
2.6 使用圖表的注意事項 ············ 86
習題···························· 87
第 3 章 數據的描述統計量··········· 91
3.1 描述水平的統計量 ··············· 91
3.1.1 平均數 ························ 91
3.1.2 分位數 ························ 93
3.1.3 眾數··························· 95
3.2 描述差異的統計量 ··············· 96
3.2.1 極差和四分位差 ·············· 96
3.2.2 方差和標準差················· 97
3.2.3 變異系數······················ 98
3.2.4 標準分數······················ 100
3.3 描述分布形狀的統計量·········· 101
3.3.1 偏度系數······················ 101
3.3.2 峰度系數······················ 102
3.4 數據的綜合描述 ·················· 103
3.4.1 幾個常用的 R 函數··········· 103
3.4.2 一個綜合描述的例子 ········· 105
習題····························· 111
第 4 章 隨機變量的概率分布········ 114
4.1 什么是概率 ························ 114
4.2 隨機變量的概率分布 ············ 115
4.2.1 隨機變量及其概括性度量 ···· 115
4.2.2 隨機變量的概率分布 ········· 117
4.2.3 其他幾個重要的統計分布 ···· 123
4.3 樣本統計量的概率分布·········· 128
4.3.1 統計量及其分布 ·············· 128
4.3.2 樣本均值的分布 ·············· 129
4.3.3 其他統計量的分布············ 132
4.3.4 統計量的標準誤 ·············· 134
習題···························· 136
第 5 章 參數估計 ······················ 137
5.1 參數估計的原理 ·················· 137
5.1.1 點估計與區間估計············ 137
5.1.2 評量估計量的標準············ 141
5.2 總體均值的區間估計 ············ 145
5.2.1 一個總體均值的估計 ········· 145
5.2.2 兩個總體均值之差的估計 ···· 148
5.3 總體比例的區間估計 ············ 152
5.3.1 一個總體比例的估計 ········· 152
5.3.2 兩個總體比例之差的估計 ···· 155
5.4 總體方差的區間估計 ············ 157
5.4.1 一個總體方差的估計 ········· 157
5.4.2 兩個總體方差比的估計······· 158
習題······························ 159
第 6 章 假設檢驗 ······················ 162
6.1 假設檢驗的原理 ·················· 162
6.1.1 提出假設······················ 162
6.1.2 做出決策······················ 164
6.1.3 表述結果······················ 167
6.1.4 效應量 ························ 168
6.2 總體均值的檢驗 ·················· 168
6.2.1 一個總體均值的檢驗 ········· 168
6.2.2 兩個總體均值之差的檢驗 ···· 172
6.3 總體比例的檢驗 ·················· 178
6.3.1 一個總體比例的檢驗 ········· 178
6.3.2 兩個總體比例之差的檢驗 ···· 179
6.4 總體方差的檢驗 ·················· 181
6.4.1 一個總體方差的檢驗 ········· 181
6.4.2 兩個總體方差比的檢驗······· 183
6.5 非參數檢驗 ························ 183
6.5.1 總體分布的檢驗 ·············· 184
6.5.2 總體位置參數的檢驗 ········· 188
習題····························· 193
第 7 章 類別變量分析 ················ 197
7.1 一個類別變量的擬合優度
檢驗 ································· 197
7.1.1 期望頻數相等················· 197
7.1.2 期望頻數不等················· 199
7.2 兩個類別變量的獨立性檢驗···· 201
7.2.1 列聯表與 χ
2 獨立性檢驗····· 201
7.2.2 應用 χ
2 檢驗的注意事項····· 203
7.3 兩個類別變量的相關性度量···· 204
7.3.1 ϕ 系數和 Cramer’s V 系數·· 204
7.3.2 列聯系數······················ 205
習題··························· 206
第 8 章 方差分析 ······················ 208
8.1 方差分析的原理 ·················· 208
8.1.1 什么是方差分析 ·············· 208
8.1.2 誤差分解······················ 209
8.2 單因子方差分析 ·················· 210
8.2.1 數學模型······················ 210
8.2.2 效應檢驗······················ 211
8.2.3 效應量分析 ··················· 215
8.2.4 多重比較······················ 215
8.3 雙因子方差分析 ·················· 221
8.3.1 數學模型······················ 221
8.3.2 主效應分析 ··················· 222
8.3.3 交互效應分析················· 228
8.4 方差分析的假定及其檢驗······· 233
8.4.1 正態性檢驗 ··················· 233
8.4.2 方差齊性檢驗················· 235
8.5 單因子方差分析的非參數
方法 ··························· 239
習題······························· 241
第 9 章 一元線性回歸 ················ 244
9.1 確定變量間的關系 ··············· 244
9.1.1 變量間的關系················· 244
9.1.2 相關關系的描述 ·············· 245
9.1.3 關系強度的度量 ·············· 247
9.2 模型估計和檢驗 ·················· 249
9.2.1 回歸模型與回歸方程 ········· 249
9.2.2 參數的最小二乘估計 ········· 250
9.2.3 模型的擬合優度 ·············· 253
9.2.4 模型的顯著性檢驗············ 255
9.3 利用回歸方程進行預測·········· 257
9.3.1 均值的置信區間 ·············· 257
9.3.2 個別值的預測區間············ 258
9.4 回歸模型的診斷 ·················· 261
9.4.1 殘差與殘差圖················· 261
9.4.2 檢驗模型假定················· 263
習題···························· 266
第 10 章 多元線性回歸··············· 269
10.1 多元線性回歸模型及其參數
估計······················ 269
10.1.1 回歸模型與回歸方程········ 269
10.1.2 參數的最小二乘估計········ 270
10.2 擬合優度和顯著性檢驗 ········ 274
10.2.1 模型的擬合優度············· 274
10.2.2 模型的顯著性檢驗 ·········· 276
10.2.3 模型診斷 ···················· 277
10.3 多重共線性及其處理 ··········· 279
10.3.1 多重共線性及其識別········ 280
10.3.2 變量選擇與逐步回歸········ 282
10.4 相對重要性和模型比較 ········ 286
10.4.1 自變量的相對重要性········ 286
10.4.2 模型比較 ···················· 288
10.5 利用回歸方程進行預測 ········ 290
10.6 啞變量回歸······················· 292
10.6.1 在模型中引入啞變量········ 292
10.6.2 含有一個啞變量的回歸 ····· 292
習題······················ 299
第 11 章 時間序列預測··············· 302
11.1 時間序列的成分和預測方法 ·· 302
11.1.1 時間序列的成分············· 302
11.1.2 預測方法的選擇與評估 ····· 305
11.2 指數平滑預測··············· 306
11.2.1 指數平滑模型的一般表達··· 306
11.2.2 簡單指數平滑預測 ·········· 308
11.2.3 Holt 指數平滑預測·········· 311
11.2.4 Winter 指數平滑預測······· 313
11.3 趨勢外推預測···················· 316
11.3.1 線性趨勢預測 ··············· 317
11.3.2 非線性趨勢預測············· 319
11.4 分解預測·························· 325
11.5 時間序列平滑···················· 329
習題····················· 332
附錄 1 求置信區間的自助法·········· 335
附錄 2 本書使用的 R 函數 ··········· 343
參考書目 ··················· 346