本書是數字視頻處理領域的權威圖書,深入闡述數字圖像與視頻處理技術。第2版更新了近80%的體系知識和內容,全書章節調整為全新的8章,包括圖像濾波、運動估計、視頻分割與跟蹤、視頻濾波、圖像壓縮和視頻壓縮等,尤其體現了近年在信號處理和計算機視覺方面的重要技術進展,以及3D、超高分辨率的視頻和數字電影的最新應用。
Contents1 Multi-Dimensional Signals and Systems 11.1 Multi-Dimensional Signals 21.1.1 Finite-Extent Signals and Periodic Signals 21.1.2 Symmetric Signals 51.1.3 Special Multi-Dimensional Signals 51.2 Multi-Dimensional Transforms 81.2.1 Fourier Transform of Continuous Signals 81.2.2 Fourier Transform of Discrete Signals 121.2.3 Discrete Fourier Transform (DFT) 141.2.4 Discrete Cosine Transform (DCT) 181.3 Multi-Dimensional Systems 201.3.1 Impulse Response and 2D Convolution 201.3.2 Frequency Response 231.3.3 FIR Filters and Symmetry 251.3.4 IIR Filters and Partial Difference Equations 271.4 Multi-Dimensional Sampling Theory 301.4.1 Sampling on a Lattice 301.4.2 Spectrum of Signals Sampled on a Lattice 341.4.3 Nyquist Criterion for Sampling on a Lattice 361.4.4 Reconstruction from Samples on a Lattice 411.5 Sampling Structure Conversion 42References 47Exercises 48Problem Set 1 48MATLAB Exercises 502 Digital Images and Video 532.1 Human Visual System and Color 542.1.1 Color Vision and Models 542.1.2 Contrast Sensitivity 572.1.3 Spatio-Temporal Frequency Response 592.1.4 Stereo/Depth Perception 622.2 Digital Video 632.2.1 Spatial Resolution and Frame Rate 642.2.2 Color, Dynamic Range, and Bit-Depth 652.2.3 Color Image Processing 672.2.4 Digital-Video Standards 702.3 3D Video 752.3.1 3D-Display Technologies 752.3.2 Stereoscopic Video 792.3.3 Multi-View Video 792.4 Digital-Video Applications 812.4.1 Digital TV 812.4.2 Digital Cinema 852.4.3 Video Streaming over the Internet 882.4.4 Computer Vision and Scene/Activity Understanding 912.5 Image and Video Quality 922.5.1 Visual Artifacts 922.5.2 Subjective Quality Assessment 932.5.3 Objective Quality Assessment 94References 96Image Filtering 1013.1 Image Smoothing 1023.1.1 Linear Shift-Invariant Low-Pass Filtering 1023.1.2 Bi-Lateral Filtering 1053.2 Image Re-Sampling and Multi-Resolution Representations 1063.2.1 Image Decimation 1073.2.2 Interpolation 1093.2.3 Multi-Resolution Pyramid Representations 1163.2.4 Wavelet Representations 1173.3 Image-Gradient Estimation, Edge and Feature Detection 1233.3.1 Estimation of the Image Gradient 1243.3.2 Estimation of the Laplacian 1283.3.3 Canny Edge Detection 1303.3.4 Harris Corner Detection 1313.4 Image Enhancement 1333.4.1 Pixel-Based Contrast Enhancement 1333.4.2 Spatial Filtering for Tone Mapping and Image Sharpening 1383.5 Image Denoising 1433.5.1 Image and Noise Models 1443.5.2 Linear Space-Invariant Filters in the DFT Domain 1463.5.3 Local Adaptive Filtering 1493.5.4 Nonlinear Filtering: Order-Statistics, Wavelet Shrinkage, and Bi-Lateral Filtering 1543.5.5 Non-Local Filtering: NL-Means and BM3D 1583.6 Image Restoration 1603.6.1 Blur Models 1613.6.2 Restoration of Images Degraded by Linear Space-Invariant Blurs 1653.6.3 Blind Restoration – Blur Identification 1713.6.4 Restoration of Images Degraded by Space-Varying Blurs 1733.6.5 Image In-Painting 176References 177Exercises 182Problem Set 3 182MATLAB Exercises 185MATLAB Resources 1894 Motion Estimation 1914.1 Image Formation 1924.1.1 Camera Models 1924.1.2 Photometric Effects of 3D Motion 1974.2 Motion Models 1984.2.1 Projected Motion vs. Apparent Motion 1994.2.2 Projected 3D Rigid-Motion Models 2034.2.3 2D Apparent-Motion Models 2064.3 2D Apparent-Motion Estimation 2104.3.1 Sparse Correspondence, Optical-Flow Estimation, and Image-Registration Problems 2104.3.2 Optical-Flow Equation and Normal Flow 2134.3.3 Displaced-Frame Difference 2154.3.4 Motion Estimation is Ill-Posed: Occlusion and Aperture Problems 2164.3.5 Hierarchical Motion Estimation 2194.3.6 Performance Measures for Motion Estimation 2204.4 Differential Methods 2214.4.1 Lukas–Kanade Method 2214.4.2 Horn–Schunk Motion Estimation 2264.5 Matching Methods 2294.5.1 Basic Block-Matching 2304.5.2 Variable-Size Block-Matching 2344.5.3 Hierarchical Block-Matching 2364.5.4 Generalized Block-Matching – Local Deformable Motion 2374.5.5 Homography Estimation from Feature Correspondences 2394.6 Nonlinear Optimization Methods 2414.6.1 Pel-Recursive Motion Estimation 2414.6.2 Bayesian Motion Estimation 2434.7 Transform-Domain Methods 2454.7.1 Phase-Correlation Method 2454.7.2 Space-Frequency Spectral Methods 2474.8 3D Motion and Structure Estimation 2474.8.1 Camera Calibration 2484.8.2 Affine Reconstruction 2494.8.3 Projective Reconstruction 2514.8.4 Euclidean Reconstruction 2564.8.5 Planar-Parallax and Relative Affine Structure Reconstruction 2574.8.6 Dense Structure from Stereo 259References 259Exercises 264Problem Set 4 264MATLAB Exercises 266MATLAB Resources 2685 Video Segmentation and Tracking 2695.1 Image Segmentation 2715.1.1 Thresholding 2715.1.2 Clustering 2735.1.3 Bayesian Methods 2775.1.4 Graph-Based Methods 2815.1.5 Active-Contour Models 2835.2 Change Detection 2855.2.1 Shot-Boundary Detection 2855.2.2 Background Subtraction 2875.3 Motion Segmentation 2945.3.1 Dominant-Motion Segmentation 2955.3.2 Multiple-Motion Segmentation 2985.3.3 Region-Based Motion Segmentation: Fusion of Color and Motion 3075.3.4 Simultaneous Motion Estimation and Segmentation 3095.4 Motion Tracking 3135.4.1 Graph-Based Spatio-Temporal Segmentation and Tracking 3155.4.2 Kanade–Lucas–Tomasi Tracking 3155.4.3 Mean-Shift Tracking 3175.4.4 Particle-Filter Tracking 3195.4.5 Active-Contour Tracking 3215.4.6 2D-Mesh Tracking 3235.5 Image and Video Matting 3245.6 Performance Evaluation 326References 327MATLAB Exercises 334Internet Resources 3356 Video Filtering 3376.1 Theory of Spatio-Temporal Filtering 3386.1.1 Frequency Spectrum of Video 3386.1.2 Motion-Adaptive Filtering 3416.1.3 Motion-Compensated Filtering 3416.2 Video-Format Conversion 3456.2.1 Down-Conversion 3476.2.2 De-Interlacing 3516.2.3 Frame-Rate Conversion 3576.3 Multi-Frame Noise Filtering 3636.3.1 Motion-Adaptive Noise Filtering 3636.3.2 Motion-Compensated Noise Filtering 3656.4 Multi-Frame Restoration 3706.4.1 Multi-Frame Modeling 3716.4.2 Multi-Frame Wiener Restoration 3716.5 Multi-Frame Super-Resolution 3736.5.1 What Is Super-Resolution? 3746.5.2 Modeling Low-Resolution Sampling 3776.5.3 Super-Resolution in the Frequency Domain 3826.5.4 Multi-Frame Spatial-Domain Methods 385References 390Exercises 395Problem Set 6 395MATLAB Exercises 3967 Image Compression 3977.1 Basics of Image Compression 3987.1.1 Information Theoretic Concepts 3987.1.2 Elements of Image-Compression Systems 4017.1.3 Quantization 4027.1.4 Symbol Coding 4057.1.5 Huffman Coding 4067.1.6 Arithmetic Coding 4107.2 Discrete-Cosine Transform Coding and JPEG 4137.2.1 Discrete-Cosine Transform 4147.2.2 ISO JPEG Standard 4167.2.3 Encoder Control and Compression Artifacts 4237.3 Wavelet-Transform Coding and JPEG 2000 4247.3.1 Wavelet Transform and Choice of Filters 4257.3.2 ISO JPEG 2000 Standard 429References 435Exercises 437Internet Resources 4408 Video Compression 4418.1 Video-Compression Approaches 4428.1.1 Intra-Frame Compression, Motion JPEG 2000, and Digital Cinema 4428.1.2 3D-Transform Coding 4438.1.3 Motion-Compensated Transform Coding 4468.2 Early Video-Compression Standards 4478.2.1 ISO and ITU Standards 4478.2.2 MPEG-1 Standard 4488.2.3 MPEG-2 Standard 4568.3 MPEG-4 AVC/ITU-T H.264 Standard 4638.3.1 Input-Video Formats and Data Structure 4648.3.2 Intra-Prediction 4658.3.3 Motion Compensation 4668.3.4 Transform 4688.3.5 Other Tools and Improvements 4698.4 High-Efficiency Video-Coding (HEVC) Standard 4718.4.1 Video-Input Format and Data Structure 4718.4.2 Coding-Tree Units 4728.4.3 Tools for Parallel Encoding/Decoding 4738.4.4 Other Tools and Improvements 4758.5 Scalable-Video Compression 4778.5.1 Temporal Scalability 4788.5.2 Spatial Scalability 4798.5.3 Quality (SNR) Scalability 4808.5.4 Hybrid Scalability 4828.6 Stereo and Multi-View Video Compression 4828.6.1 Frame-Compatible Stereo-Video Compression 4838.6.2 Stereo and Multi-View Video-Coding Extensions of the H.264/AVC Standard 4848.6.3 Multi-View Video Plus Depth Compression 487References 492Exercises 494Internet Resources 495A Ill-Posed Problems in Image and Video Processing 497A.1 Image Representations 497A.1.1 Deterministic Framework – Function/Vector Spaces 497A.1.2 Bayesian Framework – Random Fields 498A.2 Overview of Image Models 498A.3 Basics of Sparse-Image Modeling 500A.4 Well-Posed Formulations of Ill-Posed Problems 501A.4.1 Constrained-Optimization Problem 501A.4.2 Bayesian-Estimation Problem 502References 502B Markov and Gibbs Random Fields 503B.1 Equivalence of Markov Random Fields and Gibbs Random Fields 503B.1.1 Markov Random Fields 504B.1.2 Gibbs Random Fields 505B.1.3 Equivalence of MRF and GRF 506B.2 Gibbs Distribution as an a priori PDF Model 507B.3 Computation of Local Conditional Probabilities from a Gibbs Distribution 508References 509C Optimization Methods 511C.1 Gradient-Based Optimization 512C.1.1 Steepest-Descent Method 512C.1.2 Newton–Raphson Method 513C.2 Simulated Annealing 514C.2.1 Metropolis Algorithm 515C.2.2 Gibbs Sampler 516C.3 Greedy Methods 517C.3.1 Iterated Conditional Modes 517C.3.2 Mean-Field Annealing 518C.3.3 Highest Confidence First 518References 519D Model Fitting 521D.1 Least-Squares Fitting 521D.2 Least-Squares Solution of Homogeneous Linear Equations 522D.2.1 Alternate Derivation 523D.3 Total Least-Squares Fitting 524D.4 Random-Sample Consensus (RANSAC) 526References 526Glossary527目 錄第1章 多維信號與系統11.1 多維信號21.1.1 有限域信號和周期信號21.1.2 對稱信號51.1.3 特殊的多維信號51.2 多維變換81.2.1 連續信號的傅里葉變換81.2.2 離散信號的傅里葉變換121.2.3 離散傅里葉變換(DFT)141.2.4 離散余弦變換(DCT)181.3 多維系統201.3.1 脈沖響應和2D卷積201.3.2 頻率響應231.3.3 FIR濾波器及對稱性251.3.4 IIR濾波器及偏微分方程271.4 多維采樣理論301.4.1 格上采樣301.4.2 格上采樣信號的譜341.4.3 格上采樣中的奈奎斯特準則361.4.4 格上采樣信號重建411.5 采樣格式轉換42參考文獻47習題48問題集148MATLAB習題50第2章 數字圖像和視頻532.1 人類視覺系統和色彩542.1.1 色覺及彩色模型542.1.2 對比敏感度572.1.3 時空頻率響應592.1.4 立體/深度感知622.2 數字視頻632.2.1 空間分辨率和幀速642.2.2 顏色、動態范圍和位深652.2.3 彩色圖像處理672.2.4 數字視頻標準702.3 3D視頻752.3.1 3D顯示技術752.3.2 立體視頻792.3.3 多視角視頻792.4 數字視頻應用812.4.1 數字電視812.4.2 數字影院852.4.3 互聯網中的視頻流882.4.4 計算機視覺和場景/行為理解912.5 圖像和視頻的質量922.5.1 視覺效果損傷922.5.2 主觀質量評估932.5.3 客觀質量評估94參考文獻96第3章 圖像濾波1013.1 圖像平滑1023.1.1 線性移不變低通濾波1023.1.2 雙邊濾波1053.2 圖像重采樣和多分辨率表示1063.2.1 圖像抽取1073.2.2 圖像內插1093.2.3 多分辨率金字塔表示1163.2.4 小波表示1173.3 圖像梯度估計、邊緣和特征檢測1233.3.1 圖像梯度估計1243.3.2 拉普拉斯估計1283.3.3 Canny邊緣檢測1303.3.4 Harris角檢測1313.4 圖像增強1333.4.1 基于像素的對比度增強1333.4.2 用于色調映射和圖像銳化的空間濾波1383.5 圖像去噪1433.5.1 圖像和噪聲模型1443.5.2 DFT域的線性空間不變濾波器1463.5.3 局部自適應濾波1493.5.4 非線性濾波:排序統計、小波收縮和雙邊濾波1543.5.5 非局部濾波:NL-Means和BM3D1583.6 圖像復原1603.6.1 模糊模型1613.6.2 線性空間不變模糊圖像的復原1653.6.3 盲復原—模糊識別1713.6.4 空間變化模糊圖像的復原1733.6.5 圖像修復176參考文獻187習題182問題集3182MATLAB習題185MATLAB資源189第4章 運動估計1914.1 圖像的形成1914.1.1 相機模型1924.1.2 3D運動的光學效果1974.2 運動模型1984.2.1 投射運動與表觀運動1994.2.2 3D剛體運動投射模型2034.2.3 2D表觀運動模型2064.3 2D表觀運動估計2104.3.1 稀疏對應性、光流估計和圖像配準問題2104.3.2 光流方程和法向流2134.3.3 幀間差2154.3.4 運動估計的病態性:遮擋與孔洞問題2164.3.5 分層運動估計2194.3.6 運動估計的性能衡量2204.4 差分法2214.4.1 Lukas-Kanade法2214.4.2 Horn-Schunk運動估計2264.5 匹配法2294.5.1 基本的塊匹配2304.5.2 變尺寸塊匹配2344.5.3 分層塊匹配2364.5.4 擴展的塊匹配—局部變形運動2374.5.5 特征對應的單應性估計2394.6 非線性優化法2414.6.1 像素遞歸運動估計2414.6.2 貝葉斯運動估計2434.7 變換域方法2454.7.1 相位相關法2454.7.2 空間-頻率譜法2474.8 3D運動估計和結構估計2474.8.1 相機標定2484.8.2 仿射重建2494.8.3 投影重建2514.8.4 歐氏重建2564.8.5 平面視差和相關仿射結構重建2574.8.6 立體中的致密結構259參考文獻259習題264問題集4264MATLAB習題266MATLAB資源268第5章 視頻分割與跟蹤2695.1 圖像分割2715.1.1 閾值法2715.1.2 聚類法2735.1.3 貝葉斯法2775.1.4 圖形法2815.1.5 主動輪廓模型2835.2 變化檢測2855.2.1 鏡頭邊界檢測2855.2.2 背景差法2875.3 運動分割2945.3.1 主要運動分割2955.3.2 復雜運動分割2985.3.3 基于區域的運動分割:彩色與運動的融合3075.3.4 運動估計與分割的同時實現3095.4 運動跟蹤3135.4.1 基于圖形的空-時分割與跟蹤3155.4.2 Kanade-Lucas-Tomasi跟蹤3155.4.3 Mean-Shift跟蹤3175.4.4 粒子濾波跟蹤3195.4.5 主動輪廓跟蹤3215.4.6 2D-Mesh跟蹤3235.5 圖像摳圖和視頻摳像3245.6 性能評估326參考文獻327MATLAB習題334互聯網資源335第6章 視頻濾波3376.1 空–時濾波理論3386.1.1 視頻的頻譜3396.1.2 運動自適應濾波3416.1.3 運動補償濾波3416.2 視頻格式轉換3456.2.1 降采樣3476.2.2 去隔行3516.2.3 幀率轉換3576.3 多幀聯合噪聲濾除3636.3.1 運動自適應噪聲濾除3636.3.2 運動補償噪聲濾除3656.4 多幀聯合復原3706.4.1 多幀聯合建模3716.4.2 多幀聯合維納復原3716.5 多幀聯合超分辨率重建3736.5.1 什么是超分辨率重建3746.5.2 低分辨率采樣建模3776.5.3 頻域超分辨率重建3826.5.4 空域多幀法385參考文獻390習題395問題集6395MATLAB習題396第7章 圖像壓縮3977.1 圖像壓縮的基礎3987.1.1 信息論概念3987.1.2 圖像壓縮系統的組成4017.1.3 量化4027.1.4 符號編碼4057.1.5 Huffman編碼4067.1.6 算術編碼4107.2 離散余弦變換編碼和JPEG4137.2.1 離散余弦變換4147.2.2 ISO JPEG標準4167.2.3 編碼控制與壓縮損傷4237.3 小波變換編碼和JPEG 20004247.3.1 小波變換和濾波器選擇4257.3.2 ISO JPEG 2000標準429參考文獻435習題437互聯網資源440第8章 視頻壓縮4418.1 視頻壓縮方法4428.1.1 幀內壓縮、運動JPEG 2000和數字影院4428.1.2 3D變換編碼4438.1.3 運動壓縮變換編碼4468.2 早期的視頻壓縮標準4478.2.1 ISO和ITU標準4478.2.2 MPEG-1標準4488.2.3 MPEG-2標準4568.3 MPEG-4 AVC/ITU-T H.264標準4638.3.1 視頻輸入格式和數據結構4648.3.2 幀內預測4658.3.3 運動補償4668.3.4 變換4688.3.5 其他工具和改進4698.4 高效視頻編碼(HEVC)標準4718.4.1 視頻輸入格式和數據結構4718.4.2 編碼樹單元4728.4.3 并行編碼/解碼工具4738.4.4 其他工具與改進4758.5 可伸縮視頻壓縮4778.5.1 時間可伸縮性4788.5.2 空間可伸縮性4798.5.3 質量(SNR)分級4808.5.4 混合可伸縮4828.6 立體視頻和多視角視頻壓縮4828.6.1 幀兼容的立體視頻壓縮4838.6.2 H.264/AVC標準中關于立體和多視角視頻編碼的擴展4848.6.3 多視角加深度信息的視頻壓縮487參考文獻492習題494互聯網資源495附錄A 圖像和視頻處理中的病態問題497A.1 圖像表示497A.1.1 確定性框架—函數/矢量空間497A.1.2 貝葉斯框架—隨機場498A.2 圖像模型概覽498A.3 圖像稀疏建;A500A.4 病態問題的適定公式501A.4.1 條件優化問題501A.4.2 貝葉斯估計問題502參考文獻502附錄B Markov和Gibbs隨機場503B.1 Markov隨機場與Gibbs隨機場的等價性503B.1.1 Markov隨機場504B.1.2 Gibbs隨機場505B.1.3 MRF和GRF的等價性506B.2 先驗PDF模型的Gibbs分布507B.3 Gibbs分布中局部條件概率的計算508參考文獻509附錄C 優化方法511C.1 基于梯度的優化512C.1.1 最速下降法512C.1.2 Newton-Raphson法513C.2 模擬退火法514C.2.1 Metropolis算法515C.2.2 Gibbs抽樣516C.3 貪婪法517C.3.1 條件遞歸法517C.3.2 平均場退火法518C.3.3 最高信任優先法518參考文獻519附錄D 模型擬合521D.1 最小均方擬合法522D.2 齊次線性方程組的LS解522D.2.1 交替推導法523D.3 總體最小均方擬合法524D.4 隨機采樣一致性(RANSAC)526參考文獻526術語表527