Measuring Agreement – Models, Methods, and Applications
Models, Methods, and Applications
Samenvatting
Presents statistical methodologies for analyzing common types of data from method comparison experiments and illustrates their applications through detailed case studies
Measuring Agreement: Models, Methods, and Applications features statistical evaluation of agreement between two or more methods of measurement of a variable with a primary focus on continuous data. The authors view the analysis of method comparison data as a two–step procedure where an adequate model for the data is found, and then inferential techniques are applied for appropriate functions of parameters of the model. The presentation is accessible to a wide audience and provides the necessary technical details and references. In addition, the authors present chapter–length explorations of data from paired measurements designs, repeated measurements designs, and multiple methods; data with covariates; and heteroscedastic, longitudinal, and categorical data. The book also:
Strikes a balance between theory and applications
Presents parametric as well as nonparametric methodologies
Provides a concise introduction to Cohen s kappa coefficient and other measures of agreement for binary and categorical data
Discusses sample size determination for trials on measuring agreement
Contains real–world case studies and exercises throughout
Provides a supplemental website containing the related datasets and R code
Measuring Agreement: Models, Methods, and Applications is a resource for statisticians and biostatisticians engaged in data analysis, consultancy, and methodological research. It is a reference for clinical chemists, ecologists, and biomedical and other scientists who deal with development and validation of measurement methods. This book can also serve as a graduate–level text for students in statistics and biostatistics.
Specificaties
Inhoudsopgave
<p>1 Introduction 1</p>
<p>1.1 Preview 1</p>
<p>1.2 Notational Conventions 1</p>
<p>1.3 Basic Characteristics of a Measurement Method 2</p>
<p>1.3.1 A Statistical Model for Measurements 3</p>
<p>1.3.2 Quality Characteristics 3</p>
<p>1.4 Method Comparison Studies 5</p>
<p>1.5 Meaning of Agreement 6</p>
<p>1.6 A Measurement Error Model 8</p>
<p>1.6.1 Identifiability Issues 9</p>
<p>1.6.2 ModelBased Moments 10</p>
<p>1.6.3 Conditions for Perfect Agreement 10</p>
<p>1.6.4 Link to Test Theory 11</p>
<p>1.7 Similarity versus Agreement 11</p>
<p>1.7.1 Evaluation of Similarity 11</p>
<p>1.7.2 Evaluation of Agreement 12</p>
<p>1.8 A Toy Example 13</p>
<p>1.9 Controversies and Our View 14</p>
<p>1.10 Concepts Related to Agreement 15</p>
<p>1.11 Role of Confidence Intervals and Hypotheses Testing 16</p>
<p>1.11.1 Formulating the Agreement Hypotheses 16</p>
<p>1.11.2 Testing Hypotheses Using Confidence Bounds 17</p>
<p>1.11.3 Evaluation of Agreement Using Confidence Bounds 17</p>
<p>1.11.4 Evaluation of Similarity Using Confidence Intervals 18</p>
<p>1.12 Common Models for Paired Measurements Data 18</p>
<p>1.12.1 A Measurement Error Model 19</p>
<p>1.12.2 A MixedEffects Model 20</p>
<p>1.12.3 A Bivariate Normal Model 21</p>
<p>1.12.4 Limitations of the Paired Measurements Design 22</p>
<p>1.13 The BlandAltman Plot 23</p>
<p>1.13.1 The Ideal Plot 23</p>
<p>1.13.2 A Linear Trend in the BlandAltman Plot 25</p>
<p>1.13.3 Heteroscedasticity in the BlandAltman Plot 26</p>
<p>1.13.4 Variations of the BlandAltman Plot 27</p>
<p>1.14 Common Regression Approaches 29</p>
<p>1.14.1 Ordinary Linear Regression 29</p>
<p>1.14.2 Deming Regression 31</p>
<p>1.15 Inappropriate Use of Common Tests in Method Comparison Studies 34</p>
<p>1.15.1 Test of Zero Correlation 34</p>
<p>1.15.2 Paired ttest 36</p>
<p>1.15.3 PitmanMorganand BradleyBlackwood Tests 36</p>
<p>1.15.4 Test of Zero Intercept and Unit Slope 38</p>
<p>1.16 Key Steps in the Analysis of Method Comparison Data 39</p>
<p>1.17 Chapter Summary 40</p>
<p>1.18 Bibliographic Note 41</p>
<p>Exercises 47</p>
<p>2 Common Approaches for Measuring Agreement 53</p>
<p>2.1 Preview 53</p>
<p>2.2 Introduction 53</p>
<p>2.3 Mean Squared Deviation 54</p>
<p>2.4 Concordance Correlation Coefficient 54</p>
<p>2.5 A Digression: Tolerance and Prediction intervals 57</p>
<p>2.5.1 Definitions 57</p>
<p>2.5.2 Normally Distributed Data 58</p>
<p>2.6 Lin s Probability Criterion and BlandAltman Criterion 59</p>
<p>2.7 Limits of Agreement 60</p>
<p>2.7.1 The Approach 60</p>
<p>2.7.2 Why Ignore the Variability? 61</p>
<p>2.7.3 Limits of Agreement versus Prediction and Tolerance Intervals 62</p>
<p>2.8 Total Deviation Index and Coverage Probability 62</p>
<p>2.8.1 The Approaches 62</p>
<p>2.8.2 Normally Distributed Differences 63</p>
<p>2.9 Inference on Agreement Measures 64</p>
<p>2.10 Chapter Summary 64</p>
<p>2.11 Bibliographic Note 65</p>
<p>Exercises 66</p>
<p>3 A General Approach for Modeling and Inference 71</p>
<p>3.1 Preview 71</p>
<p>3.2 MixedEffects Models 71</p>
<p>3.2.1 The Model 72</p>
<p>3.2.2 Prediction 73</p>
<p>3.2.3 Model Fitting 74</p>
<p>3.2.4 Model Diagnostics 75</p>
<p>3.3 A LargeSample Approach to Inference 76</p>
<p>3.3.1 Approximate Distributions 77</p>
<p>3.3.2 Confidence Intervals 78</p>
<p>3.3.3 Parameter Transformation 80</p>
<p>3.3.4 Bootstrap Confidence Intervals 81</p>
<p>3.3.5 Confidence Bands 83</p>
<p>3.3.6 Test of Homogeneity 83</p>
<p>3.3.7 Model Comparison 84</p>
<p>3.4 Modeling and Analysis of Method Comparison Data 85</p>
<p>3.5 Chapter Summary 88</p>
<p>3.6 Bibliographic Note 89</p>
<p>Exercises 89</p>
<p>4 Paired Measurements Data 95</p>
<p>4.1 Preview 95</p>
<p>4.2 Modeling of Data 95</p>
<p>4.2.1 MixedEffects Model 95</p>
<p>4.2.2 Bivariate Normal Model 97</p>
<p>4.3 Evaluation of Similarity and Agreement 98</p>
<p>4.4 Case Studies 99</p>
<p>4.4.1 Oxygen Saturation Data 99</p>
<p>4.4.2 Plasma Volume Data 101</p>
<p>4.4.3 Vitamin D Data 103</p>
<p>4.5 Chapter Summary 106</p>
<p>4.6 Technical Details 106</p>
<p>4.6.1 MixedEffects Model 106</p>
<p>4.6.2 Bivariate Normal Model 107</p>
<p>4.7 Bibliographic Note 108</p>
<p>Exercises 108</p>
<p>5 Repeated Measurements Data 111</p>
<p>5.1 Preview 111</p>
<p>5.2 Introduction 111</p>
<p>5.2.1 Types of Data 112</p>
<p>5.2.2 Individual versus Average Measurement 113</p>
<p>5.2.3 Example Datasets 113</p>
<p>5.3 Displaying Data 114</p>
<p>5.3.1 Basic Plots 114</p>
<p>5.3.2 Interaction Plots 116</p>
<p>5.4 Modeling of Data 117</p>
<p>5.4.1 Unlinked Data 118</p>
<p>5.4.2 Linked Data 121</p>
<p>5.4.3 Model Fitting and Evaluation 123</p>
<p>5.5 Evaluation of Similarity and Agreement 123</p>
<p>5.6 Evaluation of Repeatability 124</p>
<p>5.6.1 Unlinked Data 125</p>
<p>5.6.2 Linked Data 125</p>
<p>5.7 Case Studies 126</p>
<p>5.7.1 Kiwi Data 126</p>
<p>5.7.2 Oximetry Data 129</p>
<p>5.8 Chapter Summary 133</p>
<p>5.9 Technical Details 134</p>
<p>5.9.1 Unlinked Data 134</p>
<p>5.9.2 Linked Data 134</p>
<p>5.10 Bibliographic Note 135</p>
<p>Exercises 137</p>
<p>6 Heteroscedastic Data 141</p>
<p>6.1 Preview 141</p>
<p>6.2 Introduction 141</p>
<p>6.2.1 Diagnosing Heteroscedasticity 142</p>
<p>6.2.2 Example Datasets 143</p>
<p>6.3 Variance Function Models 144</p>
<p>6.4 Repeated Measurements Data 146</p>
<p>6.4.1 A Heteroscedastic MixedEffects Model 147</p>
<p>6.4.2 Specifying the Variance Function 149</p>
<p>6.4.3 Model Fitting and Evaluation 150</p>
<p>6.4.4 Testing for Homoscedasticity 151</p>
<p>6.4.5 Evaluation of Similarity, Agreement, and Repeatability 151</p>
<p>6.4.6 Case Study: Cholesterol Data 152</p>
<p>6.5 Paired Measurements Data 162</p>
<p>6.5.1 A Heteroscedastic Bivariate Normal Model 162</p>
<p>6.5.2 Specifying the Variance Function 163</p>
<p>6.5.3 Model Fitting and Evaluation 164</p>
<p>6.5.4 Testing for Homoscedasticity 164</p>
<p>6.5.5 Evaluation of Similarity and Agreement 164</p>
<p>6.5.6 Case Study: Cyclosporin Data 165</p>
<p>6.6 Chapter Summary 171</p>
<p>6.7 Technical Details 171</p>
<p>6.7.1 Repeated Measurements Data 171</p>
<p>6.7.2 Paired Measurements Data 173</p>
<p>6.8 Bibliographic Note 174</p>
<p>Exercises 174</p>
<p>7 Data from Multiple Methods 177</p>
<p>7.1 Preview 177</p>
<p>7.2 Introduction 177</p>
<p>7.3 Displaying Data 179</p>
<p>7.4 Example Datasets 179</p>
<p>7.4.1 Systolic Blood Pressure Data 180</p>
<p>7.4.2 Tumor Size Data 180</p>
<p>7.5 Modeling Unreplicated Data 184</p>
<p>7.6 Modeling Repeated Measurements Data 186</p>
<p>7.6.1 Unlinked Data 186</p>
<p>7.6.2 Linked Data 187</p>
<p>7.7 Model Fitting and Evaluation 189</p>
<p>7.8 Evaluation of Similarity and Agreement 190</p>
<p>7.9 Evaluation of Repeatability 191</p>
<p>7.10 Case Studies 192</p>
<p>7.10.1 Systolic Blood Pressure Data 192</p>
<p>7.10.2 Tumor Size Data 195</p>
<p>7.11 Chapter Summary 198</p>
<p>7.12 Technical Details 198</p>
<p>7.13 Bibliographic Note 200</p>
<p>Exercises 200</p>
<p>8 Data with Covariates 205</p>
<p>8.1 Preview 205</p>
<p>8.2 Introduction 205</p>
<p>8.3 Modeling of Data 206</p>
<p>8.3.1 Modeling Means of Methods 206</p>
<p>8.3.2 Modeling Variances of Methods 207</p>
<p>8.3.3 Data Models 208</p>
<p>8.3.4 Model Fitting and Evaluation 211</p>
<p>8.4 Evaluation of Similarity, Agreement, and Repeatability 211</p>
<p>8.4.1 Measures of Agreement for Two methods 212</p>
<p>8.4.2 Measures of Agreement for More Than Two Methods 213</p>
<p>8.4.3 Measures of Repeatability 213</p>
<p>8.4.4 Inference on Measures 214</p>
<p>8.5 Case Study 214</p>
<p>8.6 Chapter Summary 224</p>
<p>8.7 Technical Details 225</p>
<p>8.8 Bibliographic Note 226</p>
<p>Exercises 226</p>
<p>9 Longitudinal Data 229</p>
<p>9.1 Preview 229</p>
<p>9.2 Introduction 229</p>
<p>9.2.1 Displaying Data 231</p>
<p>9.2.2 Percentage Body Fat Data 231</p>
<p>9.3 Modeling of Data 234</p>
<p>9.3.1 The Longitudinal Data Model 236</p>
<p>9.3.2 Specifying the Mean Functions 237</p>
<p>9.3.3 Specifying the Correlation Function 237</p>
<p>9.3.4 Model Fitting and Evaluation 240</p>
<p>9.4 Evaluation of Similarity and Agreement 241</p>
<p>9.5 Case Study 242</p>
<p>9.6 Chapter Summary 247</p>
<p>9.7 Technical Details 247</p>
<p>9.8 Bibliographic Note 249</p>
<p>Exercises 250</p>
<p>10 A Nonparametric Approach 253</p>
<p>10.1 Preview 253</p>
<p>10.2 Introduction 253</p>
<p>10.3 The Statistical Functional Approach 255</p>
<p>10.3.1 A Weighted Empirical CDF 256</p>
<p>10.3.2 Distributions Induced by Empirical CDF 256</p>
<p>10.4 Evaluation of Similarity and Agreement 258</p>
<p>10.5 Case Studies 259</p>
<p>10.5.1 Unreplicated Blood Pressure Data 259</p>
<p>10.5.2 Replicated Blood Pressure Data 263</p>
<p>10.6 Chapter Summary 267</p>
<p>10.7 Technical Details 267</p>
<p>10.7.1 The Matrix 268</p>
<p>10.7.2 Estimation of 269</p>
<p>10.7.3 Influence Functions for the Measures 270</p>
<p>10.7.4 TDI Confidence Bounds 270</p>
<p>10.7.5 Summary of Steps 271</p>
<p>10.8 Bibliographic Note 271</p>
<p>Exercises 272</p>
<p>11 Sample Size Determination 279</p>
<p>11.1 Preview 279</p>
<p>11.2 Introduction 279</p>
<p>11.3 The Sample Size Methodology 281</p>
<p>11.3.1 Paired Measurements Design 281</p>
<p>11.3.2 Repeated Measurements Design 281</p>
<p>11.4 Case Study 282</p>
<p>11.5 Chapter Summary 286</p>
<p>11.6 Bibliographic Note 286</p>
<p>Exercises 287</p>
<p>12 Categorical Data 289</p>
<p>12.1 Preview 289</p>
<p>12.2 Introduction 289</p>
<p>12.3 Experimental Setups and Examples 290</p>
<p>12.3.1 Types of Data 290</p>
<p>12.3.2 Illustrative Examples 290</p>
<p>12.3.3 A Graphical Approach 292</p>
<p>12.4 Cohen s Kappa Coefficient for Dichotomous Data 293</p>
<p>12.4.1 Definition and Basic Properties: Two Raters 293</p>
<p>12.4.2 Sample Kappa Coefficient 297</p>
<p>12.4.3 Agreement with a Gold Standard 298</p>
<p>12.4.4 Unbiased Raters: Intraclass Kappa 299</p>
<p>12.4.5 Multiple Raters 300</p>
<p>12.4.6 Combining and Comparing Kappa Coefficients 301</p>
<p>12.4.7 Sample Size Calculations 302</p>
<p>12.5 Kappa Type Measures for More Than Two Categories 303</p>
<p>12.5.1 Two Fixed Raters with Nominal Categories 303</p>
<p>12.5.2 Two Raters with Ordinal Categories: Weighted Kappa 303</p>
<p>12.5.3 Multiple Raters 304</p>
<p>12.6 Case Studies 305</p>
<p>12.6.1 Two Raters with Two Categories 305</p>
<p>12.6.2 Weighted Kappa: Multiple Categories 306</p>
<p>12.7 Models for Exploring Agreement 306</p>
<p>12.7.1 Conditional Logistic Regression Models 306</p>
<p>12.7.2 LogLinear Models 307</p>
<p>12.7.3 A Generalized Linear MixedEffects Model 308</p>
<p>12.8 Discussion 309</p>
<p>12.9 Chapter Summary 310</p>
<p>12.10 Bibliographic Note 311</p>
<p>Exercises 312</p>
<p>References 319</p>
<p>Dataset List 331</p>
<p>Index 333</p>