Download pdf - Mette Jensen Stochkendahl Et Al SR Palpation 2006

7/28/2019 Mette Jensen Stochkendahl Et Al SR Palpation 2006

1/21

LITERATURE R EVIEW

M ANUAL E XAMINATION OF THE S PINE: A S YSTEMATIC

C RITICAL L ITERATURE R EVIEW OF R EPRODUCIBILITY

Mette Jensen Stochkendahl, DC, a Henrik Wulff Christensen, DC, MD, PhD, b Jan Hartvigsen, DC, PhD,c

Werner Vach, PhD, d Mitchell Haas, DC, MA,e Lise Hestbaek, DC, PhD, f

Alan Adams, DC, MS, MSEd,g and Gert Bronfort, DC, PhD h

ABSTRACT

O bjective : Poor reproducibility of spinal palpation has been reported in previously published literature, and authors of recent reviews have posted criticism on study quality. This article critically analyzes the literature pertaining to the inter-and intraobserver reproducibility of spinal palpation to investigate the consistency of study results and assess the level of evidence for reproducibility.M ethods : Systematic review and meta-analysis were performed on relevant literature published from 1965 to 2005,identified using the electronic databases MEDLINE, MANTIS, and CINAHL and checking of reference lists. Descriptivedata from included articles were extracted independently by 2 reviewers. A 6-point scale was constructed to assess themethodological quality of original studies. A meta-analysis was conducted among the high-quality studies to investigatethe consistency of data, separately on motion palpation, static palpation, osseous pain, soft tissue pain, soft tissue changes,and global assessment. A standardized method was used to determine the level of evidence.R esults : The quality score of 48 included studies ranged from 0% to 100%. There was strong evidence that theinterobserver reproducibility of osseous and soft tissue pain is clinically acceptable (j z 0.4) and that intraobserver reproducibility of soft tissue pain and global assessment are clinically acceptable. Other spinal procedures are either not reproducible or the evidence is conflicting or preliminary. (J Manipulative Physiol Ther 2006;29:475-485)K ey Indexing Terms: Reproducibility of Results; Palpation; Literature Review; Diagnostic Tests; Spine; Meta-Analysis

B iomechanical dysfunction is thought to be animportant contributor to spinal pain, and manual palpation is a widely used procedure for thediagno sis of such dysfunctions among providers of manualmedici ne.1-3 Contrary to the expectations of many clini-cians, unacceptable levels of reproducibility have been

475

a Research Fellow, Nordic Institute of Chiropractic andClinical Biomechanics, Part of Clinical Locomotion Science,Odense, Denmark.

b Senior Researcher, Nordic Institute of Chiropractic andClinical Biomechanics, Part of Clinical Locomotion Science,Odense, Denmark.

c Senior Researcher, Nordic Institute of Chiropractic andClinical Biomechanics, Part of Clinical Locomotion Science,Odense, Denmark; and Associate Professor, Institute of SportsScience and Clinical Biomechanics, Part of Clinical LocomotionScience, University of Southern Denmark, Denmark.

d Professor, The Department of Statistics, University of SouthernDenmark, Denmark.

e Professor, Center for Outcomes Studies, Western StatesChiropractic College, Portland, Ore.

f Senior Researcher, The Back Research Center, Backcenter Funen; and Part of Clinical Locomotion Science, University of Southern Denmark, Denmark.

g Professor, Texas Chiropractic College, Pasadena, Tex.h Professor, Department of Research, Wolfe-Harris Center

for Clinical Studies, Northwestern Health Sciences University,Bloomington, Minn.

This study was funded by the Nordic Institute of Chiroprac-tic and Clinical Biomechanics, Odense, Denmark and theFoundation for Chiropractic Education and Research, grant no.03-09-01.

Submit requests for reprints to: Mette Jensen Stochkendahl, DC, Nordic Institute of Chiropractic and Clinical Biomechanics,Research Department, Klosterbakken 20, DK-5000 Odense C,Denmark (e-mail: [email protected] ).

Paper submitted September 15, 2005; in revised form February2, 2006.

0161-4754/$32.00Copyright D 2006 by National University of Health Sciences.doi:10.1016/j.jmpt.2006.06.011


2/21

shown in the majority of the previously published literature,and authors of newer reviews have questioned the utility of manual examination procedures in spinal diagnosis alto-gether .4-7 Severe criticism has been posted on the design of the original studies, including the use of asymptomatic

subject s,4,5

inexperienced observers,5

parallel testing,4

unclear definitions of positive findings and rating scales,4,6

weak description of study results,4,5,7 and the need for improvement in overall study quality.4,7 Furthermore, thedependence of Cohens j (the most widely statisticalmethod used in studies on reproducibility) on the prevalenceof positive findings, and the composition of the study population has been the subject of discussion.8,9

Unfortunately, these reviews themselves have important limitations. For instance, some deal with only a minority of manual examination procedures such as chiropractic proce-dures only,4 1 spinal region,4,6,10 or motion palpation only.5

In only 3 reviews werea predefined quality system appliedto assess study quality,4,6,7 and in none of the reviews were both the number of studies, the methodological quality, andthe consistency of the outcomes considered, as recommen-ded by van Tulder and others.11-13 Finally, in none of thesereviews was the impact of the predefined criteria on theconclusions tested. Therefore, the value of palpation as adiagnostic tool is, at present, still unknown and so are theabilities of practitioners of manual therapy to reliablydiagnose spinal dysfunctions using palpation.

We therefore decided that another systematic reviewtaking into account the above issues was warranted.Furthermore, a meta-analysis including comparable studies

of adequate methodological standard and assessment of theconsistency of study outcomes would be highly useful. The purpose of this paper is therefore to systematically review andcritically assess the design and statistical methodology of theliterature pertaining to reproducibility of spinal palpationadopting standardized criteria for judging diagnostic studies.A meta-analysis was conducted to evaluate consistency of study outcomes. Finally, the level of evidence for thereproducibility of spinal palpation was determined.

METHODSDefinitions

Pal pation was defined according to Bergmann andPetersen, 1 and results of the original articles were analyzedaccording to the palpation procedure, using the followingannotations: motion palpation (MP), static palpation (SP)(palpation for alignment and/or structure), osseous pain(OP) (pain generated from palpation of osseous structures),soft tissue pain (STP), soft tissue changes (STC), and globalassessment (GA) (the latter was introduced to describe theuse of 2 or more of the above procedures to make 1 single judgement on the presence/absence of mechanical dysfunc-tion). Each palpation procedure could be by applied under 5 conditionsstanding, sitting, prone, supine, or side

lyingand at different segmental levels. Consequently, a palpation procedure applied under a specific condition at 1or more segmental level is denoted a test. A paper couldconsider a single test or several tests and only 1 palpation procedure or several palpation procedures.

Reproducibility refers to the ability of a single observer to find the same result using the same diagnostic procedurein the same patient on 2 separate moments in time

(intraobserver agreement) and/or the ability of 2 observersto find the same result of a given diagnostic procedure in a patient (interobserver agreement ).14

Study SelectionStudies were identified by a comprehensive search of the

MANTIS (1966-2005), CINAHL (1982-2005), and MED-LINE (1965-2005) databases using the index termsrepro-ducibility , reliability , or observer variation in combinationwith palpation , motion palpation , physical examination procedures , or spine in text and abstracts. Bibliographies of retrieved documents were checked for any additional

studies. The principal investigator (MJS) screened thedocuments retrieved from this search twice to determineeligibility according to inclusion and exclusion criteria, aslisted in Figure 1.

Data ExtractionUsing a checklist, data from included documents were

extracted and recorded independently by 2 of the authors(MJS and HWC). Completed checklists were then compared, and discordances were resolved by discussion untilconsensus was reached. If consensus could not be reached, athird investigator (JH) was available to mediate.

Fig 1. Inclusion and exclusion criteria.

476 Journal of Manipulative and Physiological TherapeuticsStochkendahl et alJuly/August 2006Spinal Palpation: A Systematic Review


3/21

Assessment of Methodological Quality of Trials No standardized and validated method for assessing

the quality of reproducibility studies exists. Therefore, a6-point scale was constructed based on recognized require-ments for clinical trials of reproducibility and standardrecommend ations for systematic reviews of test accu-racy. 12,15,16 The operational definitions of the qualitycriteria are described inFigure 2. A study was consideredhigh-quality if the methodological quality score, expressedas a percentage of the maximum score, was 50% or higher and low-quality if the score was less than 50%. The qualityscore reflects the relevance and appropriateness of 3 sepa-rate dimensions that may affect interpretation of results,

study population, study design, and statistical analysis. Thequality scoring of the trials was performed independently by 2 reviewers (MJS and HWC). Differences in scoreswere resolved through consensus by the 2 reviewers. Thequality scores of the individual trials were used as part of the evidence determination.

Meta-AnalysisTo assess the consistency of study outcomes in

articles included in the systematic review, a meta-analysiswas conducted. Not eligible for inclusion in the meta-analysis were (1) low quality studies (b 50%), (2) studies

not using a binary classification of the test outcome, (3)studies not reporting any results at all, (4) studies usinga binary outcome but not reporting j values, and (5)studies not reporting an adequate description of the palpation procedure.

When possible, single results from included studies ( j andconfidence intervals [CI]) were drawn directly from theoriginal articles. If CIs were not reported in the originalstudies, CIs were calculated according to Altman17 if thenecessary information (prevalence and sample size) wasavailable. Results for individual segmental levels not insequence were included separately in the analysis. In case of multiple reproducibility results reported for several pairs of

Fig 2. Operational definitions of the quality criteria.

Table 1. Basic characteristic of the selected articles for the systematic review

No. of articles

Region Inter (n = 48) Intra (n = 19)

Cervical 16 3Thoracic 5 2Lumbar 19 8SI joints 8 6

No. of tests considered

Palpaton procedure Inter (n = 58) Intra (n = 26)

MP 28 15SP 3 0OP 6 1STP 11 5STC 3 0GA 7 5

Fig 3. Flow chart of study inclusion in the meta-analysis of interobserver reproducibility studies.

Stochkendahl et alJournal of Manipulative and Physiological TherapeuticsSpinal Palpation: A Systematic ReviewVolume 29, Number 6

477


4/21

observers or several spinal segments in sequence, we took theaverage of the reportedj values and computed a CI, again byapplying the Altman formula with the original sample size.This is a conservative approach ignoring a possible gain in precision due to taking the average.

We displayed all available original results in a forest plot. No formal modeling and analysis of heterogeneity was performed because (1) information on the precision of thesingle results was not available in all studies, (2) we used partially a conservative assessment in the single studies,and (3) multiple results within a study cannot be regardedas independent.

Overall j values were computed by taking first the meanj value within each study and then by averaging these meanj values. Confidence intervals for the overall j values are based on the empirical variation of the meanj values, andwere only computed if at least 4 studies constituted a meanj value.

In a secondary analysis, the association between severalstudy characteristics and the mean j value of the studywas tested by an analysis of covariance, including the type

Fig 4. Meta-analysis: intraobserver reproducibility.

Fig 5. Meta-analysis: interobserver reproducibility.



5/21

of palpation, separately for the intra- and interobserver results. The study characteristics were as follows: pub-lication year, definition of positive findings, segmentalregion, standardization (ie, agreement on procedure,written instructions, and training sessions), applicationcondition, occupation, experience, symptomatic status of test population, multiple tests.

Assessment of the Level of EvidenceCriteria for determining the level of evidence for

reproducibility of spinal palpation were adapted from the

Agency for Health Car e Policy and Researchs guidelinesfor acute low back pain.18 This method has been used toassess the level of evidence of risk factors for low back painin systematic reviews of epidemiological studies.13,19 Themethod takes into account all available included studieswhich describe a palpation procedure, report results, and usea valid statistical method (j or j w ) or intraclass correlationcoefficient [ICC]).8

The system evaluates the evidence by taking into account (1) the number of studies, (2) the methodological qualityexpressed by quality scores, and (3) the consistency of thestudy outcomes. Consistency was checked by visualinspection of the forest plots. The rating system was appliedto each palpation procedure. Five categories were used todescribe evidence levels:

- Strong evidence : provided by generally consistent findings in multiple (z 2) high-quality studies

- Moderate evidence : provided by generally consistent findings in 1 high-quality study and 1 or more low-quality studies or in multiple (z 2) low-quality studies

- Preliminary evidence : only 1 study available- Conflicting evidence : inconsistent findings in multiple

(z 2) studies- No evidence : no studies were identified

The level of acceptable reproducibility has traditionally,and somewhat arbitrarily, been set at j N 0.4 in studies of manual medicine, 8,20-25 and thus, a j value above 0.4 wasconsidered clinically acceptable reproducibility in thisreview. Levels of clinically acceptable reproducibilityexpressed in j w or ICC were arbitrarily chosen at 0.4 and0.8, respectively.

Sensitivity AnalysisTo test the robustness of the assumptions behind the

weighting of the evidence, the prespecified cut points for adequate methodological quality (50%) and minimal clin-ically acceptable reproducibility (j z 0.4) were subjected toincreases and decreases of the cut points of F 25% in thequality score and F .1 in reproducibility.

R ESULTSResults of the Literature Search

More than 900 publications were retrieved, and 48 originalarticles published between 1980 and 2005 were includedaccording to the inclusion criteria.20-67 In all 48 studiesinterobserver reproducibility were reported, and in 19 studies,intraobserver reproducibility was also reported (AppendicesA and B, available online at www.mosby.com/jmpt). All predefined categories of palpation, spinal segments, and application conditions were evaluated. In 25 articles, a singletest was evaluated, and in 22 articles, multiple tests (paralleltesting) were assessed. Classification of the palpation procedure was not possible in 1 study due to insufficient description. 63 Altogether, 58 tests were considered for

Fig 6. Flow chart of study inclusion in the assessment of level of evidence of interobserver reproducibility studies.

Table 2. Results of studies using ICC or j w , and low quality studies included in the level of evidence of interobserver reproducibility

Palpaton procedure

Results

j or ICC Low quality

MP ICC: j :0.09-0.25 4 47 0.05 4 370.4-0.73 4 49 0.01 4 56

j w : 0.17-0.17 4 570.16-0.49 4 21

0.42-0.75 4 40

OP and STP ICC: j :OP: 0.27-0.85 4 49 OP: 0.00-1.0 4 36

OP: 0.22-0.804 20 STP: 0.35-0.87 4 36

j w :OP: 0.47-0.52 4 25

STP: 0.24-0.56 4 25

STC j : 0.07 4 33

SP j : 0.14-0.37 4 36

j w represents weighted j .4 In-text reference number.


479


6/21

interobserver reproducibility and 26 tests for intraobserver reproducibility (Table 1). Motion palpation was the most frequently investigated palpation procedure, followed bystudies of palpation for pain.

Methodological QualityThe methodological quality of the studies ranged from

0% to 100% (Appendices C and D, available online at www.mosby.com/jmpt). Overall, 30 studies (63%) were of high quality; however, only 8 of 19 studies (42%) inves-tigating intraobserver reproducibility were high-quality. The proportion of high quality was higher among articlesinvestigating the cervical and thoracic spine than the articlesinvestigating the lumbar spine and the sacroiliac (SI) joints(67% vs 59%). A trend for increasing quality was seen for more recent articles. The average quality score increasesfrom 27% in articles published before 1988, to 48% inarticles published between 1988 and 1995, and to 54% inarticles published after 1996.

Meta-AnalysisOf 48 original studies addressing interobserver reprodu-

cibility, 22 were considered both high-quality and eligiblefor inclusion in the meta-analysis according to the predetermined cri teria. Twenty-six articles were not included (Fig 3). Figures 4 and 5 give an overview of the single results available for the meta-analysis.

Eight original studies addressing intraobserver reprodu-cibility were included in the meta-analysis (Fig 4). Elevenstudies were not eligible. Ten studies were low-qual-ity, 34,37,48,53,60,61,63-66 and 1 paper did not use a binaryclassification of the test outcome.55 R esults were onlyavailable for 4 procedures (STP, OP, MP, and GA). Withineach procedure, results seem to be comparable and point tomidrange to high-range j values, except of the study of Meijne et al.39

With respect to interobserver reproducibility, most of theresults for STP indicate midrange repr oducibility (Fig 5).Excepted are results from Boline,58 which showed low-range reproducibility; however, the j estimate was veryimprecise here (large CI). For STC, the results suggest low-range reproducibility, whereas SP shows inconsistent results. Results of OP all suggest mid- to high-rangejvalues. Most of the results for MP suggest low reprodu-cibility. j Values were inconsistent for GA but had wide,overlapping confidence intervals.

We found no significant effect of year of publication,segmental region, standardization of procedures, observer profession or experience, symptomatic status of test population, or number of tests performed on thej values(data not shown). Thus, our investigation showed that most study characteristics had little influence on the studyresults. A notable exception was seen when comparingthe application conditions, where sitting palpation wasassociated with slightly smaller j values and standing palpation was associated with distinctly smaller j values.These differences were significant ( P = .042) for theinterobserver studies, but the tendency could be also seenin the intraobserver studies (nonsignificant). We would alsolike to note that we could observe in the intraobserver

analysis a tendency to low mean j values in studieswithout parallel testing (j = 0.23), compared with studieswith parallel testing (j = 0.61) (nonsignificant).

Evidence of ReproducibilityThirty-one articles were available for the assessment of

level of evidence, including 6 studies not reporting a binaryoutcome (Fig 6).20,21,25,40,47,49 Results from the 6 studiesusing weighted j or ICC were not directly comparable tothe studies using j , but all 6 studies showed results withsimilar trends of low interobserver agreement on MP andhigher interobserver agreement on evaluation of pain

Table 3. Articles included in the meta-analysis and the assessment of level of evidence in categories of palpation procedures

Total number of articles insystematic review(n = HQ/LQ)

No of HQarticles eligiblefor meta-analysis

No of used test results in themeta-analysis

No of articleseligible for levelof evidence(n = HQ/LQ)

Conflictingevidence

Level of evidence

Average j valuefrom the meta-analysis(95% CI)4

Procedure Inter (30/18)

Intra(8/11)

Inter (n = 22)

Intra(n = 8)

Inter (n = 57)

Intra(n = 26)

Inter (25/6)

Intra(11/3)

Inter Intra Inter Intra Inter Intra

OP 8/2 1/1 5 1 5 1 8/1 1/0 No Strong Pre 0.53(0.32-0.74)

0.91

STP 8/2 2/1 7 2 11 5 8/1 2/0 No No Strong Strong 0.42(0.29-0.55)

0.65

MP 22/14 7/8 16 6 27 15 20/3 6/2 No No Strong Strong 0.17(0.10-0.24)

0.35(0.13-0.58)

STC 5/2 0/0 3 0 3 0 3/1 0 No Strong No 0.03 SP 4/1 0/0 3 0 3 0 3/1 0 Yes Conf No GA 4/1 2/1 4 2 7 5 4/0 2/1 Yes No Conf Strong 0.44

HQ , High-quality; LQ , low-quality; Pre , preliminary; Conf , conflicting.4 Calculated if 4 or more results were available.



7/21

(Table 2). Similarly, we also included 5 low-quality studies,which showed similar trends (Table 2).33,36,37,56,57

Taking all 31 studies together, strong evidence of clinically acceptable intraobserver reproducibility ( j z0.4) was found for STP and GA (Table 3). Strong evidence

for clinically acceptable interobserver reproducibility wasfound for OP and STP according to the predefined criteriafor assessment of levels of evidence. Strong evidence of clinically unacceptable reproducibility was found for intra-observer MP and interobserver MP and STC. Conflictingevidence was found for interobserver reproducibility of SPand GA. Preliminary evidence of clinically acceptablereproducibility was found for intraobserver OP, and noevidence was found for intraobserver SP and STC.

Sensitivity AnalysisIn the meta-analysis, only high-quality studies were

included. If low-quality studies reporting binary outcomesand j values or high-quality studies using j w or ICC had been included, the results would have been unaffected (datanot shown).

Raising the cut point for adequate methodologicalquality from 50% to 75%, or any amount of decrease inthe cut point, did not effect the weight of the evidence or the overall conclusions, except for intraobserver MP andintraobserver GA, where an increase to 75% would result inconflicting evidence derived from only 2 studies for intraobserver MP and moderate evidence for clinicallyacceptable intraobserver GA. Raising the cut point for clinical acceptability has an obvious impact, with results for

pain being most robust due to high overallj values.

D ISCUSSIONSummary of Results

After reviewing studies dealing with reproducibility of manual palpation of the entire spine, including the SI joints,we found strong evidence for clinically acceptable repro-ducibility both within and between observers for palpationof osseous and STP and within the same observer for GA.Strong evidence for clinically unacceptable levels of reproducibility for intra- and interobserver MP and STCwas found. Intraobserver reproducibility was consistentlyhigher than interobserver reproducibility, and reproducibil-ity of palpation for pain response was consistently higher than reproducibility of palpation for motion.

The most recent and comprehensive review evaluatingthe reproducibility of spinal palpation by Seffinger et al7

applied different inclusion and general review criteria, andthus, only 27 of 44 articles and 9 of 19 high-quality articlesincluded in this review were evaluated. Furthermore, weincluded several more recent publications and articlesdealing with the SI joints, GA, and evaluated single resultsfrom multiple test regimens. Our conclusions are based on predefined criteria and an evaluation of consistency of

high-quality studies, a method not previousl y applied,whereas the conclusions by Seffinger et al7 were basedon both high- and low-quality studies without an evaluationof consistency. The authors concluded that pain provoca-tion tests are most reliable, and soft tissue paraspinal

palpatory diagnostic test is not reliable. Among the 12highest-quality articles, pain provocation, motion, andlandmark location tests were reliable within the sameobserver, but not always among observers under similar conditions. Overall, examiner discipline, experience level,consensus on procedures used, training, or the use of symptomatic subjects did not improve reliability. This is inagreement with our findings. Furthermore, we concludethat palpation of pain is reproducible both within andamong observers, whereas MP may be reproducible withinthe same observer.

Methodological and Clinical ConsiderationsThe experimental design of repr oducibil ity studies has

been criticized in previous reviews,4-7,68-71 and we foundthat 26 of 48 articles were of low methodological quality,had invalid statistical methods, or insufficient reporting of palpation procedures or test results.

Comparability of the studies included in a review is theimportant requirement to ensure valid generalizations. Weensured comparability with respect to the palpation proce-dures used, but the studies were rather heterogeneous withrespect to characteristics such as definition of positivefindings, segmental region, standardization, occupation,experience, symptomatic status of test population, and

parallel testing. However, our investigation showed that most study characteristics had little influence on the studyresults, with the exception of the application condition.Especially, standing palpation was associated with very lowj values. Among the reviewed studies, standing palpation isused solely in the b Gillet test Q of SI biomechanicaldysfunction, and only 2 studies reporting this conditionwere included in our analysis.39,59 However, both contrib-uted to the evaluation of the inter- and intraobserver agreement of MP. If we remove these 2 studies, then theaverage j for the interobserver agreement increases to0.19 (0.13-0.26), and the intraobserver agreement increasesto 0.44 (0.14-0.73), such that the intraobserver agreement of MP can be regarded as acceptable.

Poor reproducibility of MP may reflect the design of reproducibility stud ies, rather than the quality of the palpation procedure.29,30,72 Greater reproducibility may beattained by allowing positive findings in a neighboringspinal segment to count in assessing agreement.29 However,this implies that we define a new, different diagnostic test which, then, requires a clinical rationale of test meaningful-ness, beyond just an increase inj values. 8 Further, paralleltesting (test regimens) seems to aid the observer in makingthe clinical decision, thus enhancing reproducibility;30,42 atendency we could also observe in our data. The acceptable


481


8/21

intraobserver reproducibility for GA is also in line with thisfinding. However, when evaluating a combination of tests,information is only given about the reproducibility of thesingle test as part of this exact combination of test s.14,73

Moreover, we must be aware that conclusions on a single

test from a study involving several tests may be only valid if the test is applied as part of this exact combination of tests.From a clinical perspective, increased reproducibility with parallel testing indicates that at this point, clinicians shouldnot base their diagnosis on a single clinical examinationfinding such as palpation but, rather, conduct a range of tests. It is, however, premature to make clinical guidelineson how to use palpation because many aspects of palpation,such as the validity, still need to be investigated.

The reproducibility of palpation for pain response isconsistently higher than palpation for motion and, consis-tently, substantially higher within an observer than amongdifferent observers. However, both palpatory pain studiesand intraobserver studies in general have inherent problemswith blinding of observers. In intraobserver studies,conscious and unconscious cues may render blinding of the observers impossible, and the independence of measurescan not be guaranteed. In palpatory pain studies, blinding of subjects is impossible. Both situations imply the risk of overestimating reproducibility. It should also be noted that intraobserver reproducibility is somewhat higher thaninterobserver reproducibility by definition (dependin g onthe magnitude of observer by subject interaction).74

A dilemma between high internal validity and clinicalapplicability arises when designing studies of reproducibility.

For example, training studies contrast maximal (ideal)reproducibility with actual reproducibility in practice. Toenhance the internal validity, rigid testing conditions should be set up with considerations to blinding, randomization,standardization and training, and parallel testing. However,rigid enforcement of testing condition often diverges fromtheclinical situation and, hence, may reduce the external validity.In a clinical situation, a mix of both asymptomatic andsymptomatic patients will most likely present to practitionersof manual medicine. Therefore, the study population shouldconsist of a mix of both symptomatic and asymptomaticsubjects so that the reproducibility of the testing procedurehas a relation to the characteristics of the study population.14

Finally, in spite of the use in every day clinical routines, test procedures do not always necessarily evaluate the clinicalentity it is intended to evaluate, and it is th erefore important todiscuss the content of the test procedure.14,75

Statistical Considerationsj is widely accepted as the statistical method of choice

for evaluat ing agreement between 2 observers for a binaryclassification. 8 It is, however, not without problems to usej as the sole measure of observer agreement becauseinformation is lost when a 4-fold table is summarized into1 number. Consequently, we do not know whether it is due

to a difference in prevalence estimates between observers, or whether observers lack agreement in spite of similar prevalence if a moderatej value is obtained in a studyof reproducibility.

j has been criticized for its dependence on the prevalence

of positive findings, which limits its usefulness in meta-analyses, because studies with varying prevalence aretypically compared. However, the composition of the study population may have gr eater impact on j than the prevalenceof positive findings.9 Both a binary outcome and a reportedjvalue were required for studies to be part of our meta-analysis. However, binary outcomes may vary according tothe definition of positive findings (ie, prevalence is directlydependent on the definition of positive findings). For example, if the observer is asked to identify any hypomobilesegment(s) in a spinal region, the prevalence can vary from0% to 100%, depending on the study population. If theobserver is to identify the most hypomobile segment, theoverall prevalence of positive findings will be 100%, but at any particular segment under investigation, the prevalence of the most hypomobile can be 0% to 100%. However,we foundno association between the prevalence of positive findingsand j values. This supports that the composition of the study populations is probably of greater importance than the prevalence of positive findings, as suggested by Vach.9

Different words and schemes have been used to evaluatethe strength of reproducibility, but there ar e no definitiveguidelines for interpreting good concordance.8,7 6 Moreover,little research has been done to establish minimal, clinicallyacceptable reproducibility, and perhaps more important than

qualifying the strength of concordance, the quantitativereproducibility ind ices need to be evaluated in terms of their clinical application.8

Limitations of this ReviewDifferent methodologies have been advocated for sys-

tematic reviews of trials addressing therapeutic efficacy,12

but little consensus exists when it comes to assessing thequality of reproducibility studies. We have chosen toevaluate the strength of evidence based on a best-evidencesynthesis method, and this is one of the main differences between this review and previously published reviews onthe same topic. Heterogeneity across studies, in terms of test procedures, inclusion criteria, study design and presentationof results, may be masked by the best-evidence approach.Considerable heterogeneity in study characteristics wasnoted across studies included in this review. However,despite this heterogeneity, the meta-analysis showed veryconsistent overall findings and only moderate impact of thespecific design characteristics on the study outcomes.

The exclusion from the meta-analysis of studies that didnot report a binary outcome is another important difference between this and previous reviews. To compare studies of reproducibility, the same type of outcome and method of statistics must be applied. On this account, we had to



9/21


10/21

17. Altman DG. Some common problems in medical research. In:Altman DG, editor. Practical statistics for medical research.London 7 Chapman & Hall; 1991. p. 396-439.

18. Bigos S, Bowyer O, Braen G, et al. Acute low back problemsin adults. Clinical Practice Guideline No. 14. AHCPR Publication No. 95-0642. Rockville (Md)7 Agency for HealthCare Policy and Research, Public Health Service, U.S.Department of Health and Human Services; 1994 [December.Available from: www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=hstat6.chapter.25870 .].

19. Hartvigsen J, Lings S, Leboeuf-Yde C, Bakketeig L. Psycho-social factors at work in relation to low back pain andconsequences of low back pain; a systematic, critical review of prospective cohort studies. Occup Environ Med 2004;61:e2.

20. Pool JJ, Hoving JL, De Vet HC, van Mameren H, Bouter LM.The interexaminer reproducibility of physical examination of the cervical spine. J Manipulative Physiol Ther 2004;27:84-90.

21. Fjellner A, Bexander C, Faleij R, Strender LE. Interexaminer reliability in physical examination of the cervical spine.J Manipulative Physiol Ther 1999;22:511-6.

22. Strender LE, Lundin M, Nell K. Interexaminer reliability in physical examination of the neck. J Manipulative Physiol Ther 1997;20:516-20.

23. Strender LE, Sjoblom A, Sundell K, Ludwig R, Taube A.Interexaminer reliability in physical examination of patientswith low back pain. Spine 1997;22:814-20.

24. Keating JC, Bergmann TF, Jacobs GE, Finer BA, Larson K.Interexaminer reliability of eight evaluative dimensions of lumbar segmental abnormality. J Manipulative Physiol Ther 1990;13:463-70.

25. Viikari-Juntura E. Interexaminer reliability of observations in physical examinations of the neck. Phys Ther 1987;67:1526-32.

26. Sebastian D, Chovvath R. Reliability of palpation assessment in non-neutral dysfunctions of the lumbar spine. Orthop PhysTher Pract 2004;16:23-6.

27. Hicks GE, Fritz JM, Delitto A, Mishock J. Interrater

reliability of clinical examination measures for identificationof lumbar segmental instability. Arch Phys Med Rehabil2003;84:1858-64.

28. Downey B, Nicholas T, Niere K. Can manipulative physi-otherapists agree on which lumbar level to treat based on palpation? Physiotherapy 2003;89:74-81.

29. Christensen HW, Vach W, Manniche C, Haghfelt T, HartvigsenL, Hb ilund-Carlsen PF. Palpation of the upper thoracicspine an observer reliability study. J Manipulative PhysiolTher 2002;25:285-92.

30. Horneij E, Hemborg B, Johnsson B, Ekdahl C. Clinical tests onimpairment level related to low back pain: a study of test reliability. J Rehabil Med 2002;34:176-82.

31. Marcotte J, Normand MC, Black P. The kinematics of motion palpation and its effect on the reliability for cervical spine

rotation. J Manipulative Physiol Ther 2002;25:E7.32. Comeaux Z, Eland D, Chila A, Pheley A, Tate M. Measure-ment challenges in physical diagnosis: refining interrater palpation, perception and comminication. J Bodyw Mov Ther 2001;5:245-53.

33. Ghoukassian M, Nicholls B, McLaughlin P. Inter-examiner reliability of the Johnson and Friedman percussion scan of thethoracic spine. J Osteopath Med 2001;4:15-20.

34. French SD, Green S, Forbes A. Reliability of chiropracticmethods commonly used to detect manipulable lesions in patients with chronic low-back pain. J Manipulative PhysiolTher 2000;23:231-8.

35. Smedmark V, Wallin M. Inter-examiner reliability in assessing passive intervertebral motion of the cervical spine. Man Ther 2000;5:97-101.

36. van Suijlekom HA, de Vet HC, van den Berg SG, Weber WE.Interobserver reliability in physical examination of the cervicalspine in patients with headache. Headache 2000;40:581-6.

37. Vincent-Smith B, Gibbons P. Inter-examiner and intra-examiner reliability of standing flexion test. Man Ther 1999;4:87-93.

38. Hawk C, Phongphua C, Bleecker J, Swank L, Lopez D, RubleyT. Preliminary study of the reliability of assessment proceduresfor indications for chiropractic adjustments of the lumbar spine. J Manipulative Physiol Ther 1999;22:382-9.

39. Meijne W, van Neerbos K, Aufdemkampe G, van der Wurff P.Intraexaminer and interexaminer reliability of the Gillet test.J Manipulative Physiol Ther 1999;22:4-9.

40. Lundberg G, Gerdle B. The relationships between spinalsagittal configuration, joint mobility, general low back mobilityand segmental mobility in female homecare personnel. Scand JRehabil Med 1999;31:197-206.

41. Cattrysse E, Swinkels RAH, Oostendorp RAB, Duquet W.Upper cervical instability: are clinical tests reliable? Man Ther 1997;2:91-7.

42. Jull G, Zito G. Inter-examiner reliability to detect painful upper cervical joint dysfunction. Aust J Physiother 1997;43:125-9.

43. McPartland JM, Goodridge JP. Counterstrain and traditionalosteopathic examination of the cervical spine compared.J Bodyw Mov Ther 1997;1:173-8.

44. Tuchin P, Hart J, Colman R, Johnson C, Gee A, Edwards I,et al. Interexaminer reliability of chiropractic evaluation for cervical spine problems a pilot study. Chiropr J Aust 1996;5:23-9.

45. Haas M. Reliability of manual end-play palpation of thethoracic spine. Chiropr Tech 1995;7:120-4.

46. Lindsay DM. Interrater reliability of manual therapy assess-ment techniques. Phys Ther Can 1995;47:173-80.

47. Binkley J, Stratford PW, Gill C. Interrater reliability of lumbar accessory motion mobility testing. Phys Ther 1995;75:786-92.

48. Inscoe EL, Witt PL, Gross MT, Mitchell RU. Reliability in

evaluating passive intervertebral motion of the lumbar spine.J Man Manip Ther 1995;3:135-43.49. Maher C, Adams R. Reliability of pain and stiffness assess-

ments in clinical manual lumbar spine examination. Phys Ther 1994;74:801-9.

50. Hubka MJ, Phelan SP. Interexaminer reliability of palpationfor cervical spine tenderness. J Manip Physiol Ther 1994;17:591-5.

51. Paydar D, Thiel H, Gemmell H. Intra- and interexaminer reliability of certain pelvic palpatory procedures and the sittingflexion test for sacroiliac joint mobility and dysfunction.J Neuromusculoskel Syst 1994;2:65-9.

52. Boline PD, Haas M, Meyer JJ, Kassak K, Nelson C, KeatingJC. Interexaminer reliability of eight evaluative dimensions of lumbar segmental abnormality: part II. J Manipulative Physiol

Ther 1993;16:363-74.53. Mior SA, McGregor M, Schut B. The role of experience inclinical accuracy. J Manipulative Physiol Ther 1990;13:68-71.

54. Leboeuf C. Chiropractic examination procedures: a reliabilityand consistency study. J Aust Chiropr Assoc 1989;19:101-4.

55. Herzog W, Read LJ, Conway PJ, Shaw LD, McEwen MC.Reliability of motion palpation procedures to detect sacroiliac joint fixations. J Manipulative Physiol Ther 1989;12:86-92.

56. Nansel DD, Peneff AL, Jansen RD, Cooperstein R. Interexa-miner concordance in detecting joint-play asymmetries in thecervical spines of otherwise asymptomatic subjects. J Manip-ulative Physiol Ther 1989;12:428-33.

57. Mootz RD, Keating JC, Kontz HP, Milus TB, Jacobs GE. Intra-and interobserver reliability of passive motion palpation of thelumbar spine. J Manipulative Physiol Ther 1989;12:440-5.

http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=hstat6.chapter.25870http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=hstat6.chapter.25870http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=hstat6.chapter.25870


11/21

58. Boline PD. Interexaminer reliability of palpatory evaluations of the lumbar spine. Am J Chiropr Med 1988;1:5-11.

59. Carmichael JP. Inter- and intra-examiner reliability of palpationfor sacroiliac joint dysfunction. J Manipulative Physiol Ther 1987;10:164-71.

60. Love RM, Brodeur RR. Inter- and intra-examiner reliability of motion palpation for the thoracolumbar spine. J ManipulativePhysiol Ther 1987;10:1-4.

61. Bergstr b m E, Courtis G. An inter- and intra-examiner reliabilitystudy of motion palpation of the lumbar spine in lateral flexionin the seated position. Eur J Chiropr 1986;34:121-41.

62. Mior SA, King R. Intra and interexaminer reliability of motion palpation in the cervical spine. J Can Chiropr Assoc 1985;29:195-9.

63. Deboer KF, Harmon R, Tuttle CD, Wallace H. Reliability studyof detection of somatic dysfunctions in the cervical spine.J Manipulative Physiol Ther 1985;8:9-16.

64. Potter NA, Rothstein JM. Intertester reliability for selectedclinical tests of the sacroiliac joint. Phys Ther 1985;65:1671-5.

65. Johnston WL, Allan BR, Hendra JL, Neff DR, Rosen ME, SillsLD, et al. Interexaminer study of palpation in detectinglocation of spinal segmental dysfunction. J Am OsteopathAssoc 1983;82:839-45.

66. Gonella C, Paris SV, Kutner M. Reliability in evaluating passive intervertebral motion. Phys Ther 1982;62:436-44.

67. Wiles MR. Reproducibility and interexaminer correlation of motion palpation findings of the sacroiliac joints. J CanChiropr Assoc 1980;24:59-69.

68. Oldreive WL. Manual therapy rounds. A critical review of theliterature on tests of the sacroiliac joint. J Man Manip Ther 1995;3:157-61.

69. Keating JC. Inter-examiner reliability of motion palpation of the lumbar spine: a review of quantitative literature. Am JChiropr Med 1989;2:107-10.

70. Panzer DM. The reliability of lumbar motion palpation.J Manipulative Physiol Ther 1992;15:518-24.

71. Haas M. The reliability of reliability. J Manipulative PhysiolTher 1991;14:199-208.

72. Humphreys K, Delahaye M, Peterson CK. An investigationinto the validity of cervical spine motion palpation usingsubjects with congenital block vertebrae as ab gold standard Q .BMC Musculoskelet Disord 2004;5:19.

73. van Deursen L, Patijn J, Ockhuysen A, Vortman BJ. The valueof some clinical tests of the sacro-iliac joint. Man Med1990;5:96-9.

74. Feldt LS, McKee ME. Estimation of the reliability of skill tests.Res Q 1958;29:279-93.

75. Haas M, Groupp E, Panzer D, Partna L, Lumsden S, Aickin M.Efficacy of cervical endplay assessment as an indicator for spinal manipulation. Spine 2003;28:1091-6.

76. Landis JR, Koch GC. The measurement of observer agreement for categorical data. Biometrics 1977;33:159-74.

77. Huxley R, Neil A, Collins R. Unravelling the fetal originshypothesis: is there really an inverse association between birthweight and subsequent blood pressure? Lancet 2002;360:659-65.

78. Chan AW, Hrobjartsson A, Haahr MT, Gotzsche PC, AltmanDG. Empirical evidence for selective reporting of outcomes inrandomized trials: comparison of protocols to publishedarticles. JAMA 2004;291:2457-65.


485


12/21

APPENDIX A

ReferenceTest procedure

Segmental level/ patient position

Study population(no. [M/F], category,symptomatic status)

Examiners(no., occupation, experience) Standardization

Christensenet al29

MP STP T1-T8Sitting + prone

107 (68/39)Outpatient Sympt +Asympt

2 Chiropractors;experience NR

+

Horneij et al30 MP STP T7-L5 prone 84 (sex, NR) Gen pop Sympt + Asympt

3 Physiotherapists, 18-25 y +

French et al34 GA T11-L5 + SI observersown choice

19 (14/5) Recruitment NR Sympt

5 Chiropractors 5-18 y

Vincent-Smithand Gibbons37

MP SI standing 9 (5/4) Edu/staff Asympt 9 Osteopathic stud 4-5 y +

Hawk et al38 GA T12-S1 Observersown choice

18 (14/4) Edu/staff Sympt + Asympt

4 Chiropractors2 N 20 y 2b 3 y

Meijne et al39 MP SI Standing 41 (41/0) Edu/staff

Sympt + Asympt

2 Physiotherapy stud

experience NR

+

Cattrysse et al41 GA Cx supine + sitting 11 (sex NR) ResearchStatus NR

4 Manual practitioners1.5-13 y

Inscoe et al48 MP T12-S1 Side posture 6 (2/4) Edu/staff Sympt 2 Physiotherapists 4-5 y +Paydar et al51 MP OP SI Sitting 32 (17/15) Edu/staff

Asympt 2 Chiropractic stud 1 y +

Mior et al53 MP SI N15 (sex NR)Recruitment NR Status NR

74 Chiropractic studExperience NR 2Chiropractors N5 y

+/

Leboeuf 54 MP OPSTP

Lx + SI sitting 45 (29/16) Gen popSympt

4 Chiropractic studExperience NR

NR

Herzog et al55 MP SI Standing 11 (sex NR) Prim CareSympt + Asympt

10 Chiropractors 1-11 y +

Mootz et al57 MP Lx Sitting 60 (sex NR) Edu/staff Status NR

2 Chiropractors 7 + 10 y +

Love andBrode ur 60

MP T1-L5Sitting

32 (32/0) Edu/staff Status NR

8 Chiropractic stud 1 y

Carmichael 59 MP SI Standing 54 (sex NR) Edu/staff Asympt

10 stud. 1-3 y +

Bergstr b m andCourti s61

MP Lx Sitting 100 (sex NR) Edu/staff Status NR

2 Chiropractic stud.Experience NR

Deboer et al63 Insuff descrip

Cx Sitting 40 (40/0) Research +Edu/staff Asympt

3 Chiropractors

Mior and King62 MP C1 Supine 62 (sex NR) Edu/staff Status NR


NR

Gonella et al66 MP T12-S1 5 (0/5) Edu/staff Asympt 5 Physiotherapists 3-20 y +Cx , Cervical spine; Tx, thoracic spine; NR, not reported; NA, not applicable; Symp , symptomatic; Asympt , asymptomatic; Prim Care , primary care; Edu/

staff , educational (students) or staff members;Gen pop , General population; Outpatient , outpatient clinic; Research , research setting; Stud , student. M/F ,male/female; PA , percentage agreement; CI , confidence interval; Neuro , neurologic testing, such as sensitivity, reflexes, muscular strength;Clin , clinicaltesting, such as active and passive range of motion, axial compression test, manual traction test, strait leg raise, and shoulder abduction test.


485.


13/21

Additional procedures

Definition of positivefindings/acceptablereliability

Statistics(type, prevalence/ CI reported) Summary of results/ j (PA) Quality score

Abnormality j N 0.5 j (expanded j ): +/+

MP: 0.13-0.45 (0.60-0.68) (82%-88%);STP: 0.34-0.57 (0.63-0.77) (81%-88%)

100%

Muscle length Pain j : /+ MP: 0.56-0.78 (78%-89%);STP: 0.64-0.78 (83%-89%)

50.0%

History posturex-ray Neuro Clin

Joint in need of adjustment;allows F 1 segment

j : / 0.21 to 1.00 (30%-100%) 25.0%

Unsymmetrical movement,LN b R

j : / 0.46 (42%) 25.0%

Manualexamination

Joint in need of adjustment (segment and functional unit)

j : +/ segment: 0.1 to 0.85 unit:0.1 to 0.77 50.0%

Fixation j : /+ 0.03-0.08 (71%-83%) 75.0%

3 tests of instability Instability j : / 0.27 to 1.0 (63.6%-100%) 75.0%

Mobility Percent agreement 0%Posture Restriction tenderness j : /se MP: 0.29 (58%) OP: 0.91 (97%) 50.0%

Fixation j : / NR 25.0%

NR Percent agreement 25.0%

Gait analysis Fixation, 3-point scale Percentageagreement, m2

50.0%

Fixation j : +/ 0.09 to 0.48 25.0%

Most hypomobile motor unit Pearson 0%

Fixation j : +/se 0.31 (90%) 50.0%

Fixation Percent agreement 0%

Fixation Pain Muscle j 25.0%

Fixation j : +/ 0.37-0.52 (71%-79%) 50.0%

Mobility, 7-point scale Mean, SD 0%

Journal of Manipulative and Physiological TherapeuticsStochkendahl et alJuly/August 2006Spinal Palpation: A Systematic Review

85.e2


14/21

APPENDIX B



Study population(number (m/f), category,symptomatic status)

Examiners (number,occupation, experience) Standardization

Pool et al20 MP OP Cx Supine 32 (12/20) Primarycare Sympt

2 PhysiotherapistsExperience NR

+

Hicks et al27 MP OP Lx Prone 63 (25/38) Outpatient +Research Sympt

3 Physiotherapist 1Physiotherapist/chiropractor 3-8 y

+

Downey et al28 MP Lx Prone 60 (28/32) PrimCare Sympt

6 Physiotherapists 3-11 y -

Sebastian andChovvath 26

MP L5 Sitting + prone 31 (sex NR) Recruitment NR Sympt

2 Physiotherapists 5-8 y +

Christensenet al29

MP STP T1-T8 Sitting + prone 107 (68/39) Outpatient Sympt + Asympt

2 ChiropractorsExperience NR

+

Horneij et al30 MP STP T7-L5 Prone 84 (sex NR) Gen pop

Sympt + Asympt

3 Physiotherapists 18-25 y +

Marcotte et al31 MP Cx Supine 3 (sex NR)Edu/staff Asympt

24 Chiropractic stud + 1Chiropractor Experience NR

+

Comeauxet al32

MP STC C2-T8 Sitting 54 (27/28) Gen popStatus NR

3 Occupation NR N10 y

Ghouk assianet al33

STC Tx Sitting 19 (19/0) Recruitment NR Asympt

10 Osteopathic Stud 2 y +

French et al34 GA T11-L5 + SIObservers own choice

19 (14/5) Recruitment NR Sympt


Smed mark andWallin 35

MP C1-3 + C7-T1Sitting + prone +side lying

61 (15/46) Prim.care Sympt

2 Physiotherapists N25 y +

Van Suijlekomet al36

SP OPSTP

Cx Position NR 24 (13/11) Outpatient +Research Sympt

2 NeurologistsExperience NR

Vincent-Smithand Gibbons 37

MP SI Standing 9 (5/4) Edu/staff Asympt 9 Osteopathic stud. 4-5 y +

Hawk et al38 GA T12-S1 Observersown choice

18 (14/4) Edu/staff Sympt + Asympt

4 Chiropractors2 N 20 y 2b 3 y

Meijne et al39 MP SI Standing 41 (41/0) Edu/staff Symptom + Asympt

2 Physiotherapy stud.Experience NR

+

Fjellner et al21 MP C0-C5 Sitting +supine

48 (8/40) Edu/staff +Gen pop Asympt

2 Physiotherapists 6 + 12 y +

Lundberg andGerdle 40

MP Lx Side posture 156 (0/156) Gen popStatus NR


+

Strender et al22 MP SP OP

STP STC

C0-C3 Supine 50 (13/37) Gen pop

Sympt + Asympt

2 Physiotherapists 21 + 23 y +

Strender et al23 MP STP Lx Prone 71 (28/43) Outpatient +Prim Care Sympt

2 Physiotherapists 2Physicians Experience NR

+

Cattrysseet al41

GA Cx Supine + sitting 11 (sex NR) ResearchStatus NR

4 Manual practitioners1.5-13 y


485.


15/21


Definition of positivefindings/ acceptablereliability


Clin Mobility Pain, 11-point scale j N 0.4, ICC N0.75

j : and ICC (2.1)+/

MP: -0.09-0.63 (48%-90%)OP: 0.22-0.80 (40.6%-87.4%)

50%

Clin Generalmobility test

Mobility Pain j : +/+ MP: -0.02-0.26 (52%-69% )OP: 0.25-0.55 (65%-87%)

50%

History Clin Most symptomatic level j : +/+ 0.37 50%

- Dysfunction j : +/ 0.69 16.7%

- Abnormality j N 0.5 j (expanded j ):+/+ MP: 0.03-0.0 (0.22-0.24) (68%-80%)STP: 0.38 (0.67-0.70) (77%-79%)

100%

Muscle length Pain j : /+ MP: 0.12-0.49 (61%-77%)

STP: 0.31-0.88 (80%-95%)

66.7%

Fixation Inclination V 68 j : +/se 0.337-0.682 (81%-90%) 16.7%

The most dysfunctionalsegment

j : +/ NR 50.0%

The most significant area of tissue tension

j : / 0.07 33.3%

History PostureX-ray Neuro Clin

Joint in need of adjustment Allows F 1segment

j : / 0.16 to 0.25 (48%-64%) 50.0%

4 tests of mobility Stiffness (reduced mobility) j : / 0.28-0.43 (79%-87%) 66.7%

History ClinTender points

Facet joint pain Impairment j : / SP: 0.14-0.37 OP: 0.0-1.0STP: 0.35-0.87

33.3%

Unsymmetrical movement,L N b R

j : / 0.05 (42%) 16.7%

Manual examination Joint in need of adjustment (segment and functional unit)

j : +/ segment: 0.42 to 0.44 unit:0.39 to 0.54

66.7%

Fixation j : /+ 0.05 to 0.0 (76%-77%) 66.7%

Clin If not normalj N0.4 j (w): +/+ 0.16 to 0.49 (41%-92%) 66.7%

Posture Clin Mobility, 5-point scale j (w): /+ 0.42-0.75 66.7%

Clin Mobility Consistency Pain

Difference between L/R,the most pronouncedside j N 0.4

j : +/+ MP: 0.05-0.15 (26%-44%)

SP: 0.24 (70%)OP: 0.37 (58%)STP: 0.31-0.52 (62%-68%)STC: .18 (36%)

75.0%

Clin Neuro Mobility Normality versus pathology j N 0.4

j : +/+ MP: PT: 0.38-0.75 (72%-88%)MD: -0.08-0.24 (48%-62%)STP PT: 0.27-0.56 (72%-86%)MD: 0.22-0.40 (71%-76%)

66.7%

3 tests of instability Instability j : / 0.64 to 1.0 (18%-100%) 83.3%

(continued on next page)


85.e4


16/21

APPENDIX B. continued





Jull and Zito42 GA C0-C3 Position NR 40 (12/28) Out patient Sympt + Asympt


McPartlan d andGoodridge 43

MP SPSTC

C0-C3 Position NR 7 + 11 (1/6 + 5/6)Research + Edu/staff Sympt + Asympt

2 Osteopaths 10 + 40 y 36Osteopathic stud

NR

Tuchin et al44 GA C1-C7 Position NR 53 (sex NR) Edu/staff Sympt + Asympt


Haas 45 MP T3-T12 Sitting 73 (2/3 males) Edu/staff Sympt/ Asympt

2 Chiropractors N15 y +

Lindsay 46 MP Lx + SISupine + prone

8 (sex NR) Gen pop Asympt

2 Physiotherapists 6 + 10 y

Binkley et al 47 MP L1-S1 Prone 18 (9/9) Outpatient Sympt 6 Physiotherapists 6-13 y +Inscoe et al48 MP T12-S1 Side posture 6 (2/4) Edu/staff Sympt 2 Physiotherapists 4-5 y +

Maher andAdams 49

MP OP Lx Prone 90 (34/56) PrimCare Sympt

6 Physiotherapists 8-21 y

Hubka andPhelon 50

SP C0-C7 Sitting 30 (11/19) PrivateClinic Sympt

2 Chiropractors 1 + 5 y

Paydar et al51 MP OP SI Sitting 32 (17/15) Edu/staff Asympt 2 Chiropractic stud. 1 y +

Boline et al52 OP STP Lx prone 28 (+/+)Prim Care Sympt 3 ChiropractorsExperience NR

NR

Keating et al24 MP SP OPSTP STC

Lx Prone + sitting 46 (20/26) Recruitment NR Sympt + Asympt

3 Chiropractors 2 -10 y +

Mior et al53 MP SI N15 (sex NR) Recruitment NR Status NR

74 Chiropractic stud.Experience NR 2Chiropractors N5 y

+/

Leboeuf 54 MP OPSTP

Lx + SI Sitting 45 (29/16) Gen pop Sympt 4 Chiropractic studExperience NR

NR

Herzog et al55 MP SI Standing 11 (sex NR) Prim CareSympt + Asympt

10 Chiropractors 1-11 y +

Nansel et al56 MP Middle + lower CxSitting + supine

270 (Approximately 50%males) Edu/ staff Asympt


+

Mootz et al57 MP Lx Sitting 60 (sex NR) Edu/staff Status NR

2 Chiropractors 7 + 10 y +

Boline 58 MP STPSTC

Lx Sitting 50 (27/23) Edu/staff +outpatient + Prim CareSympt + Asympt


+

Carmichael 59 MP SI Standing 54 (sex NR) Edu/staff Asympt

10 stud. 1-3 y +

Love andBrodeur 60

MP T1-L5 Sitting 32 (32/0) Edu/staff Status NR

8 Chiropractic stud 1 y

Viikari-Juntu ra25 OP STP Cx Seated 69 (29/23)Outpatient Sympt

1 Physician 1 Physiotherapist Experience NR

+


485.


17/21




Manual examination Most dysfunctional segment

Order of magnitude

j : / 0.25-1.0 66.7%

Dysfunction. Facet joint tenderness. Tissue texture.(Rating 0-10)

j : / MP: 0.34 (67%)SP: 0.53 (77%)STC: 0.19 (70%)

58.3%

Manual examination Vertebral dysfunction Logisticregression m2

16.7%

End play restriction j : /SE 0.14 100%Posture Clin

Muscle lengthBeyond slight anomaly j : +/ Lx: 0.30 to 0.0 (14%-50%)

SI: 0.0-0.60 (75%-86%)66.7%

- Motion, 9-point scale ICC/+ 0.09-0.25 33.3% Mobility Percent Agreement 16.7%

Stiffness, 11-point scalePain, 11-point scale ICC (1,1) +/+ MP:

0.40 to 0.73OP: 0.27-0.85 58.3%

The most tender spot j : +/+ 0.68 (77%) 75.0%

Posture Restriction Tenderness j : /se MP: 0.09 (34%)OP: 0.73 (91%)

50.0%

PostureDermothemographySurfaceelectromyography

Presence of abnormality j : +/ OP: 0.48-0.90 (75-96%)STP: 0.40-0.78 (89%)

50.0%

PostureDermothemographyTemperature

Misalignment Pain Fixationj N 0.4

j : +/ MP: 0.07-0.09SP: 0.0OP: 0.48STP: 0.30STC: 0.07

75.0%

Fixation j : / NR 16.7%

NR Percent agreement 16.7%

Gait analysis Fixation, 3-point scale Percentage agree-ment, m2

50.0%

The side of greatest resistance (LN b R) -marked segment.

j : +/ 0.01 (46%-54%) 16.7%

- Fixation j : +/ 0.17 to 0.17 33.3%

Presence of severeabnormality, fixation

j : +/ MP: 0.05 to 0.31 (78-91%)STP: 0.03 to 0.49 (90-96%)STC: 0.10-0.31 (70%)

66.7%

Fixation j : +/se 0.02 (85%) 50.0%

Most hypomobile motor unit Pearson 16.7%

Neuro Clin Tendersness Rating (0-3)j N 0.4

j (w): +/ OP: 0.47-0.52STP: 0.24-0.56

50.0%



85.e6


18/21

APPENDIX B continued





Bergstr b m andCourtis 61

MP Lx Sitting 100 (sex NR) Edu/staff Status NR

2 Chiropractic stud.Experience NR

Mior and King62 MP C1 Supine 62 (sex NR) Edu/staff Status NR


NR

Deboer et al63 Insuff descrip

Cx Sitting 40 (40/0) Research +Edu/staff Asympt


Potter andRothstein 64

MP SI Standing +sitting + side posture + prone

17 (10/7) Outpatient Sympt 8 Physiotherapists 2-18 y +

Johnston et al65 STC C7-T12 Standing 30 (sex NR) Edu/staff Status NR

1 Osteopaths 5 Osteopathicstud Experience NR

NR

Gonella et al66 MP T12-S1 5 (0/5) Edu/staff Asympt 5 Physiotherapists 3-20 y +Wiles 67 MP SI 46 (sex NR) Edu/staff

Asympt

12 Chiropractors

average 2.75 y

NR


485.


19/21




Fixation Percent agreement 0%

Fixation j : +/ 0.15 (61%) 50.0%

Fixation Pain Muscle j 50.0%

13 SI joint tests Restriction Percentageagreement, v 2

33.3%

Decreased rebound/ dullness

Percent Agreement

(79%-86%) 0%

Mobility, 7-point scale Mean, SD 16.7% Restriction, 5-point scale Percentage

agreement,Pearson

0%


85.e8


20/21

Reference Case mixBlinding of observersto confounding info Subject blinding j /ICC

Total(max 4 points) Total percentage

Christensen et al29 1 1 1 1 4 100.00

Horneij et al30 1 0 0 1 2 50.00French et al34 0 0 0 1 1 25.00Vincent-Smith and Gibbons37 0 0 0 1 1 25.00Hawk et al38 1 0 0 1 2 50.00Meijne et al39 0 1 1 1 3 75.00Cattrysse et al41 0 1 1 1 3 75.00Inscoe et al48 0 0 0 0 0 0.00Paydar et al51 1 0 0 1 2 50.00Mior et al53 0 0 0 1 1 25.00Leboeuf 54 1 0 0 0 1 25.00Herzog et al55 1 0 1 0 2 50.00Mootz et al57 0 0 0 1 1 25.00

Love and Brodeur 60

0 0 0 0 0 0.00Carmichael 59 0 0 1 1 2 50.00Bergstr b m and Courtis61 0 0 0 0 0 0.00Deboer et al63 0 0 1 0 1 25.00Mior and King62 0 0 1 1 2 50.00Gonella et al66 0 0 0 0 0 0.00

APPENDIX C. Intra-observer reproducibility studies

APPENDIX D. Inter-observer reproducibility studies

Reference

Randomizedorder of observer

Casemix

Blinding of observers toother observers

Blinding of observers toconfounding info

Subject blinding j /ICC

Total(max 6 points)

Total percentage

Pool et al20 0 1 1 0 0 1 3 50.00Hicks et al27 0 1 1 0 0 1 3 50.00Downey et al28 0 1 1 0 0 1 3 50.00Sebastian and Chovv ath26 1 0 0 0 0 0 1 16.67Christensen et al29 1 1 1 1 1 1 6 100.00Horneij et al30 1 1 1 0 0 1 4 66.67Marcotte et al31 0 0 1 0 0 0 1 16.67Comeaux et al32 0 0 1 1 1 0 3 50.00Ghoukas sian et al33 0 0 1 0 0 1 2 33.33French et al34 1 0 1 0 0 1 3 50.00Smedmark and Wallin 35 1 1 1 0 0 1 4 66.67Van Suijlekom et al36 0 1 0 0 0 1 2 33.33Vincent-Sm it h and Gibbons37 0 0 0 0 0 1 1 16.67Hawk et al38 1 1 1 0 0 1 4 66.67Meijne et al39 0 0 1 1 1 1 4 66.67Fjellner et al21 1 0 1 0 1 1 4 66.67Lundberg and Gerdle 40 1 1 1 0 0 1 4 66.67Strender et al22 1 1 1 0 0.5 1 4.5 75.00Strender et al23 1 0 1 0 1 1 4 66.67Cattrysse et al41 1 0 1 1 1 1 5 83.33Jull and Zito42 0 1 1 1 0 1 4 66.67McPartland and Goodridge 43 1 1 1 0 0.5 0 3.5 58.33Tuchi n et al44 1 0 0 0 0 0 1 16.67Haas 45 1 1 1 1 1 1 6 100.00Lindsay 46 1 0 1 0 1 1 4 66.67Binkley et al47 0 1 0 0 0 1 2 33.33Inscoe et al48 0 0 1 0 0 0 1 16.67Maher and Adams49 1 0 1 0 0.5 1 3.5 58.33



485.


21/21

APPENDIX D. continued

Reference

Randomizedorder of observer

Casemix

Blinding of observers toother observers

Blinding of observers toconfounding info

Subject blinding j /ICC

Total(max 6 points)

Total percentage

Hubka and Phelan50 1 1 1 0 0.5 1 4.5 75.00Paydar et al51 1 1 0 0 0 1 3 50.00Boline et al52 1 1 0 0 0 1 3 50.00Keating et al24 1 1 1 0 0.5 1 4.5 75.00Mior et al53 0 0 0 0 0 1 1 16.67Leboe uf 54 0 1 0 0 0 0 1 16.67Herzog et al55 0 1 1 0 1 0 3 50.00 Nansel et al56 0 0 0 0 0 1 1 16.67Mootz et al57 1 0 0 0 0 1 2 33.33Boline 58 1 1 1 0 0 1 4 66.67Carmich ael59 0 0 1 0 1 1 3 50.00Love and Brodeur 60 0 0 1 0 0 0 1 16.67Viikari-Juntura 25 1 1 0 0 0 1 3 50.00Bergstr b m and Courtis61 0 0 0 0 0 0 0 0.00Mior and King62 0 0 1 0 1 1 3 50.00Deboer et al63 1 0 1 0 1 0 3 50.00Potter and Rothstein64 0 1 1 0 0 0 2 33.33Johnston et al65 0 0 0 0 0 0 0 0.00Gonella et al66 1 0 0 0 0 0 1 16.67Wiles 67 0 0 0 0 0 0 0 0.00


5.e10