A Human Face Asymmetries in Facial Actions
Title Page
Contents
Acknowledgements
Abstract
Introduction
Method
Results
Discussion
Summary and Conclusions
Tables
Appendix R
Appendix S
Appendix Y
References

APPENDIX R

Reliability

An argument could be made that demonstrating reliability was not necessary in this study. A similar scoring technique was shown to be reliable in the Ekman et al. (1981) study, and the main coder (myself) had demonstrated reliability and was the same in both studies. The purpose of examining reliability was to affirm previously established reliability and to investigate the reliability for different AUs.

Table R1 shows the reliability coefficients for intercoder agreement for each action scored. Pearson correlations were based on the actual numerical scores assigned by the coders, but percent correct and Kappas were based on the category (Left, Right, or Symmetrical) of asymmetry. This category was determined simply by whether the score was left, right, or symmetrical, regardless of numerical value. Actions were selected for reliability scoring from each of condition in the study except the simulations of emotion.

In general, coefficients for in Table R1 were high enough to establish asymmetry scoring as a reliable method. This reliability is reflected in Tables R2 and R3 which show that coders agreed about the results of the scoring. Evidence for validity beyond the reliability coefficients was clear because the results obtained by different coders are similar. Table R2 shows that coders agreed on the relative proportions of left, right, and symmetrical subjects for each requested AU. Table R3 shows consistent agreement between the two coders on the results of the experiment involving spontaneous actions. Table R3 Part 1 shows the distribution of left, right, and symmetrical subjects for three spontaneous movements. Both coders scored roughly equal proportions of asymmetrical subjects, and they agreed on the ratio of right to left subjects for each movement. There were more right than left subjects for AU 6/7, roughly equal numbers of left and right subjects for AU 20, and slightly more left than right subjects for AU 12.

Table R3 Part 2 looks at the distribution of left, right and asymmetrical scores within the startle conditions (a breakdown of the "startle" entries in Part 1). For AU 20, coders agreed about the proportion of asymmetries and that there were roughly equal numbers of right and left asymmetries in all three conditions and in total (slightly, but not significantly, more right). For AU 6/7 they agreed on the tendency for more right than left asymmetries in each condition and in total (significantly more for both coders). This tendency was stronger in Jim's than in Joe's scores, perhaps because Jim also scored relatively more asymmetries than Joe. That Jim scored more asymmetries of spontaneous AU 6/7 than Joe is the only point of disagreement in these results, but as explained below, both coders agreed about the degree of asymmetry in the spontaneous conditions versus the requested conditions.

The most cogent comparisons for assessing whether coders validly scored the degree of asymmetry are comparisons of how the two coders scored different conditions. Table R3 Part 3 compares the degree of asymmetry observed between pairs of startle conditions. Both coders agreed that there were no differences between conditions for either AUs 6/7 or 20. Of course the failure to find differences could be due to weaknesses in the experiment, but the agreement of coders about these results does not implicate invalidity as would disagreement about results.

Table R4 compares the degree of asymmetry in deliberate versus spontaneous actions based on the scores of two different coders. For all three AUs, the coders agreed on the results of this comparison. AUs 12 and 20 were significantly more asymmetrical in the deliberate condition, but AU 6/7 did not differ. Thus, even though Jim scored more asymmetries of AU 6/7 within the spontaneous condition than Joe, they agreed on the more important issue of the comparison between conditions.

Coder Bias

 

The influence of coder bias was possible in this study because the main coder was the experimenter and knew the hypotheses. However, the scoring procedure eliminated most of the opportunity for bias to occur. Important hypotheses of the study involved comparison of different conditions that were scored at least one month apart. This interval, combined with the large number of subjects, precluded remembering how subjects had been scored. The coder did not know how to score consistent with hypotheses about relations between conditions. Other important hypotheses involved relations between the facial scores and questionaire variables which the scorer also did not know. This ignorance not only minimized bias, but also strengthened the scorer's motivation to record asymmetry as accurately as possible in order to maximize the opportunity to test these hypotheses.

In cases where the main coder's knowledge could have biased scoring, the reliability of scoring is evidence that that bias could not have been significant. Three secondary coders who were naive to the hypotheses all attained acceptable reliability. In the one comparison between two naive coders, reliability was at the same level as between the main and secondary coders. Also important is that coders agreed about the results.

Finally, many of the results of the experiment either did not confirm the hypotheses or they contradicted them. This is true even for scoring that was most susceptible to bias. For example, the hypothesis that deliberate requested actions would be lateralized left was contradicted for many AUs that were lateralized right. Such inconsistencies between hypotheses and findings indicate that coder bias could not have been an important factor in this study.

Another possible source of bias could have been a perceptual bias to look at one side of the face more than the other. Since coders saw the face only in its normal orientation, laterality might be an artifact of this bias. This bias could not have been significant in this study because the laterality of AUs changed between conditions and differed with AU.

 

AU scored

N

Pearson r

Kappa

% correct

1

83

.45

.29

54

2

82

.81

.56

77

12

67

.83

.55

72

20

155

.55

.28

53

6

141

.65

.30

56

7

141

.85

.62

77

NOTE: N = number of events involved in coefficients. Pearson r's are based on continuous scores (-5 to 5) as assigned by the coders; percent correct and Kappas are based on "Left, Symmetrical, or Right" categories. All probabilities for coefficients p < .0001

 

AU

Category

Joe

Jim

AU

Category

Joe

Bob

N

%

N

%

N

%

N

%

12

Left

8

47

12

70

1

Left

4

25

0

0

Right

5

29

5

30

Right

8

50

12

75

Symm.

4

24

0

0

Symm.

4

25

4

25

6

Left

7

25

8

33

2

Left

2

13

1

7

Right

9

32

4

17

Right

12

80

13

87

Symm.

12

43

12

50

Symm.

1

7

1

7

7

Left

5

15

8

24

1+2

Left

4

25

2

12

Right

12

36

11

33

Total

Right

12

75

13

81

Symm.

15

45

14

42

Symm.

0

0

1

6

6+7

Left

10

30

13

39

Total

Right

10

30

10

30

Squint

Symm.

13

39

10

30

20

Left

8

31

11

35

20

Left

3

25

3

25

Right

10

38

13

42

Right

5

41

6

50

Symm.

8

31

7

22

Symm

4

33

3

25

AU

Category

Joe

Jim

Condition

N

%

N

%

6+7

Left

6

18

4

9

Startle

Total

Right

15

45

21

64

Symm.

12

36

9

27

20

Left

7

24

12

38

Startle

Right

8

28

11

34

Symm.

14

48

9

28

12

Left

4

27

5

33

Aren't you happy?

Right

1

7

1

7

Symm.

10

67

9

60

For startle conditions, scores are across all 3 startle noise conditions.

Unanticipated

Anticipated

Inhibit

Total

Left

Right

Symm.

Left

Right

Symm.

Left

Right

Symm.

Left

Right

Symm.

20

Jim

N

7

7

15

2

8

16

3

5

12

12

20

43

%

24

24

52

8

31

62

15

25

60

16

27

57

20

Joe

N

3

4

18

3

4

14

3

4

7

9

12

39

%

12

16

72

14

19

67

21

28

50

15

20

65

6/7

Jim

N

2

14

17

2

14

16

0

13

20

4

41

53

%

6

42

52

6

44

50

0

39

61

4

42

54

6/7

Joe

N

4

7

20

1

9

21

2

6

22

7

22

63

%

13

22

64

3

29

68

7

20

73

8

24

68

NOTE: Across conditions, the coders agreed that more right than left asymmetries of AUs6/7 occurred (p<.05), but no laterality for AU 20. The number of subjects was not equal for each coder in situations where the coders did not agree on the number of actions to score.

AU

Coder

#Ss with Greater Asymmetry in each Condition

Unant. vs. Ant.

Unant. vs. Inhibit

Ant. vs Inhibit

Unant.>

Ant.>

Eq.

Unant.>

Inhib.>

Eq.

Ant.>

Inhib.>

Eq.

N (%)

N (%)

N (%)

N (%)

N (%)

N (%)

N (%)

N (%)

N (%)

20

Jim

8(33)

6(25)

10(42)

4(20)

5(25)

11(55)

2(12)

3(18)

12(70)

20

Joe

3(17)

2(11)

13(72)

3(23)

5(38)

5(38)

4(33)

4(33)

4(33)

6/7

Jim

7(22)

9(28)

16(50)

7(21)

6(18)

20(61)

7(22)

5(16)

20(62)

6/7

Joe

8(28)

7(24)

14(50)

8(28)

7(24)

14(50)

6(20)

6(20)

17(60)

AU

Coder

Conditions

Deliberate >

Spontaneous >

Equal

20

Jim

Startle vs Request

19

7

4

20

Joe

"

13

7

3

6/7

Jim

"

13

17

3

6/7

Joe

"

15

11

7

12

Jim

Happy Qs vs Smile Req

13

1

2

12

Joe

"

11

2

2


NOTE: Scores for the spontaneous startle actions were averages of the three startle noise conditions. Scores for the deliberate requested actions were averages of the several actions performed. Only one spontaneous happy action was scored for each subject. Wilcoxon signed- ranks tests showed that deliberate actions were more asymmetrical than spontaneous actions of AU 20 (Joe, p<.05; Jim, p<.001), but there was no difference for AUs 6/7.