Processing math: 100%
+ - 0:00:00
Notes for current slide
Notes for next slide

Steering students past the ‘true model myth’

OZCOTS 2021

Damjan Vukcevic

…with Margarita Moreno-Betancur, John Carlin, Sue Finch, Ian Gordon & Lyle Gurrin

9 July 2021

1 / 17

Student's predicament

X1,X2,,XnN(μ1,σ21)

Y1,Y2,,YmN(μ2,σ22)

2 / 17

Student's predicament

X1,X2,,XnN(μ1,σ21)

Y1,Y2,,YmN(μ2,σ22)

Want to compare μ1 and μ2

2 / 17

Student's predicament

X1,X2,,XnN(μ1,σ21)

Y1,Y2,,YmN(μ2,σ22)

Want to compare μ1 and μ2

Can we assume σ1=σ2?

2 / 17

Student's predicament

X1,X2,,XnN(μ1,σ21)

Y1,Y2,,YmN(μ2,σ22)

Want to compare μ1 and μ2

Can we assume σ1=σ2?

In R, which of these should we run?

t.test(x, y)
t.test(x, y, var.equal = TRUE)
2 / 17

The 'true model myth'

3 / 17

The 'true model myth'

Analysis process:

  1. Determine the best model
  2. Derive (all of the) answers from this model
3 / 17

The 'true model myth'

Analysis process:

  1. Determine the best model
  2. Derive (all of the) answers from this model

Implicit assumptions:

  • Our goal is to find the 'true' model
  • We can use our 'best' model as if it were the 'true' model
3 / 17

The 'true model myth'

Analysis process:

  1. Determine the best model
  2. Derive (all of the) answers from this model

Implicit assumptions:

  • Our goal is to find the 'true' model
  • We can use our 'best' model as if it were the 'true' model

(Similar to misuse of statistical significance?
An overly 'black and white' view of the data?
Ignores model uncertainty...)

3 / 17

Antidotes

The idea of a 'statistical investigation'

4 / 17

Antidotes

The idea of a 'statistical investigation'

4 / 17

Antidotes

The idea of a 'statistical investigation'

A statistical investigation will typically investigate multiple models

4 / 17

Antidotes

The idea of a 'statistical investigation'

A statistical investigation will typically investigate multiple models

(...and we might never need to choose between them!)

4 / 17

Antidotes

The idea of a 'statistical investigation'

A statistical investigation will typically investigate multiple models

(...and we might never need to choose between them!)

Let's show such examples to students.

4 / 17

Reasons for using multiple models

  1. Comparing & optimising performance
  2. Exploring different assumptions
  3. Exploring different questions
  4. Varying the desired estimation properties
5 / 17

1. Comparing & optimising performance

6 / 17

1. Comparing & optimising performance

Routinely done for predictive modelling

6 / 17

1. Comparing & optimising performance

Routinely done for predictive modelling

...including creating ensembles of multiple models

6 / 17

Example: species distribution modelling

7 / 17

2. Exploring different assumptions

8 / 17

2. Exploring different assumptions

Common scenario: sensitivity analyses

8 / 17

Example: t-test

Different variances (Welch approximation)

t.test(x, y)

t=7.85

df=36.1

p-value=2.5×109

95% CI=(3.00,5.08)

9 / 17

Example: t-test

Different variances (Welch approximation)

t.test(x, y)

t=7.85

df=36.1

p-value=2.5×109

95% CI=(3.00,5.08)

Pooled variance

t.test(x, y, var.equal = TRUE)

t=7.32

df=43

p-value=4.5×109

95% CI=(2.93,5.15)

9 / 17

Example: prior sensitivity analysis

10 / 17

Example: causal inference

11 / 17

3. Exploring different questions

Some clear examples:

  • Changing the response variable
  • Changing the 'primary' explanatory variables
12 / 17

3. Exploring different questions

Some clear examples:

  • Changing the response variable
  • Changing the 'primary' explanatory variables

But sometimes it's less clear...

12 / 17

Example: ANOVA vs polynomial regression

13 / 17

Example: ANOVA vs polynomial regression

13 / 17

Example: ANOVA vs polynomial regression

14 / 17

4. Varying the desired estimation properties

15 / 17

4. Varying the desired estimation properties

Typical trade-off: bias vs variance

15 / 17

Example: ANOVA vs polynomial regression

16 / 17

When teaching students...

Create examples that feature multiple models/techniques.

17 / 17

When teaching students...

Create examples that feature multiple models/techniques.

Handy reference

Possible reasons for using multiple models:

  1. Comparing & optimising performance
  2. Exploring different assumptions
  3. Exploring different questions
  4. Varying the desired estimation properties
17 / 17

Student's predicament

X1,X2,,XnN(μ1,σ21)

Y1,Y2,,YmN(μ2,σ22)

2 / 17
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow