# The aim of this investigation is to examine whether or not the number of people per doctor affects a countries average life expectancy Paper

The aim of this investigation is to examine whether or not the number of people per doctor affects a countries average life expectancy.

The life expectancy of many lesser economically developed countries is lower than that of more economically developed countries. Generally, better-developed countries have a greater doctor to population ratio. So I wish to determine whether this is a factor that affects life expectancy.

I choose this investigation, as I’m interested in geography particularly travelling. I plan to take a gap year after my A-levels, prior to university and hopefully visit many areas of the world including less economically developed countries. This led me to an interest in the variation of death rates between countries and I decided to compare this data to the number of doctors per person and to see if this influences the death rate in anyway.

DATA COLLECTION:

Firstly, I collected a list of all the countries in the world and their doctor to patient ratio. I got my data from a school Atlas I acquired from the college library; I collected the data from the same source as it was obtained in the same year. The countries were listed alphabetically and assigned a number. Using a graphics calculator I generated a random number, using a random function and chose a sample of 50. However, some numbers were generated twice so I ignored it the second time and went on to the next number.

– 1 –

Data No.

People per Dr.

average life expectancy

1

7358

45

2

769

73

3

1250

69

4

555

83

5

14300

47

6

1316

75

7

370

73

8

298

71

9

455

78

10

385

77

11

257

70

12

709

74

13

9090

73

14

5000

58

15

885

76

16

244

68

17

885

76

18

10000

53

19

5825

61

20

57300

44

21

3448

69

22

909

75

23

1111

76

24

667

70

25

357

78

26

2000

52

27

5000

47

28

5556

45

29

2500

69

30

333

79

31

2500

63

32

6423

65

33

1667

62

34

588

78

35

625

70

36

6667

52

37

5000

86

38

20000

56

39

476

77

40

406

77

41

50000

45

42

476

77

43

33333

49

44

303

78

45

10000

68

46

435

73

47

699

72

48

556

70

49

375

81

50

14300

37

Total

123743

1712

– 2 –

Modelling Procedures:

I decided to use Excel to input my data into a table format (shown above), from this table I used Excel to draw a scatter diagram of all the data.

Scatter Diagram to Compare Life Expectancies to People Per Doctor For 50 Random Countries

The scatter diagram gives a good diagrammatic representation of the data and shows how the data is spread in roughly an elliptical nature. From this I can make an initial conclusion/statement that both data variables are random and normally distributed. Due to the elliptical nature of the data it allowed me to produce a regression line from the data. The regression lines shows visually roughly how strong or weak the correlation of the data is and in this instance the data is a relatively strong negative correlation. The strength of the correlation can be calculated using Pearson’s Product Moment Correlation.

To do this I used Excel to set-up a table consisting of (xi, yi , xi2 , yi2 , xiyi ) and the sum of all columns (shown page. 5)

– 3 –

Data No.

People per Dr.

average life expectancy

Xi^2

Yi^2

XiYi

1

7358

45

54140164

2025

331110

2

769

73

591361

5329

56137

3

1250

69

1562500

4761

86250

4

555

83

308025

6889

46065

5

14300

47

204490000

2209

672100

6

1316

75

1731856

5625

98700

7

370

73

136900

5329

27010

8

298

71

88804

5041

21158

9

455

78

207025

6084

35490

10

385

77

148225

5929

29645

11

257

70

66049

4900

17990

12

709

74

502681

5476

52466

13

9090

73

82628100

5329

663570

14

5000

58

25000000

3364

290000

15

885

76

783225

5776

67260

16

244

68

59536

4624

16592

17

885

76

783225

5776

67260

18

10000

53

100000000

2809

530000

19

5825

61

33930625

3721

355325

20

57300

44

3283290000

1936

2521200

21

3448

69

11888704

4761

237912

22

909

75

826281

5625

68175

23

1111

76

1234321

5776

84436

24

667

70

444889

4900

46690

25

357

78

127449

6084

27846

26

2000

52

4000000

2704

104000

27

5000

47

25000000

2209

235000

28

5556

45

30869136

2025

250020

29

2500

69

6250000

4761

172500

30

333

79

110889

6241

26307

31

2500

63

6250000

3969

157500

32

6423

65

41254929

4225

417495

33

1667

62

2778889

3844

103354

34

588

78

345744

6084

45864

35

625

70

390625

4900

43750

36

6667

52

44448889

2704

346684

37

5000

86

25000000

7396

430000

38

20000

56

400000000

3136

1120000

39

476

77

226576

5929

36652

40

406

77

164836

5929

31262

41

50000

45

2500000000

2025

2250000

42

476

77

226576

5929

36652

43

33333

49

1111088889

2401

1633317

44

303

78

91809

6084

23634

45

10000

68

100000000

4624

680000

46

435

73

189225

5329

31755

47

699

72

488601

5184

50328

48

556

70

309136

4900

38920

49

375

81

140625

6561

30375

50

14300

37

204490000

1369

529100

Total

123743

1712

3804969945

120078

6450387

– 5 –

Pearson’s Product Moment Correlation Coefficient

This is denoted by ‘r’

r = Sxy

Sx Sy

Sx = Standard deviation of x =

Sy = Standard deviation of y =

Sxy = Covariance = 1/50 ?xi yi – x y

= 1/50 ?xi yi – x y

Sx Sy

Sx = 11588.897

Sy = 12.312

Sxy = -87234.776

R = -0.624

Hypothesis Test

I’m going to test my data at a 5% significant level. p = Population Product Moment Correlation Coefficient,

H0: p = 0 (no correlation between people per doctor and life expectancy)

H1: p < 0 (negative correlation between people per doctor and life expectancy)

I’m using a 1 tail test- as from the initial scatter diagram and Pearson’s Product Moment Correlation Coefficient I’m aware that the correlation (if significant will be negative).

* n = 50 r = -0.624 r (critical value) =

Therefore by using the tables of critical values for (r) when n = 50 it is evident that the value for r (-0.624) is greater than the critical value when n = 50 at a 5% significant level.

H1: p ; 0 (negative correlation between people per doctor and life expectancy) can be accepted and H0 rejected. Thus showing that at a 5% significant level there is negative correlation between people per doctor and life expectancy.

– 6-

Regression Line

Using the equation for a regression line: y- y = Sxy (x -x)

Sx2

I’ve generated an equation to calculate the value of (x) from (y).

* y – 66.8 = -87234.776 (x- 5879.22)

11588.8972

*

Conclusion

The scatter diagram is a good initial indication of negative correlation between people per doctor and life expectancy, suggesting that for countries that life expectancy is low there will be a greater number of people per doctor- compared to a country with higher life expectancy.

Pearson’s Product Moment Correlation Coefficient determines the strength of correlation between data, i.e

* if r = o (no correlation)

* if r = -1 ( perfect negative correlation)

* if r = 1 (perfect positive correlation)

Because my calculation gave me the value of r equal to -0.624 it supported the initial interpretation of the data having negative correlation and indicated that the negative correlation was of a reasonable strength.

I decided to carry out a Hypothesis test on the data. This was carried out by the comparison of r (-0.624) with the corresponding critical values of (r) from the tables- showing negative correlation between people per doctor and life expectancy at a 5% significance level.

– 7-

Accuracy

The accuracy of my raw data is likely to be of the highest accuracy due to the fixers being obtained from the CIA (Central Intelligence of America) web site- from this I can be certain that all data is recent and for my investigation reliable. The only error likely to occur is the ever changing patient to doctor ratio, although is accounted for before the raw data was published by the CIA. I found this the most accurate and up to date source of information available for my access.

Within the calculations itself the results are also of my highest possible accuracy. I used Excel to initially calculate Pearson’s Product Correlation Coefficient, Mean, Standard Deviation and Co-variance, that was then check by hand using a calculator and the formula’s included within my investigation. I kept the data to 3signifcant figures as accuracy beyond this wasn’t necessary for this particular investigation.

The regression line was also drawn by Excel and not by hand as to be most accurate.

The only inaccuracy that I felt might have effected my investigation is a particular significant outlier or anomal result: (a result over two standard deviations from the mean). This could have caused my standard deviation of X to increase and Y to decrease compared to all other data figures, leading to a possible inaccuracy to my Co-variance and Pearson’s Product Correlation Coefficient. The anomaly is highlighted in my scatter diagram (including the regression line) as to show the change in the regression line to incorporate this outlier- another possibly affected factor in my investigation.