Ilorin Journal of Education, Vol. 27 August, 2007 CONSTRUCTION AND VALIDATION OF A GENERAL SCIENCE APTITUDE TEST (GSAT) FOR NIGERIAN JUNIOR SECONDARY SCHOOL GRADUATES Ariyo, Akinyele Oyetunde (PhD), International Centre for Educational Evaluation, Institute of Education, University of Ibadan, Ibadan, Nigeria. [email protected] com +2348034292924 Abstract This paper is a report of a study whose major purpose was to develop and validate a General Science Aptitude Test (GSAT) for Junior Secondary School graduate seeking admission into senior secondary school one in Nigeria.
The specific objectives were to describe the various stages in the development and validation of GSAT and also to determine the psychometric properties of the instrument. The Pearson Product Moment Correlation, Test difficult index, discriminative index and the KuderRichardson 21 statistics were used for the analysis of results. The results of analysis show that the GSAT was moderately difficult for the sampled students (average item difficult is 0. 39), while the instrument was found to be reliable since internal consistency was found to be 0. 90.
The inter-correlation among the GSAT‘s sub-scales was found to be substantial . GSAT is recommended for use in other parts of the world. Introduction There is a strong agreement among educationist and psychologist on the utility of aptitude tests in the process of educational and vocational decisions, about placement, streaming, admission and classification of students and job seekers (Gay, 1980; Macklem, 1990). According to Gay (1980), aptitude tests help the teacher to test more realistic expectations of students’ abilities and facilitate the identification of under achievers.
The terms intelligence, ability, and aptitude are often used interchangeably to refer to behaviour that is used to predict future learning or performance. However, subtle differences exist between the terms. The tests designed to measure these attributes differ in several significant ways. Like intelligence tests, aptitude tests measure a student’s overall performance across a broad range of mental capabilities. But aptitude tests also often include items which measure more specialized abilities such as verbal and numerical skills–that predict scholastic performance in educational programs.
Compared to achievement tests, aptitude tests cover a broader area and look at a wider range of experiences. Achievement tests tend to measure recent learning and are closely tied to particular school subjects. Aptitude tests tell us what a student brings to the task regardless of the specific curriculum that the student has already experienced. The difference between aptitude and achievement tests is sometimes a matter of degree. Some aptitude and 20 Ariyo, Akinyele Oyetunde achievement tests look a lot alike. In fact, the higher a student goes in levels of education, the more the content of aptitude tests resembles achievement tests.
This is because the knowledge that a student has already accumulated is a good predictor of success at advanced levels. In literature there is no single definition of aptitude. Some investigators defined aptitude as the characteristics of a person that are regarded as indices of his capacity to acquire, through future training some specific set of responses (Aiken, 1988; Gronlund, 1981). On the contrary some other investigators see it as natural or innate capacity for a particular performance (Yejide, 1973; Thorndike & Hagen, 1977).
Aptitude tests are cognitive (intellectual) measures used to predict future performance in some activities such as school learning and other forms of accomplishment (Aiken, 1988; Gronlund, 1981; Sax, 1980). Aptitude tests are widely used in schools and industries. Teachers might want to administer aptitude test to help them identify students who have potential to perform well in physics in the future so that they could be offered physics instruction. This is in line with previous research findings, whereby a positive link had already been established (Aiken, 1988; Gagne & Briggs, 1979; Gronlund, 1981; Thorndike & Hagen, 1977).
The United States Employment service developed the General Aptitude Test Battery (GATB) which is widely used in the United States by State Employment offices and has been made available as a model or starting point for the development of aptitude batteries in other countries (Tittle, 1990). The Armed Services Vocational Aptitude Battery (ASVAB) is the most widely used aptitude battery in United States high schools. The Differential Aptitude Tests (DAT) was designed primarily for educational and vocational counselling in U. S. A. secondary schools (Tittle, 1984). An early ork on the differential test is a battery devised for the selection of apprentices in the metal trades. Aptitude testing derived prominence when it became apparent that intelligence tests were rather limited in their coverage of special abilities or talents (Denga, 1987). Ikeotuonye (1986) validated the Differential Aptitude Test (DAT) in Kaduna State of Nigeria. The DAT battery consists of eight independent tests, namely: Verbal Reasoning (VR), Numerical Ability (NA), Abstract (CSA), Mechanical Reasoning (MR), Space Relations (SR), Language usage: Spelling (LU-I) and Language usage: Grammar (LU-II).
The specific objective of the study was to determine whether the location of schools, in terms of ruralurban categorization, had any effect on the students’ scores on the Differential Aptitude Tests. The sample was made up of 75 girls and 325 boys (400 students) from eight secondary schools in Kaduna State. The results of the study showed that the students in urban areas performed better than the students in rural areas. It was then established empirically that the experiential background of the students had observable effects on their aptitudes.
Attempts have been made to develop tests of intelligence and creativity (Bakare, 1972; McCarthy, 1973; Ohuche & Ohuche 1973; Yoloye, 1973) adapted various standardized tests of intelligence in measuring pupils’ achievements. Also, Akinboye (1977) developed a test aimed at measuring the creative abilities of post primary students. 21 Ilorin Journal of Education, Vol. 27 August, 2007 In Nigeria, the Federal Government established the Nigerian Aptitude Test Unit (NATU) in 1963 which has helped the West African Examination Council (WAEC) in constructing and administering aptitude tests (WAEC NEWS, 1989).
In order to reduce the dependence on foreign tests, the Test Development and Research office (TEDRO) a branch of WAEC came into existence in 1963. TEDRO has designed over twenty-one (21) types of aptitude tests. W. A. E. C. aptitude tests were designed for selection of candidates into various education programmes such as: secondary schools, Vocational /Technical schools, Schools of Nursing, the Polytechnics and Occupational services. The tests are useful in streaming students into various courses of study, for example, the scholastic or purely academic science, technical and commercial (Soriyan, 1978).
Yejide (1979) indicated that the predictor tests for science ability designed by W. A. E. C. are a battery of eight aptitude tests. It consists of verbal Analogies (VAL), the reading comprehension (RDL), the memory Test (MEM), Graph (GPH), Arithmetic (RTA), Tables (TAB) and science information (SCI) subtests. They are called I-D tests, specifically designed for use in Africa. They were developed during 1960 – 1964 in a project supported by the U. S. Agency for International Development which gave the contract to the American Institute for Research.
The introduction of the 6-3-3-4 system of education in Nigeria in 1982, constituted a major boost for psychological testing. The Guidance and Counselling Unit of the Federal Ministry of Education, Nigeria commissioned experts from the Nigerian universities and representative from W. A. E. C. to develop and standardize aptitude tests for use in counselling and placement of students after junior secondary education. The tests were standardized in 1986 (Denga, 1987). As at present in Nigeria, the administration of aptitude tests is a yearly exercise in the Federal unity schools.
The National Examinations Council (N. E. C. O. ) based at Minna usually administers a series of Aptitude Tests on junior secondary three (JSIII) students in Federal unity schools for the purpose of placement in different classes such as Arts, commercial, science and technical classes at the senior secondary school one. The developmental trend in Aptitudes testing revealed the need to conduct more studies in the country that involve aptitude testing especially that will cut across the Federal, State and Privately owned secondary schools.
Uses of Aptitude Tests In general, aptitude test results have three major uses: Instructional: Teachers can use aptitude test results to adapt their curricula to match the level of their students, or to design assignments for students who differ widely. Aptitude test scores can also help teachers form realistic expectations of students. Knowing something about the aptitude level of students in a given class can help a teacher identify which students are not learning as much as could be predicted on the basis of aptitude scores.
For instance, if a whole class were performing less well than would be predicted from aptitude test results, then curriculum, objectives, teaching methods, or student characteristics might be investigated. Administrative: Aptitude test scores can identify the general aptitude level of a high school, for example. This can be helpful in determining how much 22 Ariyo, Akinyele Oyetunde emphasis should be given to college preparatory programs. Aptitude tests can be used to help identify students to be accelerated or given extra attention, for grouping, and in predicting job training performance.
Guidance: Guidance counselors use aptitude tests to help parents develop realistic expectations for their child’s school performance and to help students understand their own strengths and weaknesses. According to Macklem (1990), research data shows that individually administered aptitude tests have the following qualities: They are excellent predictors of future scholastic achievement. They provide ways of comparing a child’s performance with that of other children in the same situation. They provide a profile of strengths and weaknesses.
They assess differences among individuals. They have uncovered hidden talents in some children, thus improving their educational opportunities. They are valuable tools for working with handicapped children. Research indicates that self-awareness and self-control are the building blocks upon which people skills are built. In other words, without awareness and control of your emotions and knowledge of how emotions affect your behaviors, there is little, if any, foundation upon which to build people skills (Emotional Intelligence Screening EQ test01). People skills” is a term that encompasses a number of important competencies such as social and organizational awareness and the ability to manage relationships well. People skills are more a product of environment than of genes. The fact is that no one is too old to learn, practice and acquire new people skills, or emotional intelligence. On account of aptitude test considerable utility, therefore, several approaches have been tried in the overall efforts to produce valid and reliable instruments for the measurements of aptitude.
Among the various approaches available for the construction of aptitude tests are: 1. The differential test approach, the component ability test approach, the work sample test approach and the analogous (Horrocks and Schoonover 1968). 2. The process of similarities, verbal analogies, memory, comprehension boxes approaches (WAEC I-D). 3. Comparison approaches (Cooley, 1958). In the differential approach a number of relatively distinct abilities believed to be of major importance in assessing and predicting human behaviour in areas of general and special bilities are measured. The component ability test is a test of a single ability, the analogous test either by duplicating the pattern in miniature or by simulating the job without presenting the examinee with an exact reproduction of it. One particular advantage of the analogous test is that it is not necessary to identify the abilities, which underline the task since at least part of the actual job performance is stimulated. The work sample test requires the examinee to perform all or part of the working operations of a given job under non-testing condition.
The WAEC I-D approach consists of using similarities, analogies, 23 Ilorin Journal of Education, Vol. 27 August, 2007 memory, comprehension, and boxes among others in the construction of aptitude test items. The comparison approach involves specific abilities needed by practicing scientists. One of the methods of statistical analysis useful for this approach is a factor analysis, which analyses the interrelationships among a battery of test. For the purpose of construction of SGAT an attempt was made by combining the various approaches discussed in the literature.
This study developed and validated and a General Science Aptitude Test for Junior Secondary School graduates seeking admission into Senior Secondary School. Research Questions 1. What is the internal consistency of General Science Aptitude Test (GSAT)? 2. What is the discrimination index of GSAT? 3. What is the difficult index of GSAT? 4. What are the relationships among GSAT subscales? Methodology In constructing the present Science Aptitude Test, an extensive review of literature was first undertaken to unearth the various de-limitations of aptitude test and aptitudes construction techniques by different authors.
Five broad areas of science aptitudes were identified which formed the five component parts of the instrument. They are Biology (BIO), Chemistry (CHE), General Reasoning (GER), Mathematics (MAT) and Physics (PHY). After obtaining these broad science areas, the next step consisted of writing items bearing in mind the building principles in the various types of aptitudes test construction approaches. The items obtained in this way were then subjected to vigorous editing and formulation. The items judged satisfactory on these bases were then grouped into the existing ive subset areas of SGAT in terms of logical and content analysis. This grouping produced 16 items in each of the five areas making up to 80 items in all. With respect to the mode of response to the test items, it was decided that the multiple-choice items with four option alternative format should be used. Sampling Procedure for Trial Testing At this stage, an experimental version of the Science General Aptitude Test (GSAT) was trial tested. Five Junior Secondary Schools (JSS) in Jos, Nigeria (1 Federal School type – 2 State Schools type and 2 Voluntary Agency School type) were randomly chosen for the trial testing.
In each of the five schools, 10 students in Junior Secondary Three (JS 3) were chosen making a total of 25 boys and 25 girls. The SGAT was administered to those 50 students and a score of 1 was assigned to each correct option of each item. The discrimination and test difficulty indices of each item as well as the average discrimination index and difficulty index in each subsets of the test were obtained in order to determine the value and position of each item and each sub-test in the whole test. Based on these results, the SGAT items were 24 Ariyo, Akinyele Oyetunde eviewed. Using the test difficulty indices, the subset were re-arranged in order of difficulty, thus the BIO sub-test of SGAT was found mostly easy and PHY sub-test was found more difficult. Then, BIO was put first in the test while PHY items were put last. Similarly, in each sub-test, the items were re-arranged in order of difficulty level. The easier items (i. e. the items that many students passed) were put before the difficult items (i. e. the items many students failed). Bloom (1976), suggested that good item difficulty should fall within the range of 0. 40 to 0. 60.
To maximize the psychometric properties of GSAT items the lowest and highest discriminative and item difficulty indices using 27% upper scorers group and 27% lower scorers group results were eliminated until 10 items remain in each subsets. The final form of the GSAT therefore consists of 50 items, 10 in each sub-test area. Reliability and Validity of the Science General Aptitude Test (SGAT) Reliability In determining the reliability of GSAT, three groups of secondary school students in JSS 3 whose ages range between 12 and 18 years were used. Group 1 consisted of 44 JSS 3 students (24 boys and 20 females).
Group 2 consisted of 44 JSS 3 students (16 males and females). Group three consisted of 135 JSS 3 (86 boys and 49 girls). The group one students represented the Federal owned School Type while Group 2 students represented the state owned school type and group three students represented students from Voluntary owned school Type. In all, 221 JSS III students in Jos were involved in the exercise. The performance of these students on SGAT and its subsets were analysed using the Kudar Richardson formular 21 in order to estimate the internal consistency of the Test.
The internal consistency estimate, which is a measure of reliability, is shown in a table 1 below. Table 1: Internal Consistencies of GSAT and its Subscale GSAT and its Subscales GSAT Biology Chemistry General Reasoning Mathematics Internal Consistency Using Kr 21 0. 9 0. 68 0. 62 0. 61 0. 43 No. of Items 50 10 10 10 10 Table 2 shows the summary of discrimination indices of GSAT and its subscales. 25 Ilorin Journal of Education, Vol. 27 August, 2007 Table 2: Discrimination Indices of SGAT and its Subsets Discrimination Index GSAT Biology Chemistry General Reasoning Mathematics Physics Discrimination Index 0. 8 0. 51 0. 43 0. 43 0. 28 0. 22 Average 0. 37 Table 3 shows the result of test difficult of GSAT and its subsets. Table 3: GSAT Difficulty Index Table GSAT and its Subjects GSAT Biology Chemistry General Reasoning Mathematics Physics Test Difficulty Indices 0. 39 0. 52 0. 43 0. 49 0. 29 0. 31 Average Test Difficulty 0. 39 Validity of the Instrument One of the more cherished validity for aptitude test is predictive validity. The provision of any predictive data would involve following up subject for a number of years and the relative of GSAT precludes furnishing such data at the moment.
Other types of validity usually, established for aptitude tests are content validity or face validity. A look at GSAT shows that items in the test are science oriented which lies within the experience of Junior Secondary School (JSS) students in Jos, Nigeria. One other aspect of validity pertains to the inter-correlation among five subjects of GSAT. Table 4 shows such inter-correlation. Table 4: Inter-correlations among the five GSAT’S Subjects N – 21 GSAT Sub-Test Biology Chemistry General Reasoning Mathematics Physics Biology 0. 47 0. 54 0. 28 0. 26 Chemistry 0. 42 0. 22 0. 30 General Reasoning 0. 25 0. 7 Mathematics 0. 22 Physics 26 Ariyo, Akinyele Oyetunde The correlation coefficients were found to be statistically significant. As one would expect, there is substantial inter-correlation among the pairs of sub-tests and they all cluster on one common factor called science aptitude. Nominative Data for the SGAT In this instrument provisions are made for obtaining ipsative scores and normative scores (ipsative scores permits comparisons of the relative strength, of characteristics within the individual while the students’ performance based on normative scores permit external comparisons with a normative sample).
The first step in the provision of normative data has been completed by administering the GSAT to a random sample of JSS 3 students in the three secondary school types – Federal, State and Voluntary agency owned schools in Jos. Table 5 shows the means and standard deviation (SD) on overall students’ performance on GSAT according to the school types. Table 5: Summary of the Means and Standard Deviations on Overall Students’ Performances according to School Types School type Statistics Biology Chemistry General Reasoning Mathematics Physics SGAT Voluntary Agency Federal State N X SD N X S. D. N X S. D 132 5. 1 2. 53 42 6. 52 1. 92 47 4. 30 2. 51 132 4. 21 2. 42 42 5. 50 2. 20 47 2. 96 1. 62 132 3. 83 2. 45 42 4. 90 1. 83 47 2. 60 1. 70 132 2. 77 1. 72 42 3. 45 2. 02 47 2. 28 1. 73 131 2. 86 1. 89 42 4. 05 1. 85 47 3. 09 2. 01 132 19. 34 7. 75 42 24. 33 5. 64 47 15. 40 5. 19 Note: N = number of subjects; X = Mean score and S. D. = Standard Deviation. From Table 5, it could be observed that the students from the Federal school type has the highest means score both in the overall students’ SGAT scores and its’ sub-tests. This was followed by the performance of students from the Voluntary agency school type.
The students from the state school type were found to perform least in all the sub-tests except physics. This result has great implication to educational practitioners and policy makers. Both student, school, and out of school factors that have been found to be detrimental to student’s performance and aptitude in science need to be addressed. Discussions The present study constructed and validated a general science aptitude test for junior secondary school graduate as an psychological instrument to assist in placing students into different arms of Senior secondary one classes.
The finding of the present study reveals that GSAT test items constructed and validated have internal consistency. This suggests the fact 27 Ilorin Journal of Education, Vol. 27 August, 2007 that the instrument is reliable. It could be adopted and adapted in the country and in other parts of the world to place students into senior secondary schools one. The average discriminative and difficulty index of GSAT were found to be 0. 37 and 0. 39 respectively. This also attests to validity of the instrument in line with Bloom (1976) assertion that good item difficulty fall within the range of 0. 4 to 0. . It was also discovered that there was a substantial inter-correlation among GSAT subscales. This also buttresses the fact that the present instrument is both valid and reliable. Therefore the present instrument is relevant and applicable to Junior Secondary School graduates. The instrument could be obtained by users from the author. Recommendations Placement of Junior Secondary School Graduate into senior secondary one class should be based on their performance in aptitude test. Students that do well in GSAT for instance are expected to be placed in science classes in senior secondary one.
A student who does not do well in GSAT may be given some other types of aptitude tests that will predict their future learning and capability. Researchers should concentrate more efforts on factors that could improve students’ aptitude since previous researches have identified aptitude as a key factor on student’s performance in school. For instance, Ariyo (2006) findings revealed that physics general aptitude has the highest causal influence on senior secondary school students’ physics achievement in Oyo State.
Out of the nine predictor variables, hypothesized to exert causal influence on achievement in physics, four variables: school type, student gender, student attitude to physics and physics general aptitude significantly exert such causal influence directly. Of the four variables, physics general aptitude has the highest contribution to achievement in physics; it was followed by students’ gender, then the students’ attitude towards physics, and school type factor. Physics general aptitude accounted for 50. 87% of the total effect on criterion variable and 46. 6% of the direct effect. Similar work like the present study should be done in arts and humanity to make the present study more robust. There should also be closer monitoring services in state school type across the country and in other parts of the world. Both science students and teachers from the state school type should be given incentive that will motivate them to do well in science subjects. Positive attitude towards science subjects should be reinforced among students.