Practical Project Design > Part 2 > Unit 3

Coping with Complexity and Statistics


1. Introductory remarks

In this second talk, I'd like to turn to the issue of how to cope with the multiple factors in classroom-based research. Let me start with a specific example to demonstrate what I mean. Tom, your little five-year old son, said to you: "Mummy, I'm hungry." With little hesitation, you can reach the conclusion: the cause of Tom's hunger is due to Tom's empty stomach. Or in other words, the cause of Tom's hunger is due to the fact that Tom has not taken anything for the last three hours. Similarly with little hesitation, you can find a solution to the problem: give him some food.

As we may all agree, under normal circumstances, your analysis of the cause, and your solution to the problem, are quite adequate and effective. And we may say that the situation is not complicated at all. For the sake of comparison, let us call the situation like this a single-cause-solution situation.

Now think about this question: Richards, one of your students, came up to you and said: "Sir, why is my listening ability so poor?" Can you diagnose the cause and offer him a solution right away? No, it is very difficult. Why so? Because there are so many factors that may contribute to Richards' poor listening ability. It is not easy to pinpoint a particular one. And to make things worse, it might be the case that the poor listening ability is caused jointly by multiple factors, such as Richards' limited vocabulary, little practice in listening, hearing problem, the materials being too difficult for him, etc. In fact, with joint activities such as teaching-learning, the causes to problems are bound to be complex, and solutions are bound to be difficult to come by. Hence this talk on how to cope with complexity and statistics.

Ah, you may realize that I have smuggled in the term statistics without a warning. This is deliberate. I interpolate the two terms with the implication that a statistic method can be used as a solution to tackle a complex problem. So my talk will be divided into two main sections: (1) given a complex issue, how to analyze it; (2) how to think statistically and apply a statistic method to a problem.

2. How to analyze a complex issue

2.1 Be aware of the complexity

The most dangerous thing with class-based research is simple-mindedness towards its complexity. Given a problem, we hit upon an idea, assuming that it is the cause or solution. It seems to me to be very important for you to start with full awareness and alertness of the fact that research into problems in teaching-learning situation is bound to be complex. It is therefore quite beneficial to draw a skeleton picture of how complex the situation can be. I have drawn one for myself, and I'd like to share with you.

Learning is a dynamic process. Take a toddler learning to walk for example. It starts from what it can do: It can sit, crawl, and support itself up on four limbs. Then with a walk-helper, it, all of a sudden, plunges itself forward, both fiercely and fearlessly, with its mother holding her breath. The top news will soon find its way to all corners of the community world: IT CAN WALK NOW!

Surely it makes a brave start, and deserves celebration. However, no mother will forget the fact that the achievement is only temporary. Without sufficient follow-up reinforcement her toddler will stay crawling on the floor. So here we have a simple case of a learning process the toddler goes through in its learning to walk.

(Note, incidentally, that for a baby to learn to walk it has to learn how to fall first. If it cannot manage to fall, it cannot hope to walk. Is it revealing for Chinese learners of English to know this? Learning to fall is like them making mistakes while trying to produce English.)

For a Chinese learner to learn a foreign language such as English is quite analogous to a toddler learning to walk properly, as shown below. Surely the language learning process is far more complicated. However, the underlying principle governing the learning process is in fact the same, and we have a lot to learn from this analogy, as we shall see later.

1. We start with what learners can do, e.g. can pronounce the alphabet, have a vocabulary of 300 hundred words, etc. As no two learners can be exactly the same, the so-called same starting level of proficiency is only an approximation.

2. It is almost certain that learners' motivation to learn affects their learning, although it is difficult to tell to what extent such effect can be.

3. The learning process consists of the interaction among the teacher, learners and the materials. The teaching methodology and classroom management, the way the materials are provided, learners' learning style, the length of learning, are the factors each of which, or some of which, or all of which may affect the effectiveness of learning.

4. Through the learning process, learners have managed to be able to do things which they were unable to do before. The crucial point at this stage of learning is that the achievement is only temporary and unstable, and that it can easily get lost. Learners are generally unaware of this danger, assuming that once they are able to do certain language tasks, they will automatically be able to reproduce them when needs arise. It must be pointed out that it often takes a great deal of reinforcement before the skills learners acquire temporarily become genuinely internalized, and readily reproducible. So some follow-up reinforcement is a crucial factor for long-term achievement. This accounts for the fact that those students whose learning is exam-driven tend to lose the skills very quickly after the exam.

To sum up, the effectiveness of learning is a very complex issue. It depends on multiple factors, the implication of which for classroom-based research is that we have to make decisions on which factors are to be treated as constants and which as variables. We turn to this question in the next section.

2.2 Constants vs. variables

Now we are fully aware that there are many factors that contribute to the effectiveness of learning. Ideally we design a research project with the objective of finding all of them out, and showing which factor plays what role. This ideal project can be designed, but is obviously unfeasible. So what is realistic and feasible in real-life situations is to keep some factors stable --- to be called constants, while examining one or two factors to see what role they play. Suppose that learners' participation in oral activities is unsatisfactory, and that we want to do some research so as to improve the situation. As we all know, learners' participation in oral activities is influenced by multiple factors, such as motivation, age, gender, the nature of oral activities, the classroom setting, the way the activities are being conducted, the level of proficiency, etc. --- may all affect learners' participation. Obviously we cannot investigate into them all. So we have to decide which ones are going to be taken as being insignificant, that is, they are unlikely to be the causes to the problem. Suppose that the learners in question are freshmen at college. This feature prompts us to treat age and the level of proficiency as constants. We have found that among the inactive learners, boys and girls seem to be equal in number. This feature prompts us to rule out gender as a potential variable. Through some further investigation we may eventually pinpoint one or two factors, say the way oral activities are being conducted, and the classroom seating arrangement, that are suspected to be the decisive causes to learners' inactive participation. Hence these two factors are treated as variables of our research project.

The effective control of constants and variables contributes to the reliability of our research. The general rule is that the more effective the control is, the more reliable the research. For M.A. dissertation we do not expect a good effective control. This also accounts for the reason that learners are encouraged to choose a small topic with fewer factors, stable or variable. For M.A. dissertation we do expect a good effective control. For a Ph. D. thesis, a strict control is required.

To conclude this section, we must emphasize the fact that the decision on which factors are constants, and which are variables, is by no means easy to be made. Errors are likely to be committed. It is one of the areas where learners need help from tutors.

3. Work with some elementary statistics

Now let us compare the following statements:

1) The learners' inactive participation in oral activities is caused by the poor design of the oral tasks.
2) The learners' inactive participation in oral activities is mainly due to the poor design of the oral tasks.
3) The learners' inactive participation in oral activities is often due to the poor design of the oral tasks.
4) The learners' inactive participation in oral activities is in some cases due to the poor design of the oral tasks.
5) The learners' inactive participation in oral activities is found 60 percent due to the poor design of the oral tasks, and 40 percent due to the poor seating arrangement.
6) In our study involving 100 freshmen at BTVU, it is found that the learners' inactive participation in oral activities is 60 percent due to the poor design of the oral tasks, and 40 percent due to the poor seating arrangement.

The first statement is an absolute statement without any hedges. It is the most informative, and at the same time the most vulnerable. As it is, it implies that the learners' inactive participation in oral activities is only caused by the poor design of the oral tasks. This statement can easily be refuted, and proved to be false. The statements 2-4 are statements hedged by "mainly", "often", and "in some cases". In comparison with the first, they are much safer and less vulnerable to attack. However, they are less accurate, more vague, and hence less informative. The fifth statement sounds quite informative, and reliable, because it offers statistic information. However, a closer look at it will soon reveal its weakness, namely that it does not tell the size of the population under investigation. It appears to be much more accurate than the first four statements, but as a matter of fact it can be quite misleading. The last statement, in comparison with the previous five, is the most informative, the most accurate, and the most reliable. However, this reliability is comparative. The exact degree of reliability cannot be given until the way the research is conducted has been carefully evaluated.

What is the nature of a statistic statement? Well, it comes from counting figures. What does a figure, say 1, 2, 3, 4, 5, mean? A figure is the reduction of all the characteristics of individuals to an abstraction. This can be tested by trying to answer the following questions:

How many tables are there in this room?
How many chairs are there in this room?
How many people are there in this room?
How many computers are there in this room?

Your answers will be made of figures only, which have little to do with what are being counted: table, chair, people, computer, or whatnot. The implication of this fact for a statistic statement is that it only captures a general tendency, ignoring individual characteristics.

You may wonder if there is any point of drawing your attention to it. The points are twofold.

First, as it is pointed out above, teaching-learning is a very complex process involving many factors. It is very difficult to control them in order to examine each of them. In other words, we cannot reach a conclusion that is absolute and deadly assured. Because of this, what we can actually achieve is only a tendency statement with various degrees of certainty or reliability. It is in this regard that statistics offers the best help.

Statistics operates with probabilities. Suppose that there are ten factors that might contribute to the learners' inactive participation in oral activities. Since we do not know beforehand which factor plays what role, the safe strategy of approaching the problem is to reduce our unwarranted bias towards some, and treat all of them equally by giving them the same degree of probability. This represents a statistically-minded approach to a complex issue such as that of teaching-learning.

Nowadays, statistics has developed and branched into a range of disciplines: there are statistics for finance, stock exchange, medicine, physics, engineering, linguistics, chess games, to name but a few. Sophisticated computer software has also been written. The most popular one is perhaps SPSS, which is commercially available. With this software we can do quite a few statistic calculations just by clicking.

Statistics requires a special treatment, which obviously goes beyond the present talk. The message I want to drive home is that we have to think statistically instead of in absolute terms!

Questions for you to reflect upon:

1. The author says that for a baby to learn to walk and for a Chinese learner to learn English share the same underlying principle. Can you spell out this principle? Or can you point out in what ways the two types of learning are similar?

2. Why do you think it is important to treat some factors as constants, and others as variables? Do you think that constants are always constants while variables are always variables?

3. In what way the distinction between constants and variables is related to the problem analysis, and cause analysis in particular?

4. Why does the author say that statistic statements only capture the general tendency?