Poverty comparisons and household survey design

Please download to get full document.

View again

of 60
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Information Report

Instruction manuals


Views: 4 | Pages: 60

Extension: PDF | Download: 0

Related documents
Poverty comparisons - an increasingly important starting-point for welfare analysis - are almost always based on household surveys. They therefore require that one be able to distinguish underlying differences in the populations being compared from sampling variation: standard errors must be calculated. So far, this has largely been done on the assumption that the household surveys are simple random samples. But household surveys are more complex than this. We show that taking into account sampling design has a major effect on standard errors for well-know poverty measures: they can increase by around one-half. The report also shows that making only a partial correction for sample design (taking into account clustering, but not stratification whether explicit or implicit) can be as misleading as not taking any account at all of sampling design.
LSA I2qq I il 1,1' 'It III LI dt- 1' liE t V , I iRil t,,.,l Poverty Comparisons and Household Suirvev Design ,rtevn Howeib iaildT~~ OIac3n:anin~iw Poverty Comparisons and Household Survey Design The Living Standards Measurement Study The Living Standards Measurement Study (LsMs) was established by the World Bank in 1980 to explore ways of improving the type and quality of household data collected by statistical offices in developing countries. Its goal is to foster increased use of household data as a basis for policy decisionmaking. Specifically, the LSMS is working to develop new methods to monitor progress in raising levels of living, to identify the consequences for households of past and proposed government policies, and to improve communications between survey statisticians, analysts, and policymakers. The LSMS Working Paper series was started to disseminate intermediate prod- ucts from the LSMS. Publications in the series include critical surveys covering dif- ferent aspects of the LSMS data collection program and reports on improved methodologies for using Living Standards Survey (Lss) data. More recent publica- tions recommend specific survey, questionnaire, and data processing designs and demonstrate the breadth of policy analysis that can be carried out using Lss data. LSMS Working Paper Number 129 Poverty Comparisons and Household Survey Design Steven Howes and Jean Olson Lanjouw The World Bank Washington, D.C. Copyright X) 1997 The International Bank for Reconstruction and Development/THE WORLD BANK 1818 H Street, N.W. Washington, D.C. 20433, U.S.A. All rights reserved Manufactured in the United States of America First printing April 1997 To present the results of the Living Standards Measurement Study with the least possible delay, the typescript of this paper has not been prepared in accordance with the procedures appropriate to formal printed texts, and the World Bank accepts no responsibility for errors. Some sources cited in this paper may be informal documents that are not readily available. The findings, interpretations, and conclusions expressed in this paper are entirely those of the author(s) and should not be attributed in any manner to the World Bank, to its affiliated organizations, or to members of its Board of Executive Directors or the countries they represent. The World Bank does not guarantee the accuracy of the data included in this publication and accepts no responsibility whatsoever for any consequence of their use. The boundaries, colors, denominations, and other information shown on any map in this volume do not imnply on the part of the World Bank Group any judgment on the legal status of any territory or the endorsement or acceptance of such boundaries. The material in this publication is copyrighted. Requests for permission to reproduce portions of it should be sent to the Office of the Publisher at the address shown in the copyright notice above. The World Bank encourages dissemination of its work and will normally give permission promptly and, when the repro- duction is for noncommercial purposes, without asking a fee. Permission to copy portions for classroom use is granted through the Copyright Clearance Center, Inc., Suite 910, 222 Rosewood Drive, Danvers, Massachusetts 01923, U.S.A. ISBN: 0-8213-3862-5 ISSN: 0253-4517 Steven Howes is an economist at the World Bank; this paper was written while he was in the Poverty and Huma;n Resources Division of the Bank's Policy Research Department. Jean Olson Lanjouw is Assistant Professor in the Department of Economics at Yale University. Library of Congress Cataloging-in-Publication Data Howes, Stephen R., 1964- Poverty comparisons and household survey design / Stephen R. Howes and Jean Olson Lanjouw. p. cm. - (LSMS working paper, ISSN 0253-4517; no. 129) Includes bibliographical references. ISBN 0-8213-3862-5 1. Poverty-Statistical methods. 2. Household surveys- Statistical methods. I. Lanjouw, Jean Olson. II. Title. III. Series. HC79.P6H68 1997 339.2'2'-dc2l 96-53397 CIP Table of Contents Foreword ................................................ vii Abstract .................. ix Acknowledgments .................. xi 1. Introduction .................. 1 2. Household survey designs .................. 3 3. Estimators of totals and means and their variances appropriate for complex survey designs.. ................ 9 4. Poverty and other welfare measures .................. 19 5. Some examples .................. 20 6. Concluding comments .................. 25 Appendix I: Proofs .................. 27 References .................. 34 Tables Table 1: Features of sample design from some recent national household consumption surveys, and a comparison with simple random sampling. 8 Table 2: Sample design for Pakistan and Ghana LSMS surveys .20 Table 3: Sample design effects for mean expenditure, household size and various poverty measures for two household surveys .23 v Foreword The ability to monitor poverty is crucial to assessing the success of policies designed to improve standards of living. With the LSMS household surveys and others now available, many developing countries now have the data-base required to undertake this policy monitoring. Earlier LSMS papers have shown how to approach the measurement of poverty statistically, so as to be able to distinguish real changes from sampling variation. This paper extends the earlier work to show how to take into account typical sample designs in calculating statistical measures of poverty change. It uses LSMS data sets both to show how household surveys differ greatly from the simple random sample paradigm and to illustrate the importance of basing statistical formulae on the actual sample design used. I S t Lyn Squire, Director Policy Research Department vii Abstract Poverty comparisons - an increasingly important starting-point for welfare analysis - are almost always based on household surveys. They therefore require that one be able to distinguish underlying differences in the populations being compared from sampling variation: standard errors must be calculated. So far, this has largely been done on the assumption that the household surveys are simple random samples. But household surveys are more complex than this. We show that taking into account sampling design has a major effect on standard errors for well-known poverty measures: they can increase by around one-half. We also show that making only a partial correction for sample design (taking into account clustering, but not stratification, whether explicit or implicit) can be as misleading as not taking any account at all of sampling design. ix Acknowledgments We would like to thank, for their provision of data, information and/or comments: Benu Bidani, Gaurav Datt, Mark Foley, Paul Glewwe, Margaret Grosh, Dean Jolliffe, Peter Lanjouw, Martin Ravallion, Chris Scott, Kinnon Scott, Salman Zaidi, and Qing-hua Zhao. Mr Ranzam of Pakistan's Federal Bureau of Statistics also provided us with useful information. xi I. Introduction Has poverty increased or fallen? Is urban or rural poverty higher? Will some policy under consideration reduce or increase poverty? These are typical of the questions asked in poverty analyses. To provide answers, recourse is required to household surveys. But a survey is not a census. It is a sample, with a size typically numbering in the thousands of households, from which conclusions concerning populations typically numbering in the millions must be drawn. This leads to the fundamental problem that any comparative analysis must distinguish population differences from sampling variation. A series of recent papers have stressed the importance of this and have provided the tools by which standard errors can be calculated (Howes, 1993, Kakwani, 1993, Pudney and Sutherland, 1994, Ravallion, 1994). The problem with the current state of play is that, in presenting statistical methods and results for use in poverty comparisons, the assumption has been made that the household surveys being analyzed are simple random samples of the populations from which they are drawn. In fact, however, they are not. Household surveys are far more complex in their design. This can best be seen by analogy. Consider each household in the population to be represented by a number written on a piece of paper. All pieces of paper are of equal size and are placed in a hat. Then a household survey would be a simple random sample if it were selected by blindly drawing numbers from the hat. Household surveys differ in a number of ways from this simple model: * One may have many hats from which sub-samples are drawn: often populations are first divided into strata, each of which may be considered a separate hat or sub-population. * There will probably be hats within hats . A random selection of clusters, such as villages, is invariably first made from the population (or from each stratum). Households are then randomly drawn from these smaller clusters. * Some numbers (households) have a higher probability of selection than others. * The selection of numbers may not be blind , that is, random. Instead, it may be systematic: the numbers may be lined up and every nth one chosen. As we will see, this is only the start of a fairly long list of complexities which household surveys incorporate. What are the implications for statistical poverty (and welfare or inequality) analysis of these various features? This is the central question which this paper addresses. We present estimators of the variance of poverty measures appropriate for typical survey designs. And we assess the influence departures from a simple random sample are likely to have on the precision with which poverty estimates can be made. Which departures have a substantial impact and which can be safely ignored for the sake of convenience? We show that, under sample designs commonly in use, conventional formulae may lead to estimates of standard errors for poverty measures which are only two-thirds the size they should be. That is, ignoring sample design can make us think estimates are considerably more precise than they actually are. It should be noted at the outset that the key results we make use of - relating to the variances of sample means - have been known since the fifties, and are presented in several textbooks on the subject of sample design.' However, we have not found any work which links the general results available for complex survey designs to the typical features of household surveys, let alone to poverty analysis. Moreover, the fact that these general results have been almost completely overlooked in the empirical and theoretical literature on poverty measurement suggests that there is a need to set out clearly the formulae required and to provide a strong motivation for their use.2 In the next section, we provide a more formal and detailed treatment of the various ways in which household surveys can differ from the simple random sample model. In Section 3, we provide the basic formulae. Section 4 applies these formulae to poverty measures. Section 5 gives some examples of the importance of taking into account sample design. Section 6 concludes. The appendix provides proofs of the paper's key results. 1. Kish (1965) is the classic on this subject. It gives what is still probably the most comprehensive treatment, though not the simplest. Som (1973) provides a very clear presentation of results. Hansen, Hurwitz and Madow (1953) provide proofs. Levy and Lemeshow (1991) provide an introduction, as does Scheaffer, Mendenhall and Ott (1990). 2. Deaton (1994) provides an excellent introduction to and analysis of household surveys and to some extent fulfills these two aims. However, his treatment does not cover many of the, common problems raised by the use of household surveys (such as when one has stratification and clustering, or raising factors and clustering, or all three, or when one is estimating per capita (rather than household) means). Scott and Amenuvegbe (1989, pp.55-57) cover - in relation to a survey of Mauritania - the joint use of clustering and (implicit) stratification (not raising factors). However, their intention is to provide only formulae, and they add neither motivation nor explanation. Rodgers and Rodgers (1992) use a alternative method to that provided in this paper (balanced repeated replications - see footnote 16), but provide only results. In general it would seem to be that poverty analysis of developing countries pays less attention to sample design and its implications than analysis of developed countries (see Rodgers and Rodgers, 1992, and Duncan and Rodgers, 1991, as examples for the United States). Incorporation of sample design also seems to be much more widespread among demographers than economists (see Cleland and Scott, 1987). 2 2. Household survey designs In this section, we discuss eight sampling features one needs to be aware of when analyzing household surveys. We then (in 2.9) present examples showing how different surveys incorporate various of them. This section, like the rest of the paper, draws most of its examples from surveys conducted as part of the World Bank Living Standards Measurement Study (LSMS). This is simply because we are more knowledgeable about these surveys' designs than others. However, LSMS surveys do present a range of household survey designs. Since each survey in the series is carried out in conjunction with the statistical bureau of the country in which it is being conducted, different LSMS surveys incorporate different designs, depending on prevailing practice in the countries concerned. 2.1. Clustering One feature which most household surveys share is that they are clustered. That is, the first selection (from the population or sample frame) is not of households, but of some higher level units such as villages or street blocks, known as clusters. This is the case for all nationwide household surveys, though some very small surveys (of one or several villages, say) are not clustered. As we will see, clustering leads to higher variances. Its justification is purely practical. By concentrating sampled households in a small number of geographical areas, clustering drastically reduces survey costs per household. Under some sample selection procedures, a cluster can be selected more than once: that is, more than one group of households, say, can be selected from a single cluster. To avoid confusion, we refer to each selection of a group of households (or ultimate sampling units - see 2.3 below) from the cluster as a 'cluster take'. 2.2. Stratification Many surveys are stratified, typically into geographical regions, such as urban/rural and provincial, but also by other characteristics. The difference between strata and clusters is simply explained. Both strata and clusters divide the sampling frame exhaustively and exclusively. If both are present, the clusters sub-divide the strata. All strata are included in the sample (each with its designated sample size), but only a selection of clusters are included in the sample. Stratification is very common, but not universal. Stratification with equal sample rates in the strata ensures a more representative sample overall, and so reduces variance. It also can also be used to ensure that one obtains sufficient observations from small sub-populations of interest. 3 Note that what we call 'stratification' here is sometimes referred to as 'explicit stratification' to distinguish it from implicit stratification, a form of systematic sampling discussed in 2.7. We analyze both forms of stratification, but when we use the term 'stratification' without qualification we are referring to the explicit variety.3 2.3. Ultimate sampling unit The ultimate sampling unit is the smallest level of population unit sampled by the survey. For most household surveys, the ultimate sampling unit is the household. That is, after first selecting clusters, and then after possibly some intermediate selection stages (see 2.4), the fmal selection of elements is of households. But this is not universal. In the Nicaragua LSMS, the ultimate sampling unit was groups of five households: each cluster was diviided into groups of five households (on the basis of geographical proximity) and a selection of these groups, rather than of individual households, was made. Of course, by selecting groups of- households one is selecting individual households. However, the ultimate sampling unit is the lowest level at which sampling occurs (and hence is the unit in which sample size is measured). In the Nicaraguan case, there was no sampling below the group level: all households within any selected group were chosen.4 2.4. Number of random selection stages The selection process of many samples is two-stage. That is, once clusters have been selected, the ultimate sampling units, typically households (see above), are selected directly from the clusters. However, more than two stages may also be used, especially in large countries. For example, in the Russian Longitudinal Monitoring Survey, raions (regions) provide the first-level clusters. Then voting districts are selected from the chosen raions. These serve as second-stage clusters. In a third stage, households are chosen from the selected voting districts. Note that the reference here is to the number of sampling stages, so stratification is never regarded as the first stage. 3. About half of the World Bank LSMS surveys are explicitly stratified. As far as we know, all the rest are implicitly stratified. 4. Compare the typical case in which the final sampling is of households. Then households are the ultimate sampling unit, and not individuals, even though by selecting households one is selecting individuals. By contrast, if one samples from a list of individuals or if one samples individuals within households (as in the case of some fertility surveys), then individuals do indeed become the ultimate sampling units. 4 2.5. Unequal probabilities of selection Many household surveys are not self-weighting. This means that some households have a higher chance of being selected than others. Variable weights (known as raising or expansion factors) have to be used to prevent estimators being biased as a result. Formally, raising factors can be defined as a set of weights such that the weighted sum of the sample observations of a given variable is an unbiased estimator of the population total of the variable. There are three main reasons why a survey may not be self-weighting. First, when strata are used, the sample may not be distributed over the strata in accordance with the distribution of the population. Instead disproportionate stratification may be used, and some areas deliberately over- represented. These may be areas, typically urban, in which sampling is cheaper or they may be small sub-national political units, such as small provinces, for which one wants to ensure a minimum sample size. Second, even if the survey is intended to be self-weighting, it can end up not being so, owing, for example, to non-response. Finally, a common method o
View more...
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks

We need your sign to support Project to invent "SMART AND CONTROLLABLE REFLECTIVE BALLOONS" to cover the Sun and Save Our Earth.

More details...

Sign Now!

We are very appreciated for your Prompt Action!