






Proceedings 



Abha Aggarwal
Scientist E, Deputy Director,
National institute of Medical Statistics,
ICMR, New Delhi110029.
India.
Corresponding Author:
Abha Aggarwal
Scientist E,
Deputy Director,
National institute of Medical Statistics,
ICMR, New Delhi110029.
India.
Email: aabha54@gmail.com
History

: 
Received  
15Sep2011 
Accepted  
17Sep2011 
Published Online 

09Oct2011 
DOI 
: 
http://dx.doi.org/10.7713/ijms.2011.0044 
Abstract
Research is an academic activity and as such the term should be used in technical sense. It refers to systematic method consisting of enunciating the research problem, formulating a hypothesis, collection of facts or data, analysing the facts and reaching certain conclusion either in the form of solution(s) towards the concerned problem or in certain generalised formed theoretical formulation. The purpose of research is to discover answer to question through the application of scientific procedures. This article is a discussion on sampling in research mainly designed to equip beginners with knowledge on general issues on sampling, its purpose, types of sampling design and guide for deciding the sample size to get the quality results.

Keywords :
Sampling studies; sampling bias; research design. 
Research methodology is a way to systematically solve the research problem. It refers to systematic method consisting of enunciating the research problem, formulating a hypothesis, collection of facts or data , analysing the facts and reaching certain conclusion either in the form of solution(s) towards the concerned problem or in certain generalised formed theoretical formulation. In research methodology we study the various steps that are generally adopted by a researcher in studying his research problem along with the logic behind them. It is necessary for the researcher to know not only the research methods/techniques but also the methodology. Researchers not only need to know how to develop certain indices or tests, how to calculate the mean, the mode, the median or the standard deviation or chisquare, how to apply particular research techniques, but they also need to know which of these methods or techniques, are relevant and which are not, and what would they mean and why. Researchers also need to understand the assumptions underlying various techniques and they need to know the criteria by which they can decide that certain techniques and procedures will be applicable to certain problems and others will not. All this means that it is necessary for the researcher to design his methodology for his problem as the same may differ from problem to problem. In research the scientist has to expose the research decisions to evaluation before they are implemented. He has to specify very clearly and precisely what decisions he selects and why he selects them so that they can be evaluated by others also[1].
Research methodology has many dimensions and research methods do constitute a part of the research methodology. The scope of research methodology is wider than that of research methods. Thus, when we talk of research methodology we not only talk of the research methods but also consider the logic behind the methods we use in the context of our research study and explain why we are using a particular method or technique and why we are not using others so that research results are capable of being evaluated either by the researcher himself or by others.
Sampling Issues
Research has a specific process which consists of series of actions or steps from literature review to
report writing with interpretations of the findings to carry out research effectively. Sampling design with appropriate sample size is one of the important steps in carrying out any research study. The foremost question appears in the mind of a researcher is how many subjects he should include in his study and then how to select the requisite subjects. Whenever we conduct any study we use a word population which is commonly used in conversation to refer to a large collection of human beings or other living organisms. Since it is not possible to cover each and every unit of the population we always draw a sample of units from the population and draw inference about the population on the basis of the sample. Sample is a subgroup of the individuals in the population and representative of the population. Based on the sample the inference is drawn for the whole population. So the statistical inference relies on the formal projection of knowledge from the sample to a population. This projection is based on the assumption that the sample is representative of the population from which it is drawn. Representativeness is ensured by random sampling, in which every individual in the population is selected by chance methods. Thus, the first step is to select the sampling procedures which require the knowledge of sampling frame. Sampling frame is a complete list or form of identification of the individuals in the population to be sampled. For example, if the aim is to sample adult residents in Delhi, one useful way is to define the sampling frame as the individuals listed in the current electoral registers (list of people entitled to vote at elections). The method of data collection is called the design of sample survey.
Thus researcher must decide the way of selecting a sample or what is popularly known as the sample
design. In other words, a sample design is a definite plan determined before any data are actually collected for obtaining a sample from a given population. Thus, the plan to select 12 of a city’s 200 drugstores in a certain way constitutes a sample design. After selecting a sample we estimate population parameters on the basis of a sample and draw inference about the population. Whatever may be the sampling method used and no matters how good is your sampling method for selecting the sampling units, it is clear that sample can never reproduce exactly the various characteristics of the population unless the sample itself is a population, some discrepancies are bound to be there. The resulting discrepancies between the sample estimates and the population values obtained in the same manner as the sample estimates are obtained is known as sampling error or standard error. It measures the variability between the standard deviations of different sample drawn from the population. So it should be as small as it is possible. It can also be controlled by the size of the sample. As the number of observations increases standard error decreases. Reciprocal of it gives the efficiency of the estimator.
Sampling Techniques
Samples can be either probability samples or nonprobability samples. With probability samples each element has a known probability of being included in the sample but the nonprobability samples do not allow the researcher to determine this probability. Probability samples are those based on simple random sampling, systematic sampling, stratified sampling, cluster/area sampling whereas nonprobability samples are those based on convenience sampling, judgment sampling and quota sampling techniques. A brief mention of the important sample designs is as follows:
1 Simple Random Sampling
2. Probability proportional to size( PPS)
3. Stratified Random Sampling
4. Cluster Sampling
5. Systematic Sampling
6. Inverse Sampling
7. Snowball Sampling
8. Lot Quality Assurance Sampling (LQAS)
Simple Random Sampling
It is a very simple method of data collection. If the complete sampling frame is available one can opt
for this sampling plan. Random number tables are available for selecting the units. It can be with or without replacement. By this method each and every unit has an equal chance of selection and the sample is representative of the population. After selecting the sample its mean and S.E. can be calculated and we draw inference about the whole population. In many medical situations this sampling is applicable.
Sampling with varying Probability
This is also known as PPS. If the units vary considerably in size, simple random sampling may
not be appropriate since it does not take into account the possible importance of the larger units in the population.In such situation we use PPS. For example, villages with larger geographical area are likely to have larger population, hence a bigger sample is required to represent the whole village. In Delhi also MCD has more blocks as compared to NDMC so the selection of block is done proportionally. So a sampling procedure in which units are selected with probability proportion to some measure of their size is known as PPS.
Stratified Random Sampling
When the population is heterogeneous or the variability in the population values is large, simple
random sampling is not suitable for selecting the population units as it may be possible that all the units from the same group will be selected and the sample will not be representative of the population. (like suppose it is income, all the units from lower income may be selected or all the higher income units will be selected). To get better precision in this situation it is better to divide the population into several groups each of which is more homogenous as compared to entire population and draw a random sample of predetermined size from each one of the groups. The groups into which the population is divided are called strata and the procedure is known as stratified random sampling. Sample selected by this method is representative of the population.
Cluster Sampling
When the sampling frame is not available we can not use SRS or stratified random sampling. In this situation we select a cluster of units as an ultimate sampling unit. When the sampling unit is a cluster it is known as cluster sampling. Cluster are so formed that variation within the cluster is as large as possible and the variation between cluster should be as small as possible, e.g. suppose list of household is available but list of persons is not available or list areas in Delhi is available but list of households within area is not known etc. once you have selected we enumerate all the units in a cluster. This sampling procedure is just reverse of stratified sampling . Cost of conducting a survey is also more in cluster sampling as compared to stratified sampling still it is widely used practically because so many times frame is not available.
Systematic Sampling
In this sampling procedure only first unit is selected randomly and the rest being selected automatically according to a predetermined pattern. Suppose you have a population of size N and you want to draw a sample of size n then k= N/n is call the interval. We draw a random number </= k, say it is i then this unit will be selected and rest will be i+k,i+2k, etc. such a sample is known as systematic sample. The method is extensively used on account of its low cost and simplicity in the selection of the sample. In NFHS we have used this sampling plan as the listing of households was available and we have to select fixed number of houses so it is better to select by this method rather than SRS.
Inverse Sampling
When the event for estimation is rare usual methods of estimation are unsatisfactory. Even the large sample is not sufficient to estimate the rare event. In this sampling method we fix the number of rare events to be collected and we continue the sampling till the required number of events are obtained. Here, in this sampling procedure the sample size (n) is a random variable. Example retina detachment after the surgery is rare, in this case if a study is to be designed we have to fix up the retina detachment case for studying the factors related to it. Recently inverse sampling has been developed for estimating the disease burden of leprosy [2].
Snow ball Sampling
This sampling procedure is also used for estimating the parameter when the event under study is rare. Example Maternal Mortality ratio is rare to obtain. Snowball sampling is a technique wherein we identify a few households where a maternal death occurred through some keyinformants in the village and ask each of them to identify households where maternal deaths have occurred. The households where maternal death have occurred, were identified and asked to provide name of other households where similar event has occurred and so on. By this method of snowball sampling, all the maternal deaths are covered by contacting few related households [3].
LQAS
Traditional survey methods which are generally costly and time consuming, usually used for estimation purpose. These techniques cannot be used to identify the areas for low or high coverage of immunisation. Lot Quality Assurance Sampling (LQAS) is a rapid technique which can be used to identify the poorly performed areas. LQAS was originally developed for industry to accept or reject a lot. The decision to accept or reject a lot, depends upon the number of defective items found in a small sample. In recent years there were several attempts to use this technique in health system to evaluate the performance of any health programme. LQAS may be used to identify the areas where the performance of the programme is poor so that intervention can be made in those areas.
The strategy and goals of LQAS in health field are similar to those in the manufacturing field. For example the immunisation programme of a health unit may be defined as acceptable if the proportion of fully immunised children in its catchment population is say 80% or higher. Suppose a researcher is interested in whether a community has a 10% prevalence of HIV infection. Traditional sampling procedures such as simple random sampling, stratified sampling, cluster sampling would be useful if we wanted to estimate the prevalence of HIV infection. However, we are most interested in hypothesis testing relative to a threshold prevalence level beyond which health planners will intervene.
For health situation one is interested in classifying a given operational area with satisfactory orunsatisfactory performance w.r.t. various indicators relating to health goals/objectives. These days this
technique is used in health to evaluate any health programme. This is generally applicable for small areas, like subcentre or block can be considered as lot [4].
References
 Research Methodology: Methods & Techniques: CR Khori 2004 ;New Age International Publishers; Second Edition.
 Aggarwal A, Pandey A. Inverse sampling for study of leprosy. Indian J Med Res 2010;132:43841.[PUBMED]
 Singh P, Pandey A, Aggarwal A. House to house survey vs. snowballing survey technique for capturing maternal deaths in India: a pilot study in search of a cost effective method. Indian J Med Res 2007;125:606.[PUBMED]
 Aggarwal A, Pandey A. Lot Quality Assurance Sampling in Evaluation of Immunization Programme in Book Biostatistician Aspects of Health and Population . Hindustan Publisher 2006 pgs. 2024.





