African Journal Shukla Et Al

Embed Size (px)

Citation preview

  • 8/2/2019 African Journal Shukla Et Al

    1/12

    African Journal of Mathematics and Computer Science Research Vol. 5(4), pp. 78-89, 15 February, 2012Available online at http://www.academicjournals.org/AJMCSRDOI: 10.5897/AJMCSR11.159ISSN 2006-97312012 Academic Journals

    Full Length Research Paper

    Utilization of mixture of 1 X,X and 2X in imputation formissing data in post-stratification

    D. Shukla1, D. S. Thakur2 and N. S. Thakur3

    1Department of Mathematics and Statistics, Dr. H. S. Gour Central University, Sagar, (M.P.), India.

    2School of Excellence, Sagar (M.P.), India.

    3Centre for Mathematical Sciences (CMS), Banasthali University, Rajasthan, India.

    Accepted 10 January, 2012

    To estimate the population mean using auxiliary variable there are many estimators available in

    literature like-ratio, product, regression, dual-to-ratio estimator and so on. Suppose that all theinformation of the main variable is present in the sample but only a part of data of the auxiliary variableis available. Then, in this case none of the aforementioned estimators could be used. This paperpresents an imputation based factor-type class of estimation strategy for population mean in presenceof missing values of auxiliary variables. The non-sampled part of the population is used as animputation technique in the proposed class. Some properties of estimators are discussed andnumerical study is performed with efficiency comparison to the non-imputed estimator. An optimumsub-class is recommended.

    Key words: Imputation, non-response, post-stratification, simple random sampling without replacement(SRSWOR), respondents (R).

    INTRODUCTION

    In sampling theory, the problem of mean estimation of apopulation is considered by many authors like Srivastavaand Jhajj (1980, 1981), Sahoo (1984, 1986), Singh(1986), Singh et al. (1987), Singh and Singh (1991),Singh et al. (1994), Sahoo et al. (1995), Sahoo andSahoo (2001), and Singh and Singh (2001). Sometimesin survey situations, a small part of sample remains non-responded (or incomplete) due to many practicalreasons. Techniques and estimation procedures areneeded to develop for this purpose. The imputation is awell defined methodology by virtue of which this kind ofproblem could be partially solved. Ahmed et al. (2006),Rao and Sitter (1995), Rubin (1976) and Singh and Horn(2000) have given applications of various imputationprocedures. Hinde and Chambers (1990) studied thenon-response imputation with multiple source of non-response. The problem of non-response in samplesurveys immensely looked into by Hansen and Hurwitz(1946), Grover and Couper (1998), Jackway and Boyce

    *Corresponding author. E-mail: [email protected].

    (1987), Khare (1987), Khot (1994), Lessler and Kalsbeek(1992).

    When the response and non-response part of thesample is assumed into two groups, it is closed to calupon as post-stratification. Estimation problem in samplesurvey, in the setup of post-stratification, under nonresponse situation is studied due to Shukla and Dubey(2001, 2004, and 2006). Some other useful contributionsto this area are by Holt and Smith (1979), Jagers et al(1985), Jagers (1986), Smith (1991), Agrawal and Panda(1993), Shukla and Trivedi (1999, 2001, 2006), Wywia(2001), Shukla et al. (2002, 2006). When a sample is fulof response over main variable but some of auxiliaryvalues are missing, it is hard to utilize the usuaestimators. Traditionally, it is essential to estimate thosemissing observations first by some specific estimationtechniques. One can think of utilizing the non-sampledpart of the population in order to get estimates of missingobservations in the sample. These estimates could beimputed into actual estimation procedures used for thepopulation mean. The content of this research work takesinto account the similar aspect for non-responding valuesof the sample assuming post-stratified setup and utilizing

  • 8/2/2019 African Journal Shukla Et Al

    2/12

    the auxiliary source of data.

    Symbols and setup

    Let U = (U1, U2 , ........., UN) be a finite population of N

    units with Y as a main variable and X the auxiliaryvariable. The population has two types of individuals likeN1 as number of "respondents (R)" and N2 "non-respondents (NR)", (N = N1+N2). Their populationproportions are expressed like W1 = N1/Nand W2 = N2/N.Quantities W1 and W2 could be guessed by past data or

    by experience of the investigator. Further, let Y and X be the population means of Yand Xrespectively. Somesymbols are as follows:

    R-group, Respondents group or group of those whoresponses in survey;

    NR-group, Non-respondents group or group of those whodenied to response during survey;

    1Y , Population mean of R-group of Y;

    2Y , Population mean of NR-group of Y;

    1X , Population mean of R-group of X;

    2X , Population mean of NR-group of X;2

    1YS , Population mean square of R-group of Y;2

    2YS , Population mean square of NR-group of Y;2

    1XS , Population mean square of R-group of X;2

    2XS , Population mean square of NR-group of X;YC1 , Coefficient of variation of Yin R-group;

    YC2 , Coefficient of variation of Yin NR-group;

    XC1 , Coefficient of variation of Xin R-group;

    XC2 , Coefficient of variation of Xin NR-group;

    , Correlation coefficient in population between Xand Y;

    N, Sample size from population of size Nby SRSWOR;

    1n , Post-stratified sample size coming from R-group;

    2n , Post-stratified sample size from NR-group;

    1y , Sample mean of Ybased on n1 observations of R-

    group;

    2y , Sample mean of Ybased on n2 observations of NR-

    group;

    1x , Sample mean of X based on n1 observations of R-

    group;

    2x , Sample mean of Xbased on n2 observations of NR-

    group;

    1 . Correlation Coefficient of population of R-group;

    2 , Correlation Coefficient of population of NR-group

    Shukla et al. 79

    Further, consider few more symbolic representations:

    2

    1

    2

    1

    11

    11

    111

    WnN

    WnN

    nWnED ;

    2

    2

    2

    2

    22

    21

    111

    WnN

    WnN

    nWnED

    N

    YNYNY 2211

    ; 2211

    N

    XNXNX

    ASSUMPTIONS

    Consider following in light of Figure 1 before formulatingan imputation based estimation procedure:

    1) The sample of size nis drawn by SRSWOR and poststratified into two groups of size n1 and n2 (n1 + n2 = naccording to R and NR group respectively2) The information about Y variable in sample iscompletely available.

    3) The sample means of both groups 1y and 2y are

    known such that

    2211

    nynyny which is sample mean on n units.

    4) The population means 1X and X are known.

    5) The population size Nand sample size nare knownAlso, N1 andN2 are known by past data, past experienceor by guess of the investigator (N1 + N2 = N).

    6) The sample mean of auxiliary information 1x is only

    known for R-Group, but information about 2x of NR-group

    is missing. Therefore

    2211

    n

    xnxnx

    could not be obtained due to absence o

    2x .

    7) Other population parameters are assumed known, in

    either exact or in ratio from except the Y , 1Y and 2Y .

    PROPOSED CLASS OF ESTIMATION STRATEGY

    To estimate population meanY , in setup of Figure 1, a

    problem to face is of missing observations related to 2x

    therefore, usual ratio, product and regression estimatorsare not applicable. Singh and Shukla (1987) have

    proposed a factor type estimator for estimating population

    mean Y . Shukla et al. (1991), Singh and Shukla (1993)and Shukla (2002) have also discussed properties offactor-type estimators applicable for estimatingpopulation mean. But all these cannot be useful due tounknown information

    2x . In order to solve this, an

    imputation 4

    *

    2x is adopted as:

    42

    *x =

    nN

    XfXfnXN 21 1

    (1)

  • 8/2/2019 African Journal Shukla Et Al

    3/12

    80 Afr. J. Math. Comput. Sci. Res.

    Population (N=N1 + N2)

    R-group NR-group

    N1 1Y 1X N2 2Y 2X

    2

    1YS

    21X

    S 2

    2YS

    22XS

    Sample (n=n1+ n2)

    R-group NR-group

    n1 1y n2

    1x 2y 2x

    Figure 1. Setup to estimate population meanY .

    The logic for this imputation is to utilize the non-sampledpart of the population of X for obtaining an estimate of

    missing 2x and generate 2x for x as describe as

    follows:

    And

    4x =

    21

    42211

    NN

    xNxN*

    (2)

    The proposed class of imputed factor-type estimator is:

    [( yFT

    )D]k

    =

    4

    4

    2211

    xCXfBA

    xfBXCA

    N

    yNyN

    (3)

    Where k0 and kis a constant and

    A = (k 1) (k 2 ); B= (k 1) (k 4); C= (k 2) (k 3) (k 4); f= n/ N.

    LARGE SAMPLE APPROXIMATION

    Consider the following for large n:

    422

    311

    222

    111

    1

    1

    1

    1

    eXx

    eXx

    eYy

    eYy

    (4)

    where,1e ,

    2e ,

    3e and

    4e are very small numbers 0ie

    (i= 1,2,3,4).

    Using the basic concept of SRSWOR and the concept ofpost-stratification of the sample ninto n1 and n2 (Cochran2005; Hansen et al., 1993; Sukhatme et al., 1984; Singhand Choudhary, 1986; Murthy, 1976), we get

    0EEE

    0EEE

    0EEE

    0EEE

    244

    133

    222

    111

    nee

    nee

    nee

    nee

    (5)

    Assume the independence of R-group and NR-grouprepresentation in the sample, the following expressioncould be obtained:

    12121 EEE nee

    1

    2

    1

    1

    11E nC

    NnY

    21

    1

    11E

    YC

    Nn

    211

    1Y

    CN

    D

    (6)

  • 8/2/2019 African Journal Shukla Et Al

    4/12

    22222 eEE nEe

    2

    2

    2

    2

    11E nC

    NnY

    222

    1YC

    ND

    (7)

    1

    2

    3

    2

    3EEE nee

    2

    11

    1X

    CN

    D

    (8)

    and

    2

    22

    2

    4

    1E XC

    NDe

    (9)

    13131

    EEE neeee

    1111

    1

    11E nCC

    NnXY

    XY

    CCN

    D1111

    1

    (10)

    0EEE212121

    n,neeee(11)

    0E 41 ee (12)

    0,EEE 213232 nneeee (13)

    XY

    CCN

    Dee222242

    1E

    (14)

    0E 43 ee (15)The expressions (11), (12), (13) and (15) are true underthe assumption of independent representation of R-groupand NR-group units in the sample. This is introduced tosimplify mathematical expressions.

    Shukla et al. 81

    Theorem 1

    The estimator [( yFT

    )D]k

    could be expressed under large

    sample approximation in the form:

    [( yFT

    )D

    ]k

    = 3Y[1 + s

    1W

    1e

    1+ s

    2W

    2e

    2][1 + (

    3

    3)e

    3(

    3

    3)

    3 2

    3e + (

    3

    3) 2

    3

    3

    3e

    Proof

    Rewrite 4

    x as in Equation 2:

    4x =

    21

    4

    *

    2211

    NN

    xNxN

    Where,

    42

    *x =

    nN

    XfXfnXN 21 1

    )4(

    x =

    nN

    XfXfnXNNeXN

    N

    21

    2311

    )1(1

    1

    =

    N

    XfXfnXNpeXN 21311 11

    =

    N

    eXNXfpnXpnfXpNXN 3112111 1

    =

    N

    eXNXfpnXpnfNXpN 311211 1

    = 3112121 1 erWr)f(pfrpfWpX

    = 311 erWX (16)

    Where,

    p=nN

    N

    2 ; = p+ (W1- pf

    2)r1pf(1-f) r2.

    Now, the estimator [( yFT

    )D]k

    under approximation and

    using (16) is

    [( yFT

    )D]k

    =

    4

    4

    2211

    xCXfBA

    xfBXCA

    N

    yNyN

    =

    XerWCXfBA

    XerWfBXCA

    N

    eYNeYN

    311

    31122

    211

    111

    =

    311

    311

    2221111

    erCWCfBA

    erfBWCfBAeWseWsY

    =

    343

    321

    2221111

    e

    eeWseWsY

    = 3 2221111 eWseWsY (1 + 3e3)(1 + 3e3)-1

  • 8/2/2019 African Journal Shukla Et Al

    5/12

    82 Afr. J. Math. Comput. Sci. Res.

    Where,

    1 = A + fB+ C; 2 = fBW1r1; 3 = A + fB+ C; 4 = CW1r1;

    r1 =X

    X1 ; r2 =X

    X2 ; s1 =Y

    Y1 ; s2 =Y

    Y2 ; 3 =1

    2

    ; 3 =

    3

    4

    ; 3 =

    3

    1

    .

    We can further express the above into following:

    = 3Y [1 + s1W1e1+ s2W2e2] (1 + 3e3) (1 3e3+2

    32

    3e

    3

    3 33e .)

    =3Y [1+s1W1e1+s2W2e2][1+(33)e3(33)32

    3e+(3 3)

    2

    33

    3e .]

    BIAS AND MEAN SQUARED ERROR

    Define E(.) for expectation, B(.) for bias and M(.) formean squared error, then the first order ofapproximations could be established for i,j= 1, 2, 3, .....as

    0E 21 jiee when i+j>2

    0E 31 jiee when i+j>2

    0E 32 jiee when i+j>2 (17)

    Theorem 2

    The [( yFT

    )D]k

    is a biased estimator of Y with the amount

    of bias to the first order of approximation:

    YXXkDFT CWsCN

    DCYy 111113133133 1

    1B

    Proof

    YyykDFTkDFT

    EB

    Using theorem 1 and taking expectations

    E[( yFT

    )D]k

    = 3YE[1 + (3 3)e3 (33) 32

    3e

    + s1W1e1{1 + (33) e3 (33)

    32

    3e }

    + s2W2e2 {1+ (33) e3 (33)32

    3e }]

    = 3Y[1(33)3 E(2

    3e )+ s1W1(33) E(e1e3) + s2W2(33)E(e2e3)]

    =

    21133331

    1 XCN

    DY

    XY

    CCN

    DWs11111133

    1

    =

    YXX

    CWsCCN

    DY11111311333

    11

    Therefore,

    YyykDFTkDFT

    EB

    =

    YXX CWsCN

    DCY 1111131331331

    1

    Theorem 3

    The mean squared error of [( yFT

    )D]k

    is

    M [( yFT

    )D]

    =

    2

    2

    2

    2

    2

    2

    2

    3211113

    2

    12

    2

    1

    2

    111

    2

    3

    2 12

    11

    YXYXYCWs

    NDCCsJCJCsJ

    NDY

    Where

    2

    1

    2

    31 WJ , 333333332 12 J ; )(12( 33313 WJ

    Proof

    M [( yFT

    )D]k

    = E[{( y FT)D}kY ]2

    Using Theorem 1, we can express

    M[( yFT

    )D]k

    = E[ 3Y{1 + s1W1e1+ s2W2e2}{1+ (3 3)e3 (3 3)32

    3e + (3 3)

    2

    3

    3

    3e

    ...}Y]2

    Using large sample approximations of (17) we could

    express

    =2

    Y E [(31)+3{(3 3)e3 (33)3 23e +(s1W1e1+ s2W2e2)+ (3 3)(s1W1e1 + s2W2e2)

    e3}]2

    =2

    Y E [(3 1)2+ 2

    3 {(3 3)

    2 23e

    + 21s

    2

    1W

    2

    1e

    + 22s

    22

    W2

    2e +2s1s2W1W2e1e2

    + 2(3 3)(s1W1e1 e3 + s2W2e2e3)}+ 23(3 1) {(3 3)e3 (3 3)32

    3e

    + (s1W1e1 + s2W2e2) + (3 3) (s1W1e1e3 + s2W2e2e3)}]

    Using (5), (11) and (12) we rewrite,

    =2

    Y [ (31)2+ 23 {(33)

    2E( 23e )+ 21s

    21W

    E( 21e )+ 22s

    22W E(

    2

    2e ) +2(33) s1W1 E(e1 e3)}

    + 23(3 1){(33)3 E(2

    3e

    )+ (3 3)s1W1 E(e1e3 )}]

    2

    11

    2

    1

    2

    1

    2

    11

    2

    33

    2

    3

    2

    3

    2 111

    YXC

    NDWsC

    NDY

    XYYCC

    NDWsC

    NDWs

    11111133

    2

    22

    2

    2

    2

    2

    12

    1

    2

    1133333

    112

    XC

    ND

    XYCC

    NDWs

    11111133

    1

    2

    1

    2

    1

    2

    1

    2

    31

    2

    3

    2 11

    YCWs

    NDY 12 2

    133333333 XC

    XYCCWs 111113333 12

    22

    22

    22

    232

    1YCWs

    ND

    212

    2

    1

    2

    111

    2

    3

    2 11 XY CJCsJ

    NDY

    2

    2

    2

    2

    2

    2

    2

    3211113

    12 YXY CWs

    NDCCsJ

  • 8/2/2019 African Journal Shukla Et Al

    6/12

    SOME SPECIAL CASES

    The term A, B and C are functions of k. In particular,there are some special cases:

    Case 1

    When k= 1

    A = 0; B= 0; C= 6; 1= -6; 2= 0; 3= - 6; 4= -6r1w1; 3 = 0; 3 =

    11Wr

    ; 3 =1 ;

    J1 = 2

    2

    1

    W; J2 = 4

    2

    1

    2

    1 23

    )(Wr ; J3 = 3

    2

    11 2

    )(Wr ;

    The estimator [( yFT

    )D]k

    along with bias and m.s.e. under

    case i is:

    [( yFT

    )D]k =1

    =

    )(

    x

    X

    N

    yNyN4

    2211

    (18)

    B[( yFT

    )D]k =1

    =Y-3[(1-)2+

    N

    D1

    1

    r12

    1W C1X{r1C1X -s11C1Y}]

    (19)

    M[( yFT

    )D]k=1

    =2

    Y -4[(1)22 + 21W

    N

    D1

    1{2

    2

    1s

    2

    1YC + (3 - 2) 2

    1r

    2

    1XC

    + 2(- 2)r1s11C1YC1X}+

    ND1

    2 2 2

    2W

    2

    2s2

    2YC ]

    (20)

    Case 2

    When k= 2

    A = 0; B= 2; C= 0; 1= -2f; 2 = -2fW1r1; 3 = -2f; 4 = 0;

    3 =1

    11

    Wr ; 3 = 0;3 =; J1=22

    1 W ; J2=2

    1

    2

    1Wr ; J3 = )(Wr 12

    2

    11 ;

    [( yFT

    )D]k =2

    =

    X

    x

    N

    yNyN)( 4

    2211

    (21)

    B [( yFT

    )D]k =2

    =

    XYCCsrWN

    D)(Y 111112

    11

    11

    (22)

    M [( yFT

    )D]k =2

    =2

    Y [( 1)2 + 21W

    ND

    11

    {2

    2

    1s

    2

    1YC + r12 2

    1XC + 2(21)s

    1r1

    1C

    1YC

    1X}

    +

    N

    D1

    22 22W

    2

    2s 2

    2YC ]

    (23)

    Shukla et al. 83

    Case 3

    When k= 3

    A = 2; B= 2; C= 0; 1= 2(1- f); 2 = -2fW1r1; 3= 2(1-f); 4=0; 3 =

    f

    rfW

    1

    11

    3 = 0; 3 =f

    f

    1

    1 ; J1=

    2

    2

    1

    2

    1

    1

    f

    Wf

    ; J2 = 2

    2

    1

    2

    1

    2

    1 f

    rWf

    ; J3 =

    21

    2

    1

    1

    12

    f

    ffrfW

    ;

    [( yFT

    )D]k =3

    =

    X)f(

    xfX

    N

    yNyN)(

    1

    4

    2211

    (24)

    B [( yFT

    )D]k =3

    =

    XYCCsrWN

    D)()f(fY 111112

    11

    1 111

    (25)

    M [( yFT

    )D]k =3

    =2

    Y (1f)-2

    [f2(1)2+

    N

    D1

    1

    2

    1W {(1f)2 2

    1s2

    1YC + f2 2

    1r2

    1XC

    + 2(2ff 1)fr1s

    1

    1C

    1YC

    1X}+

    N

    D1

    2(1f)2 22W

    2

    2s 2

    2YC ]

    (26)

    Case 4

    When k= 4;

    A = 6; B= 0; C= 0 ; 1= 6; 2 = 0; 3 = 6; 4 = 0; 3 = 0; 3= 0; 3 =

    1; J1=2

    1W ; J2= 0; J3 = 0;

    [( yFT

    )D]k =4 =

    N

    yNyN 2211

    (27)

    B[( yFT

    )D]k =4

    = 0(28)

    V[( yFT

    )D]k =4

    = 22

    2

    22

    22

    2

    1

    2

    12

    11

    11YY CYW

    NDCYW

    ND

    (29)

    ESTIMATOR WITHOUT IMPUTATION

    Throughout the discussion, the assumption is unknown

    value of 2x . This is imputed by the term 42*

    x , to provide

    the generation of 4

    x [Equations (1) and (2)]. Suppose

    the 2x is known, then there is no need of imputation and

    the proposed (2) and (3) reduces into:

    N

    xNxNx 2211(*)

    (30)

  • 8/2/2019 African Journal Shukla Et Al

    7/12

    84 Afr. J. Math. Comput. Sci. Res.

    *

    (*)

    kwFT xCX)fBA(

    xfBX)CA(

    N

    yNyNy 2211

    (31)

    Where, kis a constant , 0 < k< and

    A = (k

    1) (k

    2); B= (k

    1) (k

    4); C= (k

    2) (k

    3) (k

    4); f= n/N.

    Theorem 4

    The estimator kwFT

    y is biased for Y with the amount

    of bias

    X

    '

    YX

    ''

    kwFTCrCsCrW

    NDyB 11211111

    2

    1121

    1

    X

    '

    YX CrCsCrWND 222222222

    22

    1

    Where,

    CfBA/fB' 1 ; CfBA/C' 2 .

    Proof

    The estimator kwFT

    y could be approximate like:

    *

    *

    kwFTxCXfBA

    xfBXCA

    N

    yNyNy 2211

    N

    eYNeYN 222111 11

    422311

    422311

    11

    11

    eXNeXNCXfBAN

    eXNeXNfBXCAN

    422311

    422311222111

    erWerWCCfBA

    erWerWfBCfBAeYWeYWY

    4223111222111 1 erWerWeYWeYWY

    ' 142231121

    erWerW'

    Expanding above using binominal expansion, and

    ignoring ljki ee terms for ( k+ l) > 2, (k, l= 0, 1, 2 ........),(i, j= 1, 2, 3, 4 ); the estimator result into

    212222111121 11 eYWeYWYYy kwFT (32)

    Where,

    1 = 42231121 erWerW'' ; 2 = 2

    422311212erWerW

    '''

    And

    W1 r1 +W2 r2 = 1 holds.

    Further, one can derive up to first order of approximationaccording to the following;

    (i) E [1] = 0

    (ii) E 21 =

    222

    2

    2

    2

    2

    2

    11

    2

    1

    2

    1

    2

    21

    11XX

    ''C

    N

    DrWC

    N

    DrW

    (iii) E 2 =

    222

    2

    2

    2

    2

    2

    11

    2

    1

    2

    1212

    11 XX

    '''C

    NDrWC

    NDrW

    (iv) E 11e = YX'' CCN

    DrW 111111211

    (v) E 12 e = YX'' CCN

    DrW 222222211

    (vi) E 21e = 10under0 n

    (vii) E 22 e = 10under0 n

    The bias of estimator without imputation is

    YyykwFTkwFT

    EB

    )1()1()( 212222111121 eYWeYWYE

    212221111

    EY)e(EYWeEYW

    X

    '

    YX

    ''CrYCYCrW

    ND 11111111

    2

    1121

    1

    X

    '

    YX CrYCYCrWND 221222222

    22

    1

    X

    '

    YX

    ''CrCsCrW

    NDY 21211111

    2

    1121

    1

    X

    '

    YX CrCsCrWN

    D 222222222

    22

    1

    Theorem 5

    The mean squared error of the estimator kwFT

    y is

    XYXYkwFT CCTTCTCTWN

    DYy 111212

    1

    2

    2

    2

    1

    2

    1

    2

    11

    2 21

    M

    XYXY CCSSCSCSWN

    D22221

    2

    2

    2

    2

    2

    2

    2

    1

    2

    222

    1

    Where,

    T1 = s1; T2 = 121 r'' ; S1 = s2; S2 = 221 r'' ;

  • 8/2/2019 African Journal Shukla Et Al

    8/12

    Proof

    2EM YyykwFTkwFT

    [( yFT

    )w]k

    2212222111121 11E eYWeYWY

    1111222222212121212 E2EEE eYYWeYWeYWY

    2121211222 E2E2 eeYYWWeYYW

    222

    2

    2

    2

    2

    2

    11

    2

    1

    2

    121

    2 112

    XX

    ''C

    NDrWC

    NDrWY

    11 222

    2

    2

    2

    2

    2

    11

    2

    1

    2

    1

    YY C

    NDsWC

    NDsW

    XY

    ''CC

    NDrWsW 1111112111

    12

    XY

    ''CC

    NDrWsW 2222222122

    12

    XYXY CCTTCTCTWN

    DY 11121212221212112 21

    XYXY CCSSCSCSWN

    D 222212

    2

    2

    2

    2

    2

    2

    1

    2

    22 21

    Remark 1

    At k= 1, k= 2, k= 3 and k= 4, there are some specialcases of non-imputed estimators with the respective biasand mean squared error as laid down with the following

    Case 5

    When k= 1

    A = 0 ; B= 0 ; C= 6 ; '1 = 0;'

    2 = 1; T1 = s1;

    T2 = -r1; S1 = s1; S2 =r2;

    *kwFT x

    X

    N

    yNyNy 2211

    1

    XYXkwFTCrCsCrW

    NDYy

    1111111

    2

    111

    1B

    XYXCrCsCrW

    ND

    2222222

    2

    22

    1

    XYXYkwFTCCrsCrCsW

    NDYy 111112121212121121 21M

    XYXY CCrsCrCsWN

    D 222222

    2

    2

    2

    2

    2

    2

    2

    2

    22 21

    Case 6

    When k= 2

    A = 0; B= 2; C= 0; '1 = 1;'

    2 = 0; T1 = s1; T2 = r1; S1= s2; S2 = r2.

    Shukla et al. 85

    X

    x

    N

    yNyNy

    kwFT

    *

    2221

    2

    YXYXkwFTCCrsW

    NDCCrsW

    NDYy 22222

    2

    2211111

    2

    112

    11B

    XYXY

    kwFT CCrsCrCsW

    NDYy

    11111

    2

    1

    2

    1

    2

    1

    2

    1

    2

    11

    2

    2

    21

    M

    XYXY CCrsCrCsWN

    D 222222

    2

    2

    2

    2

    2

    2

    2

    2

    22 21

    Case 7

    When k= 3

    A = 2; B= 2; C= 0; '1 = -f/(1-f)-1; '2 =0; T1= s1; T2= r1 f(1- f)

    -1

    S1= s2; S2= r2 f(1- f)-1

    ;

    Xf

    xXN

    yNyNykwFT )1(

    *

    22113

    XYkwFTCCsrW

    NDffYy

    11111

    2

    11

    1

    3

    11B

    XYCCsrW

    ND 22222

    2

    22

    1

    2

    1

    2

    1

    222

    1

    2

    1

    2

    11

    2

    31

    1M XYkwFT CrffCsWN

    DYy XYCCsff 1111112

    + 22222221

    YCsWN

    D

    XYX CCrsffCrff 222221222222 121

    Case 8

    When k= 4, then

    A = 6 ; B= 0 ; C= 0; '1 = 0;'

    2 = 0; T1= s1; T2= 0; S1= s2; S2= 0

    N

    yNyNy

    kwFT

    2211

    4

    0B4

    kwFT

    y

    2

    2

    2

    2

    2

    22

    2

    1

    2

    1

    2

    11

    2

    4

    11V YYkwFT CsWN

    DCsWN

    DYy

    NUMERICAL ILLUSTRATION

    Consider two populations I and II given in Appendix Aand B. Both populations are divided into two parts as R-group and NR-group having size N1 and N2 respectively(N= N1 + N2). The population parameters are displayed inTables 1 to 6.

    DISCUSSION

    Using the imputation for 2x by the mixture of three

  • 8/2/2019 African Journal Shukla Et Al

    9/12

    86 Afr. J. Math. Comput. Sci. Res.

    Table 1. Parameters of population - I (from Appendix A).

    Parameter Entire population For R-group For NR-group

    Size N= 180 N1 =100 N2= 80

    Mean Y 03.159Y 60.1731 Y 81.1402 Y

    Mean X 22.113X 45.1281 X 19.942 X

    Mean square Y 18.22052 YS 36.25322

    1 YS 90.12192

    2 YS

    Mean square X 61.19722 XS 86.23002

    1 XS 179242

    2 .S X

    Coefficient of variation of Y 295.0YC 2899.01 YC 248.02 YC

    Coefficient of variation of X 392.0X

    C 373.01 XC 323.02 XC

    Correlation coefficient 897.0XY 857.01 XY 956.02 XY

    Table 2. Parameters of population - ii (from Appendix B).

    Parameter Entire population For R-group For NR-group

    Size N= 150 N1 = 90 N2= 60Mean Y 77.63Y 33.661 Y 92.592 Y

    Mean X 2.29X 72.301 X 92.262 X

    Mean Square Y 87.2992 YS 33.3492

    1 YS 35.2062

    2 YS

    Mean Square X 43.1102 XS 67.112

    2

    1 XS 081002

    2 .S X

    Coefficient of Variation of Y 272.0YC 282.01 YC 2397.02 YC

    Coefficient of Variation of X 3599.0XC 345.01 XC 3716.02 XC

    Correlation coefficient 8093.0XY 8051.01 XY 8084.02 XY

    Let samples of size n = 40 and n = 30 are drawn from population I and II respectively by SRSWOR and post-stratified into R and NR-groups. The sample values are in Tables 3 and 4.

    Table 3. Sample values for population i.

    Parameter Entire sample R-group NR-group

    Size n= 40 n1 = 28 n2= 12

    Fraction f =0.22 - -

    Table 4. Sample values for population ii

    Parameter Entire sample R-group NR-group

    Size n= 30 n1 = 20 n2= 10

    Fraction f =0.2 - -

    population means is performed. The proposed class is inEquation (3) with bias in theorem 2 and mean squarederror in theorem 3. The class contains some specialimputed estimators for value k= 1, k= 2, k= 3 and k= 4.A non-imputed class of estimator is also developed whichhas bias and mean squared error derived in theorem 4and 5. This class also has some special cases. Thecomputation over two population is made whose

    description is given in appendix. The two random sampleof size n = 40 and n = 30 are drawn from populations and II respectively and post-stratified into two groupsThe Tables 5 and 6 are presenting a numericacomparison between imputation and non-imputation classover the two populations in terms of their bias and m. se. The imputation technique (1) is effective because thereis not much increase in the mean squared error due to

  • 8/2/2019 African Journal Shukla Et Al

    10/12

    Shukla et al. 87

    Table 5. Bias and M.S.E. comparisons of kDFT

    y .

    EstimatorPopulation I Population II

    Bias M.S.E. Bias M.S.E.

    [( yFT

    )D

    ]k

    =1

    -2.8775 18.6000 -1.1874 9.8618

    [( yFT

    )D

    ]k=2

    3.3127 232.4162 3.4360 41.4794

    [( yFT

    )D

    ]k=3

    -4.0370 8.4710 -0.6377 6.3540

    [( yFT

    )D

    ]k=4

    0 43.6500 0 9.2675

    Table 6. Bias and MSE comparison of kwFT

    y .

    EstimatorPopulation I Population II

    Bias M.S.E. Bias M.S.E.

    1kwFTy 0.1433 12.9589 0.1095 6.0552

    2kwFTy 0.3141 216.3024 0.1599 46.838

    3kwFT

    y 0.096 4.327 0.031 5.2423

    4kwFT

    y 0 43.65 0 9.2662

    imputation. The k= 3 seems to be a good choice. The k=2 performs worst for both sots of data.

    CONCLUSIONS

    The technique of mixture of X , 1X , 2X performs welland the imputed factor-type estimator is very close tonon-imputed in terms of mean squared error when k= 1,2, and 3 holds. The choice k= 3 is better over the othertwo. Performance over population II is superior than I. itseems that factor type lass is able to replace the non-responded observation in a nice manner.

    ACKNOWLEDGEMENT

    Authors are thankful to the referee for his valuablesuggestions.

    REFERENCES

    Agrawal MC, Panda KB (1993). An efficient estimator in post-stratification. Metron, 51(3-4): 179-187.

    Ahmed MS, Al-titi O, Al-Rawi Z, Abu-dayyeh W (2006). Estimation of apopulation mean using different imputation method. Stat. Transit.,7(6): 1247-1264.

    Cochran WG (2005). Sampling Techniques.Third Edition, John Willey &Sons, New Delhi.

    Grover RM, Couper MP (1998). Non-response in household surveysJohn Wiley and Sons, New York.

    Hansen MH, Hurwitz WN (1946). The problem of non-response insample surveys. J. Am. Stat. Assoc., 41: 517-529.

    Hansen MH, Hurwitz WN, Madow WG (1993). Sample Survey Methodsand Theory. John Wiley and Sons, New York.

    Hinde RL, Chambers RL (1990). Non-response imputation with multiplesource of non-response. J. Off. Stat., 7: 169-179.

    Holt D, Smith TMF (1979).Post-stratification. J. R. Stat. Soc., Ser. A.143: 33-46.

    Jackway PT, Boyce RA (1987).Response including techniques of maisurveys. Aus. J. Stat., 29: 255-263.

    Jagers P (1986). Post-stratification against bias in sampling. Int. StatRev., 54: 159-167.

    Jagers P, Oden A, Trulsoon L (1985). Post-stratification and ratioestimation. Int. Stat. Rev., 53: 221-238.

    Khare BB (1987). Allocation in stratified sampling in presence of non-response.Metron, 45(I/II): 213-221.

    Khot PS (1994). A note on handling non-response in sample surveys. JAm. Stat. Assoc., 89: 693-696.

    Lessler JT, Kalsbeek WD (1992). Non-response error in surveys. JohnWiley and Sons, New York.Murthy MN (1976). Sampling Theory and Methods. Statistica

    Publishing Society, Calcutta.Rao JNK, Sitter RR (1995). Variance estimation under two phase

    sampling with application to imputation for missing data. Biometrika82: 453-460.

    Rubin DB (1976). Inference and missing data. Biometrika, 63: 581-593.Sahoo LN (1984). A note on estimation of the population mean using

    two auxiliary variables. Ali. J. Stat., pp. 3-4, 63-66.Sahoo LN (1986). On a class of unbiased estimators using mult

    auxiliary information. J. Ind. Soc. Ag. Stat, 38(3): 379-382.Sahoo LN, Sahoo RK (2001). Predictive estimation of finite population

    mean in two-phase sampling using two auxiliary variables. Ind. SocAg. Stat., 54(4): 258-264.

  • 8/2/2019 African Journal Shukla Et Al

    11/12

    88 Afr. J. Math. Comput. Sci. Res.

    Sahoo LN, Sahoo J, Mohanty S (1995). New predictive ratio estimator.J. Ind. Soc. Ag. Stat., 47(3): 240-242.

    Shrivastava SK, Jhajj HS (1980). A class of estimators using auxiliaryinformation for estimation finite population variance. Sankhya C, 42:87-96.

    Shrivastava SK, Jhajj HS (1981). A class of the population mean in asurvey sampling using auxiliary information. Biometrika, 68: 341-343.

    Shukla D (2002). F-T estimator under two phase sampling. Metron,

    59(1-2): 253-263.Shukla D, Dubey J (2001). Estimation in mail surveys under PSNR

    sampling scheme. Ind. Soc. Ag. Stat., 54(3): 288-302.Shukla D, Dubey J (2004). On estimation under post-stratified two

    phase non-response sampling scheme in mail surveys. Int. J.Manage. Syst., 20(3): 269-278.

    Shukla D, Dubey J (2006). On earmarked strata in post-stratification.Stat. Transit., 7(5): 1067-1085.

    Shukla D, Trivedi M (1999). A new estimator for post-stratified samplingscheme. Proceedings of NSBA-TA (Bayesian Analysis), Eds. RajeshSingh, pp. 206-218.

    Shukla D, Trivedi M (2001). Mean estimation in deeply stratifiedpopulation under post-stratification. Ind. Soc. Ag. Stat., 54(2): 221-235.

    Shukla D, Trivedi M (2006). Examining stability of variability ratio withapplication in post-stratification. Int. J. Manage. Syst., 22(1): 59-70.

    Shukla D, Bankey A, Trivedi M (2002). Estimation in post-stratificationusing prior information and grouping strategy. Ind. Soc. Ag. Stat.,55(2): 158-173.

    Shukla D, Singh VK, Singh GN (1991). On the use of transformation infactor type estimator. Metron, 49(1-4): 359-361.

    Shukla D, Trivedi M, Singh GN (2006). Post-stratification two-waydeeply stratified population. Stat. Transit., 7(6): 1295-1310.

    Singh GN, Singh VK (2001). On the use of auxiliary information insuccessive sampling. J. Ind. Soc. Ag. Stat., 54(1): 1-12.

    Singh D, Choudhary FS (1986). Theory and Analysis of Sample SurveyDesign. Wiley Eastern Limited, New Delhi.

    Singh HP (1986). A generalized class of estimators of ratio product andmean using supplementary information on an auxiliary character inPPSWR scheme. Guj. Stat. Rev., 13(2): 1-30.

    Singh HP, Upadhyaya LN, Lachan R (1987). Unbiased produc

    estimators. Guj. Stat. Rev., 14(2): 41-50.Singh S, Horn S (2000). Compromised imputation in survey sampling

    Metrika, 51: 266-276.Singh VK, Shukla D (1987). One parameter family of factor type ratio

    estimator. Metron, 45(1-2): 273-283.Singh VK, Shukla D (1993). An efficient one-parameter family of factor

    type estimator in sample survey. Metron, 51(1-2): 139-159.Singh VK, Singh GN (1991). Chain type estimator with two auxiliary

    variables under double sampling scheme. Metron, 49: 279-289.Singh VK, Singh GN, Shukla D (1994). A class of chain ratio estimato

    with two auxiliary variables under double sampling scheme. SankhyaSer. B., 46(2): 209-221.

    Smith TMF (1991). Post-stratification. Statisticians, 40: 323-330.Sukhatme PV, Sukhatme BV, Sukhatme S, Ashok C (1984). Sampling

    Theory of Surveys with Applications. Iowa State University PressI.S.A.S. Publication, New Delhi.

    Wywial J (2001). Stratification of population after sample selection. StatTransit., 5(2): 327-348.

  • 8/2/2019 African Journal Shukla Et Al

    12/12

    Shukla et al. 89

    Appendix A. Population I (N= 180); R-group: (N1=100).

    Y: 110 75 85 165 125 110 85 80 150 165 135 120 140 135 145X: 80 40 55 130 85 50 35 40 110 115 95 60 70 85 115

    Y: 200 135 120 165 150 160 165 145 215 150 145 150 150 195 190

    X: 150 85 80 100 25 130 135 105 185 110 95 75 70 165 160

    Y: 175 160 165 175 185 205 140 105 125 230 230 255 275 145 125X: 145 110 135 145 155 175 80 75 65 170 170 190 205 105 85

    Y: 110 110 120 230 220 280 275 220 145 155 170 195 170 185 195

    X: 75 80 90 165 160 205 215 190 105 115 135 145 135 110 145

    Y: 180 150 185 165 285 150 235 125 165 135 130 245 255 280 150X: 135 110 135 115 125 205 100 195 85 115 75 190 205 210 105

    Y: 205 180 150 205 220 240 260 185 150 155 115 115 220 215 230

    X: 110 105 110 175 180 215 225 110 90 95 85 75 175 185 190

    Y: 210 145 135 250 265 275 205 195 180 115

    X: 170 85 95 190 215 200 165 155 150 175

    NR-group: (N2=80)

    Y: 85 75 115 165 140 110 115 13.5 120 125 120 150 145 90 105

    X: 55 40 65 115 90 55 60 65 70 75 80 120 105 45 65

    Y: 110 90 155 130 120 95 100 125 140 155 160 145 90 90 95

    X: 70 60 85 95 80 55 60 75 90 105 125 95 45 55 65

    Y: 115 140 180 170 175 190 160 155 175 195 90 90 80 90 80

    X: 75 105 120 115 125 135 110 115 135 145 45 55 50 60 50

    Y: 105 125 110 120 130 145 160 170 180 `145 130 195 200 160 110

    X: 65 75 70 80 85 105 110 115 130 95 65 135 130 115 55

    Y: 155 190 150 180 200 160 155 170 195 200 150 165 155 180 200

    X: 115 130 110 120 125 145 120 105 100 95 90 105 125 130 145

    Y: 160 155 170 195 200

    X: 120 115 120 135 150

    Appendix B. Population II (N=150); R-group (N1=90).

    Y: 90 75 70 85 95 55 65 80 65 50 45 55 60 60 95

    X: 30 35 30 40 45 25 40 50 35 30 15 20 25 30 40

    Y: 100 40 45 55 35 45 35 55 85 95 65 75 70 80 65

    X: 50 10 25 25 10 15 10 25 35 55 35 40 30 45 40

    Y: 90 95 80 85 55 60 75 85 80 65 35 40 95 100 55

    X: 40 50 35 45 35 25 30 40 25 35 10 15 45 45 25

    Y: 45 40 40 35 55 75 80 80 85 55 45 70 80 90 55

    X: 15 15 20 10 30 25 30 40 35 20 25 30 40 45 30

    Y: 65 60 75 75 85 95 90 90 45 40 45 55 60 65 60

    X: 25 40 35 30 40 35 40 35 15 25 15 30 30 25 20

    Y: 75 70 40 55 75 45 55 60 85 55 60 70 75 65 80

    X: 25 20 35 30 45 10 30 25 40 15 25 30 35 30 45NR-group (N2=60)

    Y: 40 90 95 70 60 65 85 55 45 60 65 60 55 55 45

    X: 10 30 30 30 25 30 40 25 15 20 30 30 35 25 20

    Y: 65 80 55 65 75 55 50 55 60 45 40 75 75 45 70

    X: 35 45 30 30 40 15 15 20 30 15 10 40 45 10 30

    Y: 65 70 55 35 35 50 55 35 55 60 30 35 45 55 65

    X: 30 40 30 10 15 25 30 15 20 30 10 20 15 30 30

    Y: 75 65 70 65 70 45 55 60 85 55 60 70 75 65 80

    X: 30 35 40 25 45 10 30 25 40 15 25 30 35 30 45