Please download to get full document.

View again

All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Information Report



Views: 0 | Pages: 0

Extension: PDF | Download: 0

Related documents
Announcement. AMS Conversation on Non-Academic Employment: Friday, January 15, 9:30am to 11am, Moscone West Room 2008
  • AMS Conversation on Non-Academic Employment: Friday, January 15, 9:30am to 11am, Moscone West Room 2008
  • Also you can drop by Booth No. 812 on Friday, January 15, 6-7pm., to talk to the people there about the Certificate in Quantitative Finance (CQF), a 6-month, part-time course for those interested in derivatives, development, quantitative trading or risk management.
  • 2Structural Equation Modeling with New Development for Mixed DesignsPlus Introductions toTime Series Modeling, Bootstrap Resampling, & Partial Correlation Network Analysis (PCNA) Kathryn Sharpe Wei Zhu 3Outline
  • Part I: SEM & Introduction to Time Series Models
  • A brief introduction to SEM with an eating disorder example
  • A fMRI time series study + AR and MV models in SEM notation
  • SEM for Mixed Designs with a PET study example
  • Part II: PCNA and Bootstrap Resampling
  • Partial Correlation Network Analysis
  • Bootstrap Resampling
  • 4Part I: SEM & Introduction to Time Series Models 1. A brief introduction to SEM with an eating disorder example 5SEM Basics
  • SEM is a set of usually inter-related linear regression equations.
  • SEM without latent variables is called
  • Path Analysis.
  • SEM is a confirmatory procedure. We cannot create hypotheses by way of SEM, only test a particular hypothesized model against a data set.
  • 6Assumptions of SEM Large samples: SEM researchers suggest a sample size of at least ten times the number of parameters we will be estimating. The variables follow a multivariate normal distribution. Now, we present a simple example of SEM. 7The Hypothesis We received a hypothesis and data from a psychologist interested in determining what factors in a young woman’s life influence her risk for developing an eating disorder. “The following path is proposed as a representation of the relationship and inter-relationships among age of menarche, press for thinness, body image, and self-concept. Age at onset of menarche will lead to a more negative body image and self-concept which will lead to an increased press for thinness and an increase in eating disordered symptomatology.” We can test the validity of her hypothesis using SEM. 8The Data Summary of variables incorporated into the model:
  • Age of first menstrual period
  • Body Image score
  • Self Concept score: measured differently for adolescents
  • and for adults, so we have two separate models
  • Drive for thinness
  • Risk for developing an eating disorder
  • The psychologist evaluated many aspects of each participant’s life and combined the results to create a score for each variable. 9Listwise/Pairwise Deletion What do we do when we have missing measurements? There are several ways a researcher can deal with missing data values. Two of them are listed here. In both cases, we assume that any missing data is missing completely at random. If much data is lost in deletion, imputation should be considered. Listwise Deletion: We delete a subject from the study if any of its measurement values are missing. Pairwise Deletion: We omit those subjects from a particular calculation who do not have the corresponding measurement value. The subjects are present in any calculation for which their value exists. In our study, we use listwise deletion to delete those subjects who do not have the age of first menstrual period variable. We lose less than 10% of our data in deletion. 10Creating Two Models We have two groups of subjects as classified by the grade of the participant. Grade is a variable included in the data but not used in the SEM model, as it was not present in the hypothesis.
  • If a subject has grade ≤ 12, she is an adolescent.
  • If a subject has grade >12, she is an adult.
  • The psychologist in this study first requested two separate models — one for adults and one for adolescents. 11Path Diagrams Age of Menstruation (AM) Age of Menstruation (AM) Body Image (BI) Adolescent Self Worth (SW) Body Image (BI) Adult Self Worth (SW) Drive for Thinness (DT) Drive for Thinness (DT) Risk for Disorder (RD) Risk for Disorder (RD) Directional Arrows indicate cause and effect 12The Equations Because there are no latent variables, we can write basic regression equations for each variable. Our system is as follows: 13SEM Programs The most popular software packages for SEM are:
  • EQS (Peter Bentler)
  • AMOS
  • For this example, we will use PROC CALIS. It takes our linear equations (previous slide) and estimates the parameters for the model. Then it evaluates the goodness of fit of the model. 14SAS Code SAS correlation procedure Suppress Pearson correlations Specifies output Dataset with type Covariance matrix Proc calis: use cov rather than corr proccorr cov nocorr data=eddata outp=edcova(type=cov); run; proccalis cov mod data=edcova; Lineqs bi = b1 am + b2 sw + E1, sw = b3 am + b4 bi + E2, dt = b5 bi + b6 sw + E3, rd = b7 dt + E4 ; Std E1-E4 = The1-The4 ; Cov E1 E2 = Ps1; Run; Give the linear equations describing the system Give variances of exogenous variables Error variances We must set the error equal to a parameter value; otherwise it is assumed to be 0 BI and SW are correlated so we must estimate the correlation of their error terms’ variances 15Results of SAS Analysis: Adolescents Age of Menstruation (AM) SAS gives us parameter estimates, error estimates, and t-values for each path included in the model. We use a t-test to determine which paths are significant. In addition, we can calculate the confidence interval: -.1232 (-.77,.52) -1.9912 (-4.02,.04) -.0525 (-2.02,1.92) Body Image (BI) Adolescent Self Worth (SW) -.1040 (-.27,.06) If zero is contained in the interval, then the path is not significant. -.2699 (-1.33,.79) .3341 (.04,.63) Drive for Thinness (DT) .8292 (.72,.94) Risk for Disorder (RD) 16Results of SAS Analysis: Adults Age of Menstruation (AM) Here again, we use a t-test and/or the confidence intervals to evaluate which paths are significant. -.4157 (-2.89,2.06) .2262 (-.46,.99) -.2998 (-2.19,1.59) Body Image (BI) Adolescent Self Worth (SW) -.1289 (-.33,.07) SAS Output: .4255 (-.43,1.28) .8285 (.55,1.11) Drive for Thinness (DT) .8960 (.77,1.03) Risk for Disorder (RD) 17Goodness of Fit After reporting the parameter estimates, SAS reports many different measures of fit so we can evaluate it in any way we choose. The more measures we use to evaluate our model, the better. Adolescent: A good fit does not necessarily mean a perfect model. We can still have unnecessary variables or be missing important ones. By convention, a model is “good” if: Adult:
  • GFI > .90/.95,
  • Small Chi-Square value,
  • large p-value,
  • RMSEA Estimate should
  • be close to zero.
  • 18Part I: SEM & Introduction to Time Series Models 2. A fMRI time series study + AR and MV models in SEM notation 19SEM for Brain Functional Pathway Analysis SEM can be utilized to study brain functional pathways based on brain image studies including the positron emission tomography (PET), & the functional magnetic resonance imaging (fMRI) studies. Each PET scan generates only one 3D image, while each fMRI scan generates a time series of images tabulating continuous brain functional activities. 20Experiment Study of Visual Attention fMRI Data Visual Attention Task : - Subjects : 28 volunteers ( 14 males and 14 females) - Methods : three-ball tracking task. 6 identified cortical regions for visual attention network : cerebellum (CEREB), left posterior parietal cortex (PPC), left anterior parietal cortex (APC), thalamus (THAL), supplementary motor area (SMA), left lateral prefrontal cortex (LPFC). 5 times cue task cue target balls attentive tracking response show target balls 1.25 s 1.5 s 7.75 s 1 s 1 s static balls cue task static balls passive viewing 5 times 20 21Study Design Data : fMRI data at coordinates corresponding the 6 regions from each individual data set.  Extracting the activation (onset) conditions.  Hemodynamic delay : The first two time points ( 2 TR, 6 sec) at each onset ON OFF ON OFF ON OFF PASSIVE ACTIVE PASSIVE ACTIVE PASSIVE ACTIVE 60 sec 60 sec 60 sec 60 sec 60 sec 60 sec 20 Images 20 Images 20 Images 20 Images 20 Images 20 Images 21 22Structure of fMRI Data 22 23Contemporaneous Pathway The initial path Model : Consideration of the prior literatures and inspection of experts identified 7 anatomically possible directional paths in the left brain hemisphere . LPFC SMA APC THAL PPC CEREB 23 24Longitudinal Pathways Modeled by MAR(1) time t-1 time t 24 25 - The original model with the longitudinal relationsand the contemporaneous relations in the same path diagram. - This model is used as the original model through all approaches to visual attention fMRI data. SMA t-1 LPFC t-1 t t t-1 t THAL t-1 t APC t-1 t t-1 t PPC CEREB Unified SEM of Visual Attention fMRI data 25 26Autoregressive & Moving Average Models
  • AR(1) Residuals
  • MV(1) Residuals
  • AR(1) Response
  • 27Autoregressive & Moving Average Models in SEM Notations AR(1) Residuals, MV(1) Residuals, AR(1) Response 28Autoregressive & Moving Average Models
  • AR(k) Residuals
  • MV(k) Residuals
  • AR(k) Response
  • 30We’ll guess same as last month 32We’ll guess same as last month plus a little more for a possible trend 33This is easy, who needs forecasting 34Continue with our successful method: guess the same as last month plus a little more for a possible trend 36Definitely looks like a trend 38Trend might be a tad steeper than I thought 39Opps 40Momentary deviation, trend will continue 41See, I told you this was easy! 42Trend will continue 43Opps, another momentary fluctuation: 44Trend should continue 45Oh oh! 46Sales has leveled off: Lets average last few points 47Oh oh, maybe things are going down hill 48Let’s be conservative and Assume a negative trend 49Thank goodness, we are still basically level 50We’ll guess same as last month 51This stuff is easy 52We have for sure leveled off 53Big trouble!!! Chief forecaster Smith and CEO Smothers fired! 54New chief forecaster points out the obvious trend 55Remarkable turnaround in sales. New CEO Smithers given credit 56Still looks like a trend to me 57Maybe not! 58Level except for anomaly 59Have things turned around? 60I’ll hedge my bets 61Things have turned around. Perhaps Smithers truly is a genius 62Trend up! 63Not bad! 64Revise trend a tad 65Smithers makes cover of Fortune Smithers Smothers 66This is easy!! 67No big deal, trend continues (in an unrelated matter Smithers cashes out stock options) 69Heads will surely roll soon 70Let’s be cautiously optimistic 71Smithers called before board 73Perhaps we over reacted 74We will guess level 75Back to normal! 77Smithers fired! 78Part I: SEM & Introduction to Time Series Models 3. SEM for Mixed Designs with a PET study example 79Structural Equation Modeling Structural equation modeling can be used to estimate strength of relationship between variables. Regression-style equations are generated from a path diagram, as shown below. Note: An endogenous (dependent, Y) variable (shown in blue) is one being pointed to (influences present in the model). An exogenous (independent, X) variable (shown in pink) is one with no arrows coming in (any existing influences are not present in the system under consideration). All variables are centered about their means. 80Matrix Form and Estimation The model is: where Βis a matrix containing the path coefficients from endogenous variables, Y, to endogenous variables, and Γ is a matrix containing the path coefficients from exogenous variables, X, to endogenous variables, Y. We are interested in the correctness of the model. If the model is correct, and we know the values of the parameters, then the population covariance matrix should be reproduced exactly. So our fundamental hypothesis for these models is Does the covariance matrix implied by the model reproduce the population covariance matrix? 81Maximum Likelihood Estimation Indeed we estimate the parameters through the maximum likelihood method, assuming the measured variables are distributed multivariate normal. The joint distribution function is shown below. When observations of each variable are taken from N subjects, the likelihood function is: The traditional fit function is a manipulation of the likelihood function. We minimize the following function over all model parameters to choose their values. We can calculate a standard error for each parameter to use in analysis of significance (each parameter has an asymptotic normal distribution. The standard errors come from the asymptotic covariance matrix: 82Example Results for Brain Functional Pathway Example Using this methodology, we can evaluate the hypothesized pathway for data from a set of independent subjects, with their measurements for a set of variables recorded under a single condition. (Normal subjects with no stimulus, for example.) Results of our example are shown at the right. Red paths indicate significant positive relationships at the α=.05 significance level (one- sided). The top value is the path coefficient, the second value is the corresponding standard error, and the third value is the corresponding t or z value. 83SEM Software Capability SAS PROC CALIS implements estimation of single-group SEMs. As of version 9.1.3, SAS has no multiple group comparison capability. In version 9.2, SAS has incorporated a new procedure called TCALIS that is capable of the multiple group comparisons other software does. (SAS technical support) LISREL Implements estimation of single-group SEMs. LISREL does have multiple group comparison capability by restraining parameters equal in the goodness-of-fit comparison framework. LISREL uses less intuitive key words and its code-generating user interface is not always accurate. (Joreskog 1996) EQS Implements estimation of single-group SEMs. EQS also has multiple group comparison capability by restraining parameters equal in the goodness-of-fit comparison framework. The user interface is easy to use and understand and generates code accurately. EQS is also capable of estimating multilevel SEMs. (Byrne 2006) These programs are good if your model fits your data well, but they are only appropriate for independent group analyses. We need a new method for correlated data. 84Example of Mixed Design Data: The Brain Reward Study The reward pathway is of interest. We are interested in whether group membership and expectation affect the strength of each path. Normal Group 0 Abusers Group 1 No Drug Drug No Drug Drug The study measures two groups of subjects, 16 normal subjects and 25 cocaine abusers, each under two conditions. Subjects are told they will receive either placebo or methyl-phenidate (a cocaine-like substance) and they receive placebo before the PET scan is taken. As a result, there are two factors of interest—group membership and drug expected. The regions of interest are: the amygdala (AMYG), orbital frontal cortex (OFC), anterior cingulate gyrus (CG), ventral striatum (VS2), thalamus (THAL), insula (INS), putamen (PUT), and motor frontal cortex (MFC). 85Existing Methods for Multiple Group or Repeated Measures Analysis Our data consists of measurements from two groups, each evaluated under two conditions, so we will need an analysis method that can manage both. To date, multiple groups and repeated measures are not handled at the same time. This is illustrated in the title of chapter four of John C. Loehlin’s text, Latent Variable Models, “Fitting Models Involving Repeated Measures orMultiple Groups.” This text, published in 2004, reports that models can have either structure, but not both. Currently, there is only one method available for comparison of independent groups in the context of SEM. This Nested Goodness-of-Fit method is implemented in the current SEM software as the only option for multiple-group comparison (Jaccard and Wan 1996). The current SEM model for repeated measures data is known as Latent Growth Modeling (Meredith and Tisak 1990). 86SEM for Mixed Designs: Multiple Groups with Repeated Measures For each path in the hypothesized diagram, we reparametrize the path coefficient to reflect possible changes from the normal subject receiving placebo due to group membership and receiving methylphenidate. Dataset 1: Normal subjects receiving placebo (group=0, drug=0) Dataset 2: Normal subjects receiving methylphenidate (group=0, drug=1) Dataset 3: Cocaine abusers receiving placebo (group=1, drug=0) Dataset 4: Cocaine abusers receiving methylphenidate (group=1, drug=1) 87SEM for Mixed Designs: Brain Reward Study Original model: Because we have reparametrized path coefficients with multipliers of 0 and 1, for each group and condition combination, we will have a model that contains summed path coefficients preceding each variable. Therefore, the variables in the model are still distributed as multivariate normal. Mixed Model (2 Groups under 2 Conditions): Mixed Model, considering group 1 under drug 0: 88Implementation of SEM for Mixed Designs We can construct the ΒandΓmatrices as shown below. (Below we consider only group 0, as group 1’s matrices can be constructed similarly.) Once we have Φand Ψ, shown on the next slide, we can construct the implied model covariance matrix as in the single-group case, shown below. 89Correlating the Errors in the Equations: Ψ We must address the fact that we are not dealing simply with independent groups, but two groups measured under two conditions each. Two sets of measurements taken from the same group will be correlated, so we must incorporate this correlation into the model. We will incorporate the correlation of subjects measured under two conditions by correlating the errors in the equations for each subject. For example, the Ψ matrix for group 0 would be: 90Correlating the Errors of the Independent Variables: Φ Additionally, to incorporate the correlation of the repeated measures, we will correlate the errors of the independent variables, X through the Φ matrix. To incorporate correlation of errors in the independent variables, we add terms to the Φmatrix that correlate vectors the errors for the repeated measures, as shown below. X1 contains measures of group 0 under both conditions. The two conditions’ errors should be correlated, as should those for group 1. We allow all errors to be correlated freely. 91Estimation of SEM for Mixed Designs We can consider the likelihood function to be a product of the likelihood functions of the two independent groups. Each group will have 18 variables (9 for each condition) that are distributed as multivariate normal. Note: All variables have been centered about their means. Consider the 18 variables that comprise measurements from group 0. Their joint distribution is We take N observations of Z, from N different subjects from our chosen group, and the resulting likelihood function for Z is The likelihood function for group 1 can be constructed in the same way. Then to create the full likelihood function, we can multiply the independent group likelihoods. 92Estimation of SEM for Mixed Designs As in the single-group SEM derivation, we can manipulate the likelihood function into a fitting function of the traditional format: The standard errors of the parameter estimates are the square roots of the respective diagonal elements of the matrix shown below. 93Numerical Implementation of SEM for Mixed Designs
  • A MATLAB script file creates the list of parameters to be estimated and sets
  • starting values (as suggested by Bollen).
  • The script calls built-in MATLAB function fminunc to minimize the likelihood-
  • based fitting function derived on the previous slide, yielding the parameter
  • estimates.
  • We calculate standard errors
  • We Need Your Support
    Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

    Thanks to everyone for your continued support.

    No, Thanks