MichaelBishop + statistics   382

Fixed effects and identification « Statistical Modeling, Causal Inference, and Social Science
It is possible that I’m completely wrong, but I am sort of surprised about the comments thus far, because I thought I immediately understood what people meant by “I prefer fixed effects over random effects because I care about identification”, but no one has attempted a translation yet. As Andrew has pointed out in a paper, “fixed effects” is used in various meanings, but I suppose that what’s meant is the standard usage – fixed effects as unit dummy variables (and perhaps, additionally, time dummies) that control for unmeasured variance between units (points in time) to the extent that it is stable over time (units). Which presupposes you have longitudinal data, although some authors have presented models that use, for example, family fixed effects (dummies). In econometrics, “the identification problem” refers to the problem of being able to make causal claims on the basis of observational data; fixed effects are thought to help with that because they control for an extra portion of the variance (see Stuart Buck’s example above). For a programmatic view, see Halaby, Charles N., 2004: “Panel Models in Sociological Research“, Annual Review of Sociology 30: 507-44; for the technique’s limitations, see Bjerk, David, 2009: “How Much Can We Trust Causal Interpretations of Fixed-Effects Estimators in the Context of Criminality?” Journal of Quantitative Criminology 25: 391-417.
fixed-effects  statistics  mixed_models  HLM  Andrew_Gelman 
6 weeks ago by MichaelBishop
Graphing Likert scale responses « Statistical Modeling, Causal Inference, and Social Science
dk says:
October 23, 2010 at 8:57 am
consider using z-score for y-axis. Often the likert measures don't have much intrinsic meaning — worse, they *appear* to have more meaning intrinsically than they do (it's a mistake to assume "slightly agree" vs. "slightly disagree," e.g., is some critically important division for opinion in the world; maybe it is, but probably it isn't). Usually what the likert measure is good for is extracting meaningful effects (of a treatment, say, or of some individual characteristic) by capturing observable degrees of variance in some attitude or latent disposition that you have independent reason to think matters in the world (debates over optimal size of likert are focused on this–the tradeoff between effect size & noise as number of points on scale increases). Using the z-score transformation of the measure for tye y-axis focuses attention on *that* because readers can see how large the variance was between conditions or between different subjects relative to variance across the sample, & aren't tempted to attribute meaning to arbitrary units in the raw likert measure. Also, using z-score as y-axis avoids potentially incorrect interpretations based on where scores fall on the mean or how many likert units differences between conditions or groups span. If, e.g., the sample mean on an 11-pt measure is 6, & you have two conditions that have means of 5 & 7 & SEMs of 0.05, people will think, "gee, everybody is pretty avarge & there's really not much difference between subjects in the two conditions." Sigh. Preempting this inference inference is what motivates people to truncate their y-axis– leading others to say, "hey, don't do that! That's creating a misleading view of your effect size!" Well, *not* doing it can be misleading too if people are unable get a good sense of variance & effect sizes from looking at bars plotted over the whole scale. So use a z-score, usually with -1 & +1 as upper & lower bounds — nothing misleading about that — & plot your data (in form of bars w/ CIs or whatever) within that.
Carl says:
October 25, 2010 at 7:36 am
Thanks dk – that was a really good suggestion and I'm trying it for my survey data results.

Thom says:
November 11, 2010 at 4:55 am
I'd have to disagree with dk's suggestion. If a scales "don't have much intrinsic meaning" then taking z scores don't add in meaning. They merely express the scores in terms of the sample SD. This can differ for all sorts of reason that have nothing to do with what you are measuring (e.g., if the ratings are high or low the z scores will be bigger because of ceiling or floor effects flattening the SD).

Furthermore if the scales are of a disagree/agree type, the most psychologically important information on the scale is probably whether they disagree or agree and this can be obscured by the z scoring.

I do agree that the variability is important and that plotting with CIs is sensible (and z scores with CIs is probably better than raw scores without CIs). However, confounding size of effect with its variability in the sample is problematic for interpretation and decreases transparency.
likert  survey  visualization  statistics  R  Andrew_Gelman 
9 weeks ago by MichaelBishop
« earlier      

related tags

6star  academia  academic  Academic_Non-Chicago  aggregation  agile  AI  Aleks  Alex_Tabarrok  algorithms  amazon  Andrew_Gelman  Andrew_Perrin  anthropology  archive  assessment  Austin_Frakt  Austin_Nichols  autobiographical  average_predictive_comparisons  Bands  Bates  bayesian  bean_plot  Berger  best_of  Bezier  bias  bibliography  binomial  bleg  blog  blogs  bonacich  bonpow  book  books  book_review  bootstrap  boston  Brendan_O'Conner  Brendan_O'Connor  British_Medical_Journal  budget  calibration  cancer  canonical  career  cartoon  cartoons  cast  categorical_data  categorical_variables  cauchy  causality  causal_inference  causation  centrality_measures  Charles_Manski  chart  cheatsheet  Christopher_Hitchens  Christopher_Winship  code  college  color  comic  comment  community  comparison  complexity  computation  confidence_interval  consistency  continuous_variables  conventions  Conway  Cook  correlation  Cosma_Shalizi  count_data  course  courses  course_evaluation  crime  criticism  crossValidated  CrossValidated  cross_validation  culture  curriculum  curves  Cyrus  Dani_Rodrik  Dan_Hirschman  data  dataframe  datamining  data_analysis  data_cleaning  data_management  data_science  David_MacKay  debate  Deborarh_Mayo  decision_theory  degrees_of_freedom  demographics  design  development  diagnostic  DIC  dichotomous  difference_of_difference  dimensionality  distributions  documentation  doe  Donoho  dot_plot  duplicate  dynamite_plot  econometrics  economics  economic_statistics  economy  editor  education  Efron  election  emacs  Emanuel_Derman  endogeneity  epidemiology  epistemology  ergonomics  errors  evaluation  example  examples  Excel  experiment  federal  Felix_Elwert  field_experiment  figures  filetype:pdf  file_type  fixed-effects  fmri  forecasting  fox-hedgehog  freakonomics  free  freedom  frequentist  function  functions  funny  Gary_King  GDP  gender  generalised_linear_model  generalized_additive_models  genetics  George_Box  Gephi  ggplot  ggplot_example  git  github  glm  glossary  google  government  grading  graffiti  graph  graphical_causal_models  graphical_models  graphics  graphs  graph_example  gui  guide  hacking  Hadley  Hadley_Wickham  Halbert_White  health_economics  higher_education  hlm  homepage  humor  hypothesis_testing  identification  igraph  indicator_variable  infovis  inside_higher_ed  instrumental_variables  interactions  interesting  international  internet  introduction  Ioannidis  IV  javascript  Jaynes  jobmarket  John_Cochrane  John_Cook  John_Fox  John_Levi_Martin  John_Mohr  John_Mullahy  John_Myles_White  John_Sonnett  John_Tukey  Journal_of_Statistical_Software  Judea_Pearl  judgment  Julia  keyboard  lab  Lane_Kenworthy  language  large_data  Larry_Wasserman  latent_variable  lattice  law  learning  lecture  lectures  Leo_Breiman  likelihood  likert  line_chart  list  listhost  lme  logic  logistic_regression  machine_learning  Malecki  maps  markov  massachusetts  math  math_soc  Matt_Asher  McFarland  MCMC  measurement  memory  meta-analysis  methodology  methods  Michael_Bishop  Microsoft_Word  mit_media_lab  mixed_models  modeling  model_checking  model_selection  momentum  mortality  multilevel  multinomial  multinomial_data  multiple-imputation  multiple_comparison  Music  national  network  networks  network_analysis  neuroscience  neuroskeptic  news  nomogram  non-parametric  nonprofit  nytimes  O'Hanagan  occam  Occam's_Razor  Omar  online_education  openscience  opensource  open_problems  p-value  package  packages  pairwise  paper  paradox  parallel  parametric  parsimony  Paul_Collier  Paul_Krugman  pdf  Pearl  pedagogy  peer-effects  permutations  person  personality  Peter_Huber  Peter_Klein  Peter_Norvig  philosophy  philosophy_of_science  philosophy_of_statistics  phish  plagiarism  plyr  poisson  policy  politics  polls  poverty  powerpoint  prediction  presentation  presentations  principal_components  prior  probability  processing  productivity  programming  programming_environment  programming_language  project_management  psychology  psychometrics  publishing  Pure_Fun  python  p_values  qualitative  quotes  r  randomness  ranking  rape  Razib  realestate  reference  regression  relational_methods  reply  report_generation  reproducibility  research  reshape  review  reviewing  Ripley  risk  Robert_Kass  robust_estimation  Rob_Kabacoff  Rosenbaum  Rossman  rstudio  R_code  R_function  r_general  r_package  R_packages  sas  scatterplot  schools  science  scraping  search  selection_bias  self-evaluation  sensitivity_analysis  seth  sex-differences  shalizi  shortcuts  Silver  sna  social  social_network  social_science  social_statistics  sociology  sociology_of_culture  software  Software_Related  speculation  speed  splines  spss  StackOverflow  stata  statistical_bias  statistical_significance  statisticians  statistics  statistics_forum  statnet  story_time  structural_equation_modelling  style  Subject_Focused_Info  summary  supreme_court  survey  surveys  survey_sampling  survey_weighting  sweave  syllabus  table  tables  taleb  Tal_Galili  teacher_evaluation  teaching  tech  techniques  template  textbook  text_book  text_editor  The_Atlantic  thinktank  time_use  tips  tools  transformation  transforming_data  trends  Tukey  tutorial  Type_M_bias  t_distribution  uchicago  upper-bound  variance-bias_tradeoff  venn  version_control  very_basic  video  visualization  vote  weakly_informative_priors  web  web2.0  weighting  Wickham  wiki  wikipedia  Willard_Manning  Wooley  workflow  world  Xi'ian  xkcd  youtube  yudkowsky  zeitgeist 

Copy this bookmark:



description:


tags: