cshalizi + visual_display_of_quantitative_information 60
PM's Question Time: The price elasticity of labor-saving devices
7 weeks ago by cshalizi
"Fourth, the presentist bias in this chart is extreme in two ways. First, we forget things that we don't count as "technology" anymore (e.g., toilets, coal furnaces, sewing machines), and so they are left off. Second, we don't know what innovations are at low levels of adoption right now--imagine someone in 1960 trying to predict the adoption arc for personal computers!--and so our current rates of adoption are vastly overestimated compared to what the same chart will look like in 50 years."
to:blog
visual_display_of_quantitative_information
technological_change
the_present_before_it_was_widely_distributed
7 weeks ago by cshalizi
Taylor & Francis Online :: Graphical Diagnostics for Markov Models for Categorical Data - Journal of Computational and Graphical Statistics - Volume 20, Issue 2
8 weeks ago by cshalizi
"Markov models are widely used as a method for describing categorical data that exhibit stationary and nonstationary autocorrelation. However, diagnostic methods are a largely overlooked topic for Markov models. We introduce two types of residuals for this purpose: one for assessing the length of runs between state changes, and the other for assessing the frequency with which the process moves from any given state to the other states. Methods for calculating the sampling distribution of both types of residuals are presented, enabling objective interpretation through graphical summaries. The graphical summaries are formed using a modification of the probability integral transformation that is applicable for discrete data. Residuals from simulated datasets are presented to demonstrate when the model is, and is not, adequate for the data. The two types of residuals are used to highlight inadequacies of a model posed for real data on seabed fauna from the marine environment."
to:NB
visual_display_of_quantitative_information
statistics
markov_models
to_teach:undergrad-ADA
8 weeks ago by cshalizi
Taylor & Francis Online :: Dissimilarity Plots: A Visual Exploration Tool for Partitional Clustering - Journal of Computational and Graphical Statistics - Volume 20, Issue 2
8 weeks ago by cshalizi
"For hierarchical clustering, dendrograms are a convenient and powerful visualization technique. Although many visualization methods have been suggested for partitional clustering, their usefulness deteriorates quickly with increasing dimensionality of the data and/or they fail to represent structure between and within clusters simultaneously. In this article we extend (dissimilarity) matrix shading with several reordering steps based on seriation techniques. Both ideas, matrix shading and reordering, have been well known for a long time. However, only recent algorithmic improvements allow us to solve or approximately solve the seriation problem efficiently for larger problems. Furthermore, seriation techniques are used in a novel stepwise process (within each cluster and between clusters) which leads to a visualization technique that is able to present the structure between clusters and the micro-structure within clusters in one concise plot. This not only allows us to judge cluster quality but also makes misspecification of the number of clusters apparent. We give a detailed discussion of the construction of dissimilarity plots and demonstrate their usefulness with several examples. Experiments show that dissimilarity plots scale very well with increasing data dimensionality."
to:NB
visual_display_of_quantitative_information
clustering
data_mining
to_teach:data-mining
8 weeks ago by cshalizi
Taylor & Francis Online :: Functional Boxplots - Journal of Computational and Graphical Statistics - Volume 20, Issue 2
8 weeks ago by cshalizi
"This article proposes an informative exploratory tool, the functional boxplot, for visualizing functional data, as well as its generalization, the enhanced functional boxplot. Based on the center outward ordering induced by band depth for functional data, the descriptive statistics of a functional boxplot are: the envelope of the 50% central region, the median curve, and the maximum non-outlying envelope. In addition, outliers can be detected in a functional boxplot by the 1.5 times the 50% central region empirical rule, analogous to the rule for classical boxplots. The construction of a functional boxplot is illustrated on a series of sea surface temperatures related to the El Niño phenomenon and its outlier detection performance is explored by simulations. As applications, the functional boxplot and enhanced functional boxplot are demonstrated on children growth data and spatio-temporal U.S. precipitation data for nine climatic regions, respectively. This article has supplementary material online."
to:NB
visual_display_of_quantitative_information
statistics
functional_data_analysis
8 weeks ago by cshalizi
Four Ways to Slice Obama’s 2013 Budget Proposal - Interactive Feature - NYTimes.com
february 2012 by cshalizi
Not sure how useful this is as an actual visualization, but very nice as eye candy. (And, if you look at the department totals, as an illustration of "an insurance company with an army".)
visual_display_of_quantitative_information
us_politics
economic_policy
via:flowing_data
to:blog
february 2012 by cshalizi
A General Framework for Dimensionality-Reducing Data Visualization Mapping
february 2012 by cshalizi
"In recent years, a wealth of dimension-reduction techniques for data visualization and preprocessing has been established. Nonparametric methods require additional effort for out-of-sample extensions, because they provide only a mapping of a given finite set of points. In this letter, we propose a general view on nonparametric dimension reduction based on the concept of cost functions and properties of the data. Based on this general principle, we transfer nonparametric dimension reduction to explicit mappings of the data manifold such that direct out-of-sample extensions become possible. Furthermore, this concept offers the possibility of investigating the generalization ability of data visualization to new data points. We demonstrate the approach based on a simple global linear mapping, as well as prototype-based local linear mappings. In addition, we can bias the functional form according to given auxiliary information. This leads to explicit supervised visualization mappings with discriminative properties comparable to state-of-the-art approaches."
in_NB
dimension_reduction
visual_display_of_quantitative_information
data_analysis
data_mining
manifold_learning
to_teach:data-mining
february 2012 by cshalizi
Hive Plots - Linear Layout for Network Visualization - Visually Interpreting Network Structure and Content Made Possible
december 2011 by cshalizi
Examine carefully. God knows hairballs are not very useful. There's apparently an R package.
in_NB
to_read
network_data_analysis
visual_display_of_quantitative_information
via:dsparks
to_teach:complexity-and-inference
re:stacs
december 2011 by cshalizi
[1111.1855] Fr'echet means of curves for signal averaging and application to ECG data analysis
november 2011 by cshalizi
"Signal averaging is the process that consists in computing a mean shape from a set of noisy signals. In the presence of geometric variability in time in the data, the usual Euclidean mean of the raw data yields a mean pattern that does not reflect the typical shape of the observed signals. In this setting, it is necessary to use alignment techniques for a precise synchronization of the signals, and then to average the aligned data to obtain a consistent mean shape. In this paper, we study the numerical performances of Fr'echet means of curves which are extensions of the usual Euclidean mean to spaces endowed with non-Euclidean metrics. This yields a new algorithm for signal averaging without a reference template. We apply this approach to the estimation of a mean heart cycle from ECG records."
to:NB
statistics
data_analysis
visual_display_of_quantitative_information
november 2011 by cshalizi
IDV User Experience: Shipping Mix
october 2011 by cshalizi
I'm not 100% sure what we're supposed to learn from this, other than that shipping concentrates at ports and straits, but it's cute.
visual_display_of_quantitative_information
maps
trade
logistics
via:schweitzer
october 2011 by cshalizi
[1110.3917] How to Evaluate Dimensionality Reduction? - Improving the Co-ranking Matrix
october 2011 by cshalizi
"The growing number of dimensionality reduction methods available for data visualization has recently inspired the development of quality assessment measures, in order to evaluate the resulting low-dimensional representation independently from a methods' inherent criteria. Several (existing) quality measures can be (re)formulated based on the so-called co-ranking matrix, which subsumes all rank errors (i.e. differences between the ranking of distances from every point to all others, comparing the low-dimensional representation to the original data). The measures are often based on the partioning of the co-ranking matrix into 4 submatrices, divided at the K-th row and column, calculating a weighted combination of the sums of each submatrix. Hence, the evaluation process typically involves plotting a graph over several (or even all possible) settings of the parameter K. Considering simple artificial examples, we argue that this parameter controls two notions at once, that need not necessarily be combined, and that the rectangular shape of submatrices is disadvantageous for an intuitive interpretation of the parameter. We debate that quality measures, as general and flexible evaluation tools, should have parameters with a direct and intuitive interpretation as to which specific error types are tolerated or penalized. Therefore, we propose to replace K with two parameters to control these notions separately, and introduce a differently shaped weighting on the co-ranking matrix. The two new parameters can then directly be interpreted as a threshold up to which rank errors are tolerated, and a threshold up to which the rank-distances are significant for the evaluation. Moreover, we propose a color representation of local quality to visually support the evaluation process for a given mapping, where every point in the mapping is colored according to its local contribution to the overall quality." --- Look at this carefully, and see if it could be taught in data mining (and whether it's worth doing so.)
to:NB
dimension_reduction
statistics
data_analysis
visual_display_of_quantitative_information
to_teach:data-mining
october 2011 by cshalizi
R Graph Gallery - Donations Welcome - Romain Francois, Professional R Enthusiast
october 2011 by cshalizi
The R Graph Gallery is an under-utilized resource, and sending a little money Romain's way is not a bad thing.
R
programming
to_teach:statcomp
statistics
visual_display_of_quantitative_information
october 2011 by cshalizi
Visualization methods for longitudinal social networks and stochastic actor-oriented modeling
july 2011 by cshalizi
"As a consequence of the rising interest in longitudinal social networks and their analysis, there is also an increasing demand for tools to visualize them. We argue that similar adaptations of state-of-the-art graph-drawing methods can be used to visualize both, longitudinal networks and predictions of stochastic actor-oriented models (SAOMs), the most prominent approach for analyzing such networks. The proposed methods are illustrated on a longitudinal network of acquaintanceship among university freshmen."
social_networks
network_data_analysis
visual_display_of_quantitative_information
statistics
july 2011 by cshalizi
Disease Maps: Epidemics on the Ground, Koch
july 2011 by cshalizi
"In the seventeenth century, a map of the plague suggested a radical idea—that the disease was carried and spread by humans. In the nineteenth century, maps of cholera cases were used to prove its waterborne nature. More recently, maps charting the swine flu pandemic caused worldwide panic and sent shockwaves through the medical community. In Disease Maps, Tom Koch contends that to understand epidemics and their history we need to think about maps of varying scale, from the individual body to shared symptoms evidenced across cities, nations, and the world. "
books:noted
maps
epidemiology
history_of_science
history_of_medicine
contagion
plague
visual_display_of_quantitative_information
disease
medicine
july 2011 by cshalizi
The Washington Monthly - The Magazine - The Information Sage
may 2011 by cshalizi
This confirms my sense from his books that Tufte is probably a complete pain in the ass to deal with, though genius must be excused much. (Also, I will now take odds that he will succumb to the Brain Eater within 10 years, which will be much more of a tragedy than usual.)
tufte.edward
visual_display_of_quantitative_information
via:?
cult_followings
to:blog
may 2011 by cshalizi
VIsualizing the Bubble « Sustainable Cities and Transport
march 2011 by cshalizi
Looks like there ought to be some sort of data collapse possible here (a common time trend multiplied by some city-specific noise?).
to_teach:undergrad-ADA
visual_display_of_quantitative_information
mortgage_crisis
march 2011 by cshalizi
Functional Boxplots - Journal of Computational and Graphical Statistics - 0(0):1
february 2011 by cshalizi
Isn't this just a boxplot for every value of the independent variable??
visual_display_of_quantitative_information
statistics
functional_data_analysis
february 2011 by cshalizi
Building a Better Word Cloud « Zero Intelligence Agents
february 2011 by cshalizi
I like the point that the axes in a plot should _mean_ something. Not sure that these are the best choices however --- what if I want to just deal with one document, or for that matter with three?
visual_display_of_quantitative_information
text_mining
february 2011 by cshalizi
Information-Theoretic Methods for the Visual Analysis of Climate and Flow Data (Tutorial Slides)
october 2010 by cshalizi
My reaction to this must, I imagine, be a little bit like how a proud parent feels when they hear from someone else about their child doing something worthwhile.
visual_display_of_quantitative_information
complexity_measures
computational_mechanics
via:georg
janicke.heiki
to:blog
october 2010 by cshalizi
Cytoscape: Analyzing and Visualizing Network Data
april 2010 by cshalizi
I need to replace dot for big graphs.
networks
visual_display_of_quantitative_information
software
via:aaron_clauset
april 2010 by cshalizi
yEd - Graph Editor
april 2010 by cshalizi
I need to replace dot for big graphs.
networks
visual_display_of_quantitative_information
software
via:aaron_clauset
april 2010 by cshalizi
Gephi, graph exploration and manipulation software
april 2010 by cshalizi
I need to replace dot for big graphs.
networks
visual_display_of_quantitative_information
software
via:rob_h
april 2010 by cshalizi
[1003.0529] A Unified Algorithmic Framework for Multi-Dimensional Scaling
march 2010 by cshalizi
"In this paper, we propose a unified algorithmic framework for solving many known variants of \mds. Our algorithm is a simple iterative scheme with guaranteed convergence, and is \emph{modular}; by changing the internals of a single subroutine in the algorithm, we can switch cost functions and target spaces easily. In addition to the formal guarantees of convergence, our algorithms are accurate; in most cases, they converge to better quality solutions than existing methods, in comparable time. "
multidimensional_scaling
dimension_reduction
visual_display_of_quantitative_information
to_teach:data-mining
data_mining
march 2010 by cshalizi
[0911.3349] Seeing Science
november 2009 by cshalizi
"The ability to represent scientific data and concepts visually is becoming increasingly important due to the unprecedented exponential growth of computational power during the present digital age. The data sets and simulations scientists in all fields can now create are literally thousands of times as large as those created just 20 years ago. Historically successful methods for data visualization can, and should, be applied to today's huge data sets, but new approaches, also enabled by technology, are needed as well. Increasingly, "modular craftsmanship" will be applied, as relevant functionality from the graphically and technically best tools for a job are combined as-needed, without low-level programming."
visual_display_of_quantitative_information
have_read
automating_craft
november 2009 by cshalizi
Visualizing Empires Decline
november 2009 by cshalizi
It's a start, but: where's China? Russia? The Ottomans and the Habsburgs? The US? Japan? Holland? (Also: what're the units, area or population?)
visual_display_of_quantitative_information
imperialism
world_history
via:idlethink
november 2009 by cshalizi
Schneier on Security: Police Data Mining Done Right
june 2009 by cshalizi
Sounds more like straight-up visualization than data-mining --- not that that's bad! The human visual cortex is a powerful pattern-recognition technology, albeit largely undocumented and without any basis in existing theory.
data-mining
police
to_teach:data-mining
visual_display_of_quantitative_information
june 2009 by cshalizi
The world economy is tracking or doing worse than during the Great Depression (update) | vox - Research-based policy analysis and commentary from leading economists
june 2009 by cshalizi
Holy shit.
financial_crisis_of_2007--
great_depression
economics
visual_display_of_quantitative_information
eichengreen.barry
via:krugman
june 2009 by cshalizi
Invariant co-ordinate selection
june 2009 by cshalizi
"A general method for exploring multivariate data by comparing different estimates of multivariate scatter is presented. The method is based on the eigenvalue–eigenvector decomposition of one scatter matrix relative to another. In particular, it is shown that the eigenvectors can be used to generate an affine invariant co-ordinate system for the multivariate data. Consequently, we view this method as a method for invariant co-ordinate selection."
statistics
data_analysis
visual_display_of_quantitative_information
principal_components
june 2009 by cshalizi
Visualizing Data using t-SNE
may 2009 by cshalizi
SNE = stochastic neighborhood embedding
manifold_learning
machine_learning
hinton.geoffrey
van_der_maaten.laurens
visual_display_of_quantitative_information
to_teach:data-mining
may 2009 by cshalizi
Grammar of Graphics 2 (R)
january 2009 by cshalizi
Nice-looking graphics system for R; draft book and R package.
R
visual_display_of_quantitative_information
via:kjhealy
books:noted
january 2009 by cshalizi
[0812.1242] Mapping change in large networks
december 2008 by cshalizi
Heard Martin talk about this at SFI last week. Nice, though I think the MDL frame-tale needs some work.
The "alluvial diagrams" are very pretty.
minimum_description_length
rosvall.martin
bergstrom.carl
kith_and_kin
network_data_analysis
have_read
re:network_differences
community_discovery
visual_display_of_quantitative_information
bootstrap
statistics
clustering
hypothesis_testing
re:stacs
to_teach:complexity-and-inference
citation_networks
bibliometry
The "alluvial diagrams" are very pretty.
december 2008 by cshalizi
PHD Comics: Enrollment vs. Unemployment Rate
october 2008 by cshalizi
Co-integration of time series: a case study
cartoons
funny:geeky
academia
time_series
visual_display_of_quantitative_information
cham.jorge
october 2008 by cshalizi
Red State, Blue State, Rich State, Poor State: Home
june 2008 by cshalizi
Andrew Gelman et al.'s forthcoming book about American voting habits.
us_politics
statistics
debunking
inequality
gelman.andrew
visual_display_of_quantitative_information
red_state_blue_state
kith_and_kin
running_dogs_of_reaction
books:recommended
june 2008 by cshalizi
Wordle - My mind as a c. 1965 book cover design
june 2008 by cshalizi
I like the fact that the largest single element is the one reminding me to transfer things to the notebooks...
social_media
pretty_pictures
visual_display_of_quantitative_information
via:vaguery
june 2008 by cshalizi
CCNR - Gallery
may 2008 by cshalizi
Assorted network images, curated by the Barabasi group.
networks
visual_display_of_quantitative_information
to_teach:complexity-and-inference
via:dpfeldman
may 2008 by cshalizi
all of inflation's little parts - the new york times
may 2008 by cshalizi
This is a great little picture.
economics
visual_display_of_quantitative_information
inflation
cox.amanda
via:chl
may 2008 by cshalizi
all streets | ben fry
may 2008 by cshalizi
Map of the continental US showing _only_ streets and roads.
maps
infrastructure
design
fry.ben
something_about_america
via:unfogged
pretty_pictures
visual_display_of_quantitative_information
may 2008 by cshalizi
Skyeome.net » Blog Archive » Fashionable Networks
april 2008 by cshalizi
Skye is right; compared to what actual designers produce, our graphics are _painfully ugly_ and _uncompelling_. How can we do better?
visual_display_of_quantitative_information
design
bender-de_moll.skye
april 2008 by cshalizi
Visualizing Social Networks (Freeman)
february 2008 by cshalizi
History of social network visualization from Moreno onward.
social_networks
network_data_analysis
visual_display_of_quantitative_information
freeman.linton_c.
february 2008 by cshalizi
WikiPediaVision (beta)
october 2007 by cshalizi
Watch where people are editing Wikipedia, as they edit Wikipedia
visual_display_of_quantitative_information
funny:geeky
wikipedia
via:logista
october 2007 by cshalizi
The Topography of Poverty in the United States: A Spatial Analysis Using County-Level Data From the Community Health Status Indicators Project
october 2007 by cshalizi
"A distinctive north–south demarcation of low versus high poverty concentrations was found, along with isolated pockets of high and low poverty within areas in which the predominant poverty rates were opposite. This pattern can be described as following
statistics
poverty
inequality
america
american_south
visual_display_of_quantitative_information
via:john-burke
to_teach:data-mining
to:blog
october 2007 by cshalizi
related tags
academia ⊕ ai ⊕ america ⊕ american_south ⊕ automating_craft ⊕ bad_data_analysis ⊕ bender-de_moll.skye ⊕ bergstrom.carl ⊕ bibliometry ⊕ books:noted ⊕ books:recommended ⊕ bootstrap ⊕ brumm.maria ⊕ cartoons ⊕ cham.jorge ⊕ citation_networks ⊕ clustering ⊕ cognitive_science ⊕ community_discovery ⊕ complexity_measures ⊕ computational_mechanics ⊕ computational_statistics ⊕ contagion ⊕ cox.amanda ⊕ cult_followings ⊕ data-mining ⊕ data_analysis ⊕ data_mining ⊕ debunking ⊕ design ⊕ dimension_reduction ⊕ disease ⊕ dondis.donis_a. ⊕ early_modern_european_history ⊕ economics ⊕ economic_policy ⊕ eichengreen.barry ⊕ electric_power_grid ⊕ epidemiology ⊕ experimental_psychology ⊕ financial_crisis_of_2007-- ⊕ fluid_mechanics ⊕ freeman.linton_c. ⊕ fry.ben ⊕ functional_data_analysis ⊕ funny:geeky ⊕ funny:tasteless ⊕ gelman.andrew ⊕ great_depression ⊕ have_read ⊕ hinton.geoffrey ⊕ history_of_ideas ⊕ history_of_medicine ⊕ history_of_science ⊕ hyperbole ⊕ hypothesis_testing ⊕ imperialism ⊕ inequality ⊕ inflation ⊕ information_theory ⊕ infrastructure ⊕ in_NB ⊕ janicke.heiki ⊕ kith_and_kin ⊕ labor ⊕ linear_regression ⊕ logistics ⊕ machine_learning ⊕ manifold_learning ⊕ mapping ⊕ maps ⊕ markov_models ⊕ medicine ⊕ minimum_description_length ⊕ mortgage_crisis ⊕ multidimensional_scaling ⊕ neo-conservatism ⊕ networks ⊕ network_data_analysis ⊕ newman.mark ⊕ occupy_wall_street ⊕ plague ⊕ police ⊕ political_science ⊕ poverty ⊕ pretty_pictures ⊕ principal_components ⊕ programming ⊕ public_opinion ⊕ r ⊕ re:network_differences ⊕ re:stacs ⊕ red_state_blue_state ⊕ rose.stephen_j. ⊕ rosvall.martin ⊕ running_dogs_of_reaction ⊕ social_life_of_the_mind ⊕ social_media ⊕ social_networks ⊕ software ⊕ something_about_america ⊕ statistics ⊕ technological_change ⊕ text_mining ⊕ the_present_before_it_was_widely_distributed ⊕ time_series ⊕ to:blog ⊕ to:NB ⊕ to_read ⊕ to_teach:complexity-and-inference ⊕ to_teach:data-mining ⊕ to_teach:statcomp ⊕ to_teach:undergrad-ADA ⊕ trade ⊕ tufte.edward ⊕ us_politics ⊕ utter_stupidity ⊕ van_der_maaten.laurens ⊕ vast_right-wing_conspiracy ⊕ via:? ⊕ via:aaron_clauset ⊕ via:chl ⊕ via:dpfeldman ⊕ via:dsparks ⊕ via:flowing_data ⊕ via:gelman ⊕ via:georg ⊕ via:idlethink ⊕ via:jbdelong ⊕ via:john-burke ⊕ via:joncgoodwin ⊕ via:kjhealy ⊕ via:krugman ⊕ via:logista ⊕ via:phnk ⊕ via:rob_h ⊕ via:schweitzer ⊕ via:unfogged ⊕ via:vaguery ⊕ visual_display_of_quantitative_information ⊖ whats_gone_wrong_with_america ⊕ wikipedia ⊕ world_history ⊕Copy this bookmark: