cshalizi + dimension_reduction 31
Lam , Yao : Factor modeling for high-dimensional time series: Inference for the number of factors
9 days ago by cshalizi
"This paper deals with the factor modeling for high-dimensional time series based on a dimension-reduction viewpoint. Under stationary settings, the inference is simple in the sense that both the number of factors and the factor loadings are estimated in terms of an eigenanalysis for a nonnegative definite matrix, and is therefore applicable when the dimension of time series is on the order of a few thousands. Asymptotic properties of the proposed method are investigated under two settings: (i) the sample size goes to infinity while the dimension of time series is fixed; and (ii) both the sample size and the dimension of time series go to infinity together. In particular, our estimators for zero-eigenvalues enjoy faster convergence (or slower divergence) rates, hence making the estimation for the number of factors easier. In particular, when the sample size and the dimension of time series go to infinity together, the estimators for the eigenvalues are no longer consistent. However, our estimator for the number of the factors, which is based on the ratios of the estimated eigenvalues, still works fine. Furthermore, this estimation shows the so-called “blessing of dimensionality” property in the sense that the performance of the estimation may improve when the dimension of time series increases. A two-step procedure is investigated when the factors are of different degrees of strength. Numerical illustration with both simulated and real data is also reported."
to:NB
dimension_reduction
factor_analysis
time_series
high-dimensional_statistics
inference_to_latent_objects
9 days ago by cshalizi
[1205.2609] Which Spatial Partition Trees are Adaptive to Intrinsic Dimension?
13 days ago by cshalizi
"Recent theory work has found that a special type of spatial partition tree - called a random projection tree - is adaptive to the intrinsic dimension of the data from which it is built. Here we examine this same question, with a combination of theory and experiments, for a broader class of trees that includes k-d trees, dyadic trees, and PCA trees. Our motivation is to get a feel for (i) the kind of intrinsic low dimensional structure that can be empirically verified, (ii) the extent to which a spatial partition can exploit such structure, and (iii) the implications for standard statistical tasks such as regression, vector quantization, and nearest neighbor search."
to:NB
decision_trees
prediction
regression
statistics
dimension_reduction
machine_learning
13 days ago by cshalizi
Cook , Forzani , Rothman : Estimating sufficient reductions of the predictors in abundant high-dimensional regressions
7 weeks ago by cshalizi
"We study the asymptotic behavior of a class of methods for sufficient dimension reduction in high-dimension regressions, as the sample size and number of predictors grow in various alignments. It is demonstrated that these methods are consistent in a variety of settings, particularly in abundant regressions where most predictors contribute some information on the response, and oracle rates are possible. Simulation results are presented to support the theoretical conclusion."
to:NB
regression
dimension_reduction
sufficiency
statistics
7 weeks ago by cshalizi
[1203.6130] Spectral dimensionality reduction for HMMs
8 weeks ago by cshalizi
"Hidden Markov Models (HMMs) can be accurately approximated using co-occurrence frequencies of pairs and triples of observations by using a fast spectral method in contrast to the usual slow methods like EM or Gibbs sampling. We provide a new spectral method which significantly reduces the number of model parameters that need to be estimated, and generates a sample complexity that does not depend on the size of the observation vocabulary. We present an elementary proof giving bounds on the relative accuracy of probability estimates from our model. (Correlaries show our bounds can be weakened to provide either L1 bounds or KL bounds which provide easier direct comparisons to previous work.) Our theorem uses conditions that are checkable from the data, instead of putting conditions on the unobservable Markov transition matrix."
to:NB
to_read
markov_models
statistics
machine_learning
dimension_reduction
re:AoS_project
spectral_clustering
8 weeks ago by cshalizi
Randomized Online PCA Algorithms with Regret Bounds that are Logarithmic in the Dimension
february 2012 by cshalizi
"We design an online algorithm for Principal Component Analysis. In each trial the current instance is centered and projected into a probabilistically chosen low dimensional subspace. The regret of our online algorithm, that is, the total expected quadratic compression loss of the online algorithm minus the total quadratic compression loss of the batch algorithm, is bounded by a term whose dependence on the dimension of the instances is only logarithmic.
"We first develop our methodology in the expert setting of online learning by giving an algorithm for learning as well as the best subset of experts of a certain size. This algorithm is then lifted to the matrix setting where the subsets of experts correspond to subspaces. The algorithm represents the uncertainty over the best subspace as a density matrix whose eigenvalues are bounded. The running time is O(n2) per trial, where n is the dimension of the instances."
to:NB
online_learning
dimension_reduction
machine_learning
learning_theory
warmuth.manfred
principal_components
low-regret_learning
"We first develop our methodology in the expert setting of online learning by giving an algorithm for learning as well as the best subset of experts of a certain size. This algorithm is then lifted to the matrix setting where the subsets of experts correspond to subspaces. The algorithm represents the uncertainty over the best subspace as a density matrix whose eigenvalues are bounded. The running time is O(n2) per trial, where n is the dimension of the instances."
february 2012 by cshalizi
A General Framework for Dimensionality-Reducing Data Visualization Mapping
february 2012 by cshalizi
"In recent years, a wealth of dimension-reduction techniques for data visualization and preprocessing has been established. Nonparametric methods require additional effort for out-of-sample extensions, because they provide only a mapping of a given finite set of points. In this letter, we propose a general view on nonparametric dimension reduction based on the concept of cost functions and properties of the data. Based on this general principle, we transfer nonparametric dimension reduction to explicit mappings of the data manifold such that direct out-of-sample extensions become possible. Furthermore, this concept offers the possibility of investigating the generalization ability of data visualization to new data points. We demonstrate the approach based on a simple global linear mapping, as well as prototype-based local linear mappings. In addition, we can bias the functional form according to given auxiliary information. This leads to explicit supervised visualization mappings with discriminative properties comparable to state-of-the-art approaches."
in_NB
dimension_reduction
visual_display_of_quantitative_information
data_analysis
data_mining
manifold_learning
to_teach:data-mining
february 2012 by cshalizi
Modified Locally Linear Embedding Using Multiple Weights
december 2011 by cshalizi
Stabilizing LLE by (as it were) model averaging. Needs at least a reference in the data-mining lecture on LLE.
"The locally linear embedding (LLE) is improved by introducing multiple linearly independent local weight vectors for each neighborhood. We characterize the reconstruction weights and show the existence of the linearly independent weight vectors at each neighborhood. The modified locally linear embedding (MLLE) proposed in this paper is much stable. It can retrieve the ideal embedding if MLLE is applied on data points sampled from an isometric manifold. MLLE is also compared with the local tangent space alignment (LTSA). Numerical examples are given that show the improvement and efficiency of MLLE."
to:NB
to_teach:data-mining
dimension_reduction
manifold_learning
via:gmg
spectral_clustering
"The locally linear embedding (LLE) is improved by introducing multiple linearly independent local weight vectors for each neighborhood. We characterize the reconstruction weights and show the existence of the linearly independent weight vectors at each neighborhood. The modified locally linear embedding (MLLE) proposed in this paper is much stable. It can retrieve the ideal embedding if MLLE is applied on data points sampled from an isometric manifold. MLLE is also compared with the local tangent space alignment (LTSA). Numerical examples are given that show the improvement and efficiency of MLLE."
december 2011 by cshalizi
[1110.3917] How to Evaluate Dimensionality Reduction? - Improving the Co-ranking Matrix
october 2011 by cshalizi
"The growing number of dimensionality reduction methods available for data visualization has recently inspired the development of quality assessment measures, in order to evaluate the resulting low-dimensional representation independently from a methods' inherent criteria. Several (existing) quality measures can be (re)formulated based on the so-called co-ranking matrix, which subsumes all rank errors (i.e. differences between the ranking of distances from every point to all others, comparing the low-dimensional representation to the original data). The measures are often based on the partioning of the co-ranking matrix into 4 submatrices, divided at the K-th row and column, calculating a weighted combination of the sums of each submatrix. Hence, the evaluation process typically involves plotting a graph over several (or even all possible) settings of the parameter K. Considering simple artificial examples, we argue that this parameter controls two notions at once, that need not necessarily be combined, and that the rectangular shape of submatrices is disadvantageous for an intuitive interpretation of the parameter. We debate that quality measures, as general and flexible evaluation tools, should have parameters with a direct and intuitive interpretation as to which specific error types are tolerated or penalized. Therefore, we propose to replace K with two parameters to control these notions separately, and introduce a differently shaped weighting on the co-ranking matrix. The two new parameters can then directly be interpreted as a threshold up to which rank errors are tolerated, and a threshold up to which the rank-distances are significant for the evaluation. Moreover, we propose a color representation of local quality to visually support the evaluation process for a given mapping, where every point in the mapping is colored according to its local contribution to the overall quality." --- Look at this carefully, and see if it could be taught in data mining (and whether it's worth doing so.)
to:NB
dimension_reduction
statistics
data_analysis
visual_display_of_quantitative_information
to_teach:data-mining
october 2011 by cshalizi
Practical Approaches to Principal Component Analysis in the Presence of Missing Values
august 2010 by cshalizi
From a quick skim, it looks too advanced to actually teach in 350, but potentially a handy reference.
principal_components
dimension_reduction
to_teach:data-mining
statistics
data_mining
to_teach:undergrad-ADA
august 2010 by cshalizi
"Dimension Reduction and Adaptation in Conditional Density Estimation" - Journal of the American Statistical Association - 105(490):761
july 2010 by cshalizi
Sounds neat: "An orthogonal series estimator of the conditional density of a response given a vector of continuous and ordinal/nominal categorical predictors ... The estimator is based on writing a conditional density as a sum of orthogonal projections on all possible subspaces of reduced dimensionality and then estimating each projection via a shrinkage procedure [which] uses a universal thresholding and a dyadic-blockwise shrinkage for low and high frequencies, respectively. The estimator is data-driven, is adaptive to underlying smoothness of a conditional density, and attains a minimax rate of the mean integrated squared error convergence. ... if a conditional density depends only on a subgroup of predictors, then the estimator seizes the opportunity and attains a corresponding minimax rate of convergence. The latter property relaxes the notorious “curse of dimensionality.” ... the estimator is fast, because neither projections nor shrinkages are computation-intensive."
density_estimation
dimension_reduction
statistics
estimation
shrinkage
july 2010 by cshalizi
[1003.0529] A Unified Algorithmic Framework for Multi-Dimensional Scaling
march 2010 by cshalizi
"In this paper, we propose a unified algorithmic framework for solving many known variants of \mds. Our algorithm is a simple iterative scheme with guaranteed convergence, and is \emph{modular}; by changing the internals of a single subroutine in the algorithm, we can switch cost functions and target spaces easily. In addition to the formal guarantees of convergence, our algorithms are accurate; in most cases, they converge to better quality solutions than existing methods, in comparable time. "
multidimensional_scaling
dimension_reduction
visual_display_of_quantitative_information
to_teach:data-mining
data_mining
march 2010 by cshalizi
Elements of Statistical Learning: data mining, inference, and prediction. 2nd Edition.
october 2009 by cshalizi
Free PDF! (Still, I find my bound physical copy much more convenient.)
books:recommended
machine_learning
data_mining
statistics
learning_theory
estimation
cross-validation
ensemble_methods
classifiers
regression
graphical_models
clustering
dimension_reduction
bootstrap
via:arthegall
have_read
october 2009 by cshalizi
Diffusion Maps @ CRAN
october 2009 by cshalizi
Diffusion maps in R. (How handy that the TA for the class wrote the package...)
R
richards.joey
diffusion_maps
manifold_learning
dimension_reduction
to_teach:data-mining
kith_and_kin
spectral_clustering
lee.ann
october 2009 by cshalizi
Robust Manifold Unfolding with Kernel Regularization
september 2009 by cshalizi
Cool stuff, and I kind of want to teach this, or at least a kiddy version of it, in 350, but I was planning on doing dimensionality reduction long before the kernel trick. Hmmm....
kernel_methods
dimension_reduction
manifold_learning
wahba.grace
statistics
machine_learning
to_read
to_teach:data-mining
september 2009 by cshalizi
Laplacian Eigenmaps for Dimensionality Reduction and Data Representation (Belkin and Niyogi)
august 2009 by cshalizi
Using the spectral graph theory to get new features, for visualization, clustering, &c. Also includes a good discussion of the local linear embedding algorithm. Will try to teach this in 350 but may make heads explode.
to_teach:data-mining
clustering
graph_theory
spectral_clustering
dimension_reduction
niyogi.partha
august 2009 by cshalizi
[0908.3400] Decomposing data sets into skewness modes
august 2009 by cshalizi
"We derive the nonlinear equations satisfied by the coefficients of linear combinations that maximize their skewness when their variance is constrained to take a specific value. In order to numerically solve these nonlinear equations we develop a gradient-type flow that preserves the constraint. In combination with the Karhunen-Lo\`eve decomposition this leads to a set of orthogonal modes with maximal skewness. For illustration purposes we apply these techniques to atmospheric data; in this case the maximal-skewness modes correspond to strongly localized atmospheric flows. We show how these ideas can be extended, for example to maximal-flatness modes."
dimension_reduction
data_analysis
principal_components
karhunen-loeve_decomposition
statistics
august 2009 by cshalizi
"Statistical Theory and Methods for Complex, High-Dimensional Data"
june 2009 by cshalizi
Loads of talks.
statistics
machine_learning
model_selection
graphical_models
regression
latent_variables
principal_components
factor_analysis
dimension_reduction
lasso
bioinformatics
track_down_references
via:shivak
june 2009 by cshalizi
A Meta-Analysis of Variance Accounted for and Factor Loadings in Exploratory Factor Analysis
may 2009 by cshalizi
Shorter Peterson: Your results look like a factor analysis of pure noise. Have a nice day. (Also, a citation in support of the folk wisdom that factor analysis doesn't work any better as data reduction than simple principal components analysis.)
factor_analysis
statistics
to:NB
to_teach:data-mining
via:moritz-heene
re:g_paper
dimension_reduction
principal_components
to_teach:undergrad-ADA
may 2009 by cshalizi
[0707.0481] Treelets -- An Adaptive Multi-Scale Basis for Sparse Unordered Data
november 2007 by cshalizi
Now with discussion. Think about teaching in data mining - probably too much...
to:NB
statistics
kith_and_kin
dimension_reduction
lee.ann
nadler.boaz
wasserman.larry
to_teach:data-mining
november 2007 by cshalizi
related tags
bioinformatics ⊕ books:recommended ⊕ bootstrap ⊕ chaos ⊕ classifiers ⊕ clustering ⊕ compressed_sensing ⊕ cross-validation ⊕ data_analysis ⊕ data_mining ⊕ decision_trees ⊕ density_estimation ⊕ diffusion_maps ⊕ dimension_reduction ⊖ dsm ⊕ dynamical_systems ⊕ ensemble_methods ⊕ entropy_estimation ⊕ estimation ⊕ factor_analysis ⊕ gaussian_processes ⊕ genetics ⊕ graphical_models ⊕ graph_theory ⊕ have_read ⊕ high-dimensional_statistics ⊕ inference_to_latent_objects ⊕ in_NB ⊕ karhunen-loeve_decomposition ⊕ kernel_estimators ⊕ kernel_methods ⊕ kith_and_kin ⊕ lasso ⊕ latent_variables ⊕ learning_theory ⊕ lee.ann ⊕ linear_regression ⊕ low-regret_learning ⊕ luca.diana ⊕ machine_learning ⊕ manifold_learning ⊕ markov_models ⊕ model_selection ⊕ multidimensional_scaling ⊕ nadler.boaz ⊕ niyogi.partha ⊕ online_learning ⊕ popular_social_science ⊕ prediction ⊕ principal_components ⊕ R ⊕ re:AoS_project ⊕ re:g_paper ⊕ regression ⊕ richards.joey ⊕ roeder.kathryn ⊕ shrinkage ⊕ spectral_clustering ⊕ statistics ⊕ sufficiency ⊕ time_series ⊕ to:NB ⊕ to_read ⊕ to_teach:data-mining ⊕ to_teach:undergrad-ADA ⊕ track_down_references ⊕ via:arthegall ⊕ via:gmg ⊕ via:moritz-heene ⊕ via:shivak ⊕ visual_display_of_quantitative_information ⊕ wahba.grace ⊕ warmuth.manfred ⊕ wasserman.larry ⊕Copy this bookmark: