Rudy Beran, University of California-Davis
             
              Estimating Many Means: A Mosaic of Recent Methodologies
                A fundamental data structure is the k-way layout of observations, 
                complete or incomplete, balanced or unbalanced. The cells of the 
                layout are indexed by all k-fold combinations of the levels of 
                the k covariates (or factors). Replication of observations within 
                cells may be rare or nonexistent. Observations may be available 
                for only a subset of the cells. The problem is to estimate the 
                mean observation, or mean potential observable, for each cell 
                in the k-way layout. Equivalently, the problem is to estimate 
                an unknown regression function that depends on k covariates. 
              This talk unifies a mosaic of recent methodologies for estimating 
                means in the general k-way layout or k-covariate regression problem. 
                Included are penalized least squares with multiple quadratic penalties, 
                associated Bayes estimators, associated submodel fits, multiple 
                Stein shrinkage, and functional data analysis as a limit scenario. 
                The focus is on the choice of tuning parameters to minimize estimated 
                quadratic risk under a minimally restrictive data model; and on 
                the asymptotic risk of the chosen estimator as the number of observed 
                cells in the k-way layout tends to infinity.
                Back to home page 
            
            ========================
              Jiahua Chen, University of British Columbia
             
              Advances in EM-test for Finite Mixture Models
                Making valid and effective inferences for finite mixture models 
                has known to be technically challenging. Due to the non-regularity, 
                the likelihood ratio test was found to diverge to infinite if 
                the parameter space is not artificially confined to a compact 
                space. Even under compact assumption, the limiting distribution 
                is often a function of the supermum of some Gaussian processes. 
                Such results are of theoretical interest but not useful in applications. 
                Recently, many new tests have been proposed to address this problem. 
                The EM-test has been found superior in many respects. For many 
                classes of finite mixture models, we have tailor designed EM-tests 
                that have easy to use limiting distributions. The simulation indicates 
                that the limiting distributions have good precision at approximating 
                the finite sample distributions in the examples investigated. 
                A general procedure for choosing the tuning parameter has also 
                been developed.
                Back to home page 
            
            
              ========================
              Xihong Lin, Harvard University
             
              Hypothesis testing and variable selection for Studying 
                Rare Variants in Sequencing Association Studies
                Sequencing studies are increasingly being conducted to 
                identify rare variants associated with complex traits. The limited 
                power of classical single marker association analysis for rare 
                variants poses a central challenge in such studies. We propose 
                the sequence kernel association test (SKAT), a supervised, flexible, 
                computationally efficient regression method to test for association 
                between genetic variants (common and rare) in a region and a continuous 
                or dichotomous trait, while easily adjusting for covariates. As 
                a score-based variance component test, SKAT can quickly calculate 
                p-values analytically by fitting the null model containing only 
                the covariates, and so can easily be applied to genome-wide data. 
                Using SKAT to analyze a genome-wide sequencing study of 1000 individuals, 
                by segmenting the whole genome into 30kb regions, requires only 
                7 hours on a laptop. Through analysis of simulated data across 
                a wide range of practical scenarios and triglyceride data from 
                the Dallas Heart Study, we show that SKAT can substantially outperform 
                several alternative rare-variant association tests. We also provide 
                analytic power and sample size calculations to help design candidate 
                gene, whole exome, and whole genome sequence association studies. 
                We also discuss variable selection methods to select causal variants.
            
             
            
            Ejaz Ahmed, University of Windsor
             
              System/Machine Bias versus Human Bias: Generalized Linear 
                Models
                Penalized and shrinkage regression have been widely used in high-dimensional 
                data analysis. Much of recent work has been done on the study 
                of penalized least square methods in linear models. In this talk, 
                I consider estimation in generalized linear models when there 
                are many potential predictor variables and some of them may not 
                have influence on the response of interest. In the context of 
                two competing models where one model includes all predictors and 
                the other restricts variable coefficients to a candidate linear 
                subspace based on prior knowledge, we investigate the relative 
                performances of absolute penalty estimator (APE) and shrinkage 
                estimators in the direction of the subspace. We develop large 
                asymptotic analysis for the shrinkage estimators. The asymptotics 
                and a Monte Carlo simulation study show that the shrinkage estimator 
                performs better than benchmark estimators. Further, it performs 
                better than the APE when the dimension of the restricted parameter 
                space is large. The estimation strategies considered in this talk 
                are also applied on a real life data set for illustrative purpose. 
              
              
            
            ========================
              Pierre Alquier, Université Paris 7 and CREST 
             
              Bayesian estimators in high dimension: PAC bounds and Monte 
                Carlo methods
                Coauthors: Karim Lounici (Georgia Institute of Technology) Gérard 
                Biau (Université Paris 6)
                The problem of sparse estimation in high dimension received a 
                lot of attention in the last ten years. However, to find an estimator 
                with both satisfying statistical and computationnal properties 
                is still an open problem. For example the LASSO can be efficiently 
                computed but statistical properties requires strong assumption 
                on the observations. On the other hand, BIC does not require such 
                hypothesis but can not be efficiently computed in very high dimension. 
                We propose here the so-called PAC-Bayesian method (McAllester 
                1998, Catoni 2004, Dalalyan and Tsybakov 2008) as an alternative 
                approach. We build a Bayesian estimator that satisfies a tight 
                PAC bound, and compute it using reversible jump Markov Chain Monte 
                Carlo methods. A first version, proposed in a joint work with 
                Karim Lounici, deals with the linear regression problem while 
                the work with Gérard Biau extends these results to the 
                single index model.
                Back to home page 
            
            ========================
              Shojaeddin Chenouri, University of Waterloo
              Coauthors: Sam Behseta (California State University, Fullerton)
              
             
              Comparison of Two Populations of Curves with an Application 
                in Neuronal Data Analysis
                Often in neurophysiological studies, scientists are interested 
                in testing hypotheses regarding the equality of the overall intensity 
                functions of a group of neurons when recorded under two different 
                experimental conditions. In this talk, we consider such a hypothesis 
                testing problem. We propose two test statistics: a parametric 
                test based on the Hotelling's $T^2$ statistic, as well as a nonparametric 
                one based on the spatial signed-rank test statistic of M\"{o}tt\"{o}nen 
                and Oja (1995). We implement these tests on smooth curves obtained 
                via fitting Bayesian Adaptive Regression Splines (BARS) to the 
                intensity functions of neuronal Peri-Stimulus Time Histograms 
                (PSTH). 
                Through simulation, we show that the powers of our proposed tests 
                are extremely high even when the number of sampled neurons, and 
                the number of trials per neuron are small. Finally, we apply our 
                methods on a group of motor cortex neurons recorded during a reaching 
                task.
            
            
              ========================
              Kjell Doksum, University of Wisconsin-Madison
              Coauthors: Fan Yang, Kam Tsui
              
             
              Biomedical large scale inference
                I will describe methods used by population and medical geneticists 
                to analyse associations between disease and genetic markers. These 
                methods are able to handle data with hundred of thousands of variables 
                by using dual principal component analysis. I will compare these 
                methods to frequentist and Bayesian methods from the field of 
                statistics. 
                This is joint work with Fan Yang and Kam Tsui
            
            
              ========================
              Yang Feng, Columbia University
              Coauthors: Tengfei Li, Wen Yu, Zhiliang Ying, Hong Zhang 
             
              Loss Adaptive Modified Penalty in Variable Selection
                For variable selection, balancing sparsity and stability is a 
                very important task. In this work, we propose the Loss Adaptive 
                Modified Penalty (LAMP) where the penalty function is adaptively 
                changed with the type of the loss function. For generalized linear 
                models, we provide a unified form of the penalty corresponding 
                to the specific exponential family. We show that LAMP can have 
                asymptotic stability while achieving oracle properties. In addition, 
                LAMP could be seen as a special functional of a conjugate prior. 
                An efficient coordinate-descent algorithm is proposed and a balancing 
                method is introduced. Simulation results show LAMP has competitive 
                performance comparing with several well-known penalties.
            
            ========================
              D. A. S. Fraser, University of Toronto
             
              High-Dimensional: The Barrier and Bayes and Bias
                We all aspire to breach the barrier and we do; and yet it 
                always reforms as more formidable. In the context of a statistical 
                model and data two familiar approaches involve; slicing which 
                uses only a data-slice of the model, namely the likelihood function 
                perhaps with a calibrating weight function or prior; and bridging 
                which uses derivatives at infinity to cantilever back over the 
                barrier to first, second, and third order. Both have had remarkable 
                successes and both involve risks that can be serious.
              We all have had confrontations with the boundary and I'll start 
                with comment on my first impact. The slicing I refer to is the 
                use of the data slice, the likelihood function, as the sole or 
                primary model summary. This can be examined in units data-standardized 
                and free from model curvature, and the related gradient of the 
                log-prior then gives a primary calibration of the prior; the initiative 
                in this direction is due to Welch and Peers (1963) but its prescience 
                was largely overlooked. The bridging is the Taylor expansion about 
                infinity with analysis from asymptotics. From these we obtain 
                an order of magnitude calibration of the effect of a prior on 
                the basic slice information; this leads to the direction and the 
                magnitude of the bias that derives from the use of a prior to 
                do a statistical analysis.
            
            ========================
              Xin Gao, York University 
              Coauthors: Peter Song, Yuehua Wu 
             
               
                Model selection for high-dimensional data with applications 
                  in feature selection and network building
                  For high-dimensional data set with complicated dependency 
                  structures, the full likelihood approach often leads to intractable 
                  computational complexity. This imposes difficulty on model selection 
                  as most of the traditionally used information criteria require 
                  the evaluation of the full likelihood. We propose a composite 
                  likelihood version of the Bayesian information criterion (BIC) 
                  and establish its consistency property for the selection of 
                  the true underlying marginal model. Under some mild regularity 
                  conditions, the proposed BIC is shown to be selection consistent, 
                  where the number of potential model parameters is allowed to 
                  increase to infinity at a certain rate of the sample size. In 
                  this talk, we will also discuss the result that using a modified 
                  Bayesian information criterion (BIC) to select the tuning parameter 
                  in penalized likelihood estimation of Gaussian graphical model 
                  can lead to consistent network model selection even when $P$ 
                  increases with $N,$ as long as all the network edges are contained 
                  in a bounded subset.
                
              
            
             ========================
              Xiaoli Gao, Oakland University,
             
               
                LAD Fused Lasso Signal Approximation
                  The fused lasso penalty is commonly used in signal processing
                  when the hidden true signals are sparse and blocky. The $\ell_1$ 
                  loss has some robust properties when the additional noises are 
                  contaminated by outliers. In this manuscript, we study the asymptotic 
                  properties of an LAD-fused-lasso model used as a signal approximation 
                  (LAD-FLSA). We first investigate the estimation consistency 
                  properties of an LAD-FLSA estimator. Then we provide some conditions 
                  under which an LAD-FLSA estimator can be both block selection 
                  consistent and sign consistent. We also provide an unbiased 
                  estimate for the generalized degrees of freedom (GDF) of the 
                  LAD-FLSA modeling procedure for any given tuning parameters. 
                  The effect of the unbiased estimate is demonstrated using simulation 
                  studies.
                  Back to home page 
              
            
            
              ========================
              Yulia Gel, University of Waterloo
              Coauthors: Peter Bickel, University of California, Berkeley
            
             
              Banded regularization of autocovariance matrices in application 
                to parameter estimation and forecasting of time series
                This talk addresses a "large p-small n" problem 
                in a time series framework and considers properties of banded 
                regularization of an empirical autocovariance matrix of a time 
                series process. Utilizing the banded autocovariance matrix enables 
                us to fit a much longer model to the observed data than typically 
                suggested by AIC, while controlling how many parameters are to 
                be estimated precisely and the level of accuracy. We present results 
                on asymptotic consistency of banded autocovariance matrices under 
                the Frobenius norm and provide a theoretical justi cation on optimal 
                band selection using cross-validation. Remarkably, the cross-validation 
                loss function for banded prediction is related to the conditional 
                mean square prediction error (MSPE) and, thus, may be viewed as 
                an alternative model selection criterion. The proposed procedure 
                is illustrated by simulations and application to predicting sea 
                surface temperature (SST) index in the Nino 3.4 region.
            
            ========================
              Jiashun Jin, Carnegie Mellon University
              Coauthors: Pengsheng Ji
            
             
              UPS delivers optimal phase diagram in high dimensional 
                variable selection
                We consider a linear regression model where both $p$ and $n$ are 
                large but $p > n$. The vector of coefficients is unknown but 
                is sparse in the sense that only a small proportion of its coordinates 
                is nonzero, and we are interested in identifying these nonzero 
                ones. We propose a two-stage variable selection procedure which 
                we call the {\it UPS}. This is a Screen and Clean procedure, in 
                which we screen with the Univariate thresholding, and clean with 
                the Penalized MLE.
                In many situations, the UPS possesses two important properties: 
                Sure Screening and Separable After Screening (SAS). These properties 
                enable us to reduce the original regression problem to many small-size 
                regression problems that can be fitted separately. As a result, 
                the UPS is effective both in theory and in computation. The lasso 
                and the subset selection are well-known approaches to variable 
                selection. However, somewhat surprisingly, there are regions where 
                neither the lasso nor the subset selection is rate optimal, even 
                for very simple design matrix. The lasso is non-optimal because 
                it is too loose in filtering out fake signals (i.e. noise that 
                is highly correlated with a signal), and the subset selection 
                is non optimal because it tends to kill one or more signals in 
                correlated pairs, triplets, etc..
              
            
            ========================
              Timothy D. Johnson, University of Michigan, Department of Biostatistics 
              
            
             
               
                Computational Speedup in Spatial Bayesian Image Modeling 
                  via GPU Computing
                  Spatial modeling is a computationally complex endeavor due to 
                  the spatial correlation structure in the data that must be taken 
                  into account in the modeling. This endeavor is even more computationally 
                  complex for 3D data/images---curse of dimensionality---and within 
                  the Bayesian framework due to posterior distributions that are 
                  not analytically tractable and thus must be approximated via 
                  MCMC simulation. For point reference data, dimension reduction 
                  techniques, such as Gaussian predictive process models, have 
                  alleviated some of the computational burden, however, for image 
                  data and point pattern data, these dimension reduction techniques 
                  may not be applicable. Two examples are a population level fMRI 
                  hierarchical model where image correlation is accounted for 
                  in the weights of a finite mixture model and a log-Gaussian 
                  Cox process model of lesion location in patients with Multiple 
                  Sclerosis. Both of these models are extremely computationally 
                  intense due to the complex nature of the likelihoods and the 
                  size of the 3D images. However, both likelihoods are amenable 
                  to parallelization. Although the MCMC simulation cannot be parallelized, 
                  by small, rather straightforward changes to the code and porting 
                  the likelihood computation to a graphical processing unit (GPU), 
                  I have achieved over 2 orders of magnitude increase in computational 
                  efficiency in these two problems.
                  Back to home page 
              
            
            
              ========================
              Abbas Khalili, McGill University
              Coauthors: Shili Lin; Dept. of Statistics, The Ohio State University 
              
             
              Regularization in finite mixture of regression models with 
                diverging number of parameters
                Feature (variable) selection has become a fundamentally 
                important problem in recent statistical 
                literature. Often, in applications many variables are introduced 
                to reduce possible modeling biases. 
                The number of introduced variables thus depends on the sample 
                size, which reflects the estimability 
                of the parametric model. In this paper, we consider the problem 
                of feature selection in finite mixture of 
                regression models when the number of parameters in the model can 
                increase with the sample size. 
                We propose a penalized likelihood approach for feature selection 
                in these models. Under certain 
                regularity conditions, our approach leads to consistent variable 
                selection. We carry out a simulation 
                study to evaluate the performance of the proposed approach under 
                controlled settings. A real data on 
                Parkinsons disease is also analyzed. The data concerns whether 
                dysphonic features extracted from 
                the patients' speech signals recorded at home can be used as surrogates 
                to study PD severity and 
                progression. Our analysis of the PD data yields interpretable 
                results that can be of important clinical values. 
                The stratification of dysphonic features for patients with mild 
                and severe symptoms lead to novel insights 
                beyond the current literature.
                Back to home page
            
            ========================
              Peter Kim, University of Guelph
             
              Testing Quantum States for Purity
              The simplest states of finite quantum systems are the pure states. 
                This paper is motivated by the need to test whether or not a given 
                state is pure. Because the pure states lie in the boundary of 
                the set of all states, the usual regularity conditions that justify 
                the standard large-sample approximations to the null distributions 
                of the deviance and the score statistic are not satisfied. For 
                a large class of quantum experiments that produce Poisson count 
                data, this paper uses an enlargement of the parameter space of 
                all states to develop likelihood ratio and score tests of purity. 
                The asymptotic null distributions of the corresponding statistics 
                are chi-squared. The tests are illustrated by the analysis of 
                some quantum experiments involving unitarily correctable codes.
              
            
            Back to home page
            ========================
              Samuel Kou, Harvard University
              Coauthors: Benjamin Olding
             
              Multi-resolution inference of stochastic models from partially 
                observed data
                
                Stochastic models, diffusion models in particular, are widely 
                used in science, engineering and economics. Inferring the parameter 
                values from data is often complicated by the fact that the underlying 
                stochastic processes are only partially observed. Examples include 
                inference of discretely observed diffusion processes, stochastic 
                volatility models, and double stochastic Poisson (Cox) processes. 
                Likelihood based inference faces the difficulty that the likelihood 
                is usually not available even numerically. Conventional approach 
                discretizes the stochastic model to approximate the likelihood. 
                In order to have desirable accuracy, one has to use highly dense 
                discretization. However, dense discretization usually imposes 
                unbearable computation burden. In this talk we will introduce 
                the framework of Bayesian multi-resolution inference to address 
                this difficulty. By working on different resolution (discretization) 
                levels simultaneously and by letting the resolutions talk to each 
                other, we substantially improve not only the computational efficiency, 
                but also the estimation accuracy. We will illustrate the strength 
                of the multi-resolution approach by examples.
                Back to home page
              
            
            ========================
              Hua Liang, University of Rochester Medical Center
              Coauthors: Hansheng Wang and Chih-Ling Tsai
            
             
              Profiled Forward Regression for Ultrahigh Dimensional Variable 
                Screening in Semiparametric Partially Linear Models
                In partially linear model selection, we develop a profiled forward 
                regression (PFR) algorithm for ultrahigh dimensional variable 
                screening. The PFR algorithm effectively combines the ideas of 
                nonparametric profiling and forward regression. This allows us 
                to obtain a uniform bound for the absolute difference between 
                the profiled and original predictors. Based on this important 
                finding, we are able to show that the PFR algorithm discovers 
                all relevant variables within a few fairly short steps. Numerical 
                studies are presented to illustrate the performance of the proposed 
                method.
                Back to home page 
            
            ========================
              Yufeng Liu, University of North Carolina  at Chapel Hill 
              Coauthors: Helen Hao Zhang (NSCU) and Guang Cheng (Purdue) 
            
             
              Automatic Structure Selection for Partially Linear Models
                Partially linear models provide good compromises between linear 
                and nonparametric models. However, given a large number of covariates, 
                it is often difficult to objectively decide which covariates are 
                linear and which are nonlinear. Common approaches include hypothesis 
                testing methods and screening procedures based on univariate scatter 
                plots. These methods are useful in practice; however, testing 
                the linearity of multiple functions for large dimensional data 
                is both theoretically and practically challenging, and visual 
                screening methods are often ad hoc. In this work, we tackle this 
                structure selection problem in partially linear models from the 
                perspective of model selection. A unified estimation and selection 
                framework is proposed and studied. The new estimator can automatically 
                determine the linearity or nonlinearity for all covariates and 
                at the same time consistently estimate the underlying regression 
                functions. Both theoretical and numerical properties of the resulting 
                estimators are presented.
            
            
              Back to home page 
            
            ========================
              Jinchi Lv, University of Southern California
             
              Non-Concave Penalized Likelihood with NP-Dimensionality
                Coauthors: Jianqing Fan (Princeton University)
                Penalized likelihood methods are fundamental to ultra-high dimensional 
                variable selection. How high dimensionality such methods can handle 
                remains largely unknown. In this paper, we show that in the context 
                of generalized linear models, such methods possess model selection 
                consistency with oracle properties even for dimensionality of 
                Non-Polynomial (NP) order of sample size, for a class of penalized 
                likelihood approaches using folded-concave penalty functions, 
                which were introduced to ameliorate the bias problems of convex 
                penalty functions. This fills a long-standing gap in the literature 
                where the dimensionality is allowed to grow slowly with the sample 
                size. Our results are also applicable to penalized likelihood 
                with the L1-penalty, which is a convex function at the boundary 
                of the class of folded-concave penalty functions under consideration. 
                The coordinate optimization is implemented for finding the solution 
                paths, whose performance is evaluated by a few simulation examples 
                and the real data analysis.
                Back to home page 
              
            
            ========================
              Bin Nan, University of Michigan
              Coauthors: Xuejing Wang, Ji Zhu, Robert Koeppe
            
             
              Sparse 3D Functional Regression via Haar Wavelets
                PET imaging has great potential to aid diagnosis of neurodegenerative 
                diseases, such as Alzheimers disease or mild cognitive impairment. 
                Commonly used region-of-interest analysis loses detailed voxel-level 
                information. Here we propose a three-dimensional functional linear 
                regression model, treating the PET images as three-dimensional 
                functional covarites. Both image and functional regression coefficient 
                are expanded using the same set of Haar wavelet bases. The functional 
                regression model is then reduced to a linear regression model. 
                We found the sparsity of original functional regression coefficient 
                can be achieved by the sparsity of the regression coefficients 
                in the reduced model after wavelet transformation. Lasso procedure 
                can be implemented with the level of Haar wavelet expansion as 
                an additional tuning parameter.
            
            ========================
              Annie Qu , Department of Statistics, University of Illinois at Urbana-Champaign 
              
              Coauthors: Peng Wang, University of Illinois at Urbana-Champaign; 
              Guei-feng Tsai, Center for Drug Evaluation of Taiwan 
             
              Conditional Inference Functions for Mixed-Effects Models 
                with Unspecified Random-Effects Distribution
                In longitudinal studies, mixed-effects models are important for 
                addressing subject-specific effects. However, most existing approaches 
                assume a normal distribution for the random effects, and this 
                could affect the bias and efficiency of the fixed-effects estimator. 
                Even in cases where the estimation of the fixed effects is robust 
                with a misspecified distribution of the random effects, the estimation 
                of the random effects could be invalid. We propose a new approach 
                to estimate fixed and random effects using conditional quadratic 
                inference functions. The new approach does not require the specification 
                of likelihood functions or a normality assumption for random effects. 
                It can also accommodate serial correlation between observations 
                within the same cluster, in addition to mixed-effects modeling. 
                Other advantages include not requiring the estimation of the unknown 
                variance components associated with the random effects, or the 
                nuisance parameters associated with the working correlations. 
                Real data examples and simulations are used to compare the new 
                approach with the penalized quasi-likelihood approach, and SAS 
                GLIMMIX and nonlinear mixed effects model (NLMIXED) procedures.
                Back to home page 
            
            ========================
              Sunil Rao, University of Miami, Division of Biostatistics
              Coauthors: Hemant Ishwaran, Cleveland Clinic
            
             
              Mixing Generalized Ridge Regressions
                Hoerl and Kennard proposed generalized ridge regression (GRR) 
                almost forty years ago as a means to overcome the deficiency of 
                least squares in multicollinear problems. Because high-dimensional 
                regression problems naturally involve correlated predictors, in 
                part due to the nature of the data and in part due to artifact 
                of the dimensionality, it is reasonable to consider GRR for addressing 
                these problems. We study GRR in problems in which the number of 
                predictors exceeds the sample size. We describe a novel geometric 
                intrepretation for GRR in terms of a uniquely defined least squares 
                estimator. However, the GRR is constrained to lie in a low-dimensional 
                subspace which limits its effectiveness. To overcome this, we 
                introduce a mixing GRR procedure using easily constructed exponential 
                weights and establish a finite sample minimax bound for this procedure. 
                A term that appears is a dimensionality effect which poses a problem 
                in ultra-high dimensions that we address by using a mixing GRR 
                for filtering variables. We study the performance of this procedure 
                as well as a hybrid method using a range of examples.
              
            
            ========================
              
            Enayetur Raheem, University of Windsor/Windsor-Essex County 
              Health Unit
              Coauthors: Kjell Doksum, S. E. Ahmed
              
             
              Absolute Penalty and B-spline-based Shrinkage Estimation 
                in Partially Linear Models
                 In the context of a partially linear regression model (PLM), 
                we utilized shrinkage and absolute penalty estimation technique 
                for simultaneous model selection and parameter estimation. Ahmed 
                et al (2007) in a similar setup considered kernel-based estimate 
                of the nonparametric component while B-spline is considered in 
                our setup. We developed shrinkage semiparametric estimators that 
                improve upon the classical estimators when there are nuisance 
                covariates present in the model. In comparing two modelswith 
                and without the nuisance covariates, the shrinkage estimators 
                take an adaptive approach in a way that the information contained 
                in the nuisance variable is utilized if it is tested to be useful 
                for overall fit of the model. Bias expressions and risk properties 
                of the estimators are obtained. Application of the proposed methods 
                to a real data set is provided.
              Since the B-spline can be incorporated in a regression model 
                easily, we attempted to numerically compare the performance of 
                our proposed method with the lasso. While both shrinkage and lasso 
                outperform classical estimators, shrinkage estimators perform 
                better than lasso in terms of prediction errors when there are 
                many nuisance variables and the sample size is moderately large.
            
            ========================
              Xiaotong Shen, School of Statistics, University of Minnesota
              Coauthors: Hsin-Cheng Huang
              
             
              On simultaneous supervised clustering and feature selection
                In network analysis, genes are known to work in groups by 
                their biological functionality, where distinctive groups reveals 
                different gene functionalities. In such a situation, identifying 
                grouping structures as well as informative genes becomes critical 
                in understanding progression of a disease. Motivated from gene 
                network analysis, we investigate, in a regression context, simultaneous 
                supervised clustering and feature selection over an arbitrary 
                undirected graph, where each predictor corresponds to one node 
                in the graph and existence of a connecting path between two nodes 
                indicates possible grouping between the two predictors. In this 
                talk, I will discuss methods for simultaneous supervised clustering 
                and feature selection over a graph, and argue that supervised 
                clustering and feature selection are complementary for identifying 
                a simpler model with higher predictive performance. Numerical 
                examples will be given in addition to theory.
                Back to home page 
            
            ========================
              Christopher G. Small, University of Waterloo
             
              Multivariate analysis of data in curved shape spaces
                We consider some statistical methods for the analysis of images 
                and objects whose shapes are encoded as points in Kendall shape 
                spaces. Standard multivariate methods, applicable to data in Euclidean 
                spaces, do not directly apply to such contexts. The talk highlights 
                the necessity for methods which respect the essentially non-Euclidean 
                nature of shape spaces. An application to data from anthropology 
                will be given.
                Back to home page
            
             
            ========================
              Hao Helen Zhang, North Carolina State University
              Coauthors: Wenbin Lu and Hansheng Wang
              
             
              On Sparse Estimation for Semiparametric Linear Transformation 
                Models
                Semiparametric linear transformation models have received 
                much attention due to its high flexibility in modeling survival 
                data. A useful estimating equation procedure was recently proposed 
                by Chen et al. (2002) for linear transformation models to jointly 
                estimate parametric and nonparametric terms. They showed that 
                this procedure can yield a consistent and robust estimator. However, 
                the problem of variable selection for linear transformation models 
                is less studied, partially because a convenient loss function 
                is not readily available under this context. We propose a simple 
                yet powerful approach to achieve both sparse and consistent estimation 
                for linear transformation models. The main idea is to derive a 
                profiled score from the estimating equation of Chen et al. (2002), 
                construct a loss function based on the profile scored and its 
                variance, and then minimize the loss subject to some shrinkage 
                penalty. We show that the resulting estimator is consistent for 
                both model estimation and variable selection. Furthermore, the 
                estimated parametric terms are asymptotically normal and can achieve 
                higher efficiency than that yielded from the estimation equations. 
                We suggest a one-step approximation algorithm which can take advantage 
                of the LARS path algorithm. Performance of the new procedure is 
                illustrated through numerous simulations and real examples including 
                one microarray data.
            
            ========================
              Hongtu Zhu, Department of Biostatistics and Biomedical Research 
              Imaging Center, UNC-Chapel Hill 
             
              Smoothing Imaging Data in Population Studies.
                Coauthors: Yimei Li, Yuan Ying, Runze Li, Steven Marron, Ja-an 
                Lin, Jianqing Fan, John H. Gilmore, Martin Styner, Dinggang Shen, 
                Weili Lin
                Motivated by recent work studying massive imaging data in large 
                neuroimaging studies,we propose various multiscale adaptive smoothing 
                models (MARM) for spatially modeling the relation between high-dimensional 
                imaging measures on a three-dimensional (3D) volume or a 2D surface 
                with a set of covariates. Statistically, MARM can be regarded 
                as a novel generalization of functional principal component analysis 
                (fPCA) and varying coefficient models (VCM) in higher dimensional 
                space compared to the standard fPCA and VCM. We develop novel 
                estimation procedures for MARMs and systematically study their 
                theoretical properties. We conduct Monte Carlo simulation and 
                real data analyses to examine the finite-sample performance of 
                the proposed procedures.
            
            Back to home page 
            
             
              
                S. Ejaz Ahmed and Saber Fallahpour, Department of Mathematics 
                and Statistics
                University of Windsor
              
              
             
             
               
                
                 
                  L1 Penalty and Shrinkage Estimation in Partially Linear 
                    Models with Random Coefficient Autoregressive Errors
                    In partially linear models (PLM) we consider methodology 
                    for simultaneous model selection and parameter estimation 
                    with random coefficient autoregressive errors using lasso 
                    and shrinkage strategies. The current work is an extension 
                    to Ahmed et al. (2007) where they considered a PLM with random 
                    errors. We provide natural adaptive estimators that significantly 
                    improve upon the classical procedures in the situation where 
                    some of the predictors are nuisance variables that may or 
                    may not affect the association between the response and the 
                    main predictors. In the context of two competing partially 
                    linear regression models (full and sub-models), we consider 
                    an adaptive shrinkage estimation strategy. We develop the 
                    properties of these estimators using the notion of asymptotic 
                    distributional risk. The shrinkage estimators (SE) are shown 
                    to have a higher efficiency than the classical
                    estimators for a wide class of models. For the lasso-type 
                    estimation strategy, we devise efficient algorithms to obtain 
                    numerical results. We compare the relative performance of 
                    lasso with the shrinkage and other estimators. Monte Carlo 
                    simulation experiments are conducted for various combinations 
                    of the nuisance
                    parameters and sample size, and the performance of each method 
                    is evaluated in terms of simulated mean squared error. The 
                    comparison reveals that the lasso and shrinkage strategies 
                    outperform the classical procedure. The SE performs better 
                    than the lasso strategy in the effective part of the parameter 
                    space when, and only when, there are many nuisance variables 
                    in the model. A data example is showcased to illustrate the 
                    usefulness of suggested methods.
                 
               
              Reference:
              Ahmed, S. E., Doksum, K. A., Hossain, S. and You, 
                J. (2007). Shrinkage, pretest and absolute penalty estimators 
                in partially linear models. Aust. New Zealand J. Stat, 49, 435-454.
              
            
             
              ======================== 
              Billy Chang, Ph.D. Candidate (Biostatistics), Dalla Lana School 
                of Public Health, University of Toronto
                Author: Billy Chang and Rafal Kustra
             
             
               
                Regularization for Nonlinear Dimension 
                  Reduction by Subspace Constraint
                  Sparked by the introduction of Isomap and Locally-Linear-Embedding 
                  in year 2000, nonlinear approaches to dimension reduction have 
                  received unprecedented attention during the past decade. Although 
                  the flexibility of such methods has provided scientists powerful 
                  ways for feature extraction and visualization, their applications 
                  are focused mainly on large-sample and low-noise settings. In 
                  small sample, high-noise settings, model regularization is necessary 
                  to avoid over-fitting. Yet, over-fitting issues for nonlinear 
                  dimension reduction have not been widely explored, even for 
                  earlier methods such as kernel PCA and multi-dimensional scaling.
                  
                  Regularization for nonlinear dimension reduction is a non-trivial 
                  task; while an overly-complex model will over-fit, an overly-simple 
                  model cannot detect highly nonlinear signals. To overcome this 
                  problem, I propose performing nonlinear dimension reduction 
                  within a lower-dimensional subspace. As such, one can increase 
                  the model complexity for nonlinear pattern search, while over-fitting 
                  is avoided as the model is not allowed to traverse through all 
                  possible dimensions. The crux of the problem lies in finding 
                  the subspace containing the nonlinear signal, and I will discuss 
                  a Kernel PCA approach for the subspace search, and a principal 
                  curve approach for nonlinear basis construction.
              
            
            ========================
              Abdulkadir Hussein, Ejaz Ahmed and Marwan Al-Momani, U of Windsor
             
               
                 
                  To homogenize or not to homogenize: The case of linear 
                    mixed models
                    The problem of whether a given data supports heterogeneous 
                    or homogeneous models has a long history and perhaps its major 
                    manifestation is in the form of generalized linear mixed models. 
                    By heterogeneous models we mean models where diversity among 
                    possible subpopulations is accommodated by using variance 
                    components. Among other areas, this problem arises in economics, 
                    finance, and Biostatistics under various names such as panel, 
                    longitudinal or cluster correlated data. Homogeneity is a 
                    desired property while heterogeneity is often a fact of life. 
                    in order to reconcile these two types of models and seek unity 
                    in diversity, we propose and explore several shrinkage-type 
                    estimators for regression coefficient parameters as well as 
                    for the variance components . We examine the merits of the 
                    different estimators by using asymptotic risk assessment measures 
                    and by using Monte Carlo simulations. We apply the proposed 
                    methods to income panel data. 
                 
              
            
             
              ========================
                Azadeh Shohoudi, McGill
             
             
              Variable Selection in Multipath Change-point 
                Problems
              Follow-up studies are frequently carried out to 
                study evolution of one or several measurements taken on some subjects 
                through time. When a stimulus is administered on subjects, it 
                is of interest to study the reaction times, change-points. One 
                may want to select the covariates that accelerate reaction to 
                the stimulus. Selecting effective covariates in this setting pose 
                a challenge when the number of covariates is large. We develop 
                such methodology and study, the large sample behavior of the method. 
                Small sample behavior is studied by the means of simulation. The 
                method is applied to a Parkinson disease data set.
                
              
            
             
              ========================
                Xin Tong, Princeton University 
                Coauthors: Philippe Rigollet (Princeton University) 
             
             
               
                Neyman-Pearson classification, convexity and stochastic 
                  constraints
                  Motivated by problems of anomaly detection, this paper implements 
                  the Neyman-Pearson paradigm to deal with asymmetric errors in 
                  binary classification with a convex loss. Given a finite collection 
                  of classifiers, we combine them and obtain a new classifier 
                  that satisfies simultaneously the two following properties with 
                  high probability: (i) its probability of type~I error is below 
                  a pre-specified level and (ii), it has probability of type~II 
                  error close to the minimum possible. The proposed classifier 
                  is obtained by solving an optimization problem with an empirical 
                  objective and an empirical constraint. New techniques to handle 
                  such problems are developed and have consequences on chance 
                  constrained programming. 
               
            
            
             
              
            
            ========================
              Chen Xu (Dept.of Stat, UBC), Song Cai (Dept.of Stat, UBC)
             
              Soft Thresholding-based Screening for Ultra-high 
                dimensional Feature Spaces
                Variable selection and feature extraction are fundamental 
                for knowledge discovery and statistical modeling with high-dimensionality. 
                To reduce the computational burden, variable screening techniques, 
                such as the Sure Independence Screening (SIS; Fan and Lv, 2008), 
                are often used before the formal analysis. In this work, we propose 
                another computational efficient procedure for variable screening 
                through a soft thresholding-based iteration (namely, the soft 
                thresholding screening, STS). The STS could efficiently screen 
                out most of the irrelevant features (covariates), while keep those 
                important ones in the model with high probability. With dimensionality 
                reduced from high to low, the refined model after STS then serves 
                as a good starting point for further selection. The excellent 
                performance of STS is supported by various numerical studies.