In general, my research is on numerical optimization problems arising in statistical inference. Specifically, I focus on creating, analyzing and implementing optimization algorithms when the size of the data set or dimension of the problem are so large that classical optimization methods are inefficient or inapplicable.
Structured Identifiability of Dynamical Systems under Partial Observability and Random Driving Inputs. In progress. identifiability dynamical systems power systems
A Statistical Approach to Dynamic Load Modelling and Identification with High Frequency Measurements. In progress. statistical estimation power systems
Kalman-based Stochastic Gradient Method for Generalized Linear Models. In progress. statistical estimation
On Why SGD Fails in Practice: Stalling, Conditioning, Divergence, and Non-convex Objectives. In progress. optimization machine learning (arxiv)
Patel, V. Kalman-based Stochastic Gradient Method with Stop Condition and Insensitivity to Conditioning. SIAM Journal on Optimization 2016. optimization machine learning statistical estimation (arXiv, doi, abstract)
Abstract. Modern proximal and stochastic gradient descent (SGD) methods are believed to efficiently minimize large composite objective functions, but such methods have two algorithmic challenges: (1) a lack of fast or justified stop conditions, and (2) sensitivity to the objective function's conditioning. In response to the first challenge, modern proximal and SGD methods guarantee convergence only after multiple epochs, but such a guarantee renders proximal and SGD methods infeasible when the number of component functions is very large or infinite. In response to the second challenge, second order SGD methods have been developed, but they are marred by the complexity of their analysis. In this work, we address these challenges on the limited, but important, linear regression problem by introducing and analyzing a second order proximal/SGD method based on Kalman Filtering (kSGD). Through our analysis, we show kSGD is asymptotically optimal, develop a fast algorithm for very large, infinite or streaming data sources with a justified stop condition, prove that kSGD is insensitive to the problem's conditioning, and develop a unique approach for analyzing the complex second order dynamics. Our theoretical results are supported by numerical experiments on three regression problems (linear, nonparametric wavelet, and logistic) using three large publicly available datasets. Moreover, our analysis and experiments lay a foundation for embedding kSGD in multiple epoch algorithms, extending kSGD to other problem classes, and developing parallel and low memory kSGD implementations.
Patel, V. A Low-Memory Kalman Filter for Large-Scale, Online Learning In Progress. optimization machine learning
Patel, V. An Optimal Randomized Iterative Method for Linear Systems and Linear Regression. In Progress. optimization linear algebra
Maldonado, D.A., Patel, V., Anitescu, M., Flueck, A. A Statistical Approach to Dynamic Load Modelling and Identification with High Frequency Measurements. Accepted to Power & Energy Society General Meeting 2017. statistical estimation power systems (preprint, abstract)
Abstract. As distribution systems become less passive and more complex, accurate dynamic load models are essential to the safe and reliable operation of the network. Dynamic load modelling can be addressed using black-box models, which, while effective in some circumstances, do not provide insight into elements in the network. On the other hand, dynamic load modelling using white-box models can provide insight, but requires extensive knowledge about the number, parameters and type of elements composing the load, which is generally not available. In this work, using white-box modelling, load aggregation, available knowledge of the network, and high frequency measurements, we contribute a statistical methodology to estimate the number, parameters and types of elements composing the load, and also quantify the uncertainty in these estimates. We validate our aggregation technique and estimation framework using simulated data, and test the sensitivity of these results to our underlying assumptions.
Documentation: Coming soon.
Description: A simple implementation of Stochastic Gradient Descent (SGD) and Kalman-based Stochastic Gradient Descent (kSGD) for the R Language on both regular and large data sets. For working with large data sets, the implementation depends on the bit and ffbase packages.
Nota bene: This is not the fastest implementation of the kSGD algorithm given that it is written entirely in R. I am working on a C version with an R interface to improve calculation speed.
I started my Ph.D. studies at the University of Chicago, Department of Statistics, in October 2013. My coursework focused on probability theory, applied mathematics and machine learning. I have been the teaching assistant for a number of courses, and taught an introductory calculus-based statistics course for undergraduates in the Winter quarter of 2015. My thesis is being advised by Mihai Anitescu.
Snr. Comp. Mathematician, ANL
Professor, University of Chicago
In June 2013, I received a masters degree from Cambridge University. My coursework in Part III of the Mathematical Tripos covered a range of fields from compressed sensing and numerical methods for PDEs to such theoretical fields as the theory of generalized functions and nonparametric statistical theory. As a member of Churchill College, my studies were supervised by James Norris.
Director of Statistics Laboratory
Professor, University of Cambridge
Between September 2008 and May 2012, I completed a Bachelors of Science at Rutgers University, majoring in both Applied Physics and Biomathematics with a minor in Inorganic Chemistry. My research was in Biomedical Engineering. Under the advisement of David Shreiber, I studied, through experimentation and computer simulations, the biomechanics of nervous system tissue.
Grad. Director Biomed. Eng.
Professor, Rutgers University
Teaching Assistant. I have assisted in teaching a number of undergraduate and graduate courses: Elementary Statistics, Numerical Linear Algebra, Sample Surveys, and Nonparametric Inference.
Data Intensive Computing Reading Group. In Autumn 2015, I started a reading group around the topic of data intensive computing systems. Here is my original reading list. If you are interested in joining, subscribe here.
Student Representative. From October 2014 to September 2015, I served as the Student Representative for the Department of Statistics to the Dean's Student Advisory Committee. In this capacity, I also represented student interests to the Statistics faculty.
PSD Co-Organizer. During the 2014 to 2015 academic year, I helped start and organize a series of graduate student lectures to encourage interdisciplinary conversations between the departments in the Physical Sciences Division.
SIAM Travel Award. Awarded to travel to SIAM UQ 2016 in Lausanne, Switzerland.
These are some notes of mine from lectures, courses and books on certain topics. If you find errata, please email me. Also, there are missing sections which I plan on completing over time.