Warsaw R Enthusiasts Meetup
- 18:00 - 18:05 Welcoming
- 18:05 - 18:45 - Causal Inference with Missing Values: Treatment Effect Estimation of Tranexamic Acid on Mortality for Traumatic Brain Injury Patients - Julie Josse
- 18:45 - 19:15 Pizza break sponsored by Appsilon
- 19:15 - 19:55 - Nonparametric imputation by data depth - Pavlo Mozharovskyi
- 19:55 - 20:00 - Few words from Appsilon
We are excited to announce the next meetup. We have a pleasure to host two international guests and active members of R community: Julie Josse and Pavlo Mozharovskyi.
Julie's abstract: In healthcare or social sciences research, prospective observational studies are frequent, relatively easily put in place (compared to experimental randomized trial studies for instance) and can allow for different kinds of posterior analyses such as causal inferences. Average treatment effect (ATE) estimation, for instance, is possible through the use of propensity scores which allow to correct for treatment assignment biases in the non-randomized study design. However, a major caveat of large observational studies is their complexity and incompleteness: the covariates are often taken at different levels and stages, they can be heterogeneous – categorical, discrete, continuous – and almost inevitably contain missing values. The problem of missing values in causal inference has long been ignored and only recently gained some attention due to the non-negligible impacts in terms of power and bias induced by complete case analyses. We propose several consistent doubly robust average treatment effect estimators which directly account for missing values and compare them to complete case ATE estimators applied on imputed data, i.e. on complete data obtained by replacing every missing value by at least one plausible one, and to the recently proposed method of Kallus et al. . We assess the performance of our estimators on a large prognostic database containing detailed information about over 15,000 severely traumatized patients in France. Using the proposed ATE estimators and this database we study the effect on mortality of tranexamic acid administration to patients with traumatic brain injury in the context of critical care management.
Pavlo's abstract: We present single imputation method for missing values which borrows the idea of data depth-a measure of centrality defined for an arbitrary point of a space with respect to a probability distribution or data cloud. This consists in iterative maximization of the depth of each observation with missing values, and can be employed with any properly defined statistical depth function. For each single iteration, imputation reverts to optimization of quadratic, linear, or quasiconcave functions that are solved analytically by linear programming or the Nelder-Mead method. As it accounts for the underlying data topology, the procedure is distribution free, allows imputation close to the data geometry, can make prediction in situations where local imputation ($k$-nearest neighbors, random forest) cannot, and has attractive robustness and asymptotic properties under elliptical symmetry. It is shown that a special case-when using the Mahalanobis depth-has direct connection to well-known methods for the multivariate normal model, such as iterated regression and regularized PCA. The methodology is extended to multiple imputation for data stemming from an elliptically symmetric distribution. Simulation and real data studies show good results compared with existing popular alternatives. The method has been implemented as an R-package.
Meetup sessions will be recorded and available on the YouTube channel thanks to Appsilon.