With the increasing availability of electronic health records, public health surveys, and real-time medical monitoring, health data analysis has become a vital tool in shaping clinical and policy decisions. Advanced data analytics methods are playing a growing role in understanding disease progression, optimizing care delivery, and improving population health outcomes. From predictive modeling and multivariate analysis to functional data analysis, machine learning, and anomaly detection, these techniques are transforming how we extract actionable insights from complex health data. This page highlights my applied work in this space, where I leverage statistical and computational tools to explore meaningful patterns in health-related datasets and support evidence-based decision-making.

Detecting Cardiac Abnormalities from ECG Signals:

Electrocardiogram (ECGs) are widely used in clinical settings to monitor heart activity and diagnose potential heart diseases. Given the high mortality associated with cardiac conditions, early and accurate detection of abnormal ECG patterns is critical. In recent years, there has been growing interest in leveraging computational methods to support this diagnostic process.

In this project, I explored the classification of ECG signals as normal or abnormal using functional data analysis techniques. Treating the ECG signals as curves, I applied B-spline basis expansion and Functional Principal Component Analysis (FPCA) to extract informative features. A classification model was then trained to distinguish between healthy and irregular cardiac activity using these features. This analysis was conducted using publicly available ECG200 datasets from the UCR Time Series Classification Archive.

Logistic Group LASSO Classifier to Predict Colon Cancer:

In this project, I developed a logistic Group Lasso classifier to predict the presence of colon cancer based on gene expression data. The dataset includes 72 samples with 100 predictors. These predictors were derived by expanding 20 original genes using 5 B-spline basis functions each.

The model was trained using the first 50 samples and evaluated on the remaining 22 to assess predictive accuracy. Group Lasso was used to encourage group-wise sparsity, allowing for the selection of relevant genes while accounting for the grouped structure introduced by the B-spline expansion.