SPUR 2020: An Unsupervised Learning Analysis to Identify Subgroups in Women with Pregnancy Complications


Around 10% of pregnant women in the US suffer from placental insufficiency due to several common pregnancy complications including preeclampsia, placental abruption, and intrauterine fetal growth restriction (IUFGR), stillbirth, and Neonatal Intensive Care Unit (NICU). Despites there is extensive research in modeling the causes placental insufficiency; but identifying the relationships among predictors of pregnant women are still open research area. Most of current research studies have limited to analyze few categories of predictors. In this study, we will utilize broad categories of predictors: namely, demographics (age, race, ethnicity, sex of fetus); social determinants of health (income, education, social support); parameters of ultrasounds; medications and vaccinations; food intake; vital signs; and placental analyses.  In this study, we will focus on studying a multi-site real dataset that has cohorts of women from different states with one or more mentioned complications causing placental insufficiency. We will analyze the predictors of these complications during pregnant women visits with respect to these complications in the delivery visits. We will use unsupervised machine learning techniques to reduce the number of dataset predictors and identify patient subgroups that present potential frequent relationships among predictors with respect to women complications. This project entails collaborations between informatics researchers and clinical experts to identify challenges and propose relevant solutions. We will require the student selected for the project and lab students to exchange ideas and solve problems together.


Student Role

The student selected in the project will use the dataset to pull patient predictors such as demographic data, symptoms, signs, and repeated measure timestamp data, such as vital signs, laboratory tests, procedures codes, and medications. The student will review the literature to identify how to use these factors with respect to pregnancy complications. The student will master concepts of machine learning techniques through sklearn and tensorflow packages and will use these concepts and the packages to conduct data cleaning methods and descriptive analysis to filter the noise and non-risky factors, respectively, from the data. Also, she or he will run machine learning techniques currently used in clinics and our lab proposed solution on the pulled data. The student then will prepare a report to compare the results among different solutions that will require her or him to draft a publication including literature review, exploratory analysis, results, discussion, limitations, and future work. If the student lacks the expertise to complete the mentioned tasks, the lab will support her or him with the necessary documents, resources, and guides. Dr. Abdelrahman will mentor the project student and the lab students and encourage the lab students to communicate and exchange ideas to help the project student acquire the necessary expertise.  The project student will communicate with the lab students and will attend meetings with other lab members and project partners. The student will present her or his findings from the meetings and the project tasks on a regular basis to get feedback from the lab members.


Student Learning Outcomes & Benefits

The student selected in the project will learn the perspectives of clinical domain, and unsupervised learning. From a clinical domain perspective, the student will gain insights into (i) pregnancy complications, (ii) characteristics of each complication (iii) predictors of each complication, and (iv) the interpretations of resultant subgroups. From a machine learning perspective, she or he will understand the concepts and technical aspects of sklearn and tensorflow packages, which will then enable her or him to run them in the project.  Of note, if the student has this expertise before joining the lab, she or he will gain more practice and hands-on skills to help propose a new solution for the project questions. On the other hand, if the student does not have enough knowledge or skills to gain such an understanding, Dr. Abdelrahman will either sharpen the student’s knowledge or adapt/minimize the outcomes of each project perspective separately for her or his maximum benefits from the project.  The student will acquire presentation, communication, collaboration, and publication skills and will be trained to work independently and in-group. These skills will support her or him in future data science career in industry or academia.


Samir E. Abdelrahman
Assistant Professor

Biomedical Informatics
School of Medicine

Dr. Abdelrahman has had more than a decade of mentoring approximately twenty-five undergraduate and fifteen graduate students. His goal is to gradually educate students about the basic concepts such that, eventually, they become completely independent in developing and validating their ideas.. Therefore, he starts every project by setting up clear milestones with clear definitions of goals, input, and output. He meets with the involved students at the time of the first milestone to let them know the project details/team and to discuss their opinions about the next milestones. Based on these discussions, Dr. Abdelrahman sets up the time-intervals for the project’s regular meetings, informs the students how they should report their findings and communicate with each other, and suggests modifications of the milestone(s), if needed. For the next milestones, he assesses the students’ performances and identifies any challenges that need to be addressed. He also encourages the students who have successfully solved their problems to develop new ideas. For those who have challenges, he meets with each student individually to understand her or his problems and to find an adequate solution. If the solution requires any change in the student’s activities, Dr. Abdelrahman adapts the milestones to reflect the student’s skills and thoughts. As a mentor, he also encourages his lab students to share ideas, and he organizes social meetings so the students can interact outside the lab environment.