rmadhu2131

May 20, 20232 min

Diagnosing Gestational Diabetics using ML

Gestational Diabetics can lead to complications for both mother and baby. The treatment always includes special meal plans and scheduled physical activity, and it may also include daily blood glucose testing and insulin injections. Early screening improved the pregnancy outcomes, such as emergency cesarean section, neonatal hypoglycemia and macrosomia. So while working on gestational diabetics data, there is a small doubt that if we can predict the GDM in patients at their first visit based on some basic biomarkers it might be helpful for the patients. So machine Learning has been used on the gestational diabetics data to predict their chances of getting GDM in future trimesters.

Importing Libraries :

import numpy as np
 
import pandas as pd
 
from sklearn.impute import KNNImputer
 
from sklearn.model_selection import train_test_split, StratifiedKFold
 
from sklearn.ensemble import RandomForestClassifier
 
from xgboost import XGBClassifier
 
from sklearn.datasets import make_classification
 
from sklearn.impute import SimpleImputer
 
import plotly.graph_objs as go
 
import plotly.offline as py
 
import seaborn as sns
 
import matplotlib.pyplot as plt
 
from sklearn.metrics import f1_score
 
from sklearn.metrics import mean_squared_error
 
from sklearn import metrics
 
from sklearn.metrics import roc_auc_score
 
from sklearn.metrics import accuracy_score
 
from sklearn.metrics import mean_absolute_error
 
from sklearn.metrics import precision_score
 
from imblearn.under_sampling import RandomUnderSampler
 
from sklearn import preprocessing
 
from collections import Counter
 

#Loading data
 
from google.colab import files
 
uploaded = files.upload()

Reading the data :

Understanding the dataset :

Transforming all categorical columns into numerical columns :

In the same way label encoder can be fitted to the column 'Vit D Deficiency' or else one hot encoder can also be used.

Considering the biomarkers(SystolicBP, DiastolicBP, Weight,BMI,Age>30,Vit D Deficiency) from visit 1 as X.

GDM_Diagnoised as y.

And while training a model in Machine Learning null or missing values cannot be present.

From the given dataset the patients with and without gdm are not balancing , to balance we can use either undersampling or oversampling. Here implementation of undersampling on model can be observed.

pip install imblearn
 
from imblearn import under_sampling, over_sampling
 
from collections import Counter

from imblearn.under_sampling import RandomUnderSampler
 
rus=RandomUnderSampler(random_state=0)
 
X_resampled, y_resampled = rus.fit_resample(X,y)
 
print(sorted(Counter(y_resampled).items()),y_resampled.shape)
 

df2 = pd.DataFrame(X_resampled)
 
df2.head()

Sampled data from undersampling will be used for training and testing of the model. For which logistic regression can be implemented as shown below with an accuracy of around 67%.

From the dataset, the biomarkers data can be given as the input and the can predict whether the patient is with GDM or wihout GDM by resulting the column with 'Yes' if patient is with GDM, if the patient is without GDM then the GDM diagnoised column is "no'.

If the patient is known with the chances of gestational diabetics then necessary precautions can be taken.

input_data = (165.0,112.0,60.6,20.407797,0,0)
 
def gdm_diagnosis(input_data1):
 
input_data_as_numpy_array = np.asarray(input_data) #changing input data to numpy array
 
input_data_reshaped = input_data_as_numpy_array.reshape(1,-1) #reshape the array as we are predicting for one instance
 
prediction = logmodel.predict(input_data_reshaped)
 
print(prediction)

gdm_diagnosis (input_data)

input_data = (138.0,63.0,94.5,38.387155,1.0,0.0)
 
def gdm_diagnosis(input_data1):
 
input_data_as_numpy_array = np.asarray(input_data) #changing input data to numpy array
 
input_data_reshaped = input_data_as_numpy_array.reshape(1,-1) #reshape the array as we are predicting for one instance
 
prediction = logmodel.predict(input_data_reshaped)
 
print(prediction)

gdm_diagnosis (input_data)

If the patient is known with the chances of gestational diabetics then necessary precautions can be taken ahead only.

    1160
    3