br Results and discussions br The
3. Results and discussions
The genetic attributes are investigated by modeling sensor network for gene, which is tested on 40 gene databases (25 cancerous or hy-drophilic and 15 non-cancerous or hydrophobic) (Table 2 and Table 3) and the databases for the genes are downloaded from public domain (http://www.ncbi.nlm.nih.gov; http://cgap.nci.nih.gov; http://www. genecards.org). The electrical responses of the sensor are simulated in MATLAB (version R2009b) environment.
3.1. Behavior analysis of sensor network using bode plot
The sensor networks representing genes are analyzed in frequency domain by observing their spectrums. Diﬀerentiation between
Fig. 2. Gene sensor responses in phase for cancer and non-cancer genes. The phase response for cancer gene shows negative value whereas non-cancer gene shows positive value at higher frequency. A. NUP214 vs. LOC107815086 gene phase plots. B. spr1652 vs. SHBG gene phase plots.
cancerous and non-cancerous genes is obtained by investigating the correlation of the gene features and their simulated system behavior.
The sensor behavior is studied using Bode magnitude and phase
Fig. 3. Confusion matrix of binary classifier for gene classification. Genes are classified based on their hydrophilicity and hydrophobicity features.
Table 5 Performance evaluation metrics for genes at diﬀerent frequency.
Frequency (Hz) Accuracy MCC TP rate TN rate Precision (P) Precision (N)
values in the frequency range of 1 Hz to 1 MHz as detailed in Table 4. There are no marked diﬀerences observed in amplitude values; hence only the phase values of all the electrical system models representing genes are considered. The plots in Fig. 2 are obtained by cascading the amino Insulin Human circuit models, where each amino acid having constant Rb of 7 Ω for backbone circuit and LSC or CSC of diﬀerent values depending on hydropathy index values of hydrophobic or hydrophilic amino acid for side chain. Fig. 2 exhibits significant diﬀerences in phase responses between cancerous and non-cancerous genes and the phase values are markedly distinguished from each other within the frequency range of 50 kHz to 1 MHz.
The simulated results (Fig. 2) for cancerous genes show negative phase at higher frequency as they are modeled by cascaded RC parallel circuit, which indicates these genes contain large amount of hydrophilic or polar amino acids. Whereas non-cancerous genes exhibit positive phase at higher frequency since they realized by RL parallel circuit, indicates they made up of large amount of hydrophobic amino acids. Therefore the cancerous and non-cancerous genes exhibit polar and
nonpolar characteristics respectively, which are clearly observed by the corresponding simulated phase responses, and the sensor realization is truly matched with the biological features (Stranzl et al., 2012) of genes.
3.2. Performance evaluation of sensor characteristics
The gene datasets, collected from the national website for health-care, are classified by modeling sensor. The sensor performance is judged by receiver operating characteristic (ROC) curve and analyzed using the following measurement metrics:
• Accuracy is the ratio of the number of correctly classified genes to the total number of genes.
• True positive rate or sensitivity (TPR) is the ratio of the number of correctly classified genes from the positive class (TP) i.e. cancer to the number of all genes from the positive class (TP + FN).
• True negative rate or specificity (TNR) is the ratio of the number of
Fig. 4. ROC curve for cancer and non-cancer gene.
Fig. 5. Nyquist plots for cancer and non-cancer genes. Non-cancer genes exhibit larger Nyquist curve as it is larger in size and cancer genes have smaller curve for its smaller size.