^{1}

^{*}

^{1}

The object of our present study is to develop a piecewise constant hazard model by using an Artificial Neural Network (ANN) to capture the complex shapes of the hazard functions, which cannot be achieved with conventional survival analysis models like Cox proportional hazard. We propose a more convenient approach to the PEANN created by Fornili
*et al*. to handle a large amount of data. In particular, it provides much better prediction accuracies over both the Poisson regression and generalized estimating equations. This has been demonstrated with lung cancer patient data taken from the Surveillance, Epidemiology and End Results (SEER) program. The quality of the proposed model is evaluated by using several error measurement criteria.

Precise prediction of the survival and the hazard has been a challenging task through-out the past years. Research scientists have used parametric methods quite often to serve this purpose. However, they impose certain distributional assumptions on the hazard functions [

Artificial intelligence neural networks (ANNs) have been extremely popular in almost every field, including computer science, engineering and in the biomedical field among others. They have the strength of making predictions based on both individual attributable variables and possible complex interactions among them. In addition to that, ANNs have the capability of handling nonlinear functions and non-additive effects. Moreover, they are free of any statistical assumptions. Thus, ANN based survival analysis models serve as efficient alternatives to the conventional survival analysis models with enhanced predictive power. One of the earliest work in survival analysis with ANN was introduced by Faraggi and Simon [

In the present study, we have modified the PEANN model by combining it with another ANN model introduced by Mani et al. [

This paper is structured as follows. In Section 2, we introduce the new ANN system and related theory along with other models which we used for com- parison. Following to that, we present our results. The final section discusses the implication and limitations.

Let T be the survival or the follow-up time for subjects ^{th} risk,

Then the corresponding survival function and the probability density function can be obtained by Equation (2) and Equation (3) as given below.

and

where

where ^{th} risk is assumed to be constant in the j^{th} time period

where

where

The kernel given in Equation (5) corresponds to the likelihood of the Poisson random variable

It has been shown that,

An alternative method is to group the exposure times and the similar

In this section, we introduce an efficient method of modeling the hazard function with artificial neural networks. ANNs allow flexible modeling of the hazard function without any probability distributional assumptions. Moreover, it captures the nonlinear effects of the risk factors.

Preceding our model, Fornili et al. [

Prior to using the proposed ANN model, data need to be preprocessed. This process can be explained using a simple example. Consider three subjects, called A, B and C who have been observed for J number of years. Suppose we have information about their risk factors

In order to use the new ANN model, this information needs to be pre- processed as given in

Subject | Survival Time | Risk Type | Censor | ||
---|---|---|---|---|---|

A | 1 | 0 | 3 | 0 | |

B | 1 | 1 | 4 | 0 | |

C | 1 | 1 | 2 | 1 |

Subject | h(1) | h(2) | h(3) | h(4) | h(5) | ... | h(J) | ||||
---|---|---|---|---|---|---|---|---|---|---|---|

A | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 1 | 1 | ... | 1 |

B | 1 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 1 | ... | 1 |

C | 1 | 1 | 1 | 0 | 0 | 0 | 0.31 | 0.24 | 0.12 | ..... | 0.01 |

C | 1 | 1 | 0 | 1 | 0 | 0 | 0.42 | 0.57 | 0.45 | 0.63 |

of the data. In this example, we have four inputs, the covariates

where^{th} time interval and, ^{th} time interval due

to the r^{th} risk. The ratio, ^{th}

risk for the period j. In summary, if the subject is alive, then

In developing the proposed ANN model, we used the hyperbolic tangent and the exponential activation functions in the hidden and the output layers. The proposed ANN structure is represented in

where

During the training, we minimized the regularized canonical error function given by Equation (8), where

We used a k-fold cross validation technique to find the optimal number of hidden nodes in each network. When using a k-fold cross validation technique, the training dataset is divided into k folds, where

The data for our study is selected from the Surveillance, Epidemiology and End Results (SEER) program [

In our analysis, four risk factors were used: age at diagnosis, tumor size, histology and the stage of the cancer. As can be seen from

Male | Female | |
---|---|---|

Cause of Death | ||

Lung | 13,029 (64%) | 10,303 (58%) |

Other | 2724 (13%) | 1928 (11%) |

Censored | 4767 (23%) | 5511 (31%) |

Age at Diagnosis | ||

45 - 49 years | 635 (3%) | 705 (4%) |

50 - 54 years | 1320 (6%) | 1161 (7%) |

55 - 59 years | 2206 (11%) | 1747 (10%) |

60 - 64 years | 3208 (16%) | 2515 (14%) |

65 - 69 years | 3757 (18%) | 3127 (18%) |

70 - 74 years | 3723 (18%) | 3086 (17%) |

75 - 79 years | 3187 (16%) | 2837 (16%) |

80 - 84 years | 1793 (9%) | 1826 (10%) |

85+ years | 691 (3%) | 738 (4%) |

Stage of the Cancer | ||

Localized | 5536 (27%) | 5525 (31%) |

Regional | 7028 (34%) | 5816 (33%) |

Distant | 7956 (39%) | 6401 (36%) |

Histology Type | ||

Adeno | 9162 (45%) | 10,056 (57%) |

Squamous | 8492 (41%) | 5054 (28%) |

Large Cell | 917 (4%) | 691 (4%) |

Small-cell | 1949 (10%) | 1941 (11%) |

Total | 20,520 | 17,742 |

We found that the survival time between males and females to be significantly different from each other, which was already a known fact [

For both males and females, we created a training data set of 70% and a testing data set of 30%. The training set was used to train the models while the testing dataset was used to evaluate the prediction accuracies of the proposed models.

We started our analysis with Poisson regression models. However, according to the deviance and the Pearson chi-square statistics, none of those models were adequate [

Following to that, we proceeded with building ANN models. We created both PEANN and our proposed ANN models. As mentioned earlier, we considered five different weight decay values: 0.01, 0.025, 0.05, 0.075, 0.1 and 10-fold cross validation was used to find the optimal number of hidden nodes in each case. The optimal network is selected based on the minimum average validation error. By using each optimal network, we predicted the hazard and corresponding survival probabilities for the testing data. In order to evaluate the prediction accuracies of ANN and GEE, we used the actual survival times and their predicted median survival times of non-censored subjects. For a better comparison, several prediction errors were considered, including the root mean square error (RMSE): average differences between actual and the predicted values, mean absolute error (MAE): average of the absolute errors, mean percentage error (MPE): average of percentage errors, and relative squared error (RSE): total squared error nor- malized by the total squared error of the simple predictor for both males and females as given in

As can be seen from

Male | GEE | Weight Decay 0.01 | Weight Decay 0.025 | Weight Decay 0.05 | Weight Decay 0.075 | Weight Decay 0.1 | |||||
---|---|---|---|---|---|---|---|---|---|---|---|

NewANN | PEANN | NewANN | PEANN | NewANN | PEANN | NewANN | PEANN | NewANN | PEANN | ||

RMSE | 4.0986 | 2.3253 | 3.5967 | 2.2416 | 3.6277 | 2.2693 | 3.5190 | 2.2144 | 3.7136 | 2.3444 | 3.5561 |

MAE | 3.5155 | 1.69 | 2.8767 | 1.6226 | 2.9106 | 1.6412 | 2.7791 | 1.6174 | 3.0070 | 1.7292 | 2.8226 |

RSE | 8.4539 | 2.721 | 6.4604 | 2.5287 | 6.5724 | 2.5916 | 6.1844 | 2.4676 | 6.8873 | 2.7659 | 6.3154 |

MPE | −2.5349 | −0.6645 | −1.6856 | −0.6137 | −1.7456 | −0.6125 | −1.5952 | −0.5819 | −1.8934 | −0.7077 | −1.6561 |

Data Count | 4659 | 4659 | 4659 | 4659 | 4659 | 4659 | 4659 | 4659 | 4659 | 4659 | 4659 |

Female | GEE | Weight Decay 0.01 | Weight Decay 0.025 | Weight Decay 0.05 | Weight Decay 0.075 | Weight Decay 0.1 | |||||
---|---|---|---|---|---|---|---|---|---|---|---|

NewANN | PEANN | NewANN | PEANN | NewANN | PEANN | NewANN | PEANN | NewANN | PEANN | ||

RMSE | 4.3146 | 2.5209 | 3.9056 | 2.5232 | 3.9682 | 2.4737 | 3.8961 | 2.4969 | 3.9611 | 2.4871 | 3.9256 |

MAE | 3.8683 | 1.8927 | 3.2999 | 1.8896 | 3.3834 | 1.8529 | 3.2912 | 1.8700 | 3.3714 | 1.8655 | 3.3240 |

RSE | 8.6342 | 2.9475 | 7.0751 | 2.9529 | 7.3036 | 2.8383 | 7.0407 | 2.8916 | 7.2776 | 2.869 | 7.1476 |

MPE | −2.9081 | −0.8038 | −1.9276 | −0.811 | −2.0216 | −0.7844 | −1.9000 | −0.7757 | −2.0242 | −0.7689 | −1.9494 |

Data Count | 3568 | 3568 | 3568 | 3568 | 3568 | 3568 | 3568 | 3568 | 3568 | 3568 | 3568 |

than both GEE and PEANN with respect to RMSE and MAE for both genders. In addition to that, RSEs for new ANNs are smaller than those two types of models. Although the predictions of new ANNs have negative biases which indicate underestimations of the survival, it is significantly less than the other two models. In particular, we found the smallest error values for the new ANN models with weight decay 0.05 and 0.075 for females and males respectively. Further analysis of the hazard rates was carried out using those two models.

small cell carcinoma is higher in females than in males for all stages as con- firmed by the [

We have introduced a new neural network architecture to model the piecewise constant hazard model. This provides a more convenient approach to handle a

large amount of observations over longer periods. In particular, our ANN model captures the complex shapes of the hazard functions, in the presence of com- peting risks. Moreover, these ANN models can handle nonlinear and non-additive effects among the risk factors. The new method overcomes several limitations associated with the traditional piecewise constant hazard model. Our ANN model is capable of modeling the hazard functions even with a large amount of data where the equivalent Poisson regression model of the piecewise constant hazard model fails. Importantly, the prediction accuracy of the survival times given by the proposed ANN model is higher than both generalized estimating equation models and the PEANN model. However, PEANN model is much suitable than our ANN model when there are time-dependent risk factors, as it is specially designed to deal with that kind of data. Our findings confirm the fact that elder patients have relatively higher hazard compared to younger patients. Their hazard is usually at the greatest around the first two years of diagnosis while for younger patients it tends to vary. However, we advise doing further analysis before making any clinical decisions.

In developing the proposed ANN model, we use the nonlinear Poisson regression model used in [

Rodrigo, H. and Tsokos, C.P. (2017) Artificial Neural Network Model for Predicting Lung Cancer Survival. Journal of Data Analysis and Information Processing, 5, 33-47. https://doi.org/10.4236/jdaip.2017.51003