摘要: 脉冲星候选体选择是脉冲星搜寻任务中的重要步骤。为了提高脉冲星候选体选择的准确率，提出了一种基于自归一化神经网络的候选体选择方法。该方法采用自归一化神经网络、遗传算法、合成少数类过采样这三种技术提升对脉冲星候选体的筛选能力。利用自归一化神经网络的自归一化性质克服了深层神经网络训练中梯度消失和爆炸的问题，大大加快了训练速度。为了消除样本数据的冗余性，利用遗传算法对脉冲星候选体的样本特征进行选择，得到了最优特征子集。针对数据中真实脉冲星样本数极少带来的严重类不平衡性，采用合成少数类过采样技术生成脉冲星候选体样本，降低了类不平衡率。以分类精度为评价指标，在三个脉冲星候选体数 据集上的实验结果表明，本文提出的方法能有效提升脉冲星候选体选择的性能。
Pulsar candidate selection based on self-normalizing neural networks
- Received Date:
17 October 2019
Abstract: Pulsar candidate selection is an important step in the pulsars search task. The traditional candidate selection is heavily dependent on human inspection. However, the human inspection is a subjective, time consuming, and error prone process. One modern radio telescopes pulsar survey project can produce totally millions of candidates, so the manual selection becomes extremely difficult and inefficient due to the large amount of candidates. Therefore, this study has focused on machine learning in recent years. In order to improve the efficiency of pulsar candidates selection, we propose a candidate selection method based on self-normalizing neural networks. This method uses three techniques: self-normalizing neural networks, genetic algorithm and synthetic minority over-sampling technique. Self-normalizing neural networks can improve the identification accuracy by applying deep neural networks to pulsar candidate selection. At the same time, it overcomes the problem of gradient disappearance and explosion in the training process of deep neural networks by using its self-normalizing property, which greatly accelerates the training process. In addition, in order to eliminate the redundancy of the sample data, we use genetic algorithm to choose sample features of pulsar candidates. The genetic algorithm for feature selection can be summarized into three steps: initializing the population, assessing population fitness, and generating new populations. Decoding the individual with the largest fitness value in the last generation population, we can get the best subset of features. Due to radio frequency interference or noise, there are a large number of non-pulsar signals in candidates, and only a few real pulsar signals. Aiming at solving the severe class imbalance problem, we use the synthetic minority over-sampling technique to increase the pulsar candidates (minority class) and reduce the imbalance degree of data. By using k-nearest neighbor and linear interpolation to insert a new sample between two minority classes of samples that are close to each other according to certain rules, we can prevent the classifier from becoming biased towards the abundant non-pulsar class (majority class). Experimental results on three pulsar candidate datasets show that the self-normalizing neural network has higher accuracy and faster convergence speed than the traditional artificial neural network in the deep structure, By using genetic algorithm and synthetic minority over-sampling technique can effectively improve the selection performance of pulsar candidates.