搜索

x
中国物理学会期刊

融合注意力机制的卷积网络单像素成像

CSTR: 32037.14.aps.74.20250010

Convolutional network single-pixel imaging with fusion attention mechanism

CSTR: 32037.14.aps.74.20250010
PDF
HTML
导出引用
  • 提出了一种基于物理驱动的融合注意力机制的新型卷积网络单像素成像方法. 通过将结合通道与空间注意力机制的模块集成到一个随机初始化的卷积网络中, 利用单像素成像的物理模型约束网络, 实现了高质量的图像重建. 具体来说, 将空间与通道两个维度的注意力机制集成为一个模块, 引入到多尺度U-net卷积网络的各层中, 通过这种方式, 不仅可以利用注意力机制在三维数据立方中提供的关键权重信息, 还充分结合了U-net网络在不同空间频率下强大的特征提取能力. 这一创新方法能够有效捕捉图像细节, 抑制背景噪声, 提升图像重建质量. 实验结果表明, 针对低采样率条件下的图像重建, 与传统非预训练网络相比, 融合注意力机制的方案不仅在直观上图像细节重建得更好, 而且在定量的评价指标(如峰值信噪比和结构相似性)上均表现出显著优势, 验证了其在单像素成像中的有效性与应用前景.

     

    This paper presents a novel convolutional neural network-based single-pixel imaging method that integrates a physics-driven fusion attention mechanism. By incorporating a module that combines both channel attention mechanism and spatial attention mechanism into a randomly initialized convolutional network, the method utilizes the physical model constraints of single-pixel imaging to achieve high-quality image reconstruction. Specifically, the spatial and channel attention mechanism are combined into a single module and introduced into various layers of a multi-scale U-net convolutional network. In the spatial attention mechanism, we extract the attention weight features of each spatial region of the pooled feature map by using convolution. In the channel attention mechanism, we pool the three-dimensional feature map into a single-channel signal and input it into a two-layer fully connected network to obtain the attention weight information for each channel. This approach not only uses the critical weighting information provided by the attention mechanism in the three-dimensional data cube but also fully integrates the powerful feature extraction capabilities of the U-net network across different spatial frequencies. This innovative method can effectively capture image details, suppress background noise, and improve image reconstruction quality. During the experimental phase, we employ the optical path of single-pixel imaging to acquire bucket signals for two target images, "snowflake" and "basket". By inputting any noisy image into a randomly initialized neural network with attention mechanism, and using the mean square error between simulated bucket signal and actual bucket signal, we physically constrain the convergence of the network. Ultimately, we achieve a reconstructed image that adheres to the physical model. The experimental results demonstrate that under low sampling rate conditions, the scheme of integrating the attention mechanism can not only intuitively reconstruct image details better, but also demonstrate significant advantages in quantitative evaluation metrics such as peak signal-to-noise ratio (PSNR) and structural similarity (SSIM), confirming its effectiveness and potential application in single-pixel imaging.

     

    目录

    /

    返回文章
    返回