Towards Clinical Deployment: A Deployment-Oriented Lightweight Transformer for Low-Latency Medical Image Segmentation

Haoyue Liu

Authors

Haoyue Liu School of Computer Science, Beijing University of Information Science and Technology, Beijing 102206, China

Keywords:

Lightweight Transformer, Model compression, Real-time inference, Edge computing

Abstract

This paper presents a comprehensive investigation into the system design of lightweight Transformer architectures specifically tailored for medical image segmentation tasks. The standard Transformer model, while demonstrating remarkable performance in various domains, suffers from excessive parameters and high computational complexity when applied to medical imaging, which often involves high-resolution volumetric data. To address these challenges, we propose a series of lightweight improvements including: (1) a sparse attention mechanism that reduces computational burden by focusing on relevant regions of the image, (2) a modular design approach that enables flexible configuration of network components based on task requirements, and (3) parameter sharing and pruning techniques that eliminate redundant connections while maintaining model accuracy. The proposed system demonstrates significant advantages in clinical applications, particularly in real-time surgical navigation and telemedicine scenarios. By efficiently operating on resource-constrained devices such as portable ultrasound machines and mobile diagnostic platforms, the system enables precise medical image analysis with minimal latency. This technological advancement provides crucial technical support for the development of precision medicine and inclusive healthcare, offering potential solutions for resource-limited settings and remote healthcare delivery.

References

Ding, C.; Wu, C. Self-Supervised Learning for Biomedical Signal Processing: A Systematic Review on ECG and PPG Signals. medRxiv 2024.

Han, X., & Dou, X. (2025). User recommendation method integrating hierarchical graph attention network with multimodal knowledge graph. Frontiers in Neurorobotics, 19, 1587973.

Hu, Xiao. "Low-Cost 3D Authoring via Guided Diffusion in GUI-Driven Pipeline." (2025).

Li, X., Wang, X., & Lin, Y. (2025). Graph Neural Network Enhanced Sequential Recommendation Method for Cross-Platform Ad Campaign. arXiv preprint arXiv:2507.08959.

Tan, C., Gao, F., Song, C., Xu, M., Li, Y., & Ma, H. (2024). Highly Reliable CI-JSO based Densely Connected Convolutional Networks Using Transfer Learning for Fault Diagnosis.

Tu, Tongwei. "ProtoMind: Modeling Driven NAS and SIP Message Sequence Modeling for Smart Regression Detection." (2025).

Xie, Minhui, and Boyan Liu. "InspectX: Optimizing Industrial Monitoring Systems via OpenCV and WebSocket for Real-Time Analysis." (2025).

Zhang, Yuhan. "AdOptimizer: A Self-Supervised Framework for Efficient Ad Delivery in Low-Resource Markets." (2025).

Zhang, Yuhan. "InfraMLForge: Developer Tooling for Rapid LLM Development and Scalable Deployment." (2025).

Zhu, Bingxin. "REACTOR: Reliability Engineering with Automated Causal Tracking and Observability Reasoning." (2025).

Zhuang, R. (2025). Evolutionary Logic and Theoretical Construction of Real Estate Marketing Strategies under Digital Transformation. Economics and Management Innovation, 2(2), 117-124.

Wang, Hao. "Joint Training of Propensity Model and Prediction Model via Targeted Learning for Recommendation on Data Missing Not at Random." AAAI 2025 Workshop on Artificial Intelligence with Causal Techniques. 2025.

Towards Clinical Deployment: A Deployment-Oriented Lightweight Transformer for Low-Latency Medical Image Segmentation

Authors

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Current Issue

Information