SecFormer: Balancing Performance and Efficiency in Privacy-Preserving Inference for Transformer Models

A new framework called SecFormer has been introduced to address the challenge of Privacy-Preserving Inference (PPI) for large language models based on the Transformer architecture. The increasing reliance on cloud-hosted large language models has raised privacy concerns, especially when sensitive data is involved. Secure Multi-Party Computing (SMPC) has emerged as a solution for preserving the privacy of both inference data and model parameters. However, applying SMPC to PPI for Transformer models often results in significant performance issues.

SecFormer takes a different approach to optimize the balance between performance and efficiency in PPI. Instead of replacing nonlinear operations with SMPC-friendly alternatives, it focuses on enhancing the model design. High-overhead operations are replaced with innovative alternatives, such as a privacy-preserving GeLU algorithm based on segmented polynomials, and efficient privacy-preserving algorithms for LayerNorm and Softmax.

The framework’s effectiveness was evaluated using the GLUE benchmark dataset with Transformer models like BERTBASE and BERTLARGE. SecFormer outperformed state-of-the-art approaches in terms of performance and efficiency, with an average improvement of 5.6% and 24.2% respectively. Comparisons with existing frameworks based on model design and SMPC protocol optimizations revealed that SecFormer achieved a speedup of 3.4 and 3.2 times in PPI while maintaining comparable performance levels.

SecFormer presents a scalable and effective solution for enhancing large language models while ensuring stringent privacy standards. By optimizing the balance between performance and efficiency through model design enhancements, SecFormer promises to address the privacy concerns associated with the increasing use of cloud-hosted large language models. With its innovative approach and promising results, SecFormer holds great potential for the future of privacy-preserving inference in complex linguistic landscapes.