cronokirby

(2026-02) Scaling Sparse Matrix Computation for Secure Outsourced Computing

2026-02-18

Abstract

Sparse General Matrix-Matrix Multiplication (SpGEMM) is a fundamental but computationally intensive operation that underpins many scientific workloads, including numerous AI applications. With the increasing demands for data security, privacy-preserving computation techniques such as Fully Homomorphic Encryption (FHE) have gained significant attention for their ability to process sensitive data without decryption. Nonetheless, executing SpGEMM within the framework of FHE presents significant challenges. The most effective SpGEMM algorithms exploit matrix sparsity to minimize computational costs; however, FHE obscures both the data values and the sparsity structures. Prior FHE-based privacy-preserving computation frameworks either ignore the inherent sparsity of matrices and rely on dense General Matrix-Matrix Multiplication (GEMM), incurring substantial overhead from redundant homomorphic multiplications, or they attempt to exploit sparsity by encrypting only the non-zero values, which inadvertently exposes sensitive positional information. To address this gap and achieve a better balance between efficiency and privacy, we propose an efficient FHE-compatible SpGEMM framework that enables oblivious data and position processing under a Single Instruction Multiple Data (SIMD) scheme. Moreover, we extend our method to support an arbitrary number of sparse matrices (FHE-SpGEMCM). The efficiency analysis shows that our method achieves an average homomorphic computation cost of (nAnB)2/(n2N)(n_A n_B)^2 / (n^2 N), where nAn_A and nBn_B represent the number of nonzero elements in AA and BB, respectively, nn is the shared inner dimension of the multiplication, and NN denotes the batch size used in FHE. Experimental results demonstrate that for square matrices of scale 292^9, our scheme achieves an average speedup of 439.25×439.25\times and a 10.68×10.68\times reduction in memory consumption compared to state-of-the-art baselines that ignore sparsity. Furthermore, when the scale increases to 2132^{13}, our method yields up to a 1201.77×1201.77\times speedup over baselines that only exploit the sparsity of a single matrix.