Published at : 28 Jan 2026
Volume : IJtech
Vol 17, No 1 (2026)
DOI : https://doi.org/10.14716/ijtech.v17i1.8195
| Hodaka Nishi | Department of Electrical Engineering and Computer Science, Tokyo University of Agriculture and Technology, 2-24-16 Naka-cho, Koganei-shi, Tokyo 184-8588, Japan |
| Shiro Yano | InfoTech Div., Toyota Motor Corporation, Otemachi Bldg. 6F, 1-6-1 Otemachi, Chiyoda-ku, Tokyo, 100-0004, Japan |
| Megumi Miyashita | Department of Electrical Engineering and Computer Science, Tokyo University of Agriculture and Technology, 2-24-16 Naka-cho, Koganei-shi, Tokyo 184-8588, Japan |
| Shunta Onishi | Department of Electrical Engineering and Computer Science, Tokyo University of Agriculture and Technology, 2-24-16 Naka-cho, Koganei-shi, Tokyo 184-8588, Japan |
| Yuta Goto | Department of Electrical Engineering and Computer Science, Tokyo University of Agriculture and Technology, 2-24-16 Naka-cho, Koganei-shi, Tokyo 184-8588, Japan |
| Toshiyuki Kondo | Department of Electrical Engineering and Computer Science, Tokyo University of Agriculture and Technology, 2-24-16 Naka-cho, Koganei-shi, Tokyo 184-8588, Japan |
Federated learning (FL) has emerged as a key paradigm for decentralized data privacy-preserving machine learning. However, substantial communication costs often hinder its practical application, especially as deep learning models scale to millions or billions of parameters. This communication bottleneck becomes particularly acute in heterogeneous networks with clients who are resource-constrained. To address this challenge, this study proposes a novel FL framework that leverages black-box optimization, specifically the zeroth-order (ZO) method, to reduce communication overhead. The proposed method, named ZO-FedSGD, reframes the learning process to eliminate the need for transmitting high-dimensional model parameters. Instead, each communication round involves exchanging only a constant number of scalar values, including a random seed and function evaluations, making the communication cost independent of the model size. Extensive experiments were conducted to compare ZO-FedSGD with the existing FedAvg algorithm on the MNIST datasets. The evaluation focused on model accuracy and total communication efficiency. Our results reveal a trade-off: ZO-FedSGD required more rounds to converge and achieved a slightly lower final accuracy. However, it demonstrated superior communication efficiency—to reach 90% accuracy, ZO-FedSGD required approximately 104 communicated parameters, compared to 106 for FedAvg, representing a two-order-of-magnitude reduction. In conclusion, this study validates ZO-FedSGD as a viable and highly efficient alternative for FL in communication-constrained scenarios. It offers a new direction for designing scalable FL systems and a promising solution to the statistical heterogeneity problem.
Black-box optimization; Federated learning; Two-point estimation
Chen,
X., Liu, S., Xu, K., Li, X., Lin, X., Hong, M., & Cox, D. (2019). Zo-adamm:
Zeroth-order adaptive momentum method for black-box optimization. Advances in
Neural Information Processing Systems, 32. https://doi.org/10.48550/arXiv.1910.06513
Devlin,
J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). Bert: Pre-training of
deep bidirectional transformers for language understanding. In J. Burstein, C.
Doran, & T. Solorio (Eds.), Proceedings of the 2019 conference of the north
american chapter of the association for computational linguistics: Human
language technologies, volume 1 (long and short papers) (pp. 4171–4186).
Association for Computational Linguistics. https://doi.org/10.18653/v1/N19-1423
Dritsas,
E., & Trigka, M. (2025). Federated learning for iot: A survey of
techniques, challenges, and applications. Journal of Sensor and Actuator
Networks, 14 (1), 9. https://doi.org/10.3390/jsan14010009
Feurer, M., & Hutter, F. (2019). Hyperparameter
optimization. In F. Hutter, L. Kotthoff, & J. Vanschoren (Eds.), Automated machine learning: Methods, systems, challenges
(pp. 3–33). Springer International Publishing. https://doi.org/10.1007/978-3-030-05318-5 1
Ghadimi,
S., & Lan, G. (2013). Stochastic first- and zeroth-order methods for
non-convex stochastic programming. arXiv preprint. https://doi.org/10.48550/arXiv.1309.5549
Golovin,
D., Karro, J., Kochanski, G., Lee, C., Song, X., & Zhang, Q. (2019).
Gradientless descent: High-dimensional zeroth-order optimization. arXiv
preprint. https://doi.org/10.48550/arXiv.1911.06317
Golovin, D., Solnik, B., Moitra, S., Kochanski, G.,
Karro, J., & Sculley, D. (2017). Google
vizier: A service for black-box optimization. Proceedings of the 23rd ACM
SIGKDD International Conference on Knowledge Discovery and Data Mining,
1487–1495. https://doi.org/10.1145/3097983.3098043
Guan, H., Yap, P.-T., Bozoki, A., & Liu, M. (2024). Federated learning for medical image analysis: A survey. Pattern
Recognition, 151, 110424. https://doi.org/10.1016/j.patcog.2024.110424
Guo, D., Yang, D., Zhang, H., Song, J., Zhang, R., Xu,
R., et al. (2025).
Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement
learning. arXiv preprint. https://doi.org/10.48550/arXiv.2501.12948
Hansen, N., Auger, A., Ros, R., Finck, S., & Posik,
P. (2010). Comparing
results of 31 algorithms from the black-box optimization benchmarking
bbob-2009. Proceedings of the 12th Annual Conference Companion on Genetic and
Evolutionary Computation, 1689–1696. https://doi.org/10.1145/1830761.1830790
Jiang,
Z., Chua, F.-F., & Lim, A. H.-L. (2025). Privacy-preserving data uploading
scheme based on threshold secret sharing algorithm for internet of vehicles.
International Journal of Technology, 16 (3), 731–747. https://doi.org/10.14716/ijtech.v16i3.7260
Kairouz,
P., McMahan, H. B., Avent, B., Bellet, A., Bennis, M., Bhagoji, A. N., et al.
(2021). Advances and open problems in federated learning. Foundations and
Trends® in Machine Learning, 14 (1–2), 1–210. https://doi.org/10.1561/2200000083
Lai, F., Dai, Y., Singapuram, S., Liu, J., Zhu, X.,
Madhyastha, H., et al. (2022).
Fedscale: Benchmarking model and system performance of federated learning at
scale. International Conference on Machine Learning, 11814–11827. https://doi.org/10.1145/3477114.3488760
LeCun, Y., Boser, B., Denker, J. S., Henderson, D.,
Howard, R. E., Hubbard, W., et al. (1989).
Backpropagation applied to handwritten zip code recognition. Neural
Computation, 1 (4), 541–551. https://doi.org/10.1162/neco.1989.1.4.541
LeCun,
Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning
applied to document recognition. Proceedings of the IEEE, 86 (11), 2278–2324. https://doi.org/10.1109/5.726791
Li, J., Zhang, Y., Li, Y., Gong, X., & Wang, W.
(2024). Fedsparse: A
communication-efficient federated learning framework based on sparse updates.
Electronics, 13 (24), 5042. https://doi.org/10.3390/electronics13245042
Li, L., Fan, Y., Tse, M., & Lin, K. Y. (2020). A review of applications in federated learning. Computers &
Industrial Engineering, 149, 106854. https://doi.org/10.1016/j.cie.2020.106854
Li,
Z., Ying, B., Liu, Z., Dong, C., & Yang, H. (2024). Achieving
dimension-free communication in federated learning via zeroth-order
optimization. arXiv preprint. https://doi.org/10.48550/arXiv.2405.15861
Lin, J., Zhu, L., Chen, W., Wang, W., & Han, S.
(2023). Tiny machine
learning: Progress and futures [feature]. IEEE Circuits and Systems Magazine,
23 (3), 8–34. https://doi.org/10.1109/MCAS.2023.3302182
Liu,
S., Chen, P., Kailkhura, B., Zhang, G., Hero, A. O., & Varshney, P. K.
(2020). A primer on zeroth-order optimization in signal processing and machine
learning: Principles, recent advances, and applications. IEEE Signal Processing
Magazine, 37 (5), 43–54. https://doi.org/10.1109/MSP.2020.3003837
Liu, S., Chen, P., Zhu, W., & Carin, L. (2018). Zeroth-order stochastic variance reduction for nonconvex
optimization. Advances in Neural Information Processing Systems. https://doi.org/10.48550/arXiv.1805.10367
Ma, X., Wang, J., & Zhang, X. (2025). Data-free black-box federated learning via zeroth-order gradient
estimation. Proceedings of the AAAI Conference on Artificial Intelligence, 39
(18), 19314–19322. https://doi.org/10.1609/aaai.v39i18.34126
McMahan,
B., Moore, E., Ramage, D., Hampson, S., & y Arcas, B. A. (2017).
Communication efficient learning of deep networks from decentralized data.
Proceedings of the 20th International Conference on Artificial Intelligence and
Statistics, 1273–1282. https://doi.org/10.48550/arXiv.1602.05629
Nesterov,
Y., & Spokoiny, V. (2017). Random gradient-free minimization of convex
functions. Foundations of Computational Mathematics, 17 (2), 527–566. https://doi.org/10.1007/s10208-015-9296-2
Nguyen,
D. C., Ding, M., Pathirana, P. N., Seneviratne, A., Li, J., & Poor, H. V.
(2021). Federated learning for internet of things: A comprehensive survey. IEEE
Communications Surveys & Tutorials, 23 (3), 1622–1658. https://doi.org/10.1109/COMST.2021.3075439
Nguyen,
D. C., Pham, Q.-V., Pathirana, P. N., Ding, M., Seneviratne, A., Lin, Z., et
al. (2023). Federated learning for smart healthcare: A survey. ACM Computing
Surveys, 55 (3), 1–37. https://doi.org/10.1145/3501296
Reisizadeh, A., Mokhtari, A., Hassani, H., Jadbabaie, A.,
& Pedarsani, R. (2020). Fedpaq:
A communication-efficient federated learning method with periodic averaging and
quantization. International Conference on Artificial Intelligence and
Statistics, 2021–2031. https://doi.org/10.48550/arXiv.1909.13014
Rieke, N., Hancox, J., Li, W., Milletari, F., Roth, H.
R., Albarqouni, S., et al. (2020).
The future of digital health with federated learning. NPJ Digital Medicine, 3
(1), 1–7. https://doi.org/10.1038/s41746-020-00323-1
Rubinstein, R. Y., & Kroese, D. P. (2004). The cross-entropy method: A unified approach to combinatorial
optimization, monte-carlo simulation and machine learning. Springer Science &
Business Media. https://doi.org/10.1007/978-1-4757-4321-0
Teo,
Z. L., Jin, L., Li, S., Miao, D., Zhang, X., Ng, W. Y., et al. (2024).
Federated machine learning in healthcare: A systematic review on clinical
applications and technical architecture. Cell Reports Medicine, 5 (2), 101419. https://doi.org/10.1016/j.xcrm.2024.101419
Turner, R., Eriksson, D., McCourt, M., Kiili, J.,
Laaksonen, E., Xu, Z., et al. (2021).
Bayesian optimization is superior to random search for machine learning
hyperparameter tuning: Analysis of the black-box optimization challenge 2020.
NeurIPS 2020 Competition and Demonstration Track, 3–26. https://doi.org/10.48550/arXiv.2104.10201
Wang, S., Tuor, T., Salonidis, T., Leung, K. K., Makaya,
C., He, T., et al. (2019).
Adaptive federated learning in resource constrained edge computing systems.
IEEE Journal on Selected Areas in Communications, 37 (6), 1205–1221. https://doi.org/10.1109/JSAC.2019.2904348
Wang, X., Jin, Y., Schmitt, S., & Olhofer, M. (2023).
Recent advances in bayesian
optimization. ACM Computing Surveys, 55 (13s), 287:1–287:36. https://doi.org/10.1145/3582078
Wang, Y., Du, S., Balakrishnan, S., & Singh, A.
(2018). Stochastic
zeroth-order optimization in high dimensions. Proceedings of the Twenty-First
International Conference on Artificial Intelligence and Statistics, 1356–1365. https://doi.org/10.48550/arXiv.1710.10551
Wei, K., Li, J., Ding, M., Ma, C., Yang, H. H., Farokhi,
F., et al. (2020).
Federated learning with differential privacy: Algorithms and performance
analysis. IEEE Transactions on Information Forensics and Security, 15,
3454–3469. https://doi.org/10.1109/TIFS.2020.2988575
Wu, C., Wu, F., Lyu, L., Huan, Y., & Xie, X. (2022). Communication-efficient federated learning via knowledge
distillation. Nature Communications, 13 (1), 2032. https://doi.org/10.1038/s41467-022-29763-x
Yang,
Q., Liu, Y., Chen, T., & Tong, Y. (2019). Federated machine learning:
Concept and applications. ACM Transactions on Intelligent Systems and
Technology, 10 (2), 1–19. https://doi.org/10.1145/3298981
Yang, Y., Yang, Z., Wang, L., Zhu, L., & Wang, M.
(2025). Dynamic
personalized federated learning via representation-driven clustering. IEEE
Internet of Things Journal. https://doi.org/10.1109/JIOT.2025.3577661
Zhang,
C., Xie, Y., Bai, H., Yu, B., Li, W., & Gao, Y. (2021). A survey on
federated learning. Knowledge-Based Systems, 216, 106775. https://doi.org/10.1016/j.knosys.2021.106775