ON-OFF SPACECRAFT RELATIVE CONTROL IN SLIDING MODE VIA REINFORCEMENT LEARNING
Keywords:
reinforcement learning, proximal policy optimization, spacecraft control, on-orbit servicing, on–off control, autonomous control systems.Abstract
The paper addresses the problem of on-off spacecraft relative control in sliding mode for autonomous on-orbit servicing operations under actuator amplitude limits, action discreteness, and parametric uncertainties. The goal is to develop and assess an approach that combines sliding-mode control with modern reinforcement-learning methods tailored for resource-constrained onboard implementation. Relative motion dynamics is formulated in an orbital coordinate frame with normalized states and discretized in time. Binary actions with pulse-width modulation, subject to constraints on the thrust level, pulse duration, and duty cycle, represent the impulsive nature of actuation. We propose a combined synthesis in which the sliding-surface parameters and switching rules are tuned via proximal policy optimization within an actor-critic architecture. The actor and critic are implemented as neural networks that approximate the policy and the value function, respectively. The actor neural network takes the state vector as input information and outputs the mean and standard deviation of the parameters of the sliding mode control law. The value function penalizes both the state error and control effort, thus enabling a trade-off among the response speed, accuracy, and propellant consumption. Two uncoupled agents are designed to control spacecraft relative orbital motion in in-plane and out-of-plane directions independently. The proximal policy optimization hyperparameters are selected to ensure a trade-off among the learning time, stability, and control performance. The reinforcement-learning agents are trained and analyzed considering four cases that differ in the thrust levels and weighting matrices. The quality functional combines state deviation and thrust use penalties, thus enabling a trade-off among the response speed, accuracy, and propellant consumption. The results confirm the potential of this approach for autonomous spacecraft control under constraints and uncertainty. Compared with reported baselines, the trained agent shows superior robustness to plant-parameter uncertainty, which we attribute to the inherent robust properties of sliding-mode control. These findings have the potential to improve the efficiency and autonomy of on-orbit servicing operations.
REFERENCES
1. Chandra A., Kalita H., Furfaro R., Thangavelautham J. End to End Satellite Servicing and Space Debris Management. arXiv:1901.11121, 2019.
2. Li W., Cheng D., Liu X., et al. On-orbit service (OOS) of spacecraft: A review of engineering developments. Progress in Aerospace Sciences. 2019. V. 108. Pp. 32-120.
https://doi.org/10.1016/j.paerosci.2019.01.004
3. Khosravi A., Sarhadi P. Tuning of pulse-width pulse-frequency modulator using PSO: An engineering approach to spacecraft attitude controller design. Automatika. 2016. V. 57. Pp. 212-220.
https://doi.org/10.7305/automatika.2016.07.618
4. Anthony T., Wie B., Carroll S. Pulse-modulated control synthesis for a flexible spacecraft. Journal of Guidance. 1989. V. 13. No. 6. Pp. 1014-1022. https://doi.org/10.2514/3.20574
5. Alpatov A., Khoroshylov S., Lapkhanov E. Synthesizing an algorithm to control the angular motion of spacecraft equipped with an aeromagnetic deorbiting system. Eastern-European Journal of Enterprise Technologies. 2020. V. 1. No. 5. Pp. 37-46. https://doi.org/10.15587/1729-4061.2020.192813
6. Goodfellow I., Bengio Y., Courville A. Deep Learning. MIT Press, 2016.
7. Krizhevsky A., Sutskever I., Hinton G. E. ImageNet classification with deep convolutional neural networks. Communications of the ACM. 2017. V. 60. No. 6. Pp. 84-90. https://doi.org/10.1145/3065386
8. Pierson H., Gashler M. Deep learning in robotics: a review of recent research. Advanced Robotics. 2017. V. 31. No. 16. Pp. 821-835. https://doi.org/10.1080/01691864.2017.1365009
9. Sallab A. E., Abdou M., Perot E., Yogamani S. Deep reinforcement learning framework for autonomous driving. Electronic Imaging. 2017. V. 19. Pp. 70-76.
https://doi.org/10.2352/ISSN.2470-1173.2017.19.AVM-023
10. Silver D., Schrittwieser J., Simonyan K. Mastering the game of Go without human knowledge. Nature. 2017. V. 550. Pp. 354-359. https://doi.org/10.1038/nature24270
11. Izzo D., Märtens M., Pan B. A survey on artificial intelligence trends in spacecraft guidance dynamics and control. Astrodynamics. 2019. No. 3. Pp. 287-299. https://doi.org/10.1007/s42064-018-0053-6
12. Khoroshylov S. V., Redka M. O. Deep learning for space guidance, navigation, and control. Space Science and Technology. 2021. V. 27. No. 6. Pp. 38-52. https://doi.org/10.15407/knit2021.06.038
13. Oestreich C. E., Linares R., Gondhalekar R. Autonomous six-degree-of-freedom spacecraft docking maneuvers via reinforcement learning. Journal of Aerospace Information Systems. 2021. V. 18. No. 7. https://doi.org/10.2514/1.I010914
14. Gaudet B., Linares R., Furfaro R. Six degree-of-freedom hovering using LIDAR altimetry via reinforcement meta-learning. Acta Astronautica. 2020. V. 172. Pp. 90-99.
https://doi.org/10.1016/j.actaastro.2020.03.026
15. Gaudet B., Linares R., Furfaro R. Seeker based adaptive guidance via reinforcement meta-learning applied to asteroid close proximity operations. Acta Astronautica. 2020. V. 171. Pp. 1-13.
https://doi.org/10.1016/j.actaastro.2020.02.036
16. Redka M. O., Khoroshylov S. V. Determination of the force impact of an ion thruster plume on an orbital object via deep learning. Space Science and Technology. 2022. V. 28. No. 5. Pp. 15-26.
https://doi.org/10.15407/knit2022.05.015
17. Khoroshylov S. V., Wang C. Spacecraft relative on-off control via reinforcement learning. Space Science and Technology. 2024. V. 30. No. 2. Pp. 3-14. https://doi.org/10.15407/knit2024.02.003
18. Khoroshylov S. V. Relative motion control system of SC for contactless space debris removal. Sci. Innov. 2018. V. 14. No. 4. Pp. 5-16. https://doi.org/10.15407/scine14.04.005
19. Steinberger M., Horn M., Fridman L. (Eds). Variable-Structure Systems and Sliding-Mode Control. Springer-Verlag, 2020. (Studies in Systems, Decision and Control; Vol. 271).
https://doi.org/10.1007/978-3-030-36621-6
20. Bryson A. E., Ho Y. C. Applied Optimal Control: Optimization, Estimation, and Control. Hemisphere Publishing, 1975. Pp. 224-235.
21. Sutton R. S., Barto A. G. Reinforcement Learning: An Introduction. 2nd ed. MIT Press, 2018. Pp. 47-65.
22. Schulman J., Wolski F., Dhariwal P., Radford A., Klimov O. Proximal Policy Optimization Algorithms. arXiv:1707.06347, 2017. 13 pp.
23. Mnih V., Badia A., Mirza M., Graves A., Lillicrap T., Harley T., Silver D. Asynchronous Methods for Deep Reinforcement Learning. arXiv:1602.01783, 2016.

