DEPLOYMENT CONTROL OF TRANSFORMABLE ROD STRUCTURES USING REINFORCEMENT LEARNING
Keywords:
transformable structure; reinforcement learning; neural network; deployment control.Abstract
The task of controlling the deployment of transformable rod structures for space applications is studied. An example of such structures is a mesh antenna truss, which is deployed using a cable-pulley system.
The aim of the study is to develop an intelligent agent (IA) based on the reinforcement learning (RL) methodology, which ensures the deployment and maintenance of the structure under consideration in the deployed position, taking into account the specified requirements. The main requirements are the deployment time and the minimum angular velocities of the V-folding rods at the final stage of the structure deployment.
During the research, methods of dynamic modeling of multibody systems, control theory, reinforcement learning, and computer simulation were used.
The possibility of using the RL methodology to overcome a number of difficulties inherent in traditional approaches to controlling the deployment of transformable rod structures is demonstrated. In particular, the RL allows optimizing the deployment system using models obtained using specialized software for modeling of the multibody dynamics, taking into account the necessary criteria and constraints.
The features of this approach to controling the deployment of rod structures were investigated using a simplified model of one section of a transformable mesh antenna. The AI was designed on the basis of the actor-critic architecture. The structure of AI neural networks was proposed, which ensure the implementation of constraints on control actions and the stability of the learning process. Proximal policy optimization algorithm is used for training the IA. Various cases are investigated, which differ in cost functions, actor activation functions, and friction parameters of the joints.
In cases where the dynamic properties of the model and the real structure differ significantly, the AI can be fine-tuned. This operation can be implemented by deploying the real structure, since the AI requires significantly fewer attempts for final fine-tuning than for preliminary training.
The practical value of the obtained results is that they allow facilitating the development of space structure deployment control systems and improve their performance according to different specified criteria.
REFERENCES
1. Puig L., Barton A., Rando N. A review on large deployable structures for astrophysics missions. Acta Astronautica. 2010. V. 67. Iss. 1-2. Pp.12-26.
https://doi.org/10.1016/j.actaastro.2010.02.021
2. Meguro A., Harada S., Watanabe M. Key technologies for high-accuracy large mesh antenna reflectors. Acta Astronautica. 2003. V. 53. Pp. 899-908.
https://doi.org/10.1016/S0094-5765(02)00211-4
3. Scialino L., Ihle A., Migliorelli M., Gatti N., Datashvili L., Klooster K., Santiago Prowald J. Large deployable reflectors for telecom and earth observation applications. CEAS Space Journal. 2013. 5. Pp. 125-146.
https://doi.org/10.1007/s12567-013-0044-7
4. Thomson M. The AstroMesh deployable reflector. IEEE Antennas Propag. Soc. 2003. 3.
https://doi.org/10.1109/APS.1999.838231
5. Medzmariashvili E., Tserodze S., Sushko A. et al. Structure, structural features, assembling, and bench testing of the deployable space reflector. CEAS Space J. 2024.
https://doi.org/10.1007/s12567-024-00575-7
6. Rivera A., Stewart A. Study of spacecraft deployables failures. 19th European Space Mechanisms and Tribology Symposium, Online, September 20-24th, 2021, https://doi.org/10.5281/ZENODO.11425012
7. Khoroshylov S., Martyniuk S., Sushko O. et al. Dynamics and attitude control of space-based synthetic aperture radar. Nonlinear Engineering. 2023. V. 12. No. 1. 20220277.
https://doi.org/10.1515/nleng-2022-0277
8. Zhang Y., Duan B., Li T. A controlled deployment method for flexible deployable space antennas, Acta Astronautica. 2012. V. 81. Iss. 1. Pp.19-29.
https://doi.org/10.1016/j.actaastro.2012.05.033
9. Li T. Deployment analysis and control of deployable space antenna, Aerospace Science and Technology. 2012. V. 18. Iss. 1. Pp. 42-47.
https://doi.org/10.1016/j.ast.2011.04.001
10. Zhang Y., Yang D., Li S. An integrated control and structural design approach for mesh reflector deployable space antennas, Mechatronics. 2016. V. 35. Pp.71-81.
https://doi.org/10.1016/j.mechatronics.2015.12.009
11. Zhang Y., Yang D., Sun Z., Li N., Du J. Winding strategy of driving cable based on dynamic analysis of deployment for deployable antennas. Journal of Mechanical Science and Technology. 2019. V. 33. Pp.5147-5156.
https://doi.org/10.1007/s12206-019-0906-9
12. Peng H., Li F., Kan Z., Liu P. Symplectic instantaneous optimal control of deployable structures driven by sliding cable actuators. Journal of Guidance, Control, and Dynamics. 2020. V. 43. Pp. 1114-1128.
https://doi.org/10.2514/1.G004872
13. Goodfellow I., Bengio Y. A. Deep Learning. Eds. Courville. The MIT press, 2016. ISBN 978-0262035613.
14. Khoroshylov S. V., Redka M. O. Deep learning for space guidance, navigation, and control. Space Science and Technology. 2021. V. 27. No. 6. Pp.38-52.
https://doi.org/10.15407/knit2021.06.038
15. Izzo D., Märtens M., Pan B. A survey on artificial intelligence trends in spacecraft guidance dynamics and control. Astrodyn. 2019. 3. Pp. 287-299.
https://doi.org/10.1007/s42064-018-0053-6
16. Redka M. O., Khoroshylov S. V. Determination of the force impact of an ion thruster plume on an orbital object via deep learning. Space Science and Technology. 2022. V. 28. No. 5. Pp. 15-26.
https://doi.org/10.15407/knit2022.05.015
17. Khoroshylov S. V., Wang C. Spacecraft relative on-off control via reinforcement learning. Space Science and Technology. 2024. V. 30. No. 2. Pp. 3-14.
https://doi.org/10.15407/knit2024.02.003
18. Liu Y., Ma G., Lyu Y., et al. Neural network-based reinforcement learning control for combined spacecraft attitude tracking maneuvers. Neurocomputing. 2022. 484. Pp. 67-78.
https://doi.org/10.1016/j.neucom.2021.07.099
19. Gaudet B., Linares R., Furfaro R. Six degree-of-freedom body-fixed hovering over unmapped asteroids via lidar altimetry and reinforcement meta-learning. Acta Astronaut. 2020. V. 172. Pp. 90-99.
https://doi.org/10.1016/j.actaastro.2020.03.026
20. Sushko O., Medzmariashvili E., Filipenko L. et al. Modified design of the deployable mesh reflector antenna for mini satellites. CEAS Space Journal. 2021. 13. Pp. 533-542.
https://doi.org/10.1007/s12567-020-00346-0
21. Khoroshylov S., Martyniuk S., Medzmariashvili E. et al. Deployment modeling and analysis of mesh antenna consisting of scissor-like and V-folding elements. CEAS Space J. 2024.
https://doi.org/10.1007/s12567-024-00584-6
22. Gerstmayr J., Dorninger A., Eder R. et al. HOTINT: A script language based framework for the simulation of multibody dynamics systems. ASME IDETC/CIE. 2013. V. 7B. V07BT10A047.
https://doi.org/10.1115/DETC2013-12299
23. János Z., Rachholz R., Woernle C. Field test validation of Flex5, MSC. Adams, alaska/Wind and SIMPACK for load calculations on wind turbines. Wind Energy. 2016. 19.7. Pp.1201-1222.
https://doi.org/10.1002/we.1892
24. Lewis F. L., Vrabie D., Syrmos V. L., Optimal Control, 3rd Edition. New York: John Wiley & Sons, Inc., 2012.
https://doi.org/10.1002/9781118122631
25. Sutton R. S., Barto A. G. Reinforcement learning: an Introduction. Eds. MIT press, 1998. ISBN 978-0262193986.
26. Schulman J., Wolski F., Dhariwal P., Radford A., Klimov O. Proximal policy optimization algorithms. arXiv preprint, 2017, arXiv:1707.06347.
27. Mnih V., Badia A., Mirza M., Graves A., Lillicrap T., Harley T., Silver D. Asynchronous Methods for Deep Reinforcement Learning. arXiv preprint, 2016, ArXiv:1602.01783.