Laboratory for System Dependability（Research topics）

Our research interests are the applications of stochastic models for analyzing software systems confronting uncertainties, evaluating system dependability, and optimizing system design under cost constraints.
We are also interested in developing new dependability measures and efficient evaluation techniques.

Fog computing is a distributed computing paradigm that allows flexible allocation of processes and data to computers close to edge devices, user terminals, or IoT sensors to achieve low latency, save network bandwidth, and preserve security and privacy. Applications on a fog computing infrastructure can flexibly change their configuration and behavior depending on processing demands and environmental conditions. However, assurance of system qualities like performance, availability, and reliability becomes complicated as it is not statically determined at the design time. Drone-based image processing systems, for instance, can be more reliable by offloading the computation tasks to any fog nodes through a wireless communication network. Desirable computation modes (e.g., drone processing or offloading mode) can change depending on environmental uncertainties like network connection reliability and workload intensities. Therefore, system quality design needs to take into account such environmental uncertainties in addition to system configurations. In this study, we use Stochastic Reward Nets to model such complex stochastic system behaviors, analyze system performance and availability quantitatively, and optimize system design.

Related publication
・F. Machida, Q. Zhang and E. Andrade, Performability analysis of adaptive drone computation offloading with fog computing. (FGCS2022) [link]
・Q. Zhang, F. Machida and E. Andrade, Performance bottleneck analysis of drone computation offloading to a shared fog node. (WoSoCer2022) [paper]
・F. Machida and E. Andrade, PA-offload: Performability-aware adaptive fog offloading for drone image processing. (ICFEC2021) [paper]
・F. Machida and E. Andrade, Availability modeling for drone image processing systems with adaptive offloading. (PRDC2021) [paper]

Recent advances in machine learning algorithms with increased computing power and available big data further expand the applications of machine learning systems. Autonomous vehicles, for instance, use deep learning to recognize the traffic signs, obstacles, pedestrians that appeared in the images captured by the camera. However, machine learning is not perfect as it can produce errors for real-world input data. Engineers need to prepare error outputs from machine learning functions and adopt relevant system architecture considering reliability and safety. In this study, we leverage the idea of N-version programming, which is a well-known software fault-tolerant technique, to improve the reliability of machine learning systems. We propose different types of N-version machine learning architectures and develop reliability models to assess the reliabilities of these architectures. [More details]

Related publication
・F. Machida, Using Diversities to Model the Reliability of Two-version Machine Learning Systems. (TETC2023) [link][pdf]
・Q. Wen and F. Machida, Characterizing Reliability of Three-version Traffic Sign Classifier System through Diversity Metrics. (ISSRE2023) [paper]
・M. Takahashi, F. Machida, and Q. Wen, How data diversification benefits the reliability of three-version image classification systems. (PRDC2022) [paper]
・Q. Wen and F. Machida, Reliability models and analysis for triple-model with triple-input machine learning systems. (DSC2022) [paper]
・F. Machida, On the diversity of machine learning models for system reliability. (PRDC2019) [paper][slide]
・F. Machida, N-version machine learning models for safety critical systems. (DSML2019) [paper][slide]

As many IoT application systems require long continuous operations of software, operational software reliability becomes one of the critical concerns of dependable systems. It is well known that long-running software often confronts the deterioration of performance and reliability over time, which is referred to as software aging. Software aging is typically caused by software bugs, which are not easily detected in software systems consisting of many dependent software components. Therefore, system monitoring and statistical analysis are the clues to find the trend and the root causes of software aging phenomena. By predicting a potential aging trend and failure time, we can effectively apply preventive software maintenance techniques like software rejuvenation and software life-extension. In this study, we use stochastic models to evaluate the effectiveness of software preventive maintenance techniques and find the optimal maintenance schedules.

Related publication
・E. Andrade, R. Pietrantuono, F. Machida, and D. Cotroneo, A comparative analysis of software aging in image classifiers on cloud and edge. (IEEE TDSC 2023) [link]
・D. Dias, F. Machida, and E. Andrade, Analysis of software aging in a blockchain platform. (WoSAR2022) [paper]
・K. Watanabe and F. Machida, Availability analysis of a drone system with proactive offloading for software life-extension. (COINS2022) [paper]
・F. Machida et al., Lifetime extension of software execution subject to aging. (IEEE TR 2017) [link][paper]