Predicting Mechanical Failure of Electronic Assemblies
Enold Pierre-Louis
Comprehensive techniques developed at Aerospace can quantify the failure risk of electronic components—an important part of the mission assurance process.
Electronic assemblies in spacecraft and launch vehicles contain internal components that are susceptible to fatigue failures induced by thermal cycling and random vibration. Repeated deformation of components such as ASICs, hybrid circuits, and ceramic grid arrays can lead to fatigue failures of their solder joints, weld joints, adhesive bonds, and lead wires. These mechanical failures, which usually manifest themselves as electrical performance anomalies, can result in catastrophic mission loss or loss of critical functions.
Aerospace has developed integrated techniques combining finite-element stress modeling and probabilistic analysis to quickly and accurately assess the likelihood that an electronic subsystem will experience anomalies during ground processing or space operation. In some cases, these analytical techniques have helped prevent launch delays by substantiating that suspect components have adequate margins to survive worst-case launch excitations and mission thermal shifts. On other occasions, they have identified root failure causes and provided rationale for design enhancements or modified operational limits.
Basic Steps
One difficult challenge that arises periodically in space programs is the need to evaluate the risk of mechanical failure of a suspect component at an advanced stage of integration. A component can be viewed as suspect for a variety of reasons, but perhaps the most common is the failure of an identical or similar component during environmental testing. If the component has already been installed on the spacecraft, the cost and risk of removing and replacing it can be tremendous—especially if the spacecraft has moved to launch preparation. Thus, a thorough analysis is needed to determine the true risk of part failure.
![]() |
A cracked field-programmable gate array (FPGA) lead from an electronic unit caused a functional failure. Characterization testing coupled with stress analyses showed that the fatigue failure of FPGA leads was caused by dynamic resonant coupling between the circuit card and the chassis. Eventually, the units installed in the flight hardware had to be replaced. |
This risk analysis will typically involve researching the failure history of the component or similar components. It entails a detailed review of the test history, including the environments and loads the component was exposed to and the type of failures, if any, that occurred. It may also include a review of manufacturing records to identify process deviations and dimensional variations. This information is used to determine a fatigue-based damage level for the component, which can in turn be used in a statistical survival analysis. The goal is to determine the probability that the component will survive its remaining mission environments, including any remaining ground tests as well as launch and orbit. If the evaluation can show that the risk of failure is sufficiently small, then a decision can be made to use the suspect component, thereby saving the program time and money. The procedure generally encompasses four basic steps.
The first step is to determine all the critical environments that the component has already been exposed to during assembly and ground-level screening tests and that it will be subjected to during any additional system-level tests, launch, and orbit. This generally requires pedigree review of unit, subsystem, and system exposure histories and review of specifications for launch and the orbital mission environments.
The next step is to estimate the component's physical responses to these environmental loads—for example, stress, strain, and deflection. For simple situations, algebraic formulas can be applied, but more commonly, the finite-element method is used because the component resides within an assembly. These finite-element models must often account for complex material behaviors and usually require a combination of procedures to determine traits such as geometric and material nonlinearities, creep behaviors as a function of temperature and time, coupled temperature and stress interactions, and friction at contact interfaces. These response models characterize the strains and stresses at critical locations in the suspect component.
The third step is to use analytical techniques to evaluate the physical damage to the component caused by the various loading events. For example, the analyst typically uses a strain-life fatigue model to determine the amount of damage induced on the component's solder joints by cyclic loads. The strain-life model is an empirically based relationship that relates the strain developed during one loading event or cycle to the number of loading cycles required to propagate a crack to failure. The extent of damage induced by each loading event can then be combined using a cumulative damage law. This approach allows the normalization of damage induced by various loading events (such as 10 to 100 large temperature cycles in acceptance testing and tens of thousands of small thermal shifts during the mission) into a generalized damage variable.
The final step is to use probabilistic modeling to evaluate the statistical risk that the suspect component would fail during a specific damage event, such as launch liftoff or a 10-year mission. Testing has shown that fatigue failures can be accurately characterized with the Weibull distribution, a statistical approach used to model the probability that a component will perform its required function during a specified period of time under stated loading conditions. For these assessments, time is interchanged with the generalized damage variable (derived in step three), and the two-parameter (scale and shape) Weibull distribution is typically used. The Weibull scale parameter is estimated by accounting for all the components tested, using failures as well as nonfailure data. To substantiate that observed failures are "outliers," accelerated life testing of similar components is often required. Successful life testing to numerous mission lives can be used to show that most similar components will fail at damage levels significantly larger than those expected during the mission. When life testing is required, additional analyses must be performed to determine the sample size and duration of the life test. An estimation based on historical data is typically used for the Weibull shape parameter when the number of component failures experienced during hardware and life testing is too small to permit calculation of the variance of the distribution. Quite often, the risk associated with a suspect component's failure during the mission is factored into a system reliability analysis to assess overall impact.
Examples of Aerospace Work
These techniques have been effectively used to avert launch delays and prevent failures of critical systems in operational space vehicles. For example, detailed stress analyses performed by Aerospace showed that the functional failure of a transmitter during system-level thermal vacuum testing was caused by the large mismatch in thermal expansion between a copper ribbon and the composite circuit board, which resulted in solder-joint cracking. Detailed material characterization performed by Aerospace revealed that the circuit board had an extremely high coefficient of thermal expansion within a narrow temperature range below ambient. Consequently, stress and probability analyses showed that the transmitters in operation and awaiting launch would fail due to the large number of mission power cycles within the critical temperature range. Further analyses and life testing demonstrated that adequate mission reliability would be achieved by operating the devices at a temperature outside the critical temperature range. Accordingly, the operating temperature was modified for all transmitters already integrated in space vehicles. Aerospace then worked with the contractors to develop a modified S-shaped ribbon with an overall length that would not adversely affect radio-frequency performance while still providing ample stress relief; the design is being used to rework transmitters for future vehicles.
Aerospace analyses also showed that the likelihood of solder-joint failures in a converter could be greatly diminished by modifying the operational safe-hold procedures. The analyses showed that a combination of solder joints with minimal fillets and gold embrittlement was the cause of numerous cracked joints detected in several converters during acceptance thermal cyclic testing. Further analysis showed that the number of thermal cycles on orbit associated with eclipses would induce minimal additional fatigue damage and that the thermal cooling shifts expected during safe-hold procedures could propagate solder cracks to failure. Probabilistic analyses showed that adequate system reliability could be achieved by maintaining power for the suspect converter during safe holds to limit additional fatigue damage. Rather than delay the mission, a simple change to the operational procedure was implemented.
![]() | ![]() |
![]() | ![]() |
![]() Stress analyses performed by Aerospace showed that the functional failure of a transmitter during system-level thermal vacuum testing was caused by the large mismatch in thermal expansion between a copper ribbon and the composite circuit board, which resulted in solder-joint cracking. Probability analyses showed that the transmitters in mission use and awaiting launch would fail due to the large number of mission power cycles within the critical temperature range. Aerospace showed that changing the operational temperature limits would allow the existing flight units to be used as-is and helped develop a modified S-shaped ribbon, which significantly improved the fatigue margin for future units. | |
Similar techniques were used to avert a significant launch delay by showing that failures observed during life testing of solar arrays were primarily caused by overtesting. A plan to replace solar cells from flight panels was being formulated after an unacceptably high failure rate stemming from solder-joint fatigue was observed during the original life test; however, stress and fatigue analyses showed that the high current used for the bypass diode electrical cycling caused severe fatigue damage to the solder joints already weakened by excessive thermal cycling. Accordingly, Aerospace worked with the contractors to develop requirements for a new life test based on more realistic operational conditions. Updated fatigue analyses enabled a significant reduction in the life-test requirements. Probabilistic analyses showed that the low failure rate achieved in the new life test substantiated that the solar cells for the upcoming flight would adequately meet power requirements until the end of their mission life.
In another investigation, Aerospace determined that repairing suspect components was most prudent. During spacecraft thermal vacuum testing, an electronic unit experienced functional failure as a result of numerous cracked field-programmable gate array (FPGA) leads. Subsequent inspections of other units revealed additional failed leads. Structural analyses coupled with characterization vibration testing showed that dynamic resonant coupling between the circuit card and the chassis caused fatigue failure of FPGA leads; however, because of schedule constraints, only low-level visual inspections and minimal repairs were initially conducted on the flight units. Aerospace convinced the contractor that additional inspections should be conducted during a schedule slip caused by an unrelated issue. A high-magnification inspection conducted by Aerospace revealed a partially cracked FPGA lead in the flight hardware, which was not detected during the initial inspection conducted by the contractor. Life validation testing, based on requirements derived by Aerospace analyses, showed that further crack growth was possible during launch. Accordingly, the contractor agreed to remove all the installed units, replace their FPGAs, and incorporate stiffeners to reduce the stresses on the leads caused by vibration.
Conclusion
Aerospace has played a vital role in the successful resolution of subassembly structural failures and anomalies for numerous space programs. While it is obviously important to detect and replace unsuitable components, it's equally important to have confidence in the suspect components that are retained. By performing detailed stress and probabilistic analyses and developing accelerated life-test programs, Aerospace helps program leads manage risk and make informed decisions with a high degree of confidence in mission success.





