by
Anonymous Coward
on 2015年12月02日 17時07分
(#2927279)
In practice, the mean time between failures (MTBF, 1/λ) is often reported instead of the failure rate. This is valid and useful if the failure rate may be assumed constant – often used for complex units / systems, electronics – and is a general agreement in some reliability standards (Military and Aerospace). It does in this case only relate to the flat region of the bathtub curve, also called the "useful life period". Because of this, it is incorrect to extrapolate MTBF to give an estimate of the service life time of a component, which will typically be much less than suggested by the MTBF due to the much higher failure rates in the "end-of-life wearout" part of the "bathtub curve".
大規模クラスタ (スコア:0)
こういった障害の起こりやすさにFIT(Failure In Time)という目安があるのですが、これは10^9時間におこる誤作動の回数です。
京の講習会の時に聞いたのですが、大規模なクラスタだと1ノードで10FITとかそのくらいのオーダーらしいです。詳しい数字は忘れましたが。ところで、10万ノードで100時間計算し、一度も誤作動を起こさない確率を上のFITから推定すると
(1 - 10^-8)^(10^6*10^2)
と、だいたい37%くらいになります。
京の実際のノードはこれよりもやや少ないですが、結構現実的な脅威ですね。
Re: (スコア:0)
それってMTBFの表現を変えただけ?
Re:大規模クラスタ (スコア:0)
In practice, the mean time between failures (MTBF, 1/λ) is often reported instead of the failure rate. This is valid and useful if the failure rate may be assumed constant – often used for complex units / systems, electronics – and is a general agreement in some reliability standards (Military and Aerospace). It does in this case only relate to the flat region of the bathtub curve, also called the "useful life period". Because of this, it is incorrect to extrapolate MTBF to give an estimate of the service life time of a component, which will typically be much less than suggested by the MTBF due to the much higher failure rates in the "end-of-life wearout" part of the "bathtub curve".
Re: (スコア:0)
出典 [wikipedia.org] ぐらい示せよ。
それから頭の悪い俺にもわかるように、100文字ぐらいにまとめてくれ。