Hottest 'fault-tolerance' Answers

11 votes

Boneheaded exceptions should not be caught. Then how to provide fault tolerance and reliability?

Fail as early as possible, and catch in context. Going by the definition on https://ericlippert.com/2008/09/10/vexing-exceptions/, a boneheaded exception isn't one that should not be caught, it's in ...

Duroth

900

answered Sep 29, 2021 at 12:09

7 votes

What difference and relation are between fault tolerance and (high) availability?

The basic concepts are orthogonal, however, they are related. One has to do with the availability of your application, and the other has to do with the correctness of your application. Remember, ...

Berin Loritsch

46.5k

answered Dec 29, 2019 at 16:15

6 votes

Boneheaded exceptions should not be caught. Then how to provide fault tolerance and reliability?

Your examples are both in the area of interfaces to systems that are not under your control, which is different from the interfaces between components that you control and where you can ensure that ...

Hans-Martin Mosner

18.6k

answered Sep 29, 2021 at 11:41

6 votes

Design pattern for objects in invalid states

No no no no. First, stop using floating point numbers to represent base 10 money. Ints work fine if you count pennies and remember to add the decimal point when presenting them as dollars. ...

candied_orange

120k

answered Sep 20, 2019 at 14:35

5 votes

How does a distributed system both tolerate network partition and achieve consistency?

The CAP Theorem says that you can only achieve at maximum two out of the three properties of Consistency (every read receives the most recent write or an error), Availability (every read receives a ...

Jörg W Mittag

105k

answered Dec 28, 2019 at 20:07

4 votes

What is the crux of difference between N version programming and self monitoring architecture?

The difference is in what is done if the outputs are different: In the self-monitoring architecture, if the outputs are different then a fault is indicated; no recovery is possible - i.e. this is a ...

Philip Kendall

26.1k

answered Oct 17, 2021 at 15:03

4 votes

Boneheaded exceptions should not be caught. Then how to provide fault tolerance and reliability?

"Fail fast" is a good default, but fault tolerance may be worth it in some cases. You just have to be really careful how you do it, because an unexpected exception means some of the program ...

JacquesB

62.3k

answered Sep 29, 2021 at 13:45

3 votes

How does a distributed system both tolerate network partition and achieve consistency?

How can we have both P and C in the second case? I will answer with a common real-world example. Common CP system in AWS cloud Consider a distributed system made up of parts deployed to 3 ...

Jonas

14.9k

answered Jan 3, 2020 at 0:28

3 votes

Accepted

Unexpected shutdown before a saga completion

That’s not the way a saga works: every involved microservice performs a step, which is locally handled as a transaction. every completed step shall result in an event to be triggered the events must ...

Christophe

82.2k

answered Aug 5, 2020 at 13:58

3 votes

Accepted

Design pattern for objects in invalid states

How are responsibilities between classes? There is no single answer to that question. It's first a question of responsibilities: Shall using classes be responsible for verifying if they can do the ...

Christophe

82.2k

answered Sep 20, 2019 at 16:05

2 votes

How to guarantee HTTP message delivery in fault tolerant way

Your colleague is right. You can't eliminate all failure modes. The goal should be predictable failure modes, e.g. to meet a certain SLA perhaps you want 99.99% reliability and a response time of ...

John Wu

27k

answered Oct 11, 2018 at 9:18

2 votes

Accepted

How to guarantee HTTP message delivery in fault tolerant way

The database solution is definitively the best, transactional filesystem are not common, unless you consider that filesystem never fail (permission settings, disk full,...). I'll detail a more ...

Walfrat

3,536

answered Oct 11, 2018 at 8:31

2 votes

Design pattern for objects in invalid states

Exceptions thrown for normal object access (or "Solution 1") is known as the general pattern coined in Python as better ask for forgiveness than permission. This pattern is heavy on the user side ...

Diane M

2,116

answered Sep 20, 2019 at 20:10

2 votes

What is the difference between masking and tolerating failures?

From what I understand both are different in respect to the level of abtractions involved: "Masked" means here: Lower levels "mask" failure transparently for higher levels of the system. Failure on a ...

Thomas Junk

9,623

answered Dec 25, 2019 at 8:12

2 votes

Feedback on Multi-Process Software Architecture

If it makes the code easier to read/debug and the system easier to reason about, then your decision to use three separate applications is a good one. Your reasoning for using a file to communicate the ...

Bart van Ingen Schenau

79k

answered Oct 16, 2024 at 9:50

2 votes

Boneheaded exceptions should not be caught. Then how to provide fault tolerance and reliability?

It's okay to sandbox (narrowly as possible) While your ERP does indeed seem boneheaded, it is not boneheaded to sandbox a third party interaction. Just make it as narrow as possible. Also, it would be ...

John Wu

27k

answered Oct 4, 2021 at 23:26

1 vote

Boneheaded exceptions should not be caught. Then how to provide fault tolerance and reliability?

In the net, there are too many articles with advice on how to handle various types of exceptions. Even the wording "to handle an exception" puts the focus on the exception object instead of ...

Ralf Kleberhoff

6,276

answered Oct 4, 2021 at 22:21

1 vote

How does a distributed system both tolerate network partition and achieve consistency?

I am going to add another perspective. CAP: if partitioning is happening, then the system may be either available or consistent. The million dollar question is what is partitioning? Let's say I have a ...

AndrewR

196

answered Mar 3, 2022 at 21:46

1 vote

Does stale data due to weak level of consistency count as Byzantine failure?

Byzantine fault can appear to be both functioning and not functioning to diffrent actors. a server can inconsistently appear both failed and functioning to failure-detection systems, presenting ...

Jonas

14.9k

answered Jan 3, 2020 at 0:46

1 vote

What is the difference between masking and tolerating failures?

Maybe a comparison could help you understanding the difference. Imagine you're going to an e-commerce website. You found a product you want to buy and you click on the “Add to cart” button. Under the ...

Arseni Mourzenko

139k

answered Dec 24, 2019 at 20:37

1 vote

When do I stop being paranoid about my code failing?

A lot of it depends on what kind of an application you are building and what SLAs you intent to provide. No system has been build to handle all the scenarios perfectly so that the developer can rest. ...

skott

509

answered Nov 14, 2019 at 10:50

1 vote

When do I stop being paranoid about my code failing?

if I do more checks, it becomes difficult to read through even for me This is the worrying part. As you spend more time crafting your code it should become easier to read. Considering the sheer ...

Martin Maat

18.6k

answered Nov 14, 2019 at 6:41

1 vote

Design pattern for objects in invalid states

Solution 1 w/ YAGNI applied: public class Wallet { /// <summary> /// Indicates the amount of Cash in the wallet /// </summary> public double Cash ...

RandomUs1r

268

answered Sep 20, 2019 at 21:48

1 vote

Design pattern for objects in invalid states

I suspect you are suffering from “primitive obsession”. Having the validation code for valid states of cash in the wallet means anywhere else you use cash needs them too. If you create a new class ...

Adam B

1,660

answered Sep 20, 2019 at 21:43

1 vote

Concurrent fault-safe data structure

I think that you should look at using the TPL (Task Parallel Library by Microsoft). If I am correctly understanding the scenario outlined in your question, then this would provide you with the low ...

Chaplin Marchais

160

answered May 16, 2021 at 22:22

Stack Exchange Network

Tag Info

Hot answers tagged fault-tolerance

Boneheaded exceptions should not be caught. Then how to provide fault tolerance and reliability?

What difference and relation are between fault tolerance and (high) availability?

Boneheaded exceptions should not be caught. Then how to provide fault tolerance and reliability?

Design pattern for objects in invalid states

How does a distributed system both tolerate network partition and achieve consistency?

What is the crux of difference between N version programming and self monitoring architecture?

Boneheaded exceptions should not be caught. Then how to provide fault tolerance and reliability?

How does a distributed system both tolerate network partition and achieve consistency?

Unexpected shutdown before a saga completion

Design pattern for objects in invalid states

How to guarantee HTTP message delivery in fault tolerant way

How to guarantee HTTP message delivery in fault tolerant way

Design pattern for objects in invalid states

What is the difference between masking and tolerating failures?

Feedback on Multi-Process Software Architecture

Boneheaded exceptions should not be caught. Then how to provide fault tolerance and reliability?

Boneheaded exceptions should not be caught. Then how to provide fault tolerance and reliability?

How does a distributed system both tolerate network partition and achieve consistency?

Does stale data due to weak level of consistency count as Byzantine failure?

What is the difference between masking and tolerating failures?

When do I stop being paranoid about my code failing?

When do I stop being paranoid about my code failing?

Design pattern for objects in invalid states

Design pattern for objects in invalid states

Concurrent fault-safe data structure

Tag Info

Hot answers tagged fault-tolerance

Related Tags