Some Notes on Monitoring and Testing Load Balanced Resources
algorithms, load_balancing, probability
One of the challenges faced when designing end-to-end checks is encountered when you are facing a load balancing arrangement. It is necessary to repeat part of the check more than once so that you can be reasonably sure that you hit all the resources that are being load balanced. Repeat too many times and you push your resources (ah, nothing like stress testing a live production system). Repeat too few times, and node 12 will keep serving that critical error you thought you'd fixed.
Random
Perhaps the most common case is random or semi-random balancing. If the balancing is fair (that is if the probability of a request being assigned to a given resource is constant across all resources), this is a case of the coupon collectors problem. Essentially the problem is the same as someone collecting a full set of coupons (or baseball cards, world cup stickers or pokemon etc.) - the first few are easy to get but it becomes progressively more difficult to get new ones as your collection grows.
In this case, the problem scales as O(N ⋅ log(N)), where N is the number of resources. To calculate the number of times we need to repeat the checks to cover all the resources(C), we first calculate it's expectation:
〈C〉 = N ⋅ HN
= N ⋅ loge N + γN + o(1⁄N2)
≅ N ⋅ loge N + γN as N → ∞
Hn is the Harmonic Number of N, and γ is the Euler-Mascheroni constant. For small N we can find Hn by hand.
Hn = 1/1 + 1/2 + 1/3 + … + 1/n
We can then use the markov inequality to find a bound on the probability of coverage, which should be a satisfactory guide.
P(C ≥ xnHn) ≤ 1⁄x
Round Robin
The other common load balancing algorithm is set up for round robin balancing, when the resources are ordered in a list, and the balancer chooses them in sequence, restarting from the top when necessary.
In this case we will need to know the number of 'organic' requests coming through between one test and the next. We can calculate this if we know the request probability density L(t); which is something you really want to be keeping track of anyway. If we can assume our L(t) is cyclostationary, the expectation of the number of requests between checks that take with period T at time 0 is:
∫0NT L(t)/N dt
= L(t)T as NT → 0.
We can find the probability that no requests come through by using the markov inequality again, if needed.
If you are running the checks in a quiet period (or running them quickly enough) then you get full coverage at C = N with probability 1-PO, where PO is the probability of an an organic request. Unfortunately simply adding an extra check or two will not help; you'll have add another N-1, to get probability 1-PO2. If you do your checks at busy times L(t)T >> N, when it's almost certain that several requests have come through, you can delay or randomise the timing of the checks so that either
T >> σ(L(t))
or
σ(T) >> L(t)N
in which case it reduces to random.
Other Algorithms
If the load balancing is set up to favour 'least loaded' resources in some way, then we have a different kettle of fish. In fact, unless we're load balancing requests that always take the same time to complete this is usually a bad thing to have in place anyway, and even then it's highly questionable whether this added complexity helps.
Ellsberg's Wager
belief, decision, Dempster-Shafer, games, Pearl, probability, religion
There is a bag containing at least one token. Tokens must be either black or white. One token will be drawn at random from the bag and then returned to it, say at noon on Sunday. Now imagine that everyone in some community has one black token and one white token. They can place one (and only one) on a Magic Box before noon, and if it's colour matches the token that is drawn out at noon, then a Desirable Prize is dispensed. There's no penalty for getting it wrong. Which token should they choose? For those who are interested we're playing a game where the payoff matrix is identity.
| token choice |
| black | white |
the bag's outcome | black | 1 | 0 |
| white | 0 | 1 |
Obviously the best strategy depends on what the bag contains. Suppose it's a balanced bag that contains an equal number of white tokens and black tokens. Even small children will tell you that either token is equally good in this case, and when psychologists have investigated people's choices they about 50% of people choose black and 50% of people choose white. What if we have no idea what's in the bag except that nothing but a white or a black token will be pulled out? It's a bag of mystery! Again either token has a 50-50 chance. And given the choice, in this case too people will choose roughly equal proportions of each token.
So lets's be clear - the strategies for a one-off game are identical for balanced and mystery bags. Now suppose we offer people a choice of matching their token against a token drawn from either the balanced bag or the mystery bag. Most people choose to play against the balanced bag. Most of the people choosing the balanced bag will not switch to the mystery bag even if the game is repeated. This is known as Ellsberg's paradox. It's named after Daniel Ellsberg, better known for leaking the Pentagon Papers in an attempt to stop the Vietnam war, by the way. He's now an outspoken opponent of the USA's war in Iraq, and the frequently threatened attack on Iran.
See, in the case of the repeated game, the mystery bag offers the opportunity of learning to exploit any imbalance that may exist in the proportion of token colours, but in the balanced case you're always stuck at 50% and can never improve upon it. Ellsberg's paradox caused much consternation amongst many economists as their theories are frequently based on the assumption (spoken or unspoken, and frequently attended by hordes of similar assumptions) that all agents act rationally (ie. in their best interests). Psychologists aren't really sure why so many people choose the balanced bag - could it be a commitment to fairness, a preference for not making hard choices, fear of the unknown...? Nobody really knows for certain. But maybe we can make an educated guess.
Let's look more closely at how strategies for the one-off game are arrived at. In the case of the balanced bag we have an probability estimate of 50%, and a maximum confidence in that estimate - given that token colour is a random variable we can't do better. The balanced probability is considered to be unconditional. In the case of the bag of mystery we have an initial probability estimate of 50% but no confidence at all in it - it's really a representation of our own ignorance, what Bayesians would call a 'prior'.
A mathematical formalism that addresses this notion of confidence is called Dempster-Shafer theory, a generalisation of Bayesian statistics. Take the set ω of outcomes. In the case above ω = {black, white}. We are interested in the 'power set' or set-of-all-subsets of ω. We write this as 2ω = {ø, {black}, {white}, ω}. Now let's define a function called the mass m, which will be a little bit like probability in that it adds up to 1, but it's defined not on ω but on 2ω. We'll define m(ø) = 0 and then we can ignore the empty set completely.
The mass of a set s is m(s) and represents the 'weight of evidence' that the outcome is in that set, but not evidence for a particular member of that set. At the beginning of our game:
m |
Balanced |
Mystery |
ω |
0 |
1 |
{white} |
½ |
0 |
{black} |
½ |
0 |
As the game progresses, we will receive evidence for the mystery bag's internal state (the particular outcomes of each draw) - and gradually we will see m(ω) going down, and m({black}) and m({white}) rising. While we will never have the rock-solid certainty of the balanced bag system, we could conceivably find that one or other of the outcomes is in the lead and adapt our strategy accordingly.
For decision making purposes, we usually don't want to work with the masses themselves, but with an interval of probabilities. The Support S(p) where p ⊆ ω is defined as the sum of m(x) where x ∈ 2p. The Plausibility of p is the extent to which evidence against p leaves room for the possibility that the event is in p and is given by Pl(p) = 1 - S(¬ p). Also note that S(p) ≤ Pl(p).
To look at our table again:
|
Balanced |
Mystery |
|
m |
S |
Pl |
m |
S |
Pl |
ω |
0 |
0 |
0 |
1 |
1 |
1 |
{white} |
½ |
½ |
½ |
0 |
0 |
1 |
{black} |
½ |
½ |
½ |
0 |
0 |
1 |
This sort of reasoning has lead some people assert the nonexistence (or sometimes non-utility) of probability altogether (whatever that turns out to mean). Judea Pearl, an AI dude from UCLA, has argued that the support is actually a probability of provability, where provability is contrasted with truth. Judea Pearl is also the originator of much of the science of belief propagation (by 'belief' he means 'support') and causality research (I've become very interested in this recently).
As evidenced by the confused thinking on the subject of decision under uncertainty it appears a pervasive human cognitive bias is preference for the experience of irreducible risk over the experience of having to deal with our own ignorance. Knowing this bad mental habit is certainly useful in itself, but I was interested to find out that there is an entire branch of decision theory that is based on it, for those situations where it is actually appropriate. They call it 'info-gap theory' which is just about the worst name I could have imagined. It comes as no surprise that the 1950s are ultimately responsible for this crime against nomenclature.
The appropriate domain is wherever the precautionary principle applies wherever making the wrong decision leads to irreversible harm and there is no further opportunity to make the decision over again- medicine, biodiversity, etc. In info-gap theory we need make decisions based on optimising robustness under failure, not the expectation of 'utility' (or just 'payoff' in pure game theory). Pascal's wager is a fun example. Pascal's wager is the argument that you have nothing to lose from believing in a God because if (s)He doesn't exist you'll still be as just as dead as if you hadn't. There is a similar, but subtler, argument attributed to the Buddha.
There is an obvious problem with Pascal's approach; the assumption that Gods will invariably reward belief and/or punish denial. We don't know this. Now we're almost back to our bag of tokens; we have two levels of uncertainty. Drawing any conclusions, that is, deciding whether to believe in Gods, deciding which set of Gods to pick, deciding whether to decide whether to believe in Gods at all, and adapting your strategy to account for potential past or future lives is left as an exercise for the reader.