Some Notes on Monitoring and Testing Load Balanced Resources
algorithms, load_balancing, probability
One of the challenges faced when designing end-to-end checks is encountered when you are facing a load balancing arrangement. It is necessary to repeat part of the check more than once so that you can be reasonably sure that you hit all the resources that are being load balanced. Repeat too many times and you push your resources (ah, nothing like stress testing a live production system). Repeat too few times, and node 12 will keep serving that critical error you thought you'd fixed.
Random
Perhaps the most common case is random or semi-random balancing. If the balancing is fair (that is if the probability of a request being assigned to a given resource is constant across all resources), this is a case of the coupon collectors problem. Essentially the problem is the same as someone collecting a full set of coupons (or baseball cards, world cup stickers or pokemon etc.) - the first few are easy to get but it becomes progressively more difficult to get new ones as your collection grows.
In this case, the problem scales as O(N ⋅ log(N)), where N is the number of resources. To calculate the number of times we need to repeat the checks to cover all the resources(C), we first calculate it's expectation:
〈C〉 = N ⋅ HN
= N ⋅ loge N + γN + o(1⁄N2)
≅ N ⋅ loge N + γN as N → ∞
Hn is the Harmonic Number of N, and γ is the Euler-Mascheroni constant. For small N we can find Hn by hand.
Hn = 1/1 + 1/2 + 1/3 + … + 1/n
We can then use the markov inequality to find a bound on the probability of coverage, which should be a satisfactory guide.
P(C ≥ xnHn) ≤ 1⁄x
Round Robin
The other common load balancing algorithm is set up for round robin balancing, when the resources are ordered in a list, and the balancer chooses them in sequence, restarting from the top when necessary.
In this case we will need to know the number of 'organic' requests coming through between one test and the next. We can calculate this if we know the request probability density L(t); which is something you really want to be keeping track of anyway. If we can assume our L(t) is cyclostationary, the expectation of the number of requests between checks that take with period T at time 0 is:
∫0NT L(t)/N dt
= L(t)T as NT → 0.
We can find the probability that no requests come through by using the markov inequality again, if needed.
If you are running the checks in a quiet period (or running them quickly enough) then you get full coverage at C = N with probability 1-PO, where PO is the probability of an an organic request. Unfortunately simply adding an extra check or two will not help; you'll have add another N-1, to get probability 1-PO2. If you do your checks at busy times L(t)T >> N, when it's almost certain that several requests have come through, you can delay or randomise the timing of the checks so that either
T >> σ(L(t))
or
σ(T) >> L(t)N
in which case it reduces to random.
Other Algorithms
If the load balancing is set up to favour 'least loaded' resources in some way, then we have a different kettle of fish. In fact, unless we're load balancing requests that always take the same time to complete this is usually a bad thing to have in place anyway, and even then it's highly questionable whether this added complexity helps.
The Polyglot Experiment
algorithms, computer_languages, polyglot
One of the traditional introductions to learning a programming language has been (since the time of K & R1) to write a program that prints "Hello, World!" or some suitable variation to the screen. In the case of C, it makes some sense to use this as a learning tool, as it introduced the concepts of the main function, data types, printf, escape sequences, the preprocessor and the standard library.
#include <stdio.h>
int main(void){
printf("Hello, World!\n");
return 0;
}
Hello, World in C
The venerable 'Hello World' is still serving proudly as a sanity check2 on many thousands of newly set up development environments worldwide. But it has long since lost it's status as a learning tool. The reason being that modern languages tend almost always to reduce all this fun to a laconic one liner3. For example,
print "Hello, World!\n";
perl
puts "Hello, World!"
ruby, or maybe tcl
print "Hello, World!"
python, but could be most kinds of BASIC or possibly lua
"Hello, world!" print
the io language likes to mix things up a bit
SELECT 'Hello, World!';
this should work in most dialects of SQL, and it isn't even turing-complete!
You get the picture. In unix environments people are always trying to print various strings to an output stream, and many languages earn their keep by making this easy and intuitive; so much so that there's nothing to learn. People have been taking the piss4. Frankly they're just right to do so - "Hello World" was really only intended as a minimal check, to test that the compiler, I/O and so on were operating correctly.
So "what's your alternative you smart-ass?", I hear your all chorus. "How else are we to compare programming languages and illustrate their main features?". Well, one of the other main programs that follows up Hello World is something like the factorial function, or the fibonacci sequence. Both these demonstrate the way recursion works, but they omit all the I/O necessary for a language to be useful.
The Trabb Pardo-Knuth algorithm covers data structures (arrays/lists), mathematical operations, subroutines, IO, iteration, program structure (conditionals). It is (I think) the algorithm Missy Elliot rapped about in "Work It" - read 11 numbers in, apply a function (traditionally 5x³+abs(x)½) and (in reverse order) print out the number if the result (is less than, say, 400) or "TOO BIG" if it is, well, too big. Missy glossed over a few details. In fact it was designed specifically to demonstrate the differences between imperative languages. And that's a problem with it - it's a very imperatively focused challenge.
So what else is there, something perhaps a bit more "real world" but still containing theoretical content? Something we can use to get a feel 'at a glance' for how the language looks at the world?
If you're of the opinion that real programmers don't eat quiche, or need know whether the processor is big- or little-endian before sitting down to code, you may think this is all a bit namby-pamby or 'high level'. How good is the language for real work, like implementing crypto routines? The Tiny Encryption Algorithm is nice and simple.
And surely, in this age of the intertubes, a routing algorithm like Dijkstra's algorithm should be able to satisfy the criteria of usefulness and theoretical rigor, as it's both a classic, beautiful, algorithm and in use in actual (OSPF) routers today?
And surely one of the most definitive features of computers is that they are machines which operate on symbols, converting from one representation to another? This is where the whole notion of a programming language comes from after all... What about something like the Shunting algorithm for converting between infix and reverse polish notation?
I propose to try and implement all the algorithms mentioned in this article in a set of programming languages that I would like to know more about. I hope this will be a useful learning experience and help to provide a more substantial set of trials for my own use in trying out new languages with. I hope to manage one language a month.
It may be that the tests I have chosen are inadequate - for example none of them seriously address the extent of libraries for, say, matrix operations, database abstraction, GUI programming, regular expressions or XML or whatever. This could be said to undermine the advantages of languages like PHP, Perl, Objective C, VB.Net etc.
The quality of libraries is undoubtably of the utmost importantance to getting anything done with software. The problems with trying to evaluate a language by issuing library-driven challenges (such as, say, editing MP3 metadata or retrieving emails over POP3) are twofold. First is that we are testing the language libraries' suitability for that purpose, rather than for implementing arbitrary algorithms. Secondly, the better the library, the less we learn - we could end up back with the same problem that we started with; life just isn't hard enough for us!
footnotes
1. K & R: The first reference known is in fact Kernighans' earlier tutorial on the 'B' language, but I think it's safe to say that that's not what made it traditional.
2. sanity. See the C2 wiki.
3. most. Java is keeping the faith - it is necessary to understand what a 'static public void' method is before attempting to design Hello World.
4.
Hello