OSPF, BGP, bah. If you're reading this, you probably like one or the other. Fuck you. They're both shit. But Ja-aymz, I hear you cry, how can you make this assertion? * BGP is optimised for the sort of static network that we do not have. There is a good reason why the internet requires micromanagement to the point of insanity - BGP doesn't actually give each host an accurate picture of the network. Which is fine, if you're not trying to create a city wide wireless network.... there is a neat little test case on our network _right now_ that I can use to prove this, if only I could be fucked drawing the diagram. But consider clusters that are connected to major routers at each end, rather than one single backbone connection. And of course TCP stalls are fun, since most people aren't using a wifi friendly implementation such as tcp_westwood (shame on you!). If you are into micromanagement, though, it does at least allow you to override other dumb fucks config choices. And if you don't, then you're fucked, because it requires it to work _at all_. * OSPF is a fragile motherfucker, falling apart at the drop of an elephant. It theoretically works, but in reality it falls apart quicker than you can say "I thought bgp was going to save the world!". And this is ignoring the fact that area 0 has all of the downsides of BGP, with none of the good sides. OSPF implementations completely fail to interoperate without massaging, and indeed the default settings of most of them are total shite anyhow. One single dumb fuck can destroy the entire network with a stupid config choice. This is not a theoretical attack - this has happened before. Requires less micromanagement than BGP, which is great, but doesn't provide the capability to do the management that you _do_ need to do. Word! ---- Looking out a dirty old iface, down below the cars in the city go rushing by I sit here alone, and I wonder why Bight lights, the music gets faster, checking your timestamp, another glance I'm not leaving, no honey, no not a chance. ---- I realise that this is a terribly, terribly difficult thing for people to grasp, but this is important - despite the name, and the shitty attempt to emulate it, WIFI IS NOT FUCKING ETHERNET. It is ESPECIALLY not cat-5 connected ethernet, not even 10M, let alone the 100M that everyone is trying to pretend that it is. Today's proposed alternative is OLSR http://olsr.org . OLSR is not perfect. It's not even good. It's got massive problems, in fact, which we'll talk about here. But yet, most of those problems exist in functionality that OSPF and BGP don't even provide. OLSR has three major advantages over the aforementioned two: It was designed for mesh routing, which vaguely resembles what we're trying to do, it supports both single hosts and networks, and has at least some measure of link quality built into it. And, unlike BGP or OSPF, a human can actually configure the thing - even on windows. --- # A very complicated configuration. Sort of. The actual # running config on hanuman.nodefus.wireless.org.au # (10.10.64.130, 10.10.64.33, 10.10.48.28) # # Note the special case for the ethernet connections! DebugLevel 0 ClearScreen yes LinkQualityLevel 2 LinkQualityWinSize 100 UseHysteresis no Hna4 { 10.10.64.128 255.255.255.240 10.10.64.32 255.255.255.240 10.10.48.0 255.255.255.224 } Interface "eth0" "wlan1" "wlan0" { HelloInterval 2.0 HelloValidityTime 20.0 # pieces of 100mbit cat5 are always cheap :) LinkQualityMult 10.10.64.132 10.0 LinkQualityMult 10.10.64.134 10.0 } LoadPlugin "olsrd_httpinfo.so.0.1" { PlParam "Net" "10.0.0.0 255.0.0.0" } --- One of the most glaring problems is the link quality calculation - the algorythm is 1/(packetLoss * packetLossBackToMe). Note any mention of speed in there? I didn't think you did... our original thought was to combine it with the script from last month for calculating speed, and apply the output of that to LinkQualityMult - which would work great, if only you could reconfigure the router at runtime. But, alas, you cannot. There is a plugin interface, and this was our next thought - however, while it can fuck with the route table, it doesn't seem to be able to change actual settings. Which is a shame for other reasons that we'll get to momentarily. The point is, that you need to either know your link speed in advance, or trust that it knows best. And while on wifi packet loss is actually a pretty good metric, it is far from perfect - some combinations of interface are just inherently better than others. There is an argument that the algorythm is also not harsh enough to lossy links - a link with 30% packet loss in each direction will get a route cost of 2.04, whereas in reality that link is almost dead. We actually did experiment with changing the algorythm to 1/(packetLoss^2*packetLossBackToMe^2), which gave results that I was happier with. Which is not to say that the current results are terrible, just that they only consider the GHO link as a 1.41 cost right now, whereas a perfect wifi link is a 1.0 cost. Under the squared algorythm, that cost would be over 2.0, which is much more realistic. But I really don't want to maintain a forked tree - especially considering that pre-built binaries already exist for linux, windows, pocketpc, and wrt54g. I'm sure that few would argue this point. The other major complaint, is that point to point links are non-obvious. UDP is the transport used, although unlike OSPF the unreliable transport actually works well here. But the default is to send to the broadcast address of the network - the official way to set up point to point exclusive links, is to change a parameter called Ip4Broadcast. Which makes sense when you think about it, but is non obvious. I don't think you can have multiple ones of these per interface, either, especially a problem when due to other trivia, olsrd tends to fail with aliased interfaces. Ironically, in the name of sanity checking. There is no management, the program knows best. This is a horrible pain in the ass sometimes... like now, for example. Of course, there is an argument that most MW people shouldn't be controlling their routers in the first place. The protocol is designed to be self configuring, and sometimes this even works. It's just a pain when it doesn't. And like I said, if you don't like it, who the fuck want sto maintain a parallel version? Not fucking me. But still, the question is, better than OSPF though? Almost certainly. Better than BGP? No, except if you're trying to do what we're trying to do. Then Yes.