We are currently testing the usage of ospf costing.
What this would mean is that the link speeds would be utilized a lot better.
Were looking in to autamating it where it looks at the quality of a link in a spicific interval and then setting the cost of ospf accordingly
Ive done this in Brackenfell so far and its made a major difference in speed and lattency.
I feel confident that we can get this going and hopefully it will encourage wuggers to look at their links more regularly, maintain and tune the hell out of it, to get it as good as possible.
The better your link the lower the cost.
First off, I am not trying to rain on anyone’s parade. While I agree that this does sound like a promising idea in theory, in practice I do have a few concerns.
Lets say that between Point A and Point B there are two distinct routes, namely route 1 and route 2. Route 1 consists of 10 hops, route 2 of 11. Then by traditional WUG OSPF standards, route 1 would have a costing of 100 and route 2 of 110, meaning that any traffic moving directly between Point A and B would always choose route 1.
Now we implement dynamic costing based on link quality. Lets say route 1’s links all perform at “standard” capacity and are assigned a costing of 10. Route 2’s first 10 links are excellent and so are assigned a costing of 5, however its final link is very poor with a low CCQ and high packet loss. This means that, unless you assign it a costing of more than 50, traffic will automatically choose route 2, leading to an overall decrease in throughput between Point A and B as well as an increase in latency and packet loss.
Again, I am not trying to complain, but merely raising a concern. OSPF has and always will be based on one deciding factor - distance. The shortest distance will always receive traffic preference, regardless of quality or capacity.
Should we not instead perhaps look into moving CTWUG towards something like a hybrid BGP/OSPF network? Given the rapid increase in network size, I don’t foresee our current system lasting too much longer.
This is pretty much why I am generally against doing these short hops because it creates too many variables in the long run. With proper planning one can implement a link where it is actually required instead of throwing a link just because. ( this might not be applicable to every short hop but I would put money on the majority of them being “why not” links which is the wrong mentality ).
A good example for me will be when the renegadewolf-sapolisie link gave a lot of problems in the past. The rbs froze up in such a way that the ospf was still advertising that the route is alive, but the link quality drop that the ccq was 6 and the link synced at 6mb (I cant remember the exact numbers). This cause games to lag and browsing was a huge problem.
Now lets say the script runs through the rb seeing the sync and ccq speed it will adjust the ospf cost accordingly(the cost will be very high for bad links), in such a case were the link is so bad even the rb next to it will follow another route. Browsing will resume and games can’t be played. We are still working out the finer details of this and will keep you guys informed.
Just remember that not all ccq drops ( especially as per your example ) are a result of an actually bad link, but more the router locking up as is the same case with prom-king02 thereby requiring manual intervention. Maybe have a remote logging for ccq’s that are found to be < 10% so if it is a case of the above someone can have a look, reboot and when the script comes to check again all is well.
I have to agree partially with Moonfruit about how things could be done. As well, about not trying to rain on the parade. I’m a WUG newbie after all.
In a “complicated” multi-inter-link configuration as the WUG has, the “right” way to do this is with BGP and MPLS TE. To work reliably however, OSPF should provide inter-router communication while BGP provides optimised inter-route communication. Thus OSPF should be configured correctly anyway and the efforts you guys are making to improve OSPF is still the right thing to do.
Your right bgp is needed but ospf costing needs to work first or bgp would be a waste.
the inter area idea will be next if th ospf costing test is successfull.