OSPF Costing

King · November 14, 2012, 10:29am

I have been thinking for a long time now that our OSPF Costing method needs to be looked at for a couple of reasons. I’m starting this thread so that we can discuss various options and then agree on if it should be kept as is or if we adopt one of the other methods that’s listed here. I know from discussions with other members that there are other ideas out there so please list them if you feel they are a viable solution on the network.

Here’s my theory:

With the advent of Dual Polarity links there’s a huge difference in the amount of data each link can handle and in most cases it would be better to route traffic over a slightly longer path than over a highly saturated single polarity hop. This made my primary goal to devise a method that takes into account the various link speeds throughout the network.

Additionally, under certain circumstances, we seem to be routing our traffic over some very big links. Data in the Northern Suburbs is routed over 2 x 30 Km links to get to a node that’s 5 km away from its starting point and all because if it followed a short (in distance) route then it would have been an extra 2 hops. This then forces us to include distance in the calculation as we want to keep the big links free for inter-area traffic.

The formula I’ve come up with is simple. For Ethernet links we keep using 1 as the Cost. For Wireless links we do the following:

300 / Sync Speed X Distance (Km) = Cost

Here’s a few example of the calculations on real links:

Mercury-Prometheus (Single Pol)
300/120 X 3 = 7.5

Prometheus-King02 (Dual Pol)
300/180 x 7 = 12

Pmurgs-Pantera (Dual Pol)
300/90 x 39 = 130

Mercury-Ariel (Single Pol)
300/48 x 5 = 31

Mercury-Moonfruit (Single Pol)
300/48 x 18 = 112.5

So, what do you guys think of the idea and if you’ve got your own ideas then please list it here so we can discuss.

controlc · November 14, 2012, 1:52pm

I agree, the OSPF cost has to be adjusted from the standard cost of 10.

Instead of using the sync rate, I’d say use the available bandwidth of the link (300/300 sync is not necessarily better than 240/240).

You can then use the standard OSPF cost formula: reference bandwidth/link speed = OSPF cost. Make the reference bandwidth 1000, and you’ll be able to differentiate between 1Gbs and 100Mbps ether links.

Adding the link distance into the equation sounds a bit overkill.

The only major disadvantage of modifying the OSPF cost on the wireless links is that it’s going to need constant admin. As link speeds increase/decrease, the OSPF cost has to be updated on both sides of the link.

Aragon · November 14, 2012, 4:20pm

How about not running OSPF over these links? Or run a separate OSPF instance over them.

controlc · November 14, 2012, 4:33pm

I don’t see how this is going to work? You need the OSPF on the interface to be able to transit data.

Alternative is to implement OSPF areas, and use filters to prevent intra-area traffic from going to a neighboring area and back. But for this, you’ll need clear boundaries between areas.

Moonfruit · November 14, 2012, 4:54pm

King:

Mercury-Prometheus (Single Pol)
300/120 X 3 = 7.5

Prometheus-King02 (Dual Pol)
300/180 x 7 = 12

Pmurgs-Pantera (Dual Pol)
300/90 x 39 = 130

Mercury-Ariel (Single Pol)
300/48 x 5 = 31

Mercury-Moonfruit (Single Pol)
300/48 x 18 = 112.5

I don’t agree with the example given. Despite being single polarity, the Mercury-Moonfruit link usually does the most data on any given day (vs. Prometheus and Ariel). Yet your formula gives us a significantly higher costing.

Why not keep costing the same, and look at fixing/tuning the problem links that are bottle-necked? There are many OSPF links on the WUG that can’t handle more than 10mbps, so instead of tuning the costing, why not just have them improve their links? And if the link cannot be improved to “backbone standard”, should it not then be removed?

King · November 14, 2012, 5:06pm

I don’t agree with the example given. Despite being single polarity, the Mercury-Moonfruit link usually does the most data on any given day (vs. Prometheus and Ariel). Yet your formula gives us a significantly higher costing.

Yes the Mercury-Moonfruit link does handle the most data on any given day but the theory is that the other links could do more work. Given the network topology I would imagine that this link will still do the most data even after the change.

Why not keep costing the same, and look at fixing/tuning the problem links that are bottle-necked? There are many OSPF links on the WUG that can’t handle more than 10mbps, so instead of tuning the costing, why not just have them improve their links? And if the link cannot be improved to “backbone standard”, should it not then be removed?

Regardless of what comes out of this discussion any link that doesn’t comply with OSPF criteria should be turned off until the issues have been resolved.

Moonfruit · November 14, 2012, 5:29pm

King, my question is this:

If all the links were tuned and optimised to comply with the rules, would it still be necessary to adjust the costing?

I feel that by adjusting costing, people with bad links have no incentive to improve them. But if one were to disable their links until they’re improved, surely that would get them sparking…

King · November 14, 2012, 8:12pm

If all the links were tuned and optimised to comply with the rules, would it still be necessary to adjust the costing?

Yes, I would say so. Even when a link is tuned to perfection the difference between a Dual Polarity link and an A link is big enough to warrant some sort of distinction. Maybe my formula isn’t exact and it could be something even more simple like:

Ethernet = 1
Dual Polarity = 10
Single Polarity N = 20
A = 30

But we then still need to figure out the crossing of the “longhaul” links. You do have a valid point about people not improving their bad links as this method would route traffic around somebody’s laziness but it only works to a point. If we have a bad link in my area we will turn it off and other area admins should be doing the same.

stormers · November 14, 2012, 9:16pm

Aahh but then again to moonfruit’s point, why not upgrade the link(s) that is incapable of the same throughput. Is it really a requirement to rethink the costing for all OSPF links or are there perhaps only some isolated instances where this would be beneficial other than an obvious upgrade.

Toady · November 14, 2012, 9:42pm

I might be missing what is actually meant here but my take on King’s idea is to prevent favoring a weak long link rather than a shorter ( in distance ) stronger link due to all costings being the same regardless of distance. The thing with upgrading is that would be the first thought but, and comparing with identical gear, a shorter link will always have a better sync rate than a longer link and would theoretically be able to deliver more bandwidth ( hence the lower cost ).

What would be ideal is to find a node on wind where the current system is sending traffic over a link with a lower overall bandwidth than any alternative links on the same node and obviously ending up at the same location. This would involve testing total throughput on each option and seeing if the current costs made the right decision or not… bandwidth wise. Latency wise, I would suspect a slight increase with adjusting costing as a higher capacity yet more hops link would be chosen.

EDIT: Just to throw something in there, a change like this, if implemented, would take quite some time i would say and if it helps nothing then thats alot of wasted time to revert it back. Think maybe some tests should be done.

WetKit · November 15, 2012, 8:36am

I like King’s idea.
I unfortunately do not agree with switching links off. Rather get the guys to tune the links better, or just raise the cost of the link!!!
Also, when I started on the wug, a 10M link was tha bomb
Yes, technology do improve, but for me to upgrade both sides of all my links, is going to cost a packet!!!
Even with my “slow” wug links it still performs much better than my ADSL.

controlc · November 15, 2012, 9:37am

OSPF was never designed with latency in mind. Trying to get a fully working formula using latency and link distance will be nearly impossible and will cause a major admin headache.

Go for the costing formula OSPF was designed for - reference bandwidth/link speed = OSPF cost.

The biggest routing improvement will be on sites where multiple routers are interlinked via ether. If you stick with the default costing of 10 per interface, then OSPF will regard a 100mbps link as just as good as a 20Mbps wireless link.

warlock072 · November 15, 2012, 2:24pm

INMO, Yes bad links should turned off, there shouldnt be any reason to have a bad ospf link live! if its not ospf bb/relay link then it really dont matter! if those peeps cant be bothered why should anyone else. Yes links should be upgraded, yes links must be aligned…But really if you have a 120/120mbs link between point A and B but there are times when u have better routes to get to B : for example : A-Y 300mbs, Y-Z 300mbs Z-B 300mbs , yes there are more hops but there shorter links are more likely more stable and have better capacity.
Of course this isnt always the case around the wug. It also has to be considered that the wug is way more complex and to go through assessing each link/links this way isnt really viable. Perhaps scripting something to do calcs etc and modify the cost 2 a day or something like that ( just a thought ) …

warlock072 · November 15, 2012, 2:28pm

Additional to this, i think some links should be looked at. Looking at WIND is a nightmare how all those links are crossing each other. Some peeps should be forced to inter-link with eachother instead of going straight to the nearest sector! Granted its not always possible but it really is getting out of hand!

Toady · November 15, 2012, 7:29pm

Im aware of this. But if using the the costing formula ospf was designed for then in most cases the shorter hops ( which should have the most capacity ) will be chosen. This in turn then would inevitably mean more hops ( Warloks desciptions above ) and for each hop the latency is increased.

Im not saying its a bad thing, I’m just putting it out there.