Networking changes on TQ | EVE Online

Networking changes on TQ

2009-04-29 - Publié par CCP Queeg500

Well, we have done tons on upgrading PC machines with more memory faster hard drives, gone from floppy drives to some RAMSAN thing, but while this was happening the network operations team has been looking at the most important thing (from the NOC standpoint), the network. Without a network connection to this interweb thing, EVE would be a great single player console game for some lonely guys in Telecity datacenter ;P

The Internet, how does it really work (yikes!!)

When I started working with Internet routing, coming from a rigid corporate environment, I was surprised how little control you have over your traffic and how flaky the Internet really is. I have read in books about all the bells and whistles of traffic engineering but I've never seen the real application until I met BGP and the Internet.  As many of you know, the Internet started out as few universities talking to each other. BGP (Border Gateway protocol) is the main building block of the Internet. The Internet is built up by BGP speaking peers (routers). Each origin (like CCP) has a network slice and we announce our network to our neighbors and hope they announce them to other neighbors.  We rely on our next hop (upstream) neighbors to accept our prefix and carry the traffic.

The major problem with the Internet is that BGP is not very deterministic; it relies heavily on something called an AS-PATH. Each entity on the Internet has an AS number (autonomous system). The path from client to CCP is usually the AS of client ISP, Tier1 SP (top of the Internet), CCP. There will often be other upstream providers of the client ISP. What is the problem here? Within one AS there could be 10 router hops (as seen via traceroute) Those hops could go to Europe, the US and then back to Europe. The best part is that there is a lot of meddling with the best paths. Some ISP's will have a good connection to some Tier1 providers, but may also have a cheaper connection with some lousy provider that they engineer the traffic to.

One example of this problem is Vodafone in Iceland. They peer with a large transit network called Cogent and then with T-systems (Deutsche Telekom) in Europe. We don't peer with any of them. The shortest path from CCP to Vodafone Iceland is CCP-Telia-Cogent-Vodafone Iceland. The AS path, then, is two hops. The problem with this is that Cogent chooses to send all traffic to US and then over to Iceland via Vodafone. This is one AS hop.  What we must then do is to manually engineer the traffic to use a longer path through our multi-homed connection in Telecity to get to Tiscali, who peer with T-systems, and then we reach Vodafone using the shortest real distance. 

So where does CCP stand

Until recently, we had two connections to the Internet: one was with Telia Sonera, a Tier1 service provider (//en.wikipedia.org/wiki/Tier_1_network). The other connection was with a multi-home peering platform operated by Telecity in London. We had limited control over our traffic. One important thing about BGP: you can control where you send the traffic, not how it returns.

So remember when you do a traceroute to TQ, it's very likely the traceroute from TQ could be very different. (I'll discuss that more below).

We decided it was not enough having only two providers even if they are dual connected for redundancy. Internet services providers get overloaded and have all kinds of issues. Having only two limits the options we have circumventing bad spots on the Internet. It was decided to get as close to the user as possible. This meant peering with the Tier1 service providers directly and in some cases try to obtain direct peering with big end user service providers. After doing some searching we found a company called Packet Exchange. They are supplying us with direct peering with some of the major Tier1 providers and direct peering with 120 smaller SP's. This was a very good match for us and we jumped on it.

The hardware conundrum

TQ core network infrastructure is built up by Cisco Catalyst switches and routers. Up to now we have been using a Cisco Catalyst 6500 as our distribution switch when peering with the outside world. When this hardware was chosen, the Internet routing table was around 250k entries and this equipment could handle it with ease. The table is growing day by day and is close to 280k routes now. The router module in the 6500 is not specialized to handle large number of BGP routes, so it became apparent that if we'd want to increase the number of peers we would have to change the network structure.

Initially we looked at few different platforms. It was always clear that we would stick with Cisco as we have good internal knowledge of the platform and they have proven to be very reliable. In the Cisco product line there are few models that we could use, one being the old and proven 7200 series, the newcomer ASR1000 series running IOS XE software and the supersized 7600-S chassis.

As we are CCP, we went for the supersized option and got the 7600-S platform with a RSP720-3CXL route processor. I read on some thread that it was indicated an IOS-based platform would give us challenges. I'm a true believer that IOS-based platform is perfect for us. The setup is not that complex, and now we have a very robust routing platform with a very fast switchover in case of hardware failure. Cisco 12K, CRS-1's are in the big service provider range and are just so expensive and for our scenario offer no real additional value.

Last week:

We installed the 7600's and turned on the links to Packet Exhange. This was done softly so we eased the changes in and hopefully no one noticed anything. There was a small hiccup as one of the fiber patch cables failed horribly but that was quickly dealt with.

We now have direct peering with: Global Crossing, Teleglobe, KPN, Telia, AboveNet and Level3. We still have the multihomed connection in Telecity operational, but that one is going to be closed soon. Level3 is eating up most of the traffic as expected as they are by far the largest Tier1 service provider.

We will soon turn on peering with 120 smaller SP's through a multilateral peering platform that PacketExchange operates.

We are in fact a small Tier2 service provider, with one customer, TQ ;P (though we don't have any downstream providers).

Next steps:

We continue working on improving the network infrastructure to get as close to you as possible. We are now looking at providing a "looking glass" service. This will allow you to take a look how the traffic is going from TQ to you. This is very important tool to find out if the traffic is perhaps taking some long detour as the traces you are currently supplying in the disconnect threads don't telling the whole story. We now have a real option of rerouting traffic and choosing the best path to the customer as with only two providers the options were very limited. Perhaps one day you can choose the outgoing provider through the web portal ;P  Now, that would be very cool!  But, these are still under investigation on how, if, and when we should do this.

Just few notes about traceroute:

When you do a traceroute, you send a packet to the destination with a low time to live number in the header. A router in the path will subtract one from the TTL and if the TTL Is 0, it drops the packet and sends an ICMP message back saying "time exceeded." Traceroute starts sending a packet with TTL of 1 and increments each time to discover the hops. When we see packets dropped in a traceroute, high latency numbers etc., it does not always indicate a real problem. Most Internet border routers are "big" routers; they will have something called the forwarding plane, which is the part responsible of sending/receiving the packets passing through the router. Then we have the control plane, which is the "router." Even in a 7600 it's only a 1200Mhz RISC processor, and it's the one that is responding to pings and sending TTL expired messages. As with anything, TTL expired has been exploited for a DoS attack on routers so it's usually rate limited; so is ICMP and other stuff that could impact the control plane of the router.  The router will answer when it feels like answering these messages. If you ping a router you might see jumps in response time and that is by no mean indication of how it is performing. It could be some indication on the housekeeping of the control plane, but not if it's overloaded. "Routers are usually lazy answering pings" ;P

I hope this first blog from the network team helps explain what we are doing, the problems we're facing and how we are tackling this ever changing thing called the Internet.