Creating a global streaming CDN with Wowza

I’ve posted in the past about the various components involved in doing edge/origin setups with Wowza, as well as startup packages and Route 53 DNS magic. In this post, I’ll tie the various pieces together to put together a geographically-distributed CDN using Wowza.

For the purposes of this hypothetical CDN, we’ll do it all within EC2 (although there’s no reason it has to be), with three edge servers in each availability zone.

There are two ways we can do load balancing within each zone. One is to use the Wowza load balancer, and the other is to use round-robin DNS. The latter is not as evenly balanced as the former, but should work reasonably well. If you’re using set-top devices like Roku, you’ll likely want to use DNS as redirects aren’t well supported in the Roku code.

For each zone, create the following DNS entries in Route 53. If you use your registrar’s DNS, you’ll probably want to create a domain name as many registrar DNS servers don’t deal well with third-level domains (sub.domain.tld). For the purposes of this example, I’ll use nhicdn.net, the domain I use for Nerd Herd streaming services. None of these hostnames actually exist.

We’ll start with Wowza’s load balancer module. Since the load balancer is not in any way tied to the origin server (although they frequently run on the same system), you can create a load balancer on one of the edge servers in that zone. Point the loadbalancertargets.txt on each edge server in that zone to that server, and point to the origin stream in the Application.xml configuration file. Once the load balancer is created, create a simple CNAME entry in Route 53 called lb-zone.nhicdn.net, pointing to the hostname of the load balancer server:

lb-us-east.nhicdn.net CNAME ec2-10-20-30-40.compute-1.amazonaws.com

If you opt to use DNS for load balancing, again point the Application.xml to the origin stream, and then create a weighted CNAME entry to each server in that zone, all with the same host record, each with equal weight (1 works):

lb-us-east.nhicdn.net CNAME ec2-10-20-30-40.compute-1.amazonaws.com
lb-us-east.nhicdn.net CNAME ec2-10-20-30-41.compute-1.amazonaws.com
lb-us-east.nhicdn.net CNAME ec2-10-20-30-42.compute-1.amazonaws.com

If you wish, you can also create a simple CNAME alias for each server for easier tracking – this is strictly optional:

edge-us-east1.nhicdn.net CNAME ec2-10-20-30-40.compute-1.amazonaws.com
edge-us-east2.nhicdn.net CNAME ec2-10-20-30-41.compute-1.amazonaws.com
edge-us-east3.nhicdn.net CNAME ec2-10-20-30-42.compute-1.amazonaws.com

Once you’ve done this in each zone where you want servers, Create a set of latency-based CNAME entries for each lb entry:

lb.nhicdn.net CNAME lb-us-east.nhicdn.net
lb.nhicdn.net CNAME lb-us-west.nhicdn.net
lb.nhicdn.net CNAME lb-europe.nhicdn.net
lb.nhicdn.net CNAME lb-brazil.nhicdn.net
lb.nhicdn.net CNAME lb-singapore.nhicdn.net
lb.nhicdn.net CNAME lb-japan.nhicdn.net

Once your DNS entries are in place, point your players to lb.nhicdn.net and use either the name of the edge application (if using DNS) or the redirect application (if using Wowza load balancer), and clients will go to either the edge server or load balancer for the closest region.

One other thing to consider is to set up a load balancer on your origin, and use that for gathering stats from the various servers. You can put multiple load balancers in the loadbalancertargets.txt file, so you can have each server report in to both its regional load balancer and the global one, except the global one is not being used to redirect clients, but rather only for statistics gathering. You can also put multiple load balancers in each region for redundancy and use a weighted CNAME entry.

If you would like assistance setting up a system like this for your organization, I am available for hire for consulting and/or training.

Adding EC2 instances to Route53

Today’s nifty bit of code is a startup/shutdown script for linux that allows you to add an EC2 instance to Amazon’s Route53 DNS automatically when you start up the instance, and remove it when the instance is knocked down. This script also allows you to add the instance to a weighted round-robin group.

This makes use of the very useful Python-based boto tool which is available in both Yum and Debian repositories under the package name python-boto.

Create this script in /etc/init.d and make it executable, and then add it to the requisite rcX.d directory for startup/shutdown.


#!/bin/bash
#
#       /etc/rc.d/init.d/<servicename>
#
#       Registers DNS with Route 53
#
# chkconfig 345 20 80
#

# Source function library.
. /etc/init.d/functions

# Best practice here is to create an IAM user/group with access to only 
# Route53 and use this keypair.
export AWS_ACCESS_KEY_ID=<Access Key ID Here>
export AWS_SECRET_ACCESS_KEY=<Secret Key ID here>

# Use the domain you have configured in Route 53
export DOMAIN=ianbeyer.com

# This is the hostname of the Weighted Round Robin Group.
export RR=wrr

# Gets the current public hostname - you don't want to use an A record with 
# the IP because the CNAME works in conjunction with Amazon's internal DNS
# to correctly resolve instances to their internal address
export PUB_HOSTNAME=`curl -s --fail http://169.254.169.254/latest/meta-data/public-hostname`

# Gets the Route 53 Zone ID
export ZONEID=`route53 ls | awk '($2 == "ID:"){printf "%s ",$3;getline;printf "%s\n",$3}' | grep $DOMAIN | awk '{print $1}'`

start() {
        echo -n "Registering host with Route 53 : "

# This is the base name of the host you want to use. It will increase the index
# number until it finds one that's not in use, and then use that. 

        export HOST=amz
        export HOSTINDEX=1

        INUSE=`route53 get $ZONEID | grep ${HOST}${HOSTINDEX}\.${DOMAIN} | wc -l`
        while [[ $INUSE > 0 ]]
        do
            HOSTINDEX=$((HOSTINDEX + 1))
            INUSE=`route53 get $ZONEID | grep ${HOST}${HOSTINDEX}\.${DOMAIN} | wc -l`
        done
        if [[ "$HOSTINDEX" == "1" ]]; then
            FQDN="${HOST}${HOSTINDEX}.${DOMAIN}"
        else
            # set the new fqdn hostname and shortname
            FQDN="${HOST}${HOSTINDEX}.${DOMAIN}"
            SHORTNAME="${HOST}${HOSTINDEX}"
        fi

        # Set the instance hostname -- If you want to make sure that bash
        # updates the prompt, run "exec bash" after the script. If you do so
        # in the script, bad stuff happens on startup.

        hostname $FQDN
        echo -n $FQDN


        # Add the DNS record
        RESULT=`route53 add_record $ZONEID $FQDN CNAME $PUB_HOSTNAME | grep "PENDING"`
        if [[ "$RESULT" == "" ]]; then
                echo "... failed.";
        else
                echo "... success.";
        fi

        # Add the CNAME record
        echo -n "Adding host to round-robin group...";

        # Checking to make sure it's not already there. 
        CNAME=`route53 get $ZONEID | grep $RR | grep $PUB_HOSTNAME | wc -l`
        if [[ $CNAME = 0 ]]; then
                RESULT=`route53 add_record $ZONEID $RR.$DOMAIN CNAME $PUB_HOSTNAME 60 ${HOST}${HOSTINDEX} 1 | grep "PENDING"`
                if [[ "$RESULT" == "" ]]; then
                        echo "... failed.";
                else
                        echo "... success.";
                fi
        else
                echo "already exists, ignoring";
        fi

}


stop() {
        echo -n "Deregistering host with Route 53 : "
        HOST=`hostname | cut -f1 -d.`
        FQDN=`hostname`

        # check to make sure it exists in Route53
        CNAME=`route53 get $ZONEID | grep $FQDN | grep $PUB_HOSTNAME | wc -l`
        if [[ $CNAME > 0 ]]; then
                RESULT=`route53 del_record $ZONEID $FQDN CNAME $PUB_HOSTNAME | grep "PENDING"`
                if [[ "$RESULT" == "" ]]; then
                        echo "... failed.";
                else
                        echo "... success.";
                fi
        else
                echo "... not found, ignoring";
        fi

        echo -n "Deregistering host from RR CNAME..."

        # Checking to make sure it exists
        CNAME=`route53 get $ZONEID | grep $RR | grep $PUB_HOSTNAME | wc -l`
        if [[ $CNAME > 0 ]]; then
                RESULT=`route53 del_record $ZONEID $RR.$DOMAIN CNAME $PUB_HOSTNAME 60 ${HOST} 1 | grep "PENDING"`
                if [[ "$RESULT" == "" ]]; then
                        echo "... failed.";
                else
                        echo "... success.";
                fi
        else
                echo "... not found, ignoring";
        fi

        # resets the hostname to default
        hostname `curl -s --fail http://169.254.169.254/latest/meta-data/hostname`


}

case "$1" in
    start)
        start
        ;;
    stop)
        stop
        ;;
    restart)
        stop
        start
        ;;
    *)
        echo "Usage: <servicename> {start|stop|restart}"
        exit 1
        ;;
esac
exit $?

Many thanks to Dave McCormick for his blog post on the subject from which I borrowed heavily.

Converting EC2 S3/instance-store image to EBS

Amazon’s instance-store images are convenient, but ephemeral in nature. Once you shut them down, they’re history. If you want persistence of data, you want to use an EBS instance that can be stopped and started at will without losing your info. Here’s the process I went through to convert a Wowza image to EBS for a client to use with a reserved instance. I’m going to assume no configuration changes for Wowza Media Server, as the default startup package is fairly full-featured. This process works for any other instance-store AMI, just ignore the Wowza bits if that’s your situation.

Boot up a 64-bit Wowza lickey instance. I was working in us-east-1, so I used ami-e6e4418f, which was the latest as of this blog post.

Once it’s booted up, log in.

Elevate yourself to root. You deserve it:

sudo su -

Stop the Wowza service:

service WowzaMediaServer stop

delete the Server.guid file. This will cause new instances to regenerate their GUID.

rm /usr/local/WowzaMediaServer/conf/Server.guid

Go into the AWS management console and create a blank EBS volume in the same zone as your instance.

Attach that volume to your instance (I’m going to assume /dev/sdf here)

Create a filesystem on it (note: while the console refers to it as /dev/sdf, Amazon Linux uses the Xen virtual disk notation /dev/xvdf):

mkfs.ext4 /dev/xvdf

Create a mount point for it, and mount the volume:

mkdir /mnt/ebs
mount /dev/xvdf /mnt/ebs

Sync the root and dev filesystems to the EBS disk:

rsync -avHx / /mnt/ebs
rsync -avHx /dev /mnt/ebs

Label the disk:

tune2fs -L '/' /dev/xvdf

Flush all writes and unmount the disk:

sync;sync;sync;sync && umount /mnt/ebs

Using the web console, create a snapshot of the EBS volume. Make a note of the snapshot ID.

Still in the web console, go to the instances and make a note of the kernel ID your instance is using. This will be aki-something. In this case, it was aki-88aa75e1.

For the next step, you’ll need an EC2 X.509 certificate and private key. You get these through the web console’s “Security Credentials” area. This is NOT the private key you use to SSH into an instance. You can have as many as you want, just keep track of the private key because Amazon doesn’t keep it for you. If you lose it, it’s gone for good. Download both the private key and certificate. You can either upload them to the instance or open them in a text editor and copy the text, and paste it into a file. Best place to do this is in /root. Once you have the files, set some environment variables to make it easy:

export EC2_CERT=`pwd`/cert-*.pem
export EC2_PRIVATE_KEY=`pwd`/pk-*.pem

once this is done, you’ll need to register the snapshot as an AMI. It’s important here to specify the root device name as well as map out the ephemeral storage as Wowza uses those for content and logs. Ephemeral storage will persist through a reboot, but not a termination. If you have data that needs to persist through termination, use an additional EBS volume.

ec2-register --snapshot [snapshot ID] --description "Descriptive Text" --name "Unique-Name" --kernel [kernel ID] --block-device-mapping /dev/sdb=ephemeral0 --block-device-mapping /dev/sdc=ephemeral1 --block-device-mapping /dev/sdd=ephemeral2 --block-device-mapping /dev/sde=ephemeral3 --architecture x86_64 --root-device-name /dev/sda1

Once it’s registered, you should be able to boot it up and customize to your heart’s content. Once you have a configuration you like, right-click on the instance in the AWS web console and select “Create Image (EBS AMI)” to save to a new AMI.

Note: As of right now, I don’t think startup packages are working with the EBS AMI. I don’t know if they’re supposed to or not. 

Streaming on Amazon’s “SuperQuad”

I posted recently about using Amazon EC2’s cluster compute instances for big streaming projects. That post got me a call from a client in Texas who was planning to stream a big tennis tournament in Dallas and needed a server backend that could handle it, without going through the hassle and expense of setting up a CDN account for a single event. Of course, since everything is bigger in Texas, they wanted to stream to a large audience. They also wanted to be able to send a single high-definition stream for each of the two tournament courts, and then transcode down to a few different bandwidth-friendly bitrates. This called for not only big network horsepower, but big CPU horsepower as well.

I fired up the superquad (cc1.4xlarge), installed Wowza and the subscription license on it (Wowza pre-built AMIs do not exist for this instance type), and tuned it. I then created the transcoder profiles to create a 480p, 360p, 240p, and 160p stream, and we tested. Note that when installing Wowza yourself on an EC2 image, you don’t have access to the EC2-specific variables and classes out of the box. You’ll need to add the EC2 jar file that can be found on one of Amazon’s prebuilt AMIs. In this case, that wasn’t a factor, as I simply hardcoded the server’s public DNS name into any place that needed it.

Once the tournament started, we were seeing big audience numbers, with bitrates on the box well in excess of 1Gbps. On day two, audiences started complaining about spotty stream performance, and some were running 15 minutes behind live.

After jumping into the logs, it became apparent that this 8-core/16-thread monster was starved for CPU resources! Wowza recommends that a transcoder system not exceed 50-55% CPU. We then reduced the number of transcode streams to two (480p and 360p). In the process, I discovered that a misformatted search/replace had altered the configuration to transcode all the streams to 1280×720, at extremely low bitrates. No wonder the poor thing was dying. Once we got everything fixed, a full audience with both courts going was clocking in around 40% CPU. At no time in the process did Java heap memory exceed 3GB (in the tuning, I allowed it up to 8GB, the max recommended by Wowza). Wowza seems to be exceedingly efficient with its memory usage. If you need to run heavier transcoding loads, you may want to look at what I call the “super-duper-octopus” (cc1.8xlarge), which is about double what this one is.

CPU Usage - Note 2/7 when we were having trouble

Early Thursday, I checked the AWS usage stats for the month, and my jaw dropped. In three days of streaming, we’d clocked over five TERABYTES of data transfer. I expect I’ll bump into the next bandwidth tier (or come very close) by the end of the week. That’s what happens when you average around 1Gbps for the better part of 12 hours a day!

Network Usage (Bytes/Minute.. What the heck, Amazon?)

As for server usage, this instance type runs about the price of two extra-large instances (each capable of about 450Mbps), and doesn’t even break a sweat at those transfer rates. Had I parked this service on a VPS at another hosting provider, I would have blown through the monthly data cap by mid-Tuesday, and likely not had access to a 10GB pipe on the server.  Meanwhile, when you start cranking terabytes of data, that cost per gigabyte is a major factor. When you crank out 10TB of data, every penny per gigabyte adds $100 to the bandwidth tab.

Although a large portion of the audience for this event was in Europe (at one point, 60% of the audience was coming from Lithuania!), the cluster instances are currently only available in the us-east (Virginia) region. If performance for European users had gotten problematic, I could have set up a repeater in Amazon’s datacenter in Ireland. As it was, there were no complaints.

So that’s how a superquad works for large streaming events. If you want some help setting one up, or just want to rent mine for your event, drop me a line.

Go big, or go home!

I’m currently working on setting up Wowza on an EC2 “Cluster Compute Quadruple Extra Large” instance (or as I’ve heard it called, the “super-duper-quadruple”, which sounds like something I’d get at Five Guys). There’s no pre-built AMI for this one, so you have to use a stock Linux image (I use the standard Amazon one) and install Wowza with a subscription license, and do the tuning yourself. But the payoff is this: for $1.30 an hour, you get a streaming server capable of delivering 10Gbps of data.  On a 750Kbps stream, that’s over 13,000 concurrent clients. This for about the same cost as nine or ten m1.small instances which can deliver an aggregate of about 1.5Gbps. On a reserved instance, you can get this down to just under 75 cents an hour.

In addition to Ludicrous Speed on the network I/O, this instance comes with 8 multithreaded Xeon 5570 cores (at 2.97GHz), 23 GB of RAM, and 1.7TB of local storage. (a quick speed test downloaded a half-gigabyte file in about four seconds, limited by the gigabit interface at the remote server). This is roughly equivalent to a moderately configured Dell R710. There’s also a GPU-enabled version of this that adds a pair of nVidia Tesla GPU cores.

If that’s not enough, you can go bigger, with 16 cores, 60GB of memory, and 3.5TB, Recently, someone clustered just over a thousand of these instances into the 42nd largest supercomputer in the world.

As of right now, these monster instances are only available in the us-east-1 zone.

 

 

Wowza Media Server V3 for Amazon EC2

Wowza V3 pre-built AMIs are now available – The devpay licensing remains, as does the pricing. The new AMI listing can be found at the Wowza V3 for EC2 page. Wowza has also added pre-built AMIs for subscription licenses, which are priced at standard instance rates. The caveat is that on devpay, the premium add-on modules won’t be available – if all you’re doing is what you were doing on V2, that won’t change anything for you.

The license key instances can also be used as a basis for your own custom images. License key can either be manually changed or included in your startup packages.

Amazon EC2 and DNS

Just discovered an interesting little tidbit about using DNS from within Amazon’s EC2:

When resolving the public DNS name of an EC2 instance, from another EC2 instance, you will get the internal IP address of that instance.This is useful if you have multiple EC2 instances talking to each other.

For example, if you have a Wowza edge/origin setup, you have your origin set up on an elastic IP for consistency in your configuration, and point your edge servers to that IP.

Now this may seem insignificant until you remember that any traffic between EC2 instances via the public IP (elastic IP or not) is going to incur a charge of 1 cent per gigabyte. If you’ve got a large streaming setup, that can add up.

If you want to use your own domain name for your server, be sure to use a CNAME record to the public IP instead of an A record. The A record will always return the public IP. The CNAME will tell the nameserver what the public DNS name is for that instance, which EC2’s nameservers will then return as the internal address.

With an A record to the public (elastic) IP:

ubuntu@ip-10-245-bar-baz:~$ nslookup wms.domain.org.
Non-authoritative answer:
Name:   wms.domain.org
Address: 50.16.XXX.YYY

With a CNAME record to the public DNS name:

ubuntu@ip-10-245-bar-baz:~$ nslookup wms.domain.org.
Non-authoritative answer:
wms2.domain.org      canonical name = ec2-50-16-XXX-YYY.compute-1.amazonaws.com.
Name:   ec2-50-16-XXX-YYY.compute-1.amazonaws.com
Address: 10.114.AAA.BBB