Tribune Company's efforts to offload storage and processing to Microsoft's Azure
Editor’s note: Welcome to .NETRocks Conversations, excerpts of conversations from the .NET Rocks! weekly Internet audio talk show. Hosts Richard Campbell and Carl Franklin chat with a wide variety of .NET developer experts. This month's excerpt is from show 579, with guest Jerry Schulist, a solutions architect at Tribune Company, on how his firm is using Windows Azure to consolidate its data centers for significant cost savings.
Carl Franklin: Let me introduce our guest, Jerry Schulist. Jerry is a solutions architect at Tribune Company, where he specializes in .NET architecture and development. Today, Jerry is part of a small research and development team which focuses on innovative ways to deliver content along with contextual advertising. Welcome, Jerry, and we're talking about Azure today.
Jerry Schulist: Thank you.
Richard Campbell: So you were talking about the Tribune Company, right? Been around a long, long time, makes a whole lot of newspapers.
JS: That's right. The Tribune Company started back in the 1800s. We have eight newspapers today. In addition to that, we have 23 broadcasting stations. We own a radio station and a host of other websites and other media properties.
RC: Generally, it seems like you guys have made a jump to the web pretty thoroughly, like your particular sites—and I can't name all these different websites because there's a ton of them, but a lot of your content now is just automatically online as well as in the other mediums.
JS: Right. Our consumers are looking to consume that content from a wide range of different devices and platforms. They don't just want to consume it with the traditional means, whether that's newspaper, or radio, or television. Our goal is really to provide that content through those consumers in any way that they wish to consume it.
RC: So is Azure all about reducing cost for you guys?
JS: It's reducing cost, improving processes. It's a lot of things.
RC: So I guess the challenge here is getting to the point where once that given story is written up and so forth, you don't handle it anymore. It just shows up in the website, in the appropriate newspapers, and so forth.
JS: That's the end goal, yes. You know, produce it once, deliver it on any device and any platform that the consumer likes to view it on.
Carl Franklin: I'm looking at the Microsoft case study, which we've shrinksterized at shrinkster.com/1ekd, and in there there's 2009 statistics for the Tribune Company. In 2009, you had eight newspapers, 23 television stations, 50 websites, 32 data centers, 4,000 servers, 75,000 square feet of raised floor...
RC: It sounds like air-conditioned server space.
JS: That's right.
CF: Two thousand software applications and 6.1 billion page views. Wow, that's a lot. But you can see the numbers continuously going up. Eight newspapers, 50 websites. Lots of data. So that is obviously the main thrust of your media.
JS: Right. We would like to consolidate down from those 32 data centers into three data centers... one in L.A., one in Chicago, and then we'd like the third one to be Windows Azure or just Azure in general.
CF: Now, they just announced the Azure in a box kind of thing.
RC: The Azure Appliance.
CF: The Azure Appliance, which really isn't an appliance. It's kind of a data center that you run yourself, but the boxes are all Microsoft managed. Is that part of your plan, or is this something new that has come out since?
JS: That is something new that has come out since, so we have not yet considered that. But because of the amount of data centers and overall the number of servers that we have today, I think the more that we can get away from that business and just let Microsoft handle it, the better.
CF: That is the real draw of the cloud, obviously. It's to take all that stuff off your plate. Why would I want to run—just because it runs the operating system and it's managed, it's still on premises and it's still stuff that I've got to look after.
RC: So what pieces right now are you running in Azure?
JS: We're mainly using Azure for the storage aspect. We've produced approximately 100GB of new content each day, and overall in a given year we're expecting that we're going to produce 100TB. Just overall scaling to meet that storage needed is very difficult and expensive.
JS: So we're storing all our new content into Windows Azure storage. Each day we're going through an upload process as that content is being created, and we're uploading and pushing that content to Azure.
RC: In some ways it sounds like you're just replacing your SAN with this remote storage.
JS: Right. The SAN, but not necessarily just one SAN. We had SANs and other storage mechanisms at each of the different newspapers, at each of our different broadcasting stations, and really it's about that consolidation. We want to get it into this one centralized repository within Azure and then just make it available to all of our different locations and publications.
RC: Well, this gets back to that message you're talking about: being able to put a show or a story in one location and that it shows up in all the different mediums.
JS: Right. I mean, the content produced in Chicago Tribune is not necessarily very specific to Tribune. There are some local stories and some events that might occur that really only pertain to those residents of the area.
JS: But a lot of the content is very useable or very important to consumers outside of that very small localized area.
CF: Do you mind telling us how many instances of servers are running?
JS: Instances of which server?
CF: Of Azure. You know, Azure instances—virtual machines, if you want to call them that.
JS: Right. So we're using the worker roles out in Windows Azure to convert or to create thumbnail sizes of all the images that we upload to Windows Azure. That is really scaled throughout the day based on the number of images that are being uploaded. So we go from anywhere between probably five as the low end all the way up to potentially 50 different instances, based on the number of images that are being uploaded.
RC: Now it's only Tribune employees that are uploading images, right?
JS: Today, yes, it's just Tribune-created content.
RC: You're not using the Azure engines to face the public at all.
JS: That's right, not today.
CF: So I guess, then, it's really helped with employee productivity in streamlining your worker process.
JS: Yeah, absolutely. You know, again, when one image gets uploaded, we go through and create multiple resizes of that image. There's no user interaction necessary, and there's a little bit of extra storage overhead involved in that because we're doing the pre-processing. But in the end, all of it is just ready to be consumed at that point. You know, depending on the platform that's going to be displaying it, you can just pick the appropriate size. It's already pre-generated for you, and there's no runtime cost here anymore because it's all been pre-generated.
RC: How many images are we talking about here?
JS: Tens of thousands, hundreds of thousands. You know, per day it's going to vary, but in general the Tribune has years upon years of archives' worth of images going back all the way to the 1800, and most of them aren't even digitized today. We only have approximately 25 years of digitized content.
JS: At some point, we will need to go through and digitize the remaining archive content, and that will just grow the overall content in the cloud exponentially.
CF: Right. You obviously had choices of S3 or Azure. Did you take a look at Amazon's offerings, and why didn't you choose them?
JS: No, we didn't take a look at Amazon and we ultimately ended up choosing Windows Azure because of our knowledge of .NET, as well as the tool and overall APIs that Windows Azure provide. The \\[Windows Azure\\] dev fabric is phenomenal. Being able to have your own localized instance of the cloud on your development sheet machine is very, very powerful. Because of that localized instance, which is completely disconnected from the Internet, I was able to do a lot of our initial development on the plane going back and forth between our different locations.
RC: But you know, so far the only part where I could see you use AppFabric is the imagery size. Are there other pieces that you're starting to use this technology with?
JS: Well, the AppFabric or the development fabric is not limited to the worker roles. Everything is exposed, or I shouldn't say everything, but the majority of the Windows Azure offering is exposed with dev fabric, so we're able to simulate the storage, the worker queues that we're using or the Windows Azure queues. We're able to interact with the table storage. Literally you can develop almost the entire application without being connected to the production live instance of Windows Azure.
CF: With what you were doing, you must have done a cost analysis as well. Did you find that the cost for what you wanted to do was comparable?
JS: Yes. Amazon as well as Azure are very comparable to each other in terms of cost, so really again going back to our existing .NET experience and knowledge as well as the tools and so forth available through Windows Azure really made it more attractive and a better choice.
CF: OK. So as Richard was alluding to, you have this piece growing up there now. What are your plans for leaning on Azure in the future? What do they include?
JS: So today we rely pretty heavily on Akamai, and we're trying to scale that down to some degree. Windows Azure or Azure in general has a CDN that they recently made available, at least production quality. I think it is going to make a lot of sense for us to start kind of leveraging and relying on that CDN in order to make sure that our content is readily available with low latency across all our consumers in different platforms and devices that they like to retrieve it on.
CF: Yeah. So that's out on beta now.
RC: So why move away from Akamai? I mean, they are the biggest content delivery network out there.
JS: Akamai is expensive, and it's also a secondary process. We would have to figure out how to get the content out of Windows Azure over to Akamai so that it can be cached on their servers, on their edge servers. Whereas the Azure again is able to pull directly from the Windows Azure storage accounts that we have all our content stored within and, you know, really pretty seamless as long as you're within the Windows Azure walls or boundary.
CF: So let's talk about savings. How much money do you expect to save, or have you already saved by moving into the cloud?
JS: Steve Gable, our CTO, recently said that we're expecting to save $1.5 million annually that we would have spent, and that's really going into the reduction of servers, the reduction of data centers, raw manpower, and everything else that's associated with maintaining our existing solution while we're moving to the cloud.
RC: But you're not there yet. You wanted to get down to three data centers... to me, it sounds like you'd saved more money than that.
JS: Yeah. We're definitely not at our end goal. Those three data centers is where we would really like to be. Over the course of the next couple of years, I'm sure that's going to be consolidated down more and more as we get our content moved over to the cloud and we get more of our processes moved over to the cto handle rather than this on-premises servers and infrastructure that we have today.
RC: Yeah. I've got to think you've got more applications you want to move on to Azure.
JS: We have a lot of long-running applications today. We have mainframe servers, we have—you name it, everything in existence within our company.
RC: Well, yeah. I mean, you guys have been around the whole time computers have been around, so I imagine you've got every tier of computing technologies somewhere in the organization.
JS: Yeah. I think you could draw us on and hit just about every computing technology and every piece of hardware that has ever come out in the last hundred years within our company, and I think that goes back to the fact that Tribune is this older company. We've grown out of acquisition, and then we've acquired these different newspapers that had previous owners and previous solutions, and overall it's again that consolidation that we're trying to meet, this standardization of processes.
RC: To try and merge off the business units into common systems, and then they could turn off that old gear they've got.
JS: Right, yeah. Get them all onto this standardized processes and applications, phase out these older archaic mainframe systems, and it's just going to push us ahead into the future.
There's much more! Listen to or read the full interview at www.dotnetrocks.com.
Richard Campbell and Carl Franklin (both Microsoft Regional Directors) are the voices and brains behind .NET Rocks! (www.dotnetrocks.com). They interview experts to bring you insights into .NET technology and the state of software development. They’re more than dry technical interviewers—they have fun!