rtyler

Scaling, with your "smart platform choice"

At times I feel as if I am plugged directly into the internet, almost like an NSA wiretap on AT&T's backbone, silently sniffing along reading packets until something throws up a red flag. This specifically applies to both Python, and .NET/Mono related bloggings, in which a fellow I know, Chris Messina, posted something titled "WordPressMU: Making a smart platform choice" which, not surprisingly, threw up a red flag. Chris and I tend not to see eye-to-eye on a lot of things, most notably, microformats, along with Ruby on Rails, and some of the other "Web 2.0" style technologies/idealogies that Chris has embraced, while I stand back and look on, casually remarking "OMGWTFBBQ" every now and then.

Chris opens the post with the following, in reference to a client of his:
Their current website is built in .NET and they’re getting to the point where things are about to start getting set in stone in terms of scaling and overall architecture and it kinda freaked me out that they’d continue down this path using a platform that I think offers little when it comes to organic community-building or much in the way of “doing web things right”.


Disclaimer: I'm not an expert on scaling, I just get yelled at when my code doesn't scale

Chris goes on to mention Ruby on Rails, Django, and WordPressMU, deciding on the third as the best option for building a people-powered Web 2.0 community on. Some of the reasons for this are employment, open source, web standards, community, scalability, politics, and a few others that don't matter. While Ruby on Rails, Django on mod_python, and WordPressMU on PHP are all good platforms to build upon, his complete dismissal of .NET (and in turn Mono) is completely unfounded, and in most cases, blatantly incorrect.

Examples


Looking at some popular sites around the internet, you can get a feel for exactly what it takes to scale:
  • MySpace
    • OS: Windows Server
    • Platform: ColdFusion/.NET
    • Database: MS SQL
    Originally, they were able to scale with ColdFusion, and have since switched over to .NET, making MySpace one of the largest sites running .NET with the largest MS-SQL installations on the planet.
  • Facebook
    • OS: Linux
    • Platform: PHP
    • Database: MySQL
    Facebook has scaled with a combination of MySQL and PHP, with a good amount of customization of their internal build of PHP, and memcached running to ensure database calls are kept to a minimum.
  • Slide
    • OS: Linux
    • Platform: Python
    • Database: MySQL

    Okay, I'm throwing us in there for fun, but we've scaled with MySQL and mod_python with a hefty dose of secret sauce :)
  • Yahoo!
    • OS: FreeBSD
    • Platform: PHP
    • Database: MySQL
    Yahoo! obviously has to scale to serve a gigantic portion of the internet, and they're running PHP, MySQL with C++ extensions written where they need to be in order to acheive extra speed.
  • Wikipedia
    • OS: Linux
    • Platform: PHP
    • Database: MySQL
    Wikpedia.org has scaled with MySQL, PHP along with some Lucene indexing servers and memcached servers running around to improve read times from their database servers

  • Microsoft
    • OS: Windows Server
    • Platform: .NET
    • Database: MS SQL
    Microsoft tends to eat their own dog food with most of their web sites and portals, running on .NET with MS-SQL, and according to Quantcast have 3 of the top 5 sites on the internet.
  • Apple
    • OS: Linux/Solaris/Mac OS X
    • Platform: WebObjects/PHP
    • Database: MySQL
    Apple runs a mix of WebObjects (Java), PHP, and god knows what else on varying platforms (Solaris, Linux, Mac OS X). I'm not 100% on exactly what's going on inside the web team at 1 Infinite Loop anymore, but they seem to be able to scale already with what they've got.

Counter-points


Employment
One of the points made is that it's easier to find PHP developers, as opposed to Python, or Ruby developers, which probably is true. However, .NET developers are definitely going to be more prominent, but most developers worth your employment, especially at a startup, are going to need to be able to pick up new frameworks and technologies quickly.

Open Source
I will agree that having an open source platform to build on is a good idea, but certainly not a deal breaker for .NET, or whichever platform you choose to use. Starting a community, or a web business in general doesn't matter if you can't get your product out the door as soon as possible. Nobody cares how "open" you are in your development process, if you can't ship.

Web standards
Citing Channel 9 as an example of how (somehow) the .NET platform doesn't adhere to web standards and "open data formats" is one of the most ludicrous arguments I've ever seen. You can generate valid JSON, XML, SOAP, and XHTML from any platform, even mod_perl!

Community
Chris makes the argument that somehow his experience in dealing with the WordPress community extrapolates to developing an actual product, and that if you're going to build a community-oriented site, you better use a platform that your community will approve of! Hint, it doesn't matter. 95% of your users probably won't care what you run, as long as they have the product to use.

Economics
The points about economics are certainly valid, in that it's far easier to find hosts and sysadmins familiar with PHP than with Rails or Django (Python), That doesn't mean that it scales however, once you get past a couple of million users, you need people who know what they're doing, with dedicated hardware to help your web application scale.

Scalibility
Talking about how you feel scaling is absolutely absurd. You will feel pain, that's what happens we you have to scale. Standing back, and looking at the code you've worked insanely hard on, and trying to figure out how to make that faster is painful, regardless of platform. If I've consulted with somebody about how to scale my architecture and they say "well, that doesn't feel right" without citing sources, strategies, or reasons, I'm going to find somebody else, or I'm going to fall on my face when the time comes to scale.

Politics
Just a quote:
However, I think people familiar with modern web design would agree with me that WordPress/PHP, Django or Rails are all superior choices over .NET when it come to the politics of technology development. In terms of openness, being forward-thinking and in terms of community outlook, any of these choices are going to net you a very different kind of response. Being keen to what each choice says about you is key to making a wise decision.
First error, is asking people involved in web design how you want to scale, you should probably ask people involved in systems architecture. Politics don't exist when you need to scale, or ship product, it's that simple. What gets the job done, the fastest, with the greatest net result.

End game


Chris' general ignorance of some of the features of ASP.NET 2.0, and his zealotry when it comes to buzzwords like "community, forward-thinking, people-powered, Web 2.0" and their ilk doesn't surprise me, since he's not a developer. For example, in ASP.NET 2.0 you can have asynchronous pages, just like you can have interlaced GIF images, that progressively load, you can have pages that progressively load, instead of needing the server to fully generate the entire page before it's piped back to the client. Of course, none of this matters since architecture is the biggest hurdle when it comes to scaling, not platform. The more important question to answer before you toss out your existing code base in favor of a more buzzword compliant platform are:
  • How can you more efficiently handle database queries?
  • Can you cut out unnecessary database queries?
  • Can you switch over to a newer version (MySQL 4 and 5 I'm looking at you) of your database to improve performance?
  • Will it be effective to add a caching solution like memcached between your web farm and your database servers?
  • What can be relegated into progessive page loads either via asynchronous pages in ASP.NET 2.0, or through the use of AJAX back to your web servers to retrieve more data instead of forcing a new page load?
  • Is this problem simply caused by not having enough servers?


Citing zero empirical evidence, not counting some useless benchmarks (scaling is far more case-by-case than doing benchmarkable operations), and going with whatever "the cool kids" are using is the quick road to failure. All of the sites I mentioned above have people whose job is to sit around all day and figure out how to squeeze more performance out of their architecture and help the sites grow with their userbase. The trick to optimization is rarely a complete rewrite, or any one trick, it's about finding where the bottlenecks are, and doing whatever possible to minimize those.

As a final note, all the platforms referenced above just spit out pages. That's it. It's how you form the output that determines how "community friendly" or aesthetically pleasing the final product is. It's all fair game between the <html> tags :)