2011-05-19

Quick and dirty intro to REST within HTTP

RESTful web design in terms of the HTTP protocol breaks down quite simply. It all boils down to how the URI is formed. Many systems handle this in different ways on the server, but it's simply about following the basic URI pattern:

(action) http://(host)/(resource(s))
Request Body: (description)

Action: Verb based CRUD, where GET is Retrieve, POST is create, PUT is update and DELETE is delete.
Resource: Noun based object. Usually found in value pairs (object/id) when not creating a new resource.
Description: POST and PUT have request bodies. This should describe details about what you want the object to have, or what you want to change.

Several examples of each type of action:

  • GET http://host.com/book/123abc789
  • POST http://host.com/book
    Request Body: {book: {name: "the super duper", contents: "once upon a midnight dreary..."}}
  • DELETE http://host.com/book/987bca123
  • PUT http://host.com/book/123abc789
    Request Body: {book: {name: "new super"}}

Here are a couple examples of nested objects, where a shelf has rows, and a row has books:

  • GET http://host.com/shelf/23a/row/5/book/21
  • DELETE http://host.com/shelf23a/row/4

It becomes pretty darn obvious exactly what you're doing when you look at URIs in this manner. The retrieved representation should have some kind of default, but you also should be able to specify a particular representation (if it makes sense for your implementation). For example:

  • GET http://host.com/shelf/23a/row/5/book/21.pdf
  • GET http://host.com/shelf/23a/row/5/book/21.html

Common REST pitfalls:

  • REST does not support any kind of API listing, so it needs to be published clearly somewhere for people to use.
  • You may not update more than one resource/object at a time, with one call to the server.

While this is a very basic primer, if you follow these rules on both client and server, and you should fall successfully in to the pit of resource oriented architecture. Don't be afraid to just try it out. It's a new way of organizing all of your URIs, and you're bound to have to try things a few different ways before finding the most optimized pattern. Everything else in REST terms is just expanding on the concepts introduced here, either on server code or client code.

2011-05-17

Scaling Web Applications (Aka: "Does Rails Scale?")

I have spent a lot of time talking about Ruby and Rails with people recently. When you're in that domain, you can't help but have people asking you, "So, is it true? Rails can't scale?"

It's a fair question, if somewhat naive. A lot of people have heard about the issues that Twitter has had as it has become the monolithic application that it is. It's been blamed squarely on Rails, even by some of Twitter's management. It's just not that simple. The kind of application that will service the amount of requests they are seeing is not the kind of application that happens on accident. However, I'm not going to attempt to answer whether or not Rails can scale, because I believe the question is fundamentally flawed. Instead, I want to talk about the concept of scalability, and why your question shouldn't be "Can X scale," but rather, "How does X scale?"

Building an application is a bit like constructing a building. You have to choose the right materials, tools and techniques. You have to plan in advance. You have to make trade offs about durability, flexibility, and all the other -ilities. Most web sites out there require very little scalability, because they'll never see more than a request every ten seconds. Some may get lucky and hit once a second. The very best, may see more! Consider that a millions hits per day is only approximately ten hits per second. That's really not all that impressive.

There are two types of scaling that are widely accepted. "Scaling up," and "Scaling out." Each have their pros and cons, but I feel it's important to define them before considering the greater picture.

"Scaling Up" refers to the applications ability to be placed to "Big Metal"- Think old time main frames. They are applications that are meant to have one instance of the application servicing every request. This is the easiest to conceptualize. You only need one program running, and as long as you buy powerful enough hardware, you can get away with any number of requests. There is a hard limit to this kind of system, though. When you aren't parallelizing tasks, you can end up with a lot of downfalls. Such as deadlocks, stalls, and more. What happens on a hardware failure? How do you plan for that, without having a second massively expensive install? You don't. Pure and simple. It's expensive. Very expensive. But it's simple to maintain.

"Scaling Out" refers to the applications ability to be massively parallel on any number of given systems. This could be commodity level systems, on out to high powered blades (or even mainframes). It's not about the hardware. It's about the software. Run it wherever you want, they'll all cluster together. This kind of scalability requires a lot of advanced planning, and forethought to be able to run twenty applications side by side, and have them buzz along happily. This tends to be why many applications need to be reworked when they get to the point where thousands of users are accessing them regularly. But if your application is set up correctly, you can grow with it, on demand. Just by bringing up a few new servers to service more requests. Scaling out tends to be the preferred method of modern scaling needs. You don't anticipate your need, you buy hardware as you need it. Backups are only as costly as having a few extra systems standing by.

Now, taking the earlier example: Instead of having to service a million requests per day, what happens when you have to service a hundred million? Or more? You're now looking at more than one thousand requests per second. The same system that can happily buzz along and handle one or two, or even ten, requests per second, will no longer be capable of handling the load. It will realistically be crushed under the weight of the load. Crushed. You didn't plan for it, it wont be capable of it. When you build a doghouse, you don't expect it to house hundreds of people, right?

That means that you need to think about how to handle that load. Build a foundation that can handle it- Pick tools and frameworks that you can vet.

Some key questions you really should be asking are- How many requests per second can your system service? Will they talk to each other? How? Are you persisting data? If you are, how many requests can your persistence tier handle? Can it scale out, too? How? Has someone else done what you're trying to do with the tools that you're using? At what scale? What pitfalls did they run in to? How can you avoid them?

The bottom line is... Don't fall in to the Sucks/Rocks dichotomy. Especially if you haven't fully evaluated what you're talking about.

Remember- Facebook is written in PHP, YouTube is written in Python, Twitter is written in Ruby, Amazon's systems are written in multiple languages, as are Google's. It's not about the language. It's about how you utilize it.

2011-05-06

Parkinson's Law

It's pretty rare that I come across a work saying that I feel compelled to share with people, but Parkinson's Law is a very important law to keep in mind in software development. What is it?

"Work expands so as to fill the time available for its completion."

This has been said in many different ways, in many different professions, or even in nature. Those include...


"Data expands to fill the space available for storage."
"Storage requirements will increase to meet storage capacity."
"Nature abhors a vacuum."

Why does this matter? One simple reason. Programmers, and all other IT people, tend to aim for the stars, and will happily spend from now until the end of eternity designing the absolutely perfect, beautiful system. Instead of shipping the product.

In the end, real developers ship.

Thinking in Code: JavaScript and Java

JavaScript is not Java.

Let me repeat that again: JavaScript is to Java what Hamster is to Ham.

There is no direct correlation, other than the fact that they both are a form of programming language, and a way to instruct a computing environment to perform specific tasks. Their paradigms are completely different, and they are not formed of the same types of features. Let me highlight some of the key features of each language:

JavaScript is...
  • a functional language with basic facilities for forcing object oriented notation, despite not being object oriented at all.
  • a dynamic, loosely typed language that is event driven
  • typically used in the web browser, though there are server side implementations that are catching on.
Java is...
  • a mostly object oriented language
  • a static, strongly typed language
  • typically used for implementing very powerful server based processes which are cross platform compatible
Every language has it's own strengths and weaknesses, and it's own way to code appropriately in. This has given birth to many books such as "Effective Java," "Thinking in Java," "The Well Grounded Rubyist" and "Javascript, the Good Parts." Each of these books focuses on where each language excels. The reason that there are so many is because each language has it's own way of thinking. If they didn't, there wouldn't be any reason to have so many books.

Why Does it Matter?

I have run across an awful lot of JavaScript code that was clearly written by a Java developer. Over architected and excessively complex code which takes hundreds of times the CPU cycles than ten lines of well written, clear, easy to use, self documenting code. This highlights the differences between the two languages, and the two different thought processes between the two. The correct way to implement a particular feature in one programming language may not only be wrong in another language, but may go against the very intention and fabric of the language. This is why companies hire programmers to participate in the development of a particular application based on language, rather than just hiring the best generalist that they can find. Not that generalists don't have their place, by all means they do. But in order to be well grounded in many languages, it not only takes passion for coding, but motivation and drive. Despite what you may believe, you're probably not one. They are few and far between, and when found, they are worth their weight in gold.