I have been involved in enterprise level Java web application development for the better part of a decade at this juncture. I have deployed applications in the law, education, healthcare and business industries. I have worked with technologies ranging from JSPs and Servelets on to J2EE EJBs, Spring IoC and MVC, AspectJ, Struts, JSF, Seam, Wicket, JUnit, PowerMock, Maven, Ant, Hibernate, iBatis, Freemarker, Velocity, JNDI, JMS... The acronym-soup goes on and on. These technologies are all things that make the "Java development culture" proud!
I often find myself stunned at choices that are made in these cultures. Choices that make development more complex, reduce productivity, cause release delays and just flat out waste time. I have had a lot of reasons quoted at me why it is necessary (necessary!) to use Java for application development. These reasons range from "only Java is stable and scalable enough for us" to "that's where the developers are" and on to my personal favorite: "That's what I know, and it works great! Why should I change?" Ugh!
Why should you look at other languages and frameworks? Because most web applications that are developed in Java are wasteful! Nine hundred and ninety nine web applications out of a thousand don't need Java-level throughput. What's worse, most web applications developed in Java aren't even taking advantage of Java's speed because they're mired in unnecessary frameworks doing excessive reflection. I couldn't even begin to guess how many hundreds of hours of my life I've wasted sitting and waiting for builds.
Before you say that I'm being unreasonable by lumping all Java web applications together, I will say that there are some attempts to needle these problems down. That isn't what the larger community accepts. Maybe you are the exception. Could be. If so, I admire your tenacity. Anyways, onwards. To what I perceive as the seven deadly sins of Java development...
Deadly Sin # 7: Cleverness
It's not easy managing Java Developers. They're hard to pin down and they spend a lot of time talking about these great architectures they are going to come up with (usually within other architectures that have already been implemented). You end up with a composition nightmare. Instead of spaghetti code that used to result from bad coding, you get lasagna code that has layer after layer of complex abstraction after complex abstraction. One of the most important things to remember when developing is that code is twice as hard to test as it is to write. That means that if you write the most complex code you can, in the cleverest way you can, that you are, by definition, incapable of testing it thoroughly.
Deadly Sin # 6: Experience
Most Java developers show up straight out of college. Some have a background in coding, but most have done their time in internships. Usually that internship has solidified their knowledge in a particular piece of the pie that is Java, and not broadened their horizons further. Schools don't all have the same curricula, but they tend to preach Java for at the present time. That is just plain dogmatic. It's as if they're saying that you need a bulldozer to paint a sidewalk. While you probably could do it, how much sense does that really make?
Generally speaking, it takes about 3 years to turn yourself in to a well rounded Java developer. To understand all the idioms, jargon and idiosyncrasies of the language. During those three years you will be inundated with all sorts of concepts like inversion of control and aspect oriented programming. Many of which are a direct result of the need for more dynamic language features in the code. That's just one big code smell. If you want a dynamic language, why not use one?
Deadly Sin # 5: Drive
Learning all there is to know about Java requires a certain amount of drive. It's not something you can dabble in. You either are a Java developer or you aren't. You know it or you don't. While that may be a boon in some cases, it can become difficult to see who is pulling the acronym-soup wool over your eyes. Sure, you know the words for inversion of control, but what does it do? What is its purpose? Is it right for your project? Does it make your code more or less maintainable? The party line answer isn't good enough. This kind of drive is actually pretty rare. That also means that over the course of a Java developers career, they are going to have to learn and relearn all sorts of technologies. The culture continues to get more clever, requiring more out of every developer that wants to be involved.
Deadly Sin # 4: Cost
This is gluttony, pure and simple. The cost to run a Java development environment is extreme. Java developers require the best machines, with the most memory. Their environment usually requires a continuous integration server, versioning server, and more. All of which have to run on different machines because the Java software that integrates their Java software is so bloated that it requires devoted resources to run at acceptable speeds. Once you get past the development environment, you're probably looking at a fair sized server farm just to run the basic application that you want. Perhaps you can get around that by running in a cloud, perhaps not. Most companies want total control, and that means that your Java apps require a massive investment before you even consider having your first real client on the system. This doesn't even bring in to account what Java developers consider themselves being worth because of their lengthy college education and their ability to wade through the overly clever technologies they are working with.
Deadly Sin # 3: Time
Waiting on your IDE to load, index, churn through all of your code, stop pausing when you're trying to write code. Run your unit tests within your IDE. Waiting on builds. Waiting on compilation of your code. Copying new code over. Setting up and forcing Java servers to run as you expect them to. Looking through hundreds of thousands of lines of log files to find out that you had one single little configuration error that caused hours of downtime. These are just a few of the wastes of time in Java development. Every Java developer has wasted hundreds of hours by going through these processes.
When you are building a web application the mantra should be- Deliver early, deliver often. Without clients, you're not getting feedback, and you're not making money. This is almost impossible when you're spending a fair amount of your time just forcing your development environment in to submission time and time again.
Deadly Sin # 2: Unproductive
Java is verbose. Extremely verbose. It not only has a syntax that requires 5 lines instead of 1 for particular tasks, the culture behind the language actually embraces the idea that more is less. If you looked at your average Java Web App you would find it replete with XML files, properties files, and javadoc blocks. Instead of running with sane configurations from the get go, you have to define that sane basis. Even to all the frameworks out there that are supposed to ease your development cycle, or provide particular functionality. You can't write self documenting code because you're working with something that is too low of level. This leads to a lot of time spent debugging frameworks and writing config files, instead of writing feature-driven code.
And finally...
Deadly Sin # 1: Distractions
Java developers are clever. By nature they must be. The frameworks they work with all day every day enforce and reinforce that concept all the time. Instead of writing twenty lines of servlet code for a simple API, they'll instead pull in a framework. Struts, Jersey, Spring MVC. Waste hours setting up the config files. It will never enter their consciousness that they've just completely over-engineered a twenty line program in to something that requires hundreds of lines of XML.
This produces unreadable, unproductive and unmaintainable code. There are a great many developers that have been forced to maintain these codebases and have become either enamored of them (because they must be to keep supporting it) or simply forced themselves to deal with it. Worse yet, most developers lose sight of the goal. They forget that their application still has constraints because it is just a given that most companies will simply 'deal with' the extra cost of the hardware that's needed to run it.
Don't get lost in the mire. Know what you're developing. Know how much engineering it needs. Don't add eight layers when you only need two. Remember: Keep it simple and stupid. The more simple your app is, the easier it will be for you to write the next big feature.
Stay tuned for my comparison of developing a web application in Java versus Ruby.
2011-12-13
2011-11-01
Toxic Perceptions in Development
Being a programmer in the corporate world is a somewhat arduous task. There are a lot of assumptions thrown around from within and without. Those outside the world of development tend to see development departments as something of a police-free zone, where anyone can do whatever they want without repercussions or having to adhere to deadlines. This couldn't be further from the truth.
Developers have a bit more leeway when it comes to technology, because they must. In order to do their job (you'll note this is unqualified- it's a requirement, not optional), developers must have such software as Firefox, Firebug, Eclipse, Fiddler, open internet connections, and the list goes on and on. Corporate IT can't restrict them (or shouldn't), because all it will do is slow down the development cycle and force deadlines to be pushed back even further
Deadlines come and go for development, it's true. This is the single biggest driver of all the churn/turn over in development groups. Miss a deadline, and one day you have 250 contractors in a massive project, and within a month, you could be down to a hundred. Or you could go from 30 down to 15 almost overnight, with two managers stepping down. Both of these have occurred in my time with corporations. It's a given, because software development is more of an art than a science. It doesn't adhere well to dates because understanding the "whole picture" in a project is something that very few people can do. Even when a person can, there are always unexpected hiccups and stop gap measures. Imagine a construction project suddenly having a supplier run out of their building material. This causes timelines to be pushed, and that doesn't make executives happy.
While the outside perception definitely hurts development on occasion, it's not the piece that strains developers the most. It's the perceptions from the inside.
During college or learning periods for developers, they get taught certain ways of seeing problems. Many of which never question these perceptions- They just start to feel as if there is only one right way to look at problems, or a very limited number of ways to solve a given problem. When that is the exact opposite of what companies are looking for. Innovation is about bringing creative and new ideas to the table. Trying new things that aren't simply extensions of the old stagnant ones.
In order to be clear about what I mean, here is an example: In a recent rewrite of an application, we went from a Java stack based on Struts 1 with some Hibernate thrown in, to another Java stack with Spring, iBatis and a DMS library. With the hopes that this would help speed up productivity and make our code more maintainable.
The core problem in our move from one version of the application to another is that we did not evaluate other languages/architectures that may be more suited to developing a web application. Groovy on Grails, Ruby on Rails, Python/Django, et al. Because developer preconceptions got in the way. This concept extends beyond just the language and patterns. The idea should extend in to some of the broken areas of application development where developers spend more time in XML files or property files than actual code. Spending more time writing unit tests or performing a build than writing code.
I've said it before, and I'll say it again: A good developer should be able to learn new languages and new patterns quickly and easily. If they can't, I would call in to question if they fully understand application development.
Developers have a bit more leeway when it comes to technology, because they must. In order to do their job (you'll note this is unqualified- it's a requirement, not optional), developers must have such software as Firefox, Firebug, Eclipse, Fiddler, open internet connections, and the list goes on and on. Corporate IT can't restrict them (or shouldn't), because all it will do is slow down the development cycle and force deadlines to be pushed back even further
Deadlines come and go for development, it's true. This is the single biggest driver of all the churn/turn over in development groups. Miss a deadline, and one day you have 250 contractors in a massive project, and within a month, you could be down to a hundred. Or you could go from 30 down to 15 almost overnight, with two managers stepping down. Both of these have occurred in my time with corporations. It's a given, because software development is more of an art than a science. It doesn't adhere well to dates because understanding the "whole picture" in a project is something that very few people can do. Even when a person can, there are always unexpected hiccups and stop gap measures. Imagine a construction project suddenly having a supplier run out of their building material. This causes timelines to be pushed, and that doesn't make executives happy.
While the outside perception definitely hurts development on occasion, it's not the piece that strains developers the most. It's the perceptions from the inside.
During college or learning periods for developers, they get taught certain ways of seeing problems. Many of which never question these perceptions- They just start to feel as if there is only one right way to look at problems, or a very limited number of ways to solve a given problem. When that is the exact opposite of what companies are looking for. Innovation is about bringing creative and new ideas to the table. Trying new things that aren't simply extensions of the old stagnant ones.
In order to be clear about what I mean, here is an example: In a recent rewrite of an application, we went from a Java stack based on Struts 1 with some Hibernate thrown in, to another Java stack with Spring, iBatis and a DMS library. With the hopes that this would help speed up productivity and make our code more maintainable.
The core problem in our move from one version of the application to another is that we did not evaluate other languages/architectures that may be more suited to developing a web application. Groovy on Grails, Ruby on Rails, Python/Django, et al. Because developer preconceptions got in the way. This concept extends beyond just the language and patterns. The idea should extend in to some of the broken areas of application development where developers spend more time in XML files or property files than actual code. Spending more time writing unit tests or performing a build than writing code.
I've said it before, and I'll say it again: A good developer should be able to learn new languages and new patterns quickly and easily. If they can't, I would call in to question if they fully understand application development.
2011-05-19
Quick and dirty intro to REST within HTTP
RESTful web design in terms of the HTTP protocol breaks down quite simply. It all boils down to how the URI is formed. Many systems handle this in different ways on the server, but it's simply about following the basic URI pattern:
(action) http://(host)/(resource(s))
Request Body: (description)
Action: Verb based CRUD, where GET is Retrieve, POST is create, PUT is update and DELETE is delete.
Resource: Noun based object. Usually found in value pairs (object/id) when not creating a new resource.
Description: POST and PUT have request bodies. This should describe details about what you want the object to have, or what you want to change.
Several examples of each type of action:
Here are a couple examples of nested objects, where a shelf has rows, and a row has books:
It becomes pretty darn obvious exactly what you're doing when you look at URIs in this manner. The retrieved representation should have some kind of default, but you also should be able to specify a particular representation (if it makes sense for your implementation). For example:
Common REST pitfalls:
While this is a very basic primer, if you follow these rules on both client and server, and you should fall successfully in to the pit of resource oriented architecture. Don't be afraid to just try it out. It's a new way of organizing all of your URIs, and you're bound to have to try things a few different ways before finding the most optimized pattern. Everything else in REST terms is just expanding on the concepts introduced here, either on server code or client code.
(action) http://(host)/(resource(s))
Request Body: (description)
Action: Verb based CRUD, where GET is Retrieve, POST is create, PUT is update and DELETE is delete.
Resource: Noun based object. Usually found in value pairs (object/id) when not creating a new resource.
Description: POST and PUT have request bodies. This should describe details about what you want the object to have, or what you want to change.
Several examples of each type of action:
- GET http://host.com/book/123abc789
- POST http://host.com/book
Request Body: {book: {name: "the super duper", contents: "once upon a midnight dreary..."}} - DELETE http://host.com/book/987bca123
- PUT http://host.com/book/123abc789
Request Body: {book: {name: "new super"}}
Here are a couple examples of nested objects, where a shelf has rows, and a row has books:
- GET http://host.com/shelf/23a/row/5/book/21
- DELETE http://host.com/shelf23a/row/4
It becomes pretty darn obvious exactly what you're doing when you look at URIs in this manner. The retrieved representation should have some kind of default, but you also should be able to specify a particular representation (if it makes sense for your implementation). For example:
- GET http://host.com/shelf/23a/row/5/book/21.pdf
- GET http://host.com/shelf/23a/row/5/book/21.html
Common REST pitfalls:
- REST does not support any kind of API listing, so it needs to be published clearly somewhere for people to use.
- You may not update more than one resource/object at a time, with one call to the server.
While this is a very basic primer, if you follow these rules on both client and server, and you should fall successfully in to the pit of resource oriented architecture. Don't be afraid to just try it out. It's a new way of organizing all of your URIs, and you're bound to have to try things a few different ways before finding the most optimized pattern. Everything else in REST terms is just expanding on the concepts introduced here, either on server code or client code.
2011-05-17
Scaling Web Applications (Aka: "Does Rails Scale?")
I have spent a lot of time talking about Ruby and Rails with people recently. When you're in that domain, you can't help but have people asking you, "So, is it true? Rails can't scale?"
It's a fair question, if somewhat naive. A lot of people have heard about the issues that Twitter has had as it has become the monolithic application that it is. It's been blamed squarely on Rails, even by some of Twitter's management. It's just not that simple. The kind of application that will service the amount of requests they are seeing is not the kind of application that happens on accident. However, I'm not going to attempt to answer whether or not Rails can scale, because I believe the question is fundamentally flawed. Instead, I want to talk about the concept of scalability, and why your question shouldn't be "Can X scale," but rather, "How does X scale?"
Building an application is a bit like constructing a building. You have to choose the right materials, tools and techniques. You have to plan in advance. You have to make trade offs about durability, flexibility, and all the other -ilities. Most web sites out there require very little scalability, because they'll never see more than a request every ten seconds. Some may get lucky and hit once a second. The very best, may see more! Consider that a millions hits per day is only approximately ten hits per second. That's really not all that impressive.
There are two types of scaling that are widely accepted. "Scaling up," and "Scaling out." Each have their pros and cons, but I feel it's important to define them before considering the greater picture.
"Scaling Up" refers to the applications ability to be placed to "Big Metal"- Think old time main frames. They are applications that are meant to have one instance of the application servicing every request. This is the easiest to conceptualize. You only need one program running, and as long as you buy powerful enough hardware, you can get away with any number of requests. There is a hard limit to this kind of system, though. When you aren't parallelizing tasks, you can end up with a lot of downfalls. Such as deadlocks, stalls, and more. What happens on a hardware failure? How do you plan for that, without having a second massively expensive install? You don't. Pure and simple. It's expensive. Very expensive. But it's simple to maintain.
"Scaling Out" refers to the applications ability to be massively parallel on any number of given systems. This could be commodity level systems, on out to high powered blades (or even mainframes). It's not about the hardware. It's about the software. Run it wherever you want, they'll all cluster together. This kind of scalability requires a lot of advanced planning, and forethought to be able to run twenty applications side by side, and have them buzz along happily. This tends to be why many applications need to be reworked when they get to the point where thousands of users are accessing them regularly. But if your application is set up correctly, you can grow with it, on demand. Just by bringing up a few new servers to service more requests. Scaling out tends to be the preferred method of modern scaling needs. You don't anticipate your need, you buy hardware as you need it. Backups are only as costly as having a few extra systems standing by.
Now, taking the earlier example: Instead of having to service a million requests per day, what happens when you have to service a hundred million? Or more? You're now looking at more than one thousand requests per second. The same system that can happily buzz along and handle one or two, or even ten, requests per second, will no longer be capable of handling the load. It will realistically be crushed under the weight of the load. Crushed. You didn't plan for it, it wont be capable of it. When you build a doghouse, you don't expect it to house hundreds of people, right?
That means that you need to think about how to handle that load. Build a foundation that can handle it- Pick tools and frameworks that you can vet.
Some key questions you really should be asking are- How many requests per second can your system service? Will they talk to each other? How? Are you persisting data? If you are, how many requests can your persistence tier handle? Can it scale out, too? How? Has someone else done what you're trying to do with the tools that you're using? At what scale? What pitfalls did they run in to? How can you avoid them?
The bottom line is... Don't fall in to the Sucks/Rocks dichotomy. Especially if you haven't fully evaluated what you're talking about.
Remember- Facebook is written in PHP, YouTube is written in Python, Twitter is written in Ruby, Amazon's systems are written in multiple languages, as are Google's. It's not about the language. It's about how you utilize it.
It's a fair question, if somewhat naive. A lot of people have heard about the issues that Twitter has had as it has become the monolithic application that it is. It's been blamed squarely on Rails, even by some of Twitter's management. It's just not that simple. The kind of application that will service the amount of requests they are seeing is not the kind of application that happens on accident. However, I'm not going to attempt to answer whether or not Rails can scale, because I believe the question is fundamentally flawed. Instead, I want to talk about the concept of scalability, and why your question shouldn't be "Can X scale," but rather, "How does X scale?"
Building an application is a bit like constructing a building. You have to choose the right materials, tools and techniques. You have to plan in advance. You have to make trade offs about durability, flexibility, and all the other -ilities. Most web sites out there require very little scalability, because they'll never see more than a request every ten seconds. Some may get lucky and hit once a second. The very best, may see more! Consider that a millions hits per day is only approximately ten hits per second. That's really not all that impressive.
There are two types of scaling that are widely accepted. "Scaling up," and "Scaling out." Each have their pros and cons, but I feel it's important to define them before considering the greater picture.
"Scaling Up" refers to the applications ability to be placed to "Big Metal"- Think old time main frames. They are applications that are meant to have one instance of the application servicing every request. This is the easiest to conceptualize. You only need one program running, and as long as you buy powerful enough hardware, you can get away with any number of requests. There is a hard limit to this kind of system, though. When you aren't parallelizing tasks, you can end up with a lot of downfalls. Such as deadlocks, stalls, and more. What happens on a hardware failure? How do you plan for that, without having a second massively expensive install? You don't. Pure and simple. It's expensive. Very expensive. But it's simple to maintain.
"Scaling Out" refers to the applications ability to be massively parallel on any number of given systems. This could be commodity level systems, on out to high powered blades (or even mainframes). It's not about the hardware. It's about the software. Run it wherever you want, they'll all cluster together. This kind of scalability requires a lot of advanced planning, and forethought to be able to run twenty applications side by side, and have them buzz along happily. This tends to be why many applications need to be reworked when they get to the point where thousands of users are accessing them regularly. But if your application is set up correctly, you can grow with it, on demand. Just by bringing up a few new servers to service more requests. Scaling out tends to be the preferred method of modern scaling needs. You don't anticipate your need, you buy hardware as you need it. Backups are only as costly as having a few extra systems standing by.
Now, taking the earlier example: Instead of having to service a million requests per day, what happens when you have to service a hundred million? Or more? You're now looking at more than one thousand requests per second. The same system that can happily buzz along and handle one or two, or even ten, requests per second, will no longer be capable of handling the load. It will realistically be crushed under the weight of the load. Crushed. You didn't plan for it, it wont be capable of it. When you build a doghouse, you don't expect it to house hundreds of people, right?
That means that you need to think about how to handle that load. Build a foundation that can handle it- Pick tools and frameworks that you can vet.
Some key questions you really should be asking are- How many requests per second can your system service? Will they talk to each other? How? Are you persisting data? If you are, how many requests can your persistence tier handle? Can it scale out, too? How? Has someone else done what you're trying to do with the tools that you're using? At what scale? What pitfalls did they run in to? How can you avoid them?
The bottom line is... Don't fall in to the Sucks/Rocks dichotomy. Especially if you haven't fully evaluated what you're talking about.
Remember- Facebook is written in PHP, YouTube is written in Python, Twitter is written in Ruby, Amazon's systems are written in multiple languages, as are Google's. It's not about the language. It's about how you utilize it.
2011-05-06
Parkinson's Law
It's pretty rare that I come across a work saying that I feel compelled to share with people, but Parkinson's Law is a very important law to keep in mind in software development. What is it?
This has been said in many different ways, in many different professions, or even in nature. Those include...
"Work expands so as to fill the time available for its completion."
This has been said in many different ways, in many different professions, or even in nature. Those include...
"Data expands to fill the space available for storage."
"Storage requirements will increase to meet storage capacity."
"Nature abhors a vacuum."
Why does this matter? One simple reason. Programmers, and all other IT people, tend to aim for the stars, and will happily spend from now until the end of eternity designing the absolutely perfect, beautiful system. Instead of shipping the product.
In the end, real developers ship.
In the end, real developers ship.
Thinking in Code: JavaScript and Java
JavaScript is not Java.
Let me repeat that again: JavaScript is to Java what Hamster is to Ham.
There is no direct correlation, other than the fact that they both are a form of programming language, and a way to instruct a computing environment to perform specific tasks. Their paradigms are completely different, and they are not formed of the same types of features. Let me highlight some of the key features of each language:
JavaScript is...
Why Does it Matter?
I have run across an awful lot of JavaScript code that was clearly written by a Java developer. Over architected and excessively complex code which takes hundreds of times the CPU cycles than ten lines of well written, clear, easy to use, self documenting code. This highlights the differences between the two languages, and the two different thought processes between the two. The correct way to implement a particular feature in one programming language may not only be wrong in another language, but may go against the very intention and fabric of the language. This is why companies hire programmers to participate in the development of a particular application based on language, rather than just hiring the best generalist that they can find. Not that generalists don't have their place, by all means they do. But in order to be well grounded in many languages, it not only takes passion for coding, but motivation and drive. Despite what you may believe, you're probably not one. They are few and far between, and when found, they are worth their weight in gold.
Let me repeat that again: JavaScript is to Java what Hamster is to Ham.
There is no direct correlation, other than the fact that they both are a form of programming language, and a way to instruct a computing environment to perform specific tasks. Their paradigms are completely different, and they are not formed of the same types of features. Let me highlight some of the key features of each language:
JavaScript is...
- a functional language with basic facilities for forcing object oriented notation, despite not being object oriented at all.
- a dynamic, loosely typed language that is event driven
- typically used in the web browser, though there are server side implementations that are catching on.
- a mostly object oriented language
- a static, strongly typed language
- typically used for implementing very powerful server based processes which are cross platform compatible
Why Does it Matter?
I have run across an awful lot of JavaScript code that was clearly written by a Java developer. Over architected and excessively complex code which takes hundreds of times the CPU cycles than ten lines of well written, clear, easy to use, self documenting code. This highlights the differences between the two languages, and the two different thought processes between the two. The correct way to implement a particular feature in one programming language may not only be wrong in another language, but may go against the very intention and fabric of the language. This is why companies hire programmers to participate in the development of a particular application based on language, rather than just hiring the best generalist that they can find. Not that generalists don't have their place, by all means they do. But in order to be well grounded in many languages, it not only takes passion for coding, but motivation and drive. Despite what you may believe, you're probably not one. They are few and far between, and when found, they are worth their weight in gold.
Subscribe to:
Posts (Atom)