Saturday, December 31, 2011

Common Security Mistakes in Web Applications

Web application developers today need to be skilled in a multitude of disciplines. It’s necessary to build an application that is user friendly, highly performant, accessible and secure, all while executing partially in an untrusted environment that you, the developer, have no control over. I speak, of course, about the User Agent. Most commonly seen in the form of a web browser, but in reality, one never really knows what’s on the other end of the HTTP connection.
There are many things to worry about when it comes to security on the Web. Is your site protected against denial of service attacks? Is your user data safe? Can your users be tricked into doing things they would not normally do? Is it possible for an attacker to pollute your database with fake data? Is it possible for an attacker to gain unauthorized access to restricted parts of your site? Unfortunately, unless we’re careful with the code we write, the answer to these questions can often be one we’d rather not hear.
We’ll skip over denial of service attacks in this article, but take a close look at the other issues. To be more conformant with standard terminology, we’ll talk about Cross-Site Scripting (XSS), Cross-Site Request Forgery (CSRF), Phishing, Shell injection and SQL injection. We’ll also assume PHP as the language of development, but the problems apply regardless of language, and solutions will be similar in other languages.
[Editor's note: A must-have for professional Web designers and developers: The Printed Smashing Books Bundle is full of practical insight for your daily work. Get the bundle right away!]

1. Cross-Site Scripting (XSS)

Cross-site scripting is an attack in which a user is tricked into executing code from an attacker’s site (say evil.com) in the context of our website (let’s call it www.mybiz.com). This is a problem regardless of what our website does, but the severity of the problem changes depending on what our users can do on the site. Let’s look at an example.
Let’s say that our site allows the user to post cute little messages for the world (or maybe only their friends) to see. We’d have code that looks something like this:
1
2  echo "$user said $message";
3?>
To read the message in from the user, we’d have code like this:
1
2  $user = $_COOKIE['user'];
3  $message = $_REQUEST['message'];
4  if($message) {
5     save_message($user, $message);
6  }
7?>
8"text" name="message" value="">
This works only as long as the user sticks to messages in plain text, or perhaps a few safe HTML tags like or . We’re essentially trusting the user to only enter safe text. An attacker, though, may enter something like this:
1Hi there...
(Note that I’ve changed http to h++p to prevent auto-linking of the URL).
When a user views this message on their own page, they load bad-script.js into their page, and that script could do anything it wanted, for example, it could steal the contents of document.cookie, and then use that to impersonate the user and possibly send spam from their account, or more subtly, change the contents of the HTML page to do nasty things, possibly installing malware onto the reader’s computer. Remember that bad-script.js now executes in the context of www.mybiz.com.
This happens because we’ve trusted the user more than we should. If, instead, we only allow the user to enter contents that are safe to display on the page, we prevent this form of attack. We accomplish this using PHP’s input_filter extension.
We can change our PHP code to the following:
01
02  $user = filter_input(INPUT_COOKIE, 'user',
03                         FILTER_SANITIZE_SPECIAL_CHARS);
04  $message = filter_input(INPUT_POST | INPUT_GET, 'message',
05                         FILTER_SANITIZE_SPECIAL_CHARS);
06  if($message) {
07     save_message($user, $message);
08  }
09?>
10"text" name="message" value="">
Notice that we run the filter on the input and not just before output. We do this to protect against the situation where a new use case may arise in the future, or a new programmer comes in to the project, and forgets to sanitize data before printing it out. By filtering at the input layer, we ensure that we never store unsafe data. The side-effect of this is that if you have data that needs to be displayed in a non-web context (e.g. a mobile text message/pager message), then it may be unsuitably encoded. You may need further processing of the data before sending it to that context.
Now chances are that almost everything you get from the user is going to be written back to the browser at some point, so it may be best to just set the default filter to FILTER_SANITIZE_SPECIAL_CHARS by changing filter.default in your php.ini file.
PHP has many different input filters, and it’s important to use the one most relevant to your data. Very often an XSS creeps in because we use FILTER_SANITIZE_SPECIAL_CHARS when we should have used FILTER_SANITIZE_ENCODED or FILTER_SANITIZE_URL or vice-versa. You should also carefully review any code that uses something like html_entity_decode, because this could potentially open your code up for attack by undoing the encoding added by the input filter.
If a site is open to XSS attacks, then its users’ data is not safe.

2. Cross-Site Request Forgery (CSRF)

A CSRF (sometimes abbreviated as XSRF) is an attack where a malicious site tricks our visitors into carrying out an action on our site. This can happen if a user logs in to a site that they use a lot (e.g. e-mail, Facebook, etc.), and then visits a malicious site without first logging out. If the original site is susceptible to a CSRF attack, then the malicious site can do evil things on the user’s behalf. Let’s take the same example as above.
Since our application reads in input either from POST data or from the query string, an attacker could trick our user into posting a message by including code like this on their website:
2     style="position:absolute;left:-999em;">
Now all the attacker needs to do, is get users of mybiz.com to visit their site. This is fairly easily accomplished by, for example, hosting a game, or pictures of cute baby animals. When the user visits the attacker’s site, their browser sends a GET request to www.mybiz.com/post_message. Since the user is still logged in to www.mybiz.com, the browser sends along the user’s cookies, thereby posting an advertisement for cheap medicine to all the user’s friends.
Simply changing our code to only accept submissions via POST doesn’t fix the problem. The attacker can change the code to something like this:
1<iframe name="pharma" style="display:none;">iframe>
2<form id="pform"
3      action="h++p://www.mybiz.com/post_message"
4      method="POST"
5      target="pharma">
6<input type="hidden" name="message" value="Cheap medicine at ...">
7form>
8<script>document.getElementById('pform').submit();script>
Which will POST the form back to www.mybiz.com.
The correct way to to protect against a CSRF is to use a single use token tied to the user. This token can only be issued to a signed in user, and is based on the user’s account, a secret salt and possibly a timestamp. When the user submits the form, this token needs to be validated. This ensures that the request originated from a page that we control. This token only needs to be issued when a form submission can do something on behalf of the user, so there’s no need to use it for publicly accessible read-only data. The token is sometimes referred to as a nonce.
There are several different ways to generate a nonce. For example, have a look at the wp_create_nonce, wp_verify_nonce and wp_salt functions in the WordPress source code. A simple nonce may be generated like this:
1
2function get_nonce() {
3  return md5($salt . ":"  . $user . ":"  . ceil(time()/86400));
4}
5?>
The timestamp we use is the current time to an accuracy of 1 day (86400 seconds), so it’s valid as long as the action is executed within a day of requesting the page. We could reduce that value for more sensitive actions (like password changes or account deletion). It doesn’t make sense to have this value larger than the session timeout time.
An alternate method might be to generate the nonce without the timestamp, but store it as a session variable or in a server side database along with the time when the nonce was generated. That makes it harder for an attacker to generate the nonce by guessing the time when it was generated.
1
2function get_nonce() {
3  $nonce = md5($salt . ":"  . $user);
4  $_SESSION['nonce'] = $nonce;
5  $_SESSION['nonce_time'] = time();
6  return $nonce;
7}
8?>
We use this nonce in the input form, and when the form is submitted, we regenerate the nonce or read it out of the session variable and compare it with the submitted value. If the two match, then we allow the action to go through. If the nonce has timed out since it was generated, then we reject the request.
1
2  if(!verify_nonce($_POST['nonce'])) {
3     header("HTTP/1.1 403 Forbidden", true, 403);
4     exit();
5  }
6  // proceed normally
7?>
This protects us from the CSRF attack since the attacker’s website cannot generate our nonce.
If you don’t use a nonce, your user can be tricked into doing things they would not normally do. Note that even if you do use a nonce, you may still be susceptible to a click-jacking attack.

3. Click-Jacking

While not on the OWASP top ten list for 2010, click-jacking has gained recent fame due to attacks against Twitter and Facebook, both of which spread very quickly due to the social nature of these platforms.
Now since we use a nonce, we’re protected against CSRF attacks, however, if the user is tricked into clicking the submit link themselves, then the nonce won’t protect us. In this kind of attack, the attacker includes our website in an iframe on their own website. The attacker doesn’t have control over our page, but they do control the iframe element. They use CSS to set the iframe’s opacity to 0, and then use JavaScript to move it around such that the submit button is always under the user’s mouse. This was the technique used on the Facebook Like button click-jack attack.
Frame busting appears to be the most obvious way to protect against this, however it isn’t fool proof. For example, adding the security="restricted" attribute to an iframe will stop any frame busting code from working in Internet Explorer, and there are ways to prevent frame busting in Firefox as well.
A better way might be to make your submit button disabled by default and then use JavaScript to enable it once you’ve determined that it’s safe to do so. In our example above, we’d have code like this:
1<input type="text" name="message" value="">
2<input id="msg_btn" type="submit" disabled="true">
3<script type="text/javascript">
4if(top == self) {
5   document.getElementById("msg_btn").disabled=false;
6}
7script>
This way we ensure that the submit button cannot be clicked on unless our page runs in a top level window. Unfortunately, this also means that users with JavaScript disabled will also be unable to click the submit button.

4. SQL Injection

In this kind of an attack, the attacker exploits insufficient input validation to gain shell access on your database server. XKCD has a humorous take on SQL injection:
http://xkcd.com/327/
Full image (from xkcd)
Let’s go back to the example we have above. In particular, let’s look at the save_message() function.
01
02function save_message($user, $message)
03{
04  $sql = "INSERT INTO Messages (
05            user, message
06          ) VALUES (
07            '$user', '$message'
08          )";
09 
10  return mysql_query($sql);
11}
12?>
The function is oversimplified here, but it exemplifies the problem. The attacker could enter something like
1test');DROP TABLE Messages;--
When this gets passed to the database, it could end up dropping the Messages table, causing you and your users a lot of grief. This kind of an attack calls attention to the attacker, but little else. It’s far more likely for an attacker to use this kind of attack to insert spammy data on behalf of other users. Consider this message instead:
1test'), ('user2', 'Cheap medicine at ...'), ('user3', 'Cheap medicine at ...
Here the attacker has successfully managed to insert spammy messages into the comment streams from user2 and user3 without needing access to their accounts. The attacker could also use this to download your entire user table that possibly includes usernames, passwords and email addresses.
Fortunately, we can use prepared statements to get around this problem. In PHP, the PDO abstraction layer makes it easy to use prepared statements even if your database itself doesn’t support them. We could change our code to use PDO.
01
02function save_message($user, $message)
03{
04  // $dbh is a global database handle
05  global $dbh;
06 
07  $stmt = $dbh->prepare('
08                     INSERT INTO Messages (
09                          user, message
10                     ) VALUES (
11                          ?, ?
12                     )');
13  return $stmt->execute(array($user, $message));
14}
15?>
This protects us from SQL injection by correctly making sure that everything in $user goes into the user field and everything in $message goes into the message field even if it contains database meta characters.
There are cases where it’s hard to use prepared statements. For example, if you have a list of values in an IN clause. However, since our SQL statements are always generated by code, it is possible to first determine how many items need to go into the IN clause, and add as many ? placeholders instead.

5. Shell Injection

Similar to SQL injection, the attacker tries to craft an input string to gain shell access to your web server. Once they have shell access, they could potentially do a lot more. Depending on access privileges, they could add JavaScript to your HTML pages, or gain access to other internal systems on your network.
Shell injection can take place whenever you pass untreated user input to the shell, for example by using the system(), exec() or `` commands. There may be more functions depending on the language you use when building your web app.
The solution is the same for XSS attacks. You need to validate and sanitize all user inputs appropriately for where it will be used. For data that gets written back into an HTML page, we use PHP’s input_filter() function with the FILTER_SANITIZE_SPECIAL_CHARS flag. For data that gets passed to the shell, we use the escapeshellcmd() and escapeshellarg() functions. It’s also a good idea to validate the input to make sure it only contains a whitelist of characters. Always use a whitelist instead of a blacklist. Attackers find inventive ways of getting around a blacklist.
If an attacker can gain shell access to your box, all bets are off. You may need to wipe everything off that box and reimage it. If any passwords or secret keys were stored on that box (in configuration files or source code), they will need to be changed at all locations where they are used. This could prove quite costly for your organization.

6. Phishing

Phishing is the process where an attacker tricks your users into handing over their login credentials. The attacker may create a page that looks exactly like your login page, and ask the user to log in there by sending them a link via e-mail, IM, Facebook, or something similar. Since the attacker’s page looks identical to yours, the user may enter their login credentials without realizing that they’re on a malicious site. The primary method to protect your users from phishing is user training, and there are a few things that you could do for this to be effective.
  1. Always serve your login page over SSL. This requires more server resources, but it ensures that the user’s browser verifies that the page isn’t being redirected to a malicious site.
  2. Use one and only one URL for user log in, and make it short and easy to recognize. For our example website, we could use https://login.mybiz.com as our login URL. It’s important that when the user sees a login form for our website, they also see this URL in the URL bar. That trains users to be suspicious of login forms on other URLs
  3. Do not allow partners to ask your users for their credentials on your site. Instead, if partners need to pull user data from your site, provide them with an OAuth based API. This is also known as the Password Anti-Pattern.
  4. Alternatively, you could use something like a sign-in image that some websites are starting to use (e.g. Bank of America, Yahoo!). This is an image that the user selects on your website, that only the user and your website know about. When the user sees this image on the login page, they know that this is the right page. Note that if you use a sign-in seal, you should also use frame busting to make sure an attacker cannot embed your sign-in image page in their phishing page using an iframe.
If a user is trained to hand over their password to anyone who asks for it, then their data isn’t safe.

Summary

While we’ve covered a lot in this article, it still only skims the surface of web application security. Any developer interested in building truly secure applications has to be on top of their game at all times. Stay up to date with various security related mailing lists, and make sure all developers on your team are clued in. Sometimes it may be necessary to sacrifice features for security, but the alternative is far scarier.
Finally, I’d like to thank the Yahoo! Paranoids for all their help in writing this article.

Further Reading

  1. OWASP Top 10 security risks
  2. XSS
  3. CSRF
  4. Phishing
  5. Code injection
  6. PHP’s input filters
  7. Password anti-pattern
  8. OAuth
  9. Facebook Like button click-jacking
  10. Anti-anti frame-busting
  11. The Yahoo! Security Center also has articles on how users can protect themselves online.

Commonly Confused Bits Of jQuery

The explosion of JavaScript libraries and frameworks such as jQuery onto the front-end development scene has opened up the power of JavaScript to a far wider audience than ever before. It was born of the need — expressed by a crescendo of screaming by front-end developers who were fast running out of hair to pull out — to improve JavaScript’s somewhat primitive API, to make up for the lack of unified implementation across browsers and to make it more compact in its syntax.
All of which means that, unless you have some odd grudge against jQuery, those days are gone — you can actually get stuff done now. A script to find all links of a certain CSS class in a document and bind an event to them now requires one line of code, not 10. To power this, jQuery brings to the party its own API, featuring a host of functions, methods and syntactical peculiarities. Some are confused or appear similar to each other but actually differ in some way. This article clears up some of these confusions.
[Editor's note: A must-have for professional Web designers and developers: The Printed Smashing Books Bundle is full of practical insight for your daily work. Get the bundle right away!]

1. .parent() vs. .parents() vs. .closest()

All three of these methods are concerned with navigating upwards through the DOM, above the element(s) returned by the selector, and matching certain parents or, beyond them, ancestors. But they differ from each other in ways that make them each uniquely useful.

parent(selector)

This simply matches the one immediate parent of the element(s). It can take a selector, which can be useful for matching the parent only in certain situations. For example:
1$('span#mySpan').parent().css('background', '#f90');
2$('p').parent('div.large').css('background', '#f90');
The first line gives the parent of #mySpan. The second does the same for parents of all
tags, provided that the parent is a div and has the class large.
Tip: the ability to limit the reach of methods like the one in the second line is a common feature of jQuery. The majority of DOM manipulation methods allow you to specify a selector in this way, so it’s not unique to parent().

parents(selector)

This acts in much the same way as parent(), except that it is not restricted to just one level above the matched element(s). That is, it can return multiple ancestors. So, for example:
1$('li.nav').parents('li'); //for each LI that has the class nav, go find all its parents/ancestors that are also LIs
This says that for each
  • that has the class nav, return all its parents/ancestors that are also

  • s. This could be useful in a multi-level navigation tree, like the following:
    01<ul id='nav'>
    02    <li>Link 1
    03        <ul>
    04            <li>Sub link 1.1li>
    05            <li>Sub link 1.2li>
    06            <li>Sub link 1.3li>
    07        ul>
    08    <li>Link 2
    09        <ul>
    10            <li>Sub link 2.1
    11 
    12            <li>Sub link 2.2
    13 
    14        ul>
    15    li>
    16ul>
    Imagine we wanted to color every third-generation

  • in that tree orange. Simple:
    1$('#nav li').each(function() {
    2    if ($(this).parents('#nav li').length == 2)
    3        $(this).css('color', '#f90');
    4});
    This translates like so: for every

  • found in #nav (hence our each() loop), whether it’s a direct child or not, see how many

  • parents/ancestors are above it within #nav. If the number is two, then this

  • must be on level three, in which case color.

    closest(selector)

    This is a bit of a well-kept secret, but very useful. It works like parents(), except that it returns only one parent/ancestor. In my experience, you’ll normally want to check for the existence of one particular element in an element’s ancestry, not a whole bunch of them, so I tend to use this more than parents(). Say we wanted to know whether an element was a descendant of another, however deep in the family tree:
    1if ($('#element1').closest('#element2').length == 1)
    2    alert("yes - #element1 is a descendent of #element2!");
    3else
    4    alert("No - #element1 is not a descendent of #element2");
    Tip: you can simulate closest() by using parents() and limiting it to one returned element.
    1$($('#element1').parents('#element2').get(0)).css('background', '#f90');
    One quirk about closest() is that traversal starts from the element(s) matched by the selector, not from its parent. This means that if the selector that passed inside closest() matches the element(s) it is running on, it will return itself. For example:
    1$('div#div2').closest('div').css('background', '#f90');
    This will turn #div2 itself orange, because closest() is looking for a
    , and the nearest
    to #div2 is itself.

    2. .position() vs. .offset()

    These two are both concerned with reading the position of an element — namely the first element returned by the selector. They both return an object containing two properties, left and top, but they differ in what the returned position is relative to.
    position() calculates positioning relative to the offset parent — or, in more understandable terms, the nearest parent or ancestor of this element that has position: relative. If no such parent or ancestor is found, the position is calculated relative to the document (i.e. the top-left corner of the viewport).
    offset(), in contrast, always calculates positioning relative to the document, regardless of the position attribute of the element’s parents and ancestors.
    Consider the following two
    s:
    Hello – I’m outerDiv. I have position: relative and left: 100px
    Hi – I’m #innerDiv. I have position absolute, left: 50px and top: 80px.
    Querying (no pun intended) the offset() and position() of #innerDiv will return different results.
    1var position = $('#innerDiv').position();
    2var offset = $('#innerDiv').offset();
    3alert("Position: left = "+position.left+", top = "+position.top+"\n"+
    4      "Offset: left = "+offset.left+" and top = "+offset.top
    5)
    Try it yourself to see the results: click here.

    3. .css(‘width’) and .css(‘height’) vs. .width() and .height()

    These three, you won’t be shocked to learn, are concerned with calculating the dimensions of an element in pixels. They both return the offset dimensions, which are the genuine dimensions of the element no matter how stretched it is by its inner content.
    They differ in the data types they return: css('width') and css('height') return dimensions as strings, with px appended to the end, while width() and height() return dimensions as integers.
    There’s actually another little-known difference that concerns IE (quelle surprise!), and it’s why you should avoid the css('width') and css('height') route. It has to do with the fact that IE, when asked to read “computed” (i.e. not implicitly set) dimensions, unhelpfully returns auto. In jQuery core, width() and height() are based on the .offsetWidth and .offsetHeight properties resident in every element, which IE does read correctly.
    But if you’re working on elements with dimensions implicitly set, you don’t need to worry about that. So, if you wanted to read the width of one element and set it on another element, you’d opt for css('width'), because the value returned comes ready appended with ‘px’.
    But if you wanted to read an element’s width() with a view to performing a calculation on it, you’d be interested only in the figure; hence width() is better.
    Note that each of these can simulate the other with the help of an extra line of JavaScript, like so:
    1var width = $('#someElement').width(); //returns integer
    2width = width+'px'; //now it's a string like css('width') returns
    3var width = $('#someElement').css('width'); //returns string
    4width = parseInt(width); //now it's an integer like width() returns
    Lastly, width() and height() actually have another trick up their sleeves: they can return the dimensions of the window and document. If you try this using the css() method, you’ll get an error.

    4. .click() (etc) vs. .bind() vs. .live() vs. .delegate

    These are all concerned with binding events to elements. The differences lie in what elements they bind to and how much we can influence the event handler (or “callback”). If this sounds confusing, don’t worry. I’ll explain.

    click() (etc)

    It’s important to understand that bind() is the daddy of jQuery’s event-handling API. Most tutorials deal with events with simple-looking methods, such as click() and mouseover(), but behind the scenes these are just the lieutenants who report back to bind().
    These lieutenants, or aliases, give you quick access to bind certain event types to the elements returned by the selector. They all take one argument: a callback function to be executed when the event fires. For example:
    1$('#table td ').click(function() {
    2    alert("The TD you clicked contains '"+$(this).text()+"'");
    3});
    This simply says that whenever a
    inside #table is clicked, alert its text content.

    bind()

    We can do the same thing with bind, like so:
    1$('#table td ').bind('click', function() {
    2    alert("The TD you clicked contains '"+$(this).text()+"'");
    3});
    Note that this time, the event type is passed as the first argument to bind(), with the callback as the second argument. Why would you use bind() over the simpler alias functions?
    Very often you wouldn’t. But bind() gives you more control over what happens in the event handler. It also allows you to bind more than one event at a time, by space-separating them as the first argument, like so:
    1$('#table td').bind('click contextmenu', function() {
    2    alert("The TD you clicked contains '"+$(this).text()+"'");
    3});
    Now our event fires whether we’ve clicked the with the left or right button. I also mentioned that bind() gives you more control over the event handler. How does that work? It does it by passing three arguments rather than two, with argument two being a data object containing properties readable to the callback, like so:
    1$('#table td').bind('click contextmenu', {message: 'hello!'}, function(e) {
    2    alert(e.data.message);
    3});
    As you can see, we’re passing into our callback a set of variables for it to have access to, in our case the variable message.
    You might wonder why we would do this. Why not just specify any variables we want outside the callback and have our callback read those? The answer has to do with scope and closures. When asked to read a variable, JavaScript starts in the immediate scope and works outwards (this is a fundamentally different behavior to languages such as PHP). Consider the following:
    1var message = 'you left clicked a TD';
    2$('#table td').bind('click', function(e) {
    3    alert(message);
    4});
    5var message = 'you right clicked a TD';
    6$('#table td').bind('contextmenu', function(e) {
    7    alert(message);
    8});
    No matter whether we click the with the left or right mouse button, we will be told it was the right one. This is because the variable message is read by the alert() at the time of the event firing, not at the time the event was bound.
    If we give each event its own “version” of message at the time of binding the events, we solve this problem.
    1$('#table td').bind('click', {message: 'You left clicked a TD'}, function(e) {
    2    alert(e.data.message);
    3});
    4$('#table td').bind('contextmenu', {message: 'You right clicked a TD'}, function(e) {
    5    alert(e.data.message);
    6});
    Events bound with bind() and with the alias methods (.mouseover(), etc) are unbound with the unbind() method.

    live()

    This works almost exactly the same as bind() but with one crucial difference: events are bound both to current and future elements — that is, any elements that do not currently exist but which may be DOM-scripted after the document is loaded.
    Side note: DOM-scripting entails creating and manipulating elements in JavaScript. Ever notice in your Facebook profile that when you “add another employer” a field magically appears? That’s DOM-scripting, and while I won’t get into it here, it looks broadly like this:
    1var newDiv = document.createElement('div');
    2newDiv.appendChild(document.createTextNode('hello, world!'));
    3$(newDiv).css({width: 100, height: 100, background: '#f90'});
    4document.body.appendChild(newDiv);

    delegate()

    A shortfall of live() is that, unlike the vast majority of jQuery methods, it cannot be used in chaining. That is, it must be used directly on a selector, like so:
    1$('#myDiv a').live('mouseover', function() {
    2    alert('hello');
    3});
    But not…
    1$('#myDiv').children('a').live('mouseover', function() {
    2    alert('hello');
    3});
    … which will fail, as it will if you pass direct DOM elements, such as $(document.body).
    delegate(), which was developed as part of jQuery 1.4.2, goes some way to solving this problem by accepting as its first argument a context within the selector. For example:
    1$('#myDiv').delegate('a', 'mouseover', function() {
    2    alert('hello');
    3});
    Like live(), delegate() binds events both to current and future elements. Handlers are unbound via the undelegate() method.

    Real-Life Example

    For a real-life example, I want to stick with DOM-scripting, because this is an important part of any RIA (rich Internet application) built in JavaScript.
    Let’s imagine a flight-booking application. The user is asked to supply the names of all passengers travelling. Entered passengers appear as new rows in a table, #passengersTable, with two columns: “Name” (containing a text field for the passenger) and “Delete” (containing a button to remove the passenger’s row).
    To add a new passenger (i.e. row), the user clicks a button, #addPassenger:
    01$('#addPassenger').click(function() {
    02    var tr = document.createElement('tr');
    03    var td1 = document.createElement('td');
    04    var input = document.createElement('input');
    05    input.type = 'text';
    06    $(td1).append(input);
    07    var td2 = document.createElement('td');
    08    var button = document.createElement('button');
    09    button.type = 'button';
    10    $(button).text('delete');
    11    $(td2).append(button);
    12    $(tr).append(td1);
    13    $(tr).append(td2);
    14    $('#passengersTable tbody').append(tr);
    15});
    Notice that the event is applied to #addPassenger with click(), not live('click'), because we know this button will exist from the beginning.
    What about the event code for the “Delete” buttons to delete a passenger?
    1$('#passengersTable td button').live('click', function() {
    2    if (confirm("Are you sure you want to delete this passenger?"))
    3    $(this).closest('tr').remove();
    4});
    Here, we apply the event with live() because the element to which it is being bound (i.e. the button) did not exist at runtime; it was DOM-scripted later in the code to add a passenger.
    Handlers bound with live() are unbound with the die() method.
    The convenience of live() comes at a price: one of its drawbacks is that you cannot pass an object of multiple event handlers to it. Only one handler.

    5. .children() vs. .find()

    Remember how the differences between parent(), parents() and closest() really boiled down to a question of reach? So it is here.

    children()

    This returns the immediate children of an element or elements returned by a selector. As with most jQuery DOM-traversal methods, it is optionally filtered with a selector. So, if we wanted to turn all s orange in a table that contained the word “dog”, we could use this:
    1$('#table tr').children('td:contains(dog)').css('background', '#f90');

    find()

    This works very similar to children(), only it looks at both children and more distant descendants. It is also often a safer bet than children().
    Say it’s your last day on a project. You need to write some code to hide all s that have the class hideMe. But some developers omit from their table mark-up, so we need to cover all bases for the future. It would be risky to target the s like this…
    1$('#table tbody tr.hideMe').hide();
    … because that would fail if there’s no . Instead, we use find():
    1$('#table').find('tr.hideMe').hide();
    This says that wherever you find a in #table with .hideMe, of whatever descendancy, hide it.

    6. .not() vs. !.is() vs. :not()

    As you’d expect from functions named “not” and “is,” these are opposites. But there’s more to it than that, and these two are not really equivalents.

    .not()

    not() returns elements that do not match its selector. For example:
    1$('p').not('.someclass').css('color', '#f90');
    That turns all paragraphs that do not have the class someclass orange.

    .is()

    If, on the other hand, you want to target paragraphs that do have the class someclass, you could be forgiven for thinking that this would do it:
    1$('p').is('.someclass').css('color', '#f90');
    In fact, this would cause an error, because is() does not return elements: it returns a boolean. It’s a testing function to see whether any of the chain elements match the selector.
    So when is is useful? Well, it’s useful for querying elements about their properties. See the real-life example below.

    :not()

    :not() is the pseudo-selector equivalent of the method .not() It performs the same job; the only difference, as with all pseudo-selectors, is that you can use it in the middle of a selector string, and jQuery’s string parser will pick it up and act on it. The following example is equivalent to our .not() example above:
    1$('p:not(.someclass)').css('color', '#f90');

    Real-Life Example

    As we’ve seen, .is() is used to test, not filter, elements. Imagine we had the following sign-up form. Required fields have the class required.
    01<form id='myform' method='post' action='somewhere.htm'>
    02    <label>Forename *
    03    <input type='text' class='required' />
    04    <br />
    05    <label>Surname *
    06    <input type='text' class='required' />
    07    <br />
    08    <label>Phone number
    09    <input type='text' />
    10    <br />
    11    <label>Desired username *
    12    <input type='text' class='required' />
    13    <br />
    14    <input type='submit' value='GO' />
    15form>
    When submitted, our script should check that no required fields were left blank. If they were, the user should be notified and the submission halted.
    1$('#myform').submit(function() {
    2    if ($(this).find('input').is('.required[value=]')) {
    3        alert('Required fields were left blank! Please correct.');
    4        return false; //cancel submit event
    5    }
    6});
    Here we’re not interested in returning elements to manipulate them, but rather just in querying their existence. Our is() part of the chain merely checks for the existence of fields within #myform that match its selector. It returns true if it finds any, which means required fields were left blank.

    7. .filter() vs. .each()

    These two are concerned with iteratively visiting each element returned by a selector and doing something to it.

    .each()

    each() loops over the elements, but it can be used in two ways. The first and most common involves passing a callback function as its only argument, which is also used to act on each element in succession. For example:
    1$('p').each(function() {
    2    alert($(this).text());
    3});
    This visits every
    in our document and alerts out its contents.
    But each() is more than just a method for running on selectors: it can also be used to handle arrays and array-like objects. If you know PHP, think foreach(). It can do this either as a method or as a core function of jQuery. For example…
    1var myarray = ['one', 'two'];
    2$.each(myarray, function(key, val) {
    3    alert('The value at key '+key+' is '+val);
    4});
    … is the same as:
    1var myarray = ['one', 'two'];
    2$(myarray).each(function(key, val) {
    3    alert('The value at key '+key+' is '+val);
    4});
    That is, for each element in myarray, in our callback function its key and value will be available to read via the key and val variables, respectively. The first of the two examples is the better choice, since it makes little sense to pass an array as a jQuery selector, even if it works.
    One of the great things about this is that you can also iterate over objects — but only in the first way (i.e. $.each).
    jQuery is known as a DOM-manipulation and effects framework, quite different in focus from other frameworks such as MooTools, but each() is an example of its occasional foray into extending JavaScript’s native API.

    .filter()

    filter(), like each(), visits each element in the chain, but this time to remove it from the chain if it doesn’t pass a certain test.
    The most common application of filter() is to pass it a selector string, just like you would specify at the start of a chain. So, the following are equivalents:
    1$('p.someClass').css('color', '#f90');
    2$('p').filter('.someclass').css('color', '#f90');
    In which case, why would you use the second example? The answer is, sometimes you want to affect element sets that you cannot (or don’t want to) change. For example:
    1var elements = $('#someElement div ul li a');
    2//hundreds of lines later...
    3elements.filter('.someclass').css('color', '#f90');
    elements was set long ago, so we cannot — indeed may not wish to — change the elements that return, but we might later want to filter them.
    filter() really comes into its own, though, when you pass it a filter function to which each element in the chain in turn is passed. Whether the function returns true or false determines whether the element stays in the chain. For example:
    1$('p').filter(function() {
    2    return $(this).text().indexOf('hello') != -1;
    3}).css('color', '#f90')
    Here, for each
    found in the document, if it contains the string hello, turn it orange. Otherwise, don’t affect it.
    We saw above how is(), despite its name, was not the equivalent of not(), as you might expect. Rather, use filter() or has() as the positive equivalent of not().
    Note also that unlike each(), filter() cannot be used on arrays and objects.

    Real-Life Example

    You might be looking at the example above, where we turned
    s starting with hello orange, and thinking, “But we could do that more simply.” You’d be right:
    1$('p:contains(hello)').css('color', '#f90')
    For such a simple condition (i.e. contains hello), that’s fine. But filter() is all about letting us perform more complex or long-winded evaluations before deciding whether an element can stay in our chain.
    Imagine we had a table of CD products with four columns: artist, title, genre and price. Using some controls at the top of the page, the user stipulates that they do not want to see products for which the genre is “Country” or the price is above $10. These are two filter conditions, so we need a filter function:
    1$('#productsTable tbody tr').filter(function() {
    2    var genre = $(this).children('td:nth-child(3)').text();
    3    var price = $(this).children('td:last').text().replace(/[^\d\.]+/g, '');
    4    return genre.toLowerCase() == 'country' || parseInt(price) >= 10;
    5}).hide();
    So, for each inside the table, we evaluate columns 3 and 4 (genre and price), respectively. We know the table has four columns, so we can target column 4 with the :last pseudo-selector. For each product looked at, we assign the genre and price to their own variables, just to keep things tidy.
    For the price, we replace any characters that might prevent us from using the value for mathematical calculation. If the column contained the value $14.99 and we tried to compute that by seeing whether it matched our condition of being below $10, we would be told that it’s not a number, because it contains the $ sign. Hence we strip away everything that is not number or dot.
    Lastly, we return true (meaning the row will be hidden) if either of our conditions are met (i.e. the genre is country or the price is $10 or more).
    filter()

    8. .merge() vs. .extend()

    Let’s finish with a foray into more advanced JavaScript and jQuery. We’ve looked at positioning, DOM manipulation and other common issues, but jQuery also provides some utilities for dealing with the native parts of JavaScript. This is not its main focus, mind you; libraries such as MooTools exist for this purpose.

    .merge()

    merge() allows you to merge the contents of two arrays into the first array. This entails permanent change for the first array. It does not make a new array; values from the second array are appended to the first:
    1var arr1 = ['one', 'two'];
    2var arr2 = ['three', 'four'];
    3$.merge(arr1, arr2);
    After this code runs, the arr1 will contain four elements, namely one, two, three, four. arr2 is unchanged. (If you’re familiar with PHP, this function is equivalent to array_merge().)

    .extend()

    extend() does a similar thing, but for objects:
    1var obj1 = {one: 'un', two: 'deux'}
    2var obj2 = {three: 'trois', four: 'quatre'}
    3$.extend(obj1, obj2);
    extend() has a little more power to it. For one thing, you can merge more than two objects — you can pass as many as you like. For another, it can merge recursively. That is, if properties of objects are themselves objects, you can ensure that they are merged, too. To do this, pass true as the first argument:
    1var obj1 = {one: 'un', two: 'deux'}
    2var obj2 = {three: 'trois', four: 'quatre', some_others: {five: 'cinq', six: 'six', seven: 'sept'}}
    3$.extend(true, obj1, obj2);
    Covering everything about the behaviour of JavaScript objects (and how merge interacts with them) is beyond the scope of this article, but you can read more here.
    The difference between merge() and extend() in jQuery is not the same as it is in MooTools. One is used to amend an existing object, the other creates a new copy.

    There You Have It

    We’ve seen some similarities, but more often than not intricate (and occasionally major) differences. jQuery is not a language, but it deserves to be learned as one, and by learning it you will make better decisions about what methods to use in what situation.
    It should also be said that this article does not aim to be an exhaustive guide to all jQuery functions available for every situation. For DOM traversal, for example, there’s also nextUntil() and parentsUntil().
    While there are strict rules these days for writing semantic and SEO-compliant mark-up, JavaScript is still very much the playground of the developer. No one will demand that you use click() instead of bind(), but that’s not to say one isn’t a better choice than the other. It’s all about the situation.