″Dale Aceptar 2012″ online!

June 18th, 2012 by ger

We are very proud to announce the release of our latest project: Dale Aceptar 2012. 1

Dale Aceptar 2012

‘Dale Aceptar’ is part of a program run by Sadosky Foundation aimed at fostering IT vocations. Using their words:

It is an ambitious program, its objective is to interest high school students all over Argentina about the opportunities the ICT technology offers. The idea is that many students get interested in taking up their studies at universities or junior colleges within this orientation. We must remember that, currently, the ICTs sector suffers a lack of human resources, as the yearly number of graduates is unable to meet the companies’ requests.

The project itself consists in an programming and animation contest using Alice (or its Spanish counterpart Rebecca), an innovative 3D programming environment that makes it easy to create an animation for telling a story, playing an interactive game, or a video to share on the web. The students will have the opportunity to learn, have fun and win prizes.

We implemented this site using our Python/Django stack which includes several preinstalled applications like Memcached 2 to improve responsiveness and performance, JS and CSS unification and minification 3 and fabric to automatize several tasks like deployments, backups and database migrations.

Additionally, a few Django apps where used for this project including django-cms to give the client an easy way to publish content to the site and pybb to implement forums and give the users a place to interact with each other.

So far, almost 14.000 high school students have registered in Dale Aceptar and are currently participating in the contest. If you know any student that might be interested in studying a ICT related career (perhaps he doesn’t know he might be interested), invite him to participate!

Scraping websites: Having fun with node.js and jQuery

January 5th, 2012 by ger

I’ve been looking for an excuse to use node.js for a while and the opportunity presented itself last week. While thinking about different alternatives for extracting certaing information from a website I realized it would be very useful to use jQuery selectors for this task. But, unlike the common jQuery use case, I wanted to execute this code on the server side and I also wanted to have access to a database to persist the scraped data. So using jQuery within a browser was out of the question. This was when node.js appeared as a possible solution to my problem. After a bit of research I realized it was simple to use jQuery from a node.js application.

The problem was reduced to three simple steps:

  • Perform a request to the website we want to scrape.
  • Use jQuery to extract relevant information.
  • Persist the results to a database.

Each of these steps required a specific node.js module to be completed. The number of available modules is huge and selecting which one to use is not always a simple task, but after a bit of testing a ended up using request, jQuery and sqlite3, all of them installed using npm.

With everything set up the implementation was very simple:

var $ = require('jQuery');
var request = require('request');
var sqlite3 = require('sqlite3');

// Open sqlite database and prepare insert statement
var db = new sqlite3.Database('sqlite3.db');
var stmt = db.prepare("insert into country(name, code, flat_image) values(?, ?, ?);");

// Perform GET request
request('http://en.wikipedia.org/wiki/ISO_3166-1', function(error, response, body) {
    if (error) { throw error; }

    // Find every country in the page:
    // Since there no id or class to identify each country
    // we must rely on the page structure:
    // The first table with class wikitable contains the country list
    $(body).find('table.wikitable').first().find('tr').each(function(index) {
        // TR layout:
        // <tr>
        //   <td>
        //     <span class="flagicon">
        //       <img alt="" src="22px-Flag_of_Afghanistan.svg.png" width="22" height="15" class="thumbborder">&nbsp;
        //     </span>
        //     <a href="/wiki/Afghanistan" title="Afghanistan">Afghanistan</a>
        //   </td>
        //   <td><a href="/wiki/ISO_3166-1_alpha-2#AF" title="ISO 3166-1 alpha-2"><tt>AF</tt></a></td>
        //   <td><tt>AFG</tt></td>
        //   <td><tt>004</tt></td>
        //   <td><a href="/wiki/ISO_3166-2:AF" title="ISO 3166-2:AF">ISO 3166-2:AF</a></td>
        // </tr>
        var country = $(this);
        var code = $(country.children()[1]).text();
        var flag = country.find('img').attr('src');
        var name = $(country.children()[0]).text();

        console.log(name + ' (' + code + '): ' + flag);
        stmt.run(name, code, flag);
    });
});

In this case we are scraping a list of countries from Wikipedia with some additional information for each one. Using jQuery we can easily obtain the list of countries (in this case a list of TR elements) and extract the name, code and flag from each TD. The selectors used and how each value is retrieved will vary from site to site but if you are familiar with jQuery it should not be hard to figure out.

Using sqlite3 from node.js is also pretty straightforward. We just open the database, prepare the insert statement and execute it multiple times. It’s important to note that everything is asynchronous but in this simple example it really doesn’t affect us. Using a different database is very similar, you’ll just have to find the right module for the job.

That’s it. With a few lines of code we can scrape any website using a familiar and powerful framework like jQuery.

Simple MVC for PHP

January 17th, 2011 by ger

I consider myself a developer. Not a Java, Python or .NET developer. Just a developer. I don’t believe the language should have a significant impact in the kind of code you create and that’s why I enjoy creating well written applications in any language I happen to be using.

This is also true for PHP. Why do I mention PHP specifically? Because this is a language that has been usually associated with hard-to-understand, ugly code. I’m not going to get into this subject since a lot has already been written. Fortunately, thanks in part to frameworks like CakePHP, this idea is changing. I would strongly recommend using it for any moderate size application. But what should you do if you need a simple two-page site and have little time to learn a new framework? I want to show you that is easy to follow well known practices (like MVC in this case) in PHP with little effort.

In my case, I’ve recently been working in a big PHP site developed years ago and it was actually a nightmare. Business and view logic were mixed up in ways you don’t even imagine. So when I started a small site for a marketing campaign, I wanted to do it the right way. Since I didn’t want to use CakePHP (or any other similar framework) for something so small I decided to implement MVC by hand.

My application would consist in a simple model (plain PHP classes like User, Country, Score), several views (mainly HTML with minimal inline PHP) and the controllers. The main task here was to create a base controller to handle the interaction between the view and the model. I wanted to implement controllers like this:

class IndexController extends Controller {
    protected function get() {
        return new View('../resources/view/home.php');
    }
}

$controller = new IndexController();
$controller->start();

This is a simple controller with no logic. It just displays home.php (which is purely HTML).

A more complex case could be something like this:

class UserHomeController extends Controller {

    protected function get() {
        $user = User::getLoggedUser();
        if ($user != null && $user->isRegistered()) {
            $scores = Score::get($user);
        }

        return new View('../resources/view/homeuser.php', array(
            'scores' => $scores,
            'user' => $user));
    }
}

$controller = new UserHomeController();
$controller->start();

In this case we retrieve the logged user and his score from the database. This information is passed to the view using the View object.

Finally, the controller for a registration form which needs to handle GET and POST requests in different ways:

class RegistrationController extends Controller {

    protected function get() {
        // Render registration form
        return new View('register.php', null, View::REDIRECT_ACTION);
    }

    protected function post() {
        // Persist user
        $user = new User();
        $user->name = $_POST['name'];
        $user->email = $_POST['email'];
        $user->setBirthdate($_POST['birthdate_year'], $_POST['birthdate_month'], $_POST['birthdate_day']);
        $user->registration_date = Date::now();

        if ($user->isValid()) {
            User::save($user);
        } else {
            header('HTTP/1.1 500 Internal Server Error');
            exit;
        }

        // Redirect to user home
        return new View('userhome.php', null, View::REDIRECT_ACTION);
    }
}

$controller = new RegistrationController();
$controller->start();

As you can see, the main idea is to separate logic completely from the view and, additionally, we can place common controller logic (session management for instance) in one place.

The BaseController itself is a simple class that can be reuse in any project:

/**
 * Base Controller for all pages
 * Handles session management, GET/POST requests and response rendering
 *
 * @author german
 */
class Controller {

    /**
     * This method should be called from the controller PHP to handle
     * the current request
     */
    public function start() {
        session_start();
        $this->init();

        if ($_SERVER['REQUEST_METHOD'] == 'POST') {
            $view = $this->post();
        } else {
            $view = $this->get();
        }

        if ($view != null) {
            $this->display($view);
        }
    }

    /**
     * Override this method to initalize the controller before handling
     * the request
     */
    protected function init() {
    }

    /**
     * GET request handler
     */
    protected function get() {
        $this->process();
    }

    /**
     * GET request handler
     */
    protected function post() {
        $this->process();
    }

    /**
     * Request handler. This method will be called if no method specific handler
     * is defined
     */
    protected function process() {
        throw new Exception($_SERVER['REQUEST_METHOD'] . ' request not handled');
    }

    /**
     * Populates the given object with POST data.
     * If not object is given a StdClass is created.
     * @param StdClass $obj
     * @return StdClass
     */
    protected function populateWithPost($obj = null) {
        if(!is_object($obj)) {
            $obj = new StdClass();
        }

        foreach ($_POST as $var =&gt; $value) {
            $obj->$var = trim($value); //here you can add a filter, like htmlentities ...
        }

        return $obj;
    }

    private function display($view) {
        if ($view->action == View::RENDER_ACTION) {
            $context = $view->context;
            include($view->url);
        } else if ($view->action == View::REDIRECT_ACTION) {
            header('Location: ' . $view->url);
        } else {
            throw Exception('Unknown view action: ' . $view->action);
        }
    }
}

class View {
    const RENDER_ACTION = 'render';
    const REDIRECT_ACTION = 'redirect';

    public $url;
    public $context;
    public $action;

    public function __construct($url, $context=array(), $action=View::RENDER_ACTION) {
        $this->url = $url;
        $this->context = $context;
        $this->action = $action;
    }
}

I don’t want to keep adding code to this post but you can imagine how the views are implemented. Anything included in the $context variable (View object) is available in the view to display it.

As you can see there’s really no reason to write spaghetti code in PHP (or any other laguange!) and even small project can be implemented in a nice and elegant way.

This is my small contribution to erradicate that old PHP myth…


Get Adobe Flash player