Monday, August 3, 2009

Statistic Module: Ninth and tenth weeks raport

This is my blog update which describes the last two weeks of my work.

I mostly worked on supporting visualization options from the backend side, but to better understand my work, let us take a look at how visualizations are handled by the frontend. As it was mentioned a number of times, we use Google Visualization API. In order to display a visualization, there has to be a script which needs to be passed a dataTable object. The dataTable contains actual data. Of course we can create the object on the client side (for example by sending raw JSON with statistic data and parsing that string to construct dataTable), but we decided to send dataTable objects directly from the server.

As you could see in some previous revisions, the functions which were responsible for that were hardcoded in the python code. It was an easy and quick solution at the beginning of the program, but now, we do not want anything like this in the final version.

When I was dealing with stats collecting functions, I added a new JSON attribute to the statistic model which contained some information on how to collect a particular statistic. This time I did a very similar thing: a new field - chart_options.

This is also a JSON string. Its format is still changing, but the field which refers to dataTable creation is "description". It describes all columns for the visualization and is used by dataTable constructor.

For the last couple of years I have been working on adding support for different visualizations for a single statistic. Some initial work had been done by Mario who added a select list inside a widget. Anyway, the list of options was static and contained all possible visualizations. It was not sufficient, because for example we want a GeoMap for statistics like 'Students Per Country', but certainly not for 'Students Per Degree'. List of visualization applicable for a single statistic should be remembered in data model. I used chart_options field again.

Having completed that, I started to work on the next issue. In the final goals list the mentors asked for some statistic for students with projects / students without projects / all students. I added those statistics a few weeks ago, but for each statistic we ended up with three different entities. It was pretty much sufficient, but we had to have also three different widgets - I really wanted to add possibility to display all three options in a single visualization.

Now all three kinds of statistics are stored in one single entity whose final_json_string has for example the following format:
{undergraduate: [1200, 700, 500], master: [100, 40, 60], phd: [100, 90, 10]}
The first number is the number for all students, the second number for students with projects and the last for students without projects.

So for every entity, one can define a multiple number of virtual statistics by choosing a subset of column. For example, we have defined 4 statistics:
Students Per Degree (all) [column 1]
Students Per Degree (with projects) [column 2]
Students Per Degree (without projects) [column 3]
Students Per Degree (cumulative) [columns 1, 2, 3]

What is interesting (or not:-), we can set up a different list of possible visualizations for each virtual statistic. This is important for example for Students Per Country. For the first three kinds we can use GeoMap, but it is not possible to use it for the forth one.

Recently I have worked on adding the ability to switch kind of virtual statistic on /statistic/show/page. It is already done, but now I need to add support for that on the dashboard page which is a little bit more sophisticated. Anyway, I am expecting to have it done by Wednesday. That day I am also going to send new patches with Sverre's suggestions taken into account.

That is basically all from me. As usually I also took some time on fixing some older bugs and so on, but it is not worthy to mention that.

No comments:

Post a Comment