Thursday, April 20, 2017

Learning R

A few years ago I was a very active Java developer and advocate. I loved it, helped to organise meetup events and even spent many nights answering questions in the now-defunct Java forum. I'm sure some old time Java developers here will still remember those days.
It used to be quite fun; I learnt a lot myself just by researching and helping people with their issues. You can't provide a solution if you haven't compiled and ran the code yourself - which is a great way to learn.
But sometime in 2009, I discovered Python by share chance. In one of the many forums online, someone was asking about open source anti-virus software - and if they were any good compared to paid software.
A few people suggested that Clamwin was quite good and it was open source. This got my thinking that it would be nice to see how real-world open source applications were developed.
Downloaded the code and studied just about every file in the project folder. That was my very first exposure to Python and it was surprisingly easy to understand the code. The C code on the other hand was bit trickier and somewhat confusing to say the least :)
This led to me learning Python and eventually getting a job as a developer using the language; and slowly using less of Java.

However, in the last 6 months or so, I have started messing about with R and I'm blown away at how quickly you can knock things together especially when it comes to data.

I will be adding more R code here as time goes on - this is mainly for me to document my progress and to share something that others can hopefully find useful.

Wednesday, July 01, 2015

Parsing JSON With Python

Ok, in the last couple of weeks I've been working with Google Geocoding but realised how easy it is to parse the return JSON in Python. In fact, I was parsing the JSON result with Javascript - which does a very nice job, but i needed or rather prefered to have better control than what Javascript provides.

And this is where Python comes in. If you have been working with Python for a while, you will have noticed that JSON is nothing more than a Python Dictionary. This makes accessing the key/value quite effortless. To give you a simple example, here's an output of a simple address query using Google's Geocoding API:,%20London%20SW1H%200BG,%20United%20Kingdom&sensor=false

   "results" : [
         "address_components" : [
               "long_name" : "8-10",
               "short_name" : "8-10",
               "types" : [ "street_number" ]
               "long_name" : "Broadway",
               "short_name" : "Broadway",
               "types" : [ "route" ]
               "long_name" : "Westminster",
               "short_name" : "Westminster",
               "types" : [ "sublocality", "political" ]
               "long_name" : "London",
               "short_name" : "London",
               "types" : [ "locality", "political" ]
               "long_name" : "Greater London",
               "short_name" : "Gt Lon",
               "types" : [ "administrative_area_level_2", "political" ]
               "long_name" : "United Kingdom",
               "short_name" : "GB",
               "types" : [ "country", "political" ]
               "long_name" : "SW1H 0BG",
               "short_name" : "SW1H 0BG",
               "types" : [ "postal_code" ]
               "long_name" : "London",
               "short_name" : "London",
               "types" : [ "postal_town" ]
         "formatted_address" : "8-10 Broadway, Westminster, London, Greater London SW1H 0BG, UK",
         "geometry" : {
            "location" : {
               "lat" : 51.49873430,
               "lng" : -0.13312210
            "location_type" : "ROOFTOP",
            "viewport" : {
               "northeast" : {
                  "lat" : 51.50008328029150,
                  "lng" : -0.1317731197084980
               "southwest" : {
                  "lat" : 51.49738531970850,
                  "lng" : -0.1344710802915020
         "types" : [ "street_address" ]
   "status" : "OK"
What is displayed above is the result of the URL we supplied to Google's
 Geocoding API. Of course, we couldn't display that to our users - that 
would be daft. Most of the time you'd want to parse this info and use 
the values in your application.
 In our case we had to extract the Lat and Long for our application. 
There was no need to display the JSON - Python makes this a breeze and 
all we need is to use Python's urllib2 module to open up the URL and 
read the result. 

So, to parse that and display the full address in human-readable format, it's as simple as doing something list this:

>>> import json
>>> import urllib2
>>> j = urllib2.urlopen(',%20London%20SW1H%200BG,%20United%20Kingdom&sensor=false')
>>> js = json.load(j)

Please note that we used load() instead of loads(). You would typically 
use loads() for strings, but load() is designed for resources such as 
files and - and in our example - a URL.

Now to display the address, we simply loop through our dictionary like so:

>>> ourResult = js['results'][0]['address_components']
>>> for rs in ourResult:
...     print rs['long_name']
Greater London
United Kingdom

With that out of the way, what about our main focus which is to print the coordinate of a particular address? Well, it turns out that that is even a lot simpler.
This time we just navigate our way down until we find what we're after. Which in this case is the latitude and longitude. 

>>> ourResult = js['results'][0]['geometry']['location']
>>> print ourResult['lat'], ourResult['lng']
    51.4987343 -0.1331221

One other cool thing about parsing JSON with Python is that you can 
quickly and easily map the results into your Django Model and do 
anything dangoey with it. Here's an example:
objs = json.loads(request.POST)
# Iterate through the stuff in the list
for o in objs:
    # Do something Djangoey with o's name and message, like
    record = myDjangoModel(name =, message = o.message)

Now, if you are looking to mix Python, JSON and Django together - it is as simple what we've just shown above.

Sunday, January 25, 2015

Testing Python: Applying Unit Testing, TDD, BDD and Acceptance Testing

This is a long overdue book. Python has been steadily growing in popularity in multiple industry verticals over the past decade and the different sub-cultures have brought a variety of programming styles and methodologies to the language. As more and more Python code is powering today's production and mission-critical systems it becomes essential to ensure quality and reliability while maintaining efficient and agile development practices.

There has been a flurry of literature written in recent years about the overall benefits of agile methodologies and test-driven development. To a large extent, however, individual developers and teams have been left on their own to figure out how to put this philosophy to practice and extract maximum benefit from it in their trade.

David Sale's book alleviates this issue for Python developers by providing extremely pragmatic, timely and well-organized guidelines as well as many specific examples on how to approach the development and testing of complex systems in a way that amounts to building better software.This is not an academic work - it focuses exclusively on practical solutions to real-world industry problems.

Fairly short and to the point, the book mixes equal portions of justification and tutorial-style examples to work through.

The author has picked some of the best Python tools that are currently available for the various tasks at hand, but the reader will be well equipped to make their own choices once they understand the basic concepts.
Related Posts with Thumbnails