Jan 17 2012

Snow leopard and Qt/PyQt 4.8.x won't work

If you try to install, even with Homebrew the latest version of Qt the 4.8.x, you may end up haing a surprise like that :

  1. ImportError: dlopen(/usr/local/lib/python/PyQt4/QtWebKit.so, 2): Symbol not found: _kCFWebServicesProviderDefaultDisplayNameKey
  2.   Referenced from: /Library/Frameworks/QtWebKit.framework/Versions/4/QtWebKit
  3.   Expected in: /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation

This is coming precisely from a Qt issue that don't seem to be resolved anytime soon, so anyway you're warned now, revert back to 4.7.x you have no choice. Except if you want to buy Lion :)

Vale


Jan 15 2012

Handle Celery-dependent tests in Django and with django-jenkins

So in your life, one of these days, you're going to realize you need tests, and that "maybe" you also need to test components that depend on several Celery tasks.

Well to help you make this day more productive and less painful, here's a few tips.

First to make it work with Django-celery, a pretty good but small documentation is available here http://ask.github.com/django-celery/cookbook/unit-testing.html it may seems small and not enough, but it actually is enough. To sum up all you need to make it work with ./manage.py test is to change the test runner to :

  1. TEST_RUNNER = 'djcelery.contrib.test_runner.CeleryTestSuiteRunner'

But it's not enough, soon when you think everything's over, you'll want to deploy and make it go through Jenkins's testing processes. Well i don't know about you but to me the django-jenkins project is quite a good one and i use it everyday but it has one flaw, it already as its designated TEST_RUNNER so if you try to execute your tests through ./manage.py jenkins it won't work and you'll get at least a Connection Refused error.

Here's how to fix this, the documentation says if you want to replace the TEST_RUNNER you may do so, but the class you'll use needs to inherit from django_jenkins.runner.CITestSuiteRunner and if you want the Celery tasks to work you need to use the djcelery.contrib.test_runner.CeleryTestSuiteRunner.

Fair enough, here's what i did. I created a new class :

  1. from django_jenkins.runner import CITestSuiteRunner
  2. from djcelery.contrib.test_runner import CeleryTestSuiteRunner
  3.  
  4. class MixedInTestRunner(CITestSuiteRunner, CeleryTestSuiteRunner):
  5.     pass

Why not ? and used it as such defining the new TEST_RUNNER by changing the settings's variable JENKINS_TEST_RUNNER to the newly created class.

And voilà.


Sep 24 2011

La puissance et le contrôle

En développement, comme dans beaucoup d'arts martiaux, on peut devenir fort assez rapidement. On peut se fixer des objectifs (une ceinture, une victoire / maitriser une technologie ou réaliser un projet perso) et les atteindre rapidement selon le language, le maître et l'implication qu'on y met.

Certains langages, tout comme certains arts martiaux, poussent à l'efficacité, à la rigueur ou aux résultats rapides. En Tae Kwon Doe ou en Jujitsu, on aura tendance à privilégié la rapidité d'execution et les résultats seront assez rapide. Pour utiliser des termes usuels : la courbe d'apprentissage n'est pas très raide, et on peut avoir des résultats imparfait dès le début. En revanche, en Karaté Shotokan, on aura tendance à privilégier un apprentissage long et arride à base de Kata, il suffit de regarder le karaté-kid (vieille version) pour comprendre qu'il y a plus trippant comme apprentissage des bases...

On comprends facilement la parallèle avec la programmation, les langages de scripting (Python, PHP, ...) se concentrant plus sur les résultats et l'execution, et les langages historiques comme Java/C++ étant plus orienté rigueur.

Mon point est le suivant, la puissance s'acquière rapidement à la fois en informatique et en arts martiaux, mais l'évolution et le travail fait qu'on lui substitue la maîtrise de notre art. Je m'explique : si on prends la programmation cette fois, au début on commence par le langage, seulement le langage, on avance et on s'habitue aux design patterns, on prends en main des architectures usuelles, des technologies afférentes, ensuite des méthodes de travails - des méthodes de tests.

Seulement pour un oeil non-averti qu'est ce qui va différencier un développeur sénior qui travaille en TDD - d'un junior "confirmé" qui a acquis la puissance mais pas la maîtrise. Très clairement le développeur sénior mettra plus longtemps, à specs équivalente, pour réaliser le travail, à cause du TDD, à cause de la rigueur et de la précision des gestes qu'il va chercher à atteindre. Vous allez me dire qu'en terme de qualité il n'y aura pas photos, très certainement. Mais un travail parfait sans bug n'existe pas en temps raisonnable et quelques tests basiques (réalisés par la QA) permettent souvent d'élaguer les erreurs usuelles.

La maîtrise, à terme, se substitue, souvent trop, à la puissance et à l'efficacité.

C'est peut-être là le rôle de ré-apprendre de nouveaux langages au fur et à mesure de son évolution, pour avoir non-seulement de nouveaux points de vue à explorer (chaque langage ayant son paradigme), mais aussi pour redécouvrir la puissance que l'on obtient au début en récompense d'avoir appris.

Sur ce,

Vale


Mar 25 2011

How to be a happy programmer (with Python) ? 2/3

In the series of the Python "features" that makes me happy last time i began with two concepts, the with statement and the list comprehensions, now i'm going to talk about Multiple assignments and the import aliases.
  • Multiple assignments

It's a simple idea that lets you return a series of value and on the other end assign those multiple values at the same time, example when you're splitting a string or extracting groups from a regular expression :

  1. >>> split_me ="here,we, are,again"
  2. # splitting we'll get a list of values
  3. >>> split_me.split(",")
  4. ['here', 'we', ' are', 'again']
  5. # if you don't know the number of values you're going to have
  6. # you can't use this features, example :
  7. >>> (start,end) = split_me.split(",")
  8. Traceback (most recent call last):
  9.   File "", line 1, in
  10. ValueError: too many values to unpack
  11. # but if you know that there's going to be n values :
  12. >>> (start,end) = split_me.split(" ")
  13. >>> start
  14. 'here,we,'
  15. >>> end
  16. 'are,again'
  • Import aliases

It means what it says, when you import a library or module, you can use aliases, example in Django for shortcuts :

  1. # this is extracted from my own code :
  2. from django.http import HttpResponse, HttpResponseRedirect as redirect
  3. from django.shortcuts import render_to_response as render
I won't start to talk about the standard library as a whole, or libs like Numpy, Scypy, scikit-learn, .... that makes it so easy to just think in Python.
And you ? What are the Python constructs (2.x or 3.x) that makes you feel happy and efficient at the end of the day ?

Mar 25 2011

Using TOR with Python

There are many occasion where you may be limited using your own IP address, i will obviously only refer myself to "rightful" cases where you need to use different IP address in very short lapse of time. Let's say you want to test your website localization functionality, or just access it using many different IP address and see how the system deals with it.

TOR is a wonderful tool for that. TOR (The Onion Ring to be specific) is a tool that allows you to use several IP addresses as gateways to connect to the Internet and change the path you use dynamically. It's main goal is to help you "Protect your privacy and defend yourself against network surveillance and traffic analysis.". I used it a long time ago, when connections were so bad, that using it was mostly a burden and not very practical.

But recently when i had to span HTTP requests through several IP address, i got frustrated by two facts :

  • First : i didn't have enough servers available to do it, and buying them from Amazon EC2 is not a solution as those servers may have the same IP address and all accounts are limited to 5 Elastic IP addresses and i don't want to buy myself for 400$ worth of servers i may use only 1 hour !
  • Second : i need to build a complex distributed environment (like a MapReduce job) and .... well i don't want to, i don't need parallel tasks, i just need a varying exit point.

So after a frustrating night of coding, i found myself thinking back of TOR. I installed it back again with Vidalia, and it worked ! Using Vidalia i could change my IP address when i needed and re-execute my tests with a brand new IP address. Of course it wasn't perfect as i was using sometimes 5 different relay points before reaching my goal but as i was not downloading anything heavy, it went well. But still, i had to change by hand my IP address. So the real point of this article is : " how to change dynamically your IP address using Python and TOR ?"

So first you need TOR installed, you can get it here : https://www.torproject.org/download/download.html.en and activate the remote control here  :

Then i'm going to assume you have Python, Git and Pip installed. To be clear on the principles of all this, TOR offers a relatively complex control system that you can connect to using Telnet, you can see the full command list and a few examples on the TOR website and here : http://thesprawl.org/memdump/?entry=8 . I'm going to refer only to the command i want to use : "re-new my route".

So to install TorCtl, the library we will want to use to control TOR, we'll clone the Git repository of the project, and install the library using Pip (it's pure laziness, i admit, because using python setup.py install works just fine too) :

  1. $ git clone git://github.com/aaronsw/pytorctl.git
  2. Cloning into pytorctl...
  3. remote: Counting objects: 555, done.
  4. remote: Compressing objects: 100% (180/180), done.
  5. remote: Total 555 (delta 376), reused 552 (delta 375)
  6. Receiving objects: 100% (555/555), 145.62 KiB | 213 KiB/s, done.
  7. Resolving deltas: 100% (376/376), done.
  8. $ pip install pytorctl/
  9. Unpacking ./pytorctl
  10.   Running setup.py egg_info for package from [...]
  11. Installing collected packages: torctl
  12.   Running setup.py install for torctl
  13. Successfully installed torctl
  14. Cleaning up...

So now if Tor is running and you try to do this :

  1. $ python
  2. Python 2.6.1 (r261:67515, Jun 24 2010, 21:47:49)
  3. [GCC 4.2.1 (Apple Inc. build 5646)] on darwin
  4. Type "help", "copyright", "credits" or "license" for more information.
  5. >>> from TorCtl import TorCtl
  6. >>> connection = TorCtl.connect(passphrase="lol")
  7. >>> connection.is_live()
  8. True

It's proof that it works, just use close() on the connection object to disconnect yourself. Now all we need to do is use this connection to send signals to TOR and tell urllib2 to use this precise connection, let's go :

  1. import urllib2
  2. # using TOR !
  3. proxy_support = urllib2.ProxyHandler({"http" : "127.0.0.1:8118"} )
  4. opener = urllib2.build_opener(proxy_support)
  5. urllib2.install_opener(opener)
  6. # every urlopen connection will then use the TOR proxy like this one :
  7. urllib2.urlopen('http://www.google.fr').read()
  8. # and to renew my route when i need to change the IP :
  9. print "Renewing tor route wait a bit for 5 seconds"
  10. from TorCtl import TorCtl
  11. conn = TorCtl.connect(passphrase="lol")
  12. conn.sendAndRecv('signal newnym\r\n')
  13. conn.close()
  14. import time
  15. time.sleep(5)
  16. print "renewed"

Now you know everything i know. On the receiving end, i don't think there's that much data on how to identify the "source" of the request, if you've got any clue about that, tell me in the comments.

Vale


Mar 18 2011

How to be a happy programmer (with Python) ? 1/3

I've just watched Hillary Mason's talk in Pycon 2011 : http://pycon.blip.tv/file/4878710/
And that got me thinking about all the python constructs that makes my day better, and i decided to make a list of them and their meaning.

  • With

The with keyword is the equivalent of the whole try, catch, finally triplets in Java to handle resources (files, database connections, remote connections, anything that can fail). So the with statement is here to make sure that, for example using a database connection, transaction is started and stopped correctly and can be used as follow :

  1. with open('/tmp/my_file', 'w') as p:
  2.      p.write('Writing in a properly closed file resource.')

For a few more example python-with-statement, the official presentation for What's new in 2.5 and if you want to know more in order to implement objects usable with the "with" statement, you need to see how the context managers work

  • List comprehensions

I shouldn't even have to explain how much joy you can gain from using these, especially when you're dealing with object oriented programming or immutable objects, well pretty much anytime you need to operate simple transformation on Lists, dictionnaries, anything iterable, list comprehension is not only the most beautiful way, but also many times, the most efficient way. So here we go for an example :

  1. >>> words = 'The quick brown fox jumps over the lazy dog'.split()
  2. >>> print words
  3. ['The', 'quick', 'brown', 'fox', 'jumps', 'over', 'the', 'lazy', 'dog']
  4. >>>
  5. >>> stuff = [[w.upper(), w.lower(), len(w)] for w in words]
  6. >>> for i in stuff:
  7. ...     print i
  8. ...
  9. ['THE', 'the', 3]
  10. ['QUICK', 'quick', 5]
  11. ['BROWN', 'brown', 5]
  12. ['FOX', 'fox', 3]
  13. ['JUMPS', 'jumps', 5]
  14. ['OVER', 'over', 4]
  15. ['THE', 'the', 3]
  16. ['LAZY', 'lazy', 4]
  17. ['DOG', 'dog', 3]

This was the first part of 3 showing  the few Python syntax constructs that makes me "not" scream when i try to do things and i don't want to lose time ! See you next time.

Vale


Mar 15 2011

How to debug Django using the Python Debugger PDB

Even if that seems common sense, i found out that there's not that much sources that explains how to use PDB with Django's bundle webserver.  So here we go, let's say you have some treatment like that :

  1. def search(request):
  2.         """
  3.                 search (it's written up there).
  4.         """
  5.         if request.method == 'POST':
  6.                 item = request.POST['item']
  7.                 # separate numeric part from string part and
  8.                 # add a % in case no numeric value is provided
  9.                 (num, test) = re.match("([\d]*)([\D]*)", item).groups()
  10.                 if not num:
  11.                         num = "%"
  12.                 .... # query in database

Now what we want to check, is that the "not num" part is doing its job in replacing any non-numeric part by a wildcard. so we'll add this statement "import pdb; pdb.set_trace()" to set up a breakpoint that PDB will be able to use :

  1. def search(request):
  2.         """
  3.                 ase.
  4.         """
  5.         if request.method == 'POST':
  6.                 item = request.POST['item']
  7.                 # separate numeric part from string part and
  8.                 # add a % in case no numeric value is provided
  9.                 (num, test) = re.match("([\d]*)([\D]*)", item).groups()
  10.                 # the breakpoint will be here :
  11.                 import pdb; pdb.set_trace()
  12.                 if not num:
  13.                         num = "%"
  14.                 .... # query in database

And then, all we need to do is run the standalone embedded web server of Django using pdb. At first PDB (just like GDB) will wait for you to use the command c (continue) to launch the program and will only stop when he reaches a breakpoint :

  1. python -m pdb manage.py runserver

Eventually when you'll reach the breakpoint, you can use the commands locals(), globals() to see all the variables you can access. For more on how to use PDB and debugging in Django i refer you to the nice tutorial by Mike Tigas.

Vale


Oct 28 2010

A good oldie : 1996 Java Vs Python

I didn't think i would find something like this one day, but here it is an original 1996 paper on "Two Next-Generation Languages Java and Python".

Makes you really think about time, both languages evolved so quickly, one being now the mainstream multi-purpose language (Java) and the other one being more relevant than ever (Python). It makes you see how much Sun invested on the applets technology, even if it never took off...

Enjoy : http://www.rogermasse.com/papers/java-python96/

Vale


Oct 27 2010

Un hébergement qui vaut le détour : AlwaysData

Ok, ce n'est très certainement pas mon domaine d'expertise, mais je suis un programmeur Python, principalement sur Django et je cherche souvent un hébergement.

AlwaysData.net n'est pas une société dont je possède des parts, mais j'apprécie vraiment leur environnement de travail, leur état d'esprit et surtout je viens de découvrir une chose encore mieux qu'un superbe hébergement pour mes applications Django : AlwaysData.net support maintenant l'extension géographique pour PostgreSQLPostGIS.

So go check it out ! surtout que l'hébergement est gratuit pour des projets de moins de 10 mega.

Vale

http://www.alwaysdata.com/

Sep 17 2010

[Django] Append objects in request.session

This article is once again more of a reminder to me, i hope it will help everyone at the same time.

I was experiencing some issue using Django Session objects lately, i wanted to save a list in my session object and something strange happened.

The first creation of the list went just fine, but when i tried to append objects to this session everything seemed to look fine in the method where i was appending the data, but when from another method i tried to loop on this list, i only found the first item.

Let me show you some log to illustrate :

  1. # first method before append - state of the session object :
  2. [('saved', ['obj1']), ('_auth_user_backend', 'django.contrib.auth.backends.ModelBackend'), ('_auth_user_id', 1)]
  3.  
  4. # first method after append:
  5. [('saved', ['obj1', 'obj2']), ('_auth_user_backend', 'django.contrib.auth.backends.ModelBackend'), ('_auth_user_id', 1)]
  6.  
  7. # second method :
  8. [('saved', ['obj1']), ('_auth_user_backend', 'django.contrib.auth.backends.ModelBackend'), ('_auth_user_id', 1)]

Ok now this is strange... because it looks just like it worked, and then... no.

The answer came to me (as usual) googling the question and finding the answer on the DjangoFoo website. Therefore instead of doing something like that when appending a list in a session object :

  1.         if not 'saved' in request.session or not request.session['saved']:
  2.                 request.session['saved'] = [obj]
  3.         else:
  4.                 request.session['saved'].append(obj)

What you need to do is :

  1.         if not 'saved' in request.session or not request.session['saved']:
  2.                 request.session['saved'] = [obj]
  3.         else:
  4.                 saved_list = request.session['saved']
  5.                 saved_list.append(obj)
  6.                 request.session['saved'] = saved_list

It's not very Pythonic or elegant, but that's the way. This limitation comes from the following advice given out by Django's documentation :

Use normal Python strings as dictionary keys on request.session. This is more of a convention than a hard-and-fast rule

Thank you for your time,
Vale