January 25, 2012
Takk

Takk for 14 fine år.

November 11, 2011
Java dependencies

A little python snippet for creating dot-representation of the dependencies in a java app.

Let this little baby chew on your target/classes dir and you will get a dot-file on stdout.

 #!/usr/bin/env python

import re
import os
import sys
from os.path import join, basename

LIB_REG_EXP = re.compile('[,+L#][a-z][A-Za-z0-9/]+')
PACKAGE_FILTER = ''
CLASS_PREFIX_REMOVE = ''

if len(sys.argv) > 2:
    PACKAGE_FILTER = sys.argv[2]

if len(sys.argv) > 3:
    CLASS_PREFIX_REMOVE = sys.argv[3]

def get_class_dependecies(class_file_contents):
    libraries = set(map(lambda x: x.replace('/','.'), [l[1:] for l in LIB_REG_EXP.findall(class_file_contents)]))
    return filter(lambda x: PACKAGE_FILTER in x, libraries)

class_dependecies = {}
for root, dirs, files in os.walk(sys.argv[1]):
    if '.svn' in dirs:
        dirs.remove('.svn')
    class_files = [f for f in files if f.endswith('.class') and '$' not in f]
    for class_file in class_files:
        class_content = open(join(root, class_file)).read()
        class_name = join(root, class_file).replace('/','.')[len(CLASS_PREFIX_REMOVE):-6]
        if PACKAGE_FILTER not in class_name:
            continue
        class_dependecies[class_name] = []
        for dep in get_class_dependecies(class_content):
            if dep == class_name:
                continue
            class_dependecies[class_name].append(dep)

print "digraph G {"
for key, deps in class_dependecies.items():
    for dep in deps:
        print '"%s" -> "%s";' % (key, dep)
print "}"

The second argument creates a basic filter for the class-names you want. The third parameter is used for removing the folder-names (target.classes. in usual cases)

Example:

python dotclass.py target/classes us.klette.myproject target.classes. > deps.dot

Have fun

12:08am  |   URL: http://tumblr.com/Z0E3fxBlyMwk
Filed under: java python tech 
October 8, 2011
Rot

Photo by [Will Sculling](http://www.falconerdevelopment.com/go/Will%20Scullin/)

I’m quite fascinated by how quickly code rots and becomes a mess, despite the programmers best intentions. A lot could be fixed by testing and and refactoring more often, but it always seems to come second to getting shit done, and does sometimes have unintended effects.

For some reason I see this more often in Java code than in other languages. Maybe it’s because of its class-madness creating deep nested code (try to trace a call through Jersey, I dare you.).

Jersey is actually a nice example of this. The code is in itself clean, documented and well tested, but it is still (in my opinion) a mess. Seems to me that abstraction does not only have an performance penalty, but a readability issue as well. And in this case, it seems the code has been tested and refactored multiple times, but still it’s hard to understand.

Seems like there is a fine line between reusability and loss of understanding for what happens inside the black box.

Photo by Will Sculling

July 18, 2011
Take control over your data - banking

So, banking data is quite interesting. If one could get a hold of your credit card transactions in an easy way, that would be sweet. But, its not that easy.

My bank, DNBNor, has a feature that you can download either an excel document, or a CSV formatted dump of the transactions against one of your accounts on a monthly basis. The programmers choice would be CSV, but I found the CSV exports to be spotty, so excel it it.

But I don’t want to my analysis in Excel, so Perl to the rescue. The following snippet takes an Excel document, exported from DNBNor, as it’s first argument, and gives you an JSON-representation of the data to STDOUT. Should be quite easy to manipulate these in any language of your choice.

#!/usr/bin/perl

use strict;
use utf8;
use warnings;
use Switch;
use Spreadsheet::ParseExcel;
use JSON::XS;

my $excel = Spreadsheet::ParseExcel->new;

die "Need an xls-file containing DNB Nor transaction list as frowst argument\n" unless @ARGV;

my $doc = $excel->Parse($ARGV[0]) or die('Could not parse excel file\n', $@);

my @transactions = ();

for(my $sheet=0; $sheet < $doc->{SheetCount} ; $sheet++) {
    my $current_sheet = $doc->{Worksheet}[$sheet];
    for(my $row = $current_sheet->{MinRow}; defined $current_sheet->{MaxRow} && $row <= $current_sheet->{MaxRow}; $row++) {
        # Skip header row
        if ($row == $current_sheet->{MinRow}){
            next;
        }
        my %transaction = ();
        for(my $column = $current_sheet->{MinCol}; defined $current_sheet->{MaxCol} && $column <= $current_sheet->{MaxCol} ; $column++) {
            my $cell = $current_sheet->{Cells}[$row][$column];
            if (defined($cell)){
                switch ($column){
                    case 0 { $transaction{'date'} = $cell->Value || undef; }
                    case 1 { $transaction{'description'} = $cell->Value || undef; }
                    case 2 { $transaction{'interest_date'} = $cell->Value || undef; }
                    case 3 {
                        $transaction{'debit'} = $cell->Value || undef;
                        $transaction{'credit'} = '0.00';
                        }
                    case 4 {
                        $transaction{'credit'} = $cell->Value || undef;
                        $transaction{'debit'} = '0.00';
                        }
                }
            }
        }
        push @transactions, \%transaction;
    }
}

my $json = JSON::XS->new->utf8->encode({'transactions' => \@transactions});
print STDOUT $json . "\n";

exit 0;

GIST available at https://gist.github.com/ea0c6ef9e712e2caec88

Requires Spreadsheet::Excel and JSON::XS - both available at CPAN.

Have fun :-)

June 15, 2011
Word cloud of my thesis so far.  I guess you might be able to guess what I&#8217;m writing about :-)

Word cloud of my thesis so far. I guess you might be able to guess what I’m writing about :-)

March 31, 2011
apache2-mpm-itk and libapache2-mod-wsgi

If you are running Apache 2 with mpm-itk and mod_wsgi on a debian system, you’re a bit out of luck performance wise.

From the MPM-ITK documentation:

mpm-itk is based on the traditional prefork MPM, which means it’s non-threaded; in short, this means you can run non-thread-aware code (like many PHP extensions) without problems. On the other hand, you lose out to any performance benefit you’d get with threads, of course; you’d have to decide for yourself if that’s worth it or not. You will also take an additional performance hit over prefork, since there’s an extra fork per request.

So mod_wsgi in embedded mode will make apache load your whole python project each request. That seems quite inefficient, so we’ll run mod_wsgi’s daemon mode.

mod_wsgi’s daemon mode launces background deamons that caches your code, and even have have built-in support for running under mpm-itk. The only problem is that for this support to be enables mod_wsgi must be compiled against an mpm-itk patched apache 2. In Debian this is not the case, so the daemon sockets will be owned by www-data and not the user that you assigned to the vhost.

You will see something like this in your apache error log:


[Fri Mar 18 13:11:40 2011] [error] [client 2001:700:300:9::188] (13)Permission denied: mod_wsgi (pid=22237): Unable to connect to WSGI daemon process 'myuser' on '/var/run/wsgi/wsgi.22197.0.1.sock' after multiple attempts.,

I’ve filed a bug with debian on this isses (#619252), but no responses yet.

Luckely it’s not very hard to build your own mod_wsgi package to support daemon mode under mpm-itk.

Here is a quick how-to on the matter:


$ apt-get source libapache2-mod-wsgi
$ cd mod-wsgi-3.3
$ mkdir debian/patches

$ cat - > debian/patches/series
fix-mod-wsgi-for-itk.patch
^D

$ cat - > debian/patches/fix-mod-wsgi-for-itk.patch
Remove #ifdefined for mpm-itk support, and only support mpm-itk, as we only use mpm-itk
--- a/mod_wsgi.c
+++ b/mod_wsgi.c
@@ -10037,11 +10037,7 @@
*/

 if (!geteuid()) {
-#if defined(MPM_ITK)
 if (chown(process->socket, process-&gte;uid, -1) < 0) {
-#else
-if (chown(process->socket, ap_unixd_config.user_id, -1) < 0) {
-#endif
 ap_log_error(APLOG_MARK, WSGI_LOG_ALERT(errno), wsgi_server,
"mod_wsgi (pid=%d): Couldn't change owner of unix "
"domain socket '%s'.", getpid(),
^D

$ debchange -i
$ dpkg-buildpackage

the resulting package will work as expected. I haven’t tested if it works on any non-mpm-itk enabled apache, but I wouldn’t bet on it.

Now, go forth and hack some python web apps!

Liked posts on Tumblr: More liked posts »