Index of categories

Non-technical posts.

Random quote

Be selfish when you ask, honest when you reply, and when others reply, take them seriously.

(me, late at night)

Posted Sun Jul 19 18:53:03 2015 Tags:

Love thy neighbor as thyself

‘Love thy neighbor as thyself’, words which astoundingly occur already in the Old Testament.

One can love one’s neighbor less than one loves oneself; one is then the egoist, the racketeer, the capitalist, the bourgeois. and although one may accumulate money and power one does not of necessity have a joyful heart, and the best and most attractive pleasures of the soul are blocked.

Or one can love one’s neighbor more than oneself—then one is a poor devil, full of inferiority complexes, with a longing to love everything and still full of hate and torment towards oneself, living in a hell of which one lays the fire every day anew.

But the equilibrium of love, the capacity to love without being indebted to anyone, is the love of oneself which is not taken away from any other, this love of one’s neighbor which does no harm to the self.

(From Herman Hesse, "My Belief")

I always have a hard time finding this quote on the Internet. Let's fix that.

Posted Wed May 20 11:35:15 2015 Tags:

Free as in Facebook

Yesterday we were in an airport. We tried to connect to the airport "free" wifi. It had a captive portal that asked for a lot of personal information before one could maybe get on the internet, and we gave up. Bologna Airport, no matter what they do to pretend that they like you, it's always clear that they don't.

I looked at the captive portal screen and I said: «ah yes, "free" wifi. Free as in Facebook».

We figured that we had an expression that will want to be reused.

Posted Mon Mar 9 10:58:49 2015 Tags:

<3 Bremen

Greedy people exploit our cultural differences to justify fighting wars to seize wealth for themselves, when sharing and enjoying cultural diversity makes us far richer than if we possess material wealth

Greedy people exploit our cultural differences to justify fighting wars to seize wealth for themselves, when sharing and enjoying cultural diversity makes us far richer than if we possess material wealth!

From a recursive murales in Bremen Halmerweg.

Posted Sun Jan 11 22:20:11 2015 Tags:

Non importa che mi dai del voi

Dai, non importa che mi dai del voi

In che senso?

Eh, mi dici sempre "voi informatici", "voi tecnici", "voi..."

Posted Fri Dec 19 15:55:20 2014 Tags:

On relationships

Good relationships are like a good video game

with an easy, intuitive interface

and lots of interesting content.

(Lynoure)

Posted Wed Aug 6 16:47:40 2014 Tags:

On love and sexual desire

Soundtrack: Skullcrusher Mountain (with lyrics)

After seeing a review of it, I just watched again the first episode of Lupin III. with an eye on how sex and love are represented. To my eyes, they seem to be shown as mutually exclusive: the evil boss sexual appetites have Fujiko tied up and (childishly) "raped". Lupin's love of Fujiko is shown as self-sacrifice, with him ending up in Zenigata's cuffs, with him enduring her betrayal. In the end, Lupin's sexual desire for Fujiko seems to turn her again into a disadvantaged position, as the instrument of "rape" of the evil boss comes back into play.

In all the literature that I remember as I grew up, there has been love and there has been sexual desire. Love was wishing for the other person to be well, sexual desire was wishing for oneself to be well.

For some reason, in most of the literature there seemed to be an implicit rule saying that two people cannot both be well at the same time. Love meant being soulless and devoting oneself to the other person entirely, and suffering from it. Sexual desire meant reducing the other person as an object of one's own pleasure, and (usually she) would suffer from it.

Sometimes two people loved each other and wished for both to be well at the same time, and were usually torn away by circumstances, or had to suffer and sacrifice a lot so that they could be together, or one of them would die, for extra drama. What being together meant, was usually not covered in the story, and tended to happen during that "happily ever after" that usually starts where the book or film ends.

Yet, I find that sex is a wonderful way for two people to be both happy at the same time, where I can desire my happiness and the other person's happiness, and where each feeling, each desire, each move contributes to both. In pleasing myself I please the other; in pleasing the other I please myself.

Is there a romantic story where the lovers do not just feel love, but also they long for, they desire each other? Where that desire is shown not by hitting the partner in the head with a club and dragging them to wake up tied up in a secret lair, but by inviting their partner to come closer, kissing them, holding them tight to their body, caressing them, discovering what gives them pleasure, opening up for them to play, each person riding their own, the other's, and their shared desire?

I cannot think of one. It sounds like I'll have to write one.

With my own life.

Posted Tue Jul 22 17:40:36 2014 Tags:

Abuse

I've recently spent a lot of effort trying to find, recover, reconnect, embrace, strengthen and grow my inner child, my actual identity, what I'm comfortable being. Now I figure that I feel abused when I perceive an assault against my identity.

I'm ok having had to study a shallow moralistic piece of literature at school. Having been expected to like it and embrace it, that feels like abuse.

I'm ok being told I made a mistake. Being humiliated for it, feels like abuse.

I'm ok being suggested how to avoid making a mistake. But being expected to promptly follow the advice and change my life accordingly, that feels like abuse.

Someone offers help in cleaning my house? Fine. Someone barges in, takes over my personal space and reshapes it as they see fit? That feels like abuse.

If I'm sad, a hug's great. If the hug comes with a surprise grope at my bottom, that feels like abuse. I would not be bothered about the sexual assault, but about the denial of my current state of mind. The denial of feelings. Forget who you are, how you feel, and just be my sexual object. When it begins to deny my identity, then it starts feeling like abuse.

Being taught that "when you love a person, you must..." feels like abuse.

Being expected to have sex and like it at another person's whim, regardless of my real feelings, feels like abuse.

Being expected to like to have sex, or not like to have sex, regardless of my real feelings, feels like abuse.

The concept of "marital duty" feels like abuse. Two people who are expected to demand and provide sex, just because they are married, regardless of what they actually feel or need.

The concept of "training your boyfriend" sounds to me like nonconsensual manipulation. Like Abuse.

A lot of dating advice? Abuse. Abuse. Abuse.

A relationship based on me faking my identity, in order to keep another person close to me who wouldn't be otherwise, sounds to me like a relationship based on mutual abuse.

Why do we even have the concept of love potions in our culture, in our fairy tales?

"For your own good", abuse. Luckily, there are alternatives.

Being called "nice boobies" when you want to be called "Elizabeth", feels to me like abuse.

Being called "geek" when you want to be called "Enrico", feels to me like abuse.

Being called a stereotype when I want to be recognised as an individual, feels like abuse. When I grow up in an environment where that is the accepted way of addressing people, and so I learn to do the same, then I feel abused and I'm taught to abuse.

I think I had some pretty abusive role models when I grew up.

Being asked to do something I don't want to? Fine. Telling me off because I do not like it? Abuse.

Despising me because I like something another person doesn't? Abuse.

Despising me because I don't like something another person likes? Abuse.

As a kid I quickly had to learn to figure out what a person liked or disliked, and pretend accordingly. To this day, I still have a problem answering the simple question "what would you like to do today?"

It feels like abuse when my identity is denied. When I cannot make mistakes. When I cannot be vulnerable. When I have to be happy. When I have to be sad. When I have to care. When I must not care. When it doesn't matter how I feel, who I am, but I just have to feel something, like something, be something.

For my own good. Because someone else knows better. Because "that's the thing you do". Because.

When I was a kid, I felt very uncomfortable when going to a carnival or to a theme park, because I felt an expectation to enjoy it, regardles of what I was really feeling. An effort was made to make me happy so I had to. It was a place where I was not free to have my own mood. Abuse.

The doctor would touch, knock, hit, tell me to do this or that, but would never ask me how I was feeling. He would tell me how I was feeling, and he must have been right. Abuse.

There was lots of abuse against my identity in my growing up. I had to learn to blend in to protect myself. I had to learn to please. I had to pretend I liked it all. I got good at pretending.

In order to be accepted as a person I needed to pretend to be someone else. And the person that ended up being accepted was not me.

In "A Wizard from Earthsea", Ged turns into a falcon to run away, but stays a falcon for too long, and forgets how to be a person again. The wizard Ogion turns him back by recognising him, accepting him, and speaking only one word: Ged's name.

I feel abused when I'm taken away from my identity, my feelings, my needs. I could even do that to myself, out of frustration, out of despair, out of habit.

I have started to recognise the moments when I'm not feeling abused, the people who accept me for what I am, there and then, who allow me to exist without judging me. I have started to accept all other moments as likely abuse attempts, and emotionally deal with them accordingly.

I have started to become conscious of when I'm abusing myself (this has been an interesting read), and stop, and ask myself why.

Posted Sat Jun 21 14:38:12 2014 Tags:

Fear of losing

If I am afraid of breaking my laptop, then I may leave it at home, and it will be as if I didn't have a laptop.

If I am afraid of losing faith, then I may closely follow the dictates of the church. I will be keeping the church's faith, but not mine.

If I am afraid of losing my children, they may have to run away from me to be free to grow into adults.

If I am afraid of losing my inner child, then I might not expose it to the world, and so my inner child will never live. I will just be a box, a mask of what the world expects from me, that shelters and cages the Me that would like to live.

If I am afraid of losing you, then I may get obsessed with preserving the beautiful image I have of you. I may stop seeing, experiencing you as you live, think, grow. I may become afraid of your depth, of your being different each day, of your being alive. I may end up with cherishing a perfect image of you in my head, while you will have become a stranger to me.

I am really asking when I can accept answers.

I am really living when I can accept myself.

I am really loving when I can accept you.

Posted Mon Jun 16 09:21:52 2014 Tags:

Perfection

We like perfection.

Perfection is the ultimate achievement, there is nothing beyond.

Perfection is fully understood. It is not going to change, it is fact, we can rely upon it.

Perfection is final. Perfection is death.

Ideas can be perfect, and perfect ideas are easy to understand.

Perfect ideas are final and unchangeable. Perfect ideas are hard to correct, hard to refute.

Perfect ideas spread easily. They are helpful. They shed light on a little corner of our world, give it shape. They bring stability. They can be relied upon. Perfect ideas make good memes.

Perfect ideas are shared standards through which we act, interact, coordinate, cooperate. They don't change, so they are a solid base for habits, that make a bit of our life a little easier.

That we should not kill, is a perfect idea. So is racism. So are the ten commandments, so, for many, is love.

Thanks Lynoure for saying the right thing at the right time.

Posted Wed Jun 4 22:53:37 2014 Tags:

When I said "I love you"

All people ever say is: thank you (a celebration of life) and please (an opportunity to make life more wonderful). Marshall Rosenberg

I have said "I love you" many times in my life, and many times I have failed to say it, because, for me, it is not an easy thing to say.

It is not easy when I have no idea what the other person will make of it: will they be frightened? Will they feel awkward around me afterwards? Will they disappear from my life?

But do I know what I myself mean when I say it?

I have said "I love you" because I thought you somehow expected it of me. "please, consider me worth of you".

I have said "I love you" to beg for affection. "please, love me back".

I have said "I love you" because I was grateful to you for existing in my life. "thank you".

I now understand why it has not been easy for me to say "I love you" when I was feeling, or imagining, that I had to say it.

I now understand why I have sometimes made myself awkward, as I was begging.

I now understand why, when I said "I love you" out of gratitude, when I said it to celebrate that you exist in my life, that's when I felt no trouble, no fear, and when I felt that my words really were fitting with what I was feeling and what I was wanting to say.

Posted Mon Jun 2 10:18:56 2014 Tags:

Habits

Beware of habits. I've seen them turn into expectations over time, without any kind of negotiation.

Posted Thu Feb 13 01:22:59 2014 Tags:

Original sin

I feel that I was somewhat born innocent, an animal with sound, primal instincts. But if I remained that way, I wouldn't be able to function in a complex society, so I got education. Education taught me what is expected from me in order to be accepted by my peers[1].

Education taught me more than that: it also taught me to enjoy what a complex society can give me: art, science, history, philosophy, adding depth of meaning and correlations to my perceptions and memories.

Education wasn't perfect, though. Some of my educators obsessed about some of the expectations, and gave me rules to follow that aren't really needed to interact with a society. Sit down for hours in silence without complaining. Don't talk back to figures of authority, even when they are abusing me. Don't ever feel that my efforts in a task have been enough, because there is always something more that I could have done. Do what people expect me to do, regardless of what I wish to do. Do what I need to do, not what I want to do.

So I grew up with a set of arbitrary expectations that weren't needed to function in a society, and weren't in any way meeting any of my needs, yet I still felt them as a part of me, putting every effort into meeting them that I would put in making myself happy. I like to call this set of learned arbitrary, unneeded expectations "neurosis".

I like this as an interpretation of the "original sin" myth: in order to go from an innocent animal to a member of a complex society, I acquired a set of neuroses that make me behave in a meaningless way. Or, rephrased going along with the myth, that rob me of my innocence.

In some environments like BDSM, many have a name for an unnegotiated practice forced on a person: they call it "abuse". It made sense to accept that I have been abused many times while being educated, because my educators also had their own baggage of parasite expectations, of neuroses. And since I don't currently know how I can ever be sure that I freed myself from all my neuroses, I feel like I should accept that it is possible that I do and will abuse others; by accepting it as a possibility, I hope at least to be able to realise as soon as possible that I am doing it, and try to stop.

I have now realised that my life is far simpler and more rewarding when expectations are negotiated between all the parties involved. I have recently spent a substantial amount of my energy in recognising and renegotiating many of the expectations and the neuroses that I had learnt while growing up, and, who knows, perhaps this effort will continue throughout my life.

So there you go, an agnostic atheist freeing himself of his original sin, and regaining his paradise lost one step at a time, alive, on earth.

And enjoying every moment of it.

[1] I suspect that is why people educated in a high class society tend to be accepted more easily by a high class society: they conform to the right set of expectations. But I digress.

Posted Sat Feb 8 17:48:27 2014 Tags:

On political correctness

I am reading "Four ways to forgiveness" by Ursula Le Guin. This is how the book begins:

“On the planet 0 there has not been a war for five thousand years,” she read, “and on Gethen there has never been a war.” She stopped reading, to rest her eyes and because she was trying to train herself to read slowly, not gobble words down in chunks the way Tikuli gulped his food. “There has never been a war”: in her mind the words stood clear and bright, surrounded by and sinking into an infinite, dark, soft incredulity.

What would that world be, a world without war? It would be the real world Peace was the true life, the life of working and learning and bringing up children to work and learn. War, which devoured work, learning, and children, was the denial of reality. But my people, she thought, know only how to deny. Born in the dark shadow of power misused, we set peace outside our world, a guiding and unattainable light. All we know to do is fight. Any peace one of us can make in our life is only a denial that the war is going on, a shadow of the shadow, a doubled unbelief.

That made me realise that several times I perceived political correctness as "only a denial that the war is going on".

Last night I watched this video, which I felt was very relevant to this.

Indeed, I have more respect for someone who listens to me and makes a good effort to understand what I said, then rudely disagrees, than for someone who is very politely ignoring everything I am trying to say.

This is how I feel like this experience can be distilled into an actionable item:

Replying to an email is probably going to be useless, unless I am willing to make an effort to understand what the other person is trying to say, from their own point of view.

Posted Wed Jan 29 10:44:14 2014 Tags:

Liberation vs expectations

I feel like I found a key struggle: liberation vs expectations.

Many ideas liberate me, like poliamory, atheism, anarchism, nonviolent communication, sexual liberation, discordianism, free software, nonjudgemental environments, E-prime, sillyness, love, pataphysics, dada, constructivism, rolling naked in the mud, fantasy, bdsm exploration, pansexuality, this very idea that i'm trying to describe, crying, laughing, mindfulness, what have you.

Sometimes however they become an end rather than a mean, they give me expectations and judgements, and take some other freedom away.

I feel like I found where to draw a line: it's the point inbetween freeing myself of my limits and making myself some new ones.

It's one of those epiphany moments. Weeeeeee!

Posted Sun Jan 12 12:17:39 2014 Tags:

Spelling a chilometri zero

Lo spelling internazionale è troppo globalizzato, e volete recuperare un attimo la dimensione del posto dove siete nati e cresciuti?

Da oggi c'è questo script che fa per voi: gli dite dove abitate, e lui vi crea lo spelling a chilometri zero.

$ git clone git@gitorious.org:trespolo/osmspell.git
$ cd osmspell
$ ./osmspell "San Giorgio di Piano"
1: San Giorgio di Piano, BO, EMR, Italia
2: San Giorgio di Piano, Via Codronchi, San Giorgio di Piano, BO, EMR, Italia
3: San Giorgio Di Piano, Via Libertà, San Giorgio di Piano, BO, EMR, Italia
Choose one: 1
Center: 44.6465332, 11.3790398
A Argelato, Altedo
B Bentivoglio, Bologna, Boschi
C Cinquanta, Castagnolo Minore, Castel Maggiore, Cento
D Dosso
E Eremo di Tizzano
F Funo di Argelato, Finale Emilia, Ferrara, Fiesso
G Gherghenzano, Galliera, Gesso
I Il Cucco, Irnerio, Idice
L Località Fortuna, Lovoleto, Lippo
M Malacappa, Massumatico, Minerbio, Marano
N Navile
O Osteriola, Ozzano dell'Emilia, Oca
P Piombino, Padulle, Poggio Renatico, Piave
Q Quarto Inferiore, Quattrina
R Rubizzano, Renazzo, Riale
S San Giorgio di Piano, Saletto
T Torre Verde, Tintoria, Tombe
U Uccellino
V Venezzano Mascarino, Vigarano Mainarda, Veduro
X XII Morelli
Z Zenerigolo, Zola Predosa

I dati vengono da OSM, e lo script è un ottimo esempio di come usarne la API di geolocazione (veloci) e la API di query geografica (lenta).

Posted Sat Jan 4 00:38:16 2014 Tags:

Poesia: "Lavatrice"

Pensavo fosse pail,

invece ora è feltro.

Posted Tue Dec 3 22:32:23 2013 Tags:

Shops

Christmas songs should only ever be played on Christmas day.

In church.

At midnight.

Unless I happen to be there.

Posted Mon Dec 2 14:07:58 2013 Tags:

Explanation of umarell

Umarell /uma'rɛl/ (oo-mah-rell), n; pl. Umarells. People in a community who offer all sorts of comments to those who are trying to get some work done, but who are not doing any work themselves.

Etymology and further details

Umarell is a word that entered Italian slang in Bologna and is spreading to nearby towns, occasionally even across Italy. It comes from the Bolognese for "cute/odd little man".

"Umarells" are those people, usually retired men, who spend time watching construction works, often holding their hands behind their back, occasionally commenting on what is going on, sometimes trying to tell the workers what to do.

It's easy to find examples on the internet; the word was popularised by a blog collecting photos, which has even been published into a book.

With some Italian Debian friends, we realised that umarell is the perfect word to describe those people in a community, who offer all sorts of comments to those who are trying to get some work done, but who are not doing any work themselves.

I think that it is a word that fits perfectly, and since I'm likely going to use it blissfully anywhere, here is a page that temporarily explains what it means until the Oxford English Dictionary picks it up.

Posted Fri Sep 20 13:27:07 2013 Tags:

Random notes from that other lightning talk session

If you can, have dinner with your upstreams at least once a year.

Posted Sat Aug 31 12:35:09 2013 Tags:

On codes of conduct

A criticism of the status quo, with a simple proposal: video and lyrics.

With compassion unyielding to grudge, mother, I learnt how to love.

Posted Sat Aug 24 18:15:31 2013 Tags:

Random notes from that other lightning talk session

YKINMKBYKIOK: "Your Kink Is Not My Kink But Your Kink Is Ok"

As far as I'm concerned, this puts the vim vs emacs quarrel to rest, for good.

Posted Fri Aug 23 18:11:30 2013 Tags:

Random notes from that other lightning talk session

The BDSM Free Software Definition:

I refuse to be bound by software I cannot trust and negotiate with.

Posted Wed Aug 21 08:16:15 2013 Tags:

Random notes from DebConf

Compersion, n: the feeling you get when someone else also takes good care of one of your packages.

Posted Tue Aug 20 17:44:52 2013 Tags:

Components in a system

But what we are discovering is that if we see ourselves as components in a system,

that it is very difficult to change the world.

It is a very good way of organising things, even rebellions,

but it offers no ideas about what comes next.

(from All Watched Over By Machines of Loving Grace)

Posted Wed Mar 13 01:30:34 2013 Tags: tags/quotes

Che coss'è l'amor

Di recente mi sono spesso trovato a chiedermi "che cos'è l'amore", e ogni volta inizia a suonarmi in testa: "♪...chiedilo al vento / che sferza il suo lamento sulla ghiaia del viale del tramonto... ♫"

Socialmente l'amore mi è sempre stato venduto come una cosa indescrivibile ma allo stesso tempo estremamente codificata. L'amore è quella cosa che ti fa sentire così, e poi se non fai così non è vero amore, e l'altra persona la devi trattare così, e devi essere spontaneo mapperò... tv e musica con i quali volente o nolente sono cresciuto hanno spesso propagandato modelli che raramente ho condiviso, e che riassumerei con questo.

Mi è sempre stato difficile capire quando poter dire "ti amo". Il problema non è tanto nell'interpretare i miei sentimenti, quanto nel capire quali significati sto tirando addosso a entrambi nel momento in cui lo faccio. Per questo motivo, finché non sono sicuro di come l'espressione "ti amo" significhi per entrambi, tendo a ragionare usando invece concetti definiti in maniera un po' più chiara (per me), tipo: desiderio, fiducia, intimità, odori, intesa, ridere, contatto fisico, curiosità...

È un peccato: ogni tanto provo quel sentimento di piacevole resa verso l'altra persona che mi porta a voler dire "ti amo!", e finisco per non farlo.

Detto questo, da quando ho visto "Una de zombis", alla domanda "ma sei innamorato?" non può non venirmi in mente questo.

Posted Wed Feb 20 02:02:30 2013 Tags:

On praising people, and on success

This morning I was pointing out to friends how excellent is mako's post on Aaron Swartz, and I thought it'd be nice if we didn't have to wait for people to die before telling the world how awesome and inspirational they are.

Then Russ posted an article about work, success and motivation and I went to tell my friends how awesome and inspirational he is.

I, too, see myself as somehow successful, and I, too, don't identify in the usual stereotype of success. I don't want to stop being a craftsman to become a manager, I don't get a high from having power over other people, I don't define my value in terms of my profits.

At a glance, people don't see me as successful, until they get to know me better. They they realise that I'm not at all unhappy about my life.

I have a job that I like, I write Free Software and it gets used and appreciated, my colleagues are friends, who respect me and my opinion, and I respect them and theirs.

I can work from home. In fact, I can work from everywhere as long as I have my laptop with me. I can sustain a long distance relationship because I can work from the house of my partner when I'm visiting. Two days ago I worked from the bar of a farm on top of a hill, because I was on the road, it was close by, and what the hell, it's a wonderful place to be.

To me success means that I can care about the quality of my life, that I have the luxury of caring about little things that make my day, of trying to make good ideas sustainable, of working a bit more when I'm on fire, and of working a bit less when there's something wonderful in the world to see, or someone interesting in the world to meet.

Russ, the way I read your article, you are questioning what "success" means, and you are spot on. People should be able to define "success" as whatever works for them and pursue it freely. Only then success becomes something that is worth praising when it is achieved. Only then it becomes inspirational.

I like how you managed to put into words something that has been for a long time in some corner of my mind and I hadn't yet managed or bothered to bring into the spotlight.

You have the insight and the confidence of seeing something in an insightful but non-mainstream way, and say "you know what? That actually makes sense."

Sometimes I read your post, nod a lot and realise how important something actually is, how that is actually such an important part of myself. And now that you took it out for me to see it, I can appreciate how valuable it is, and make sure I don't accidentally lose it.

Thanks! That's another one I owe you. It's just the kind of thing I shouldn't wait before letting you know.

Posted Fri Jan 25 12:45:49 2013 Tags:

Yet another Ubuntu anecdote

Some posts on planet made me remember of a little Canonical-related story of mine.

Many years ago I shortly contracted for Canonical. It was interesting and fun.

At the time I didn't have any experience of being temporarily hired by a foreign company, so I rang my labour union to get an appointment, to make sure with them that everything was allright.

The phone call went more or less like this:

Me:

Hello. I have received this contract for temporary employment by a foreign company and I wondered if I could book an appointment to come show it to you to see if it's all ok.

Their answer rather cut me short:

Hi. Be careful! People get temporary employment from obscure companies with the headquarters, like, in the Isle of Man, they do the job, the company disappears and they never get paid. There's bad stuff out there!

I looked at the contract, the heading said something like "Canonical ltd, Douglas, Isle of Man".

I was certain that the union people would have never understood what was going on. I politely thanked them for their time and hung up. However, to this day I still regret that I didn't insist:

Uh, yes, the company is indeed in the Isle of Man. But what if I told you that it's owned by an astronaut?

I just signed the contract and had a good time.

Posted Sat Jan 15 10:35:36 2011 Tags:

La paura e la voglia

Results 1 - 10 of about 5,470 for "la paura e la voglia di essere nudi". (0.34 seconds) 

Results 1 - 10 of about 26,500 for "la paura e la voglia di essere soli". (0.10 seconds) 

Poi non c'è da stupirsi se siamo diventati come siamo diventati.

Posted Wed May 5 13:36:24 2010 Tags:

Se è vero questo...

Se è vero questo...

...allora è vero anche questo!

Posted Thu Apr 22 09:26:08 2010 Tags:

Global trends

For some time I have been trying to pinpoint what it is that is brewing in Italy and risks spreading elsewhere, like it happened in the past.

I don't need to be decent to stay in power

While following a train of thought during a political/philosophical lecture I figured that a current growing trend is to have public figures that are more and more indecent.

In Italy it is very hard to find a public figure you can look up to. It is hard to name a politician that is not involved in some shady exchange of favours or some abues of their powers, and we got used to seeing people in power implicated in major corruption scandals, perverted prostitution affairs, or dealings with international criminal organisations.

They do not normally end up in jail, and in fact they keep being very firmly in power, because they manage to stretch or change the laws to get away, or at least to delay trials in order to trigger some statute of limitations.

Is there a pattern here that, although maybe not as clearly defined as in Italy, can be found more or less globally?

Yesterday I thought that this could be such a pattern:

I don't need to be decent to stay in power

If I think of it like that, then it is most definitely not just an Italian phenomenon. If you tell "one doesn't need to be decent to stay in power" to a British, or to a French, I would not expect them to see anything strange with it. We all find it depressing, but we are all used to it.

It is a pattern with repercussions, though: once that becomes normal in a society, it means that people who get to be in power are free to abuse their power much as they want, as long as they are careful enough not to end up in jail. Because, well, nowadays one doesn't need to be decent to stay in power.

I don't need to follow the law to stay in power

That first pattern is already quite well accepted in Italy. So much well accepted, that I think we are starting to see what comes next.

At the end of March we are going to have elections for some regional governors. Funnily enough, in Lazio, the very important region around Rome, the centre-right coalition failed to submit the paperwork on time, and is out of the elections.

It is not just red tape: at some point someone will have to print out the ballots and dispatch them to the voting booths, so one expects to have the coalition logos and the names of the candidates submitted in time, together with signatures supporting the candidates and whatever else the election process needs.

Well, they missed the deadline, they got there after closing time and the building was, well, closed.

It was a fantastic opportunity for a laugh. Memes blossomed on the Italian intarwebs and we now have 2 or 3 new expressions to mean "stupid".

However, now it's hard to tell what is going to happen. On one hand, you can't exclude one of the two major coalitions because of some bureaucratic detail like an office closing time. But on the other hand, several minor coalitions have been excluded in all sorts of past elections because of similar things, and it really would not be fair to start making exceptions now.

But Lombardia, the region around Milan and Emilia Romagna, the one around Bologna, both very, very important, are having similar kinds of problems.

In both regions the previous governors are running again, for the 3rd time in a row, and most likely they legally can't do it, and if elected one can sue and force them to resign, because they have been in power long enough. Lots of paper is being shuffled at the moment to figure if they can get away with it or not.

Oh, and the lawyers of the candidate for the Milan region also managed to get to the tribunal after closing time, but apparently there was still someone inside and they managed to shout loud enough, or somesuch.

Anyway, the situation is getting hot. The Lazio coalition that has been excluded because of their incompetence is now hard at work pushing their potential voters to mount a fracas. Chances are that eventually they'll get away with it, and manage to take part to the election. If that happens, they will likely get close to winning it.

So this seems to be a new pattern that is emerging:

I don't need to follow the law to stay in power

Which, again, is a pattern with quite some repercussions. It is something much more radical than just an issue with morality: it means feudalism, it means we are culturally ready to accept dictatorship.

So, please do me a favour: do not think for a moment that Italy is just a funny place with lemons and tomatoes, and watch out for these patterns emerging around you.

Posted Fri Mar 5 16:52:31 2010 Tags:

Feedback democratico

Le elezioni sono passate, la frustrazione rimane.

Per la comodità di chi volesse dare un democratico feedback alle deludenti forze politiche in gioco, dallo spiegare pacatamente le proprie ragioni per l'aver votato un altro, al mandarli cortesemente a scoreggiare nella farina, ecco un po' di email:

Non dimenticate di mettere bene in chiaro che la mail di insulti che avete mandato non è da intendere come una richiesta di essere iscritti alle loro newsletter.

Posted Tue Jun 9 10:45:05 2009 Tags:

Igiene pubblica

Stamattina:

  • Andare in via Gramsci 12
  • Andare alla porta 10 piano terra
  • Prendere il numero bianco con bordo blu
  • Aspettare (due ore)
  • Fare vaccino
  • Andare alla porta 22 piano terra
  • Pagare
  • Tornare alla porta 10 piano terra
  • Consegnare la ricevuta
  • Ritirare il certificato

Mi sono sentito come la tartaruga del logo.

Posted Sat Jun 6 00:57:39 2009 Tags:

My company

I had an interesting conversation in which I was explaining my current work situation. It might be fun to share:

< mornfall> you are working on a timed contract?
< enrico> I'm probably worse than a timed contract: I had to setup a company
< enrico> The government is restricting them so badly that they couldn't hire
          me even temporarily, and they had to buy software from me
< enrico> so I had to become a company
< enrico> I AM THE COMPANY
< mornfall> oh duh
< enrico> I'm my employer
< enrico> and my employee
< enrico> in all this, the government proudly shows statistics of lots of new
          companies being created in Italy as a sign that our economy is so
          good
< mornfall> be nice to your employee :-)
< mornfall> hmh
< enrico> so last friday there was a general strike and my company sent a
          press release to the major labor union telling them that the CEO
          together with all the employees decided to fully take part in the
          strike
< enrico> Signed: Enrico Zini, directory of the company Enrico Zini;
          Employees of the company Enrico Zini: Enrico Zini
< mornfall> that's getting tough
< enrico> I'm very nice to my employees, hes :)
< enrico> but sometimes I wonder if masturbation could be considered sexual
          harassment from my boss
< mornfall> LOL
< mornfall> you need to appoint a secretary
< enrico> I have one: Enrico Zini.  He's a bit messy, though.
< enrico> The only thing which is not called Enrico Zini is the accountant.
< enrico> Incidentally, he's called Vincenzo Zini, but he's isn't a relative
          of mine :)
< mornfall> in that case i think the boss is harassing the secretary not the
          other employee
< enrico> However, he is member of the local LUG and has a Jabber account :)
< mornfall> fun :)
< enrico> oh, ok.  But the employee might be harassing the secretary, or the
          secretary is harassing the employee and the boss
< enrico> IDEA
< enrico> Tonight I'll propose an orgy with the boss, the secretaries and the
          employees!
< enrico> (but not the accountant)

(blogged with mornfall's permission)

Posted Sat Jun 6 00:57:39 2009 Tags:
Posted Sat Jun 6 00:57:39 2009 Tags:

Off by one

A few days ago I was around a table having a hot pot and having a small chat with other friends around the table:

X: How old are you?  I was born in 1976
Me: I was born in 1976 as well.  Cool year!

a few minutes pass, I realise something:

Me: So, since you were born in 1976, how old are you?
X: I'm 30
Me: I thought so.  I'm 29 :)

It turns out that in some countries, at the moment you're born you're 1 year old instead of 0: it's just a matter of convention.

And then today it's the first day of 2006. It's also the 95th year since the founding of Republic of China, which is another way of counting years here. The good thing is that if you see that a bottle of milk expires in '95, it means it could still be good.

So, when was Republic of China founded? Of course in 1912, but since when it was founded it was 1 year old, we're now in 95 instead of 94.

Things would be even more interesting if the Lunar Calendar used numbers for the years, but they instead use a "sexagesimal stem-branch cycle" so I cannot say that, according to the lunar calendar, we're still in 2005 until February 29 However, people do have two different birthdays.

And why am I in Taiwan but they count since the founding of Republic of China? I'll tell you if you buy me a glass of wine :)

Anyway, Today is Sweetmorn, the 1st day of Chaos in the YOLD 3172

Posted Sat Jun 6 00:57:39 2009 Tags:

Etiopia

È interessante, bello e triste allo stesso tempo trovarsi a ridefinire il significato di "Abissinia". E maledire che per i primi 30 anni della tua vita, quella parola l'hai sentita soltanto quando uno stronzo cantava "Faccetta nera".

Posted Sat Jun 6 00:57:39 2009 Tags:

Housing agent in the UK

Yesterday afternoon:

Me: "Where is the gas central heating?"

Agent: "Over there, behind the chimney"

Me: "I see. Beyond that wall is another apartment: where do the fumes go?"

Agent: "Up the chimney"

Me: "Good. And where does it get oxygen from?"

Agent: "It does not need any oxygen, it only needs gas"

Me: "The gas central heater does not have any oxygen or air??"

Agent: "No, it only needs gas"

Me: "I think the laws of physics would have problems with that"

Agent: "You'd have to ask that to British Gas"

Posted Sat Jun 6 00:57:39 2009 Tags:

Allucinazioni

Bologna, 14 dicembre 2005.

Stamattina andavo a Bologna in macchina (capita raramente, ma odio quando capita).

Al primo maggio tiro dritto per andare in via Colombo e la coda dell'occhio tira su un cartellone dei saldi da un capannone sulla destra.

C'è qualcosa di strano. Guardo meglio.

"Niente piú telefonate! Svendita totale per cambio gestione"

Sorrido: il capannone era "Il Mobile di Castel Maggiore".

Posted Sat Jun 6 00:57:39 2009 Tags:

Italian weather agencies

Most Italian regions have agencies providing good regional weather forecast. However, I could not find a useful national index of all their weather forecast pages. So I made one, tweaking an image map found on wikipedia.

Abruzzo Basilicata Calabria Campania Emilia-Romagna Friuli-Venezia Giulia Lazio Liguria Lombardia Marche Molise Piemonte Puglia Sardegna Sicilia Toscana Trentino-Alto Adige Umbria Valle d'Aosta Veneto

Italia suddivisa per regioni

I only filled in those regions that I was checking to organise a specific trip. With time, if I'll keep using this map I'll add more regions. Of course, if you now of a weather forecast page of a regional, public funded agency that is missing here, send me an e-mail.

Posted Sat Jun 6 00:57:39 2009 Tags:

Altro cazzeggio in letteratura

Dopo il mio talk sul cazzeggio e la versione inglese, il buon godog mi segnala un'altra serie di link di letteratura "futile":

Quoto in toto la facezia LXI:

DI GUGLIELMO CHE AVEVA UN AFFARE ABBONDANTE

Nella città di Terranova eravi un uomo che aveva nome Guglielmo, che facea il falegname ed era assai ben provvisto dalla natura. E la moglie fortunata narrò la cosa alle vicine, e quando questa morì, condusse egli in moglie una giovinetta ingenua, che avea nome Antonia, e che quando fu sposa seppe dai vicini che arma potente possedesse il marito. Nella prima notte che ella fu col marito tremava assai, e voleva sfuggirlo né voleva lasciar fare. E l'uomo capì di che cosa avesse timore la ragazza, e per consolarla le disse che ciò che ella aveva udito dire era vero, ma che egli ne aveva due, uno più grande e uno più piccolo: «E di questo», soggiunse, «per non farti male, mi servirò questa notte; e vedrai che ti farà bene; poi se ti piacerà proveremo col più grande». La ragazza acconsentì e cedette senza pianto e senza dolore all'uomo. E dopo un mese, fattasi più franca e più audace, una notte, mentre accarezzava suo marito: «Amico mio», gli disse, «se ora ti volessi servire di quell'altro ch'è più grande? « E l'uomo, che ne avea quasi quanto un asino, rise dell'appetito della donna; e da lui una volta udii narrare, in compagnia, questa storia.

Posted Sat Jun 6 00:57:39 2009 Tags:

Edifici

Da una canzone in amarico:

"Il tuo amore è diventato vecchio

come gli edifici costruiti dagli italiani"

Posted Sat Jun 6 00:57:39 2009 Tags:
Posted Sat Aug 1 15:48:47 2015

Entries to also be published on truelite.it

Free as in Facebook

Yesterday we were in an airport. We tried to connect to the airport "free" wifi. It had a captive portal that asked for a lot of personal information before one could maybe get on the internet, and we gave up. Bologna Airport, no matter what they do to pretend that they like you, it's always clear that they don't.

I looked at the captive portal screen and I said: «ah yes, "free" wifi. Free as in Facebook».

We figured that we had an expression that will want to be reused.

Posted Mon Mar 9 10:58:49 2015 Tags:

Setting up Akonadi

Now that I have a CalDAV server that syncs with my phone I would like to use it from my desktop.

It looks like akonadi is able to sync with CalDAV servers, so I'm giving it a try.

First thing first is to give a meaning to the arbitrary name of this thing. Wikipedia says it is the oracle goddess of justice in Ghana. That still does not hint at all at personal information servers, but seems quite nice. Ok. I gave up with software having purpose-related names ages ago.

# apt-get install akonadi-server akonadi-backend-postgresql

Akonadi wants a SQL database as a backend. By default it uses MySQL, but I had enough of MySQL ages ago.

I tried SQLite but the performance with it is terrible. Terrible as in, it takes 2 minutes between adding a calendar entry and having it show up in the calendar. I'm fascinated by how Akonadi manages to use SQLite so badly, but since I currently just want to get a job done, next in line is PostgreSQL:

# su - postgres
$ createuser enrico
$ psql postgres
postgres=# alter user enrico createdb;

Then as enrico:

$ createdb akonadi-enrico
$ cat <<EOT > ~/.config/akonadi/akonadiserverrc
[%General]
Driver=QPSQL

[QPSQL]
Name=akonadi-enrico
StartServer=false
Host=
Options=
ServerPath=
InitDbPath=

I can now use kontact to connect Akonadi to my CalDAV server and it works nicely, both with calendar and with addressbook entries.

KDE has at least two clients for Akonadi: Kontact, which is a kitchen sink application similar to Evolution, and KOrganizer, which is just the calendar and scheduling component of Kontact.

Both work decently, and KOrganizer has a pretty decent startup time. I now have a usable desktop PIM application that is synced with my phone. W00T!

Next step is to port my swift little calendar display tool to use Akonadi as a back-end.

Posted Tue Feb 17 15:34:55 2015 Tags:

seat-inspect

Four months ago I wrote this somewhere:

Seeing a DD saying "this new dbus stuff scares me" would make most debian users scared. Seeing a DD who has an idea of what is going on, and who can explain it, would be an interesting and exciting experience.

So, let's be exemplary, competent and patient. Or at least, competent. Some may like or not like the changes, but do we all understand what is going on? Will we all be able to support our friends and customers running jessie?

I confess that although I understand the need for it, I don't feel competent enough to support systemd-based machines right now.

So, are we maybe in need of help, cheat sheets, arsenals of one-liners, diagnostic tools?

Maybe a round of posts on -planet like "one debian package a day" but with new features that jessie will have, and how to understand them and take advantage of them?

That was four months ago. In the meantime, I did some work, and it got better for me.

Yesterday, however, I've seen an experienced Linux person frustrated because the shutdown function of the desktop was doing nothing whatsoever. Today I found John Goerzen's post on planet.

I felt like some more diagnostic tools were needed, so I spent the day making seat-inspect.

seat-inspect tries to make the status of the login/seat system visible, to help with understanding and troubleshooting.

The intent of running the code is to have an overview of the system status, both to see what the new facilities are about, and to figure out if there is something out of place.

The intent of reading the code is to have an idea of how to use these facilities: the code has been written to be straightforward and is annotated with relevant bits from the logind API documentation.

seat-inspect is not a finished tool, but a starting point. I put it on github hoping that people will fork it and add their own extra sanity checks and warnings, so that it can grow into a standard thing to run if a system acts weird.

As it is now, it should be able to issue warnings if some bits are missing for network-manager or shutdown functions to work correctly. I haven't really tested that, though, because I don't have a system at hand where they are currently not working fine.

Another nice thing of it is that when running seat-inspect -v you get a dump of what logind/consolekit think about your system. I found it an interesting way to explore the new functionalities that we recently grew. The same can be done, and in more details, with loginctl calls, but I lacked a summary.

After writing this I feel a bit more competent, probably enough to sit at somebody's computer and poke into loginctl bits. I highly recommend the experience.

Posted Tue Feb 10 18:06:43 2015 Tags:

Playing with python, terminfo and command output

I am experimenting with showing progress on the terminal for a subcommand that is being run, showing what is happening without scrolling away the output of the main program, and I came out with this little toy. It shows the last X lines of a subcommand output, then gets rid of everything after the command has ended.

Usability-wise, it feels like a tease to me: it looks like I'm being shown all sorts of information then they are taken away from me before I managed to make sense of them. However, I find it cute enough to share:

#!/usr/bin/env python3
#coding: utf-8
# Copyright 2015 Enrico Zini <enrico@enricozini.org>.  Licensed under the terms
# of the GNU General Public License, version 2 or any later version.

import argparse
import fcntl
import select
import curses
import contextlib
import subprocess
import os
import sys
import collections
import shlex
import shutil
import logging

def stream_output(proc):
    """
    Take a subprocess.Popen object and generate its output, line by line,
    annotated with "stdout" or "stderr". At process termination it generates
    one last element: ("result", return_code) with the return code of the
    process.
    """
    fds = [proc.stdout, proc.stderr]
    bufs = [b"", b""]
    types = ["stdout", "stderr"]
    # Set both pipes as non-blocking
    for fd in fds:
        fcntl.fcntl(fd, fcntl.F_SETFL, os.O_NONBLOCK)
    # Multiplex stdout and stderr with different prefixes
    while len(fds) > 0:
        s = select.select(fds, (), ())
        for fd in s[0]:
            idx = fds.index(fd)
            buf = fd.read()
            if len(buf) == 0:
                fds.pop(idx)
                if len(bufs[idx]) != 0:
                    yield types[idx], bufs.pop(idx)
                types.pop(idx)
            else:
                bufs[idx] += buf
                lines = bufs[idx].split(b"\n")
                bufs[idx] = lines.pop()
                for l in lines:
                    yield types[idx], l
    res = proc.wait()
    yield "result", res

@contextlib.contextmanager
def miniscreen(has_fancyterm, name, maxlines=3, silent=False):
    """
    Show the output of a process scrolling in a portion of the screen.

    has_fancyterm: true if the terminal supports fancy features; if false, just
    write lines to standard output

    name: name of the process being run, to use as a header

    maxlines: maximum height of the miniscreen

    silent: do nothing whatsoever, used to disable this without needing to
            change the code structure

    Usage:
        with miniscreen(True, "my process", 5) as print_line:
            for i in range(10):
                print_line(("stdout", "stderr")[i % 2], "Line #{}".format(i))
    """
    if not silent and has_fancyterm:
        # Discover all the terminal control sequences that we need
        output_normal = str(curses.tigetstr("sgr0"), "ascii")
        output_up = str(curses.tigetstr("cuu1"), "ascii")
        output_clreol = str(curses.tigetstr("el"), "ascii")
        cols, lines = shutil.get_terminal_size()
        output_width = cols

        fg_color = (curses.tigetstr("setaf") or
                    curses.tigetstr("setf") or "")
        sys.stdout.write(str(curses.tparm(fg_color, 6), "ascii"))

        output_lines = collections.deque(maxlen=maxlines)

        def print_lines():
            """
            Print the lines in our buffer, then move back to the beginning
            """
            sys.stdout.write("{} progress:".format(name))
            sys.stdout.write(output_clreol)
            for msg in output_lines:
                sys.stdout.write("\n")
                sys.stdout.write(msg)
                sys.stdout.write(output_clreol)
            sys.stdout.write(output_up * len(output_lines))
            sys.stdout.write("\r")

        try:
            print_lines()

            def _progress_line(type, line):
                """
                Print a new line to the miniscreen
                """
                # Add the new line to our output buffer
                msg = "{} {}".format("." if type == "stdout" else "!", line)
                if len(msg) > output_width - 4:
                    msg = msg[:output_width - 4] + "..."
                output_lines.append(msg)
                # Update the miniscreen
                print_lines()

            yield _progress_line

            # Clear the miniscreen by filling our ring buffer with empty lines
            # then printing them out
            for i in range(maxlines):
                output_lines.append("")
            print_lines()
        finally:
            sys.stdout.write(output_normal)
    elif not silent:
        def _progress_line(type, line):
            print("{}: {}".format(type, line))
        yield _progress_line
    else:
        def _progress_line(type, line):
            pass
        yield _progress_line

def run_command_fancy(name, cmd, env=None, logfd=None, fancy=True, debug=False):
    quoted_cmd = " ".join(shlex.quote(x) for x in cmd)
    log.info("%s running command %s", name, quoted_cmd)
    if logfd: print("runcmd:", quoted_cmd, file=logfd)

    # Run the script itself on an empty environment, so that what was
    # documented is exactly what was run
    proc = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, env=env)

    with miniscreen(fancy, name, silent=debug) as progress:
        stderr = []
        for type, val in stream_output(proc):
            if type == "stdout":
                val = val.decode("utf-8")
                if logfd: print("stdout:", val, file=logfd)
                log.debug("%s stdout: %s", name, val)
                progress(type, val)
            elif type == "stderr":
                val = val.decode("utf-8")
                if logfd: print("stderr:", val, file=logfd)
                stderr.append(val)
                log.debug("%s stderr: %s", name, val)
                progress(type, val)
            elif type == "result":
                if logfd: print("retval:", val, file=logfd)
                log.debug("%s retval: %d", name, val)
                retval = val

    if retval != 0:
        lastlines = min(len(stderr), 5)
        log.error("%s exited with code %s", name, retval)
        log.error("Last %d lines of standard error:", lastlines)
        for line in stderr[-lastlines:]:
            log.error("%s: %s", name, line)

    return retval


parser = argparse.ArgumentParser(description="run a command showing only a portion of its output")
parser.add_argument("--logfile", action="store", help="specify a file where the full execution log will be written")
parser.add_argument("--debug", action="store_true", help="debugging output on the terminal")
parser.add_argument("--verbose", action="store_true", help="verbose output on the terminal")
parser.add_argument("command", nargs="*", help="command to run")
args = parser.parse_args()

if args.debug:
    loglevel = logging.DEBUG
elif args.verbose:
    loglevel = logging.INFO
else:
    loglevel = logging.WARN
logging.basicConfig(level=loglevel, stream=sys.stderr)
log = logging.getLogger()

fancy = False
if not args.debug and sys.stdout.isatty():
    curses.setupterm()
    if curses.tigetnum("colors") > 0:
        fancy = True

if args.logfile:
    logfd = open("output.log", "wt")
else:
    logfd = None

retval = run_command_fancy("miniscreen example", args.command, logfd=logfd)

sys.exit(retval)
Posted Wed Jan 21 11:13:31 2015 Tags:

Upgrade Cyanogenmod with an encrypted phone

Cyanogenmod found an update, it downloaded it, then it rebooted to install it and nothing happened. It turns out that the update procedure cannot work if the zip file to install is in encrypted media, so a workaround is to move the zip into unencrypted external storage.

As far as I know, my Nexus 4 has no unencrypted external storage.

This is how I managed to upgrade it, I write it here so I can find it next time:

  1. enable USB debugging
  2. adb pull /cmupdater/cm-11-20141115-SNAPSHOT-M12-mako.zip
  3. adb reboot recovery
  4. choose "install zip from sideload"
  5. adb sideload cm-11-20141115-SNAPSHOT-M12-mako.zip
Posted Fri Dec 19 10:21:29 2014 Tags:

Radicale and DAVDroid

radicale and DAVdroid appeal to me. Let's try to make the whole thing work.

A self-signed SSL certificate

Generating the certificate:

    openssl req -nodes -x509 -newkey rsa:2048 -keyout cal-key.pem -out cal-cert.pem -days 3650
    [...]
    Country Name (2 letter code) [AU]:IT
    State or Province Name (full name) [Some-State]:Bologna
    Locality Name (eg, city) []:
    Organization Name (eg, company) [Internet Widgits Pty Ltd]:enricozini.org
    Organizational Unit Name (eg, section) []:
    Common Name (e.g. server FQDN or YOUR name) []:cal.enricozini.org
    Email Address []:postmaster@enricozini.org

Installing it on my phone:

    openssl x509 -in cal-cert.pem -outform DER -out cal-cert.crt
    adb push cal-cert.crt /mnt/sdcard/
    enrico --follow-instructions http://davdroid.bitfire.at/faq/entry/importing-a-certificate

Installing radicale in my VPS

An updated radicale package, with this patch to make it work with DAVDroid:

    apt-get source radicale
    # I reviewed 063f7de7a2c7c50de5fe3f8382358f9a1124fbb6
    git clone https://github.com/Kozea/Radicale.git
    Move the python code from git to the Debian source
    dch -v 0.10~enrico  "Pulled in the not yet released 0.10 work from upstream"
    debuild -us -uc -rfakeroot

Install the package:

    # dpkg -i python-radicale_0.10~enrico0-1_all.deb
    # dpkg -i radicale_0.10~enrico0-1_all.deb

Create a system user to run it:

    # adduser --system --disabled-password radicale

Configure it for mod_wsgi with auth done by Apache:

    # For brevity, this is my config file with comments removed

    [storage]
    # Storage backend
    # Value: filesystem | multifilesystem | database | custom
    type = filesystem

    # Folder for storing local collections, created if not present
    filesystem_folder = /var/lib/radicale/collections

    [logging]
    config = /etc/radicale/logging

Create the wsgi file to run it:

    # mkdir /srv/radicale
    # cat <<EOT > /srv/radicale/radicale.wsgi
    import radicale
    radicale.log.start()
    application = radicale.Application()
    EOT
    # chown radicale.radicale /srv/radicale/radicale.wsgi
    # chmod 0755 /srv/radicale/radicale.wsgi

Make radicale commit to git

    # apt-get install python-dulwich
    # cd /var/lib/radicale/collections
    # git init
    # chown radicale.radicale -R /var/lib/radicale/collections/.git

Apache configuration

Add a new site to apache:

    $ cat /etc/apache2/sites-available/cal.conf
    # For brevity, this is my config file with comments removed
    <IfModule mod_ssl.c>
    <VirtualHost *:443>
            ServerName cal.enricozini.org
            ServerAdmin enrico@enricozini.org

            Alias /robots.txt /srv/radicale/robots.txt
            Alias /favicon.ico /srv/radicale/favicon.ico

            WSGIDaemonProcess radicale user=radicale group=radicale threads=1 umask=0027 display-name=%{GROUP}
            WSGIProcessGroup radicale
            WSGIScriptAlias / /srv/radicale/radicale.wsgi

            <Directory /srv/radicale>
                    # WSGIProcessGroup radicale
                    # WSGIApplicationGroup radicale
                    # WSGIPassAuthorization On
                    AllowOverride None
                    Require all granted
            </Directory>

            <Location />
                    AuthType basic
                    AuthName "Enrico's Calendar"
                    AuthBasicProvider file
                    AuthUserFile /usr/local/etc/radicale/htpasswd
                    Require user enrico
            </Location>

            ErrorLog{APACHE_LOG_DIR}/cal-enricozini-org-error.log
            LogLevel warn

            CustomLog{APACHE_LOG_DIR}/cal-enricozini-org-access.log combined

            SSLEngine on
            SSLCertificateFile    /etc/ssl/certs/cal.pem
            SSLCertificateKeyFile /etc/ssl/private/cal.key
    </VirtualHost>
    </IfModule>

Then enable it:

    # a2ensite cal.conf
    # service apache2 reload

Create collections

DAVdroid seems to want to see existing collections on the server, so we create them:

    $ apt-get install cadaver
    $ cat <<EOT > /tmp/empty.ics
    BEGIN:VCALENDAR
    VERSION:2.0
    END:VCALENDAR
    EOT
    $ cat <<EOT > /tmp/empty.vcf
    BEGIN:VCARD
    VERSION:2.1
    END:VCARD
    EOT
    $ cadaver https://cal.enricozini.org
    WARNING: Untrusted server certificate presented for `cal.enricozini.org':
    [...]
    Do you wish to accept the certificate? (y/n) y
    Authentication required for Enrico's Calendar on server `cal.enricozini.org':
    Username: enrico
    Password: ****
    dav:/> cd enrico/contacts.vcf/
    dav:/> put /tmp/empty.vcf
    dav:/> cd ../calendar.ics/
    dav:/> put /tmp/empty.ics
    dav:/enrico/calendar.ics/> ^D
    Connection to `cal.enricozini.org' closed.

DAVdroid configuration

  1. Add a new DAVdroid sync account
  2. Use server/username configuration
  3. For server, use https:////
  4. Add username and password

It should work.

Related links

Posted Tue Dec 9 16:35:50 2014 Tags:

Alternate rescue boot entry with systemd

Since systemd version 215, adding systemd.debug-shell to the kernel command line activates the debug shell on tty9 alongside the normal boot. I like the idea of that, and I'd like to have it in my standard 'rescue' entry in my grub menu.

Unfortunately, by default update-grub does not allow to customize the rescue menu entry options. I have just filed #766530 hoping for that to change.

After testing the patch I proposed for /etc/grub.d/10_linux, I now have this in my /etc/default/grub, with some satisfaction:

GRUB_CMDLINE_LINUX_RECOVERY="systemd.log_target=kmsg systemd.log_level=debug systemd.debug-shell"

Further information:

Thanks to sjoerd and uau on #debian-systemd for their help.

Posted Thu Oct 23 22:06:30 2014 Tags:

Spelling a chilometri zero

Lo spelling internazionale è troppo globalizzato, e volete recuperare un attimo la dimensione del posto dove siete nati e cresciuti?

Da oggi c'è questo script che fa per voi: gli dite dove abitate, e lui vi crea lo spelling a chilometri zero.

$ git clone git@gitorious.org:trespolo/osmspell.git
$ cd osmspell
$ ./osmspell "San Giorgio di Piano"
1: San Giorgio di Piano, BO, EMR, Italia
2: San Giorgio di Piano, Via Codronchi, San Giorgio di Piano, BO, EMR, Italia
3: San Giorgio Di Piano, Via Libertà, San Giorgio di Piano, BO, EMR, Italia
Choose one: 1
Center: 44.6465332, 11.3790398
A Argelato, Altedo
B Bentivoglio, Bologna, Boschi
C Cinquanta, Castagnolo Minore, Castel Maggiore, Cento
D Dosso
E Eremo di Tizzano
F Funo di Argelato, Finale Emilia, Ferrara, Fiesso
G Gherghenzano, Galliera, Gesso
I Il Cucco, Irnerio, Idice
L Località Fortuna, Lovoleto, Lippo
M Malacappa, Massumatico, Minerbio, Marano
N Navile
O Osteriola, Ozzano dell'Emilia, Oca
P Piombino, Padulle, Poggio Renatico, Piave
Q Quarto Inferiore, Quattrina
R Rubizzano, Renazzo, Riale
S San Giorgio di Piano, Saletto
T Torre Verde, Tintoria, Tombe
U Uccellino
V Venezzano Mascarino, Vigarano Mainarda, Veduro
X XII Morelli
Z Zenerigolo, Zola Predosa

I dati vengono da OSM, e lo script è un ottimo esempio di come usarne la API di geolocazione (veloci) e la API di query geografica (lenta).

Posted Sat Jan 4 00:38:16 2014 Tags:
Posted Tue Apr 7 19:20:12 2015

Pages with tips about Debian.

Resolving IP addresses in vim

A friend on IRC said: "I wish vim had a command to resolve all the IP addresses in a block of text".

But it does:

:<block>!perl -MSocket -pe 's/(\d+\.\d+\.\d+\.\d+)/gethostbyaddr(inet_aton($1), AF_INET)/ge'

If you use it often, put the perl command in a one-liner script and call it an editor macro. It works on other editors, too, and even without an editor at all. And it can be scripted!

We live with the power of Unix every day, so much that we risk forgetting how awesome it is.

Posted Wed Mar 7 14:07:07 2012 Tags:

SQLAlchemy, MySQL and sql_mode=traditional

As everyone should know, by default MySQL is an embarassing stupid toy:

mysql> create table foo (val integer not null);
Query OK, 0 rows affected (0.03 sec)

mysql> insert into foo values (1/0);
ERROR 1048 (23000): Column 'val' cannot be null

mysql> insert into foo values (1);
Query OK, 1 row affected (0.00 sec)

mysql> update foo set val=1/0 where val=1;
Query OK, 1 row affected, 1 warning (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 1

mysql> select * from foo;
+-----+
| val |
+-----+
|   0 |
+-----+
1 row in set (0.00 sec)

Luckily, you can tell it to stop being embarassingly stupid:

mysql> set sql_mode="traditional";
Query OK, 0 rows affected (0.00 sec)

mysql> update foo set val=1/0 where val=0;
ERROR 1365 (22012): Division by 0

(There is an even better sql mode you can choose, though: it is called "Install PostgreSQL")

Unfortunately, I've been hired to work on a project that relies on the embarassing stupid behaviour of MySQL, so I cannot set sql_mode=traditional globally or the existing house of cards will collapse.

Here is how you set it session-wide with SQLAlchemy 0.6.x: it took me quite a while to find out:

import sqlalchemy.interfaces

# Without this, MySQL will silently insert invalid values in the
# database, causing very long debugging sessions in the long run
class DontBeSilly(sqlalchemy.interfaces.PoolListener):
    def connect(self, dbapi_con, connection_record):
        cur = dbapi_con.cursor()
        cur.execute("SET SESSION sql_mode='TRADITIONAL'")
        cur = None
engine = create_engine(..., listeners=[DontBeSilly()])

Why does it take all that effort is beyond me. I'd have expected this to be turned on by default, possibly with a switch that insane people could use to turn it off.

Posted Mon Feb 27 19:45:58 2012 Tags:

Tips on using python datetime module

Python's datetime module is one of those bits of code that tends not to do what one would expect them to do.

I have come to adopt some extra usage guidelines in order to preserve my sanity:

  • Avoid using str(datetime_object) or isoformat to serialize a datetime: there is no function in the library that can parse all its possible outputs
  • datetime.strptime silently throws away all timezone information. If you look very closely, it even says so in its documentation
  • Timezones do not exist, all datetime objects have to be naive. aware means broken.
  • datetime objects must always contain UTC information
  • datetime.now() is never to be used. Always use datetime.utcnow()
  • Be careful of 3rd party python modules: people have a dangerous tendency to use datetime.now()
  • If a conversion to some local time is needed, it shall be done via either some ugly thing like time.localtime(int(dt.strftime("%s"))) or via the pytz module
  • pytz must be used directly, and never via timezone aware datetime objects, because datetime objects fail in querying pytz:

That’s right, the datetime object created by a call to datetime.datetime constructor now seems to think that Finland uses the ancient “Helsinki Mean Time” which was obsoleted in the 1920s. The reason for this behaviour is clearly documented on the pytz page: it seems the Python datetime implementation never asks the tzinfo object what the offset to UTC on the given date would be. And without knowing it pytz seems to default to the first historical definition. Now, some of you fellow readers could insist on the problem going away simply by defaulting to the latest time zone definition. However, the problem would still persist: For example, Venezuela switched to GMT-04:30 on 9th December, 2007, causing the datetime objects representing dates either before, or after the change to become invalid.

From: http://blog.redinnovation.com/2008/06/30/relativity-of-time-shortcomings-in-python-datetime-and-workaround/

  • Timezone-aware datetime objects have other bugs: for example, they fail to compute Unix timestamps correctly. The following example shows two timezone-aware objects that represent the same instant but produce two different timestamps.
>>> import datetime as dt
>>> import pytz
>>> utc = pytz.timezone("UTC")
>>> italy = pytz.timezone("Europe/Rome")
>>> a = dt.datetime(2008, 7, 6, 5, 4, 3, tzinfo=utc)
>>> b = a.astimezone(italy)
>>> str(a)
'2008-07-06 05:04:03+00:00'
>>> a.strftime("%s")
'1215291843'
>>> str(b)
'2008-07-06 07:04:03+02:00'
>>> b.strftime("%s")
'1215299043'
Posted Thu Jun 25 19:18:25 2009 Tags:

Cute things one could do with debtags

While answering to a long message in the debtags-devel mailing list I accidentally put together the pieces of a fun idea.

This is the bit of message I was answering:

  • It would be very useful if the means for indicating the supported
    data formats was more comprehensive. This could mean a lot of
    expanding in the "works-with-format" section of the vocabulary, which
    doesn't even include formats such as gif or mpg at the moment. I don't
    know how feasible it is to alter underlying debtags functionality, but
    perhaps it would be the easiest to make "works-with-format" a special
    case tag which allows for formats not listed in the vocabulary.

This is my answer:

Good point. The idea has popped up in the past to list supported mime types among the package metadata, so that one could point to a file and get a list of all the packages that can work with it.

I'm not sure it's a good idea to encode mime types in debtags and I'd like to see something ad-hoc for it. In the meantime works-with-format is the best we can do, but we should limit it to the most common formats.

This is the fun idea: if works-with-format is the best we can do, what can we do with it?

Earlier today I worked on resurrecting some old code of mine to expand Zack's ls2rss with Dublin Core metadata extracted from the files. The mime type scanner was ready for action.

Some imports:

import sys
# Requires python-extractor, python-magic, python-apt
# and an unreleased python-debtags from http://bzr.debian.org/bzr/pkg-python-debian/trunk/
import extractor
import magic
from debian_bundle import debtags
import re
from optparse import OptionParser
import apt

A tenative mapping between mime types and debtags tags:

mime_map = (
        ( r'text/html\b', ("works-with::text","works-with-format::html") ),
        ( r'text/plain\b', ("works-with::text","works-with-format::plaintext") ),
        ( r'text/troff\b', ("works-with::text", "works-with-format::man") ),
        ( r'image/', ("works-with::image",) ),
        ( r'image/jpeg\b', ("works-with::image:raster","works-with-format::jpg") ),
        ( r'image/png\b', ("works-with::image:raster","works-with-format::png") ),
        ( r'application/pdf\b', ("works-with::text","works-with-format::pdf")),
        ( r'application/postscript\b', ("works-with::text","works-with-format::postscript")),
        ( r'application/x-iso9660\b', ('works-with-format::iso9660',)),
        ( r'application/zip\b', ('works-with::archive', 'works-with-format::zip')),
        ( r'application/x-tar\b', ('works-with::archive', 'works-with-format::tar')),
        ( r'audio/', ("works-with::audio",) ),
        ( r'audio/mpeg\b', ("works-with-format::mp3",) ),
        ( r'audio/x-wav\b', ("works-with-format::wav",) ),
        ( r'message/rfc822\b', ("works-with::mail",) ),
        ( r'video/', ("works-with::video",)),
        ( r'application/x-debian-package\b', ("works-with::software:package",)),
        ( r'application/vnd.oasis.opendocument.text\b', ("works-with::text",)),
        ( r'application/vnd.oasis.opendocument.graphics\b', ("works-with::image:vector",)),
        ( r'application/vnd.oasis.opendocument.spreadsheet\b', ("works-with::spreadsheet",)),
        ( r'application/vnd.sun.xml.base\b', ("works-with::db",)),
        ( r'application/rtf\b', ("works-with::text",)),
        ( r'application/x-dbm\b', ("works-with::db",)),
)

Code that does its best to extract a mime type:

extractor = extractor.Extractor()
magic = magic.open(magic.MAGIC_MIME)
magic.load()

def mimetype(fname):
    keys = extractor.extract(fname)
    xkeys = {}
    for k, v in keys:
        if xkeys.has_key(k):
            xkeys[k].append(v)
        else:
            xkeys[k] = [v]
    namemagic =  magic.file(fname)
    contentmagic = magic.buffer(file(fname, "r").read(4096))
    return xkeys.has_key("mimetype") and xkeys['mimetype'][0] or contentmagic or namemagic

Command line parser:

parser = OptionParser(usage="usage: %prog [options] filename",
        version="%prog "+ VERSION,
        description="search Debian packages that can handle a given file")
parser.add_option("--tagdb", default="/var/lib/debtags/package-tags", help="Tag database to use (default: %default)")
parser.add_option("--action", default=None, help="Show the packages that allow the given action on the file (default: %default)")

(options, args) = parser.parse_args()

if len(args) == 0:
    parser.error("Please provide the name of a file to scan")

And here starts the fun: first we load the debtags data:

# Read full database 
fullcoll = debtags.DB()
tagFilter = re.compile(r"^special::.+$|^.+::TODO$")
fullcoll.read(open(options.tagdb, "r"), lambda x: not tagFilter.match(x))

Then we scan the mime type and look up tags in the mime_map above:

type = mimetype(args[0])
#print >>sys.stderr, "Mime type:", type
found = set()
for match, tags in mime_map:
    match = re.compile(match)
    if match.match(type):
        for t in tags:
            found.add(t)

if len(found) == 0:
    print >>sys.stderr, "Unhandled mime type:", type
else:

If the user only gave the file name, let's show what Debian can do with that file:

    if options.action == None:
        print "Debtags query:", " && ".join(found)

        query = found.copy()
        query.add("role::program")
        subcoll = fullcoll.filterPackagesTags(lambda pt: query.issubset(pt[1]))
        uses = map(lambda x:x[5:], filter(lambda x:x.startswith("use::"), subcoll.iterTags()))
        print "Available actions:", ", ".join(uses)

If the user picked one of the available actions, let's show the packages that do it:

    else:
        aptCache = apt.Cache()
        query = found.copy()
        query.add("role::program")
        query.add("use::"+options.action)
        print "Debtags query:", " && ".join(query)
        subcoll = fullcoll.filterPackagesTags(lambda pt: query.issubset(pt[1]))
        for i in subcoll.iterPackages():
            aptpkg = aptCache[i]
            desc = aptpkg.rawDescription.split("\n")[0]
            print i, "-", desc

\o/

The morale of the story:

  • Debian is lots of fun
  • We have amazing tecnology just waiting for good ideas.
  • I'd love to see more little scripts like this getting written.
Posted Sat Jun 6 00:57:39 2009 Tags:

Telling gpg not to use the key in the card

So, I created the subkeys for the OpenPGP card, and it works.

Now I'd like to upload some Debian packages, but the uploads fail because my new subkeys aren't yet known to the Debian keyring. I tried to push my subkeys to keyring.debian.org, but uploading afterwards still was rejected. Maybe it takes some time for propagation, maybe there's some other procedure to follow,

I don't know. I didn't manage to figure out what is the procedure for getting a new subkey in the Debian keyring. I wish to replace this paragraph with proper details if I'll ever find out.

Now, failing to use the subkeys, I had to convince gpg to use my good old main key. The quick and dirty way was to make a backup of the keyring, delete the subkeys, sign and upload.

Seconds after hours of searching terminated in the above crude hack, as it normally happens, someone (Holger in this case) suggested the correct way to do it: use --default-key and append an exclamation mark at the end of the key ID.

This was in the gpg manpage, but nowhere near the documentation of --default-key:

Note that you can append an exclamation mark (!) to key IDs or fingerprints.
This flag tells GnuPG to use the specified primary or secondary key and not
to try and calculate which primary or secondary key to use.

So, now I'm happy:

$ gpg --sign  --default-key '797ebfab!'

You need a passphrase to unlock the secret key for
user: [...]

$ gpg --sign
gpg: signatures created so far: xx

Please enter the PIN
[sigs done: xx]
Posted Sat Jun 6 00:57:39 2009 Tags:

How to autologin X without a display manager

Note: this can now be done properly with nodm.

Problem: configure a custom Debian box used to drive some industrial machinery. The system should boot directly into the GUI control application, that runs full screen, with root privileges. Everything should respawn if X is killed or the control application dies.

In theory, you'd run an X display manager with autologin, then run matchbox-window-manager and the control application as the X session. You wish. At the end of the post is an explanation of why this way failed.

So, here is how to get the whole thing to work, without a display manager.

Use init to drive the whole thing:

6:23:respawn:/sbin/getty -L -n -l /usr/local/sbin/autologin

This will respawn everything if it dies, stop respawning if it dies all the time, avoid starting it in single user mode, and not ask for a username.

/usr/local/sbin/autologin contains:

#!/bin/sh
/bin/login -f root MAINAPP=true

This will autologin as root, setting an extra env variable.

Then comes root's ~/.bash_profile, that just starts X if we are doing autologin:

if [ "$MAINAPP" = "true" ]
then
    startx
    logout
fi

If the application was running as a special user, we could have made things simpler and just used startx as the shell for that user; however, we still want root to have bash as the shell, and the above hack does it.

Finally, root's ~/.xsession:

#!/bin/sh

matchbox-window-manager &

# If the touch screen is not calibrated, run the calibration
while [ ! -f /etc/touchscreen-calibration ]
do
        calibrate-touchscreen
done

# Run the main application: if it ends, the session ends
main-application

And there we go, no dependencies at all.

Why not using a display manager

gdm and kdm seem to do autologin, but their dependency list is not acceptable for something that should just respawn an X server in an industrial system that must be kept simple.

xdm on the other hand has a small set of dependencies, but its developers seem to have decided that autologin is a ""glitz", and there is no need for it in such a bare-bones display manager".

Dict for "glitz" gives "tasteless showiness". What one has to bear...

wdm seems however even more disconcerting, as it "doesn't actually support autologin but you can set the default user and default password in /etc/X11/wdm/wdm-config. Now, [...] you only need to press Enter twice [...] to login".

Figure how this would look in the manual: "After powering up the unit, attach a USB keyboard and press enter twice to start the system".

And this is why we are not using a display manager.

Posted Sat Jun 6 00:57:39 2009 Tags:

Editing ChangeLog with vim

Turns out vim has a changelog.vim plugin to edit ChangeLog files.

With \o you start a new entry.

A look at /usr/share/vim/vim70/ftplugin/changelog.vim can show some more.

Posted Sat Jun 6 00:57:39 2009 Tags:

Dapper on XEN (part 1/1, unfinished)

I need to do some work on a Dapper system, so it's time to try out xen:

apt-get install linux-image-2.6-xen-686 xen-hypervisor-3.0-i386 xen-utils-3.0 xen-tools

xen-tools tries to recommend xen, which is old and has been requested to be removed from the archive. The recommends needs to be ignored, which is not that trivial to do with aptitude.

The linux-image doesn't happen to have an initrd. The bug has already been reported. One can recreate one using::

mkinitrd -o/boot/initrd-2.6.16-2-xen-686.img  2.6.16-2-xen-686

Then one adds this to /boot/grub/menu.lst:

title Xen
root (hd0,0)
kernel /boot/xen-3.0-i386.gz
module /boot/vmlinuz-2.6.16-2-xen-686 root=/dev/hda1
module /boot/initrd.img-2.6.16-2-xen-smp

And it boots lovely:

# xm list
Name                              ID Mem(MiB) VCPUs State  Time(s)
Domain-0                           0      939     2 r-----  4677.9

Now xen-tools provides xen-create-image. I'll try to create the Dapper image (deboostrap in my system has already been changed to be able to install dapper):

xen-create-image --size=4G --swap=128M --dhcp --volume=marvin \
                 --hostname=cavazza --dist=dapper --fs=xfs

But I get:

[...]
Creating swapfile : /dev/marvin/cavazza-swap
Done

Creating disk image: /dev/marvin/cavazza-root
Done

Creating xfs filesystem
Done


Installing the base system.   This will take a while!

Copying files from host to image.
Finished
Something went wrong with the debootstrap installation
Aborting

I love detailed error messages. This one is not.

So I'll have to do by hand. The documentation says:

Before you can start an additional domain, you must create a configuration file. We provide two example files which you can use as a starting point:

  • /etc/xen/xmexample1 is a simple template configuration file for describing a single VM.
  • /etc/xen/xmexample2 file is a template description that is intended to be reused for multiple virtual machines. Setting the value of the vmid variable on the xm command line fills in parts of this template.

But in Debian there's no trace of the two example files. I've reported to Guido Trotter. Then I googled for the two files and found them.

Now, on to adapt Dapper to run under Xen. The good Ubuntu wiki has an HOWTO which crudely makes you install the Tarballed Pre-built installations of Xen 3.0. It's sad that xen-enabled kernels didn't make it into Dapper proper.

I can make this:

kernel = "/boot/vmlinuz-xen-domu"
memory = 128
name = "Dapper"
disk = [ 'phy:marvin/dapper,hda1,w' ]
dhcp="dhcp"
root = "/dev/hda1 ro"

I used the pre-built kernel as a domU, but:

Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(3,1)

Sigh. I'll try with the Debian Xen-enabled kernels. Installing yaird and linux-image-2.6.16-2-xen-686_2.6.16-14_i386.deb and linux-modules-2.6.16-2-xen-686. It WORKS!

Now, that's not Dapper: that's a debootstrap install of it. No user, no root password, no desktop, no shiny Ubuntu custom config. How to finish the installation? No idea. In the past, I could run base-config, but now it doesn't exist anymore.

It seems that the best bet is not to use debootstrap, but to install the CD in the LVM partition using QEMU and then chroot into it and then install the debian xen kernel to boot it.

A QEMU install will take ages, and I expect some trouble, like getting partman to see LVM. Maybe I better install on a disk image file and then use tar and netcat to bring the ubuntu installation out of the disk image and into the LVM image.

For today, I didn't make it. This job slips another day. Frustration. I hope that at least this description of my efforts so far can be helpful to someone trying similar things.

One suggestion I got was to boot the Dapper CD and install from there. I can't: this computer has no bootable CD.

Update: I found out http://xensource.com/summerofcode.html and it says:

Project Idea 3: Xen Desktop Outside of Domain 0

This project would deliver a Xen desktop to the user of a client system that presents an abstraction of a virtual desktop to the user in which multiple guests share the virtual desktop, each with a subset of the desktop resources (pixels, etc). Smart management of the sound drivers would allow mixing of sound from multiple guests to the single device used for output.

Microphone input could be broadcast to all guests. Technologies such as ALSA already emulate playback settings by downsampling for the hardware etc. so its not hard to imagine a xen-snd-front device munging the data to a common format used for the internal sound card. Issues: synchronization of playback.

So, what I'm trying to do has been proposed as a SoC project and (I guess) is something that can't be done overnight. Although I was planning to run a displayless gdm accessed with XDMCMP from domain0's X server, and to just hand out the audio card to the domU using pciback, so my task would have been easier.

Thus I give up using Xen for this one.

I'll work around the lack-of-bootable-cd limitation of this computer by installing Ubuntu using QEMU. Which I found out requires using the "Alternate install" CD. The desktop install CD is a live CD installing with Ubuntu Expresso, and the live CD doesn't seem to work in QEMU.

Or, I'll install in a LVM partition in the laptop and then move it around using the network. This one's probably faster.

Could this experience be an interesting use case for Edgy Eft?

Posted Sat Jun 6 00:57:39 2009 Tags:

Simplify g++ error messages

You use templates and STL and sometimes get lost on insanely complex compiler errors?

Filter it through this script of mine.

#!/usr/bin/perl -w

use strict;
use warnings;

while (<>)
{
    s/std::basic_string<char, std::char_traits<char>, std::allocator<char> >\s*/std::string/g;
    s/std::set<(.+?), std::less<\1>, std::allocator<\1> >/std::set<$1>/g;
    print;
}

It's quite simple minded, but it has more than once been a lifesaver.

Update: Bet Hutchings pointed me at stlfilt, which however seems to be in need of some updating to newer versions of g++.

Posted Sat Jun 6 00:57:39 2009 Tags:

Send a fax from the laptop

My bank sent me a PDF form via e-mail. I needed to fill it in, then send it back via fax. Send it back via e-mail would not work because it's not secure. The bank agrees that this is fantastically silly, but apparently this requirement is not their fault.

Step 1: send a fax with the laptop

  1. apt-get source sl-modem-daemon efax-gtk
  2. patch as instructed in the Debian BTS
  3. pbuilder-satisfydepends, debuild, dpkg -i
  4. slmodemd -c ITALY --alsa hw:0,6
  5. echo ATDmymobilenumber > /dev/ttySL0 and my mobile phone rung
  6. efax-gtk

Believe it or not, at this point I managed to successfully send a test fax.

Background: the laptop's modem is actually a sound card, and is ashamed to admit that it can also work as a modem:

$ lspci
00:1b.0 Audio device: Intel Corporation 82801H (ICH8 Family) HD Audio Controller (rev 03)

But the sound card actually has its own bus, which you can query with aplay -l:

$ aplay -l
**** List of PLAYBACK Hardware Devices ****
card 0: Intel [HDA Intel], device 0: ALC861VD Analog [ALC861VD Analog]
  Subdevices: 1/1
  Subdevice #0: subdevice #0
card 0: Intel [HDA Intel], device 6: Si3054 Modem [Si3054 Modem]
  Subdevices: 0/1
  Subdevice #0: subdevice #0

Then you learn that sl-modem-daemon can drive it both on i386 and on amd64, but you get period size 48 is not supported by playback (64) when trying to dial. But then you find the patch to get rid of that, and it works.

The modem was the last device in the new laptop that I had not yet attempted to use. I can now claim that every single piece of hardware on my ASUS F9E-2P119E laptop can be made to work with Debian. Oh, yes!

Step 2: fill in the form

Much to my surprise, evince allowed me to just click in the form fields and type text. Even checkboxes worked. "Save a copy", however, did not retain the field contents: I had to print to file to get another PDF with the fields filled in. Update: this could be a limitation of that specific PDF, see this thread on the Adobe forums (thanks to Tomas Weber).

However, evince did not allow me to import an image with my signature and paste it in the right place. Inkscape, however, successfully managed to import the PDF as an editable vector drawing that I could change at will. Again, that was impressive.

From there, it was just a matter of pasting the signature in the right place, save as PostScript, give it to efax-gtk and phone the bank to learn that, in fact, the fax was received and was perfectly readable.

Posted Sat Jun 6 00:57:39 2009 Tags:
Posted Sat Jun 6 00:57:39 2009
sw

Software

Billing an Italian public administration

Here's a simple guide for how I managed to bill one of my customers as is now mandated by law in Italy.

Create a new virtualbox machine

I would never do any of this to any system I would ever want to use for anything else, so it's virtual machine time.

  • I started virtualbox, created a new machine for Ubuntu 32bit, 8Gb disk, 4Gb RAM, and placed the .vdi image in an encrypted partition. The web services of Infocert's fattura-pa requires "Java (JRE) a 32bit di versione 1.6 o superiore".
  • I installed Ubuntu 12.04 on it: that is what dike declares to support.
  • I booted the VM, installed virtualbox-guest-utils, and de sure I also had virtualbox-guest-x11
  • I restarted the VM so that I could resize the virtualbox window and have Ubuntu resize itself as well. Now I could actually read popup error messages in full.
  • I changed the desktop background to something that gave me the idea that this is an untrusted machine where I need to be very careful of what I type. I went for bright red.

Install smart card software into it

  • apt-get install pcscd pcsc-tools opensc
  • In virtualbox, I went to Devices/USB devices and enabled the smart card reader in the virtual machine.
  • I ran pcsc_scan to see if it could see my smart card.
  • I ran Firefox, went to preferences, advanced, security devices, load. Module name is "CRS PKCS#11", module path is /usr/lib/opensc-pkcs11.so
  • I went to https://fattura-pa.infocamere.it/fpmi/service and I was able to log in. To log in, I had to type the PIN 4 times into popups that offered little explanations about what was going on, enjoying cold shivers because the smart card would lock itself at the 3rd failed attempt.
  • Congratulations to myself! I thought that all was set, but unfortunately, at this stage, I was not able to do anything else except log into the website.

Descent into darkness

Set up things for fattura-pa

  • I got the PDF with the setup instructions from here. Get it too, for a reference, a laugh, and in case you do not believe the instructions below.
  • I went to https://www.firma.infocert.it/installazione/certificato.php, and saved the two certificates.
  • Firefox, preferences, advanced, show certificates, I imported both CA certificates, trusted for everything, all my base are belong to them.
  • apt-get install icedtea-plugin
  • I went to https://fattura-pa.infocamere.it/fpmi/service and tried to sign. I could not: I got an error about invalid UTF8 for something or other in Firefox's stdandard error. Firefox froze and had to be killed.

Set up things for signing locally with dike

  • I removed icedtea so that I could use the site without firefox crashing.
  • I installed DiKe For Ubuntu 12.04 32bit
  • I ran dikeutil to see if it could talk to my smart card
  • When signing with the website, I chose the manual signing options and downloaded the zip file with the xml to be signed.
  • I got a zip file, unzipped it.
  • I loaded the xml into dike.
  • I signed it with dike.
  • I got this error message: "nessun certificato di firma presente sul dispositivo di firma" and then this error message: "Impossibile recuperare il certificato dal dispositivo di firma". No luck.

Set up things for signing locally with ArubaSign

  • I went to https://www.pec.it/Download.aspx
  • I downloaded ArubaSign for Linux 32 bit.
  • Oh! People say that it only works with Oracle's version of Java.
  • sudo add-apt-repository ppa:webupd8team/java
  • apt-get update
  • apt-get install oracle-java7-installer
  • During the installation process I had to agree to also sell my soul to Oracle.
  • tar axf ArubaSign*.tar*
  • cd ArubaSing-*/apps/dist
  • java -jar ArubaSign.jar
  • I let it download its own updates. Another time I did not. It does not seem to matter: I get asked that question every time I start it anyway.
  • I enjoyed the fancy brushed metal theme, and had an interesting time navigating an interface where every label on every icon or input field was truncated.
  • I downloaded https://www.pec.it/documenti/Manuale_ArubaSign2_firma%20Remota_V03_02_07_2012.pdf to get screenshots of that interface with all the labels intact
  • I signed the xml that I got from the website. I got told that I needed to really view carefully what I was signing, because the signature would be legally binding
  • I enjoyed carefully reading a legally binding, raw XML file.
  • I told it to go ahead, and there was now a .p7m file ready for me. I rejoiced, as now I might, just might actually get paid for my work.

Try fattura-pa again

Maybe fattura-pa would work with Oracle's Java plugin?

  • I went to https://fattura-pa.infocamere.it/fpmi/service
  • I got asked to verify java at www.java.com. I did it.
  • I told FireFox to enable java.
  • Suddenly, and while I was still in java.com's tab, I got prompted about allowing Infocert's applet to run: I allowed it to run.
  • I also got prompted several times, still while the current tab was not even Infocert's tab, about running components that could compromise the security of my system. I allowed and unblocked all of them.
  • I entered my PIN.
  • Congratulations! Now I have two ways of generating legally binding signatures with government issued smart cards!

Aftermath

I shut down that virtual machine and I'm making sure I never run anything important on it. Except, of course, generating legally binding signatures as required by the Italian government.

What could possibly go wrong?

Posted Thu Jul 2 23:48:36 2015 Tags:

debtags rewritten in python3

In my long quest towards closing #540218, I have uploaded a new libept to experimental. Then I tried to build debtags on a sid+experimental chroot and the result runs but has libc's free() print existential warnings about whatevers.

At a quick glance, there are now things around like a new libapt, gcc 5 with ABI changes, and who knows what else. I figured how much time it'd take me to debug something like that, and I've used that time to rewrite debtags in python3. It took 8 hours, 5 of pleasant programming and the usual tax of another 3 of utter frustration packaging the results. I guess I gained over the risk of spending an unspecified amount of hours of just pure frustration.

So from now on debtags is going to be a pure python3 package, with dependencies on only python3-apt and python3-debian. 700 lines of python instead of several C++ files built on 4 layers of libraries. Hopefully, this is the last of the big headaches I get from hacking on this package. Also, one less package using libept.

Posted Sun Jun 21 18:04:39 2015 Tags:

Work around Google evil .ics feeds

I've happily been using 2015/akonadi-install for my calendars, and yesterday I added an .ics feed export from Google, as a URL file source. It is a link in the form: https://www.google.com/calendar/ical/person%40gmail.com/private-12341234123412341234123412341234/basic.ics

After doing that, I noticed that the fan in my laptop was on more often than usual, and I noticed that akonadi-server and postgres were running very often, and doing quite a lot of processing.

The evil

I investigated and realised that Google seems to be doing everything they can to make their ical feeds hard to sync against efficiently. This is the list of what I have observed Gmail doing to an unchanged ical feed:

  • Date: headers in HTTP replies are always now
  • If-Modified-Since: is not supported
  • DTSTAMP of each element is always now
  • VTIMEZONE entries appear in random order
  • ORGANIZER CN entries randomly change between full name and plus.google.com user ID
  • ATTENDEE entries randomly change between having a CN or not having it
  • TRIGGER entries change spontaneously
  • CREATED entries change spontaneously

This causes akonadi to download and reprocess the entire ical feed at every single poll, and I can't blame akonadi for doing it. In fact, Google is saying that there is a feed with several years worth of daily appointments that all keep being changed all the time.

The work-around

As a work-around, I have configured the akonadi source to point at a local file on disk, and I have written a script to update the file only if the .ics feed has actually changed.

Have a look at the script: I consider it far from trivial, since it needs to do a partial parsing of the .ics feed to throw away all the nondeterminism that Google pollutes it with.

The setup

The script needs to be run periodically, and I used it as an opportunity to try systemd user timers:

    $ cat ~/.config/systemd/user/update-ical-feeds.timer
    [Unit]
    Description=Updates ical feeds every hour
    # Only run when on AC power
    ConditionACPower=yes

    [Timer]
    # Run every hour
    OnActiveSec=1h
    # Run a minute after boot
    OnBootSec=1m
    Unit=update-ical-feeds.service

    $ cat ~/.config/systemd/user/update-ical-feeds.service
    [Unit]
    Description=Update ICal feeds

    [Service]
    # Use oneshot to prevent two updates being run in case the previous one
    # runs for more time than the timer interval
    Type=oneshot
    ExecStart=/home/enrico/tmp/calendars/update

    $ systemctl --user start update-ical-feeds.timer
    $ systemctl --user list-timers
    NEXT                         LEFT       LAST                         PASSED UNIT                    ACTIVATES
    Wed 2015-03-25 22:19:54 CET  59min left Wed 2015-03-25 21:19:54 CET  2s ago update-ical-feeds.timer update-ical-feeds.service

    1 timers listed.
    Pass --all to see loaded but inactive timers, too.

To reload the configuration after editing: systemctl --user daemon-reload.

Further investigation

I wonder if ConditionACPower needs to be in the .timer or in the .service, since there is a [Unit] section is in both. Update: I have been told it can be in the .timer.

I also wonder if there is a way to have the timer trigger only when online. There is a network-online.target and I do not know if it is applicable. I also do not know how to ask systemd if all the preconditions are currently met for a .service/.timer to run.

Finally, I especially wonder if it is worth hoping that Google will ever make their .ics feeds play nicely with calendar clients.

Posted Wed Mar 25 21:50:21 2015 Tags:

Screen-dependent window geometry

I have an external monitor for my laptop in my work desk at home, and when I work I keep a few windows like IRC on my laptop screen, and everything else on the external monitor. Then maybe I transfer on the sofa to watch a movie or in the kitchen to cook, and I unplug from the external monitor to bring the laptop with me. Then maybe I go back to the external monitor to resume working.

The result of this (with openbox) is that when I disconnect the external monitor all the windows on my external monitor get moved to the right edge of the laptop monitor, and when I reconnect the external monitor I need to rearrange them all again.

I would like to implement something that does the following:

  1. it keeps a dictionary mapping screen geometry to window geometries
  2. every time a window geometry and virtual desktop number changes, it gets recorded in the hash for the current screen geometry
  3. every time the screen geometry changes, for each window, if there was a saved window geometry + wirtual desktop number for it for the new screen geometry, it gets restored.

Questions:

  1. Is anything like this already implemented? Where?
  2. If not, what would be a convenient way to implement it myself, ideally in a wmctrl-like way that does not depend on a specific WM?

Note: I am not interested in switching to a different WM unless it is openbox with this feature implemented in it.

Posted Mon Mar 16 21:29:36 2015 Tags:

Reuse passwords in /etc/crypttab

Today's scenario was a laptop with an SSD and a spinning disk, and the goal was to deploy a Debian system on it so that as many things as possible are encrypted.

My preferred option for it is to setup one big LUKS partition in each disk, and put a LVM2 Physical Volume inside each partition. At boot, the two LUKS partition are opened, their contents are assembled into a Volume Group, and I can have everything I want inside.

This has advantages:

  • if any of the disks breaks, the other can still be unlocked, and it should still be possible to access the LVs inside it
  • once boot has happened, any layout of LVs can be used with no further worries about encryption
  • I can use pvmove to move partitions at will between SSD and spinning disks, which means I can at anytime renegotiate the tradeoffs between speed and disk space.

However, by default this causes cryptsetup to ask for the password once for each LUKS partition, even if the passwords are the same.

Searching for ways to mitigate this gave me unsatisfactory results, like:

  • decrypt the first disk, and use a file inside it as the keyfile to decrypt the second one. But in this case if the first disk breaks, I also lose the data in the second disk.
  • reuse the LUKS session key for the first disk in the second one. Same problem as before.
  • put a detached LUKS header in /boot and use it for both disks, then make regular backups of /boot. It is an interesting option that I have not tried.

The solution that I found was something that did not show up in any of my search results, so I'm documenting it here:

    # <target name> <source device>   <key file>   <options>
    ssd             /dev/sda2         main         luks,initramfs,discard,keyscript=decrypt_keyctl
    spin            /dev/sdb1         main         luks,initramfs,keyscript=decrypt_keyctl

This caches each password for 60 seconds, so that it can be reused to unlock other devices that use it. The documentation can be found at the beginning of /lib/cryptsetup/scripts/decrypt_keyctl, beware of the leopard™.

main is an arbitrary tag used to specify which devices use the same password.

This is also useful to work easily with multiple LUKS-on-LV setups:

    # <target name> <source device>          <key file>  <options>
    home            /dev/mapper/myvg-chome   main        luks,discard,keyscript=decrypt_keyctl
    backup          /dev/mapper/myvg-cbackup main        luks,discard,keyscript=decrypt_keyctl
    swap            /dev/mapper/myvg-cswap   main        swap,discard,keyscript=decrypt_keyctl
Posted Thu Mar 12 22:45:57 2015 Tags:

Another day in the life of a poor developer

try:
    # After Python 3.3
    from collections.abc import Iterable
except ImportError:
    # This has changed in Python 3.3 (why, oh why?), reinforcing the idea that
    # the best Python version ever is still 2.7, simply because upstream has
    # promised that they won't touch it (and break it) for at least 5 more
    # years.
    from collections import Iterable

import shlex
if hasattr(shlex, "quote"):
    # New in version 3.3.
    shell_quote = shlex.quote
else:
    # Available since python 1.6 but deprecated since version 2.7: Prior to Python
    # 2.7, this function was not publicly documented. It is finally exposed
    # publicly in Python 3.3 as the quote function in the shlex module.
    #
    # Except everyone was using it, because it was the only way provided by the
    # python standard library to make a string safe for shell use
    #
    # See http://stackoverflow.com/questions/35817/how-to-escape-os-system-calls-in-python
    import pipes
    shell_quote = pipes.quote

import shutil
if hasattr(shutil, "which"):
    # New in version 3.3.
    shell_which = shutil.which
else:
    # Available since python 1.6:
    # http://stackoverflow.com/questions/377017/test-if-executable-exists-in-python
    from distutils.spawn import find_executable
    shell_which = find_executable
Posted Fri Feb 27 12:02:33 2015 Tags:

Akonadi client example

After many failed attemps I have managed to build a C++ akonadi client. It has felt like one of the most frustrating programming experiences of my whole life, so I'm sharing the results hoping to spare others from all the suffering.

First thing first, akonadi client libraries are not in libakonadi-dev but in kdepimlibs5-dev, even if kdepimlibs5-dev does not show in apt-cache search akonadi.

Then, kdepimlibs is built with Qt4. If your application uses Qt5 (mine was) you need to port it back to Qt4 if you want to talk to Akonadi.

Then, kdepimlibs does not seem to support qmake and does not ship pkg-config .pc files, and if you want to use kdepimlibs your build system needs to be cmake. I ported by code from qmake to cmake, and now qtcreator wants me to run cmake by hand every time I change the CMakeLists.txt file, and it stopped allowing to add, rename or delete sources.

Finally, most of the code / build system snippets found on the internet seem flawed in a way or another, because the build toolchain of Qt/KDE applications has undergone several redesignins during time, and the network is littered with examples from different eras. The way to obtain template code to start a Qt/KDE project is to use kapptemplate. I have found no getting started tutorial on the internet that said "do not just copy the snippets from here, run kapptemplate instead so you get them up to date".

kapptemplate supports building an "Akonadi Resource" and an "Akonadi Serializer", but it does not support generating template code for an akonadi client. That left me with the feeling that I was dealing with some software that wants to be developed but does not want to be used.

Anyway, now an example of how to interrogate Akonadi exists as is on the internet. I hope that all the tears of blood that I cried this morning have not been cried in vain.

Posted Mon Feb 23 15:44:01 2015 Tags:

The wonders of missing documentation

Update: I have managed to build an example Akonadi client application.

I'm new here, I want to make a simple C++ GUI app that pops up a QCalendarWidget which my local Akonadi has appointments.

I open qtcreator, create a new app, hack away for a while, then of course I get undefined references for all Akonadi symbols, since I didn't tell the build system that I'm building with akonadi. Ok.

How do I tell the build system that I'm building with akonadi? After 20 minutes of frantic looking around the internet, I still have no idea.

There is a package called libakonadi-dev which does not seem to have anything to do with this. That page mentions everything about making applications with Akonadi except how to build them.

There is a package called kdepimlibs5-dev which looks promising: it has no .a files but it does have haders and cmake files. However, qtcreator is only integrated with qmake, and I would really like the handholding of an IDE at this stage.

I put something together naively doing just what looked right, and I managed to get an application that segfaults before main() is even called:

/*
 * Copyright © 2015 Enrico Zini <enrico@enricozini.org>
 *
 * This work is free. You can redistribute it and/or modify it under the
 * terms of the Do What The Fuck You Want To Public License, Version 2,
 * as published by Sam Hocevar. See the COPYING file for more details.
 */
#include <QDebug>

int main(int argc, char *argv[])
{
    qDebug() << "BEGIN";
    return 0;
}
QT       += core gui widgets
CONFIG += c++11

TARGET = wtf
TEMPLATE = app

LIBS += -lkdecore -lakonadi-kde

SOURCES += wtf.cpp

I didn't achieve what I wanted, but I feel like I achieved something magical and beautiful after all.

I shall now perform some haruspicy on those oscure cmake files to see if I can figure something out. But seriously, people?

Posted Mon Feb 23 11:36:18 2015 Tags:

Setting up Akonadi

Now that I have a CalDAV server that syncs with my phone I would like to use it from my desktop.

It looks like akonadi is able to sync with CalDAV servers, so I'm giving it a try.

First thing first is to give a meaning to the arbitrary name of this thing. Wikipedia says it is the oracle goddess of justice in Ghana. That still does not hint at all at personal information servers, but seems quite nice. Ok. I gave up with software having purpose-related names ages ago.

# apt-get install akonadi-server akonadi-backend-postgresql

Akonadi wants a SQL database as a backend. By default it uses MySQL, but I had enough of MySQL ages ago.

I tried SQLite but the performance with it is terrible. Terrible as in, it takes 2 minutes between adding a calendar entry and having it show up in the calendar. I'm fascinated by how Akonadi manages to use SQLite so badly, but since I currently just want to get a job done, next in line is PostgreSQL:

# su - postgres
$ createuser enrico
$ psql postgres
postgres=# alter user enrico createdb;

Then as enrico:

$ createdb akonadi-enrico
$ cat <<EOT > ~/.config/akonadi/akonadiserverrc
[%General]
Driver=QPSQL

[QPSQL]
Name=akonadi-enrico
StartServer=false
Host=
Options=
ServerPath=
InitDbPath=

I can now use kontact to connect Akonadi to my CalDAV server and it works nicely, both with calendar and with addressbook entries.

KDE has at least two clients for Akonadi: Kontact, which is a kitchen sink application similar to Evolution, and KOrganizer, which is just the calendar and scheduling component of Kontact.

Both work decently, and KOrganizer has a pretty decent startup time. I now have a usable desktop PIM application that is synced with my phone. W00T!

Next step is to port my swift little calendar display tool to use Akonadi as a back-end.

Posted Tue Feb 17 15:34:55 2015 Tags:

seat-inspect

Four months ago I wrote this somewhere:

Seeing a DD saying "this new dbus stuff scares me" would make most debian users scared. Seeing a DD who has an idea of what is going on, and who can explain it, would be an interesting and exciting experience.

So, let's be exemplary, competent and patient. Or at least, competent. Some may like or not like the changes, but do we all understand what is going on? Will we all be able to support our friends and customers running jessie?

I confess that although I understand the need for it, I don't feel competent enough to support systemd-based machines right now.

So, are we maybe in need of help, cheat sheets, arsenals of one-liners, diagnostic tools?

Maybe a round of posts on -planet like "one debian package a day" but with new features that jessie will have, and how to understand them and take advantage of them?

That was four months ago. In the meantime, I did some work, and it got better for me.

Yesterday, however, I've seen an experienced Linux person frustrated because the shutdown function of the desktop was doing nothing whatsoever. Today I found John Goerzen's post on planet.

I felt like some more diagnostic tools were needed, so I spent the day making seat-inspect.

seat-inspect tries to make the status of the login/seat system visible, to help with understanding and troubleshooting.

The intent of running the code is to have an overview of the system status, both to see what the new facilities are about, and to figure out if there is something out of place.

The intent of reading the code is to have an idea of how to use these facilities: the code has been written to be straightforward and is annotated with relevant bits from the logind API documentation.

seat-inspect is not a finished tool, but a starting point. I put it on github hoping that people will fork it and add their own extra sanity checks and warnings, so that it can grow into a standard thing to run if a system acts weird.

As it is now, it should be able to issue warnings if some bits are missing for network-manager or shutdown functions to work correctly. I haven't really tested that, though, because I don't have a system at hand where they are currently not working fine.

Another nice thing of it is that when running seat-inspect -v you get a dump of what logind/consolekit think about your system. I found it an interesting way to explore the new functionalities that we recently grew. The same can be done, and in more details, with loginctl calls, but I lacked a summary.

After writing this I feel a bit more competent, probably enough to sit at somebody's computer and poke into loginctl bits. I highly recommend the experience.

Posted Tue Feb 10 18:06:43 2015 Tags:
Posted Sat Jun 6 00:57:39 2009

Python-related posts.

Custom function decorators with TurboGears 2

I am exposing some library functions using a TurboGears2 controller (see web-api-with-turbogears2). It turns out that some functions return a dict, some a list, some a string, and TurboGears 2 only allows JSON serialisation for dicts.

A simple work-around for this is to wrap the function result into a dict, something like this:

@expose("json")
@validate(validator_dispatcher, error_handler=api_validation_error)
def list_colours(self, filter=None, productID=None, maxResults=100, **kw):
    # Call API
    res = self.engine.list_colours(filter, productID, maxResults)

    # Return result
    return dict(r=res)

It would be nice, however, to have an @webapi() decorator that automatically wraps the function result with the dict:

def webapi(func):
    def dict_wrap(*args, **kw):
        return dict(r=func(*args, **kw))
    return dict_wrap

# ...in the controller...

    @expose("json")
    @validate(validator_dispatcher, error_handler=api_validation_error)
    @webapi
    def list_colours(self, filter=None, productID=None, maxResults=100, **kw):
        # Call API
        res = self.engine.list_colours(filter, productID, maxResults)

        # Return result
        return res

This works, as long as @webapi appears last in the list of decorators. This is because if it appears last it will be the first to wrap the function, and so it will not interfere with the tg.decorators machinery.

Would it be possible to create a decorator that can be put anywhere among the decorator list? Yes, it is possible but tricky, and it gives me the feeling that it may break in any future version of TurboGears:

class webapi(object):
    def __call__(self, func):
        def dict_wrap(*args, **kw):
            return dict(r=func(*args, **kw))
        # Migrate the decoration attribute to our new function
        if hasattr(func, 'decoration'):
            dict_wrap.decoration = func.decoration
            dict_wrap.decoration.controller = dict_wrap
            delattr(func, 'decoration')
        return dict_wrap

# ...in the controller...

    @expose("json")
    @validate(validator_dispatcher, error_handler=api_validation_error)
    @webapi
    def list_colours(self, filter=None, productID=None, maxResults=100, **kw):
        # Call API
        res = self.engine.list_colours(filter, productID, maxResults)

        # Return result
        return res

As a convenience, TurboGears 2 offers, in the decorators module, a way to build decorator "hooks":

class before_validate(_hook_decorator):
    '''A list of callables to be run before validation is performed'''
    hook_name = 'before_validate'

class before_call(_hook_decorator):
    '''A list of callables to be run before the controller method is called'''
    hook_name = 'before_call'

class before_render(_hook_decorator):
    '''A list of callables to be run before the template is rendered'''
    hook_name = 'before_render'

class after_render(_hook_decorator):
    '''A list of callables to be run after the template is rendered.

    Will be run before it is returned returned up the WSGI stack'''

    hook_name = 'after_render'

The way these are invoked can be found in the _perform_call function in tg/controllers.py.

To show an example use of those hooks, let's add a some polygen wisdom to every data structure we return:

class wisdom(decorators.before_render):
    def __init__(self, grammar):
        super(wisdom, self).__init__(self.add_wisdom)
        self.grammar = grammar
    def add_wisdom(self, remainder, params, output):
        from subprocess import Popen, PIPE
        output["wisdom"] = Popen(["polyrun", self.grammar], stdout=PIPE).communicate()[0]

# ...in the controller...

    @wisdom("genius")
    @expose("json")
    @validate(validator_dispatcher, error_handler=api_validation_error)
    def list_colours(self, filter=None, productID=None, maxResults=100, **kw):
        # Call API
        res = self.engine.list_colours(filter, productID, maxResults)
    
        # Return result
        return res

These hooks cannot however be used for what I need, that is, to wrap the result inside a dict. The reason is because they are called in this way:

        controller.decoration.run_hooks(
                'before_render', remainder, params, output)

and not in this way:

        output = controller.decoration.run_hooks(
                'before_render', remainder, params, output)

So it is possible to modify the output (if it is a mutable structure) but not to exchange it with something else.

Can we do even better? Sure we can. We can assimilate @expose and @validate inside @webapi to avoid repeating those same many decorator lines over and over again:

class webapi(object):
    def __init__(self, error_handler = None):
        self.error_handler = error_handler

    def __call__(self, func):
        def dict_wrap(*args, **kw):
            return dict(r=func(*args, **kw))
        res = expose("json")(dict_wrap)
        res = validate(validator_dispatcher, error_handler=self.error_handler)(res)
        return res

# ...in the controller...

    @expose("json")
    def api_validation_error(self, **kw):
        pylons.response.status = "400 Error"
        return dict(e="validation error on input fields", form_errors=pylons.c.form_errors)

    @webapi(error_handler=api_validation_error)
    def list_colours(self, filter=None, productID=None, maxResults=100, **kw):
        # Call API
        res = self.engine.list_colours(filter, productID, maxResults)

        # Return result
        return res

This got rid of @expose and @validate, and provides almost all the default values that I need. Unfortunately I could not find out how to access api_validation_error from the decorator so that I can pass it to the validator, therefore I remain with the inconvenience of having to explicitly pass it every time.

Posted Wed Nov 4 17:52:38 2009 Tags:

Building a web-based API with Turbogears2

I am using TurboGears2 to export a python API over the web. Every API method is wrapper by a controller method that validates the parameters and returns the results encoded in JSON.

The basic idea is this:

@expose("json")
def list_colours(self, filter=None, productID=None, maxResults=100, **kw):
    # Call API
    res = self.engine.list_colours(filter, productID, maxResults)

    # Return result
    return res

To validate the parameters we can use forms, it's their job after all:

class ListColoursForm(TableForm):
    fields = [
            # One field per parameter
            twf.TextField("filter", help_text="Please enter the string to use as a filter"),
            twf.TextField("productID", help_text="Please enter the product ID"),
            twf.TextField("maxResults", validator=twfv.Int(min=0), default=200, size=5, help_text="Please enter the maximum number of results"),
    ]
list_colours_form=ListColoursForm()

#...

    @expose("json")
    @validate(list_colours_form, error_handler=list_colours_validation_error)
    def list_colours(self, filter=None, productID=None, maxResults=100, **kw):
        # Parameter validation is done by the form
    
        # Call API
        res = self.engine.list_colours(filter, productID, maxResults)
    
        # Return result
        return res

All straightforward so far. However, this means that we need two exposed methods for every API call: one for the API call and one error handler. For every API call, we have to type the name several times, which is error prone and risks to get things mixed up.

We can however have a single error handler for all methonds:

def get_method():
    '''
    The method name is the first url component after the controller name that
    does not start with 'test'
    '''
    found_controller = False
    for name in pylons.c.url.split("/"):
        if not found_controller and name == "controllername":
            found_controller = True
            continue
        if name.startswith("test"):
            continue
        if found_controller:
            return name
    return None

class ValidatorDispatcher:
    '''
    Validate using the right form according to the value of the "method" field
    '''
    def validate(self, args, state):
        method = args.get("method", None)
    # Extract the method from the URL if it is missing
        if method is None:
            method = get_method()
            args["method"] = method
        return forms[method].validate(args, state)

validator_dispatcher = ValidatorDispatcher()

This validator will try to find the method name, either as a form field or by parsing the URL. It will then use the method name to find the form to use for validation, and pass control to the validate method of that form.

We then need to add an extra "method" field to our forms, and arrange the forms inside a dictionary:

class ListColoursForm(TableForm):
    fields = [
            # One hidden field to have a place for the method name
            twf.HiddenField("method")
            # One field per parameter
            twf.TextField("filter", help_text="Please enter the string to use as a filter"),
    #...

forms["list_colours"] = ListColoursForm()

And now our methods become much nicer to write:

    @expose("json")
    def api_validation_error(self, **kw):
        pylons.response.status = "400 Error"
        return dict(form_errors=pylons.c.form_errors)

    @expose("json")
    @validate(validator_dispatcher, error_handler=api_validation_error)
    def list_colours(self, filter=None, productID=None, maxResults=100, **kw):
        # Parameter validation is done by the form
    
        # Call API
        res = self.engine.list_colours(filter, productID, maxResults)
    
        # Return result
        return res

api_validation_error is interesting: it returns a proper HTTP error status, and a JSON body with the details of the error, taken straight from the form validators. It took me a while to find out that the form errors are in pylons.c.form_errors (and for reference, the form values are in pylons.c.form_values). pylons.response is a WebOb Response that we can play with.

So now our client side is able to call the API methods, and get a proper error if it calls them wrong.

But now that we have the forms ready, it doesn't take much to display them in web pages as well:

def _describe(self, method):
    "Return a dict describing an API method"
    ldesc = getattr(self.engine, method).__doc__.strip()
    sdesc = ldesc.split("\n")[0]
    return dict(name=method, sdesc = sdesc, ldesc = ldesc)

@expose("myappserver.templates.myappapi")
def index(self):
    '''
    Show an index of exported API methods
    '''
    methods = dict()
    for m in forms.keys():
        methods[m] = self._describe(m)
    return dict(methods=methods)

@expose('myappserver.templates.testform')
def testform(self, method, **kw):
    '''
    Show a form with the parameters of an API method
    '''
    kw["method"] = method
    return dict(method=method, action="/myapp/test/"+method, value=kw, info=self._describe(method), form=forms[method])

@expose(content_type="text/plain")
@validate(validator_dispatcher, error_handler=testform)
def test(self, method, **kw):
    '''
    Run an API method and show its prettyprinted result
    '''
    res = getattr(self, str(method))(**kw)
    return pprint.pformat(res)

In a few lines, we have all we need: an index of the API methods (including their documentation taken from the docstrings!), and for each method a form to invoke it and a page to see the results.

Make the forms children of AjaxForm, and you can even see the results together with the form.

Posted Thu Oct 15 15:45:39 2009 Tags:

Creating pipelines with subprocess

It is possible to create process pipelines using subprocess.Popen, by just using stdout=subprocess.PIPE and stdin=otherproc.stdout.

Almost.

In a pipeline created in this way, the stdout of all processes except the last is opened twice: once in the script that has run the subprocess and another time in the standard input of the next process in the pipeline.

This is a problem because if a process closes its stdin, the previous process in the pipeline does not get SIGPIPE when trying to write to its stdout, because that pipe is still open on the caller process. If this happens, a wait on that process will hang forever: the child process waits for the parent to read its stdout, the parent process waits for the child process to exit.

The trick is to close the stdout of each process in the pipeline except the last just after creating them:

#!/usr/bin/python
# coding=utf-8

import subprocess

def pipe(*args):
    '''
    Takes as parameters several dicts, each with the same
    parameters passed to popen.

    Runs the various processes in a pipeline, connecting
    the stdout of every process except the last with the
    stdin of the next process.
    '''
    if len(args) < 2:
        raise ValueError, "pipe needs at least 2 processes"
    # Set stdout=PIPE in every subprocess except the last
    for i in args[:-1]:
        i["stdout"] = subprocess.PIPE

    # Runs all subprocesses connecting stdins and stdouts to create the
    # pipeline. Closes stdouts to avoid deadlocks.
    popens = [subprocess.Popen(**args[0])]
    for i in range(1,len(args)):
        args[i]["stdin"] = popens[i-1].stdout
        popens.append(subprocess.Popen(**args[i]))
        popens[i-1].stdout.close()

    # Returns the array of subprocesses just created
    return popens

At this point, it's nice to write a function that waits for the whole pipeline to terminate and returns an array of result codes:

def pipe_wait(popens):
    '''
    Given an array of Popen objects returned by the
    pipe method, wait for all processes to terminate
    and return the array with their return values.
    '''
    results = [0] * len(popens)
    while popens:
        last = popens.pop(-1)
        results[len(popens)] = last.wait()
    return results

And, look and behold, we can now easily run a pipeline and get the return codes of every single process in it:

process1 = dict(args='sleep 1; grep line2 testfile', shell=True)
process2 = dict(args='awk \'{print $3}\'', shell=True)
process3 = dict(args='true', shell=True)
popens = pipe(process1, process2, process3)
result = pipe_wait(popens)
print result

Update: Colin Watson suggests an improvement to compensate for Python's nonstandard SIGPIPE handling.

Colin Watson has a similar library for C.

Posted Wed Jul 1 09:08:06 2009 Tags:

Passing values to turbogears widgets at display time (the general case)

Last time I dug this up I was not clear enough in documenting my findings, so I had to find them again. Here is the second attempt.

In Turbogears, in order to pass parameters to arbitrary widgets in a compound widget, the syntax is:

form.display(PARAMNAME=dict(WIDGETNAME=VALUE))

And if you have more complex nested widgets and would like to know what goes on, this monkey patch is good for inspecting the params lookup functions:

import turbogears.widgets.forms
old_rpbp = turbogears.widgets.forms.retrieve_params_by_path
def inspect_rpbp(params, path):
    print "RPBP", repr(params), repr(path)
    res = old_rpbp(params, path)
    print "RPBP RES", res
    return res
turbogears.widgets.forms.retrieve_params_by_path = inspect_rpbp

The code for the lookup itself is, as the name suggests, in the retrieve_params_by_path function in the file widgets/forms.py in the Turbogears source code.

Posted Sat Jun 6 00:57:39 2009 Tags:

Python scoping

How do you create a list of similar functions in Python?

As a simple example, let's say we want to create an array of 10 elements like this:

a[0] = lambda x: x
a[1] = lambda x: x+1
a[2] = lambda x: x+2
...
a[9] = lambda x: x+9

Simple:

>>> a = []
>>> for i in range(0,10): a.append(lambda x: x+i)
...

...but wrong:

>>> a[0](1)
10

What happened here? In Python, that lambda x: x+i uses the value that i will have when the function is invoked.

This is the trick to get it right:

>>> a = []
>>> for i in range(0,10): a.append(lambda x, i=i: x + i)
...
>>> a[0](1)
1

What happens here is explained in the section "A Jedi Mind Trick" of the Instant Python article: i=i assigns as the default value of the parameter i the current value of i.

Strangely enough the same article has "A Note About Python 2.1 and Nested Scopes" which seems to imply that from Python 2.2 the scoping has changed to "work as it should". I don't understand: the examples above are run on Python 2.4.4.

Googling for keywords related to python closure scoping only yields various sorts of complicated PEPs and an even uglier list trick:

a lot of people might not know about the trick of using a list to box variables within a closure.

Now I know about the trick, but I wish I didn't need to know :-(

Posted Sat Jun 6 00:57:39 2009 Tags:

TurboGears RemoteForm tip

In case your RemoteForm misteriously behaves like a normal HTTP form, refreshing the page on submit, and the only hint that there's something wrong is this bit in the Iceweasel's error console:

Errore: uncaught exception: [Exception... "Component returned failure
code: 0x80070057 (NS_ERROR_ILLEGAL_VALUE) [nsIXMLHttpRequest.open]"
nsresult: "0x80070057 (NS_ERROR_ILLEGAL_VALUE)"  location: "JS frame ::
javascript: eval(__firebugTemp__); :: anonymous :: line 1"  data: no]

the problem can just be a missing action= attribute to the form.

I found out after:

  1. reading the TurboGears remoteform wiki: "For some reason, the RemoteForm is acting like a regular html form, serving up a new page instead of performing the replacements we're looking for. I'll update this page as soon as I figure out why this is happening."

  2. finding this page on Google and meditating for a while while staring at it. I don't speak German, but often enough I manage to solve problems after meditating over Google results in all sorts of languages unknown or unreadable to me. I will call this practice Webomancy.

Posted Sat Jun 6 00:57:39 2009 Tags:

Passing values to turbogears widgets at display time

In turbogears, I often need to pass data to widgets at display time. Sometimes it works automatically, but sometimes, in cases like passing option lists to CheckBoxLists or number of repetitions in a RepeatingFieldSet, it doesn't.

All the examples use precomputed lists or pass simple code functions. In most of my cases, I want them computed by the controller every time.

Passing a function hasn't worked, as I did not find any obvious way to have the function know about the controller.

So I need to pass things the display() method of the widgets, but I could not work out how to pass the option list and default list for a CheckBoxList that is part of a WidgetsList in a TableForm.

On IRC came the answer, thanks to Xentac:

you should be able to...
    tableform.display(options=dict(checkboxname=[optionlist]))

And yes, it works. I can pass the default value as one of the normal form values:

    tableform.display(values=dict(checkboxname=[values]), options=dict(checkboxname=[optionlist]))
Posted Sat Jun 6 00:57:39 2009 Tags:

File downloads with TurboGears

In TurboGears, I had to implement a file download method, but the file required access controls so it was put in a directory not exported by Apache.

In #turbogears I've been pointed at: http://cherrypy.org/wiki/FileDownload and this is everything put together:

from cherrypy.lib.cptools import serveFile
# In cherrypy 3 it should be:
#from cherrypy.lib.static import serve_file

@expose()
def get(self, *args, **kw):
    """Access the file pointed by the given path"""
    pathname = check_auth_and_compute_pathname()
    return serveFile(pathname)

Then I needed to export some CSV:

@expose()
def getcsv(self, *args, **kw):
    """Get the data in CSV format"""
    rows = compute_data_rows()
    headers = compute_headers(rows)
    filename = compute_file_name()

    cherrypy.response.headers['Content-Type'] = "application/x-download"
    cherrypy.response.headers['Content-Disposition'] = 'attachment; filename="'+filename+'"'

    csvdata = StringIO.StringIO()
    writer = csv.writer(csvdata)
    writer.writerow(headers)
    writer.writerows(rows)

    return csvdata.getvalue()

In my case it's not an issue as I can only compute the headers after I computed all the data, but I still have to find out how to serve the CSV file while I'm generating it, instead of storing it all into a big string and returning the big string.

Posted Sat Jun 6 00:57:39 2009 Tags:

Turbogears quirks when testing controllers that use SingleSelectField

Suppose you have a User that can be a member of a Company. In SQLObject you model it somehow like this:

    class Company(SQLObject):
        name = UnicodeCol(length=16, alternateID=True, alternateMethodName="by_name")
        display_name = UnicodeCol(length=255)

    class User(InheritableSQLObject):
        company = ForeignKey("Company", notNull=False, cascade='null')

Then you want to make a form that allows to choose what is the company of a user:

def companies():
    return [ [ -1, 'None' ] ] + [ [c.id, c.display_name] for c in Company.select() ]

class NewUserFields(WidgetsList):
    """Fields for editing general settings"""
    user_name = TextField(label="User name")
    companyID = SingleSelectField(label="Company", options=companies)

Ok. Now you want to run tests:

  1. nosetests imports the controller to see if there's any initialisation code.
  2. The NewUserFields class is created.
  3. The SingleSelectField is created.
  4. The SingleSelectField constructor tries to guess the validator and peeks at the first option.
  5. This calls companies.
  6. companies accesses the database.
  7. The testing database has not yet been created because nosetests imported the module before giving the test code a chance to setup the test database.
  8. Bang.

The solution is to add an explicit validator to disable this guessing code that is a source of so many troubles:

class NewUserFields(WidgetsList):
    """Fields for editing general settings"""
    user_name = TextField(label="User name")
    companyID = SingleSelectField(label="Company", options=companies, validator=v.Int(not_empty=True))
Posted Sat Jun 6 00:57:39 2009 Tags:

Turbogears form quirk

I had a great idea:

@validate(model_form)
@error_handler()
@expose(template='kid:myproject.templates.new')
def new(self, id, tg_errors=None, **kw):
    """Create new records in model"""
    if tg_errors:
        # Ask until there is still something missing
        return dict(record = defaults, form = model_form)
    else:
        # We have everything: save it
        i = Item(**kw)
        flash("Item was successfully created.")
        raise redirect("../show/%d" % i.id)

It was perfect: one simple method, simple error handling, nice helpful messages all around. Except, check boxes and select fields would not get the default values while all other fields would.

After two hours searching and cursing and tracing things into widget code, I found this bit in InputWidget.adjust_value:

# there are some input fields that when nothing is checked/selected
# instead of sending a nice name="" are totally missing from
# input_values, this little workaround let's us manage them nicely
# without interfering with other types of fields, we need this to
# keep track of their empty status otherwise if the form is going to be
# redisplayed for some errors they end up to use their defaults values
# instead of being empty since FE doesn't validate a failing Schema.
# posterity note: this is also why we need if_missing=None in
# validators.Schema, see ticket #696.

So, what is happening here is that since check boxes and option fields don't have a nice behaviour when unselected, turbogears has to work around it. So in order to detect the difference between "I selected 'None'" and "I didn't select anything", it reasons that if the input has been validated, then the user has made some selections, so it defaults to "The user selected 'None'". If the input has not been validated, then we're showing the form for the first time, then a missing value means "Use the default provided".

Since I was doing the validation all the time, this meant that Checkboxes and Select fields would never use the default values.

Hence, if you use those fields then you necessarily need two different controller methods, one to present the form and one to save it:

@expose(template='kid:myproject.templates.new')
def new(self, id, **kw):
    """Create new records in model"""
    return dict(record = defaults(), form = model_form)

@validate(model_form)
@error_handler(new)
@expose()
def savenew(self, id, **kw):
    """Create new records in model"""
    i = Item(**kw)
    flash("Item was successfully created.")
    raise redirect("../show/%d"%i.id)

If someone else stumbles on the same problem, I hope they'll find this post and they won't have to spend another two awful hours tracking it down again.

Posted Sat Jun 6 00:57:39 2009 Tags:
Posted Sat Jun 6 00:57:39 2009

Pages related to my visit in Addis Ababa for a Linux training course.

Third day in Addis

Believe it or not, a network that fails often is the best thing to have when you are teaching network troubleshooting.

Various tools useful for networking:

  • ifconfig - configure a network interface
  • dnsmasq - Simple DNS and DHCP server
  • host - DNS lookup utility
  • route - show / manipulate the IP routing table
  • arping - send ARP REQUEST to a neighbour host
  • mii-tool - view, manipulate media-independent interface status (IOW, see if the cable works)
  • nmap - Network exploration tool and security / port scanner

    Examples:

     # Look at what machines are active in the local network:
     nmap -sP 10.5.15.0/24
    
     # Look at what ports are open in a machine:
     nmap 10.5.15.26
    
  • tcpdump - dump traffic on a network

    It can be used to see if there is traffic, and to detect traffic that shouldn't be there.

Useful tip:

    # Convert a unix timestamp to a readable date
    date -d @1152841341

What happens when you browse a web page:

  1. type the address www.google.com in the browser
  2. the browser needs the IP address of the web server:

  3. . look for the DNS address in /etc/resolv.conf (/etc/resolv.conf is created automatically by the DHCP client)

  4. . try all the DNS servers in /etc/resolv.conf until one gives you the IP address of www.google.com
  5. . take the first address that comes from the DNS (in our case was 64.233.167.104)

  6. figure out how to connect to 64.233.167.104:

  7. . consult the routing table to see if it's in the local network:

    1. if it's in the local network, then look for the MAC address (using ARP
      • Address Resolution Protocol)
    2. if it'd not in the local network, then send through the gateway (again using ARP to find the MAC address of the gateway)
  8. Send out the HTTP request to the local web server or through the gateway, using the Ethernet physical protocol, and the MAC address to refer to the other machine.

Troubleshooting network problems:

  1. See if the network driver works:

  2. . With ifconfig, see if you see the HWaddr:. If you do not see it, then the linux driver for the network card is not working. Unfortunately there's no exact way to say that it works perfectly

  3. See if you have an IP address with ifconfig. If you find out that you need to rerun DHCP (for example, if the network cable was disconnected when the system started), then you can do it either by deactivating/reactivating the Ethernet interface using System/Administration/Networking or, on a terminal, running:

    # ifdown eth0
    # ifup eth0
    

    If you don't get an IP, try to see if the DHCP server is reachable by running:

    $ arping -D [address of DHCP server]
    
  4. See if the local physical network works:

  5. . With sudo mii-tool, see if the cable link is ok. If it's not, then it's a problem in the cable or the plugs, or simply the device at the other end of the cable is turned off.

  6. . Try arping or ping -n on a machine in the local network (like the gateway) to see if the local network works.

  7. See if the DNS works:

  8. . Find out the DNS address:

    cat /etc/resolv.conf
    
  9. . If it's local, arping it

  10. . If it's not local, ping -n it
  11. . Try to resolve a famous name using that DNS:

    $ host [name] [IP address of the DNS]
    
  12. . Try to resolve the name of the machine you're trying to connect. If you can resolve a famous name but not the name you need, then it's likely a problem with their DNS.

  13. If you use a proxy, see if the proxy is reachable: check if the proxy name resolves to an IP, if you can ping it, if you can telnet to the proxy address and port:

    $ telnet [proxy address] [proxy port]
    

    you quit telnet with ^]quit.

  14. If you can connect directly to the web server, try to see if it answers:

    $ telnet [address] 80
    

    If you are connected, you can confirm that it's a web server:

    GET / HTTP/1.0 (then Enter twice)
    

    If it's a web server, it should give you something like a webpage or an HTTP redirect.

When you try to setup a service and it doesn't work:

  1. check that it's running:

    $ ps aux | grep dnsmasq
    
  2. check that it's listening on the right port:

    $ sudo netstat -lp
    
  3. check that it's listening from the outside:

    $ nmap [hostname]
    
  4. check for messages in /var/log/daemon.log or /var/log/syslog

  5. check that the configuration is correct and reload or restart the server to make sure it's running with the right configuration:

    # /etc/init.d/dnsmasq restart
    

dnsmasq:

By default: works as a DNS server that serves the data in /etc/hosts.

By default: uses /etc/resolv.conf to find addresses of other DNS to use when a name is not found in /etc/hosts.

To enable the DHCP server, uncomment:

    dhcp-range=192.168.0.50,192.168.0.150,12h

in /etc/dnsmasq.conf and set it to the range of addresses you want to serve. Pay attention to never put two DHCP servers on the same local network, or they will interfere with each others.

To test if the DHCP server is working, use dhcping (not installed by default on Ubuntu).

To communicate other information like DNS, gateway and netmask to the clients, use this piece of dnsmasq.conf:

    # For reference, the common options are:
    # subnet mask - 1
    # default router - 3
    # DNS server - 6
    # broadcast address - 28
    dhcp-option=1,255.255.255.0
    dhcp-option=3,192.168.0.1
    dhcp-option=6,192.168.0.1
    dhcp-option=28,192.168.0.255

Problems found today:

  • changing the name of the local machine in /etc/hosts breaks sudo, and without sudo it's impossible to edit the file. The only way to fix this is a reboot in recovery mode.

  • dhclient -n -w is different than dhclient -nw

Quick start examples with tar:

    # Create an archive
    tar zcvf nmap.tar.gz *.deb

    # Extract an archive
    tar zxvf nmap.tar.gz

    # Look at the contents of an archive
    tar ztvf nmap.tar.gz

Quick & dirty way to send a file between two computers without web server, e-mail, shared disk space or any other infrastructure:

    # To send
    nc -l -p 12345 -q 1 < nmap.tar.gz

    # To receive
    nc 10.5.15.123 12345 > nmap.tar.gz

    # To repeat the send command 20 times
    for i in `seq 1 20`; do nc -l -p 12345 -q 1 < nmap.tar.gz ; done

Update: Javier Fernandez-Sanguino writes:

Your "XXX day in Addis" is certainly good reading, nice to see somebody reviewing common tools from a novice point of view. Some comments:

  • Regarding your comments on how to troubleshoot network connectivity problems I just wanted to point you to the network test script I wrote and submited to the debian-goodies package ages ago. It's available at http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=307694 and should do automatically most of the stuff you commented on your blog.

  • Your example to test hosts alive in the network using nmap -sP 10.5.15.0/24 is good. However, newer (v4) versions can do ARP ping in the local network which is much more efficient (some systems might block ICMP outbount), that's the -PR option and should be enabled (by default). See http://www.insecure.org/nmap/man/man-host-discovery.html Also, you might want to add a '-n' there so that nmap does not try to do DNS resolution of the hosts (which might take up some time if your DNS does not include local IPs)

  • tcpdump, it would be wiser to turn novice users to ethereal since it has a much better UI than tcpdump and it is able to dissect (interpret) protocols that tcpdump can't analyse.

  • you are missing arp as a tool in itself, it is useful to debug network issues since if the host is local and does not show up in arp output either a) it's down or b) you don't have proper network connectivity. (If you are missing an ARP entry for your default gateway your setup is broken)

Update: Marius Gedminas writes:

Re: http://www.enricozini.org/blog/eng/third-day-in-addis

In my experience if sudo cannot resolve the hostname (e.g. if you break /etc/hosts), you can still use sudo, but you have to wait something like 30 seconds until the DNS request times out.

I tried to break my /etc/hosts (while keeping a root shell so I can fix it if something goes wrong), but couldn't even get the timeout now. Sudo just said unable to lookup $hostname via gethostbyname() and gave me a root shell.

Posted Sat Jun 6 00:57:39 2009 Tags:

Ninth day in Addis

SSH

To enable remote logins with ssh

apt-get install openssh-server

Then you can login with:

$ ssh efossnet@proxy.dream.edu.et

To verify the host key fingerprint of a machine:

$ ssh-keygen -l -f /etc/ssh/ssh_host_rsa_key.pub

Note: you need to verify it before logging in!

More information at http://www.securityfocus.com/infocus/1806

Example ssh usages

To log in:

    $ ssh efossnet@proxy

To run a command in the remote computer:

    $ ssh efossnet@proxy "cat /etc/hosts"

To copy a file to the remote computer:

    $ scp Desktop/july-18.tar.gz efossnet@proxy:

To copy a file from the remote computer:

    $ scp efossnet@proxy:july-18.tar.gz /tmp/

Beware of brute-force login attempts

Warning about SSH: there are people who run automated scans for ssh servers and try to login using commonly used easy passwords.

If you have an SSH server on the network, use strong passwords, or if you can it's even better to disable password authentication: in /etc/ssh/sshd_config, add:

    PasswordAuthentication no

To log in using public/private keys:

  1. Create your key:

    ssh-keygen -t rsa
    
  2. Copy your public key to the machine where you want to log in:

    ssh-copy-id -i .ssh/id_rsa.pub efossnet@proxy
    
  3. Now you can ssh using your RSA key

If you use ssh often, read these:

proxy

Problems we had today with the proxy:

ssl does not work

Reason: squid tries to directly connect to the ssl server, but the AAU network wants us to go through their proxy.

Ideal solution: none. There is no way to tell squid to use a parent proxy for SSL connections.

Solution: update the documentation for the Dream university users telling to setup a different proxy for SSL connections.

Longer term solution: get the AAU network admins to enable outgoing SSL connections from the Dream university proxy.

Other things that can be done:

  • report a bug on squid reporting the need and requesting the feature
  • download squid source code and implement the feature ourselves, then submit the patch to the squid people

Browsing normal pages returns an error of 'Connection refused'.

In the logs, the line is:

1153294204.912    887 192.168.0.200 TCP_MISS/503 1441 GET http://www.google.com.et/search? - NONE/- text/html

That "/503" is one of the HTTP error codes.

Explanation of the error codes:

Reason: the other proxy is refusing connections from our proxy.

Solution: none so far. Will need to get in touch with the admins of the other proxy to try to find out why it refuses connection to our proxy, and how we can fix the problem.

postfix on smtp.dream.edu.et

Basic information is at http://www.postfix.org/basic.html.

Difference between mail name and smarthost:

  • The mail name is the name of the mail server you're setting up (TODO: need more details on what's it used for)
  • The smarthost is the name of the mail server that will relay mail for you.

Quick way to send test mails:

apt-get install mailx
echo ciao | mail efossnet@localhost

To configure a workstation not to do any mail delivery locally and send all mail produced locally to smtp.dream.edu.et:

  1. install postfix choosing "Satellite system"
  2. put smtp.dream.edu.et as a smarthost.

To setup a webmail: apt-get install squirrelmail (on a working apache setup).

To setup mailing lists: apt-get install mailman, then follow the instructions in /usr/share/doc.

Mail server issues we encountered

When a mail is sent to efossnet@localhost, the system tries to send it to efossnet@yoseph.org

Investigation:

  • "yoseph.org" does not appear anywhere in /etc or /var/spool/postfix
  • postfix configuration has been reloaded
  • postfix logs show that the mail has been 'forwarded'

Cause: the user efossnet had forgotten that he or she had setup a .forward file in the home directory.

Solution:

 rm ~efossnet/.forward

Apache

To add a new website:

  1. cd /etc/apache2/sites-available
  2. sudo cp default course
  3. sudo vi course:

    1. Remove the first line
    2. Add a ServerName directive with the address of your server: ServerName course.dream.edu.et
    3. Customize the rest as needed: you at least want to remove the support for browsing /usr/share/doc and you want to use a different document root.
  4. sudo a2ensite course

  5. sudo /etc/init.d/apache2 reload

More VIM

Undo: u (in command mode)

Redo: ^R (in command mode)

You can undo and redo multiple times.

To recover a lost password for root or for the ubuntu admin user

Boot with a live CD, mount the system on the hard disk (the live CD usually does it automatically), then edit the file /etc/shadow, removing the password:

enrico:$1$3AJfasjJFHa234dfh230:13343:0:99999:7:::

becomes:

enrico::13343:0:99999:7:::

You can edit the file because, in the live CD system, you can always become root.

After you do this, reboot the system: you can log in without password, and set yourself a new password using the command passwd.

Installing packages not on the CDs

To get a package for installing when offline:

  1. apt-get --print-uris install dnsmasq
  2. Manually download the packages at the URLs that it gives you

Otherwise, apt-get --download-only install dnsmasq will download the package for you in /var/cache/apt/archives.

You can install various previously downloaded debian packages with:

dpkg -i *.deb

Backups

There are various ways:

  • dump (for ext2/ext3 file systems) or xfsdump (for xfs file systems).

    Makes a low-level dump of the file system.

    It must be used for every different partition.

    It makes the most exact backup possible, including inode numbers.

    It can do full and incremental backups.

    To see the type of the filesystems, use 'mount' with no parameters.

    To restore: restore or xfsrestore.

  • tar

    Filesystem independent.

    It can work accross partitions.

    It correctly backups permissions and hard links.

    It can do full and incremental backups.

    Example:

     tar lzcpf backup.tar.gz /home /var /etc /usr/local
     tar lzcpf root.tar.gz /
    

    To restore:

     tar zxpf backup.tar.gz
    
  • faubackup

    Filesystem independent.

    Uses hard drive as backup storage.

    Always incremental.

    It cannot do compression.

    Unchanged files in new backups are just links to old backups, and do not occupy space.

    Any old backup can be deleted at any time without compromising the others.

    It can be used to provided a "yesterday's files" service to users (both locally and exported as a read-only samba share...).

    To restore, just copy the files from the backup area.

  • amanda

     apt-get install amanda-client amanda-server
    

    It is a network backup system.

    It can do full and incremental backups.

    You can have a backup server which handles the storage and various backup clients that send the files to backup to the server.

    It takes some studying to set up.

    To restore: it has its own tool.

Some data requires exporting before backing it up:

  • To save the list of installed packages and the answer to configuration questions:

     dpkg --get-selections > pkglist
     debconf-get-selections > pkgconfig
    

    To restore:

     dpkg --set-selections < list
     debconf-set-selections < pkgconfig
     apt-get dselect-upgrade
    

    If you do this, they you only need to backup /etc, /home, /usr/local, /var.

  • To save the contents of a MySQL database:

     mysqldump name-of-database | gzip > name-of-database.dump.gz
    

    To restore:

     zcat name-of-database.dump.gz | mysql
    

You can schedule these dumps to be made one hour before the time you make backups.

Scheduling tasks

As a user:

crontab -e

As root: add a file in one of the /etc/cron.* directories.

In cron.{hourly,daily,weekly,monthly} you put scripts.

In the other directories you put crontab files (man 5 crontab).

If the system is turned off during normal maintainance hours, you can do two things:

  1. Change /etc/crontab to use different maintanance hours
  2. Install anacron (it's installed by default in ubuntu)

For scheduling one-shot tasks, use at(1):

$ at 17:40
echo "Please tell Enrico that the lesson is finished" | mail efossnet@dream.edu.et
^D

When and how to automate

  1. First, you manage to do it yourself
  2. Then, you document it
  3. Then, you automate it

Start at step 1 and go to 2 or 3 if/when you actually need it.

(credits to sto@debian.org: he's the one from which I heard it for the first time, said so well).

Interesting programs to schedule during maintanance

  • rkhunter, chkrootkit
  • checksecurity
  • debsecan
  • tiger

Important keys to know in a Unix terminal

These are special keys that work on Unix terminals:

  • ^C: interrupt (sends SIGTERM)
  • ^\: interrupt (send SIGQUIT)
  • ^D: end of input
  • ^S: stop scrolling
  • ^Q: resume scrolling

Therefore, if the terminal looks like it got stuck, try hitting ^Q.

Problems we had today with postfix

  • Problem: mail to efossnet@dream.edu.et is accepted only if sent locally.

    Reason:

     $ host -t mx dream.edu.et
     Host dream.edu.et not found: 3(NXDOMAIN)
    

    Solution: tell dnsmasq to handle a MX record also for dream.edu.et:

    mx-host=dream.edu.et,smtp.dream.edu.et,50

  • The problem not solved with the previous solution.

    Reason: postfix was making complaints which mentioned localhost as a domain name.

    Solution: fixed by changing 'myhostname' in main.cf to something different than localhost.

    Note: solved by luck. Investigate why this happened.

Problems found yesterday and today

  • there is no way to tell squid to use another proxy for SSL connections: it only does them directly
  • if you want to configure evolution to get mail from /var/mail/user, you need to explicitly enter the path. It would be trivially easier if evolution presented a good default, since it's easy to compute. It would also be useful if below the "Path" entry there were some text telling what path is being requested: the mail spool? the evolution mail storage?
  • In Evolution: IMAP or IMAPv4r1? What is the difference? Why should I care?
  • apt-get --print-uris doesn't print the URIs if the package is in the local cache, and there seems to be no way to have it do it.
  • in /etc/apache2/sites-available/default, is the NameVirtualHost * directive appropriate there? It gets in the way when using 'default' as a template for new sites.

    Otherwise, one can add a new (disabled) site that can be used as a template for new sites instead of default.

  • the default comments put by crontab -e are not that easy to read.

Posted Sat Jun 6 00:57:39 2009 Tags:

Tenth day in Addis

Procedure to check if all the services of Dream University are up and running

If a machine blocks pings, use arping instead.

  1. Test DHCP:

    $ sudo ifdown eth0
    $ sudo ifup eth0
    $ ifconfig
    
  2. Test the DNS:

    # See if the DNS machine is on
    # The network
    $ ping -n 192.168.0.1
    
    # See if the DNS resolves names
    $ host www.dream.edu.et
    
  3. Test the gateway:

    # Ping the gateway
    $ ping gateway
    # Ping an outside host
    $ ping -n 10.4.15.6
    
  4. Test the proxy:

    # Ping the proxy
    $ ping proxy
    # Open a web page and see if it displays
    # See if it caches
    http_proxy=http://proxy.dream.edu.et:3030/ wget -S -O/dev/null http://www.enricozini.org  2>&1 | grep X-Cache
    
  5. Test the mail server:

    $ ping smtp
    $ nmap smtp -p 25 |grep 25/tcp
    $ if nmap gateway -p 25 |grep 25/tcp | grep -q open ; then echo "It works"; fi
    $ send a mail and see if you receive it
    

To do more advanced network and service monitoring, try nagios:

New useful tools seen today

wget - The non-interactive network downloader.

Special devices

  • /dev/null:
    • On read, there is no data.
    • On write, discards data.
  • /dev/zero:
    • On read, reads an infininte amount of zero bits.
    • On write, discards data.
  • /dev/random, /dev/urandom
    • On read, reads random bits.
    • On write, discards data.
    • Difference: /dev/random is cryptographically secure, but it can hang waiting for system events

Example uses:

wget -O/dev/null http://www.example.org

dd if=/dev/zero of=testdisk bs=1M count=50
mke2fs testdisk
sudo mount -o loop testdisk  /mnt

Tiny little commands

  • true - do nothing, successfully
  • false - do nothing, unsuccessfully
  • yes - output a string repeatedly until killed

Example uses:

  • while /bin/true; do echo ciao; done
  • Using /bin/false as a shell
  • yes | boring-tool-that-asks-lots-of-silly-questions

Some more shell syntax

  • 2>&1 Redirects the standard error in the standard output
  • 2> Redirects the standard error instead of the standard output

Some people run commands ignoring the standard error: command 2> /dev/null this causes unexpected error messages to go unnoticed: please do not do it.

What to check if a machine is very slow

  • See if the ram is full: $ free If it is, you see what are the fattest programs using top, pressing M to sort by memory usage.
  • See if there are lots of programs competing for CPU: $ top
  • Check if you have I/O bottlenecks: $ vmstat (but I don't know how to read it)
  • For a desktop on older hardware, you can try xubuntu instead of ubuntu

More VIM command mode

Command mode allows to perform various text editing functions.

You work by performing operations on selected blocks of text.

Some common operations:

  • y: copy ("yank")
  • p: paste
  • P: paste before
  • d: cut ("delete")
  • c: change
  • i: insert
  • `a: append
  • .: repeat last operation

Some common blocks:

  • w: word
  • }: paragraph
  • left and right arrow: one character left or right
  • up and down arrow: this line and the one on top or below
  • f letter: from the cursor until the given letter
  • v: selection
  • V: line selection
  • ^V: block selection

Examples:

  • yw: copy word
  • dw: cut word
  • yy: copy line
  • dd: cut line
  • V (select lines) y: copy a selection of lines
  • V (select lines) d: cut a selection of lines
  • p: paste

The best way to learn more vim is always to run vimtutor.

Installing squirrelmail

To install squirrelmail:

  1. apt-get install squirrelmail
  2. /usr/sbin/squirrelmail-config and configure IMAP and SMTP.

    In our case, since we use IMAPS, the IMAP server is imap.dream.edu.et, port 993, secure IMAP enabled and SMTP is smtp.dream.edu.et.

  3. Read /usr/share/doc/squirrelmail/README.Debian.gz (with zless) for how to proceed with setup. A short summary:
    • link /etc/squirrelmail/apache.conf into the apache conf.d directory
    • customise /etc/squirrelmail/apache.conf for example setting up the virtual hosts, or running it only on SSL

To have different virtual hosts over HTTPS, you need to have a different IP for every virtual host: name based virtual hosts do not work on HTTPS.

You can configure multiple IP addresses on the same computer: use network interfaces named: eth0:1, eth0:2, eth0:3... These are called interface aliases.

You cannot setup interface aliases using the graphical network configuration and you need to add them in /etc/network/interfaces:

    iface eth0:1 inet static
          address 192.168.0.201
          netmask 255.255.255.0
          gateway 192.168.0.3
    auto eth0:1

This is the trick commonly used to put different virtual HTTPS hosts on the same computer.

Links

squid documentation:

Shell programming:

Performance analysis:

Setting up mail services:

Posted Sat Jun 6 00:57:39 2009 Tags:

Fifth day in Addis

Samba

To get samba:

    apt-get install samba samba-doc smbclient

To get the Samba Web Administration Tool:

    apt-get install swat netkit-inetd

The configuration is in /etc/samba:

  • One [global] section with the general settings
  • One section per share

One could use swat at http://localhost:901/ but it does not work easily on Ubuntu.

To see what is shared:

    smbclient -L localhost

To access a share:

    smbclient //localhost/name-of-the-share

To add a new user:

    sudo smbpasswd -a username

To change the password of a user:

    sudo smbpasswd username

To test accessing a share as a user:

    smbclient //localhost/web -U yared

Documentation:

    man smb.conf

To force the user or group used to access a share:

    force user = enrico
    force group = www-data

To set the unix permissions for every created file:

    # For files
    create mask = 0664
    # For directories
    directory mask = 0775

Example share configuration for a webspace:

    mkdir /var/www/public
    chgrp www-data /var/www/public
    chmod 0775 /var/www/public

Then, in /etc/samba/smb.conf:

    [web]
       comment = Webspace
       path = /var/www
       writable = yes
       public = no
       force group = www-data
       create mask = 0664
       directory mask = 0775

Example share configuration for a read only directory where only a limited group of people can write:

    [documents]
       comment = Documents
       path = /home/enrico/Desktop/documents
       force user = enrico
       public = yes
       writable = no
       write list = enrico, yared

Print server (CUPS)

Installation:

    apt-get install cupsys

Configuration:

  • On the web (not enabled in Ubuntu):

     http://localhost:631/
    
  • On the desktop:

     System/Administration/Printing
    

Example IPP URIs:

    ipp://server[:port]/printers/queue
    http://server:631/printers/queue
    ipp://server[:port]/...

For example:

    ipp://server/printers/laserjet

"This printer uri scheme can be used to contact local or remote print services to address a particular queue on the named host in the uri. The "ipp" uri scheme is specified in the Internet Print Protocol specifications and is actually much more free form that listed above. All Solaris and CUPS based print queues will be accessed using the formats listed above. Access to print queues on other IPP based print servers requires use of the server supported ipp uri format. Generally, it will be one of the formats listed above."

LDAP Lightweight Directory Access Protocol

Installation:

    apt-get install ldap-utils slapd

The configuration is in /etc/ldap.

To access a ldap server:

    apt-get install gq

Various LDAP HOWTOs:

GRUB

The configuration file is in /boot/grub/menu.lst.

The documentation can be accessed as info grub after installing the package grub-doc.

Quick list of keys for info:

  • arrows: move around
  • enter: enters a section
  • l: goes back
  • u: goes up one node
  • q: quit
  • /: search

Grub trick to have a memory checker:

  1. apt-get install memtest86+
  2. Add this to /boot/grub/menu.lst:

    title Memory test
        root (hd0,5)
        kernel /boot/memtest86+.bin
    

Firewall

With iptables:

    man iptables
    # Only allow in input the network packets
    # that are going to the web server
    iptables -P INPUT DROP
    iptables -A INPUT --protocol tcp --destination port 80 -j ACCEPT
    # To reset the input chain as the default
    iptables -F INPUT
    iptables -P INPUT ACCEPT

Some links:

Squid

Installation:

    apt-get install squid

The configuration is in /etc/squid/squid.conf.

To allow the local network to use the proxy:

    # Add this before "http_access deny all"
    acl our_networks src 10.4.15.0/24
    http_access allow our_networks

To use a parent proxy:

    cache_peer proxy.aau.edu.et     parent    8080  0  proxy-only no-query

Pay attention because /var/spool/squid will grow as the cache is used. The maximum cache size is set in the directive cache_dir.

Information about squid access control is at http://www.squid-cache.org/Doc/FAQ/FAQ-10.html

To check that the configuration has no syntactic errors: squid -k parse.

To match urls:

    acl forbiddensites url_regex [-i] regexp

For info about regular expressions:

    man regex

Example filtering by regular expression:

    acl skype url_regex -i [^A-Za-z]skype[^A-Za-z]
    http_access deny skype

Transparent proxy setup: http://www.tldp.org/HOWTO/TransparentProxy.html

Problems found today

Hiccups of the day:

  • swat does not run on Ubuntu because Ubuntu does not have inetd
  • swat does not allow root login on Ubuntu because root does not have a password
  • smbpasswd -a does not seem to update the timestamp of /var/lib/samba/passwd.tdb
  • cups web admin does not work on Ubuntu
  • LDAP is still not so intuitive to set up

Update: Marius Gedminas writes:

I think it would be a good idea to mention that running

     iptables -P INPUT DROP

in the shell is a Bad Idea if you're logged in remotely via SSH.

Posted Sat Jun 6 00:57:39 2009 Tags:

First day in Addis

First day in Addis Ababa, after the introductory session for this 10 days Linux training.

Interesting new quotes I picked up from the excellent presentation of Dr. Dawit:

Much that I bound I could not free Much that I freed returned to me

(I didn't manage to transcribe the attribution)

And this one for Bubulle, about translation:

When you speak to me in my language you speak to my heart when you speak to me in English you speak to my head

(sb.)

Incomplete list of questions I've been asked, in bogosort -n order:

  • How do I get support?
  • Are the configuration files always the same accross different distributions?
  • What is the level of interoperatibility between the various Linux distributions? And between different Unix-like systems?
  • Does plug and play work well when I change hardware?
  • Can I access NTFS partitions?
  • How do I play multimedia files in restricted formats?
  • I heard that NFS has security problems: can it be secured, or are there other file sharing alternatives?
  • Can I access a desktop remotely?
  • Can I install Linux on a computer where there's Windows already? Do I need to partition?
  • Can I be sure to find drivers for my hardware?

I'm happy to find that we've been successful in building more and more good answers for these questions.

Posted Sat Jun 6 00:57:39 2009 Tags:

Addis course Tasks & Skills questions

  • What does the command find /etc | less do?

  • What does the command ps aux do?

  • What does the command mii-tool do and when would you use it?

  • What does the command host www.google.com do?

  • How do you get the MAC address of your computer?

  • What can you use dnsmasq for?

  • What is in /etc/dnsmasq.conf?

  • What is the use of the dhcp-option configuration parameter of /etc/dnsmasq.conf?

  • What is the difference between chown, chgrp and chmod?

  • What would you use nmap for?

  • How do you check to see if a network service is running on your computer?

  • What does apache2ctl configtest do? When should you run it?

  • Consider this piece of configuration of apache:

     AuthUserFile /etc/apache2/students
     AuthType Basic
     AuthName "Students"
     Require valid-user
    

    What does it do?

    What command would you use to add a new username and password to /etc/apache2/students? (you can write the entire commandline if you know it, but just the name of the command is fine)

  • You created the configuration for a new apache site in /etc/apache2/sites-available. How do you activate the new site?

  • When do you need to add the line Listen 443 to /etc/apache2/ports.conf?

  • What do you normally find in /var/log/syslog, and when would you read it?

  • What does the command smbclient //localhost/web do?

  • What does the command sudo smbpasswd -a enrico do?

  • Where do you look for the explanation of the many directives found in /etc/samba/smb.conf?

  • What is the purpose of the package cupsys?

  • What is the purpose of the command iptables?

  • What is the difference between MDA, MTA and MUA?

  • In a normal mail server configuration, when should you accept a mail coming from outside your local network?

  • Suppose you are a mail software and you need to send a mail to addis@yahoo.com: how do you find out the internet host to which you should connect to send the mail?

  • What is the difference between man 5 postconf and man 8 postconf?

  • What is the different use of SMTP and IMAP?

  • What is a "smarthost" in the context of mail server configuration?

  • What does the command mailq do?

  • What does the command sudo postsuper -d ALL deferred do?

  • Postfix has four mail queues: "incoming", "active", "deferred" and "hold". What is the difference among them?

  • What does the package dovecot do?

  • In the file /etc/dovecot/dovecot.conf, what is the difference between having protocols = imap and protocols = imaps?

  • What happens if I put the line enrico@enricozini.org in the file /home/enrico/.forward?

  • Consider this list of possible strategies for handling mail classified as spam:

    • silently delete it
    • refuse the mail and send a notification to the sender
    • refuse the mail and send a notification to the receiver
    • quarantine the e-mail
    • refuse delivery with a SMTP error
    • deliver with an extra header that says that it's spam

    What are their advantages and disadvantages?

Posted Sat Jun 6 00:57:39 2009 Tags:

Etiopia

È interessante, bello e triste allo stesso tempo trovarsi a ridefinire il significato di "Abissinia". E maledire che per i primi 30 anni della tua vita, quella parola l'hai sentita soltanto quando uno stronzo cantava "Faccetta nera".

Posted Sat Jun 6 00:57:39 2009 Tags:

First pratical lesson

Notes after today's training session.

Small index of most used shell commands:

  • ls - list directory contents
  • cp - copy files and directories
  • mv - move (rename) files
  • rm - remove files or directories
  • find - search for files in a directory hierarchy
  • cat - concatenate files and print on the standard output
  • more - file perusal filter for crt viewing
  • less - opposite of more (quit with 'q')
  • cd - Change the current directory to DIR. (use "help cd" instead of "man cd")
  • mkdir - make directories
  • rmdir - remove empty directories

Small index of commands useful for combining in pipelines:

  • grep, egrep, fgrep, rgrep - print lines matching a pattern
  • tail - output the last part of files
  • head - output the first part of files
  • sort - sort lines of text files
  • uniq - report or omit repeated lines
  • sed - stream editor
  • wc - print the number of newlines, words, and bytes in files

Problems found during the lesson:

  • You set the system default locale to Amharic, and the gdm login will be in Amharic input mode. We didn't find out how to switch it back to input roman characters. Right click on the input field to set the input method doesn't work. Since usernames are not in Amharic, you're locked out.
  • So you CTRL+ALT+F1, login and try dpkg-reconfigure locales. On Ubuntu Dapper, it does not work anymore.
  • So you dig and dig and dig and finally find that you can force a locale in /etc/default/gdm (but not in /etc/gdm/locale.conf, nor in /etc/gdm/gdm.conf).
  • Then the internet works for a bit and you look up how to reconfigure locales in Ubuntu. Turns out you have to use localeconf, which is not installed by default, is not in universe and thus not on the CDs, and needs to be downloaded from the Internet.
  • The Ubuntu wiki is all on https, which defeats any attempt of proxy caching.
  • An Internet proxy needs to be configured 3 times: in Gnome, in Firefox and in Synaptic (well, apt). This is especially tricky when you forgot to setup the proxy in Synaptic and seemingly unrelated applications fail, like the Ubuntu language selector, which internally invokes the package manager to download missing langpacks.
  • Some short descriptions in the NAME section of manpages are hard to understand, or wrong. Noted on apt-get, apt-cache and less. Top prize goes to apt-cache:

     NAME
            apt-cache - APT package handling utility -- cache manipulator
     DESCRIPTION
            [...] apt-cache does not manipulate the state of the system but
            does provide operations to search and generate interesting output
            from the package metadata. [...]
    

    So apt-cache is a manipulator that doesn't manipulate. A possible improvement can be "query the APT package cache".

  • The language selector in Ubuntu Breezy doesn't really exit and keeps the package database locked. This seems to be fixed in Dapper, and probably had been fixed in some Breezy update. System updates here are a problem: my Dapper (with some Universe things in it) wanted to download more than 120Mb of data, and the Uni network was giving me 14Kbps. It's been a nice opportunity to teach about fuser -uva and kill.
  • dict, squid and many other packages from 'main' are not on the normal Ubuntu CDs: is there an easy way to build a CD with them? Or do Ubuntu CDs with extra packages already exist? I'll have to find out.
  • cupsys has documentation outside of /usr/share/doc, in /usr/share/cups/doc-root.
  • man works on all commands, except cd, which is an internal shell command and thus needs help instead of man. I should remember to ponder about autogenerating manpages from help output.
  • Is there an index-like manpage with a list of the core Unix commands and their short descriptions? It there's not, it's easy to generate:

     #!/bin/sh
     DIR=${1:-"/bin"}
     (
     find $DIR | while read FILE
     do
         if [ -x $FILE ] && ! [ -d $FILE ]
         then
             LANG=C COLUMNS=2000 man `basename $FILE` | \
                      grep ^SYNOPSIS -B 100 | grep ^NAME -A 100 | \
                      tail -n +2 | head -n +2 | \
                      grep -v '^[ \t]*$'
         fi
     done
     ) | sort | uniq | sed 's/^ \+//'
    

    Try running it on /bin and /sbin: it's great!. Also, since it doesn't redirect stderr, it nicely exposes a number of manpage problems.

Lots of bugs to report when I come home: from here it'll take ages, and lots of money on the hotel internet connection, and some are Ubuntu-specific so I'd need to do everything online with Malone.

As usual, teaching is one of the best ways to find bugs.

I propose an Etch training session a month before release.

Other things to do:

  • Find more info about that Wikipedia live CD with Wikipedia browsable without the Internet.
  • Make a collection of Free technical E-books: even those Indian low-cost book editions are too expensive here, so E-books mean a lot.

Update: Matt Zimmerman writes:

I read your blog entry at http://www.enricozini.org/blog/eng/second-day-in-addis and wanted to respond as follows:

  • localeconf is not the standard way to configure locales in Ubuntu; what documentation told you that? It's an unsupported package from Progeny. If what you wanted was to set the system default locale from the command line, editing /etc/environment is probably the best way.

  • I suggest filing a bug report at <https://launchpad.net/products/ubuntu-website about the HTTPS issue>; I don't think it's necessary for the entire wiki to be HTTPS, only authentication.

  • Synaptic may be able to use the GNOME proxy settings without introducing undesirable dependencies; please file a wishlist bug

  • dict, squid and other packages from main are not on the Ubuntu CDs because there is no space. The DVD contains these packages.

  • The cupsys documentation bug was quite likely inherited from Debian and should be reported there

  • You can file bugs in Malone via email; this has been possible for a long time now. Please don't reinforce this misconception.

    https://help.launchpad.net/UsingMaloneEmail

Update:

Posted Sat Jun 6 00:57:39 2009 Tags:

Fourth day in Addis

Unix file permissions:

    drwxr-xr-x   2 root root    38 2006-07-14 
    |
    +- Is a directory

    drwxr-xr-x   2 root root    38 2006-07-14 
     ---
      |
      +- User permissions (u)

    drwxr-xr-x   2 root root    38 2006-07-14 
        ---
         |
         +- Group permissions (g)

    drwxr-xr-x   2 root root    38 2006-07-14 
           ---
            |
            +- Permissions for others (o)

    drwxr-xr-x   2 root root    38 2006-07-14 
                   ----
                    |
                    +- Owner user

    drwxr-xr-x   2 root root    38 2006-07-14 
                        ----
                         |
            Owner group -+

Other bits:

  • 4000 Set user ID:

    • For executable files: run as the user who owns the file, instead of the user who runs the file
    • For directories: I think it's not used
  • 2000 Set group ID:

    • For executable files: run as the group who owns the file, instead of the group of the user who runs the file
    • For directories: when a file is created inside the directory, it belongs to the group of the directory instead of the default group of the user who created the file
  • 1000 Sticky bit:

    • For files: I think it's not used anymore
    • For directories: only the owner of a file can delete or rename the file

The executable bit for directories means "can access the files in the directory".

If a directory is readable but not executable, then I can see the list of files (with ls) but I cannot access the files.

To access a file, all the directories of its path up to / need to be executable.

Commands to manipulate permissions:

  • chown - change file owner and group
  • chgrp - change group ownership
  • chmod - change file access permissions

  • sudo adduser enrico www-data adds the user enrico to the group www-data.

Example setup for a website for students:

    # Create the group 'students'
    mkdir /var/www/students
    chgrp students /var/www/students
    chmod 2775 /var/www/students

    # If you don't want other users to read the files of the students:

    chmod 2770 /var/www/students
    adduser www-data students
     (this way the web server can read the
      pages)

    # when you add a user to a group, it does not affect running processes:

     - users need to log out and in again
     - servers need to be restarted

Apache:

  • To install apache2 without a graphical interface:

     apt-cache search apache2 | less
     sudo apt-get install apache2
    
  • By default, /var/www is where is the static website.

  • By default, ~/public_html is the personal webspace for every user, accessible as: http://localhost/~user

  • By default, /usr/lib/cgi-bin contains scripts that are executed when someone browses http://website/cgi-bin/script

  • By default, apache reads the server name from the DNS. If we don't have a name in the DNS and we want to use the IP, we need to set:

     ServerName 10.4.15.158
    

    in /etc/apache/apache2.conf (set it to your IP address)

  • To access the Apache manual: http://localhost/doc/apache2-doc/manual/

  • http://localhost/doc/apache2-doc/manual/mod/mod_access.html The access control module

  • http://localhost/doc/apache2-doc/manual/mod/mod_auth.html The user authentication module

  • To edit a user password file, use:

     htpasswd - Manage user files for basic authentication
    
  • Example .htaccess file to password protect a directory:

     AuthUserFile /etc/apache2/students
     AuthType Basic
     AuthName "Students"
     Require valid-user
    
  • Information about .htaccess is in http://localhost/doc/apache2-doc/manual/howto/htaccess.html

  • If you need to tell apache to listen on different ports, add a Listen directive to /etc/apache2/ports.conf. Then you can use:

     <VirtualHost www.training.aau.edu.et:9000>
     [...]
     </VirtualHost>
    
  • To setup an HTTPS website:

    • Documentation is in http://localhost/doc/apache2-doc/manual/ssl/
    • How to create a certificate: http://www.tc.umn.edu/~brams006/selfsign.html

    • Create a certificate:

      /usr/sbin/apache2-ssl-certificate -days 365

    • Create a virtual host on port 443:

      [...]

    • Enable SSL in the VirtualHost:

      SSLEngine On SSLCertificateFile /etc/apache2/ssl/apache.pem

    • Enable listening on the HTTPS port (/etc/apache2/ports.conf):

      Listen 443

Apache troubleshooting:

  • check that there are no errors in the configuration file:

     apache2ctl configtest
    

    This it is always a good thing to do before restarting or reloading apache.

  • read logs in /var/log/apache2/

  • if you made a change but you don't see it on the web, it can be that you have the old page in the cache of the browser: try reloading a few times.

To install PHP

  • apt-get install libapache2-module-php5
  • then by default, every file .php is executed as php code
  • Small but useful test php file:

     <? phpinfo() ?>
    

To install MySQL

  • apt-get install mysql-client mysql-server
  • for administration run mysql as root:

    • Create a database with:

      create database students

  • Give a user access to the database:

     # Without password
     grant all on students.* to enrico;
    
     # With password
     grant all on students.* to enrico identified by "SECRET";
    
  • More information can be found at http://www-css.fnal.gov/dsg/external/freeware/mysqlAdmin.html

To use MySQL from PHP:

    apt-get install php5-mysqli php5-mysql

Problems found today:

  • the apache2 manual in /usr/share/doc/manual can only be viewed using apache because it uses MultiView. So you need to have a working apache to read how to have a working apache.

  • chmod does not have examples in the manpage.

Posted Sat Jun 6 00:57:39 2009 Tags:

Edifici

Da una canzone in amarico:

"Il tuo amore è diventato vecchio

come gli edifici costruiti dagli italiani"

Posted Sat Jun 6 00:57:39 2009 Tags:
Posted Sat Jun 6 00:57:39 2009
ppy

Posts for Planet Python.

Custom function decorators with TurboGears 2

I am exposing some library functions using a TurboGears2 controller (see web-api-with-turbogears2). It turns out that some functions return a dict, some a list, some a string, and TurboGears 2 only allows JSON serialisation for dicts.

A simple work-around for this is to wrap the function result into a dict, something like this:

@expose("json")
@validate(validator_dispatcher, error_handler=api_validation_error)
def list_colours(self, filter=None, productID=None, maxResults=100, **kw):
    # Call API
    res = self.engine.list_colours(filter, productID, maxResults)

    # Return result
    return dict(r=res)

It would be nice, however, to have an @webapi() decorator that automatically wraps the function result with the dict:

def webapi(func):
    def dict_wrap(*args, **kw):
        return dict(r=func(*args, **kw))
    return dict_wrap

# ...in the controller...

    @expose("json")
    @validate(validator_dispatcher, error_handler=api_validation_error)
    @webapi
    def list_colours(self, filter=None, productID=None, maxResults=100, **kw):
        # Call API
        res = self.engine.list_colours(filter, productID, maxResults)

        # Return result
        return res

This works, as long as @webapi appears last in the list of decorators. This is because if it appears last it will be the first to wrap the function, and so it will not interfere with the tg.decorators machinery.

Would it be possible to create a decorator that can be put anywhere among the decorator list? Yes, it is possible but tricky, and it gives me the feeling that it may break in any future version of TurboGears:

class webapi(object):
    def __call__(self, func):
        def dict_wrap(*args, **kw):
            return dict(r=func(*args, **kw))
        # Migrate the decoration attribute to our new function
        if hasattr(func, 'decoration'):
            dict_wrap.decoration = func.decoration
            dict_wrap.decoration.controller = dict_wrap
            delattr(func, 'decoration')
        return dict_wrap

# ...in the controller...

    @expose("json")
    @validate(validator_dispatcher, error_handler=api_validation_error)
    @webapi
    def list_colours(self, filter=None, productID=None, maxResults=100, **kw):
        # Call API
        res = self.engine.list_colours(filter, productID, maxResults)

        # Return result
        return res

As a convenience, TurboGears 2 offers, in the decorators module, a way to build decorator "hooks":

class before_validate(_hook_decorator):
    '''A list of callables to be run before validation is performed'''
    hook_name = 'before_validate'

class before_call(_hook_decorator):
    '''A list of callables to be run before the controller method is called'''
    hook_name = 'before_call'

class before_render(_hook_decorator):
    '''A list of callables to be run before the template is rendered'''
    hook_name = 'before_render'

class after_render(_hook_decorator):
    '''A list of callables to be run after the template is rendered.

    Will be run before it is returned returned up the WSGI stack'''

    hook_name = 'after_render'

The way these are invoked can be found in the _perform_call function in tg/controllers.py.

To show an example use of those hooks, let's add a some polygen wisdom to every data structure we return:

class wisdom(decorators.before_render):
    def __init__(self, grammar):
        super(wisdom, self).__init__(self.add_wisdom)
        self.grammar = grammar
    def add_wisdom(self, remainder, params, output):
        from subprocess import Popen, PIPE
        output["wisdom"] = Popen(["polyrun", self.grammar], stdout=PIPE).communicate()[0]

# ...in the controller...

    @wisdom("genius")
    @expose("json")
    @validate(validator_dispatcher, error_handler=api_validation_error)
    def list_colours(self, filter=None, productID=None, maxResults=100, **kw):
        # Call API
        res = self.engine.list_colours(filter, productID, maxResults)
    
        # Return result
        return res

These hooks cannot however be used for what I need, that is, to wrap the result inside a dict. The reason is because they are called in this way:

        controller.decoration.run_hooks(
                'before_render', remainder, params, output)

and not in this way:

        output = controller.decoration.run_hooks(
                'before_render', remainder, params, output)

So it is possible to modify the output (if it is a mutable structure) but not to exchange it with something else.

Can we do even better? Sure we can. We can assimilate @expose and @validate inside @webapi to avoid repeating those same many decorator lines over and over again:

class webapi(object):
    def __init__(self, error_handler = None):
        self.error_handler = error_handler

    def __call__(self, func):
        def dict_wrap(*args, **kw):
            return dict(r=func(*args, **kw))
        res = expose("json")(dict_wrap)
        res = validate(validator_dispatcher, error_handler=self.error_handler)(res)
        return res

# ...in the controller...

    @expose("json")
    def api_validation_error(self, **kw):
        pylons.response.status = "400 Error"
        return dict(e="validation error on input fields", form_errors=pylons.c.form_errors)

    @webapi(error_handler=api_validation_error)
    def list_colours(self, filter=None, productID=None, maxResults=100, **kw):
        # Call API
        res = self.engine.list_colours(filter, productID, maxResults)

        # Return result
        return res

This got rid of @expose and @validate, and provides almost all the default values that I need. Unfortunately I could not find out how to access api_validation_error from the decorator so that I can pass it to the validator, therefore I remain with the inconvenience of having to explicitly pass it every time.

Posted Wed Nov 4 17:52:38 2009 Tags:

Building a web-based API with Turbogears2

I am using TurboGears2 to export a python API over the web. Every API method is wrapper by a controller method that validates the parameters and returns the results encoded in JSON.

The basic idea is this:

@expose("json")
def list_colours(self, filter=None, productID=None, maxResults=100, **kw):
    # Call API
    res = self.engine.list_colours(filter, productID, maxResults)

    # Return result
    return res

To validate the parameters we can use forms, it's their job after all:

class ListColoursForm(TableForm):
    fields = [
            # One field per parameter
            twf.TextField("filter", help_text="Please enter the string to use as a filter"),
            twf.TextField("productID", help_text="Please enter the product ID"),
            twf.TextField("maxResults", validator=twfv.Int(min=0), default=200, size=5, help_text="Please enter the maximum number of results"),
    ]
list_colours_form=ListColoursForm()

#...

    @expose("json")
    @validate(list_colours_form, error_handler=list_colours_validation_error)
    def list_colours(self, filter=None, productID=None, maxResults=100, **kw):
        # Parameter validation is done by the form
    
        # Call API
        res = self.engine.list_colours(filter, productID, maxResults)
    
        # Return result
        return res

All straightforward so far. However, this means that we need two exposed methods for every API call: one for the API call and one error handler. For every API call, we have to type the name several times, which is error prone and risks to get things mixed up.

We can however have a single error handler for all methonds:

def get_method():
    '''
    The method name is the first url component after the controller name that
    does not start with 'test'
    '''
    found_controller = False
    for name in pylons.c.url.split("/"):
        if not found_controller and name == "controllername":
            found_controller = True
            continue
        if name.startswith("test"):
            continue
        if found_controller:
            return name
    return None

class ValidatorDispatcher:
    '''
    Validate using the right form according to the value of the "method" field
    '''
    def validate(self, args, state):
        method = args.get("method", None)
    # Extract the method from the URL if it is missing
        if method is None:
            method = get_method()
            args["method"] = method
        return forms[method].validate(args, state)

validator_dispatcher = ValidatorDispatcher()

This validator will try to find the method name, either as a form field or by parsing the URL. It will then use the method name to find the form to use for validation, and pass control to the validate method of that form.

We then need to add an extra "method" field to our forms, and arrange the forms inside a dictionary:

class ListColoursForm(TableForm):
    fields = [
            # One hidden field to have a place for the method name
            twf.HiddenField("method")
            # One field per parameter
            twf.TextField("filter", help_text="Please enter the string to use as a filter"),
    #...

forms["list_colours"] = ListColoursForm()

And now our methods become much nicer to write:

    @expose("json")
    def api_validation_error(self, **kw):
        pylons.response.status = "400 Error"
        return dict(form_errors=pylons.c.form_errors)

    @expose("json")
    @validate(validator_dispatcher, error_handler=api_validation_error)
    def list_colours(self, filter=None, productID=None, maxResults=100, **kw):
        # Parameter validation is done by the form
    
        # Call API
        res = self.engine.list_colours(filter, productID, maxResults)
    
        # Return result
        return res

api_validation_error is interesting: it returns a proper HTTP error status, and a JSON body with the details of the error, taken straight from the form validators. It took me a while to find out that the form errors are in pylons.c.form_errors (and for reference, the form values are in pylons.c.form_values). pylons.response is a WebOb Response that we can play with.

So now our client side is able to call the API methods, and get a proper error if it calls them wrong.

But now that we have the forms ready, it doesn't take much to display them in web pages as well:

def _describe(self, method):
    "Return a dict describing an API method"
    ldesc = getattr(self.engine, method).__doc__.strip()
    sdesc = ldesc.split("\n")[0]
    return dict(name=method, sdesc = sdesc, ldesc = ldesc)

@expose("myappserver.templates.myappapi")
def index(self):
    '''
    Show an index of exported API methods
    '''
    methods = dict()
    for m in forms.keys():
        methods[m] = self._describe(m)
    return dict(methods=methods)

@expose('myappserver.templates.testform')
def testform(self, method, **kw):
    '''
    Show a form with the parameters of an API method
    '''
    kw["method"] = method
    return dict(method=method, action="/myapp/test/"+method, value=kw, info=self._describe(method), form=forms[method])

@expose(content_type="text/plain")
@validate(validator_dispatcher, error_handler=testform)
def test(self, method, **kw):
    '''
    Run an API method and show its prettyprinted result
    '''
    res = getattr(self, str(method))(**kw)
    return pprint.pformat(res)

In a few lines, we have all we need: an index of the API methods (including their documentation taken from the docstrings!), and for each method a form to invoke it and a page to see the results.

Make the forms children of AjaxForm, and you can even see the results together with the form.

Posted Thu Oct 15 15:45:39 2009 Tags:

Creating pipelines with subprocess

It is possible to create process pipelines using subprocess.Popen, by just using stdout=subprocess.PIPE and stdin=otherproc.stdout.

Almost.

In a pipeline created in this way, the stdout of all processes except the last is opened twice: once in the script that has run the subprocess and another time in the standard input of the next process in the pipeline.

This is a problem because if a process closes its stdin, the previous process in the pipeline does not get SIGPIPE when trying to write to its stdout, because that pipe is still open on the caller process. If this happens, a wait on that process will hang forever: the child process waits for the parent to read its stdout, the parent process waits for the child process to exit.

The trick is to close the stdout of each process in the pipeline except the last just after creating them:

#!/usr/bin/python
# coding=utf-8

import subprocess

def pipe(*args):
    '''
    Takes as parameters several dicts, each with the same
    parameters passed to popen.

    Runs the various processes in a pipeline, connecting
    the stdout of every process except the last with the
    stdin of the next process.
    '''
    if len(args) < 2:
        raise ValueError, "pipe needs at least 2 processes"
    # Set stdout=PIPE in every subprocess except the last
    for i in args[:-1]:
        i["stdout"] = subprocess.PIPE

    # Runs all subprocesses connecting stdins and stdouts to create the
    # pipeline. Closes stdouts to avoid deadlocks.
    popens = [subprocess.Popen(**args[0])]
    for i in range(1,len(args)):
        args[i]["stdin"] = popens[i-1].stdout
        popens.append(subprocess.Popen(**args[i]))
        popens[i-1].stdout.close()

    # Returns the array of subprocesses just created
    return popens

At this point, it's nice to write a function that waits for the whole pipeline to terminate and returns an array of result codes:

def pipe_wait(popens):
    '''
    Given an array of Popen objects returned by the
    pipe method, wait for all processes to terminate
    and return the array with their return values.
    '''
    results = [0] * len(popens)
    while popens:
        last = popens.pop(-1)
        results[len(popens)] = last.wait()
    return results

And, look and behold, we can now easily run a pipeline and get the return codes of every single process in it:

process1 = dict(args='sleep 1; grep line2 testfile', shell=True)
process2 = dict(args='awk \'{print $3}\'', shell=True)
process3 = dict(args='true', shell=True)
popens = pipe(process1, process2, process3)
result = pipe_wait(popens)
print result

Update: Colin Watson suggests an improvement to compensate for Python's nonstandard SIGPIPE handling.

Colin Watson has a similar library for C.

Posted Wed Jul 1 09:08:06 2009 Tags:

Passing values to turbogears widgets at display time (the general case)

Last time I dug this up I was not clear enough in documenting my findings, so I had to find them again. Here is the second attempt.

In Turbogears, in order to pass parameters to arbitrary widgets in a compound widget, the syntax is:

form.display(PARAMNAME=dict(WIDGETNAME=VALUE))

And if you have more complex nested widgets and would like to know what goes on, this monkey patch is good for inspecting the params lookup functions:

import turbogears.widgets.forms
old_rpbp = turbogears.widgets.forms.retrieve_params_by_path
def inspect_rpbp(params, path):
    print "RPBP", repr(params), repr(path)
    res = old_rpbp(params, path)
    print "RPBP RES", res
    return res
turbogears.widgets.forms.retrieve_params_by_path = inspect_rpbp

The code for the lookup itself is, as the name suggests, in the retrieve_params_by_path function in the file widgets/forms.py in the Turbogears source code.

Posted Sat Jun 6 00:57:39 2009 Tags:

Python scoping

How do you create a list of similar functions in Python?

As a simple example, let's say we want to create an array of 10 elements like this:

a[0] = lambda x: x
a[1] = lambda x: x+1
a[2] = lambda x: x+2
...
a[9] = lambda x: x+9

Simple:

>>> a = []
>>> for i in range(0,10): a.append(lambda x: x+i)
...

...but wrong:

>>> a[0](1)
10

What happened here? In Python, that lambda x: x+i uses the value that i will have when the function is invoked.

This is the trick to get it right:

>>> a = []
>>> for i in range(0,10): a.append(lambda x, i=i: x + i)
...
>>> a[0](1)
1

What happens here is explained in the section "A Jedi Mind Trick" of the Instant Python article: i=i assigns as the default value of the parameter i the current value of i.

Strangely enough the same article has "A Note About Python 2.1 and Nested Scopes" which seems to imply that from Python 2.2 the scoping has changed to "work as it should". I don't understand: the examples above are run on Python 2.4.4.

Googling for keywords related to python closure scoping only yields various sorts of complicated PEPs and an even uglier list trick:

a lot of people might not know about the trick of using a list to box variables within a closure.

Now I know about the trick, but I wish I didn't need to know :-(

Posted Sat Jun 6 00:57:39 2009 Tags:

TurboGears RemoteForm tip

In case your RemoteForm misteriously behaves like a normal HTTP form, refreshing the page on submit, and the only hint that there's something wrong is this bit in the Iceweasel's error console:

Errore: uncaught exception: [Exception... "Component returned failure
code: 0x80070057 (NS_ERROR_ILLEGAL_VALUE) [nsIXMLHttpRequest.open]"
nsresult: "0x80070057 (NS_ERROR_ILLEGAL_VALUE)"  location: "JS frame ::
javascript: eval(__firebugTemp__); :: anonymous :: line 1"  data: no]

the problem can just be a missing action= attribute to the form.

I found out after:

  1. reading the TurboGears remoteform wiki: "For some reason, the RemoteForm is acting like a regular html form, serving up a new page instead of performing the replacements we're looking for. I'll update this page as soon as I figure out why this is happening."

  2. finding this page on Google and meditating for a while while staring at it. I don't speak German, but often enough I manage to solve problems after meditating over Google results in all sorts of languages unknown or unreadable to me. I will call this practice Webomancy.

Posted Sat Jun 6 00:57:39 2009 Tags:

Passing values to turbogears widgets at display time

In turbogears, I often need to pass data to widgets at display time. Sometimes it works automatically, but sometimes, in cases like passing option lists to CheckBoxLists or number of repetitions in a RepeatingFieldSet, it doesn't.

All the examples use precomputed lists or pass simple code functions. In most of my cases, I want them computed by the controller every time.

Passing a function hasn't worked, as I did not find any obvious way to have the function know about the controller.

So I need to pass things the display() method of the widgets, but I could not work out how to pass the option list and default list for a CheckBoxList that is part of a WidgetsList in a TableForm.

On IRC came the answer, thanks to Xentac:

you should be able to...
    tableform.display(options=dict(checkboxname=[optionlist]))

And yes, it works. I can pass the default value as one of the normal form values:

    tableform.display(values=dict(checkboxname=[values]), options=dict(checkboxname=[optionlist]))
Posted Sat Jun 6 00:57:39 2009 Tags:

File downloads with TurboGears

In TurboGears, I had to implement a file download method, but the file required access controls so it was put in a directory not exported by Apache.

In #turbogears I've been pointed at: http://cherrypy.org/wiki/FileDownload and this is everything put together:

from cherrypy.lib.cptools import serveFile
# In cherrypy 3 it should be:
#from cherrypy.lib.static import serve_file

@expose()
def get(self, *args, **kw):
    """Access the file pointed by the given path"""
    pathname = check_auth_and_compute_pathname()
    return serveFile(pathname)

Then I needed to export some CSV:

@expose()
def getcsv(self, *args, **kw):
    """Get the data in CSV format"""
    rows = compute_data_rows()
    headers = compute_headers(rows)
    filename = compute_file_name()

    cherrypy.response.headers['Content-Type'] = "application/x-download"
    cherrypy.response.headers['Content-Disposition'] = 'attachment; filename="'+filename+'"'

    csvdata = StringIO.StringIO()
    writer = csv.writer(csvdata)
    writer.writerow(headers)
    writer.writerows(rows)

    return csvdata.getvalue()

In my case it's not an issue as I can only compute the headers after I computed all the data, but I still have to find out how to serve the CSV file while I'm generating it, instead of storing it all into a big string and returning the big string.

Posted Sat Jun 6 00:57:39 2009 Tags:

Turbogears quirks when testing controllers that use SingleSelectField

Suppose you have a User that can be a member of a Company. In SQLObject you model it somehow like this:

    class Company(SQLObject):
        name = UnicodeCol(length=16, alternateID=True, alternateMethodName="by_name")
        display_name = UnicodeCol(length=255)

    class User(InheritableSQLObject):
        company = ForeignKey("Company", notNull=False, cascade='null')

Then you want to make a form that allows to choose what is the company of a user:

def companies():
    return [ [ -1, 'None' ] ] + [ [c.id, c.display_name] for c in Company.select() ]

class NewUserFields(WidgetsList):
    """Fields for editing general settings"""
    user_name = TextField(label="User name")
    companyID = SingleSelectField(label="Company", options=companies)

Ok. Now you want to run tests:

  1. nosetests imports the controller to see if there's any initialisation code.
  2. The NewUserFields class is created.
  3. The SingleSelectField is created.
  4. The SingleSelectField constructor tries to guess the validator and peeks at the first option.
  5. This calls companies.
  6. companies accesses the database.
  7. The testing database has not yet been created because nosetests imported the module before giving the test code a chance to setup the test database.
  8. Bang.

The solution is to add an explicit validator to disable this guessing code that is a source of so many troubles:

class NewUserFields(WidgetsList):
    """Fields for editing general settings"""
    user_name = TextField(label="User name")
    companyID = SingleSelectField(label="Company", options=companies, validator=v.Int(not_empty=True))
Posted Sat Jun 6 00:57:39 2009 Tags:

Turbogears form quirk

I had a great idea:

@validate(model_form)
@error_handler()
@expose(template='kid:myproject.templates.new')
def new(self, id, tg_errors=None, **kw):
    """Create new records in model"""
    if tg_errors:
        # Ask until there is still something missing
        return dict(record = defaults, form = model_form)
    else:
        # We have everything: save it
        i = Item(**kw)
        flash("Item was successfully created.")
        raise redirect("../show/%d" % i.id)

It was perfect: one simple method, simple error handling, nice helpful messages all around. Except, check boxes and select fields would not get the default values while all other fields would.

After two hours searching and cursing and tracing things into widget code, I found this bit in InputWidget.adjust_value:

# there are some input fields that when nothing is checked/selected
# instead of sending a nice name="" are totally missing from
# input_values, this little workaround let's us manage them nicely
# without interfering with other types of fields, we need this to
# keep track of their empty status otherwise if the form is going to be
# redisplayed for some errors they end up to use their defaults values
# instead of being empty since FE doesn't validate a failing Schema.
# posterity note: this is also why we need if_missing=None in
# validators.Schema, see ticket #696.

So, what is happening here is that since check boxes and option fields don't have a nice behaviour when unselected, turbogears has to work around it. So in order to detect the difference between "I selected 'None'" and "I didn't select anything", it reasons that if the input has been validated, then the user has made some selections, so it defaults to "The user selected 'None'". If the input has not been validated, then we're showing the form for the first time, then a missing value means "Use the default provided".

Since I was doing the validation all the time, this meant that Checkboxes and Select fields would never use the default values.

Hence, if you use those fields then you necessarily need two different controller methods, one to present the form and one to save it:

@expose(template='kid:myproject.templates.new')
def new(self, id, **kw):
    """Create new records in model"""
    return dict(record = defaults(), form = model_form)

@validate(model_form)
@error_handler(new)
@expose()
def savenew(self, id, **kw):
    """Create new records in model"""
    i = Item(**kw)
    flash("Item was successfully created.")
    raise redirect("../show/%d"%i.id)

If someone else stumbles on the same problem, I hope they'll find this post and they won't have to spend another two awful hours tracking it down again.

Posted Sat Jun 6 00:57:39 2009 Tags:
Posted Sat Jun 6 00:57:39 2009

Pages about Ubuntu.

Live CD on a removable disk, the Debian way

In [live-cd-on-removable-disk] at some point I wrote:

Enrico's note: do we have anything in Debian that we can install and just does that?

Here are the answers:

Sven Mueller writes:

Well, Enrico, a tool I really grew fond of, which auto-configures X on Debian systems is xdebconfigurator, it lacks being auto-run on each system start, which I consider a feature on normal systems, but for your proposed usage (i.e. a portable USB-storage based Debian system), it would certainly be the right thing.

Essentially, it never failed on me. Except for VMware virtual machines, where all it did wrong was that it proposed too high resolutions which resulted from my dual-screen Windows setup I ran VMware on. You might want to give it a try.

Tollef Fog Heen writes:

I added the support in casper for doing this almost a year ago and it has saved me lots of debugging time. Booting the live CD that way is almost as fast as booting an installed system. If you couple this with using the persistent storage support in casper, you can get the configure-on-boot support together with persistency.

In a later update, slh is quited saying that xresprobe doesn't work on AMD64. This is wrong, I wrote that support based on code by Matthew Garret a little more than nine months ago. I wouldn't recommend incorporating it in new-written code, but rather use libx86

And finally, Marco Amadori writes:

Without needing to look for tools external to Debian, there is already the Debian Live software in sid: live-package, that creates a live system, and casper, that generates an initramfs that can configure a Debian system on the fly.

So far there is no hard disk target for live-package, but the "Iso" target can already do the job quite well. At boot time, Casper's initramfs scans all the block devices, so it works also for USB keys and hard drives.

To obtain a hard drive image, you just need to invoke "make-live" with the options to have the required software, then copy the content of the iso (or of the directory ./debian-live/binary) on a partition and install the boot loader.

This is what the future "HD" target of live-package will do; so far it can only build ISO and Netboot images.

Posted Sat Jun 6 00:57:39 2009 Tags:

Live CD on a removable disk

Eros is a hardware guru that happened to be the unknown guy sitting next to me on a plane.

He happens to be a happy Kubuntuer. While chatting, he told me one of his systems is an external hard drive made by copying a Kubuntu live CD image on it.

Why did you do so? I asked.

Because this way I can plug it in any computer, and it'll do hardware detection at boot. However it's a hard drive, so it's fast, and I can keep my home and all my customisations on it.

I had never thought of it.

That's an interesting and smart (ab)use of a live CD.

Now I wonder: what would be required to plug the live CD boot time hardware detection infrastructure on an existing Debian or Ubuntu instalation?

Update: slh on IRC suggests (a bit edited by me):

A lot of the former "obscure black magic" for live CDs isn't needed anymore. What is needed is: a kernel with static usb-storage, libusual, ehci-hcd, ohci-hcd, uhci-hdc (or an appropriate initrd/ initramfs). udev takes care of most h/w detection issues these days.

As long as everything needed to boot is contained in a single partition you don't need a fstab: udev, hal and pmount take care of the rest, procfs, sysvfs, devpts, usbfs, shm are mounted by sysvinit.

All what is left is a tool to create the xorg.conf while booting (those tools exist and just need to be called early).

Everything else is just a matter of convenience: enhancing the live span of the USB key by changing data into tmpfs, etc.; if passwordless logins are required then xsession and inittab need to be changed; new ssh host keys generated on boot; small stuff.

With ordinary flash storage, jffs2 and something to reduce write access is a good idea: perhaps unionfs for /var/ and /home/, bind mounting /tmp/ on /var/tmp/), but that's also not strictly necessary.

Mostly it boils down to running the xorg-creation script at every boot time.

There are various tools to do that. Some are here, but there is surely more. (Enrico's note: do we have anything in Debian that we can install and just does that?)

Since USB and PS/2 mice share the same device since kernel 2.6, that part of xorg.conf doesn't strictly need to be detected, same for the keyboard (alps and synaptic touchpads can be easily detected) and X.org can use the screen's ddc info although it's not always reliable.

It can boil down to just detecting the video chipset: something like this, that uses PCI IDs from discover1-data.

It can also become a lot easier with X.org's own ddc detection, which almost boils down to configuring input devices and selecting the video driver. If I understand Daniel Stone correctly, X.org will soon improve its detection routines (fail safe X (auto-)configuration) as well in X.org 7.3.

xresprobe is in debian: it's pretty similar to ddcxinfo-kanotix, both forked off RedHat's kudzu package - and all fail miserably on amd64. That's why ddcxinfo has a fallback to 1024*768 @75 Hz which "always works (+manual overrides)".

Posted Sat Jun 6 00:57:39 2009 Tags:

First pratical lesson

Notes after today's training session.

Small index of most used shell commands:

  • ls - list directory contents
  • cp - copy files and directories
  • mv - move (rename) files
  • rm - remove files or directories
  • find - search for files in a directory hierarchy
  • cat - concatenate files and print on the standard output
  • more - file perusal filter for crt viewing
  • less - opposite of more (quit with 'q')
  • cd - Change the current directory to DIR. (use "help cd" instead of "man cd")
  • mkdir - make directories
  • rmdir - remove empty directories

Small index of commands useful for combining in pipelines:

  • grep, egrep, fgrep, rgrep - print lines matching a pattern
  • tail - output the last part of files
  • head - output the first part of files
  • sort - sort lines of text files
  • uniq - report or omit repeated lines
  • sed - stream editor
  • wc - print the number of newlines, words, and bytes in files

Problems found during the lesson:

  • You set the system default locale to Amharic, and the gdm login will be in Amharic input mode. We didn't find out how to switch it back to input roman characters. Right click on the input field to set the input method doesn't work. Since usernames are not in Amharic, you're locked out.
  • So you CTRL+ALT+F1, login and try dpkg-reconfigure locales. On Ubuntu Dapper, it does not work anymore.
  • So you dig and dig and dig and finally find that you can force a locale in /etc/default/gdm (but not in /etc/gdm/locale.conf, nor in /etc/gdm/gdm.conf).
  • Then the internet works for a bit and you look up how to reconfigure locales in Ubuntu. Turns out you have to use localeconf, which is not installed by default, is not in universe and thus not on the CDs, and needs to be downloaded from the Internet.
  • The Ubuntu wiki is all on https, which defeats any attempt of proxy caching.
  • An Internet proxy needs to be configured 3 times: in Gnome, in Firefox and in Synaptic (well, apt). This is especially tricky when you forgot to setup the proxy in Synaptic and seemingly unrelated applications fail, like the Ubuntu language selector, which internally invokes the package manager to download missing langpacks.
  • Some short descriptions in the NAME section of manpages are hard to understand, or wrong. Noted on apt-get, apt-cache and less. Top prize goes to apt-cache:

     NAME
            apt-cache - APT package handling utility -- cache manipulator
     DESCRIPTION
            [...] apt-cache does not manipulate the state of the system but
            does provide operations to search and generate interesting output
            from the package metadata. [...]
    

    So apt-cache is a manipulator that doesn't manipulate. A possible improvement can be "query the APT package cache".

  • The language selector in Ubuntu Breezy doesn't really exit and keeps the package database locked. This seems to be fixed in Dapper, and probably had been fixed in some Breezy update. System updates here are a problem: my Dapper (with some Universe things in it) wanted to download more than 120Mb of data, and the Uni network was giving me 14Kbps. It's been a nice opportunity to teach about fuser -uva and kill.
  • dict, squid and many other packages from 'main' are not on the normal Ubuntu CDs: is there an easy way to build a CD with them? Or do Ubuntu CDs with extra packages already exist? I'll have to find out.
  • cupsys has documentation outside of /usr/share/doc, in /usr/share/cups/doc-root.
  • man works on all commands, except cd, which is an internal shell command and thus needs help instead of man. I should remember to ponder about autogenerating manpages from help output.
  • Is there an index-like manpage with a list of the core Unix commands and their short descriptions? It there's not, it's easy to generate:

     #!/bin/sh
     DIR=${1:-"/bin"}
     (
     find $DIR | while read FILE
     do
         if [ -x $FILE ] && ! [ -d $FILE ]
         then
             LANG=C COLUMNS=2000 man `basename $FILE` | \
                      grep ^SYNOPSIS -B 100 | grep ^NAME -A 100 | \
                      tail -n +2 | head -n +2 | \
                      grep -v '^[ \t]*$'
         fi
     done
     ) | sort | uniq | sed 's/^ \+//'
    

    Try running it on /bin and /sbin: it's great!. Also, since it doesn't redirect stderr, it nicely exposes a number of manpage problems.

Lots of bugs to report when I come home: from here it'll take ages, and lots of money on the hotel internet connection, and some are Ubuntu-specific so I'd need to do everything online with Malone.

As usual, teaching is one of the best ways to find bugs.

I propose an Etch training session a month before release.

Other things to do:

  • Find more info about that Wikipedia live CD with Wikipedia browsable without the Internet.
  • Make a collection of Free technical E-books: even those Indian low-cost book editions are too expensive here, so E-books mean a lot.

Update: Matt Zimmerman writes:

I read your blog entry at http://www.enricozini.org/blog/eng/second-day-in-addis and wanted to respond as follows:

  • localeconf is not the standard way to configure locales in Ubuntu; what documentation told you that? It's an unsupported package from Progeny. If what you wanted was to set the system default locale from the command line, editing /etc/environment is probably the best way.

  • I suggest filing a bug report at <https://launchpad.net/products/ubuntu-website about the HTTPS issue>; I don't think it's necessary for the entire wiki to be HTTPS, only authentication.

  • Synaptic may be able to use the GNOME proxy settings without introducing undesirable dependencies; please file a wishlist bug

  • dict, squid and other packages from main are not on the Ubuntu CDs because there is no space. The DVD contains these packages.

  • The cupsys documentation bug was quite likely inherited from Debian and should be reported there

  • You can file bugs in Malone via email; this has been possible for a long time now. Please don't reinforce this misconception.

    https://help.launchpad.net/UsingMaloneEmail

Update:

Posted Sat Jun 6 00:57:39 2009 Tags:

Fixing problems after upgrade to Dapper

Laptop: Asus M3Ae

Problem: Can't mount root partition because of various ACPI errors. Breezy kernel works.

Solution:

1) boot with old kernel 2) sudo echo "libata noacpi=1" >> /etc/mkinitramfs/modules 3) sudo mv /boot/initrd.img-2.6.15-25-686 /boot/initrd.img-2.6.15-25-686.backup 4) mkinitramfs -o /boot/initrd.img-2.6.15-25-686 2.6.15-25-686

Thanks: Matthew Garrett

Posted Sat Jun 6 00:57:39 2009 Tags:
Posted Sat Jun 6 00:57:39 2009

Cazzeggio.

Non importa che mi dai del voi

Dai, non importa che mi dai del voi

In che senso?

Eh, mi dici sempre "voi informatici", "voi tecnici", "voi..."

Posted Fri Dec 19 15:55:20 2014 Tags:

Spelling a chilometri zero

Lo spelling internazionale è troppo globalizzato, e volete recuperare un attimo la dimensione del posto dove siete nati e cresciuti?

Da oggi c'è questo script che fa per voi: gli dite dove abitate, e lui vi crea lo spelling a chilometri zero.

$ git clone git@gitorious.org:trespolo/osmspell.git
$ cd osmspell
$ ./osmspell "San Giorgio di Piano"
1: San Giorgio di Piano, BO, EMR, Italia
2: San Giorgio di Piano, Via Codronchi, San Giorgio di Piano, BO, EMR, Italia
3: San Giorgio Di Piano, Via Libertà, San Giorgio di Piano, BO, EMR, Italia
Choose one: 1
Center: 44.6465332, 11.3790398
A Argelato, Altedo
B Bentivoglio, Bologna, Boschi
C Cinquanta, Castagnolo Minore, Castel Maggiore, Cento
D Dosso
E Eremo di Tizzano
F Funo di Argelato, Finale Emilia, Ferrara, Fiesso
G Gherghenzano, Galliera, Gesso
I Il Cucco, Irnerio, Idice
L Località Fortuna, Lovoleto, Lippo
M Malacappa, Massumatico, Minerbio, Marano
N Navile
O Osteriola, Ozzano dell'Emilia, Oca
P Piombino, Padulle, Poggio Renatico, Piave
Q Quarto Inferiore, Quattrina
R Rubizzano, Renazzo, Riale
S San Giorgio di Piano, Saletto
T Torre Verde, Tintoria, Tombe
U Uccellino
V Venezzano Mascarino, Vigarano Mainarda, Veduro
X XII Morelli
Z Zenerigolo, Zola Predosa

I dati vengono da OSM, e lo script è un ottimo esempio di come usarne la API di geolocazione (veloci) e la API di query geografica (lenta).

Posted Sat Jan 4 00:38:16 2014 Tags:

Poesia: "Lavatrice"

Pensavo fosse pail,

invece ora è feltro.

Posted Tue Dec 3 22:32:23 2013 Tags:

Shops

Christmas songs should only ever be played on Christmas day.

In church.

At midnight.

Unless I happen to be there.

Posted Mon Dec 2 14:07:58 2013 Tags:

Airports

Photo of a commercial in London City airport saying 'In the lap of luxury - Want to reach a captive audience with dwell time? Why advertise anywhere else? - London City Airport Media Sales'

In the airport, we are not travellers. We are a captive audience with dwell time.

In other words, suckers stuck in a room where the only pastime provided is spending money and staring at advertisements selling advertisement space in rooms full of suckers like them.

Posted Fri Nov 22 18:58:00 2013 Tags:

Explanation of umarell

Umarell /uma'rɛl/ (oo-mah-rell), n; pl. Umarells. People in a community who offer all sorts of comments to those who are trying to get some work done, but who are not doing any work themselves.

Etymology and further details

Umarell is a word that entered Italian slang in Bologna and is spreading to nearby towns, occasionally even across Italy. It comes from the Bolognese for "cute/odd little man".

"Umarells" are those people, usually retired men, who spend time watching construction works, often holding their hands behind their back, occasionally commenting on what is going on, sometimes trying to tell the workers what to do.

It's easy to find examples on the internet; the word was popularised by a blog collecting photos, which has even been published into a book.

With some Italian Debian friends, we realised that umarell is the perfect word to describe those people in a community, who offer all sorts of comments to those who are trying to get some work done, but who are not doing any work themselves.

I think that it is a word that fits perfectly, and since I'm likely going to use it blissfully anywhere, here is a page that temporarily explains what it means until the Oxford English Dictionary picks it up.

Posted Fri Sep 20 13:27:07 2013 Tags:

Yet another Ubuntu anecdote

Some posts on planet made me remember of a little Canonical-related story of mine.

Many years ago I shortly contracted for Canonical. It was interesting and fun.

At the time I didn't have any experience of being temporarily hired by a foreign company, so I rang my labour union to get an appointment, to make sure with them that everything was allright.

The phone call went more or less like this:

Me:

Hello. I have received this contract for temporary employment by a foreign company and I wondered if I could book an appointment to come show it to you to see if it's all ok.

Their answer rather cut me short:

Hi. Be careful! People get temporary employment from obscure companies with the headquarters, like, in the Isle of Man, they do the job, the company disappears and they never get paid. There's bad stuff out there!

I looked at the contract, the heading said something like "Canonical ltd, Douglas, Isle of Man".

I was certain that the union people would have never understood what was going on. I politely thanked them for their time and hung up. However, to this day I still regret that I didn't insist:

Uh, yes, the company is indeed in the Isle of Man. But what if I told you that it's owned by an astronaut?

I just signed the contract and had a good time.

Posted Sat Jan 15 10:35:36 2011 Tags:

Mailman defaults

Monopoly Chance: It's the first of the month. / You're flooded with mailman junk / Skip a turn.

Posted Fri Oct 1 12:03:53 2010 Tags:

My rule to see if a framework is worth of attention

I came up with a little rule:

In order to be worth of any attention, a framework must be stable enough that I can charge money to train people to use it.

This probably applies to other kinds of software stacks, libraries, development environments and, well, to most software applications.

In the context of python web frameworks, this means that:

  • If it changes API all the time it is not worth of attention, because my customers won't get value for their money, as they'd continuously need retraining and rewriting their software.
  • If I see lots of DeprecationWarnings it is not worth of attention, because my customers will see them and blame me for teaching them deprecated stuff.
  • If fixes for bugs affecting the stable version are only distributed "in a recent git" or "in the next development version", and they are not backported into a new bugfix-only stable release, then it is not worth of attention, because:
    • my customers' business is to develop their own products based on the framework.
    • My customers' business is not to be maintaning in-house stable updates of the framework. Although if the framework's community is nice enough they might end up giving a hand.
  • If it requires virtualenv or can only be obtained through easy_install it is not worth of attention, because:
    • my customers are not interested in maintaning custom deployment environments over time.
    • My customers are not interested in tracking each and every single library's upstream development to keep their production system free of bugs.
    • My customers are used to getting software through a proper distribution which also takes care of security updates.
    • I am paid to teach them how to use a framework, not a custom python-only package management system.
    • In my experience, if distributions have trouble keeping packages up to date, upstream is doing something fundamentally wrong.

In light of this rule, I regret to notice that I see very few python web frameworks worth of any attention.

Posted Wed Aug 4 15:32:24 2010 Tags:

On python stable APIs

There is a theory which states that if ever anyone discovers exactly what the Universe is for and why it is here, it will instantly disappear and be replaced by something even more bizarre and inexplicable.

There is another theory which states that this has already happened.

In Debian testing:

/usr/lib/python2.6/dist-packages/sqlalchemy/types.py:547: SADeprecationWarning: The Binary type has been renamed to LargeBinary.

In Debian Lenny:

ImportError: cannot import name LargeBinary

I was starting to think that SQLAlchemy wasn't too bad, since I've been using it for 6 months and I haven't seen its API change yet.

But there it is, a beautiful reminder that SQLAlchemy, too, is part of the marvelously autistic Python ecosystem.

Posted Mon Jul 19 16:14:25 2010 Tags:
Posted Sat Jun 6 00:57:39 2009

Posts containing useful tips.

Resolving IP addresses in vim

A friend on IRC said: "I wish vim had a command to resolve all the IP addresses in a block of text".

But it does:

:<block>!perl -MSocket -pe 's/(\d+\.\d+\.\d+\.\d+)/gethostbyaddr(inet_aton($1), AF_INET)/ge'

If you use it often, put the perl command in a one-liner script and call it an editor macro. It works on other editors, too, and even without an editor at all. And it can be scripted!

We live with the power of Unix every day, so much that we risk forgetting how awesome it is.

Posted Wed Mar 7 14:07:07 2012 Tags:

SQLAlchemy, MySQL and sql_mode=traditional

As everyone should know, by default MySQL is an embarassing stupid toy:

mysql> create table foo (val integer not null);
Query OK, 0 rows affected (0.03 sec)

mysql> insert into foo values (1/0);
ERROR 1048 (23000): Column 'val' cannot be null

mysql> insert into foo values (1);
Query OK, 1 row affected (0.00 sec)

mysql> update foo set val=1/0 where val=1;
Query OK, 1 row affected, 1 warning (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 1

mysql> select * from foo;
+-----+
| val |
+-----+
|   0 |
+-----+
1 row in set (0.00 sec)

Luckily, you can tell it to stop being embarassingly stupid:

mysql> set sql_mode="traditional";
Query OK, 0 rows affected (0.00 sec)

mysql> update foo set val=1/0 where val=0;
ERROR 1365 (22012): Division by 0

(There is an even better sql mode you can choose, though: it is called "Install PostgreSQL")

Unfortunately, I've been hired to work on a project that relies on the embarassing stupid behaviour of MySQL, so I cannot set sql_mode=traditional globally or the existing house of cards will collapse.

Here is how you set it session-wide with SQLAlchemy 0.6.x: it took me quite a while to find out:

import sqlalchemy.interfaces

# Without this, MySQL will silently insert invalid values in the
# database, causing very long debugging sessions in the long run
class DontBeSilly(sqlalchemy.interfaces.PoolListener):
    def connect(self, dbapi_con, connection_record):
        cur = dbapi_con.cursor()
        cur.execute("SET SESSION sql_mode='TRADITIONAL'")
        cur = None
engine = create_engine(..., listeners=[DontBeSilly()])

Why does it take all that effort is beyond me. I'd have expected this to be turned on by default, possibly with a switch that insane people could use to turn it off.

Posted Mon Feb 27 19:45:58 2012 Tags:

Quei simpatici spammer di Aruba

Aruba ha deciso, di punto in bianco, di iscrivermi a tutte le loro newsletter.

Le newsletter non hanno link di deiscrizione. O meglio, forse ce l'hanno, ma si vedono solo decodificando la mail usando programmi che io non ho intenzione di usare. A prescindere dal link di deiscrizione, perché dovrei deiscrivermi da delle newsletter alle quali non mi sono mai iscritto?

Ho mandato questa mail a abuse@staff.aruba.it, e altre 3 segnalazioni dopo di questa, che ovviamente sono state ignorate:

Buon giorno,

vi segnalo questo spam inviato da voi stessi (in allegato la mail con
gli header intatti).

Potreste per favore procedere con provvedimenti disciplinari contro voi
stessi? Il vostro comportamento su internet viola le piú banali regole
di netiquette, ed è vostro interesse, come provider, istruire voi stessi
sulle stesse e farvele rispettare.


Cordiali saluti,

Enrico

Che dire, una nazione del terzo mondo si merita ISP da terzo mondo.

È pur sempre un'ottima scusa per studiarsi gli header_check di postfix: ora le mail delle newsletter di Aruba, che son tra l'altro dei patozzi da 300Kb l'una, incontrano un REJECT direttamente nella sessione SMTP:

550 5.7.1 Criminal third-world ISP spammers not accepted here.

Per farlo, ho aggiunto a /etc/postfix/main.cf:

# Reject aruba spam right away
header_checks = pcre:/etc/postfix/known_idiots.pcre

E poi ho creato il file /etc/postfix/known_idiots.pcre:

/^Received:.+smtpnewsletter[0-9]+.aruba.it/
REJECT Criminal third-world ISP spammers not accepted here.

Nel frattempo ho mandato un'email al Garante Privacy e una all'AGCOM, piú per curiosità che altro. Non mi aspetto nessuna risposta, ma se succede qualcosa lo aggiungo volentieri qui.

Posted Fri Oct 14 16:14:54 2011 Tags:

Award winning code

Me and Yuwei had a fun day at hhhmcr (#hhhmcr) and even managed to put together a prototype that won the first prize \o/

We played with the gmp24 dataset kindly extracted from Twitter by Michael Brunton-Spall of the Guardian into a convenient JSON dataset. The idea was to find ways of making it easier to look at the data and making sense of it.

This is the story of what we did, including the code we wrote.

The original dataset has several JSON files, so the first task was to put them all together:

#!/usr/bin/python

# Merge the JSON data
# (C) 2010 Enrico Zini <enrico@enricozini.org>
# License: WTFPL version 2 (http://sam.zoy.org/wtfpl/)

import simplejson
import os

res = []
for f in os.listdir("."):
    if not f.startswith("gmp24"): continue
    data = open(f).read().strip()
    if data == "[]": continue
    parsed = simplejson.loads(data)
    res.extend(parsed)

print simplejson.dumps(res)

The results however were not ordered by date, as GMP had to use several accounts to twit because Twitter was putting Greather Manchester Police into jail for generating too much traffic. There would be quite a bit to write about that, but let's stick to our work.

Here is code to sort the JSON data by time:

#!/usr/bin/python

# Sort the JSON data
# (C) 2010 Enrico Zini <enrico@enricozini.org>
# License: WTFPL version 2 (http://sam.zoy.org/wtfpl/)

import simplejson
import sys
import datetime as dt

all_recs = simplejson.load(sys.stdin)
all_recs.sort(key=lambda x: dt.datetime.strptime(x["created_at"], "%a %b %d %H:%M:%S +0000 %Y"))

simplejson.dump(all_recs, sys.stdout)

I then wanted to play with Tf-idf for extracting the most important words of every tweet:

#!/usr/bin/python

# tfifd - Annotate JSON elements with Tf-idf extracted keywords
#
# Copyright (C) 2010  Enrico Zini <enrico@enricozini.org>
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program.  If not, see <http://www.gnu.org/licenses/>.

import sys, math
import simplejson
import re

# Read all the twits
records = simplejson.load(sys.stdin)

# All the twits by ID
byid = dict(((x["id"], x) for x in records))

# Stopwords we ignore
stopwords = set(["by", "it", "and", "of", "in", "a", "to"])

# Tokenising engine
re_num = re.compile(r"^\d+$")
re_word = re.compile(r"(\w+)")
def tokenise(tweet):
    "Extract tokens from a tweet"
    for tok in tweet["text"].split():
        tok = tok.strip().lower()
        if re_num.match(tok): continue
        mo = re_word.match(tok)
        if not mo: continue
        if mo.group(1) in stopwords: continue
        yield mo.group(1)

# Extract tokens from tweets
tokenised = dict(((x["id"], list(tokenise(x))) for x in records))

# Aggregate token counts
aggregated = {}
for d in byid.iterkeys():
    for t in tokenised[d]:
        if t in aggregated:
            aggregated[t] += 1
        else:
            aggregated[t] = 1

def tfidf(doc, tok):
    "Compute TFIDF score of a token in a document"
    return doc.count(tok) * math.log(float(len(byid)) / aggregated[tok])

# Annotate tweets with keywords
res = []
for name, tweet in byid.iteritems():
    doc = tokenised[name]
    keywords = sorted(set(doc), key=lambda tok: tfidf(doc, tok), reverse=True)[:5]
    tweet["keywords"] = keywords
    res.append(tweet)

simplejson.dump(res, sys.stdout)

I thought this was producing a nice summary of every tweet but nobody was particularly interested, so we moved on to adding categories to tweet.

Thanks to Yuwei who put together some useful keyword sets, we managed to annotate each tweet with a place name (i.e. "Stockport"), a social place name (i.e. "pub", "bank") and a social category (i.e. "man", "woman", "landlord"...)

The code is simple; the biggest work in it was the dictionary of keywords:

#!/usr/bin/python

# categorise - Annotate JSON elements with categories
#
# Copyright (C) 2010  Enrico Zini <enrico@enricozini.org>
# Copyright (C) 2010  Yuwei Lin <yuwei@ylin.org>
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program.  If not, see <http://www.gnu.org/licenses/>.

import sys, math
import simplejson
import re

# Electoral wards from http://en.wikipedia.org/wiki/List_of_electoral_wards_in_Greater_Manchester
placenames = ["Altrincham", "Sale West",
"Altrincham", "Ashton upon Mersey", "Bowdon", "Broadheath", "Hale Barns", "Hale Central", "St Mary", "Timperley", "Village",
"Ashton-under-Lyne",
"Ashton Hurst", "Ashton St Michael", "Ashton Waterloo", "Droylsden East", "Droylsden West", "Failsworth East", "Failsworth West", "St Peter",
"Blackley", "Broughton",
"Broughton", "Charlestown", "Cheetham", "Crumpsall", "Harpurhey", "Higher Blackley", "Kersal",
"Bolton North East",
"Astley Bridge", "Bradshaw", "Breightmet", "Bromley Cross", "Crompton", "Halliwell", "Tonge with the Haulgh",
"Bolton South East",
"Farnworth", "Great Lever", "Harper Green", "Hulton", "Kearsley", "Little Lever", "Darcy Lever", "Rumworth",
"Bolton West",
"Atherton", "Heaton", "Lostock", "Horwich", "Blackrod", "Horwich North East", "Smithills", "Westhoughton North", "Chew Moor", "Westhoughton South",
"Bury North",
"Church", "East", "Elton", "Moorside", "North Manor", "Ramsbottom", "Redvales", "Tottington",
"Bury South",
"Besses", "Holyrood", "Pilkington Park", "Radcliffe East", "Radcliffe North", "Radcliffe West", "St Mary", "Sedgley", "Unsworth",
"Cheadle",
"Bramhall North", "Bramhall South", "Cheadle", "Gatley", "Cheadle Hulme North", "Cheadle Hulme South", "Heald Green", "Stepping Hill",
"Denton", "Reddish",
"Audenshaw", "Denton North East", "Denton South", "Denton West", "Dukinfield", "Reddish North", "Reddish South",
"Hazel Grove",
"Bredbury", "Woodley", "Bredbury Green", "Romiley", "Hazel Grove", "Marple North", "Marple South", "Offerton",
"Heywood", "Middleton",
"Bamford", "Castleton", "East Middleton", "Hopwood Hall", "Norden", "North Heywood", "North Middleton", "South Middleton", "West Heywood", "West Middleton",
"Leigh",
"Astley Mosley Common", "Atherleigh", "Golborne", "Lowton West", "Leigh East", "Leigh South", "Leigh West", "Lowton East", "Tyldesley",
"Makerfield",
"Abram", "Ashton", "Bryn", "Hindley", "Hindley Green", "Orrell", "Winstanley", "Worsley Mesnes",
"Manchester Central",
"Ancoats", "Clayton", "Ardwick", "Bradford", "City Centre", "Hulme", "Miles Platting", "Newton Heath", "Moss Side", "Moston",
"Manchester", "Gorton",
"Fallowfield", "Gorton North", "Gorton South", "Levenshulme", "Longsight", "Rusholme", "Whalley Range",
"Manchester", "Withington",
"Burnage", "Chorlton", "Chorlton Park", "Didsbury East", "Didsbury West", "Old Moat", "Withington",
"Oldham East", "Saddleworth",
"Alexandra", "Crompton", "Saddleworth North", "Saddleworth South", "Saddleworth West", "Lees", "St James", "St Mary", "Shaw", "Waterhead",
"Oldham West", "Royton",
"Chadderton Central", "Chadderton North", "Chadderton South", "Coldhurst", "Hollinwood", "Medlock Vale", "Royton North", "Royton South", "Werneth",
"Rochdale",
"Balderstone", "Kirkholt", "Central Rochdale", "Healey", "Kingsway", "Littleborough Lakeside", "Milkstone", "Deeplish", "Milnrow", "Newhey", "Smallbridge", "Firgrove", "Spotland", "Falinge", "Wardle", "West Littleborough",
"Salford", "Eccles",
"Claremont", "Eccles", "Irwell Riverside", "Langworthy", "Ordsall", "Pendlebury", "Swinton North", "Swinton South", "Weaste", "Seedley",
"Stalybridge", "Hyde",
"Dukinfield Stalybridge", "Hyde Godley", "Hyde Newton", "Hyde Werneth", "Longdendale", "Mossley", "Stalybridge North", "Stalybridge South",
"Stockport",
"Brinnington", "Central", "Davenport", "Cale Green", "Edgeley", "Cheadle Heath", "Heatons North", "Heatons South", "Manor",
"Stretford", "Urmston",
"Bucklow-St Martins", "Clifford", "Davyhulme East", "Davyhulme West", "Flixton", "Gorse Hill", "Longford", "Stretford", "Urmston",
"Wigan",
"Aspull New Springs Whelley", "Douglas", "Ince", "Pemberton", "Shevington with Lower Ground", "Standish with Langtree", "Wigan Central", "Wigan West",
"Worsley", "Eccles South",
"Barton", "Boothstown", "Ellenbrook", "Cadishead", "Irlam", "Little Hulton", "Walkden North", "Walkden South", "Winton", "Worsley",
"Wythenshawe", "Sale East",
"Baguley", "Brooklands", "Northenden", "Priory", "Sale Moor", "Sharston", "Woodhouse Park"]

# Manual coding from Yuwei
placenames.extend(["City centre", "Tameside", "Oldham", "Bury", "Bolton",
"Trafford", "Pendleton", "New Moston", "Denton", "Eccles", "Leigh", "Benchill",
"Prestwich", "Sale", "Kearsley", ])
placenames.extend(["Trafford", "Bolton", "Stockport", "Levenshulme", "Gorton",
"Tameside", "Blackley", "City centre", "Airport", "South Manchester",
"Rochdale", "Chorlton", "Uppermill", "Castleton", "Stalybridge", "Ashton",
"Chadderton", "Bury", "Ancoats", "Whalley Range", "West Yorkshire",
"Fallowfield", "New Moston", "Denton", "Stretford", "Eccles", "Pendleton",
"Leigh", "Altrincham", "Sale", "Prestwich", "Kearsley", "Hulme", "Withington",
"Moss Side", "Milnrow", "outskirt of Manchester City Centre", "Newton Heath",
"Wythenshawe", "Mancunian Way", "M60", "A6", "Droylesden", "M56", "Timperley",
"Higher Ince", "Clayton", "Higher Blackley", "Lowton", "Droylsden",
"Partington", "Cheetham Hill", "Benchill", "Longsight", "Didsbury",
"Westhoughton"])


# Social categories from Yuwei
soccat = ["man", "woman", "men", "women", "youth", "teenager", "elderly",
"patient", "taxi driver", "neighbour", "male", "tenant", "landlord", "child",
"children", "immigrant", "female", "workmen", "boy", "girl", "foster parents",
"next of kin"]
for i in range(100):
    soccat.append("%d-year-old" % i)
    soccat.append("%d-years-old" % i)

# Types of social locations from Yuwei
socloc = ["car park", "park", "pub", "club", "shop", "premises", "bus stop",
"property", "credit card", "supermarket", "garden", "phone box", "theatre",
"toilet", "building site", "Crown court", "hard shoulder", "telephone kiosk",
"hotel", "restaurant", "cafe", "petrol station", "bank", "school",
"university"]


extras = { "placename": placenames, "soccat": soccat, "socloc": socloc }

# Normalise keyword lists
for k, v in extras.iteritems():
    # Remove duplicates
    v = list(set(v))
    # Sort by length
    v.sort(key=lambda x:len(x), reverse=True)

# Add keywords
def add_categories(tweet):
    text = tweet["text"].lower()
    for field, categories in extras.iteritems():
        for cat in categories:
            if cat.lower() in text:
                tweet[field] = cat
                break
    return tweet

# Read all the twits
records = (add_categories(x) for x in simplejson.load(sys.stdin))

simplejson.dump(list(records), sys.stdout)

All these scripts form a nice processing chain: each script takes a list of JSON records, adds some bit and passes it on.

In order to see what we have so far, here is a simple script to convert the JSON twits to CSV so they can be viewed in a spreadsheet:

#!/usr/bin/python

# Convert the JSON twits to CSV
# (C) 2010 Enrico Zini <enrico@enricozini.org>
# License: WTFPL version 2 (http://sam.zoy.org/wtfpl/)

import simplejson
import sys
import csv

rows = ["id", "created_at", "text", "keywords", "placename"]

writer = csv.writer(sys.stdout)
for rec in simplejson.load(sys.stdin):
    rec["keywords"] = " ".join(rec["keywords"])
    rec["placename"] = rec.get("placename", "")
    writer.writerow([rec[row] for row in rows])

At this point we were coming up with lots of questions: "were there more reports on women or men?", "which place had most incidents?", "what were the incidents involving animals?"... Time to bring Xapian into play.

This script reads all the JSON tweets and builds a Xapian index with them:

#!/usr/bin/python

# toxapian - Index JSON tweets in Xapian
#
# Copyright (C) 2010  Enrico Zini <enrico@enricozini.org>
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program.  If not, see <http://www.gnu.org/licenses/>.

import simplejson
import sys
import os, os.path
import xapian

DBNAME = sys.argv[1]

db = xapian.WritableDatabase(DBNAME, xapian.DB_CREATE_OR_OPEN)

stemmer = xapian.Stem("english")
indexer = xapian.TermGenerator()
indexer.set_stemmer(stemmer)
indexer.set_database(db)

data = simplejson.load(sys.stdin)
for rec in data:
    doc = xapian.Document()
    doc.set_data(str(rec["id"]))

    indexer.set_document(doc)
    indexer.index_text_without_positions(rec["text"])

    # Index categories as categories
    if "placename" in rec:
        doc.add_boolean_term("XP" + rec["placename"].lower())
    if "soccat" in rec:
        doc.add_boolean_term("XS" + rec["soccat"].lower())
    if "socloc" in rec:
        doc.add_boolean_term("XL" + rec["socloc"].lower())

    db.add_document(doc)

db.flush()

# Also save the whole dataset so we know where to find it later if we want to
# show the details of an entry
simplejson.dump(data, open(os.path.join(DBNAME, "all.json"), "w"))

And this is a simple command line tool to query to the database:

#!/usr/bin/python

# xgrep - Command line tool to query the GMP24 tweet Xapian database
#
# Copyright (C) 2010  Enrico Zini <enrico@enricozini.org>
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program.  If not, see <http://www.gnu.org/licenses/>.

import simplejson
import sys
import os, os.path
import xapian

DBNAME = sys.argv[1]

db = xapian.Database(DBNAME)

stem = xapian.Stem("english")

qp = xapian.QueryParser()
qp.set_default_op(xapian.Query.OP_AND)
qp.set_database(db)
qp.set_stemmer(stem)
qp.set_stemming_strategy(xapian.QueryParser.STEM_SOME)
qp.add_boolean_prefix("place", "XP")
qp.add_boolean_prefix("soc", "XS")
qp.add_boolean_prefix("loc", "XL")

query = qp.parse_query(sys.argv[2],
    xapian.QueryParser.FLAG_BOOLEAN |
    xapian.QueryParser.FLAG_LOVEHATE |
    xapian.QueryParser.FLAG_BOOLEAN_ANY_CASE |
    xapian.QueryParser.FLAG_WILDCARD |
    xapian.QueryParser.FLAG_PURE_NOT |
    xapian.QueryParser.FLAG_SPELLING_CORRECTION |
    xapian.QueryParser.FLAG_AUTO_SYNONYMS)

enquire = xapian.Enquire(db)
enquire.set_query(query)

count = 40
matches = enquire.get_mset(0, count)
estimated = matches.get_matches_estimated()
print "%d/%d results" % (matches.size(), estimated)

data = dict((str(x["id"]), x) for x in simplejson.load(open(os.path.join(DBNAME, "all.json"))))

for m in matches:
    rec = data[m.document.get_data()]
    print rec["text"]

print "%d/%d results" % (matches.size(), matches.get_matches_estimated())

total = db.get_doccount()
estimated = matches.get_matches_estimated()
print "%d results over %d documents, %d%%" % (estimated, total, estimated * 100 / total)

Neat! Now that we have a proper index that supports all sort of cool things, like stemming, tag clouds, full text search with complex queries, lookup of similar documents, suggest keywords and so on, it was just fair to put together a web service to share it with other people at the event.

It helped that I had already written similar code for apt-xapian-index and dde before.

Here is the server, quickly built on bottle. The very last line starts the server and it is where you can configure the listening interface and port.

#!/usr/bin/python

# xserve - Make the GMP24 tweet Xapian database available on the web
#
# Copyright (C) 2010  Enrico Zini <enrico@enricozini.org>
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program.  If not, see <http://www.gnu.org/licenses/>.

import bottle
from bottle import route, post
from cStringIO import StringIO
import cPickle as pickle
import simplejson
import sys
import os, os.path
import xapian
import urllib
import math

bottle.debug(True)

DBNAME = sys.argv[1]
QUERYLOG = os.path.join(DBNAME, "queries.txt")

data = dict((str(x["id"]), x) for x in simplejson.load(open(os.path.join(DBNAME, "all.json"))))

prefixes = { "place": "XP", "soc": "XS", "loc": "XL" }
prefix_desc = { "place": "Place name", "soc": "Social category", "loc": "Social location" }

db = xapian.Database(DBNAME)

stem = xapian.Stem("english")

qp = xapian.QueryParser()
qp.set_default_op(xapian.Query.OP_AND)
qp.set_database(db)
qp.set_stemmer(stem)
qp.set_stemming_strategy(xapian.QueryParser.STEM_SOME)
for k, v in prefixes.iteritems():
    qp.add_boolean_prefix(k, v)

def make_query(qstring):
    return qp.parse_query(qstring,
        xapian.QueryParser.FLAG_BOOLEAN |
        xapian.QueryParser.FLAG_LOVEHATE |
        xapian.QueryParser.FLAG_BOOLEAN_ANY_CASE |
        xapian.QueryParser.FLAG_WILDCARD |
        xapian.QueryParser.FLAG_PURE_NOT |
        xapian.QueryParser.FLAG_SPELLING_CORRECTION |
        xapian.QueryParser.FLAG_AUTO_SYNONYMS)


@route("/")
def index():
    query = urllib.unquote_plus(bottle.request.GET.get("q", ""))

    out = StringIO()
    print >>out, '''
<html>
<head>
<title>Query</title>
<script src="http://ajax.googleapis.com/ajax/libs/jquery/1.4.2/jquery.min.js"></script>
<script type="text/javascript">
$(function(){
    $("#queryfield")[0].focus()
})
</script>
</head>
<body>
<h1>Search</h1>
<form method="POST" action="/query">
Keywords: <input type="text" name="query" value="%s" id="queryfield">
<input type="submit">
<a href="http://xapian.org/docs/queryparser.html">Help</a>
</form>''' % query

    print >>out, '''
<p>Example: "car place:wigan"</p>

<p>Available prefixes:</p>

<ul>
'''
    for pfx in prefixes.keys():
        print >>out, "<li><a href='/catinfo/%s'>%s - %s</a></li>" % (pfx, pfx, prefix_desc[pfx])
    print >>out, '''
</ul>
'''

    oldqueries = []
    if os.path.exists(QUERYLOG):
        total = db.get_doccount()
        fd = open(QUERYLOG, "r")
        while True:
            try:
                q = pickle.load(fd)
            except EOFError:
                break
            oldqueries.append(q)
        fd.close()

        def print_query(q):
            count = q["count"]
            print >>out, "<li><a href='/query?query=%s'>%s (%d/%d %.2f%%)</a></li>" % (urllib.quote_plus(q["q"]), q["q"], count, total, count * 100.0 / total)

        print >>out, "<p>Last 10 queries:</p><ul>"
        for q in oldqueries[:-10:-1]:
            print_query(q)
        print >>out, "</ul>"

        # Remove duplicates
        oldqueries = dict(((x["q"], x) for x in oldqueries)).values()

        print >>out, "<table>"
        print >>out, "<tr><th>10 queries with most results</th><th>10 queries with least results</th></tr>"
        print >>out, "<tr><td>"

        print >>out, "<ul>"
        oldqueries.sort(key=lambda x:x["count"], reverse=True)
        for q in oldqueries[:10]:
            print_query(q)
        print >>out, "</ul>"

        print >>out, "</td><td>"

        print >>out, "<ul>"
        nonempty = [x for x in oldqueries if x["count"] > 0]
        nonempty.sort(key=lambda x:x["count"])
        for q in nonempty[:10]:
            print_query(q)
        print >>out, "</ul>"

        print >>out, "</td></tr>"
        print >>out, "</table>"

    print >>out, '''
</body>
</html>'''
    return out.getvalue()

@route("/query")
@route("/query/")
@post("/query")
@post("/query/")
def query():
    query = bottle.request.POST.get("query", bottle.request.GET.get("query", ""))
    enquire = xapian.Enquire(db)
    enquire.set_query(make_query(query))

    count = 40
    matches = enquire.get_mset(0, count)
    estimated = matches.get_matches_estimated()
    total = db.get_doccount()

    out = StringIO()
    print >>out, '''
<html>
<head><title>Results</title></head>
<body>
<h1>Results for "<b>%s</b>"</h1>
''' % query

    if estimated == 0:
        print >>out, "No results found."
    else:
        # Give as results the first 30 documents; also use them as the key
        # ones to use to compute relevant terms
        rset = xapian.RSet()
        for m in enquire.get_mset(0, 30):
            rset.add_document(m.document.get_docid())

        # Compute the tag cloud
        class NonTagFilter(xapian.ExpandDecider):
            def __call__(self, term):
                return not term[0].isupper() and not term[0].isdigit()
        cloud = []
        maxscore = None
        for res in enquire.get_eset(40, rset, NonTagFilter()):
            # Normalise the score in the interval [0, 1]
            weight = math.log(res.weight)
            if maxscore == None: maxscore = weight
            tag = res.term
            cloud.append([tag, float(weight) / maxscore])
        max_weight = cloud[0][1]
        min_weight = cloud[-1][1]
        cloud.sort(key=lambda x:x[0])

        def mklink(query, term):
            return "/query?query=%s" % urllib.quote_plus(query + " and " + term)
        print >>out, "<h2>Tag cloud</h2>"
        print >>out, "<blockquote>"
        for term, weight in cloud:
            size = 100 + 100.0 * (weight - min_weight) / (max_weight - min_weight)
            print >>out, "<a href='%s' style='font-size:%d%%; color:brown;'>%s</a>" % (mklink(query, term), size, term)
        print >>out, "</blockquote>"

        print >>out, "<h2>Results</h2>"
        print >>out, "<p><a href='/'>Search again</a></p>"

        print >>out, "<p>%d results over %d documents, %.2f%%</p>" % (estimated, total, estimated * 100.0 / total)
        print >>out, "<p>%d/%d results</p>" % (matches.size(), estimated)

        print >>out, "<ul>"
        for m in matches:
            rec = data[m.document.get_data()]
            print >>out, "<li><a href='/item/%s'>%s</a></li>" % (rec["id"], rec["text"])
        print >>out, "</ul>"

        fd = open(QUERYLOG, "a")
        qinfo = dict(q=query, count=estimated)
        pickle.dump(qinfo, fd)
        fd.close()

    print >>out, '''
<a href="/">Search again</a>

</body>
</html>'''
    return out.getvalue()

@route("/item/:id")
@route("/item/:id/")
def show(id):
    rec = data[id]

    out = StringIO()
    print >>out, '''
<html>
<head><title>Result %s</title></head>
<body>
<h1>Raw JSON record for twit %s</h1>
<pre>''' % (rec["id"], rec["id"])

    print >>out, simplejson.dumps(rec, indent=" ")

    print >>out, '''
</pre>
</body>
</html>'''
    return out.getvalue()

@route("/catinfo/:name")
@route("/catinfo/:name/")
def catinfo(name):
    prefix = prefixes[name]
    out = StringIO()
    print >>out, '''
<html>
<head><title>Values for %s</title></head>
<body>
''' % name

    terms = [(x.term[len(prefix):], db.get_termfreq(x.term)) for x in db.allterms(prefix)]
    terms.sort(key=lambda x:x[1], reverse=True)
    freq_min = terms[0][1]
    freq_max = terms[-1][1]

    def mklink(name, term):
        return "/query?query=%s" % urllib.quote_plus(name + ":" + term)

    # Build tag cloud
    print >>out, "<h1>Tag cloud</h1>"
    print >>out, "<blockquote>"
    for term, freq in sorted(terms[:20], key=lambda x:x[0]):
        size = 100 + 100.0 * (freq - freq_min) / (freq_max - freq_min)
        print >>out, "<a href='%s' style='font-size:%d%%; color:brown;'>%s</a>" % (mklink(name, term), size, term)
    print >>out, "</blockquote>"

    print >>out, "<h1>All terms</h1>"
    print >>out, "<table>"
    print >>out, "<tr><th>Occurrences</th><th>Name</th></tr>"
    for term, freq in terms:
        print >>out, "<tr><td>%d</td><td><a href='/query?query=%s'>%s</a></td></tr>" % (freq, urllib.quote_plus(name + ":" + term), term)
    print >>out, "</table>"

    print >>out, '''
</body>
</html>'''
    return out.getvalue()

# Change here for bind host and port
bottle.run(host="0.0.0.0", port=8024)

...and then we presented our work and ended up winning the contest.

This was the story of how we wrote this set of award winning code.

Posted Sat Oct 16 01:36:08 2010 Tags:

Computing time offsets between EXIF and GPS

I like the idea of matching photos to GPS traces. In Debian there is gpscorrelate but it's almost unusable to me because of bug #473362 and it has an awkward way of specifying time offsets.

Here at SoTM10 someone told me that exiftool gained -geosync and -geotag options. So it's just a matter of creating a little tool that shows a photo and asks you to type the GPS time you see in it.

Apparently there are no bindings or GIR files for gtkimageview in Debian, so I'll have to use C.

Here is a C prototype:

/*
 * gpsoffset - Compute EXIF time offset from a photo of a gps display
 *
 * Use with exiftool -geosync=... -geotag trace.gpx DIR
 *
 * Copyright (C) 2009--2010  Enrico Zini <enrico@enricozini.org>
 *
 * This program is free software; you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation; either version 2 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program; if not, write to the Free Software
 * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
 */


#define _XOPEN_SOURCE /* glibc2 needs this */
#include <time.h>
#include <gtkimageview/gtkimageview.h>
#include <libexif/exif-data.h>
#include <stdio.h>
#include <stdlib.h>

static int load_time(const char* fname, struct tm* tm)
{
    ExifData* exif_data = exif_data_new_from_file(fname);
    ExifEntry* exif_time = exif_data_get_entry(exif_data, EXIF_TAG_DATE_TIME);
    if (exif_time == NULL)
    {
        fprintf(stderr, "Cannot find EXIF timetamp\n");
        return -1;
    }

    char buf[1024];
    exif_entry_get_value(exif_time, buf, 1024);
    //printf("val2: %s\n", exif_entry_get_value(t2, buf, 1024));

    if (strptime(buf, "%Y:%m:%d %H:%M:%S", tm) == NULL)
    {
        fprintf(stderr, "Cannot match EXIF timetamp\n");
        return -1;
    }

    return 0;
}

static time_t exif_ts;
static GtkWidget* res_lbl;

void date_entry_changed(GtkEditable *editable, gpointer user_data)
{
    const gchar* text = gtk_entry_get_text(GTK_ENTRY(editable));
    struct tm parsed;
    if (strptime(text, "%Y-%m-%d %H:%M:%S", &parsed) == NULL)
    {
        gtk_label_set_text(GTK_LABEL(res_lbl), "Please enter a date as YYYY-MM-DD HH:MM:SS");
    } else {
        time_t img_ts = mktime(&parsed);
        int c;
        int res;
        if (exif_ts < img_ts)
        {
            c = '+';
            res = img_ts - exif_ts;
        }
        else
        {
            c = '-';
            res = exif_ts - img_ts;
        }
        char buf[1024];
        if (res > 3600)
            snprintf(buf, 1024, "Result: %c%ds -geosync=%c%d:%02d:%02d",
                    c, res, c, res / 3600, (res / 60) % 60, res % 60);
        else if (res > 60)
            snprintf(buf, 1024, "Result: %c%ds -geosync=%c%02d:%02d",
                    c, res, c, (res / 60) % 60, res % 60);
        else 
            snprintf(buf, 1024, "Result: %c%ds -geosync=%c%d",
                    c, res, c, res);
        gtk_label_set_text(GTK_LABEL(res_lbl), buf);
    }
}

int main (int argc, char *argv[])
{
    // Work in UTC to avoid mktime applying DST or timezones
    setenv("TZ", "UTC");

    const char* filename = "/home/enrico/web-eddie/galleries/2010/04-05-Uppermill/P1080932.jpg";

    gtk_init (&argc, &argv);

    struct tm exif_time;
    if (load_time(filename, &exif_time) != 0)
        return 1;

    printf("EXIF time: %s\n", asctime(&exif_time));
    exif_ts = mktime(&exif_time);

    GtkWidget* window = gtk_window_new(GTK_WINDOW_TOPLEVEL);
    GtkWidget* vb = gtk_vbox_new(FALSE, 0);
    GtkWidget* hb = gtk_hbox_new(FALSE, 0);
    GtkWidget* lbl = gtk_label_new("Timestamp:");
    GtkWidget* exif_lbl;
    {
        char buf[1024];
        strftime(buf, 1024, "EXIF time: %Y-%m-%d %H:%M:%S", &exif_time);
        exif_lbl = gtk_label_new(buf);
    }
    GtkWidget* date_ent = gtk_entry_new();
    res_lbl = gtk_label_new("Result:");
    GtkWidget* view = gtk_image_view_new();
    GdkPixbuf* pixbuf = gdk_pixbuf_new_from_file(filename, NULL);

    gtk_box_pack_start(GTK_BOX(hb), lbl, FALSE, TRUE, 0);
    gtk_box_pack_start(GTK_BOX(hb), date_ent, TRUE, TRUE, 0);

    gtk_signal_connect(GTK_OBJECT(date_ent), "changed", (GCallback)date_entry_changed, NULL);
    {
        char buf[1024];
        strftime(buf, 1024, "%Y-%m-%d %H:%M:%S", &exif_time);
        gtk_entry_set_text(GTK_ENTRY(date_ent), buf);
    }

    gtk_widget_set_size_request(view, 500, 400);
    gtk_image_view_set_pixbuf(GTK_IMAGE_VIEW(view), pixbuf, TRUE);
    gtk_container_add(GTK_CONTAINER(window), vb);
    gtk_box_pack_start(GTK_BOX(vb), view, TRUE, TRUE, 0);
    gtk_box_pack_start(GTK_BOX(vb), hb, FALSE, TRUE, 0);
    gtk_box_pack_start(GTK_BOX(vb), exif_lbl, FALSE, TRUE, 0);
    gtk_box_pack_start(GTK_BOX(vb), res_lbl, FALSE, TRUE, 0);
    gtk_widget_show_all(window);

    gtk_main ();

    return 0;
}

And here is its simple makefile:

CFLAGS=$(shell pkg-config --cflags gtkimageview libexif)
LDFLAGS=$(shell pkg-config --libs gtkimageview libexif)

gpsoffset: gpsoffset.c

It's a simple prototype but it's a working prototype and seems to do the job for me.

I currently cannot find out why after I click on the text box, there seems to be no way to give the focus back to the image viewer so I can control it with keys.

There is another nice algorithm to compute time offsets to be implemented: you choose a photo taken from a known place and drag it on that place on a map: you can then look for the nearest point on your GPX trace and compute the time offset from that.

I have seen that there are programs for geotagging photos that implement all such algorithms, and have a nice UI, but I haven't seen any in Debian.

Are there any such softwares that can be packaged?

If not, the interpolation and annotation tasks can now already be performed by exiftool, so it's just a matter of building a good UI, and I would love to see someone picking up the task.

Posted Sun Jul 11 12:34:04 2010 Tags:

Searching OSM nodes in Spatialite

Third step of my SoTM10 pet project: finding the POIs.

I put together a query to find all nodes with a given tag inside a bounding box, and also a query to find all the tag values for a given tag name inside a bounding box.

The result is this simple POI search engine:

#
# poisearch - simple geographical POI search engine
#
# Copyright (C) 2010  Enrico Zini <enrico@enricozini.org>
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
#

from pysqlite2 import dbapi2 as sqlite

class PoiDB(object):
    def __init__(self):
        self.db = sqlite.connect("pois.db")
        self.db.enable_load_extension(True)
        self.db.execute("SELECT load_extension('libspatialite.so')")
        self.oldsearch = []
        self.bbox = None

    def set_bbox(self, xmin, xmax, ymin, ymax):
        '''Set bbox for searches'''
        self.bbox = (xmin, xmax, ymin, ymax)

    def tagid(self, name, val):
        '''Get the database ID for a tag'''
        c = self.db.cursor()
        c.execute("SELECT id FROM tag WHERE name=? AND value=?", (name, val))
        res = None
        for row in c:
            res = row[0]
        return res

    def tagnames(self):
        '''Get all tag names'''
        c = self.db.cursor()
        c.execute("SELECT DISTINCT name FROM tag ORDER BY name")
        for row in c:
            yield row[0]

    def tagvalues(self, name, use_bbox=False):
        '''
        Get all tag values for a given tag name,
        optionally in the current bounding box
        '''
        c = self.db.cursor()
        if self.bbox is None or not use_bbox:
            c.execute("SELECT DISTINCT value FROM tag WHERE name=? ORDER BY value", (name,))
        else:
            c.execute("SELECT DISTINCT tag.value FROM poi, poitag, tag"
                      " WHERE poi.rowid IN (SELECT pkid FROM idx_poi_geom WHERE ("
                      "       xmin >= ? AND xmax <= ? AND ymin >= ? AND ymax <= ?) )"
                      "   AND poitag.tag = tag.id AND poitag.poi = poi.id"
                      "   AND tag.name=?",
                      self.bbox + (name,))
        for row in c:
            yield row[0]

    def search(self, name, val):
        '''Get all name:val tags in the current bounding box'''
        # First resolve the tagid
        tagid = self.tagid(name, val)
        if tagid is None: return

        c = self.db.cursor()
        c.execute("SELECT poi.name, poi.data, X(poi.geom), Y(poi.geom) FROM poi, poitag"
                  " WHERE poi.rowid IN (SELECT pkid FROM idx_poi_geom WHERE ("
                  "       xmin >= ? AND xmax <= ? AND ymin >= ? AND ymax <= ?) )"
                  "   AND poitag.tag = ? AND poitag.poi = poi.id",
                  self.bbox + (tagid,))
        self.oldsearch = []
        for row in c:
            self.oldsearch.append(row)
            yield row[0], simplejson.loads(row[1]), row[2], row[3]

    def count(self, name, val):
        '''Count all name:val tags in the current bounding box'''
        # First resolve the tagid
        tagid = self.tagid(name, val)
        if tagid is None: return

        c = self.db.cursor()
        c.execute("SELECT COUNT(*) FROM poi, poitag"
                  " WHERE poi.rowid IN (SELECT pkid FROM idx_poi_geom WHERE ("
                  "       xmin >= ? AND xmax <= ? AND ymin >= ? AND ymax <= ?) )"
                  "   AND poitag.tag = ? AND poitag.poi = poi.id",
                  self.bbox + (tagid,))
        for row in c:
            return row[0]

    def replay(self):
        for row in self.oldsearch:
            yield row[0], simplejson.loads(row[1]), row[2], row[3]

Problem 3 solved: now on to the next step, building a user interface for it.

Posted Sat Jul 10 15:50:31 2010 Tags:

Importing OSM nodes into Spatialite

Second step of my SoTM10 pet project: creating a searchable database with the points. What a fantastic opportunity to learn Spatialite.

Learning Spatialite is easy. For example, you can use the two tutorials with catchy titles that assume your best wish in life is to create databases out of shapefiles using a pre-built, i386-only executable GUI binary downloaded over an insecure HTTP connection.

To be fair, the second of those tutorials is called "An almost Idiot's Guide", thus expliciting the requirement of being an almost idiot in order to happily acquire and run software in that way.

Alternatively, you can use A quick tutorial to SpatiaLite which is so quick it has examples that lead you to write SQL queries that trigger all sorts of vague exceptions at insert time. But at least it brought me a long way forward, at which point I could just cross reference things with PostGIS documentation to find out the right way of doing things.

So, here's the importer script, which will probably become my reference example for how to get started with Spatialite, and how to use Spatialite from Python:

#!/usr/bin/python

#
# poiimport - import nodes from OSM into a spatialite DB
#
# Copyright (C) 2010  Enrico Zini <enrico@enricozini.org>
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
#

import xml.sax
import xml.sax.handler
from pysqlite2 import dbapi2 as sqlite
import simplejson
import sys
import os

class OSMPOIReader(xml.sax.handler.ContentHandler):
    '''
    Filter SAX events in a OSM XML file to keep only nodes with names
    '''
    def __init__(self, consumer):
        self.consumer = consumer

    def startElement(self, name, attrs):
        if name == "node":
            self.attrs = attrs
            self.tags = dict()
        elif name == "tag":
            self.tags[attrs["k"]] = attrs["v"]

    def endElement(self, name):
        if name == "node":
            lat = float(self.attrs["lat"])
            lon = float(self.attrs["lon"])
            id = int(self.attrs["id"])
            #dt = parse(self.attrs["timestamp"])
            uid = self.attrs.get("uid", None)
            uid = int(uid) if uid is not None else None
            user = self.attrs.get("user", None)

            self.consumer(lat, lon, id, self.tags, user=user, uid=uid)

class Importer(object):
    '''
    Create the spatialite database and populate it
    '''
    TAG_WHITELIST = set(["amenity", "shop", "tourism", "place"])

    def __init__(self, filename):
        self.db = sqlite.connect(filename)
        self.db.enable_load_extension(True)
        self.db.execute("SELECT load_extension('libspatialite.so')")
        self.db.execute("SELECT InitSpatialMetaData()")
        self.db.execute("INSERT INTO spatial_ref_sys (srid, auth_name, auth_srid,"
                        " ref_sys_name, proj4text) VALUES (4326, 'epsg', 4326,"
                        " 'WGS 84', '+proj=longlat +ellps=WGS84 +datum=WGS84"
                        " +no_defs')")
        self.db.execute("CREATE TABLE poi (id int not null unique primary key,"
                        " name char, data text)")
        self.db.execute("SELECT AddGeometryColumn('poi', 'geom', 4326, 'POINT', 2)")
        self.db.execute("SELECT CreateSpatialIndex('poi', 'geom')")
        self.db.execute("CREATE TABLE tag (id integer primary key autoincrement,"
                        " name char, value char)")
        self.db.execute("CREATE UNIQUE INDEX tagidx ON tag (name, value)")
        self.db.execute("CREATE TABLE poitag (poi int not null, tag int not null)")
        self.db.execute("CREATE UNIQUE INDEX poitagidx ON poitag (poi, tag)")
        self.tagid_cache = dict()

    def tagid(self, k, v):
        key = (k, v)
        res = self.tagid_cache.get(key, None)
        if res is None:
            c = self.db.cursor()
            c.execute("SELECT id FROM tag WHERE name=? AND value=?", key)
            for row in c:
                self.tagid_cache[key] = row[0]
                return row[0]
            self.db.execute("INSERT INTO tag (id, name, value) VALUES (NULL, ?, ?)", key)
            c.execute("SELECT last_insert_rowid()")
            for row in c:
                res = row[0]
            self.tagid_cache[key] = res
        return res

    def __call__(self, lat, lon, id, tags, user=None, uid=None):
        # Acquire tag IDs
        tagids = []
        for k, v in tags.iteritems():
            if k not in self.TAG_WHITELIST: continue
            for val in v.split(";"):
                tagids.append(self.tagid(k, val))

        # Skip elements that don't have the tags we want
        if not tagids: return

        geom = "POINT(%f %f)" % (lon, lat)
        self.db.execute("INSERT INTO poi (id, geom, name, data)"
                        "     VALUES (?, GeomFromText(?, 4326), ?, ?)", 
                (id, geom, tags["name"], simplejson.dumps(tags)))

        for tid in tagids:
            self.db.execute("INSERT INTO poitag (poi, tag) VALUES (?, ?)", (id, tid))


    def done(self):
        self.db.commit()

# Get the output file name
filename = sys.argv[1]

# Ensure we start from scratch
if os.path.exists(filename):
    print >>sys.stderr, filename, "already exists"
    sys.exit(1)

# Import
parser = xml.sax.make_parser()
importer = Importer(filename)
handler = OSMPOIReader(importer)
parser.setContentHandler(handler)
parser.parse(sys.stdin)
importer.done()

Let's run it:

$ ./poiimport pois.db < pois.osm 
SpatiaLite version ..: 2.4.0    Supported Extensions:
        - 'VirtualShape'        [direct Shapefile access]
        - 'VirtualDbf'          [direct Dbf access]
        - 'VirtualText'         [direct CSV/TXT access]
        - 'VirtualNetwork'      [Dijkstra shortest path]
        - 'RTree'               [Spatial Index - R*Tree]
        - 'MbrCache'            [Spatial Index - MBR cache]
        - 'VirtualFDO'          [FDO-OGR interoperability]
        - 'SpatiaLite'          [Spatial SQL - OGC]
PROJ.4 Rel. 4.7.1, 23 September 2009
GEOS version 3.2.0-CAPI-1.6.0
$ ls -l --si pois*
-rw-r--r-- 1 enrico enrico 17M Jul  9 23:44 pois.db
-rw-r--r-- 1 enrico enrico 37M Jul  9 16:20 pois.osm
$ spatialite pois.db
SpatiaLite version ..: 2.4.0    Supported Extensions:
        - 'VirtualShape'        [direct Shapefile access]
        - 'VirtualDbf'          [direct DBF access]
        - 'VirtualText'         [direct CSV/TXT access]
        - 'VirtualNetwork'      [Dijkstra shortest path]
        - 'RTree'               [Spatial Index - R*Tree]
        - 'MbrCache'            [Spatial Index - MBR cache]
        - 'VirtualFDO'          [FDO-OGR interoperability]
        - 'SpatiaLite'          [Spatial SQL - OGC]
PROJ.4 version ......: Rel. 4.7.1, 23 September 2009
GEOS version ........: 3.2.0-CAPI-1.6.0
SQLite version ......: 3.6.23.1
Enter ".help" for instructions
spatialite> select id from tag where name="amenity" and value="fountain";
24
spatialite> SELECT poi.name, poi.data, X(poi.geom), Y(poi.geom) FROM poi, poitag WHERE poi.rowid IN (SELECT pkid FROM idx_poi_geom WHERE (xmin >= 2.56 AND xmax <= 2.90 AND ymin >= 41.84 AND ymax <= 42.00) ) AND poitag.tag = 24 AND poitag.poi = poi.id;
Font Picant de la Cellera|{"amenity": "fountain", "name": "Font Picant de la Cellera"}|2.616045|41.952449
Font de Can Pla|{"amenity": "fountain", "name": "Font de Can Pla"}|2.622354|41.974724
Font de Can Ribes|{"amenity": "fountain", "name": "Font de Can Ribes"}|2.62311|41.979193

It's impressive: I've got all sort of useful information for the whole of Spain in just 17Mb!

Let's put it to practice: I'm thirsty, is there any water fountain nearby?

spatialite> SELECT count(1) FROM poi, poitag WHERE poi.rowid IN (SELECT pkid FROM idx_poi_geom WHERE (xmin >= 2.80 AND xmax <= 2.85 AND ymin >= 41.97 AND ymax <= 42.00) ) AND poitag.tag = 24 AND poitag.poi = poi.id;
0

Ouch! No water fountains mapped in Girona... yet.

Problem 2 solved: now on to the next step, trying to show the results in some usable way.

Posted Sat Jul 10 09:10:35 2010 Tags:

Filtering nodes out of OSM files

I have a pet project here at SoTM10: create a tool for searching nearby POIs while offline.

The idea is to have something in my pocket (FreeRunner or N900), which doesn't require an internet connection, and which can point me at the nearest fountains, post offices, atm machines, bars and so on.

The first step is to obtain a list of POIs.

In theory one can use Xapi but all the known Xapi servers appear to be down at the moment.

Another attempt is to obtain it by filtering all nodes with the tags we want out of a planet OSM extract. I downloaded the Spanish one and set to work.

First I tried with xmlstarlet, but it ate all the RAM and crashed my laptop, because for some reason, on my laptop the Linux kernels up to 2.6.32 (don't now about later ones) like to swap out ALL running apps to cache I/O operations, which mean that heavy I/O operations swap out the very programs performing them, so the system gets caught in some infinite I/O loop and dies. Or at least this is what I've figured out so far.

So, we need SAX. I put together this prototype in Python, which can process a nice 8MB/s of OSM data for quite some time with a constant, low RAM usage:

#!/usr/bin/python

#
# poifilter - extract interesting nodes from OSM XML files
#
# Copyright (C) 2010  Enrico Zini <enrico@enricozini.org>
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
#


import xml.sax
import xml.sax.handler
import xml.sax.saxutils
import sys

class XMLSAXFilter(xml.sax.handler.ContentHandler):
    '''
    A SAX filter that is a ContentHandler.

    There is xml.sax.saxutils.XMLFilterBase in the standard library but it is
    undocumented, and most of the examples using it you find online are wrong.
    You can look at its source code, and at that point you find out that it is
    an offensive practical joke.
    '''
    def __init__(self, downstream):
        self.downstream = downstream

    # ContentHandler methods

    def setDocumentLocator(self, locator):
        self.downstream.setDocumentLocator(locator)

    def startDocument(self):
        self.downstream.startDocument()

    def endDocument(self):
        self.downstream.endDocument()

    def startPrefixMapping(self, prefix, uri):
        self.downstream.startPrefixMapping(prefix, uri)

    def endPrefixMapping(self, prefix):
        self.downstream.endPrefixMapping(prefix)

    def startElement(self, name, attrs):
        self.downstream.startElement(name, attrs)

    def endElement(self, name):
        self.downstream.endElement(name)

    def startElementNS(self, name, qname, attrs):
        self.downstream.startElementNS(name, qname, attrs)

    def endElementNS(self, name, qname):
        self.downstream.endElementNS(name, qname)

    def characters(self, content):
        self.downstream.characters(content)

    def ignorableWhitespace(self, chars):
        self.downstream.ignorableWhitespace(chars)

    def processingInstruction(self, target, data):
        self.downstream.processingInstruction(target, data)

    def skippedEntity(self, name):
        self.downstream.skippedEntity(name)

class OSMPOIHandler(XMLSAXFilter):
    '''
    Filter SAX events in a OSM XML file to keep only nodes with names
    '''
    PASSTHROUGH = ["osm", "bound"]
    TAG_WHITELIST = set(["amenity", "shop", "tourism", "place"])

    def startElement(self, name, attrs):
        if name in self.PASSTHROUGH:
            self.downstream.startElement(name, attrs)
        elif name == "node":
            self.attrs = attrs
            self.tags = []
            self.propagate = False
        elif name == "tag":
            if self.tags is not None:
                self.tags.append(attrs)
                if attrs["k"] in self.TAG_WHITELIST:
                    self.propagate = True
        else:
            self.tags = None
            self.attrs = None

    def endElement(self, name):
        if name in self.PASSTHROUGH:
            self.downstream.endElement(name)
        elif name == "node":
            if self.propagate:
                self.downstream.startElement("node", self.attrs)
                for attrs in self.tags:
                    self.downstream.startElement("tag", attrs)
                    self.downstream.endElement("tag")
                self.downstream.endElement("node")

    def ignorableWhitespace(self, chars):
        pass

    def characters(self, content):
        pass

# Simple stdin->stdout XMl filter
parser = xml.sax.make_parser()
handler = OSMPOIHandler(xml.sax.saxutils.XMLGenerator(sys.stdout, "utf-8"))
parser.setContentHandler(handler)
parser.parse(sys.stdin)

Let's run it:

$ bzcat /store/osm/spain.osm.bz2 | pv | ./poifilter > pois.osm
[...]
$ ls -l --si pois.osm
-rw-r--r-- 1 enrico enrico 19M Jul 10 23:56 pois.osm
$ xmlstarlet val pois.osm 
pois.osm - valid

Problem 1 solved: now on to the next step: importing the nodes in a database.

Posted Fri Jul 9 16:28:15 2010 Tags:

Tweaking locale settings

I sometimes meet some Italian programmer who prefers his system to be in English, so they get untranslated manpages and error messages.

I sometimes notice that their solution often leaves them something to complain about.

Some set LANG=C and complain they can't see accented characters.

Some set LANG=en_US.UTF-8 and complain that OpenOffice Calc wants dates in the format MM/DD/YYYY which is an Abomination unto Nuggan, as well as unto Me.

But the locales system can do much better than that. In fact, most times people would be extremely happy with LC_MESSAGES in English, and everything else in Italian:

$ locale
LANG=it_IT.UTF-8
LC_CTYPE="it_IT.UTF-8"
LC_NUMERIC="it_IT.UTF-8"
LC_TIME="it_IT.UTF-8"
LC_COLLATE="it_IT.UTF-8"
LC_MONETARY="it_IT.UTF-8"
LC_MESSAGES=en_US.UTF-8
LC_PAPER="it_IT.UTF-8"
LC_NAME="it_IT.UTF-8"
LC_ADDRESS="it_IT.UTF-8"
LC_TELEPHONE="it_IT.UTF-8"
LC_MEASUREMENT="it_IT.UTF-8"
LC_IDENTIFICATION="it_IT.UTF-8"
LC_ALL=

A way to do this (is there a better one?) is to tell the display manager (GDM, KDM...) to use the Italian locale, and then override the right LC_* bits in ~/.xsessionrc:

$ cat ~/.xsessionrc
export LC_MESSAGES=en_US.UTF-8

That does the trick: English messages, with Proper currency, Proper dates, accented letters sort Properly, Proper A4 printer paper, Proper SI units. Even Nuggan would be happy.

Posted Sat Apr 17 15:39:19 2010 Tags:

Temporarily disabling file caching

Does it happen to you that you cp a big, big file (say, similar in order of magnitude to the amount of RAM) and the system becomes rather unusable?

It looks like Linux is saying "let's cache this", and as you copy it will try to free more and more ram in order to cache the big file you're copying. In the end, all the RAM is full with file data that you are not going to need.

This varies according to how /proc/sys/vm/swappiness is set.

I learnt about posix_fadvise and I tried to play with it. The result is this preloadable library that hooks into open(2) and fadvises everything as POSIX_FADV_DONTNEED.

It is all rather awkward. fadvise in that way will discard existing cache pages if the file is already cached, which is too much. Ideally one would like to say "don't cache this because of me" without stepping on the feet of other system activities.

Also, I found I need to also hook into write(2) and run fadvise after every single write, because you can't fadvise a file to be written in its entirety, unless you pass fadvise the file size in advance. But the size of the output file cannot be known by the preloaded library, so meh.

So, now I can run: nocache cp bigfile someplace/ without trashing the existing caches. I can also run nocache tar zxf foo.tar.gz and so on. I wish, of course, that there were no need to do so in the first place.

Here is the nocache library source code, for reference:

/*
 * nocache - LD_PRELOAD library to fadvise written files to not be cached
 *
 * Copyright (C) 2009--2010 Enrico Zini <enrico@enricozini.org>
 *
 * This program is free software; you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation; either version 2 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program; if not, write to the Free Software
 * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
 */

#define _XOPEN_SOURCE 600
#include <fcntl.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <dlfcn.h>
#include <stdarg.h>
#include <errno.h>
#include <stdio.h>

typedef int (*open_t)(const char*, int, ...);
typedef int (*write_t)(int fd, const void *buf, size_t count);

int open(const char *pathname, int flags, ...)
{
    static open_t func = 0;
    int res;
    if (!func)
        func = (open_t)dlsym(RTLD_NEXT, "open");

    // Note: I wanted to add O_DIRECT, but it imposes restriction on buffer
    // alignment
    if (flags & O_CREAT)
    {
        va_list ap;
        va_start(ap, flags);
        mode_t mode = va_arg(ap, mode_t);
        res = func(pathname, flags, mode);
        va_end(ap);
    } else
        res = func(pathname, flags);

    if (res >= 0)
    {
        int saved_errno = errno;
        int z = posix_fadvise(res, 0, 0, POSIX_FADV_DONTNEED);
        if (z != 0) fprintf(stderr, "Cannot fadvise on %s: %m\n", pathname);
        errno = saved_errno;
    }

    return res;
}

int write(int fd, const void *buf, size_t count)
{
    static write_t func = 0;
    int res;
    if (!func)
        func = (write_t)dlsym(RTLD_NEXT, "write");

    res = func(fd, buf, count);

    if (res > 0)
    {
        int saved_errno = errno;
        int z = posix_fadvise(fd, 0, 0, POSIX_FADV_DONTNEED);
        if (z != 0) fprintf(stderr, "Cannot fadvise during write: %m\n");
        errno = saved_errno;
    }

    return res;
}

Updates

Steve Schnepp writes:

Robert Love did a O_STREAMING patch for 2.4. It wasn't merged in 2.6 since POSIX_FADV_NOREUSE should be used instead.

But unfortunatly it's currently mapped either as WILLNEED or as a noop.

It seems that there is a google code project that has spawned to control this.

Posted Mon Mar 8 13:26:28 2010 Tags:
Posted Sat Jun 6 00:57:39 2009