Showing posts with label developer's life. Show all posts
Showing posts with label developer's life. Show all posts

2016-07-09

GitFlow (and friends) with remotes: avoid the busywork

There are lots of places online where one can learn about GitFlow, but it's seemingly always discussed in a local way; the details, subtle or not, about using GitFlow when you are pushing and pulling your changes through the network are never mentioned.

Now, GitFlow is a bit long on the teeth, and some of the younger and simpler alternatives do take into consideration things as Pull (or Merge) Requests; lately some even want the original GitFlow to be considered harmful. But still, there's a common basic concept in GitFlow and the alternatives, which is local feature branches. How and when exactly to merge them back into a development branch is one of the big differences, but is rarely detailed.

The goal of this post is to gather some tips on how to keep developing in the local feature branch while staying involved on a lively repository; and how to make easier the final feature merging.

2015-09-10

List the DTrace providers in your machine

I don't see any official way to list the DTrace providers; you can seemingly only list ALL the probes, the >300K of them (in my Mac right now), and then you have to deal with the multitude of providers instantiated multiple times for different PIDs.

So here's a small AWK script to list the unique providers, how many instances of each are there, and how many providers are attached to each PID:

2015-08-07

Fred Brooks vs Dijkstra?

In the 60's, Freed Brooks led the development of the IBM/360 system and its OS. The experience made him write the famous The Mythical Man-Month.

Turns out that Dijkstra also wrote about that system in his EWD255. And everything he complained about on that system is the standard nowadays! Sounds like he considered that a lot of the complexity that should be fixed by the computer is being dumped on the programmer; that the features touted by the IBM/360’s OS are in fact partial fixes to the bad design of the computer:

2015-07-21

6 tips to survive Codility tests

Well, it happened. When applying for a job, I got sent to a Codility test. And even though I guess I already had some good practice with them, I managed to do badly in stupid ways – ways that 2 years ago I had already thought about and even taken notes on. Just, in the heat of the moment I forgot about those rules of thumb.

And in fact I think these hints should be given by Codility themselves – because if not you are practically ambushed, even more so if you didn't take the time to thoroughly explore how they work. So here are my hints-to-self.

The summary is: don't think of this as a coding interview; this is rather about getting something working, ASAP.

2015-07-13

Swapping bottom left and right panes (content and patch summary/file tree) in gitk

Gitk (as of git 2.4.2) has its bottom left and right panes the other way round to what I like: the bottom right shows the file tree or the patch summary, while the bottom left shows the contents of what is selected on the right. Isn't that kinda non-standard?

There is no official way to swap those panes, so here's a little hacky patch to do it.

2015-07-06

Towards functional

(another entry in the "let the greats make my points" series…)

At some point I was getting uncomfortably close to the expressive possibilities of plain old C, and started learning C++. But C++ always had to me this air about it connecting it to UML excesses, to baroque Design Patterns for things that might have been expressed with a handful of words, to "big, spongy frameworks" (as Yegge said?)… the kind of things that made me wary of going into J2EE to begin with, some long time ago. 


2015-07-05

Great minds think alike…

… or, "using quotes by well-known people to put forward possibly unpopular points" ;P

Sometimes I have a fully formed opinion to write about; lots of times I don't, but I have an intuition of where I want to go. And those times seeing what some of the "greats" in the field think can be a bit of a beacon.

2015-06-25

Eclipse CDT configuration for big, makefile-based projects

It's been kinda hard to get Eclipse to work well with big CDT projects (say, the Linux kernel, or even a BuildRoot project including the Linux kernel). The documentation for Eclipse is between non-existent and unhelpful, and I only could find a post that talked specifically about preparing an Eclipse project for the Linux kernel. But that post explains nothing, and does some arbitrary things. So this is the generalization of that.

Telnet autologin with a short Expect script

Couldn't find anything concise, so here's my solution: a sweet short Expect (the CLI variety installed by default in Ubuntu) script to open a telnet session, feed in the user and password, AND continue to an interactive session.


2013-10-26

Codility

Lately I came across Codility, a company which present themselves quite simply: “we test coders”. I had been thinking for some time about trying my hand at some coding competition, and was in the process of searching for a new job, so I became curious. But after a quick look at their website the curiosity became disdain. It all sounded like boasting to be a coarse-grained filter to separate the chaff from the wheat, quickly.

Only that the chaff are people too. Sounded pretty dystopian. Probably this is a First World problem, but hey, after interviewing in a couple of places things seemed sufficiently bleached and dead inside as to not need any more automatic filter to dehumanize the interviewing / testing even more. Typically in my interviews I missed feeling anything that would make me want to work with the interviewers; I wanted to find someone with passion, instead of someone boringly fulfilling the recruiter role for a hollow company.

And the Codility premise didn’t seem to help.

But after a couple of days I found myself going back to the thought of what they do. I started realizing how one of the causes that pushed me to look for a new job was that I missed having an environment where I could grow as a programmer - ideally with programmers I could look up to. My job at the moment was not in a software-centered company, so I guess that’s why I was lacking inspiration; everyone had a variety of hats there, and I was feeling little by little more sure that my hat should be more of a software-centered one, while most other people’s hats were mainly in the hardware side.

But that doesn’t change the fact that the software side of things were not looking right. Even my tries to spice up my own work and make it something I could be proud of were finding resistance - for example, use the preprocessor a bit too much (for example to implement a dab of generic functions without the full problem of getting into C++) and get told that “you’re working too high level, this is firmware”. Yeah, I wonder what people using these techniques in the 70’s would have thought about even our humblest 8-bit uC with 128 KB of RAM.

So then it hit me. What would have happened if Codility had been used in the recruiting process of my own company? Probably most problems would have disappeared. And reminded me of that FizzBuzz article, and how an astounding proportion of programmers plainly can’t program their way out of a wet paper bag.
What if Codility was in fact something to look for? What if a company using Codility could be a sign that they are serious about programmers?
Even, if they might be such a force for good, what about working for Codility themselves?

I decided to investigate them a bit more – even the application procedure was interesting – and tried one of their free tests. And turns out it was easy. I finished in half of the allotted time, and after some cursory checking I submitted my solution feeling smug. It worked. I knew because I did it in parallel in my own IDE and could see it was working. Yeah, I'm good.

…except that I made an error controlling the index of an array, which made my solution fucking useless in most cases apart from the small test case I had used. I got a score of 20%.

Well... that was sobering.

(in my defence I’ll say that Codility’s test required to use big, signed integers, and their solution checker had their own problems, since their compiler emitted warnings about long long type only being a C99 type, but didn’t allow to use compilation flags to for example specify C99 mode. So I concentrated on the sign problems… only to fail to realize that the more mundane problem was right there too). 

So I took to practice with some of their other free tests, of which they have a full page, for those wanting to improve. This helpfulness seemed strange at the moment, but turned to be an interesting detail later.

Why strange? Because until then I was rather impressed by their online testing/evaluation system and was half-sold to their possible goals, but could still see lots of warning signs that made me still wonder if they were a force for good or bad. Every rough corner had a big potential to change from being just an apparent nuisance to being a showstopper - for the candidate!, while the recruiting company would just receive a bad report about the candidate and Codility would come out looking like it did its job - even when that was arguable. And the opacity of Codility didn’t help me trust them. So, to avoid the risk of a let down in the case the ugly face was the real one, I decided to try my hand with Python, even if I am mostly a novice with it: if I was going to invest my time and get tested, why not at least take the chance to practice what I want and have some fun by the way.

Turns out, I think the testing style of the tasks does in fact favour doing just that: there is no time to stop to think about all the corner cases in C. Or rather, with Python you can do the same task using half the brain cycles.

Some tries later I was managing to reliably get good scores and tried interviewing with them. And turns out that the official test scores were good enough, so I finally got to talk to them. Into the Death Star!

Well, what a surprise. From the first moment they seemed even warm, in the best “mythological startup”-y way. Not only they were rather nice as plain people. Not only they were rather in the know of the shortcomings of the testing environment - they even seemed mortified about some of the problematic details I mentioned, and even asked for details to fix them, though I expected them to brush it aside with a “yep, we know, we have a bug report somewhere, we’ll fix it when we have time”.

Not only they no longer sounded like douchebags - they in fact have such an emphasis on teaching and making the programming community get better that it left me positively dazed. Sounded much better than I the best case I had imagined. Remember the “strange page full of tests”? When I found it I thought it was strange because it seemed quite helpful for a site intended on fucking you up. Now I see it in other way; the “fucking you up” part is still there, but is trying hard to get better/not be worse than it has to be (after all they are still a filter!). And that part is needed to pay for the other side that really is the one with a greater-than-themselves potential, with a dream: to teach!

But the intriguing thing is that they have the possibility to teach in a unique way. After all they have a big and growing corpus of programming examples showing how programmers try to do a task and how/where/why they fail. Imagine all that can be learnt from there. Even in a purely "psychology of programming" way it must be terribly interesting.

For example, Peter Norvig said something to the effect that “Design patterns are bug reports against your programming language” - which is great, because now I don’t need to explain myself as much when I say that I dislike the premise of design patterns :P. But, what if the kind of “typical mistake” data that Codility can gather was in fact a source of antipatterns - meaning constructions which are typically done wrong by programmers? Maybe you could find even a blind spot or Achiles heel of human thinking; some articulation of thought which for a big percentage of people will cause wrong results. What if (some?) patterns made sense as a framework to automatically avoid those dangerous spots? Or inversely, what if you could standardize some antipatterns to let programmers realize when they are treading on specially dangerous ground?

Sounds maybe far-fetched, but don’t we already have examples of something similar in high-school level philosophy courses? P -> Q implies ~Q -> ~P, but everyone has had the impulse at some moment to think ~P -> ~Q - and lots of people still do it, of course. Didn’t the greeks already have people specializing in convincing people by exploiting just that kind of buggy thinking?

Going slightly further, what if that kind of “problematic construction detection” was built into a compiler or lint-type tool?

But the most interesting thing would be in a context where in fact it touches even more pressing and real problems. For example, in some safety systems (say, railway traffic control) a number N of equivalent-but-different versions of a program run in parallel as a way to get redundancy. The different versions can cross-check their results and orderly shut down the system when a discrepancy is detected, to avoid the system working in an unexpected state; or if more than 2 versions are running, the result can be decided from a majority vote, while the system/s in minority can be isolated and taken for corrective measures. The idea is that N independent implementations by N different programmers should have different bugs in different places, affording a measure of safety.

…or does it? Turns out that MIT published not long ago a very interesting paper in which they show that this approach to redundancy is NOT good because it assumes that bugs are equiprobable and uncorrelated. But they aren’t! They depend both on the programmer and, more importantly and interestingly, on the task at hand. The MIT group asked a number of programmers with different contexts and experience to solve some problems, and they do demonstrate how the bugs are much more probable in certain parts of the task, no matter if the programmer is a student or a experienced developer. Terribly interesting, even though the sampling space they have is too small to do much more than assert that the bug-statistical-independence is false - so N-version programming gets at the very least a pretty big warning.

But now imagine what could you do to attack the bug determinism in N-version programming if you had a database of how thousands of programmers tend to fail in different programming tasks?
How BEAUTIFUL could such research be, and how terribly useful could the result be? Things like that are the ones which make me think about getting into a PhD…

Also, I think it is interesting to compare that to machine translation. For a long time it was an intractable problem, which gave useless results. And then, “all of a sudden” we have Google Translate which gives usually pretty good results (if you have never used 90’s style translating software, do believe me, Google Translate is just incredible). What changed? Well, the approach changed. Instead of having a machine “understand” the original text and translate it into a target language, now Google has a huge corpus of translations of the same texts into different languages, and can statistically link parts of a text in language A to parts of the same text in language B. Rosetta stone, a billion times over, with feedback from users and an ever growing corpus.
Again: a big corpus, statistics, translation / finding correspondence.
What could Codility do here with its data? Still thinking on that, but the potential has to be huge…

So. Yep, finally I interviewed with them. And though it seemed so interesting finally the thing was not to be. But certainly I expect them to evolve into something great… if only because if they don’t, the potential of the nasty-looking part of it taking over the dreamy part sounds horrible.

So here’s one hoping that they will really be a force for good! And looking forward for companies either using Codility with a measure of taste and care - or improving their recruitment processes for programmers… because if not, I can see myself fully taking advantage of the test periods :P.


[UPDATE: 2 years later, things look less wonderful… ]

2013-10-15

Estimation questions in job interviews

I had heard about those questions in job interviews where you are asked to estimate on the spot something, though having no real data about it.

There was some example about estimating the quantity of gas stations on the whole USA; I seem to remember that was for a Google job application. The guy had the guts to throw up numbers in a hunch with the handwaviest idea of how they could be good numbers… and, luckily, at the end the result was even not a bad approximation.

I realize now that at the moment, when I read that, the “luckily” part is what stuck on me. While it’s true that the guy had the wit to back the numbers and/or the chutzpah to pick them out of thin air, and even though he also had some brillian moments to just turn around and cross-check some imagined number by using another set of assumptions… the thing is that finally he was near the true number.
Luckily.

It was fascinating, and rather intimidating. How in hell would I be able to do something like that?

So, when in a recent job interview I was asked point-blank for such an estimation, I was totally caught off-guard. I was rather well prepared technically, if only because I had been kind-of-practicing lately - but I wasn’t really expecting any more technical questions that day (it was late after work!), and certainly no estimations. But here I was, with the interviewer just charmingly dropping the question and handing me a piece of paper.  Something like, “how many tumbles do together all the martial art practitioners in Kyoto on one given day?”.
Oh god oh god oh god. WTF. Wasn’t this the HR part of the interview??

Well, who would have thought. I had never practiced for such a thing (how?), but suddenly I was throwing up numbers all the same. Adrenaline, I guess. One factor, two, a third one, of course this is approximate but we can consider it’s good enough because of blah, … a big multiplication of it all, it’s done.

The interviewer looked at me kind of blankly. Then at the piece of paper.

“Are you sure?”, I think he asked.

I took another look at the numbers and crossed over one number and fumbled a bit with another and mumbled something but the final result stayed more or less where it was already. Which had to be a good sign, since the interviewer had started saying that just getting in the order of magnitude would be OK.  “Yes, I think that has to be something like this”.

“Oh, I wonder what would the real number be. Hehe”. Change of subject.

What… the … fuck?

You don’t even have it? What kind of dumbness is this? What use is to come up with all kind of assumptions if you are not checking the results?

How can then we know if I was lucky or not??

Yep, I remembering even getting annoyed for a moment. Are we playing job interviews or what?

I only managed to understand how wrong I was about it all when I talked to my interviewers some time afterwards. Luck? Numbers? Who cares? The important thing is the process, and if you can assure yourself and others about how the assumptions are the best than can be had.

I knew that the process was important; what I didn’t get earlier was that the process was the only important thing.

The Google interviewee who explained his assumptions to come up with an approximation of the number of gas stations in the USA took advantage of things he knew about geography and population, about his own driving and mileage per gallon, about… whatever. Probably in fact that is what stroke me as brilliant, maybe just because I rarely drive so I wouldn’t have probably had the feeling for that kind of data; seeing him conjure numbers was amazing.

And I did the same! For my assumptions I had the advantage of having lived most of my life in touristic places and the knowledge of how people flock there (and the difference between a concrete event or a seasonal thing). I knew how different martial art practitioners are more or less prone to be tumbling around (say, Aikido all the time, Karate rarely :P). I knew how an average tatami looks like, how many people can be training in it at once, how they behave - because I spent a lot of time in them.
I remember the feeling of flow while I was stringing all the numbers one after another, realising how in fact I could give solid estimates for every factor. I was pulling it off!

... in my head, at least. Because turns out that I barely explained any of the reasonings behind each number. I just wrote, wrote, crossed out something, started again, mumbled, wrote. I just wanted to be fast and get a lucky result.

So, one ends, feeling even smug about it. Fuck yeah, I aced it. The other looks blankly, not knowing what to do with a number. “…heh. Yeah, well. OK, let’s move on.”
(looking back, maybe they should have also re-stated the goals, since clearly they were more expecting a rather longish string than a number :P. I can only assume that we cross-guessed each other. Also funny that I could sense his awkwardness but didn't know what to do of it... way to waste a pretty nice hint!)


An even more interesting take is that it is not even about having the background / experience to imagine the correct numbers. I should have been able to sell him on my numbers, even if they had been pulled straight out of my ass in real time. Because if they are right that’s nice, but no one is going to notice because he didn’t have anything to compare to. Which means that this is not only about “reasoning being more important than data”, but that it’s about communication being even more important than reasoning. That might be even arguable, and probably for an engineer is a tad too much (as opposed, say, to a sales guy); but anyway for sure we all would prefer working with someone who communicates his idea (maybe to the point of unduly pulling you into it, Reality Distortion Field style) than with someone who can’t / doesn’t bother to explain what is going on in his head, where he’s the primadonna to an audience of 1.

Yes, that was a shocking thing to realize. Partly because it sounds so logical now, in hindsight.

And, of course, even the Google interview guy did it. After all, I had been able to glimpse (... or, had been shown) his brilliance. And that was it. The result being “correct” was icing on the cake, because he was already communicating it.

And now I am doing it, too. :P

One lingering question is, why was I so fixated with getting a number? First thing I though is that the style of other programming tests had put me on that frame of mind. But there’s a more insidious possibility: what comes to my mind when I think about solving a math problem is to tidily frame the final result, making it stand out from all the noise of the process. That’s what I guess we all did in exams. And I guess that’s even good presentation of the result, which should always be good. Only that in this case the result to be presented was not the final number, but… everything else. I was framing the unimportant part, and not bothering with the rest.

So, one other thing to learn is: make sure to check a couple of times at least if the question is being answered or not… another oh-so-logical tip, huh?



2013-09-13

My first hardware bug report?

I have found what looks like a hardware bug in the AT90CAN128! I guess I could be happy in a geeky way, but the number of hours spent trying to track the problem make it a bit difficult :P.

The thing is that aborting a pending MOb in the CAN controller embedded in the AT90CAN128 can leave CONMOB in an unexpected state. I am not totally sure if this is a hardware bug… or could be Atmel's library at90CANlib_3_2 being buggy AND misleading.

Atmel Studio 6 and blank Solution Explorer

Atmel Studio 6, which is (ununderstandably!*) based on Visual Studio (2010?), suddenly stopped showing anything in the Solution Explorer pane.
The pane was there, it just remained empty, blank, apart from the "Properties" button (and the "Show all files" button shows up too, depending on the frontmost pane).

Using Atmel Studio 6's Projects for libraries

Long story short: Atmel Studio's inflexibility forces quite a strict isolation, which forces you towards the purist side (potential advantage) and makes compile-time library configuration very difficult (clear actual disadvantage). So it's not a very clear win - though I am rather liking it.

2012-01-16

Safari's Reading List protocol: CalDAV + XBELs


Safari on desktop and iOS has a feature called "Reading List". It is a way to store URLs in iCloud, synchronize them between Safaris, and mark them as read or not. Somewhat like Instapaper maybe.

I was a bit surprised that there is no Chrome or Firefox extensions for tapping into Safari's Reading List. So I wanted to try poking a bit into the protocol, maybe something interesting would appear or someone could get some headstart from this.

Safari always seem to start contacting p05-bookmarks.icloud.com, which resolves to the Akamai CDN, some p05-bookmarks.icloud.com.akadns.net. Luckily the IP always was the same, and there is no client certificate check, so I could mount a little Man In The Middle via /etc/hosts.

Then, a couple of socats: one to pose as the original SSL server and resend the plaintext; and the second to receive the plaintext and send it to the original IP. The first socat shows everything that goes out of it (in plaintext) thanks to the -vx option.

socat -vx OPENSSL-LISTEN:443,cert=certs/server.pem,verify=0,reuseaddr,fork TCP:localhost:50000 

socat -v TCP-LISTEN:50000,reuseaddr,fork OPENSSL:17.172.100.37:443

I tried tcpflow instead of the -vx, but tcpflow won't show anything when capturing on the loopback interface on OS X. Seems to be a bug.

So now we can see the exchanges between Safari and iCloud. I was half-expecting something encrypted, but it isn't (apart from the SSL connection). It seems to be CalDAV, used to exchange small XBEL files, gzipped; which can be quickly unzipped by copying the hex from the socat dump and pasting into a "xxd -r -p | zcat ". Each XBEL file is a small XML file with an URL and its status.

I expected this to be much quicker, but socat and tcpflow kind of conspired against it. I started expecting to use named pipes and tee's to connect the socats while showing the exchanges, but each update in the Reading List causes between 3 and 5 connections, which needs the socat option fork, which forks new socats for each connection, which won't play nice with the pipes.

I guess I should have switched to some python earlier... next time I will. And at this point anyway, if I wanted to keep going into this protocol, looks like throwing together some small client with Python would be the best way forward.

But I am not currently interested in JavaScript, so I guess I won't be making any extension myself, so I guess I should move on to other things.

So, dear Lazyweb... :)

2012-01-15

Tcpflow and connections between local interfaces


Looks like tcpflow doesn't see connections between local interfaces. After a bit of digging, looks like such connections are "routed internally by the kernel", at least in Linux. And there are patches for Linux to force those packets out of one interface and in from another, but even that is only useful if you have an external network connecting both interfaces (looks like a simple crossover cable should be enough).

http://www.ssi.bg/~ja/send-to-self.txt

There is another option: using iptables to first make the packets leave the machine towards some external IP, and then using arpd to make a router send back those packets.

http://serverfault.com/questions/127636/force-local-ip-traffic-to-an-external-interface

And I see people reporting that tcpflow -i lo does work for them, capturing flows having local addresses even though different than 127.0.0.1.

http://fixunix.com/tcp-ip/327123-capturing-tcpdump-local-traffic.html

The interesting thing is that people seem to take for well-known that Linux routes through the "lo" interface the traffic between local interfaces; but I didn't find any authoritative source which explains any rationale, any configurability, any implementation. Which I guess would make it somewhat easier to find such things in the BSD's and OS X.

(I surely should go straight to the source code, but that feels fragile. I am not interested on the current implementation, but on the design: how should it work vs. how does it work right now. Although surely that kind of networking would be pretty baked in into the kernel...)

Would be interesting to know if this something missing in the BSD's/OS X or in libpcap.

2012-01-04

tcpflow 1.0.2 doesn't work with net expressions

Tcpflow 1.0.2, as built with MacPorts, doesn't work when net expressions are used.
But 1.0.6 does work.

I already submitted a new portfile so it should be available soon.


2012-01-01

Building socat in OS X 10.7 Lion

socat (1.7.2.0 as of this writing) doesn't compile with the clang, the standard compiler in Mac OS X 10.7 Lion (10.7.2 as of this writing). It does compile if instead of clang one uses for example llvm-gcc-4.2.2.


The developer reports that this is a bug; only gcc is supported in socat, but there are compatibility fallbacks for other compilers. Only, the fallback was missing on the file that fails to compile, xioexit.c. The fix is easy:
@@ -5,6 +5,7 @@
/* this file contains the source for the extended exit function */

#include "xiosysincludes.h"
+#include "compat.h"
#include "xio.h"
(if someone is trying to build something like socat, I guess he doesn't need help about patchfiles)

This problem was also present in MacPorts' port for socat. I have already reported it and provided a new working portfile, so I guess it won't be long until it is published.

2011-12-05

Límites absurdos de longitud de paths en APIs de Windows

Ordenando cosas viejas me he encontrado con un problema interesante que tuvimos en los tiempos de gvSIG (hace unos 4 años)...

Problema: si se intenta descomprimir el ZIP en un Windows, nos encontramos con la incapacidad de Windows de crear ficheros con path completo de más de 260 chars. Esto sucede tanto en NTFS como en FAT, y es un problema (por ejemplo) si se quiere descomprimir el ZIP en un Windows para meterlo en una unidad USB o un CD / DVD.
No es una limitación de FAT: la copia o descompresión se puede hacer en un OS no-Windows y va bien.
Es una limitación de API de Windows (como mínimo incluyendo Windows 2003 Server): aunque filenames pueden ser de mas de 255chars, paths completos no pueden superar 260 chars.

http://msdn2.microsoft.com/en-us/library/aa365247.aspx

2011-11-23

dtrace'ing paging to disk

How to know which processes are paging to/from disk (as opposed to other VMM management) in OS X, and how much exactly:

sudo dtrace -n maj_fault'{@[execname] = count()}'

reference (with examples and other options): http://wikis.sun.com/display/DTrace/vminfo+Provider

I had been meaning to look for ways to do this, and tried some of the tools included in OS X (Activity Monitor, top, Instruments, vm_stat, vmmap, top, ...). But nothing really helped, and/or seemed to miss the exact level of information I was targeting (only resulting in real I/O; relationship between I/O and process; realtime... ). Finally I had the inspiration to google for "dtrace pagefaults". Bingo.
(dtrace in this example isn't realtime, but is the best approximation so far, and I'm guessing some tuning should fix it. Heck, it's a one-liner!)

Learning dtrace is still something I'd love to do, and once again it is tempting me to let it jump to the front of queue...

(Mhm, will Oracle's Java for OS X support dtrace too?)

Oh, and of course Firefox was by far the greatest pagefaulter, even with the great improvements in the v9 betas. (I actually saw it diminish its VM usage by some hundreds of MB when it was freshly launched!... though after some days it has gone back to its habit of hovering over 20% CPU and 2 GB VM even while idle)
But it's interesting that the second offender is Skype, even if more than one order of magnitude less than Firefox; but also one order of magnitude greater than the 3rd and rest of offenders. Interesting because it looks very good mannered in Activity Monitor, and it was unused during all of the measuring time. Maybe its P2P routing thing...? Would be good to keep an eye on it.