I'm a missionary in Japan. The name of my mission agency is WEC International. That's supposedly Worldwide Evangelisation for Christ, but I think I have a better idea about what it stands for...
2005-10-06
I am going to rant about email
There's a lot of hype about Zimbra at the moment. Other people hype about Gmail.
Both of these tools are crap. Crap, crap, crap.
Sorry to put it like that, but it's true. We were busy building something much, much better, but, well, various things happened. I realised I couldn't be a professional programmer because I have too much pride within me, and I became a missionary, and the rest of the company realised they should be working on things that made money instead of cool research tools.
Of course, it's easy to see this now, with the benefit of hindsight. The big bonus that these two tools have, over the one I was working on, is availability. You can sign up for them. You can't sign up for mine, because it didn't get there. It got to internal beta, and I can tell you about how cool it was, but you can never know it for yourself, so you have to trust me.
This has down sides and up sides. For me, the upside is that I can tell you about all kinds of things that our product did, and you have no idea whether or not I'm telling you the truth. The downside is that I know how the game ought to be played, and I'm still very disappointed that nobody's playing it well enough yet. Dammit, there ought to have been programmers smarter than me and my team playing with this idea. But the evidence suggests that there wasn't.
Let's take a for-instance. I played with the Zimbra demo. It's very cool. It does many of the things we were trying to achieve. It's insanely slow, but then computers and Internet links will steadily get faster, so maybe that's not a problem.
What is a problem is that it doesn't index what it knows. For instance, there's an email in the demo user which refers to a physical location. This is great. This is exactly what our project was all about. You refer to a physical location, and you get a map of that location. Excellent. Ten out of ten. But then, it goes horribly wrong.
Does it give you adverts related to that location? No. Does it tell you about other emails in the vicinity of that location? No. Here's what we wanted: Someone's invited you to San Francisco for the weekend, so your mail client shows you a list of hotels in SF, and a list of all the people you've been in contact with from SF, so you can arrange to meet them while you're there. That's what any traveller wants. Is it what Zimbra offers? Um.
Sorry if I'm picking on locations, but that's one area where you can do very cool things, and we had the plan in place, but Zimbra dropped the ball. And that should be a barometer of how likely Zimbra is to take on other interesting areas and do interesting things with them.
Bluntly, what Zimbra did didn't work right in that area. There are many other areas, such as tracking a meme through various conversations, automatically providing a summary of threads, and all sorts of other stuff. No Zimbra there, either. Computers are supposed to do the hard stuff, but so often we let them get away with doing the minimal amount of stuff to look cool, and ignore the fact that they aren't doing useful things for us instead.
So that was Zimbra. It could have been useful, but it was slow and unintelligent. What's next?
Gmail, eh? Gmail doesn't do a hundredth of what we were doing, although again it has the bonus of being there right now. But gmail supporters, I have three words for you: "search within attachments". There's no reason why Gmail doesn't do this, and yet it doesn't.
Google has so much metadata at their disposal. They should be able to act as the surrogate brain, but they can't. So someone sent me the list of this year's new students, but they sent it to me as an Excel document. I want to find it by searching for the name of one of the new sudents. Can Gmail help me here? Can it hell.
Maybe this is because both Zimbra and Google try to be the surrogate brain but at the same time they don't understand how the surrogate brain ought to work. When you remember things about the mail you're trying to find, it's often random things, and therefore you should index everything you possibly can to help the poor human construct his search terms, because he's going to be looking for random features, not just features which are useful to computers.
Here's a real-life example: it was some French guy writing to me about a Perl module, with a patch. I should be able - either directly or through a wizard - to turn that into "locale:fr +perl attachment:patch". Can you do that in Gmail, or even in Zimbra?
That's what we did. We wrote fast, intelligent, mail clients, which tried to understand the way you think about things, and tried to index mail in just the way you think about mail. We tried to do all sorts of inference from the mail to understand more about the sender, or the recipient, or the subject. And we never got to market.
This is painful for me. I want to say "I told you so", but I don't have any code to show for it, which undermines me completely. So since I can't really say it, I can only suggest to these projects how they can improve. In some ways, I want them to improve, so that there can be a publically available, mass-market way to handle email in a cool and froody way; in other ways, of course, I don't, because I (and the rest of my team) thought of it first, dammit!
But for now, the most important thing to take away from this is that neither Gmail or Zimbra are it, whatever it might become. There's something better, and I don't know where it's coming from, but I hope it's coming soon. Not just for you guys, but for me as well!
2005-09-28
SpamMonkey considered "Good Enough"
So a few days ago I finished the first plugin for SpamMonkey,
SpamMonkey::Test::check_uridnsbl. SpamAssassin people
already know what that does - looks up URIs in a message in a DNS blacklist.
This is basically what I wanted to stop comment spam on the blog.
And as of now, SpamMonkey is doing what I intended it to do - it's a SpamAssassin clone which can filter other types of spam, such as comment spam. It's in operation now on this very Bryar blog. The code to make it connect to Bryar was very simple:
use SpamMonkey;
my $sm = SpamMonkey->new;
$sm->ready();
my $res = $sm->test($params{content});
if ($res->is_spam) {
$self->report_error("I think you're a spammer, because your comment: ".
join("\n", $res->describe_hits));
}
That should deal with the spammers... for the moment.
2005-09-07
SpamMonkey
Yeah, yeah, I'm not programming... but I wrote something today called SpamMonkey, which is a very very cut down ground-up reimplementation of SpamAssassin. I said I was going to do it, and I did. (I was going to call it Barcelos, or Spam<something else>, but that's another story...)
Specifically it lacks:
- Meta rules (because I can't be bothered to implement the logic in them yet)
- Conditional compilation of rules (because I haven't worked out how to emulate the plugin infrastructure yet)
- Any of the eval tests (because they're all really ugly).
Unfortunately that last one means there's no DNS BL checks, which I really do need to fix, and there's no Bayes yet. I might fix that.
OK, so why? First because the SA code is somewhat baroque, to say the least. To be fair, most of the complexity is justified, for reasons of optimization or having to handle pathological (MIME) messages. But there's not really any reason to implement your own MIME parser in this day and age. I've already done that.
Second because I want to be able to use it to test things that aren't mail (like, say, blog comments) for spaminess using the same ruleset. And since this reads SA config files, it does indeed use the same ruleset.
I might release this in a few days, but I have this horrible premonition that everyone will miss the point and complain that it's just like SpamAssassin but it doesn't do X, Y, and Z. What do you think?
| « | 2008-05 | |||||
|---|---|---|---|---|---|---|
| S | M | T | W | T | F | S |
| 1 | 2 | 3 | ||||
| 4 | 5 | 6 | 7 | 8 | 9 | 10 |
| 11 | 12 | 13 | 14 | 15 | 16 | 17 |
| 18 | 19 | 20 | 21 | 22 | 23 | 24 |
| 25 | 26 | 27 | 28 | 29 | 30 | 31 |
lathos: Just written a device driver for my new piano. I impress myself sometimes.
Elvis Costello – The Invisible Man





