25 July 2008

WTF?! Microsoft Becomes a Platinum ASF Sponsor?

When you were growing up I'm sure you knew some kid who always seemed to have the cruelest intentions. S/he always possessed ulterior motives, trying to better her/his social position using lies and deceit veiled by declarations of pure intentions. Yet, in the end, they'd just wind up screwing people yet again leaving them feeling like fools for actually wanting to believe that s/he may have actually changed her/his ways. Well this is kinda what I'm feeling right now, just waiting for the punchline.

I found out about this announcement early this morning and yet I'm still perplexed to discover that Microsoft has become a platinum sponsor of the Apache Software Foundation. I certainly trust the folks at the ASF who made this happen, but I can't let go of Microsoft's past FUD against open source and Linux so easily (remember the 'Get the Facts' campaign?). Has the poor behavior and shady tactics surrounding the ISO <quote>approval</quote> of OOXML as an international standard already been forgotten? This move really smells of a distraction and a method of getting the open source community to assist Microsoft in leveling (and eventually tipping) the playing field in the Windows v. Linux battle. A brilliant move, really - let's pay the folks we're trying to beat so that they will help us beat them.

Of course, I don't discount the fact that someone can turn over a new leaf - and I really would like this to be the case. I just hope that the fox has not been let into the henhouse bearing a bucket of grain.

24 July 2008

A Brief Look at Everything2, Wikipedia, About.com, Freebase, Squidoo, GoogleBase and Google Knol

First there was Everything2, a Perl-MySQL Web Content-Management System to create a flexible system of entering, linking, and retrieving information. I remember first discovering this system when I still read Slashdot sooooooo long ago. Cool idea, less concerned with being an authoritative reference and more about being a sounding board for anyone who is interested in writing about a given topic. Everything2 was developed by the same guys who developed Slashdot which is why the Perl/MySQL solution was used. It offers a simple search mechanism for discovering nodes.

Next came Wikipedia, a multilingual, Web-based, free content encyclopedia project whose articles provide links to guide the user to related pages with additional information. Wikipedia was born out of Nupedia which was an effort to create a new encyclopedia via an elaborate system of peer review that required highly qualified contributors. Wikipedia threw out most of the formalities of qualifications and peer review became a staple due to the high amount of collaboration. Still, Wikipedia is most interested in information that is worthy of notice. Wikipedia also offers a primitive search mechanism for locating information. Wikipedia also prides itself on the anonymity of its content creators.

About.com provides it's own information that is managed by its guides, people who have the credentials and experience to back up knowledge. Guides must also have professional writing experience in your area of expertise. It's a unique set of qualifications, but About.com actually pays its guides to author the content. About.com offers a lot of information but it is owned by the New York Times Company which gives it a really commercialized feel. Where Wikipedia definitely has an encyclopedic approach to its information, About.com has a very consumeristic bend to its content.

Then Freebase came about an open database of the world’s information by drawing information from large open data sets like Wikipedia, MusicBrainz, and the SEC, it contains structured information on many popular topics, like movies, music, people and locations—all reconciled and freely available via it's own open API and Metaweb Query Language. Freebase also offers a Javascript template language named the Metaweb Javascript Template Language (MJT) billed as an in-browser web framework.

[Interestingly, a company named Powerset developed its own search technology focused on aggregating, summarizing and navigating information and it's first focus to showcase its technology was a combination of content from Wikipedia and Freebase. (Earlier this month, it was announced that Microsoft acquired Powerset.)]

Another player in this space is Squidoo offering its own set of information authored by anyone. Squidoo has mulitple goals listed on its website including it's goal as a platform is to bring the power of recommendation to search; it's goal as a co-op is to pay as much money as we can to our lensmasters and to charity, and it's goal as a community is to have fun along the way, and meet new ideas and the people behind them. So it pays its authors and charities and promotes having fun authoring content and recommending it.

The first offering from Google was GoogleBase as a way to describe your information to make it as easy as possible for people to find when they search. In other words, enter your information and make it part of the Google-verse. Oddly, this seems to overlap with Google Knol.

Now comes Google Knol, a system for creating authoritative articles about a specific topic. Knol seems to be somewhere in between Everything2 and Wikipedia but it removes anonymity from the picture by requiring information creators to have a Google account. Knol is yet another way for Google to exploit its business of content-targeted ads.

So how do these information systems differ? IMO they really don't differ much in what they seek to provide, they only differ in their implementation. Each one seems to store and make it's information available in it's own, unique way. So to compare and contrast each one, I searched for the string 'Boulder, Colorado'. Below are links to the results.

Boulder, Colorado on Everything2 returned a spartan amount of information.

Boulder, Colorado on Wikipedia produced the most information out of any of these systems.

Boulder, Colorado on About.com.

Boulder, Colorado on Freebase.

Boulder, Colorado on Squidoo produced no content.

Boulder, Colorado on GoogleBase produced the results in the same format used by Google Shopping and other Google properties.

Boulder, Colorado on Google Knol rendered nothing.

The bottom line is that all of the sites I mention here are focused on organizing and providing information. The quality of the information must be good, but that's only a contributing factor, IMO. If the quality of the information is good, the real differentiating factor is the value-add around the edges. And right now the biggest value-add seems to be how the information is offered to be discovered by users. They all have their own way of attacking the problem.

Wikipedia offers the best content quality hands down. This is surely due to the high amount of collaboration at Wikipedia and the Wikipedia community's ability to police it's content so incredibly well. Each entry is typically fairly well-rounded and has been contributed to by multiple people - the wisdom of the crowd at its finest.

Freebase is more interesting to me because it offers an API for accessing the data (and because I'm a software engineer, I'm biased on that front) but Freebase can't shake a stick at the content offered by Wikipedia. Freebase is also not the best at presentation.

Even though Powerset is not a content creator, it's ability to aggregate data and present it to users in a more meaningful manner is probably the most compelling just because it's the most usable by far. But not that many people have even heard of Powerset and how that it's been swallowed by Microsoft, it may never be heard from again (unless Microsoft leaves it untouched to continue to do what it does).

Google certainly has the most marketing power and its dominance in so many other web properties gives it a leg up. But it's content breadth and depth is sorely lacking currently. Maybe it will catch up over time.

Which one will prevail? Your guess is as good as mine. Competition is a good thing ;-).

16 July 2008

Live From Daryl's House



Last night I flipped on the TV to catch the last 20 minutes of the Conan O'Brien show (IMO, Conan is a comedic genius). As it turned out, Daryl Hall (of Hall and Oates fame in the 1980s) played with KT Tunstall and the performance was really good. As Conan ended, he announced that Daryl has a website where he posts live performances called Live From Daryl's House. So I checked it out immediately and was blown away at the quality of the music.

I really dig live music. I enjoy it because it's an opportunity to see true musicians play raw music without the assistance of studio affects and mixing to perfection before it's heard. I refer to this as an opportunity because I've been disappointed by some musician's who turned out to be really shitty in a live setting. At any rate, Daryl's voice is as good as it's ever been, maybe better, and his band is great. The quality of the music and videos on his website is stellar (despite the damn Flash player being somewhat choppy).

Anyone who enjoyed Hall and Oates back in the early 80s will enjoy Daryl playing some familiar songs from that era as well as some new stuff. What's more, you can download the music for US$.99 per track!

Eclipse Maven Integration Using m2eclipse

maven

Recently I authored a chapter in Maven: the Definitive Guide. The chapter I wrote is about Maven integration with Eclipse using m2eclipse, an Eclipse plugin for Maven. I really enjoyed writing this chapter because it allowed me to dig into m2e and understand more about what is really there today. In short, I was blown away!

m2e has come a long way since I last tried it out a couple years ago. Since that time, it has become so feature rich that there's little need for me to use Maven on the command line anymore. Being that I'm a command line freak because of the power and control it offers, I wasn't thrilled about staying in the IDE to build everything. But I commend Eugene and company on the features and polish in m2e today. In fact, I'm already finding it difficult to live without. If you haven't tried out m2eclipse, read through the article and try it out now. Believe me, you'll be very satisfied!

Additonally, we also just published an article based on the m2e chapter titled Introduction to m2eclipse. This chapter is focused on getting m2e installed and showcasing its major features and should help folks get started with it quickly. Unfortnately, in the time it took for the TheServerSide to publish the article, m2e has gotten more new features and the Maven book chapter has been updated significantly.

BTW, you can also buy a published copy of the Maven book from O'Reilly Media, though I'm not quite sure I'd recommend it for one reason. The updates to the book chapters are coming at such a rapid pace that a printed book will be out of date the day you buy it. Besides, I'm not a huge fan of paper books that simply grow out of date really fast. But if you like a printed book in hand, buy away ;-).

12 July 2008

Integration as a Service

Back at LogicBlaze, we had a product idea for a SOA and messaging appliance with management and monitoring software that could be installed at a customer site. We called this idea LogicBlades because we were talking about using blade servers. I still think this would be a compelling solution for small to medium sized businesses (SMBs) for a lot of reasons. But it would probably require an operations team for offering a service on top of the appliance for monitoring, software updates, etc.

Well, call it a missed opportunity because there are already a bunch of companies in this space including Forum Systems' Forum Sentry, Dajeil XML Acceleration Hardware, Vordel XML and SOA Appliances, IBM WebSphere DataPower SOA Appliances (is every product at IBM somehow linked to WebSphere?!) and Cisco's SONA product line just to name a few.

The even hotter portion of this space involves the inclusion of virtualization with the SOA offering. Vordel is in this space as well, Layer 7 Technologies, TIBCO and IBM is also here with virtualized partitions on servers. Suffice it to say that this market is being attacked.

(I don't have the time to comprehensively research all companies in this space so I'm sure I've missed a few.)

Now, British Telecom is throwing its hat into the ring with a slightly different offering. BT is providing integration as a service with its new managed application and data-integration service in the UK for a 'pay-as-you-grow' service. The solution consists of a hardware appliance running a hardened Linux with the Sonic ESB and iWay connectors. But the BT product not installed at the customer site. Instead it's installed in a BT ops facility where BT handles all the management and monitoring. So this solution is really a hosted service instead of something that customers install on-site. From the BT point of view, this is certainly an easier product to manage. Trying to manage remote appliances at customer sites can be an utter nightmare. Still, an on-site solution seems like a larger opportunity if you offer the customer the management and monitoring software for their own use along with training and professional services.

It still seems like there is lots of opportunity in this space, especially for customers who are not willing to bet the farm on big dollar products from big companies and for companies who can innovate further.

11 July 2008

Pandora For the iPhone



I already love Pandora and it's concept of radio stations based on artists you already like, but Pandora's Streaming Radio App for iPhone sounds even better and yet another reason to get an iPhone.

According to the Wired article, you can bookmark songs to your profile, purchase songs from iTunes or even ask Pandora why it chose a particular song. For those that don't know, Pandora was instrumental in creating the Music Genome Project:


A given song is represented by a vector containing approximately 150 genes. Each gene corresponds to a characteristic of the music, for example, gender of lead vocalist, level of distortion on the electric guitar, type of background vocals, etc. Rock and pop songs have 150 genes, rap songs have 350, and jazz songs have approximately 400. Other genres of music, such as world and classical, have 300-500 genes. The system depends on a sufficient number of genes to render useful results. Each gene is assigned a number between 1 and 5, and fractional values are allowed but are limited to half integers.


When I first read about the Music Genome Project, I was immediately intrigued because it's all about the science behind music. This drove me to try out Pandora right away and I've been very happily using it ever since. That was back in 2003, IIRC.

This all culminates with the the fact that I don't have an iPhone... yet. So far I've put off purchasing an iPhone because I've heard horror stories about at&t coverage and customer service. Besides, I've been with T-Mobile for years and despite experiencing bad coverage where I live, I get great service almost everywhere else and especially in the EU without changing anything on my phone. So I still like the service. And there's always the possibility of still using the iPhone with a provider other than at&t ;-).

I've also been waiting for the 3G iPhone to be released, which is happening today all over the world with major fanfare. It looks like Apple has yet another major hit on its hands.

Update: It looks like the 3G iPhone release has been plagued with issues on the iTunes server-side, though there doesn't seem to be much detail about the issue yet, especially regarding a solution. :-(

Dead Simple JMS

What if working with JMS were as easy as working with a filesystem? This would allow using JMS to become as familiar as files on the filesystem, using all the tools you're already comfortable using. Well it appears that Adam Turnbull has made this a reality.

Adam made use of the FUSE (Filesystem in Userspace) and it's cousin FUSE-J to map JMS queues to the filesystem. He's asking if anyone would be interested in him open sourcing it. I'd definitely like to get my hands on it for experimentation and use with Apache ActiveMQ and to try it out with macfuse. Whatta ya say, Adam?

As for the name, my suggestion is the title of this blog entry.

09 July 2008

Unable to Access GMail This Morning



This morning when attempting to log into GMail I was met with the screenshot you see here telling me that and upstream system is having issues via the old HTTP 502 status code.

Interestingly I seem to be experiencing more and more issues with the Ajaxy stuff in GMail recently such as inability to access the inbox or a message, inability to refresh, inability to log into chat, etc. These issues always seem to manifest themselves via lots of loooooooooooooong pauses and the yellow status messages at the top of the pages stating 'Loading...' More recently I've taken to logging out of GMail and signing back in and the problems almost always are immediately gone then. This tells me that there must be a lot of caching in the GMail web app or long timeouts or both.

Is anyone else seeing this error or experiencing problems such as these?