Archive for the ‘Tools’ category

Why should Apple provide Java at all?

October 31st, 2007

So Long Apple. The Party’s Over

JavaLobby.org has a pretty long thread on this recent blogosphere topic. Apple did not ship a Java 6 with Leopard. Everyone is up in arms. One of the questions I haven’t seen address anywhere is why it’s Apple’s job to spend time putting together a Java runtime for the Mac. Why is this the expectation? I really don’t know. Did Sun and Apple agree to some mechanism for Apple to build JVMs for the Mac? Didn’t Sun learn anything from the MS JVM situation many years ago? The path each party took may be different, but the net result is the same: current Java technology not running on a major platform to Sun’s specs and the community’s desires.

Perhaps with the openjdk project someone will be able to build a usable Java 6 for the Mac (although the openjdk project is only for Java 7 and onward, I think). I wrote a bit more on my initial Apple/Java reaction over here -> Choose Apple In the Enterprise – Get Screwed but I don’t think I asked the same question there as I’m asking here. Why doesn’t Sun spend the time making their tech work as they expect on all the major platforms? While we’re at it, why not a 64-bit browser plugin for applets? If they can’t even be bothered to do this, then completely remove applet support from future versions of the Java stack. I don’t think we’re moving away from 64 bit support in the future, so why keep clinging on to useless technology that Sun won’t update?

UPDATE – another great post on the subject over at ‘thinking in java‘.

Blogged with Flock

Tags: , , , , ,

Mandriva upgrade – ouch

October 12th, 2007

Actually, the ‘upgrade’ wasn’t an upgrade so much as a switch back.  I’ve been on ubuntu from dapper through fiesty, and switched over to mandriva 2008 yesterday.  I thought I’d backed everything up, but I’d forgot a few things.

  • Everything in /var

doesn’t sound like much, except that’s where my development php code was, as well as my mysql databases.  *Most* of it was not that useful – demo code, etc.  But, I did have my work development blog there, so I’ve just lost 2 months of posts.  OUCH!  :(

Open Source Risk Mitigation

October 12th, 2007

I’ve been with Open Source Risk Management for about 2 months now, and it’s been quite interesting so far.  The issues and risks that the integration of open source code raises, and how different companies respond to these risks, is probably the crux of the interesting stuff, at least for me.  It doesn’t necessarily seem to be the companies with the most at stake who are necessarily the most demanding about the audit processes we do, either, which surprised me a bit.  One of the things I do in my role is to talk with technical people about their code – how do things link together, where certain bits came from, and so on.

Much of what we’ve been doing so far has been reviews of code before an acquisition, or before a product launch.  The awareness of the risks is a relatively new thing, and we’re (as in the industry) still mostly dealing with the issues after the code is written.  Some new technologies are aiming to help developers address the problem during the coding phase itself, by flagging suspicious code in the developer’s IDE, or by offering libraries of pre-approved code available for integration.

I’ll throw this out there for all of you: do you view integrating open source in to your applications or products as risky?  Does the new GPL3 make any difference that view?  How do you go about keeping track of what components you’re using in your projects and ensuring licensing compliance?

Amazing firefox plugin – useful for researchers

September 19th, 2007

I just stumbled on Zotero, a fantastic firefox plugin for archiving, annotating and searching stuff you find on the web.  There’s very little I can say about it that they don’t say better on their site.  I’ve only been using it today, but it’s simply amazing.  I’ve been looking for something like this for a long time.  I’ve tried using the google notebook, but it’s always so slow to load up during it’s connections to the server.  I like that the google notebook lets me share stuff live, but I’m ending up not using that aspect much.  I think this Zotero is going to be much more useful in my everyday browsing and clipping.  The exporting in RDF is just icing on the cake.

First week on the job

August 27th, 2007

So, I’ve now been at my new workplace (opensourceriskmanagement) for a week.  Well, not quite.  I was only *there* (in Durham) for 2.5 days last week.  Since last Wednesday evening, I’ve been in San Jose, CA, attending meetings with current and prospective clients, and doing code audits.  My Black Duck experience today was less than 100%, in that the server environment wasn’t set up for my code scan, the server crashed twice (necessitating reboots both times), and I really didn’t get going until 2:30.  I’m going to have to stay another night here – so I will have been away from home for 8 days during my first two weeks!.

I don’t want to be doing this every week, but I do have to say it’s been a rather exciting first week.  I wish I could say more, but I can’t.  This is probably the most frustrating thing about the job – I’m meeting some interesting and powerful people, and learning about some very interesting behind-the-scenes things going on at very large companies, but I can’t *write* anything about any of it.

I’m used to writing about issues at work, but that’ll need to stop, at least with respect to naming names.  To the extent that I’m able to share some of my more interesting experiences, I will, but I don’t expect to be able to do that too much, at least to start with.

Overall, there’s quite a lot of potential here, though, which I’m very excited about.  I’ll also add that if you’re at a company that uses open source software in your products, and you’d like to have some expert auditing and assessment of your risks and compliance with open source licensing obligations, drop me a line.  :)

DRM is good and necessary

August 12th, 2007

for the social web to evolve to the next level.  Is that at all controversial?  I hope at least the title is, as I’d like to provoke a bit of thought in you, the reader, about the topic of DRM.

I’ve been mulling this and related topics for some time, but not quite in these words.  This morning the connection between what I’ve been thinking about and what’s commonly known as “DRM” jumped out at me, and I wanted to elaborate a bit more.  This is intended both to help me flesh out my thinking on this as well as perhaps get some feedback from the community.

I’ve always been afraid to put too much on line, especially in this blog.  Once I started publishing anything online, I was very, perhaps overly, aware of the possibility of anyone reading it.  Issues like looking for a new job were things that I couldn’t write about because my coworkers might read about it.  Financial issues were not something I could write about because, well, they tend to be somewhat personal.  Family and health issues were also pretty much off the table.  While I would have benefited from writing about each topic, writing them all with the ‘same’ identity would have made too much information about me available to too many people.

Keeping separate blogs with different identities is one way of coping with this multiple identity issue.  Using separate user accounts and participating in different forums is another way.  Both have their drawbacks – the complexity and confusion of having to use multiple systems are primary concerns, but I’m sure you can think of some other wrinkles in there as well.

This got me to thinking about the control we have over our own content on the internet.  The current model is that end users contribute actual content – text, images, video, etc. – to discrete servers under our chosen identities.  These central services act as aggregators of the content.  Once something is out there, it’s out there.  There are certain barriers which can be put up which will prevent people from accessing some of that content – forums can be closed or access-limited, for example.  We’ve still no good way to create content and control its distribution at a granular level, nor any way to revoke content once its been published.

I realize many people will continue to have this view that “it’s the internet, if you publish it, it’s out there forever”.  Google’s cache, archive.org and other developments have ingrained this “write once, live with it forever” attitude in an entire generation of people.  I’m not suggesting that those services are a bad thing, or that the concept of content being around “forever” is necessarily bad either.  I *am* suggesting that some information shouldn’t fall under that umbrella – content has different meaning based on who is writing it, who the intended audience is, who the actual audience is, and so on.   I am also suggesting that the concept of centralized ‘one time’ publishing and archiving of information is something which is having a suppressing effect on the amount of content created, shared and consumed on the internet.

What are some of the controls that we can exert over our information as its published right now?  Consider a tech geek who runs their own blog or community on their own server.  This is someone who embodies all that is possible in terms of ‘control’ over their own information on the internet.  This person can choose to make their information available to the public at large, or only to a select group of people, via registration/invitation.  If the information is to the public at large, a ‘robots.txt’ file is available to let well-behaved search engine crawlers know what they can index (ignoring the non-well-behaved for this discussion).  Once it’s indexed, our hero has a devil of a time getting it ‘unindexed’.  Google has an ‘immediate’ page removal tool, but that is something which still operates on pages.  You need to serve up a 404 page for the googlebot, but keep the page ‘open’ to the rest of your visitors if your intention was to truly ‘unindex’ the URL, rather than remove it.   How or if other search crawlers offer these sorts of services is beyond the scope of this post.  The point I’m trying to make is that it’s rather difficult and complicated, and that’s for people who have control over their entire publishing mechanism.

For people who simply post in hosted content services (blogs, forums, etc.) the control over content is extremely limited.  That’s been the nature of the beast so far, and it’s worked reasonably well, but there seems to be quite a lot lacking in my own ability to control what I’ve said and where it’s been republished/syndicated/etc.  Perhaps the ‘what I’ve said’ issue shouldn’t be able to be modified.  After all, even in the real world, rarely do we let people go back and revise their content (excepting George Lucas’ ability to revise  “Star Wars” ad infinitum).   But who the content gets distributed to, and perhaps how much of that content they receive, is something we’ve had more experience with over the past several years, primarily in the music and movie arena.

The notion of DRM – Digital Rights Management – software controlling what you can and can’t do with something received (usually purchased) isn’t really all that new.  Back in my day, C64 disks were ‘copy protected’.  If you used the product as intended, it worked.  If you tried to use a generic disk backup utility, the drive knocked about, (and could break) because the publisher had modified the disk format such that ‘ordinary’ utilities couldn’t read the disk contents, which would prohibit copies.  Mr. Nibble got around this by writing new disk copy programs which bypassed that built-in reading, and then publishers pushed back with even harder-to-crack protection.  This arms race eventually subsided, and copy protection, at least at the hardware level, seemed to subside for awhile.

But it’s come back with a vengence, and the stakes are much higher.  Copy protection – DRM – is a basic part of how most music and videos are distributed.  The software players will decode the bits and give you the music only if conditions embedded in the music directly ‘allow’ the player to do so.  Have you paid your license this month for your Yahoo! music subscription?  If not, your player won’t play.  Time-limited DRM is big with Yahoo and Microsoft, who offer ‘all you can eat’ subscription pricing.  Apple’s DRM is not time-sensitive, but hardware sensitive.  Your purchased tracks can only be transferred to X number of computers, and you can only burn a track collection Y number of times.  These limits are high enough that most people aren’t affected with average use, just like the monthly pricing is set low enough to not be a burden to most people.  But the concept is still in there – the content owner still has a say in how you use the content, and they have technical means to prevent you from taking certain actions.

Contrast this with content you create and publish on the web in the form of images, music, videos and text.  The average user has no control over how their information is used once it’s “out there”.  Yes, we have copyright laws, but tracking down violators and enforcing the laws is often not worth the effort, mostly because the effort is so time consuming.

There’s been a move to incorporating restrictions in content creation tools, albeit at a somewhat coarse level, in neworks  like facebook.   Facebook has the idea of controlling which pieces of information are shown to specific sets of people (‘my friends’, ‘my groups’, etc.).  While this idea is a step in the right direction, it’s nowhere near as fine-grained as it should or could be.

As I’ve been writing this entry, I’ve stopped a few times (errands to run and such), and already my thinking has changed a bit since this morning’s view.  What I’m now envisioning is content creation that would allow marking up various segments of the content with permission levels.  Delivery of content can be handled much as most web content is delivered today.  When served up by the server, an authenticated user would get access to “extra” layers in the content.

This seems similar to the old RealPlayer idea of a stream being created once, but having multiple levels of quality built in to it – the player and server negotiated the level of quality, and the server would serve up the higher quality sections of the file if the player could handle it.  If not, the lower quality portions of the file were streamed down. This wouldn’t necessarily work in a world where people access most data directly (or, with only one layer of software in between – the general purpose browser).  My scheme would require an extra or different layer of software to request the content with the necessary authentication protocol in place.  I’m envisioning this being handled more between agents on behalf of users – perhaps the next generation of RSS readers with identity management built in.  Ideally the software would also respect caching and timeout headers, to help deal with ‘clearing’ out of content which the original author no longer wants around.  I completely understand that something like this depends on the receiving software honoring that sort of request, and it could just as easily ignore it.  Once you have the content, you have the content, right?  While technically true, our general web browsers have the notion of content caching built in, and we don’t generally worry about that too much.  Nothing will give total control, but a decent balance between the wishes of the author and the desires of the consumer would be more closely achieved with this sort of approach.

So, after another half hour or so away from this, this idea is turning in to more of a wish for three things:

  • Multi-layered content creation tools which respect identity levels
  • Identity authentication and negotiation at the content serving level
  • Identity management and negotiation at the content consuming level (RSS readers would be a good start)

OK, so it’s not *necessary*, but would certainly be useful. For the identity negotiation aspect to work, I’m thinking that the openid project has a good approach, and incorporating that openid practice would be a good direction to head in.

When an agent requests a piece of content, the server response can include embedded information which indicates a more complete version is available, with links to request the more complete version(s).  Any request for this information would require authentication (via openid).  During this authentication process, if the user/agent is unknown, the original author would be notified of a pending request, the requestees information, and the option to grant access to the information or not.

As I explore this more, I’m more conflicted.  On one hand, it sounds plausible, and possibly doable were this to be integrated in to  some key communication tools (facebook, wordpress, myspace, etc.).  However, it’s complex.  It’s complex to implement and complex to think about.  Complexity rarely wins out over the simple on the internet.  In other ways it may be a solution in search of a problem.  Well, *I’ve* found it a problem – content creation and distribution with different sections of content intended for different audiences.  Has anyone else found the problem of multiple identities and multiple audiences to be enough of a problem to contemplate these sorts of measures?  Or am I just barking up the wrong tree?  Or just simply barking, as my wife suggests?

OSCON 07 – windmill testing

July 26th, 2007

This is one I wish I’d just recorded straight from the board.  This testing framework looks pretty awesome, and one that may have a big impact on how people do AJAX/web testing in the coming months.  While it’s been ‘out’ for awhile, this was basically a public launch here at OSCON.  I’m going to see if these guys will do a webdevradio interview with me sometime today.  Great looking system, with apparently a lot of power under the hood.  http://windmill.osafoundation.org is where you can learn more about it.  Have a look!

Interview with Prashant Deva

July 14th, 2007

I had a quick interview with Prashant Deva from Placid Systems, talking about the upcoming Virtual Ant product.  Hopefully I will have this up on webdevradio.com in the next week.  It was about 15 minutes, with a couple fluff/testing minutes at the beginning  :)   Prashant was a pretty cool guy, and I wish we’d had a bit more time to delve in to the project via a screencast.  This is the issue with audio-only podcasts – I may have to start screencasting at some point – products like Virtual Ant really lend themselves to visual walk throughs.  There are some pretty cool ideas in Virtual Ant, and I hope the product catches on soon quickly.

SOLR presentation dry run

June 14th, 2007

I gave my first public run through of my SOLR presentation last night to the tripug group last night. Whew! It was pretty darn rough. My first runs through had timed out at only 10-12 minutes. I was really worried about how I’d fill the time. What happened, however, was that I ended up adlibbing (not good!) and answering questions (good) and getting a bit lost (not good!).  A few things I learned:

  • Have web-accessible version of presentation.  If anything goes wrong, having at least a web-based version (s5 perhaps?) with screenshots of the code is a saving grace.
  • Have more code ready.  I know this, but haven’t finished all my code samples yet.
  • Focus!  I tried to cover too much in my first run.  While I don’t want to come across as a ‘beginner tutorial’ sort of presentation, there are going to have to be a lot of things I forgo detailing during a 30 minute speech.

I’ve got to finish my printed materials by June 25, so I’ve only got a bit over a week to wrap this up.  Such pressure!

Group generated content

June 2nd, 2007

UGC – User Generated Content – has been all the rage the last year or so.  Actually, it’s been the rage since probably 1996 or 97 or whenever GeoCities made it easier to have URLs with the word “TheTropics” in them with homages to pets rendered on animated rainbow GIF backgrounds.  However, big media and web 2.0 types have gotten around to naming “user generated content” and VC money has started to flow to startups able to ‘monetize’ UGC.

What I haven’t seen much of yet is GROUP generated content (GGC).  In some sense, UGC done over time by multiple people might be considered GGC – forum threads and such might fall under this name.  However, I’m thinking of more ‘real time’ collaborative efforts – live whiteboarding sessions recorded by a group of people might be considered GGC. Conference calls – these things that happen millions of times per day in businesses big and small – are really GGC, but generally there’s no artifacts left behind to consider ‘content’.  The phone call and all the content and ideas it represented is just gone.  Group podcasts would also likely fall under GGC, at least in my view.

Will there be a push to capitalize on GGC like there was UGC? Will there be tools coming out to help ease the creation of GGC?