You are currently browsing the archives for the Speech category


Twilio – phone app proxy service

I had no idea how to describe Twilio, so “phone app proxy service” was the best I could come up with. It’s probably nowhere close, so let me explain what Twilio does.

Register for a Twilio account, and you’ll get a phone number. People calling in to your phone number will trigger Twilio’s server to hit *your* server; specifically, a unique URL you give Twilio. Your URL will return back some specific XML which instructs Twilio to play a message, say some text, or ask the user for some input (key or voice). Twilio will then send the user’s input back to your URL, which you can then parse and send back more XML to instruct Twilio which step to take next.

That’s it. You can build your own phone-driven service without investing in any hardware and very little software. Cost is 3 cents per minute, which is very reasonable in my view.

What sorts of apps would you build with such a service? I’ve got some ideas of my own, and will let you know if/when I build any of them. :)


I'm currently working on a book for web freelancers, covering everything you need to know to get started or just get better. Want to stay updated? Sign up for my mailing list to get updates when the book is ready to be released!

Web Developer Freelancing Handbook

Audible commenting v2

This is an example of the audiblab system for recording and embedding your voice messages.

Visit audiblab.com to try it out yourself…


I'm currently working on a book for web freelancers, covering everything you need to know to get started or just get better. Want to stay updated? Sign up for my mailing list to get updates when the book is ready to be released!

Web Developer Freelancing Handbook

Interview with Joe Brinkman of DotNetNuke

I put up an interview with Joe Brinkman of the DotNetNuke project over at http://www.webdevradio.com. Grahame helped me clean up the audio (well, did some detective work really). About 15 seconds in to the interview (after I’d already done a sanity sound check) a distinct hum came in to the audio, and was sort of hard to remove without affecting the sound some. Grahame determined it was between 390 and 490 hz, so I pulled those out as much as possible. That’s also a range which affects the voices, so we sound a little tinny, but I think it works.

I met Joe at the CodeMash conference a little over a week ago. He had quite a lot to say about the project, its goals, how the project has evolved, future plans and challenges, and a host of other interesting info. If this conversation doesn’t pique your interest in DotNetNuke, I’m not sure what would. :)


I'm currently working on a book for web freelancers, covering everything you need to know to get started or just get better. Want to stay updated? Sign up for my mailing list to get updates when the book is ready to be released!

Web Developer Freelancing Handbook

rss/ical combination

I’ve not seen any signs we’re quite there yet, though searching for “ical/ics” and “rss enclosure” does bring up some interesting ideas. In short, what I am hoping to see is something like the following:

When I’m authoring a blog entry, I can add specific event information (date/time/location/etc) which gets added to the RSS feed for that blog. The data could be in its own format, but it likely makes more sense to use the ical format. Whether this is embedded directly in the RSS or a reference to an automatically created ics file doesn’t really matter to me – both could be fine.

Feeds carrying references to audio files became known as ‘podcasts’. While the main focus – the audio file – was intended for media players, the other data in the RSS stream provides valuable context and can be used on its own in feed readers. While many podcast feeds are solely the audio file, beyond a title and a short description, that’s just how it is now, not how it has to be. All the data in the RSS feed can be complimentary, and in my view, event information is perfectly suited for some form of standardized inclusion process in RSS feeds.

I know that a lot of people ‘subscribe’ to ICS files directly, but I’m not sure that’s the best way to go about it. My biggest concern is that there’s not enough meta-data around the event info. In some cases there doesn’t need to be much, but in other cases, more data would help. By my reading of ICS, it’s not XML based, and there are defined standards for what fields are to be included. I imagine it degrades gracefully, ignoring unknown keywords, but something still feels very utilitarian about it (not necessarily a bad thing, but doesn’t seem to leave room for much innovation).

EXAMPLE: In a blog post, I can add a long description of the event, links to more info, such as maps, etc. Perhaps even include an audio or video file to describe the event in more detail. Current readers can be extended to incorporate this new info whenever, adding new functionality. Imagine the google rss reader offering to add event information from the feed items you’re reading directly in to your google calendar. While I’m not the biggest google fan, they’re a convenient example because they own both components right now – I can’t think of too many companies that do (does MS? I’d think so).

I’m probably not putting this very eloquently, but I hope I get the idea across. This seems like it’s something that needs to come from one of the big players, or perhaps a consortium, to get momentum, otherwise it’s just someone else’s half-baked code idea :)

Addition:

I think the bigger hurdle here is adoption on the reader side (is it that obvious?). If someone like feedburner would make the option to ‘add to ical, add to yahoo cal, add to outlook’, etc in their feed processing for ical data associated with a blog post, that would probably ensure adoption right there.


I'm currently working on a book for web freelancers, covering everything you need to know to get started or just get better. Want to stay updated? Sign up for my mailing list to get updates when the book is ready to be released!

Web Developer Freelancing Handbook

interview with alfred green

Alfred’s a friend from a way back in Michigan (well, perhaps just a few years ago) who’s also in to open source technology, and overall a sharp guy.  He interviewed me for a podcast he produces, and it’s finally ‘up’ for listening.  We chatted some time ago – September I think, so I’d nearly forgotten about this!  Looking at the time – 98 minutes! – I realized I probably rambled way too much.  If you don’t get enough of me on my webdevradio.com podcast, you’ll get your fill of rambling, tech ponderings, personal history, software dev insights, and more, over at his blog.


I'm currently working on a book for web freelancers, covering everything you need to know to get started or just get better. Want to stay updated? Sign up for my mailing list to get updates when the book is ready to be released!

Web Developer Freelancing Handbook

K as in knife

My last name starts with a K, so whenever I’ve spelled out my name for people on the phone I’ve said “K, as in ‘knife’”. I’ve done this for years, almost without thinking. I was initially inspired by Inspector Clouseau in the first Pink Panther movie, where he was spelling something with a ‘J’, and he said ‘J, as in ‘jalapeno’”.

Anyway, years ago I started to put together a list of words with confusing/ambiguous first letters. One of the guys at work lent a hand and we nearly completed a list you can use for yourself. He brought in more of the foreign words, and helped in a few tough spots too. I think there’s still a couple with this list that need to be done or improved on, but I figured I’d post it here as my little public service. Feel free to use this next time you’re on the phone with someone needing to spell something!

  • a as in aisle (or aye)
  • b as in bdellium
  • c as in czar
  • d as in djibouti
  • e as in eight
  • f as in fjord
  • g as in gnaw
  • h as in herb
  • i as in isle (ioan was recommended but I’m split on this one)
  • j as in jalepeno
  • k as in knife
  • l as in Lladro (llano was another suggested alternate)
  • m as in mnemonic
  • n as in Nguyen (or ngwee)
  • o as in oestrogen
  • p as in pneumatic
  • q as in queue (or quay – pronounced ‘kay’)
  • r as in ????
  • s as in scene (or sea)
  • t as in tao (pronounced with the ‘t’ as a ‘d’)
  • u as in uakari
  • v as in “Five” (as in the roman numeral five – yes, a bit weak, but still gives the level of confusion suitable for phone conversations)
  • w as in wrench (or ‘why’)
  • x as in xenon
  • y as in yttrium (Yvonne might work better for the phone?)
  • z as in ????????

After I put this together, I did discover this other site which has a similiar list, with multiple options for each letter. The original purpose was slightly different that mine (confusing librarians, I believe) but you may still find this useful too!


I'm currently working on a book for web freelancers, covering everything you need to know to get started or just get better. Want to stay updated? Sign up for my mailing list to get updates when the book is ready to be released!

Web Developer Freelancing Handbook

Linux dictation/speech recognition

Using Linux is, at different times, both frustrating and freeing. My experiences over the years switching between OSX, Windows and Linux have been numerous, and I still tend to find myself coming back to Linux. It’s often because much of the work I do ends up running on LAMP based servers, so running LAMP locally makes it easy to know that what I’m doing will work in the final production system. If I worked for a large company with an unlimited budget, perhaps being MS oriented wouldn’t be so hard, but it’s often the case that where I’m working we simply can’t justify the cost of being 100% MS based with respect to all our technology. Yet the opportunity cost (in terms of time spent making disparate platforms interoperate) in being a cross platform shop aren’t exactly chicken feed.

These are just a few thoughts that I come back to now and then when looking for needed tech for my Linux machines. Most recently, I’ve been looking for quality (or really ANY!) software that would run under Linux and allow me to take audio files and convert them to text. “Speech to text” or “dictation” software just doesn’t seem to exist for Linux. IBM’s ViaVoice *was* available, but is no longer, and apparently was sold off to a Windows-oriented shop. IBM still apparently retains the rights to the Linux version of ViaVoice, but doesn’t seem to be forthcoming about releasing it.  This article seems to have more on the state of things…
I’m not even asking for a ‘free’ or ‘open source’ version. I’d be willing to pay something for a dictation package I could run natively, but it’s one of those things I will need to dual boot for. I was toying with running Windows under VMWare, but I think for something like speech conversion it would not be effective given the CPU and RAM requirements I can imagine.

If I’m missing some obvious way of translating files for Linux, please let me know!


I'm currently working on a book for web freelancers, covering everything you need to know to get started or just get better. Want to stay updated? Sign up for my mailing list to get updates when the book is ready to be released!

Web Developer Freelancing Handbook


Get updates on my upcoming book!
  • Get better clients!
  • Make more money!
  • Avoid costly mistakes!
I'm hard at work writing a book which will give you everything you need to know to get started in web freelancing, from getting clients and getting paid to contracts and what types of work you should consider.