Wednesday, November 27, 2013

Dictation (audio file to text how-to)

The pressure is on to take affirmative action to make screencasts and other online video more accessible. Of course, this includes eTextbooks that contain video. One important aspect of that challenge is to make video more accessible to persons who are deaf or have difficulty hearing. For video content creators, this means providing a transcript or, better, providing subtitles to that video so that dialogue may be viewed in the same context as the video. This is fast becoming de rigueur.
The problem is that many videos are created without a script that is followed closely by the speakers in that video. Indeed, many important videos are created in ad hoc fashion (interviews, panel discussions, conference presentations and the like) where scripts would be totally inappropriate.
Creating text from speech has become essential to meeting these expectations, especially where all one has to work with is the speech in the audio track of a video. Speech to text (STT) is a bit more difficult than text to speech (TTS) which has been in use much longer.
MacOS X recently introduced Dictation (speech-to-text) as a feature usable in any application that takes text as input. This is quite an advance over having to purchase a two hundred dollar application to accomplish the same end. However, the first iteration of this system required an internet connection so that speech could be uploaded to Apple's servers where it would be turned into text. This created delays and was difficult to use for substantial bodies of text. However, Dictation was given a significant boost in MacOS X 10.9 (Mavericks) with the introduction of Enhanced Dictation which enables offline use and continuous dictation with live feedback.
Still, this is a system that assumes a live speaker. There is no obviously easy way to route speech from a recorded file through Apple's Dictation system to produce usable text. That's what this post is all about. You can, in fact, route the speech in an audio file through Apple's speech-to-text subsystem and render very usable text output. It isn't intuitive or Apple-easy but it is something that anyone can accomplish with a bit of determination. Here's how.
The application at the center of this process is
Audio HiJack Pro by Rogue Amoeba ($32 USD). There are two things to set up with this app. The first is to identify the source of the audio. It could be any app that emits audio but I used QuickTime Player X. Thus, I set that app as the audio source as follows:

ss.01


This will capture the audio from anything that this app plays. My
sample audio is from NPR and contains a dramatic reading from noted actor, Sam Waterston and looks like this in QuickTime Player X:


ss.02

This configuration will grab all the audio from QuickTime Player X as it plays the "NPR Gettsyberg Address" audio file. Next, we use Audio HiJack Pro to send that audio to Soundflower (free). To do that we go to the Effects tab and choose Auxiliary Device Output from the 4FX menu.

ss.03


The Auxiliary Device Output plug-in enables us to choose the previously installed Soundflower as the recipient of the HiJacked audio as follows:

ss.04

Once installed, Soundflower becomes an input/output option in your Sound preference pane and everywhere else audio sources and destinations can be specified. In other words, it becomes an integral part of your sound system in MacOS X.

Finally, we set the Dictation input to be Soundflower as follows:

ss.05

At this point, any audio played by QuickTime Player X will be routed to Soundflower and will thus become available to any application that accepts text input and has a Start Dictation menu item. In Pages, that looks like:


ss.06

The following screencast illustrates this process from start to finish:



Here is a larger 720p version.

A very special "Thank You" to Chris Barajas at Rogue Amoeba who patiently worked me through the intricacies of the in Auxiliary Devices Output plug-in for Audio HiJack Pro.

Tuesday, September 3, 2013

Using Scalable Vector Graphics (SVG) Images in iBooks Author

The primary advantage of Scalable Vector Graphics (SVG) files is that a very small file can be scaled up to yield large images without the aliasing (jaggies) that appears when a bitmapped graphic is scaled up. SVG files are resolution independent, usually non-photographic and carry the suffix *.svg. There are lots of free SVG files available on the Internet and there are many applications for creating SVG files such as the free, open source Inkscape. For an excellent primer on vector graphics, see this Wikipedia article.

However, it is not possible to use SVG images directly in iBooks Author.  If you attempt to drag and drop an SVG file onto an iBooks Author project, nothing will happen. You'll get no error messages or feedback of any kind. Similarly, apps in the
iWork suite (Pages, Keynote and Numbers) will also refuse to accept SVG files. Since it is important to keep the size of iBooks Author output low for easy downloading and to avoid the 2 GB limit in the iBookstore, we need to pursue this further.

The iBooks Author application has its own Text, Shapes and Graphs menus with which a number of vector graphics can be created. Another option is to use the vector graphics created by Keynote, Numbers and Pages. These can be copied and pasted directly into an iBooks Author project. Graphics created in iBooks Author or any of the iWorks suite applications are vector graphics in PDF containers, not SVG files. PDF files can contain text, bit-mapped graphics and vector graphics. The
OmniGraffle application is a considerably more sophisticated graphics toolset and is capable of exporting both SVG vector drawings and PDF vector images. The latter are compatible with iWork suite and iBooks Author.

That's useful but there is an Internet full of already drawn
SVG images that are in the public domain or CC licensed.  It would be a shame not to have access to that vast library of free vector images.  The trick is to use this on-line conversion service to convert SVG to PDF and then drag and drop that PDF directly into an iBooks Author project or into one of the iWork apps or OmniGraffle for further manipulation.

Download an *.ibooks file
here that shows how vector graphics created in iBooks Author compare with vector graphics converted from SVG files. The following screencast uses that same multi-touch eBook.


Sunday, July 28, 2013

Using iAd Producer to Create HTML Widgets for iBooks Author (external video widget)

The iAd Producer application from Apple has grown considerably since its inception. Originally, it was a highly specialized application that created advertisements for mobile devices from Apple. Those iAds were composed of sophisticated HTML, CSS and Javascript.

Since that inception, it has been expanded to create iTunes LPs for music albums sold in the iTunes Store and iTunes Extras for video sold in the iTunes Store. These, too, rely upon HTML, CSS and Javascript web technologies. Most recently, iAd producer has added iBooks Author HTML widgets to its repertoire. Thus, the following screencast tutorial showing how easy it is to use iAd Producer to create a high quality HTML widget for iBooks Author without writing a single line of code.

This example focuses on creating an HTML widget that plays a video hosted on an external server. This keeps the size of your *.ibooks file down making for quicker downloads and avoiding becoming a burden to iPads already nearly filled to capacity with other books and media.



Try viewing this video in full screen to catch all of the details. Note that Firefox doesn't like MPEG-4 and will refuse to play this. Try any other modern web browser (Safari, Chrome, etc.).

Download the example book to an iPad to get an even better view of how this looks and feels in the hands of your audience.

Download iAd Producer (free developer registration required)

Tuesday, June 18, 2013

How to Instigate an Identity Crisis in *.ibooks files.


The setting. I wrote four eTextbooks (of six planned) for a private iTunes U course that I am teaching. I used names such as Unit.01 Getting Started, Unit.03 Capture and so on.  These were uploaded to iTunes U via the iTunes U Course Manager. 

The bad move. Once I got the design of Unit.01 decide upon, I duplicated the *.iba file for Unit.01 and edited it to make the shell for Unit.02 and so on.  What I should have done was to create a template and work from that. Bad move.

The reason. After a bit of back and forth with some very diligent folks at Apple involving both the iTunes U and iBooks Author teams, we discovered that my bad move caused the *.iba files to have the same internal ID and that caused all of the resulting *.ibooks files exported from them to inherit that identical internal ID.  

The cure. So I opened each *.iba file and made a template from it, closing it and adding the prefix "old_" to its file name in the Finder, just to have a fallback in case something went wrong.  After that, I created a new IBA project file from the just created template and saved that under the old name (no prefix) and exporter a new *.ibooks file replacing the old , defective, identity-challenged *.ibooks file. Rinse, repeat with each of the other eTextbooks.

Testing has shown that all of these eTextbooks have regained their individual identities. Opening Unit.04 always results in opening Unit.04. Success.

The moral. Make templates, don't dupe an *.iba file.

Friday, February 8, 2013

How to Easily Incorporate External Video Into an iBooks Author Project

The iBookstore limits the size of of *.ibooks files created with iBook Author (henceforth, IBA) to 2 GB and recommends that you keep the size of your iBook file under one gigabyte if possible in order to avoid taking too much space on your readers' iPads as well as to avoid your readers having to endure long download times. Although including video that is internal to your IBA project is a simple drag and drop application using the Media Widget in IBA, that kind of video will very quickly increase the size of your iBook and may place an unwelcome burden on some of your readers.

The alternative is to include external video in your IBA project using a custom made HTML Widget. The big advantage is that a one megabyte HTML Widget can play a 70 megabyte video in your iBook. The downside, of course, is that the reader must have an active internet connection and the availability of the video must be maintained. If a video used in your iBook should become unavailable, you can provide your readers with a free upgrade correcting that issue via the iBookstore's versioning feature.

Unfortunately, many people are persuaded not to use this approach because it involves writing HTML code but this post will offer you a way around that obstacle. If you can use a text editor, you can modify an HTML widget template that plays an external video that you select for your iBook. Here's how:

As you'll learn from
this Apple support document, an HTML widget is nothing more than a collection of text files enclosed in a folder with the suffix ".wdgt" added to it. On the Mac, adding that suffix to the folder name changes the appearance of the folder into a widget icon. The minimum HTML widget contains three files: a Default.png file, an index.html file and an info.plist file. I have prepared an HTML widget that you can use as both an example and a template. It is a ZIP archive containing a complete working widget that you can add to a test IBA project. Once you have it in an IBA project you may use the Preview function to see how it works in the iBooks.app on an iPad. Download that HTML widget here and then double-click on it to extract it.

This example widget plays a video called "Open Access Explained" that is hosted on a server that I have access to. In this tutorial, I will show you how to open the widget and modify it so that it will play another video, one that is on a server that I do not have access to, a video that you choose. All I'll have to do to accomplish this feat is to open the widget, change the Default.png file and edit the text of the index.html and info.plist files so that they reference a different video. It's just that easy.

Of course that video must be playable on an iPad so no Flash. These
tech specs provide all the necessary details. The great thing about video on the iPad is that the HTML 5 video tag works without having to create multiple fallback versions of your video (*.mp4, *.ogg and *webm) as one would have to do on a web site. As long as it's using the MPEG-4 H.264 video and AAC audio CODECs, it can be in either a MOV, MP4 or M4V container. More simply, if the video plays on your iPad, it will play in this widget.

The video that I'll be modifying the widget to use in this tutorial is:
http://movies.apple.com/media/us/mac/ibooks-author/2012/tours/apple-ibooks-tour-ipad_ibooks_author-cc-us-20120314_r848-9cie.mov Because I'll be changing the widget to play another movie with different dimensions, I'll need to create a new Default.png file and change all of the references from the old video to the new video. I'll be using the BBEdit text editor but any plain text editor such as TextEdit will do just as well. Here's a screencast showing how this is done. Click HERE or click the image below.




Caveats: Some video services such as Vimeo and YouTube go to great lengths to tie their hosted video to their own web sites so that they can generate data about you and get paid for exposing advertising to you. Thus, it is just a little more difficult to use these videos in an iBook but it can be done. I may take that topic up in the next post.

NOTES:

• About the "auto play" attribute. You'll notice that I use the optional "auto play" attribute in the HTML 5 "video" tag. All HTML widgets take over the entire screen when invoked. Under iOS 6.x (tested), tapping this external video player widget will bring the poster (Default.png) image to full size atop a white background that also displays the widget's title and caption as well as a close button. A standard iOS "play" icon will be superimposed on the center. The video will begin to play automatically without the reader having to tap this play button. The time that auto play takes depends upon the size of the video and the speed of the network. The operating system tries to estimate when it can play the external video without interruption. The reader can always tap the play button prior to auto play. Simply delete the "auto play" attribute if you rather not have this feature operative.

• About the "controls" attribute. I also use the optional "controls" attribute. This provides the reader with a standard video controller with which they can control playback of the video such as audio volume and two "full screen" options as well as a scrubber for moving the play head to arbitrary points along the time line. Simply delete the "controls" attribute if you'd rather not have this feature be available to your readers. The following image shows these various controls and their effects.




Resources:

Safari HTML 5 Audio and Video Guide

iBooks Author: About HTML widget creation

Saturday, December 8, 2012

Space-efficient Interactive Lectures in iBooks Author

One of the most useful widgets for iBooks Author is the Keynote widget. It enables you to add a presentation to an interactive iBook using the features of the Keynote.app for MacOS X and iOS, including many of the transitions and builds. You can even convert a PowerPoint presentation to Keynote and bring that content into your iBook as well. The full details are in this technical note.
The one disappointment I had was that this widget does not support voiceover narration. This can seriously diminish the value of a slide presentation. The reader can flip through the slides forward and backward but they have to guess what the presenter might have said. Peter Norvig did a wonderful six slide Powerpoint presentation illustrating this very point. See the slides for President Lincoln's Gettysburg Address
here. View the slides on-line or download the presentation as a *.ppt file.




As you'll see, there is something missing, something very important.

Of course there is a way around this. In Keynote, you can add a voiceover and export the presentation as a video to include in your iBooks Author project using the media widget. You can also use screen casting software such as ScreenFlow to capture slides, narration and even a secondary video source such as a PIP (picture-in-picture) of the speaker. The problem with these audio-annotated slideshows done as video is that their file sizes are unnecessarily large. This is a problem for iBooks both because of the 2 GB limit and the time it takes for readers to download very large iBooks. This might well reduce ones audience.
My use of the word "unnecessarily" should signal that there might be an even better workaround and there is – sort of. What I'm about to describe would be a great workaround
if it were supported by iBooks Author. It is not currently supported in iBooks Author but users of that application can change that. Request an enhancement right from within iBooks Author like this:




The workaround that's better than a video is called an "enhanced audio" file. It is created in GarageBand and carries the *.m4a file name suffix.. You may also see it referred to as an "enhanced podcast" file. What makes the enhanced audio file such a great alternative to video is that it uses one static image of a slide over its entire time on screen instead of 30 frames per second as in a video. If a 50k slide is on-screen for 100 seconds in an enhanced audio file, that image contributes only 50k to the total file size. If a 50k slide is on screen for 100 seconds in a video file, that image contributes 150,000k ([50*30]*100) to the total file size. That's 50k vs 150 Mb, a 3000:1 ratio in this example! In real life, the difference is somewhat less than this because good video encoding uses a number of neuroscience-based tricks to present incomplete data in between the key frames that are full representations of what the camera captured. That fools the eye and takes less space. Still, the difference is quite significant. We'll look at a real world comparison below.

Since this is not a how-to post, I'll leave that task to others. There are many fine tutorials teaching you how to use GarageBand to create enhanced audio files on the web. Here's a good
one in the form of a PDF.

I created an example using some ancient media describing the beginnings of the Space Shuttle program. Intended for school use, the package contained a cassette audio tape and a set of photo slides. The audio tape has sharp "beeps" to tell the projector operator when to advance to the next slide image. I've left those beeps in for their nostalgia value. Here's the enhanced audio file slide show:



You may download a copy of this file
here and it will play (larger) in QuickTime X Player, in the iTunes.app and may other venues that support QuickTime but just not iBooks Author and the iBooks it creates. This 18 minute presentation is only 21.4 MB in size! Space-efficiency isn't the only advantage of enhanced audio files. The assembly of the static images creates a chapter track that enables the viewer to quickly and easily move to any part of the presentation. This is great for studying a topic where revisiting a difficult section is helpful. Here's a screenshot to illustrate what a chapter track looks like:



Here's a view of the GarageBand project that created this enhanced audio file:



So, what would this presentation cost us in terms of file size if it were a video? I created an *.mp4 version with ScreenFlow using the same assets. That version weighed in at 198.2 MB. You may download a copy of that file here. The *.m4a file tipped the scales at 21.4 MB. The video version is approximately 9.3 times larger than the enhanced audio file yet playing them side by side reveals no important differences. Here's a screencast of that analysis:



If you'd like to download and view a larger version of this video, you may do that
here.

The one on the left is the c. 200 Mb video and the one on the right is the c. 20 Mb enhanced audio file. So, if you are at all impressed by the potential advantages of being able to use an enhanced audio file in an iBooks Author project, send in that enhancement request to Apple as described above and do that ASAP.

By the way, you can add an enhanced audio file to an iBooks Author project and it will play the audio part. No slides though and that's where the biggest advantage is. Here's what that looks like in iBooks Author.



Note that there is an option for "Show Audio As:" that includes an "Image" option. That doesn't make the slides appear though. That's for drag/dropping a single image onto it that will appear throughout the duration of the audio. Yes, I got real excited when I first saw that.


Wednesday, November 21, 2012

New Book: The Coming ePublishing Revolution in Higher Education

It's been quiet here since last January when I talked about the new iBooks Author and its implications for eTextbooks in higher education. That's because writing that piece in January raised questions in my mind that couldn't be set aside. Even while vacationing in Europe its grip never lessened.
First things first, here's where you can buy this book for the munificent sum of $0.99:


(click on the badge above)

Why $0.99? At first, I wanted to make it free since the book is, in part, about making eTextbooks free to students. However, I wanted to know how many copies were actually read as opposed to how many copies were downloaded. My thought is that people are more likely to read what they have paid for, even if it is just a token sum. I suppose we'll see about that. By the way, the copyright is CC-BY-NC-SA so one is free to use or re-use any part of the book as long as they attribute the work to me and share any improvements they might make with me.
So what's the book about? Thinking about the potential implications of new digital authoring software such as iBooks Author, Pages, Sigil et. al., I realized that higher education might be a spacial case with regard to the potential for the dis-intermediation of the academic publishing industry. After all, most of the people who write academic papers, books and textbooks are also employed in the higher education sector. This brought forth the entangled relationships between academics seeking promotion and tenure, the institutions that employ them and commercial publishing houses. I wanted to see if the technical potential to dis-intermediate could actually be translated into action in this byzantine culture. I think that I've gotten a handle on it and laying that understanding out is what the book is about.
Why an iBook that is only readable on the iPad? It's one thing to assert that a single subject matter expert (professor) can develop and deliver an eTextbook without assistance from a publisher and make it available to students at little or no cost. I wanted to test that assertion and use the results of that testing as evidence in support of the idea that dis-intermediation of the academic publishing is technically and economically feasible.
If dis-intermediation is technically possible, what's to stop it? As it turns out, the most formidable obstacle has little to do with technology. The primary barrier to dis-intermediation is not a technology problem. It's a people problem. It's the culture of academe exacerbated by recent economic issues that make the outcome of this story so difficult to foresee. What I think is achieved in this book is that we now have a better idea as to where we should cast our gaze and what to be looking for. Those who know what to look for will be among the first to understand how this will all turn out.