Open-Captions

Posted: October 23rd, 2011 | Author: | Filed under: technical | Tags: , | 1 Comment »

What is Open-Captions?

According to research, over 90% of deaf children have hearing parents who “frequently do not have fully effective means of communicating with them”. The American Sign Language (ASL) is a difficult language to learn, especially as a second language.

Open-Captions makes it easy for parents and children to learn and practice American Sign Language together while watching their favorite videos on YouTube. People can find closed captioned videos on any topic with the Open-Captions search engine. The viewer is able to select individual words in the video’s caption stream and see the American Sign Language representation of the word.

Open Captions Example Screenshot

Closed captions on various television shows have been around for quite some time. They are useful for hearing impaired people and also to people whose first language is not English.
Captions have always been a steady stream of white text on black background, located either at the top or bottom of the screen. Video sites like YouTube and Hulu , have adopted the same style from television.
Open-Captions, changes the way users watch videos by making the captions more interactive.

  • The viewer can pause the video, select a word and watch an ASL representation.
  • The viewer can review earlier captions with the “Previous Caption” button.

The American Sign Language representations are shown from the SmartSign dictionary maintained by Georgia Institute of Technology’s Center for Accessible Technology in Sign (CATS).

How did Open-Captions come about?

My interest in building Assistive Technologies has been one of the main motivating forces behind getting a graduate degree in Computer Science.

I built a web based widget platform for Interactive TV as a graduate student researcher at GeorgiaTech. In the process I worked on developing functionality that would allow widget developers to access the closed captions associated with the currently playing content.
Harley Hamilton, a researcher at CATS, was interested in leveraging this ability of accessing closed captions and we ended up building a rough prototype. Subsequently, I learnt more about hearing impaired children from Harley and wrote a small tool using which one can select any word on a webpage and see the American Sign Language representation of the word. This would be very useful for parents who want to learn ASL.
In September this year, I signed up for the HackdayTV weekend hackathon and ended up building the first version of Open-Captions, which is the mashup of the earlier ideas I had worked on related to ASL.

What happens behind the scenes in Open-Captions?

Disclaimer: Technical content ahead ;)
The entire source code for the project can be found on GitHub
YouTube has extensive documentation on how to use their APIs for search, embedding and controlling videos on your site. The search and results page are based on the PHP library of the YouTube search API. The page on which the video plays is the one that gets interesting. There are 2 parts to it -

  • Getting the closed captions of the video and synchronizing it to show correctly
  • Handling the mouse clicks on the individual words to show the ASL representation

I used Firebug to see what requests YouTube sends whenever you click on the “Interactive Transcript” icon under the video. I removed parameters that were not mandatory (i.e. I would still get the same result if I removed them) and made a PHP curl request for the trimmed URL to get the captions formatted as XML. For this YouTube video the URL for the captions would be


http://www.youtube.com/api/timedtext?lang=en&format=1&v=<VIDEO_ID>&type=track&kind=&hl=en

and the result will be of the form

<transcript>
<text start="0.734" dur="4.299">[Sesame Street theme music]</text>
<text start="5.033" dur="1">Elmo: You okay, Chris?</text>
<text start="6.033" dur="1">Chris: I&#39;m good, I&#39;m good.</text>
<text start="7.033" dur="1">Elmo: You ready?</text>
<text start="8.033" dur="1.067">Chris: Yeah, I&#39;m ready. Oh, hey!</text>
<text start="9.1" dur="1.467">Red light&#39;s on. Hey, hi, everybody.</text>
<text start="10.567" dur="1.034">I&#39;m Chris.</text>
<text start="11.601" dur="1">Elmo: Oh, and Elmo&#39;s Elmo.</text>
<text start="12.601" dur="1.367">Chris: Mm-hmm, he sure is.</text>
<text start="13.968" dur="3.199">
I work at Hooper&#39;s Store right here on Sesame Street.
..
..
</transcript>

As far as I understood, not all YouTube videos with captions could be accessed with the above API call because their caption files had names. Then, I came across this user script which is useful if you want to download the captions file. The author had a snippet of code for retrieving the name of the captions file –


http://video.google.com/timedtext?hl=en&v=<VIDEO_ID>&type=list

this returned an XML with the name of the captions file. I used that to extend my API call to


http://www.youtube.com/api/timedtext?lang=en&format=1&v&<VIDEO_ID>&type=track&kind=&hl=en&name=<CAPTION_FILE_NAME></strong>

Then using JavaScript setTimeout function and the values of the start and duration of the captions, I do the synchronization of the captions and the video.

function showAppropriateCaptions(){
   var i,
     len = global_full_captions.length,
     currTime;
   //my_ytPlayer is the handle of the YouTube player
   if(my_ytPlayer.getPlayerState() == -1){
      return;
   }
   if(my_ytPlayer.getPlayerState() == 1){
      // State = 1 => playing video - so get the time and show appropriate captions
      // this will get triggered automatically when the video starts for the very first time
      currTime = my_ytPlayer.getCurrentTime();

      // to-do: a better way to handle hide show instead of doing at every iteration
      $('.myCaptionSpan').show();// the container having the captions
      $('#previous').show(); // the previous button 
      
      for(i=0;i<len;i++){
        if(currTime < global_full_captions[0].startTime){ 
          // if the captions has not started, call just before the first caption is scheduled
          setTimeout(showAppropriateCaptions, (global_full_captions[0].startTime - currTime )*1000);
          return;
        }
        if(currTime > global_full_captions[len-1].startTime + global_full_captions[len-1].duration){
          // it has ended, no more captions to show
          $('.myCaptionSpan').hide();
          $("#previous").removeClass('enabled');
          $("#previous").attr("disabled",true);
        }
        if((global_full_captions[len-1].startTime <= currTime && ((global_full_captions[len-1].startTime + global_full_captions[i].duration) > currTime))){
          // ugly workaround for showing the last caption
          if(!$('#previous').hasClass('enabled')){
            $("#previous").addClass('enabled');
            $("#previous").attr("disabled",false);
          }
          createBeautifulCaptions(global_full_captions[len-1].captions);
          setTimeout(showAppropriateCaptions,global_full_captions[len-1].duration*1000);
          return;
        }

        if((global_full_captions[i].startTime <= currTime && (global_full_captions[i+1].startTime > currTime))){
          // found the appropriate caption
          if(!$('#previous').hasClass('enabled')){
            $("#previous").addClass('enabled');
            $("#previous").attr("disabled",false);
          }
          createBeautifulCaptions(global_full_captions[i].captions);
          // now call the same function before the start of the next caption
          setTimeout(showAppropriateCaptions,
                     Math.abs(global_full_captions[i+1].startTime - global_full_captions[i].startTime)*1000);
          return;
        }
   }

}

Now, for every word in that particular line of the caption, I broke it up into span elements and attached click handlers to each. The span elements look like this

<p class="mycaption" >
<span id="beautifulCaptions0">to</span>
<span id="beautifulCaptions1">spend</span>
<span id="beautifulCaptions2">some</span>
<span id="beautifulCaptions3">good</span>
<span id="beautifulCaptions4">time</span>
<span id="beautifulCaptions5">with</span>
<span id="beautifulCaptions6">my</span>
<span id="beautifulCaptions7">good</span>
<span id="beautifulCaptions8">buddy</span>
<span id="beautifulCaptions9">Elmo</span>
<span id="beautifulCaptions10">over</span>
<span id="beautifulCaptions11">here.</span>
</p>

When a word is clicked on the captions, the showASL method gets called and the word string is passed as a parameter. After stripping the words of extraneous characters like !, &, ], [, ; etc and converting the word to lowercase, the method inserts an iframe on the right hand top corner whose URL points to

http://cats.gatech.edu/cats/MySignLink/dictionary/html/pages/<THE SELECTED WORD>.htm

The pages showing the ASL have flash videos embedded, so I have styled the iframe so that the flash video is at the center of the box. Some of the pages have just images and some may have both. Hence, I added the “Full Page View” button under the ASL box, so that viewers could see the entire page if they wanted to.

The SmartSign website covers around 25000 words and hence there will be words in the captions for which the ASL representation do not exist. For such cases, I get an image from Bing to substitute for the ASL.
The source code is undergoing constant refactoring, and would love to get more ideas on how to design a better solution.

What lies ahead for Open-Captions?

Currently, I am gathering feedback from hearing impaired users about Open-Captions and also getting thoughts on what additional features they would like. Harley, the researcher at CATS, is excited about this project too and reaching out to more people for feedback.
Another idea is to build Open-Captions as the “Khan Academy” of ASL, by having videos that enable different levels of learning of ASL.


JavaScript Notes – 2

Posted: October 21st, 2011 | Author: | Filed under: technical | Tags: | 1 Comment »

In this blog post, I am introducing Custom Constructor functions, a method for “Classical Inheritance” in JavaScript and Prototypal inheritance in JavaScript. The knowledge in this post is condensed from Stoyan Stefanov’s book JavaScript Patterns.
While trying to understand Prototypal inheritance in JavaScript, I realized that I needed to grasp the above concepts strongly. So hence this post touches upon a breadth of topics.

Custom Constructor Functions

3 ways of creating objects in JS –

var myObject1 = {}; // literal notation
var myObject2 = new Object();
var myObject3 = new MyConstructor(); // custom constructor

The third way is what we are interested in currently. So, it looks the same as we would do in Java to create a new object of a class MyConstructor. Below is one way that the function MyConstructor can be defined. (Do remember that MyConstructor is a function at the end of the day. The capitalized first letter is a coding practice to differentiate custom constructors from functions)

var MyConstructor = function(name){ 
    this.name = name;
    this.hello = function () {return "Hello "+this.name;};
}
var obj = new MyConstructor("nadu");

When we call MyConstructor with new, 3 things happen inside the function

  • An empty object is created and referenced by this variable, inheriting the prototype of the function.
  • Properties and methods are added to the object referenced by this.
  • The newly created object referenced by this is returned at the end implicitly (if no other object was returned explicitly).

In the above example, the hello was added to this. So each time new MyConstructor gets called, a new function hello is created in memory. Since the method hello does not change from one instance to another, we can add that to the prototype of MyConstructor.

MyConstructor.prototype.hello = function(){
    return "Hello "+ this.name;
}

A prototype is an object and every function you create automatically gets a prototype property that points to a new blank object. You can add functions to the prototype and they will be accessible to all the objects created using new MyConstructor()

There are 2 different ways inheritance (code-reuse) can be done in JavaScript. One is to create classes to emulate the typical (class-ical) inheritance and the other is using the prototypal nature of the language.

Classical Inheritance

This can be implemented in JavaScript based on custom constructors. Below is just one pattern (simplest) of implementing classical inheritance. Refer to the book for different implementations.

// parent constructor
function Parent(name){
   this.name = name || "Adam";
}
// add functionality to the prototype
Parent.prototype.say= function(){
    return "Hello "+ this.name;
}

The hidden link __proto__ points to the prototype property of the constructor function that created that object.

//child constructor
function Child(){}

//inherit - C will have access to P's properties
inherit(Child,Parent);

var kid = new Child();
console.log(kid.name);//"Adam"
console.log(kid.say()); 

// here is how inherit is defined
function inherit(C,P){
    C.prototype = new P();
}

Working Example on JS Fiddle

Prototypal Inheritance

In Prototypal inheritance, objects inherit from objects. You have an object that you would like to reuse and you want to create a second object that gets functionality from the first one.

var parent = {
    name:"Parent"
};

// the new object which inherits parent
var child = object(parent); 

console.log(child.name); // Parent
console.log(child.prototype);
console.log(child.__proto__);


// here is how object is defined
function object(o){
    function F(){};
    F.prototype = o;
    return new F();
}

An important thing to understand is that the child.__proto__ points to the parent object. The __proto__ link points to the prototype property of its parent constructor which was the temporary function F. The __proto__ link is what makes the interpreter check for the property name in its parent.

The console output showing the values of child.name, child.prototype, child.__proto__


Rest in Peace Steve Jobs

Posted: October 6th, 2011 | Author: | Filed under: technical | Comments Off

The man who was and always will be an inspiration to everyone. Rest in Peace.

Steve Jobs

Image Credit: Diana Walker


I am guilty of screen scraping

Posted: October 2nd, 2011 | Author: | Filed under: technical | Tags: , | 1 Comment »

I love data visualization. I am inspired by the fact that India has a wealth of data that can lead to wonderful visualizations that can help Government officials, politicians, journalists, social workers do their daily jobs better. Even the average citizen should be able to see the visualization and understand what’s going on and maybe get inspired to be a more responsible citizen.
I am on the constant lookout for Government data that is openly accessible and see if I can do something interesting with it. That led me to build this visual guide of elected representatives of India and this blog entry comparing corruption at the higher levels and by average citizens
There are a lot of websites (usually run by NGOs) that have data about Indian politicians and their details (education, criminal records, attendance in the parliament, participation in the parliament etc). This website has data on criminal records that made me interested. I emailed them asking (hoping) if they have any APIs that can be used to access the data, but I guess that is not high on their priority list. However, what they had was data arranged in HTML tables and lots of pages numbered serially. So I got into screen scraping mode and scraped out information of around 7800 candidates who stood in the elections in the Indian General Elections in 2009. Now, I have their educational qualifications, criminal records, total assets and liabilities (some have really obscene numbers) in my database.
The goal is now 2 fold.

  1. Create a simple visualization on the map of India to show educational qualifications, assets information and criminal cases of these candidates. Give the ability to filter by party, state. So viewers can see which areas had more educated candidates, which had more criminal candidates and where did the richest candidates come from.
  2. While creating the above goal, ensure that the API to access data is well written so that it can be exposed to other developers interested in building more cool visualizations.

This should be fun.