Monday, June 16, 2008

UIMA Hadoop

Wow complex topic to my post. Yeah this is what I have been working on lately. And I have been struggling hard with it. Killing forums for answers which I don't find in the web. Strange but its true there are still some things unavailable in the web. In the process of seeking answers I have somehow managed to educate myself. And whats even more surprising the answer which I was seeking was good enough to be put in their wiki. And what more the content was delivered by me as I somehow managed to get to the solution taking half inputs from all the places. Well why am I writing about this. Its not to brag but because I want to keep the solution for myself and more over because this is probably my first ever contribution of any form to an open source project and for the community. So here is the solution which I posted along with the link to the wiki

Running UIMA Apps on Hadoop

And here goes the original content. Who knows who may edit it.

Problem:

To make a simple UIMA app work over hadoop

Assumption:

     1. You have tested hadoop and you have it running
     2. You have a standalone UIMA app which has been tested

How TO

     1. Let the UIMA be a simple nameAnnotation example which uses a type system nameType for name annotation. Let the descriptors for them be nameAnnotator.xml and nameType.xml.
     2. Write a map and reduce classes within the application along with a job specifier.
     3. Via these map/reduce class you aim to annotate the input value which they are recieving
     4. Create a job jar out of the application.
     5. Run this over hadoop

It will not work. There are several other things which has to be taken care of before

Important Consideration (Before creating/running the Job Jar over hadoop)

     1. The jar file created should shave all the classes, descriptors of the UIMA app along with the map/reduce and job main class
     2. All import in the descriptor declared in UIMA (be it analysis engine, agg engine, cas consumer etc) should be import by name.
     3. Any such activity which involves reading of a resource should be done using the Classloader:
     For eg. Reading an xml source should be done via XMLInputSource in = new XMLInputSource(ClassLoader.getSystemResourceAsStream
    (aeXmlDescriptor),null) i.e. inputstreams should be created using classloader
     4. Last but not the least ResourceManager should be used while producing any analysis engine/ cas consumer etc.

     E.g. ResourceManager rMng=UIMAFramework.newDefaultResourceManager();
rMng.setExtensionClassPath(str, true); //Here str is the path to any of the resources which can be obtained via
//ClassLoader.getSystemResource(aeXmlDescriptor).getPath()
     rMng.setDataPath(str);
     aEngine = UIMAFramework.produceAnalysisEngine
        (aSpecifier,rMng,null);

     This 4th point has to be considered as when we read a xml without using classloader by default it reads from temp task directory i.e..
/tmp/hadoop-root/mapred/local/taskTracker/jobcache/
    job_200806112341_0002/task_200806112341_0002_m_000000_0/
     But all the resources gets unjarred in
/tmp/hadoop-root/mapred/local/taskTracker/jobcache/
        job_200806112341_0002/work
     directory
     So to tell the system to look out for hadoop in the correct directory we have to use Resource Manager. Actually this is required to take care of the the resources which UIMA will try to load because of the imports present in its various descriptors

30 comments:

Atish said...

wtf... oops, sorry .. i mean WTF!!!!!!!!!!!!!

Rohan Rai said...

hahahha...wat can i say ...but i can only do 1 thing is to giggle..I hope its not to girly or aishwarya rai types..:)

Ninad said...

Rohan in the third step I am getting

InvalidXMLExceptionInvalid descriptor at .

do u know why??

Rohan Rai said...

There was another thread,,,,
http://osdir.com/ml/apache.uima.general/2008-06/msg00095.html

Which explained a little more problem..regarding the need 2 fake imports...

You can try that...

If you annotator or any other xml is running fine on standalone UIMA app then the post and the link this should be enough to make it work

Ninad said...

Can u elaborate more on faking imports...??

I didn't understand it clearly.

Anonymous said...

Infatuation casinos? alter safe of this advanced [url=http://www.realcazinoz.com]casino[/url] superintend and mesa online casino games like slots, blackjack, roulette, baccarat and more at www.realcazinoz.com .
you can also into our blooming [url=http://freecasinogames2010.webs.com]casino[/url] give something at http://freecasinogames2010.webs.com and gripe enfold of loving incredibly ill-advised !
another imaginative [url=http://www.ttittancasino.com]casino spiele[/url] authority is www.ttittancasino.com , in thoughtfulness german gamblers, detract a classify the ill-use character in magnanimous online casino bonus.

Anonymous said...

limit subvert this gratis [url=http://www.casinoapart.com]casino[/url] hand-out at the outwit [url=http://www.casinoapart.com]online casino[/url] signal with 10's of in taste [url=http://www.casinoapart.com]online casinos[/url]. actions [url=http://www.casinoapart.com/articles/play-roulette.html]roulette[/url], [url=http://www.casinoapart.com/articles/play-slots.html]slots[/url] and [url=http://www.casinoapart.com/articles/play-baccarat.html]baccarat[/url] at this [url=http://www.casinoapart.com/articles/no-deposit-casinos.html]no compress casino[/url] , www.casinoapart.com
the finest [url=http://de.casinoapart.com]casino[/url] against UK, german and all as a remains the world. so in the ambit of the treatment of the choicest [url=http://es.casinoapart.com]casino en linea[/url] confirmation us now.

xanax said...

Please one more post about that.I wonder how you got so good. This is really a fascinating blog, lots of stuff that I can get into. One thing I just want to say is that your Blog is so perfect

viagra shelf life said...

That is very good comment you shared.Thank you so much that for you shared those things with us.Im wishing you to carry on with ur achivments.All the best.

Anonymous said...

dating darla http://loveepicentre.com/ mastery with woman and dating

Anonymous said...

EwuYjj http://sinsakuchanel.com/ AmbSas [url=http://sinsakuchanel.com/]シャネル 財布[/url] XxtNtv http://ninnkicoach.com/ LywDas [url=http://ninnkicoach.com/]コーチ アウトレット オンライン[/url] OifAnw http://diorautoretto.com/ DanMfh [url=http://diorautoretto.com/]ディオールオム ネックレス[/url] TawNdz http://nihongucci.com/ GfvUlx [url=http://nihongucci.com/]グッチ キーケース ピンク[/url] OixDdp http://longchampnihon.com/ VhpKqv [url=http://longchampnihon.com/]ロンシャン 店舗 名古屋[/url] FtoIde http://sinsakuvuitton.com/ BcpMgp [url=http://sinsakuvuitton.com/]ルイヴィトン バッグ モノグラム[/url] RqzTgd http://gekiyasuprada.com/ HfnBwx [url=http://gekiyasuprada.com/]プラダ トート 2012[/url] IisYcg http://uggsinsaku.com/ SriOln [url=http://uggsinsaku.com/]UGG クラシックミニ コーディネート[/url]

Anonymous said...

cf9hl0 house for rent in cambodia cg9wv3 siem reap apartments for rent ik0pa9 buy an apartment in siem reap ho2tr6

Anonymous said...

sony ebook promo codes http://audiobookscollection.co.uk/fr/Hack-I-T-Security-Through-Penetration-Testing/p180623/ free ebook gorilla marketing [url=http://audiobookscollection.co.uk/Databases/c1148/]foreign popular free ebook centers[/url] self-promotion online ebook torrents

Anonymous said...

cl8xc1 xtreme no forum hr2sl5 xtreme no price vr1kv0 xtreme no for sale lg9lf4

Anonymous said...

jz9nd2 xtreme no indonesia at1fq0 cheap xtreme no ig3yt8 xtreme no 60 tablets rl7nt4

Anonymous said...

3auic940 provillus sale 7cxvv916 http://provillusbuy.tumblr.com 7wtce356 provillus free trial 2vfxm224

Anonymous said...

xw5lx3 garcinia cambogia customer reviews ew0hf2 xtreme no official website ty1bu3 garcinia cambogia nature made cz3pq2

Anonymous said...

zm4tb0 garcinia cambogia puritans pride dw4rl0 provillus official website vm0jq7 garcinia cambogia side effects xq6hy4

Anonymous said...

mm6yu6 garcinia cambogia extract pure xz0pc2 garcinia cambogia supplements jz4fy3 ggarcinia cambogia dose nd3bm5

Anonymous said...

true green coffee bean extract kq7bn6 green coffee bean extract va2ml3 green coffee beans at9au1

Anonymous said...

lv5bt0 maqui berry fruit ar2yt6 what does maqui berry do maqui berry online blogspot cz9bj0

Anonymous said...

ov9fx0 maqui berry australia ke1zp6 buy maqui berry maqui berry weight loss qo3kj4

Anonymous said...

cz1de1 maqui berry singapore cy3lq8 diet maqui berry maqui berry powder extract dz6ns6

Anonymous said...

ho9gr2 hgh energizer no fantastico fu4dt6 http://besthghenergizerreviews.tumblr.com re1sj2 hgh energizer ingredients zo3dp1

Anonymous said...

xk1nr5 hgh energizer di malaysia tl6wk8 hghenergizergrowth.tumblr.com gd2oq7 hgh energizer no fantastico gz8wt0

Anonymous said...

hs2mv1 hgh energizer yq0sh1 buycheaphghenergizerhere.tumblr.com qj9no6 hgh energizer to buy de5oo4

Anonymous said...

pu9xa3 lc6tq3 zetaclear ol8fp8 kh2ba9 http://zetaclearbuy.webs.com/ http://zetaclearreviewsx.webs.com/ http://toenailfungusremedies.webs.com/ http://nailfungustreatmenttoday.webs.com/

Anonymous said...

Thank you for the good writeup. It in fact was a amusement account it.

Look advanced to far added agreeable from you! By the way, how can we communicate?


Also visit my webpage; プラダ バッグ

Anonymous said...

Very energetic post, I enjoyed that a lot. Will there be a part 2?


My web blog; rolexコピー

Anonymous said...

Ahaa, its good conversation about this piece of writing here at this weblog, I have read all that, so now me
also commenting here.

my web-site - 腕時計コピー