Wednesday, December 14, 2011

S3 / Linux

Here are the instructions to use s3cmd tool to access and upload data to your bucket on Amazon S3:

·         Uncompress the file using the cmd: tar zxf s3cmd-1.0.1.tar.gz
·         Go inside the directory: cd s3cmd-1.0.1
·         First time configure it: ./s3cmd –configure
o   Supply the values: access key, secret key
s3cmd mb s3://my-new-bucket-name (To Create Bucket)
·         Run the command to list the contents of your bucket: ./s3cmd ls s3://yourbucket
·         Put (write/deposit) a file as; ./s3cmd put <localfilename> s3://yourbucket/<remote-folder-optional>/<remote-file-name>
·         Make sure there are no additional ‘/’ – because the S3 does not optimize them to single one.

Friday, December 09, 2011

DATAPIPE Combo -- 1

Another one of the various datapipe combo available out there

AVRO

HADOOP--PIG--VOLDEMORT--SINATRA 

JAVA,PIG,RUBY/JRUBY

http://datasyndrome.com/post/13707537045/booting-the-analytics-application-events-ruby





Wednesday, December 07, 2011

OOZIE HADOOP And STAGING DIR

For Oozie to work properly /user/<user>/.staging needs 700 as permission.
If not the silent exceptions (like local exception in oozie log and FS permission exception in name node) will possibly waist a lot of your time





Friday, November 18, 2011

SVN Revert

svn merge --dry-run -r 73:68
svn merge -r 73:68
svn commit -m "Reverted to revision 68."

Monday, September 05, 2011

Google Doodle : Freddie Mercury

In spite of being such a heavy user of google I hardly had a chance to say thank you but the doodle today makes me to do so. So thank You Google.

.

Tuesday, February 08, 2011

PIG and Skewed JOIN Issue with Empty Files

Infra
     HADOOP 0.20.2
     PIG 0.6

ISSUE
     SKEWED JOIN on EMPTY FILES

RESULT
     JOB FAILS

CONCLUSION
     DON'T DO  SKEWED JOIN IF INPUT DATA IS EMPTY if You Still want JOB Not to come in Failed JOB List





Recursively Delete A Directory

Sweet and Powerful. Isn't It

rm -rf `find . -type d -name <dir>`