Wednesday, December 14, 2011

S3 / Linux

Here are the instructions to use s3cmd tool to access and upload data to your bucket on Amazon S3:

·         Uncompress the file using the cmd: tar zxf s3cmd-1.0.1.tar.gz
·         Go inside the directory: cd s3cmd-1.0.1
·         First time configure it: ./s3cmd –configure
o   Supply the values: access key, secret key
s3cmd mb s3://my-new-bucket-name (To Create Bucket)
·         Run the command to list the contents of your bucket: ./s3cmd ls s3://yourbucket
·         Put (write/deposit) a file as; ./s3cmd put <localfilename> s3://yourbucket/<remote-folder-optional>/<remote-file-name>
·         Make sure there are no additional ‘/’ – because the S3 does not optimize them to single one.

Friday, December 09, 2011

DATAPIPE Combo -- 1

Another one of the various datapipe combo available out there

AVRO

HADOOP--PIG--VOLDEMORT--SINATRA 

JAVA,PIG,RUBY/JRUBY

http://datasyndrome.com/post/13707537045/booting-the-analytics-application-events-ruby





Wednesday, December 07, 2011

OOZIE HADOOP And STAGING DIR

For Oozie to work properly /user/<user>/.staging needs 700 as permission.
If not the silent exceptions (like local exception in oozie log and FS permission exception in name node) will possibly waist a lot of your time