
Jatin Madaan
Apr 28, 20191 min read
Hive on Spark simple program
## PySpark code to run sql command . code : ## Importing HiveContext >>>> from pyspark.sql import Hive Context ## Create a SqlContext...
Â
Â
Â


## PySpark code to run sql command . code : ## Importing HiveContext >>>> from pyspark.sql import Hive Context ## Create a SqlContext...



To Load data from a csv (it can be pipe,tab,comma seprated ) file : Step 1 : Create a table with delimiter as given in file Command :...

To run oracle commands on oracle server using pyspark . For EMR First install software sudo su pip install cx_Oracle==6.0b1 Function 1 :...

There is a simple command although it would run map reduce but still in case required . last_year=$(hive -e "select...

To copy files on local machine we can use command : aws s3 cp s3://bucket_name/folder_name/file_name.txt . there is a dot at end to...

We can perform almost all hadoop fs commands on s3 file system as well. Eg : hadoop fs -du -s -h s3://bucket_name/folder_name 10.1 G ...

While running hive query using hive -e or hive -f command merely writing rc=$? below hive command will not help , it will only tell if...


Get parameter such as workflow_name,start_date,end_date,parameter_file as input in a file . Loop through dates to get all date values...

To connect aws cluster (EMR or EC2) via terminal on mac . First make sure you download pem file from aws account. Once file has been...

Until loop until [[ $flag > 1 ]] do [code] done

alias sr="cd /[path_to_folder]"

import time import sys import subprocess ## Getting start time of a job start_time= time.time() ## importing spark and Hive Context to...
