Below is the my PySpark quickstart guide. Livy will automatically create a new session identified by batchid for this job.123curl -X POST --data '{"file":"file:/home/yular/livy_code/test.py","conf":{"spark.executor.extraClassPath":"/home/yular/hadoop-lzo-0.6.0.2.2.4.2-2.jar","spark.driver.extraClassPath":"/home/yular/hadoop-lzo-0.6.0.2.2.4.2-2.jar"}}' -H "Content-Type: application/json" localhost:8998/batches. It is strongly recommended to configure Spark to submit applications in YARN cluster mode. At first, we need to install requests library of Python. pip install Livy-Submit Below is the content of python file submitted (The file name is test.py. Apache License, Version 2.0 In our case, since it is the first session, the id is 0.1session_url = host + '/sessions/0', Now we should generate the code and submit to the corresponding statement REST API.123456789101112statements_url = session_url + '/statements'data = { 'code': textwrap("""\import org.apache.spark.SparkContext;import org.apache.spark.sql.hive.HiveContext; val sqlContext=new HiveContext(sc);val df=sqlContext.sql("use db1");val res=sqlContext.sql("select * from tb1 limit 10 ").collect();res.map(t => t(0)).foreach(println);""")}stm_r = requests.post(statements_url, data=json.dumps(data), headers=headers)stm_r.json(), After that, we can get the running result from the statement REST API using following code:123statement_url = host + stm_r.headers['Location']res = requests.get(statement_url, headers=headers)pprint.pprint(res.json()).

In Livy, the structure of REST API is /sessions/sessionid/statements/statementid. early and provides a statement URL that can be polled until it is complete: That was a pretty simple example. and that the host running the Livy server doesn't become overloaded when multiple user sessions are cluster in the case you don't have access to spark-submit on an edge-node That makes sure that user sessions have their resources properly accounted for in the YARN cluster, The application is build using the requests library. the Spark context. python matlab simulation random-walk levy-distribution.

Livy also supports Spark 2.0+ for both interactive and batch submission, you could seamlessly

Use Git or checkout with SVN using the web URL. Livy requires at least Spark 1.6 and supports both Scala 2.10 and 2.11 builds of Spark, Livy You can get Spark releases at This is from the Spark Examples: PySpark has the same API, just with a different initial request: The Pi example from before then can be run as: User group: http://groups.google.com/a/cloudera.org/group/livy-user, Dev group: http://groups.google.com/a/cloudera.org/group/livy-dev. of keyring to store the password for connecting to your edgenode (in order Before Building Livy.