Difference between revisions of "Spark notes"
From Simson Garfinkel
Jump to navigationJump to search
m |
m |
||
Line 1: | Line 1: | ||
==Spark on MacOS== | |||
1. Install anaconda. | |||
2. pip install pyspark | |||
Adding files to the nodes: | Adding files to the nodes: | ||
Revision as of 06:23, 7 April 2019
Spark on MacOS
1. Install anaconda. 2. pip install pyspark
Adding files to the nodes:
sc.addFile(filename)
Tutorials
Tuning
- https://stackoverflow.com/questions/37871194/how-to-tune-spark-executor-number-cores-and-executor-memory
- https://blog.cloudera.com/blog/2015/03/how-to-tune-your-apache-spark-jobs-part-2/
- http://site.clairvoyantsoft.com/understanding-resource-allocation-configurations-spark-application/
Good Blog Entries
- https://developerzen.com/best-practices-writing-production-grade-pyspark-jobs-cb688ac4d20f (most about packaging and a shared context)