Difference between revisions of "Spark notes"

From Simson Garfinkel
Jump to navigationJump to search
(Created page with "Adding files to the nodes: sc.addFile(filename)")
 
m
Line 2: Line 2:


     sc.addFile(filename)
     sc.addFile(filename)
* https://medium.com/@rbahaguejr/adding-python-files-to-pyspark-job-b725e02c8ab2
==Tutorials==
* https://medium.com/@rbahaguejr/adding-python-files-to-pyspark-job-b725e02c8ab2
==Tuning==
* https://stackoverflow.com/questions/37871194/how-to-tune-spark-executor-number-cores-and-executor-memory
* https://blog.cloudera.com/blog/2015/03/how-to-tune-your-apache-spark-jobs-part-2/
* http://site.clairvoyantsoft.com/understanding-resource-allocation-configurations-spark-application/
==Good Blog Entries==
* https://developerzen.com/best-practices-writing-production-grade-pyspark-jobs-cb688ac4d20f  (most about packaging and a shared context)

Revision as of 12:10, 26 May 2018