Spark notes

From Simson Garfinkel
Revision as of 12:10, 26 May 2018 by Simson (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

Adding files to the nodes:

   sc.addFile(filename)
  • https://medium.com/@rbahaguejr/adding-python-files-to-pyspark-job-b725e02c8ab2


Tutorials

  • https://medium.com/@rbahaguejr/adding-python-files-to-pyspark-job-b725e02c8ab2

Tuning

  • https://stackoverflow.com/questions/37871194/how-to-tune-spark-executor-number-cores-and-executor-memory
  • https://blog.cloudera.com/blog/2015/03/how-to-tune-your-apache-spark-jobs-part-2/
  • http://site.clairvoyantsoft.com/understanding-resource-allocation-configurations-spark-application/


Good Blog Entries

  • https://developerzen.com/best-practices-writing-production-grade-pyspark-jobs-cb688ac4d20f (most about packaging and a shared context)
Retrieved from "https://simson.net/wiki/index.php?title=Spark_notes&oldid=1881"

Navigation menu

Page actions

  • Page
  • Discussion
  • View
  • View source
  • History

Page actions

  • Page
  • Discussion
  • More
  • Tools

Personal tools

  • Log in

Pages

  • Bio
  • Consulting
  • Photos
  • Notes
  • Notepaper Generator

Academic

  • Students
  • Courses
  • CV
  • Research
  • Unpublished

Special

  • Main page
  • Recent changes
  • Random page
  • All pages
  • Special pages

Contact

  • Contact
  • Upload a File

Tools

  • What links here
  • Related changes
  • Special pages
  • Printable version
  • Permanent link
  • Page information
Attribution-Noncommercial-No Derivative Works 3.0 Unported
Powered by MediaWiki
  • This page was last edited on 26 May 2018, at 12:10.
  • Content is available under Attribution-Noncommercial-No Derivative Works 3.0 Unported unless otherwise noted.
  • Privacy policy
  • About Simson Garfinkel
  • Disclaimers