pycharm配置pyspark环境
参考这篇博客, 比较靠谱:
https://blog.csdn.net/ringsuling/article/details/84448369
用到的配置环境变量:
还没配置成功
C:\file\spark_package\spark-2.4.4-bin-hadoop2.7
将
C:\file\spark_package\spark-2.4.4-bin-hadoop2.7\python\pyspark
拷贝到
C:\Users\Carry Wan\AppData\Local\Programs\Python\Python37-32\Lib\site-packages拷贝过去
C:\Users\Carry Wan\AppData\Local\Programs\Python\Python37-32\Lib\site-packages\pysparkSPARK_HOME C:\file\spark_package\spark-2.4.4-bin-hadoop2.7
PYTHONPATH C:\file\spark_package\spark-2.4.4-bin-hadoop2.7\pythonpip install py4j JAVA_HOME C:\Program Files\Java\jdk1.8.0_162\bin添加
C:\file\spark_package\spark-2.4.4-bin-hadoop2.7\python\lib\py4j-0.10.7-src.zip
C:\file\spark_package\spark-2.4.4-bin-hadoop2.7\python\lib\pyspark.zip