1 Star 0 Fork 0

wsxGit / spark-nlp-workshop

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
克隆/下载
jsl_colab_setup_with_OCR.sh 1.86 KB
一键复制 编辑 原始数据 按行查看 历史
#!/bin/bash
#default values for pyspark, spark-nlp, and SPARK_HOME
PYSPARK="3.1.1"
SPARKNLP=$PUBLIC_VERSION
SPARKNLP_JSL=$JSL_VERSION
SPARK_NLP_LICENSE=$SPARK_NLP_LICENSE
AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY=$AWS_SECRET_ACCESS_KEY
JSL_SECRET=$SECRET
while getopts p: option
do
case "${option}"
in
p) PYSPARK=${OPTARG};;
esac
done
SPARKHOME="/content/spark-3.1.1-bin-hadoop2.7"
echo "setup Colab for PySpark $PYSPARK and Spark NLP $SPARKNLP"
apt-get update
apt-get purge -y openjdk-11* -qq > /dev/null && sudo apt-get autoremove -y -qq > /dev/null
apt-get install -y openjdk-8-jdk-headless -qq > /dev/null
if [[ "$PYSPARK" == "3.1"* ]]; then
wget -q "https://downloads.apache.org/spark/spark-3.1.1/spark-3.1.1-bin-hadoop2.7.tgz" > /dev/null
tar -xvf spark-3.1.1-bin-hadoop2.7.tgz > /dev/null
SPARKHOME="/content/spark-3.1.1-bin-hadoop2.7"
elif [[ "$PYSPARK" == "3.0"* ]]; then
wget -q "https://downloads.apache.org/spark/spark-3.0.2/spark-3.0.2-bin-hadoop2.7.tgz" > /dev/null
tar -xvf spark-3.0.2-bin-hadoop2.7.tgz > /dev/null
SPARKHOME="/content/spark-3.0.2-bin-hadoop2.7"
elif [[ "$PYSPARK" == "2"* ]]; then
wget -q "https://downloads.apache.org/spark/spark-2.4.7/spark-2.4.7-bin-hadoop2.7.tgz" > /dev/null
tar -xvf spark-2.4.7-bin-hadoop2.7.tgz > /dev/null
SPARKHOME="/content/spark-2.4.7-bin-hadoop2.7"
else
wget -q "https://downloads.apache.org/spark/spark-3.1.1/spark-3.1.1-bin-hadoop2.7.tgz" > /dev/null
tar -xvf spark-3.1.1-bin-hadoop2.7.tgz > /dev/null
SPARKHOME="/content/spark-3.1.1-bin-hadoop2.7"
fi
export SPARK_HOME=$SPARKHOME
export JAVA_HOME="/usr/lib/jvm/java-8-openjdk-amd64"
# Install pyspark spark-nlp
! pip install implicits
! pip install --upgrade -q pyspark==$PYSPARK spark-nlp==$SPARKNLP findspark
! pip install spark-ocr==$OCR_VERSION --user --extra-index-url=https://pypi.johnsnowlabs.com/$JSL_OCR_SECRET --upgrade --no-deps
Java
1
https://gitee.com/wangsixian7/spark-nlp-workshop.git
git@gitee.com:wangsixian7/spark-nlp-workshop.git
wangsixian7
spark-nlp-workshop
spark-nlp-workshop
master

搜索帮助

53164aa7 5694891 3bd8fe86 5694891