Importing larger data sources
To deploy data files larger than 1GB, if Spark is deployed on a separate machine, you need to increase the relevant parameters.
Edit the following parameters in the Spark JobServer config file va.conf.
The file is located in the <MashZone NextGen Explorer installation>/va-sjs/config/ directory.
Set
request-chunk-aggregation-limit = 2000m in section
spray.can/server.
Set
max-chunk-size = 2000m in section
spray.can/server/parsing.
Set
short-timeout = 10 s in section
spark/jobserver.
Edit the following parameters in the Spark JobServer startup scripts va-sjs.bat for MS Windows and va-sjs for Linux.
The files are located in the <MashZone NextGen Explorer installation>/va-sjs/bin/ directory.
In
va-sjs.bat: Change line
set DEFAULT_JVM_OPTS="" to
set DEFAULT_JVM_OPTS="-Xmx8192m".
In
va-sjs: Change line
DEFAULT_JVM_OPTS="" to
DEFAULT_JVM_OPTS="-Xmx8192m".