#DOWNLOAD SPARK JAVA JAR CODE#
Notice that sbt package and sbt assembly require different code to customize the JAR file name. jar" customizes the JAR file name that’s created by sbt assembly. TestFrameworks += new TestFramework("")ĪrtifactName :=. LibraryDependencies += "com.lihaoyi" %% "utest" % "0.6.3" % "test" This is an excerpt of the spark-daria build.sbt file: libraryDependencies += "" %% "spark-sql" % "2.3.0" % "provided" Spark-daria is a good example of an open source project that is distributed as a thin JAR file. Let’s dig into the gruesome details! Building a Thin JAR FileĪs discussed, the sbt package builds a thin JAR file of your project. If you run sbt assembly, SBT will build a fat JAR file that includes both your project files and the uJson files. The thin JAR file will not include the uJson files. If you run sbt package, SBT will build a thin JAR file that only includes your project files. libraryDependencies += "com.lihaoyi" %% "ujson" % "0.6.5" Let’s say you add the uJson library to your build.sbt file as a library dependency. Fat JAR files inlude all the code from your project and all the code from the dependencies. You can build “fat” JAR files by adding sbt-assembly to your project.
Thin JAR files only include the project’s classes / objects / traits and don’t include any of the project dependencies. You can build a “thin” JAR file with the sbt package command. JAR files can be attached to Databricks clusters or launched via spark-submit. Hopefully it will help you make the leap and start writing Spark code in SBT projects with a powerful IDE by your side! JAR File BasicsĪ JAR (Java ARchive) is a package file format typically used to aggregate many Java class files and associated metadata and resources (text, images, etc.) into one file for distribution.
#DOWNLOAD SPARK JAVA JAR HOW TO#
This episode will demonstrate how to build JAR files with the SBT package and assembly commands and how to customize the code that’s included in JAR files.
Scala is a difficult language and it’s especially challenging when you can’t leverage the development tools provided by an IDE like IntelliJ. Spark JAR files let you package a project into a single file so it can be run on a Spark cluster.Ī lot of developers develop Spark code in brower based notebooks because they’re unfamiliar with JAR files.