DEV Community

loading...

Using Deequ 1.1 with Spark 3

aloneguid profile image Ivan G ・2 min read

If you try to upgrade AWS Deequ to latest version (1.1.0) atm and use with Spark 3.0.1 you will get following error:

[error] (update) Conflicting cross-version suffixes in: org.apache.spark:spark-launcher, org.apache.spark:spark-sketch, org.apache.spark:spark-kvstore, org.json4s:json4s-ast, org.apache.spark:spark-catalyst, org.apache.spark:spark-network-shuffle, com.twitter:chill, org.apache.spark:spark-sql, org.scala-lang.modules:scala-xml, org.json4s:json4s-jackson, com.fasterxml.jackson.module:jackson-module-scala, org.json4s:json4s-core, org.apache.spark:spark-unsafe, org.json4s:json4s-scalap, org.scala-lang.modules:scala-parser-combinators, org.apache.spark:spark-tags, org.apache.spark:spark-core, org.apache.spark:spark-network-common
[error] Total time: 6 s, completed 10-Feb-2021 13:07:46
Enter fullscreen mode Exit fullscreen mode

This is due to the fact that Deque has transitive dependencies to Scala 2.11 for some unknown reason (a bug?). You can fix that by using the following build.sbt:

name := "dq"

scalaVersion := "2.12.12"

val sparkVersion = "3.0.1"

libraryDependencies += "org.apache.spark" %% "spark-sql" % sparkVersion % "provided"


// https://mvnrepository.com/artifact/com.amazon.deequ/deequ
// issue with Deequ transitive libs cross-compiled to Scala 2.11
libraryDependencies += ("com.amazon.deequ" % "deequ" % "1.1.0_spark-3.0-scala-2.12")
  .exclude("org.scalanlp", "breeze_2.11")
  .exclude("com.chuusai", "shapeless_2.11")
  .exclude("org.apache.spark", "spark-core_2.11")
  .exclude("org.apache.spark", "spark-sql_2.11")
Enter fullscreen mode Exit fullscreen mode

P.S. Originally published on my blog.

Discussion (0)

pic
Editor guide