Druid Indexing & Xerces Hell – Loader Constraint Violation

Recently my team mate Vihag came across an java.lang.Linkage error while doing druid indexing with CDH 5.10.2

It was good fun we must say finding out the reason.  hehe taking md5 of class files and comparing and finding out which jar the class is loaded from.

—————————————-

2018-05-11 11:03:04,848 FATAL [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster
java.lang.LinkageError: loader constraint violation: when resolving overridden method “org.apache.xerces.jaxp.DocumentBuilderImpl.newDocument()Lorg/w3c/dom/Document;” the class loader (instance of org/apache/hadoop/util/ApplicationClassLoader) of the current class, org/apache/xerces/jaxp/DocumentBuilderImpl, and its superclass loader (instance of <bootloader>), have different Class objects for the type org/w3c/dom/Document used in the signature
at org.apache.xerces.jaxp.DocumentBuilderFactoryImpl.newDocumentBuilder(Unknown Source)
at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2541)
at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2503)
at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2409)
at org.apache.hadoop.conf.Configuration.get(Configuration.java:1233)
at org.apache.hadoop.yarn.factory.providers.RecordFactoryProvider.getRecordFactory(RecordFactoryProvider.java:49)
at org.apache.hadoop.mapreduce.TypeConverter.<clinit>(TypeConverter.java:62)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.call(MRAppMaster.java:481)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.call(MRAppMaster.java:469)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.callWithJobClassLoader(MRAppMaster.java:1579)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:469)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:391)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$4.run(MRAppMaster.java:1537)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1534)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1467)
2018-05-11 11:03:04,851 INFO [main] org.apache.hadoop.util.ExitUtil: Exiting with status 

—————————————-

Vihag has documented the resolution nicely here @ github

—————————————-

The crux of the matter was that The class ‘DocumentBuilderImpl‘ was present in Druid in the xercesImpl-2.9.1.jar and Yarn was loading it from xercesImpl-2.10.0.jar.

So we removed 2.9.1 jar from druid and replaced it with 2.10.0 jar

Also we noticed that 2.10 jar plays well with xml-apis-1.4.01.jar & not xml-apis-1.3.04.jar

Hope this helps you.