作者:KingDragon龙帝 | 来源:互联网 | 2023-08-27 21:59
Opening this issue to capture an ongoing discussion around the Guava version and StopWatch exceptions that are occurring with Titan 1.1.0/TP3 when using HBase 1.1.x on Hadoop 2.7.1.
The conversation started in a pull request #1216 but was moved here as more appropriate.
(https://github.com/thinkaurelius/titan/pull/1216)
The current version of Guava in Titan is 18.0. Even the latest versions of HBase are still using version 12.0. (BTW, a JIRA was opened against HBase to fix this, but we should not hold our breath on when it gets done). The way StopWatch is constructed changed, which can cause the following problem:
Caused by: java.lang.IllegalAccessError: tried to access method com.google.common.base.Stopwatch.()V from class org.apache.hadoop.hbase.zookeeper.MetaTableLocator at org.apache.hadoop.hbase.zookeeper.MetaTableLocator.blockUntilAvailable(MetaTableLocator.java:596) at org.apache.hadoop.hbase.zookeeper.MetaTableLocator.blockUntilAvailable(MetaTableLocator.java:580) at org.apache.hadoop.hbase.zookeeper.MetaTableLocator.blockUntilAvailable(MetaTableLocator.java:559)
The problem occurred while following the bulk loading steps in Chapter 32 of the Titan 1.0 documentation.
The problem was produced in two separate environments simultaneously. One environment was an IBM Open Platform environment running HBase 1.1.1 and Hadoop 2.7.1 installed with Ambari and the other was a HDP 2.3.0.0-2557 cluster installed with Ambari. In both cases when the problem was reproduced, Spark was running as a stand alone cluster on the same nodes used by hdfs and hbase.
In other words, Spark was not managed by Yarn in the initial tests.
It would seem this is classpath related, but there are other factors involved too, such as, perhaps, timing.
The StopWatch exception in my tests was intermittent.
The
graph.compute(SparkGraphComputer).program(blvp).submit().get()
line in Chapter 32 could be issued one moment and the exception would occur, and could be issued a second time (same Gremlin shell, same classpath, same everything) and the problem would not occur and the graph would be loaded. This intermittent pattern was repeatable, if not precisely predictable.
In tests yesterday, I configured Spark with a specific classpath (rather than just pointing it to a directory of jars) and was unable to get the Stop Watch exception to occur after 10 tries. I also varied the number of Spark workers in the stand alone cluster from 1 to 2, including 1 remote worker, with no impact on the results. I will post the config I used (which worked).
I do not have a specific configuration right now that consistently reproduces the exception.
该提问来源于开源项目:thinkaurelius/titan
Execuse me, I try to follow your step to "mvn install", but after installing, titan-dist/titan-dist-hadoop-2/target is empty, and I can't get titan-1.1.0-SNAPSHOT-hadoop2.zip(titan-1.1.0-SNAPSHOT-hadoop1.zip is ok)
1)Titan version is 1.0, which is fetched from git,
2)mvn command is: mvn clean install -DskipTests=true -Dgpg.skip=true -Phadoop2 -Paurelius-release
Does some configuration should be added when run mvn?
Best regards