java client 2.81

chrislu 2021-12-05 18:20:33 -08:00
parent edb20ae349
commit 0a88de5e09
5 changed files with 27 additions and 28 deletions

@ -26,7 +26,7 @@ Then get the seaweedfs hadoop client jar.
``` ```
cd share/hadoop/common/lib/ cd share/hadoop/common/lib/
wget https://oss.sonatype.org/service/local/repositories/releases/content/com/github/chrislusf/seaweedfs-hadoop2-client/1.7.0/seaweedfs-hadoop2-client-1.7.0.jar wget https://oss.sonatype.org/service/local/repositories/releases/content/com/github/chrislusf/seaweedfs-hadoop2-client/2.81/seaweedfs-hadoop2-client-2.81.jar
``` ```
# TestDFSIO Benchmark # TestDFSIO Benchmark
@ -142,4 +142,3 @@ As for `join`, every DataFrame join with itself on one column. Following is the
| Stddev | 0.934 | 1.381 | | Stddev | 0.934 | 1.381 |
| Min | 24.006 | 22.275 | | Min | 24.006 | 22.275 |
| Max | 30.991 | 30.279 | | Max | 30.991 | 30.279 |

@ -10,12 +10,12 @@ $ mvn install
# build for hadoop2 # build for hadoop2
$cd $GOPATH/src/github.com/chrislusf/seaweedfs/other/java/hdfs2 $cd $GOPATH/src/github.com/chrislusf/seaweedfs/other/java/hdfs2
$ mvn package $ mvn package
$ ls -al target/seaweedfs-hadoop2-client-1.7.0.jar $ ls -al target/seaweedfs-hadoop2-client-2.81.jar
# build for hadoop3 # build for hadoop3
$cd $GOPATH/src/github.com/chrislusf/seaweedfs/other/java/hdfs3 $cd $GOPATH/src/github.com/chrislusf/seaweedfs/other/java/hdfs3
$ mvn package $ mvn package
$ ls -al target/seaweedfs-hadoop3-client-1.7.0.jar $ ls -al target/seaweedfs-hadoop3-client-2.81.jar
``` ```
Maven Maven
@ -23,7 +23,7 @@ Maven
<dependency> <dependency>
<groupId>com.github.chrislusf</groupId> <groupId>com.github.chrislusf</groupId>
<artifactId>seaweedfs-hadoop3-client</artifactId> <artifactId>seaweedfs-hadoop3-client</artifactId>
<version>1.7.0</version> <version>2.81</version>
</dependency> </dependency>
or or
@ -31,23 +31,23 @@ or
<dependency> <dependency>
<groupId>com.github.chrislusf</groupId> <groupId>com.github.chrislusf</groupId>
<artifactId>seaweedfs-hadoop2-client</artifactId> <artifactId>seaweedfs-hadoop2-client</artifactId>
<version>1.7.0</version> <version>2.81</version>
</dependency> </dependency>
``` ```
Or you can download the latest version from MavenCentral Or you can download the latest version from MavenCentral
* https://mvnrepository.com/artifact/com.github.chrislusf/seaweedfs-hadoop2-client * https://mvnrepository.com/artifact/com.github.chrislusf/seaweedfs-hadoop2-client
* [seaweedfs-hadoop2-client-1.7.0.jar](https://oss.sonatype.org/service/local/repositories/releases/content/com/github/chrislusf/seaweedfs-hadoop2-client/1.7.0/seaweedfs-hadoop2-client-1.7.0.jar) * [seaweedfs-hadoop2-client-2.81.jar](https://oss.sonatype.org/service/local/repositories/releases/content/com/github/chrislusf/seaweedfs-hadoop2-client/2.81/seaweedfs-hadoop2-client-2.81.jar)
* https://mvnrepository.com/artifact/com.github.chrislusf/seaweedfs-hadoop3-client * https://mvnrepository.com/artifact/com.github.chrislusf/seaweedfs-hadoop3-client
* [seaweedfs-hadoop3-client-1.7.0.jar](https://oss.sonatype.org/service/local/repositories/releases/content/com/github/chrislusf/seaweedfs-hadoop3-client/1.7.0/seaweedfs-hadoop3-client-1.7.0.jar) * [seaweedfs-hadoop3-client-2.81.jar](https://oss.sonatype.org/service/local/repositories/releases/content/com/github/chrislusf/seaweedfs-hadoop3-client/2.81/seaweedfs-hadoop3-client-2.81.jar)
# Test SeaweedFS on Hadoop # Test SeaweedFS on Hadoop
Suppose you are getting a new Hadoop installation. Here are the minimum steps to get SeaweedFS to run. Suppose you are getting a new Hadoop installation. Here are the minimum steps to get SeaweedFS to run.
You would need to start a weed filer first, build the seaweedfs-hadoop2-client-1.7.0.jar You would need to start a weed filer first, build the seaweedfs-hadoop2-client-2.81.jar
or seaweedfs-hadoop3-client-1.7.0.jar, and do the following: or seaweedfs-hadoop3-client-2.81.jar, and do the following:
``` ```
# optionally adjust hadoop memory allocation # optionally adjust hadoop memory allocation
@ -60,12 +60,12 @@ $ echo "<configuration></configuration>" > etc/hadoop/mapred-site.xml
# on hadoop2 # on hadoop2
$ bin/hdfs dfs -Dfs.defaultFS=seaweedfs://localhost:8888 \ $ bin/hdfs dfs -Dfs.defaultFS=seaweedfs://localhost:8888 \
-Dfs.seaweedfs.impl=seaweed.hdfs.SeaweedFileSystem \ -Dfs.seaweedfs.impl=seaweed.hdfs.SeaweedFileSystem \
-libjars ./seaweedfs-hadoop2-client-1.7.0.jar \ -libjars ./seaweedfs-hadoop2-client-2.81.jar \
-ls / -ls /
# or on hadoop3 # or on hadoop3
$ bin/hdfs dfs -Dfs.defaultFS=seaweedfs://localhost:8888 \ $ bin/hdfs dfs -Dfs.defaultFS=seaweedfs://localhost:8888 \
-Dfs.seaweedfs.impl=seaweed.hdfs.SeaweedFileSystem \ -Dfs.seaweedfs.impl=seaweed.hdfs.SeaweedFileSystem \
-libjars ./seaweedfs-hadoop3-client-1.7.0.jar \ -libjars ./seaweedfs-hadoop3-client-2.81.jar \
-ls / -ls /
``` ```
@ -112,9 +112,9 @@ $ bin/hadoop classpath
# Copy SeaweedFS HDFS client jar to one of the folders # Copy SeaweedFS HDFS client jar to one of the folders
$ cd ${HADOOP_HOME} $ cd ${HADOOP_HOME}
# for hadoop2 # for hadoop2
$ cp ./seaweedfs-hadoop2-client-1.7.0.jar share/hadoop/common/lib/ $ cp ./seaweedfs-hadoop2-client-2.81.jar share/hadoop/common/lib/
# or for hadoop3 # or for hadoop3
$ cp ./seaweedfs-hadoop3-client-1.7.0.jar share/hadoop/common/lib/ $ cp ./seaweedfs-hadoop3-client-2.81.jar share/hadoop/common/lib/
``` ```
Now you can do this: Now you can do this:

@ -5,10 +5,10 @@ The installation steps are divided into 2 steps:
* https://cwiki.apache.org/confluence/display/Hive/AdminManual+Metastore+Administration * https://cwiki.apache.org/confluence/display/Hive/AdminManual+Metastore+Administration
### Configure Hive Metastore to support SeaweedFS ### Configure Hive Metastore to support SeaweedFS
1. Copy the seaweedfs-hadoop2-client-1.7.0.jar to hive lib directory,for example: 1. Copy the seaweedfs-hadoop2-client-2.81.jar to hive lib directory,for example:
``` ```
cp seaweedfs-hadoop2-client-1.7.0.jar /opt/hadoop/share/hadoop/common/lib/ cp seaweedfs-hadoop2-client-2.81.jar /opt/hadoop/share/hadoop/common/lib/
cp seaweedfs-hadoop2-client-1.7.0.jar /opt/hive-metastore/lib/ cp seaweedfs-hadoop2-client-2.81.jar /opt/hive-metastore/lib/
``` ```
2. Modify core-site.xml 2. Modify core-site.xml
modify core-site.xml to support SeaweedFS, 30888 is the filer port modify core-site.xml to support SeaweedFS, 30888 is the filer port
@ -50,9 +50,9 @@ metastore.thrift.port is the access port exposed by the Hive Metadata service it
Follow instructions for installation of Presto: Follow instructions for installation of Presto:
* https://prestosql.io/docs/current/installation/deployment.html * https://prestosql.io/docs/current/installation/deployment.html
### Configure Presto to support SeaweedFS ### Configure Presto to support SeaweedFS
1. Copy the seaweedfs-hadoop2-client-1.7.0.jar to Presto directory,for example: 1. Copy the seaweedfs-hadoop2-client-2.81.jar to Presto directory,for example:
``` ```
cp seaweedfs-hadoop2-client-1.7.0.jar /opt/presto-server-347/plugin/hive-hadoop2/ cp seaweedfs-hadoop2-client-2.81.jar /opt/presto-server-347/plugin/hive-hadoop2/
``` ```
2. Modify core-site.xml 2. Modify core-site.xml

@ -1,7 +1,7 @@
# Installation for HBase # Installation for HBase
Two steps to run HBase on SeaweedFS Two steps to run HBase on SeaweedFS
1. Copy the seaweedfs-hadoop2-client-1.7.0.jar to `${HBASE_HOME}/lib` 1. Copy the seaweedfs-hadoop2-client-2.81.jar to `${HBASE_HOME}/lib`
1. And add the following 2 properties in `${HBASE_HOME}/conf/hbase-site.xml` 1. And add the following 2 properties in `${HBASE_HOME}/conf/hbase-site.xml`
``` ```
@ -27,4 +27,4 @@ Two steps to run HBase on SeaweedFS
Visit HBase Web UI at `http://<hostname>:16010` to confirm that HBase is running on SeaweedFS Visit HBase Web UI at `http://<hostname>:16010` to confirm that HBase is running on SeaweedFS
![](HBaseOnSeaweedFS.png) ![](HBaseOnSeaweedFS.png)

@ -11,12 +11,12 @@ To make these files visible to Spark, set HADOOP_CONF_DIR in $SPARK_HOME/conf/sp
## installation not inheriting from Hadoop cluster configuration ## installation not inheriting from Hadoop cluster configuration
Copy the seaweedfs-hadoop2-client-1.7.0.jar to all executor machines. Copy the seaweedfs-hadoop2-client-2.81.jar to all executor machines.
Add the following to spark/conf/spark-defaults.conf on every node running Spark Add the following to spark/conf/spark-defaults.conf on every node running Spark
``` ```
spark.driver.extraClassPath=/path/to/seaweedfs-hadoop2-client-1.7.0.jar spark.driver.extraClassPath=/path/to/seaweedfs-hadoop2-client-2.81.jar
spark.executor.extraClassPath=/path/to/seaweedfs-hadoop2-client-1.7.0.jar spark.executor.extraClassPath=/path/to/seaweedfs-hadoop2-client-2.81.jar
``` ```
And modify the configuration at runtime: And modify the configuration at runtime:
@ -37,8 +37,8 @@ And modify the configuration at runtime:
1. change the spark-defaults.conf 1. change the spark-defaults.conf
``` ```
spark.driver.extraClassPath=/Users/chris/go/src/github.com/chrislusf/seaweedfs/other/java/hdfs2/target/seaweedfs-hadoop2-client-1.7.0.jar spark.driver.extraClassPath=/Users/chris/go/src/github.com/chrislusf/seaweedfs/other/java/hdfs2/target/seaweedfs-hadoop2-client-2.81.jar
spark.executor.extraClassPath=/Users/chris/go/src/github.com/chrislusf/seaweedfs/other/java/hdfs2/target/seaweedfs-hadoop2-client-1.7.0.jar spark.executor.extraClassPath=/Users/chris/go/src/github.com/chrislusf/seaweedfs/other/java/hdfs2/target/seaweedfs-hadoop2-client-2.81.jar
spark.hadoop.fs.seaweedfs.impl=seaweed.hdfs.SeaweedFileSystem spark.hadoop.fs.seaweedfs.impl=seaweed.hdfs.SeaweedFileSystem
``` ```
@ -81,8 +81,8 @@ spark.history.fs.cleaner.enabled=true
spark.history.fs.logDirectory=seaweedfs://localhost:8888/spark2-history/ spark.history.fs.logDirectory=seaweedfs://localhost:8888/spark2-history/
spark.eventLog.dir=seaweedfs://localhost:8888/spark2-history/ spark.eventLog.dir=seaweedfs://localhost:8888/spark2-history/
spark.driver.extraClassPath=/Users/chris/go/src/github.com/chrislusf/seaweedfs/other/java/hdfs2/target/seaweedfs-hadoop2-client-1.7.0.jar spark.driver.extraClassPath=/Users/chris/go/src/github.com/chrislusf/seaweedfs/other/java/hdfs2/target/seaweedfs-hadoop2-client-2.81.jar
spark.executor.extraClassPath=/Users/chris/go/src/github.com/chrislusf/seaweedfs/other/java/hdfs2/target/seaweedfs-hadoop2-client-1.7.0.jar spark.executor.extraClassPath=/Users/chris/go/src/github.com/chrislusf/seaweedfs/other/java/hdfs2/target/seaweedfs-hadoop2-client-2.81.jar
spark.hadoop.fs.seaweedfs.impl=seaweed.hdfs.SeaweedFileSystem spark.hadoop.fs.seaweedfs.impl=seaweed.hdfs.SeaweedFileSystem
spark.hadoop.fs.defaultFS=seaweedfs://localhost:8888 spark.hadoop.fs.defaultFS=seaweedfs://localhost:8888
``` ```