mirror of
https://github.com/seaweedfs/seaweedfs.git
synced 2024-01-19 02:48:24 +00:00
separate hadoop2 and hadoop3
parent
e0f3dde8a3
commit
e1dec20dc6
|
@ -6,28 +6,46 @@ SeaweedFS excels on small files and has no issue to store large files. Now it is
|
|||
```
|
||||
$cd $GOPATH/src/github.com/chrislusf/seaweedfs/other/java/client
|
||||
$ mvn install
|
||||
$cd $GOPATH/src/github.com/chrislusf/seaweedfs/other/java/hdfs
|
||||
|
||||
# build for hadoop2
|
||||
$cd $GOPATH/src/github.com/chrislusf/seaweedfs/other/java/hdfs2
|
||||
$ mvn package
|
||||
$ ls -al target/seaweedfs-hadoop-client-*.jar
|
||||
$ ls -al target/seaweedfs-hadoop2-client-*.jar
|
||||
|
||||
# build for hadoop3
|
||||
$cd $GOPATH/src/github.com/chrislusf/seaweedfs/other/java/hdfs3
|
||||
$ mvn package
|
||||
$ ls -al target/seaweedfs-hadoop3-client-*.jar
|
||||
|
||||
```
|
||||
Or you can download the latest version from MavenCentral
|
||||
* https://mvnrepository.com/artifact/com.github.chrislusf/seaweedfs-hadoop-client
|
||||
* http://central.maven.org/maven2/com/github/chrislusf/seaweedfs-hadoop-client/
|
||||
* https://mvnrepository.com/artifact/com.github.chrislusf/seaweedfs-hadoop2-client
|
||||
* http://central.maven.org/maven2/com/github/chrislusf/seaweedfs-hadoop2-client/
|
||||
|
||||
* https://mvnrepository.com/artifact/com.github.chrislusf/seaweedfs-hadoop3-client
|
||||
* http://central.maven.org/maven2/com/github/chrislusf/seaweedfs-hadoop3-client/
|
||||
|
||||
# Test SeaweedFS on Hadoop
|
||||
|
||||
Suppose you are getting a new Hadoop installation. Here are the minimum steps to get SeaweedFS to run.
|
||||
|
||||
You would need to start a weed filer first, build the seaweedfs-hadoop-client-xxx.jar, and do the following:
|
||||
You would need to start a weed filer first, build the seaweedfs-hadoop2-client-xxx.jar
|
||||
or seaweedfs-hadoop3-client-xxx.jar, and do the following:
|
||||
|
||||
```
|
||||
$ cd ${HADOOP_HOME}
|
||||
# create etc/hadoop/mapred-site.xml, just to satisfy hdfs dfs. skip this if the file already exists.
|
||||
$ echo "<configuration></configuration>" > etc/hadoop/mapred-site.xml
|
||||
|
||||
# on hadoop2
|
||||
$ bin/hdfs dfs -Dfs.defaultFS=seaweedfs://localhost:8888 \
|
||||
-Dfs.seaweedfs.impl=seaweed.hdfs.SeaweedFileSystem \
|
||||
-libjars ./seaweedfs-hadoop-client-x.x.x.jar \
|
||||
-libjars ./seaweedfs-hadoop2-client-x.x.x.jar \
|
||||
-ls /
|
||||
# or on hadoop3
|
||||
$ bin/hdfs dfs -Dfs.defaultFS=seaweedfs://localhost:8888 \
|
||||
-Dfs.seaweedfs.impl=seaweed.hdfs.SeaweedFileSystem \
|
||||
-libjars ./seaweedfs-hadoop3-client-x.x.x.jar \
|
||||
-ls /
|
||||
|
||||
```
|
||||
|
@ -57,7 +75,10 @@ $ bin/hadoop classpath
|
|||
|
||||
# Copy SeaweedFS HDFS client jar to one of the folders
|
||||
$ cd ${HADOOP_HOME}
|
||||
$ cp ./seaweedfs-hadoop-client-x.x.x.jar share/hadoop/common/lib/
|
||||
# for hadoop2
|
||||
$ cp ./seaweedfs-hadoop2-client-x.x.x.jar share/hadoop/common/lib/
|
||||
# or for hadoop3
|
||||
$ cp ./seaweedfs-hadoop3-client-x.x.x.jar share/hadoop/common/lib/
|
||||
```
|
||||
|
||||
Now you can do this:
|
||||
|
@ -83,12 +104,12 @@ To make these files visible to Spark, set HADOOP_CONF_DIR in $SPARK_HOME/conf/sp
|
|||
|
||||
## installation not inheriting from Hadoop cluster configuration
|
||||
|
||||
Copy the seaweedfs-hadoop-client-x.x.x.jar to all executor machines.
|
||||
Copy the seaweedfs-hadoop2-client-x.x.x.jar to all executor machines.
|
||||
|
||||
Add the following to spark/conf/spark-defaults.conf on every node running Spark
|
||||
```
|
||||
spark.driver.extraClassPath /path/to/seaweedfs-hadoop-client-x.x.x.jar
|
||||
spark.executor.extraClassPath /path/to/seaweedfs-hadoop-client-x.x.x.jar
|
||||
spark.driver.extraClassPath /path/to/seaweedfs-hadoop2-client-x.x.x.jar
|
||||
spark.executor.extraClassPath /path/to/seaweedfs-hadoop2-client-x.x.x.jar
|
||||
```
|
||||
|
||||
And modify the configuration at runntime:
|
||||
|
|
Loading…
Reference in a new issue