Pydoop FSSpec Interface

This is a Pydoop implementation for interacting with the Hadoop File System (HDFS) using a FSSpec Pythonic interface.

Features

Compatible with FSSpec and PyArrow
Provides seamless integration with Pydoop
Supports common file system operations such as listing directories, reading and writing files, and more

Installation

Build the wheel

python3 setup.py sdist bdist_wheel

Install the package using pip:

pip install dist/pydoopfsspec-0.1.0-py3-none-any.whl

Usage

FSSpec instance

import fsspec
from pydoopfsspec import HadoopFileSystem

fsspec.register_implementation("pydoop", HadoopFileSystem)
hdfs = fsspec.filesystem("pydoop", host="hdfs", port=8020, user="root")

Pyarrow instance

import fsspec
import pyarrow.fs
from pydoopfsspec import HadoopFileSystem

fsspec.register_implementation("pydoop", HadoopFileSystem)
hdfs = fsspec.filesystem("pydoop", host="hdfs", port=8020, user="root")
fs = pyarrow.fs.PyFileSystem(pyarrow.fs.FSSpecHandler(hdfs))

Unittests

By default, unittest will run the test methods in the order of their names in lexicographical order. Thus, I marked serial numbers in methods' names to control the execution sequence.

create /tests directory and put a file.txt there, you can add any contents to file.txt as your wish, for example, create a file locally then run the following scripts:

$HADOOP_HOME/bin/hdfs dfs -mkdir /tests
$HADOOP_HOME/bin/hdfs dfs -put file.txt /tests

run the unittests

python3 test_fsspec.py
python3 test_pyarrow.py

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.devcontainer		.devcontainer
pydoopfsspec		pydoopfsspec
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py
test_fsspec.py		test_fsspec.py
test_pyarrow.py		test_pyarrow.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pydoop FSSpec Interface

Features

Installation

Usage

FSSpec instance

Pyarrow instance

Unittests

About

Languages

License

pwang697/PydoopFSSpec

Folders and files

Latest commit

History

Repository files navigation

Pydoop FSSpec Interface

Features

Installation

Usage

FSSpec instance

Pyarrow instance

Unittests

About

Resources

License

Stars

Watchers

Forks

Languages