From 786a6048d25b63af6262e2087f627764d6e05acb Mon Sep 17 00:00:00 2001 From: brendan Date: Tue, 6 Jun 2023 14:41:47 +1000 Subject: [PATCH] add limitations to docs as per #158 --- docsrc/index.rst | 1 + docsrc/limitations.md | 7 +++++++ 2 files changed, 8 insertions(+) create mode 100644 docsrc/limitations.md diff --git a/docsrc/index.rst b/docsrc/index.rst index edfd665..2e6d055 100644 --- a/docsrc/index.rst +++ b/docsrc/index.rst @@ -15,6 +15,7 @@ aims to serve as a general purpose python library for importing, analysing, mani /examples /phase_space_format /supported_particles + /limitations /code_docs Indices and tables diff --git a/docsrc/limitations.md b/docsrc/limitations.md new file mode 100644 index 0000000..968a098 --- /dev/null +++ b/docsrc/limitations.md @@ -0,0 +1,7 @@ +# Limitations + +The major limitation of this code at the time of writing is that it can only easily handle data that fits inside memory. This is partially a result of the choice of pandas as the backend - [this page](https://pandas.pydata.org/docs/user_guide/scale.html) describes some difficulties and solutions of handling large data with pandas. + +As is discussed in the above link: if your data is too big to fit inside RAM, it should be possible to read and process your data in 'chunks' where each chunk can fit inside memory. This is not supported in most data loaders, but should be possible with minimal extensions - open an issue and we can talk about it! + +Beyond this, libraries such as [DASK](https://www.dask.org/) may enable using this library of distributed resources. This is discussed a little bit in [this issue](https://github.com/bwheelz36/ParticlePhaseSpace/issues/158), with an example of utilising DASK on an OpenPMD dataset. \ No newline at end of file