Replies: 34 comments
-
@casperdcl , I'm not sure how much benefit would bring, giving that it is not that hard to create a docker image for DVC: Also, testing this would be the same as testing DVC on a linux machine (or at least, it should be) I doubt that users want to use it as a base image, since it doesn't provide anything else than DVC. I'd prefer to not maintain this one, to be honest |
Beta Was this translation helpful? Give feedback.
-
I would also argue, that when doing some data science stuff in docker, its probably easier to install DVC in your own image, rather than try to adjust DVC image to you requirements (like installing TF/pytorch/...). |
Beta Was this translation helpful? Give feedback.
-
@pared that was my point about
as in people could copy-paste from our docker file into theirs... I assumed it's more complex than just a |
Beta Was this translation helpful? Give feedback.
-
@casperdcl ok, I didn't quite get it.
That would surely help to build custom image. |
Beta Was this translation helpful? Give feedback.
-
@casperdcl Hm, |
Beta Was this translation helpful? Give feedback.
-
@efiop the |
Beta Was this translation helpful? Give feedback.
-
@efiop can confirm |
Beta Was this translation helpful? Give feedback.
-
@casperdcl Ah, got it. yeah, |
Beta Was this translation helpful? Give feedback.
-
Yes I did start making a few flavours of docker images for testing (alpine, ubuntu LTS, 2.7, 3.6 etc) ages ago which are probably sitting in a Now I just use a conda env for dvc testing. |
Beta Was this translation helpful? Give feedback.
-
btw totally fine with this issue being closed - don't actually have any strong opinions about it. |
Beta Was this translation helpful? Give feedback.
-
@casperdcl No reason to close it. Docker images(or at least dockerfiles) would be nice to have, for sure. 🙂 |
Beta Was this translation helpful? Give feedback.
-
I just came here looking exactly for that. I instantly go looking for a docker file for me to easily test the software. That is because I use different software (mostly R), and don't use python outside docker (because python environment libraries change a lot, and nobody seems to agree which one is best - conda, virtualenv, pyenv, pipenv etc - which, to complicate further, have different functionality). |
Beta Was this translation helpful? Give feedback.
-
@nettoyoussef Thanks for the feedback! Do you need pre-built images, or a Dockerfile in our docs would do? |
Beta Was this translation helpful? Give feedback.
-
@casperdcl thank you for this idea! I agree with @pared and @MrOutis that it is easy to create your own docker image and it might create additional supporting overhead for us. However, prebuild docker gives value to users and @nettoyoussef showed some example. It can attract users' attention and improve usability despite a simple implementation. But then documentation plays the major role. Can we make a good documentation page or even a small blog post that explains the motivation behind using docker image instead of installed tool and when it is needed? Why don't we start with doc/blog-post and then implement the docker image. |
Beta Was this translation helpful? Give feedback.
-
Thank you for being so helpful. Personally, a Dockerfile would suffice. The community, however, maybe would benefit more from a pre-built image. Instead of building one from Ubuntu, you could make your life easier and, e.g., start from a miniconda image. I think this can be easy to automate, and maybe you can even delegate this to other teams - a partnership with rocker for example. Since from what I read It also makes easier to implement it in existing projects - since you don't have to rebuild the images just to try it. |
Beta Was this translation helpful? Give feedback.
-
maybe I'm missing something but it looks like they're using On a related note I like where this is heading https://github.com/iterative/dvc-cml/wiki/Tensorflow-Mnist-for-Github-Actions |
Beta Was this translation helpful? Give feedback.
-
@casperdcl it is using docker files. Index.js is here just support GH users who don’t want or cannot use docker. You can find it in the workflow files. |
Beta Was this translation helpful? Give feedback.
-
Right. Seems a bit odd to provide a nodejs action for public use via the standard |
Beta Was this translation helpful? Give feedback.
-
@casperdcl what would be your suggestions for that project? How to organize it in the right way? |
Beta Was this translation helpful? Give feedback.
-
Advantages:
Surely should discuss this in an issue on that repo though? |
Beta Was this translation helpful? Give feedback.
-
you mean prettier configs, etc, etc?
could you elaborate?
same here, could you elaborate? |
Beta Was this translation helpful? Give feedback.
-
er, just a general principle removing as much as possible. Some tools expect files to be in the root so we're mostly stuck there, ofc.
It's cumbersome for us to maintain multiple, well, entrypoints to our actual code. If we want to support both This way we can use the docker wrapper for the action. Thus running the action will test our docker wrapper as well as the underlying entrypoint. The additional advantage is that all deps are guaranteed installed in the docker container. |
Beta Was this translation helpful? Give feedback.
-
I feel that I'm still missing something :) Docker entrypoint for the image we provide already does this, right? It already runs Node. And image itself has JS bundle pre-installed. There are no very strong reason to support direct docker-less action, but it's a separate topic. |
Beta Was this translation helpful? Give feedback.
-
yes I was making several minor points, I think we're all missing small things but nothing major :) |
Beta Was this translation helpful? Give feedback.
-
Hi! we are pushing to use the code through Docker, the main reasons are:
The MAIN reason why the js action is maintained is because there is no way MACOS or WINDOWS can run specific native tools in docker. So if a user would be using i.e. CoreML with Xcode the only way to make this work available for them is through the purely Github Js action and only in Github |
Beta Was this translation helpful? Give feedback.
-
Hi, we're using the |
Beta Was this translation helpful? Give feedback.
-
Hi, @hsharrison. Could you please mention your issue in the CML's repo, specifically on iterative/cml#217? This way, it'll be easier and faster for you. Thanks. |
Beta Was this translation helpful? Give feedback.
-
👋 @hsharrison feel free to open a ticket there. Could you please provide also whats your pain point? |
Beta Was this translation helpful? Give feedback.
-
My mistake, sorry for the noise. |
Beta Was this translation helpful? Give feedback.
-
Hello, I couldn't find instructions for containers in the installation page of DVC, and I found this discussion after searching for docker in the issues list. Reading the comments above, I think there was no clear rationale for adding a container, even though there were at least two attempts: I think I understand the resistance to add a Dockerfile or instructions without a clear reason for doing so, especially as it would be something else to have to maintain along with the code, tests, docs, pip/snap/etc installation methods. But I thought if users provide the reason why they thought about this, maybe that could help a little. In my case, I work with HPC climate workflows. These workflows run climate models with large datasets, and we have recently started tests using DVC instead of managing model data the traditional way (hard-coded paths on HPCs). This is looking promising, especially as we are using multiple HPCs over Europe. One issue with HPCs, is that many environments are either offline/secluded, or may have older versions of Python, or have issues resolving dependencies. It's really very likely that you won't be able to get the same version of a tool installed with Singularity containers are used where possible, to address this issue of portability, and installing tools like DVC in the HPC. I am quickly creating a container with python or micromamba where I will just do a So it would be useful to have, if not a Dockerfile or container, then just a brief explanation in the docs saying that it is recommended to use Thanks for DVC! -Bruno |
Beta Was this translation helpful? Give feedback.
-
Provide docker images
docker run --rm -t -v $PWD:/repo dvc status
)Beta Was this translation helpful? Give feedback.
All reactions