diff --git a/README.md b/README.md index 887b03711567437cbc600745b65c52a689e1ff28..6dd629ca6f1900a8c272e8a61c5ec7c4738f5f80 100644 --- a/README.md +++ b/README.md @@ -22,7 +22,7 @@ There are technical reasons in Git's design to make this extra step necessary: Even if you work alone, this is a bit of a hassle — it gets much worse if other people have clones, because history change requires everyone involved to confirm the deletion with their clone. -### Optional: More technical details +**Optional, technical details:** With the Git LFS extension, you can version control (large) files "in association" with a Git repository. Instead of storing a file within the Git repository as a *blob*, Git LFS only stores *pointer files* in the repository, but stores the actual file contents on a (separate) Git LFS server (and locally, in another folder). A file tracked by Git LFS gets downloaded only if needed, e.g. when you check out a Git branch containing the tracked file (but it also gets cached locally, if you downloaded it before). @@ -47,7 +47,7 @@ With this in mind, using Git LFS may be a natural choice, if: 5. all machines that need to access these files have network access to the LFS server. -### Optional: External Git LFS server +**Advanced option: External Git LFS server** As an alternative to storing LFS data on JLU GitLab, you could store LFS data on an external server, while still using JLU GitLab to host your project. For example, your workgroup could run their own Git LFS server; you can choose from e.g. [this list](https://github.com/git-lfs/git-lfs/wiki/Implementations). The advantage of an external LFS server is independence from JLU GitLab; e.g. you could implement different policies, such as potentially more suitable security practices. @@ -66,10 +66,10 @@ The following tips make the following assumptions: ### Basic use 1. Set up Git LFS; you have to do this once per machine and repository: `$ git lfs install`. 2. In your local repository, choose what types of files to track by LFS. - - For example, to track all `CSV` files, type: `$ git lfs track "*.csv"`. + - For example, to track all CSV files, type: `$ git lfs track "*.csv"`. - This will create/change the Git configuration file [`.gitattributes`](.gitattributes). - You should track this configuration change in the repository itself, with the usual Git commands `$ git add .gitattributes` and `$ git commit -m "start tracking CSV files with LFS"`. - *Note that because the file name `.gitattributes` starts with a dot, it may be hidden from view (on Linux and MacOS, use `$ ls -a` to see it; FIXME: What to do on Windows?).* + You should track this configuration change in the repository itself, with the usual Git commands `$ git add .gitattributes` and `$ git commit -m "start tracking CSV files with LFS"`. + - *Note: Because the file name `.gitattributes` leads with a dot, it may be hidden from view (to list it, use `$ ls -a` on Linux/MacOS, or FIXME: What to do on Windows?).* 3. Now you can interact with the LFS-tracked files in the usual way to control versions with Git. For example, to make a new snapshot with a file `some_data.csv`, use the usual commands `add`, `commit`, and `push` like any other file in the repository: ``` @@ -79,22 +79,19 @@ The following tips make the following assumptions: ``` -### Optional: File locking mechanism -FIXME: Clarify locking mechanism, may be especially relevant for people working in teams, see https://github.com/git-lfs/git-lfs/wiki/File-Locking. - - -### Optional: Ignore LFS files -You may want to work with a Git repository but ignore the (large) files stored by LFS. -For example, you may want to clone a repository on a machine that doesn't have access to the Git LFS server, or you simply don't require the large files. +### Option: Prevent download of LFS files +You may want to work with a Git repository but prevent all LFS files from download. +For example, you may want to clone a repository on a machine where you simply don't require the large files. +Or maybe you want to work with a clone on a machine without network access, leading to LFS errors. To temporally ignore the LFS content, you can set the [environment variable](https://en.wikipedia.org/wiki/Environment_variable) called `GIT_LFS_SKIP_SMUDGE` to the value `1`. -To stop ignoring LFS, set the variable to `0`. +(And to stop ignoring LFS files, just set the variable to `0`.) The syntax to set the variable depends on your command line interface: - On Windows, type `$ set GIT_LFS_SKIP_SMUDGE=1`. - For Bash (e.g. Linux), type `$ export GIT_LFS_SKIP_SMUDGE=1`. -After that, you can e.g. clone the repository without downloading the LFS files: +After that, you can e.g. clone the repository like usual, without an attempt to download the LFS files: ``` $ git clone <REMOTE-URL> <LOCAL-FOLDER> @@ -102,6 +99,36 @@ $ git clone <REMOTE-URL> <LOCAL-FOLDER> TODO?: You can also ignore LFS files permanently, via Git configuration. + +### Option: Exclude particular files from being tracked by LFS +The easiest way to track files with LFS is to use a general file pattern, like all CSV files (`*.csv`). +However, you may want to have a *particular* CSV file in the normal Git repository, and exclude it from the LFS pattern. +This can be done by editing the `.gitattributes` file directly, with any text editor you want. + +**Example** +Let's say you're tracking all CSV files in Git LFS, i.e. your `.gitattributes` file contains a row like this: + +``` +*.csv filter=lfs diff=lfs merge=lfs -text +``` + +Let's assume you want to exclude `some-directory/my-particular-file.csv` from LFS, and put it into the normal Git repo instead. +To do that, you simply add a text line `some-directory/my-particular-file.csv !filter !diff !merge text` into the `.gitattributes` file. + +After your editing, `.gitattributes` should contain these 2 lines: + +``` +*.csv filter=lfs diff=lfs merge=lfs -text +b.dat !filter !diff !merge text +``` + +*Note: The file `.gitattributes` is just a text file, so you can edit it with any text editor. +Because the file name leads with a dot, it may be hidden from view (to list it, use `$ ls -a` on Linux/MacOS, or FIXME: What to do on Windows?).* + +### Option: Lock files to avoid conflicts +FIXME: Clarify locking mechanism, mainly relevant for people working in teams, see https://github.com/git-lfs/git-lfs/wiki/File-Locking. + + ## Example(s) - [Here](example) you can find an of an analysis script that relies on data stored in Git LFS. - TODO: Maybe add more examples with different data (audio, video, images)?