Skip to content
Snippets Groups Projects
Commit 90691150 authored by Johannes Keyser's avatar Johannes Keyser
Browse files

Some smoother formulations etc.

parent 171d15a0
No related branches found
No related tags found
No related merge requests found
......@@ -2,27 +2,30 @@
<img src="logo.svg" width="100px" />
*Support information how to use Git LFS on JLU GitLab, and if this is a good idea.*
_Support information how to use Git LFS on JLU GitLab, and if this is a good idea._
__Please note: This is work in progress; any [contributions](CONTRIBUTING.md) are welcome!__
_Please note: This project provides an initial overview and is partially incomplete; any [contributions](CONTRIBUTING.md) are welcome!_
[[_TOC_]]
## Why bother with Git LFS?
The main purpose of Git LFS is to treat **data** files *as conveniently as if they were inside* a Git repository, while *actually keeping them outside* of the repository.
There are technical reasons in Git's design to make this extra step necessary:
There are technical reasons in Git's design that necessitate this extra step:
- Large files will bloat the Git repository for everyone who has a clone, and degrade the performance of Git operations.
- Large files will bloat the Git repository for everyone who has a clone of it, and will degrade the performance of Git operations.
Git is optimized for text-based content, not for binary files.
- Git is designed to distribute the entire snapshot history to every clone.
If data are part of the Git repository (at any point in history!), it means it gets replicated on every clone, even if's not needed (any more).
As a result, if data are part of the Git repository (at any point in history!), it gets replicated to every clone, even if it's not needed (any more).
- Git is designed to make it impossible to delete data from the repository's history.
All you can do to "delete data" is force Git to you explicitly re-write the snapshot history.
All you can do to "delete data" is to force Git to explicitly re-write the snapshot history.
Even if you work alone, this is a bit of a hassle — it gets much worse if other people have clones, because history change requires everyone involved to confirm the deletion with their clone.
(The reason behind this design is to ensure data integrity: Nobody can make hidden changes.)
### Optional, technical details
**Optional, technical details:**
With the Git LFS extension, you can version control (large) files "in association" with a Git repository.
Instead of storing a file within the Git repository as a *blob*, Git LFS only stores *pointer files* in the repository, but stores the actual file contents on a (separate) Git LFS server (and locally, in another folder).
A file tracked by Git LFS gets downloaded only if needed, e.g. when you check out a Git branch containing the tracked file (but it also gets cached locally, if you downloaded it before).
......@@ -34,6 +37,7 @@ When you `push` a commit that contains a new/changed file tracked by LFS, a *pre
## Is it a good idea to use Git LFS for your project?
You must consider several aspects before uploading any research data to JLU GitLab.
Please read [this information](https://gitlab.ub.uni-giessen.de/jlugitlab/support/-/blob/master/en/Information.md#storage-of-research-data) on research data management.
If in doubt about your specific situation, please consult the department for research data, [forschungsdaten@uni-giessen.de](mailto:forschungsdaten@uni-giessen.de).
......@@ -46,14 +50,15 @@ With this in mind, using Git LFS may be a natural choice, if:
4. all project members who need to access these data files can work with Git LFS, and
5. all machines that need to access these files have network access to the LFS server.
### Advanced option: External Git LFS server
**Advanced option: External Git LFS server**
As an alternative to storing LFS data on JLU GitLab, you could store LFS data on an external server, while still using JLU GitLab to host your project.
For example, your workgroup could run their own Git LFS server; you can choose from e.g. [this list](https://github.com/git-lfs/git-lfs/wiki/Implementations).
The advantage of an external LFS server is independence from JLU GitLab; e.g. you could implement different policies, such as potentially more suitable security practices.
## Practical tips how to use Git LFS
The following tips make the following assumptions:
- You have Git installed on your machine and you know the basics (if you don't, [here is a good starting point](https://git-scm.com/)).
......@@ -62,8 +67,8 @@ The following tips make the following assumptions:
Below, example commands are indicated with a different font and with a leading dollar sign, `$ like this`; to reproduce them, drop the dollar sign `$ `.
- You have the Git LFS extension installed on your machine (you can find [instructions here](https://git-lfs.github.com/)); you can check e.g. with typing `$ git lfs version`.
### Basic use
1. Set up Git LFS; you have to do this once per machine and repository: `$ git lfs install`.
2. In your local repository, choose what types of files to track by LFS.
- For example, to track all CSV files, type: `$ git lfs track "*.csv"`.
......@@ -78,8 +83,8 @@ The following tips make the following assumptions:
$ git push
```
### Option: Prevent download of LFS files
You may want to work with a Git repository but prevent all LFS files from download.
For example, you may want to clone a repository on a machine where you simply don't require the large files.
Or maybe you want to work with a clone on a machine without network access, leading to LFS errors.
......@@ -99,7 +104,6 @@ $ git clone <REMOTE-URL> <LOCAL-FOLDER>
TODO?: You can also ignore LFS files permanently, via Git configuration.
### Option: Exclude particular files from being tracked by LFS
The easiest way to track files with LFS is to use a general file pattern, like all CSV files (`*.csv`).
However, you may want to have a *particular* CSV file in the normal Git repository, and exclude it from the LFS pattern.
......@@ -126,7 +130,8 @@ some-directory/my-particular-file.csv !filter !diff !merge text
Because the file name leads with a dot, it may be hidden from view (to list it, use `$ ls -a` on Linux/MacOS, or FIXME: What to do on Windows?).*
### Option: Lock files to avoid conflicts
FIXME: Clarify locking mechanism, mainly relevant for people working in teams, see https://github.com/git-lfs/git-lfs/wiki/File-Locking.
FIXME: Clarify locking mechanism, mainly relevant for people working in teams, see <https://github.com/git-lfs/git-lfs/wiki/File-Locking>.
## Example(s)
......@@ -135,6 +140,6 @@ FIXME: Clarify locking mechanism, mainly relevant for people working in teams, s
## Useful links
- Main website about Git LFS: https://git-lfs.github.com/
- Information on LFS on GitLab: https://docs.gitlab.com/ce/topics/git/lfs/
- A list of LFS server implementations: https://github.com/git-lfs/git-lfs/wiki/Implementations
- Main website about Git LFS: <https://git-lfs.github.com/>
- Information on LFS on GitLab: <https://docs.gitlab.com/ce/topics/git/lfs/>
- A list of LFS server implementations: <https://github.com/git-lfs/git-lfs/wiki/Implementations>
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment