Git repository management

From Hepplestone Research Group
Jump to navigation Jump to search

This page should provide a guide that should help you to set up and use git history/version control for developing software.

Git is a fantastic tool for storing code, tracking changes, and keeping a history. It allows for easy swapping between different versions and different development branches (such as implementing different features at the same time) on a single code. It reinforces good coding practises, such as tracking changes, readable commenting, and clear reasoning and improvements for new software releases.

WARNING! Ensure that you are using git/2.37.3 or later! In terminal, run module load git/2.37.3

WARNING! Whenever initialising a git, ensure it sets main as the default branch, not master.

Accessing the group git

The git host that we use for group work is the Exeter GitLab (we use this because it is offered to us and no one has to create a new login).

Use the following steps to access the Hepplestone git group:

  1. Go to the Exeter GitLab website https://git.exeter.ac.uk
  2. Log in using your university credentials
  3. Request an invite from one of the Hepplestone research group Maintainers (or higher), this is done verbally or via email
  4. FOR MAINTAINER OR HIGHER:
    1. Go to the Hepplestone git group page: https://git.exeter.ac.uk/groups/hepplestone
    2. On the left sidebar of the page, select Manage -> Members
    3. On the Members page, select "Invite members" on the top-right of the page
    4. Enter the person's username, select their role (defaults = Reporter for Masters/code users, Developer for basic coders (safe version), Maintainer for high-level coders)
    5. Optional: set an expiration date for their access
    6. Invite
  5. Go to https://git.exeter.ac.uk/groups/hepplestone to confirm access

Note: A hand-wavy description of the roles follows:

  • Developer:
    • access to creating new group projects (PLEASE CONFIRM)
    • access to existing projects within the group
    • clone repositories
    • create branches
    • commit to branches
    • cannot commit to main (the default) branch of a project
  • Maintainer:
    • Developer+
    • can commit to main branch

Connecting via ssh to remote repository

Firstly, we want to connect to GitLab using ssh (so that we don't have to put in our username and password every time we upload changes to the remote repository). This requires a few steps:

  1. go to git.exeter.ac.uk, go to the left sidebar and click on your avatar (top right of sidebar), go to Preferences.
  2. In Preferences, go to the left sidebar and select "SSH Keys".
  3. Click on "Add new key" on the right-hand side of the page.
  4. Follow the instructions on the page (or here). This will set up your device with an ssh key for a defined amount of time to authenticate your access to GitLab.
    1. Open a terminal
    2. Run ssh-keygen -t rsa -b 2048 [-C "<GitLab>]". <GitLab> is an optional comment associated with the key.
    3. Press ↵ Enter.
    4. You will then be prompted to provide a filepath/filename for your new id key (with RSA encryption).
      • Pressing ↵ Enter without inserting anything defaults to the filepath in parentheses.
      • Providing just a filename will save it in the present working directory
      • As an example: one could provide "id_rsa_gitlab" to have a separate ssh key for GitLab.
    1. Press ↵ Enter.
    2. Specify a passphrase and confirm. This will be requested of you in the future the first time your session is connecting to GitLab, so effectively once every computer restart.
    3. A public and private key are now generated.
    4. Go to where you saved your ssh key to (e.g. /home/links/<USER>/.ssh/).
    5. Find the associated public key, e.g. id_rsa.pub for default (or, <FILENAME>.pub, where <FILENAME> is the filename you entered when prompted earlier).
    6. Copy the public key file contents.
  1. Back on the GitLab webpage, paste the public key into the "Key" input field.
  2. Provide a title to remind you of what device it relates to.
  3. Leave the Usage type as default and set an expiration date (or leave blank if you want it never to expire).
  4. In terminal, run emacs ~/.ssh/config and ensure the following lines are in the file:
Host git.exeter.ac.uk
  HostName git.exeter.ac.uk
  User <UNI USERNAME, e.g. abc123>
  ForwardAgent yes
  IdentityFile <ssh-key filepath, not the .pub one, e.g. ~/.ssh/id_rsa_gitlab>

You have now set up your ssh protocol for accessing GitLab.

Next, you want to change your local git config to automate more of this.

In terminal, type emacs ~/.gitconfig. Edit the git file so that it looks like:

 [user]
         email = <UNI EMAIL ADDRESS, e.g. a.b.cde@exeter.ac.uk>
         name = <YOUR NAME, e.g. A Cde>
 [push]
         default = simple
 [credential "https://git.exeter.ac.uk/"]
         username = <UNI USERNAME, e.g. abc123>
 [init]
         defaultBranch = main

This sets your default username for requests sent to the remote repository.

Starting out with git

Clone remote repository:
git clone <ADDRESS> (USE SSH CLONING, button on most repository websites, near top right hand side of webpage)

Or initialise new local repository:
git init

In the new local repository directory, run emacs .git/config. Ensure that it contains the following lines

 [remote "origin"]
         url = git@git.exeter.ac.uk:hepplestone/<CODENAME>.git
         fetch = +refs/heads/*:refs/remotes/origin/*

The [remote] and url lines are the important ones (the fetch line can likely be ignored as it should be handled by default when you first set an upstream/remote repository). Ensure that the url line is set up to reference a git ssh address, not a http web address. If it references an ssh address, it will mean all pushes and pulls are performed via ssh, and not http access (which is just more annoying and cumbersome).

You can either set up a repository on the GitLab website, or push it up from an existing/new one created on your local system. The former option is detailed well by the GitLab webpage when you go to create a new repository. I will detail the latter here (pushing up an existing local repository using the command-line). In terminal, cd to your local repository and run the following command:

  git push --set-upstream git@git.exeter.ac.uk:hepplestone/$(git rev-parse --show-toplevel | xargs basename).git $(git rev-parse --abbrev-ref HEAD)

Running this in your local repository will take the name of that directory and use it as the name for the remote repository. Note, you need to have used git init and have made a commit before the local repository can be pushed remotely.

Quick how-to git guide

This should provide one with a quick guide on the basics of using git to track changes in your local git repository and how to keep those changes synced with a remote repository.

Firstly, refer to the previous section, #Starting out with git, to learn how to initialise a local repository.

Next, when coding, remember to regularly stage (git add) and commit (git commit -m "<MESSAGE>") changes made to your working tree. This will allow you to track all slight changes made to a code to determine when things go wrong, where fixes were implemented, why changes were made, etc.

We shall start by discussing why one would use git. If you are working on a code, it is fantastic to be able to keep track of all changes made to it and easily refer back to (or revert code) to an earlier, potentially working version. When developing code, it is common to want to implement changes (such as to improve speed and/or accuracy, to fix bugs, or to implement new features), this should never be performed directly on the main (default) branch, it should always be done within a separate branch and given a name relating to the intention of the branch. Once the intended changes have been implemented and tested for parity, they can then be merged back into the main branch without breaking things. This also, as can be seen, allows for code to be developed by multiple people in unison.

To set up a branch in which to develop features/fixes, follow these steps:

  1. Open a terminal.
  2. cd to the local repository.
  3. Run git branch <NAME>, where <NAME> is the new branch name.
  4. Run git switch <NAME> to switch to branch <NAME>.
  5. Run git branch to list branches and confirm visually that you are in branch <NAME>.

Now that you are in the new branch, you can start coding here. Do a little bit of coding (not too much! Remember, small commits!) and return here. Once you have made some changes to your work tree, follow these steps to track and observe them:

  1. Open a terminal in the local repository.
  2. Run git status
    . This will list all files that have been staged for commit (in green) all tracked files with unstaged changes (red), and all untracked files (red also).
  3. Run git diff <FILE>
    to view a comparison of the unstaged changes and the previous commit for <FILE>.
  4. If there are files that you want to keep locally, but do not want tracked by git version tracking, simply add them (one on each line) to the .gitignore file in the root directory of the local repository (wildcards etc. can be used to catch multiple files).

Now, let us commit these changes to keep track of them:

  1. Open a terminal in the local repository.
  2. Run git add <FILE> to stage the changes in <FILE> for a commit. Here, <FILE> is the file (filepath) to which you have made changes.
  3. Run git commit -m "<MESSAGE>" to commit the staged changes and have them now tracked by the git version tracking. This allows changes to be referred back to at any point in the future now. WARNING! Refer to the #Commit message convention for how to properly format commit messages.

Once you have made some changes and have committed them all, it is best to push these to the remote repository so that you can access them remotely (from different devices), or so that others can access them. Follow these steps to do so:

  1. Open a terminal in the local repository.
  2. Run git push to push all of the changes in branch <NAME> to the remote repository (assuming you have set a remote repository).
    1. If this is a new branch, a warning error will come up and this will fail. It will tell you how to resolve this (an upstream needs to be set so that the command recognises where to send these commits).
    2. Simply entire that command (i.e. git push --set-upstream origin <NAME> to set an upstream and push the commits to remote branch <NAME>.

If there are changes made on the remote branch that you are working on that you want to pull down (and you have made no other changes to your local branch), then follow these steps:

  1. Open a terminal in the local repository.
  2. Run git pull to pull all of the changes in the remote branch <NAME> down to your local repository (assuming you have set a remote repository).
    1. If this is a new local branch, a warning error will come up and this will fail. It will tell you how to resolve this (an upstream needs to be set so that the command recognises where to pull these commits from).
    2. Simply entire that command (i.e. git pull --set-upstream origin <NAME> to set an upstream and pull the commits from the remote branch <NAME>.
  3. NOTE: You may need to use git fetch to get information of remote branches that are not currently tracked locally.


Commit message convention

Always use future tense, as if you are intending on doing it, i.e. "Add parameter", "Test code", "Fix typos". Be clear and concise!

Please refer to one of the following links and stick to a particular convention:


Merging branches back into main

Once a branch has achieved its purpose, and has been tested for parity of results with the main branch, it is time to merge this branch back into main. So far, I have done this mostly with the GitLab website, as it is much better at tracking and handling these things (a GUI is just much better for this task, it seems). For now, it is best to be aware that this process exists and is something that should be performed when the time is right.

Whenever pushing commits to the remote repository from a non-default branch, a message will appear in terminal along the lines of:

remote:
remote: To create a merge request for <NAME>, visit:
remote:   https://git.exeter.ac.uk/hepplestone/<NAME>/-/merge_requests/...
remote:

So it tells you where to go to sort out merge requests. To conduct them properly, you will need to look through and agree to all changes between the branch and main. You will need to assign reviewers, etc. Comments and discussions can be had regarding the changes. Once done though, it is usually set that the branch will be deleted remotely. Once the merge request has been completed, return to your local repository and follow these steps to keep it up to date with the remote one:

  1. Open a terminal in the local repository.
  2. Run git fetch to fetch information relating to branch changes
  3. Run git switch main to switch to the main branch
  4. Run git pull to update your main branch to the remote one.
  5. Run git branch -d <NAME> to delete your local copy of the branch <BRANCH>.

Note, merge requests and branches do not have to come directly from the main branch. You can make branches of branches and merge these branches back into other branches. It is simply discussed here in the simple case as it is assumed that anything more complex will be performed by those who understand the process and do not need a guide.

Finally, one will, at some point, need to handle merging multiple development histories together, and will encounter conflicts that need to be resolved. Ned has had little experience of this so far (other than with himself) as no one else has coded with him, so does not have any useful information to provide here. However, it seems relatively straightforward, git merge seems like the command to handle this, then developers need to go commit-by-commit through the conflicting differences and select the overriding one. Changes that do not affect each other are not conflicts and, hence, are not presented to the developers as there is nothing to resolve there. When going through this though, just talk to him though and all parties can puzzle it out together.

Git commands

This section provides a brief description of the key git commands (and their extensions) that have been found useful so far. NOTE: git merge has not been detailed here, it is often best to handle this on the GitLab website instead, and is a more difficult thing to handle, but should be learned at some point.

Help

The following are git commands that provide help and guidance for how to use git in terminal.
git = provide a brief help.
git <TAG> -h = provide a brief help with <TAG>.
git <TAG> --help = provide an indepth help with <TAG>, like the man command

Initial git commands

The following are git commands that are only used when initialising/setting up a git repository.
git clone <ADDRESS> = clone a remote repository into a new directory
git init = create an empty git repository, or reinitialises an existing one. PLEASE CONFIRM If I am right, needs to be done in the directory you want as the repository)

Version history

The following are git commands that allow one to stage and commit code changes. As convention, it is good practise to regularly perform commits, to keep track of changes. Each set of changes should be their own commits, e.g. do not stage bug fixes and a new procedure in the same commit.
git add . = stage all changes in the current repository (recursively) for a commit.
git add <FILE> = stage changes in <FILE> for a commit.
git mv = move or rename a file. WARNING! Never use the mv command in a git repository. Always use git mv, otherwise the filepath change will not be tracked.
git rm <FILE> = delete <FILE>. WARNING! Never use the rm command in a git repository. Always use git rm, otherwise the deletion will not be tracked.
git commit -m "<MESSAGE>" = commit staged changes to the local repository history. <MESSAGE> is the commit message. WARNING! Refer to the #Commit message convention for how to properly format commit messages.
git restore <FILE> = remove unstaged changes from FILE, i.e. revert file to last commit.

Recent changes to code

The following are git commands that provide details of recent changes between the current tree and prior commits.
git status = show all unstaged and staged changes to working tree.
git diff = show unstaged changes to working tree.
git diff <FILE> = show unstaged changes of <FILE>.

Connecting with remote repository

The following are git commands that provide means of connecting with a remote repository to share and distribute changes.
git push = push all local commits to remote repository, warning, on first instance, won't work, but error tells you exactly what to do, set upstream.
git pull = pull all remote commits to local repository.
git fetch = unsure, like pull, but just stages remote changes for local commit.

Managing worktree branches

The following are git commands that provide more elaborate, but vital, capabilities to work tree handling and roaming.
git tag = list known tags.
git tag <TAG> = create tag <TAG> reference to current version of code (for easy future referencing back to, such as 1.0.0 release.
git branch = list local branches.
git branch <BRANCH> = create a new branch from current called <BRANCH>.
git switch <BRANCH> = switch to existing branch <BRANCH>.
git branch -d <BRANCH> = delete local branch <BRANCH>.
git checkout <BRANCH> = create new branch and move to it.
git checkout tags/<TAG> = change files to version found in tag <TAG>, for accessing older versions of the code that are useful to access.
git checkout -B <BRANCH> = set current branch back to HEAD, create new branch from currently unstaged changes and moves you to that. Useful if you start making a new feature and realise it should have been a separate branch before you've done too much work on it.