2

Basics of GitHub

 3 years ago
source link: https://towardsdatascience.com/must-know-tools-for-data-scientists-114d0b52b0a9
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Basics of GitHub

Must know tools for Data Scientists

Image for post
Image for post
Photo by Richy Great on Unsplash

Ever felt frustrated because of not being able to recover a small code snippet which got deleted accidentally? Ever felt handicapped because of not being able to re-use an older iteration of your classification model which was offering the best accuracy score? Are you still following the old school version control approaches (remember V 0.1, V 0.2 V 1.0…)?

If the answer to any of the above questions is yes, then this tutorial is for you.

Assumption

This tutorial assumes that you already have a GitHub account and the Git Bash application installed on your system (assuming Windows system). If not, there are a lot of tutorials out there that can help you with that. The GitBash screen looks something like below:

Image for post
Image for post
Git Bash Screen (Image by Author)

Taking off with Git

Git is a free and open-source version control system that enables tracking source code (or any file you upload on it) changes locally.

To promote the concept of collaborative development, companies like GitHub (a Microsoft subsidiary) have built a cloud-based platform (GitHub platform) on top of Git. Other than supporting version control (standard Git feature), these platforms enable additional features like wikis, bug tracking, task management, etc.

Defining Keywords

Before learning to use GitHub, let’s understand some common terminologies which you will encounter throughout this tutorial:

  • Repository — In layman terms, this is analogous to a project folder that contains all your project files. Standard practice is to have one repository per project.
  • Branch — Generally, developers use different branches for maintaining different modules of the project. Another common scenario that warrants the use of branches is when multiple members of the team want to work on the same piece of code. This is when each one can have its branch. By default, each newly created repository has a central branch named “master”.
  • Clone — Cloning is like copying and pasting the repository from one drive(developer’s folder on GitHub) to another (our local folder).
  • Stage & Commit — Creation of a new project version, on your git repository, is a 2 step process. The first step is to collect all the files which are required to be a part of the new version. This is called staging the files. The second step is to create the new version of your project which is called committing. Only those files which are staged, can be committed to a new version.
  • Push & Pull — Given our focus on GitHub, push and pull is about interacting with repositories stored on GitHub’s cloud. A pull is like downloading the latest version and a push is synonymous to uploading your latest version on GitHub

GitHub Activities When Working Alone

This scenario applies when you are working alone on your repositories for purposes like storing your codes, files, projects etc. Your repository has no authorized collaborators or you are not an authorized collaborator on someone else’s repository.

a.) Creating your own Repository

Creating a repository is the first thing you will do when working with GitHub. The process is very simple and demonstrated below:

  • Login — Log in to your GitHub account and click on new on the top left of the screen.
Image for post
Image for post
Home Page — New Button on Top Left (Image by Author)
  • Details — Fill in a simple-looking form and click create repository (sample screenshot for your reference). That’s it, your repository creation is done. As defined earlier, think of it as a project folder in which you can keep multiple files.
Image for post
Image for post
Repository Form (Image by Author)

b.) Cloning cloud repository on your local system

Cloning downloads the content of your cloud (GitHub) repository into your system folder. Using this process, you can download the content not only from your GitHub repository but from any public repository created by other developers. This is where we will start using Git Bash:

  • Clone Link — Search for the repository you want to clone and copy the cloning link
Image for post
Image for post
Link to clone the repository (Image by Author)
  • Windows Folder Creation — In your windows drive create the folder where you want all the repository files to get cloned. Open Git Bash and navigate to the desired folder location using the following command.
Image for post
Image for post
Change Folder Location (Image by Author)

The Keyword “cd” is an abbreviation for change directory. This followed by folder location or double period (..) instructs the console to change its working location from the current directory to the provided folder location or the previous folder in the folder hierarchy respectively.

  • Cloning — Once at the folder location, use the “git clone” command to clone the repository
#### Command
git clone clone_link
Image for post
Image for post
Cloning GitHub Repository (Image by Author)

The Clone link in the above command is the link we copied in step 1. This command will create a new folder (with the same name as GitHub repository) in your folder location. This new folder will have all the resources of the cloud repository we have cloned. Two important points to note here:

  • The process explained above clones only the “master” branch of the repository. We have given a brief on branches in our definition section but more details on this in chapter 2
  • The clone link used for cloning gets saved in your local repository as a remote link with a default name “origin”

Knowing the above 2 is important as this will be useful when we are pushing or pulling the latest version to/from the GitHub repository.

c.) Creating Versions (add and commit)

Once cloned, a copy of the cloud repository is available for us to modify. To create versions at every checkpoint, we will take the following steps:

  • Staging — Once you have modified the file/files to your satisfaction (or created a new one), add them to the staging area
#### Command
git add file_name
Image for post
Image for post
Staging (Image by Author)
  • Status Check — To check if the file is added successfully to the staging area, execute the following command
#### Command
git status
Image for post
Image for post
Checking Status (Image by Author)

Git status will list down all the files you have modified in your local repo. The ones which are added to staging will be green in color whereas the ones not added to staging will be red.

  • Commit — Once you are sure that the files you want to version control are there in staging, version control them by executing the following command
#### Command
git commit -m "message"
Image for post
Image for post
Commit (Image by Author)

Please note the command line option “-m” followed by “message”. The message here is a free text comment explaining the changes made in the committed version.

This is it, a new version of your file got saved on Git repository (but on your local system).

d.) Sync up the local repository with cloud repository

Until the last step, we created a new version of the file by committing it to our local repository. In this step, we will push our local repository (with updated file versions) to the cloud repository. The command to do that is as follows:

#### Command
git push origin master
Image for post
Image for post
Push To GitHub (Image by Author)

Decoding the syntax:

  • The push command instructs the command line to upload the local repository to the cloud (Git Hub)
  • As explained in the cloning step, the keyword “origin” contains the link to the GitHub repository which was cloned. When Git encounters the word origin, it identifies the cloud location where the local repository needs to be pushed.
  • The keyword “master” is the name of the branch to which the local repository will be pushed. When working with some other branch, replace the master with the branch name.

e.) Downloading subsequent updates from the cloud repository

For first time access to the cloud repository, we used the process of cloning. Given the cloud repository will be accessible to the whole community, there can be multiple updates to it (commit in git terminology) and your locally cloned repository might not be updated with recent changes. To download the latest version from the cloud repository use the following command.

#### Command
git pull origin master
Image for post
Image for post
Pull From GitHub (Image by Author)

Note that the command remains the same as the push command with the only difference that the word push is replaced with pull.

Closing note

Did you know that for a lot of technical job roles, employers now expect you to be an active GitHub member with multiple repositories and contributors?

In our next chapter on GitHub, we will learn about how to collaborate with the developer community using GitHub. In the meanwhile, equipped with the knowledge of this new tool, go ahead and start socializing your projects.

HAPPY LEARNING ! ! ! !


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK