Merge remote-tracking branch 'odu-laptop2/master'

master
Wirawan Purwanto 2 years ago
commit 9697526833
  1. 75
      cloud-storage/20221116.gdrive-client-id.md
  2. 124
      cloud-storage/20221116.rclone-config-demo.txt
  3. 243
      cloud-storage/20221117.gdrive-rclone-setup.md
  4. 28
      cloud-storage/20230123.gdrive-rclone-sync.md
  5. 22
      terminal/20221122.tmux-cheatsheet.md
  6. 22
      terminal/20221124.tmux-TODO.md

@ -0,0 +1,75 @@
Creating ODU's own google drive rclone client ID
================================================
> This is an internal ITS notes.
* Date: 2022-11-16
* Executor: Wirawan Purwanto
Log of action for creating new client_id
----------------------------------------
(Links below assumed that user ID 1 is the ODU user ID)
Starting point:
https://console.cloud.google.com/projectselector2/apis/dashboard?authuser=1&supportedpurview=project
* Select a project => a popup dialog box
- Select from: [ODU.EDU]
- Click "NEW PROJECT"
- Created a new project named "ODU-RCS-rclone"
* Enable API & Services
- Enable "Google Drive API"
* Go to "Credentials" tab (on left sidebar)
https://console.cloud.google.com/apis/credentials?authuser=1&project=odu-rcs-rclone&supportedpurview=project
- Click the "Configure Consent Screen" button
- Select "Internal" user type
* App information (for the consent screen)
- App name: rclone for ODU research computing
- User support email: wpurwant@odu.edu
- App logo: ODU-logo-120px.png
App domain:
- Application home page: https://odu.edu/hpc
- Application privacy policy link: (blank)
- Application terms of service link: (blank)
Authorized domains:
- odu.edu
Developer contact into:
- Email: wpurwant@odu.edu
* (next page) Scopes
Added the following access:
Google Drive API .../auth/drive See, edit, create, and delete all of your Google Drive files
Google Drive API .../auth/drive.appdata See, create, and delete its own configuration data in your Google Drive
Google Drive API .../auth/drive.file See, edit, create, and delete only the specific Google Drive files you use with this app
Google Drive API .../auth/drive.metadata View and manage metadata of files in your Google Drive
Google Drive API .../auth/drive.metadata.readonly See information about your Google Drive files
Google Drive API .../auth/drive.photos.readonly View the photos, videos and albums in your Google Photos
Google Drive API .../auth/drive.readonly See and download all your Google Drive files
Google Drive API .../auth/drive.activity View and add to the activity record of files in your Google Drive
All of the scopes above are sensitive except: drive.appdata and drive.file .
* Go back to the "Credentials" tab
* Now press "Create Credentials"
- Choose type: OAuth Client ID
- Application type: Desktop App
- Name: rclone client for ODU research computing
Obtained the following ID:
- Your Client ID: 605919805393-odnfmddo2v24ffodmg80j6ht4oi4kftn.apps.googleusercontent.com
- Your Client Secret: GOCSPX-###REDACTED###

@ -0,0 +1,124 @@
Demonstration of an "rclone config" session
-------------------------- BEGIN CONFIG ---------------------------
wpurwant@wahab-01:~$ rclone config
2022/11/17 03:49:36 NOTICE: Config file "/home/wpurwant/.config/rclone/rclone.conf" not found - using defaults
No remotes found - make a new one
n) New remote
s) Set configuration password
q) Quit config
name> wpurwant-gdrive
Type of storage to configure.
Enter a string value. Press Enter for the default ("").
Choose a number from below, or type in your own value
1 / 1Fichier
\ "fichier"
2 / Alias for an existing remote
\ "alias"
3 / Amazon Drive
\ "amazon cloud drive"
...
12 / Google Cloud Storage (this is not Google Drive)
\ "google cloud storage"
13 / Google Drive
\ "drive"
14 / Google Photos
\ "google photos"
...
Storage> drive
** See help for drive backend at: https://rclone.org/drive/ **
Google Application Client Id
Setting your own is recommended.
See https://rclone.org/drive/#making-your-own-client-id for how to create your own.
If you leave this blank, it will use an internal key which is low performance.
Enter a string value. Press Enter for the default ("").
client_id> 605919805393-odnfmddo2v24ffodmg80j6ht4oi4kftn.apps.googleusercontent.com
OAuth Client Secret
Leave blank normally.
Enter a string value. Press Enter for the default ("").
client_secret> ### SEND EMAIL TO ITSResearchAndCloudComputing@odu.edu for this value
Scope that rclone should use when requesting access from drive.
Enter a string value. Press Enter for the default ("").
Choose a number from below, or type in your own value
1 / Full access all files, excluding Application Data Folder.
\ "drive"
2 / Read-only access to file metadata and file contents.
\ "drive.readonly"
/ Access to files created by rclone only.
3 | These are visible in the drive website.
| File authorization is revoked when the user deauthorizes the app.
\ "drive.file"
/ Allows read and write access to the Application Data folder.
4 | This is not visible in the drive website.
\ "drive.appfolder"
/ Allows read-only access to file metadata but
5 | does not allow any access to read or download file content.
\ "drive.metadata.readonly"
scope> drive
ID of the root folder
Leave blank normally.
Fill in to access "Computers" folders (see docs), or for rclone to use
a non root folder as its starting point.
Enter a string value. Press Enter for the default ("").
root_folder_id>
Service Account Credentials JSON file path
Leave blank normally.
Needed only if you want use SA instead of interactive login.
Leading `~` will be expanded in the file name as will environment variables such as `${RCLONE_CONFIG_DIR}`.
Enter a string value. Press Enter for the default ("").
service_account_file>
Edit advanced config? (y/n)
y) Yes
n) No (default)
y/n> n
Remote config
Use auto config?
* Say Y if not sure
* Say N if you are working on a remote or headless machine
y) Yes (default)
n) No
y/n> y
If your browser doesn't open automatically go to the following link: http://127.0.0.1:53682/auth?state=nniLEG-###REDACTED###
Log in and authorize rclone for access
Waiting for code...
Got code
Configure this as a team drive?
y) Yes
n) No (default)
y/n> n
--------------------
[wpurwant-gdrive]
type = drive
client_id = 605919805393-odnfmddo2v24ffodmg80j6ht4oi4kftn.apps.googleusercontent.com
client_secret = GOCSPX-###REDACTED###
scope = drive
token = {"access_token":"###REDACTED###","token_type":"Bearer","refresh_token":"###REDACTED###","expiry":"2022-11-17T05:07:15.32276879Z"}
--------------------
y) Yes this is OK (default)
e) Edit this remote
d) Delete this remote
y/e/d> y
Current remotes:
Name Type
==== ====
wpurwant-gdrive drive
--------------------------- DONE CONFIG ---------------------------
(I was trying to use this off-the-node oauth2 permission thing, but it
was rejected by Google as an invalid way for validating the oauth2)
Please go to the following link: https://accounts.google.com/o/oauth2/auth?access_type=offline&...###LINK_REDACTED
Log in and authorize rclone for access
Enter verification code>

@ -0,0 +1,243 @@
Google Drive & RClone: Setting Up CLI Access to Google Drive Data
=================================================================
> This is the original draft (Nov 17, 2022).
> The published version is here:
> https://wiki.hpc.odu.edu/en/DataMgmt/cloud/grive-rclone-setup
Google Drive is a popular cloud storage platform to backup and share files. This article provides a step-by-step guidance to enable access and transfer data from your Drive to/from ODU HPC via rclone command-line program. By using rclone, you will be able to automate data transfer and synchronization between the Drive and the cluster storage.
> This article assumes that you have installed rclone (or rclone is available) on your system. Refer to [rclone downloads page](https://rclone.org/downloads/) if you need to download and install rclone.
>
> On Wahab HPC, you will use `module load rclone` to make rclone available to your shell environment.
{.is-info}
> This guide can also be used to enable access to Google Drive from Linux, Mac, and Windows desktop.{.is-info}
Setting Up Access
-----------------
> Because of the web access involved somewhere in the steps, it is best that you use the [remote desktop](https://wiki.hpc.odu.edu/GettingStarted#connecting-via-rdp) or [virtual desktop](/open-ondemand/virtual-desktop).
The first step is to issue the `rclone config` command. This will guide you through a series of questions, which will be broken up and commented throughout due to the length. First, we need to create a new **remote**, which is simply a user-defined name for a particular Google Drive storage area. in the following instruction, we will use `my-gdrive` as a name, but please feel free to specify a name that best describes your data (it must not contain whitespaces or begin with a dash [`-`]).
```
No remotes found - make a new one
n) New remote
s) Set configuration password
q) Quit config
n/s/q> n
name> my-gdrive
```
Rclone will prompt your response after the `>` character. Here, `n` and `my-gdrive` are the responses to the question. In the illustration above, no remote has been created yet, so there are only a few options. If you have existing remote(s), you will see more options.
Next, we need to specify the storage type. Type in "drive" for Google Drive.
```
Type of storage to configure.
Enter a string value. Press Enter for the default ("").
Choose a number from below, or type in your own value
1 / 1Fichier
\ "fichier"
2 / Alias for an existing remote
\ "alias"
3 / Amazon Drive
\ "amazon cloud drive"
...
12 / Google Cloud Storage (this is not Google Drive)
\ "google cloud storage"
13 / Google Drive
\ "drive"
14 / Google Photos
\ "google photos"
...
Storage> drive
```
The following steps will ask for a "client ID". It is highly recommended that you use ODU's client ID so that your rclone sessions would perform better (i.e. faster):
```
Google Application Client Id
Setting your own is recommended.
See https://rclone.org/drive/#making-your-own-client-id for how to create your own.
If you leave this blank, it will use an internal key which is low performance.
Enter a string value. Press Enter for the default ("").
client_id> 605919805393-odnfmddo2v24ffodmg80j6ht4oi4kftn.apps.googleusercontent.com
OAuth Client Secret
Leave blank normally.
Enter a string value. Press Enter for the default ("").
client_secret> ### SEND EMAIL TO ITSResearchAndCloudComputing@odu.edu for this value
```
For security reasons, we do not publish the client secret. Please contact us via email to get the client secret value (it will begin with `GOCSPX`).
The next prompt will ask what kind of access you want. in >99% of the cases, you will want to us option one (`drive`), which gives you full read-write access to your data stored in the Drive (you can limit the write/modify access later when using `rclone`).
```
Scope that rclone should use when requesting access from drive.
Enter a string value. Press Enter for the default ("").
Choose a number from below, or type in your own value
1 / Full access all files, excluding Application Data Folder.
\ "drive"
2 / Read-only access to file metadata and file contents.
\ "drive.readonly"
/ Access to files created by rclone only.
3 | These are visible in the drive website.
| File authorization is revoked when the user deauthorizes the app.
\ "drive.file"
/ Allows read and write access to the Application Data folder.
4 | This is not visible in the drive website.
\ "drive.appfolder"
/ Allows read-only access to file metadata but
5 | does not allow any access to read or download file content.
\ "drive.metadata.readonly"
scope> drive
```
Root folder: Do you want to allow access to the entire Drive? Or just a specific subfolder in your Drive? This is where you can specify it. If you leave blank, you will use the root folder of the Drive.
```
ID of the root folder
Leave blank normally.
Fill in to access "Computers" folders (see docs), or for rclone to use
a non root folder as its starting point.
Enter a string value. Press Enter for the default ("").
root_folder_id>
```
> ### What is my folder ID?
> The Google folder ID is shown as a series of letters and digits in the URL of the corresponding folder from the web interface. You can use the "Get link" submenu (or button), which will return an URL like this:
>
> `https://drive.google.com/drive/folders/16hY6ZurF09Ax1GzsxqJDJNNxsv-P8ihe?usp=share_link`
>
> The `16hY6ZurF09Ax1GzsxqJDJNNxsv-P8ihe` string is the root folder ID.
> *(FYI this is a demo folder on Research Computing's Google Drive, it is safe but does not contain anything useful to you, most likely.)*
{.is-info}
The next prompt asks for service account. Skip this by leave it blank.
```
Service Account Credentials JSON file path
Leave blank normally.
Needed only if you want use SA instead of interactive login.
Leading `~` will be expanded in the file name as will environment variables such as `${RCLONE_CONFIG_DIR}`.
Enter a string value. Press Enter for the default ("").
service_account_file>
```
The next series of prompts are important. Use the auto-config to launch the web browser *on the same machine* to give rclone permission to access your data stored in the Drive storage. This will allow rclone to access your data *from this machine only*.
```
Edit advanced config? (y/n)
y) Yes
n) No (default)
y/n> n
Remote config
Use auto config?
* Say Y if not sure
* Say N if you are working on a remote or headless machine
y) Yes (default)
n) No
y/n> y
If your browser doesn't open automatically go to the following link: http://127.0.0.1:53682/auth?state=SOME_RANDOM_STRING
Log in and authorize rclone for access
Waiting for code...
```
You will need to authorize access from the browser. If you have not logged in to your ODU Google Drive account, please do so now and authorize access to this.
<!-- FIXME insert screenshots -->
At this time, on the browser you will see a prompt like this:
> **rclone for ODU research computing** wants access to your Google Account.
> ...
> This will allow **rclone for ODU research computing** to: See, edit, create, and delete all of your Google Drive files.
> Make sure you trust rclone for ODU research computing.
> Despite its scary-sounding advice, you need to allow access. This is what connects the `rclone` program to your data to be able to manipulate them. It is *your* invocation fo the rclone program to the *remote* you specify that will "see, edit, create, and delete" the data on your Drive. You can always remove Drive access from rclone from your Google Account settings.{.is-info}
The next steps are finalization:
```
Got code
Configure this as a team drive?
y) Yes
n) No (default)
y/n> n
--------------------
[my-gdrive]
type = drive
client_id = 605919805393-odnfmddo2v24ffodmg80j6ht4oi4kftn.apps.googleusercontent.com
client_secret = GOCSPX*******
scope = drive
token = {"access_token":"###REDACTED###","token_type":"Bearer","refresh_token":"###REDACTED###","expiry":"2022-11-17T05:07:15.32276879Z"}
--------------------
y) Yes this is OK (default)
e) Edit this remote
d) Delete this remote
y/e/d> y
Current remotes:
Name Type
==== ====
my-gdrive drive
```
> If you want to access a shared Drive (or sometimes called team Drive) instead of a personal Drive storage (do not confuse this with a Drive location shared by somebody to you personally), you will need to respond "y" to the question "Configure this as a team drive?".{.is-info}
Voila! Your Drive setup is good to go.
Testing the Drive Access
------------------------
Let us now test if this access works correctly. Let us just list the contents of the root folder. From the terminal, type (do not include `$` shell prompt):
```
$ rclone ls --max-depth=1 my-gdrive:
```
If all is well, you should see the listing of all the files in the root directories (no folders).
Here is an example from one of the staff members' listing (redacted):
```
$ rclone ls --max-depth=1 wpurwant-gdrive:
-1 BLANK - Old Dominion University, Norfolk Maturity/Capabilities Model Assessment.xlsx
129915 Position Statements and Bios_2020.pdf
67430 NSF_RFI_Response_final.pdf
22627 DEAPSECURE 2.0 brainstorming
-1 DataUp response.docx
20318 DeapSECURE-module-3-MachineLearning
-1 Fabric Benchmarking 2017.docx
-1 ODU Training.docx
-1 ODU Zoom meetings.docx
-1 PEARC19 Champion Related Activities.docx
-1 Research Computing Strategy brainstorming doc.docx
-1 Restricted-data-computing-platforms-ODU-2022.d20220407.pptx
```
The first number on every row is the file size. If it is -1, it indicates a native Goggle document (Docs, Sheets, Slides). Other files will show the file sizes.
References
----------
* Official documentation:
https://rclone.org/drive/

@ -0,0 +1,28 @@
Google Drive & RClone: Backing Up, Syncing & Copying
====================================================
Rclone supports several modes of syncing:
- "sync"
- "copy"
What are the differences?
From rclone's website:
> "Sync the source to the destination, changing the destination
only. Doesn't transfer files that are identical on source and
destination, testing by size and modification time or
MD5SUM. Destination is updated to match source, including deleting
files if necessary (except duplicate objects, see below). If you
don't want to delete files from destination, use the copy command
instead." ([ref](https://rclone.org/commands/rclone_sync/))
Unless you want to perform deletion on the destination to match 100%
what's on the source, you will want to first look at the "copy"
operation.
References:
https://www.carc.usc.edu/user-information/user-guides/data-management/transferring-files-rclone

@ -0,0 +1,22 @@
A Cheatsheet Intro to Tmux
==========================
"Tmux Tutorial"
https://leimao.github.io/blog/Tmux-Tutorial/
Key shortcuts commands:
|-----------------|------------------------------|
| Ctrl+b c | Create a new shell (screen) |
| Ctrl+b d | Detach from the current tmux session |
|-----------------|------------------------------|
| Ctrl+b 0 | Switch to screen 0 |
| Ctrl+b 1 | Switch to screen 1 |
| ... | Switch to screen "n" |
| Ctrl+b 9 | Switch to screen 9 |
| Ctrl+b n | Switch to next screen (will wrap around from the last screen to screen 0) |
| Ctrl+b p | Switch to previous screen |
|-----------------|------------------------------|

@ -0,0 +1,22 @@
TODO tutorial / guide for tmux
==============================
## [ ] Terminologies
* "session" : the entire session
* "screen" : a separate shell / CLI / TUI program spawned by tmux and whose display is accesible from tmux.
## [ ] Cheatsheet
See its own cheatsheet draft.
## [ ] Using tmux to have persistent terminal on HPC
### Draft:
The tmux process should be run on the login node instead of the compute node.
Spawn as many screens as from one or more screen(s) / shell(s) within the tmux session.
For heavier process, we should launch the "salloc" to get the shell on the compute node before running the heavy process.
Loading…
Cancel
Save