This is a guide on how to try the data transfer functionality in Reva in your local environment using rclone as the data transfer driver.
A data transfer is initiated through an OCM share by setting the protocol
to type datatx
.
Use rclone version v1.61 or higher. Available at https://rclone.org/.
The rclone server should be run with the server-side-across-configs
flag set to true
which will make HTTP Third Party Copy (TPC) transfers possible:
rclone -vv rcd --server-side-across-configs=true --rc-user=rclone --rc-pass=rclonesecret --rc-addr=0.0.0.0:5572
TPC allows for direct (ie. efficient) Reva to Reva transfers as opposed to streaming the data through rclone
Follow the setup (prerequisites, building, running) of the OCM share tutorial.
Use the data transfer example config for the relevant settings to enable rclone driven data transfer.
At this point you should have a two Reva daemon setup between which we will establish a data transfer driven by rclone.
(assume we are logged in as einstein on the first Reva instance and we have uploaded some data to the /home/my-data
folder)
The tutorial explains transfer between user einstein at cern and user marie at cesnet.
Creating a transfer is similar to creating a regular OCM share through the ocm-share-create
command with the addition of the -datatx
flag. The -datatx
flag signifies that this is a data transfer.
The ocm-share-create
command makes (see example below), via an OCM share, the contents of folder /home/my-data
available for transferring to the grantee.
*Note that only a folder can be transferred!
>> ocm-share-create -grantee f7fbf8c8-139b-4376-b307-cf0a8c2d0d9c -idp cesnet.cz -transfer /home/my-data
+--------------------------------------+-----------------+--------------------------------------+--------------------------------------------------------------------------------------------+-------------------+-------------+--------------------------------------+--------------------------------+--------------------------------+
| # | OWNER.IDP | OWNER.OPAQUEID | RESOURCEID | TYPE | GRANTEE.IDP | GRANTEE.OPAQUEID | CREATED | UPDATED |
+--------------------------------------+-----------------+--------------------------------------+--------------------------------------------------------------------------------------------+-------------------+-------------+--------------------------------------+--------------------------------+--------------------------------+
| edc8f1c3-5f12-4430-8680-95b9034d6592 | cernbox.cern.ch | 4c510ada-c86b-4815-8820-42cdf82c3d51 | storage_id:"123e4567-e89b-12d3-a456-426655440000" opaque_id:"fileid-einstein%2Fmy-data" | GRANTEE_TYPE_USER | cesnet.cz | f7fbf8c8-139b-4376-b307-cf0a8c2d0d9c | 2023-04-11 11:52:08 +0200 CEST | 2023-04-11 11:52:08 +0200 CEST |
+--------------------------------------+-----------------+--------------------------------------+--------------------------------------------------------------------------------------------+-------------------+-------------+--------------------------------------+--------------------------------+--------------------------------+
(assume we are logged in on the receiving Reva instance as marie)
The grantee (ie. the receiver of the transfer) can now discover the transfer share and its details in the same way as with regular shares using the ocm-share-list-received
command to obtain the share id, and subsequent ocm-share-get-received
command using that share id:
>> ocm-share-list-received
+--------------------------------------+-----------------+--------------------------------------+-------------------------------------------------------------------------------+-------------------+-------------+--------------------------------------+--------------------------------+--------------------------------+---------------------+-----------------+
| # | OWNER.IDP | OWNER.OPAQUEID | RESOURCEID | TYPE | GRANTEE.IDP | GRANTEE.OPAQUEID | CREATED | UPDATED | STATE | SHARETYPE |
+--------------------------------------+-----------------+--------------------------------------+-------------------------------------------------------------------------------+-------------------+-------------+--------------------------------------+--------------------------------+--------------------------------+---------------------+-----------------+
| 79a2bf32-4bba-437a-ad8f-ec93211375b5 | cernbox.cern.ch | 4c510ada-c86b-4815-8820-42cdf82c3d51 | opaque_id:"123e4567-e89b-12d3-a456-426655440000:fileid-einstein%2Fmy-data" | GRANTEE_TYPE_USER | cesnet.cz | f7fbf8c8-139b-4376-b307-cf0a8c2d0d9c | 2023-04-11 11:52:08 +0200 CEST | 2023-04-11 11:52:08 +0200 CEST | SHARE_STATE_PENDING | SHARE_TYPE_USER |
+--------------------------------------+-----------------+--------------------------------------+-------------------------------------------------------------------------------+-------------------+-------------+--------------------------------------+--------------------------------+--------------------------------+---------------------+-----------------+
>> ocm-share-get-received 79a2bf32-4bba-437a-ad8f-ec93211375b5
{"id":{"opaqueId":"79a2bf32-4bba-437a-ad8f-ec93211375b5"}, "name":"my-data", "resourceId":{"opaqueId":"123e4567-e89b-12d3-a456-426655440000:fileid-einstein%2Fmy-data"}, "grantee":{"type":"GRANTEE_TYPE_USER", "userId":{"idp":"cesnet.cz", "opaqueId":"f7fbf8c8-139b-4376-b307-cf0a8c2d0d9c"}}, "owner":{"idp":"cernbox.cern.ch", "opaqueId":"4c510ada-c86b-4815-8820-42cdf82c3d51", "type":"USER_TYPE_FEDERATED"}, "creator":{"idp":"cernbox.cern.ch", "opaqueId":"4c510ada-c86b-4815-8820-42cdf82c3d51", "type":"USER_TYPE_FEDERATED"}, "ctime":{"seconds":"1683549473", "nanos":722800878}, "mtime":{"seconds":"1683549473", "nanos":722800878}, "shareType":"SHARE_TYPE_USER", "protocols":[{"transferOptions":{"sourceUri":"https://cernbox.cern.ch/remote.php/dav/ocm/IFs4ZVKVjp7OQsArvCSvXkf8A7emEQ71"}}], "state":"SHARE_STATE_PENDING", "resourceType":"RESOURCE_TYPE_CONTAINER"}
To start the transfer it must be accepted by the grantee.
The grantee (ie. the receiver of the transfer) must now accept the transfer by updating the state
of the transfer to accepted
. That will start the transfer. Optionally the grantee can also specify a path to which the data must be transferred:
>> ocm-share-update-received -state accepted -path /home/transfers 79a2bf32-4bba-437a-ad8f-ec93211375b5
OK
At this point the transfer should have started automatically. In the command example above the data will be transferred into the /home/transfers
folder of the grantee. In this case the final resulting path will read /home/transfer/my-data/
If a path is not provided with the command the transfers will be written into the folder as set by the configuration property data_transfers_folder
of the gateway as follows:
[grpc.services.gateway]
data_transfers_folder = "/home/MyTransfers"
Note that at least one of each must be provided but that the path
command flag overrides the configuration setting (ie. per transfer).
In case the transfer has failed and it is not a driver (rclone) issue, or maybe you want to transfer to another folder, use these 2 steps:
First update the share to pending
:
ocm-share-get-received -state pending 79a2bf32-4bba-437a-ad8f-ec93211375b5
OK
Next accept the transfer, optionally with a different path:
>> ocm-share-update-received -state accepted -path /home/transfers-sec 79a2bf32-4bba-437a-ad8f-ec93211375b5
OK
Now the data will be transferred to the /home/transfers-sec/my-data/
folder.
Whenever transfer shares are accepted corresponding transfer jobs will be created for them. These can be managed.
The transfer driver creates a transfer job for each transfer. These jobs can be managed (request status, retried, cancelled). For this one must first discover the transfer id from the transfers list.
List the transfers using the transfer-list
command to discover their corresponding transfer id:
>> transfer-list
+--------------------------------------+--------------------------------------+
| SHAREID.OPAQUEID | ID.OPAQUEID |
+--------------------------------------+--------------------------------------+
| 2c55dc61-4a06-4f44-9478-78eb1243971b | 0f901f2c-a004-4126-b810-29bf51909035 |
| 1f5de8f0-5565-4694-8eca-f66e578783c8 | f0b3b410-0e39-4591-92f7-8e229650b3c7 |
| 79a2bf32-4bba-437a-ad8f-ec93211375b5 | fe671ae3-0fbf-4b06-b7df-32418c2ebfcb |
+--------------------------------------+--------------------------------------+
Show the current status of a transfer using the transfer-status
command. Possible transfer states are:
cancelled
cancel failed
complete
expired
failed
in progress
new
invalid
transfer-get-status -txId fe671ae3-0fbf-4b06-b7df-32418c2ebfcb
+--------------------------------------+--------------------------------------+--------------------------+-----------------------------------+
| SHAREID.OPAQUEID | ID.OPAQUEID | STATUS | CTIME |
+--------------------------------------+--------------------------------------+--------------------------+-----------------------------------+
| 79a2bf32-4bba-437a-ad8f-ec93211375b5 | fe671ae3-0fbf-4b06-b7df-32418c2ebfcb | STATUS_TRANSFER_COMPLETE | Mon May 8 12:38:08 +0000 UTC 2023 |
+--------------------------------------+--------------------------------------+--------------------------+-----------------------------------+
Retry a transfer using the transfer-retry
command with the transfer id specified. This should restart the transfer job and return the new status of the transfer:
transfer-retry -txId fe671ae3-0fbf-4b06-b7df-32418c2ebfcb
+--------------------------------------+--------------------------------------+---------------------+-------------------------------+
| SHAREID.OPAQUEID | ID.OPAQUEID | STATUS | CTIME |
+--------------------------------------+--------------------------------------+---------------------+-------------------------------+
| 79a2bf32-4bba-437a-ad8f-ec93211375b5 | fe671ae3-0fbf-4b06-b7df-32418c2ebfcb | STATUS_TRANSFER_NEW | 2023-05-08 12:41:07 +0000 UTC |
+--------------------------------------+--------------------------------------+---------------------+-------------------------------+
A running transfer (transfer state in progress
) can be cancelled using the transfer-cancel
command as follows:
transfer-retry -txId fe671ae3-0fbf-4b06-b7df-32418c2ebfcb
+--------------------------------------+--------------------------------------+---------------------+-------------------------------+
| SHAREID.OPAQUEID | ID.OPAQUEID | STATUS | CTIME |
+--------------------------------------+--------------------------------------+---------------------+-------------------------------+
| 79a2bf32-4bba-437a-ad8f-ec93211375b5 | fe671ae3-0fbf-4b06-b7df-32418c2ebfcb | STATUS_TRANSFER_CANCELLED | 2023-05-08 13:50:12 +0000 UTC |
+--------------------------------------+--------------------------------------+---------------------+-------------------------------+
Transfers will be removed from the db using the transfer-cancel
command when the configuration property remove_transfer_on_cancel
and remove_transfer_job_on_cancel
of the datatx service and rclone driver respectively have been set to true
as follows:
[grpc.services.datatx]
remove_transfer_on_cancel = true
[grpc.services.datatx.txdrivers.rclone]
remove_transfer_job_on_cancel = true
Currently this setting is recommended.
*Note that with these settings transfer-cancel
will remove transfers & jobs even when a transfer cannot actually be cancelled because it was already in an end-state, eg. finished
or failed
. So transfer-cancel
will act like a ‘delete’ function.