Delete Kaltura Entries from Amazon S3 Bucket

Delete Kaltura Entries from S3 Bucket Automatically

When setting up a Kaltura cluster, you usually want to store the entries on a remote storage like an Amazon S3 bucket. The problem is that when deleting entries using the Kaltura KMC, they are not deleted from the bucket, which can cost you money.

We have previously written about using S3 with Kaltura, which included several fixes to the Kaltura database. To delete Kaltura Entries from S3 Bucket automatically, we need to perform several code and database fixes which are described below.

Check that the batch delete jobs are running

The file batch_config.ini (/opt/kaltura/app/batch/batch_config.ini) on your batch server controls which jobs are run by the server. Make sure it includes the following lines, and that the params.useS3 option is set to 1 (enabled) and not 0.

[KAsyncStorageDelete]
 id = 380
 name = KAsyncStorageDelete
 friendlyName = Storage Delete
 type = KAsyncStorageDelete
 maximumExecutionTime = 300
 maxJobsEachRun = 1
 scriptPath = batches/Storage/KAsyncStorageDeleteExe.php
 scriptArgs =
 maxInstances = 3
 sleepBetweenStopStart = 43200
 startForQueueSize = 0
 enable = 1
 autoStart = 1
 params.useFTP = 1
 params.useSCP = 1
 params.useSFTP = 1
 params.useS3 = 1
 params.chmod = 755

Another interesting setting is sleepBetweenStopStart, which sets the interval between delete jobs, in hours. You can set it to 12 hours (43200 seconds) to allow accidental delete recovery.

Add the allow custom delete permission

We need to add a permission the the remote_storage.custom_data field in the kaltura database. Log into your kaltura database and check the storage_profile.custom_data field.
The allow auto delete flag should be set to 1. If it is not, set it. For example, if your partner id is 100:

update storage_profile set custom_data = 'a:5:{s:17:"allow_auto_delete";b:1;s:11:"path_format";N;s:19:"path_manager_params";s:6:"a:0:{}";s:7:"trigger";i:3;s:18:"url_manager_params";s:6:"a:0:{}";}' where partner_id = 100;

Adding support for S3 in the

We need to add support for the S3 protocol in the KAsyncStorageDelete class (/opt/kaltura/app/batch/batches/Storage/KAsyncStorageDelete.class.php) on your front (API) server.
Open the file and add a line after line 158. The line that should be added is in bold below: 

protected function getSupportedProtocols() {
 $supported_engines_arr = array();
 if ( $this->taskConfig->params->useFTP ) $supported_engines_arr[] = KalturaStorageProfileProtocol::FTP;
 if ( $this->taskConfig->params->useSCP ) $supported_engines_arr[] = KalturaStorageProfileProtocol::SCP;
 if ( $this->taskConfig->params->useSFTP ) $supported_engines_arr[] = KalturaStorageProfileProtocol::SFTP;
 if ( $this->taskConfig->params->useS3 ) $supported_engines_arr[] = KalturaStorageProfileProtocol::S3;

 return join(',', $supported_engines_arr);
 }
}

Fixing the doDelFile function

The doDelFile function in the s3Mgr class doesn’t delete the file from S3. To fix it, open /home/ron/Documents/Projects/kaltura/package/app/app/infra/general/file_transfer_managers/s3Mgr.class.php
and find the doDelFile function. Replace the last two lines with the lines in bold below. After the edit the function should look like this:

protected function doDelFile ($remote_file)
{
 list($bucket, $remote_file) = explode("/",ltrim($remote_file,"/"),2);
 $object = $this->s3->getObject($bucket, $remote_file);
 $this->s3->deleteObject($bucket, $remote_file);
 return $object;
}

Testing

  1. Set the sleepBetweenStartStop flag to one minute (60). Don’t forget to change it back after the test.
  2. Upload a video to the KMC.
  3. Check that it arrived at the S3 bucket using the entry id.
  4. Delete the file from the KMC.
  5. Check that the file was deleted from the S3 bucket.
You can also follow the delete job’s progress in the API log (/opt/kaltura/log/kaltura_api_v3.log) and in the admin console’s Batch Process Control.

Leave a Reply

Your email address will not be published. Required fields are marked *