Use AWS Lambda to create an archive of certain files from AWS S3

image

AWS Lambda is a compute service that runs your code in response to events and automatically manages the compute resources. Overview of the concept, principles of operation, prices and the like already are on habré ( habrahabr.ru/company/epam_systems/blog/245949 ), I will try to show a practical example of the use of this service.

So, as the name implies post, we will use AWS Lambda to create an archive of these files we stored on AWS S3. Go!



the

to create a new function in the AWS console


AWS Lambda is under "preview", so if you use it for the first time, you will need to fill in a request and wait a few days.

To create a new feature in the AWS console click on the Create a Lambda function, see the form the parameters for a new function.
First we ask the name and description for the new functions:
image

Then its code:
image
Code or directly write to the editor or load a specially prepared zip archive. The first option is suitable only for code without additional dependencies, and the second at the time of me writing this article was not working via the web. Therefore, at this stage we create a function with the code from the editor without changing the text of the proposed example and we later upload the required us the code with all the dependencies programmatically.

Role name specifies what access rights to various AWS objects will have a function. I am not going to dwell on this, except to say that we offer by default when you create a new role of law provide access to AWS S3, and therefore sufficient for this example.

You must also specify the allocated amount of memory and the timeout is run:
image
Allocated the amount of memory affects the price function (the more, the more expensive). However, it also tied the allocated processor resources. As the task of the archive is heavily dependent on CPU resources, select the maximum available memory — price increase at the same time completely kompensiruet reduction in processing time.

By filling in the form, press Create Lambda function and leave the AWS console to go to the direct creation of our functions.

the

Code for the function, as well as its packaging and uploading to AWS


For our task, we use several third-party libraries and library grunt-aws-lambda for convenience, packaging and uploading functions.

Create packaje.json with the following content:
the
{
"name": "zip-s3",
"description": "AWS Lamda Function",
"version": "0.0.1",
"private": "true",
"devDependencies": {
"aws-sdk": "^2.1.4",
"grunt": "^0.4.5",
"grunt-aws-lambda": "^0.3.0"
},
"dependencies": {
"promise": "^6.0.1",
"s3-upload-stream": "^1.0.7",
"archiver": "^0.13.1"
},
"bundledDependencies": [
"promise",
"s3-upload-stream",
"archiver"
]
}

and install dependencies:
the
npm install

Archive bundledDependencies in package.json has dependencies that will be packaged together with our function when unloading.

Then create the file index.js to contain the function code.
Let's see how looks the function code, doing nothing:
the
exports.handler = function (data, context) {
context.done(null, ");
}

The calling context.done signals that the operation is completed, while AWS Lambda terminates its execution, said the times used, etc.

The data object contains parameters that are passed to the function. The structure of this object, we will have the following form:
the
{
bucket : 'from-bucket', 
keys : ['/123/test.txt', '/456/test2.txt'],
outputBucket : 'to bucket',
outputKey : 'result.zip'
}

Start writing the actual function code.
Plug-in required libraries:
var Promise = require('promise'); var s3Stream = require('s3-upload-stream')(new AWS.S3()); var archiver = require('archiver'); var s3 = new AWS.S3();
Create the objects that will carry out the archiving of files and streaming of the resulting archive to AWS S3.
the
var archive = archiver('zip');

var upload = s3Stream.upload({
"Bucket" data.outputBucket,
"Key": data.outputKey
});

archive.pipe(upload);

Create a promise to call context.done at the end of the upload result:
the
var allDonePromise = new Promise(function(resolveAllDone) {
upload.on('uploaded', function (details) {
resolveAllDone();
});
});

allDonePromise.then(function() {
context.done(null, "); 
});

The files received to addresses specified and add them to the archive. After downloading all the files, close the files:
the
var getObjectPromises = [];
for(var i in data.keys) {
(function(itemKey) {
itemKey = decodeURIComponent(itemKey).replace(/\+/g,' ');
var getPromise = new Promise(function(resolveGet) {
s3.getObject({
Bucket: data.bucket,
Key : itemKey
}, function(err, fileData) {
if (err) {
console.log(itemKey, err, err.stack); 
resolveGet();
}
else {
var itemName = itemKey.substr(itemKey.lastIndexOf('/'));
archive
.append(fileData.Body, { name: itemName });
resolveGet();
}
});
});
getObjectPromises.push(getPromise);
})(data.keys[i]);
}
Promise.all(getObjectPromises).then(function() {
archive.finalize();
});

All of the code in Assembly

var AWS = require('aws-sdk');
var Promise = require('promise');
var s3Stream = require('s3-upload-stream')(new AWS.S3());
var archiver = require('archiver');
var s3 = new AWS.S3();

exports.handler = function (data, context) {
var archive = archiver('zip');

var upload = s3Stream.upload({
"Bucket" data.outputBucket,
"Key": data.outputKey
});

archive.pipe(upload);

var allDonePromise = new Promise(function(resolveAllDone) {
upload.on('uploaded', function (details) {
resolveAllDone();
});
});

allDonePromise.then(function() {
context.done(null, "); 
});

var getObjectPromises = [];
for(var i in data.keys) {
(function(itemKey) {
itemKey = decodeURIComponent(itemKey).replace(/\+/g,' ');
var getPromise = new Promise(function(resolveGet) {
s3.getObject({
Bucket: data.bucket,
Key : itemKey
}, function(err, data) {
if (err) {
console.log(itemKey, err, err.stack); 
resolveGet();
}
else {
var itemName = itemKey.substr(itemKey.lastIndexOf('/'));
archive
.append(data.Body, { name: itemName });
resolveGet();
}
});
});
getObjectPromises.push(getPromise);
})(data.keys[i]);
}
Promise.all(getObjectPromises).then(function() {
archive.finalize();
});
};


To package and upload files to AWS create Gruntfile.js follows:
Gruntfile.js
module.exports = function(grunt) {
grunt.initConfig({
lambda_invoke: {
default: { 
}
},
lambda_package: {
default: {
}
},
lambda_deploy: {
default: {
function: 'zip-s3'
}
}
});
grunt.loadNpmTasks('grunt-aws-lambda');
};

And the file ~/.aws/creadentials with access keys AWS:
the
[default]
aws_access_key_id = ...
aws_secret_access_key = ...

Pack and unload our function in AWS Lambda:
the
grunt lambda_package lambda_deploy

the

Call the created function of our app


To call a function going from java applications.
For this prepared data:
the
JSONObject requestData = new JSONObject();
requestData.put("bucket", "from bucket");
requestData.put("outputBucket","to-bucket");
requestData.put("outputKey", "result.zip");

JSONArray keys = new JSONArray();
keys.put(URLEncoder.encode("/123/файл1.txt","UTF-8"));
keys.put(URLEncoder.encode("/456/файл2.txt","UTF-8"));
requestData.put("keys", keys);

And directly call the function:
the
AWSCredentials myCredentials = new BasicAWSCredentials(accessKeyID, secretKey);

AWSLambdaClient awsLambda = new AWSLambdaClient(myCredentials);

InvokeAsyncRequest req = new InvokeAsyncRequest();
req.setFunctionName("zip-s3");
req.setInvokeArgs(requestData.toString());

InvokeAsyncResult res = awsLambda.invokeAsync(req);

The function will be executed asynchronously — we immediately get the result that the execution request was successfully received AWS Lambda, the implementation will take some time.
Article based on information from habrahabr.ru

Comments

Popular posts from this blog

Powershell and Cyrillic in the console (updated)

Active/Passive PostgreSQL Cluster, using Pacemaker, Corosync

Experience with the GPS logger Holux M-241. Working from under Windows, Mac OS X, Linux