Search This Blog

Monday, December 12, 2016

AWS tutorial: Retrieve items from DynamoDB using Lambda and API Gateway

Introduction


In the previous tutorial I showed you how to use AWS Lambda and API Gateway to insert items in a DynamoDB table. The example implemented a function which stored the location of the user. In this tutorial we create a Lambda function which retrieves this data from the DynamoDB table and expose this functionality over HTTP using API Gateway.

Create a DynamoDB Global Secondary Index
The DeviceLocation DynamoDB table uses the id as partition key. To retrieve the locations for a given device, a global secondary index must be created to query on deviceId.

1. Open the AWS console and navigate to the DynamoDB section.
2. Select the DeviceLocation table and click the Indexes tab.
3. Select Create Index.
4. Use deviceId as the partition key, use 1 for read/write capacity units and click create. See the create_dynamodb_index screenshot.

create_dynamodb_index

Create the implementation
1. Create a new class named RetrieveLocationRequest with one field named “deviceId” with type String. This class holds the parameters which are used as query input to the  DynamoDB table.
2. Create a new class named RetrieveLocationResponse with a field named “locations” of type List. This class will hold the data returned to the consumer of the Lambda function.
3. Initialize the locations field to: new ArrayList<>() since we are going to add DeviceLocation objects to it.
4. Create a new class named RetrieveLocationFunction which implements the com.amazonaws.services.lambda.runtime.RequestHandler interface.

The RetrieveLocationFunction is the implementation of our Lambda function which queries DynamoDB using the given deviceId. It retrieves all locations for the given device.

Copy the following code and paste this in the handleRequest method of the RetrieveLocationFunction class. Make sure the region used here is the same region as the region of the DynamoDB table!

final AmazonDynamoDBClient client = new AmazonDynamoDBClient(new EnvironmentVariableCredentialsProvider());
client.withRegion(Regions.EU_CENTRAL_1); // specify the region you created the table in.
final DynamoDB dynamoDB = new DynamoDB(client);

System.out.println("input = " + input); // Pure for testing. Do not use System.out in production code

final Table table = dynamoDB.getTable("DeviceLocation");
final Index index = table.getIndex("deviceId-index");
final ItemCollection items = index.query("deviceId", input.getDeviceId());

final RetrieveLocationResponse response = new RetrieveLocationResponse();
for (final Item item : items) {
    final DeviceLocation deviceLocation = new DeviceLocation();
    deviceLocation.setDeviceId(item.getString("deviceId"));
    deviceLocation.setLat(item.getDouble("lat"));
    deviceLocation.setLng(item.getDouble("lng"));
    response.getLocations().add(deviceLocation);
}

return response;

Create the Lambda function
After the implementation is ready we need to upload the function and configure it in AWS.

1. Run the Gradle “clean build” task to create the zip distribution which holds our code and dependencies.
2. Open the AWS console and navigate to the Lambda section.
3. Select “Create a Lambda function” and select Blank Function.
4. Click Next.
5. As name specify “retrieveLocations” and as runtime select Java 8.
6. Upload the PROJECT/build/distributions/DISTRIBUTION_NAME.zip file
7. Use com.example.persister.RetrieveLocationFunction as the Handler
8. Use the existing lambda_location_persister role create in part 1 of this series.
9. Leave the rest default and click Next.
9. Click Create Function.
10. To test the function click Test and use the following test data (replace DEVICE_ID with the deviceId of a record in the table. Refer to part 1 to insert data in the DeviceLocation table):

{
  "deviceId": “DEVICE_ID”
}

Exposing the functionality using API gateway

The retrieveLocations function will be exposed using an HTTP GET method using API Gateway.

1. Open the AWS console and navigate to API Gateway.
2. Create a new GET method under the deviceLocation resource.
3. As Integration Type choose Lambda Function and select the region the Lambda Function was created in.
4. Specify retrieveLocations as the name and click Save.

The deviceId is passed in as a query parameter in the URL and must be mapped as input to the Lambda function.

1. Open the Method Request settings and open URL Query String Parameters.
2. Add a query parameter named deviceId.
3. Open the Integration Request settings and open Body Mapping Templates.
4. Choose the option: When there are no templates defined (recommended)
5. Add a mapping template for: application/json
6. Add the following template to map the deviceId URL query parameter to the Lambda input (see screenshot body_mapping).

{
    "deviceId": "$input.params('deviceId')"
}

body_mapping


7. Click Save.
8. Click Test, add a known deviceId and Click the Test button. You should see the output of the Lambda function.
9. Click Actions -> Deploy API to deploy your api.
10. You function should now be publicly accessible.

Make sure you delete your resources when you are done with the tutorial to prevent unwanted billing.

Friday, December 2, 2016

AWS Lambda/Java, DynamoDB and Api gateway integration

Introduction

Part 2: http://jcraane.blogspot.nl/2016/12/aws-tutorial-retrieve-items-from.html

In this post I am going through a full (Java) example of integrating AWS Lambda, DynamoDb and Api Gateway to create a function and expose this function as a HTTP resource for other parties to consume.

Before we dive into the details I will give a brief overview of the AWS services used in this example (as taken from the AWS documentation):

  • AWS Lambda. AWS Lambda is a compute service that runs developers' code in response to events and automatically manages the compute resources for them, making it easy to build applications that respond quickly to new information.
  • DynamoDB: Fast and flexible, managed, NoSql database.
  • Api Gateway: Amazon API Gateway is a fully managed service that makes it easy for developers to create, publish, maintain, monitor, and secure APIs at any scale.

In this example we are going to create a lambda function which tracks the location (latitude and longitude) of a specific mobile device. The data flow looks like this:

mobile device -> HTTP POST -> Api Gateway -> recordLocation (Lambda function) -> DynamoDb (store location)

Creating the application

Prerequisites:

  • IntelliJ Idea is used for this example but any IDE will do. Gradle is used for the build system.
  • AWS account to actually deploy and run the example.
  • After you are done with the example, delete any AWS resources you have created to prevent unnecessary billing.

For the implementation of the Lambda function we use the AWS Java SDK.

1. In IntelliJ, Select File -> New project and choose Gradle with the Java library. Click Next
2. groupId=com.example, artifactId=locationpersister. Click Next.
3. Choose a Java 8 IDE and Click Next, click Finish.
4. Create the src/main/java folder if it does not exist already.
5. Open de build.gradle file and add the following dependencies:
compile 'com.amazonaws:aws-lambda-java-core:1.1.0'
compile 'com.amazonaws:aws-lambda-java-events:1.1.0'
6. Add the following code to the build.gradle file:
task buildZip(type: Zip) {
    from compileJava
    from processResources
    into('lib') {
        from configurations.runtime
    }
}
build.dependsOn buildZip
The above code creates a zip archive when the build task is triggered. The zip file can be uploaded directly to AWS Lambda.

7. Create a class com.example.persister.DeviceLocation with the members: lat (double), lng (double) and deviceId (string). This class holds the data that gets submitted to the Lambda function.
8. Create a new class com.example.persister.LocationPersisterFunction. This class will hold the implementation of the Lambda function.
9. Make the LocationPersisterFunction implement the com.amazonaws.services.lambda.runtime.RequestHandler interface.
This interface defines the handleRequest function which is executed when the Lambda function is triggered. The handleRequest function takes two parameters: the input (which is of DeviceLocation type and is passed-in when the function is invoked Context object).

Creating the DynamoDb table

To store the data in DynamoDB we need to create a table.

1. Open the AWS console and navigate to the DynamoDB section.
2. Click create Table and for the name use: "DeviceLocation". Type id (string field) to use as a partition key and do not specify a sort key. Leave everything default and click Create.
3. Please note that the table is created in the selected regio. If you click on the table and look at the details you can see the region of the table.

Create the DeviceLocation class

The DeviceLocation class holds the input which is passed to the Lambda function.
1. Create a new class named DeviceLocation with the following fields: id (string), lat (double) and lng (double)
2. Make sure the class contains both setters and getters. The setters are used by AWS Lambda to populate this object based on the passed in JSON when calling the Lambda function.

Implementing the Lambda function

1. Open the LocationPersisterFunction
2. Add the following code to the body of the handleRequest method:
final AmazonDynamoDBClient client = new AmazonDynamoDBClient(new EnvironmentVariableCredentialsProvider());
        client.withRegion(Regions.EU_WEST_1); // specify the region you created the table in.
        DynamoDB dynamoDB = new DynamoDB(client);
        Table table = dynamoDB.getTable("DeviceLocation");
        final Item item = new Item()
                .withPrimaryKey("id", UUID.randomUUID().toString()) // Every item gets a unique id
                .withString("deviceId", input.getDeviceId())
                .withDouble("lat", input.getLat())
                .withDouble("lng", input.getLng());
        table.putItem(item);
        return null;
3. Make sure you change the region to match the region you created the table in.
4. The above code gets a reference to the DynamoDB DeviceLocation table, creates an item and persist it.
5. Execute the gradle build task to create a zip-archive or our code.
5. Now that the implementation is complete we are ready to create our AWS Lambda function.

Creating out Lambda function

1. Open the AWS console and navigate to the Lambda section.
2. Select Blank function and click next (we do not create a trigger at this stage).
3. As name choose: persistDeviceLocation and select Java 8 as the runtime
4. Upload the /build/distributions/locationpersister-1.0-SNAPSHOT.zip file
5. In the Handler field specify the fully qualified classname which implements our handler: com.example.persister.LocationPersisterFunction
6. In the Role field select to create a custom role. The create role form is opened. Use lambda_location_persister as the Role name and click allow. The role is created and selected in the Existing Role field. See the screenshot "lambda_role"
7. Leave everything default and click Next
8. Click Create function

lambda_role


Testing the function in de AWS console

After the function is created we are going to test is using the AWS console.
1. Click Test
2. A dialog opens where you can specify the data sent to the Lambda function. Use the following testdata:

{
  "deviceId": "deviceId",
  "lat": 52.5,
  "lng": 5.5
}

3. You can modify the testdata at any time by clicking Actions -> configure test event
4. When done, click Test
5. If everything went according to plan you should get an error message which states the following: Status Code: 400; Error Code: AccessDeniedException
6. This is correct until this point. Although we created a custom role, we did not gave this role permissions to access our DynamoDB table.

Add DynamoDB permissions to our role

1. Open the AWS console and navigate to the IAM section.
2. Click on Roles
3. Click on the lambda_location_persister to open it.
4. Click Attach Policy
5. In the filter field search for DynamoDB
6. Select the AmazonDynamoDBFullAccess policy and click Attach
7. Navigate back to AWS Lambda and test the function again. The function should be succesfull.
8. Navigate to DynamoDB and select the DeviceLocation table and click on items. You should see one item added to the table.
9 If you get a Status Code: 400; Error Code: ResourceNotFoundException error, check the region you specified in the Lambda implementation corresponds to the region of the DynamoDB table.

Creating the API Gateway
The API Gateway is used to create an HTTP endpoint which is the trigger for the Lambda function. Applications can communicate with this endpoint over HTTP.

1. Open de AWS console and navigate to API Gateway.
2. Create a new API. As the name use: LocationPersisterApi
3. Click Create API
4. Select Actions -> and click Create Resource, see screenshot_create_resource
5. As resource name use: devicelocation, and click Create Resource
6. Select Actions -> and click Create Method and select POST.
7. In the method details select as Integration Type: Lambda Function, Lambda Region (the region you created the Lambda Function in, and as Lambda Function: persistDeviceLocation (the name of the Lambda function))
8. Click Save and then OK
9. Click Test and paste a test message in the body. This can be the same message as used in the Test section of the Lambda function. After the body is filled-in, click Test. If everything is OK you should see a HTTP 200 status code.
10. Select Actions -> Deploy API and select [New Stage]. Specify prod as the stage name.
11. Click Deploy. Your API will be deployed so that it can be accessed from the outside world.
12. Navigate to the prod stage, expand the resources and select the POST method. Copy the URL after the 'Invoke URL' text in, for example, Postman. Execute an HTTP post with a test-message body. You should see a HTTP 200 status code (success).
13. If you now open the DynamoDB tables and list the items you should see several items added to the table.


Conclusion

AWS Lambda, DynamoDB and API Gateway is a powerful to provision functionality in the cloud without having to provision entire servers or more full-fledged managed services like elastic beanstalk. This post showed you how to use those AWS services to create A Lambda function which uses DynamoDB and make it available using API gateway.

Resources

- The full source code of the example project can be found on Github.
- AWS Lambda
- AWS DynamoDB
- AWS Api Gateway

Sunday, August 21, 2016

IntelliJ TooltipRunner plugin

Live coding during presentations can be a powerful mechanism to captivate an audience and this is preferably done using a tool which is suited for this. This means an IDE which provides some sort of presentation mode to focus on the code at hand.

When watching Get a Taste of Lambdas and Get Addicted to Streams by Venkat Subramaniam Venkat uses this technique of live coding. What is especially useful is the that TextMate is setup so that the results of program execution are displayed as a tooltip.

Since I use IntelliJ instead of textMate I searched if there was something similar. Unfortunately there was not. There is the convenient presentation mode but not an option (that I can think of) to display the results of a Java main execution in a tooltip.

That is why I created a plugin which does the same for IntelliJ. The plugin can be found here or you can install it using the Plugin manager in the Preferences. You can find a demonstration of the plugin on YouTube.

The source code of this plugin can be found on Github: https://github.com/jcraane/intellij-tooltip-runner

Monday, February 22, 2016

Increase max open files in Elastic Search (OSX)

Elastic Search, Logstash and Kibana (ELK) is an end-to-end stack which provides realtime analytics for almost any type of structured or unstructured data. 

When importing large amounts of data using Logstash to Elastic Search (ES), the chances are that ES hit the limits of the maximum files it can open. This limit is seen as an error in the ES logs with the following description: (Too many open files)

To deal with this you can increase the maximum files ES (or any process) may open using the following steps:

1. First start ES with the following option: ./elasticsearch -Des.max-open-files. This wil show the maximum number of files ES is allowed to open, for example: [2016-02-22 06:44:09,558][INFO ][bootstrap                ] max_open_files [10240]
2. Now execute the following commands to increase the maximum number of files a process may open:

- sudo sysctl -w kern.maxfiles=32000 
- sudo sysctl -w kern.maxfilesperproc=32000

3. Execute the following commands to set the file limit for the terminal process (this is the terminal window to launch ES in)

- ulimit -Sn 32000
- ulimit -Hn 32000

For ES 1.7:
Start Elastic Search with the following command: ./elasticsearch -Des.max-open-files -XX:-MaxFDLimit=true

For ES 2.2
Execute the following commands:

export ES_JAVA_OPTS=-XX:-MaxFDLimit (this increases the maximum files the JVM is allowed to open by default, see JVM configuration for more information)

and then start ES with the following command: 
./elasticsearch -Des.max-open-files (max_open_files should be 32000 now)


Heap sizes

You may also need to increase the heap size of both Logstash and Elastic Search. 

To increase the heap of LogStash execute the following command before launching Logstash: export LS_HEAP_SIZE=2g

To increase the heap of Elastic Search execute the following command before launching ES: export ES_HEAP_SIZE=8g

When importing large files using Logstash, it may benefit to increase the number of workers to speed up the importing process. The default is 1. See the following example of the elasticsearch output plugin in Logstash:

elasticsearch {
     action => "index"
     hosts => ["localhost"]
     index => "logstash-%{+YYYY.MM.dd}"
     workers => 4
     flush_size => 1000
}