Hello Everyone,

In this 3 post series, we are going to see how to deploy a deep learning model, which performs image classification in AWS and exposes its API to the world, so others can interact with your API.

Go to Post 1: Link

Go to Post 2: Link

Go to Post 3: You are exactly there, where you should be.

As stated before we are going to use 3 services in AWS.

1. AWS API Gateway (Part — 3)

AWS API Gateway

In our first and second posts, we addressed the limitation of AWS Lambda on why it cannot handle the sizes > 250 MB of uncompressed deployment packages, therefore we attached EFS which is like the Google Drive of AWS, where you can scale horizontally and load your deployment packages there.

In the second post, we added deployment packages (both light and heavy) into AWS Lambda using both AWS Lambda Layers and AWS EFS.

In this third post, we will finally deploy our image recognition model using API Gateway and expose it to the outer world.

Let’s get started.

1. Navigate to AWS API Gateway console and click Create API, then Build a REST API and populate the required fields.

Give the API a name and Endpoint type as Regional.

There are three types of Endpoint available, Regional(deployed only in the current region), Edge optimized (deployed at various edge locations in the world), Private (deployed within you local VPC)

2. Now let’s create a Resource and a method under it, as per the definitions of Representational State Transfer API REST.

3. Under Actions, click Create Resource and add Resource Name and click Create Resource.

4. Now under Resource created, we will create a new method → POST (HTTP VERBS) since we are getting an image from the users. Select the integration type as “Lambda Function”, Check the Use Lambda Proxy Integration, as we want all input mapping to be done in Lambda (mapping rules, input content type, etc)and choose the Lambda function that we created in the last post and save it.

It will prompt a window stating to add proper permissions to that lambda function. Just click OK.

Here at the lambda function console, we can see that, it automatically added a trigger point, so our AWS Lambda can be triggered by API Gateway. Also check the permissions tab, where it added resource-based policy.

5. Finally in the settings tab, add Binary Media type: */*, a wild card so that it accepts any input format and save changes.

6. Now we deploy our API, so under the API Actions, click Deploy API.

So that we can get an invoke URL and pass our requests.

7. Choose the stage name as v1/dev/prod/test and click deploy.

8. Now we have an invoke URL, where we invoke this URL to make an API call via CURL in BASH.

Now let’s check our function codes. Here we print the Event.

AWS Lambda integrates with other AWS services to invoke functions. You can configure triggers to invoke a function in response to resource lifecycle events, respond to incoming HTTP requests.

Each service that integrates with Lambda sends data to your function in JSON as an Event. The structure of the event document is different for each event type and contains data about the resource or request that triggered the function. Lambda runtimes convert the event into an object and pass it to your function.

9. Now we invoke this URL via our Bash console,

Command : curl — request POST -H “Content-Type:image/jpg” — upload-file “xray.jpg” https://3biakdh1fg.execute-api.ap-south-1.amazonaws.com/v1/xray

In the above command, we are passing a POST request and we specify the content-type as image/jpg (also there are various other content-type available), then we upload the file in my case I have an image “xray.jpg” in my desktop, followed by invoke_URL/method (https://…/xray).

And we get the default output “Hello from Lambda”, which denotes that our process ran without any errors.

10. Now let’s monitor the response we got by invoking our API in the cloudwatch logs of our lambda function.

11. Select the most recent log based on the time of invoking.

12. Here the Event is a JSON format of our input, where it contains all the necessary parameters for us to make use of.

The input image that we sent along with the invoke URL is under the “body” section of the JSON, but the input image is base64encoded, so in order to reconstruct back we need to perform base64 decoding.

13. Now we write python code in lambda_function to decode this base64encoded image and store it in our EFS drive.

15. Here we can see the image is decoded and stored under the directory “/efs/demo/workplace”. Now using this image.jpg we can perform some image processing using deep learning models and pass the outputs. So let’s modify our codes.

16. Great, now I have added some deep learning-based codes for processing the input image and finding the disease in the given chest X-Ray and now our model will provide the class label of the findings.

17. Now let’s invoke our API endpoint with the updated codes and see if we get back the class of the disease label.

And just as we Expected we can see the output label of our model’s prediction, which in our case it’s Cardiomegaly.

And we have just completed running our deep learning model in serverless mode.

We are not limited here just by sending back the output class only, we can do a lot of things.

For instance, in DeepScopy, we provide API endpoints for our clients to use most of our models and in return, we send them back a detailed report based on the input image.

Here if you see, the client invokes our endpoint using the API-Key (which are unique for every client) and in return, we provide an URL of the report pdf stored in one of the S3 buckets which can be downloaded.

Great, finally we are at the end of our three-post series, we came across a lot of services under AWS hood to make it happen and it’s just the start, there are tons of customization available.


So now we saw how to do end to end deployment of our deep learning models in this full series, it is good to know the limitation present inside each service.

For AWS Lambda: The maximum execution time is 15 Minutes and Maximum RAM available is 3 GB and if you have like 100 lambda functions under your hood, the maximum concurrency is 1000. (i.e Concurrency is the number of requests that your function is serving at any given time. When your function is invoked, Lambda allocates an instance of it to process the event. When the function code finishes running, it can handle another request. If the function is invoked again while a request is still being processed, another instance is allocated, which increases the function’s concurrency.)

They are the hard limits as of writing this article.

For API Gateway (REST API): The maximum timeout is only 29s (Hard limit). So when an invoke URL is called with requests, you need to send the response back within 29 seconds, otherwise, the connection is terminated and timeout error is sent back.

So make use of the multithreading/multiprocessing modules within python if your program is either CPU bound or I/O bound since at maximum memory allocate up to 2 CPU under the Lambda hood.

Otherwise, perform asynchronous operations, in our case you can get the image from the user and once it’s successful you can send a response back to them and once the processing is done, you can configure to mail the outputs to the client securely via AWS SES.

It is best practices to check the pricing of AWS API Gateway: https://aws.amazon.com/api-gateway/pricing/

And I will be posting articles related to Machine and Deep Learning and also their intersection with AWS Services.

Until then, see you next time.