Computer Vision

Overview

Computer Vision enables you to find meaning in visual content! Analyze images for scenes, objects, faces, and other content. Choose a default model off the shelf, or create your own custom classifier. Develop smart applications that analyze the visual content of images or video frames to understand what is happening in a scene.

Beginner's Guide

Computer vision is not just a way to convert pictures to pixels, and it can’t make sense of a picture just from its pixels. It’s the ability of a machine to take a step back and interpret the big picture that those pixels represent. And that’s much harder than we think it is.

For instance, when we see a picture of a model wearing a dress, we automatically identify which part of the body we’re looking at and from which angle. We can figure out the lighting conditions. We may even be able to judge the color and texture of the clothes based on shadows, highlights, and color temperature.

Quickstart

The following 6 services/APIs are provided by Computer Vision as part of Release 1.0.0.

    • Face Detection : Face Detection Service either in the form of image or JSON response

    • Object Detection : Object Detection Service either in form of image or JSON response

    • Image Classification : Image Classification takes an image as input and outputs a JSON response

APIs

All URIs below are relative to https://prod-kong.dltk.ai

Face Detection ImagePOST  /vision/face-detection/image
Face Detection JSONPOST  /vision/face-detection/json
Object DetectionPOST  /computer_vision/object_detection
Image ClassificationPOST  /computer_vision/image_classification
Job StatusGET  /computer_vision/task?task_id={id}

Face Detection Image

Description

This API would enable you to detect multiple faces from an image and retrieve the response in terms of an image.

URI
POST /vision/face-detection/image

Headers
ApiKeyYour App’s API Key
Attributes
fileMultiPart File (Form Data)

Request Example:

Response:

Face Detection JSON

Description

This API would enable you to detect multiple faces from an image and retrieve the response in terms of a JSON.
URI
POST /vision/face-detection/json
Headers
ApiKeyYour App’s API Key
Attributes
fileMultiPart File (Form Data)

Request:

Response:

[
 {
   "x": 57,
   "y": 43,
   "width": 81,
   "height": 81,
   "ratio": 0
 }
]

Object Detection

Description

This API would enable you to detect objects present in an image and return their names in a image or JSON format.

Create Object Detection Job

URI
POST/computer_vision/object_detection/

Headers
ApiKeyYour App’s API Key
Attributes

base64_img or

image_url
base64 string for an image or image url
input_method
base64_img or image_url
tasks
{“object_detection“:true}

configs.output_types

[“json”,”image”]

configs.object_detection_config.tensorflow

True or False

configs.object_detection_config.azure

True or False
Request Example:
{
    "image_url":" image url",
    "input_method":"image_url",
    "tasks":{"object_detection":true},
    "configs":{
        "output_types":["json","image"],
        "object_detection_config": {
            "tensorflow":true,
            "azure":true}
        }
}
Response:
{
    "job_id": "c30xxxx9-xxxa-xxxxa-xxxa-1dxxxxxxxx654"
}

Check Object Detection Job Status and Get Output

URI
GET/computer_vision/task?task_id={id}

Headers
ApiKeyYour App’s API Key
Response:
{
    "task_status": "SUCCESS",
    "task_id": "c30xxxx9-xxxa-xxxxa-xxxa-1dxxxxxxxx654",
    "output": {
        "azure_detected_objects": [],
        "tensorflow_detected_objects": [
            {
                "object_name": "car",
                "confidence": 0.8257747888565063,
                "bbox": {
                    "x1": 14,
                    "y1": 23,
                    "x2": 501,
                    "y2": 257
                }
            }
        ],
        "tf_output_image": "base64 string of image",
        "azure_output_image": "base64 string of image"
    }
}

Image Classification

Description

This API would enable you to identify what an image represents and return their classes names in JSON format.

Create Image Classification Job

URI
POST/computer_vision/image_classification

Headers
ApiKeyYour App’s API Key
Attributes
base64_img or
image_url
base64 string for an image or image url
input_method
base64_img or image_url
tasks
{“image_classification”:true}
configs.
output_types
[“json”]
configs.
img_classification_config.
top_n
Give number of top prediction i.e. 3
configs.
img_classification_config
.tensorflow
True or False
configs.
img_classification_config.ibm
True or False
configs.
img_classification_config.azure
True or False
Request Example:
{ 
    "base64_img":"base64 string of image",
    "input_method":"base64_img",
    "tasks":{"image_classification":true},
    "configs":{
        "output_types":["json"],
        "img_classification_config": {       
                "top_n":2,        
                "tensorflow":true,
                "ibm":true,
                "azure":true
              }
    }
}
Response:
{
    "job_id": "c30xxxx9-xxxa-xxxxa-xxxa-1dxxxxxxxx654"
}

Check Image Classification Job Status and Get Output

URI
GET/computer_vision/task?task_id={id}

Headers
ApiKeyYour App’s API Key
Response:
{
    "task_status": "SUCCESS",
    "task_id": "c30xxxx9-xxxa-xxxxa-xxxa-1dxxxxxxxx654",
    "output": {
        "ibm_predicted_classes": [
            {
                "class": "canopy",
                "confidence": 0.889
            },
            {
                "class": "shelter",
                "confidence": 0.889
            }
        ],
        "azure_predicted_classes": [
            {
                "class": "umbrella",
                "confidence": 1.0
            },
            {
                "class": "accessory",
                "confidence": 1.0
            }
        ],
        "tensorflow_predicted_classes": [
            {
                "class": "umbrella",
                "confidence": "0.9987213"
            },
            {
                "class": "parachute",
                "confidence": "0.0008525266"
            }
        ]
    }
}

SDK

Installation

DLTK requires Python version greater than 3.5. One can install DLTK SDK using the following command:
pip install dltk_ai

Creating Client

Create DLTK client to perform a different task.
client = dltk_ai.DltkAiClient('Your API Key')
To use these services, one needs to register to dltk website and create a project. Copy your API key to use different APIs.

Face Detection

It detects the face and its location from a given image.
"""
face_analytics(self, image_url=None, features=None, image_path=None, dlib=False, opencv=True, azure=False, mtcnn=False, output_types=["json"])
Args:
            output_types (list): Type of output requested by client: "json", "image"
            image_url: Image URL
            image_path: Local Image Path
            features (list) : Type of features requested by client
            dlib: if True, uses dlib for face analytics
            opencv: if True, uses opencv for face analytics
            azure: if True, returns azure results of face analytics on given image
            mtcnn: if True, uses mtcnn for face analytics
"""
image_link = https://upload.wikimedia.org/wikipedia/commons/thumb/0/01/Unidentified_People_1975_Hammond_Slides.jpg/1600px-Unidentified_People_1975_Hammond_Slides.jpg

response = client.face_analytics(features=None, image_url = image_link, dlib=True, opencv=True,azure=True, mtcnn=True, output_types=["json","image"])

Response

{
   "output":{
      "azure":{
         "base64_img":"base64 string",
         "json":{
            "face_locations":[
               {
                  "h":154,
                  "w":154,
                  "x":444,
                  "y":124
               },
               {
                  "h":147,
                  "w":147,
                  "x":818,
                  "y":123
               }
            ]
         }
      },
      "dlib":{
         "base64_img":"base64 string",
         "json":{
            "face_landmarks":[
               {
                  "left_eye":[
                     485,
                     171
                  ],
                  "left_eye_left_corner":[
                     473,
                     172
                  ],
                  "left_eye_right_corner":[
                     498,
                     171
                  ],
                  "mouth":[
                     524,
                     220
                  ],
                  "right_eye":[
                     548,
                     162
                  ],
                  "right_eye_left_corner":[
                     536,
                     165
                  ],
                  "right_eye_right_corner":[
                     560,
                     160
                  ]
               }
            ],
            "face_locations":[
               {
                  "h":155,
                  "w":155,
                  "x":442,
                  "y":133
               }
            ]
         }
      },
      "mtcnn":{
         "base64_img":"base64 string",
         "json":{
            "face_landmarks":[
               
            ],
            "face_locations":[
               
            ]
         }
      },
      "opencv":{
         "base64_img":"base64 string",
         "json":{
            "face_locations":[
               
            ]
         }
      }
   },
   "task_id":"817738ac-cd02-4c4f-a5c3-283f6b82ae4a",
   "task_status":"SUCCESS"
}

Object Detection

It detects and marks detected objects in a given image.
"""
object_detection(image_url=None, image_path=None, tensorflow=True, azure=False, output_types=["json"])
        Args:
            output_types (list): Type of output requested by client: "json", "image"
            image_url: Image URL
            image_path: Local Image Path
            tensorflow: if True, uses tensorflow for object detection
            azure: if True, returns azure results of object detection on given image
"""

#Using local image file
file_path = "../examples/object_detection_sample_1.jpg"

response = client.object_detection(image_path=file_path, tensorflow=True, output_types=["json", "image"])
print(response)

#Using image url
image_url = "https://static2.stuff.co.nz/1462405802/409/14655409.jpg"

response = client.object_detection(image_url=image_url, tensorflow=True, output_types=["json", "image"])
print(response)

Response

{
   "task_status":"SUCCESS",
   "task_id":"c30xxxx9-xxxa-xxxxa-xxxa-1dxxxxxxxx654",
   "output":{
      "tensorflow_detected_objects":[
         {
            "object_name":"car",
            "confidence":0.7981252670288086,
            "bbox":{
               "x1":116,
               "y1":179,
               "x2":196,
               "y2":243
            }
         },
         {
            "object_name":"car",
            "confidence":0.672417938709259,
            "bbox":{
               "x1":34,
               "y1":184,
               "x2":93,
               "y2":231
            }
         },
         {
            "object_name":"truck",
            "confidence":0.6478196382522583,
            "bbox":{
               "x1":436,
               "y1":118,
               "x2":498,
               "y2":179
            }
         },
         {
            "object_name":"car",
            "confidence":0.5425207018852234,
            "bbox":{
               "x1":409,
               "y1":157,
               "x2":437,
               "y2":177
            }
         }
      ],
      "tf_output_image":"base64 string"
   }
}

Image Classification

It classify the objects present in the image and return their names.
"""
image_classification(image_url=None, image_path=None, top_n=3, tensorflow=True, azure=False, ibm=False, output_types=["json"])

Args:

            top_n: get top n predictions
            output_types (list): Type of output requested by client: "json", "image"
            image_url: Image URL
            image_path: Local Image Path
            tensorflow: if True, uses tensorflow for image classification
            azure: if True, returns azure results of image classification on given image
            ibm: if True, returns ibm results of image classification on given image
"""

file_path = "../examples/data/image/image_classification_sample_1.jpg"
image_url = "https://static2.stuff.co.nz/1462405802/409/14655409.jpg"


# Note: Image classification: Using local image file
response = client.image_classification(image_path=file_path)
print(response)

# Note: Image classification: Using image url
response = client.image_classification(image_url=image_url)
print(response)

# Note: top_n=4 predictions
response = client.image_classification(image_path=file_path, top_n=4)
print(response)

# Note: predictions using ibm & azure
response = client.image_classification(image_path=file_path, azure=True, ibm=True)
print(response)

Response

{
   "task_status":"SUCCESS",
   "task_id":"c30xxxx9-xxxa-xxxxa-xxxa-1dxxxxxxxx654",
   "output":{
      "ibm_predicted_classes":[
         {
            "class":"gray color",
            "confidence":0.988
         },
         {
            "class":"road",
            "confidence":0.787
         },
         {
            "class":"arterial road",
            "confidence":0.666
         }
      ],
      "azure_predicted_classes":[
         {
            "class":"tree",
            "confidence":0.999996542930603
         },
         {
            "class":"outdoor",
            "confidence":0.9997731447219849
         },
         {
            "class":"road",
            "confidence":0.9962339401245117
         }
      ],
      "tensorflow_predicted_classes":[
         {
            "class":"jeep",
            "confidence":"0.11386044"
         },
         {
            "class":"recreational_vehicle",
            "confidence":"0.0992422"
         },
         {
            "class":"seashore",
            "confidence":"0.096477225"
         }
      ]
   }
}

Release Notes

Following are the release notes as part of Release 1.0.0

  • Face Detection gives an accuracy of 93.5%.

  • Face Detection can be used to detect multiple faces at once.

  • Eye Detection can be used to detect multiple eyes at once.

  • Smile Detection can be used to detect multiple smiles at once.

  • Object Detection can be used to detect up to 100 classes of objects.

  • Image-Classification can be used to detect up to 100 classes of objects.

  • License Plate Detection can be used to detect license plates.

Login

Register