Better Brain,
Better Chain.


ABOUT ailia

ailia’s features

Object detection, image classification, features extraction.
Use models trained in the cloud for your embedded applications!
Get high speed deep learning inference!

ailia is a deep learning middleware specialized in inference in the edge. Easily integrated in your application, it computes inference while making the best use of the GPU when available. We provide integral support to our product from the nice and consistent API to the optimized low layers, all developped in-house as fully proprietary solution.


Use models trained in the cloud for your embedded applications.

  • obstacle detection
  • image classification
  • features extraction

Using models trained on the cloud, you can implement easily your image recognition applications.
You don’t have to write anymore the pre- and post-processing, it is now provided by ailia in a utility class. Furthermore, you can use validated models publicly available on internet or provided by your business partners.

Catalogue of trained models


Unity plugin included.

A plugin for Unity is included. As webcam input is easily accessible inside Unity, you can take advantage of ailia’s C# API for your image recognition applications.


Supports weights compression.

Includes a proprietary weights compression method. As the weights are compressed before being sent to the edge side, there is a gain of up to 1/3 of transmission time and storage.


Multiplatform high-speed GPU inference.

Perform fast inference using the GPU from various platforms.
High speed inference made possible while not depending on some particular hardware maker.


Demo movie

Obstacles detection with YOLO

Using YOLO trained model, you can detect persons and cars positions.
You can also load your own weights for example learned through Darknet.

Estimation of face characteristics with Gender/Age/EmotionNet.

After processing by YOLO Face, check for gender, age or emotion using each of these networks.

Feature extraction with VGG16

With VGG16, you can extract features from your images.
Using their distance in features space, you can compute the resemblance between images,
and thus easily build a search-by-image engine.

Pose estimation with Acculus Pose

Corresponds to the pose estimation model provided by Acculus Inc.
Achieve fast pose estimation with an algorithm different from OpenPose.


Customizable recognition.

We include a Python library to convert from Keras/Darknet to a format readable by ailia, letting you write easily your conversion scripts.


import keras2caffe


python yolo.cfg yolo.weights yolo.prototxt yolo.caffemodel 

The layers/networks below are currently supported. We can also provide help to add new ones.

Supported layers

keras caffe ailia
InputLayer data data
Conv2D Convolution Convolution
SeparableConv2D Convolution Convolution
DepthwiseConv2D Convolution Convolution
BatchNormalization BatchNorm BatchNorm
Dense InnerProduct InnerProduct
Activation ReLU Sofmax ReLU Sofmax
Cropping2D Crop Crop
Concatenate , Merge Concat Concat
Add Eltwise Eltwise
Flatten Eltwise Eltwise
MaxPooling2D Pooling Pooling
AveragePooling2D Pooling Pooling
Dropout Dropout Dropout
GlobalAveragePooling2D Pooling Pooling
ZeroPadding2D Pooling Pooling
Not Implemented Pooling Pooling

Supported networks

AlexNet  InceptionV3 XceptionV1 VGG16 
SqueezeNet MobileNet LeNet Yolo ResNet


Usage examples

Interactive signage using Unity.

By capturing position and pose of the person in front of a digital signage, a 3D character can act correspondingly, creating an enjoyable interaction.

Person detection for a reception system.

Detecting an incoming visitor by image recognition, it is possible to display informations about his visit destination, etc.

People counter for a physical store customers analytics.

Based on image recognition, estimate the number, gender, and age of visting customers, and use these data for your marketing analysis.


Specifications and included items

Items Specifications
Input format prototxt, caffemodel, ONNX (Comming Soon)
Supported OS Windows, Mac, iOS, Android
Library format Static, Dynamic
API C++, C#, Python (Comming Soon)
Accelerator IntelMKL, Accelerate.framework
GPGPU MetalPerformanceShaders, RenderScript, C++AMP
SIMD instruction set SSE2, AVX, NEON


Thank you for your inquiry.
Our representative will contact you.

"Unity" is a trademark or registered trademark of Unity Technologies in Japan and other countries.
Windows is a registered trademark of Microsoft Corporation in the United States and other countries.
Android is a trademark of Google Inc.
Mac is a trademark of Apple Inc.
Other listed products and services are trademarks of each company.

Catalogue of trained models

Trained model which was verified working with ailia


Trained model publicly available on Internet

Use Details Algorithm Project page License Download weights
Object recognition Detect location and type (20 categories) of objects. YOLOv1 Darknet


Size: 1.42GB

Facial recognition Recognize the position of the face. YOLOv1 Darknet


Size: 156.6MB

Object recognition Recognize the object depicted in the image (1000 categories). High accuracy. VGG16 CC


Size: 490MB

Object recognition Recognize the object depicted in the image (1000 categories). High speed. SqueezeNet MIT


Size: 4.4MB

Gender estimation / Emotion estimation Estimate the gender or the emotion. miniXception MIT


Size: 217.8KB

Gender estimation / Age estimation Estimate the gender or the age. High speed. Alexnet Site reference


Size: 712B

Gender estimation / Age estimation Estimate the gender or the age. High accuracy. VGG Site reference


Size: 681KB


Trained models provided by Axell and our partners (paid service).

Use Details Download the weights.
Pose estimation Detect the posture of people. Currently in preparation. Please contact us for details.
Detection of facial landmarks. Detect the position of a person’s facial landmarks. Currently in preparation. Please contact us for details.

For reference, some trained models are released on this page.

List of Caffe trained models.