Computer or Computer Vision (II)

Second part of this analysis on emerging technology with new models.

By Osvaldo Callegari*

The applications of Artificial Intelligence are diversified in various areas and methods of application.

Within the technology evaluations we show case studies of products such as Microsoft's Azure and its Intelligence Api to enable voice, vision and language features.

- Publicidad -

Image analysis for insights
You can analyze images to get accurate data about an object.
Currently some methods are suffering prohibitions, as is the case of the city of San Francisco that have imposed a law in which it is not allowed to make facial detections.

That said, let's go to the characteristics of the application libraries:

• Tagging visual features
◦ Identify and label the visual characteristics of an image from a set of thousands of recognizable objects, living things, landscapes and actions.
Labeling is not limited to the main subject, such as a person in the foreground, but also includes the environment (indoor or outdoor), furniture, tools, plants, animals, accessories, gadgets, etc.

• Detect objects
◦ Object detection is similar to tagging, but the API returns the bounding rectangle coordinates for each applied tag. For example, if an image contains a dog, a cat, and a person, the detection operation will display those objects along with their coordinates in the image.

• Detection of trademarks
◦ Identify image or video trademarks from a database of thousands of global logos. You can use this feature, for example, to detect which brands are most popular on social media or most prevalent in the placement of media products.

• Classify an image
        ◦ Identify and classify an entire image using a Category Taxonomy
with hereditary hierarchies of primary and secondary elements.
            ▪ Categories can be used alone or with our new labeling models.
            ▪ Currently, English is the only language that is supported for tagging and classifying images.

• Describe an image
◦ Generate a description of an entire image in natural language, with complete sentences. Computer Vision (CV) algorithms generate several descriptions based on the objects identified in the image. Each of these descriptions is evaluated and a confidence score is generated. A list of confidence scores sorted from highest to lowest is then returned.

- Publicidad -

• Detect faces
◦ Detect faces in an image and provide information about them. Computer Vision returns the coordinates, rectangle, gender, and age of the faces it detects.
▪ In turn it provides a subset of the functionality that can be found in Face (Cognitive Service) and this service can be used to get more detailed analysis, such as facial identification and posture detection.

• Detect image types
◦ Detect the characteristics of an image, such as whether an image is a line drawing or the probability that it is a clip art.

• Detect domain-specific content
◦ Use domain models to detect and identify domain-specific content in an image, such as celebrities and monuments. For example, if an image contains people, Computer Vision can use a celebrity domain model that is included with the service to determine whether people who have been detected in the image match known celebrities.

• Detect color scheme
◦ Analyze the use of color in an image. CV can determine whether an image is black and white or color, and in color images, identify dominant and accent colors.

• Generate a thumbnail
◦ Analyze the content of an image to generate an appropriate thumbnail of it. First, Computer Vision generates a high-quality thumbnail after analyzing the objects in the image to determine the area of interest. Computer Vision crops the image to fit the requirements of the area of interest. The generated thumbnail can be presented with a different aspect ratio from the original image depending on your needs.

• Get the area of interest
◦ Read the contents of an image to return the coordinates of the area of interest. This is the same function used to generate a thumbnail, but instead of cropping the image, Computer Vision returns the coordinates of the region's bounding rectangle, so the calling application can modify the original image as needed.

- Publicidad -

Extract text in images
You can use Computer Vision to extract text from an image in a machine-readable character sequence using optical character recognition (OCR).

If necessary, OCR corrects the rotation of the recognized text and provides the frame coordinates of each word. OCR supports 25 languages and automatically detects the language of the recognized text.

You can also use the Read API to extract printed and handwritten text from images and documents with a lot of text. Read API uses updated models and serves different objects with different surfaces and backgrounds, such as receipts, posters, business cards, letters, and whiteboards. Currently, Read API is available in preview and in English, as it is the only supported language.

Moderation of image content
It is possible to use VC to detect adult and risqué content in an image and return a confidence score for both.

Using Docker Containers
Use Computer Vision containers to recognize printed and handwritten text locally, by installing a standard Docker container closer to the data.

Image requirements
Computer Vision can analyze images that meet the following requirements:
    • The image must be presented in JPEG, PNG, GIF or BMP format
    • Image file size must be less than 4 megabytes (MB)
    • Image dimensions must be larger than 50 x 50 pixels
    • For OCR, the input image size must be between 50 x 50 and 4200 x 4200 pixels.

Data security and privacy
As with all instances of Cognitive Services, developers using the Computer Vision service should be aware of Microsoft's policies on customer data. For more information, see the Cognitive Services page in the Microsoft Trust Center.

Information Engineering
Many include Computer Vision within information engineering.
Information engineering comprises the following fields:
    • Machine learning
    • Artificial intelligence
    • Control theory
    • Signal processing
    • Information theories
    • Computer vision
    • Medical imaging
    • Chemoinformatics
    • Autonomous robotics
    • Mobile robotics
    • Telecommunications

Many of the areas originate in computer science.

Figure 1. Graph of areas with their proximity.

An important part of artificial intelligence has to do with the planning or deliberation of a system that can perform mechanical actions, such as moving a robot through some environment.

This type of processing typically needs input data provided by a computer vision system, which acts as a vision sensor and provides high-level information about the environment and the robot.

Other parts that are sometimes described as belonging to artificial intelligence and used in connection with computer vision are pattern recognition and learning techniques.

Researchers at North Carolina State University have developed a new technique that improves the ability of machine vision technologies to better identify and separate objects in an image, a process called segmentation.

Image processing and computer vision are important for multiple applications, from autonomous vehicles to the detection of anomalies in medical imaging.

Machine vision technologies use algorithms to segment, or delineate objects, in an image. For example, separating the outline of a pedestrian in the context of a busy street.

These algorithms are based on defined parameters, programmed values, to segment images. For example, if there is a change in color that crosses a specific threshold, a computer vision program will interpret it as a dividing line between two objects. And that specific threshold is one of the parameters of the algorithm.

But there is a challenge here. Even small changes in a parameter can lead to very different computer vision results. For example, if a person crossing the street enters and leaves shaded areas, that would affect the color a computer sees, and the computer can "see" the person disappearing and reappearing, or interpret the person and shadow as a single, large object like a car.

"Some algorithm parameters can perform better than others in any set of circumstances, and we wanted to know how to combine multiple parameters and algorithms to create better image segmentation using computer vision programs," says Edgar Lobatón, assistant professor of electricity and computers. Engineering at NC State and lead author of an article on the work.

Lobatón and student Qian Ge developed a technique that compiles segmentation data from multiple algorithms and aggregates it, creating a new version of the image. This new image is segmented again, based on the persistence of a given segment in all the original input algorithms.

"Visually, the results of this technique look better than any given algorithm on its own," says Lobatón. "However, the nature of this work does not align with existing metrics for measuring computer vision accuracy. Therefore, we need to develop a new means of assessing the accuracy of computer vision – that's a future project for us."

Lobatón points out that the new image segmentation technique can be used in real time, processing 30 frames per second. This is due, in part, to the fact that most computational steps can be executed in parallel, rather than sequentially.

The paper, "Consensus-Based Image Segmentation through Topological Persistence," will be presented July 1 at the IEEE Conference on Computer Vision and Pattern Recognition in Las Vegas, Nevada.

Source of the story
Materials provided by North Carolina State University. Note: Content can be edited by style and length.

In our article computer vision we can say that it is the field that tries to get computers to understand image and video data at a high level.

Youtube example of CogniMem Technologies interactions (r)

Sef Vision Computer Vision Example

The names and trademarks mentioned are names and trademarks of their owners. Sources of Inquiries: Microsoft Corp through its Agency Salem Viale. Wikipedia and its common License and Sans information. Org. Sources may vary in different authors and essays poured.

* To contact the author of this article write to [email protected]

Author: Duván Chaverra Agudelo

Jefe Editorial en Latin Press, Inc,.

Comunicador Social y Periodista con experiencia de más de 16 años en medios de comunicación. Apasionado por la tecnología y por esta industria. [email protected]