International. The use of AI in video surveillance is usually accompanied by the term "Machine Learning" or "Deep Learning", differentiating them can be complex, since both describe programming methods in which a system learns from a set of data.
The text "Understanding AI in video surveillance. Applying Human Intelligence to Computer Programs" written by Brian Carle, director of strategic alliances at Salient Systems, addresses precisely the difference between the two concepts, taking into account their application for video analytics.
The document states that, in the case of Machine Learning, the attributes of the data sought by a system are usually pre-established, or corrected, by human programmers. For example, the system can be programmed to delineate an object that is taller than it is wide, with limbs that move specifically, among other features, and label this object as "person".

Now, in that sense in Machine Learning it can happen that programmers may not recognize the most relevant criteria. So, going back to the example, if the described algorithm is used to identify a person, it is possible that a seated and immobile person does not provoke accurate detection.
In contrast, with Deep Learning (which is considered superior), video analysis algorithms are fed by a large set of data representing an object. This diet is commonly known as training, which consists of time spent on the algorithm training to recognize a type of object.
Returning to the example, in this case the system receives thousands of images of people of different genders, clothing styles, ethnic origins and physical positions of images taken at different angles, among other varieties of characteristics.

This way the algorithm will be able to determine which attributes are similar and which are not. In that sense, Deep Learning, also known as deep learning, establishes how to weigh the relevance of the characteristics, in a certain way it has a selection criterion formed thanks to that broad and deep knowledge.
Then, as the text of Salient Systems explains, after analyzing thousands of images, the algorithm can calculate where the nose is located in a face photo, beyond only an average location, if not as a set also related to the other elements, such as eyes and mouth. "In fact, the algorithm may have identified many other features of this typology that people wouldn't think of".
In that order of ideas we can conclude that for video analytics, especially for video surveillance, Deep Learning is the best alternative, since the developers of the software are responsible for training the system before it is used by a consumer.
Deep Learning Training Process
This is a process that requires a lot of computing power, much more than it takes to detect and classify objects. The records must be large and the more varied the better. The result is a complex file that the system references to determine whether a detected object matches the classification.
Finally, the text states that: "Since the deep learning process uses the machine to determine the characteristics of objects, it has given rise to analyses that can provide a much more granular classification. For example, older approaches may be to detect a person, but deep learning-based analyses can detect whether the person is a male, female or child".
In conclusion, this technology makes it possible to detect with greater precision the characteristics associated with an individual, as well as the type of vehicle or the information on the plate, even if the font varies.

