According to recent reports, Google is training AI to understand and replicate human behaviors such as hugging, cooking, and even fighting. As technology advances, artificial intelligence continues to evolve. Now, AI has access to a vast human behavior database, suggesting that future AI systems will become increasingly human-like in their interactions and understanding.
Google, the parent company of YouTube, announced on October 19 the launch of a new video database called AVA. This collection of movie clips is designed to help machines better interpret human actions in everyday life. The videos look normal at first glance, mostly showing three-second scenes of people performing routine tasks like drinking or cooking. However, each clip comes with detailed annotations that specify the actions taking place, including body postures and whether individuals are interacting with others or objects.
This process is similar to how a child learns by being shown examples—like when an adult points to a dog and asks, “Is that a dog?†AVA serves as the AI equivalent of that learning experience.

When multiple people appear in a video, each individual is labeled accordingly, enabling the algorithm to recognize social norms, such as shaking hands when meeting someone. This level of detail helps the AI grasp the nuances of human interaction.
The AVA database is expected to assist Google in analyzing thousands of videos on YouTube. It can be used to recommend targeted ads or content based on what users are watching. The ultimate aim is to develop "social visual intelligence" in computers—enabling them to understand not just what people are doing, but also what they might do next and what goals they are trying to achieve.
In their research paper, the team described "social visual intelligence" as the ability to comprehend human activities, intentions, and objectives. The AVA database includes 57,600 labeled videos covering 80 different actions, from simple ones like walking or talking to more complex interactions. Each action has over 10,000 video tags, making it one of the most comprehensive datasets of its kind.
However, the researchers acknowledged some limitations. They noted that some film clips may portray actions in a more exaggerated or dramatic way than real-life situations. Despite this, they emphasized that the data offers a broader range of scenarios than user-uploaded videos, such as how to care for a pet or plan a birthday party.
“We didn’t think the data was perfect,†the researchers wrote. “But it’s far more diverse than typical user-generated content.â€
Additionally, the paper mentioned that the team aimed to identify top performers from different countries, though it did not address potential issues like racial or gender bias in the dataset.
CCTV/Surveillance Monitor
Hengstar professional CCTV monitors are designed for professional surveillance systems. The monitors have multi signal input options, and using BNC connectors, which can support long distance signal transmission. Its controller boards have functions of: low EMI, 3D filter and 3D noise reduction and professional Mstar ACE-3 image/color processing and, ensuring the monitors have a perfect image. We have different board solutions for various input needs, and monitors of small size(10.4'') to big size(65'') for option. Our monitors support wall mount and desktop solutions, other mounting solutions are customizable according to customer's requirements. This professional CCTV monitors have been widely applied to control centers, stations, banks, medical diagnose and other site monitoring fields.
cctv monitor,cctv monitor screen,cctv display monitor,surveillance monitor,surveillance monitoring,video surveillance monitors
Shenzhen Hengstar Technology Co., Ltd. , https://www.angeltondal.com