AI: children’s photos secretly used to train popular AI-tools, report warns

Watch more of our videos on Shots! 
and live on Freeview channel 276
Visit Shots! now
  • Human Rights Watch found that photos of children were being used to trail AI-tools.
  • Photos ranged throughout childhood from babies through to teenagers.
  • Campaigners call for governments to adopt policies to protect children’s data from AI-fueled misuse.

Hundreds of photos of children were secretly used to train popular AI-tools without consent or permission, a report has warned. Human Rights Watch (HRW) found that the images span the entirety of childhood. 

The watchdog said that it found evidence that 170 photos of real kids from Brazil had been scraped off the web into a large data set that companies then use to train their AI tools. Analysis by HRW found that LAION-5B contains links to identifiable photos of Brazilian children.

Hide Ad
Hide Ad

The photos capture intimate moments of babies being born into the gloved hands of doctors, young children blowing out candles on their birthday cake or dancing in their underwear at home, students giving a presentation at school, and teenagers posing for photos at their high school’s carnival.

The report warns that some children’s names are listed in the accompanying caption or the URL where the image is stored. In many cases, their identities are easily traceable, including information on when and where the child was at the time their photo was taken.

This photo taken 22 April 2004 shows schoolchildren making their way home from class in Tokyo. Photo by YOSHIKAZU TSUNO/AFP via Getty ImagesThis photo taken 22 April 2004 shows schoolchildren making their way home from class in Tokyo. Photo by YOSHIKAZU TSUNO/AFP via Getty Images
This photo taken 22 April 2004 shows schoolchildren making their way home from class in Tokyo. Photo by YOSHIKAZU TSUNO/AFP via Getty Images | YOSHIKAZU TSUNO/AFP via Getty Images

In one such photo, a 2-year-old girl, her lips parted in wonder as she touches the tiny fingers of her newborn sister. The caption and information embedded in the photo reveals not only both children’s names but also the name and precise location of the hospital in Santa Catarina where the baby was born nine years ago on a winter afternoon.

“Children should not have to live in fear that their photos might be stolen and weaponized against them,” said Hye Jung Han, children’s rights and technology researcher and advocate at Human Rights Watch. “The government should urgently adopt policies to protect children’s data from AI-fueled misuse.”

Hide Ad
Hide Ad

Many of these photos were originally seen by few people and appear to have previously had a measure of privacy. They do not appear to be otherwise possible to find through an online search. 

Some of these photos were posted by children, their parents, or their family on personal blogs and photo- and video-sharing sites. Some were uploaded years or even a decade before LAION-5B was created.

Human Rights Watch warns: “Once their data is swept up and fed into AI systems, these children face further threats to their privacy due to flaws in the technology. AI models, including those trained on LAION-5B, are notorious for leaking private information; they can reproduce identical copies of the material they were trained on, including medical records and photos of real people. Guardrails set by some companies to prevent the leakage of sensitive data have been repeatedly broken.

“These privacy risks pave the way for further harm. Training on photos of real children has enabled AI models to create convincing clones of any child, based on a handful of photos or even a single image. Malicious actors have used LAION-trained AI tools to generate explicit imagery of children using innocuous photos, as well as explicit imagery of child survivors whose images of sexual abuse were scraped into LAION-5B.”

Hide Ad
Hide Ad

In response, LAION, the German nonprofit that manages LAION-5B, confirmed that the data set contained the children’s personal photos found by Human Rights Watch and pledged to remove them. It disputed that AI models trained on LAION-5B could reproduce personal data verbatim. LAION also said that children and their guardians were responsible for removing children’s personal photos from the internet, which it argued was the most effective protection against misuse.

The photos of children were from at least 10 states across Brazil from: Alagoas, Bahia, Ceará, Mato Grosso do Sul, Minas Gerais, Paraná, Rio de Janeiro, Rio Grande do Sul, Santa Catarina, and São Paulo.

If you want to learn more about concerns about AI, computer scientist Sasha Luccioni has a fantastic TED Talk called AI Is Dangerous, but Not for the Reasons You Think. It is just 10 minutes long and can be watched on YouTube right now.

Comment Guidelines

National World encourages reader discussion on our stories. User feedback, insights and back-and-forth exchanges add a rich layer of context to reporting. Please review our Community Guidelines before commenting.