Big Tech's secret race for AI training data intensifies
The pursuit of artificial intelligence (AI) training data has led Big Tech companies into a clandestine market, vying for vast collections of digital content.
Once a dominant player, Photobucket, now with only 2 million users, finds its archive of 13 billion photos and videos, a potential goldmine for training generative AI models. CEO Ted Leonard disclosed ongoing discussions with tech giants to license this trove, with prices ranging widely based on the content and buyer.
This emerging market is propelled by the need for "foundation" AI models to learn from massive datasets. Initially, companies like Google, Meta, and Microsoft relied on freely scraped internet data but now seek legally and ethically sourced content to mitigate copyright and privacy concerns. For instance, Shutterstock has struck significant deals with Meta, Google, Amazon, and Apple, licensing its extensive library for AI training.
The demand for data extends beyond existing web content to include specially created or sourced materials, like podcasts, short-form videos, and even sensitive images used for content moderation training. Companies are willing to pay top dollar for high-quality, "ethically sourced" data that respects copyright and privacy norms.
However, this practice raises legal and ethical questions, especially when involving personal data from old social media platforms. The industry grapples with ensuring privacy and consent in using such data, highlighting the complex balance between technological advancement and ethical responsibility.
Most Read News
-
World leaders gather in France for G7 summit amid
-
EU lacks unanimity needed to sanction Israeli minister
-
Iranian foreign minister, parliament speaker expected in
-
Anthropic in talks with Trump administration to reverse
-
Italy, Japan leaders meet in Rome to deepen strategic pa
-
Strong earthquake of 6.2 magnitude strikes southern
-
Nearly 80 million under severe storm alert across
-
US envoy to meet with Iraqi prime minister to discuss












