This week, two AI conversations are unfolding in parallel: one about infrastructure, the other about ownership.
In New Delhi, the India AI Impact Summit 2026 is underway at Bharat Mandapam. Prime Minister Modi is meeting global technology leaders including Sam Altman and Sundar Pichai. Over 20 heads of state and 60 ministers have gathered to discuss three guiding Sutras: People, Planet, and Progress.
The focus is clear: build computing capacity, democratise AI access, and position India as a central player in the global AI ecosystem.

Union Minister Ashwini Vaishnaw has indicated that India could attract over $200 billion in AI and data infrastructure investment in the next two years, with approximately $70 billion already committed.
It is an impressive display of ambition.
Across the Atlantic, a quieter but equally consequential conversation is taking place. In the United States Congress, a bipartisan bill called the TRAIN Act (Transparency and Responsibility for Artificial Intelligence Networks) has been reintroduced. Its purpose is straightforward: creators would have the right to discover whether their copyrighted work was used to train AI models.
Two conversations. One question.
Who owns the data that powers AI?
The Gap Between Infrastructure and Governance
The India AI Summit highlights themes such as Safe and Trusted AI, Inclusion, and Democratizing AI Resources. These are necessary pillars.
India’s Digital Personal Data Protection Act, 2023 establishes a consent-based framework for data processing. But it does not yet clarify how principles such as consent, purpose limitation, and transparency apply when personal data is used for AI model training and refinement.
The law is technologically neutral.
Neutrality, however, is not clarity.
This distinction matters strategically.
India’s digital ecosystem is vast and highly engaged. Global firms view it as an invaluable environment for testing, localisation, and iterative improvement. Offering AI tools at low or no cost benefits users immediately; while simultaneously generating feedback loops that refine models.
Access and data contribution are increasingly intertwined.
Without clear guidance on data provenance and training transparency, the balance between innovation and accountability remains undefined.
What the TRAIN Act Signals
The TRAIN Act establishes a simple principle: creators have a right to know if their work has been used in AI training datasets.
It is a transparency-first approach. Before debates about compensation, there must be visibility.
For Indian creators, this raises a practical question:
If your song, composition, or lyrical work has been scraped into a training dataset, would you ever know?
Under current Indian law, the answer is uncertain.
Europe’s GDPR has begun shaping conversations around AI consent and purpose limitation. India now faces a similar inflection point. Infrastructure and investment alone will not define leadership. Governance clarity will; because infrastructure shapes careers long before visibility does.
What This Means for Composers, Authors, and Publishers
Consider a composer in Kerala whose work becomes part of a training dataset for an AI music generation system. That system is used by a producer elsewhere to generate a commercially successful track. The composer recognises something familiar — but there is no registry, no disclosure, and no traceability mechanism.
This is not speculative fiction. It is a governance gap: one that echoes the concerns raised in the human authorship dilemma when machines begin composing at scale.
The conversations around the TRAIN Act and the India AI Summit converge on the same reality: the next phase of AI development will require clear standards on data provenance and accountability; not only for regulators, but for the creators whose work feeds these systems.
A Question Worth Sitting With
India has laid the foundations of data protection. Its AI ambition now calls for operational clarity.
Rather than drafting entirely new regimes, policymakers could clarify how existing principles apply to AI training contexts:
- How does consent function in model training scenarios?
- How does purpose limitation apply to dataset reuse?
- What transparency norms should govern training data?
- How can provenance and traceability be institutionalised?
These are not anti-innovation questions. They are structural questions.
For creators, the issue is more personal. Much like the early career publishing blind spot, structural clarity often arrives after value has already leaked.
Your work may already be in a training dataset. Your melodies. Your lyrics. Your voice.
The question is not whether AI will use creative works. It is whether creators will have the right to know; and whether the law will ensure they have a seat at the table when value is assigned.
The India AI Summit represents national ambition.
The TRAIN Act represents creator visibility.
Both are ultimately conversations about the same issue:
Who owns the data that powers AI?
And more importantly, who gets to decide?
Leave a Reply