An important observation is that the number of files produced each year continues to increase worldwide (see https://en.wikipedia.org/wiki/Information_explosion). And with it the number of digital objects increases in the same measure, for which we must decide: Keep or throw away?
The truth is, the discard scenario becomes the more likely one with each passing year.
And the file formats that are being dropped are already in place.
The truth is, no one can build up format knowledge for this yet.
A fuzzy concept
When talking to colleagues, the topic of validation does not play a role. For one thing, no one is clear about what "valid" means. Valid against a specification? Valid against a profile? Valid because it can be opened by programs? On the other hand, nothing happens after that. If a file is broken, it is still archived. If it is not broken, fine.
The truth is, validation is useless.
Do you know how the success of digital preservation is measured? I'll tell you, in terabytes per year. If the numbers go up, that's a good thing to sell to politicians. Whether it was difficult to prepare digital objects for long-term availability doesn't matter. Whether born-digitals are more at risk, never mind.
Is that the truth?
It used to be said that long-term digital archiving could only be handled by organizations with a minimum of resources. Look around and you'll find dozens of one-man orchestras and part-time archives. And do you think that as the amount of data increases, so do the human resources? Oh, come on!
You know the truth!
That's too exhausting
If you've ever heard of format migration as a principle of long-term preservation, you've read in textbooks phrases like
To ensure format migration, the significant properties of groups of objects that must be preserved must be determined.
Have you ever seen an archive that has actually determined and documented significant properties?
The truth is, significant properties are determined after the fact from technical metadata.
So what is digital long-term preservation? Only an expensive backup.