Text will always be a major format for data and it will never be well-organized. According to Phil Karlton’s famous joke, the two hard problems in computing are naming things, cache invalidation, and off-by-one errors. A third problem could be “formatting things”. Most textual data has irregularities in the formatting that make it a pain to process. And much of the work in text processing goes into dealing with formatting issues. These are just the sad facts.