Re: Parse huge text files (x12 files) to XML
Hi Lingan,
I think that the best approach in your case would be to wire up a segmented X12 Parser in Tika. Your Parser will be handed a java.io.InputStream and you can create e.g., a Reader for that stream and parse out each segment of the x12 file. Then, you just need to use one of Tika's existing ContentHandlers (or a plain ol' Java SAX ContentHandler, or write your own) and then you can start emitting the XHTML that you desire.
You can learn more about how to do this in Chapters 8 and Chapter 11.
HTH,
Chris
|