Feb-15-2017, 07:42 AM
One approach:
read the file in chunks of 20 lines (assuming each separate chunk has the same structure as the two chunks in your example).
you need lines 2 and 12-19 of each chunk.
parse the respective lines (using split, strip, etc.)
write result to your file.
Alternative:
read the file line by line
check if line (stripped of leading spaces) starts with one of the words (LOCATION, color, mandatory, viscosity,winter use, chemicals, safe for kids, accessories, cleaning)
parse the respective line accordingly. Note that you need to write a row to the output file when you find next LOCATION line.
Third approach - RegEx
This one should be first one, if you are familiar with RegEx
regex that returns all matches from your example
Maybe combine RegEx with reafing chunks of 20 lines/
read the file in chunks of 20 lines (assuming each separate chunk has the same structure as the two chunks in your example).
you need lines 2 and 12-19 of each chunk.
parse the respective lines (using split, strip, etc.)
write result to your file.
Alternative:
read the file line by line
check if line (stripped of leading spaces) starts with one of the words (LOCATION, color, mandatory, viscosity,winter use, chemicals, safe for kids, accessories, cleaning)
parse the respective line accordingly. Note that you need to write a row to the output file when you find next LOCATION line.
Third approach - RegEx
This one should be first one, if you are familiar with RegEx
regex that returns all matches from your example
[^ \n][\w ]* ?: \w*
. There might be a better one, but I'm not that experienced with RegExMaybe combine RegEx with reafing chunks of 20 lines/