Log comparison or log analysis is an integral part of Test results verification. This could be a daily routine while testing runtime of the applications/databases or drivers or even the verification of the logs generated by the Automation frameworks.
But this process might become tedious and cumbersome when the log files get huge (say 30K-40K lines). Though there are many handy tools like ‘Beyond Compare’, ‘Win Merge’ for text comparison, the huge length of the file makes it a visual pain and prone to human error.
The complexity might further rise, if the files under comparison differ in their alignment. Log files generally contain dates, build numbers, names of the machines, databases/servers/apps that they connect to, which differ from run to run adding to the differences list that the diff tools show up.
“Divide and Rule” could be
the mantra in simplifying things
quickly in such cases. As a quick work around, one can split the huge file
based on a pattern into manageable small files so that the comparison through the
tools gets easier. This minimizes the impact of the differences in the
alignment or order of the text in the files under comparison.
There are again n number of file splitter tools
available. “GSplit” is one such file splitter tool which I tried, that was
simple and easy to use. It can split based on a pattern or based on the number
of lines we intend to put in each chunk etc. The good part is, it can also,
unite things back in to one big file.
The tool provides command
line option too and hence can be integrated into Automation or continuous
Integration.
More details about the tool here.
Every problem will have multiple solutions. It depends on n number of factors on which one to pick. Sometimes, we would want to go for an easier, quicker way of solving the problem (though not a recommended way in long term) but at times we would want to take time to research, investigate and find an unshakable solution possible. “Divide and Rule” strategy for this problem of huge file comparison belongs to the short term, ‘get it quick’ category of solution.
Ideally, if possible, we better avoid generating such huge files. Especially, when generating from an Automation framework. In cases where it is unavoidable or immutable, we may also look at other long term options where the human eye itself is not involved in comparison. Instead, we should just get out filtered, semantical, rule based differences between two files – all automated!!
More details about the tool here.
Every problem will have multiple solutions. It depends on n number of factors on which one to pick. Sometimes, we would want to go for an easier, quicker way of solving the problem (though not a recommended way in long term) but at times we would want to take time to research, investigate and find an unshakable solution possible. “Divide and Rule” strategy for this problem of huge file comparison belongs to the short term, ‘get it quick’ category of solution.
Ideally, if possible, we better avoid generating such huge files. Especially, when generating from an Automation framework. In cases where it is unavoidable or immutable, we may also look at other long term options where the human eye itself is not involved in comparison. Instead, we should just get out filtered, semantical, rule based differences between two files – all automated!!
Your Thoughts??
No comments:
Post a Comment