Tapping into the resources offered by the Research Computing Services team, Bruce Wallace and Phil Masson in Systems and Computer Engineering were able to quickly process large amounts of research data and nearly eliminate time consuming manual input.
They were working with 350GB of input data and predicted that processing it was going to take 70 days. In addition, the existing multi-step workflow required a lot of user intervention between steps (manually editing, copying, moving files and folders), which they were hoping to automate.
After reviewing the researchers’ code, the Research Computing Services team rewrote large sections to improve efficiency of the file reading and writing. This reduced runtime from 2.5 days to 1 hour for the first two steps. They were able to split the code up into several parts and have it run simultaneously in parallel. This is where they noticed most of the speed improvements.
Instead of manual user intervention, they automated most the tasks between the steps of the process using scripts. This allowed the researchers to run most of the process automatically.
The expected runtime of 70 days was dropped to a real runtime of 2 days for an overall performance improvement of 35x using one of the Research Computing Cloud nodes.
“I am really impressed and happy with the performance results and that weeks of work dropped to days”, said Wallace.
Research Computing Services work with faculty members to understand their research computing needs and provide the support or computing resources they require.