During the implementation of Data Science Projects, we always face cases where we have to decide on the best method of implementation in order for it to be integrated with the pipeline smoothly. The goal is to achieve the most simplistic implementation as the overall design is always complex. We focus on to simplifying our approaches as much as possible so we can keep track of all the steps and modify them easily with minimum implementation/modification time.
Some tools can be more productive than others. Throughout our experience in implementing an optimal machine-learning pipeline in production, we have learned to appreciate the raw strength of the combination of SAP HANA with SAP Data Services. The amount of time that can be saved by reformulating the approach and optimizing it to use this combination is significant, compared to a vanilla approach involving usage of Python for data wrangling, cleaning, discovery, and normalization, which are significant aspects of machine learning pipeline development.