Data Versioning and Its Impact on Machine Learning Models

Authors

  • Vamsi Krishna Eruvaram Sr. Data Engineer, Lowe's, USA Author
  • Mohan Raja Pulicharla Department of Computer Sciences, Monad University, India Author

DOI:

https://doi.org/10.55662/JST.2024.5101

Keywords:

Machine Learning Models, Data Versioning, ML pipeline

Abstract

Data versioning in machine learning is of paramount importance as it ensures the reproducibility, transparency, and reliability of ML models. In the dynamic landscape of ML research, where models heavily rely on diverse datasets, data versioning plays a crucial role in maintaining consistency throughout the ML pipeline. By tracking changes in datasets over time and aligning machine learning models with specific versions of data, researchers can reproduce experiments, verify results, and address challenges related to data quality, collaboration, and model training. Effective data versioning practices contribute to the robustness of ML workflows, fostering trust in model outcomes and supporting advancements in the field.

Downloads

Download data is not yet available.

Downloads

Published

29-01-2024

How to Cite

[1]
V. K. Eruvaram and M. R. Pulicharla, “Data Versioning and Its Impact on Machine Learning Models”, J. Sci. Tech., vol. 5, no. 1, pp. 22–37, Jan. 2024, doi: 10.55662/JST.2024.5101.

Most read articles by the same author(s)