Una oteada reflexiva hacia el contexto epistemológico de la analítica de datos

Autores/as

  • Luis Miguel Mejia Giraldo Universidad La Gran Colombia Autor/a
  • Ximena Cifuentes Wchima Universidad La Gran Colombia Autor/a
  • Bibiana Vélez Medina Universidad La Gran Colombia Autor/a
  • John Edward Herrera Universidad La Gran Colombia Autor/a
  • Luis Fernando Restrepo Betancur Universidad La Gran Colombia Autor/a

DOI:

https://doi.org/10.18634/sophiaj.21v.1i.1440

Palabras clave:

ciencia de datos, campo de investigación, educación e investigación, epistemología

Resumen

La abundancia y prominencia moderna de los datos ha llevado al desarrollo de la “ciencia de datos” como un nuevo campo de investigación, junto con un cuerpo de reflexiones epistemológicas sobre sus fundamentos, métodos y consecuencias. Este artículo es derivado del ejercicio investigativo sobre los fines de la educación donde el análisis del conocimiento proporciona una dinámica sistemática y una revisión crítica de importantes problemas y debates abiertos en la epistemología de la analítica y la ciencia de datos, proponiendo una división de la epistemología de la ciencia de datos en los siguientes cinco aspectos: Caracterizaciones maximalistas y minimalistas, taxonomías descriptivas, el conocimiento generado por la ciencia de datos, problemas de la caja negra y la ciencia en un paradigma intensivo en datos, aspectos que brindan un ejercicio reflexivo frente a la comprensión y abordaje de aspectos esenciales de la interpretación de datos y la comprensión de patrones ocultos en ellos, siendo esto el reto de la analítica como tal.

Biografía del autor/a

  • Luis Miguel Mejia Giraldo, Universidad La Gran Colombia
    Magister en Desarrollo Sostenible y Medio Ambiente. Profesor Asociado - Facultad de Ingenierías, Universidad La Gran Colombia. Armenia, Colombia. Líder del Grupo de Investigación GIDA. Correo electrónico: mejiagluismiguel@miugca.edu.co
  • Ximena Cifuentes Wchima, Universidad La Gran Colombia
    Magister en Desarrollo Sostenible y Medio Ambiente. Decana - Facultad de Ingenierías, Universidad La Gran Colombia. Armenia, Colombia. Integrante del Grupo de Investigación Gerencia de la Tierra. Correo electrónico: defingenieria@ugca.edu.co
  • Bibiana Vélez Medina, Universidad La Gran Colombia
    Ph.D. en Ciencias de la Educación. Mg. en Educación. Rectora delegataria de la Universidad La Gran Colombia. Armenia, Colombia. Líder del grupo de investigación PAIDEIA. Correo electrónico: rectoraugca@ugca.edu.co
  • John Edward Herrera, Universidad La Gran Colombia
    Magister en Sistemas Integrados de Gestión de la Calidad. Profesor Asociado - Facultad de Ingenierías, Universidad La Gran Colombia. Armenia, Colombia. Correo electrónico: herreraquijohn@miugca.edu.co
  • Luis Fernando Restrepo Betancur, Universidad La Gran Colombia
    Magister en Desarrollo Sostenible y Medio Ambiente. Profesor Asociado - Facultad de Ingenierías, Universidad La Gran Colombia. Armenia, Colombia. Líder del Grupo de Investigación GIDA. Correo electrónico: mejiagluismiguel@miugca.edu.co

Referencias

Alemany Oliver, M. and Vayre, J.-S. (2015). Big data and the future of knowledge production in marketing research: Ethics, digital traces, and abductive reasoning. Journal of Marketing Analytics, 3(1), pp. 5–13. doi: 10.1057/jma.2015.1. https://link.springer.com/article/10.1057/jma.2015.1

Anderson, C. (2008). The End of Theory: The Data Deluge Makes the Scientific Method Obsolete’. Wired. https://www.wired.com/2008/06/pb-theory/

Arjovsky, M., Bottou, L., Gulrajani, I., & Lopez-Pad, D. (2019). Invariant risk minimization. arXiv preprint, 1907.02893. https://www.researchgate.net/publication/334288906_Invariant_Risk_Minimization

Bareinboim, E., Lee, S., & Zhang, J. (2021). An introduction to causal reinforcement learning. Columbia CausalAI Laboratory, Technical Report (R-65). https://ics.uci.edu/~dechter/courses/ics-295cr/2024-25_Q2_Winter/presentations/P1%20-%20Jiapeng%20Zhao%20-%20An%20Introduction%20to%20Causal%20Reinforcement%20Learning.pdf

Blei, D. M. and Smyth, P. (2017). Science and data science. Proceedings of the National Academy of Sciences, 114(33), 8689–8692. doi: 10.1073/pnas.1702076114. https://www.pnas.org/doi/10.1073/pnas.1702076114

Breiman, L. (2001). Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author). Statistical Science, 16(3), 199–231. doi: 10.1214/ss/1009213726. https://projecteuclid.org/journals/statistical-science/volume-16/issue-3/Statistical-Modeling--The-Two-Cultures-with-comments-and-a/10.1214/ss/1009213726.full

Burrell, J. (2016). How the machine “thinks: Understanding opacity in machine learning algorithms. Big Data & Society. doi: 10.1177/2053951715622512. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2660674

Carmichael, I. and Marron, J. S. (2018). Data Science vs. Statistics: Two Cultures? Japanese Journal of Statistics and Data Science, 1(1), 117–138. doi: 10.1007/s42081-018-0009-3. https://arxiv.org/abs/1801.00371

Chambers, J. (1993). Classes and Methods in S.I: Recent Developments Computational Statistics, 8:3, 167-184.

Chambers, J. M. (1993). Greater or lesser statistics: a choice for future research. Statistics and Computing, 3(4), 182–184. doi: 10.1007/BF00141776. https://link.springer.com/article/10.1007/BF00141776

Cifuentes et al. (2016). Métodos de análisis para la investigación, desarrollo e innovación (I+D+i) de procesos agrícolas y agroindustriales. En https://www.ugc.edu.co/sede/armenia/files/editorial/metodos_de_analisis_para_la_investigacion.pdf

Donoho, D. (2017). 50 Years of Data Science. doi: https://doi.org/10.1080/10618600.2017.1384734 745-766 https://www.tandfonline.com/doi/full/10.1080/10618600.2017.1384734

Doshi-Velez, F. and Kim, B. (2017). Towards A Rigorous Science of Interpretable Machine Learning. http://arxiv.org/abs/1702.08608

Frické, M. (2015). Big data and its epistemology. Journal of the Association for Information Science and Technology. 66(4), pp. 651–661. doi: 10.1002/asi.23212. https://asistdl.onlinelibrary.wiley.com/doi/10.1002/asi.23212

Gigerenzer, G., Swijtink, Z., Porter, T., Daston, L., Beatty, J. & Krüger, L. (2013). The empire of chance: How probability changed science and everyday life. New York: Cambridge University Press. https://books.google.com.cu/books/about/The_Empire_of_Chance.html?id=Bw2yKfpvts8C&redir_esc=y

Glymour, C., Zhang, K., & Spirtes, P. (2019). Review of Causal Discovery Methods Based on Graphical Models. Frontiers in genetics, 10, 524. https://doi.org/10.3389/fgene.2019.00524

Hacking, I. (1975). The emergence of probability: A philosophical study of early ideas about probability, induction, and statistical inference. New York: Cambridge University Press. https://www.cambridge.org/core/books/emergence-of-probability/9852017A380C63DA30886D25B80336A7

Harman, G. (1965). The inference to the best explanation. Philosophical Review, 74(1), 88-95. https://www.jstor.org/stable/2183532

Harman, G. & Kulkarni, S. (2007). Reliable reasoning: Induction and statistical learning theory. Cambridge, MA: The MIT Press. https://direct.mit.edu/books/monograph/2565/Reliable-ReasoningInduction-and-Statistical

Hey, T., Tansley, S. and Tolle, K. (2009). The Fourth Paradigm: Data-Intensive Scientific Discovery. pp 287. https://www.microsoft.com/en-us/research/publication/fourth-paradigm-data-intensive-scientific-discovery/

Kitchin, R. (2014). Big Data, new epistemologies and paradigm shifts. Big Data & Society, 1(1), p. 2053951714528481. doi: 10.1177/2053951714528481. https://www.researchgate.net/publication/271525133_Big_Data_New_Epistemologies_and_Paradigm_Shift

Kitcher, P. (1976). Explanation, Conjunction, and Unification. The Journal of Philosophy, 73(8), 207–212. doi:10.2307/2025559. https://www.jstor.org/stable/2025559

Kitcher, P. (1989). Explanatory unification and the causal structure of the world. In P. Kitcher & W. Salmon (eds.), Scientific Explanation, 410-505. Minneapolis: University of Minnesota Press. https://conservancy.umn.edu/server/api/core/bitstreams/8f6f9fe7-b511-43cd-8d75-5c8570fefd59/content

Krishnan, M. (2020). Against Interpretability: a Critical Examination of the Interpretability Problem in Machine Learning. Philosophy & Technology, 33(3), 487–502. doi: 10.1007/s13347-019-00372-9. https://www.researchgate.net/publication/335148516_Against_Interpretability_a_Critical_Examination_of_the_Interpretability_Problem_in_Machine_Learning

Kuhn, T. S. (1970). The structure of scientific revolutions. 2nd Edition. Chicago: University of Chicago Press. https://www.lri.fr/~mbl/Stanford/CS477/papers/Kuhn-SSR-2ndEd.pdf

Leonelli, S. (2014). What difference does quantity make? On the epistemology of Big Data in biology. Big Data & Society, 1(1), 2053951714534395. doi: 10.1177/2053951714534395. https://journals.sagepub.com/doi/10.1177/2053951714534395

Lipton, P. (1991). Inference to the best explanation. London: Routledge. https://books.google.es/books?id=WIfYNExpSC0C

Lipton, Z. C. (2018). The mythos of model interpretability. Communications of the ACM, 61(10), 36–43. doi: 10.1145/3233231. https://arxiv.org/abs/1606.03490

MacKenzie, D. (1984). Statistics in Britain, 1865-1930: The social construction of scientific knowledge. Edinburgh: Edinburgh University Press. https://gwern.net/doc/statistics/1981-mackenzie-statisticsinbritain18651930.pdf

Mallows, C. (2006). Tukey’s Paper After 40 Years. Technometrics, 48, pp. 319–325. doi: 10.1198/004017006000000219. https://www.researchgate.net/publication/238879758_Tukey%27s_Paper_After_40_Years

Mayo, D. (1996). Error and the growth of experimental knowledge. Chicago: University of Chicago Press. https://errorstatistics.com/wp-content/uploads/2020/10/egek-pdf-red.pdf

Mayo, D. (2018). Statistical inference as severe testing: How to get beyond the statistics wars. New York: Cambridge University Press. https://www.cambridge.org/core/books/statistical-inference-as-severe-testing/D9DF409EF568090F3F60407FF2B973B2

Napoletani, D., Panza, M. and Struppa, D. (2018). The Agnostic Structure of Data Science Methods. p. 17. https://arxiv.org/abs/2101.12150

Nash, J. (1950). Non-Cooperative Games. PhD thesis, Princeton University.

Nash, J. (1951). Non-Cooperative Games. The Annals of Mathematics,54(2):286–295. https://www.jstor.org/stable/1969529

Nie, X. and Wager, S. (2021). Quasi-oracle estimation of heterogeneous treatment effects. Biometrika, 108(2), pp. 299–319. doi:10.1093/biomet/asaa076. https://arxiv.org/abs/1712.04912

Niiniluoto, I. (2018). Truth-seeking by abduction. Cham, Switzerland: Springer. https://link.springer.com/book/10.1007/978-3-319-99157-3

Pearl, J. (2000). Causality: Models, reasoning, and inference. Cambridge, England: Cambridge University Press. https://bayes.cs.ucla.edu/BOOK-2K/neuberg-review.pdf

Pearl, J. (2009) Causality. Cambridge: Cambridge University Press. doi: 10.1017/CBO9780511803161. https://www.cambridge.org/core/books/causality/B0046844FAE10CBF274D4ACBDAEB5F5B

Peters, J., Janzing, D., & Schölkopf, B. (2017). The elements of causal inference: Foundations and learning algorithms. Cambridge, MA: The MIT Press. Pietsch, W. (no date) ‘Big Data – The New Science of Complexity. https://books.google.com.cu/books/about/Elements_of_Causal_Inference.html?id=XPpFDwAAQBAJ&redir_esc=y

Prensky, M. (2009). H. Sapiens Digital: From Digital Immigrants and Digital Natives to Digital Wisdom, p. 11. https://eric.ed.gov/?id=ej834284

Ratti, E. and López-Rubio, E. (2018). MECHANISTIC MODELS AND THE EXPLANATORY LIMITS OF MACHINE LEARNING. Machine Learning, p. 18. https://philsci-archive.pitt.edu/14452/1/manuscript%20philsci%20-%20Ratti%20%26%20Lopez-Rubio.pdf

Schmidt, M. and Lipson, H. (2009). Distilling Free-Form Natural Laws from Experimental Data. Science, 324(5923), 81–85. doi: 10.1126/science.1165893. https://www.science.org/doi/10.1126/science.1165893

Spirtes, P., Glymour, C., & Scheines, R. (2000). Causation, prediction, and search. Cambridge, MA: The MIT Press. https://direct.mit.edu/books/monograph/2057/Causation-Prediction-and-Search

Steadman, I. (2013). Big data and the death of the theorist. Wired UK, 25 January. https://www.wired.co.uk/article/big-data-end-of-theory

Tukey, J. W. (1962). The Future of Data Analysis. Ann. Math. Statist. 33(1): 1-67 (March, 1962). DOI: 10.1214/aoms/1177704711. https://projecteuclid.org/journals/annals-of-mathematical-statistics/volume-33/issue-1/The-Future-of-Data-Analysis/10.1214/aoms/1177704711.full

Van Fraassen, B. C. (1980) The Scientific Image. Oxford University Press. https://epistemh.pbworks.com/f/2.+Oxford.University.Press.USA.The.Scientific.Image.Okt.1980.pdf

Vélez, B. (2018). Fines y estrategias de un modelo de universidad socialmente responsable. Sophia-Educación, 4 (2). https://dialnet.unirioja.es/servlet/articulo?codigo=6996273

Wigner, E.P. (1960). The unreasonable effectiveness of mathematics in the natural sciences. Richard Courant lecture in mathematical sciences delivered at New York University, May 11, 1959, Communications on Pure and Applied Mathematics, 13(1), 1–14. doi:10.1002/cpa.3160130102. https://www.researchgate.net/publication/227990770_The_unreasonable_effectiveness_of_mathematics_in_the_natural_sciences_Richard_Courant_lecture_in_mathematical_sciences_delivered_at_New_York_University_May_11_1959

Wu, C. F. J. (1997). Solving the Black Box Problem: A Normative Framework for Explainable Artificial Intelligence’, Philosophy and Technology, 1–24. doi: 10.1007/s13347-019-00382-7. https://arxiv.org/pdf/1903.04361/1000

Publicado

2025-12-02

Número

Sección

Artículo de Reflexión

Cómo citar

Una oteada reflexiva hacia el contexto epistemológico de la analítica de datos. (2025). Sophia, 21(1), 1-24. https://doi.org/10.18634/sophiaj.21v.1i.1440

Artículos más leídos del mismo autor/a

1 2 > >>