Um olhar reflexivo sobre o contexto epistemológico da análise de dados

Autores

  • Luis Miguel Mejia Giraldo Universidad La Gran Colombia Autor
  • Ximena Cifuentes Wchima Universidad La Gran Colombia Autor
  • Bibiana Vélez Medina Universidad La Gran Colombia Autor
  • John Edward Herrera Universidad La Gran Colombia Autor
  • Luis Fernando Restrepo Betancur Universidad La Gran Colombia Autor

DOI:

https://doi.org/10.18634/sophiaj.21v.1i.1440

Palavras-chave:

Ciência de dados, área de pesquisa, educação e pesquisa, epistemologia

Resumo

A abundância e a proeminência modernas dos dados levaram ao desenvolvimento da “ciência dos dados” como um novo campo de investigação, juntamente com um conjunto de reflexões epistemológicas sobre os seus fundamentos, métodos e consequências. Este artigo deriva do exercício de investigação sobre os propósitos da educação onde a análise do conhecimento proporciona uma dinâmica sistemática e uma revisão crítica de problemas importantes e debates abertos na epistemologia da análise e da ciência de dados, propondo uma divisão da epistemologia da ciência de dados em os seguintes cinco aspetos: caracterizações maximalistas e minimalistas, taxonomias descritivas, o conhecimento gerado pela ciência de dados, problemas de caixa negra e ciência num paradigma intensivo de dados, aspetos que proporcionam um exercício reflexivo contra a compreensão e abordagem de aspetos essenciais da interpretação e compreensão dos dados padrão neles ocultos, sendo este o desafio da análise enquanto tal.

Biografia do Autor

  • Luis Miguel Mejia Giraldo, Universidad La Gran Colombia
    Mestre em Desenvolvimento Sustentável e Meio Ambiente. Professor Associado - Faculdade de Engenharia, Universidade La Gran Colombia. Armenia, Colômbia. Líder do Grupo de Pesquisa GIDA. E-mail: mejiagluismiguel@miugca.edu.co
  • Ximena Cifuentes Wchima, Universidad La Gran Colombia
    Mestrado em Desenvolvimento Sustentável e Meio Ambiente. Decano da Faculdade de Engenharia da Universidade La Gran Colombia, Armenia, Colômbia. Membro do Grupo de Pesquisa em Gestão Territorial. E-mail: defingenieria@ugca.edu.co
  • Bibiana Vélez Medina, Universidad La Gran Colombia
    Doutora em Ciências da Educação. Mestre em Educação. Reitora interina da Universidade La Gran Colombia. Armenia, Colômbia. Líder do grupo de pesquisa PAIDEIA. E-mail: rectoraugca@ugca.edu.co
  • John Edward Herrera, Universidad La Gran Colombia
    Mestrado em Sistemas Integrados de Gestão da Qualidade. Professor Associado - Faculdade de Engenharia, Universidade La Gran Colombia. Armenia, Colômbia. E-mail: herreraquijohn@miugca.edu.co
  • Luis Fernando Restrepo Betancur, Universidad La Gran Colombia
    Mestre em Desenvolvimento Sustentável e Meio Ambiente. Professor Associado - Faculdade de Engenharia, Universidade La Gran Colombia. Armenia, Colômbia. Líder do Grupo de Pesquisa GIDA. E-mail: mejiagluismiguel@miugca.edu.co

Referências

Alemany Oliver, M. and Vayre, J.-S. (2015). Big data and the future of knowledge production in marketing research: Ethics, digital traces, and abductive reasoning. Journal of Marketing Analytics, 3(1), pp. 5–13. doi: 10.1057/jma.2015.1. https://link.springer.com/article/10.1057/jma.2015.1

Anderson, C. (2008). The End of Theory: The Data Deluge Makes the Scientific Method Obsolete’. Wired. https://www.wired.com/2008/06/pb-theory/

Arjovsky, M., Bottou, L., Gulrajani, I., & Lopez-Pad, D. (2019). Invariant risk minimization. arXiv preprint, 1907.02893. https://www.researchgate.net/publication/334288906_Invariant_Risk_Minimization

Bareinboim, E., Lee, S., & Zhang, J. (2021). An introduction to causal reinforcement learning. Columbia CausalAI Laboratory, Technical Report (R-65). https://ics.uci.edu/~dechter/courses/ics-295cr/2024-25_Q2_Winter/presentations/P1%20-%20Jiapeng%20Zhao%20-%20An%20Introduction%20to%20Causal%20Reinforcement%20Learning.pdf

Blei, D. M. and Smyth, P. (2017). Science and data science. Proceedings of the National Academy of Sciences, 114(33), 8689–8692. doi: 10.1073/pnas.1702076114. https://www.pnas.org/doi/10.1073/pnas.1702076114

Breiman, L. (2001). Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author). Statistical Science, 16(3), 199–231. doi: 10.1214/ss/1009213726. https://projecteuclid.org/journals/statistical-science/volume-16/issue-3/Statistical-Modeling--The-Two-Cultures-with-comments-and-a/10.1214/ss/1009213726.full

Burrell, J. (2016). How the machine “thinks: Understanding opacity in machine learning algorithms. Big Data & Society. doi: 10.1177/2053951715622512. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2660674

Carmichael, I. and Marron, J. S. (2018). Data Science vs. Statistics: Two Cultures? Japanese Journal of Statistics and Data Science, 1(1), 117–138. doi: 10.1007/s42081-018-0009-3. https://arxiv.org/abs/1801.00371

Chambers, J. (1993). Classes and Methods in S.I: Recent Developments Computational Statistics, 8:3, 167-184.

Chambers, J. M. (1993). Greater or lesser statistics: a choice for future research. Statistics and Computing, 3(4), 182–184. doi: 10.1007/BF00141776. https://link.springer.com/article/10.1007/BF00141776

Cifuentes et al. (2016). Métodos de análisis para la investigación, desarrollo e innovación (I+D+i) de procesos agrícolas y agroindustriales. En https://www.ugc.edu.co/sede/armenia/files/editorial/metodos_de_analisis_para_la_investigacion.pdf

Donoho, D. (2017). 50 Years of Data Science. doi: https://doi.org/10.1080/10618600.2017.1384734 745-766 https://www.tandfonline.com/doi/full/10.1080/10618600.2017.1384734

Doshi-Velez, F. and Kim, B. (2017). Towards A Rigorous Science of Interpretable Machine Learning. http://arxiv.org/abs/1702.08608

Frické, M. (2015). Big data and its epistemology. Journal of the Association for Information Science and Technology. 66(4), pp. 651–661. doi: 10.1002/asi.23212. https://asistdl.onlinelibrary.wiley.com/doi/10.1002/asi.23212

Gigerenzer, G., Swijtink, Z., Porter, T., Daston, L., Beatty, J. & Krüger, L. (2013). The empire of chance: How probability changed science and everyday life. New York: Cambridge University Press. https://books.google.com.cu/books/about/The_Empire_of_Chance.html?id=Bw2yKfpvts8C&redir_esc=y

Glymour, C., Zhang, K., & Spirtes, P. (2019). Review of Causal Discovery Methods Based on Graphical Models. Frontiers in genetics, 10, 524. https://doi.org/10.3389/fgene.2019.00524

Hacking, I. (1975). The emergence of probability: A philosophical study of early ideas about probability, induction, and statistical inference. New York: Cambridge University Press. https://www.cambridge.org/core/books/emergence-of-probability/9852017A380C63DA30886D25B80336A7

Harman, G. (1965). The inference to the best explanation. Philosophical Review, 74(1), 88-95. https://www.jstor.org/stable/2183532

Harman, G. & Kulkarni, S. (2007). Reliable reasoning: Induction and statistical learning theory. Cambridge, MA: The MIT Press. https://direct.mit.edu/books/monograph/2565/Reliable-ReasoningInduction-and-Statistical

Hey, T., Tansley, S. and Tolle, K. (2009). The Fourth Paradigm: Data-Intensive Scientific Discovery. pp 287. https://www.microsoft.com/en-us/research/publication/fourth-paradigm-data-intensive-scientific-discovery/

Kitchin, R. (2014). Big Data, new epistemologies and paradigm shifts. Big Data & Society, 1(1), p. 2053951714528481. doi: 10.1177/2053951714528481. https://www.researchgate.net/publication/271525133_Big_Data_New_Epistemologies_and_Paradigm_Shift

Kitcher, P. (1976). Explanation, Conjunction, and Unification. The Journal of Philosophy, 73(8), 207–212. doi:10.2307/2025559. https://www.jstor.org/stable/2025559

Kitcher, P. (1989). Explanatory unification and the causal structure of the world. In P. Kitcher & W. Salmon (eds.), Scientific Explanation, 410-505. Minneapolis: University of Minnesota Press. https://conservancy.umn.edu/server/api/core/bitstreams/8f6f9fe7-b511-43cd-8d75-5c8570fefd59/content

Krishnan, M. (2020). Against Interpretability: a Critical Examination of the Interpretability Problem in Machine Learning. Philosophy & Technology, 33(3), 487–502. doi: 10.1007/s13347-019-00372-9. https://www.researchgate.net/publication/335148516_Against_Interpretability_a_Critical_Examination_of_the_Interpretability_Problem_in_Machine_Learning

Kuhn, T. S. (1970). The structure of scientific revolutions. 2nd Edition. Chicago: University of Chicago Press. https://www.lri.fr/~mbl/Stanford/CS477/papers/Kuhn-SSR-2ndEd.pdf

Leonelli, S. (2014). What difference does quantity make? On the epistemology of Big Data in biology. Big Data & Society, 1(1), 2053951714534395. doi: 10.1177/2053951714534395. https://journals.sagepub.com/doi/10.1177/2053951714534395

Lipton, P. (1991). Inference to the best explanation. London: Routledge. https://books.google.es/books?id=WIfYNExpSC0C

Lipton, Z. C. (2018). The mythos of model interpretability. Communications of the ACM, 61(10), 36–43. doi: 10.1145/3233231. https://arxiv.org/abs/1606.03490

MacKenzie, D. (1984). Statistics in Britain, 1865-1930: The social construction of scientific knowledge. Edinburgh: Edinburgh University Press. https://gwern.net/doc/statistics/1981-mackenzie-statisticsinbritain18651930.pdf

Mallows, C. (2006). Tukey’s Paper After 40 Years. Technometrics, 48, pp. 319–325. doi: 10.1198/004017006000000219. https://www.researchgate.net/publication/238879758_Tukey%27s_Paper_After_40_Years

Mayo, D. (1996). Error and the growth of experimental knowledge. Chicago: University of Chicago Press. https://errorstatistics.com/wp-content/uploads/2020/10/egek-pdf-red.pdf

Mayo, D. (2018). Statistical inference as severe testing: How to get beyond the statistics wars. New York: Cambridge University Press. https://www.cambridge.org/core/books/statistical-inference-as-severe-testing/D9DF409EF568090F3F60407FF2B973B2

Napoletani, D., Panza, M. and Struppa, D. (2018). The Agnostic Structure of Data Science Methods. p. 17. https://arxiv.org/abs/2101.12150

Nash, J. (1950). Non-Cooperative Games. PhD thesis, Princeton University.

Nash, J. (1951). Non-Cooperative Games. The Annals of Mathematics,54(2):286–295. https://www.jstor.org/stable/1969529

Nie, X. and Wager, S. (2021). Quasi-oracle estimation of heterogeneous treatment effects. Biometrika, 108(2), pp. 299–319. doi:10.1093/biomet/asaa076. https://arxiv.org/abs/1712.04912

Niiniluoto, I. (2018). Truth-seeking by abduction. Cham, Switzerland: Springer. https://link.springer.com/book/10.1007/978-3-319-99157-3

Pearl, J. (2000). Causality: Models, reasoning, and inference. Cambridge, England: Cambridge University Press. https://bayes.cs.ucla.edu/BOOK-2K/neuberg-review.pdf

Pearl, J. (2009) Causality. Cambridge: Cambridge University Press. doi: 10.1017/CBO9780511803161. https://www.cambridge.org/core/books/causality/B0046844FAE10CBF274D4ACBDAEB5F5B

Peters, J., Janzing, D., & Schölkopf, B. (2017). The elements of causal inference: Foundations and learning algorithms. Cambridge, MA: The MIT Press. Pietsch, W. (no date) ‘Big Data – The New Science of Complexity. https://books.google.com.cu/books/about/Elements_of_Causal_Inference.html?id=XPpFDwAAQBAJ&redir_esc=y

Prensky, M. (2009). H. Sapiens Digital: From Digital Immigrants and Digital Natives to Digital Wisdom, p. 11. https://eric.ed.gov/?id=ej834284

Ratti, E. and López-Rubio, E. (2018). MECHANISTIC MODELS AND THE EXPLANATORY LIMITS OF MACHINE LEARNING. Machine Learning, p. 18. https://philsci-archive.pitt.edu/14452/1/manuscript%20philsci%20-%20Ratti%20%26%20Lopez-Rubio.pdf

Schmidt, M. and Lipson, H. (2009). Distilling Free-Form Natural Laws from Experimental Data. Science, 324(5923), 81–85. doi: 10.1126/science.1165893. https://www.science.org/doi/10.1126/science.1165893

Spirtes, P., Glymour, C., & Scheines, R. (2000). Causation, prediction, and search. Cambridge, MA: The MIT Press. https://direct.mit.edu/books/monograph/2057/Causation-Prediction-and-Search

Steadman, I. (2013). Big data and the death of the theorist. Wired UK, 25 January. https://www.wired.co.uk/article/big-data-end-of-theory

Tukey, J. W. (1962). The Future of Data Analysis. Ann. Math. Statist. 33(1): 1-67 (March, 1962). DOI: 10.1214/aoms/1177704711. https://projecteuclid.org/journals/annals-of-mathematical-statistics/volume-33/issue-1/The-Future-of-Data-Analysis/10.1214/aoms/1177704711.full

Van Fraassen, B. C. (1980) The Scientific Image. Oxford University Press. https://epistemh.pbworks.com/f/2.+Oxford.University.Press.USA.The.Scientific.Image.Okt.1980.pdf

Vélez, B. (2018). Fines y estrategias de un modelo de universidad socialmente responsable. Sophia-Educación, 4 (2). https://dialnet.unirioja.es/servlet/articulo?codigo=6996273

Wigner, E.P. (1960). The unreasonable effectiveness of mathematics in the natural sciences. Richard Courant lecture in mathematical sciences delivered at New York University, May 11, 1959, Communications on Pure and Applied Mathematics, 13(1), 1–14. doi:10.1002/cpa.3160130102. https://www.researchgate.net/publication/227990770_The_unreasonable_effectiveness_of_mathematics_in_the_natural_sciences_Richard_Courant_lecture_in_mathematical_sciences_delivered_at_New_York_University_May_11_1959

Wu, C. F. J. (1997). Solving the Black Box Problem: A Normative Framework for Explainable Artificial Intelligence’, Philosophy and Technology, 1–24. doi: 10.1007/s13347-019-00382-7. https://arxiv.org/pdf/1903.04361/1000

Publicado

2025-12-02

Edição

Seção

Artículo de Reflexión

Como Citar

Um olhar reflexivo sobre o contexto epistemológico da análise de dados. (2025). Sophia, 21(1), 1-24. https://doi.org/10.18634/sophiaj.21v.1i.1440

Artigos mais lidos pelo mesmo(s) autor(es)

1 2 > >>