Um olhar reflexivo sobre o contexto epistemológico da análise de dados
DOI:
https://doi.org/10.18634/sophiaj.21v.1i.1440Palavras-chave:
Ciência de dados, área de pesquisa, educação e pesquisa, epistemologiaResumo
A abundância e a proeminência modernas dos dados levaram ao desenvolvimento da “ciência dos dados” como um novo campo de investigação, juntamente com um conjunto de reflexões epistemológicas sobre os seus fundamentos, métodos e consequências. Este artigo deriva do exercício de investigação sobre os propósitos da educação onde a análise do conhecimento proporciona uma dinâmica sistemática e uma revisão crítica de problemas importantes e debates abertos na epistemologia da análise e da ciência de dados, propondo uma divisão da epistemologia da ciência de dados em os seguintes cinco aspetos: caracterizações maximalistas e minimalistas, taxonomias descritivas, o conhecimento gerado pela ciência de dados, problemas de caixa negra e ciência num paradigma intensivo de dados, aspetos que proporcionam um exercício reflexivo contra a compreensão e abordagem de aspetos essenciais da interpretação e compreensão dos dados padrão neles ocultos, sendo este o desafio da análise enquanto tal.Referências
Alemany Oliver, M. and Vayre, J.-S. (2015). Big data and the future of knowledge production in marketing research: Ethics, digital traces, and abductive reasoning. Journal of Marketing Analytics, 3(1), pp. 5–13. doi: 10.1057/jma.2015.1. https://link.springer.com/article/10.1057/jma.2015.1
Anderson, C. (2008). The End of Theory: The Data Deluge Makes the Scientific Method Obsolete’. Wired. https://www.wired.com/2008/06/pb-theory/
Arjovsky, M., Bottou, L., Gulrajani, I., & Lopez-Pad, D. (2019). Invariant risk minimization. arXiv preprint, 1907.02893. https://www.researchgate.net/publication/334288906_Invariant_Risk_Minimization
Bareinboim, E., Lee, S., & Zhang, J. (2021). An introduction to causal reinforcement learning. Columbia CausalAI Laboratory, Technical Report (R-65). https://ics.uci.edu/~dechter/courses/ics-295cr/2024-25_Q2_Winter/presentations/P1%20-%20Jiapeng%20Zhao%20-%20An%20Introduction%20to%20Causal%20Reinforcement%20Learning.pdf
Blei, D. M. and Smyth, P. (2017). Science and data science. Proceedings of the National Academy of Sciences, 114(33), 8689–8692. doi: 10.1073/pnas.1702076114. https://www.pnas.org/doi/10.1073/pnas.1702076114
Breiman, L. (2001). Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author). Statistical Science, 16(3), 199–231. doi: 10.1214/ss/1009213726. https://projecteuclid.org/journals/statistical-science/volume-16/issue-3/Statistical-Modeling--The-Two-Cultures-with-comments-and-a/10.1214/ss/1009213726.full
Burrell, J. (2016). How the machine “thinks: Understanding opacity in machine learning algorithms. Big Data & Society. doi: 10.1177/2053951715622512. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2660674
Carmichael, I. and Marron, J. S. (2018). Data Science vs. Statistics: Two Cultures? Japanese Journal of Statistics and Data Science, 1(1), 117–138. doi: 10.1007/s42081-018-0009-3. https://arxiv.org/abs/1801.00371
Chambers, J. (1993). Classes and Methods in S.I: Recent Developments Computational Statistics, 8:3, 167-184.
Chambers, J. M. (1993). Greater or lesser statistics: a choice for future research. Statistics and Computing, 3(4), 182–184. doi: 10.1007/BF00141776. https://link.springer.com/article/10.1007/BF00141776
Cifuentes et al. (2016). Métodos de análisis para la investigación, desarrollo e innovación (I+D+i) de procesos agrícolas y agroindustriales. En https://www.ugc.edu.co/sede/armenia/files/editorial/metodos_de_analisis_para_la_investigacion.pdf
Donoho, D. (2017). 50 Years of Data Science. doi: https://doi.org/10.1080/10618600.2017.1384734 745-766 https://www.tandfonline.com/doi/full/10.1080/10618600.2017.1384734
Doshi-Velez, F. and Kim, B. (2017). Towards A Rigorous Science of Interpretable Machine Learning. http://arxiv.org/abs/1702.08608
Frické, M. (2015). Big data and its epistemology. Journal of the Association for Information Science and Technology. 66(4), pp. 651–661. doi: 10.1002/asi.23212. https://asistdl.onlinelibrary.wiley.com/doi/10.1002/asi.23212
Gigerenzer, G., Swijtink, Z., Porter, T., Daston, L., Beatty, J. & Krüger, L. (2013). The empire of chance: How probability changed science and everyday life. New York: Cambridge University Press. https://books.google.com.cu/books/about/The_Empire_of_Chance.html?id=Bw2yKfpvts8C&redir_esc=y
Glymour, C., Zhang, K., & Spirtes, P. (2019). Review of Causal Discovery Methods Based on Graphical Models. Frontiers in genetics, 10, 524. https://doi.org/10.3389/fgene.2019.00524
Hacking, I. (1975). The emergence of probability: A philosophical study of early ideas about probability, induction, and statistical inference. New York: Cambridge University Press. https://www.cambridge.org/core/books/emergence-of-probability/9852017A380C63DA30886D25B80336A7
Harman, G. (1965). The inference to the best explanation. Philosophical Review, 74(1), 88-95. https://www.jstor.org/stable/2183532
Harman, G. & Kulkarni, S. (2007). Reliable reasoning: Induction and statistical learning theory. Cambridge, MA: The MIT Press. https://direct.mit.edu/books/monograph/2565/Reliable-ReasoningInduction-and-Statistical
Hey, T., Tansley, S. and Tolle, K. (2009). The Fourth Paradigm: Data-Intensive Scientific Discovery. pp 287. https://www.microsoft.com/en-us/research/publication/fourth-paradigm-data-intensive-scientific-discovery/
Kitchin, R. (2014). Big Data, new epistemologies and paradigm shifts. Big Data & Society, 1(1), p. 2053951714528481. doi: 10.1177/2053951714528481. https://www.researchgate.net/publication/271525133_Big_Data_New_Epistemologies_and_Paradigm_Shift
Kitcher, P. (1976). Explanation, Conjunction, and Unification. The Journal of Philosophy, 73(8), 207–212. doi:10.2307/2025559. https://www.jstor.org/stable/2025559
Kitcher, P. (1989). Explanatory unification and the causal structure of the world. In P. Kitcher & W. Salmon (eds.), Scientific Explanation, 410-505. Minneapolis: University of Minnesota Press. https://conservancy.umn.edu/server/api/core/bitstreams/8f6f9fe7-b511-43cd-8d75-5c8570fefd59/content
Krishnan, M. (2020). Against Interpretability: a Critical Examination of the Interpretability Problem in Machine Learning. Philosophy & Technology, 33(3), 487–502. doi: 10.1007/s13347-019-00372-9. https://www.researchgate.net/publication/335148516_Against_Interpretability_a_Critical_Examination_of_the_Interpretability_Problem_in_Machine_Learning
Kuhn, T. S. (1970). The structure of scientific revolutions. 2nd Edition. Chicago: University of Chicago Press. https://www.lri.fr/~mbl/Stanford/CS477/papers/Kuhn-SSR-2ndEd.pdf
Leonelli, S. (2014). What difference does quantity make? On the epistemology of Big Data in biology. Big Data & Society, 1(1), 2053951714534395. doi: 10.1177/2053951714534395. https://journals.sagepub.com/doi/10.1177/2053951714534395
Lipton, P. (1991). Inference to the best explanation. London: Routledge. https://books.google.es/books?id=WIfYNExpSC0C
Lipton, Z. C. (2018). The mythos of model interpretability. Communications of the ACM, 61(10), 36–43. doi: 10.1145/3233231. https://arxiv.org/abs/1606.03490
MacKenzie, D. (1984). Statistics in Britain, 1865-1930: The social construction of scientific knowledge. Edinburgh: Edinburgh University Press. https://gwern.net/doc/statistics/1981-mackenzie-statisticsinbritain18651930.pdf
Mallows, C. (2006). Tukey’s Paper After 40 Years. Technometrics, 48, pp. 319–325. doi: 10.1198/004017006000000219. https://www.researchgate.net/publication/238879758_Tukey%27s_Paper_After_40_Years
Mayo, D. (1996). Error and the growth of experimental knowledge. Chicago: University of Chicago Press. https://errorstatistics.com/wp-content/uploads/2020/10/egek-pdf-red.pdf
Mayo, D. (2018). Statistical inference as severe testing: How to get beyond the statistics wars. New York: Cambridge University Press. https://www.cambridge.org/core/books/statistical-inference-as-severe-testing/D9DF409EF568090F3F60407FF2B973B2
Napoletani, D., Panza, M. and Struppa, D. (2018). The Agnostic Structure of Data Science Methods. p. 17. https://arxiv.org/abs/2101.12150
Nash, J. (1950). Non-Cooperative Games. PhD thesis, Princeton University.
Nash, J. (1951). Non-Cooperative Games. The Annals of Mathematics,54(2):286–295. https://www.jstor.org/stable/1969529
Nie, X. and Wager, S. (2021). Quasi-oracle estimation of heterogeneous treatment effects. Biometrika, 108(2), pp. 299–319. doi:10.1093/biomet/asaa076. https://arxiv.org/abs/1712.04912
Niiniluoto, I. (2018). Truth-seeking by abduction. Cham, Switzerland: Springer. https://link.springer.com/book/10.1007/978-3-319-99157-3
Pearl, J. (2000). Causality: Models, reasoning, and inference. Cambridge, England: Cambridge University Press. https://bayes.cs.ucla.edu/BOOK-2K/neuberg-review.pdf
Pearl, J. (2009) Causality. Cambridge: Cambridge University Press. doi: 10.1017/CBO9780511803161. https://www.cambridge.org/core/books/causality/B0046844FAE10CBF274D4ACBDAEB5F5B
Peters, J., Janzing, D., & Schölkopf, B. (2017). The elements of causal inference: Foundations and learning algorithms. Cambridge, MA: The MIT Press. Pietsch, W. (no date) ‘Big Data – The New Science of Complexity. https://books.google.com.cu/books/about/Elements_of_Causal_Inference.html?id=XPpFDwAAQBAJ&redir_esc=y
Prensky, M. (2009). H. Sapiens Digital: From Digital Immigrants and Digital Natives to Digital Wisdom, p. 11. https://eric.ed.gov/?id=ej834284
Ratti, E. and López-Rubio, E. (2018). MECHANISTIC MODELS AND THE EXPLANATORY LIMITS OF MACHINE LEARNING. Machine Learning, p. 18. https://philsci-archive.pitt.edu/14452/1/manuscript%20philsci%20-%20Ratti%20%26%20Lopez-Rubio.pdf
Schmidt, M. and Lipson, H. (2009). Distilling Free-Form Natural Laws from Experimental Data. Science, 324(5923), 81–85. doi: 10.1126/science.1165893. https://www.science.org/doi/10.1126/science.1165893
Spirtes, P., Glymour, C., & Scheines, R. (2000). Causation, prediction, and search. Cambridge, MA: The MIT Press. https://direct.mit.edu/books/monograph/2057/Causation-Prediction-and-Search
Steadman, I. (2013). Big data and the death of the theorist. Wired UK, 25 January. https://www.wired.co.uk/article/big-data-end-of-theory
Tukey, J. W. (1962). The Future of Data Analysis. Ann. Math. Statist. 33(1): 1-67 (March, 1962). DOI: 10.1214/aoms/1177704711. https://projecteuclid.org/journals/annals-of-mathematical-statistics/volume-33/issue-1/The-Future-of-Data-Analysis/10.1214/aoms/1177704711.full
Van Fraassen, B. C. (1980) The Scientific Image. Oxford University Press. https://epistemh.pbworks.com/f/2.+Oxford.University.Press.USA.The.Scientific.Image.Okt.1980.pdf
Vélez, B. (2018). Fines y estrategias de un modelo de universidad socialmente responsable. Sophia-Educación, 4 (2). https://dialnet.unirioja.es/servlet/articulo?codigo=6996273
Wigner, E.P. (1960). The unreasonable effectiveness of mathematics in the natural sciences. Richard Courant lecture in mathematical sciences delivered at New York University, May 11, 1959, Communications on Pure and Applied Mathematics, 13(1), 1–14. doi:10.1002/cpa.3160130102. https://www.researchgate.net/publication/227990770_The_unreasonable_effectiveness_of_mathematics_in_the_natural_sciences_Richard_Courant_lecture_in_mathematical_sciences_delivered_at_New_York_University_May_11_1959
Wu, C. F. J. (1997). Solving the Black Box Problem: A Normative Framework for Explainable Artificial Intelligence’, Philosophy and Technology, 1–24. doi: 10.1007/s13347-019-00382-7. https://arxiv.org/pdf/1903.04361/1000
Downloads
Publicado
Edição
Seção
Licença
Esta obra está bajo una licencia de CREATIVE COMMONS

