Mô hình chủ đề rất hữu dụng cho việc diễn giải

Khái niệm:: Diễn giải, đọc

What does this have to do with the humanities? Here is the rosy vision. A humanist imagines the kind of hidden structure that she wants to discover and embeds it in a model that generates her archive. The form of the structure is influenced by her theories and knowledge — time and geography, linguistic theory, literary theory, gender, author, politics, culture, history. With the model and the archive in place, she then runs an algorithm to estimate how the imagined hidden structure is realized in actual texts. Finally, she uses those estimates in subsequent study, trying to confirm her theories, forming new theories, and using the discovered structure as a lens for exploration. She discovers that her model falls short in several ways. She revises and repeats.

Note that the statistical models are meant to help interpret and understand texts; it is still the scholar’s job to do the actual interpreting and understanding. A model of texts, built with a particular theory in mind, cannot provide evidence for the theory.[5] (After all, the theory is built into the assumptions of the model.) Rather, the hope is that the model helps point us to such evidence. Using humanist texts to do humanist scholarship is the job of a humanist.

In summary, researchers in probabilistic modeling separate the essential activities of designing models and deriving their corresponding inference algorithms. The goal is for scholars and scientists to creatively design models with an intuitive language of components, and then for computer programs to derive and execute the corresponding inference algorithms with real data. The research process described above — where scholars interact with their archive through iterative statistical modeling — will be possible as this field matures.

I reviewed the simple assumptions behind LDA and the potential for the larger field of probabilistic modeling in the humanities. Probabilistic models promise to give scholars a powerful language to articulate assumptions about their data and fast algorithms to compute with those assumptions on large archives. I hope for continued collaborations between humanists and computer scientists/statisticians. With such efforts, we can build the field of probabilistic modeling for the humanities, developing modeling components and algorithms that are tailored to humanistic questions about texts.

Nguồn:: » Topic Modeling and Digital Humanities Journal of Digital Humanities

Chính vì ❓Nhân văn chỉ quan tâm đến việc lưu trữ, hiểu dữ liệu và tạo ra câu chuyện hay, nên Nhân văn số sử dụng mô hình chủ đề rất nhiều.

Bản chất của mô hình chủ đề là tô màu cho văn bản và từ
Topic modelling trong NLP dùng cho máy và cần tập dữ liệu lớn. Còn thematic analysis trong nhân học thì dành cho người, nhấn mạnh vào yếu tố thị giác
Mô hình ngôn ngữ lớn làm việc với ngôn ngữ, không phải kiến thức