这里会显示出您选择的修订版和当前版本之间的差别。
两侧同时换到之前的修订记录 前一修订版 后一修订版 | 前一修订版 | ||
zh:notes:education [2022/01/28 12:35] pzczxs [Topic Discovery] |
zh:notes:education [2024/06/24 10:05] (当前版本) pzczxs [Extract Simple Cycles] |
||
---|---|---|---|
行 1: | 行 1: | ||
====== Mapping Technological Trajectories in the Education Domain ====== | ====== Mapping Technological Trajectories in the Education Domain ====== | ||
===== Citation Information ===== | ===== Citation Information ===== | ||
- | To be added later. | + | -Shuo Xu, Congcong Wang, Xin An, Liyuan Hao, and Guancan Yang, 2024. [[https://doi.org/10.1177/01655515221101835|A Novel Developmental Trajectory Discovery Approach by Integrating Main Path Analysis and Intermediacy]]. //Journal of Information Science//, Vol. 50, No. 3, pp. 651-672. |
+ | - | ||
===== Requirements ===== | ===== Requirements ===== | ||
*[[zh:notes:install_dtm|DTM]] | *[[zh:notes:install_dtm|DTM]] | ||
+ | *[[http://mrvar.fdv.uni-lj.si/pajek/|Pajek]] | ||
===== Dataset ===== | ===== Dataset ===== | ||
The dataset comes form the following journals in the Web of Science (WoS). Here, we include all publications of the document type //Article//, //Article; Early Access//, //Article; Proceedings Paper//, //Database Review//, //Reprint//, and //Review// published before 2020 (inclusive). | The dataset comes form the following journals in the Web of Science (WoS). Here, we include all publications of the document type //Article//, //Article; Early Access//, //Article; Proceedings Paper//, //Database Review//, //Reprint//, and //Review// published before 2020 (inclusive). | ||
行 192: | 行 193: | ||
To run <color red>CountryMerger.java</color> in the package <color red>cn.edu.bjut.ui</color>. | To run <color red>CountryMerger.java</color> in the package <color red>cn.edu.bjut.ui</color>. | ||
- | <!-- | ||
===== Extract Simple Cycles ===== | ===== Extract Simple Cycles ===== | ||
The direct citation network can be constructed by running the following SQL statement. | The direct citation network can be constructed by running the following SQL statement. | ||
<code sql> | <code sql> | ||
> SELECT ta_ca.target_article_id AS target_artice_id, ca.wos_id AS cited_article_id FROM target_article as ta, target_article_cited_article AS ta_ca, cited_article AS ca WHERE ta.id = ta_ca.target_article_id AND ta_ca.cited_article_id = ca.id AND ca.wos_id IS NOT NULL AND ta.publication_year <= 2020 And ta.publication_year != 0 AND ta.type IN ("Article; Proceedings Paper", "Reprint", "Article; Early Access", "Article", "Review", "Database Review") ORDER by ta_ca.target_article_id ASC; | > SELECT ta_ca.target_article_id AS target_artice_id, ca.wos_id AS cited_article_id FROM target_article as ta, target_article_cited_article AS ta_ca, cited_article AS ca WHERE ta.id = ta_ca.target_article_id AND ta_ca.cited_article_id = ca.id AND ca.wos_id IS NOT NULL AND ta.publication_year <= 2020 And ta.publication_year != 0 AND ta.type IN ("Article; Proceedings Paper", "Reprint", "Article; Early Access", "Article", "Review", "Database Review") ORDER by ta_ca.target_article_id ASC; | ||
+ | </code> | ||
+ | |||
+ | education technology: | ||
+ | <code sql> | ||
+ | > INSERT INTO citation (citing_doi, citing_publication_year, cited_doi, cited_publication_year) | ||
+ | > SELECT a.doi AS citing_doi, a.publication_year as citing_year, ca.doi AS cited_doi, ca.publication_year AS cited_year FROM article AS a, article_cited_article AS a_ca, cited_article AS ca | ||
+ | > WHERE a.id = a_ca.article_id AND a_ca.cited_article_id = ca.id | ||
+ | > AND a.TYPE IN ("Article", "Article; Early Access", "Article; Proceedings Paper", "Database Review", "Reprint", "Review", "Review; Early Access") | ||
+ | > AND a.doi is not null and ca.doi IS NOT NULL AND flag = 0 | ||
+ | > AND ca.journal_id IS NOT NULL | ||
+ | > ORDER BY citing_doi ASC, cited_doi ASC; | ||
+ | |||
+ | > DELETE FROM citation WHERE citing_doi = cited_doi; | ||
</code> | </code> | ||
行 228: | 行 241: | ||
> DELETE FROM target_article_cited_article WHERE target_article_id = "ISI:000408778800002" AND cited_article_id = 162626; | > DELETE FROM target_article_cited_article WHERE target_article_id = "ISI:000408778800002" AND cited_article_id = 162626; | ||
</code> | </code> | ||
- | --> | ||
- | |||
===== Topic Discovery ===== | ===== Topic Discovery ===== | ||
+ | Several files (//Education.docs//, //Education.word.vocab//, //Education-multi.dat//, and //Education-seq.dat//) need to be generated in advance in the directory <color red> data/DTM</color> by running <color red>Converter2DTM.java</color> in the package <color red>cn.edu.bjut.dtm</color>. | ||
+ | On the basis of these generated files, theme structures can be discovered with the DTM model with the number of topics $K \in \{5, 10, \cdots, 50\}$ by running the following commands. | ||
<code bash> | <code bash> | ||
- | > run_dim_education.sh 40 | + | > nohup ./run_dtm_education.sh 5 >> Education/log5.txt 2>&1 |
+ | > nohup ./run_dtm_education.sh 10 >> Education/log10.txt 2>&1 | ||
+ | > nohup ./run_dtm_education.sh 15 >> Education/log15.txt 2>&1 | ||
+ | > nohup ./run_dtm_education.sh 20 >> Education/log20.txt 2>&1 | ||
+ | > nohup ./run_dtm_education.sh 25 >> Education/log25.txt 2>&1 | ||
+ | > nohup ./run_dtm_education.sh 30 >> Education/log30.txt 2>&1 | ||
+ | > nohup ./run_dtm_education.sh 35 >> Education/log35.txt 2>&1 | ||
+ | > nohup ./run_dtm_education.sh 40 >> Education/log40.txt 2>&1 | ||
+ | > nohup ./run_dtm_education.sh 45 >> Education/log45.txt 2>&1 | ||
+ | > nohup ./run_dtm_education.sh 50 >> Education/log50.txt 2>&1 | ||
</code> | </code> | ||
+ | |||
+ | For the sake of identifying a proper number of topics, the perplexity is calculated for each $K \in \{5, 10, \cdots, 50\}$ by running <color red>DTMTuningParam.java</color> in the package <color red>cn.edu.bjut.dtm</color>. In the end, $K$ is fixed to 30 in our case. | ||
+ | |||
+ | Then, one can output top words for each topic and topic distribution for each document by running <color red>DTMPrinter.java</color> in the package <color red>cn.edu.bjut.dtm</color>. In this time, two files (//Education.twords// and //Education.vartheta//) will be generated. | ||
+ | |||
+ | ===== Main Path Analysis ===== | ||
+ | |||
~~DISCUSSION:closed~~ | ~~DISCUSSION:closed~~ |