这里会显示出您选择的修订版和当前版本之间的差别。
两侧同时换到之前的修订记录 前一修订版 后一修订版 | 前一修订版 | ||
zh:notes:at_credit [2020/10/07 14:10] pzczxs [Split Train and Test Sets] |
zh:notes:at_credit [2022/11/24 13:00] (当前版本) pzczxs [Citation Information] |
||
---|---|---|---|
行 8: | 行 8: | ||
===== Citation Information ===== | ===== Citation Information ===== | ||
- | To be added later. | + | -Shuo Xu, Ling Li, Congcong Wang, Xin An, and Guancan Yang, 2022. [[https://doi.org/10.1177/01655515221133530|An Improved Author-Topic (AT) Model with Authorship Credit Allocation Schemes]]. //Journal of Information Science//. |
+ | -Shuo Xu, Ling Li, Liyuan Hao, Xin An, and Guancan Yang, 2021. [[https://doi.org/10.1007/978-3-030-71292-1_18|An Author Interest Discovery Model armed with Authorship Credit Allocation Scheme]]. //iConference: Diversity, Divergence, Dialogue//, pp. 199-207. | ||
===== Create Database ===== | ===== Create Database ===== | ||
- | The database SQL file: <color red>synthetic_biology.sql</color>. This database consists of the following tables: //author//, //cited_author//, //cited_article_author//, //citing_article//, //citing_article_cited_article//, //citing_article_keyword//, //keyword//, //target_article//, //target_article_author//, and //target_article_keyword//. | + | The database SQL file: <color red>synthetic_biology.sql</color>. This database consists of the following tables: //author//, //target_article//, and //target_article_author//. |
===== Fill Missing DOI Information ===== | ===== Fill Missing DOI Information ===== | ||
行 74: | 行 75: | ||
INSERT target_article_author (target_article_id, author_id, seq_no_original, seq_no, is_reprint_original, is_reprint) VALUES ("WOS:000365103600006", 10851, 6, 6, 0, 0); | INSERT target_article_author (target_article_id, author_id, seq_no_original, seq_no, is_reprint_original, is_reprint) VALUES ("WOS:000365103600006", 10851, 6, 6, 0, 0); | ||
</code> | </code> | ||
+ | |||
+ | ===== Detect and Tokenize Sentences, and Recognize Entities ===== | ||
+ | To run <color red>Converter2Genia.java</color> in the package <color red>cn.edu.bjut.genia</color>. Thus, the articles will be saved in the directory <color red>data/genia</color>. Each article is named by its resulting id. | ||
+ | |||
+ | <code bash> | ||
+ | > ./run_geniass.sh geniass data/genia & | ||
+ | > ./run_geniatagger.sh geniatagger data/genia & | ||
+ | </code> | ||
+ | |||
+ | For each document, two files will be generated with the extension name <color red>.txt.ss</color> and <color red>.txt.ss.tag</color>. To save all <color red>.txt.ss</color> and <color red>.txt.ss.tag</color> files in the directory <color red>data/genia</color>. | ||
===== Authorship Credit Allocation Schemes ===== | ===== Authorship Credit Allocation Schemes ===== | ||
* Arithmetic counting scheme: To run <color red>ArithmeticCredit.java</color> in the package <color red>cn.edu.bjut.credit</color>. | * Arithmetic counting scheme: To run <color red>ArithmeticCredit.java</color> in the package <color red>cn.edu.bjut.credit</color>. | ||
行 85: | 行 96: | ||
<code matlab> | <code matlab> | ||
> load credits | > load credits | ||
+ | |||
> std(arithmetic(:)) / mean(arithmetic(:)) | > std(arithmetic(:)) / mean(arithmetic(:)) | ||
> std(geometric(:)) / mean(geometric(:)) | > std(geometric(:)) / mean(geometric(:)) | ||
行 93: | 行 105: | ||
</code> | </code> | ||
- | ===== Detect and Tokenize Sentences, and Recognize Entities ===== | ||
- | To run <color red>Converter2Genia.java</color> in the package <color red>cn.edu.bjut.genia</color>. Thus, the articles will be saved in the directory <color red>data/genia</color>. Each article is named by its resulting id. | ||
- | |||
- | <code bash> | ||
- | > ./run_geniass.sh geniass data/genia & | ||
- | > ./run_geniatagger.sh geniatagger data/genia & | ||
- | </code> | ||
- | |||
- | For each document, two files will be generated with the extension name <color red>.txt.ss</color> and <color red>.txt.ss.tag</color>. To save all <color red>.txt.ss</color> and <color red>.txt.ss.tag</color> files in the directory <color red>data/genia</color>. | ||
===== Split Train and Test Sets ===== | ===== Split Train and Test Sets ===== | ||
行 154: | 行 157: | ||
To run <color red>ATArithmeticCreditRunner.java</color>, <color red>ATAxiomaticCreditRunner.java</color>, <color red>ATGeometricCreditRunner.java</color>, <color red>ATGoldenNumberCreditRunner.java</color>, <color red>ATHarmonicCreditRunner.java</color>, and <color red>ATNetworkCreditRunner.java</color> in the package <color red>cn.edu.bjut.ui</color>. | To run <color red>ATArithmeticCreditRunner.java</color>, <color red>ATAxiomaticCreditRunner.java</color>, <color red>ATGeometricCreditRunner.java</color>, <color red>ATGoldenNumberCreditRunner.java</color>, <color red>ATHarmonicCreditRunner.java</color>, and <color red>ATNetworkCreditRunner.java</color> in the package <color red>cn.edu.bjut.ui</color>. | ||
- | ~~DISCUSSION~~ | + | ~~DISCUSSION:closed~~ |