用户工具

站点工具


zh:notes:at_credit

差别

这里会显示出您选择的修订版和当前版本之间的差别。

到此差别页面的链接

两侧同时换到之前的修订记录 前一修订版
后一修订版
前一修订版
zh:notes:at_credit [2020/10/07 14:09]
pzczxs [Split Train and Test Sets]
zh:notes:at_credit [2022/11/24 13:00] (当前版本)
pzczxs [Citation Information]
行 8: 行 8:
  
 ===== Citation Information ===== ===== Citation Information =====
-To be added later.+  -Shuo Xu, Ling Li, Congcong Wang, Xin An, and Guancan Yang, 2022. [[https://​doi.org/​10.1177/​01655515221133530|An Improved Author-Topic (AT) Model with Authorship Credit Allocation Schemes]]. //Journal of Information Science//​. 
 +  -Shuo Xu, Ling Li, Liyuan Hao, Xin An, and Guancan Yang, 2021. [[https://​doi.org/​10.1007/​978-3-030-71292-1_18|An Author Interest Discovery Model armed with Authorship Credit Allocation Scheme]]. //​iConference:​ Diversity, Divergence, Dialogue//, pp. 199-207.
  
 ===== Create Database ===== ===== Create Database =====
-The database SQL file: <color red>​synthetic_biology.sql</​color>​. This database consists of the following tables: //author//, //​cited_author//,​ //​cited_article_author//,​ //​citing_article//,​ //​citing_article_cited_article//,​ //​citing_article_keyword//,​ //keyword//, //​target_article//,​ //​target_article_author//, and //​target_article_keyword//. +The database SQL file: <color red>​synthetic_biology.sql</​color>​. This database consists of the following tables: //author//, //​target_article//, ​and //​target_article_author//​. ​
  
 ===== Fill Missing DOI Information ===== ===== Fill Missing DOI Information =====
行 74: 行 75:
 INSERT target_article_author (target_article_id,​ author_id, seq_no_original,​ seq_no, is_reprint_original,​ is_reprint) VALUES ("​WOS:​000365103600006",​ 10851, 6, 6, 0, 0);  INSERT target_article_author (target_article_id,​ author_id, seq_no_original,​ seq_no, is_reprint_original,​ is_reprint) VALUES ("​WOS:​000365103600006",​ 10851, 6, 6, 0, 0); 
 </​code>​ </​code>​
 +
 +===== Detect and Tokenize Sentences, and Recognize Entities =====
 +To run <color red>​Converter2Genia.java</​color>​ in the package <color red>​cn.edu.bjut.genia</​color>​. Thus, the articles will be saved in the directory <color red>​data/​genia</​color>​. Each article is named by its resulting id. 
 +
 +<code bash>
 +> ./​run_geniass.sh geniass data/genia &
 +> ./​run_geniatagger.sh geniatagger data/genia &
 +</​code>​
 +
 +For each document, two files will be generated with the extension name <color red>​.txt.ss</​color>​ and <color red>​.txt.ss.tag</​color>​. To save all <color red>​.txt.ss</​color>​ and <color red>​.txt.ss.tag</​color>​ files in the directory <color red>​data/​genia</​color>​.
 ===== Authorship Credit Allocation Schemes ===== ===== Authorship Credit Allocation Schemes =====
   * Arithmetic counting scheme: To run <color red>​ArithmeticCredit.java</​color>​ in the package <color red>​cn.edu.bjut.credit</​color>​.   * Arithmetic counting scheme: To run <color red>​ArithmeticCredit.java</​color>​ in the package <color red>​cn.edu.bjut.credit</​color>​.
行 85: 行 96:
 <code matlab> <code matlab>
 > load credits > load credits
 +
 > std(arithmetic(:​)) / mean(arithmetic(:​)) > std(arithmetic(:​)) / mean(arithmetic(:​))
 > std(geometric(:​)) / mean(geometric(:​)) > std(geometric(:​)) / mean(geometric(:​))
行 93: 行 105:
 </​code>​ </​code>​
  
-===== Detect and Tokenize Sentences, and Recognize Entities ===== 
-To run <color red>​Converter2Genia.java</​color>​ in the package <color red>​cn.edu.bjut.genia</​color>​. Thus, the articles will be saved in the directory <color red>​data/​genia</​color>​. Each article is named by its resulting id.  
- 
-<code bash> 
-> ./​run_geniass.sh geniass data/genia & 
-> ./​run_geniatagger.sh geniatagger data/genia & 
-</​code>​ 
- 
-For each document, two files will be generated with the extension name <color red>​.txt.ss</​color>​ and <color red>​.txt.ss.tag</​color>​. To save all <color red>​.txt.ss</​color>​ and <color red>​.txt.ss.tag</​color>​ files in the directory <color red>​data/​genia</​color>​. 
  
 ===== Split Train and Test Sets ===== ===== Split Train and Test Sets =====
行 112: 行 115:
 To run <color red>​TrainTestSetSplitter.java</​color>​ in the package <color red>​cn.edu.bjut.multilabel</​color>​. In this time, two files <color red>​syn_bio.train.docs</​color>​ and <color red>​syn_bio.test.docs</​color>​ in the directory <color red>​data/​multi-label</​color>​ will be generated. ​ To run <color red>​TrainTestSetSplitter.java</​color>​ in the package <color red>​cn.edu.bjut.multilabel</​color>​. In this time, two files <color red>​syn_bio.train.docs</​color>​ and <color red>​syn_bio.test.docs</​color>​ in the directory <color red>​data/​multi-label</​color>​ will be generated. ​
  
-To run <color red>​Converter2ATCredit.java</​color>​ in the package <color red>​cn.edu.bjut.genia</​color>​. Several files will be generated for the AT model in the directory<​color red>​data/​at_credit</​color>​. ​+To run <color red>​Converter2ATCredit.java</​color>​ in the package <color red>​cn.edu.bjut.genia</​color>​. Several files will be generated for the AT<​sup>​credit</​sup> ​model in the directory<​color red>​data/​at_credit</​color>​. ​
 ===== Parameter Tuning ===== ===== Parameter Tuning =====
 To run <color red>​ATArithmeticCreditTuningParam.java</​color>,​ <color red>​ATAxiomaticCreditTuningParam.java</​color>,​ <color red>​ATGeometricCreditTuningParam.java</​color>,​ <color red>​ATGoldenNumberCreditTuningParam.java</​color>,​ <color red>​ATHarmonicCreditTuningParam.java</​color>,​ and <color red>​ATNetworkCreditTuningParam.java</​color>​ in the package <color red>​cn.edu.bjut.ui</​color>​. Note that if one wants to turn on the hyper-authorship strategy, the second parameter is set to <color red>​true</​color>​ in these java files, otherwise false. To run <color red>​ATArithmeticCreditTuningParam.java</​color>,​ <color red>​ATAxiomaticCreditTuningParam.java</​color>,​ <color red>​ATGeometricCreditTuningParam.java</​color>,​ <color red>​ATGoldenNumberCreditTuningParam.java</​color>,​ <color red>​ATHarmonicCreditTuningParam.java</​color>,​ and <color red>​ATNetworkCreditTuningParam.java</​color>​ in the package <color red>​cn.edu.bjut.ui</​color>​. Note that if one wants to turn on the hyper-authorship strategy, the second parameter is set to <color red>​true</​color>​ in these java files, otherwise false.
行 154: 行 157:
 To run <color red>​ATArithmeticCreditRunner.java</​color>,​ <color red>​ATAxiomaticCreditRunner.java</​color>,​ <color red>​ATGeometricCreditRunner.java</​color>,​ <color red>​ATGoldenNumberCreditRunner.java</​color>,​ <color red>​ATHarmonicCreditRunner.java</​color>,​ and <color red>​ATNetworkCreditRunner.java</​color>​ in the package <color red>​cn.edu.bjut.ui</​color>​. To run <color red>​ATArithmeticCreditRunner.java</​color>,​ <color red>​ATAxiomaticCreditRunner.java</​color>,​ <color red>​ATGeometricCreditRunner.java</​color>,​ <color red>​ATGoldenNumberCreditRunner.java</​color>,​ <color red>​ATHarmonicCreditRunner.java</​color>,​ and <color red>​ATNetworkCreditRunner.java</​color>​ in the package <color red>​cn.edu.bjut.ui</​color>​.
  
-~~DISCUSSION~~+~~DISCUSSION:closed~~
zh/notes/at_credit.1602050957.txt.gz · 最后更改: 2020/10/07 14:09 由 pzczxs