这里会显示出您选择的修订版和当前版本之间的差别。
两侧同时换到之前的修订记录 前一修订版 后一修订版 | 前一修订版 | ||
zh:notes:techemergence [2020/10/17 15:16] pzczxs [Update PubMed id and PMC id] |
zh:notes:techemergence [2020/10/17 16:19] (当前版本) pzczxs [Rum the DIM Model] |
||
---|---|---|---|
行 44: | 行 44: | ||
To run <color red>DownloadByDoi.java</color> in the project <color red>EmergingTopicsConverter</color>, a file <color red>ref-ids2.txt</color> will be generated. According to this file, to fetch full-record and cited references in the format of CSV from All Database in the Web of Science, and to save <color red>data/contest/cited/csv</color>. | To run <color red>DownloadByDoi.java</color> in the project <color red>EmergingTopicsConverter</color>, a file <color red>ref-ids2.txt</color> will be generated. According to this file, to fetch full-record and cited references in the format of CSV from All Database in the Web of Science, and to save <color red>data/contest/cited/csv</color>. | ||
- | To import the related information into the database with <color red>CSVReferencemporter.java</color> in the project <color red>EmergingTopicsConvert</color> from the directory <color red>data/contest/cited/csv</color>. | + | To import the related information into the database with <color red>CSVReferencemporter.java</color> in the project <color red>EmergingTopicsConverter</color> from the directory <color red>data/contest/cited/csv</color>. |
===== Assign wos_id for references ===== | ===== Assign wos_id for references ===== | ||
行 52: | 行 52: | ||
===== Merge Authors ===== | ===== Merge Authors ===== | ||
To run <color red>AuthorMerger.java</color> in the project <color red>TechEmergence</color>. | To run <color red>AuthorMerger.java</color> in the project <color red>TechEmergence</color>. | ||
+ | |||
+ | ===== Detect and Tokenize Sentences, and Recognize Entities ===== | ||
+ | To run <color red>Converter2Genia.java</color> in the package <color red>cn.edu.bjut.genia</color> of the project <color red>EmergingTopicsConverter</color>. Thus, the articles will be saved in the directories <color red>data/contest-Genia/DIM</color> and <color red>data/contest-Genia/CIM</color>. Each article is named by its resulting id. | ||
+ | |||
+ | <code bash> | ||
+ | > ./run_geniass.sh geniass data/contest-Genia/DIM & | ||
+ | > ./run_geniatagger.sh geniatagger data/contest-Genia/DIM & | ||
+ | > ./run_geniass.sh geniass data/contest-Genia/CIM & | ||
+ | > ./run_geniatagger.sh geniatagger data/contest-Genia/CIM & | ||
+ | </code> | ||
+ | |||
+ | For each document, two files will be generated with the extension name <color red>.txt.ss</color> and <color red>.txt.ss.tag</color>. To save all <color red>.txt.ss</color> and <color red>.txt.ss.tag</color> files in the directories <color red>data/contest-Genia/DIM</color> and <color red>data/contest-Genia/CIM</color>. | ||
+ | |||
+ | ===== Rum the CIM Model ===== | ||
+ | To run <color red>Converter2DIM.java</color> in the package <color red>cn.edu.bjut.genia</color> of the project <color red>EmergingTopicsConverter</color>. Several files will be generated for the DIM model in the directory <color red>data/contest-DIM/emergence</color>. | ||
+ | |||
+ | ===== Rum the DIM Model ===== |