用户工具

站点工具


zh:notes:disruptive_index

差别

这里会显示出您选择的修订版和当前版本之间的差别。

到此差别页面的链接

两侧同时换到之前的修订记录 前一修订版
后一修订版
前一修订版
zh:notes:disruptive_index [2025/01/21 16:22]
pzczxs [Import Citing Articles]
zh:notes:disruptive_index [2025/06/19 09:11] (当前版本)
pzczxs [Summarization]
行 1: 行 1:
 ====== Disruptive Index Calculation ====== ====== Disruptive Index Calculation ======
 ===== Citation Information ===== ===== Citation Information =====
-To add later+  - Shuo Xu, Congcong Wang, Xin An, Yunkang Deng, and Jianhua Liu, 2025. [[  
 +https://​doi.org/​10.1016/​j.joi.2025.101685|Do OpenCitations and Dimensions Serve as an Alternative to Web of Science for Calculating Disruption Indexes]]? //Journal of Informetrics//,​ Vol. 19, No. 3, pp. 101685.  
 +  - 
  
 ===== Datasets ===== ===== Datasets =====
行 75: 行 77:
 ===== Dimensions ===== ===== Dimensions =====
 ==== Download Data ==== ==== Download Data ====
-TODO+For each domain (e.g., SYNTHETIC BIOLOGY), the procedure begins by querying the Dimensions API with DOIs from a specified list (e.g., doi_list1.csv) to retrieve target articles’ metadata (ID, title, authors, year, journal) and their referenced publications (referenced_pubs),​ saving results as JSON files in the target_articles folder by running <color red>​dimensions_retrieve.ipynb</​color>​. Next, reference IDs extracted from the referenced_pubs field are used to fetch metadata for cited articles, stored in the cited_articles folder by running <color red>​dimensions_references.ipynb</​color>​. Finally, a combined list of IDs from target and cited articles is queried to identify citing articles (those whose reference_ids include any of the input IDs), with outputs saved to the citing_articles folder by running <color red>​dimensions_citations.ipynb</​color>​.
 ==== Import Target Articles with Backward Citations ==== ==== Import Target Articles with Backward Citations ====
 Import the target articles with the resulting backward citations to the database by running <color red>​TargetArticleImporter.java</​color>​ in the package <color red>​cn.edu.bjut.dimensions</​color>​. Import the target articles with the resulting backward citations to the database by running <color red>​TargetArticleImporter.java</​color>​ in the package <color red>​cn.edu.bjut.dimensions</​color>​.
行 168: 行 170:
 > UPDATE article SET doi = "​10.3389/​FPLS.2016.00706"​ WHERE doi = "​10.3389/​F,​OLS.2016.00706"; ​ > UPDATE article SET doi = "​10.3389/​FPLS.2016.00706"​ WHERE doi = "​10.3389/​F,​OLS.2016.00706"; ​
 > UPDATE article SET doi = "​10.1016/​J.GEB.2019.07.003"​ WHERE doi = "​10.1016/​J,​GEB.2019.07.003"; ​ > UPDATE article SET doi = "​10.1016/​J.GEB.2019.07.003"​ WHERE doi = "​10.1016/​J,​GEB.2019.07.003"; ​
 +> UPDATE article SET doi = "​10.1017/​S0140525X21001370"​ WHERE doi = "​10.1017/​S0140525X21001370,​E120"; ​
 +> UPDATE article SET doi = "​10.1007/​978-3-030-68386-3_18"​ WHERE doi = "​10.1007/​978-3-030-68,​38,​6-3_18"; ​
 +> UPDATE article SET doi = "​10.1016/​J.MULFIN.2018.06.001"​ WHERE doi = "​10.1016/​J.MULFIN,​2018.06.001"; ​
 +> UPDATE article SET doi = "​10.1007/​978-3-030-68386-3_8"​ WHERE doi = "​10.1007/​978-3-030-68,​38,​6-3_8"; ​
 +> UPDATE article SET doi = "​10.1002/​CPT.1619"​ WHERE doi = "​10.1002/​CPT.1619MASSACHUSETTS,​USA.*"; ​
 </​code>​ </​code>​
  
 +<!--
 <code bash> <code bash>
 > ./​merge-article-doi-wos.sh > merge-article-doi.log > ./​merge-article-doi-wos.sh > merge-article-doi.log
 </​code>​ </​code>​
 +-->
  
 Several cited articles are attached with multiple DOI numbers. The cited articles with multiple DOI names can be resolved by running <color red>​CitedArticleMultipleDoiResolver.java</​color>​ in the package <color red>​cn.edu.bjut.wos</​color>​. Note that this operation needs to access the [[https://​www.doi.org/​the-identifier/​resources/​factsheets/​doi-resolution-documentation|DOI parser]]. ​ Several cited articles are attached with multiple DOI numbers. The cited articles with multiple DOI names can be resolved by running <color red>​CitedArticleMultipleDoiResolver.java</​color>​ in the package <color red>​cn.edu.bjut.wos</​color>​. Note that this operation needs to access the [[https://​www.doi.org/​the-identifier/​resources/​factsheets/​doi-resolution-documentation|DOI parser]]. ​
行 184: 行 193:
 > nohup ./​update-cited-article-doi-wos.sh > update-cited-article-doi-wos.log 2>&1 > nohup ./​update-cited-article-doi-wos.sh > update-cited-article-doi-wos.log 2>&1
  
-> UPDATE ​cited_article ​SET doi = "​10.1145/​2024724.2024911"​ WHERE id 5148507; // WOS:​000297360000151+> UPDATE ​article ​SET doi = "​10.1145/​2024724.2024911"​ WHERE wos_id ​ "​WOS:​000297360000151";
 </​code>​ </​code>​
  
行 235: 行 244:
 <code bash> <code bash>
 > ./​update-doi.sh > ./​update-doi.sh
 +> ./​update-doi-with-excel.sh > update-doi-with-excel.log
 </​code>​ </​code>​
 ==== Update Publication Year ==== ==== Update Publication Year ====
行 246: 行 256:
 <code bash> <code bash>
 > nohup ./​update-doi-publication-year.sh > update-doi-publication-year.log 2>&1 > nohup ./​update-doi-publication-year.sh > update-doi-publication-year.log 2>&1
-> nohup ./​check-doi-publication-year.sh > check-doi-publication-year-2.log 2>&1+> nohup ./​check-doi-publication-year.sh > check-doi-publication-year.log 2>&1
 </​code>​ </​code>​
  
-About 1500 citing articles are not attached any publication year at all. In this case, we can supplement the resulting publication years of these citing articles, and save them in the Excel file <color red>​doi-publication-year20250110.xlsx</​color>​.+About 1500 citing articles are not attached any publication year at all. In this case, we can supplement the resulting publication years of these citing articles, and save them in the Excel file <color red>​doi-publication-year20250430.xlsx</​color>​.
 <code bash> <code bash>
 > nohup ./​update-doi-publication-year-with-excel.sh > update-doi-publication-year-with-excel.log 2>&1 > nohup ./​update-doi-publication-year-with-excel.sh > update-doi-publication-year-with-excel.log 2>&1
行 256: 行 266:
  
 Note that 23 citing DOI names actually point to a journal, not a publication. It is very difficult to assign a publication year to these DOI names. Hence, we fix the resulting publication year of these DOIs to null.  Note that 23 citing DOI names actually point to a journal, not a publication. It is very difficult to assign a publication year to these DOI names. Hence, we fix the resulting publication year of these DOIs to null. 
-==== Normalization ​==== +==== Summarization ​==== 
-Before normalization,​ a global DOI set is generated by running ​<color red>DoiUpdater.java</colorin the package <color red>cn.edu.bjut.indices</color>Then, the resulting publication years are updated by running <color red>​DoiPublicationYearUpdater.java</color> in the package <color red>cn.edu.bjut.indices</color>+<code sql> 
 +> SELECT id FROM doi WHERE name = "10.1042/0264-6021:​3370023";​ 
 +UPDATE doi SET preferred_id = null WHERE preferred_id = 811; 
 +UPDATE citation_open_citations SET cited_article_doi = "10.1042/​0264-6021:​3440069"​ WHERE cited_article_doi = "10.1042/​0264-6021:​3370023"​ AND citing_article_doi = "10.1111/J.1365-2958.2008.06183.X";​ 
 +> DELETE FROM citation_open_citations WHERE id = 791; // 重复 
 +UPDATE citation_open_citations SET cited_article_doi = "10.1042/​0264-6021:​3480001"​ WHERE cited_article_doi = "10.1042/​0264-6021:​3370023"​ AND citing_article_doi = "10.1039/C1MB05175J";​ 
 +DELETE FROM citation_open_citations WHERE id = 2694; // 重复
  
-The citations from Web of Science, Dimensions, and OpenCitations can be normalized by running <color red>CitatoinUpdater.java</color>, <color red>CitatoinDimensionsUpdater.java</color>, and <color red>CitatoinOpenCitationsUpdater.java</colorrespectively in the package <color red>cn.edu.bjut.indices</color>.+SELECT * FROM doi WHERE name = "10.3332/ECANCER";​ 
 +UPDATE doi SET preferred_id = null WHERE preferred_id = 269085; 
 +UPDATE citation_open_citations SET citing_article_doi = "10.3332/ECANCER.2013.370"​ WHERE cited_article_doi = "​10.1126/​SCIENCE.277.5331.1508"​ AND citing_article_doi = "​10.3332/​ECANCER";​ 
 +DELETE FROM citation_open_citations WHERE id = 312088; // 重复 
 + 
 +SELECT * FROM doi WHERE name = "10.5754/HGE10106";​ 
 +UPDATE doi SET preferred_id = null WHERE preferred_id = 665378; 
 +UPDATE citation_open_citations SET citing_article_doi = "10.5754/​HGE11387"​ WHERE citing_article_doi = "10.5754/​HGE10106"​ AND cited_article_doi = "10.1016/S0140-6736(01)06102-5";​ 
 +DELETE FROM citation_open_citations WHERE id = 933064; // 重复 
 + 
 +> UPDATE doi SET preferred_id = null WHERE name = "10.1103/​PHYSREVD.69.065012";​ 
 +> DELETE FROM citation_wos WHERE citing_article_doi = cited_article_doi;​ 
 +> DELETE FROM citation_wos WHERE citing_article_doi IN ("​10.1016/​J.PHYSLETB.2004.09.028",​ "​10.1016/​J.PHYSLETB.2006.05.031",​ "​10.1016/​J.PHYSLETB.2023.137856",​ "​10.1103/​PHYSREVD.71.105005",​ "​10.1103/​PHYSREVD.73.105010",​ "​10.1103/​PHYSREVD.74.125016"​) AND cited_article_doi = "​10.1103/​PHYSREVD.69.065012";​ 
 +> DELETE FROM citation_wos WHERE citing_article_doi = "​10.1103/​PHYSREVD.82.105029"​ AND cited_article_doi = "​10.1103/​PHYSREVD.69.105012";​ 
 +</​code>​
  
 <code base> <code base>
-> ./​update-doi-with-excel.sh > update-doi-with-excel.log 
 > ./​summary.sh > summary.log > ./​summary.sh > summary.log
 </​code>​ </​code>​
  
 ==== Calculate Disruptive Index ==== ==== Calculate Disruptive Index ====
 +Before calculation,​ the related data can be exported by running <color red>​CitationExporter.java</​color>​ in the package <color red>​cn.edu.bjut.ui</​color>​.
 +<code bash>
 +> ./​export-citations.sh
 +</​code>​
  
 ~~DISCUSSION:​closed~~ ~~DISCUSSION:​closed~~
zh/notes/disruptive_index.1737447746.txt.gz · 最后更改: 2025/01/21 16:22 由 pzczxs