JIN Mingzhe 金 明哲
  • Specially-appointed Professor (Part-time)
  • Institute of Interdisciplinary Research
Faculty Profile
Research Field

statistical science, data science, artificial intelligence, language science, digital humanities

Courses Taught

Data Science, Machine Learning, Text Analytics, and Text Mining

Academic Degrees

Ph. D. The Graduate University for Advanced Studies(SOKENDAI)

Brief Biography

JSPS Research Fellowship for Young Scientists, National Institute for Japanese Language and Linguistics
Sapporo Gakuin University, Faculty of Social Information Studies, Associate Professor, Professor
Doshisha University, Faculty and Graduate School of Culture and Information Science, Professor

Achievements and Awards

JSS Publication Award (2015)
The Behaviormetric Society, Chikio Hayashi for Excellence Award (2013)

Media Coverage

An evening paper of the Asahi Shinbun (2023-08-23)
Considering how large language models might act differently if trained in different languages. Communications of the ACM, Volume 67, Number 5 (2024), Pages 29-31

Related Links

Research Overview

I conduct research using scientific methods based on statistical science and data science. Among these, research on classification, prediction, and trend analysis based on text (or corpora) occupies a significant portion. Text refers to information represented using characters, symbols, numbers, etc. It includes books, articles, poems, scripts, e-books, web pages, social media posts, chat messages, program code, markup languages, genetic information represented by sequences of symbols, musical notes, and more. Therefore, research utilizing text is relevant to many fields.

Large-scale language models such as GPT have garnered significant attention. Some people even use GPT-generated text to write reports or graduation theses. In my research, I am also interested in scientifically identifying and authenticating whether GPT has generated text. Text mining and text analytics have the potential to be applied not only in academic research but also in various fields such as daily life and business.

List of Researches
Research Keywords

statistical science, data science, text mining, text analytics, quantitative linguistics, stylometry, artificial intelligence, digital humanities

Academic Papers

[1] An Empirical Comparison and Ensemble Learning Methods of BERT Models on Authorship Attribution, Taisei KANDA, Mingzhe JIN. Journal of Japan Society of Information and Knowledge, 34(3), 244-255, https://doi.org/10.2964/jsik_2024_022, 2024/9/30.
[2] Can we spot fake public comments generated by ChatGPT(-3.5, -4)?: Japanese stylometric analysis exposes emulation created by one-shot learning. Wataru ZAITSU, Mingzhe JIN, Shunichi ISHIHARA, Satoru TSUGE, Mitsuyuki INABA. Public Library of Science (PLOS), 19(3) e0299031, DIO: 10.1371/journal.pone.0299031, 2024/03/13
[3] A CORPUS-BASED STYLISTIC ANALYS DID THE NOVELIST MINAE MIZUMURA ACHIEVE HER LITERARY GOAL, Guangwei LI, Mingzhe JIN, Yasuko NAKAMURA. PSYCHOLOGIA, 65(2), 273–283. https://doi.org/10.2117/psysoc.2023-B038, 2024/02
[4] QUANTITATIVE ANALYSIS OF THE CHARACTERISTICS AND HISTORICAL TRANSITION OF EDOGAWA RAMPO’S WORKS, Tetsuya YAMAMOTO, Yasuko NAKAMURA, Hideki OHIRA, Mingzhe JIN. PSYCHOLOGIA, 65(2), 284-295. https://doi.org/10.2117/psysoc.2023-b036, 2024/02.
[5] Analysis of stock market movement prediction with pre-trained language model), Jinyang Li, Mingzhe JIN, Hiroshi YADOHISHA. Artificial Intelligence Frontier, ISSN: 29580-1479, 1(2), 26-39, https://doi.org/10.55375/aif.2023.2.3, 2023/9/15.
[6] Authorship Attribution Using the Nucleus BunSetsu as Stylometric Features in Japanese Writings, Yejia LIU, Mingzhe JIN. Bulletin of Data Analysis of Japanese Classification Society, 12(1), 33-36, https://doi.org/10.32146/bdajcs.12.33, 2023/09.
[7] Improving the Performance of Feature Selection Methods with Low-Sample-Size Data, Wanwan ZHENG, Mingzhe JIN.The Computer Journal, 66(7), 1664-1686, https://doi.org/10.1093/comjnl/bxac033, 2023/7/9.
[8] Distinguishing ChatGPT(3.5, 4)-generated and human-written papers through Japanese stylometric analysis, Zaitsu. WATARU, Mingzhe. JIN. Public Library of Science (PLOS ONE) 18(8) , https://doi.org/10.1371/journal.pone.0288453, 2023/8/9,
[9] Is Word-length Inaccurate for Authorship Attribution? , Wanwan. ZHENG, Mingzhe. JIN. DSH: Digital Scholarship in the Humanities, 38(2), 875-890, https://doi.org/10.1093/llc/fqac067, 2022/11/01
[10] A Review on Authorship Attribution in Text Mining, Wanwan. ZHENG, Mingzhe. JIN. WIREs Computational Statistics, 15(2). 2022/4/4 accepted. First published: 2022/4/22, https://doi.org/10.1002/wics.1584
[11] Authorship Attribution in the Multi-genre Mingled Corpus, Yejia Liu, Mingzhe JIN. Bulletin of Data Analysis of Japanese Classification Society(in Japanese), 11 (1), 1-14, https://doi.org/10.32146/bdajcs.11.1, 2022/3/29.
[12] Statistical Modeling and Analysis of Diachronic Changes in Sentence-final Expressions in Modern Novels, Guangwei LI, Mingzhe JIN. Mathematical Linguistics(in Japanese),2022/06.
[13] A Corpus-based Approach to Explore the Stylistic Peculiarity of Koji Uno's Postwar Works, Xueqin LIU, Mingzhe JIN. DSH: Digital Scholarship in the Humanities. 37(1),168–184,April 2022/3/23,https://doi.org/10.1093/llc/fqab029,
[14] The Effectiveness of the Maximal Information Coefficients in Real-World Classification Tasks, Yanru CHEN, Wanwan ZHENG, Mingzhe JIN. The Harris science review of Doshisha University(in Japanese),62(3),17-24. 2021/10/31
[15] Discriminant Analysis for Corporate Bankruptcy using Financial Numerical and Textual Data, Limeng XU, Mingzhe JIN. Bulletin of Data Analysis of Japanese Classification Society(in Japanese), 10 (1),45–57.https://doi.org/10.32146/bdajcs.10.45. 2021
[16] Modeling Analysis of Diachronic Changes in Auxiliary Words in Novels, Guangwei LI, Mingzhe JIN. Journal of Japan Society of Information and Knowledge(in Japanese),31(3),371-383. 2021/01/29

Publications

(Sole Author)

ISBN: 4000298968, Fundamentals and Practices of Text Analytic, Iwanami Shoten (Tokyo, in Japanese), 2021

ISBN: 432011261X, Text Analytics, Kyoritsu Shuppan (Tokyo, in Japanese), 2018

ISBN: 462709602X, Data Science with R, Morikita Publishing (Tokyo, in Japanese), 2017

ISBN: 4320123689, Qualitative Data Analysis, Kyoritsu Shuppan (Tokyo, in Japanese), 2016