<?xml version="1.0"?>
<article xlink="http://www.w3.org/1999/xlink" mml="http://www.w3.org/1998/Math/MathML" xsi="http://www.w3.org/2001/XMLSchema-instance" ali="http://www.niso.org/schemas/ali/1.0/" noNamespaceSchemaLocation="http://jats.nlm.nih.gov/publishing/1.1/xsd/JATS-journalpublishing1-mathml3.xsd" article-type="research-article" dtd-version="1.1" lang="en"><front><journal-meta><journal-id journal-id-type="publisher-id">isrdo-SRJSET</journal-id><journal-id journal-id-type="pmc">isrdo-SRJSET</journal-id><journal-id journal-id-type="nlm-ta">isrdo-SRJSET</journal-id><journal-title-group><journal-title>Scientific Research Journal of Science, Engineering and Technology</journal-title><abbrev-journal-title abbrev-type="publisher" pub-type="epub">SRJSET</abbrev-journal-title></journal-title-group><issn>2584-0584</issn><publisher><publisher-name>ISRDO</publisher-name><publisher-loc>Gujarat,India</publisher-loc></publisher></journal-meta><article-meta><article-id pub-id-type="publisher-id">M-10141</article-id><article-id pub-id-type="doi"/><article-categories><subj-group subj-group-type="categories"><subject>Computer Science and Engineering</subject></subj-group></article-categories><title-group><article-title>Data-Driven Healthcare: Exploring Biomedical Text Mining Through NLP Models</article-title></title-group><contrib-group content-type="authors"><contrib id="197" contrib-type="author" corresp="yes"><name><given-names>Md Ariful Islam sabbir</given-names></name><xref ref-type="aff" rid="aff-1">1</xref><aff id="aff-1"><label>0</label><institution>Shanghai University Of Engineering Science</institution><country>China</country></aff></contrib></contrib-group><contrib-group content-type="editors"><contrib contrib-type="editor"/></contrib-group><pub-date pub-type="epub" data-type="pub" iso-8601-date="2025-01-17"><day>17</day><month>01</month><year iso-8601-date="2">2025</year></pub-date><volume>2</volume><elocation-id>V2-I2-2024</elocation-id><history><date date-type="received" iso-8601-date="2024-10-04"><day>04</day><month>10</month><year iso-8601-date="2024">2024</year></date><date date-type="revised" iso-8601-date="2024-10-18"><day>18</day><month>10</month><year iso-8601-date="2024"/></date><date date-type="accepted" iso-8601-date="2024-10-18"><day>18</day><month>10</month><year iso-8601-date="2024"/></date></history><permissions><copyright-statement>&#xA9;2024 Md Ariful Islam sabbir Year Corresponding Author</copyright-statement><copyright-year>2024</copyright-year><copyright-holder>Md Ariful Islam sabbir</copyright-holder><license href="https://creativecommons.org/licenses/by/4.0/"><license-p>This is an open access article distributed under the terms of the, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (ISRDO) and either DOI or URL of the article must be cited.<ext-link ext-link-type="uri" href="https://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution License</ext-link></license-p></license></permissions><self-uri href="https://isrdo.org/journal/SRJSET/currentissue/data-driven-healthcare-exploring-biomedical-text-mining-through-nlp-models-1"/><abstract><p>&nbsp;&nbsp;&nbsp;&nbsp;In recent years, the&#xD;
expanding volume of biological literature, clinical notes, and electronic&#xD;
health records (EHRs) has presented both a barrier and an opportunity for&#xD;
healthcare improvement. Biological text mining, which employs natural language&#xD;
processing (NLP) methods, is a viable alternative for extracting useful&#xD;
insights from unstructured biological data. This paper analyzes the relevance&#xD;
of NLP models in facilitating data-driven healthcare, with an emphasis on basic&#xD;
tasks such as named entity recognition (NER), relationship extraction (RE), and&#xD;
text classification. We show how domain-specific NLP models such as BioBERT,&#xD;
SciBERT, and ClinicalBERT have been built to cope with the intrinsic complexity&#xD;
of biological language, such as confusing terminology, acronyms, and technical&#xD;
jargon.Biomedical text mining has various healthcare applications, including&#xD;
drug discovery and reuse, clinical decision support, and pharmacovigilance. NLP&#xD;
models allow more informed decision-making, boost patient outcomes, and speed&#xD;
up personalized medicine research by automating the extraction of relevant&#xD;
patterns from large-scale biological texts. This paper also highlights the key&#xD;
challenges faced in biomedical text mining, such as data heterogeneity,&#xD;
imbalanced datasets, and the demand for explainable AI. Finally, we address&#xD;
future techniques for biological text mining that incorporate the integration&#xD;
of multimodal data, enhanced semantic understanding, and improved model&#xD;
interpretability. Finally, this research illustrates how NLP-driven text mining&#xD;
may turn unstructured data into relevant information in the healthcare&#xD;
industry.</p></abstract><kwd-group kwd-group-type="author"><kwd>Natural Language Processing (NLP)</kwd><kwd>Biological Text Mining</kwd><kwd>Named Entity Recognition (NER)</kwd><kwd>BioBERT</kwd><kwd>Clinical Decision Support</kwd><kwd>Drug Discovery</kwd><kwd>Explainable AI</kwd><kwd>Natural Language Processing (NLP)</kwd><kwd>Biological Text Mining</kwd><kwd>Named Entity Recognition (NER)</kwd><kwd>BioBERT</kwd><kwd>Clinical Decision Support</kwd><kwd>Drug Discovery</kwd><kwd>Explainable AI</kwd></kwd-group><funding-group><funding-statement>no funding</funding-statement></funding-group></article-meta></front><back><sec sec-type="data-availability"><title>Data Availability</title><p>This research was carried out&#xD;
following the ethical standards of the relevant institutional and national&#xD;
guidelines, with no ethical violations that could cause a conflict of interest.&#xD;
All necessary approvals have been obtained from ethical committees where&#xD;
applicable.&#xD;
&#xD;
The authors take full&#xD;
responsibility for the content of this paper, and all views expressed are our&#xD;
own and not influenced by third parties.</p></sec><sec sec-type="COI-statement"><title>Conflicts of Interest</title><p>Conflict of Interest Statement&#xD;
&#xD;
I, the author of this manuscript, declare that there are no conflicts of interest that could influence the research work presented in this paper. The study was conducted impartially, and the results were not affected by any financial, personal, or professional relationships that could be perceived as conflicts of interest.&#xD;
&#xD;
Potential Conflicts of Interest:&#xD;
The author confirm that there are no financial interests, such as funding, consultancy, ownership of stock or shares, or other forms of economic gain, that could affect the research outcomes. The authors also declare that there are no personal relationships with organizations or individuals that could have influenced the research. Additionally, no institutional relationships or commitments affect the integrity and objectivity of this work.&#xD;
&#xD;
Funding Disclosure:&#xD;
The author hasnot received financial support for this research and  did not influence the study design, data collection, analysis, or interpretation of the findings. The sponsors did not interfere with the publication of the results.&#xD;
&#xD;
Intellectual Property:&#xD;
The research presented in this manuscript does not have any undisclosed intellectual property interests, such as patents or commercialization potential, which might present a conflict.&#xD;
&#xD;
Ethical Compliance:&#xD;
This research was carried out following the ethical standards of the relevant institutional and national guidelines, with no ethical violations that could cause a conflict of interest. All necessary approvals have been obtained from ethical committees where applicable.&#xD;
&#xD;
The authors take full responsibility for the content of this paper, and all views expressed are our own and not influenced by third parties.&#xD;
&#xD;
Acknowledgements:&#xD;
I have disclosed all sources of support for this research in the acknowledgments section of the paper. Any collaborations or assistance received in the preparation of this manuscript have also been properly acknowledged.&#xD;
&#xD;
Signed by the authors: [MD ARIFUL ISLAM SABBIR, Shanghai University Of Engineering Science]&#xD;
[Date- 10 OCT,2024]</p></sec><sec sec-type="author-contributions"><title>Authors&#x2019; Contributions</title><p>MD ARIFUL ISLAM SABBIR- all section prepared</p></sec><sec sec-type="funding-statement"><title>Funding Statement</title><p>no funding</p></sec><ack><title>Acknowledgments</title><p>Acknowledgements:&#xD;
I have disclosed all sources of support for this research in the acknowledgments section of the paper. Any collaborations or assistance received in the preparation of this manuscript have also been properly acknowledged.&#xD;
Signed by the authors: [MD ARIFUL ISLAM SABBIR, Shanghai University Of Engineering Science]&#xD;
[Date- 10 OCT,2024]</p></ack><ref-list content-type="authoryear"><ref id="1"><label>1</label><element-citation publication-type="journal"><p>[1]	A. I. Stoumpos, F. Kitsios, and M. A. Talias, &#x201C;Digital Transformation in Healthcare: Technology Acceptance and Its Applications,&#x201D; Int. J. Environ. Res. Public Health, vol. 20, no. 4, 2023, doi: 10.3390/ijerph20043407.&#xD;
[2]	S. Zilcha-Mano, M. J. Constantino, and C. F. Eubanks, &#x201C;Evidence-Based Tailoring of Treatment to Patients, Providers, and Processes: Introduction to the Special Issue,&#x201D; J. Consult. Clin. Psychol., vol. 90, no. 1, pp. 1&#x2013;4, 2022, doi: 10.1037/ccp0000694.&#xD;
[3]	M. A. Razzaqe and T. Basak, &#x201C;Text mining in unstructured text: techniques, methods and analysis,&#x201D; World Sci. News An Int. Sci. J., no. 174, pp. 76&#x2013;92, 2022, [Online]. Available: www.worldscientificnews.com&#xD;
[4]	T. ValizadehAslani et al., &#x201C;PharmBERT: a domain-specific BERT model for drug labels,&#x201D; Brief. Bioinform., vol. 24, no. 4, pp. 1&#x2013;10, 2023, doi: 10.1093/bib/bbad226.&#xD;
[5]	P. Pilipiec, M. Liwicki, and A. Bota, &#x201C;Using Machine Learning for Pharmacovigilance: A Systematic Review,&#x201D; Pharmaceutics, vol. 14, no. 2, pp. 1&#x2013;25, 2022, doi: 10.3390/pharmaceutics14020266.&#xD;
[6]	M. Rashida, F. Iffath, R. Karim, and M. S. A. B, Trends and Techniques of Biomedical Text Mining&#x202F;: A Review, vol. 1. Springer International Publishing. doi: 10.1007/978-3-030-93247-3.&#xD;
[7]	T. Alam and S. Schmeier, &#x201C;Deep Learning in Biomedical Text Mining&#x202F;: Contributions and Challenges&#x201D;.&#xD;
[8]	J. Lee et al., &#x201C;Data and text mining BioBERT&#x202F;: a pre-trained biomedical language representation model for biomedical text mining,&#x201D; no. September, pp. 1&#x2013;7, 2019, doi: 10.1093/bioinformatics/btz682.&#xD;
[9]	J. Lee et al., &#x201C;BioBERT&#x202F;: pre-trained biomedical language representation model for biomedical text mining,&#x201D; pp. 1&#x2013;8, 2019.&#xD;
[10]	&#x201C;About PMC - PMC.&#x201D; Accessed: Oct. 04, 2024. [Online]. Available: https://www.ncbi.nlm.nih.gov/pmc/about/intro/&#xD;
[11]	&#x201C;ClinicalTrials.gov &#x2013; What, Why, Which Studies, When | Office of Human Research Affairs.&#x201D; Accessed: Oct. 04, 2024. [Online]. Available: https://www.bumc.bu.edu/ohra/clinicaltrials-gov/clinicaltrials-gov-what-why-which-studies-when/&#xD;
[12]	&#x201C;ISO - Electronic health records explained.&#x201D; Accessed: Oct. 04, 2024. [Online]. Available: https://www.iso.org/healthcare/electronic-health-records&#xD;
[13]	L. Zhao, W. Alhoshan, A. Ferrari, and K. J. Letsholo, &#x201C;Classification of Natural Language Processing Techniques for Requirements Engineering&#x201D;.&#xD;
[14]	L. Fu, Z. Weng, J. Zhang, H. Xie, and Y. Cao, &#x201C;MMBERT&#x202F;: a unified framework for biomedical named entity recognition,&#x201D; pp. 327&#x2013;341, 2024, doi: 10.1007/s11517-023-02934-8.&#xD;
[15]	M. Huang, P. Lai, P. Lin, Y. You, R. T. Tsai, and W. Hsu, &#x201C;Biomedical named entity recognition and linking datasets&#x202F;: survey and our recent development,&#x201D; vol. 21, no. June, pp. 2219&#x2013;2238, 2020, doi: 10.1093/bib/bbaa054.&#xD;
[16]	H. Cho and H. Lee, &#x201C;Biomedical named entity recognition using deep neural networks with contextual information,&#x201D; pp. 1&#x2013;11, 2019.&#xD;
[17]	Y. J. Park, G. J. Yang, C. B. Sohn, and S. J. Park, &#x201C;GPDminer&#x202F;: a tool for extracting named entities and analyzing relations in biological literature,&#x201D; BMC Bioinformatics, pp. 1&#x2013;18, 2024, doi: 10.1186/s12859-024-05710-z.&#xD;
[18]	C. Y. Kesiku and A. Chaves-villota, &#x201C;Natural Language Processing Techniques for Text Classification of Biomedical Documents&#x202F;: A Systematic Review,&#x201D; 2022.&#xD;
[19]	J. Li et al., &#x201C;A comparative study of pre &#x2011; trained language models for named entity recognition in clinical trial eligibility criteria from multiple corpora,&#x201D; BMC Med. Inform. Decis. Mak., vol. 7, pp. 1&#x2013;9, 2022, doi: 10.1186/s12911-022-01967-7.&#xD;
[20]	K. Lo, : &#x201C;A Pretrained Language Model for Scientific Text,&#x201D; 2019.&#xD;
[21]	K. Huang, J. Altosaar, and R. Ranganath, &#x201C;ClinicalBERT&#x202F;: Modeling Clinical Notes and Predicting Hospital Readmission&#x201D;.&#xD;
[22]	&#x201C;Fine-tune a pretrained model.&#x201D; Accessed: Oct. 04, 2024. [Online]. Available: https://huggingface.co/docs/transformers/training&#xD;
[23]	M. Neumann, D. King, I. Beltagy, and W. Ammar, &#x201C;ScispaCy&#x202F;: Fast and Robust Models for Biomedical Natural Language Processing,&#x201D; pp. 319&#x2013;327, 2019.&#xD;
[24]	S. M. Jain, Introduction to Transformers for NLP With the Hugging Face Library. &#xD;
[25]	R. Yacouby, &#x201C;Probabilistic Extension of Precision , Recall , and F1 Score for More Thorough Evaluation of Classification Models,&#x201D; pp. 79&#x2013;91, 2020.</p></element-citation></ref></ref-list></back></article>
