Publications
2025
Ontology-based Protein-Protein Interaction Explanation Using Large Language Models
Nur Bengisu Çam, Hasin Rehana, Jie Zheng, Benu Bansal, Yongqun He, Junguk Hur, Arzucan Özgür
bioRxiv
Evaluating Large Language Models in Data Generation for Low-Resource Scenarios: A Case Study on Question Answering
Ebru Arisoy, Merve Unlu Menevse, Yusufcan Manav, Arzucan Ozgur
GNNMutation: a heterogeneous graph-based framework for cancer detection
Nuriye Özlem Özcan Şimşek, Arzucan Özgür, Fikret Gürgen
BMC bioinformatics
The BU-MEF System for the Speak & Improve Challenge 2025: Spoken Language Assessment Using Speech and Textual Representations
Merve Unlu Menevse, Ebru Arisoy, Arzucan Ozgur
Alterations in Gut Microbiota–Brain Axis in Major Depressive Disorder as Identified by Machine Learning
Atacan Deniz Oncu, Arzucan Ozgur, Kutlu O Ulgen
OMICS: A Journal of Integrative Biology
Cancer Vaccine Adjuvant Name Recognition from Biomedical Literature using Large Language Models
Hasin Rehana, Jie Zheng, Leo Yeh, Benu Bansal, Nur Bengisu Çam, Christianah Jemiyo, Brett McGregor, Arzucan Özgür, Yongqun He, Junguk Hur
arXiv preprint arXiv:2502.09659
HATECAT-TR: A Hate Speech Span Detection and Categorization Dataset for Turkish
Hasan Kerem Seker, Gökçe Uludogan, Pelin Önal, Arzucan Özgür
evobpe: Evolutionary protein sequence tokenization
Burak Suyunu, Özdeniz Dolu, Arzucan Özgür
arXiv preprint arXiv:2503.08838
Hashtag activism and framing strategies in the aftermath of George Floyd’s death and the 2020 elections
Basak Taraktas, Kadir Cihan Duran, Suzan Uskudarli
Politics, Groups, and Identities
This article delves into online strategies to demand accountability for Floyd’s murder amid the polarized context of the 2020 presidential elections and the Chauvin trial. By applying signaling theory to the study of hashtag activism, we examine how users strategically emphasized and deemphasized Floyd’s death to adapt to contextual sensitivities. Our analysis, based on an original dataset of approximately 6,000,000 tweets (January 2020–December 2021), employs statistical tools and network analysis to uncover temporal patterns in users’ framing strategies related to Floyd’s death. Users emphasized Floyd’s case, policing reforms, and systemic racism during the summer of 2020, transitioning to broader themes during the elections, and refocused on accountability and justice during the Chauvin trial. This article proposes a novel theoretical application of the signaling theory to the study of online activism, through observable metrics – tweet volume, duration, and hashtag combinations. These metrics capture when and by which messaging activists strategically raise issue salience. Our findings shed light on the differences in activist strategies along partisan lines, with Democrats predominantly associated with justice demands and Republicans with grievances.
Hashtag activism and framing strategies in the aftermath of George Floyd’s death and the 2020 elections
Basak Taraktas, Kadir Cihan Duran, Suzan Uskudarli
Politics, Groups, and Identities
VO: The Vaccine Ontology
Jie Zheng, Asiyah Yu Lin, Anthony Huffman, Anna Maria Masci, Rebecca Racz, Guanming Wu, Kallan Roan, Edison Ong, Sirarat Sarntivijai, Joy Hu, Eliyas Asfaw, Hayleigh Kahn, Xingxian Li, Xumeng Zhang, Nilufer Kosar, Jianfu Li, Warren Manuel, Rashmie Abeysinghe, Hasin Rehana, Benu Bansal, Yuanyi Pan, Jinjing Guo, Virginia He, Justin Song, Andrey I Seleznev, Katelyn Hur, Anna He, Alexander Davydov, Qi Yang, Randi Vita, Bjoern Peters, Alan Ruttenberg, Alexander D Diehl, Charles Tapley Hoyt, Paola Roncaglia, Rachael P Huntley, Richard H Scheuermann, Melanie Courtot, Thomas Todd, Samantha Sayers, Fang Chen, Xinna Li, Feng-Yu Yeh, Zuoshuang Xiang, Arzucan Ozgur, Patricia L Whetzel, Mark A Musen, Christopher J Mungall, Wolfgang W Leitner, Licong Cui, Lesley A Colby, Harry LT Mobley, Brian D Athey, Gilbert S Omenn, Lindsay G Cowell, Cui Tao, Junguk Hur, Barry Smith, Yongqun He
bioRxiv
2024
Evaluating the quality of a corpus annotation scheme using pretrained language models
Furkan Akkurt, Onur Güngör, Büşra Marşan, Tunga Güngör, Balkız Öztürk Başaran, Arzucan Özgür, Susan Üsküdarlı
Dealing With Data Scarcity in Spoken Question Answering
Ebru Arısoy, Arzucan Özgür, Merve Ünlü Menevşe, Yusufcan Manav
2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024-Main Conference Proceedings--Joint 30th International Conference on Computational Linguistics and 14th International Conference on Language Resources and Evaluation, LREC-COLING 2024--20 May 2024 through 25 May 2024--Hybrid, Torino--199620
Do activists prioritize solutions over grievances? A Twitter Study of Black Lives Matter
B. Taraktas, K. C. Duran, S. Üsküdarli
Marmara Üniversitesi Siyasal Bilimler Dergisi
Do social movements shift the focus of their framing from grievances to tactics as they mature? This paper examines the nature of the frames that social movements and activists co-create using the case of Black Lives Matter (BLM). Building on (Snow \& Benford, 1988), we explore whether BLM’s frames have evolved from diagnostic to prognostic frames since the movement’s emergence. We compiled a novel tweet dataset collected from Twitter that contains 269,963 tweets sent under the hashtag “BlackLivesMatter” from Jan. 01, 2020, to Dec. 31, 2021. Using time series and network analysis, we show that frames do not naturally evolve from diagnostic to prognostic frames as movements mature. We find that BLM activists increasingly use prognostic frames while expressing their grievances because injustices and discrimination toward the Black continue. The evidence suggests that tweets on tactics and solutions outnumber the grievance-related frames only after Chauvin’s guilty plea alleviates grievances.
Generative language models on nucleotide sequences of human genes
Musa Nuri Ihtiyar, Arzucan Özgür
Scientific Reports
Evaluating the Quality of a Corpus Annotation Scheme Using Pretrained Language Models
Furkan Akkurt, Onur Gungor, Büşra Marşan, Tunga Gungor, Balkiz Ozturk Basaran, Arzucan Özgür, Susan Uskudarli
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Pretrained language models and large language models are increasingly used to assist in a great variety of natural language tasks. In this work, we explore their use in evaluating the quality of alternative corpus annotation schemes. For this purpose, we analyze two alternative annotations of the Turkish BOUN treebank, versions 2.8 and 2.11, in the Universal Dependencies framework using large language models. Using a suitable prompt generated using treebank annotations, large language models are used to recover the surface forms of sentences. Based on the idea that the large language models capture the characteristics of the languages, we expect that the better annotation scheme would yield the sentences with higher success. The experiments conducted on a subset of the treebank show that the new annotation scheme (2.11) results in a successful recovery percentage of about 2 points higher. All the code developed for this work is available at https://github.com/boun-tabi/eval-ud .
Dealing with Data Scarcity in Spoken Question Answering
Merve Ünlü Menevşe, Yusufcan Manav, Ebru Arisoy, Arzucan Özgür
Evaluating GPT and BERT models for protein–protein interaction identification in biomedical text
Hasin Rehana, Nur Bengisu Çam, Mert Basmaci, Jie Zheng, Christianah Jemiyo, Yongqun He, Arzucan Özgür, Junguk Hur
Bioinformatics Advances
Leveraging Large Language Models for Extracting Protein-Protein Interactions from Biomedical Corpora
Hasin Rehana, Nur Bengisu Çam, Mert Basmaci, Jie Zheng, Christianah Jemiyo, Yongqun He, Arzucan Özgür, Junguk Hur
Nested named entity recognition using multilayer BERT-based model
Hasin Rehana, Benu Bansal, Nur Bengisu Çam, Jie Zheng, Yongqun He, Arzucan Özgür, Junguk Hur
CLEF Working Notes
Linguistic laws meet protein sequences: A comparative analysis of subword tokenization methods
Burak Suyunu, Enes Taylan, Arzucan Özgür
IEEE
Tweeting through a public health crisis: Communication strategies of right-wing populist leaders during the COVID-19 pandemic
Başak Taraktaş, Berk Esen, Suzan Uskudarli
Government and Opposition
Do Activists Prioritize Solutions Over Grievances? A Twitter Study of Black Lives Matter
Basak Taraktas, Kadir Cihan Duran, Susan Üsküdarlı
Marmara Üniversitesi Siyasal Bilimler Dergisi
Exploring data‐driven chemical SMILES tokenization approaches to identify key protein–ligand binding moieties
Asu Busra Temizer, Gökçe Uludoğan, Rıza Özçelik, Taha Koulani, Elif Ozkirimli, Kutlu O Ulgen, Nilgun Karali, Arzucan Özgür
Molecular Informatics
Example Publication for Testing
Author A. Test, Author B. Example
Test Journal
This is a test publication to validate the BibTeX parsing and Markdown generation.
Detecting Hate Speech in Turkish Print Media: A corpus and a hybrid approach with target-oriented linguistic knowledge
Gökçe Uludoğan, Atıf Emre Yüksel, Ümit Tunçer, Burak Işık, Yasemin Korkmaz, Didar Akar, Arzucan Özgür
Overview of the hate speech detection in turkish and arabic tweets (hsd-2lang) shared task at case 2024
Gökçe Uludoğan, Somaiyeh Dehghan, Inanç Arın, Elif Erol, Berrin Yanıkoğlu, Arzucan Özgür
TURNA: A Turkish Encoder-Decoder Language Model for Enhanced Understanding and Generation
Gökçe Uludoğan, Zeynep Balal, Furkan Akkurt, Meliksah Turker, Onur Gungor, Susan Üsküdarlı
https://aclanthology.org/2024.findings-acl.600
{TURNA}: A {T}urkish Encoder-Decoder Language Model for Enhanced Understanding and Generation
Gökçe Uludo\ugan, Zeynep Balal, Furkan Akkurt, Meliksah Turker, Onur Gungor, Susan Üsküdarlı
Findings of the Association for Computational Linguistics ACL 2024
The recent advances in natural language processing have predominantly favored well-resourced English-centric models, resulting in a significant gap with low-resource languages. In this work, we introduce TURNA, a language model developed for the low-resource language Turkish and is capable of both natural language understanding and generation tasks. TURNA is pretrained with an encoder-decoder architecture based on the unified framework UL2 with a diverse corpus that we specifically curated for this purpose. We evaluated TURNA with three generation tasks and five understanding tasks for Turkish. The results show that TURNA outperforms several multilingual models in both understanding and generation tasks and competes with monolingual Turkish models in understanding tasks.
Incorporating Knowledge Graph Embeddings into Graph Neural Networks for Sequential Recommender Systems
Kazim Emre Yüksel, Susan Üsküdarli
IEEE
Incorporating Knowledge Graph Embeddings into Graph Neural Networks for Sequential Recommender Systems
Kazim Emre Yüksel, Susan Üsküdarli
2024 9th International Conference on Computer Science and Engineering (UBMK)
2023
Evaluation of chatgpt and bert-based models for turkish hate speech detection
Nur Bengisu Çam, Arzucan Özgür
IEEE
Siu2023-nst-hate speech detection contest
Inanç Arın, Zeynep Işık, Seçilay Kutal, Somaiyeh Dehghan, Arzucan Özgür, Berrin Yanikoğlu
IEEE
Can We Explain Privacy?
Gönül Aycı, Arzucan Özgür, Murat Şensoy, Pınar Yolum
IEEE Internet Computing
Explain to Me: Towards Understanding Privacy Decisions.
Gonul Ayci, Pinar Yolum, Arzucan Özgür, Murat Sensoy
PEAK: Explainable Privacy Assistant through Automated Knowledge Extraction
Gonul Ayci, Arzucan Özgür, Murat Şensoy, Pınar Yolum
arXiv preprint arXiv:2301.02079
Uncertainty-aware personal assistant for making personalized privacy decisions
Gonul Ayci, Murat Sensoy, Arzucan Özgür, Pinar Yolum
ACM Transactions on Internet Technology
A Computational Software for Training Robust Drug–Target Affinity Prediction Models: pydebiaseddta
Melih Barsbey, Rıza Özçelik, Alperen Bağ, Berk Atil, Arzucan Özgür, Elif Ozkirimli
Journal of Computational Biology
Pattern recognition for healthcare analytics
İnci M Baytaş, Yifan Peng, Arzucan Özgür
Frontiers Media SA
Improving the filtering of false positive single nucleotide variations by combining genomic features with quality metrics
Kazım Kıvanç Eren, Esra Çınar, Hamza U Karakurt, Arzucan Özgür
Bioinformatics
Visualizing Software Repositories Through Requirements Trace Links
Kadir Ersoy, Ecenur Sezer, Susan Üsküdarlı, Fatma Başak Aydemir
IEEE
A dataset for investigating the impact of context for offensive language detection in tweets
Musa Ihtiyar, Ömer Özdemir, Mustafa Erengül, Arzucan Özgür
Visualizing Software Repositories Through Requirements Trace Links
Kadir Ersoy, Ecenur Sezer, Susan Uskudarli, Fatma Başak Aydemir
2023 IEEE 31st International Requirements Engineering Conference Workshops (REW)
TULAP-An accessible and sustainable platform for Turkish natural language processing resources
Susan Üsküdarlı, Muhammet Şen, Furkan Akkurt, Merve Gürbüz, Onur Güngör, Arzucan Özgür, Tunga Güngör
Improving Cross-lingual Transfer Learning for Turkish NLP
John Doe, Jane Smith, Alex Johnson
ACL 2023
This paper presents a novel approach to improve cross-lingual transfer learning for Turkish natural language processing tasks. We demonstrate significant improvements in performance across multiple NLP tasks including named entity recognition, part-of-speech tagging, and sentiment analysis. Our method leverages morphological information specific to Turkish to enhance the transfer of knowledge from high-resource languages.
{TULAP} - An Accessible and Sustainable Platform for {T}urkish Natural Language Processing Resources
Susan Uskudarli, Muhammet Şen, Furkan Akkurt, Merve Gürbüz, Onur Gungor, Arzucan Özgür, Tunga Güngör
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations
Access to natural language processing resources is essential for their continuous improvement. This can be especially challenging in educational institutions where the software development effort required to package and release research outcomes may be overwhelming and under-recognized. Access towell-prepared and reliable research outcomes is important both for their developers as well as the greater research community. This paper presents an approach to address this concern with two main goals: (1) to create an open-source easily deployable platform where resources can be easily shared and explored, and (2) to use this platform to publish open-source Turkish NLP resources (datasets and tools) created by a research lab. The Turkish Natural Language Processing (TULAP) was designed and developed as an easy-to-use platform to share dataset and tool resources which supports interactive tool demos. Numerous open access Turkish NLP resources have been shared on TULAP. All tools are containerized to support portability for custom use. This paper describes the design, implementation, and deployment of TULAP with use cases (available at \url{https://tulap.cmpe.boun.edu.tr/}). A short video demonstrating our system is available at \url{https://figshare.com/articles/media/TULAP_Demo/22179047}.
A Framework for Improving the Generalizability of Drug–Target Affinity Prediction Models
Rıza Özçelik, Alperen Bağ, Berk Atil, Melih Barsbey, Arzucan Özgür, Elif Ozkirimli
Journal of Computational Biology
2022
Boğaziçi University Annotation Tool (BoAT)-Web
Salih Furkan Akkurt, Susan Uskudarli
Boğaziçi University
BoAT v2 -- A Web-Based Dependency Annotation Tool with Focus on Agglutinative Languages
Salih Furkan Akkurt, Büşra Marşan, Susan Uskudarli
https://arxiv.org/abs/2207.01327
BoAT v2 - A Web-Based Dependency Annotation Tool with Focus on Agglutinative Languages
Salih Furkan Akkurt, Büşra Marşan, Susan Uskudarli
Proceedings of the ALTNLP The International Conference and workshop on Agglutinative Language Technologies as a challenge of Natural Language Processing
Cluster-based mention typing for named entity disambiguation
Arda Çelebi, Arzucan Özgür
Natural Language Engineering
Identifying hate speech using neural networks and discourse analysis techniques
Zehra Melce Hüsünbeyi, Didar Akar, Arzucan Özgür
A shap-based active learning approach for creating high-quality training data
Nailcan Kara, Yagiz Levent Gume, Umit Tigrak, Gokce Ezeroglu, Serdar Mola, Omer Burak Akgun, Arzucan Özgür
IEEE
BOUN Treebank v2. 11
Büşra Marşan, Furkan Akkurt, Suzan Üsküdarlı, Tunga Güngör, Balkız Öztürk, Arzucan Özgür, Onur Güngör, Muhammet Şen, Merve Gürbüz, Utku Türk, Talha Bedir, Şaziye Betül Özateş
Boğaziçi University
Enhancements to the BOUN treebank reflecting the agglutinative nature of Turkish
Büşra Marşan, Salih Furkan Akkurt, Muhammet Şen, Merve Gürbüz, Onur Güngör, Şaziye Betül Özateş, Suzan Üsküdarlı, Arzucan Özgür, Tunga Güngör, Balkız Öztürk
arXiv preprint arXiv:2207.11782
Enhancements to the BOUN Treebank Reflecting the Agglutinative Nature of Turkish
Büşra Marşan, Salih Furkan Akkurt, Muhammet Şen, Merve Gürbüz, Onur Güngör, Şaziye Betül Özateş, Suzan Üsküdarlı, Arzucan Özgür, Tunga Güngör, Balkız Öztürk
Proceedings of the ALTNLP The International Conference and workshop on Agglutinative Language Technologies as a challenge of Natural Language Processing
A framework for automatic generation of spoken question-answering data
Merve Ünlü Menevşe, Yusufcan Manav, Ebru Arisoy, Arzucan Özgür
A Dataset and BERT-based Models for Targeted Sentiment Analysis on Turkish Texts
M Melih Mutlu, Arzucan Özgür
arXiv e-prints
Dataset for Targeted Sentiment Analysis in Turkish
Mustafa Melih Mutlu, Arzucan Özgür
Boğaziçi University
Tweeting through a Public Health Crisis: Communication Strategies of Right-Wing Populist Leaders during the COVID-19 Pandemic
Başak Taraktaş, Berk Esen, Suzan Uskudarli
Government and Opposition
BOUN Treebank
Utku Türk, Furkan Atmaca, Şaziye Betül Özateş, Gözde Berk, Seyyit Talha Bedir, Abdullatif Köksal, Balkız Öztürk Başaran, Tunga Güngör, Arzucan Özgür
Boğaziçi University
Resources for Turkish dependency parsing: Introducing the BOUN treebank and the BoAT annotation tool
Utku Türk, Furkan Atmaca, Şaziye Betül Özateş, Gözde Berk, Seyyit Talha Bedir, Abdullatif Köksal, Balkız Öztürk Başaran, Tunga Güngör, Arzucan Özgür
Language Resources and Evaluation
Interpreting Chemical Words of a Data-driven Segmentation Method as Protein Family Pharmacophores and Functional Groups
Kutlu O Ulgen, Nilgün Karalı, Arzucan Özgür
arXiv preprint arXiv:2210.14642
Exploiting pretrained biochemical language models for targeted drug design
Gökçe Uludoğan, Elif Ozkirimli, Kutlu O Ulgen, Nilgün Karalı, Arzucan Özgür
Bioinformatics
A hybrid deep dependency parsing approach enhanced with rules and morphology: A case study for Turkish
Şazıye Betül Özateş, Arzucan Özgür, Tunga Güngör, Balkiz Öztürk Başaran
IEEE Access
Improving Code-Switching Dependency Parsing with Semi-Supervised Auxiliary Tasks
Şaziye Betül Özateş, Arzucan Özgür, Tunga Güngör, Özlem Çetinoğlu
2021
Overcoming the challenges in morphological annotation of Turkish in universal dependencies framework
Talha BediR, Karahan Şahin, Onur Güngör, Suzan Uskudarli, Arzucan Özgür, Tunga Güngör, Balkız Öztürk Başaran
Overcoming the challenges in morphological annotation of Turkish in universal dependencies framework
Talha Bedir, Karahan Şahin, Onur Güngör, Suzan Uskudarli, Arzucan Özgür, Tunga Güngör, Balkız Öztürk Başaran
Proceedings of The Joint 15th Linguistic Annotation Workshop (LAW) and 3rd Designing Meaning Representations (DMR) Workshop
This paper presents several challenges faced when annotating Turkish treebanks in accordance with the Universal Dependencies (UD) guidelines and proposes solutions to address them. Most of these challenges stem from the lack of adequate support in the UD framework to accurately represent null morphemes and complex derivations, which results in a significant loss of information for Turkish. This loss negatively impacts the tools that are developed based on these treebanks. We raised and discussed these issues within the community on the official UD portal. This paper presents these issues and our proposals to more accurately represent morphosyntactic information for Turkish while adhering to guidelines of UD. This work aims to contribute to the representation of Turkish and other agglutinative languages in UD-based treebanks, which in turn aids to develop more accurately annotated datasets for such languages.
Crowdsourced mapping of unexplored target space of kinase inhibitors
Anna Cichońska, Balaguru Ravikumar, Robert J Allaway, Fangping Wan, Sungjoon Park, Olexandr Isayev, Shuya Li, Michael Mason, Andrew Lamb, Ziaurrehman Tanoli, Minji Jeon, Sunkyu Kim, Mariya Popova, Stephen Capuzzi, Jianyang Zeng, Kristen Dang, Gregory Koytiger, Jaewoo Kang, Carrow I Wells, Timothy M Willson, IDG-DREAM Drug-Kinase Binding Prediction Challenge Consortium User oselot Tan Mehmet 18, Team N121 Huang Chih-Han 19 Shih Edward SC 19 Chen Tsai-Min 19 Wu Chih-Hsun 19 Fang Wei-Quan 19 Chen Jhih-Yu 19 Hwang Ming-Jing 19, Team Let_Data_Talk Wang Xiaokang 20 Ben Guebila Marouen 21 Shamsaei Behrouz 22 Singh Sourav 23, User thinng Nguyen Thin 24, Team KKT Karimi Mostafa 25 26 Wu Di 25 27 Wang Zhangyang 28 29 Shen Yang 25, Team Boun Öztürk Hakime 30 Ozkirimli Elif 31 Özgür Arzucan 30, Team KinaseHunter Lim Hansaim 32 Xie Lei 33, Team AmsterdamUMC-KU-team Kanev Georgi K. 34 Kooistra Albert J. 35 Westerman Bart A. 34, Team DruginaseLearning Terzopoulos Panagiotis 36 Ntagiantas Konstantinos 36 Fotis Christos 36 Alexopoulos Leonidas 36, Team KERMIT-LAB-Ghent University Boeckaerts Dimitri 37 Stock Michiel 38 De Baets Bernard 38 Briers Yves 37, Team QED Luo Yunan 39 Hu Hailin 40 Peng Jian 39, Team METU_EMBLEBI_CROssBAR Dogan Tunca 41 Rifaioglu Ahmet S. 42 Atas Heval 43 Atalay Rengul Cetin 43 Atalay Volkan 42 Martin Maria J. 44, Team DMIS_DK Jeon Minji 6 Lee Junhyun 6 Yun Seongjun 6 Kim Bumsoo 6 Chang Buru 6, Team AI Winter is Coming, Team hulab Turu Gábor 45 Misák Ádám 45 Szalai Bence 45 Hunyady László 45, Team ML-Med Lienhard Matthias 46 Prasse Paul 47 Bachmann Ivo 48 Ganzlin Julia 47 Barel Gal 46 Herwig Ralf 46, Team Prospectors Oršolić Davor 49 Lučić Bono 50 Stepanić Višnja 49 Šmuc Tomislav 49, Challenge organizers
Nature communications
Re-narration as a basis for accessibility and inclusion on the World Wide Web
T Dinesh, V Choppella, S Uskudarli
Balancing Methods for Multi-label Text Classification with Long-Tailed Class Distribution
Yi Huang, Buse Giledereli, Abdullatif Köksal, Arzucan Özgür, Elif Ozkirimli
PIDNA at BioASQ MESINESP: Hybrid Semantic Indexing for Biomedical Articles in Spanish.
Yi Huang, Buse Giledereli, Abdullatif Köksal, Arzucan Özgür, Elif Ozkirimli
A novel gene selection method for gene expression data for the task of cancer type classification
N Özlem ÖZCAN ŞİMŞEK, Arzucan ÖzgÜr, Fikret GÜrgen
Biology Direct
BOUN at SemEval-2021 Task 9: Text Augmentation Techniques for Fact Verification in Tabular Data
Abdullatif Köksal, Yusuf Yüksel, Bekir Yıldırım, Arzucan Özgür
Twitter dataset and evaluation of transformers for Turkish sentiment analysis
Abdullatif Köksal, Arzucan Özgür
IEEE
Sentiment analysis of customer comments in banking using bert-based approaches
Melik Masarifoglu, Umit Tigrak, Sefa Hakyemez, Guven Gul, Erdal Bozan, Ali Hakan Buyuklu, Arzucan Özgür
IEEE
BOUN Treebank v2. 8
Utku Türk, Furkan Atmaca, Şaziye Betül Özateş, Gözde Berk, Seyyit Talha Bedir, Abdullatif Köksal, Balkız Öztürk Başaran, Tunga Güngör, Arzucan Özgür
Boğaziçi University
Machine Learning Methodologies to Study Molecular Interactions
Artur Yakimovich, Arzucan Özgür, Tunca Doğan, Elif Ozkirimli
Frontiers Media SA
Chemboost: A chemical language based approach for protein–ligand binding affinity prediction
Rıza Özçelik, Hakime Öztürk, Arzucan Özgür, Elif Ozkirimli
Molecular Informatics
2020
Analyzing ELMo and DistilBERT on socio-political news classification
Berfu Büyüköz, Ali Hürriyetoğlu, Arzucan Özgür
BOUN-REX at CLEF-2020 ChEMU Task 2: Evaluating Pretrained Transformers for Event Extraction
Hilal Dönmez, Abdullatif Köksal, Elif Ozkirimli, Arzucan Ozgür
EXSEQREG: Explaining sequence-based NLP tasks with regions with a case study using morphological features for named entity recognition
Onur Güngör, Tunga Güngör, Suzan Uskudarli
Plos one
{EXSEQREG}: Explaining sequence-based NLP tasks with regions with a case study using morphological features for named entity recognition
Onur Güngör, Tunga Güngör, Suzan Uskudarli
Plos one
The state-of-the-art systems for most natural language engineering tasks employ machine learning methods. Despite the improved performances of these systems, there is a lack of established methods for assessing the quality of their predictions. This work introduces a method for explaining the predictions of any sequence-based natural language processing (NLP) task implemented with any model, neural or non-neural. Our method named EXSEQREG introduces the concept of region that links the prediction and features that are potentially important for the model. A region is a list of positions in the input sentence associated with a single prediction. Many NLP tasks are compatible with the proposed explanation method as regions can be formed according to the nature of the task. The method models the prediction probability differences that are induced by careful removal of features used by the model. The output of the method is a list of importance values. Each value signifies the impact of the corresponding feature on the prediction. The proposed method is demonstrated with a neural network based named entity recognition (NER) tagger using Turkish and Finnish datasets. A qualitative analysis of the explanations is presented. The results are validated with a procedure based on the mutual information score of each feature. We show that this method produces reasonable explanations and may be used for i) assessing the degree of the contribution of features regarding a specific prediction of the model, ii) exploring the features that played a significant role for a trained model when analyzed across the corpus.
The Role of Contextual Word Embeddings in Correcting the `de/da' Clitic Errors in Turkish
Öztürk\ H, A Değirmenci, O Güngör, S Uskudarli
IEEE
An extended overview of the CLEF 2020 ChEMU lab: information extraction of chemical reactions from patents
Jiayuan He, Dat Quoc Nguyen, Saber A Akhondi, Christian Druckenbrodt, Camilo Thorne, Ralph Hoessel, Zubair Afzal, Zenan Zhai, Biaoyan Fang, Hiyori Yoshikawa, Ameer Albahem, Jingqi Wang, Yuankai Ren, Zhi Zhang, Yaoyun Zhang, Mai Hoang Dao, Pedro Ruas, Andre Lamurias, Francisco M Couto, Jenny Copara, Nona Naderi, Julien Knafou, Patrick Ruch, Douglas Teodoro, Daniel Lowe, John Mayfield, Abdullatif Köksal, Hilal Dönmez, Elif Özkirimli, Arzucan Özgür, Darshini Mahendran, Gabrielle Gurdin, Nastassja Lewinski, Christina Tang, Bridget T McInness, CS Malarkodi, Pattabhi Rk Rao, Sobha Lalitha Devi, Lawrence Cavedon, Trevor Cohn, Timothy Baldwin, Karin Verspoor
22-25 September 2020
The RELX Dataset and Matching the Multilingual Blanks for Cross-Lingual Relation Classification
Abdullatif Köksal, Arzucan Özgür
Findings of the Association for Computational Linguistics: EMNLP 2020
Vapur: A Search Engine to Find Related Protein-Compound Pairs in COVID-19 Literature
Abdullatif Köksal, Hilal Dönmez, Rıza Özçelik, Elif Ozkirimli, Arzucan Özgür
Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020
Resources for Turkish Dependency Parsing: Introducing the BOUN Treebank and the BoAT Annotation Tool
Utku Türk, Furkan Atmaca, Şaziye Betül Özateş, Gözde Berk, Seyyit Talha Bedir, Abdullatif Köksal, Balkız Öztürk Başaran, Tunga Güngör, Arzucan Özgür
Language Resources and Evaluation
We introduce the resources that we developed for Turkish dependency parsing, which include a novel manually annotated treebank (BOUN Treebank), along with the guidelines we adopted, and a new annotation tool (BoAT).
The Role of Contextual Word Embeddings in Correcting the 'de/da' Clitic Errors in Turkish
Hasan Öztürk, Alperen De\ugirmenci, Onur Güngör, Suzan Uskudarli
28th Signal Processing and Communications Applications Conference ({SIU})
One of the most common spelling errors in Turkish is regarding the clitic `de/da'. People often misspell the `de/da' either by treating it as a suffix inappropriately when it should not, or by spelling it seperately when it should be a suffix. Since Turkish is a morphologically rich agglutinative language, detecting and identifying such errors are difficult. As such, many widely used spell correction tools do not handle such mistakes well. In this work, we show that a sequence tagger model that employs BERT model which produces word embeddings that consider the context of a word obtains higher performance compared to using non-contextual word embeddings instead. Training and evaluation tasks were performed with a dataset that was derived from a Turkish corpus using a special process in addition to a manually curated one. The contextual word embeddings obtained during this task are publicly shared with the research community.
NeuroBoun: An inquiry-based approach for exploring scientific literature--a use case in neuroscience
S Uskudarli, Erinç Gökdeniz, Resit Canbeyli
arXiv preprint arXiv:2001.00186
NeuroBoun: An inquiry-based approach for exploring scientific literature -- a use case in neuroscience
S. Uskudarli, E. Gökdeniz, R. Canbeyli
Microblog topic identification using {Linked Open Data}
Ahmet Yıldırım, Suzan Uskudarli
Plos one
Much valuable information is embedded in social media posts (microposts) which are contributed by a great variety of persons about subjects that of interest to others. The automated utilization of this information is challenging due to the overwhelming quantity of posts and the distributed nature of the information related to subjects across several posts. Numerous approaches have been proposed to detect topics from collections of microposts, where the topics are represented by lists of terms such as words, phrases, or word embeddings. Such topics are used in tasks like classification and recommendations. The interpretation of topics is considered a separate task in such methods, albeit they are becoming increasingly human-interpretable. This work proposes an approach for identifying machine-interpretable topics of collective interest. We define topics as a set of related elements that are associated by having posted in the same contexts. To represent topics, we introduce an ontology specified according to the W3C recommended standards. The elements of the topics are identified via linking entities to resources published on Linked Open Data (LOD). Such representation enables processing topics to provide insights that go beyond what is explicitly expressed in the microposts. The feasibility of the proposed approach is examined by generating topics from more than one million tweets collected from Twitter during various events. The utility of these topics is demonstrated with a variety of topic-related tasks along with a comparison of the effort required to perform the same tasks with words-list-based representations. Manual evaluation of randomly selected 36 sets of topics yielded 81.0{\%} and 93.3{\%} for the precision and F1 scores respectively.
Dependency Parser
Şaziye Betül Özateş, Tunga Güngör, Arzucan Özgür, Balkız Öztürk Başaran
Boğaziçi University
Exploring chemical space using natural language processing methodologies for drug discovery
Hakime Öztürk, Arzucan Özgür, Philippe Schwaller, Teodoro Laino, Elif Ozkirimli
Elsevier Current Trends
The role of contextual word embeddings in correcting the ‘de/da’clitic errors in Turkish
Hasan Öztürk, Alperen Değirmenci, Onur Güngör, Suzan Uskudarli
IEEE
2019
Detecting Clitics Related Orthographic Errors in {T}urkish
Ugurcan Arikan, Onur Gungor, Suzan Uskudarli
International Conference on Recent Advances in Natural Language Processing ({RANLP})
For the spell correction task, vocabulary based methods have been replaced with methods that take morphological and grammar rules into account. However, such tools are fairly immature, and, worse, non-existent for many low resource languages. Checking only if a word is well-formed with respect to the morphological rules of a language may produce false negatives due to the ambiguity resulting from the presence of numerous homophonic words. In this work, we propose an approach to detect and correct the {``}de/da{''} clitic errors in Turkish text. Our model is a neural sequence tagger trained with a synthetically constructed dataset consisting of positive and negative samples. The model{'}s performance with this dataset is presented according to different word embedding configurations. The model achieved an F1 score of 86.67{\%} on a synthetically constructed dataset. We also compared the model{'}s performance on a manually curated dataset of challenging samples that proved superior to other spelling correctors with 71{\%} accuracy compared to the second-best (Google Docs) with and accuracy of 34{\%}.
Detecting clitics related orthographic errors in Turkish
Ugurcan Arikan, Onur Güngör, Suzan Uskudarli
Supervised learning methods in classifying organized behavior in tweet collections
Erdem Beğenilmiş, Susan Uskudarli
International Journal on Artificial Intelligence Tools
Supervised Learning Methods in Classifying Organized Behavior in Tweet Collections
Erdem Beğenilmiş, Susan Uskudarli
International Journal on Artificial Intelligence Tools
The successful use of social media to manipulate public opinion via bots and hired individuals to spread (mis)information to unsuspecting users reached alarming levels due to the manipulations during the 2016 US elections and the Brexit deliberations in the UK. Fake interaction such as “liking” and “retweeting” are staged to foster trust in the posts of bots and individuals, which makes it difficult for individuals to detect the posts that are part of greater schemes. We propose an approach based on supervised learning to classify collections of tweets as “organized” when they inhabit premeditated intent and as “organic” otherwise. Features related to users and posting behavior are used to train the classifiers using 851 data sets totaling above 270 million tweets. Further classifiers are trained to assess the effectiveness of the selected features. The random forest algorithm persistently yielded the best results with scores greater than 95\% for both accuracy and f-measure. For comparison purposes, unsupervised learning methods were used to cluster the same data sets. The Gaussian Mixture Model clustered [organized vs organic] data set with 99\% agreement with the labels. The success of using only behavioral features to detect organized behavior is encouraging.
Overview of the BioCreative VI Precision Medicine Track: mining protein interactions and mutations for precision medicine
Rezarta Islamaj Doğan, Sun Kim, Andrew Chatr-Aryamontri, Chih-Hsuan Wei, Donald C Comeau, Rui Antunes, Sérgio Matos, Qingyu Chen, Aparna Elangovan, Nagesh C Panyam, Karin Verspoor, Hongfang Liu, Yanshan Wang, Zhuang Liu, Berna Altınel, Zehra Melce Hüsünbeyi, Arzucan Özgür, Aris Fergadis, Chen-Kai Wang, Hong-Jie Dai, Tung Tran, Ramakanth Kavuluru, Ling Luo, Albert Steppi, Jinfeng Zhang, Jinchan Qu, Zhiyong Lu
Database
The effect of morphology in named entity recognition with sequence tagging
Onur Güngör, Tunga Güngör, Suzan Üsküdarli
Natural Language Engineering
The effect of morphology in named entity recognition with sequence tagging
Onur Güngör, Tunga Güngör, Suzan Uskudarli
Natural Language Engineering
This work proposes a sequential tagger for named entity recognition in morphologically rich languages. Several schemes for representing the morphological analysis of a word in the context of named entity recognition are examined. Word representations are formed by concatenating word and character embeddings with the morphological embeddings based on these schemes. The impact of these representations is measured by training and evaluating a sequential tagger composed of a conditional random field layer on top of a bidirectional long short-term memory layer. Experiments with Turkish, Czech, Hungarian, Finnish and Spanish produce the state-of-the-art results for all these languages, indicating that the representation of morphological information improves performance.
Identifying Image Related Sentences in News Articles
Melike Esma İlter, Lale Akarun, Arzucan Özgür
IEEE
Statistical representation models for mutation information within genomic data
N Özlem ÖZCAN ŞİMŞEK, Arzucan Özgür, Fikret Gürgen
BMC bioinformatics
BOUN-ISIK participation: an unsupervised approach for the named entity normalization and relation extraction of bacteria biotopes
Ilknur Karadeniz, Ömer Faruk Tuna, Arzucan Özgür
Linking entities through an ontology using word embeddings and syntactic re-ranking
Ilknur Karadeniz, Arzucan Özgür
BMC bioinformatics
Machine learning-based identification and rule-based normalization of adverse drug reactions in drug labels
Mert Tiftikci, Arzucan Özgür, Yongqun He, Junguk Hur
BMC bioinformatics
Improving the annotations in the Turkish universal Dependency Treebank
Utku Türk, Furkan Atmaca, Şaziye Betül Özateş, Balkız Öztürk Başaran, Tunga Güngör, Arzucan Özgür
Turkish treebanking: Unifying and constructing efforts
Utku Türk, Furkan Atmaca, Şaziye Betül Özateş, Abdullatif Köksal, Balkız Öztürk Başaran, Tunga Güngör, Arzucan Özgür
Turkish tweet classification with transformer encoder
Atıf Emre Yüksel, Yaşar Alim Türkmen, Arzucan Özgür, Berna Altınel
WideDTA: prediction of drug-target binding affinity
Hakime Öztürk, Elif Ozkirimli, Arzucan Özgür
arXiv preprint arXiv:1902.04166
2018
Organized Behavior Classification of Tweet Sets using Supervised Learning Methods
E Begenilmiş, S. Uskudarlı
to appear
Organized Behavior Classification of Tweet Sets using Supervised Learning Methods
Erdem Be\ugenilmiş, Suzan Uskudarli
8th International Conference on Web Intelligence, Mining and Semantics ({WIMS})
Segmenting hashtags and analyzing their grammatical structure
Arda Celebi, Arzucan Özgür
Journal of the Association for Information Science and Technology
Towards an ontology-driven clinical experience sharing ecosystem: Demonstration with liver cases
Mar\'\ia del Mar Rold\'an-Garc\'\ia, Suzan Uskudarli, Neda B Marvasti, Burak Acar, Jos\'e F Aldana-Montes
Expert Systems with Applications
Past medical cases, hence clinical experience, are invaluable resources in supporting clinical practice, research, and education. Medical professionals need to be able to exchange information about patient cases and explore them from subjective perspectives. This requires a systematic and flexible methodology to case representation for supporting the exchange of processable patient information. We present an ontology based approach to modeling patient cases and use patients with liver disease conditions as an example. To this end a novel ontology, lico, that utilizes well known medical standards is proposed to represent liver patient cases. The utility of the proposed approach is demonstrated with semantic queries and reasoning using data collected from real patients. The preliminary results are promising in regards to the potentials of ontology based medical case representation for building case-based search and retrieval systems, paving the way towards a Clinical Experience Sharing platform for comparative diagnosis, research, and education.
A closed-domain question answering framework using reliable resources to assist students
Caner Derici, Yiğit Aydin, Çiğdem Yenialaca, Nihal Yağmur Aydin, Günizi Kartal, Arzucan Özgür, Tunga Güngör
Natural Language Engineering
Improving named entity recognition by jointly learning to disambiguate morphological tags
Onur Güngör, Suzan Uskudarli, Tunga Güngör
Recurrent neural networks for Turkish named entity recognition
Onur Güngör, Suzan Üsküdarlı, Tunga Güngör
IEEE
Improving Named Entity Recognition by Jointly Learning to Disambiguate Morphological Tags
Onur Gungor, Suzan Uskudarli, Tunga Gungor
27th International Conference on Computational Linguistics ({COLING})
Recurrent neural networks for Turkish named entity recognition
Onur Güngör, Suzan Uskudarli, Tunga Güngör
26th Signal Processing and Communications Applications Conference ({SIU})
Ontology-based literature mining and class effect analysis of adverse drug reactions associated with neuropathy-inducing drugs
Junguk Hur, Arzucan Özgür, Yongqun He
Journal of biomedical semantics
Towards an ontology-driven clinical experience sharing ecosystem: Demonstration with liver cases
María del Mar Roldán-García, Suzan Uskudarli, Neda B Marvasti, Burak Acar, José F Aldana-Montes
Expert Systems With Applications
Semi-Supervised Psychometric Scoring of Document Collections
Burak Suyunu, Gonul Ayci, Mine Öğretir, Ali Taylan Cemgil, Suzan Uskudarli, Hamza Zeytinoglu, Bulent Ozel, Arman Boyacı
IEEE
Semi-Supervised Psychometric Scoring of Document Collections
Burak Suyunu, Gonul Ayci, Mine Ö\ugretir, Ali Taylan Cemgil, Suzan Uskudarli, Hamza Zeytinoglu, Bulent Ozel, Arman Boyacı
International Conference on Data Mining Workshops ({ICDMW})
We describe a generic computational approach that can be used in developing methods for psychometric profiling. Our approach is based on semi-supervised analysis of document collections using topic modeling. The method depends on a supervisor providing a set of seed documents, grouped by abstract themes, such as Schwartz values or personality traits; and possibly a separate background document corpus. Instead of casting the problem into a standard classification framework, we interpret the group labels as a guide for finding distinguishing features. During training, we train each group of documents associated with a theme separately by using nonnegative matrix factorization to obtain theme specific topic distributions. In the analysis, we decompose a new document using the model learned during training to arrive at the theme scores. We demonstrate our approach on two psychometric profiling theories (Schwartz and Big Five) and evaluate our Schwartz scores with leave-one-out cross-validation method and compare Big Five scores to independent surveys, which are much more costly to carry out.
The information revealed by processing semantic topics extracted from collective short posts
Ahmet Yildirim, Suzan Üsküdarli
IEEE
A morphology-based representation model for lstm-based dependency parsing of agglutinative languages
Şaziye Betül Özateş, Arzucan Özgür, Tunga Güngör, Balkız Öztürk
A novel methodology on distributed representations of proteins using their interacting ligands
Hakime Öztürk, Elif Ozkirimli, Arzucan Özgür
Bioinformatics
DeepDTA: deep drug–target binding affinity prediction
Hakime Öztürk, Arzucan Özgür, Elif Ozkirimli
Bioinformatics
2017
Text classification using ontology and semantic values of terms for mining protein interactions and mutations
B Altinel, ZM Husunbeyi, A Ozgur
Proceedings of the BioCreative VI Workshop
Busem at semeval-2017 task 4a sentiment analysis with word embedding and long short term memory rnn approaches
Deger Ayata, Murat Saraclar, Arzucan Özgür
Political opinion/sentiment prediction via long short term memory recurrent neural networks on Twitter
Değer Ayata, Murat Saraçlar, Arzucan Özgür
IEEE
Turkish tweet sentiment analysis with word embedding and machine learning
Değer Ayata, Murat Saraçlar, Arzucan Özgür
IEEE
Automatic query generation using word embeddings for retrieving passages describing experimental methods
Ferhat Aydın, Zehra Melce Hüsünbeyi, Arzucan Özgür
Database
Description of the BOUN System for the Trilingual Entity Detection and Linking Tasks at TAC KBP 2017.
Arda Celebi, Arzucan Özgür
Morphological embeddings for named entity recognition in morphologically rich languages
Onur Gungor, Eray Yildiz, Suzan Uskudarli, Tunga Gungor
arXiv preprint arXiv:1706.00506
Ontology-based literature mining of E. coli vaccine-associated gene interaction networks
Junguk Hur, Arzucan Özgür, Yongqun He
Journal of biomedical semantics
BIOSSES: a semantic sentence similarity estimation system for the biomedical domain
Gizem Soğancıoğlu, Hakime Öztürk, Arzucan Özgür
Bioinformatics
Extracting Adverse Drug Reactions using Deep Learning and Dictionary Based Approaches.
Mert Tiftikci, Arzucan Özgür, Yongqun He, Junguk Hur
CNN-based chemical–protein interactions classification
Atakan Yüksel, Hakime Öztürk, Elif Ozkirimli, Arzucan Özgür
Proceedings of the BioCreative VI Workshop
2016
Automated neuroanatomical relation extraction: a linguistically motivated approach with a PVT connectivity graph case study
Erinç Gökdeniz, Arzucan Özgür, Reşit Canbeyli
Frontiers in neuroinformatics
BioCreative V BioC track overview: collaborative biocurator assistant task for BioGRID
Sun Kim, Rezarta Islamaj Doğan, Andrew Chatr-Aryamontri, Christie S Chang, Rose Oughtred, Jennifer Rust, Riza Batista-Navarro, Jacob Carter, Sophia Ananiadou, Sergio Matos, Andre Santos, David Campos, José Luís Oliveira, Onkar Singh, Jitendra Jonnagaddala, Hong-Jie Dai, Emily Chia-Yu Su, Yung-Chun Chang, Yu-Chen Su, Chun-Han Chu, Chien Chin Chen, Wen-Lian Hsu, Yifan Peng, Cecilia Arighi, Cathy H Wu, K Vijay-Shanker, Ferhat Aydın, Zehra Melce Hüsünbeyi, Arzucan Özgür, Soo-Yong Shin, Dongseop Kwon, Kara Dolinski, Mike Tyers, W John Wilbur, Donald C Comeau
Database
Named entity recognition on Twitter for Turkish using semi-supervised learning with word embeddings
Eda Okur, Hakan Demir, Arzucan Özgür
Towards building a political protest database to explain changes in the welfare state
Cagil Sonmez, Arzucan Özgür, Erdem Yörük
Ontology-based categorization of bacteria and habitat entities using information retrieval techniques
Mert Tiftikci, Hakan Şahin, Berfu Büyüköz, Alper Yayıkçı, Arzucan Özgür
Identifying topics in microblogs using Wikipedia
Ahmet Yıldırım, Suzan Uskudarli, Arzucan Özgür
PLOS ONE
Twitter is an extremely high volume platform for user generated contributions regarding any topic. The wealth of content created at real-time in massive quantities calls for automated approaches to identify the topics of the contributions. Such topics can be utilized in numerous ways, such as public opinion mining, marketing, entertainment, and disaster management. Towards this end, approaches to relate single or partial posts to knowledge base items have been proposed. However, in microblogging systems like Twitter, topics emerge from the culmination of a large number of contributions. Therefore, identifying topics based on collections of posts, where individual posts contribute to some aspect of the greater topic is necessary. Models, such as Latent Dirichlet Allocation (LDA), propose algorithms for relating collections of posts to sets of keywords that represent underlying topics. In these approaches, figuring out what the specific topic(s) the keyword sets represent remains as a separate task. Another issue in topic detection is the scope, which is often limited to specific domain, such as health. This work proposes an approach for identifying domain-independent specific topics related to sets of posts. In this approach, individual posts are processed and then aggregated to identify key tokens, which are then mapped to specific topics. Wikipedia article titles are selected to represent topics, since they are up to date, user-generated, sophisticated articles that span topics of human interest. This paper describes the proposed approach, a prototype implementation, and a case study based on data gathered during the heavily contributed periods corresponding to the four US election debates in 2012. The manually evaluated results (0.96 precision) and other observations from the study are discussed in detail.
Identifying topics in microblogs using Wikipedia
Ahmet Yıldırım, Suzan Üsküdarlı, Arzucan Özgür
PloS one
Sentence similarity based on dependency tree kernels for multi-document summarization
Şaziye Betül Özateş, Arzucan Özgür, Dragomir Radev
Ignet: A centrality and INO-based web system for analyzing and visualizing literature-mined networks
Arzucan Özgür, Junguk Hur, Zuoshuang Xiang, Edison Ong, Dragomir R Radev, Yongqun He
Bioinformatics
The Interaction Network Ontology-supported modeling and mining of complex interactions represented with multiple keywords in biomedical literature
Arzucan Özgür, Junguk Hur, Yongqun He
BioData mining
A comparative study of SMILES-based compound similarity functions for drug-target interaction prediction
Hakime Öztürk, Elif Ozkirimli, Arzucan Özgür
BMC bioinformatics
2015
Retrieving Passages Describing Experimental Methods using Ontology and Term Relevance based Query Matching
Ferhat Aydın, Zehra Melce Hüsünbeyi, Arzucan Özgür
GLASS: a comprehensive database for experimentally validated GPCR-ligand associations
Wallace KB Chan, Hongjiu Zhang, Jianyi Yang, Jeffrey R Brender, Junguk Hur, Arzucan Özgür, Yang Zhang
Bioinformatics
Question analysis for a closed domain question answering system
Caner Derici, Kerem Celik, Ekrem Kutbay, Yiğit Aydın, Tunga Güngör, Arzucan Özgür, Günizi Kartal
Springer International Publishing
A review on computational systems biology of pathogen–host interactions
Saliha Durmuş, Tunahan Çakır, Arzucan Özgür, Reinhard Guthke
Frontiers Media SA
DRENAJ: Distributed social media data collection system
Onur Güngör, Suzan Uskudarli, A Taylan Cemgil
IEEE
Development and application of an interaction network ontology for literature mining of vaccine-associated gene-gene interactions
Junguk Hur, Arzucan Özgür, Zuoshuang Xiang, Yongqun He
Journal of biomedical semantics
Detection and categorization of bacteria habitats using shallow linguistic analysis
Ilknur Karadeniz, Arzucan Özgür
BMC bioinformatics
Literature Mining and Ontology based Analysis of Host-Brucella Gene–Gene Interaction Network
Ilknur Karadeniz, Junguk Hur, Yongqun He, Arzucan Özgür
Frontiers in microbiology
Overview of the ImageCLEF 2015 liver CT annotation task.
Neda Barzegar Marvasti, Maria del Mar Roldan Garcia, Suzan Üsküdarli, José Francisco Aldana Montes, Burak Acar
General overview of imageCLEF at the CLEF 2015 labs
Henning Müller, Mauricio Villegas, Andrew Gilbert, Lucas Piras, Josiah Wang, Krystian Mikolajczyk, Alba García Seco de Herrera, Stefano Bromuri, M Ashraful Amin, Mahmood Kazi Mohammed, Burak Acar, Suzan Uskudarli, Neda Marvasti, José Aldana, María del Mar Roldan García
8–11 September 2015
Amaçlı Sanal Topluluklar İçin Ontoloji Tabanlı Uygulama Üretme Platformu
M. Seyhan, S. Uskudarli
ISBN 9789750621185
General Overview of ImageCLEF at the CLEF 2015 Labs
Mauricio Villegas, Henning Müller, Gilbert, Andrew, Luca Piras, Josiah Wang, Krystian Mikolajczyk, AlbaG.Seco de Herrera, Stefano Bromuri, M.Ashraful Amin, MahmoodKazi Mohammed, Burak Acar, Suzan Uskudarli, NedaB. Marvasti, JoséF. Aldana, María del Mar Roldán García
International Conference of the Cross-Language Evaluation Forum for European Languages
Bir Ontoloji ile Mikroblog Ortamlarının Modellenmesi ile, İçeriklerin Anlamsal Olarak Erişilebilir Hale Getirilmesi ve Sorgulanması
Ahmet Yıldırım, Suzan Üsküdarlı
Anadolu Üniversitesi yayınları, https://ab.org.tr/ab15/bildiri/452.pdf
Extension of the Interaction Network Ontology for literature mining of gene-gene interaction networks from sentences with multiple interaction keywords
Arzucan Özgür, Junguk Hur, Yongqun He
Classification of Beta-Lactamases and Penicillin Binding Proteins Using Ligand-Centric Network Models.
Hakime Öztürk, Elif Ozkirimli, Arzucan Özgür
PloS one
2014
Expanding Machine Translation Training Data with an Out-of-Domain Corpus using Language Modeling based Vocabulary Saturation
Burak Aydın, Arzucan Özgür
Imageclef 2014: Overview and analysis of the results
Barbara Caputo, Henning Müller, Jesus Martinez-Gomez, Mauricio Villegas, Burak Acar, Novi Patricia, Neda Marvasti, Suzan Üsküdarlı, Roberto Paredes, Miguel Cazorla, Ismael Garcia-Varea, Vicente Morell
Springer International Publishing
Self-training a Constituency Parser using N-gram Trees
Arda Celebi, Arzucan Ozgur
European Language Resources Association (ELRA)
Improving Named Entity Recognition for Morphologically Rich Languages using Word Embeddings
Hakan Demir, Arzucan Özgür
Türkçe Soru Cevaplama Sistemlerinde Kural Tabanlı Odak Çıkarımı Rule-Based Focus Extraction in Turkish Question Answering Systems
Caner Derici, Kerem Çelik, Arzucan Özgür, Tunga Güngör, Ekrem Kutbay, Yigit Aydın, Günizi Kartal
Semantic description of liver CT images: an ontological approach
Nadin Kökciyan, Rüştü Türkay, Suzan Üsküdarli, Pınar Yolum, Barış Bakır, Burak Acar
IEEE journal of biomedical and health informatics
Semantic description of liver CT images: An ontological approach
Nadin Kökciyan, Rüştü Türkay, Suzan Uskudarli, Pınar Yolum, Bariş Bakir, Burak Acar
Journal of Biomedical and Health Informatics
Radiologists inspect CT scans and record their observations in reports to communicate with physicians. These reports may suffer from ambiguous language and inconsistencies resulting from subjective reporting styles, which present challenges in interpretation. Standardization efforts, such as the lexicon RadLex for radiology terms, aim to address this issue by developing standard vocabularies. While such vocabularies handle consistent annotation, they fall short in sufficiently processing reports for intelligent applications. To support such applications, the semantics of the concepts as well as their relationships must be modeled, for which, ontologies are effective. They enable the software to make inferences beyond what is present in the reports. This paper presents the open-source ontology onlira (Ontology of the Liver for Radiology), which is developed to support such intelligent applications, such as identifying and ranking similar liver patient cases. onlira is introduced in terms of its concepts, properties, and relations. Examples of real liver patient cases are provided for illustration purposes. The ontology is evaluated in terms of its ability to express real liver patient cases and address semantic queries.
Bayesian pathway analysis of cancer microarray data
Melike Korucuoglu, Senol Isci, Arzucan Ozgur, Hasan H Otu
PloS one
ImageCLEF Liver CT Image Annotation Task 2014
N B Marvasti, N Kökciyan, R Türkay, A Yazıcı, P Yolum, S Uskudarli, B Acar
http://ceur-ws.org/Vol-1180/CLEF2014wn-Image-MarvastiEt2014.pdf
Analyzing stemming approaches for Turkish multi-document summarization
Muhammed Yavuz Nuzumlalı, Arzucan Özgür
Turkish MDS Data Set
Muhammed Yavuz Nuzumlalı, Arzucan Özgür
Association for Computational Linguistics
Turkish Multi-document Summarization (MDS) Corpus
Muhammed Yavuz Nuzumlalı, Arzucan Özgür
Boğaziçi University
A systems pharmacology approach to model tyrosine kinase inhibitor‐induced cardiotoxicity gene interaction networks (844.17)
Sirarat Sarntivijai, Junguk Hur, Arzucan Ozgur, Keith Burkhart, Yongqun He, Gilbert Omenn, Brian Athey, Darrell Abernethy
The FASEB Journal
PREDICTING GENE INTERACTIONS OF TYROSINE KINASE INHIBITORS INDUCED CARDIOTOXICITY WITH THE ONTOLOGY OF ADVERSE EVENTS-ASSISTED BIOINFORMATICS APPROACH.
S Sarntivijai, J Hur, A Ozgur, K Burkhart, Y He, GS Omenn, BD Athey, DR Abernethy
NATURE PUBLISHING GROUP
A Graph-based Approach for Contextual Text Normalization
Cagil Sönmez, Arzucan Özgür
Association for Computational Linguistics
2013
Clinical experience sharing by similar case retrieval
Neda Barzegar Marvasti, Ceyhun Burak Akgül, Burak Acar, Nadin Kökciyan, Suzan Uskudarli, Pınar Yolum, Rüstü Türkay, Barıs Bakır
1st ACM international workshop on Multimedia indexing and information retrieval for healthcare
Bacteria biotope detection, ontology-based normalization, and relation extraction using syntactic rules
Ilknur Karadeniz, Arzucan Özgür
BOUNCE: Sentiment Classification in Twitter using Rich Feature Sets
Nadin Kökciyan, Arda Celebi, Arzucan Ozgur, Suzan Uskudarli
Association for Computational Linguistics
Bounce: Sentiment classification in Twitter using rich feature sets
Nadin Kökciyan, Arda Celebi, Arzucan Ozgür, Suzan Uskudarli
Second Joint Conference on Lexical and Computational Semantics (*{SEM})
Clinical experience sharing by similar case retrieval
Neda Barzegar Marvasti, Ceyhun Burak Akgül, Burak Acar, Nadin Kökciyan, Suzan Üsküdarlı, Pınar Yolum, Rüstü Türkay, Barıs Bakır
PHISTO: A New Web Platform for Pathogen-Human Interactions
Saliha Durmuş Tekir, Tunahan Çakır, Emre Ardıç, İlknur Karadeniz, Arzucan Özgür, Fatih Erdoğan Sevilgen, Kutlu Ö Ülgen
Computational Methods in Systems Biology: 11th International Conference, CMSB 2013, Klosterneuburg, Austria, September 22-24, 2013, Proceedings
PHISTO: pathogen-host interaction search tool
Saliha Durmus Tekir, Tunahan Cakir, Emre Ardic, Ali Semih Sayilirbas, Gokhan Konuk, Mithat Konuk, Hasret Sariyer, Azat Ugurlu, Ilknur Karadeniz, Arzucan Ozgur, Fatih Erdogan Sevilgen, Kutlu O Ulgen
Bioinformatics
M{\"u}nazaralar{\i}n Twitter'da Etkisinin Ara\c{s}t{\i}r{\i}lmas{\i}
Ahmet Yıldırım, Suzan Uskudarli
Akademik Bili\c{s}im 2013
Münazaraların Twitter'da etkisinin araştırılması
Ahmet Yıldırım, Suzan Üsküdarlı
https://ab.org.tr/ab13/kitap/yildirim_uskudarli_AB13.pdf
Mikroblog İleti Kümelerinde Konu Algılama Yönteminin İncelenmesi
Ahmet Yıldırım, Suzan Üsküdarlı, Arzucan Özgür
Akademik Bilişim
Word polarity detection using a multilingual approach
Cüneyd Murad Özsert, Arzucan Özgür
Springer Berlin Heidelberg
2012
Content Based Microblogger Recommendation
H Burak Celebi, Suzan Uskudarli
International Conference on Social Computing (SocialCom) Privacy, Security, Risk and Trust ({PASSAT})
A social web for another billion
TB Dinesh, Suzan Uskudarli
Proceedings of M4D 2012 28-29 February 2012 New Delhi, India
Alipi: A framework for re-narrating web pages
T. B. Dinesh, S Uskudarli, Subramanya Sastry, Deepti Aggarwal, Venkatesh Choppella
Lyon, France
System and method for generating queries
George Erhart, Valentine Matula, Arzucan Ozgur, David Skiba
Identification of fever and vaccine-associated gene interaction networks using ontology-based literature mining
Junguk Hur, Arzucan Özgür, Zuoshuang Xiang, Yongqun He
Journal of biomedical semantics
Identification of fever and vaccine-associated gene interaction networks using ontology-based literature mining
Arzucan Özgür, Junguk Hur, Zuoshuang Xiang, Yongqun Oliver He
2011
U-Compare bio-event meta-service: compatible BioNLP event extraction services
Yoshinobu Kano, Jari Björne, Filip Ginter, Tapio Salakoski, Ekaterina Buyko, Udo Hahn, K Bretonnel Cohen, Karin Verspoor, Christophe Roeder, Lawrence E Hunter, Halil Kilicoglu, Sabine Bergler, Sofie Van Landeghem, Thomas Van Parys, Yves Van de Peer, Makoto Miwa, Sophia Ananiadou, Mariana Neves, Alberto Pascual-Montano, Arzucan Özgür, Dragomir R Radev, Sebastian Riedel, Rune Saetre, Hong-Woo Chun, Jin-Dong Kim, Sampo Pyysalo, Tomoko Ohta, Jun'ichi Tsujii
BMC bioinformatics
Mining of vaccine-associated IFN-g gene interaction networks using the Vaccine Ontology
Arzucan Özgür, Zuoshuang Xiang, Dragomir R Radev, Yongqun He
J Biomed Semantics
2010
An Operator Provided M-learning Service -- A Preliminary Report
Haluk Bingol, M Gokhan Habiboglu, Suzan Uskudarli, Ahmet Yildirim, Onur Calikus, Cenk Sezgin, Sahin Yelkenci
International Association for Development of the Information Society
Exploring area-specific microblogger social networks
EA Degirmencioglu, S Uskudarli
Proceedings of the WebSci10: Extending the Frontiers of Society On-Line, April
Semantic tagprint - tagging and indexing content for semantic search and content management
Murat Kalender, Jiangbo Dang, Suzan Uskudarli
IEEE
Unipedia: A unified ontological knowledge platform for semantic content tagging and search
Murat Kalender, Jiangbo Dang, Suzan Uskudarli
IEEE
Semantic tagprint-tagging and indexing content for semantic search and content management
Murat Kalender, Jiangbo Dang, Suzan Uskudarli
Fourth International Conference on Semantic Computing ({ICSC})
UNIpedia: A unified ontological knowledge platform for semantic content tagging and search
Murat Kalender, Jiangbo Dang, Suzan Uskudarli
Fourth International Conference on Semantic Computing ({ICSC})
Literature-Based Discovery of IFN-𝛾 and Vaccine-Mediated Gene Interaction Networks
Arzucan Özgür, Zuoshuang Xiang, Dragomir R Radev, Yongqun He
Journal of Biomedicine and Biotechnology
2009
Screen-replay: a session recording and analysis tool for DrScheme
M Fatih Köksal, RE Başar, S Üsküdarlı
Proceedings of the Scheme and Functional Programming Workshop, Technical Report, California Polytechnic State University, CPSLO-CSC-09
Michigan molecular interactions r2: from interacting proteins to pathways
V Glenn Tarcea, Terry Weymouth, Alex Ade, Aaron Bookvich, Jing Gao, Vasudeva Mahavisno, Zach Wright, Adriane Chapman, Magesh Jayapandian, Arzucan Özgür, Yuanyuan Tian, Jim Cavalcoli, Barbara Mirel, Jignesh Patel, Dragomir Radev, Brian Athey, David States, HV Jagadish
Nucleic acids research
2008
Introducing meta-services for biomedical information extraction
Florian Leitner, Martin Krallinger, Carlos Rodriguez-Penagos, Jörg Hakenberg, Conrad Plake, Cheng-Ju Kuo, Chun-Nan Hsu, Richard Tzong-Han Tsai, Hsi-Chuan Hung, William W Lau, Calvin A Johnson, Rune Saetre, Kazuhiro Yoshida, Yan Hua Chen, Sun Kim, Soo-Yong Shin, Byoung-Tak Zhang, William A Baumgartner Jr, Lawrence Hunter, Barry Haddow, Michael Matthews, Xinglong Wang, Patrick Ruch, Frédéric Ehrler, Arzucan Özgür, Güneş Erkan, Dragomir R Radev, Michael Krauthammer, ThaiBinh Luong, Robert Hoffmann, Chris Sander, Alfonso Valencia
Genome biology
Semantic Tagging and Inference in Online Communities
Ahmet Yıldırım, Suzan Uskudarli
International Conference on Semantic Systems ({I-SEMANTICS})
Co-occurrence network of reuters news
Arzucan Özgür, Burak Cetin, Haluk Bingol
International Journal of Modern Physics C
Identifying gene-disease associations using centrality on a literature mined gene-interaction network
Arzucan Özgür, Thuy Vu, Güneş Erkan, Dragomir R Radev
Bioinformatics
2007
DTI Application with Haptic Interfaces
Murat Aksoy, Neslehan Avcu, Susana Merino-Caviedes, Engin Deniz Diktas, Miguel Angel Martın-Fernández, Sıla Girgin, Ioannis Marras, Emma Munoz-Moreno, Erkin Tekeli, Burak Acar, Roland Bammer, Marcos Martin-Fernandez, Ali Vahit Sahiner, Suzan Uskudarli
Extracting interacting protein pairs and evidence sentences by using dependency parsing and machine learning techniques
Günes Erkan, Arzucan Ozgur, Dragomir R Radev
Proceedings of the Second BioCreative Challenge Workshop
Semi-supervised classification for extracting protein interaction sentences using dependency parsing
Gunes Erkan, Arzucan Özgür, Dragomir Radev
2006
Classification of skewed and homogenous document corpora with class-based and corpus-based keywords
Arzucan Özgür, Tunga Güngör
Springer Berlin Heidelberg
Efficient indexing technique for XML-based electronic product catalogs
Arzucan Özgür, Taflan İ Gündem
Electronic Commerce Research and Applications
2005
Text categorization with class-based and corpus-based keyword selection
Arzucan Özgür, Levent Özgür, Tunga Güngör
Springer Berlin Heidelberg
2004
Supervised and unsupervised machine learning techniques for text document categorization
Arzucan Ozgur
Unpublished Master’s Thesis, İstanbul: Boğaziçi University
Social network of co-occurrence in news articles
Arzucan Özgür, Haluk Bingol
Springer Berlin Heidelberg
2002
Pantoto: A participatory model for community information
Susan Uskudarli, T Dinesh
Proceedings DyD’02: Development by Design
1998
1997
Share-Where Maintenance in Visual Algebraic
TB Dinesh¹, Susan M Üsküdarlı
Advances in Computing Science--ASIAN...
1996
Programming Research Group University of Amsterdam
Susan Üsküdarlı
Proceedings: IEEE Symposium on Visual Languages, September 3-6, 1996, Boulder, Colorado
1995
Specifying Visual Syntax via an Interpretation Tool
S Uskudarli
Programming Research Group, University of Amsterdam
Towards a visual programming environment generator for algebraic specifications
Susan M Uskudarli, TB Dinesh
IEEE
1994
1992
1902
Widedta: prediction of drug-target binding affinity, 2019
Hakime Oztürk, Elif Ozkirimli, Arzucan Ozgür
URL https://arxiv. org/abs
0
SIU2023-NST-Nefret Söylemi Tespit Yarısması SIU2023-NST-Hate Speech Detection Contest
Inanç Arın, Zeynep Isık, Seçilay Kutal, Somaiyeh Dehghan, Arzucan Özgür, Berrin Yanıkoglu
Identifying Common Pathogenesis of Diseases Using Literature Mined Gene Interactions
Özge Dinçsoy, Arzucan Özgür, Ahmet Okay Çağlayan
DRENAJ: Distributed social media data collection system
Onur Gungor, Suzan Uskudarli, A Taylan Cemgil
Fares Zeidán-Chuliá1, 2*, Mervi Gürsoy2, Ben-Hur Neves de Oliveira1, Vural Özdemir3, 4, Eija Könönen2, 5 and Ulvi K. Gürsoy2
Yongqun He, Arzucan Ozgur, Junguk Hur, Jie Zheng
Resources for Turkish Dependency Parsing
Utku Türk, Furkan Atmaca, Saziye Betül, Abdullatif Köksal, Balkız Öztürk Basaran, Tunga Güngör, Arzucan Özgür
The more the merrier: a new dependency treebank for Turkish
Utku Türk, Furkan Atmaca, Şaziye Betül, Gözde Berk, Seyyit Talha Bedir, Abdüllatif Köksal, Balkız Öztürk Başaran, Tunga Güngör, Arzucan Özgür
SCIENTIFIC COMMITTEE