Papers - TABILAB

This article delves into online strategies to demand accountability for Floyd’s murder amid the polarized context of the 2020 presidential elections and the Chauvin trial. By applying signaling theory to the study of hashtag activism, we examine how users strategically emphasized and deemphasized Floyd’s death to adapt to contextual sensitivities. Our analysis, based on an original dataset of approximately 6,000,000 tweets (January 2020–December 2021), employs statistical tools and network analysis to uncover temporal patterns in users’ framing strategies related to Floyd’s death. Users emphasized Floyd’s case, policing reforms, and systemic racism during the summer of 2020, transitioning to broader themes during the elections, and refocused on accountability and justice during the Chauvin trial. This article proposes a novel theoretical application of the signaling theory to the study of online activism, through observable metrics – tweet volume, duration, and hashtag combinations. These metrics capture when and by which messaging activists strategically raise issue salience. Our findings shed light on the differences in activist strategies along partisan lines, with Democrats predominantly associated with justice demands and Republicans with grievances.

Hashtag activism and framing strategies in the aftermath of George Floyd’s death and the 2020 elections

Basak Taraktas, Kadir Cihan Duran, Suzan Uskudarli

Politics, Groups, and Identities

VO: The Vaccine Ontology

Jie Zheng, Asiyah Yu Lin, Anthony Huffman, Anna Maria Masci, Rebecca Racz, Guanming Wu, Kallan Roan, Edison Ong, Sirarat Sarntivijai, Joy Hu, Eliyas Asfaw, Hayleigh Kahn, Xingxian Li, Xumeng Zhang, Nilufer Kosar, Jianfu Li, Warren Manuel, Rashmie Abeysinghe, Hasin Rehana, Benu Bansal, Yuanyi Pan, Jinjing Guo, Virginia He, Justin Song, Andrey I Seleznev, Katelyn Hur, Anna He, Alexander Davydov, Qi Yang, Randi Vita, Bjoern Peters, Alan Ruttenberg, Alexander D Diehl, Charles Tapley Hoyt, Paola Roncaglia, Rachael P Huntley, Richard H Scheuermann, Melanie Courtot, Thomas Todd, Samantha Sayers, Fang Chen, Xinna Li, Feng-Yu Yeh, Zuoshuang Xiang, Arzucan Ozgur, Patricia L Whetzel, Mark A Musen, Christopher J Mungall, Wolfgang W Leitner, Licong Cui, Lesley A Colby, Harry LT Mobley, Brian D Athey, Gilbert S Omenn, Lindsay G Cowell, Cui Tao, Junguk Hur, Barry Smith, Yongqun He

bioRxiv

2024

Evaluating the quality of a corpus annotation scheme using pretrained language models

Furkan Akkurt, Onur Güngör, Büşra Marşan, Tunga Güngör, Balkız Öztürk Başaran, Arzucan Özgür, Susan Üsküdarlı

Dealing With Data Scarcity in Spoken Question Answering

Ebru Arısoy, Arzucan Özgür, Merve Ünlü Menevşe, Yusufcan Manav

2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024-Main Conference Proceedings--Joint 30th International Conference on Computational Linguistics and 14th International Conference on Language Resources and Evaluation, LREC-COLING 2024--20 May 2024 through 25 May 2024--Hybrid, Torino--199620

Do activists prioritize solutions over grievances? A Twitter Study of Black Lives Matter

B. Taraktas, K. C. Duran, S. Üsküdarli

Marmara Üniversitesi Siyasal Bilimler Dergisi

Do social movements shift the focus of their framing from grievances to tactics as they mature? This paper examines the nature of the frames that social movements and activists co-create using the case of Black Lives Matter (BLM). Building on (Snow \& Benford, 1988), we explore whether BLM’s frames have evolved from diagnostic to prognostic frames since the movement’s emergence. We compiled a novel tweet dataset collected from Twitter that contains 269,963 tweets sent under the hashtag “BlackLivesMatter” from Jan. 01, 2020, to Dec. 31, 2021. Using time series and network analysis, we show that frames do not naturally evolve from diagnostic to prognostic frames as movements mature. We find that BLM activists increasingly use prognostic frames while expressing their grievances because injustices and discrimination toward the Black continue. The evidence suggests that tweets on tactics and solutions outnumber the grievance-related frames only after Chauvin’s guilty plea alleviates grievances.

Generative language models on nucleotide sequences of human genes

Musa Nuri Ihtiyar, Arzucan Özgür

Scientific Reports

Evaluating the Quality of a Corpus Annotation Scheme Using Pretrained Language Models

Furkan Akkurt, Onur Gungor, Büşra Marşan, Tunga Gungor, Balkiz Ozturk Basaran, Arzucan Özgür, Susan Uskudarli

Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Pretrained language models and large language models are increasingly used to assist in a great variety of natural language tasks. In this work, we explore their use in evaluating the quality of alternative corpus annotation schemes. For this purpose, we analyze two alternative annotations of the Turkish BOUN treebank, versions 2.8 and 2.11, in the Universal Dependencies framework using large language models. Using a suitable prompt generated using treebank annotations, large language models are used to recover the surface forms of sentences. Based on the idea that the large language models capture the characteristics of the languages, we expect that the better annotation scheme would yield the sentences with higher success. The experiments conducted on a subset of the treebank show that the new annotation scheme (2.11) results in a successful recovery percentage of about 2 points higher. All the code developed for this work is available at https://github.com/boun-tabi/eval-ud .

Dealing with Data Scarcity in Spoken Question Answering

Merve Ünlü Menevşe, Yusufcan Manav, Ebru Arisoy, Arzucan Özgür

Evaluating GPT and BERT models for protein–protein interaction identification in biomedical text

Hasin Rehana, Nur Bengisu Çam, Mert Basmaci, Jie Zheng, Christianah Jemiyo, Yongqun He, Arzucan Özgür, Junguk Hur

Bioinformatics Advances

Leveraging Large Language Models for Extracting Protein-Protein Interactions from Biomedical Corpora

Hasin Rehana, Nur Bengisu Çam, Mert Basmaci, Jie Zheng, Christianah Jemiyo, Yongqun He, Arzucan Özgür, Junguk Hur

Nested named entity recognition using multilayer BERT-based model

Hasin Rehana, Benu Bansal, Nur Bengisu Çam, Jie Zheng, Yongqun He, Arzucan Özgür, Junguk Hur

CLEF Working Notes

Linguistic laws meet protein sequences: A comparative analysis of subword tokenization methods

Burak Suyunu, Enes Taylan, Arzucan Özgür

IEEE

Tweeting through a public health crisis: Communication strategies of right-wing populist leaders during the COVID-19 pandemic

Başak Taraktaş, Berk Esen, Suzan Uskudarli

Government and Opposition

Do Activists Prioritize Solutions Over Grievances? A Twitter Study of Black Lives Matter

Basak Taraktas, Kadir Cihan Duran, Susan Üsküdarlı

Marmara Üniversitesi Siyasal Bilimler Dergisi

Exploring data‐driven chemical SMILES tokenization approaches to identify key protein–ligand binding moieties

Asu Busra Temizer, Gökçe Uludoğan, Rıza Özçelik, Taha Koulani, Elif Ozkirimli, Kutlu O Ulgen, Nilgun Karali, Arzucan Özgür

Molecular Informatics

Example Publication for Testing

Author A. Test, Author B. Example

Test Journal

This is a test publication to validate the BibTeX parsing and Markdown generation.

Detecting Hate Speech in Turkish Print Media: A corpus and a hybrid approach with target-oriented linguistic knowledge

Gökçe Uludoğan, Atıf Emre Yüksel, Ümit Tunçer, Burak Işık, Yasemin Korkmaz, Didar Akar, Arzucan Özgür

Overview of the hate speech detection in turkish and arabic tweets (hsd-2lang) shared task at case 2024

Gökçe Uludoğan, Somaiyeh Dehghan, Inanç Arın, Elif Erol, Berrin Yanıkoğlu, Arzucan Özgür

TURNA: A Turkish Encoder-Decoder Language Model for Enhanced Understanding and Generation

Gökçe Uludoğan, Zeynep Balal, Furkan Akkurt, Meliksah Turker, Onur Gungor, Susan Üsküdarlı

https://aclanthology.org/2024.findings-acl.600

{TURNA}: A {T}urkish Encoder-Decoder Language Model for Enhanced Understanding and Generation

Gökçe Uludo\ugan, Zeynep Balal, Furkan Akkurt, Meliksah Turker, Onur Gungor, Susan Üsküdarlı

Findings of the Association for Computational Linguistics ACL 2024

The recent advances in natural language processing have predominantly favored well-resourced English-centric models, resulting in a significant gap with low-resource languages. In this work, we introduce TURNA, a language model developed for the low-resource language Turkish and is capable of both natural language understanding and generation tasks. TURNA is pretrained with an encoder-decoder architecture based on the unified framework UL2 with a diverse corpus that we specifically curated for this purpose. We evaluated TURNA with three generation tasks and five understanding tasks for Turkish. The results show that TURNA outperforms several multilingual models in both understanding and generation tasks and competes with monolingual Turkish models in understanding tasks.

Incorporating Knowledge Graph Embeddings into Graph Neural Networks for Sequential Recommender Systems

Kazim Emre Yüksel, Susan Üsküdarli

IEEE

Incorporating Knowledge Graph Embeddings into Graph Neural Networks for Sequential Recommender Systems

Kazim Emre Yüksel, Susan Üsküdarli

2024 9th International Conference on Computer Science and Engineering (UBMK)

2023

Evaluation of chatgpt and bert-based models for turkish hate speech detection

Nur Bengisu Çam, Arzucan Özgür

IEEE

Siu2023-nst-hate speech detection contest

Inanç Arın, Zeynep Işık, Seçilay Kutal, Somaiyeh Dehghan, Arzucan Özgür, Berrin Yanikoğlu

IEEE

Can We Explain Privacy?

Gönül Aycı, Arzucan Özgür, Murat Şensoy, Pınar Yolum

IEEE Internet Computing

Explain to Me: Towards Understanding Privacy Decisions.

Gonul Ayci, Pinar Yolum, Arzucan Özgür, Murat Sensoy

PEAK: Explainable Privacy Assistant through Automated Knowledge Extraction

Gonul Ayci, Arzucan Özgür, Murat Şensoy, Pınar Yolum

arXiv preprint arXiv:2301.02079

Uncertainty-aware personal assistant for making personalized privacy decisions

Gonul Ayci, Murat Sensoy, Arzucan Özgür, Pinar Yolum

ACM Transactions on Internet Technology

A Computational Software for Training Robust Drug–Target Affinity Prediction Models: pydebiaseddta

Melih Barsbey, Rıza Özçelik, Alperen Bağ, Berk Atil, Arzucan Özgür, Elif Ozkirimli

Journal of Computational Biology

Pattern recognition for healthcare analytics

İnci M Baytaş, Yifan Peng, Arzucan Özgür

Frontiers Media SA

Improving the filtering of false positive single nucleotide variations by combining genomic features with quality metrics

Kazım Kıvanç Eren, Esra Çınar, Hamza U Karakurt, Arzucan Özgür

Bioinformatics

Visualizing Software Repositories Through Requirements Trace Links

Kadir Ersoy, Ecenur Sezer, Susan Üsküdarlı, Fatma Başak Aydemir

IEEE

A dataset for investigating the impact of context for offensive language detection in tweets

Musa Ihtiyar, Ömer Özdemir, Mustafa Erengül, Arzucan Özgür

Visualizing Software Repositories Through Requirements Trace Links

Kadir Ersoy, Ecenur Sezer, Susan Uskudarli, Fatma Başak Aydemir

2023 IEEE 31st International Requirements Engineering Conference Workshops (REW)

TULAP-An accessible and sustainable platform for Turkish natural language processing resources

Susan Üsküdarlı, Muhammet Şen, Furkan Akkurt, Merve Gürbüz, Onur Güngör, Arzucan Özgür, Tunga Güngör

Improving Cross-lingual Transfer Learning for Turkish NLP

John Doe, Jane Smith, Alex Johnson

ACL 2023

This paper presents a novel approach to improve cross-lingual transfer learning for Turkish natural language processing tasks. We demonstrate significant improvements in performance across multiple NLP tasks including named entity recognition, part-of-speech tagging, and sentiment analysis. Our method leverages morphological information specific to Turkish to enhance the transfer of knowledge from high-resource languages.

PDF Code Read More

{TULAP} - An Accessible and Sustainable Platform for {T}urkish Natural Language Processing Resources

Susan Uskudarli, Muhammet Şen, Furkan Akkurt, Merve Gürbüz, Onur Gungor, Arzucan Özgür, Tunga Güngör

Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations

Access to natural language processing resources is essential for their continuous improvement. This can be especially challenging in educational institutions where the software development effort required to package and release research outcomes may be overwhelming and under-recognized. Access towell-prepared and reliable research outcomes is important both for their developers as well as the greater research community. This paper presents an approach to address this concern with two main goals: (1) to create an open-source easily deployable platform where resources can be easily shared and explored, and (2) to use this platform to publish open-source Turkish NLP resources (datasets and tools) created by a research lab. The Turkish Natural Language Processing (TULAP) was designed and developed as an easy-to-use platform to share dataset and tool resources which supports interactive tool demos. Numerous open access Turkish NLP resources have been shared on TULAP. All tools are containerized to support portability for custom use. This paper describes the design, implementation, and deployment of TULAP with use cases (available at \url{https://tulap.cmpe.boun.edu.tr/}). A short video demonstrating our system is available at \url{https://figshare.com/articles/media/TULAP_Demo/22179047}.

A Framework for Improving the Generalizability of Drug–Target Affinity Prediction Models

Rıza Özçelik, Alperen Bağ, Berk Atil, Melih Barsbey, Arzucan Özgür, Elif Ozkirimli

Journal of Computational Biology

2022

Boğaziçi University Annotation Tool (BoAT)-Web

Salih Furkan Akkurt, Susan Uskudarli

Boğaziçi University

BoAT v2 -- A Web-Based Dependency Annotation Tool with Focus on Agglutinative Languages

Salih Furkan Akkurt, Büşra Marşan, Susan Uskudarli

https://arxiv.org/abs/2207.01327

BoAT v2 - A Web-Based Dependency Annotation Tool with Focus on Agglutinative Languages

Salih Furkan Akkurt, Büşra Marşan, Susan Uskudarli

Proceedings of the ALTNLP The International Conference and workshop on Agglutinative Language Technologies as a challenge of Natural Language Processing

Cluster-based mention typing for named entity disambiguation

Arda Çelebi, Arzucan Özgür

Natural Language Engineering

Identifying hate speech using neural networks and discourse analysis techniques

Zehra Melce Hüsünbeyi, Didar Akar, Arzucan Özgür

A shap-based active learning approach for creating high-quality training data

Nailcan Kara, Yagiz Levent Gume, Umit Tigrak, Gokce Ezeroglu, Serdar Mola, Omer Burak Akgun, Arzucan Özgür

IEEE

BOUN Treebank v2. 11

Büşra Marşan, Furkan Akkurt, Suzan Üsküdarlı, Tunga Güngör, Balkız Öztürk, Arzucan Özgür, Onur Güngör, Muhammet Şen, Merve Gürbüz, Utku Türk, Talha Bedir, Şaziye Betül Özateş

Boğaziçi University

Enhancements to the BOUN treebank reflecting the agglutinative nature of Turkish

Büşra Marşan, Salih Furkan Akkurt, Muhammet Şen, Merve Gürbüz, Onur Güngör, Şaziye Betül Özateş, Suzan Üsküdarlı, Arzucan Özgür, Tunga Güngör, Balkız Öztürk

arXiv preprint arXiv:2207.11782

Enhancements to the BOUN Treebank Reflecting the Agglutinative Nature of Turkish

Büşra Marşan, Salih Furkan Akkurt, Muhammet Şen, Merve Gürbüz, Onur Güngör, Şaziye Betül Özateş, Suzan Üsküdarlı, Arzucan Özgür, Tunga Güngör, Balkız Öztürk

Proceedings of the ALTNLP The International Conference and workshop on Agglutinative Language Technologies as a challenge of Natural Language Processing

A framework for automatic generation of spoken question-answering data

Merve Ünlü Menevşe, Yusufcan Manav, Ebru Arisoy, Arzucan Özgür

A Dataset and BERT-based Models for Targeted Sentiment Analysis on Turkish Texts

M Melih Mutlu, Arzucan Özgür

arXiv e-prints

Dataset for Targeted Sentiment Analysis in Turkish

Mustafa Melih Mutlu, Arzucan Özgür

Boğaziçi University

Tweeting through a Public Health Crisis: Communication Strategies of Right-Wing Populist Leaders during the COVID-19 Pandemic

Başak Taraktaş, Berk Esen, Suzan Uskudarli

Government and Opposition

BOUN Treebank

Utku Türk, Furkan Atmaca, Şaziye Betül Özateş, Gözde Berk, Seyyit Talha Bedir, Abdullatif Köksal, Balkız Öztürk Başaran, Tunga Güngör, Arzucan Özgür

Boğaziçi University

Resources for Turkish dependency parsing: Introducing the BOUN treebank and the BoAT annotation tool

Utku Türk, Furkan Atmaca, Şaziye Betül Özateş, Gözde Berk, Seyyit Talha Bedir, Abdullatif Köksal, Balkız Öztürk Başaran, Tunga Güngör, Arzucan Özgür

Language Resources and Evaluation

Interpreting Chemical Words of a Data-driven Segmentation Method as Protein Family Pharmacophores and Functional Groups

Kutlu O Ulgen, Nilgün Karalı, Arzucan Özgür

arXiv preprint arXiv:2210.14642

Exploiting pretrained biochemical language models for targeted drug design

Gökçe Uludoğan, Elif Ozkirimli, Kutlu O Ulgen, Nilgün Karalı, Arzucan Özgür

Bioinformatics

A hybrid deep dependency parsing approach enhanced with rules and morphology: A case study for Turkish

Şazıye Betül Özateş, Arzucan Özgür, Tunga Güngör, Balkiz Öztürk Başaran

IEEE Access

Improving Code-Switching Dependency Parsing with Semi-Supervised Auxiliary Tasks

Şaziye Betül Özateş, Arzucan Özgür, Tunga Güngör, Özlem Çetinoğlu

2021

Overcoming the challenges in morphological annotation of Turkish in universal dependencies framework

Talha BediR, Karahan Şahin, Onur Güngör, Suzan Uskudarli, Arzucan Özgür, Tunga Güngör, Balkız Öztürk Başaran

Overcoming the challenges in morphological annotation of Turkish in universal dependencies framework

Talha Bedir, Karahan Şahin, Onur Güngör, Suzan Uskudarli, Arzucan Özgür, Tunga Güngör, Balkız Öztürk Başaran

Proceedings of The Joint 15th Linguistic Annotation Workshop (LAW) and 3rd Designing Meaning Representations (DMR) Workshop

This paper presents several challenges faced when annotating Turkish treebanks in accordance with the Universal Dependencies (UD) guidelines and proposes solutions to address them. Most of these challenges stem from the lack of adequate support in the UD framework to accurately represent null morphemes and complex derivations, which results in a significant loss of information for Turkish. This loss negatively impacts the tools that are developed based on these treebanks. We raised and discussed these issues within the community on the official UD portal. This paper presents these issues and our proposals to more accurately represent morphosyntactic information for Turkish while adhering to guidelines of UD. This work aims to contribute to the representation of Turkish and other agglutinative languages in UD-based treebanks, which in turn aids to develop more accurately annotated datasets for such languages.

Crowdsourced mapping of unexplored target space of kinase inhibitors

Anna Cichońska, Balaguru Ravikumar, Robert J Allaway, Fangping Wan, Sungjoon Park, Olexandr Isayev, Shuya Li, Michael Mason, Andrew Lamb, Ziaurrehman Tanoli, Minji Jeon, Sunkyu Kim, Mariya Popova, Stephen Capuzzi, Jianyang Zeng, Kristen Dang, Gregory Koytiger, Jaewoo Kang, Carrow I Wells, Timothy M Willson, IDG-DREAM Drug-Kinase Binding Prediction Challenge Consortium User oselot Tan Mehmet 18, Team N121 Huang Chih-Han 19 Shih Edward SC 19 Chen Tsai-Min 19 Wu Chih-Hsun 19 Fang Wei-Quan 19 Chen Jhih-Yu 19 Hwang Ming-Jing 19, Team Let_Data_Talk Wang Xiaokang 20 Ben Guebila Marouen 21 Shamsaei Behrouz 22 Singh Sourav 23, User thinng Nguyen Thin 24, Team KKT Karimi Mostafa 25 26 Wu Di 25 27 Wang Zhangyang 28 29 Shen Yang 25, Team Boun Öztürk Hakime 30 Ozkirimli Elif 31 Özgür Arzucan 30, Team KinaseHunter Lim Hansaim 32 Xie Lei 33, Team AmsterdamUMC-KU-team Kanev Georgi K. 34 Kooistra Albert J. 35 Westerman Bart A. 34, Team DruginaseLearning Terzopoulos Panagiotis 36 Ntagiantas Konstantinos 36 Fotis Christos 36 Alexopoulos Leonidas 36, Team KERMIT-LAB-Ghent University Boeckaerts Dimitri 37 Stock Michiel 38 De Baets Bernard 38 Briers Yves 37, Team QED Luo Yunan 39 Hu Hailin 40 Peng Jian 39, Team METU_EMBLEBI_CROssBAR Dogan Tunca 41 Rifaioglu Ahmet S. 42 Atas Heval 43 Atalay Rengul Cetin 43 Atalay Volkan 42 Martin Maria J. 44, Team DMIS_DK Jeon Minji 6 Lee Junhyun 6 Yun Seongjun 6 Kim Bumsoo 6 Chang Buru 6, Team AI Winter is Coming, Team hulab Turu Gábor 45 Misák Ádám 45 Szalai Bence 45 Hunyady László 45, Team ML-Med Lienhard Matthias 46 Prasse Paul 47 Bachmann Ivo 48 Ganzlin Julia 47 Barel Gal 46 Herwig Ralf 46, Team Prospectors Oršolić Davor 49 Lučić Bono 50 Stepanić Višnja 49 Šmuc Tomislav 49, Challenge organizers

Nature communications

Re-narration as a basis for accessibility and inclusion on the World Wide Web

T Dinesh, V Choppella, S Uskudarli

Balancing Methods for Multi-label Text Classification with Long-Tailed Class Distribution

Yi Huang, Buse Giledereli, Abdullatif Köksal, Arzucan Özgür, Elif Ozkirimli

PIDNA at BioASQ MESINESP: Hybrid Semantic Indexing for Biomedical Articles in Spanish.

Yi Huang, Buse Giledereli, Abdullatif Köksal, Arzucan Özgür, Elif Ozkirimli

A novel gene selection method for gene expression data for the task of cancer type classification

N Özlem ÖZCAN ŞİMŞEK, Arzucan ÖzgÜr, Fikret GÜrgen

Biology Direct

BOUN at SemEval-2021 Task 9: Text Augmentation Techniques for Fact Verification in Tabular Data

Abdullatif Köksal, Yusuf Yüksel, Bekir Yıldırım, Arzucan Özgür

Relation Extractor

Abdullatif Köksal, Arzucan Özgür

Association for Computational Linguistics

Sentiment Analysis Corpus

Abdullatif Köksal, Arzucan Özgür

Boğaziçi University

Twitter dataset and evaluation of transformers for Turkish sentiment analysis

Abdullatif Köksal, Arzucan Özgür

IEEE

Sentiment analysis of customer comments in banking using bert-based approaches

Melik Masarifoglu, Umit Tigrak, Sefa Hakyemez, Guven Gul, Erdal Bozan, Ali Hakan Buyuklu, Arzucan Özgür

IEEE

BOUN Treebank v2. 8

Utku Türk, Furkan Atmaca, Şaziye Betül Özateş, Gözde Berk, Seyyit Talha Bedir, Abdullatif Köksal, Balkız Öztürk Başaran, Tunga Güngör, Arzucan Özgür

Boğaziçi University

Machine Learning Methodologies to Study Molecular Interactions

Artur Yakimovich, Arzucan Özgür, Tunca Doğan, Elif Ozkirimli

Frontiers Media SA

Chemboost: A chemical language based approach for protein–ligand binding affinity prediction

Rıza Özçelik, Hakime Öztürk, Arzucan Özgür, Elif Ozkirimli

Molecular Informatics

2020

Analyzing ELMo and DistilBERT on socio-political news classification

Berfu Büyüköz, Ali Hürriyetoğlu, Arzucan Özgür

BOUN-REX at CLEF-2020 ChEMU Task 2: Evaluating Pretrained Transformers for Event Extraction

Hilal Dönmez, Abdullatif Köksal, Elif Ozkirimli, Arzucan Ozgür

EXSEQREG: Explaining sequence-based NLP tasks with regions with a case study using morphological features for named entity recognition

Onur Güngör, Tunga Güngör, Suzan Uskudarli

Plos one

{EXSEQREG}: Explaining sequence-based NLP tasks with regions with a case study using morphological features for named entity recognition

Onur Güngör, Tunga Güngör, Suzan Uskudarli

Plos one

The state-of-the-art systems for most natural language engineering tasks employ machine learning methods. Despite the improved performances of these systems, there is a lack of established methods for assessing the quality of their predictions. This work introduces a method for explaining the predictions of any sequence-based natural language processing (NLP) task implemented with any model, neural or non-neural. Our method named EXSEQREG introduces the concept of region that links the prediction and features that are potentially important for the model. A region is a list of positions in the input sentence associated with a single prediction. Many NLP tasks are compatible with the proposed explanation method as regions can be formed according to the nature of the task. The method models the prediction probability differences that are induced by careful removal of features used by the model. The output of the method is a list of importance values. Each value signifies the impact of the corresponding feature on the prediction. The proposed method is demonstrated with a neural network based named entity recognition (NER) tagger using Turkish and Finnish datasets. A qualitative analysis of the explanations is presented. The results are validated with a procedure based on the mutual information score of each feature. We show that this method produces reasonable explanations and may be used for i) assessing the degree of the contribution of features regarding a specific prediction of the model, ii) exploring the features that played a significant role for a trained model when analyzed across the corpus.

The Role of Contextual Word Embeddings in Correcting the `de/da' Clitic Errors in Turkish

Öztürk\ H, A Değirmenci, O Güngör, S Uskudarli

IEEE

An extended overview of the CLEF 2020 ChEMU lab: information extraction of chemical reactions from patents

Jiayuan He, Dat Quoc Nguyen, Saber A Akhondi, Christian Druckenbrodt, Camilo Thorne, Ralph Hoessel, Zubair Afzal, Zenan Zhai, Biaoyan Fang, Hiyori Yoshikawa, Ameer Albahem, Jingqi Wang, Yuankai Ren, Zhi Zhang, Yaoyun Zhang, Mai Hoang Dao, Pedro Ruas, Andre Lamurias, Francisco M Couto, Jenny Copara, Nona Naderi, Julien Knafou, Patrick Ruch, Douglas Teodoro, Daniel Lowe, John Mayfield, Abdullatif Köksal, Hilal Dönmez, Elif Özkirimli, Arzucan Özgür, Darshini Mahendran, Gabrielle Gurdin, Nastassja Lewinski, Christina Tang, Bridget T McInness, CS Malarkodi, Pattabhi Rk Rao, Sobha Lalitha Devi, Lawrence Cavedon, Trevor Cohn, Timothy Baldwin, Karin Verspoor

22-25 September 2020

The RELX Dataset and Matching the Multilingual Blanks for Cross-Lingual Relation Classification

Abdullatif Köksal, Arzucan Özgür

Findings of the Association for Computational Linguistics: EMNLP 2020

Vapur: A Search Engine to Find Related Protein-Compound Pairs in COVID-19 Literature

Abdullatif Köksal, Hilal Dönmez, Rıza Özçelik, Elif Ozkirimli, Arzucan Özgür

Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020

Resources for Turkish Dependency Parsing: Introducing the BOUN Treebank and the BoAT Annotation Tool

Utku Türk, Furkan Atmaca, Şaziye Betül Özateş, Gözde Berk, Seyyit Talha Bedir, Abdullatif Köksal, Balkız Öztürk Başaran, Tunga Güngör, Arzucan Özgür

Language Resources and Evaluation

We introduce the resources that we developed for Turkish dependency parsing, which include a novel manually annotated treebank (BOUN Treebank), along with the guidelines we adopted, and a new annotation tool (BoAT).

The Role of Contextual Word Embeddings in Correcting the 'de/da' Clitic Errors in Turkish

Hasan Öztürk, Alperen De\ugirmenci, Onur Güngör, Suzan Uskudarli

28th Signal Processing and Communications Applications Conference ({SIU})

One of the most common spelling errors in Turkish is regarding the clitic `de/da'. People often misspell the `de/da' either by treating it as a suffix inappropriately when it should not, or by spelling it seperately when it should be a suffix. Since Turkish is a morphologically rich agglutinative language, detecting and identifying such errors are difficult. As such, many widely used spell correction tools do not handle such mistakes well. In this work, we show that a sequence tagger model that employs BERT model which produces word embeddings that consider the context of a word obtains higher performance compared to using non-contextual word embeddings instead. Training and evaluation tasks were performed with a dataset that was derived from a Turkish corpus using a special process in addition to a manually curated one. The contextual word embeddings obtained during this task are publicly shared with the research community.

NeuroBoun: An inquiry-based approach for exploring scientific literature--a use case in neuroscience

S Uskudarli, Erinç Gökdeniz, Resit Canbeyli

arXiv preprint arXiv:2001.00186

NeuroBoun: An inquiry-based approach for exploring scientific literature -- a use case in neuroscience

S. Uskudarli, E. Gökdeniz, R. Canbeyli

Microblog topic identification using {Linked Open Data}

Ahmet Yıldırım, Suzan Uskudarli

Plos one

Much valuable information is embedded in social media posts (microposts) which are contributed by a great variety of persons about subjects that of interest to others. The automated utilization of this information is challenging due to the overwhelming quantity of posts and the distributed nature of the information related to subjects across several posts. Numerous approaches have been proposed to detect topics from collections of microposts, where the topics are represented by lists of terms such as words, phrases, or word embeddings. Such topics are used in tasks like classification and recommendations. The interpretation of topics is considered a separate task in such methods, albeit they are becoming increasingly human-interpretable. This work proposes an approach for identifying machine-interpretable topics of collective interest. We define topics as a set of related elements that are associated by having posted in the same contexts. To represent topics, we introduce an ontology specified according to the W3C recommended standards. The elements of the topics are identified via linking entities to resources published on Linked Open Data (LOD). Such representation enables processing topics to provide insights that go beyond what is explicitly expressed in the microposts. The feasibility of the proposed approach is examined by generating topics from more than one million tweets collected from Twitter during various events. The utility of these topics is demonstrated with a variety of topic-related tasks along with a comparison of the effort required to perform the same tasks with words-list-based representations. Manual evaluation of randomly selected 36 sets of topics yielded 81.0{\%} and 93.3{\%} for the precision and F1 scores respectively.

Microblog topic identification using Linked Open Data

Ahmet Yıldırım, Suzan Uskudarli

Plos one

Dependency Parser

Şaziye Betül Özateş, Tunga Güngör, Arzucan Özgür, Balkız Öztürk Başaran

Boğaziçi University

Exploring chemical space using natural language processing methodologies for drug discovery

Hakime Öztürk, Arzucan Özgür, Philippe Schwaller, Teodoro Laino, Elif Ozkirimli

Elsevier Current Trends

The role of contextual word embeddings in correcting the ‘de/da’clitic errors in Turkish

Hasan Öztürk, Alperen Değirmenci, Onur Güngör, Suzan Uskudarli

IEEE

2019

Detecting Clitics Related Orthographic Errors in {T}urkish

Ugurcan Arikan, Onur Gungor, Suzan Uskudarli

International Conference on Recent Advances in Natural Language Processing ({RANLP})

For the spell correction task, vocabulary based methods have been replaced with methods that take morphological and grammar rules into account. However, such tools are fairly immature, and, worse, non-existent for many low resource languages. Checking only if a word is well-formed with respect to the morphological rules of a language may produce false negatives due to the ambiguity resulting from the presence of numerous homophonic words. In this work, we propose an approach to detect and correct the {``}de/da{''} clitic errors in Turkish text. Our model is a neural sequence tagger trained with a synthetically constructed dataset consisting of positive and negative samples. The model{'}s performance with this dataset is presented according to different word embedding configurations. The model achieved an F1 score of 86.67{\%} on a synthetically constructed dataset. We also compared the model{'}s performance on a manually curated dataset of challenging samples that proved superior to other spelling correctors with 71{\%} accuracy compared to the second-best (Google Docs) with and accuracy of 34{\%}.

Detecting clitics related orthographic errors in Turkish

Ugurcan Arikan, Onur Güngör, Suzan Uskudarli

Supervised learning methods in classifying organized behavior in tweet collections

Erdem Beğenilmiş, Susan Uskudarli

International Journal on Artificial Intelligence Tools

Supervised Learning Methods in Classifying Organized Behavior in Tweet Collections

Erdem Beğenilmiş, Susan Uskudarli

International Journal on Artificial Intelligence Tools

The successful use of social media to manipulate public opinion via bots and hired individuals to spread (mis)information to unsuspecting users reached alarming levels due to the manipulations during the 2016 US elections and the Brexit deliberations in the UK. Fake interaction such as “liking” and “retweeting” are staged to foster trust in the posts of bots and individuals, which makes it difficult for individuals to detect the posts that are part of greater schemes. We propose an approach based on supervised learning to classify collections of tweets as “organized” when they inhabit premeditated intent and as “organic” otherwise. Features related to users and posting behavior are used to train the classifiers using 851 data sets totaling above 270 million tweets. Further classifiers are trained to assess the effectiveness of the selected features. The random forest algorithm persistently yielded the best results with scores greater than 95\% for both accuracy and f-measure. For comparison purposes, unsupervised learning methods were used to cluster the same data sets. The Gaussian Mixture Model clustered [organized vs organic] data set with 99\% agreement with the labels. The success of using only behavioral features to detect organized behavior is encouraging.

Overview of the BioCreative VI Precision Medicine Track: mining protein interactions and mutations for precision medicine

Rezarta Islamaj Doğan, Sun Kim, Andrew Chatr-Aryamontri, Chih-Hsuan Wei, Donald C Comeau, Rui Antunes, Sérgio Matos, Qingyu Chen, Aparna Elangovan, Nagesh C Panyam, Karin Verspoor, Hongfang Liu, Yanshan Wang, Zhuang Liu, Berna Altınel, Zehra Melce Hüsünbeyi, Arzucan Özgür, Aris Fergadis, Chen-Kai Wang, Hong-Jie Dai, Tung Tran, Ramakanth Kavuluru, Ling Luo, Albert Steppi, Jinfeng Zhang, Jinchan Qu, Zhiyong Lu

Database

The effect of morphology in named entity recognition with sequence tagging

Onur Güngör, Tunga Güngör, Suzan Üsküdarli

Natural Language Engineering

The effect of morphology in named entity recognition with sequence tagging

Onur Güngör, Tunga Güngör, Suzan Uskudarli

Natural Language Engineering

This work proposes a sequential tagger for named entity recognition in morphologically rich languages. Several schemes for representing the morphological analysis of a word in the context of named entity recognition are examined. Word representations are formed by concatenating word and character embeddings with the morphological embeddings based on these schemes. The impact of these representations is measured by training and evaluating a sequential tagger composed of a conditional random field layer on top of a bidirectional long short-term memory layer. Experiments with Turkish, Czech, Hungarian, Finnish and Spanish produce the state-of-the-art results for all these languages, indicating that the representation of morphological information improves performance.

Identifying Image Related Sentences in News Articles

Melike Esma İlter, Lale Akarun, Arzucan Özgür

IEEE

Statistical representation models for mutation information within genomic data

N Özlem ÖZCAN ŞİMŞEK, Arzucan Özgür, Fikret Gürgen

BMC bioinformatics

BOUN-ISIK participation: an unsupervised approach for the named entity normalization and relation extraction of bacteria biotopes

Ilknur Karadeniz, Ömer Faruk Tuna, Arzucan Özgür

Linking entities through an ontology using word embeddings and syntactic re-ranking

Ilknur Karadeniz, Arzucan Özgür

BMC bioinformatics

Machine learning-based identification and rule-based normalization of adverse drug reactions in drug labels

Mert Tiftikci, Arzucan Özgür, Yongqun He, Junguk Hur

BMC bioinformatics

Improving the annotations in the Turkish universal Dependency Treebank

Utku Türk, Furkan Atmaca, Şaziye Betül Özateş, Balkız Öztürk Başaran, Tunga Güngör, Arzucan Özgür

Turkish treebanking: Unifying and constructing efforts

Utku Türk, Furkan Atmaca, Şaziye Betül Özateş, Abdullatif Köksal, Balkız Öztürk Başaran, Tunga Güngör, Arzucan Özgür

Turkish tweet classification with transformer encoder

Atıf Emre Yüksel, Yaşar Alim Türkmen, Arzucan Özgür, Berna Altınel

WideDTA: prediction of drug-target binding affinity

Hakime Öztürk, Elif Ozkirimli, Arzucan Özgür

arXiv preprint arXiv:1902.04166

2018

Organized Behavior Classification of Tweet Sets using Supervised Learning Methods

E Begenilmiş, S. Uskudarlı

to appear

Organized Behavior Classification of Tweet Sets using Supervised Learning Methods

Erdem Be\ugenilmiş, Suzan Uskudarli

8th International Conference on Web Intelligence, Mining and Semantics ({WIMS})

Segmenting hashtags and analyzing their grammatical structure

Arda Celebi, Arzucan Özgür

Journal of the Association for Information Science and Technology

Towards an ontology-driven clinical experience sharing ecosystem: Demonstration with liver cases

Mar\'\ia del Mar Rold\'an-Garc\'\ia, Suzan Uskudarli, Neda B Marvasti, Burak Acar, Jos\'e F Aldana-Montes

Expert Systems with Applications

Past medical cases, hence clinical experience, are invaluable resources in supporting clinical practice, research, and education. Medical professionals need to be able to exchange information about patient cases and explore them from subjective perspectives. This requires a systematic and flexible methodology to case representation for supporting the exchange of processable patient information. We present an ontology based approach to modeling patient cases and use patients with liver disease conditions as an example. To this end a novel ontology, lico, that utilizes well known medical standards is proposed to represent liver patient cases. The utility of the proposed approach is demonstrated with semantic queries and reasoning using data collected from real patients. The preliminary results are promising in regards to the potentials of ontology based medical case representation for building case-based search and retrieval systems, paving the way towards a Clinical Experience Sharing platform for comparative diagnosis, research, and education.

A closed-domain question answering framework using reliable resources to assist students

Caner Derici, Yiğit Aydin, Çiğdem Yenialaca, Nihal Yağmur Aydin, Günizi Kartal, Arzucan Özgür, Tunga Güngör

Natural Language Engineering

Renarration for All

TB Dinesh, S Uskudarli

arXiv preprint arXiv:1810.12379

Improving named entity recognition by jointly learning to disambiguate morphological tags

Onur Güngör, Suzan Uskudarli, Tunga Güngör

Named Entity Recognizer

Onur Güngör, Susan Uskudarli, Tunga Güngör

Boğaziçi University

Recurrent neural networks for Turkish named entity recognition

Onur Güngör, Suzan Üsküdarlı, Tunga Güngör

IEEE

Improving Named Entity Recognition by Jointly Learning to Disambiguate Morphological Tags

Onur Gungor, Suzan Uskudarli, Tunga Gungor

27th International Conference on Computational Linguistics ({COLING})

Recurrent neural networks for Turkish named entity recognition

Onur Güngör, Suzan Uskudarli, Tunga Güngör

26th Signal Processing and Communications Applications Conference ({SIU})

Ontology-based literature mining and class effect analysis of adverse drug reactions associated with neuropathy-inducing drugs

Junguk Hur, Arzucan Özgür, Yongqun He

Journal of biomedical semantics

Towards an ontology-driven clinical experience sharing ecosystem: Demonstration with liver cases

María del Mar Roldán-García, Suzan Uskudarli, Neda B Marvasti, Burak Acar, José F Aldana-Montes

Expert Systems With Applications

Semi-Supervised Psychometric Scoring of Document Collections

Burak Suyunu, Gonul Ayci, Mine Öğretir, Ali Taylan Cemgil, Suzan Uskudarli, Hamza Zeytinoglu, Bulent Ozel, Arman Boyacı

IEEE

Semi-Supervised Psychometric Scoring of Document Collections

Burak Suyunu, Gonul Ayci, Mine Ö\ugretir, Ali Taylan Cemgil, Suzan Uskudarli, Hamza Zeytinoglu, Bulent Ozel, Arman Boyacı

International Conference on Data Mining Workshops ({ICDMW})

We describe a generic computational approach that can be used in developing methods for psychometric profiling. Our approach is based on semi-supervised analysis of document collections using topic modeling. The method depends on a supervisor providing a set of seed documents, grouped by abstract themes, such as Schwartz values or personality traits; and possibly a separate background document corpus. Instead of casting the problem into a standard classification framework, we interpret the group labels as a guide for finding distinguishing features. During training, we train each group of documents associated with a theme separately by using nonnegative matrix factorization to obtain theme specific topic distributions. In the analysis, we decompose a new document using the model learned during training to arrive at the theme scores. We demonstrate our approach on two psychometric profiling theories (Schwartz and Big Five) and evaluate our Schwartz scores with leave-one-out cross-validation method and compare Big Five scores to independent surveys, which are much more costly to carry out.

The information revealed by processing semantic topics extracted from collective short posts

Ahmet Yildirim, Suzan Üsküdarli

IEEE

A morphology-based representation model for lstm-based dependency parsing of agglutinative languages

Şaziye Betül Özateş, Arzucan Özgür, Tunga Güngör, Balkız Öztürk

A novel methodology on distributed representations of proteins using their interacting ligands

Hakime Öztürk, Elif Ozkirimli, Arzucan Özgür

Bioinformatics

DeepDTA: deep drug–target binding affinity prediction

Hakime Öztürk, Arzucan Özgür, Elif Ozkirimli

Bioinformatics

2017

Text classification using ontology and semantic values of terms for mining protein interactions and mutations

B Altinel, ZM Husunbeyi, A Ozgur

Proceedings of the BioCreative VI Workshop

Busem at semeval-2017 task 4a sentiment analysis with word embedding and long short term memory rnn approaches

Deger Ayata, Murat Saraclar, Arzucan Özgür

Political opinion/sentiment prediction via long short term memory recurrent neural networks on Twitter

Değer Ayata, Murat Saraçlar, Arzucan Özgür

IEEE

Turkish tweet sentiment analysis with word embedding and machine learning

Değer Ayata, Murat Saraçlar, Arzucan Özgür

IEEE

Automatic query generation using word embeddings for retrieving passages describing experimental methods

Ferhat Aydın, Zehra Melce Hüsünbeyi, Arzucan Özgür

Database

Description of the BOUN System for the Trilingual Entity Detection and Linking Tasks at TAC KBP 2017.

Arda Celebi, Arzucan Özgür

Morphological embeddings for named entity recognition in morphologically rich languages

Onur Gungor, Eray Yildiz, Suzan Uskudarli, Tunga Gungor

arXiv preprint arXiv:1706.00506

Ontology-based literature mining of E. coli vaccine-associated gene interaction networks

Junguk Hur, Arzucan Özgür, Yongqun He

Journal of biomedical semantics

BIOSSES: a semantic sentence similarity estimation system for the biomedical domain

Gizem Soğancıoğlu, Hakime Öztürk, Arzucan Özgür

Bioinformatics

Extracting Adverse Drug Reactions using Deep Learning and Dictionary Based Approaches.

Mert Tiftikci, Arzucan Özgür, Yongqun He, Junguk Hur

CNN-based chemical–protein interactions classification

Atakan Yüksel, Hakime Öztürk, Elif Ozkirimli, Arzucan Özgür

Proceedings of the BioCreative VI Workshop

2016

Segmenting Hashtags using Automatically Created Training Data

Arda Çelebi, Arzucan Ozgur

Automated neuroanatomical relation extraction: a linguistically motivated approach with a PVT connectivity graph case study

Erinç Gökdeniz, Arzucan Özgür, Reşit Canbeyli

Frontiers in neuroinformatics

BioCreative V BioC track overview: collaborative biocurator assistant task for BioGRID

Sun Kim, Rezarta Islamaj Doğan, Andrew Chatr-Aryamontri, Christie S Chang, Rose Oughtred, Jennifer Rust, Riza Batista-Navarro, Jacob Carter, Sophia Ananiadou, Sergio Matos, Andre Santos, David Campos, José Luís Oliveira, Onkar Singh, Jitendra Jonnagaddala, Hong-Jie Dai, Emily Chia-Yu Su, Yung-Chun Chang, Yu-Chen Su, Chun-Han Chu, Chien Chin Chen, Wen-Lian Hsu, Yifan Peng, Cecilia Arighi, Cathy H Wu, K Vijay-Shanker, Ferhat Aydın, Zehra Melce Hüsünbeyi, Arzucan Özgür, Soo-Yong Shin, Dongseop Kwon, Kara Dolinski, Mike Tyers, W John Wilbur, Donald C Comeau

Database

Named entity recognition on Twitter for Turkish using semi-supervised learning with word embeddings

Eda Okur, Hakan Demir, Arzucan Özgür

Towards building a political protest database to explain changes in the welfare state

Cagil Sonmez, Arzucan Özgür, Erdem Yörük

Ontology-based categorization of bacteria and habitat entities using information retrieval techniques

Mert Tiftikci, Hakan Şahin, Berfu Büyüköz, Alper Yayıkçı, Arzucan Özgür

Identifying topics in microblogs using Wikipedia

Ahmet Yıldırım, Suzan Uskudarli, Arzucan Özgür

PLOS ONE

Twitter is an extremely high volume platform for user generated contributions regarding any topic. The wealth of content created at real-time in massive quantities calls for automated approaches to identify the topics of the contributions. Such topics can be utilized in numerous ways, such as public opinion mining, marketing, entertainment, and disaster management. Towards this end, approaches to relate single or partial posts to knowledge base items have been proposed. However, in microblogging systems like Twitter, topics emerge from the culmination of a large number of contributions. Therefore, identifying topics based on collections of posts, where individual posts contribute to some aspect of the greater topic is necessary. Models, such as Latent Dirichlet Allocation (LDA), propose algorithms for relating collections of posts to sets of keywords that represent underlying topics. In these approaches, figuring out what the specific topic(s) the keyword sets represent remains as a separate task. Another issue in topic detection is the scope, which is often limited to specific domain, such as health. This work proposes an approach for identifying domain-independent specific topics related to sets of posts. In this approach, individual posts are processed and then aggregated to identify key tokens, which are then mapped to specific topics. Wikipedia article titles are selected to represent topics, since they are up to date, user-generated, sophisticated articles that span topics of human interest. This paper describes the proposed approach, a prototype implementation, and a case study based on data gathered during the heavily contributed periods corresponding to the four US election debates in 2012. The manually evaluated results (0.96 precision) and other observations from the study are discussed in detail.

Identifying topics in microblogs using Wikipedia

Ahmet Yıldırım, Suzan Üsküdarlı, Arzucan Özgür

PloS one

Sentence similarity based on dependency tree kernels for multi-document summarization

Şaziye Betül Özateş, Arzucan Özgür, Dragomir Radev

Ignet: A centrality and INO-based web system for analyzing and visualizing literature-mined networks

Arzucan Özgür, Junguk Hur, Zuoshuang Xiang, Edison Ong, Dragomir R Radev, Yongqun He

Bioinformatics

The Interaction Network Ontology-supported modeling and mining of complex interactions represented with multiple keywords in biomedical literature

Arzucan Özgür, Junguk Hur, Yongqun He

BioData mining

A comparative study of SMILES-based compound similarity functions for drug-target interaction prediction

Hakime Öztürk, Elif Ozkirimli, Arzucan Özgür

BMC bioinformatics

2015

Retrieving Passages Describing Experimental Methods using Ontology and Term Relevance based Query Matching

Ferhat Aydın, Zehra Melce Hüsünbeyi, Arzucan Özgür

GLASS: a comprehensive database for experimentally validated GPCR-ligand associations

Wallace KB Chan, Hongjiu Zhang, Jianyi Yang, Jeffrey R Brender, Junguk Hur, Arzucan Özgür, Yang Zhang

Bioinformatics

Question analysis for a closed domain question answering system

Caner Derici, Kerem Celik, Ekrem Kutbay, Yiğit Aydın, Tunga Güngör, Arzucan Özgür, Günizi Kartal

Springer International Publishing

A review on computational systems biology of pathogen–host interactions

Saliha Durmuş, Tunahan Çakır, Arzucan Özgür, Reinhard Guthke

Frontiers Media SA

DRENAJ: Distributed social media data collection system

Onur Güngör, Suzan Uskudarli, A Taylan Cemgil

IEEE

Development and application of an interaction network ontology for literature mining of vaccine-associated gene-gene interactions

Junguk Hur, Arzucan Özgür, Zuoshuang Xiang, Yongqun He

Journal of biomedical semantics

Detection and categorization of bacteria habitats using shallow linguistic analysis

Ilknur Karadeniz, Arzucan Özgür

BMC bioinformatics

Literature Mining and Ontology based Analysis of Host-Brucella Gene–Gene Interaction Network

Ilknur Karadeniz, Junguk Hur, Yongqun He, Arzucan Özgür

Frontiers in microbiology

Overview of the ImageCLEF 2015 liver CT annotation task.

Neda Barzegar Marvasti, Maria del Mar Roldan Garcia, Suzan Üsküdarli, José Francisco Aldana Montes, Burak Acar

General overview of imageCLEF at the CLEF 2015 labs

Henning Müller, Mauricio Villegas, Andrew Gilbert, Lucas Piras, Josiah Wang, Krystian Mikolajczyk, Alba García Seco de Herrera, Stefano Bromuri, M Ashraful Amin, Mahmood Kazi Mohammed, Burak Acar, Suzan Uskudarli, Neda Marvasti, José Aldana, María del Mar Roldan García

8–11 September 2015

Amaçlı Sanal Topluluklar İçin Ontoloji Tabanlı Uygulama Üretme Platformu

M. Seyhan, S. Uskudarli

ISBN 9789750621185

General Overview of ImageCLEF at the CLEF 2015 Labs

Mauricio Villegas, Henning Müller, Gilbert, Andrew, Luca Piras, Josiah Wang, Krystian Mikolajczyk, AlbaG.Seco de Herrera, Stefano Bromuri, M.Ashraful Amin, MahmoodKazi Mohammed, Burak Acar, Suzan Uskudarli, NedaB. Marvasti, JoséF. Aldana, María del Mar Roldán García

International Conference of the Cross-Language Evaluation Forum for European Languages

Sosyal Ağlar Üzerinden Deprem Tespiti

Kıvanç Yazan, S Uskudarli

Bir Ontoloji ile Mikroblog Ortamlarının Modellenmesi ile, İçeriklerin Anlamsal Olarak Erişilebilir Hale Getirilmesi ve Sorgulanması

Ahmet Yıldırım, Suzan Üsküdarlı

Anadolu Üniversitesi yayınları, https://ab.org.tr/ab15/bildiri/452.pdf

Extension of the Interaction Network Ontology for literature mining of gene-gene interaction networks from sentences with multiple interaction keywords

Arzucan Özgür, Junguk Hur, Yongqun He

Classification of Beta-Lactamases and Penicillin Binding Proteins Using Ligand-Centric Network Models.

Hakime Öztürk, Elif Ozkirimli, Arzucan Özgür

PloS one

2014

Expanding Machine Translation Training Data with an Out-of-Domain Corpus using Language Modeling based Vocabulary Saturation

Burak Aydın, Arzucan Özgür

Imageclef 2014: Overview and analysis of the results

Barbara Caputo, Henning Müller, Jesus Martinez-Gomez, Mauricio Villegas, Burak Acar, Novi Patricia, Neda Marvasti, Suzan Üsküdarlı, Roberto Paredes, Miguel Cazorla, Ismael Garcia-Varea, Vicente Morell

Springer International Publishing

Self-training a Constituency Parser using N-gram Trees

Arda Celebi, Arzucan Ozgur

European Language Resources Association (ELRA)

Improving Named Entity Recognition for Morphologically Rich Languages using Word Embeddings

Hakan Demir, Arzucan Özgür

Türkçe Soru Cevaplama Sistemlerinde Kural Tabanlı Odak Çıkarımı Rule-Based Focus Extraction in Turkish Question Answering Systems

Caner Derici, Kerem Çelik, Arzucan Özgür, Tunga Güngör, Ekrem Kutbay, Yigit Aydın, Günizi Kartal

Semantic description of liver CT images: an ontological approach

Nadin Kökciyan, Rüştü Türkay, Suzan Üsküdarli, Pınar Yolum, Barış Bakır, Burak Acar

IEEE journal of biomedical and health informatics

Semantic description of liver CT images: An ontological approach

Nadin Kökciyan, Rüştü Türkay, Suzan Uskudarli, Pınar Yolum, Bariş Bakir, Burak Acar

Journal of Biomedical and Health Informatics

Radiologists inspect CT scans and record their observations in reports to communicate with physicians. These reports may suffer from ambiguous language and inconsistencies resulting from subjective reporting styles, which present challenges in interpretation. Standardization efforts, such as the lexicon RadLex for radiology terms, aim to address this issue by developing standard vocabularies. While such vocabularies handle consistent annotation, they fall short in sufficiently processing reports for intelligent applications. To support such applications, the semantics of the concepts as well as their relationships must be modeled, for which, ontologies are effective. They enable the software to make inferences beyond what is present in the reports. This paper presents the open-source ontology onlira (Ontology of the Liver for Radiology), which is developed to support such intelligent applications, such as identifying and ranking similar liver patient cases. onlira is introduced in terms of its concepts, properties, and relations. Examples of real liver patient cases are provided for illustration purposes. The ontology is evaluated in terms of its ability to express real liver patient cases and address semantic queries.

Bayesian pathway analysis of cancer microarray data

Melike Korucuoglu, Senol Isci, Arzucan Ozgur, Hasan H Otu

PloS one

ImageCLEF Liver CT Image Annotation Task 2014

N B Marvasti, N Kökciyan, R Türkay, A Yazıcı, P Yolum, S Uskudarli, B Acar

http://ceur-ws.org/Vol-1180/CLEF2014wn-Image-MarvastiEt2014.pdf

Analyzing stemming approaches for Turkish multi-document summarization

Muhammed Yavuz Nuzumlalı, Arzucan Özgür

Turkish MDS Data Set

Muhammed Yavuz Nuzumlalı, Arzucan Özgür

Association for Computational Linguistics

Turkish Multi-document Summarization (MDS) Corpus

Muhammed Yavuz Nuzumlalı, Arzucan Özgür

Boğaziçi University

A systems pharmacology approach to model tyrosine kinase inhibitor‐induced cardiotoxicity gene interaction networks (844.17)

Sirarat Sarntivijai, Junguk Hur, Arzucan Ozgur, Keith Burkhart, Yongqun He, Gilbert Omenn, Brian Athey, Darrell Abernethy

The FASEB Journal

PREDICTING GENE INTERACTIONS OF TYROSINE KINASE INHIBITORS INDUCED CARDIOTOXICITY WITH THE ONTOLOGY OF ADVERSE EVENTS-ASSISTED BIOINFORMATICS APPROACH.

S Sarntivijai, J Hur, A Ozgur, K Burkhart, Y He, GS Omenn, BD Athey, DR Abernethy

NATURE PUBLISHING GROUP

A Graph-based Approach for Contextual Text Normalization

Cagil Sönmez, Arzucan Özgür

Association for Computational Linguistics

2013

Clinical experience sharing by similar case retrieval

Neda Barzegar Marvasti, Ceyhun Burak Akgül, Burak Acar, Nadin Kökciyan, Suzan Uskudarli, Pınar Yolum, Rüstü Türkay, Barıs Bakır

1st ACM international workshop on Multimedia indexing and information retrieval for healthcare

N-gram Parsing for Jointly Training a Discriminative Constituency Parser

Arda Celebi, Arzucan Ozgur

Bacteria biotope detection, ontology-based normalization, and relation extraction using syntactic rules

Ilknur Karadeniz, Arzucan Özgür

BOUNCE: Sentiment Classification in Twitter using Rich Feature Sets

Nadin Kökciyan, Arda Celebi, Arzucan Ozgur, Suzan Uskudarli

Association for Computational Linguistics

Bounce: Sentiment classification in Twitter using rich feature sets

Nadin Kökciyan, Arda Celebi, Arzucan Ozgür, Suzan Uskudarli

Second Joint Conference on Lexical and Computational Semantics (*{SEM})

Clinical experience sharing by similar case retrieval

Neda Barzegar Marvasti, Ceyhun Burak Akgül, Burak Acar, Nadin Kökciyan, Suzan Üsküdarlı, Pınar Yolum, Rüstü Türkay, Barıs Bakır

PHISTO: A New Web Platform for Pathogen-Human Interactions

Saliha Durmuş Tekir, Tunahan Çakır, Emre Ardıç, İlknur Karadeniz, Arzucan Özgür, Fatih Erdoğan Sevilgen, Kutlu Ö Ülgen

Computational Methods in Systems Biology: 11th International Conference, CMSB 2013, Klosterneuburg, Austria, September 22-24, 2013, Proceedings

PHISTO: pathogen-host interaction search tool

Saliha Durmus Tekir, Tunahan Cakir, Emre Ardic, Ali Semih Sayilirbas, Gokhan Konuk, Mithat Konuk, Hasret Sariyer, Azat Ugurlu, Ilknur Karadeniz, Arzucan Ozgur, Fatih Erdogan Sevilgen, Kutlu O Ulgen

Bioinformatics

M{\"u}nazaralar{\i}n Twitter'da Etkisinin Ara\c{s}t{\i}r{\i}lmas{\i}

Ahmet Yıldırım, Suzan Uskudarli

Akademik Bili\c{s}im 2013

Münazaraların Twitter'da etkisinin araştırılması

Ahmet Yıldırım, Suzan Üsküdarlı

https://ab.org.tr/ab13/kitap/yildirim_uskudarli_AB13.pdf

Mikroblog İleti Kümelerinde Konu Algılama Yönteminin İncelenmesi

Ahmet Yıldırım, Suzan Üsküdarlı, Arzucan Özgür

Akademik Bilişim

Word polarity detection using a multilingual approach

Cüneyd Murad Özsert, Arzucan Özgür

Springer Berlin Heidelberg

2012

Content based microblogger recommendation

H Burak Celebi, Susan Uskudarli

IEEE

Content Based Microblogger Recommendation

H Burak Celebi, Suzan Uskudarli

International Conference on Social Computing (SocialCom) Privacy, Security, Risk and Trust ({PASSAT})

A social web for another billion

TB Dinesh, Suzan Uskudarli

Proceedings of M4D 2012 28-29 February 2012 New Delhi, India

Alipi: A framework for re-narrating web pages

T. B. Dinesh, S Uskudarli, Subramanya Sastry, Deepti Aggarwal, Venkatesh Choppella

Lyon, France

System and method for generating queries

George Erhart, Valentine Matula, Arzucan Ozgur, David Skiba

Identification of fever and vaccine-associated gene interaction networks using ontology-based literature mining

Junguk Hur, Arzucan Özgür, Zuoshuang Xiang, Yongqun He

Journal of biomedical semantics

User generated human computation applications

Nadin Kokciyan, Suzan Uskudarli, TB Dinesh

IEEE

Visual language theory

Kim Marriott, Bernd Meyer

Springer Science & Business Media

Identification of fever and vaccine-associated gene interaction networks using ontology-based literature mining

Arzucan Özgür, Junguk Hur, Zuoshuang Xiang, Yongqun Oliver He

2011

U-Compare bio-event meta-service: compatible BioNLP event extraction services

Yoshinobu Kano, Jari Björne, Filip Ginter, Tapio Salakoski, Ekaterina Buyko, Udo Hahn, K Bretonnel Cohen, Karin Verspoor, Christophe Roeder, Lawrence E Hunter, Halil Kilicoglu, Sabine Bergler, Sofie Van Landeghem, Thomas Van Parys, Yves Van de Peer, Makoto Miwa, Sophia Ananiadou, Mariana Neves, Alberto Pascual-Montano, Arzucan Özgür, Dragomir R Radev, Sebastian Riedel, Rune Saetre, Hong-Woo Chun, Jin-Dong Kim, Sampo Pyysalo, Tomoko Ohta, Jun'ichi Tsujii

BMC bioinformatics

Mining of vaccine-associated IFN-g gene interaction networks using the Vaccine Ontology

Arzucan Özgür, Zuoshuang Xiang, Dragomir R Radev, Yongqun He

J Biomed Semantics

2010

Analyzing Tags in Twitter Community

Duygu Saide Akman, Suzan Uskudarli

An Operator Provided M-learning Service -- A Preliminary Report

Haluk Bingol, M Gokhan Habiboglu, Suzan Uskudarli, Ahmet Yildirim, Onur Calikus, Cenk Sezgin, Sahin Yelkenci

International Association for Development of the Information Society

Exploring area-specific microblogger social networks

EA Degirmencioglu, S Uskudarli

Proceedings of the WebSci10: Extending the Frontiers of Society On-Line, April

Semantic tagprint - tagging and indexing content for semantic search and content management

Murat Kalender, Jiangbo Dang, Suzan Uskudarli

IEEE

Unipedia: A unified ontological knowledge platform for semantic content tagging and search

Murat Kalender, Jiangbo Dang, Suzan Uskudarli

IEEE

Semantic tagprint-tagging and indexing content for semantic search and content management

Murat Kalender, Jiangbo Dang, Suzan Uskudarli

Fourth International Conference on Semantic Computing ({ICSC})

UNIpedia: A unified ontological knowledge platform for semantic content tagging and search

Murat Kalender, Jiangbo Dang, Suzan Uskudarli

Fourth International Conference on Semantic Computing ({ICSC})

Text and network mining for literature-based scientific discovery in biomedicine

Arzucan Ozgur

Citation summarization through keyphrase extraction

Vahed Qazvinian, Dragomir Radev, Arzucan Özgür

Literature-Based Discovery of IFN-𝛾 and Vaccine-Mediated Gene Interaction Networks

Arzucan Özgür, Zuoshuang Xiang, Dragomir R Radev, Yongqun He

Journal of Biomedicine and Biotechnology

2009

A web environment to support teaching introductory programming

Daghan Dinç, Suzan Üsküdarli

IEEE

Screen-replay: a session recording and analysis tool for DrScheme

M Fatih Köksal, RE Başar, S Üsküdarlı

Proceedings of the Scheme and Functional Programming Workshop, Technical Report, California Polytechnic State University, CPSLO-CSC-09

Michigan molecular interactions r2: from interacting proteins to pathways

V Glenn Tarcea, Terry Weymouth, Alex Ade, Aaron Bookvich, Jing Gao, Vasudeva Mahavisno, Zach Wright, Adriane Chapman, Magesh Jayapandian, Arzucan Özgür, Yuanyuan Tian, Jim Cavalcoli, Barbara Mirel, Jignesh Patel, Dragomir Radev, Brian Athey, David States, HV Jagadish

Nucleic acids research

Detecting speculations and their scopes in scientific text

Arzucan Özgür, Dragomir Radev

Supervised classification for extracting biomedical events

Arzucan Özgür, Dragomir Radev

2008

Introducing meta-services for biomedical information extraction

Florian Leitner, Martin Krallinger, Carlos Rodriguez-Penagos, Jörg Hakenberg, Conrad Plake, Cheng-Ju Kuo, Chun-Nan Hsu, Richard Tzong-Han Tsai, Hsi-Chuan Hung, William W Lau, Calvin A Johnson, Rune Saetre, Kazuhiro Yoshida, Yan Hua Chen, Sun Kim, Soo-Yong Shin, Byoung-Tak Zhang, William A Baumgartner Jr, Lawrence Hunter, Barry Haddow, Michael Matthews, Xinglong Wang, Patrick Ruch, Frédéric Ehrler, Arzucan Özgür, Güneş Erkan, Dragomir R Radev, Michael Krauthammer, ThaiBinh Luong, Robert Hoffmann, Chris Sander, Alfonso Valencia

Genome biology

Semantic Tagging and Inference in Online Communities

Ahmet Yıldırım, Suzan Uskudarli

International Conference on Semantic Systems ({I-SEMANTICS})

Semantic Tagging and Inference in Online Communities

A Yıldırım, S. Uskudarli

Journal UCS

Co-occurrence network of reuters news

Arzucan Özgür, Burak Cetin, Haluk Bingol

International Journal of Modern Physics C

Identifying gene-disease associations using centrality on a literature mined gene-interaction network

Arzucan Özgür, Thuy Vu, Güneş Erkan, Dragomir R Radev

Bioinformatics

2007

DTI Application with Haptic Interfaces

Murat Aksoy, Neslehan Avcu, Susana Merino-Caviedes, Engin Deniz Diktas, Miguel Angel Martın-Fernández, Sıla Girgin, Ioannis Marras, Emma Munoz-Moreno, Erkin Tekeli, Burak Acar, Roland Bammer, Marcos Martin-Fernandez, Ali Vahit Sahiner, Suzan Uskudarli