Navigation auf zora.uzh.ch

Search ZORA

ZORA (Zurich Open Repository and Archive)

Data-driven information extraction and enrichment of molecular profiling data for cancer cell lines

Smith, Ellery; Paloots, Rahel; Giagkos, Dimitris; Baudis, Michael; Stockinger, Kurt (2024). Data-driven information extraction and enrichment of molecular profiling data for cancer cell lines. Bioinformatics advances, 4(1):vbae045.

Abstract

Motivation
With the proliferation of research means and computational methodologies, published biomedical literature is growing exponentially in numbers and volume. Cancer cell lines are frequently used models in biological and medical research that are currently applied for a wide range of purposes, from studies of cellular mechanisms to drug development, which has led to a wealth of related data and publications. Sifting through large quantities of text to gather relevant information on cell lines of interest is tedious and extremely slow when performed by humans. Hence, novel computational information extraction and correlation mechanisms are required to boost meaningful knowledge extraction.

Results
In this work, we present the design, implementation, and application of a novel data extraction and exploration system. This system extracts deep semantic relations between textual entities from scientific literature to enrich existing structured clinical data concerning cancer cell lines. We introduce a new public data exploration portal, which enables automatic linking of genomic copy number variants plots with ranked, related entities such as affected genes. Each relation is accompanied by literature-derived evidences, allowing for deep, yet rapid, literature search, using existing structured data as a springboard.

Availability and implementation
Our system is publicly available on the web at https://cancercelllines.org.

Additional indexing

Item Type:Journal Article, refereed, original work
Communities & Collections:07 Faculty of Science > Institute of Molecular Life Sciences
Dewey Decimal Classification:570 Life sciences; biology
Scopus Subject Areas:Life Sciences > Structural Biology
Life Sciences > Molecular Biology
Life Sciences > Genetics
Physical Sciences > Computer Science Applications
Uncontrolled Keywords:Computer Science Applications, Genetics, Molecular Biology, Structural Biology
Language:English
Date:16 March 2024
Deposited On:25 Apr 2024 07:26
Last Modified:31 Dec 2024 02:37
Publisher:Oxford University Press
ISSN:2635-0041
OA Status:Gold
Free access at:Publisher DOI. An embargo period may apply.
Publisher DOI:https://doi.org/10.1093/bioadv/vbae045
PubMed ID:38560553
Project Information:
  • Funder: H2020
  • Grant ID: 863410
  • Project Title: INODE - INODE - Intelligent Open Data Exploration
Download PDF  'Data-driven information extraction and enrichment of molecular profiling data for cancer cell lines'.
Preview
  • Content: Published Version
  • Language: English
  • Licence: Creative Commons: Attribution 4.0 International (CC BY 4.0)

Metadata Export

Statistics

Citations

Dimensions.ai Metrics
1 citation in Web of Science®
1 citation in Scopus®
Google Scholar™

Altmetrics

Downloads

2 downloads since deposited on 25 Apr 2024
2 downloads since 12 months
Detailed statistics

Authors, Affiliations, Collaborations

Similar Publications