author_facet Zhang, Jiexin
Zhang, Li
Coombes, Kevin R.
Zhang, Jiexin
Zhang, Li
Coombes, Kevin R.
author Zhang, Jiexin
Zhang, Li
Coombes, Kevin R.
spellingShingle Zhang, Jiexin
Zhang, Li
Coombes, Kevin R.
Bioinformatics
Gene sequence signatures revealed by mining the UniGene affiliation network
Computational Mathematics
Computational Theory and Mathematics
Computer Science Applications
Molecular Biology
Biochemistry
Statistics and Probability
author_sort zhang, jiexin
spelling Zhang, Jiexin Zhang, Li Coombes, Kevin R. 1367-4811 1367-4803 Oxford University Press (OUP) Computational Mathematics Computational Theory and Mathematics Computer Science Applications Molecular Biology Biochemistry Statistics and Probability http://dx.doi.org/10.1093/bioinformatics/bti796 <jats:title>Abstract</jats:title> <jats:p>Background: In the post-genomic era, developing tools to decode biological information from genomic sequences is important. Inspired by affiliation network theory, we investigated gene sequences of two kinds of UniGene clusters (UCs): narrowly expressed transcripts (NETs), whose expression is confined to a few tissues; and prevalently expressed transcripts (PETs) that are expressed in many tissues.</jats:p> <jats:p>Results: We explored the human and the mouse UniGene databases to compare NETs and PETs from different perspectives. We found that NETs were associated with smaller cluster size, shorter sequence length, a lower likelihood of having LocusLink annotations, and lower and more sporadic levels of expression. Significantly, the dinucleotide frequencies of NETs are similar to those of intergenic sequences in the genome, and they differ from those of PETs. We used these differences in dinucleotide frequencies to develop a discriminant analysis model to distinguish PETs from intergenic sequences.</jats:p> <jats:p>Conclusions: Our results show that most NETs resemble intergenic sequences, casting doubts on the quality of such UniGene clusters. However, we also noted that a fraction of NETs resemble PETs in terms of dinucleotide frequencies and other features. Such NETs may have fewer quality problems. This work may be helpful in the studies of non-coding RNAs and in the validation of gene sequence databases.</jats:p> <jats:p>Availability: </jats:p> <jats:p>Contact: kcoombes@mdanderson.org</jats:p> <jats:p>Supplementary information: </jats:p> Gene sequence signatures revealed by mining the UniGene affiliation network Bioinformatics
doi_str_mv 10.1093/bioinformatics/bti796
facet_avail Online
Free
finc_class_facet Mathematik
Informatik
Biologie
Chemie und Pharmazie
format ElectronicArticle
fullrecord blob:ai-49-aHR0cDovL2R4LmRvaS5vcmcvMTAuMTA5My9iaW9pbmZvcm1hdGljcy9idGk3OTY
id ai-49-aHR0cDovL2R4LmRvaS5vcmcvMTAuMTA5My9iaW9pbmZvcm1hdGljcy9idGk3OTY
institution DE-D275
DE-Bn3
DE-Brt1
DE-Zwi2
DE-D161
DE-Gla1
DE-Zi4
DE-15
DE-Pl11
DE-Rs1
DE-105
DE-14
DE-Ch1
DE-L229
imprint Oxford University Press (OUP), 2006
imprint_str_mv Oxford University Press (OUP), 2006
issn 1367-4811
1367-4803
issn_str_mv 1367-4811
1367-4803
language English
mega_collection Oxford University Press (OUP) (CrossRef)
match_str zhang2006genesequencesignaturesrevealedbyminingtheunigeneaffiliationnetwork
publishDateSort 2006
publisher Oxford University Press (OUP)
recordtype ai
record_format ai
series Bioinformatics
source_id 49
title Gene sequence signatures revealed by mining the UniGene affiliation network
title_unstemmed Gene sequence signatures revealed by mining the UniGene affiliation network
title_full Gene sequence signatures revealed by mining the UniGene affiliation network
title_fullStr Gene sequence signatures revealed by mining the UniGene affiliation network
title_full_unstemmed Gene sequence signatures revealed by mining the UniGene affiliation network
title_short Gene sequence signatures revealed by mining the UniGene affiliation network
title_sort gene sequence signatures revealed by mining the unigene affiliation network
topic Computational Mathematics
Computational Theory and Mathematics
Computer Science Applications
Molecular Biology
Biochemistry
Statistics and Probability
url http://dx.doi.org/10.1093/bioinformatics/bti796
publishDate 2006
physical 385-391
description <jats:title>Abstract</jats:title> <jats:p>Background: In the post-genomic era, developing tools to decode biological information from genomic sequences is important. Inspired by affiliation network theory, we investigated gene sequences of two kinds of UniGene clusters (UCs): narrowly expressed transcripts (NETs), whose expression is confined to a few tissues; and prevalently expressed transcripts (PETs) that are expressed in many tissues.</jats:p> <jats:p>Results: We explored the human and the mouse UniGene databases to compare NETs and PETs from different perspectives. We found that NETs were associated with smaller cluster size, shorter sequence length, a lower likelihood of having LocusLink annotations, and lower and more sporadic levels of expression. Significantly, the dinucleotide frequencies of NETs are similar to those of intergenic sequences in the genome, and they differ from those of PETs. We used these differences in dinucleotide frequencies to develop a discriminant analysis model to distinguish PETs from intergenic sequences.</jats:p> <jats:p>Conclusions: Our results show that most NETs resemble intergenic sequences, casting doubts on the quality of such UniGene clusters. However, we also noted that a fraction of NETs resemble PETs in terms of dinucleotide frequencies and other features. Such NETs may have fewer quality problems. This work may be helpful in the studies of non-coding RNAs and in the validation of gene sequence databases.</jats:p> <jats:p>Availability:  </jats:p> <jats:p>Contact:  kcoombes@mdanderson.org</jats:p> <jats:p>Supplementary information:  </jats:p>
container_issue 4
container_start_page 385
container_title Bioinformatics
container_volume 22
format_de105 Article, E-Article
format_de14 Article, E-Article
format_de15 Article, E-Article
format_de520 Article, E-Article
format_de540 Article, E-Article
format_dech1 Article, E-Article
format_ded117 Article, E-Article
format_degla1 E-Article
format_del152 Buch
format_del189 Article, E-Article
format_dezi4 Article
format_dezwi2 Article, E-Article
format_finc Article, E-Article
format_nrw Article, E-Article
_version_ 1792329796955406347
geogr_code not assigned
last_indexed 2024-03-01T13:14:30.56Z
geogr_code_person not assigned
openURL url_ver=Z39.88-2004&ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fvufind.svn.sourceforge.net%3Agenerator&rft.title=Gene+sequence+signatures+revealed+by+mining+the+UniGene+affiliation+network&rft.date=2006-02-15&genre=article&issn=1367-4803&volume=22&issue=4&spage=385&epage=391&pages=385-391&jtitle=Bioinformatics&atitle=Gene+sequence+signatures+revealed+by+mining+the+UniGene+affiliation+network&aulast=Coombes&aufirst=Kevin+R.&rft_id=info%3Adoi%2F10.1093%2Fbioinformatics%2Fbti796&rft.language%5B0%5D=eng
SOLR
_version_ 1792329796955406347
author Zhang, Jiexin, Zhang, Li, Coombes, Kevin R.
author_facet Zhang, Jiexin, Zhang, Li, Coombes, Kevin R., Zhang, Jiexin, Zhang, Li, Coombes, Kevin R.
author_sort zhang, jiexin
container_issue 4
container_start_page 385
container_title Bioinformatics
container_volume 22
description <jats:title>Abstract</jats:title> <jats:p>Background: In the post-genomic era, developing tools to decode biological information from genomic sequences is important. Inspired by affiliation network theory, we investigated gene sequences of two kinds of UniGene clusters (UCs): narrowly expressed transcripts (NETs), whose expression is confined to a few tissues; and prevalently expressed transcripts (PETs) that are expressed in many tissues.</jats:p> <jats:p>Results: We explored the human and the mouse UniGene databases to compare NETs and PETs from different perspectives. We found that NETs were associated with smaller cluster size, shorter sequence length, a lower likelihood of having LocusLink annotations, and lower and more sporadic levels of expression. Significantly, the dinucleotide frequencies of NETs are similar to those of intergenic sequences in the genome, and they differ from those of PETs. We used these differences in dinucleotide frequencies to develop a discriminant analysis model to distinguish PETs from intergenic sequences.</jats:p> <jats:p>Conclusions: Our results show that most NETs resemble intergenic sequences, casting doubts on the quality of such UniGene clusters. However, we also noted that a fraction of NETs resemble PETs in terms of dinucleotide frequencies and other features. Such NETs may have fewer quality problems. This work may be helpful in the studies of non-coding RNAs and in the validation of gene sequence databases.</jats:p> <jats:p>Availability:  </jats:p> <jats:p>Contact:  kcoombes@mdanderson.org</jats:p> <jats:p>Supplementary information:  </jats:p>
doi_str_mv 10.1093/bioinformatics/bti796
facet_avail Online, Free
finc_class_facet Mathematik, Informatik, Biologie, Chemie und Pharmazie
format ElectronicArticle
format_de105 Article, E-Article
format_de14 Article, E-Article
format_de15 Article, E-Article
format_de520 Article, E-Article
format_de540 Article, E-Article
format_dech1 Article, E-Article
format_ded117 Article, E-Article
format_degla1 E-Article
format_del152 Buch
format_del189 Article, E-Article
format_dezi4 Article
format_dezwi2 Article, E-Article
format_finc Article, E-Article
format_nrw Article, E-Article
geogr_code not assigned
geogr_code_person not assigned
id ai-49-aHR0cDovL2R4LmRvaS5vcmcvMTAuMTA5My9iaW9pbmZvcm1hdGljcy9idGk3OTY
imprint Oxford University Press (OUP), 2006
imprint_str_mv Oxford University Press (OUP), 2006
institution DE-D275, DE-Bn3, DE-Brt1, DE-Zwi2, DE-D161, DE-Gla1, DE-Zi4, DE-15, DE-Pl11, DE-Rs1, DE-105, DE-14, DE-Ch1, DE-L229
issn 1367-4811, 1367-4803
issn_str_mv 1367-4811, 1367-4803
language English
last_indexed 2024-03-01T13:14:30.56Z
match_str zhang2006genesequencesignaturesrevealedbyminingtheunigeneaffiliationnetwork
mega_collection Oxford University Press (OUP) (CrossRef)
physical 385-391
publishDate 2006
publishDateSort 2006
publisher Oxford University Press (OUP)
record_format ai
recordtype ai
series Bioinformatics
source_id 49
spelling Zhang, Jiexin Zhang, Li Coombes, Kevin R. 1367-4811 1367-4803 Oxford University Press (OUP) Computational Mathematics Computational Theory and Mathematics Computer Science Applications Molecular Biology Biochemistry Statistics and Probability http://dx.doi.org/10.1093/bioinformatics/bti796 <jats:title>Abstract</jats:title> <jats:p>Background: In the post-genomic era, developing tools to decode biological information from genomic sequences is important. Inspired by affiliation network theory, we investigated gene sequences of two kinds of UniGene clusters (UCs): narrowly expressed transcripts (NETs), whose expression is confined to a few tissues; and prevalently expressed transcripts (PETs) that are expressed in many tissues.</jats:p> <jats:p>Results: We explored the human and the mouse UniGene databases to compare NETs and PETs from different perspectives. We found that NETs were associated with smaller cluster size, shorter sequence length, a lower likelihood of having LocusLink annotations, and lower and more sporadic levels of expression. Significantly, the dinucleotide frequencies of NETs are similar to those of intergenic sequences in the genome, and they differ from those of PETs. We used these differences in dinucleotide frequencies to develop a discriminant analysis model to distinguish PETs from intergenic sequences.</jats:p> <jats:p>Conclusions: Our results show that most NETs resemble intergenic sequences, casting doubts on the quality of such UniGene clusters. However, we also noted that a fraction of NETs resemble PETs in terms of dinucleotide frequencies and other features. Such NETs may have fewer quality problems. This work may be helpful in the studies of non-coding RNAs and in the validation of gene sequence databases.</jats:p> <jats:p>Availability: </jats:p> <jats:p>Contact: kcoombes@mdanderson.org</jats:p> <jats:p>Supplementary information: </jats:p> Gene sequence signatures revealed by mining the UniGene affiliation network Bioinformatics
spellingShingle Zhang, Jiexin, Zhang, Li, Coombes, Kevin R., Bioinformatics, Gene sequence signatures revealed by mining the UniGene affiliation network, Computational Mathematics, Computational Theory and Mathematics, Computer Science Applications, Molecular Biology, Biochemistry, Statistics and Probability
title Gene sequence signatures revealed by mining the UniGene affiliation network
title_full Gene sequence signatures revealed by mining the UniGene affiliation network
title_fullStr Gene sequence signatures revealed by mining the UniGene affiliation network
title_full_unstemmed Gene sequence signatures revealed by mining the UniGene affiliation network
title_short Gene sequence signatures revealed by mining the UniGene affiliation network
title_sort gene sequence signatures revealed by mining the unigene affiliation network
title_unstemmed Gene sequence signatures revealed by mining the UniGene affiliation network
topic Computational Mathematics, Computational Theory and Mathematics, Computer Science Applications, Molecular Biology, Biochemistry, Statistics and Probability
url http://dx.doi.org/10.1093/bioinformatics/bti796