Show simple item record

dc.contributor.authorAdie, Euan A
dc.contributor.authorAdams, Richard R
dc.contributor.authorEvans, Kathryn L
dc.contributor.authorPorteous, David
dc.contributor.authorPickard, Ben S
dc.coverage.spatial13en
dc.date.accessioned2005-04-07T16:18:47Z
dc.date.available2005-04-07T16:18:47Z
dc.date.issued2005-03-14
dc.identifier.citationSpeeding disease gene discovery by sequence based candidate prioritization Euan A Adie, Richard R Adams, Kathryn L Evans, David J Porteous and Ben S Pickard BMC Bioinformatics 2005, 6:55en
dc.identifier.urihttp://www.biomedcentral.com/1471-2105/6/55
dc.identifier.urihttp://hdl.handle.net/1842/752
dc.description.abstractBackground: Regions of interest identified through genetic linkage studies regularly exceed 30 centimorgans in size and can contain hundreds of genes. Traditionally this number is reduced by matching functional annotation to knowledge of the disease or phenotype in question. However, here we show that disease genes share patterns of sequence-based features that can provide a good basis for automatic prioritization of candidates by machine learning. Results: We examined a variety of sequence-based features and found that for many of them there are significant differences between the sets of genes known to be involved in human hereditary disease and those not known to be involved in disease. We have created an automatic classifier called PROSPECTR based on those features using the alternating decision tree algorithm which ranks genes in the order of likelihood of involvement in disease. On average, PROSPECTR enriches lists for disease genes two-fold 77% of the time, five-fold 37% of the time and twenty-fold 11% of the time. Conclusion: PROSPECTR is a simple and effective way to identify genes involved in Mendelian and oligogenic disorders. It performs markedly better than the single existing sequence-based classifier on novel data. PROSPECTR could save investigators looking at large regions of interest time and effort by prioritizingen
dc.format.extent727419 bytes
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.publisherBioMed Central Ltd.en
dc.subjectMendelian disordersen
dc.subjectoligogenic disordersen
dc.subjectgeneen
dc.titleSpeeding disease gene discovery by sequence based candidate prioritizationen
dc.typeArticleen


Files in this item

This item appears in the following Collection(s)

Show simple item record